Knowledge Base

NORMALIZE_PHONE

Phone Number Normalization in NQL

Overview

This reference provides a concise guide to using the NORMALIZE_PHONE User-Defined Function (UDF) in NQL to standardize phone numbers into one of several recognized formats, including E.164, national, international, or RFC3966. By using this function, you can ensure consistent phone number representations across data sources and workflows.


Function: NORMALIZE_PHONE

Arguments

  1. format (string):
    The desired phone number format. Options include:
    • E164: Returns the phone number in the E.164 international format (e.g., +15145555555).
    • NATIONAL: Returns the phone number in the national format (e.g., (514) 555-5555 for a US/Canada number).
    • INTERNATIONAL: Returns the phone number in a more readable international format (e.g., +1 514-555-5555).
    • RFC3966: Returns the phone number in RFC3966 format (e.g., tel:+1-514-555-5555).
  2. phone_number (string):
    The input phone number to normalize.
  3. region (string, optional):
    A region code (e.g., US, CA) used to infer the country code when it is not explicitly included in the input number.

What It Does

  1. Removes extraneous characters:
    Strips out spaces, hyphens, parentheses, and other symbols.
  2. Applies region-specific rules:
    When a region is provided, the UDF will use localized rules (e.g., for North American numbers) to infer a country code if missing.
  3. Converts to specified format:
    Depending on the format argument, the function will produce an E.164, NATIONAL, INTERNATIONAL, or RFC3966 output.
  4. Handles invalid input:
    Returns NULL for invalid or unparseable phone numbers to preserve data integrity.

Examples

Example 1: Basic Usage

SELECT 
    NORMALIZE_PHONE('E164', '5145555555', 'CA')       AS "Normalized Phone",
    NORMALIZE_PHONE('NATIONAL', '+15145555555', NULL) AS "National Format",
    NORMALIZE_PHONE('E164', 'NOT VALID', NULL)        AS "Invalid Phone"
FROM company_data.test_normalize_phone;

Expected Output

Input PhoneRegionNormalized PhoneNational FormatInvalid Phone
5145555555CA+15145555555--
+15145555555NULL-(514) 555-5555-
NOT VALIDNULL--NULL
  • In the first case, 5145555555 with region CA becomes +15145555555 in E.164 format.
  • In the second case, +15145555555 is converted to the NATIONAL format (assuming a North American region).
  • In the third case, an invalid input returns NULL.

Example 2: Multiple Format Outputs

The following query normalizes the same phone number into E.164, NATIONAL, INTERNATIONAL, and RFC3966 formats side by side:

SELECT 
    "PHONE",
    NORMALIZE_PHONE('E164', "PHONE", 'US')          AS "NORMALIZED_PHONE_E164",
    NORMALIZE_PHONE('NATIONAL', "PHONE", 'US')      AS "NORMALIZED_PHONE_NATIONAL",
    NORMALIZE_PHONE('INTERNATIONAL', "PHONE", 'US') AS "NORMALIZED_PHONE_INTERNATIONAL",
    NORMALIZE_PHONE('RFC3966', "PHONE", 'US')       AS "NORMALIZED_PHONE_RFC3966"
FROM company_data.verisk_tci_v3;

Illustrative Output

PHONENORMALIZED_PHONE_E164NORMALIZED_PHONE_NATIONALNORMALIZED_PHONE_INTERNATIONALNORMALIZED_PHONE_RFC3966
15145555555+15145555555(514) 555-5555+1 514-555-5555tel:+1-514-555-5555
+15145556666+15145556666(514) 555-6666+1 514-555-6666tel:+1-514-555-6666
...............

Use these formats as needed—E.164 for standardized storage and identity resolution, NATIONAL for local usage, INTERNATIONAL for human-readable global display, and RFC3966 for SIP/VoIP systems or URLs.


Key Benefits

  1. Standardization: Ensures that all phone numbers conform to a consistent structure across your data pipelines.
  2. Global Compatibility: E.164 is recognized globally, making it ideal for cross-border applications and identity resolution.
  3. Flexibility: NATIONAL, INTERNATIONAL, and RFC3966 formats cover a wide range of use cases, from user-facing displays to telephony URIs.
  4. Error Handling: Automatically returns NULL for invalid or unparseable inputs, helping maintain high data quality.

Notes

  • E.164 phone numbers can have up to 15 digits (excluding the + sign).
  • Providing the correct region code is critical for numbers lacking a country prefix.
  • As a best practice, test your queries in the NQL Editor to catch potential errors and validate outputs before running them in production.

Related Resources

< Back
Rosetta

Hi! I’m Rosetta, your big data assistant. Ask me anything! If you want to talk to one of our wonderful human team members, let me know! I can schedule a call for you.