Knowledge Base

NORMALIZE_EMAIL

Email Normalization in NQL

Overview

This reference provides a concise guide to using the NORMALIZE_EMAIL User-Defined Function (UDF) in NQL to standardize email addresses for consistent data processing.

Function: NORMALIZE_EMAIL

Arguments

  • email (string): The email address to normalize.

What It Does

The NORMALIZE_EMAIL UDF standardizes email addresses by:

  1. Converting the email address to lowercase.
  2. Removing leading and trailing whitespace.
  3. For Gmail addresses:
    • Removing all periods (.) in the local part.
    • Removing everything after a + symbol in the local part.

Example

NQL Query

SELECT "EMAIL", 
       NORMALIZE_EMAIL("EMAIL") AS "NORMALIZED_EMAIL"
FROM company_data.test_normalize_email;

Expected Output

Original EmailNormalized Email
User.Name+promo@gmail.comusername@gmail.com
admin@Example.comadmin@example.com
JOHNDOE@Yahoo.COMjohndoe@yahoo.com
user.name+news@gmail.comusername@gmail.com
alice@domain.comalice@domain.com

Key Benefits

  • Consistency: Normalized email addresses ensure consistent results in operations like hashing and deduplication.
  • Data Cleanliness: Removes extraneous formatting variations.

Notes

  • Use the NQL Editor to validate your query before execution.
  • This function is optimized for Gmail-specific address formats but works for all email domains.
< Back
Rosetta

Hi! I’m Rosetta, your big data assistant. Ask me anything! If you want to talk to one of our wonderful human team members, let me know! I can schedule a call for you.