Knowledge Base

Understanding Dataset Details in Narrative's Platform

Overview

This guide provides an in-depth look at the Dataset Details section in Narrative's platform, exploring each tab's unique functions and insights to help you manage and analyze your datasets effectively.

Demo

Overview Tab

The Overview tab offers high-level information about the dataset, including:

  • Display Name: The dataset's name as it appears in the system.
  • Unique Name: A system-generated unique identifier.
  • Dataset ID: A unique numerical ID for reference.
  • File Type: Indicates the type of file (e.g., CSV, JSONL).
  • Query Engine: Specifies the engine used for data processing.
  • Creation Date: Date and time of dataset creation or latest update.

This tab provides essential metadata for quick dataset identification.

Statistics Tab

The Statistics tab is used for Quality Assurance (QA) and gives insights into the dataset’s structure and completeness:

  • Latest Ingestion Details: Information on the last ingestion date, total records, disk size, and file count.
  • Column Statistics: Summarizes specific columns, allowing users to assess completeness, null count, and data distribution.

This tab is useful for understanding the data quality and distribution within each column.

Schema Tab

In the Schema tab, users can verify and edit the structure of their dataset:

  • Property Name: Column names within the dataset.
  • Data Type: Data type for each column (e.g., String, Integer).
  • Display Name: The name used for display purposes.
  • Description: Optional descriptions for each column.
  • Required, Queryable, and Sensitive Flags: Indicates if a column is mandatory, available for querying, or contains sensitive information.
  • Edit Option: Allows modifications to column details as needed.

The Schema tab is essential for ensuring data structure accuracy and compliance with internal standards.

Sample Data Tab

The Sample Data tab provides a preview of dataset contents:

  • Dataset Sample: Displays a limited number of rows for quick review.
  • Sorting and Filtering: Options to sort columns or filter rows, helping users understand the data’s structure and values.

This tab is particularly useful for verifying data format and content without downloading the dataset.

Mappings Tab

The Mappings tab displays transformations applied to standardize the dataset to Narrative’s attribute framework:

  • Rosetta Stone Mappings: Lists standard mappings such as HL7 Gender Identity or custom mappings.
  • Attributes: Shows mapped attributes within the dataset.
  • Source, State, and Scope: Specifies data origin, active status, and privacy scope.
  • Expression: Contains the NQL expression used for normalization (e.g., mapping gender values).

Tip: Use the "Suggest Mappings" feature, powered by Rosetta AI, to automatically identify and create relevant mappings. You can also create mappings manually by clicking the "Create Mappings" button.

The Mappings tab helps ensure compatibility with Narrative's data standards.

Jobs Tab

The Jobs tab provides an overview of all jobs associated with the dataset:

  • Job List: Displays the list of jobs linked to this dataset.
  • Creation and Runtime: Each job includes details on when it was created and its runtime duration.

Any dataset created through NQL will require jobs for data processing, and users can track job history here.

Access Rules Tab

The Access Rules tab defines dataset access permissions:

  • Rules Overview: Lists permissions that control user access to the dataset.
  • Price, Type, and Status: Information about pricing, access type (e.g., internal-only), and current status.

This tab allows administrators to manage data accessibility within the organization.

Ingestion History Tab

The Ingestion History tab tracks past ingestions for the dataset:

  • Snapshot ID: Unique identifier for each ingestion event.
  • Ingestion Date: Shows the exact date and time of each ingestion.
  • Records and Bytes Added: Tracks the number of records and bytes ingested.

This tab is useful for auditing changes in dataset content over time.

Connections Tab

The Connections tab enables users to distribute the dataset to external destinations:

  • Available Connectors: Lists available connectors, such as Trade Desk and Google DV360.
  • Connect Button: Allows users to initiate connections with one or more external systems.

This tab facilitates data integration with other platforms and tools for seamless data sharing.

< Back
Rosetta

Hi! I’m Rosetta, your big data assistant. Ask me anything! If you want to talk to one of our wonderful human team members, let me know! I can schedule a call for you.