Knowledge Base
Setting Up a Managed S3 Bucket in Narrative
Introduction
After creating your dataset in Narrative, you might want to upload data directly or automate dataset deliveries for larger datasets or regular updates. To facilitate this, Narrative utilizes Amazon Simple Storage Service (S3), a scalable, secure, and high-speed object storage service offered by Amazon Web Services (AWS). This guide explains how to create a managed AWS S3 bucket within Narrative’s account for seamless data ingestion.
Prerequisites
- AWS Account: An active AWS account.
- AWS CLI (Optional): AWS Command Line Interface installed and configured on your local machine.
- Basic Understanding of AWS S3 and IAM: Familiarity with AWS services will aid in the setup process.
What is Amazon S3?
Amazon S3 (Simple Storage Service) is an object storage service that offers industry-leading scalability, data availability, security, and performance. Data is stored as objects within buckets, and you can control access to your data using AWS Identity and Access Management (IAM) policies and bucket policies.
Creating Your Managed S3 Bucket
Step 1: AWS Account Setup
Ensure you have an AWS account. If not, create one following AWS’s account creation guide. Your AWS account will be used to configure access permissions to the managed S3 bucket and to upload data.
Step 2: Generate a Bucket in Narrative
- Navigate to Sources: Log in to the Narrative platform and go to the Sources section under My Data.
- Create New Source: Click on "Create New Data Source" and select "Managed AWS S3 Bucket" as the source type. This option allows you to use a bucket hosted within Narrative’s AWS account.
- Configuration Details:
Provide the following information:- AWS Account ID:
Enter your 12-digit AWS Account ID. This is required to grant your account access to the managed bucket. You can find your Account ID in the AWS Console under My Account. - Resource ID:
A unique identifier for your bucket (e.g., your company name in lowercase without spaces). This ID is used to create the bucket name and must be unique across AWS. - Access Type:
Choose one of the following:- Bucket Policy (Recommended):
Ideal for granting access using bucket policies directly. Suitable for straightforward access control. - IAM Role:
Narrative creates an IAM role for bucket access, which you assume to interact with the bucket. You may add an External ID for enhanced security, preventing the "confused deputy" problem.
- Bucket Policy (Recommended):
- AWS Account ID:
- Create the Bucket:
Click "Create" to set up the managed S3 bucket.
Note: The Access Type you choose will determine how you configure access in the following steps.
Access Types Explained
Bucket Policy Access
A Bucket Policy is a resource-based AWS Identity and Access Management (IAM) policy that allows you to define who can access your S3 bucket and objects. This method is ideal for:
- Cross-Account Access:
Granting permissions to another AWS account. - Direct Object Operations:
Allows the use of standard AWS S3 commands likeaws s3 cp
without additional configuration.
IAM Role Access
An IAM Role is an AWS identity with permission policies that determine what the identity can and cannot do in AWS. This method:
- Security:
Provides secure, temporary access to AWS resources. - Managed Access:
Narrative controls the IAM role, and you assume the role to access the bucket. - Additional Configuration:
Requires setting up AWS CLI profiles or SDK configurations to assume the role.
Configuring Bucket Access
Option A: Setting Up Bucket Policy Access
- Wait for Narrative's Configuration: Narrative will update the bucket policy to grant your AWS account access based on the AWS Account ID you provided.
- Verify Access Permissions:
Ensure your IAM user or role has the necessary permissions (
s3:ListBucket
,s3:PutObject
, etc.) to interact with S3 buckets. - Uploading Data:
Use the AWS CLI or SDKs to upload data to the managed bucket.- AWS CLI Command Example:
aws s3 cp file.csv s3://narrative-managed-bucket-name/your_directory/ --acl bucket-owner-full-control
- Explanation:
s3://narrative-managed-bucket-name/your_directory/
is the path to your managed bucket.--acl bucket-owner-full-control
ensures Narrative has full control over the uploaded objects.
Important: Always include the--acl bucket-owner-full-control
flag to grant Narrative access to the objects you upload.
- AWS CLI Command Example:
- Verifying Upload:
List the contents of the bucket to confirm your upload:
aws s3 ls s3://narrative-managed-bucket-name/your_directory/
Option B: Setting Up IAM Role Access
- Receive IAM Role Details from Narrative: Narrative will provide you with the Role ARN and External ID (if applicable).
- Configure AWS CLI Profile:
Edit your AWS CLI configuration file (
~/.aws/config
) to add a new profile:[profile narrative-role] role_arn = arn:aws:iam::NarrativeAWSAccountID:role/NarrativeRoleName source_profile = default external_id = ExternalIDProvidedByNarrative
Replace:NarrativeAWSAccountID
with the AWS Account ID provided by Narrative.NarrativeRoleName
with the role name provided.ExternalIDProvidedByNarrative
with the External ID if one was provided.
- Assuming the Role:
The AWS CLI will automatically assume the role when you specify the profile.- List Bucket Contents:
aws s3 ls s3://narrative-managed-bucket-name/ --profile narrative-role
- List Bucket Contents:
- Uploading Data:
aws s3 cp file.csv s3://narrative-managed-bucket-name/your_directory/ --profile narrative-role
Note: When using IAM Role Access, you don't need to specify the--acl bucket-owner-full-control
flag since the role has the necessary permissions. - Verify Upload: Confirm that your data has been uploaded successfully.
Important Considerations
AWS S3 Basics
- Buckets: Containers for storing objects (files). Bucket names must be globally unique and follow DNS naming conventions.
- Objects: Files stored in buckets. Each object can be up to 5TB in size.
- Regions: S3 buckets are created in specific AWS regions. Ensure you're interacting with the correct region.
Permissions and Access Control
- IAM Policies: Define permissions for users and roles within your AWS account.
- Bucket Policies: Attached directly to the bucket to grant permissions to principals (users, roles, accounts).
- ACLs (Access Control Lists):
Control access to individual objects. Using
--acl bucket-owner-full-control
ensures the bucket owner has full permissions on the object.
Troubleshooting
- Access Denied Errors: Verify that your IAM user or role has the necessary permissions and that the bucket policy includes your AWS Account ID.
- Incorrect Region: Ensure that you're operating in the correct AWS region where the bucket is located.
- Bucket Does Not Exist: Double-check the bucket name and ensure it matches the name provided by Narrative.
Reviewing and Finalizing the Setup
- Review Your Settings:
- AWS Account ID: Confirm it's correct.
- Resource ID: Ensure it meets AWS bucket naming requirements.
- Access Type: Verify whether you chose Bucket Policy or IAM Role.
- Finalize Creation: After verifying all details, click "Create" to set up the managed S3 bucket.
- Test Access: Before proceeding, test your access to the bucket by listing its contents or uploading a test file.
Next Steps
With your managed bucket successfully set up and accessible, you're ready to start the data ingestion process into Narrative.
- Automate Uploads: For regular data updates, consider scripting your uploads using the AWS CLI or SDKs.
- Monitor Uploads: Keep track of your data uploads and check for any errors or access issues.
Additional Resources
- AWS S3 Documentation: Amazon S3 Documentation
- AWS IAM Documentation: AWS IAM Documentation
- AWS CLI Installation Guide: Installing the AWS CLI
- Understanding S3 Bucket Policies: AWS Blog on S3 Bucket Policies