Documentation Index
Fetch the complete documentation index at: https://mintlify.com/embucket/embucket/llms.txt
Use this file to discover all available pages before exploring further.
Overview
AWS S3 Table Buckets provide a fully managed catalog for Apache Iceberg tables. When you use S3 Tables with Embucket, AWS handles metadata management, indexing, and catalog operations automatically.S3 Table Buckets are purpose-built for analytics workloads and include features like automatic compaction, metadata caching, and optimized query performance.
Benefits of S3 Tables
- Managed Metadata: AWS handles metadata storage and availability
- Automatic Optimization: Built-in compaction and optimization
- Integrated Permissions: Native IAM integration for access control
- High Availability: AWS-managed infrastructure with multi-AZ support
- Performance: Optimized data layout and metadata caching
Configuration
Basic Setup
Define an S3 Tables volume in yourmetastore.yaml configuration:
Configuration Parameters
Unique identifier for this volume. Used to reference the volume in database definitions.
Must be
s3-tables for AWS S3 Table Buckets.Optional database name to create automatically. If provided, Embucket will create a database associated with this volume on startup.
The Amazon Resource Name (ARN) of your S3 Table Bucket. Format:
arn:aws:s3tables:REGION:ACCOUNT_ID:bucket/BUCKET_NAMEAWS credentials for accessing the S3 Table Bucket. See Authentication below.
Custom S3 Tables endpoint URL. Only needed for testing or non-standard AWS configurations.
Authentication
Access Key Credentials
The most common authentication method uses AWS access keys:Session Token Support
For temporary credentials or assumed roles, include a session token:IAM Permissions Required
Your AWS credentials need the following permissions:Understanding the ARN
The S3 Tables ARN uniquely identifies your table bucket:- Region: Used for API endpoint configuration
- Account ID: For IAM permission validation
- Bucket Name: The underlying S3 bucket for data storage
Example Queries
Once configured, query your S3 Tables catalog using standard SQL:List Schemas
List Tables in a Schema
Query Table Data
Aggregate Queries
Docker Deployment
Mount your configuration file when running Embucket in Docker:Complete Example
Here’s a full configuration with S3 Tables and a database:metastore.yaml
The
should_refresh: true flag tells Embucket to periodically sync the catalog metadata with S3 Tables to detect new tables or schema changes.Troubleshooting
Connection Issues
If Embucket cannot connect to S3 Tables:- Verify your ARN format is correct
- Check that IAM credentials have required permissions
- Ensure the region in the ARN matches your table bucket’s region
- Validate network connectivity to AWS S3 Tables endpoints
Credential Validation
Embucket validates S3 Tables credentials on startup by callingGetTableBucket. If this fails, check:
- Access key ID format (20 alphanumeric characters)
- Secret access key format (40 Base64 characters)
- IAM policy allows
s3tables:GetTableBucketaction - Table bucket exists and ARN is correct
Next Steps
Query Your Data
Learn SQL query syntax
Metastore Config
Complete configuration reference