The metastore configuration file defines volumes, databases, schemas, and tables to bootstrap when Embucket starts. This YAML file is specified via theDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/embucket/embucket/llms.txt
Use this file to discover all available pages before exploring further.
--metastore-config flag or METASTORE_CONFIG environment variable.
Schema Structure
Volumes
Volumes define storage backends for Iceberg tables.Unique identifier for the volume.
Volume type.Options:
s3, s3tables, file, memoryOptional database name to auto-create for this volume.
Whether to refresh the volume metadata on startup.
S3 Volume
AWS region for the S3 bucket.
S3 bucket name. Must contain only alphanumeric characters, hyphens, or underscores.
Custom S3 endpoint URL. Must start with
http:// or https://.AWS credentials object.
S3 Credentials
Access Key Credentials:Must be
access_key for access key credentials.AWS access key ID (20 character alphanumeric string).
AWS secret access key (40 character Base64-like string).
Optional AWS session token for temporary credentials.
Must be
token for token-based credentials.S3 Tables Volume
Amazon S3 Tables bucket ARN.Format:
arn:aws:s3tables:region:account-id:bucket/bucket-nameCustom endpoint URL for S3 Tables. Must start with
http:// or https://.AWS credentials (same format as S3 volume credentials).
File Volume
Local filesystem path for the volume.
Memory Volume
Memory volumes have no additional configuration beyondtype: memory.
Databases
Database identifier name.
Volume identifier to use for this database. Must reference a defined volume.
Whether to refresh the database metadata on startup.
Schemas
Database name for this schema.
Schema name.
Tables
Tables can be pre-registered from existing Iceberg metadata.Database name for this table.
Schema name for this table.
Table name.
S3 or file URL to the Iceberg metadata JSON file.
Complete Examples
Basic Memory Volume
S3 Volume with Access Keys
S3 Tables Volume
File Volume
Multiple Volumes and Pre-registered Tables
Custom S3 Endpoint (MinIO)
Session Token Credentials
Validation Rules
- Volume idents must be unique within the configuration
- Bucket names must only contain alphanumeric characters, hyphens, or underscores
- Bucket names must not start or end with a hyphen or underscore
- AWS Access Key IDs must be 20 character alphanumeric strings
- AWS Secret Access Keys must be 40 character Base64-like strings
- S3 Tables ARNs must follow format:
arn:aws:s3tables:region:account-id:bucket/bucket-name - Endpoints must start with
http://orhttps:// - Database volumes must reference existing volume idents
- Schema databases must reference existing database idents
- Table metadata locations must be accessible from the volume’s object store
Bootstrap Behavior
- Volumes are created first in the order defined
- Databases are created and linked to their volumes
- Schemas are created within databases (default
publicschema is auto-created) - Tables are registered from their metadata locations
- If items already exist, they are skipped (idempotent)
- The
should_refreshflag triggers metadata reload for S3 Tables volumes