Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/embucket/embucket/llms.txt

Use this file to discover all available pages before exploring further.

The embucketd binary provides comprehensive command-line configuration for running the Embucket server. All flags can also be set via environment variables.

Usage

embucketd [OPTIONS]

Server Configuration

--metastore-config
path
default:"none"
Path to YAML config describing volumes/databases to seed the metastore.Environment variable: METASTORE_CONFIG
embucketd --metastore-config /path/to/config.yaml
--host
string
default:"localhost"
Host address to bind the server to.Environment variable: BUCKET_HOST
embucketd --host 0.0.0.0
--port
integer
default:"3000"
Port number to bind the server to.Environment variable: BUCKET_PORT
embucketd --port 8080
--idle-timeout-seconds
integer
default:"18000"
Service idle timeout in seconds (5 hours by default).Environment variable: IDLE_TIMEOUT_SECONDS
embucketd --idle-timeout-seconds 3600

Query Execution

--max-concurrency-level
integer
default:"8"
Maximum number of queries that can run simultaneously.Environment variable: MAX_CONCURRENCY_LEVEL
embucketd --max-concurrency-level 16
--query-timeout-secs
integer
default:"1200"
Maximum duration in seconds a single query is allowed to run (20 minutes by default).Environment variable: QUERY_TIMEOUT_SECS
embucketd --query-timeout-secs 600
--max-concurrent-table-fetches
integer
default:"2"
The maximum number of concurrent requests to get table details.Environment variable: MAX_CONCURRENT_TABLE_FETCHES
embucketd --max-concurrent-table-fetches 4

Memory Management

--mem-pool-type
enum
default:"greedy"
Memory pool type for query execution.Environment variable: MEM_POOL_TYPEOptions:
  • greedy - Allocates memory aggressively
  • fair - Distributes memory fairly among queries
embucketd --mem-pool-type fair
--mem-pool-size-mb
integer
default:"none"
Maximum memory pool size in megabytes. If not set, uses system defaults.Environment variable: MEM_POOL_SIZE_MB
embucketd --mem-pool-size-mb 4096
--mem-enable-track-consumers-pool
boolean
default:"false"
Wrap memory pool with TrackConsumersPool for tracking per-consumer memory usage.Environment variable: MEM_ENABLE_TRACK_CONSUMERS_POOL
embucketd --mem-enable-track-consumers-pool
--disk-pool-size-mb
integer
default:"none"
Maximum disk pool size in megabytes for spilling operations.Environment variable: DISK_POOL_SIZE_MB
embucketd --disk-pool-size-mb 10240
--alloc-tracing
boolean
default:"false"
Enable memory tracing functionality for debugging.Environment variable: ALLOC_TRACING
embucketd --alloc-tracing

Data Formats and Parsing

--data-format
string
default:"json"
Data serialization format in Snowflake v1 API.Environment variable: DATA_FORMAT
embucketd --data-format json
--sql-parser-dialect
string
default:"snowflake"
SQL parser dialect to use.Environment variable: SQL_PARSER_DIALECTOptions: snowflake, postgres, mysql, generic
embucketd --sql-parser-dialect postgres

Authentication

--auth-demo-user
string
default:"embucket"
Username for demo authentication mode.Environment variable: AUTH_DEMO_USER
embucketd --auth-demo-user admin
--auth-demo-password
string
default:"embucket"
Password for demo authentication mode.Environment variable: AUTH_DEMO_PASSWORD
embucketd --auth-demo-password secret123
--jwt-secret
string
default:"none"
JWT secret for authentication. This value is sensitive and will be removed from environment after loading.Environment variable: JWT_SECRET
Keep this value secret. The environment variable is automatically unset after startup.
embucketd --jwt-secret your-secret-key

AWS SDK Configuration

--aws-sdk-connect-timeout-secs
integer
default:"3"
AWS SDK connection timeout in seconds.Environment variable: AWS_SDK_CONNECT_TIMEOUT_SECS
embucketd --aws-sdk-connect-timeout-secs 5
--aws-sdk-operation-timeout-secs
integer
default:"30"
AWS SDK operation timeout in seconds.Environment variable: AWS_SDK_OPERATION_TIMEOUT_SECS
embucketd --aws-sdk-operation-timeout-secs 60
--aws-sdk-operation-attempt-timeout-secs
integer
default:"10"
AWS SDK operation attempt timeout in seconds.Environment variable: AWS_SDK_OPERATION_ATTEMPT_TIMEOUT_SECS
embucketd --aws-sdk-operation-attempt-timeout-secs 15

Iceberg Configuration

--iceberg-create-table-timeout-secs
integer
default:"30"
Iceberg table creation timeout in seconds.Environment variable: ICEBERG_CREATE_TABLE_TIMEOUT_SECS
embucketd --iceberg-create-table-timeout-secs 60
--iceberg-catalog-timeout-secs
integer
default:"10"
Iceberg catalog operation timeout in seconds.Environment variable: ICEBERG_CATALOG_TIMEOUT_SECS
embucketd --iceberg-catalog-timeout-secs 20

Object Store Configuration

--object-store-timeout-secs
integer
default:"10"
Object store operation timeout in seconds.Environment variable: OBJECT_STORE_TIMEOUT_SECS
embucketd --object-store-timeout-secs 30
--object-store-connect-timeout-secs
integer
default:"3"
Object store connection timeout in seconds.Environment variable: OBJECT_STORE_CONNECT_TIMEOUT_SECS
embucketd --object-store-connect-timeout-secs 5

Observability

--tracing-level
enum
default:"info"
Tracing level for logs. Can be overridden by the RUST_LOG environment variable.Environment variable: TRACING_LEVELOptions: off, info, debug, trace
embucketd --tracing-level debug
--tracing-span-processor
enum
default:"batch-span-processor"
Tracing span processor type.Environment variable: span_processorOptions:
  • batch-span-processor
  • batch-span-processor-experimental-async-runtime
embucketd --tracing-span-processor batch-span-processor
--otel-exporter-otlp-protocol
string
default:"grpc"
OpenTelemetry Exporter Protocol.Environment variable: OTEL_EXPORTER_OTLP_PROTOCOL
embucketd --otel-exporter-otlp-protocol http

Examples

Basic Server Startup

embucketd --host 0.0.0.0 --port 3000

Production Configuration

embucketd \
  --host 0.0.0.0 \
  --port 3000 \
  --metastore-config /etc/embucket/config.yaml \
  --max-concurrency-level 32 \
  --mem-pool-size-mb 8192 \
  --query-timeout-secs 3600 \
  --tracing-level info

Using Environment Variables

export BUCKET_HOST="0.0.0.0"
export BUCKET_PORT="3000"
export MAX_CONCURRENCY_LEVEL="16"
export MEM_POOL_SIZE_MB="4096"
export TRACING_LEVEL="debug"

embucketd

Memory-Optimized Configuration

embucketd \
  --mem-pool-type fair \
  --mem-pool-size-mb 16384 \
  --disk-pool-size-mb 51200 \
  --mem-enable-track-consumers-pool