Workflow Examples#
This guide lists all of Minitrino’s modules as well as some of the workflows that the tool is best suited for.
Overview#
Module Documentation#
Each module has a readme associated with it. The list below points to the
readme files for each module.
Enterprise vs Open Source Modules#
Minitrino supports both Trino (open-source) and Starburst Enterprise modules. The table below shows which modules require a Starburst license.
IMAGE=starburst :::
Module Type |
Module Name |
License |
Description |
|---|---|---|---|
Admin |
cache-service |
⭐ Enterprise |
Materialized view caching and table scan redirection |
Admin |
data-products |
⭐ Enterprise |
Data products management and governance |
Admin |
file-group-provider |
OSS |
File-based group provider for authorization |
Admin |
insights |
⭐ Enterprise |
Starburst Insights analytics and monitoring UI |
Admin |
ldap-group-provider |
⭐ Enterprise |
LDAP-based group integration |
Admin |
minio |
OSS |
S3-compatible object storage (MinIO) |
Admin |
mysql-event-listener |
OSS |
MySQL-based audit logging and event tracking |
Admin |
resource-groups |
OSS |
Resource management and query queueing |
Admin |
results-cache |
⭐ Enterprise |
Query result caching for improved performance |
Admin |
scim |
⭐ Enterprise |
SCIM 2.0 user and group synchronization |
Admin |
session-property-manager |
OSS |
Manage session properties and defaults |
Admin |
spooling-protocol |
OSS |
Client-side result spooling protocol |
Admin |
starburst-gateway |
⭐ Enterprise |
Starburst Gateway for query federation |
Catalog |
clickhouse |
OSS |
ClickHouse database connector |
Catalog |
db2 |
⭐ Enterprise |
IBM Db2 database connector |
Catalog |
delta-lake |
OSS |
Delta Lake table format support |
Catalog |
elasticsearch |
OSS |
Elasticsearch search engine connector |
Catalog |
faker |
OSS |
Faker data generator for testing |
Catalog |
hive |
OSS |
Hive Metastore with MinIO storage |
Catalog |
iceberg |
OSS |
Apache Iceberg REST catalog |
Catalog |
iceberg-hms |
⭐ Enterprise |
Iceberg with Hive Metastore catalog |
Catalog |
mariadb |
OSS |
MariaDB database connector |
Catalog |
mysql |
OSS |
MySQL database connector |
Catalog |
pinot |
OSS |
Apache Pinot real-time analytics connector |
Catalog |
postgres |
OSS |
PostgreSQL database connector |
Catalog |
sqlserver |
OSS |
Microsoft SQL Server connector |
Catalog |
stargate |
⭐ Enterprise |
Stargate data federation connector |
Catalog |
stargate-parallel |
⭐ Enterprise |
Stargate with parallel query execution |
Security |
biac |
⭐ Enterprise |
Built-in access control (BIAC) |
Security |
file-access-control |
OSS |
File-based authorization rules |
Security |
kerberos |
OSS |
Kerberos authentication integration |
Security |
ldap |
OSS |
LDAP authentication |
Security |
oauth2 |
OSS |
OAuth 2.0 authentication |
Security |
password-file |
OSS |
Password file-based authentication |
Security |
tls |
OSS |
TLS/SSL encryption for secure connections |
Using Enterprise Modules:
Enterprise modules require:
Starburst Enterprise distribution:
IMAGE=starburstValid Starburst license file (see license configuration)
# Example: Provision with enterprise module
minitrino -e IMAGE=starburst -e LIC_PATH=~/starburstdata.license provision -m insights
Administration Modules
Catalog Modules
Security Modules
CLI Examples
Choosing a Trino or Starburst Version
Each Minitrino release uses a default Trino version, specified in
lib/minitrino.env via the CLUSTER_VER variable. Unless overridden by an
environment variable, this is the version that will be used for all provision
commands.
To use Starburst instead of Trino, set the IMAGE environment variable to
starburst. Starburst Enterprise (SEP) releases are based on Trino releases.
So, SEP release 400-e directly maps to Trino release 400. To clearly
demonstrate the relationship, here are few more examples:
SEP
413-e.9-> Trino413SEP
446-e-> Trino446SEP
453-e.4-> Trino453SEP
460-e-> Trino460
Nearly all Trino plugins are exposed in SEP, so anything documented in the Trino docs should be configurable in the related SEP image.
Run Commands in Verbose Mode
It is recommended to run the majority of Minitrino commands in verbose mode. To
do so, simply add the -v option to any command:
minitrino -v ...
List Modules
To see which modules exist (without needing to reference this wiki), make use of
the modules command.
List all modules:
minitrino modules
List all modules of a given type (one of admin, catalog, security):
minitrino modules --type admin
minitrino modules --type catalog
minitrino modules --type security
The --json option can be used to dump all metadata tied to a module(s), such
as the module’s directory, its Docker Compose file location, and a JSON
representation of the entire Docker Compose file.
Using jq, the command can be manipulated to show many different groupings of
information.
minitrino modules -m ldap --json
# Return the module directory
minitrino modules -m ${module} --json | jq .${module}.module_dir
# Return the module's Compose file location
minitrino modules -m ${module} --json | jq .${module}.yaml_dict
Provision an Environment
When a certain CLUSTER_VER is deployed for the first time, Minitrino must
first build the image. This typically takes ~ 5 minutes depending on the quality
of your network. Once a given CLUSTER_VER is built, it will be reused for all
future commands specifying the same version.
Provision a single-node cluster using the default Trino version:
minitrino -v provision
Provision a multi-node cluster with two worker nodes:
minitrino -v provision --workers 2
Provision the postgres catalog module with a specific Trino version:
minitrino -v -e CLUSTER_VER=${VER} provision -m postgres
Provision with Starburst instead of Trino:
minitrino -v -e IMAGE=starburst -e CLUSTER_VER=476-e provision -m postgres
Provision the hive catalog module with two worker nodes:
minitrino -v provision -m hive --workers 2
Append the running Hive environment with the iceberg module and downsize to
one worker (the hive module will be included automatically):
minitrino -v provision -m iceberg --workers 1
Provision Trino with password authentication, which will include the tls
module as a dependency and expose the service on https://localhost:8443:
minitrino -v provision -m password-file
Provision multiple password authenticators in tandem:
minitrino -v provision -m ldap -m password-file
Access the UI
All environments expose the Starburst service on localhost:8080, meaning you
can visit the web UI directly, or you can connect external clients, such as
DBeaver, to the localhost service.
The coordinator container can be directly accessed via:
docker exec -it minitrino-default bash
Note: The default cluster name is default. If you provisioned with a
custom cluster name (e.g., --cluster my-cluster), replace minitrino-default
with minitrino-my-cluster.
Worker Provisioning Overview
When you provision an environment with one or more workers, the following events take place:
The coordinator container is deployed and any relevant bootstrap scripts are executed inside of it.
The coordinator is restarted.
Once the coordinator is up, the worker containers are deployed, and the coordinator’s
/mnt/etc/directory is compressed, copied, and extracted to all of the worker containers.The workers’
config.propertiesfiles are overwritten with basic configurations for connectivity to the coordinator.
This ensures that any distributed files, such as catalog files, are placed on every container in the cluster. It also ensures that coordinator-specific configurations do not remain on the workers.
Modify Files in a Running Container
You can modify files inside a running container. For example:
# Update coordinator logging settings
docker exec -it minitrino-default bash
echo "io.trino=DEBUG" >> /mnt/etc/log.properties
exit
docker restart minitrino-default
# Update worker logging settings (assuming default cluster name)
docker exec -it minitrino-worker-1-default bash
echo "io.trino=DEBUG" >> /mnt/etc/log.properties
exit
docker restart minitrino-worker-1-default
Restarting the container allows Trino to register the configuration change.
Note: Replace default with your cluster name if using a custom cluster.
Access the Trino CLI
docker exec -it minitrino-default bash
trino-cli --debug --user admin --execute "SELECT * FROM tpch.tiny.customer LIMIT 10"
Restart Cluster Containers
Restart all containers in the cluster without reprovisioning:
minitrino restart
This is useful when you’ve made configuration changes and need to reload the cluster without tearing it down completely.
View Cluster Resources
View all resources associated with your cluster:
minitrino resources
Filter by specific resource types:
# View only containers
minitrino resources --container
# View only volumes
minitrino resources --volume
# View only images
minitrino resources --image
# View only networks
minitrino resources --network
Shut Down an Environment
minitrino down
To skip graceful shutdown and stop all containers immediately, run:
minitrino down --sig-kill
The default behavior of the down command removes containers. To stop the
containers instead of removing them, run:
minitrino down --keep
Remove Minitrino Resources
Minitrino creates volumes and pulls/builds various images to support each
module. All resources are labeled with project-specific metadata, so all
remove commands will target Docker resources specifically tied to Minitrino
modules.
Remove all Minitrino-labeled images (requires --cluster all or -c all):
minitrino -c all remove --images
Remove images from a specific module:
minitrino remove --images \
--label org.minitrino.module.${MODULE_TYPE}.${MODULE}=true
Where ${MODULE_TYPE} is one of: admin, catalog, security.
Remove all Minitrino-labeled volumes:
minitrino remove --volumes
Remove volumes from a specific module:
minitrino remove --volumes \
--label org.minitrino.module.${MODULE_TYPE}.${MODULE}=true
Snapshot a Customized Module
Users designing and customizing their own modules can persist them for later
usage with the snapshot command. For example, if a user modifies the hive
module, they can persist their changes by running:
minitrino snapshot --name ${SNAPSHOT_NAME} -m hive
By default, the snapshot file is placed in
${LIB_PATH}/${SNAPSHOT_NAME}.tar.gz. The snapshot can be saved to a different
directory by passing the --directory option to the snapshot command.
Point to a Starburst License File for Enterprise Modules
Enterprise Module Reference table above for a complete list of enterprise vs open-source modules. :::
Pass an environment variable directly to the command:
minitrino -e LIC_PATH=/path/to/starburstdata.license provision -m insights
Export the environment variable to the shell:
export LIC_PATH=/path/to/starburstdata.license
minitrino provision -m insights
Add the variable to the minitrino.cfg file:
minitrino config
# Edit the config file
[config]
LIC_PATH=~/work/license/starburstdata.license
# Enterprise modules now automatically receive the license file
minitrino provision -m insights
More information about environment variables can be found here.
Modify the Trino config.properties and jvm.config Files
Many modules may change the Trino’s config.properties and jvm.config files.
There are two supported ways to modify these files.
Method One: Docker Compose Environment Variables
Minitrino has special support for two Trino-specific environment variables:
CONFIG_PROPERTIES and JVM_CONFIG. Below is an example of setting these
variables in a Docker Compose file:
minitrino:
environment:
CONFIG_PROPERTIES: |-
insights.jdbc.url=jdbc:postgresql://postgresdb:5432/insights
insights.jdbc.user=admin
insights.jdbc.password=password
insights.persistence-enabled=true
JVM_CONFIG: |-
-Xlog:gc:/var/log/sep-gc-%t.log:time:filecount=10
Method Two: Environment Variables
Environment variables can be exported prior to executing a provision command:
export CONFIG_PROPERTIES=$'query.max-stage-count=85\ndiscovery.uri=http://trino:8080'
export JVM_CONFIG=$'-Xmx2G\n-Xms1G'
minitrino -v provision
Note that multiple configs can be separated with a newline, though $'...'
syntax must be used, as it allows you to include escape sequences like \n that
are interpreted as actual newlines.
As an alternative to using export to set environment variables, they can also
be passed directly to the provision command:
minitrino -v \
--env CONFIG_PROPERTIES=$'query.max-stage-count=85\ndiscovery.uri=http://trino:8080' \
--env JVM_CONFIG=$'-Xmx2G\n-Xms1G' \
provision
Single configs are simpler and do not require $'...' syntax:
minitrino -v \
--env CONFIG_PROPERTIES='query.max-stage-count=85' \
--env JVM_CONFIG='-Xmx2G' \
provision
Method Three: Bootstrap Scripts
The config.properties and jvm.config files can be modified directly with a
module bootstrap script.
Bootstrap Scripts
Bootstrap scripts allow you to customize container initialization with arbitrary shell commands. These scripts execute at specific lifecycle stages and can modify configurations, load data, or initialize external services.
How Bootstrap Scripts Work
Minitrino copies bootstrap scripts from the library to the container
Scripts execute in two phases:
before_startandafter_startContainer restarts after each bootstrap execution
Idempotency checks: Minitrino checksums scripts and only re-executes if content changes
Important: Bootstrap scripts do NOT replace the container’s entrypoint. They augment the startup process.
Execution Context
User and Permissions:
Scripts initially run as root
After execution, ownership of
/etc/${CLUSTER_DIST}changes to service user (trino/starburst)Scripts can use
sudofor privileged operationsFiles created should be owned by
${SERVICE_USER}for persistence
Execution Phases:
before_start():Runs before Trino/Starburst server starts
Use for: config generation, certificate setup, schema initialization
Trino/Starburst service is NOT available yet
after_start():Runs after Trino/Starburst is accepting queries
Use for: data loading, SQL execution, post-startup validation
Trino/Starburst service IS available (can use
trino-cli)
Available Environment Variables
Your bootstrap script has access to these key variables:
$CLUSTER_DIST # "trino" or "starburst"
$CLUSTER_VER # Version number (e.g., "476")
$CLUSTER_NAME # Cluster name (e.g., "default")
$SERVICE_USER # Service user ("trino" or "starburst")
$HOSTNAME # Container hostname
$COORDINATOR # "true" for coordinator, "false" for workers
All Docker Compose environment variables are also available.
Available Tools and Utilities
Bootstrap scripts have access to a rich toolset:
Networking:
curl,wget,ping,telnetData:
jq(JSON processor)Trino:
trino-cli(for SQL execution inafter_start)Security:
keytool(Java keystore),openssl,ldap-utils,krb5-userUtilities:
wait-for-it(wait for services),vim,tree,lessPython: Python 3 with pip (can install additional packages)
Creating a Bootstrap Script
1. Create the script directory:
mkdir -p lib/modules/${type}/${module}/resources/bootstrap/
2. Write the script:
#!/usr/bin/env bash
set -euxo pipefail
before_start() {
echo "Executing before Trino starts..."
# Example: Modify config file
echo "discovery.uri=http://custom-coordinator:8080" >> /etc/${CLUSTER_DIST}/config.properties
# Example: Wait for external service
wait-for-it postgres:5432 --strict --timeout=30
# Example: Generate certificates (as root)
keytool -genkeypair -alias mykey ...
# Example: Fix ownership
chown -R ${SERVICE_USER}:root /etc/${CLUSTER_DIST}/custom/
}
after_start() {
echo "Executing after Trino started..."
# Example: Load data via SQL
trino-cli --user admin --execute "CREATE SCHEMA IF NOT EXISTS hive.default"
# Example: Validate setup
if trino-cli --user admin --execute "SELECT 1"; then
echo "Cluster is ready!"
else
echo "Cluster validation failed!"
exit 1
fi
}
3. Reference in Docker Compose YAML:
services:
minitrino:
environment:
MINITRINO_BOOTSTRAP: bootstrap.sh
4. Test the module:
minitrino -v provision -m ${module}
Bootstrap Script Best Practices
Idempotency
Always check if operations were already completed:
before_start() {
# Check if index already exists
if curl -s http://elasticsearch:9200/_cat/indices | grep -q 'myindex'; then
echo "Index already exists, skipping creation"
return 0
fi
# Create index
curl -XPUT http://elasticsearch:9200/myindex ...
}
Error Handling
Use set -e to exit on errors, but handle expected failures:
set -euxo pipefail # Exit on error, undefined vars, pipe failures
before_start() {
# This will exit script if command fails
keytool -genkeypair ...
# Handle expected failures
if ! some_optional_command; then
echo "Optional command failed, continuing..."
fi
}
External Service Dependencies
Wait for services to be ready:
before_start() {
# Wait for service to be available
wait-for-it elasticsearch:9200 --strict --timeout=60 -- \
echo "Elasticsearch is up"
# Alternative: manual retry logic
retries=30
until curl -s http://postgres:5432 > /dev/null 2>&1; do
retries=$((retries - 1))
if [ $retries -eq 0 ]; then
echo "Postgres failed to start"
exit 1
fi
sleep 1
done
}
File Ownership
Ensure files are owned by the service user:
before_start() {
# Create files as root
echo "config-value" > /etc/${CLUSTER_DIST}/custom.properties
# Fix ownership for service user
chown ${SERVICE_USER}:root /etc/${CLUSTER_DIST}/custom.properties
chmod 644 /etc/${CLUSTER_DIST}/custom.properties
}
Using Python in Bootstraps
Install packages and run Python scripts:
before_start() {
# Install Python packages
sudo pip install requests faker
# Run inline Python
python3 << 'EOF'
import requests
response = requests.get('http://service:8080/health')
print(f"Health check: {response.status_code}")
EOF
# Or execute external script
python3 /mnt/bootstrap/mymodule/myscript.py
}
Debugging Bootstrap Scripts
View Bootstrap Logs
# Check if bootstraps completed
docker logs minitrino-default 2>&1 | grep BOOTSTRAP
# View bootstrap script execution
docker logs minitrino-default 2>&1 | grep -A 50 "before_start"
Manually Execute Bootstrap
# Enter container
docker exec -it minitrino-default bash
# View bootstrap script
cat /tmp/minitrino/bootstrap/mymodule/bootstrap.sh
# Execute manually
bash -x /tmp/minitrino/bootstrap/mymodule/bootstrap.sh
Force Re-execution
Minitrino tracks bootstrap checksums to avoid re-running unchanged scripts:
# Remove checksum file to force re-execution
docker exec minitrino-default rm /etc/trino/.minitrino/bootstrap_checksums.json
# Restart container (will re-run bootstraps)
docker restart minitrino-default
Check Bootstrap Files
# View bootstrap directory structure
docker exec minitrino-default tree /tmp/minitrino/bootstrap/
# Check what's mounted
docker exec minitrino-default ls -la /mnt/bootstrap/
Real-World Examples
Example 1: TLS Certificate Generation
From security/tls/resources/cluster/bootstrap.sh:
before_start() {
local ssl_dir=/mnt/etc/tls
# Check if certificates exist
if [[ -f "${ssl_dir}/keystore.jks" ]]; then
echo "TLS artifacts already exist, skipping generation"
return 0
fi
# Generate keystore
keytool -genkeypair \
-alias minitrino \
-keyalg RSA \
-keystore "${ssl_dir}/keystore.jks" \
-storepass changeit \
-validity 3650 \
-dname "CN=*.minitrino.com"
# Export certificate
keytool -export \
-alias minitrino \
-keystore "${ssl_dir}/keystore.jks" \
-file "${ssl_dir}/minitrino_cert.cer" \
-storepass changeit
}
Example 2: Data Loading
From catalog/elasticsearch/resources/cluster/bootstrap-es.sh:
before_start() {
# Wait for Elasticsearch
wait-for-it elasticsearch:9200 --strict --timeout=60
# Check if index exists
if curl -s http://elasticsearch:9200/_cat/indices | grep -q 'user'; then
echo "Index already exists, skipping"
return 0
fi
# Create index
curl -XPUT http://elasticsearch:9200/user -H 'Content-Type: application/json' -d '{
"settings": { "number_of_replicas": 0 }
}'
# Install Python dependencies
sudo pip install faker requests
# Generate and load data
python3 << 'EOF'
import requests
from faker import Faker
fake = Faker()
for i in range(1, 500):
user = {"name": fake.name(), "email": fake.email()}
requests.post("http://elasticsearch:9200/user/_doc/" + str(i), json=user)
EOF
}
Example 3: Configuration File Modification
before_start() {
# Append to JVM config
echo "-Djava.security.krb5.conf=/etc/krb5.conf" >> /etc/${CLUSTER_DIST}/jvm.config
# Modify config.properties
cat << EOF >> /etc/${CLUSTER_DIST}/config.properties
http-server.authentication.krb5.service-name=trino
http-server.authentication.krb5.keytab=/etc/trino.keytab
EOF
# Ensure ownership
chown -R ${SERVICE_USER}:root /etc/${CLUSTER_DIST}/
}
Testing Bootstrap Scripts Locally
Before adding to a module, test bootstrap logic locally:
# 1. Provision cluster without custom bootstrap
minitrino -v provision -m base-module
# 2. Copy test script into container
docker cp my-bootstrap.sh minitrino-default:/tmp/test-bootstrap.sh
# 3. Execute manually
docker exec -it minitrino-default bash -x /tmp/test-bootstrap.sh
# 4. Verify results
docker exec -it minitrino-default cat /etc/trino/config.properties
When Bootstrap Scripts Fail
If provisioning hangs or fails during bootstrap:
Check logs:
docker logs minitrino-default 2>&1 | tail -100Look for errors: Search for “ERROR”, “failed”, or stack traces
Verify external services: Ensure dependencies (DBs, etc.) are running
Test interactively: Enter container and run script manually
Simplify script: Comment out sections to isolate the failure
Check permissions: Verify files/directories are accessible
After fixing issues, destroy and re-provision:
minitrino down
minitrino remove --volumes
minitrino -v provision -m ${module}
More Examples
Additional bootstrap examples can be found in the library:
TLS setup:
lib/modules/security/tls/resources/cluster/bootstrap.shLDAP configuration:
lib/modules/security/ldap/resources/cluster/bootstrap.shOAuth2 setup:
lib/modules/security/oauth2/resources/cluster/bootstrap.shBIAC initialization:
lib/modules/security/biac/resources/cluster/bootstrap.shElasticsearch data loading:
lib/modules/catalog/elasticsearch/resources/cluster/bootstrap-es.sh