Data Storage in Kestra

Understand where different data components (inputs, outputs, logs, etc.) are stored in Kestra's architecture.

Overview

Kestra processes and stores various data components, including flow definitions, workflow inputs, outputs, logs, execution metadata, and more. Understanding where these components are stored is beneficial for optimizing performance, configuring persistence, and integrating with external storage solutions.

Kestra data is stored in either a Repository, such as PostgreSQL, or Internal Storage, which by default is your local storage but can be configured to an AWS S3 bucket or MinIO.

You can read more about Kestra's architecture and Internal Storage in their dedicated documentation.

Data Storage Components

Below is a table view of many of the Kestra data storage components, where they are stored, and what they are.

Data Component	Storage Location	Description
Flows & Definitions	Repository	Flows, tasks, and their configurations are stored in the database.
Namespace	Repository	Namespaces are used to organize workflows and manage access to secrets, plugin defaults, and variables.
Namespace Files	Internal Storage	Namespace Files store code and configuration files directly in Kestra's internal storage backend.
Executions & Metadata	Repository	Each execution, including status, timestamps, and execution metadata, is stored in the database.
Inputs	Internal Storage	Inputs provided to a flow execution are kept in internal storage.
Input Files	Internal Storage	Additional files to pass to any script or CLI task.
Outputs	Internal Storage	Outputs from tasks are stored in Kestra's internal storage system, separate from the database.
Output Files	Internal Storage	Generated files available for download and usable in downstream tasks.
Key-Value Pairs	Internal Storage & Repository (Metadata only)	KV Store holds data in a convenient, key-value format. You can create them directly from the UI, via dedicated tasks, Terraform, or through the API.
Logs & Audit Logs (Enterprise Edition)	Repository	All logs generated by tasks are stored in the database.
Task State & Variables	Repository	Dynamic variables and task states within an execution are stored and retrieved as needed.
Secrets	Repository or External Secret Manager	Secrets can be managed through Kestra's internal database or external secret managers like AWS Secrets Manager, HashiCorp Vault, or Google Secret Manager.
Queues	Repository or Kafka	Internal communication between Kestra server components.
Triggers	Repository	Triggers are event-based mechanisms to automate the execution of your workflows.
User Administration	Repository	This includes RBAC and user management information such as invitations, groups, and roles.

Kestra Internal Storage

Kestra uses Internal Storage to handle incoming and outgoing files in a scalable way. It stores files generated during a flow execution and used to pass data between tasks. Execution outputs and artifacts such as output files are stored separately from the database. This allows efficient retrieval of task results while keeping database storage optimized.

Used for: Task inputs and outputs, temporary execution data, and artifacts such as Input, Output, and Namespace files.
KV Store: Internal storage is used to store Key-Value pairs, as they may contain sensitive information. This can either be your local storage or private cloud bucket. The database only contains metadata about the object, such as the key, file URI, any attached metadata about the object like TTL, creation date, last updated timestamp, etc.
Storage Backend: By default, Kestra's internal storage is your local storage, but it can be configured to use cloud storage options for production such as:
- AWS S3
- Google Cloud Storage
- Azure Blob Storage
- MinIO
- any S3-compatible storage

Configuring Internal Storage

You can configure Kestra's Internal Storage backend in the docker-compose.yaml file, for example, like in the following with AWS S3:

yaml

kestra:
  storage:
    type: s3
    bucket: "kestra-internal-storage"
    region: "us-east-1"

Check out the Configuration documentation for more on internal storage configuration.

Data Storage Additional Information

Flows & Execution Metadata

Stored in PostgreSQL, MySQL, Microsoft SQL Server (Enterprise Edition), or H2 (not recommended for distributed components) as structured data.
Includes:
- Flow definitions
- Execution details
- Execution Queues
- Historical metadata
Accessible via the Kestra API and UI.

Flows and execution data are stored in a database to provide persistent data and historical data.

Logs

Kestra Open Source: Stored in the database.
Kestra Enterprise Edition: Can use the same architecture as Kestra Open Source but also supports an Elasticsearch backend for storing logs.
- In the Enterprise Edition, Audit Logs are also stored in the database.
Logs can be accessed via the API, UI, or through external logging systems when integrated (e.g., Log Shipper).

Queues

Kestra Open Source: Stored in the database.
Kestra Enterprise Edition: Can use the database—same as Kestra Open Source—but also supports Kafka instances to replace the database for messaging between server components.

Secrets Management

Secrets can be stored in:
- Kestra's database (default).
- External Secret Managers, including AWS Secrets Manager, Google Secret Manager, Azure Key Vault, and HashiCorp Vault.
Secrets are encrypted and never exposed in logs.

You can manage secrets in your Kestra instance with the secret manager of your choice in your configuration file. For example, to add AWS Secret Manager, use the following:

yaml

kestra:
  secret:
    type: aws-secret-manager
    awsSecretManager:
      accessKeyId: mysuperaccesskey
      secretKeyId: mysupersecretkey
      sessionToken: mysupersessiontoken
      region: us-east-1

For more configurations, check out the Secret Managers documentation.

Database Maintenance

Because the database is potentially storing lots of execution data and logs over time, it is beneficial for performance and capacity to utilize Purge tasks to keep the instance as tidy as possible when data is no longer needed.

Conclusion

Kestra's storage architecture ensures efficient separation of execution data, logs, and artifacts. While the database handles structured execution metadata, Internal Storage is used for inputs, outputs, and task-generated files, preventing database overload. For large-scale deployments, cloud-based storage solutions can be used to optimize performance.

If the data components listed need to be broken out into, for example, separate Business Units, check out the Governance section of the Enterprise Edition to learn more about tenants.

For more details about storage architecture, refer to:

Was this page helpful?

ArchitectureMulti-tenancy

DocsInstallation Guide

​Data ​Storage in ​Kestra

Data Storage in Kestra