Data Storage in Kestra​Data ​Storage in ​Kestra

Understand where different data components (inputs, outputs, logs, etc.) are stored in Kestra's architecture.

Overview

Kestra processes and stores various data components, including flow definitions, workflow inputs, outputs, logs, execution metadata, and more. Understanding where these components are stored is beneficial for optimizing performance, configuring persistence, and integrating with external storage solutions.

Kestra data is stored in either a database such as PostgreSQL or internal storage, which by default is your local storage but can be configured to an S3 bucket or MinIO. You can read more about Kestra's architecture and internal storage in their dedicated documentation.

Data Storage Components

Below is a table view of many of the Kestra data storage components, where they are stored, and what they are.

Data ComponentStorage LocationDescription
Flows & DefinitionsDatabaseFlows, tasks, and their configurations are stored in the database.
NamespaceDatabaseNamespaces are used to organize workflows and manage access to secrets, plugin defaults, and variables.
Namespace FilesInternal StorageNamespace Files store code and configuration files directly in Kestra's internal storage backend.
Executions & MetadataDatabaseEach execution, including status, timestamps, and execution metadata, is stored in the database.
InputsInternal StorageInputs provided to a flow execution are kept in internal storage.
Input FilesInternal StorageAdditional files to pass to any script or CLI task.
OutputsInternal StorageOutputs from tasks are stored in Kestra’s internal storage system, separate from the database.
Output FilesInternal StorageGenerated files available for download and usable in downstream tasks.
Key-Value PairsInternal Storage & Database (Metadata only)KV Store holds data in a convenient, key-value format. You can create them directly from the UI, via dedicated tasks, Terraform, or through the API.
Logs & Audit Logs (Enterprise)DatabaseAll logs generated by tasks are stored in the database.
Task State & VariablesDatabaseDynamic variables and task states within an execution are stored and retrieved as needed.
SecretsDatabase or External Secret ManagerSecrets can be managed through Kestra’s internal database or external secret managers like AWS Secrets Manager, HashiCorp Vault, or Google Secret Manager.
QueuesDatabaseInternal communication between Kestra server components.
TriggersDatabaseTriggers are event-based mechanisms to automate the execution of your workflows.
User AdministrationDatabaseThis includes RBAC and user management information such as invitations, groups, and roles.

Kestra Internal Storage

Kestra uses Internal Storage to handle incoming and outgoing files in a scalable way. It stores files generated during a flow execution and used to pass data between tasks. Execution outputs and artifacts such as output files are stored separately from the database. This allows efficient retrieval of task results while keeping database storage optimized.

  • Used for: Task inputs and outputs, temporary execution data, and artifacts such as Input, Output, and Namespace files.
  • KV Store: Internal storage is used to store Key-Value pairs, as they may contain sensitive information. This can either be your local storage or private cloud bucket. The database only contains metadata about the object, such as the key, file URI, any attached metadata about the object like TTL, creation date, last updated timestamp, etc.
  • Storage Backend: By default, Kestra’s internal storage is your local storage, but it can be configured to use cloud storage options for production such as:
    • Amazon S3
    • Google Cloud Storage
    • Azure Blob Storage
    • Any S3-compatible storage

Configuring Internal Storage

You can configure Kestra’s Internal Storage backend in the docker-compose.yaml file, for example, like in the following with S3:

yaml
kestra:
  storage:
    type: s3
    bucket: "kestra-internal-storage"
    region: "us-east-1"

Check out the Configuration documentation for more on internal storage configuration.

Data Storage Additional Information

Flows & Execution Metadata

  • Stored in PostgreSQL, MySQL, or H2 (not recommended for distributed components) as structured data.
  • Includes:
    • Flow definitions
    • Execution details
    • Execution Queues
    • Historical metadata
  • Accessible via the Kestra API and UI.

Flows and execution data are stored in a database to provide persistent data and historical data.

Logs

  • Kestra Open Source: Stored in the database.
  • Kestra Enterprise: Can use the same architecture as Kestra Open Source but also supports an Elasticsearch backend for storing logs.
    • In the Enterprise Edition, Audit Logs are also stored in the database.
  • Logs can be accessed via the UI, API, or through external logging systems when integrated (e.g., Log Shipper).

Queues

  • Kestra Open Source: Stored in the database.
  • Kestra Enterprise: Can use the database—same as Kestra Open Source—but also supports Kafka instances to replace the database for messaging between server components.

Secrets Management

  • Secrets can be stored in:
    • Kestra’s database (default).
    • External secret managers, including AWS Secrets Manager, Google Secret Manager, and HashiCorp Vault.
  • Secrets are encrypted and never exposed in logs.

You can manage secrets in your Kestra instance with the secret manager of your choice in your configuration file. For example, to add AWS Secret Manager, use the following:

yaml
kestra:
  secret:
    type: aws-secret-manager
    awsSecretManager:
      accessKeyId: mysuperaccesskey
      secretKeyId: mysupersecretkey
      sessionToken: mysupersessiontoken
      region: us-east-1

For more configurations, check out the Secret Managers documentation.

Database Maintenance

Because the database is potentially storing lots of execution data and logs over time, it is beneficial for performance and capacity to utilize Purge tasks to keep the instance as tidy as possible when data is no longer needed.

Conclusion

Kestra’s storage architecture ensures efficient separation of execution data, logs, and artifacts. While the database handles structured execution metadata, internal storage is used for inputs, outputs, and task-generated files, preventing database overload. For large-scale deployments, cloud-based storage solutions can be used to optimize performance.

If the data components listed need to be broken out into, for example, separate Business Units, check out the Governance section of the Enterprise Edition to learn more about tenants.

For more details about storage architecture, refer to:

Was this page helpful?