# Kestra Complete Documentation > Full content snapshot of all Kestra documentation pages. For the curated index, see /llms.txt. > Append .md to any kestra.io/docs/* URL to retrieve that page as plain Markdown. Total pages: 488 --- # Kestra Docs: Your Infinitely-Scalable Orchestration Platform URL: https://kestra.io/docs > The official documentation for Kestra, an open-source orchestration platform to automate business-critical workflows. import HomePageButtons from "~/components/content/HomePageButtons.astro" import WhatsNew from "~/components/common/WhatsNew.astro" import SupportLinks from "~/components/content/SupportLinks.astro" import PluginCount from "~/components/content/PluginCount.vue" import ChildCard from "~/components/docs/ChildCard.astro" What is Kestra? Kestra is an open-source, infinitely-scalable **orchestration platform** that enables all engineers to manage **business-critical workflows** declaratively in code. Thanks to plugins and an embedded code editor with Git and Terraform integrations, Kestra makes scheduled and event-driven workflows easy.
--- # Administrator Guide: Operate and Secure Your Cluster URL: https://kestra.io/docs/administrator-guide > The complete Administrator Guide for operating, securing, and scaling your Kestra cluster in production. import ChildCard from "~/components/docs/ChildCard.astro" The Administrator Guide covers everything you need to know about managing your Kestra cluster. - Check the **[Installation Guide](../02.installation/index.mdx)** for details on how to **install Kestra** to your preferred environment. - Check the **[Configuration guide](../configuration/index.mdx)** for details on how to **configure Kestra** based your specific needs. --- # Backup and Restore Kestra: Flows, Secrets, and Executions URL: https://kestra.io/docs/administrator-guide/backup-and-restore > Learn how to perform full or metadata-only backups and restores of your Kestra instance for disaster recovery and migration. Back up and restore your Kestra instance. Kestra provides a backup feature for **metadata**. In addition, you can back up and restore the underlying database and internal storage if a metadata-only backup is not sufficient. :::alert{type="info"} The commands in the next section assume Kestra runs locally on the host. If you run Kestra in Docker, see the [container example](#example-backup-and-restore-inside-docker) below. ::: ## Metadata-only Backup & Restore (Enterprise Edition) Since 0.19, [Kestra Enterprise Edition](../../oss-vs-paid/index.md) provides **metadata** backup and restore. You can back up metadata from one Kestra instance and restore it into another — even across different Kestra versions or repository/queue backends. Perform metadata backup and restore while Kestra is paused to ensure consistency. As a best practice, enable [Maintenance Mode](../../07.enterprise/05.instance/maintenance-mode/index.md) (available since 0.21) before starting. A metadata backup includes all data **not** related to executions: blueprints, flows, namespaces, roles, secrets (for JDBC and Elasticsearch secrets-manager backends), security integrations, settings, templates, tenants, triggers, users, and access bindings. To include execution-related data, use the `--include-data` flag. ### Metadata backup To back up instance metadata, run: ```bash kestra backups create FULL ``` `FULL` backs up the entire instance. To back up a single tenant (when multi-tenancy is enabled), use `TENANT`. In `TENANT` mode, only the selected tenant’s data is included (global users/tenants are excluded). To back up only specific resources, use the `--resources` flag. For example, to back up the [KV Store](../../06.concepts/05.kv-store/index.md): ```bash kestra backups create --resources KV_STORE ``` Other resources include: `FLOW`, `NAMESPACE_FILE`, `TRIGGER`, `LOG`, `SECRET`, and more. By default, backups are encrypted with the embedded Kestra encryption key. You can change this behavior with: - `--tenant` (for `TENANT` backups): the tenant name to back up. Defaults to the “default” tenant. - `--encryption-key`: a custom encryption key to use instead of the embedded key. - `--no-encryption`: disable encryption (not recommended; metadata may contain sensitive information). :::badge{version=">=0.22" editions="EE"} ::: - `--include-data`: include execution data (executions, logs, metrics, audit logs). By default, execution data is excluded due to potential size. - `--internal-log`: set the level for internal logs to include in the backup. - `-l, --log-level`: set the backup log level (`TRACE`, `DEBUG`, `INFO`, `WARN`, `ERROR`). Default: `INFO`. When you start the backup process from the command line, you will see the following logs which include a backup summary and the URI to the Kestra internal storage file where the backup will be stored. ```plaintext 2024-09-17 16:33:12,706 INFO create io.kestra.ee.backup.BackupService Backup summary: [BINDING: 3, BLUEPRINT: 1, FLOW: 13, GROUP: 1, NAMESPACE: 1, ROLE: 6, SECRET: 1, SECURITY_INTEGRATION: 0, SETTING: 1, TENANT: 1, TENANT_ACCESS: 2, TRIGGER: 2, USER: 1] 2024-09-17 16:33:12,706 INFO create io.kestra.ee.backup.BackupService Backup instance created in 508 ms Backup created: kestra:///backups/full/backup-20240917163312.kestra ``` ### Metadata restore To restore an instance from a metadata backup, run the following command using the internal-storage URI returned by the backup: ```bash kestra backups restore kestra:///backups/full/backup-20240917163312.kestra ``` You can use the following command line parameters: - `--encryption-key`: use it to specify a custom encryption key instead of the Kestra embedded one. - `--to-tenant`: restore the backup into a different tenant. Starting the restore process from the command line will display the following logs which include backup information and a restore summary. ```plaintext 2024-09-17 16:41:06,065 INFO restore io.kestra.ee.backup.BackupService Restoring kestra:///backups/full/backup-20240917163312.kestra 2024-09-17 16:41:06,149 INFO restore io.kestra.ee.backup.BackupService Restoring FULL backup from Kestra version 0.19.0-SNAPSHOT created at 2024-09-17T16:33:12.700099909 2024-09-17 16:41:06,150 INFO restore io.kestra.ee.backup.BackupService Backup summary: [BINDING: 3, BLUEPRINT: 1, FLOW: 13, GROUP: 1, NAMESPACE: 1, ROLE: 6, SECRET: 1, SECURITY_INTEGRATION: 0, SETTING: 1, TENANT: 1, TENANT_ACCESS: 2, TRIGGER: 2, USER: 1] 2024-09-17 16:41:07,182 INFO restore io.kestra.ee.backup.BackupService Restore summary: [BINDING: 3, BLUEPRINT: 1, FLOW: 13, GROUP: 1, NAMESPACE: 1, ROLE: 6, SECRET: 1, SECURITY_INTEGRATION: 0, SETTING: 1, TENANT: 1, TENANT_ACCESS: 2, USER: 1, TRIGGER: 2] Backup restored from URI: kestra:///backups/full/backup-20240917163312.kestra ``` ### Example: Backup and restore inside Docker If Kestra runs in Docker, use `docker exec` and `docker cp` to move the backup file in and out of the container: ```bash ## Create a full backup (with execution data) from inside the container docker exec your_container bash -c "./kestra backups create FULL --include-data --no-encryption" ## Copy the backup file from the container to a local directory docker cp your_container:/app/storage/backups/full/backup123.kestra . ## After upgrading Kestra, copy the backup back into the container docker cp ./backup123.kestra your_container:/app/storage/backups/full/ ## Restore the backup from inside the container docker exec your_container bash -c "./kestra backups restore kestra:///backups/full/backup123.kestra" ``` ## Full backup and restore with backend tools ### Backup & Restore with the JDBC Backend With the JDBC backend, Kestra can be backed up and restored using the database's native backup tools. #### Backup & Restore for PostgreSQL First, stop Kestra to ensure the database is in a stable state. Although `pg_dump` allows you to back up a running PostgreSQL database, it's always better to perform backups offline when possible. Next, run the following command: ```bash pg_dump -h localhost -p 5432 -U -d -F tar -f kestra.tar ``` To restore the backup to a new database, use `pg_restore`: ```bash pg_restore -h localhost -p 5432 -U -d kestra.tar ``` Finally, restart Kestra. #### Backup & Restore for MySQL First, stop Kestra to ensure the database is in a stable state. Although MySQL's `mysqldump` allows you to back up a running MySQL database, it's always better to perform backups offline when possible. Next, run the following command to back up the database: ```bash mysqldump -h localhost -P 3306 -u -p'' > kestra.sql ``` To restore the backup to a new database, use the following command: ```bash mysql -h localhost -P 3306 -u -p'' < kestra.sql ``` The `< kestra.sql` part tells MySQL to read and execute the SQL statements contained in the `kestra.sql` backup file as input. Finally, restart Kestra. ### Backup & Restore with the Elasticsearch and Kafka Backend With the Elasticsearch and Kafka backend, Kestra can be backed up and restored using Elasticsearch snapshots. Kafka will be reinitialized with the information from Elasticsearch. This guide assumes you have already configured a snapshot repository in Elasticsearch named `my_snapshot_repository`. Elasticsearch provides several [backup options](https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshot-restore.html). Use basic snapshot and restore operations via the Elasticsearch API. First, create an Elasticsearch snapshot named `kestra`: ```bash ## Kibana Dev Tools (Console) or curl (adjust host/auth as needed) PUT _snapshot/my_snapshot_repository/kestra?wait_for_completion=true ``` Next, delete all Kestra indices (prefixed with `kestra_` by default) and recreate them using the snapshot: ```bash POST _snapshot/my_snapshot_repository/kestra/_restore { "indices": "kestra_*" } ``` If you need to start from a fresh Kafka cluster, reindex Kafka from Elasticsearch with: ```bash kestra sys-ee restore-queue ``` Since some execution information is stored only in Kafka, not all pending executions may be restarted. Finally, restart Kestra. ### Backup & Restore of Internal Storage Kestra’s internal storage can be either a local filesystem or object storage. - **Local filesystem**: back up/restore the storage directory with your standard filesystem tools. - **Managed object storage**: enable cross-region replication (often sufficient for DR) or use the provider’s backup tooling. - **Self-hosted object storage (e.g., MinIO)**: use a tool like [Restic](https://blog.min.io/back-up-restic-minio/) and/or configure replication. --- # Fix Basic Authentication Issues in Kestra URL: https://kestra.io/docs/administrator-guide/basic-auth-troubleshooting > Troubleshoot common issues with Basic Authentication in Kestra, including configuration and login problems. Troubleshoot issues with Basic Authentication. Every open-source instance of Kestra requires Basic Authentication (`username` and `password`). You can configure credentials via the Setup Page in the UI (http://localhost:8080/ui/main/setup) or manually in the configuration file under `basic-auth` (recommended for production): ```yaml kestra: server: basic-auth: username: admin@kestra.io password: Admin1234 ``` Since Basic Authentication is now required, the `enabled` flag is ignored and should no longer be used. Credentials must be configured to access the Kestra UI or API. For new users, simply follow the Setup Page that will show up when you start Kestra UI. For production deployments, set a valid email address and a strong password in the configuration file. There are four possible scenarios for existing users. ### Scenario 1: The `enabled` flag is set to `true` ```yaml kestra: server: basic-auth: enabled: true username: admin@kestra.io password: Admin1234 ``` In this case, the following occurs: - Now that authentication is required, it is always enabled. Therefore, the `enabled` flag is ignored regardless of `true` or `false`. - The user Setup page **will not** appear when starting Kestra because `username` and `password` are set. You will be prompted to log in with those credentials. - If either `username` or `password` is missing, Kestra starts with the Setup page and prompts you to create credentials. These values will be used for all future logins. - If `username` or `password` is invalid, Kestra will show an error and prompt you to update the credentials to valid values. ### Scenario 2: The `enabled` flag is set to `false` ```yaml kestra: server: basic-auth: enabled: false ``` In this case, the following occurs: - Now that authentication is required, it is always enabled. Therefore, the `enabled` flag is ignored regardless of `true` or `false`. - On first startup, the Setup page appears and prompts you to create credentials. These are stored in the Kestra database in the **Settings** table under the key `kestra.server.basic-auth` and are used for all subsequent logins. ### Scenario 3: No `basic-auth` configuration is added If no `basic-auth` configuration is defined: - The Setup page will appear the first time starting Kestra, and you will need to create valid credentials. The authentication credentials are stored in your Kestra database in the **Settings** table under the key `kestra.server.basic-auth`. This is how you log in for all future sessions. :::alert{type="info"} If you forget your credentials, update the `username` and `password` in the configuration file. The configuration file always takes precedence over values set from the Setup page. ::: ### Scenario 4: Using Authorization headers instead of cookies Kestra’s API accepts both `Authorization: Basic ...` headers and cookies for authentication. However, the **UI only works with cookie-based authentication** and will ignore the `Authorization` header. If you are running in an environment where headers are injected automatically (e.g. via proxy or authentication middleware), you have two options: - Use a proxy or middleware to translate the `Authorization` header into a `BASIC_AUTH` cookie before forwarding to the Kestra UI. - Use a browser extension (e.g. [ModHeader](https://modheader.com/)) to inject the `BASIC_AUTH` cookie directly. This limitation does not affect API usage, which continues to accept both headers and cookies. --- # Docker-in-Docker Behind a Proxy: Kestra on Kubernetes URL: https://kestra.io/docs/administrator-guide/dind-behind-proxy > Configure Docker-in-Docker (DinD) to run securely behind a corporate or MITM proxy within your Kestra deployment. Configure Docker-in-Docker (DinD) to run behind a Proxy in a Kubernetes-based Kestra deployment. This guide describes how to configure Docker-in-Docker (DinD) to work **behind a corporate or MITM (Man-in-the-Middle) proxy** in a **rootless** setup, within a Kestra deployment. ## Why configure CA certs and proxies for DinD? Docker-in-Docker (DinD) runs a Docker daemon inside a container, allowing it to build and run other containers. Kestra relies on DinD for certain task types that require Docker runtime isolation. If your environment uses a proxy that intercepts HTTPS traffic (such as an MITM proxy), Docker must **trust the proxy’s CA certificate** when pulling images from remote registries (like Docker Hub or private registries). Without this, you'll see errors like: ```plaintext x509: certificate signed by unknown authority ``` ## Prerequisites 1. Create a ConfigMap for the Docker daemon configuration. This should include your `daemon.json` with proxy settings. Create a file `daemon.json`: ```json { "proxies": { "http-proxy": "http://mitmproxy.default.svc.cluster.local:8000", "https-proxy": "http://mitmproxy.default.svc.cluster.local:8000", "no-proxy": "localhost,127.0.0.1,.svc,.cluster.local,your.nexus.domain.com,kestra-minio" } } ``` Apply the configmap: ```bash kubectl create configmap dind-daemon-config \ --from-file=daemon.json=./daemon.json \ -n kestra ``` 2. Create a ConfigMap for the MITM Proxy CA certificate. Assuming you have the CA file saved as `mitmproxy-ca.crt`, run: ```bash kubectl create configmap dind-ca-certs \ --from-file=ca.crt=./mitmproxy-ca.crt \ -n kestra ``` 3. Kestra Configuration Here is a configuration sample you can include in your Helm `values.yaml`: ```yaml configurations: application: kestra: plugins: configurations: - type: io.kestra.plugin.scripts.runner.docker.Docker values: volume-enabled: true common: extraVolumes: - name: docker-daemon-config configMap: name: dind-daemon-config - name: ca-cert-volume configMap: name: dind-ca-certs extraVolumeMounts: - name: docker-daemon-config mountPath: /home/rootless/.config/docker readOnly: true - name: ca-cert-volume mountPath: /home/rootless/.config/docker/certs.d/mitmproxy.default.svc.cluster.local:8000 readOnly: true - name: ca-cert-volume mountPath: /home/rootless/mitmproxy readOnly: true dind: enabled: true base: rootless: image: repository: docker tag: dind-rootless pullPolicy: IfNotPresent securityContext: runAsUser: 1000 runAsGroup: 1000 args: - --log-level=fatal - --group=1000 socketPath: /dind/ tmpPath: /tmp/ resources: {} extraEnv: - name: SSL_CERT_FILE value: /home/rootless/mitmproxy/ca.crt ``` Here, `volume-enabled: true` ensures that the CA certificate is mounted from the DinD pod into any container deployed by a Kestra task. ## DinD in action This configuration will help the DinD pod pull the required container images successfully through the MITM proxy. For Kestra tasks that run in Docker containers (e.g., `io.kestra.plugin.scripts.shell.Script`), you also need to set the `HTTPS_PROXY` environment variable and trust the certificate using `beforeCommands` as shown below. For consistency across tasks, consider configuring these settings as plugin defaults. ```yaml id: mitm_proxy namespace: company.team tasks: - id: shell type: io.kestra.plugin.scripts.shell.Script containerImage: alpine/curl beforeCommands: - apk add --no-cache ca-certificates - update-ca-certificates taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker volumes: - /home/rootless/mitmproxy/ca.crt:/usr/local/share/ca-certificates/mitmproxy.crt env: HTTPS_PROXY: "mitmproxy.default.svc.cluster.local:8000" script: | curl https://httpbin.org/get ``` ## How it Works - `daemon.json`: tells Docker which proxy settings to use. - `certs.d`: directory where Docker looks for custom CA certificates to trust registries. - `SSL_CERT_FILE`: overrides the TLS stack used by the Docker daemon to trust the MITM CA. - `HTTP_PROXY`, `HTTPS_PROXY`, `NO_PROXY`: standard proxy env vars for networking. --- # High Availability in Kestra: Scale Workers and Webservers URL: https://kestra.io/docs/administrator-guide/high-availability > Design and configure Kestra for High Availability (HA) to ensure fault tolerance and continuous operation in production. Kestra is designed for high availability and fault tolerance. This page explains how to configure your deployment to ensure continuous operation. Highly available systems are built to keep running even in the event of component or infrastructure failures. This is achieved by eliminating single points of failure and introducing redundancy across critical services. In Kestra, high availability is achieved by running multiple instances of each core component — including the `webserver` (API), `scheduler`, `executor`, `indexer`, and `workers`. This ensures that if one instance fails, the system can continue to operate without interruption. :::alert{type="info"} This architecture requires a [Kafka and Elasticsearch deployment](../../08.architecture/index.mdx#architecture-with-kafka-and-elasticsearch-backend), which is designed to be highly available and fault-tolerant. ::: ## Scaling the components The following components can be scaled horizontally by increasing the number of replicas in your [Helm chart values](https://github.com/kestra-io/kestra/blob/develop/charts/kestra/values.yaml): - Webserver - Scheduler - Executor - Worker - Indexer Additionally, the Elasticsearch and Kafka clusters can be scaled out as needed to handle large volumes of data. Finally, internal storage (such as S3) is highly available and fault-tolerant by design. :::alert{type="info"} Ensure that the underlying host system is also tuned for high availability and fault tolerance. For example, adjusting the Linux kernel parameter `net.ipv4.tcp_retries2` can reduce [TCP retransmission times](https://access.redhat.com/solutions/726753). ::: ## Load balancing To guarantee high availability, deploy a load balancer to distribute incoming requests across multiple `webserver` instances. This prevents downtime if any single instance fails, allowing the system to continue operating without interruption. --- # JVM CPU Limits for Kestra on Kubernetes URL: https://kestra.io/docs/administrator-guide/jvm-cpu-limits > Configure the Kestra Helm chart to force the JVM to honor Kubernetes CPU limits, preventing pods from over-consuming resources. Force the JVM to match Kubernetes CPU limits through the Kestra Helm chart. Kestra pods on some Kubernetes clusters can exceed their CPU and memory limits because the Java Virtual Machine (JVM) does not always read cgroup data correctly. This guide explains how to align the JVM with [Kubernetes](../../02.installation/03.kubernetes/index.md) constraints using the Kestra Helm chart and the `-XX:ActiveProcessorCount` flag. For broader deployment guidance, see [High Availability](../high-availability/index.md) and [Monitoring](../03.monitoring/index.md). ## Why JVM sizing can ignore Kubernetes limits Kubernetes enforces container limits with cgroups, but the JVM may still detect the host capacity instead of the constrained container. When that happens: - Netty and other pools size themselves for the full node. - The pod uses more memory and threads than its limits allow and can be killed by the OOM killer. This behavior depends on the cluster runtime and cgroup configuration, so a consistent fix must live in the Helm chart rather than in the application code. ## Use `-XX:ActiveProcessorCount` to align CPU detection `-XX:ActiveProcessorCount` lets you tell the JVM how many CPUs to see. Using it inside the Kestra container makes internal pools scale to the CPU count that matches your Kubernetes limits: ```bash java -XX:ActiveProcessorCount=2 -jar kestra.jar ``` Because many clusters already expose cgroup data correctly, the Helm chart keeps this flag optional and configurable. ## Configure the Kestra Helm chart The chart adds a dedicated JVM section in `values.yaml`: ```yaml common: jvm: forceActiveProcessors: enabled: false count: "auto" # "auto" or "value" value: 2 # only used when count = "value" extraOpts: "" ``` - `enabled`: toggle the feature (disabled by default). - `count`: - `"auto"` derives the CPU count from `resources.limits.cpu`. - `"value"` uses a fixed number. - `value`: CPU count when `count` is set to `"value"`. - `extraOpts`: additional JVM flags; the chart prepends `-XX:ActiveProcessorCount` when enabled. ### Derive CPU count automatically Auto mode reads `resources.limits.cpu`, supports values such as `"250m"`, `"1"`, or `"1.5"`, converts them to an integer CPU count (minimum 1), and injects: ```plaintext -XX:ActiveProcessorCount= ``` Example: ```yaml common: resources: limits: cpu: "250m" jvm: forceActiveProcessors: enabled: true count: "auto" ``` This yields `KESTRA_JAVA_OPTS="-XX:ActiveProcessorCount=1"` for the pod. ### Provide an explicit CPU value If you prefer a fixed number, switch to `"value"`: ```yaml common: jvm: forceActiveProcessors: enabled: true count: "value" value: 3 ``` This sets `KESTRA_JAVA_OPTS="-XX:ActiveProcessorCount=3"`, which can be useful to keep the JVM more conservative than the container limit. ### Override per component Different components can use different CPU counts. Component overrides take precedence over the global setting: ```yaml common: jvm: forceActiveProcessors: enabled: true count: "value" value: 2 deployments: standalone: enabled: true jvm: forceActiveProcessors: enabled: true count: "value" value: 5 ``` ## How the chart applies the setting - The Helm helper computes the CPU count from the global `common.jvm.forceActiveProcessors`, any component override, and the component `resources.limits.cpu` (falling back to `common.resources.limits.cpu`). - It builds `KESTRA_JAVA_OPTS`, adding `-XX:ActiveProcessorCount=` when enabled and appending `extraOpts`. - The container exports `KESTRA_JAVA_OPTS`, and the Kestra start script runs `exec java ${KESTRA_JAVA_OPTS} ${JAVA_OPTS} ...`. ## When to enable it Enable `forceActiveProcessors` when pods hit OOMs or thread pools scale as if the full node is available. Start with auto mode so the JVM mirrors your Kubernetes CPU limits: ```yaml common: resources: limits: cpu: "2" jvm: forceActiveProcessors: enabled: true count: "auto" ``` If your pods already respect limits, keep the feature disabled. Combine this setting with your existing Helm configuration in [High Availability](../high-availability/index.md) to scale components safely and monitor the impact using [Prometheus metrics](../prometheus-metrics/index.md). --- # MITM Proxy: Inspect Kestra's Outbound HTTPS Traffic URL: https://kestra.io/docs/administrator-guide/mitm-proxy-configuration > Configure Kestra to route outbound HTTPS traffic through a Man-in-the-Middle (MITM) proxy for secure environments. Configure outbound HTTP/S traffic through an MITM proxy in Kestra. This guide explains how to route and inspect Kestra's outbound HTTP/S traffic using an MITM proxy. ## Why use an MITM proxy In secured or restricted environments it’s common to route outbound HTTP/S traffic through a **Man-in-the-Middle (MITM) proxy** for auditing, inspection, or policy enforcement. For this to work, clients (Kestra) must: - Trust the proxy’s CA certificate. - Route outbound traffic through the proxy. - Configure the JVM and any auxiliary daemons (e.g., Docker daemon) to use the proxy and truststore. :::alert{type="info"} **Security note:** An MITM proxy intercepts TLS traffic. Only enable this in controlled environments and with appropriate approvals. ::: ## Prerequisites ### 1. Create a Java truststore with the MITM CA certificate Import the MITM CA certificate into a Java keystore so the JVM trusts intercepted TLS connections: ```bash keytool -importcert -alias mitmproxy-ca -storepass changeit -keystore truststore.jks -trustcacerts -file mitmproxy-ca.crt -noprompt ``` :::alert{type="info"} Tip: prefer a strong password instead of `changeit` in production. You can also use PKCS12 by setting `-deststoretype PKCS12`. ::: ### 2. (Kubernetes) Create a Secret containing the truststore Create a Kubernetes secret from the `truststore.jks`: ```bash kubectl create secret generic kestra-ssl --from-file=truststore.jks -n kestra ``` This secret will be mounted into Kestra pods. ## Configuring Kestra to use the MITM proxy You must update the [Observability and Networking configuration](../../configuration/03.observability-and-networking/index.md) and ensure the truststore is available inside the container. Below are suggested changes for both Kubernetes (Helm) and Docker Compose deployments. ### 1. Micronaut / Kestra configuration Add proxy settings and truststore configuration to your [Observability and Networking configuration](../../configuration/03.observability-and-networking/index.md) (merged via Helm `configurations.application` or a config file): ```yaml ## values.yaml configurations: application: micronaut: http: client: proxy-address: "your.proxy.net:8000" proxy-type: HTTP server: ssl: clientAuthentication: want trustStore: path: "file:/app/ssl/truststore.jks" password: "changeit" type: "JKS" ``` ### 2. Mount the truststore inside the container **Kubernetes (Helm `values.yaml`)** ```yaml common: extraVolumeMounts: - name: ssl-secret mountPath: "/app/ssl" readOnly: true extraVolumes: - name: ssl-secret secret: secretName: kestra-ssl ``` **Docker Compose** ```yaml services: kestra: volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - tmp-kestra:/tmp/kestra-wd - ./ssl:/app/ssl # ensure ./ssl/truststore.jks exists on host ``` ### 3. JVM environment variables (JAVA_OPTS) **Kubernetes (values.yaml)** ```yaml common: extraEnv: - name: JAVA_OPTS value: >- -Djavax.net.ssl.trustStore=/app/ssl/truststore.jks -Djavax.net.ssl.trustStorePassword=changeit -Djavax.net.ssl.trustStoreType=JKS -Dhttp.proxyHost=your.proxy.net -Dhttp.proxyPort=8000 -Dhttps.proxyHost=your.proxy.net -Dhttps.proxyPort=8000 -Dhttp.nonProxyHosts=localhost|127.0.0.1|kubernetes.default.svc|.svc|.cluster.local|your.nexus.domain.com|kestra-minio ``` **Docker Compose** ```yaml services: kestra: environment: - JAVA_OPTS=-Djavax.net.ssl.trustStore=/app/ssl/truststore.jks -Djavax.net.ssl.trustStorePassword=changeit -Djavax.net.ssl.trustStoreType=JKS -Dhttp.proxyHost=your.proxy.net -Dhttp.proxyPort=8000 -Dhttps.proxyHost=your.proxy.net -Dhttps.proxyPort=8000 -Dhttp.nonProxyHosts=localhost|127.0.0.1|your.nexus.domain.com ``` ## Troubleshooting 1. **TLS handshake errors** Verify `truststore.jks` contains the correct CA (`keytool -list -keystore truststore.jks`). 2. **Requests not reaching the proxy** Confirm `http.proxyHost` / `https.proxyHost` and `http.nonProxyHosts` are correct. 3. **Docker image pull failures** Add the MITM CA to Docker daemon certs (`/etc/docker/certs.d/.../ca.crt`). 4. **Debugging TLS** Temporarily enable: `-Djavax.net.debug=ssl,handshake`. --- # Kestra Monitoring: Prometheus, Alerts, and Health Checks URL: https://kestra.io/docs/administrator-guide/monitoring > Monitor and alert on Kestra health. Best practices for setting up Prometheus metrics, health checks, and failure notifications for your instance. This page provides best practices for setting up alerting and monitoring in your Kestra instance. Failure alerts are essential. When a production workflow fails, you should be notified immediately. To implement failure alerting, you can use Kestra’s built-in notification tasks, such as: - [Slack](/plugins/plugin-slack) - [Microsoft Teams](/plugins/plugin-teams) - [Email](/plugins/plugin-mail) Technically, you can add custom failure alerts to each flow separately using the `errors` tasks: ```yaml id: onFailureAlert namespace: company.team tasks: - id: fail type: io.kestra.plugin.core.execution.Fail errors: - id: slack type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ secret('SLACK_WEBHOOK') }}" messageText: "Failure alert for flow `{{ flow.namespace }}.{{ flow.id }}` with ID `{{ execution.id }}`. Here is a bit more context about why the execution failed: `{{ errorLogs() }}`" ``` However, this can lead to boilerplate code when this `errors` configuration is duplicated across multiple flows. For centralized namespace-level alerting, create a dedicated monitoring workflow with a notification task and a Flow trigger. Below is an example workflow that automatically sends a Slack alert as soon as any flow in a namespace `company.analytics` fails or finishes with warnings. ```yaml id: failureAlertToSlack namespace: company.monitoring tasks: - id: send type: io.kestra.plugin.slack.notifications.SlackExecution url: "{{ secret('SLACK_WEBHOOK') }}" channel: "#general" executionId: "{{trigger.executionId}}" triggers: - id: listen type: io.kestra.plugin.core.trigger.Flow conditions: - type: io.kestra.plugin.core.condition.ExecutionStatus in: - FAILED - WARNING - type: io.kestra.plugin.core.condition.ExecutionNamespace namespace: company.analytics prefix: true ``` Adding this single flow will ensure that you receive a Slack alert on any flow failure in the `company.analytics` namespace. Here is an example alert notification: ![alert notification](../../03.tutorial/06.errors/alert-notification.png) :::alert{type="warning"} To send this alert on failure across multiple namespaces, add an `OrCondition` to the `conditions` list. See the example below: ```yaml id: alert namespace: company.system tasks: - id: send type: io.kestra.plugin.slack.notifications.SlackExecution url: "{{ secret('SLACK_WEBHOOK') }}" channel: "#general" executionId: "{{trigger.executionId}}" triggers: - id: listen type: io.kestra.plugin.core.trigger.Flow conditions: - type: io.kestra.plugin.core.condition.ExecutionStatus in: - FAILED - WARNING - type: io.kestra.plugin.core.condition.Or conditions: - type: io.kestra.plugin.core.condition.ExecutionNamespace namespace: company.product prefix: true - type: io.kestra.plugin.core.condition.ExecutionFlow flowId: cleanup namespace: company.system ``` ::: The example above works correctly. However, if you list the conditions without using `OrCondition`, no alerts will be sent because Kestra will try to match all conditions simultaneously. Since there’s no overlap between them, the conditions cancel each other out. See the example below: ```yaml id: bad_example namespace: company.monitoring description: This example will not work tasks: - id: send type: io.kestra.plugin.slack.notifications.SlackExecution url: "{{ secret('SLACK_WEBHOOK') }}" channel: "#general" executionId: "{{trigger.executionId}}" triggers: - id: listen type: io.kestra.plugin.core.trigger.Flow conditions: - type: io.kestra.plugin.core.condition.ExecutionStatus in: - FAILED - WARNING - type: io.kestra.plugin.core.condition.ExecutionNamespace namespace: company.product prefix: true - type: io.kestra.plugin.core.condition.ExecutionFlow flowId: cleanup namespace: company.system ``` Here, there's no overlap between the two conditions. The first condition will only match executions in the `company.product` namespace, while the second condition will only match executions from the `cleanup` flow in the `company.system` namespace. To match executions from the `cleanup` flow in the `company.system` namespace **or** any execution in the `product` namespace, use `OrCondition`. ## Monitoring Kestra exposes a monitoring endpoint on port 8081 by default. You can change this port using the `endpoints.all.port` property in the [Observability and Networking configuration](../../configuration/03.observability-and-networking/index.md). This monitoring endpoint provides invaluable information for troubleshooting and monitoring, including Prometheus metrics and several Kestra's internal routes. For instance, the `/health` endpoint exposed by default on port 8081 (e.g., http://localhost:8081/health) generates a similar response as shown below as long as your Kestra instance is healthy: ```json { "name": "kestra", "status": "UP", "details": { "jdbc": { "name": "kestra", "status": "UP", "details": { "jdbc:postgresql://postgres:5432/kestra": { "name": "kestra", "status": "UP", "details": { "database": "PostgreSQL", "version": "15.3 (Debian 15.3-1.pgdg110+1)" } } } }, "compositeDiscoveryClient()": { "name": "kestra", "status": "UP", "details": { "services": { } } }, "service": { "name": "kestra", "status": "UP" }, "diskSpace": { "name": "kestra", "status": "UP", "details": { "total": 204403494912, "free": 13187035136, "threshold": 10485760 } } } } ``` ## Prometheus Kestra exposes [Prometheus](https://prometheus.io/) metrics on the endpoint `/prometheus`. This endpoint is compatible with Prometheus and can be scraped by any Prometheus-based monitoring system. For more details about Prometheus setup, refer to the [Monitoring with Grafana & Prometheus](../../15.how-to-guides/monitoring/index.md) article. :::alert{type="info"} For a complete list of available metrics, refer to the [Prometheus metrics page](../prometheus-metrics/index.md). ::: ### Kestra's metrics Use Kestra's internal metrics to configure custom alerts. Each metric provides multiple time series with tags allowing to track at least namespace & flow but also other tags depending on available tasks. Kestra metrics use the prefix `kestra`. This prefix can be changed using the `kestra.metrics.prefix` property in the [Observability and Networking configuration](../../configuration/03.observability-and-networking/index.md). Each task type can expose custom metrics that will be also exposed on Prometheus. #### Worker |Metrics|Type|Description| |-|-|-| |worker.running.count|`GAUGE`|Number of tasks currently running| |worker.started.count|`COUNTER`|Count of tasks started| |worker.retried.count|`COUNTER`|Count of tasks retried| |worker.ended.count|`COUNTER`|Count of tasks completed| |worker.ended.duration|`TIMER`|Duration of tasks completed| |worker.job.running|`GAUGE`|Count of currently running worker jobs| |worker.job.pending|`GAUGE`|Count of currently pending worker jobs| |worker.job.thread|`GAUGE`|Total worker job thread count| :::alert{type="info"} The `worker.job.pending`, `worker.job.running`, and `worker.job.thread` metrics are intended for autoscaling [worker servers](../../08.architecture/02.server-components/index.md#worker). ::: #### Executor |Metrics|Type|Description| |-|-|-| |executor.taskrun.next.count|`COUNTER`|Count of tasks found| |executor.execution.end.count|`COUNTER`|Count of completed executions| |executor.taskrun.ended.duration|`TIMER`|Duration of tasks completed| |executor.workertaskresult.count|`COUNTER`|Count of task results sent by a worker| |executor.execution.started.count|`COUNTER`|Count of executions started| |executor.execution.end.count|`COUNTER`|Count of executions completed| |executor.execution.duration|`TIMER`|Duration of executions completed| |executor.flowable.execution.count|`COUNTER`|Count of flowable tasks executed| |executor.execution.popped.count|`COUNTER`|Count of executions popped| |executor.execution.queued.count|`COUNTER`|Count of executions queued| |executor.thread.count|`COUNTER`|Count of executor threads| #### Indexer |Metrics|Type| Description | |-|-|-------------------------------------------| |indexer.count|`COUNTER`| Count of index requests sent to a repository | |indexer.duration|`DURATION`| Duration of index requests sent to a repository | #### Scheduler |Metrics|Type| Description | |-|-|-----------------------------------------------------------------------------------------------------| |scheduler.trigger.count|`COUNTER`| Count of triggers | |scheduler.evaluate.running.count|`COUNTER`| Evaluation of triggers actually running | |scheduler.evaluate.duration|`TIMER`| Duration of trigger evaluation | #### JDBC Queue |Metrics|Type|Description| |-|-|-| |queue.big_message.count|`COUNTER`|Count of big messages| |queue.produce.count|`COUNTER`|Count of produced messages| |queue.receive.duration|`TIMER`|Duration to receive and consume a batch of messages| |queue.poll.size|`GAUGE`|Size of a poll to the queue (message batch size)| ### Other metrics Kestra also exposes all internal metrics from the following sources: - [Micronaut](https://micronaut-projects.github.io/micronaut-micrometer/latest/guide/) - [Kafka](https://kafka.apache.org/documentation/#remote_jmx) - Thread pools of the application - JVM See the [Micronaut documentation](https://micronaut-projects.github.io/micronaut-micrometer/latest/guide/) for more information. ## Grafana and Kibana Kestra uses Elasticsearch to store all executions and metrics. You can create a dashboard with [Grafana](https://grafana.com/) or [Kibana](https://www.elastic.co/kibana) to monitor the health of your Kestra instance. Share your dashboard with [the community](/slack). Below is an example Grafana dashboard you can use as a starting point: ![grafana](./grafana.png) :::collapse{title="Grafana Dashboard JSON"} ```json { "annotations": { "list": [ { "builtIn": 1, "datasource": { "type": "grafana", "uid": "-- Grafana --" }, "enable": true, "hide": true, "iconColor": "rgba(0, 211, 255, 1)", "name": "Annotations & Alerts", "type": "dashboard" } ] }, "editable": true, "fiscalYearStartMonth": 0, "graphTooltip": 0, "id": 1862, "links": [], "panels": [ { "collapsed": false, "gridPos": { "h": 1, "w": 24, "x": 0, "y": 0 }, "id": 3, "panels": [], "repeat": "namespace", "title": "INSTANCE: $namespace", "type": "row" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "thresholds" }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "light-blue" } ] }, "unit": "core" }, "overrides": [] }, "gridPos": { "h": 3, "w": 4, "x": 0, "y": 1 }, "id": 6, "options": { "colorMode": "background_solid", "graphMode": "area", "justifyMode": "center", "orientation": "auto", "percentChangeColorMode": "standard", "reduceOptions": { "calcs": [ "lastNotNull" ], "fields": "", "values": false }, "showPercentChange": false, "textMode": "auto", "wideLayout": true }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "exemplar": false, "expr": "sum(kube_pod_container_resource_requests{namespace=\"$namespace\", resource=\"cpu\"})", "instant": true, "legendFormat": "{{container}}", "range": false, "refId": "A" } ], "title": "cpu requests", "type": "stat" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "thresholds" }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "light-blue" } ] }, "unit": "decbytes" }, "overrides": [] }, "gridPos": { "h": 3, "w": 4, "x": 4, "y": 1 }, "id": 5, "options": { "colorMode": "background_solid", "graphMode": "area", "justifyMode": "auto", "orientation": "auto", "percentChangeColorMode": "standard", "reduceOptions": { "calcs": [ "lastNotNull" ], "fields": "", "values": false }, "showPercentChange": false, "textMode": "auto", "wideLayout": true }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "exemplar": false, "expr": "sum(kube_pod_container_resource_requests{namespace=\"$namespace\", resource=\"memory\"})", "instant": true, "legendFormat": "{{container}}", "range": false, "refId": "A" } ], "title": "memory requests", "type": "stat" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "thresholds" }, "decimals": 2, "mappings": [], "max": 1, "min": 0, "thresholds": { "mode": "absolute", "steps": [ { "color": "light-green" }, { "color": "light-orange", "value": 0.25 }, { "color": "light-red", "value": 0.75 } ] }, "unit": "percentunit" }, "overrides": [] }, "gridPos": { "h": 6, "w": 8, "x": 8, "y": 1 }, "id": 4, "options": { "displayMode": "lcd", "legend": { "calcs": [], "displayMode": "list", "placement": "bottom", "showLegend": false }, "maxVizHeight": 20, "minVizHeight": 20, "minVizWidth": 8, "namePlacement": "auto", "orientation": "horizontal", "reduceOptions": { "calcs": [ "lastNotNull" ], "fields": "", "values": false }, "showUnfilled": true, "sizing": "manual", "text": { "titleSize": 12, "valueSize": 16 }, "valueMode": "color" }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "exemplar": false, "expr": "avg(rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", container!=\"\"}[2m])) by (container) / avg(kube_pod_container_resource_requests{namespace=\"$namespace\", resource=\"cpu\"}) by (container)", "instant": true, "legendFormat": "{{container}}", "range": false, "refId": "A" } ], "title": "cpu consumptions / usages vs requests", "type": "bargauge" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "thresholds" }, "decimals": 2, "mappings": [], "max": 1, "min": 0, "thresholds": { "mode": "absolute", "steps": [ { "color": "light-green" }, { "color": "light-orange", "value": 0.25 }, { "color": "light-red", "value": 0.75 } ] }, "unit": "percentunit" }, "overrides": [] }, "gridPos": { "h": 6, "w": 8, "x": 16, "y": 1 }, "id": 12, "options": { "displayMode": "lcd", "legend": { "calcs": [], "displayMode": "list", "placement": "bottom", "showLegend": false }, "maxVizHeight": 20, "minVizHeight": 20, "minVizWidth": 8, "namePlacement": "auto", "orientation": "horizontal", "reduceOptions": { "calcs": [ "lastNotNull" ], "fields": "", "values": false }, "showUnfilled": true, "sizing": "manual", "text": { "titleSize": 12, "valueSize": 16 }, "valueMode": "color" }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "exemplar": false, "expr": "avg(rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", container!=\"\"}[2m])) by (container) / avg(kube_pod_container_resource_limits{namespace=\"$namespace\", resource=\"cpu\"}) by (container)", "instant": true, "legendFormat": "{{container}}", "range": false, "refId": "A" } ], "title": "cpu overloads / usages vs limits", "type": "bargauge" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "thresholds" }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "light-blue" } ] }, "unit": "currencyEUR" }, "overrides": [] }, "gridPos": { "h": 3, "w": 4, "x": 0, "y": 4 }, "id": 16, "options": { "colorMode": "background_solid", "graphMode": "area", "justifyMode": "center", "orientation": "auto", "percentChangeColorMode": "standard", "reduceOptions": { "calcs": [ "lastNotNull" ], "fields": "", "values": false }, "showPercentChange": false, "textMode": "auto", "wideLayout": true }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "exemplar": false, "expr": "monthly_namespace_cost{exported_namespace=\"$namespace\"}", "instant": true, "legendFormat": "{{container}}", "range": false, "refId": "A" } ], "title": "monthly cost", "type": "stat" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "thresholds" }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "light-blue" } ] }, "unit": "currencyEUR" }, "overrides": [] }, "gridPos": { "h": 3, "w": 4, "x": 4, "y": 4 }, "id": 17, "options": { "colorMode": "background_solid", "graphMode": "area", "justifyMode": "center", "orientation": "auto", "percentChangeColorMode": "standard", "reduceOptions": { "calcs": [ "lastNotNull" ], "fields": "", "values": false }, "showPercentChange": false, "textMode": "auto", "wideLayout": true }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "exemplar": false, "expr": "monthly_namespace_slack{exported_namespace=\"$namespace\"}", "instant": true, "legendFormat": "{{container}}", "range": false, "refId": "A" } ], "title": "monthly slack", "type": "stat" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "thresholds" }, "custom": { "axisPlacement": "auto", "fillOpacity": 50, "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "insertNulls": false, "lineWidth": 0, "spanNulls": false }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "green" }, { "color": "red", "value": 1 } ] } }, "overrides": [] }, "gridPos": { "h": 6, "w": 8, "x": 0, "y": 7 }, "id": 8, "options": { "alignValue": "left", "legend": { "displayMode": "list", "placement": "bottom", "showLegend": false }, "mergeValues": true, "rowHeight": 0.8, "showValue": "never", "tooltip": { "hideZeros": false, "mode": "single", "sort": "none" } }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "expr": "count(kube_pod_container_status_last_terminated_reason{reason=\"OOMKilled\", namespace=\"$namespace\"}) or vector(0)", "legendFormat": "OOM Killed", "range": true, "refId": "OOMKilled" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "expr": "count(ALERTS{alertstate=\"pending\", namespace=\"$namespace\"}) or vector(0)", "hide": false, "instant": false, "legendFormat": "Alerts Pending", "range": true, "refId": "Alerts Pending" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "expr": "count(ALERTS{alertstate=\"firing\", namespace=\"$namespace\"}) or vector(0)", "hide": false, "instant": false, "legendFormat": "Alerts Firing", "range": true, "refId": "Alerts Firing" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "expr": "count(avg(container_memory_usage_bytes{namespace=\"$namespace\", container!=\"\"}) by (container) / avg(kube_pod_container_resource_limits{namespace=\"$namespace\", resource=\"memory\"}) by (container) > 0.8) or vector(0)", "hide": false, "instant": false, "legendFormat": "Memory Limits 80%", "range": true, "refId": "Memory Limits" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "expr": "count(ALERTS{alertname=\"ProbesStatusError\", alertstate=\"firing\", job=\"blackbox-$namespace\"} == 1) or vector(0)", "hide": false, "instant": false, "legendFormat": "Blackbox", "range": true, "refId": "Blackbox" } ], "title": "sanity checks", "type": "state-timeline" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "thresholds" }, "decimals": 2, "mappings": [], "max": 1, "min": 0, "thresholds": { "mode": "absolute", "steps": [ { "color": "green" }, { "color": "light-orange", "value": 0.25 }, { "color": "light-red", "value": 0.75 } ] }, "unit": "percentunit" }, "overrides": [] }, "gridPos": { "h": 6, "w": 8, "x": 8, "y": 7 }, "id": 7, "options": { "displayMode": "lcd", "legend": { "calcs": [], "displayMode": "list", "placement": "bottom", "showLegend": false }, "maxVizHeight": 20, "minVizHeight": 20, "minVizWidth": 8, "namePlacement": "auto", "orientation": "horizontal", "reduceOptions": { "calcs": [ "lastNotNull" ], "fields": "", "values": false }, "showUnfilled": true, "sizing": "manual", "text": { "titleSize": 12, "valueSize": 16 }, "valueMode": "color" }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "exemplar": false, "expr": "avg(container_memory_usage_bytes{namespace=\"$namespace\", container!=\"\"}) by (container) / avg(kube_pod_container_resource_requests{namespace=\"$namespace\", resource=\"memory\"}) by (container)", "instant": true, "legendFormat": "{{container}}", "range": false, "refId": "A" } ], "title": "memory consumptions / usages vs requests", "type": "bargauge" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "thresholds" }, "decimals": 2, "mappings": [], "max": 1, "min": 0, "thresholds": { "mode": "absolute", "steps": [ { "color": "green" }, { "color": "light-orange", "value": 0.25 }, { "color": "light-red", "value": 0.75 } ] }, "unit": "percentunit" }, "overrides": [] }, "gridPos": { "h": 6, "w": 8, "x": 16, "y": 7 }, "id": 10, "options": { "displayMode": "lcd", "legend": { "calcs": [], "displayMode": "list", "placement": "bottom", "showLegend": false }, "maxVizHeight": 20, "minVizHeight": 20, "minVizWidth": 8, "namePlacement": "auto", "orientation": "horizontal", "reduceOptions": { "calcs": [ "lastNotNull" ], "fields": "", "values": false }, "showUnfilled": true, "sizing": "manual", "text": { "titleSize": 12, "valueSize": 16 }, "valueMode": "color" }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "exemplar": false, "expr": "avg(container_memory_usage_bytes{namespace=\"$namespace\", container!=\"\"}) by (container) / avg(kube_pod_container_resource_limits{namespace=\"$namespace\", resource=\"memory\"}) by (container)", "instant": true, "legendFormat": "{{container}}", "range": false, "refId": "A" } ], "title": "memory overloads / usages vs limits", "type": "bargauge" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisBorderShow": false, "axisCenteredZero": false, "axisColorMode": "text", "axisLabel": "", "axisPlacement": "auto", "barAlignment": 0, "barWidthFactor": 0.6, "drawStyle": "line", "fillOpacity": 10, "gradientMode": "none", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "insertNulls": false, "lineInterpolation": "smooth", "lineWidth": 2, "pointSize": 5, "scaleDistribution": { "type": "linear" }, "showPoints": "auto", "spanNulls": false, "stacking": { "group": "A", "mode": "none" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "green" } ] }, "unit": "core" }, "overrides": [] }, "gridPos": { "h": 6, "w": 8, "x": 0, "y": 13 }, "id": 2, "options": { "legend": { "calcs": [], "displayMode": "table", "placement": "right", "showLegend": true, "width": 200 }, "tooltip": { "hideZeros": false, "mode": "single", "sort": "none" } }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "expr": "avg(rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", container!=\"\", pod!~\"^app-bootstrap.*\"}[2m])) by (container)", "legendFormat": "{{container}}", "range": true, "refId": "A" } ], "title": "cpu usages", "type": "timeseries" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisBorderShow": false, "axisCenteredZero": false, "axisColorMode": "text", "axisLabel": "", "axisPlacement": "auto", "barAlignment": 0, "barWidthFactor": 0.6, "drawStyle": "line", "fillOpacity": 10, "gradientMode": "none", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "insertNulls": false, "lineInterpolation": "smooth", "lineWidth": 2, "pointSize": 5, "scaleDistribution": { "type": "linear" }, "showPoints": "auto", "spanNulls": false, "stacking": { "group": "A", "mode": "none" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "green" }, { "color": "red", "value": 80 } ] }, "unit": "decbytes" }, "overrides": [] }, "gridPos": { "h": 6, "w": 8, "x": 8, "y": 13 }, "id": 1, "options": { "legend": { "calcs": [], "displayMode": "table", "placement": "right", "showLegend": true, "width": 200 }, "tooltip": { "hideZeros": false, "mode": "single", "sort": "none" } }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "expr": "avg(container_memory_usage_bytes{namespace=\"$namespace\", container!=\"\", container!=\"certresolver\", pod!~\"^app-bootstrap.*\"}) by (namespace, container)", "legendFormat": "{{container}}", "range": true, "refId": "A" } ], "title": "memory usages", "type": "timeseries" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisBorderShow": false, "axisCenteredZero": false, "axisColorMode": "text", "axisLabel": "", "axisPlacement": "auto", "barAlignment": 0, "barWidthFactor": 0.6, "drawStyle": "line", "fillOpacity": 10, "gradientMode": "none", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "insertNulls": false, "lineInterpolation": "smooth", "lineWidth": 2, "pointSize": 5, "scaleDistribution": { "type": "linear" }, "showPoints": "auto", "spanNulls": false, "stacking": { "group": "A", "mode": "none" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "green" } ] }, "unit": "binBps" }, "overrides": [] }, "gridPos": { "h": 6, "w": 8, "x": 16, "y": 13 }, "id": 13, "options": { "legend": { "calcs": [], "displayMode": "table", "placement": "right", "showLegend": true, "width": 200 }, "tooltip": { "hideZeros": false, "mode": "single", "sort": "none" } }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "expr": "avg by (namespace) (\n rate(container_network_receive_bytes_total{namespace=\"$namespace\"}[2m])\n * on (namespace,pod) group_left ()\n topk by (namespace,pod) (\n 1,\n max by (namespace,pod) (kube_pod_info{host_network=\"false\"})\n )\n)", "legendFormat": "receive_bytes", "range": true, "refId": "receive_bytes" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "expr": "avg by (namespace) (\n rate(container_network_transmit_bytes_total{namespace=\"$namespace\"}[2m])\n * on (namespace,pod) group_left ()\n topk by (namespace,pod) (\n 1,\n max by (namespace,pod) (kube_pod_info{host_network=\"false\"})\n )\n)", "hide": false, "instant": false, "legendFormat": "transmit_bytes", "range": true, "refId": "transmit_bytes" } ], "title": "network", "type": "timeseries" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisBorderShow": false, "axisCenteredZero": false, "axisColorMode": "text", "axisLabel": "", "axisPlacement": "auto", "barAlignment": 0, "barWidthFactor": 0.6, "drawStyle": "line", "fillOpacity": 10, "gradientMode": "none", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "insertNulls": false, "lineInterpolation": "linear", "lineStyle": { "fill": "solid" }, "lineWidth": 2, "pointSize": 5, "scaleDistribution": { "type": "linear" }, "showPoints": "auto", "spanNulls": false, "stacking": { "group": "A", "mode": "none" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "light-green" } ] }, "unit": "none" }, "overrides": [] }, "gridPos": { "h": 8, "w": 12, "x": 0, "y": 19 }, "id": 14, "options": { "legend": { "calcs": [], "displayMode": "table", "placement": "right", "showLegend": true }, "tooltip": { "hideZeros": false, "mode": "single", "sort": "none" } }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "expr": "avg(increase(kestra_worker_ended_duration_seconds_count{namespace=\"$namespace\"}[5m])) by (flow_id) > 0", "legendFormat": "{{ flow_id }}", "range": true, "refId": "indexer_message_in_count" } ], "title": "flows activities", "type": "timeseries" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisBorderShow": false, "axisCenteredZero": false, "axisColorMode": "text", "axisLabel": "", "axisPlacement": "auto", "barAlignment": 0, "barWidthFactor": 0.6, "drawStyle": "line", "fillOpacity": 10, "gradientMode": "none", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "insertNulls": false, "lineInterpolation": "smooth", "lineWidth": 2, "pointSize": 5, "scaleDistribution": { "type": "linear" }, "showPoints": "auto", "spanNulls": false, "stacking": { "group": "A", "mode": "none" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "light-green" } ] }, "unit": "none" }, "overrides": [] }, "gridPos": { "h": 8, "w": 12, "x": 12, "y": 19 }, "id": 20, "options": { "legend": { "calcs": [], "displayMode": "list", "placement": "bottom", "showLegend": false }, "tooltip": { "hideZeros": false, "mode": "single", "sort": "none" } }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "expr": "sum(kestra_worker_job_running{namespace=\"$namespace\"})", "legendFormat": "{{ flow_id }}", "range": true, "refId": "indexer_message_in_count" } ], "title": "worker jobs running", "type": "timeseries" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "thresholds" }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "green" } ] }, "unit": "decbytes" }, "overrides": [] }, "gridPos": { "h": 5, "w": 4, "x": 0, "y": 27 }, "id": 19, "options": { "colorMode": "background_solid", "graphMode": "area", "justifyMode": "center", "orientation": "auto", "percentChangeColorMode": "standard", "reduceOptions": { "calcs": [ "lastNotNull" ], "fields": "", "values": false }, "showPercentChange": false, "textMode": "auto", "wideLayout": true }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "expr": "sum(jvm_memory_used_bytes{namespace=\"$namespace\"})", "hide": false, "instant": false, "legendFormat": "__auto", "range": true, "refId": "A" } ], "title": "jvm memory", "type": "stat" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "thresholds" }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "green" } ] }, "unit": "decbytes" }, "overrides": [] }, "gridPos": { "h": 5, "w": 20, "x": 4, "y": 27 }, "id": 18, "options": { "colorMode": "value", "graphMode": "area", "justifyMode": "auto", "orientation": "auto", "percentChangeColorMode": "standard", "reduceOptions": { "calcs": [ "lastNotNull" ], "fields": "", "values": false }, "showPercentChange": false, "textMode": "auto", "wideLayout": true }, "pluginVersion": "12.0.1", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "editorMode": "code", "expr": "avg(jvm_memory_used_bytes{namespace=\"$namespace\"}) by (id)", "hide": false, "instant": false, "legendFormat": "__auto", "range": true, "refId": "A" } ], "title": "jvm memory used", "type": "stat" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisBorderShow": false, "axisCenteredZero": false, "axisColorMode": "text", "axisLabel": "", "axisPlacement": "auto", "barAlignment": 0, "barWidthFactor": 0.6, "drawStyle": "line", "fillOpacity": 0, "gradientMode": "none", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "insertNulls": false, "lineInterpolation": "linear", "lineWidth": 1, "pointSize": 5, "scaleDistribution": { "type": "linear" }, "showPoints": "auto", "spanNulls": false, "stacking": { "group": "A", "mode": "none" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "green" }, { "color": "red", "value": 80 } ] } }, "overrides": [] }, "gridPos": { "h": 9, "w": 12, "x": 0, "y": 32 }, "id": 21, "options": { "legend": { "calcs": [], "displayMode": "table", "placement": "right", "showLegend": true }, "tooltip": { "hideZeros": false, "mode": "single", "sort": "none" } }, "pluginVersion": "12.0.1", "targets": [ { "editorMode": "code", "expr": "avg(rate(http_server_requests_seconds_count{namespace=\"$namespace\", uri!~\"(UNMATCHED_URI|REDIRECTION|/health|/prometheus|/health/{selector})\"}[2m])) by (uri)", "legendFormat": "__auto", "range": true, "refId": "A" } ], "title": "http_server_requests_seconds_count", "type": "timeseries" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisBorderShow": false, "axisCenteredZero": false, "axisColorMode": "text", "axisLabel": "", "axisPlacement": "auto", "barAlignment": 0, "barWidthFactor": 0.6, "drawStyle": "line", "fillOpacity": 0, "gradientMode": "none", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "insertNulls": false, "lineInterpolation": "linear", "lineWidth": 1, "pointSize": 5, "scaleDistribution": { "type": "linear" }, "showPoints": "auto", "spanNulls": false, "stacking": { "group": "A", "mode": "none" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "green" }, { "color": "red", "value": 80 } ] } }, "overrides": [] }, "gridPos": { "h": 9, "w": 12, "x": 12, "y": 32 }, "id": 22, "options": { "legend": { "calcs": [], "displayMode": "table", "placement": "right", "showLegend": true }, "tooltip": { "hideZeros": false, "mode": "single", "sort": "none" } }, "pluginVersion": "12.0.1", "targets": [ { "editorMode": "code", "expr": "avg(rate(http_server_requests_seconds_count{namespace=\"$namespace\", uri=~\"(REDIRECTION|UNMATCHED_URI)\"}[2m])) by (uri)", "legendFormat": "__auto", "range": true, "refId": "A" } ], "title": "http_server_requests_seconds_count", "type": "timeseries" } ], "preload": false, "refresh": "10s", "schemaVersion": 41, "tags": [], "templating": { "list": [ { "current": { "text": "Prometheus Sample", "value": "Prometheus Sample" }, "label": "Datasource", "name": "datasource", "options": [ { "selected": false, "text": "Prometheus Sample 1", "value": "Prometheus Sample 1" }, { "selected": false, "text": "Prometheus Sample 2", "value": "Prometheus Sample 2" }, { "selected": true, "text": "Prometheus Sample", "value": "Prometheus Sample" } ], "query": "Prometheus Sample 1,Prometheus Sample 2,Prometheus Sample", "type": "custom" }, { "current": { "text": "kestra", "value": "kestra" }, "datasource": { "type": "prometheus", "uid": "${datasource}" }, "definition": "label_values(kestra_jdbc_query_duration_seconds_count,kestra_instance)", "includeAll": false, "label": "KestraServer", "name": "KestraServer", "options": [], "query": { "qryType": 1, "query": "label_values(kestra_jdbc_query_duration_seconds_count,kestra_instance)", "refId": "PrometheusVariableQueryEditor-VariableQuery" }, "refresh": 1, "regex": "", "type": "query" }, { "current": { "text": "All", "value": [ "$__all" ] }, "datasource": { "type": "prometheus", "uid": "${datasource}" }, "definition": "label_values(kestra_jdbc_query_duration_seconds_count{kestra_instance=\"$KestraServer\"},kestra_cloud_instance_name)", "includeAll": true, "label": "Instance", "multi": true, "name": "instance", "options": [], "query": { "qryType": 1, "query": "label_values(kestra_jdbc_query_duration_seconds_count{kestra_instance=\"$KestraServer\"},kestra_cloud_instance_name)", "refId": "PrometheusVariableQueryEditor-VariableQuery" }, "refresh": 1, "regex": "", "type": "query" }, { "current": { "text": "All", "value": "$__all" }, "datasource": { "type": "prometheus", "uid": "${datasource}" }, "definition": "label_values(kestra_jdbc_query_duration_seconds_count{kestra_instance=~\"$KestraServer\", kestra_cloud_instance_name=~\"$instance\"},namespace)", "hide": 2, "includeAll": true, "label": "Namespace", "multi": true, "name": "namespace", "options": [], "query": { "qryType": 1, "query": "label_values(kestra_jdbc_query_duration_seconds_count{kestra_instance=~\"$KestraServer\", kestra_cloud_instance_name=~\"$instance\"},namespace)", "refId": "PrometheusVariableQueryEditor-VariableQuery" }, "refresh": 1, "regex": "", "type": "query" } ] }, "time": { "from": "now-1h", "to": "now" }, "timepicker": {}, "timezone": "UTC", "title": "Sample Kestra Dashboard", "uid": "sample_dashboard_uid", "version": 1 } ``` ::: ## Kestra endpoints Kestra exposes internal endpoints on the management port (8081 by default) to provide status corresponding to the [server type](../../08.architecture/02.server-components/index.md): * `/worker`: will expose all currently running tasks on this worker. * `/scheduler`: will expose all currently scheduled flows on this scheduler with the next date. * `/kafkastreams`: will expose all [Kafka Streams](https://kafka.apache.org/documentation/streams/) states and aggregated store lag. * `/kafkastreams/{clientId}/lag`: will expose details lag for a `clientId`. * `/kafkastreams/{clientId}/metrics`: will expose details metrics for a `clientId`. ## Other Micronaut default endpoints Since Kestra is based on [Micronaut](https://micronaut.io), the [default Micronaut endpoints](https://docs.micronaut.io/latest/guide/index.html#providedEndpoints) are enabled by default on port 8081: * `/info` [Info Endpoint](https://docs.micronaut.io/snapshot/guide/index.html#infoEndpoint) with git status information. * `/health` [Health Endpoint](https://docs.micronaut.io/snapshot/guide/index.html#healthEndpoint) usable as an external heathcheck for the application. * `/loggers` [Loggers Endpoint](https://docs.micronaut.io/snapshot/guide/index.html#loggersEndpoint) allows changing logger level at runtime. * `/metrics` [Metrics Endpoint](https://docs.micronaut.io/snapshot/guide/index.html#metricsEndpoint) metrics in JSON format. * `/env` [Environment Endpoint](https://docs.micronaut.io/snapshot/guide/index.html#environmentEndpoint) to debug configuration files. You can disable some endpoints following the above Micronaut configuration. ## Debugging techniques Here are several debugging techniques administrators can use to investigate issues: ## Enable verbose log Kestra has some [management endpoints](../03.monitoring/index.md#other-micronaut-default-endpoints) including one that allows changing logging verbosity at run time. Inside the container (or locally if standalone jar is used), send this command to enable very verbose logging: ```shell curl -i -X POST -H "Content-Type: application/json" \ -d '{ "configuredLevel": "TRACE" }' \ http://localhost:8081/loggers/io.kestra ``` Alternatively, you can change logging levels on configuration files: ```yaml logger: levels: io.kestra.core.runners: TRACE ``` ## Capture some java dump Kestra runs on a JRE rather than a JDK, so JVM monitoring tools are not included. Install [Jattach](https://github.com/jattach/jattach#installation) first: :::alert{type="info"} Jattach is included in the Kestra image, so there is no need to install it separately. If you're running an older version, continue to follow the steps below. ::: ```shell curl -L -o jattach https://github.com/jattach/jattach/releases/download/v2.2/jattach chmod +x jattach ``` - You need to find the pid of the Kestra process, it's usually `1` on docker installation. - You can get JVM information with `jattach jcmd VM.info > vminfo` - You can get a heap history via `jattach inspectheap > inspectheap` - You can get a heap dump via `jattach dumpheap > dumpheap` - You can get a thread dump via `jattach threaddump > threaddump` Alternatively, you can request a thread dump via the `/threaddump` endpoint available on the management port (8081 if not configured otherwise). --- # OpenTelemetry for Kestra: Traces, Metrics, and Logs URL: https://kestra.io/docs/administrator-guide/open-telemetry > Implement observability in Kestra with OpenTelemetry to export traces, metrics, and logs to your preferred monitoring tools. **Observability** refers to understanding a system's internal state by analyzing its outputs. In software, this means examining telemetry data — such as traces, metrics, and logs — to gain insights into system behavior. **OpenTelemetry** is a vendor-neutral, tool-agnostic framework and toolkit for creating and managing telemetry data. It helps implement observability in software applications. OpenTelemetry defines three different kinds of telemetry data: - **Traces** provide a high-level view of what happens when a request is made to an application. A trace can contain multiple [spans](https://opentelemetry.io/docs/concepts/signals/traces/#spans). - **Metrics** are measurements of a service captured at runtime. - **Logs** are timestamped text records, either structured (recommended) or unstructured, with optional metadata. Kestra supports all three kinds of telemetry data via OpenTelemetry-compatible exporters. For more details, see the [OpenTelemetry official documentation](https://opentelemetry.io/docs/). ## Traces :::alert{type="info"} Exporting trace data in Kestra is currently a Beta feature. ::: The first step is to enable distributed traces inside the [Observability and Networking configuration](../../configuration/03.observability-and-networking/index.md) file: ```yaml micronaut: otel: enabled: true kestra: traces: root: DEFAULT # Enable traces inside Kestra flow executions otel: traces: exporter: otlp # Only otlp is supported for now exporter: otlp: endpoint: http://localhost:4317 # Replace with the address of your own collector ``` When enabled, Kestra instruments: - All calls to its API - All flow executions (one span per task execution, plus one span for each execution message processed by the Executor) - External HTTP calls made by the HTTP tasks (including tasks that use the Kestra HTTP client) ### Trace correlation Kestra propagates the trace context so that traces are correlated: - The API call trace correlates with the execution it creates. - Flow execution traces correlate with parent flows when the `Subflow` or `ForEachItem` task is used. - External HTTP calls include the standard propagation header for downstream correlation. ### Example: Jaeger with Docker Compose Enable [Jaeger](https://www.jaegertracing.io), an OpenTelemetry-compatible tracing platform, with Kestra in a Docker Compose configuration file: ```yaml services: # Postgres is included here as a dependency for Kestra during local testing postgres: image: postgres:14.13 environment: POSTGRES_DB: kestra_unit POSTGRES_USER: kestra POSTGRES_PASSWORD: k3str4 ports: - 5432:5432 restart: on-failure jaeger-all-in-one: image: jaegertracing/all-in-one:latest ports: - "16686:16686" # Jaeger UI - "14268:14268" # OpenTracing (optional) - "4317:4317" # OTLP gRPC receiver - "4318:4318" # OTLP HTTP receiver - "14250:14250" # External otel-collector (optional) environment: - COLLECTOR_OTLP_ENABLED=true restart: on-failure ``` The following screenshot shows three correlated traces: - One created from the API call that creates the execution - One created from an execution of a flow named `opentelemetry_parent` which has spans for tasks including a `Subflow` - One created from the `opentelemetry_basic` flow execution ![Example of correlated traces in Jaeger](./opentelemetry_traces.png) ### Disabling traces You can disable traces for flows while keeping API traces: ```yaml kestra: traces: root: DISABLED ``` You can also disable traces per component (experimental). For example, disabling only Executor spans: ```yaml kestra: traces: root: DEFAULT categories: io.kestra.core.runners.Executor: DISABLED ``` #### Supported categories | Category | Description | |----------------------------------------|---------------------------------------------------| | `io.kestra.core.runners.Executor` | Spans for each message in the execution queue | | `io.kestra.core.runners.Worker` | Spans for each runnable task execution | | `io.kestra.plugin.core.flow.Subflow` | Spans for each `Subflow` task execution | | `io.kestra.plugin.core.flow.ForEachItem` | Spans for each `ForEachItem` task execution | ## Metrics To send metrics to an OpenTelemetry-compatible collector, add the following parameters to your [Observability and Networking configuration](../../configuration/03.observability-and-networking/index.md) file: ```yaml micronaut: metrics: export: otlp: enabled: true url: http://localhost:4318/v1/metrics # Replace with your collector URL ``` For example, you can configure an OpenTelemetry Collector to forward metrics to Prometheus: ```yaml receivers: otlp: protocols: http: endpoint: 0.0.0.0:4318 exporters: prometheus: endpoint: "0.0.0.0:9464" ``` ## Logs To send logs to an OpenTelemetry-compatible collector, use the [LogShipper](../../07.enterprise/02.governance/logshipper/index.md) with the built-in OpenTelemetry log exporter. :::alert{type="warning"} LogShipper is only available in the Kestra **Enterprise Edition**. ::: The following flow sends logs from all flows to a collector daily: ```yaml id: log_shipper namespace: company.team triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" tasks: - id: log_export type: io.kestra.plugin.ee.core.log.LogShipper logExporters: - id: OTLPLogExporter type: io.kestra.plugin.ee.opentelemetry.LogExporter otlpEndpoint: http://localhost:4318/v1/logs # Replace with your collector URL ``` --- # Prometheus Metrics for Kestra: Reference and /prometheus Endpoint URL: https://kestra.io/docs/administrator-guide/prometheus-metrics > Explore the available Prometheus metrics in Kestra to monitor the performance and health of your orchestration cluster. This page provides an overview of all available [Prometheus](https://prometheus.io/) metrics in Kestra. Prometheus metrics are enabled by default in Kestra, in contrast to [OpenTelemetry](../open-telemetry/index.md), which must be explicitly enabled in the configuration file. Metrics include custom metrics defined within the application and framework-provided metrics collected via [Micrometer](https://micrometer.io/). Each Prometheus metric is described with its purpose and the type of data it represents. You can access these metrics via the `http://localhost:8081/prometheus` endpoint in Kestra. Example output from the Prometheus endpoint: ```plaintext ## HELP executor_active_threads The approximate number of threads that are actively executing tasks ## TYPE executor_active_threads gauge executor_active_threads 4 ``` :::alert{type="info"} For deeper details on Micrometer metrics integration, see the [Micronaut Micrometer documentation](https://micronaut-projects.github.io/micronaut-micrometer/latest/guide/). ::: ## Kestra ### Kestra Executor Metrics Executor server exclusive: * `kestra_executor_execution_delay_created_count_total` (counter): The total number of execution delays created by the Executor. * `kestra_executor_execution_duration_seconds` (summary): Execution duration inside the Executor. * `kestra_executor_execution_duration_seconds_max` (gauge): Maximum observed execution duration inside the Executor. * `kestra_executor_execution_end_count_total` (counter): The total number of executions ended by the Executor. * `kestra_executor_execution_message_process_seconds` (summary): Duration of a single execution message processed by the Executor. * `kestra_executor_execution_message_process_seconds_max` (gauge): Maximum observed duration of a single execution message processed by the Executor. * `kestra_executor_execution_started_count_total` (counter): The total number of executions started by the Executor. * `kestra_executor_flowable_execution_count_total` (counter): The total number of flowable tasks executed by the Executor * `kestra_executor_taskrun_created_count_total` (counter): The total number of tasks created by the Executor. * `kestra_executor_taskrun_ended_count_total` (counter): he total number of tasks ended by the Executor. * `kestra_executor_taskrun_ended_duration_seconds` (summary): Task duration inside the Executor. * `kestra_executor_taskrun_ended_duration_seconds_max` (gauge): Maximum observed task duration inside the Executor. * `kestra_executor_thread_count` (gauge): The number of executor threads. * `kestra_executor_worker_job_resubmit_count_total` (counter): The total number of worker jobs resubmitted to the Worker by the Executor. ### Kestra Indexer Metrics Indexer server exclusive: * `kestra_indexer_message_in_count_total` (counter): Total number of records received by the Indexer * `kestra_indexer_message_out_count_total` (counter): Total number of records indexed by the Indexer * `kestra_indexer_request_count_total` (counter): Total number of batches of records received by the Indexer * `kestra_indexer_request_duration_seconds` (summary): Batch of records duration inside the Indexer. * `kestra_indexer_request_duration_seconds_max` (gauge): Maximum observed batch of records duration inside the Indexer. ### Kestra Scheduler Metrics Scheduler server exclusive: * `kestra_scheduler_evaluate_count_total` (counter): Total number of triggers evaluated by the Scheduler. * `kestra_scheduler_evaluation_loop_duration_seconds` (summary): Trigger evaluation loop duration inside the Scheduler. * `kestra_scheduler_evaluation_loop_duration_seconds_max` (gauge): Maximum observed trigger evaluation loop duration inside the Scheduler. * `kestra_scheduler_loop_count_total` (counter): Total number of evaluation loops executed by the Scheduler. ### Kestra Worker Metrics Worker server exclusive: * `kestra_worker_ended_count_total` (counter): The total number of tasks ended by the Worker. * `kestra_worker_ended_duration_seconds` (summary): Task run duration inside the Worker. * `kestra_worker_ended_duration_seconds_max` (gauge): Maximum observed task run duration inside the Worker. * `kestra_worker_job_pending` (gauge): The number of jobs (tasks or triggers) pending to be run by the Worker. * `kestra_worker_job_running` (gauge): The number of jobs (tasks or triggers) currently running inside the Worker. * `kestra_worker_job_thread` (gauge): The number of worker threads. * `kestra_worker_queued_duration_seconds` (summary): Task queued duration inside the Worker. * `kestra_worker_queued_duration_seconds_max` (gauge): Maximum observed task queued duration inside the Worker. * `kestra_worker_running_count` (gauge): The number of tasks currently running inside the Worker. * `kestra_worker_started_count_total` (counter): The total number of tasks started by the Worker. ### Kestra JDBC Metrics Various Kestra-specific database queries: * `kestra_jdbc_query_duration_seconds` (summary): Duration of database queries. * `kestra_jdbc_query_duration_seconds_max` (gauge): Maximum observed query duration. ### Kestra Queue Metrics For each internal queue: * `kestra_queue_big_message_count_total` (counter): Big messages in the queue. * `kestra_queue_message_lag_count` (gauge): Total number of messages in the queue that are not yet consumed. * `kestra_queue_poll_size` (gauge): Size of a poll to the queue (message batch size). * `kestra_queue_produce_count_total` (counter): Total number of produced messages. * `kestra_queue_receive_duration_seconds` (summary): Queue duration to receive and consume a batch of messages. * `kestra_queue_receive_duration_seconds_max` (gauge): Maximum observed queue duration to receive and consume a batch of messages. ## Cache metrics Micronaut web server caching overview: * `cache_size` (gauge): Current number of entries in the cache. Approximate depending on cache type. ## HikariCP Connection Pool Metrics Database connection pool status: * `hikaricp_connections` (gauge): Total number of connections in the pool. * `hikaricp_connections_acquire_seconds` (summary): Time taken to acquire connections. * `hikaricp_connections_acquire_seconds_max` (gauge): Maximum time observed for acquiring a connection. * `hikaricp_connections_active` (gauge): Number of currently active connections. * `hikaricp_connections_creation_seconds` (summary): Time taken to create new connections. * `hikaricp_connections_creation_seconds_max` (gauge): Maximum observed connection creation time. * `hikaricp_connections_idle` (gauge): Number of idle connections. * `hikaricp_connections_max` (gauge): Maximum connections allowed in the pool. * `hikaricp_connections_min` (gauge): Minimum connections maintained in the pool. * `hikaricp_connections_pending` (gauge): Threads waiting to acquire a connection. * `hikaricp_connections_timeout_total` (counter): Total count of connection timeouts. * `hikaricp_connections_usage_seconds` (summary): Time connections are in use. * `hikaricp_connections_usage_seconds_max` (gauge): Maximum observed connection usage time. ## HTTP Client Metrics Outbound HTTP requests: * `http_client_requests_seconds` (summary): Duration of HTTP client requests. * `http_client_requests_seconds_max` (gauge): Maximum observed client request duration. ## HTTP Server Metrics Inbound HTTP requests to Micronaut web server endpoints: * `http_server_requests_seconds` (summary): Duration of HTTP server requests. * `http_server_requests_seconds_max` (gauge): Maximum observed server request duration. ## JVM ## Java executor pool metrics Various asynchronous task executors: * `executor_active_threads` (gauge): The approximate number of threads that are actively executing tasks. * `executor_completed_tasks_total` (counter): The approximate total number of tasks that have completed execution. * `executor_idle_seconds` (summary): Time threads have spent idle in the executor pool. * `executor_idle_seconds_max` (gauge): Maximum idle time observed for a thread. * `executor_pool_core_threads` (gauge): The core number of threads for the pool. * `executor_pool_max_threads` (gauge): The maximum allowed number of threads in the pool. * `executor_pool_size_threads` (gauge): The current number of threads in the pool. * `executor_queue_remaining_tasks` (gauge): The number of additional elements that this queue can ideally accept without blocking. * `executor_queued_tasks` (gauge): The approximate number of tasks that are queued for execution. * `executor_seconds` (summary): Time tasks have spent executing. * `executor_seconds_max` (gauge): Maximum execution time observed for a task. ### JVM Buffer Pool Metrics Overview of Java buffer pools type statistics: * `jvm_buffer_count_buffers` (gauge): An estimate of the number of buffers in the pool. * `jvm_buffer_memory_used_bytes` (gauge): An estimate of the memory that the Java virtual machine is using for this buffer pool. * `jvm_buffer_total_capacity_bytes` (gauge): An estimate of the total capacity of the buffers in this pool. ### JVM Class Loading Metrics Overview of Java class loading activity: * `jvm_classes_loaded_classes` (gauge): The number of classes that are currently loaded in the Java virtual machine. * `jvm_classes_unloaded_classes_total` (counter): The number of classes unloaded in the Java virtual machine. ### JVM Garbage Collection (GC) Metrics Overview of runtime Java GC: * `jvm_gc_concurrent_phase_time_seconds` (summary): Time spent in concurrent phase. * `jvm_gc_concurrent_phase_time_seconds_max` (gauge): Maximum observed time spent in concurrent phase. * `jvm_gc_live_data_size_bytes` (gauge): Size of long-lived heap memory pool after reclamation. * `jvm_gc_max_data_size_bytes` (gauge): Max size of long-lived heap memory pool. * `jvm_gc_memory_allocated_bytes_total` (counter): Incremented for an increase in the size of the (young) heap memory pool after one GC to before the next. * `jvm_gc_memory_promoted_bytes_total` (counter): Count of positive increases in the size of the old generation memory pool before GC to after GC. * `jvm_gc_pause_seconds` (summary): Time spent in GC pause. * `jvm_gc_pause_seconds_max` (gauge): Maximum observed time spent in GC pause. ### JVM Memory Metrics Overview of various Java memory regions: * `jvm_memory_committed_bytes` (gauge): The amount of memory in bytes that is committed for the Java virtual machine to use. * `jvm_memory_max_bytes` (gauge): The maximum amount of memory in bytes that can be used for memory management. * `jvm_memory_used_bytes` (gauge): The amount of used memory. ### JVM Thread Metrics Java threading model: * `jvm_threads_daemon_threads` (gauge): The current number of live daemon threads. * `jvm_threads_live_threads` (gauge): The current number of live threads including both daemon and non-daemon threads. * `jvm_threads_peak_threads` (gauge): The peak live thread count since the Java virtual machine started or peak was reset. * `jvm_threads_started_threads_total` (counter): The total number of application threads started in the JVM. * `jvm_threads_states_threads` (gauge): The current number of threads. ## Logback metrics Logger emitted events by log level: * `logback_events_total` (counter): Log events enabled by the effective log level. ## Runtime metrics ### Process Metrics Kestra from OS process point of view: * `process_cpu_time_ns_total` (counter): The "cpu time" used by the Java Virtual Machine process. * `process_cpu_usage` (gauge): The "recent cpu usage" for the Java Virtual Machine process. * `process_files_max_files` (gauge):The maximum file descriptor count. * `process_files_open_files` (gauge): The open file descriptor count. * `process_start_time_seconds` (gauge): Start time of the process since unix epoch. * `process_uptime_seconds` (gauge): The uptime of the Java virtual machine. ### System Metrics Runtime resources overview: * `system_cpu_count` (gauge): The number of processors available to the Java virtual machine. * `system_cpu_usage` (gauge): The "recent cpu usage" of the system the application is running in. * `system_load_average_1m` (gauge): The sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time. --- # Purge Executions, Logs, and Files in Kestra URL: https://kestra.io/docs/administrator-guide/purge > Reclaim storage by purging old executions, logs, KV entries, and files in Kestra. Configure scheduled purge jobs to keep your database lean in production. Use purge tasks to remove old executions, logs, and key-value pairs, helping reduce storage usage. To keep storage optimized, use [`io.kestra.plugin.core.execution.PurgeExecutions`](/plugins/core/execution/io.kestra.plugin.core.execution.purgeexecutions), [`io.kestra.plugin.core.log.PurgeLogs`](/plugins/core/log/io.kestra.plugin.core.log.purgelogs), and [`io.kestra.plugin.core.kv.PurgeKV`](/plugins/core/kv/io.kestra.plugin.core.kv.purgekv). - `PurgeExecutions`: deletes execution records - `PurgeLogs`: removes both `Execution` and `Trigger` logs in bulk - `PurgeKV`: deletes expired keys globally for a specific namespace Together, these replace the legacy `io.kestra.plugin.core.storage.Purge` task with a **faster and more reliable process (~10x faster)**. :::alert{type="info"} The [Enterprise Edition](../../07.enterprise/index.mdx) also includes [`PurgeAuditLogs`](../../07.enterprise/02.governance/06.audit-logs/index.md#how-to-purge-audit-logs). ::: The flow below purges executions and logs: ```yaml id: purge namespace: company.myteam description: | This flow will remove all executions and logs older than 1 month. We recommend running it daily to prevent storage issues. tasks: - id: purge_executions type: io.kestra.plugin.core.execution.PurgeExecutions endDate: "{{ now() | dateAdd(-1, 'MONTHS') }}" purgeLog: false - id: purge_logs type: io.kestra.plugin.core.log.PurgeLogs endDate: "{{ now() | dateAdd(-1, 'MONTHS') }}" triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" ``` ## Purge Key-value pairs The example below purges expired Key-value pairs from the `company` Namespace. It's set up as a flow in the [`system`](../../06.concepts/system-flows/index.md) Namespace. ```yaml id: purge_kv_store namespace: system tasks: - id: purge_kv type: io.kestra.plugin.core.kv.PurgeKV expiredOnly: true namespaces: - company includeChildNamespaces: true ``` :::alert{type="warning"} Purge tasks permanently delete data. Always test in non-production environments first. ::: ## Auto-delete expired key-value pairs Rather than creating a system flow to regularly purge Key-value pairs, you can add a global configuration to your Kestra Configuration/Application file that auto-deletes expired key-value pairs: ```yaml kestra: kv: purge-expired: enabled: true # default true initial-delay: PT5S # default PT6H fixed-delay: PT5S # default PT6H batch-size: 10 # default 1000 ``` ## Purge Namespace files The example below purges old versions of Namespace files for a Namespace tree (parents + children Namespaces). Use a `filePattern` and specify the `behavior` (e.g., keep the last N versions and/or delete versions older than a given date): ```yaml id: purge_namespace_files namespace: system tasks: - id: purge_files type: io.kestra.plugin.core.namespace.PurgeFiles namespaces: - company includeChildNamespaces: true filePattern: "**/*.sql" behavior: type: version before: "2025-01-01T00:00:00Z" ``` Refer to the [PurgeFiles documentation](/plugins/core/namespace/io.kestra.plugin.core.namespace.purgefiles) for more details. ## Purge assets and lineage (retention) Use the `io.kestra.plugin.ee.assets.PurgeAssets` task to enforce asset retention without touching executions or logs. By default, this task purges assets, asset usage events (execution view), and asset lineage events (for asset exporters) matching the filters. You can configure it to only purge specific types of records. **Filters:** | Property | Description | | --- | --- | | `namespace` | Filter by namespace. Supports prefix matching (e.g., `company.data` matches `company.data.staging`). | | `assetId` | Filter by a specific asset ID. | | `assetType` | Filter by one or more asset types (e.g., `io.kestra.plugin.ee.assets.Table`). | | `metadataQuery` | Filter by metadata key-value pairs. | | `endDate` | **(required)** Purge records created or updated before this date (ISO 8601). | **Purge scope:** | Property | Default | Description | | --- | --- | --- | | `purgeAssets` | `true` | Whether to purge the asset records themselves. | | `purgeAssetUsages` | `true` | Whether to purge asset usage events (execution view). | | `purgeAssetLineages` | `true` | Whether to purge asset lineage events. | **Outputs:** `purgedAssetsCount`, `purgedAssetUsagesCount`, `purgedAssetLineagesCount`. Example: purge old VM assets on a monthly schedule. ```yaml id: asset_retention_policy namespace: company.infra triggers: - id: monthly_cleanup type: io.kestra.plugin.core.trigger.Schedule cron: "0 0 1 * *" tasks: - id: purge_old_vms type: io.kestra.plugin.ee.assets.PurgeAssets assetType: - io.kestra.plugin.ee.assets.VM endDate: "{{ now() | dateAdd(-180, 'DAYS') }}" ``` ## Purge tasks vs. UI deletion Purge tasks perform **hard deletion**, permanently removing records and reclaiming storage. In contrast, deleting items in the UI is a **soft deletion**—the data is hidden but retained (e.g., revision history and past executions can reappear if a flow with the same ID is recreated). This distinction matters for compliance and troubleshooting: purge flows are best for cleaning up space, while UI deletions preserve history for auditability. :::alert{type="warning"} Purge tasks do not affect Kestra’s [internal queues](../../08.architecture/01.main-components/index.md#queue). Queue retention is managed separately via the [Runtime and Storage configuration](../../configuration/02.runtime-and-storage/index.md) for JDBC or the [Enterprise and Advanced configuration](../../configuration/06.enterprise-and-advanced/index.md) for Kafka. ::: :::collapse{title="Renamed Purge Tasks in 0.18.0"} We've [improved](https://github.com/kestra-io/kestra/pull/4298) the mechanism of the **Purge tasks** to make them more performant and reliable — some tasks have been renamed to reflect their enhanced functionality. Here are the main `Purge` plugin changes in Kestra 0.18.0: - `io.kestra.plugin.core.storage.Purge` has been renamed to `io.kestra.plugin.core.execution.PurgeExecutions` to reflect that it only purges data related to executions (e.g., it doesn't include trigger logs; use the `PurgeLogs` task for those). An alias has been added so that using the old task type will still work, but it will emit a warning. Use the new task type going forward. - `io.kestra.plugin.core.storage.PurgeExecution` has been renamed to `io.kestra.plugin.core.storage.PurgeCurrentExecutionFiles` to reflect that it purges all data from the current execution, including inputs and outputs. An alias has been added for backward compatibility, but update your flows to use the new task type. ::: --- # Software and Hardware Requirements to Run Kestra URL: https://kestra.io/docs/administrator-guide/requirements > Check Kestra system requirements. Verify software prerequisites (Java, DB) and hardware recommendations for running Kestra effectively. This page outlines the software and hardware requirements for running Kestra. ## Software requirements The table below lists the software requirements for Kestra. ### Java Runtime | Kestra Edition | Required version | Note | |----------------|------------------|------| | Open Source / Enterprise | Runtime JDK 25; source/target 21 | Default: Java 25 (Eclipse Temurin); compiled with `--release 21` | ### Queue and Repository Kestra Open Source supports PostgreSQL or MySQL for the queue and repository components. Kestra Enterprise Edition (EE) provides two options: - Use the same JDBC configuration as Open Source for standard deployments - Use Kafka with Elasticsearch or OpenSearch for large-scale deployments | Kestra Edition | Database | Required version | Note | |-------------------|--------------------------|---------------------------|----------------| | OSS / Enterprise | **PostgreSQL** | >=14 | Default `latest` | | OSS / Enterprise | **MySQL** | >= 8 (except version 8.0.31) | Default: 8.3.2 | | Enterprise | **Apache Kafka** | >=3 | | | Enterprise | **Elasticsearch** | >=7 | | | Enterprise | **Opensearch** | >=2 | | :::alert{type="warning"} MySQL deployments must have the **time zone tables loaded**. If the time zone data is missing, the scheduler can misfire or skip runs. Follow the MySQL guide to install time zone information to avoid deployment issues: [Time Zone Support → Load the Time Zone Tables](https://dev.mysql.com/doc/refman/8.0/en/time-zone-support.html#time-zone-installation). ::: ### Internal Storage | Kestra Edition | Storage Provider | Required version | Note | |-------------------|--------------------|---------------------------|----------------------------------| | OSS/Enterprise | MinIO | >=8 | | | OSS/Enterprise | Google Cloud GCS | N/A | | | OSS/Enterprise | AWS S3 | N/A | | | OSS/Enterprise | Azure Blob Storage | N/A | | ## Hardware requirements A Kestra standalone server requires at least 4 GiB of memory and 2 vCPUs. To use script tasks, the server must support Docker-in-Docker (this is why, for example, AWS ECS Fargate is not supported). For guidance on allocating memory and CPU for different architecture components, [contact us](/demo). We can help size your deployment based on your expected workload. --- # Security Hardening for Kestra: Network and Process Isolation URL: https://kestra.io/docs/administrator-guide/security-hardening > Best practices for hardening Kestra security, including network isolation, host-level controls, and plugin validation. Security hardening options for Kestra. By design, Kestra allows arbitrary HTTP calls and script execution. To prevent misuse of link-local metadata services (IMDS), isolate and block access at the network layer: - **Network ACLs or security groups** — configure your VPC or firewall to deny all requests to link-local ranges (e.g., `169.254.169.254/32`). - **Dedicated orchestration subnet** - place Kestra workers in a private subnet with no route to management or metadata services. - **Egress proxy or NAT gateway filtering** - route all outbound traffic through a proxy or gateway that can enforce allow-lists and block link-local IPs. ## Host-level isolation Running workflows in isolated environments reduces the impact of potential malicious flows: - **Container sandboxes** - launch each flow execution in its own container (for example, Docker or Kubernetes Pod) with minimal privileges. - **Ephemeral compute** — use Kestra's native [Task Runners](../../07.enterprise/04.scalability/task-runners/index.md) to auto-scale ephemeral compute nodes, which are destroyed after each run to ensure no residual state. - **Minimum host permissions** - grant only the OS-level rights required for the runtime; avoid mounting cloud credential files or granting host-level IAM roles directly. ## Plugin and code validation To prevent the execution of malicious code, you can implement several strategies: - **Plugin configuration** — use Kestra’s plugin architecture, including [Plugin Versioning](../../07.enterprise/05.instance/versioned-plugins/index.md), to control which plugins are allowed and [which should be prohibited](../../07.enterprise/02.governance/worker-isolation/index.md). - **CI/CD validation** — implement a custom [Flow Validation step in your CI/CD pipeline](../../version-control-cicd/cicd/index.md) to scan task definitions for disallowed patterns (e.g., `169.254.169.254`) and block merging if detected. - **Java Security (EE-only)** — Enterprise Edition users can define security policies to restrict access to untrusted files, plugins, or network resources. ## Documentation and audit - **User guidance** — update onboarding materials and runbooks to highlight metadata-blocking best practices when deploying a new Kestra environment. - **Periodic review** — include network and host configuration checks in your security audit cycle to verify link-local ranges remain blocked. --- # Server Heartbeats and Job Recovery in Kestra URL: https://kestra.io/docs/administrator-guide/server-lifecycle > Understand Kestra's server liveness mechanism, heartbeats, and how it handles component failures and recovery. Kestra is separated into several components that can be deployed independently or inside a single process (a standalone deployment). These components are called **server components** or just **servers**. See the [server components](../../08.architecture/02.server-components/index.md) and [deployment](../../08.architecture/03.deployment-architecture/index.md) sections for more information. Kestra has a built-in liveness mechanism. Each server sends a periodic heartbeat stored inside the database, and other servers check whether the server is still alive. When a server is not alive, Kestra runs maintenance routines such as [worker job resubmission](#worker-job-resubmission). ## The liveness mechanism When a server starts, it sends a heartbeat to the database with a `RUNNING` status. When it stops, it first transitions to `TERMINATING`. If the server has pending tasks, it waits up to the configured `kestra.server.terminationGracePeriod` for them to finish. If it completes within that window, it sends a `TERMINATED_GRACEFULLY` heartbeat; otherwise the process is terminated with status `TERMINATED_FORCED`. Other servers detect missing heartbeats and run maintenance tasks: - Workers resubmit pending jobs to another worker before transitioning to `NOT_RUNNING`. - Other server types transition to `NOT_RUNNING` immediately. By default, liveness checks run every 10 seconds. `NOT_RUNNING` servers transition to `INACTIVE` at the next liveness check. If a server does not send a heartbeat within `kestra.server.liveness.timeout`, it is marked `DISCONNECTED` and then `NOT_RUNNING` at the next check. If that server is still alive, it self-terminates after detecting that other components classified it as `NOT_RUNNING`, preventing “resurrection.” For configuration details, see the [Runtime and Storage configuration](../../configuration/02.runtime-and-storage/index.md). ## Worker job resubmission The Worker has a special behavior that allows it to resubmit jobs that were not completed due to a server termination or any kind of failures. **Worker jobs** are tasks or triggers currently executing on a worker. When the Executor sends a task to a worker, it creates an entry in the worker job store. Workers remove the entry once they complete the task. The same logic applies to triggers evaluated by the Scheduler. The liveness mechanism resubmits pending jobs from a terminating worker to another worker, ensuring each job runs at least once. Configure this behavior via `kestra.server.workerTaskRestartStrategy`: - `AFTER_TERMINATION_GRACE_PERIOD` (default): wait another grace period before resubmitting jobs, preventing a terminated worker from returning. - `IMMEDIATELY`: resubmit jobs right away. - `NEVER`: never resubmit jobs (tasks remain incomplete and flows stay `RUNNING`). ::alert{type="info"} This resubmission mechanism also applies to **Realtime Triggers**. If the worker running a Realtime Trigger listener is stopped gracefully, Kestra waits for the `terminationGracePeriod` before reassigning the trigger to another worker. See [Worker failover for Realtime Triggers](../../05.workflow-components/07.triggers/05.realtime-trigger/index.md#worker-failover-for-realtime-triggers) for more details. :: Resubmitted task runs show multiple attempts in the UI. ![resubmitted task run](./taskrun-resubmitted-attempts.png) In the timeline, one of the states will be `RESUBMITTED`. ![resubmitted task run states](./taskrun-resubmitted-states.png) ## Instance view (EE only) Kestra Enterprise exposes an instance dashboard (**Administration → Instance**) that summarizes heartbeats, liveness status, and maintenance activity across clusters. See the [instance dashboard documentation](../../07.enterprise/05.instance/index.mdx) for a walkthrough. --- # SSL/TLS Configuration: Enable HTTPS for Kestra URL: https://kestra.io/docs/administrator-guide/ssl-configuration > Configure SSL/TLS encryption for Kestra to secure the UI and API access using self-signed or CA-signed certificates. Configure secure access to the Kestra UI via HTTPS. This page explains how to configure secure access via HTTPS to the Kestra UI. ## Why use SSL/TLS encryption In short, adding TLS encryption to your environment provides the following benefits: - Data is encrypted in transit, preventing sensitive data from being intercepted in "man-in-the-middle" attacks. - TLS adds a layer of trust by ensuring users know the URL they access is genuine (e.g., `https://mycompany.kestra.com/ui` is verified as an internal site). For further details, Cloudflare has a good write-up on [why you should use https](https://www.cloudflare.com/en-gb/learning/ssl/why-use-https/). ## Creating self-signed certificates To get started in lower environments, you can create self-signed certificates using the OpenSSL library. Full details on the steps and how to examine the certificates and keys in more detail can be found in this [Micronaut article](https://guides.micronaut.io/latest/micronaut-security-x509-maven-groovy.html). :::alert{type="info"} While self-signed certificates encrypt traffic, they are considered unsuitable for production usage. They are deemed untrustworthy, as they do not come from a trusted Certificate Authority (CA) such as [Let's Encrypt](https://letsencrypt.org/). Follow your organization's best practices when choosing a CA provider. ::: ```bash ## Create a folder which will be later mounted to the kestra container mkdir -p /app/ssl cd /app/ssl ``` ```bash ## Create CA in PEM format along with private key openssl req -x509 -sha256 -days 365 -newkey rsa:4096 \ -keyout cacert.key -out cacert.pem \ -subj '/CN=example.kestra.com/C=IE/O=kestra' \ -passout pass:changeit ## Create certificate signing request openssl req -newkey rsa:4096 \ -keyout server.key -out server.csr \ -subj '/CN=example.kestra.com/C=IE/O=kestra' \ -passout pass:changeit ## Create the server configuration which will be used to sign the certificate cat <<< 'authorityKeyIdentifier=keyid,issuer basicConstraints=CA:FALSE subjectAltName = @alt_names [alt_names] DNS.1 = localhost' > server.conf ## sign certificate openssl x509 -req -CA cacert.pem -CAkey cacert.key \ -in server.csr -out server.pem -days 365 \ -CAcreateserial -extfile server.conf \ -passin pass:changeit ## Create server.p12 openssl pkcs12 -export -out server.p12 -name "localhost" \ -inkey server.key -in server.pem \ -passin pass:changeit \ -passout pass:changeit ## Create keystore.p12 with JDK keytool keytool -importkeystore -srckeystore server.p12 \ -srcstoretype pkcs12 -destkeystore keystore.p12 \ -deststoretype pkcs12 \ -deststorepass changeit -srcstorepass changeit ## Create truststore.jks keytool -import -trustcacerts -noprompt -alias ca \ -ext san=dns:localhost,ip:127.0.0.1 \ -file cacert.pem -keystore truststore.jks \ -storepass changeit -keypass changeit ``` ## Sample Kestra configuration with SSL enabled Enable HTTPS through the `micronaut` configuration settings. These are set at the root level within the [Observability and Networking configuration](../../configuration/03.observability-and-networking/index.md). :::alert{type="info"} Ensure that you expose the secure port of the connection if different from the default port. ::: ```yaml kestra: image: registry.kestra.io/docker/kestra:latest pull_policy: always user: "root" command: server standalone --worker-thread=128 volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - tmp-kestra:/tmp/kestra-wd - /app/ssl:/app/ssl ports: - "8443:8443" environment: KESTRA_CONFIGURATION: | micronaut: security: x509: enabled: false ssl: enabled: true server: ssl: port: 8443 enabled: true clientAuthentication: want keyStore: path: file:/app/ssl/server.p12 password: changeit type: PKCS12 trustStore: path: file:/app/ssl/truststore.jks password: changeit type: JKS datasources: postgres: url: jdbc:postgresql://postgres:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 kestra: server: basic-auth: enabled: false username: "admin@kestra.io" # it must be a valid email address password: kestra repository: type: postgres storage: type: local local: base-path: "/app/storage" queue: type: postgres tasks: tmp-dir: path: /tmp/kestra-wd/tmp ports: - "8443:8443" ``` ## Outbound SSL configuration If Kestra tasks make outbound calls to other services, secure the process by configuring SSL for outbound traffic. You can accomplish this in your [Observability and Networking configuration](../../configuration/03.observability-and-networking/index.md) file by passing the following JVM options in the `JAVA_OPTS` environment variable: ```yaml JAVA_OPTS: "-Djavax.net.ssl.trustStore=/app/ssl/truststore.jks -Djavax.net.ssl.trustStorePassword=changeit" ``` Below is an example configuration file with the newly added environment variable: ```yaml kestra: image: registry.kestra.io/docker/kestra:latest pull_policy: always user: "root" command: server standalone --worker-thread=128 volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - tmp-kestra:/tmp/kestra-wd - /app/ssl:/app/ssl ports: - "8443:8443" environment: JAVA_OPTS: "-Djavax.net.ssl.trustStore=/app/ssl/truststore.jks -Djavax.net.ssl.trustStorePassword=changeit" # Add in the JVM options as an environment variable KESTRA_CONFIGURATION: | micronaut: security: x509: enabled: false ssl: enabled: true server: ssl: port: 8443 enabled: true clientAuthentication: want keyStore: path: file:/app/ssl/server.p12 password: changeit type: PKCS12 trustStore: path: file:/app/ssl/truststore.jks password: changeit type: JKS datasources: postgres: url: jdbc:postgresql://postgres:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 kestra: server: basic-auth: enabled: false username: "admin@kestra.io" # it must be a valid email address password: kestra repository: type: postgres storage: type: local local: base-path: "/app/storage" queue: type: postgres tasks: tmp-dir: path: /tmp/kestra-wd/tmp ports: - "8443:8443" ``` ## Enabling CSRF protection Cross-site request forgery (CSRF) is an attack where a malicious website or email tricks a user's browser into performing unwanted actions on a trusted site while authenticated. To enable CSRF protection, you must ensure that your instance has TLS/SSL enabled. Once this is configured, add the following to your configuration file: ```yaml micronaut: security: csrf: enabled: true ``` This setting enables CSRF protection on all endpoints that reach `/api/.*`. ## Configuring SSL with Kubernetes For Kubernetes deployments, you can enable HTTPS either by configuring TLS at the Ingress level or by using self-signed certificates at the application level. ### Using ingress with TLS termination (recommended for production) Most cloud providers expect TLS termination at the ingress controller. Here's how to configure HTTPS using Let's Encrypt certificates: 1. **Install cert-manager** (automates certificate management — to select a different version, check the [available releases on GitHub](https://github.com/cert-manager/cert-manager/releases)): ```bash kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.17.1/cert-manager.yaml ``` 2. **Create a Let's Encrypt issuer** (replace `your-email@example.com`): ```yaml apiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: letsencrypt-prod spec: acme: server: https://acme-v02.api.letsencrypt.org/directory email: your-email@example.com privateKeySecretRef: name: letsencrypt-prod solvers: - http01: ingress: class: nginx # Update for your ingress controller ``` 3. **Configure Ingress with TLS** (Azure AKS example): ```yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: kestra-ingress annotations: cert-manager.io/cluster-issuer: letsencrypt-prod nginx.ingress.kubernetes.io/backend-protocol: "HTTPS" spec: tls: - hosts: - kestra.yourdomain.com secretName: kestra-tls rules: - host: kestra.yourdomain.com http: paths: - path: / pathType: Prefix backend: service: name: kestra-service port: number: 80 ``` ### Using self-signed certificates (for testing) 1. **Generate certificates** using the OpenSSL commands from the previous section. 2. **Create TLS secret**: ```bash kubectl create secret tls kestra-tls \ --cert=server.pem \ --key=server.key ``` 3. **Reference the secret in your Ingress**: ```yaml spec: tls: - hosts: - kestra.yourdomain.com secretName: kestra-tls ``` ### Application-level SSL configuration For environments where ingress TLS termination isn't available: 1. **Create secret with SSL files**: ```bash kubectl create secret generic kestra-ssl \ --from-file=keystore.p12 \ --from-file=truststore.jks ``` 2. **Configure Kestra deployment**: ```yaml env: - name: KESTRA_CONFIGURATION value: | micronaut: server: ssl: enabled: true port: 8443 keyStore: path: file:/app/ssl/keystore.p12 password: changeit type: PKCS12 volumeMounts: - name: ssl-secret mountPath: "/app/ssl" volumes: - name: ssl-secret secret: secretName: kestra-ssl ``` 3. **Expose HTTPS port** in your service: ```yaml ports: - name: https port: 8443 targetPort: 8443 ``` :::alert{type="warning"} Production deployments on cloud platforms such as Azure AKS typically require valid certificates from trusted CAs for SSO integration. Self-signed certificates may work for testing but aren't suitable for production use. ::: ### Verifying the configuration Check certificate validity with: ```bash kubectl get certificate kestra-tls -w ``` Expected output: ```plaintext NAME READY SECRET AGE kestra-tls True kestra-tls 5m ``` --- # Troubleshoot Kestra: Kubernetes, Docker, and Startup Issues URL: https://kestra.io/docs/administrator-guide/troubleshooting > Solutions for common Kestra issues, including pod restarts, unprocessable executions, and Docker-in-Docker problems. Common issues and fixes for Kestra deployments. ## CrashLoopBackoff when restarting all pods **Question:** "When I restart all Kubernetes pods at once, they get stuck in a `CrashLoopBackoff` for a number of minutes before eventually resolving — why does it happen?" This is likely caused by Java startup behavior, which can consume a lot of resources and cause liveness probes to fail. Since Java loads many classes at startup, pods may restart multiple times before stabilizing. Setting the CPU limit to 2 times the request can improve startup time and resolve failing health checks. ## Unprocessable execution Sometimes, executions cannot be processed. In such cases, you can instruct Kestra to skip them. Start the executor server (or the standalone server if not using a multi-component deployment) with a list of execution identifiers to skip: ```sh kestra server executor --skip-executions 6FSPERUe1JwbYmMmdwRlgV,5iLGjTLOHAVGUGlsesFaMb ``` You can also skip executions at broader levels: 1. **Flows** — Skip all executions of one or more flows: ```sh kestra server executor "--skip-flows=tenant|namespace|flowA,tenant|namespace|flowB" ``` Example: ```sh kestra server executor "--skip-flows=companyA|production-data|daily-data-sync" ``` :::alert{type="info"} Replace `tenant` and `namespace` with the correct values for the flow. ::: 2. **Namespaces** — Skip all executions within specific namespaces: ```sh kestra server executor "--skip-namespaces=tenant|myNamespaceA,tenant|myNamespaceB" ``` Example: ```sh kestra server executor "--skip-namespaces=companyA|production-data" ``` :::alert{type="info"} Replace `tenant` with the correct values for the namespace. ::: 3. **Tenants** — Skip all executions associated with specific tenants: ```sh kestra server executor "--skip-tenants=tenantA,tenantB" ``` Example: ```sh kestra server executor "--skip-tenants=companyA" ``` ## Docker in Docker (DinD) If you face issues using Docker in Docker (e.g., with [Script tasks](../../16.scripts/index.mdx) using the `DOCKER` runner), start troubleshooting by attaching to the DinD container: ```sh docker run -it --privileged docker:dind sh ``` From there, use: - `docker logs ` to view logs - `docker inspect ` to get environment, network, and configuration details These commands help diagnose misconfigurations. ## Docker in Docker using Helm charts On some Kubernetes deployments, using DinD with our default Helm charts can result in errors such as: ```bash Device "ip_tables" does not exist. ip_tables 24576 4 iptable_raw,iptable_mangle,iptable_nat,iptable_filter modprobe: can't change directory to '/lib/modules': No such file or directory error: attempting to run rootless dockerd but need 'kernel.unprivileged_userns_clone' (/proc/sys/kernel/unprivileged_userns_clone) set to 1 ``` To fix this, switch DinD to insecure (privileged) mode by setting the following values: ```yaml dind: mode: 'insecure' base: insecure: image: tag: dind args: - --log-level=fatal securityContext: runAsUser: 0 runAsGroup: 0 ``` ## DinD on a Mac with Apple silicon (ARM) If you see errors like: ```plaintext java.io.IOException: com.sun.jna.LastErrorException: [111] Connection refused ``` it may be caused by running Docker in Docker on ARM-based Macs. Try using an embedded Docker server as shown below: :::collapse{title="Example docker-compose.yml"} ```yaml ## volumes omitted for brevity services: postgres: image: postgres # ... dind: image: docker:dind privileged: true environment: DOCKER_HOST: unix://dind/docker.sock command: - --log-level=fatal volumes: - dind-socket:/dind - tmp-data:/tmp/kestra-wd kestra: image: kestra/kestra:latest entrypoint: /bin/bash user: "root" # dev only — not for production command: - -c - /app/kestra server standalone --worker-thread=128 volumes: - kestra-data:/app/storage - dind-socket:/dind - tmp-data:/tmp/kestra-wd environment: KESTRA_CONFIGURATION: | datasources: postgres: url: jdbc:postgresql://postgres:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 kestra: storage: type: local local: base-path: "/app/storage" queue: type: postgres tasks: tmp-dir: path: /tmp/kestra-wd/tmp ports: - "8080:8080" - "8081:8081" ``` ::: ## tmp directory errors ("No such file or directory") If you encounter errors such as `"No such file or directory"` related to the tmp directory, it usually means the directory is not mounted correctly. In your `docker-compose.yml`, ensure the `tmpDir` path matches the mounted volume: ```yaml kestra: tasks: tmpDir: path: /home/kestra/tmp ``` Example volume configuration: ```yaml volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /home/kestra:/home/kestra ``` This ensures Kestra can properly access the tmp directory. --- # Upgrade Kestra: Rolling Updates, Migrations, and Rollback URL: https://kestra.io/docs/administrator-guide/upgrades > Best practices for upgrading Kestra, performing rolling updates, and rolling back to previous versions safely. Kestra evolves quickly. This page explains how to upgrade your installation.
## How to upgrade Kestra To upgrade Kestra, follow these steps: 1. Perform a database backup (optional but recommended). 2. Read the [release notes](https://github.com/kestra-io/kestra/releases) to understand the changes in the new version. 3. Perform a rolling upgrade of Kestra components. For Kubernetes, upgrade the Kestra Helm chart as described in “Rolling upgrades in Kubernetes,” below. 4. Apply any actions noted in the release notes (for example, update configuration files or adjust deprecated features). ## How to rollback Kestra to a previous version Sometimes you might need to roll back Kestra to a previous version. Follow these steps: 1. Perform a database backup (optional but recommended). 2. Stop all Kestra components. 3. Restore from a backup. 4. Restart with the older version. Check the [Backup and Restore](../backup-and-restore/index.md) section for more information on how to backup and restore Kestra, and [Maintenance Mode](../../07.enterprise/05.instance/maintenance-mode/index.md) to pause your Kestra instance for maintenance, upgrade, and backup tasks. :::alert{type="warning"} We strongly recommend avoiding downgrades. To prevent surprises, test the new version in a non-production environment before upgrading. If you must roll back, closely follow the steps above. ::: ## Where you can find the release changelog You can find the changelog on the main repository’s [Releases](https://github.com/kestra-io/kestra/releases) page. It lists changes, new features, and bug fixes for each release, as well as any breaking changes. For a high-level overview, see the release [blog posts](/blogs). ## How to identify breaking changes in a release In addition to bug fixes and enhancements, the release notes include a `Breaking Changes` section. It lists changes that may require adjustments to your code or Kestra configuration, with links to [migration docs](../../11.migration-guide/index.mdx). :::alert{type="warning"} The `Breaking Changes` section appears at the end of the [release notes](https://github.com/kestra-io/kestra/releases). Review it before upgrading. ::: ## How to minimize downtime when updating Kestra If you run Kestra as separate components, you should: - Stop the executors and the scheduler - Stop the workers — a graceful shutdown waits for active jobs to finish. The default is `kestra.server.terminateGracePeriod = '5m'`, configurable in your [Runtime and Storage configuration](../../configuration/02.runtime-and-storage/index.md). - If the job finishes within five minutes, the worker shuts down immediately. Otherwise, the task is killed and restarts when the worker restarts. - Stop the webserver (and the indexer if using EE with Kafka). All components support graceful shutdown, so no data is lost. Afterward, update and restart everything in the opposite order (or in any order, as components are independent). :::alert{type="info"} The webserver hosts the API, so stop and then start it immediately to avoid downtime. After that, restart the other components so flow executions can resume. ::: ## How to stick to a specific Kestra version If you want to stick to a specific Kestra version, you can pin the [Docker image tag](https://hub.docker.com/r/kestra/kestra/tags) to a specific release. Here are some examples: - `kestra/kestra:v0.21.4-no-plugins` includes the 0.21.4 release with the fourth patch version - `kestra/kestra:v0.21.4` includes the 0.21.4 release with all plugins - `kestra/kestra:v0.19.0-no-plugins` includes the 0.19 release without any plugins - `kestra/kestra:v0.19.0` includes the 0.19 release with all plugins. You can also create a custom image with your own plugins and dependencies, as explained in the [Docker installation](../../02.installation/02.docker/index.md). ## Migrating a standalone installation If you use a manual standalone installation with Java, you can download the Kestra binary for a specific version from the Assets menu of a specific [Release](https://github.com/kestra-io/kestra/releases) page. The image below shows how you can download the binary for the 0.14.1 release. ![download_kestra_binary](./download_kestra_binary.png) Once you’ve downloaded the binary, start Kestra with the following command: ```bash ./kestra-VERSION server standalone ``` ## Migrating an installation with Docker If you use Docker, change the [Docker image tag](https://hub.docker.com/r/kestra/kestra/tags) to the desired version and restart the container(s) or Kubernetes pod(s). ### Docker Compose If you use Docker Compose, update your compose file to the desired [Docker image tag](https://hub.docker.com/r/kestra/kestra/tags) and run `docker compose up -d` to restart the container(s). ## Migration in Kubernetes using Helm If you use Helm, set the [Helm chart `image.tag` value](https://github.com/kestra-io/kestra/blob/develop/charts/kestra/values.yaml) to the desired version. For example: ```bash helm upgrade kestra kestra/kestra --set image.tag=v1.0.0 ``` For more complex configurations that include multiple changes, consider using a custom values file: 1. First, create a `values.yaml` file that contains the settings you want to adjust. 2. Then, use the `helm upgrade` command with the `-f` flag to specify your custom values file: ```sh helm upgrade kestra kestra/kestra -f values.yaml ``` ## Rolling upgrades in Kubernetes Upgrading Kestra on Kubernetes depends on your deployment rollout strategy. Every service can be rolled out without downtime, except workers, which need special attention. During rollout, each component creates a new pod (the old one keeps running). After the new pod passes health checks, Kubernetes shuts down the previous pod, resulting in zero downtime. Upgrading workers is more involved because they handle data-processing tasks that can run from seconds to hours. Define the desired behavior for in-flight tasks. By default, Kestra workers wait for all task runs to complete before shutting down during a migration. You can override this behavior if needed. Kestra [Helm charts](https://github.com/kestra-io/kestra/blob/develop/charts/kestra/values.yaml) provide a configuration of a `terminationGracePeriodSeconds` (set to 60 seconds by default) that allows you to define the amount of time you want to wait before force-killing the worker. If the worker has no running tasks, or finishes them before the grace period, it shuts down immediately. If the pod cannot finish tasks before `terminationGracePeriodSeconds`, Kubernetes kills the pod, and those tasks are resubmitted to another worker. If a worker exits unexpectedly, the executor detects it and resubmits unfinished task runs to a new worker. The same behavior applies when a pod is terminated at `terminationGracePeriodSeconds`. ## Where can I find migration guides The [Migrations section](../../11.migration-guide/index.mdx) details deprecated features and explains how to migrate to the new behavior. For all breaking changes, the migration guides are linked in the [release notes](https://github.com/kestra-io/kestra/releases). ## How to stay informed about new releases You can get notified about new releases in the following ways: 1. Subscribe to notifications in the `#announcements` channel in the [Slack](/slack) community. 2. Follow us on [X (Twitter)](https://twitter.com/kestra_io) 3. Follow us on [LinkedIn](https://www.linkedin.com/company/kestra/) 4. Subscribe to the [Kestra newsletter](/blogs) 5. Subscribe to Release notifications on the [main GitHub repository](https://github.com/kestra-io/kestra), as shown in the image below: ![release_notification_github](./release_notifications_github.png) ## Database migrations There are two types of database migrations: automatic and manual. ### Automatic database migration Kestra uses [Flyway](https://flywaydb.org/) to automatically perform database migrations when the server starts. Flyway version-controls schema changes and stores the current version in the `flyway_schema_history` table. On startup, it compares the current version with the target and runs any required migrations — no manual intervention needed. ### Manual database migration Sometimes a manual database migration is useful, especially when you have a large database and you want to perform the migration before upgrading Kestra to avoid a long downtime. For example, when migrating from v0.12.0 to v0.13.0, all indexes are rebuilt due to multi-tenancy (`tenant_id` is added to most tables). With a large JDBC-backed database, this can take hours. In such cases, run `kestra sys database migrate` manually before starting Kestra. This command should use the same configuration as configured on your Kestra instance. Depending on whether you deploy Kestra using Docker or Kubernetes, this command can be launched via a `docker exec` or a `kubectl exec` command. There are two ways to initiate the manual database migration: 1. Keep Kestra running in an old version. Then, stop Kestra and launch the command on the new version. 2. Start Kestra on the new version with automatic schema migration disabled: `flyway.datasources.postgres.enabled=false` (if your database is not Postgres, replace `postgres` with your DB type). Then run: `kestra sys database migrate`. Example: run the command via `docker exec`: ```bash docker exec your_container_id bash ./kestra sys database migrate --help ``` Here is the output of that `--help` command: ```bash Usage: kestra sys database migrate [-hVv] [--internal-log] [-c=] [-l=] [-p=] Force database schema migration. Kestra uses Flyway to manage database schema evolution, this command will run Flyway then exit. -c, --config= Path to a configuration file, default: /root/. kestra/config.yml) -h, --help Show this help message and exit. --internal-log Change also log level for internal log, default: false) -l, --log-level= Change log level (values: TRACE, DEBUG, INFO, WARN, ERROR; default: INFO) -p, --plugins= Path to plugins directory , default: ./plugins) -v, --verbose Change log level. Multiple -v options increase the verbosity. -V, --version Print version information and exit. ``` ## Getting help If you have questions about the upgrade process: - If you are a [Kestra Enterprise](/enterprise) customer, submit a [support ticket](https://support.kestra.io/). - Or reach out [via Slack](/slack). For further help, [contact us](/contact-us) for assistance with migration based on your environment and use case. --- # Anonymous Usage Reporting in Kestra: Enable or Disable URL: https://kestra.io/docs/administrator-guide/usage > Learn about anonymous usage reporting in Kestra and how to configure or disable data collection. Configuration options for the usage report. The `kestra.anonymous-usage-report.enabled` option is mandatory: decide whether to share anonymous data to help improve Kestra. - `kestra.anonymous-usage-report.enabled`: (default true) - `kestra.anonymous-usage-report.initial-delay`: (default 5m) - `kestra.anonymous-usage-report.fixed-delay`: (default 1h) The collected data can be found [here](https://github.com/kestra-io/kestra/tree/develop/core/src/main/java/io/kestra/core/models/collectors). We collect only **anonymous data** that allows us to understand how you use Kestra. The data collected includes: - **host data:** CPU, RAM, OS, JVM, and a machine fingerprint. - **plugins data:** plugins installed and their current versions. - **flow data:** namespace count, flow count, the task type and the trigger type used. - **execution data:** execution and task run counts for the last two days, with counts and durations grouped by status. - **UI interaction:** data to help us understand user experience in the interface. - **common data:** server type, version, time zone, environment, start time, and URL. --- # Webserver URL, Reverse Proxy, and Forward Proxy Setup URL: https://kestra.io/docs/administrator-guide/webserver-url > Configure the Kestra webserver URL and proxy settings to ensure correct link generation and access behind reverse proxies. Configure the URL of your Kestra webserver. Some notification services require a URL configuration to add links from alert messages. Use a full URI with a trailing `/` (excluding `ui` or `api`). ```yaml kestra: url: https://www.my-host.com/kestra/ ``` ## Proxy configuration In networking, a **forward proxy** acts on behalf of clients to control **outbound traffic**, while a **reverse proxy** acts on behalf of servers to control **inbound traffic** and may also provide features such as load balancing and SSL encryption. A forward proxy serves as an intermediary for requests from clients seeking resources from other servers (such as the Kestra API for retrieving blueprints and plugin documentation), while a reverse proxy sits in front of one or more web servers, intercepting client requests before they reach the server. ### Forward proxy configuration In a forward proxy, the client connects to the proxy server, requesting some service (such as Kestra API) available from a different server. To set up a proxy in your Kestra installation, adjust the `micronaut.http.services.api` configuration to include a proxy address, username, and password. This will allow you to make requests to the Kestra API through the proxy to fetch data for the Kestra UI, such as Blueprints. Here is how you can adjust your `config.yml` file to include the necessary configuration: ```yaml micronaut: http: services: api: url: https://api.kestra.io proxy-type: http proxy-address: my.company.proxy.address:port proxy-username: "username" proxy-password: "password" follow-redirects: true ``` See the [Micronaut HttpClient Configuration](https://docs.micronaut.io/latest/guide/configurationreference.html#io.micronaut.http.client.DefaultHttpClientConfiguration) for more details on configuring `DefaultHttpClientConfiguration` in your `config.yml` file. Another way to authenticate is by providing `micronaut.http.client.proxy-authorization: Basic ` and `micronaut.http.services.*.proxy-authorization: Basic `, which prevents the password from being displayed in plain text in the config file. ### Reverse proxy configuration Reverse proxies hide the server’s identity from clients and may perform tasks such as load balancing, authentication, decryption, and caching. A reverse proxy acts on behalf of the server, taking requests from the external network, and directing them to the internal server(s) that can fulfill those requests. To display executions in real-time when hosting Kestra behind a reverse proxy, enable [Server-sent events (SSE)](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events). On some reverse proxies, such as Nginx, you need to disable buffering to enable real-time updates. Here is a working configuration: ```bash location / { proxy_pass http://localhost:; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_read_timeout 600s; proxy_redirect off; proxy_set_header Host $http_host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Protocol $scheme; # Needed for SSE proxy_buffering off; proxy_cache off; } ``` To access Kestra via a separate context path, add the following to your Kestra startup configuration (for example, to serve the UI at `mycompany.com/kestra`): ```yaml micronaut: server: context-path: "/kestra" ``` Then, modify your above nginx configuration to the following ```bash server { listen 80; server_name mycompany.com; location /kestra { proxy_pass http://:/kestra; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_read_timeout 600s; proxy_redirect off; proxy_set_header Host $http_host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Protocol $scheme; # Needed for SSE proxy_buffering off; proxy_cache off; } } ``` --- # AI Tools in Kestra: Copilot, Agents & RAG Workflows URL: https://kestra.io/docs/ai-tools > Learn how Kestra's AI Copilot, AI Agents, and Agent Skills can accelerate your workflow creation and enable autonomous orchestration. import ChildCard from "~/components/docs/ChildCard.astro" Create, refine, and orchestrate workflows using natural language or autonomous decision-making. ## Learn how Kestra AI tools accelerate orchestration Kestra provides two AI-powered features — **AI Copilot** and **AI Agents** — that extend how workflows can be created and executed. Additionally, **Agent Skills** let you bring Kestra expertise to external AI coding agents. ## AI Copilot AI Copilot allows users to generate and refine flow definitions from natural language prompts. Instead of manually writing YAML, you can describe the desired behavior (for example, _“Make a REST API call to https://kestra.io/api/mock and allow failure”_) and Copilot will generate the corresponding flow code. The generated YAML can then be reviewed, accepted, or modified. Copilot can also update existing flows incrementally, such as adding tasks or adjusting triggers, without affecting unrelated parts of the flow. ## AI Agents AI Agents provide autonomous orchestration capabilities. An AI Agent task uses a large language model (LLM), optional memory, and configured tools such as web search, task execution, or flow calling. The agent can dynamically decide which actions to take, loop until conditions are satisfied, and adapt based on new information. Unlike static flows that follow a fixed sequence, agents operate adaptively while remaining observable and fully defined as code. ## Agent Skills Agent Skills are structured knowledge files that teach external AI coding agents — such as Claude Code, Cursor, and Windsurf — how to generate Kestra flows and operate Kestra environments using `kestractl`. Unlike AI Copilot (which works inside the Kestra UI) or AI Agents (which run inside flows), Agent Skills bring Kestra expertise directly to your editor or terminal. ## Summary Together, these approaches offer complementary ways to work with AI: - **AI Copilot**: speeds up flow creation and modification by translating natural language instructions into YAML. - **AI Agents**: enable adaptive orchestration patterns where task sequences are not predetermined but are chosen dynamically at runtime. - **Agent Skills**: give external AI coding agents structured knowledge to generate valid Kestra flows and operate environments from your development tools. AI Copilot and AI Agents are built into Kestra, while Agent Skills extend Kestra expertise to the external tools you already use. --- # Agent Skills – Operate Kestra from AI Coding Agents URL: https://kestra.io/docs/ai-tools/agent-skills > Give AI coding agents like Claude Code, Cursor, and Windsurf structured knowledge to generate Kestra flows and operate Kestra environments using kestractl. Give AI coding agents structured knowledge to generate Kestra flows and operate Kestra environments. ## What are Agent Skills Agent Skills are structured knowledge files (`SKILL.md`) that teach external AI coding agents how to work with Kestra. They provide the context, commands, and guardrails an agent needs to generate valid flow YAML or operate a Kestra environment via the CLI. Unlike [AI Copilot](../ai-copilot/index.md), which works inside the Kestra UI, Agent Skills bring Kestra expertise to the tools you already use in your editor or terminal — Claude Code, Cursor, Windsurf, OpenAI Codex, and others. Unlike [AI Agents](../ai-agents/index.md), which are autonomous tasks running inside Kestra flows, Agent Skills equip your external coding agent with Kestra-specific knowledge so it can help you build and operate flows from your development environment. Agent Skills follow an emerging standard for giving AI tools domain-specific knowledge. Learn more at [agentskills.io](https://agentskills.io/home), the community hub for agent skills across tools and domains. ## Available Skills Kestra provides two skills in the [kestra-io/agent-skills](https://github.com/kestra-io/agent-skills) repository. ### kestra-flow Generate, modify, or debug Kestra Flow YAML grounded in the live flow schema — the same approach used by Kestra's AI Copilot. **Use when:** - Generating a new flow from a description - Modifying or extending an existing flow - Debugging invalid YAML or incorrect task/trigger references **Covers:** - Fetching and validating against the live flow schema from `https://api.kestra.io/v1/plugins/schemas/flow` - Schema-validated task and trigger generation - Partial modifications that touch only the relevant part of a flow - Guardrails: no invented types, no hardcoded secrets, correct looping and trigger patterns **Example prompt:** ```plaintext Use kestra-flow to write a flow that polls a REST API every 30 minutes and stores the result in KV store. ``` ### kestra-ops Operate Kestra using `kestractl` for flow, execution, namespace, and namespace-file operations. **Use when:** - Validating or deploying flows - Triggering executions and checking status - Managing namespaces and namespace files (`nsfiles`) - Configuring or switching CLI contexts **Covers:** - Context and auth setup (`config add`, `config use`, `config show`) - Flow operations: list, get, validate, deploy - Execution monitoring: run with `--wait`, get status - Namespace file management: list, get, upload, delete - Production guardrails: validate before deploy, confirm destructive actions, avoid exposing credentials **Example prompt:** ```plaintext Use kestra-ops to validate and deploy all flows in ./flows to prod.namespace with fail-fast enabled. ``` ## Prerequisites - **AI coding agent**: Claude Code, Cursor, Windsurf, OpenAI Codex, OpenCode, or any agent that supports skill files - **For kestra-flow**: `curl` and network access to `https://api.kestra.io` - **For kestra-ops**: [`kestractl`](../../kestra-cli/kestractl/index.md) installed with valid credentials ## Setup The easiest way to install Kestra agent skills is with [skills.sh](https://skills.sh) — it auto-detects your AI coding agent and places the skill files in the right location: ```bash npx skills add kestra-io/agent-skills ``` This works with Claude Code, Cursor, Windsurf, OpenAI Codex, and other agents that support skill files. The CLI detects which agent you’re using and installs the `SKILL.md` files into the correct directory (e.g. `.claude/skills/` for Claude Code, `.cursor/rules/` for Cursor). ### Manual installation You can also manually download skill files from the [kestra-io/agent-skills](https://github.com/kestra-io/agent-skills) repository. Each skill is a `SKILL.md` file under `skills//`. For example, to add the `kestra-ops` skill to Claude Code: ```bash mkdir -p .claude/skills/kestra-ops curl -sL https://raw.githubusercontent.com/kestra-io/agent-skills/main/skills/kestra-ops/SKILL.md \ -o .claude/skills/kestra-ops/SKILL.md ``` Repeat for any other skill you need (e.g. `kestra-flow`). Adjust the target directory for your agent — `.cursor/rules/` for Cursor, `.agents/skills/` for OpenAI Codex, etc. ## Example Workflows ### Generate a flow with kestra-flow Ask your agent to create a flow that polls an API on a schedule and persists the result: ```plaintext Use kestra-flow to write a flow in namespace company.data that fetches https://api.example.com/metrics every 30 minutes and stores the response in KV store under the key "latest_metrics". ``` The agent will fetch the live schema, generate valid YAML with a `Schedule` trigger and `io.kestra.plugin.core.kv.Set` task, and output ready-to-deploy flow code. ### Validate and deploy with kestra-ops Ask your agent to validate local flow files and deploy them: ```plaintext Use kestra-ops to validate all flows in ./flows, then deploy them to prod.pipelines namespace with --override and --fail-fast. ``` The agent will run `kestractl flows validate ./flows/`, confirm results, and then run `kestractl flows deploy` with the requested flags. ### Run a flow and report results with kestra-ops Ask your agent to trigger an execution and summarize the outcome: ```plaintext Use kestra-ops to run nightly-refresh in analytics.jobs namespace, wait for completion, and report the execution status. ``` The agent will run `kestractl executions run analytics.jobs nightly-refresh --wait`, then summarize the execution result. ## Creating Custom Skills You can create your own skills following the same `SKILL.md` format. Each skill file should include: - **Frontmatter** with `name`, `description`, and `compatibility` - **When to use** — trigger conditions for the skill - **Required inputs** — what context the agent needs - **Workflow** — step-by-step instructions - **Guardrails** — safety rules and constraints - **Example prompts** — realistic usage examples See the [contributing guidelines](https://github.com/kestra-io/agent-skills) in the repository for more details. --- # AI Agents in Kestra – Autonomous Orchestration URL: https://kestra.io/docs/ai-tools/ai-agents > Build autonomous AI agents in Kestra for LLM-powered orchestration. Create agents that think, remember, and use tools like web search for complex workflows. Launch autonomous processes with an LLM, memory, and tools. ## Build autonomous AI agents in Kestra Add autonomous AI-driven tasks to flows that can think, remember, and dynamically orchestrate tools and tasks.
An AI Agent is an autonomous system that uses a Large Language Model (LLM). Each run combines a **system message** and a **prompt**. The system message defines the agent's role and behavior, while the prompt carries the actual user input for that execution. Together, they guide the agent's response. With AI Agents, workflows are no longer limited to a predefined sequence of tasks. An AI Agent task launches an autonomous process with the help of an LLM, memory, and tools such as web search, task execution, and flow calling, and can dynamically decide which actions to take and in what order. Unlike traditional flows, an AI Agent can loop tasks until a condition is met, adapt to new information, and orchestrate complex multi-step objectives on its own. This enables agentic orchestration patterns in Kestra, where agents can operate independently or collaborate in multi-agent systems, all while remaining fully observable and manageable in code. To start using this feature, you can add an [**AI Agent**](/plugins/plugin-ai/agent) task to your flow. The AI Agent will then use the tools you provide to achieve its goal, leveraging capabilities such as web search, task execution, and flow calling. Thanks to memory, your AI Agent can remember information across executions to provide context for future tasks and subsequent prompts. ## AI Agent flow example
To demonstrate, below is a flow that summarizes arbitrary text with controllable length and language. Each component of the flow is broken down. ```yaml id: simple_summarizer_agent namespace: company.ai inputs: - id: summary_length displayName: Summary Length type: SELECT defaults: medium values: - short - medium - long - id: language displayName: Language ISO code type: SELECT defaults: en values: - en - fr - de - es - it - ru - ja - id: text type: STRING displayName: Text to summarize defaults: | Kestra is an open-source orchestration platform that: - Allows you to define workflows declaratively in YAML - Allows non-developers to automate tasks with a no-code interface - Keeps everything versioned and governed, so it stays secure and auditable - Extends easily for custom use cases through plugins and custom scripts. Kestra follows a "start simple and grow as needed" philosophy. You can schedule a basic workflow in a few minutes, then later add Python scripts, Docker containers, or complicated branching logic if the situation calls for it. tasks: - id: multilingual_agent type: io.kestra.plugin.ai.agent.AIAgent systemMessage: | You are a precise technical assistant. Produce a {{ inputs.summary_length }} summary in {{ inputs.language }}. Keep it factual, remove fluff, and avoid marketing language. If the input is empty or non-text, return a one-sentence explanation. Output format: - 1-2 sentences for 'short' - 2-5 sentences for 'medium' - Up to 5 paragraphs for 'long' prompt: | Summarize the following content: {{ inputs.text }} - id: english_brevity type: io.kestra.plugin.ai.agent.AIAgent prompt: Generate exactly 1 sentence English summary of "{{ outputs.multilingual_agent.textOutput }}" pluginDefaults: - type: io.kestra.plugin.ai.agent.AIAgent values: provider: type: io.kestra.plugin.ai.provider.GoogleGemini modelName: gemini-2.5-flash apiKey: "{{ secret('GEMINI_API_KEY') }}" configuration: logRequests: true logResponses: true responseFormat: type: TEXT ``` ### Inputs The goal of the AI Agent is to summarize text. The flow uses three inputs -- `summary_length`, `language`, and `text` -- to control the length, language, and source text for the summary. All inputs have a default value, and more or less can be used and referenced in downstream agentic tasks depending on the use case with [expressions](../../expressions/index.mdx). When executing the flow, all the inputs can be selected or modified from the defaults. ![AI Agent Flow Inputs](./ai-agent-inputs.png) Continuing below for reference, we select `short` for the summary length and German (`de`) for the summary language. ### Tasks In the flow, there are two tasks using the [AI Agent plugin](/plugins/plugin-ai/agent): `multilingual_agent` and `english_brevity`. The first task, `multilingual_agent`, includes the `systemMessage` property which dictates the system message to the LLM provider. The system message references the input selections for the desired summary length and in what language to generate the summary in. It also defines what should be outputted when the input is short, medium, or long. Now that the AI Agent is familiar with its role, the `prompt` property tells it what to do, which is to summarize the inputted text. Taking a look at the output for a short summary, the `multilingual_agent` task does provide a 1–2 sentence summary of Kestra in German. ![AI Agent Initial Summary](./ai-agent-summary.png) Following `multilingual_agent` is the `english_brevity` task, which only needs a `prompt` because the `systemMessage` moves downstream in the flow. Whether a shorter English translation is needed, or the original outputted summary is in a different language, the `english_brevity` task provides a different output to match the need. In the execution context, the output is abbreviated and limited to exactly one sentence per the prompt. ![AI Agent Abbreviated Summary](./ai-agent-brevity.png) These outputs can then be passed on as notifications or system messages to external tools or subflows within Kestra. Other useful outputs include `tokenUsage` to compare different providers for the same tasks. For more examples and details about properties, outputs, and definitions, refer to the AI [Agent plugin documentation](/plugins/plugin-ai/agent). ### Plugin defaults Each task using the AI Agent requires the `provider` property. To avoid repetition and simplify the flow building experience, first consider using [Kestra's AI Copilot](../ai-copilot/index.md), next consider using [Plugin Defaults](../../05.workflow-components/09.plugin-defaults/index.md) to ensure consistency and remove repetition. Additionally, for your provider API key, secure it either through the [Key-Value Store](../../06.concepts/05.kv-store/index.md) or as a [Secret](../../06.concepts/04.secret/index.md) if using [Kestra Enterprise Edition](../../07.enterprise/01.overview/01.enterprise-edition/index.md). --- # AI Copilot in Kestra – Generate and Edit Flows URL: https://kestra.io/docs/ai-tools/ai-copilot > Use Kestra AI Copilot to generate and edit flows with natural language prompts. Get AI-assisted suggestions for tasks, triggers, and configurations. Build and modify flows directly from natural language prompts. ## Create and edit flows with AI Copilot The AI Copilot can generate and iteratively edit declarative flow code with AI-assisted suggestions.
The AI Copilot is designed to help build and modify flows directly from natural language prompts. Describe what you are trying to build, and Copilot will generate the YAML flow code for you to accept or adjust. Once your initial flow is created, you can iteratively refine it with Copilot’s help, adding new tasks or adjusting triggers without touching unrelated parts of the flow. Everything stays as code and in Kestra's usual declarative syntax. Copilot is available anywhere you build in Kestra — Flows, Apps, Unit tests, and Dashboards — so you can keep iterating with the same AI assistant across the product surface. You can type prompts or click the microphone button in the Copilot panel to dictate them with speech-to-text directly from the UI. Copilot grounds its suggestions in your Namespace metadata. It automatically reads Plugin Defaults, Variables, Secrets, and Key-Value pairs configured in the current Namespace, so prompts like "Create a task that integrates with MongoDB" can reuse your existing `pluginDefaults`, stored credentials, or variables without extra hints. ## Configuration To add Copilot to your flow editor, add the following to your [Enterprise and Advanced configuration](../../configuration/06.enterprise-and-advanced/index.md). The `providers` array lets you register multiple LLMs and pick a default (`isDefault: true`): ```yaml kestra: ai: enabled: true # set to false to disable AI Copilot entirely providers: - id: gemini display-name: Gemini - Private type: gemini configuration: model-name: gemini-2.5-flash api-key: YOUR_GEMINI_API_KEY - id: gpt display-name: Open AI type: openai isDefault: true configuration: model-name: gpt-4 api-key: YOUR_OPENAI_API_KEY ``` :::alert{type="info"} Legacy single-provider configs (`kestra.ai.type` + provider block) still work, but the `providers` array lets you register multiple providers and choose a default (`isDefault: true`). ::: ### Disabling AI Copilot To fully disable the AI Copilot — including the built-in fallback to the `api.kestra.io` service — set `kestra.ai.enabled` to `false`: ```yaml kestra: ai: enabled: false ``` When disabled, the Copilot UI will not appear and all AI endpoints will be deactivated. The property defaults to `true`. ### Multiple providers When multiple providers are configured, users can switch models from a dropdown in the Copilot UI instead of relying only on the default. Replace `api-key` with your provider credentials. Copilot appears in the top right corner of the flow editor. Optionally, you can add the following properties inside each provider `configuration` block (availability varies by provider): - `temperature`: Controls randomness in responses — lower values make outputs more focused and deterministic, while higher values increase creativity and variability. - `topP` (nucleus sampling): Ranges from 0.0–1.0; lower values (0.1–0.3) produce safer, more focused responses for technical tasks, while higher values (0.7–0.9) encourage more creative and varied outputs. - `topK`: Typically ranges from 1–200+ depending on the API; lower values restrict choices to a few predictable tokens, while higher values allow more options and greater variety in responses. - `maxOutputTokens`: Sets the maximum number of tokens the model can generate, capping the response length. - `logRequests`: Creates logs in Kestra for LLM requests. - `logResponses`: Creates logs in Kestra for LLM responses. - `baseURL`: Specifies the endpoint address where the LLM API is hosted. - `clientPem`: (Required for mTLS) PEM bundle with client cert + private key (e.g., `cat client.crt.pem client.key.pem > client-bundle.pem`). Used for mutual TLS. - `caPem`: CA PEM file to add a custom CA without `trustAll`. Usually not needed since hosts already trust the CA. - `customHeaders`: Specify custom HTTP headers for authentication and routing through internal AI gateways. Custom headers should be passed as a map inside the property. - `timeout`: Specifies the maximum duration to wait for an AI model API request to complete before timing out. ISO 8601 duration format (Java Duration): `PT30S` = 30 seconds. You can set it per provider to enforce strict SLAs. :::alert{type="info"} Enterprise Edition includes an [RBAC permission](../../07.enterprise/03.auth/rbac/index.md) that lets administrators allow or disallow Copilot usage per role at tenant or namespace scope. ::: ![AI Copilot](./ai-copilot.png) :::alert{type="info"} The open-source version supports only Google Gemini models. Enterprise Edition users can configure any LLM provider, including Amazon Bedrock, Anthropic, Azure OpenAI, DeepSeek, Google Gemini, Google Vertex AI, Mistral, and all open-source models supported by Ollama. Navigate down to the Enterprise configurations section for your provider. If you use a different provider, please [reach out to us](https://kestra.io/demo) and we'll add it. ::: ## Build flows with Copilot
In the above demo, we want to create a flow that uses a [Python script](/plugins/plugin-script-python/io.kestra.plugin.scripts.python.script) to fetch New York City weather data. To get started, open the Copilot and write a prompt. For example: ```txt Create a flow with a Python script that fetches weather data for New York City ``` Once prompted, the Copilot generates YAML directly in the flow editor that can be accepted or refused in the bottom right corner. ![Copilot Suggestion](./copilot-suggestion.png) If accepted, the flow is created and can be saved for execution, iterated on manually, or continually iterated upon by the Copilot. For example, you want a trigger added to the flow to run it on a schedule. Reopen the Copilot and prompt it with the desired trigger setup such as: ```txt Add a trigger to run the flow every day at 9 AM ``` The Copilot again makes a suggestion to add to the flow, but only in the targeted section, in this case a `triggers` block. This is also the case if you want the Copilot only to consider a specific task, input, plugin default, and so on. ![Copilot Trigger Iteration](./copilot-trigger.png) You can continuously collaborate with Copilot until the flow is exactly as you imagined. If accepted, suggestions are always declaratively written and manageable as code. You can keep track of the revision history using the built-in Revisions tab or with the help of Git Sync. ## Fix with AI With Copilot configured, there is also the added benefit of consulting Copilot to resolve execution errors from the Logs and Gantt views. For failed tasks, you can open the task and click the three dots to "**Fix with AI**". This option reopens the flow editor with the Copilot automatically prompted with the error context to help resolve any issues with the task. ![Fix with AI](./fix-with-ai-gantt.png) ## Starter prompts To get started with Copilot, here are some example prompts to test, iterate on, and use as a starting point for collaboratively building flows with AI in Kestra: :::collapse{title="Example prompts to get started"} ```markdown - Create a flow that runs a dbt build command on DuckDB - Create a flow cloning https://github.com/kestra-io/dbt-example Git repository from a main branch, then add a dbt CLI task using DuckDB backend that will run dbt build command for that cloned repository using my_dbt_project profile and dev target. The dbt project is located in the root directory so no dbt project needs to be configured. - Create a flow that sends a POST request to https://dummyjson.com/products/add - Send a POST request to https://dummyjson.com/products/add - Write a Python script that sends a POST request to https://dummyjson.com/products/add - Write a Node.js script that sends a POST request to https://dummyjson.com/products/add - Create a flow with a Python script that fetches weather data for New York City - Make a REST API call to https://kestra.io/api/mock and allow failure - Create a flow that logs "Hello from AI" to the console - Create a flow that returns Hello as output - Create a flow that outputs Hello as value - Run a flow every 10 minutes - Run a flow every day at 9 AM - Run a shell command echo 'Hello Docker' in a Docker container - Run a command python main.py in a Docker container - Run a script main.py stored as namespace file - Build a Docker image from an inline Dockerfile and push it to a GitHub Container Registry - Build a Docker image from an inline Dockerfile and push it to a DockerHub Container Registry - Create a flow that adds a string KV pair called MYKEY with value myvalue to namespace company - Fetch value for KV pair called MYKEY from namespace company - Create a flow that downloads a file mydata.csv from S3 bucket named mybucket - Create a flow that downloads all files from the folder kestra/plugins/ from S3 bucket mybucket in us-east-1 - Send a Slack notification that approval is needed and Pause the flow for manual approval - Send a Slack alert whenever any execution from namespace company fails - Fetch value for string kv pair called mykey from Redis - Fetch value for mykey from Redis - Set value for mykey in Redis to myvalue - Sync all flows and scripts for selected namespaces from Git to Kestra - Create a flow that clones a Git repository and runs a Python script - Export a Postgres table called mytable to a CSV file - Query a Postgres table called mytable - Find documents in a MongoDB collection called mycollection - Load documents into a MongoDB mycollection using a file from input mydata - Trigger an Airbyte connection sync and retry it up to 3 times - Run an Airflow DAG called mydag - Orchestrate an Ansible playbook stored in Namespace Files - Run a DuckDB query that reads a CSV file - Fetch AWS ECR authorization token to push Docker images to Amazon ECR - Run a flow whenever 5 records are available in Kafka topic mytopic - Submit a run for a Databricks job ``` ::: ## Enterprise Edition Copilot configurations Enterprise Edition users can configure any LLM provider, including Amazon Bedrock, Anthropic, Azure OpenAI, DeepSeek, Google Gemini, Google Vertex AI, Mistral, OpenAI, OpenRouter, and all open-source models supported by Ollama. Add one or more of the snippets below as entries inside `kestra.ai.providers` (set `isDefault: true` on the default provider). Each configuration has slight differences, so adjust it for your provider. Only non-thinking modes are supported. If the used LLM is a pure thinking model (one that possesses thinking ability and cannot be disabled), the generated Flow will be incorrect and contain thinking elements. ### Amazon Bedrock ```yaml kestra: ai: providers: - id: bedrock display-name: Amazon Bedrock type: bedrock configuration: model-name: amazon.nova-lite-v1:0 access-key-id: BEDROCK_ACCESS_KEY_ID secret-access-key: BEDROCK_SECRET_ACCESS_KEY ``` ### Anthropic ```yaml kestra: ai: providers: - id: anthropic display-name: Anthropic type: anthropic configuration: model-name: claude-opus-4-1-20250805 api-key: CLAUDE_API_KEY ``` ### Azure OpenAI ```yaml kestra: ai: providers: - id: azure-openai display-name: Azure OpenAI type: azure-openai configuration: model-name: gpt-4o-2024-11-20 api-key: AZURE_OPENAI_API_KEY tenant-id: AZURE_TENANT_ID client-id: AZURE_CLIENT_ID client-secret: AZURE_CLIENT_SECRET endpoint: "https://your-resource.openai.azure.com/" ``` ### Deepseek ```yaml kestra: ai: providers: - id: deepseek display-name: DeepSeek type: deepseek configuration: model-name: deepseek-chat api-key: DEEPSEEK_API_KEY base-url: "https://api.deepseek.com/v1" ``` ### Google Gemini ```yaml kestra: ai: providers: - id: gemini display-name: Google Gemini type: gemini configuration: model-name: gemini-2.5-flash api-key: YOUR_GEMINI_API_KEY ``` ### Google Vertex AI ```yaml kestra: ai: providers: - id: vertex display-name: Google Vertex AI type: googlevertexai configuration: model-name: gemini-2.5-flash project: GOOGLE_PROJECT_ID location: GOOGLE_CLOUD_REGION endpoint: VERTEX-AI-ENDPOINT ``` ### Mistral ```yaml kestra: ai: providers: - id: mistral display-name: Mistral type: mistralai configuration: model-name: mistral:7b api-key: MISTRALAI_API_KEY base-url: "https://api.mistral.ai/v1" ``` ### Ollama ```yaml kestra: ai: providers: - id: ollama display-name: Ollama type: ollama configuration: model-name: llama3 base-url: http://localhost:11434 ``` :::alert{type="info"} If Ollama is running locally on your host machine while Kestra is running inside a container, connection errors may occur when using `localhost`. In thi"s case, use the Docker internal network URL instead — for example, set the base URL to `http://host.docker.internal:11434`. ::: :::alert{type="info"} Some Ollama model names can be confusing. For example, at the time of writing, the model `qwen3:30b-a3b` is pointing to SHA `ad815644918f`, which is the `qwen3:30b-a3b-thinking-2507-q4_K_M` model behind the scenes. This is a thinking model that doesn't support disabling it. Please double-check that the chosen model has a non-thinking version or that a toggle is available. ::: ### OpenAI ```yaml kestra: ai: providers: - id: openai display-name: OpenAI type: openai configuration: model-name: gpt-5-nano api-key: OPENAI_API_KEY base-url: https://api.openai.com/v1 ``` ### OpenRouter ```yaml kestra: ai: providers: - id: openrouter display-name: OpenRouter type: openrouter configuration: api-key: OPENROUTER_API_KEY base-url: "https://openrouter.ai/api/v1" model-name: "anthropic/claude-sonnet-4" ``` --- # RAG Workflows in Kestra – Retrieval-Augmented Generation URL: https://kestra.io/docs/ai-tools/ai-rag-workflows > Build Retrieval-Augmented Generation (RAG) workflows in Kestra to ground LLM responses in your own data or web search results. Ask questions, get data-backed answers with RAG. ## Build retrieval-augmented generation workflows Retrieval Augmented Generation (RAG) enhances LLM responses by grounding them in your own data instead of relying solely on the model’s internal knowledge. It works by retrieving relevant document embeddings and combining them with the user’s prompt to produce accurate, context-aware outputs. Chat with your data using RAG in Kestra. This example shows how to use **Retrieval Augmented Generation (RAG)** in Kestra to ground Large Language Model (LLM) responses in your own data. The flow ingests documents, stores embeddings in the KV Store, and contrasts responses from a plain LLM prompt with RAG-enabled responses, demonstrating how RAG reduces hallucinations and improves accuracy. ## RAG flow example ```yaml id: rag namespace: company.ai tasks: - id: ingest type: io.kestra.plugin.ai.rag.IngestDocument provider: type: io.kestra.plugin.ai.provider.GoogleGemini modelName: gemini-embedding-exp-03-07 apiKey: "{{ secret('GEMINI_API_KEY') }}" embeddings: type: io.kestra.plugin.ai.embeddings.KestraKVStore drop: true fromExternalURLs: - https://raw.githubusercontent.com/kestra-io/docs/refs/heads/main/content/blogs/release-0-24.md - id: parallel type: io.kestra.plugin.core.flow.Parallel tasks: - id: chat_without_rag type: io.kestra.plugin.ai.completion.ChatCompletion provider: type: io.kestra.plugin.ai.provider.GoogleGemini messages: - type: USER content: Which features were released in Kestra 0.24? - id: chat_with_rag type: io.kestra.plugin.ai.rag.ChatCompletion chatProvider: type: io.kestra.plugin.ai.provider.GoogleGemini embeddingProvider: type: io.kestra.plugin.ai.provider.GoogleGemini modelName: gemini-embedding-exp-03-07 embeddings: type: io.kestra.plugin.ai.embeddings.KestraKVStore systemMessage: You are a helpful assistant that can answer questions about Kestra. prompt: Which features were released in Kestra 0.24? pluginDefaults: - type: io.kestra.plugin.ai.provider.GoogleGemini values: apiKey: "{{ secret('GEMINI_API_KEY') }}" modelName: gemini-2.5-flash ``` ### How it works This flow first ingests external documents into the Kestra KV Store by generating embeddings with a chosen LLM provider. Those embeddings act as a searchable index. When you ask a question, Kestra can either pass the raw prompt directly to the LLM (without RAG) or augment the prompt with the most relevant information retrieved from the embeddings (with RAG). By supporting the model’s response in actual data, Kestra reduces the likelihood of hallucinations and ensures answers remain accurate and contextual to your source material. ### Without RAG vs. with RAG Without RAG, the model answers based only on its pretraining and may produce plausible but inaccurate results if the requested details are not part of its training knowledge. With RAG, the model supplements its reasoning by retrieving embeddings stored in the KV Store and using them as context, producing responses directly tied to the ingested documents. Use RAG when you need AI responses anchored in current, domain-specific, or external data sources. ## RAG with web search example This example shows how to combine Retrieval Augmented Generation (RAG) with a web search content retriever to answer questions using both stored knowledge and up-to-date external sources. ```yaml id: rag_with_websearch_content_retriever namespace: company.ai tasks: - id: chat_with_rag_and_websearch_content_retriever type: io.kestra.plugin.ai.rag.ChatCompletion chatProvider: type: io.kestra.plugin.ai.provider.GoogleGemini modelName: gemini-2.5-flash apiKey: "{{ secret('GEMINI_API_KEY') }}" contentRetrievers: - type: io.kestra.plugin.ai.retriever.TavilyWebSearch apiKey: "{{ secret('TAVILY_API_KEY') }}" systemMessage: You are a helpful assistant that can answer questions about Kestra. prompt: What is the latest release of Kestra? ``` The flow uses the `TavilyWebSearch` ([Tavily](https://www.tavily.com/)) retriever to fetch the latest information from the web and provides it as context to the `ChatCompletion` task. By grounding the LLM’s response in real-time search results, Kestra can answer questions such as “What is the latest release of Kestra?” with accurate, current data. ### Comparison: Static RAG vs. Web Search RAG - Static RAG (e.g., with document ingestion) is ideal when you want to ground responses in a fixed knowledge base, such as internal documentation or policies. - Web Search RAG extends this by retrieving fresh, dynamic content from the internet, making it better for answering time-sensitive or evolving questions like product releases or recent events. For more properties, examples, and implementations, refer to the [Kestra RAG documentation](/plugins/plugin-ai/rag). --- # AI Workflows in Kestra: Orchestrate with Any LLM URL: https://kestra.io/docs/ai-tools/ai-workflows > Orchestrate AI workflows in Kestra with any LLM provider. Connect to OpenAI, Anthropic, Google, and more to build intelligent automation pipelines. Build AI workflows with your preferred LLM. ## Orchestrate AI workflows with your preferred LLM Kestra provides plugins for multiple LLM providers and continues to add more with each release. You can design flows that use your chosen model and integrate AI into orchestration workflows.
The following examples demonstrate Kestra AI plugins for a variety of workflows. You can adapt each example to your chosen provider. Three key properties are important to understand: - `type`: Defines the LLM provider plugin and task (e.g., `ChatCompletion` with OpenAI). - `apiKey`: Access key for the provider – store this as a [key-value pair](../../06.concepts/05.kv-store/index.md) in Kestra Open Source or as a [secret](../../06.concepts/04.secret/index.md) in Enterprise Edition. - `model`: Specifies the provider model. Models vary in performance, cost, and capabilities, so choose the one that best fits your use case. Different provider plugins may include additional properties beyond those shown in the examples. Refer to each plugin’s documentation for a complete list. Common properties to be aware of include `prompt`, `messages`, `jsonResponseSchema`, to name a few. :::collapse{title="Check the weather is suitable for sports every day using Gemini"} This flow checks the daily wind conditions in Cambridgeshire and uses Google Gemini to decide whether it is suitable to go sailing. If the wind speed falls within the preferred range (above 10 knots and below 30 knots), the flow notifies you in Slack with the recommendation and automatically blocks your calendar for the day with an 'Out of office – gone sailing' event. It runs every morning at `8:00` AM on a schedule. ```yaml id: check_weather namespace: company.ai tasks: - id: ask_ai type: io.kestra.plugin.gemini.StructuredOutputCompletion apiKey: "{{ secret('GEMINI_API_KEY') }}" model: "gemini-2.5-flash-preview-05-20" prompt: "I like to go sailing when the wind is above 10 knots but below 30 knots. I sail in Cambridgeshire. If the wind is within that range, I want to know if I should go sailing or not. Also tell me the current wind speed speeds" jsonResponseSchema: | { "type": "object", "properties": { "content": { "type": "string" }, "wind": { "type": "number" }, "go_sailing": { "type": "boolean" } } } } - id: if type: io.kestra.plugin.core.flow.If condition: "{{ outputs.ask_ai['predictions'] | first | jq('.go_sailing') | first }}" then: - id: notify_me type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ secret('SLACK_WEBHOOK') }}" payload: | { "text": "{{ outputs.ask_ai['predictions'] | first | jq('.content') | first }}" } - id: block_calendar type: io.kestra.plugin.googleworkspace.calendar.InsertEvent calendarId: "{{ secret('CALENDAR_ID') }}" serviceAccount: "{{ secret('GOOGLE_SA') }}" summary: Out of office description: "Gone sailing because the wind is {{ outputs.ask_ai['predictions'] | first | jq('.wind') | first }} knots" startTime: dateTime: "{{ now() | date(\"yyyy-MM-dd'T'09:00:00+01:00\") }}" timeZone: "Europe/London" endTime: dateTime: "{{ now() | date(\"yyyy-MM-dd'T'18:00:00+01:00\") }}" timeZone: "Europe/London" creator: email: wrussell@kestra.io triggers: - id: check_daily type: io.kestra.plugin.core.trigger.Schedule cron: "* 8 * * *" ``` ::: :::collapse{title="Create tasks with natural language prompts using DeepSeek and Todoist"} This flow turns natural language prompts into structured Todoist tasks using an AI model. Each item is parsed into a title, description, and due date, then automatically created in your Todoist workspace via the REST API. ```yaml id: add_tasks_to_todoist namespace: company.ai inputs: - id: prompt type: STRING displayName: What would you like to add to your task list? description: List out all the things you need to get done defaults: I need to get my prescription on Friday afternoon and go shopping afterwards tasks: - id: create_task_fields type: io.kestra.plugin.deepseek.ChatCompletion apiKey: '{{ secret("DEEPSEEK_API_KEY") }}' modelName: deepseek-chat messages: - type: SYSTEM content: You are going to help to write a todo list inside of Todoist. I need you to return any user messages as tasks in JSON format only. There might be multiple tasks. The current time is '{{ now() }}' - type: USER content: "{{ inputs.prompt }}" jsonResponseSchema: | { type: "object", "properties": { "tasks": { "type": "array", "items": { "type": "object", "properties": { "title": "string", "description": "string", "due_date": { "type": "string", "format": "date-time" "description": "Due date of the task (as a RFC 3339 timestamp).", } } } } } } - id: create_tasks type: io.kestra.plugin.core.flow.ForEach values: "{{ outputs.create_task_fields.response | jq('.tasks') | first }}" tasks: - id: create_task type: io.kestra.plugin.core.http.Request uri: https://api.todoist.com/rest/v2/tasks method: POST contentType: application/json headers: Authorization: "Bearer {{ secret('TODOIST_API_TOKEN') }}" body: | { "content": "{{ taskrun.value | jq('.title') | first }}", "description": "{{ taskrun.value | jq('.description') | first }}", "due_datetime": "{{ taskrun.value | jq('.due_date') | first }}" } ``` ::: :::collapse{title="Generate an image with OpenAI with human approval"} This flow generates an image from a user prompt, sends it to a Discord channel for review, and waits for approval. If approved, the image is finalized and logged; if rejected, the user can provide feedback to regenerate a new image, which is then shared again on Discord. ```yaml id: gen_img_approval namespace: company.ai inputs: - id: image_prompt type: STRING variables: discord_webhook: "https://discord.com/api/webhooks/URL" tasks: - id: gen_img type: io.kestra.plugin.core.flow.Subflow namespace: demo flowId: generate_image inputs: openai_prompt: "{{ inputs.image_prompt }}" - id: send_image type: io.kestra.plugin.discord.DiscordExecution content: "Are you happy with the image: {{ outputs.gen_img.outputs.image }}. Approve it here: http://localhost:8082/ui/executions/{{flow.namespace}}/{{flow.id}}/{{execution.id}} " url: "{{ vars.discord_webhook }}" - id: wait_for_approval type: io.kestra.plugin.core.flow.Pause onResume: - id: approve description: Are you happy with the photo or not? type: BOOLEAN - id: feedback description: Write the prompt again with more detail type: STRING - id: try_again type: io.kestra.plugin.core.flow.If condition: "{{ outputs.wait_for_approval.onResume.approve }}" then: - id: approved type: io.kestra.plugin.core.log.Log message: "Final photo: {{ outputs.gen_img.outputs.image }}" else: - id: retry type: io.kestra.plugin.core.flow.Subflow namespace: demo flowId: generate_image inputs: openai_prompt: "{{ outputs.wait_for_approval.onResume.feedback }}" - id: send_new_image type: io.kestra.plugin.discord.DiscordExecution content: "Here's the new image with your feedback: {{ outputs.retry.outputs.image }}" url: "{{ vars.discord_webhook }}" ``` ::: :::collapse{title="Summarize Git commits from the past week using Ollama"} This flow automatically summarizes Git commits from the past week in a specified repository and branch. Each Friday at `15:00` UTC, it generates a plain-text summary using Ollama and posts it to Slack, keeping teams updated on project progress. ```yaml id: ai-summarize-weekly-git-commits namespace: company.ai inputs: - id: repository type: URI defaults: https://github.com/kestra-io/blueprints description: Repository to summarize last week's progress - id: branch type: STRING defaults: main description: Git branch to summarize last week's progress tasks: - id: wdir type: io.kestra.plugin.core.flow.WorkingDirectory tasks: - id: clone_repo type: io.kestra.plugin.git.Clone branch: "{{ inputs.branch }}" url: "{{ inputs.repository }}" - id: fetch_commits type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: bitnami/git:latest commands: # 0. Set safe.directory for Git to avoid "dubious ownership" errors - git config --global --add safe.directory "$(pwd)" # 1. Deepen clone if shallow - git fetch --unshallow origin {{ inputs.branch }} || true # 2. Update main branch - git fetch origin {{ inputs.branch }} # 3. Fetch commits from the last 7 days (weekly) - git log origin/{{ inputs.branch }} --since="7 days ago" --pretty=format:"%h %ad %s" --date=short > commits.txt # 4. Show how many were found - echo "Fetched $(wc -l < commits.txt) commits from the last 7 days" outputFiles: - commits.txt - id: summarize_commits type: io.kestra.plugin.ollama.cli.OllamaCLI enableModelCaching: true modelCachePath: "{{ kv('OLLAMA_CACHE_PATH') }}" commands: - "ollama run gemma3:1b \"Summarize the following Git commits into a clear and concise weekly development update for users. Output plain text for Slack, no markdown or extra formatting. Ensure no markdown syntax like **bold text** in the response — stick to plain text! Here are the commit messages: {{ read(outputs.fetch_commits.outputFiles['commits.txt']) }}\" > output.txt" outputFiles: - output.txt - id: slack type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ secret('SLACK_WEBHOOK') }}" payload: | {{ { "text": "This week's repository updates for " ~ inputs.repository ~ ". " ~ read(outputs.summarize_commits.outputFiles['output.txt']) } }} triggers: - id: weekly-trigger type: io.kestra.plugin.core.trigger.Schedule cron: "0 15 * * 5" # Every Friday at 15:00 (3:00 PM) UTC ``` ::: --- # API Reference: Enterprise and Open Source Editions URL: https://kestra.io/docs/api-reference > Access the complete API reference for both Kestra Open Source and Enterprise editions to integrate and automate your workflows. import ChildTableOfContents from "~/components/content/ChildTableOfContents.astro" ## Choose the right Kestra API reference --- # Cloud & Enterprise API Reference for Kestra URL: https://kestra.io/docs/api-reference/enterprise > Comprehensive API reference for Kestra Cloud and Enterprise editions, including advanced features like authentication and governance. import ApiDocEE from "~/components/content/ApiDocee.astro" API Reference of Kestra Cloud & Enterprise. ## Explore the Kestra Cloud and Enterprise API --- # SDK Language Clients for the Kestra API URL: https://kestra.io/docs/api-reference/kestra-sdk > Explore official Kestra SDKs for Java, Python, and Node.js to interact with the Kestra API and build custom applications. import ChildCard from "~/components/docs/ChildCard.astro" Interact with Kestra's API via language SDKs. ## Interact with Kestra using language SDKs There are [official Kestra SDKs](https://github.com/kestra-io/client-sdk) for Java, JavaScript, and Python. These SDKs provide a convenient way to interact with Kestra's API and build custom applications on top of it. SDK-based plugins now support an `DEFAULT`/`AUTO` authentication mode that pulls a default service account globally or from the current [Namespace](../../07.enterprise/02.governance/07.namespace-management/index.md#default-service-account-for-sdk-plugins) (or [Tenant](../../07.enterprise/02.governance/tenants/index.md#default-service-account-for-sdk-plugins)). Configure those defaults in the UI, or set a global fallback under `tasks.sdk.authentication` in your [Configuration Basics](../../configuration/01.configuration-basics/index.md). --- # Java SDK for Kestra: Client Setup and Examples URL: https://kestra.io/docs/api-reference/kestra-sdk/java-sdk > Integrate Kestra with Java using the official SDK. Learn to set up the client, configure authentication, and programmatically create and execute workflows. Interact with Kestra's API via Java SDK. ## Use the Kestra Java SDK ## Installation Choose the installation method that matches your environment. ### For Maven users Add this dependency to your project's **POM** file: ```xml io.kestra kestra-api-client 1.0.0 compile ``` ### For Gradle users Add this dependency to your **build.gradle** file: ```groovy implementation "io.kestra:kestra-api-client:1.0.0" ``` ### Manual installation If you prefer to install the JAR manually, first generate it: ```shell ./gradlew publishToMavenLocal ``` --- ## Getting started Initialize the `KestraClient` and reuse it across your application. Run this minimal example to verify your client setup: ```java import java.util.*; import io.kestra.client.KestraClient; // Adjust import to your SDK package public class GettingStarted { // Instantiate the client once and reuse it (e.g., as a singleton) private static final KestraClient CLIENT = KestraClient.builder() .url("http://localhost:8080") .basicAuth("root@root.com", "Root!1234") // or .tokenAuth("...") if you use tokens .build(); public static void main(String[] args) { // A lightweight example to confirm that the client was initialized System.out.println("KestraClient initialized: " + (CLIENT != null)); } } ``` :::alert{type="info"} **Notes** - Set `.url(...)` to your Kestra API base URL (for example, `http://localhost:8080`). - Configure either **basic** or **bearer** authentication to match your environment. - Construct the client **once** (singleton/DI) and reuse it for all API calls. ::: --- ## Create a flow Create a flow by sending the YAML definition as a string. This matches what you’d define in the UI, but through the SDK. ```java import java.util.*; import io.kestra.client.KestraClient; public class FlowsExamples { private static final KestraClient CLIENT = KestraClient.builder() .url("http://localhost:8080") .basicAuth("root@root.com", "Root!1234") .build(); public static void createFlow() { String tenant = "main"; String flowBody = """ id: myflow namespace: my.namespace inputs: - id: key type: STRING defaults: 'empty' tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Hello World! 🚀 """; CLIENT.flows().createFlow(tenant, flowBody); System.out.println("Flow created: my.namespace/myflow"); } } ``` :::alert{type="info"} **Important** - `flowBody` must be **valid YAML** for a Kestra flow. Invalid YAML or missing required fields will return a `4xx`. - Set the correct `tenant` for multi-tenant environments. - On success, the API returns the created flow (including metadata and source); you may log/inspect it as needed. ::: --- ## Update a flow Update by sending the full YAML for the flow (including the same `id`/`namespace`), then calling `updateFlow`. ```java import java.util.*; import io.kestra.client.KestraClient; public class FlowsUpdates { private static final KestraClient CLIENT = KestraClient.builder() .url("http://localhost:8080") .basicAuth("root@root.com", "Root!1234") .build(); public static void updateFlow() { String flowId = "myflow"; String namespace = "my.namespace"; String tenant = "main"; String updatedFlowBody = """ id: myflow namespace: my.namespace inputs: - id: key type: STRING defaults: 'empty' tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Updated! 🚀 """; CLIENT.flows().updateFlow(namespace, flowId, tenant, updatedFlowBody); System.out.println("Flow updated: my.namespace/myflow"); } } ``` :::alert{type="info"} **Tips** - Send the **full** YAML for updates (id/namespace must match the target). - Keep your flow YAML in source control for diffing/auditing alongside code. - If you frequently change only a few fields, consider templating your YAML in code. ::: --- ## Execute a flow Trigger an execution and optionally pass inputs, labels, or scheduling parameters. You can choose to block (`wait=true`) until completion or return immediately. ```java import java.util.*; import io.kestra.client.KestraClient; import io.kestra.client.types.ExecutionKind; // Adjust to your SDK model package public class ExecutionsExamples { private static final KestraClient CLIENT = KestraClient.builder() .url("http://localhost:8080") .basicAuth("root@root.com", "Root!1234") .build(); public static void createExecution() { String flowId = "myflow"; String namespace = "my.namespace"; String tenant = "main"; Boolean wait = false; // non-blocking call List labels = List.of("label1:created"); Integer revision = null; // latest String scheduleDate = null; // or ISO-8601 string, e.g. "2025-11-01T10:00:00Z" List breakpoints = List.of(); // task ids to pause at (for debugging) ExecutionKind kind = ExecutionKind.NORMAL; Map variables = Map.of(); // flow variables (if any) Map inputs = new HashMap<>(); inputs.put("key", "value"); // matches the flow `inputs` definition var exec = CLIENT.executions() .createExecution(namespace, flowId, wait, tenant, labels, revision, scheduleDate, breakpoints, kind, variables, inputs); System.out.println("Execution started: " + exec.getId()); } } ``` :::alert{type="info"} **Notes** - `wait=true` blocks until the execution finishes (useful for [synchronous flows/test runners](../../../15.how-to-guides/synchronous-executions-api/index.md#synchronous-executions-api)). - Use [`labels`](../../../05.workflow-components/08.labels/index.md) (e.g., `team:platform`) for search, routing, or reporting. - `scheduleDate` allows delayed start. - `breakpoints` pause at specific task IDs to debug step-by-step. ::: --- ## Follow (stream) an execution Stream execution events/logs as they happen. This is useful for building live console output or CI visibility. ```java import java.util.*; import io.kestra.client.KestraClient; public class ExecutionStreaming { private static final KestraClient CLIENT = KestraClient.builder() .url("http://localhost:8080") .basicAuth("root@root.com", "Root!1234") .build(); public static void followExecution() { String executionId = "yourExecutionId"; String tenant = "main"; CLIENT.executions().followExecution(executionId, tenant) .doOnNext(execution -> { System.out.printf("Event: %s | Status: %s%n", execution.getId(), execution.getState()); }) .doOnError(err -> { System.err.println("Stream error: " + err.getMessage()); }) .doOnComplete(() -> { System.out.println("Execution stream completed."); }) .subscribe(); } } ``` :::alert{type="info"} **Tips** - Use `followExecution` in interactive tools or long-running services to surface progress in real time. - The first event is an empty keepalive payload — skip it before processing subsequent updates. - If you only need the final result, poll the execution by ID instead of streaming. - Consider backoff/retry logic when streaming over unstable networks. ::: --- ## Putting it together (recommended structure) Create one utility class to hold your client and reuse it everywhere: ```java import io.kestra.client.KestraClient; public final class KestraClients { private KestraClients() {} public static final KestraClient INSTANCE = KestraClient.builder() .url(System.getenv().getOrDefault("KESTRA_URL", "http://localhost:8080")) // Choose one auth mechanism: .basicAuth( System.getenv().getOrDefault("KESTRA_USER", "root@root.com"), System.getenv().getOrDefault("KESTRA_PASS", "Root!1234") ) // .tokenAuth(System.getenv("KESTRA_TOKEN")) .build(); } ``` Then, in your feature classes: ```java public class MyFlows { public void create() { KestraClients.INSTANCE.flows().createFlow("main", "...yaml..."); } } ``` :::alert{type="info"} **Best practices** - Prefer **one** `KestraClient` per application (share via DI or a static holder). - Externalize **URL** and **auth** via environment variables or your config system. - Keep flow YAML as code (templates/strings) under version control for traceability. - Use **labels** and consistent naming for easier search, dashboards, and governance. ::: --- # JavaScript SDK for Kestra: Client Setup and Examples URL: https://kestra.io/docs/api-reference/kestra-sdk/javascript-sdk > Integrate Kestra with JavaScript using the official SDK. Install the library, configure the client, and programmatically create and execute workflows. Interact with Kestra's API via the JavaScript SDK. ## Install the JavaScript SDK This guide shows how to create and execute flows programmatically with the JavaScript SDK. Before starting, ensure your Kestra instance is reachable (for example via `KESTRA_API_URL`), and keep credentials in environment variables or an `.env` file: ```bash KESTRA_API_URL=http://localhost:8080 KESTRA_USERNAME=root@root.com KESTRA_PASSWORD=Root!1234 # KESTRA_TOKEN=... # optional if you use token auth instead of basic auth ``` Install the SDK (and `dotenv` if you want to load `.env` automatically): ```shell npm install @kestra-io/kestra-sdk npm install dotenv --save-dev ``` :::alert{type="info"} **Notes** - Prefer environment variables over hardcoding credentials. - Use **either** username/password (basic auth) or an access token (bearer). - Reuse a single `KestraClient` instance throughout your application. ::: ### Configure the client Initialize the client once and share it: ```javascript import 'dotenv/config'; import KestraClient from '@kestra-io/kestra-sdk'; const client = new KestraClient( process.env.KESTRA_API_URL ?? 'http://localhost:8080', process.env.KESTRA_TOKEN ?? '', // accessToken (preferred if set) process.env.KESTRA_USERNAME ?? 'root@root.com', process.env.KESTRA_PASSWORD ?? 'Root!1234' ); export default client; ``` --- ## Create a flow Send the flow definition as YAML. This mirrors what you would define in the UI. ```javascript import client from './client.js'; // the shared client above async function createFlow() { const tenant = 'main'; const body = `id: my_flow namespace: my_namespace tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Hello World! 🚀 `; const created = await client.flowsApi.createFlow(tenant, body); console.log('Flow created:', created?.id ?? 'my_flow'); } createFlow().catch(console.error); ``` :::alert{type="info"} **Important** - `body` must be valid flow YAML. Invalid YAML or missing fields returns a `4xx`. - Ensure the correct `tenant` for multi-tenant setups. - The response contains the created flow (including metadata and source). ::: --- ## Update a flow Send the full YAML (including the same `id` and `namespace`) to update a flow. ```javascript import 'dotenv/config'; import client from './client.js'; async function updateFlow() { const tenant = 'main'; const namespace = 'company.team'; const id = 'my_flow'; const body = `id: ${id} namespace: ${namespace} tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Hello World! with update 🚀 `; const updated = await client.flowsApi.updateFlow(namespace, id, tenant, body); console.log('Flow updated:', updated?.id ?? `${namespace}/${id}`); } updateFlow().catch(console.error); ``` :::alert{type="info"} **Tips** - Provide the **full** YAML on update; partial payloads are not merged. - Keep flow YAML in source control for versioning and code review. - Reuse the same `tenant`/`namespace`/`id` to target the correct flow. ::: --- ## Delete a flow Remove a flow by `namespace`/`id`/`tenant`. ```javascript import 'dotenv/config'; import client from './client.js'; async function deleteFlow() { const tenant = 'main'; const namespace = 'company.team'; const id = 'my_flow'; const deleted = await client.flowsApi.deleteFlow(namespace, id, tenant); console.log('Flow deleted:', deleted || 'No data returned'); } deleteFlow().catch(console.error); ``` :::alert{type="info"} **Notes** - Deleting a flow removes its definition; executions remain in history unless separately deleted. - Ensure you target the correct `tenant` before deleting. ::: --- ## Execute a flow Trigger an execution and optionally pass inputs, labels, or scheduling parameters. ```javascript import client from './client.js'; async function executeFlow() { const tenant = 'main'; const namespace = 'company.team'; const flowId = 'my_flow'; const wait = true; // set false for a non-blocking call const exec = await client.executionsApi.createExecution(namespace, flowId, wait, tenant); console.log('Execution started:', exec?.id ?? 'No data returned'); } executeFlow().catch(console.error); ``` :::alert{type="info"} **Notes** - `wait=true` blocks until the execution finishes (handy for tests/CLI). - You can also pass labels, schedule dates, breakpoints, variables, and inputs — see the method signature for optional parameters. - For multi-tenant setups, set the correct `tenant` value. ::: --- ## Delete an execution Delete an execution and optionally purge logs, metrics, and internal storage. ```javascript import client from './client.js'; async function deleteExecution() { const executionId = '6nN8Eqt0sq5gXJDj6NjfgO'; const tenant = 'main'; const opts = { deleteLogs: true, deleteMetrics: true, deleteStorage: true, }; const deleted = await client.executionsApi.deleteExecution(executionId, tenant, opts); console.log('Execution deleted:', deleted || 'No data returned'); } deleteExecution().catch(console.error); ``` :::alert{type="info"} **Notes** - Use the flags to remove associated logs/metrics/storage when needed. - Ensure you target the correct `tenant` and execution ID before deleting. ::: --- ## Follow (stream) an execution Stream execution events/logs for live feedback. ```javascript import client from './client.js'; async function followExecution() { const executionId = 'your-execution-id'; const tenant = 'main'; const stream = client.executionsApi.followExecution(executionId, tenant); stream.onmessage = (event) => { const data = JSON.parse(event.data || '{}'); if (!data || !data.state) return; // first message may be empty (keepalive) console.log(`Event: ${data.id} | Status: ${data.state.current}`); }; stream.onerror = (err) => { console.error('Stream error:', err); stream.close(); }; } followExecution().catch(console.error); ``` :::alert{type="info"} **Tips** - The first SSE payload is an empty keepalive — skip it before processing updates. - If you only need the final result, poll the execution by ID instead of streaming. - Add retry/backoff when streaming over unstable networks. ::: --- ## Best practices - **Reuse your client:** Create one `KestraClient` and share it across your app. - **Externalize config:** Keep URL/auth in env vars or your config system. - **Validate YAML:** Invalid flow YAML returns `422` responses. - **Automate:** Combine `createFlow` + `createExecution` for CI/CD pipelines. - **Label consistently:** Use labels for governance, search, and routing. --- # Python SDK for Kestra: Client Setup and Examples URL: https://kestra.io/docs/api-reference/kestra-sdk/python-sdk > Integrate Kestra with your Python applications. Learn to set up the Kestra Python SDK, configure the client, and programmatically create and execute workflows. Interact with Kestra's API via Python SDK. ## Use the Kestra Python SDK programmatically ## Install the Python SDK This guide demonstrates how to use the Kestra Python SDK to create and execute flows programmatically. Before starting, ensure your Kestra instance is running and accessible via the `KESTRA_HOST` environment variable. You can store credentials in an `.env` file: ```bash KESTRA_HOST=http://localhost:8080 KESTRA_USERNAME=admin@kestra.io KESTRA_PASSWORD=Admin1234 ``` ### Set up your environment Create a virtual environment and install the [Kestra Python SDK](https://github.com/kestra-io/client-sdk/blob/main/README_PYTHON_SDK.md). `kestrapy` is the core package. ```shell uv venv source .venv/bin/activate uv pip install kestrapy uv pip install python-dotenv # Optional: for loading .env variables automatically ``` :::alert{type="info"} **Tip:** Using `python-dotenv` allows you to store credentials securely and load them automatically when your script runs. ::: ### Configure the client Import and initialize the client with your Kestra credentials: ```python from kestrapy import Configuration, KestraClient configuration = Configuration( host="http://localhost:8080", username="root@root.com", password="Root!1234" ) kestra_client = KestraClient(configuration) ``` :::alert{type="info"} **Notes:** - Use `.env` or environment variables for credentials (avoid hardcoding). - Configure either **basic** or **token-based** authentication. - Reuse a single `KestraClient` instance throughout your application. ::: --- ## Create a flow Use the following Python script to create a simple flow with a `Sleep` task. This example uses the [`create_flow` method](https://github.com/kestra-io/client-sdk/blob/main/python-sdk/docs/FlowsApi.md#create_flow). ```python def create_flow(): tenant = "main" body = """ id: my_flow namespace: my_namespace tasks: - id: hello type: io.kestra.plugin.core.flow.Sleep duration: PT1S """ created = kestra_client.flows.create_flow(tenant=tenant, body=body) print(f"Flow created: {created.id}") ``` :::alert{type="info"} **Notes:** - `body` must be valid YAML for a Kestra flow. - If a flow with the same `id`, `namespace`, and `tenant` already exists, use `update_flow` instead. - The response contains metadata for the created flow. ::: --- ## Update a flow Use the following Python script to update an existing flow. ```python def update_flow(): tenant = "main" body = """ id: my_flow namespace: my_namespace tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "Updated message!" """ updated = kestra_client.flows.update_flow( id="my_flow", namespace="my_namespace", tenant=tenant, body=body ) print(f"Flow updated: {updated.id}") ``` :::alert{type="info"} **Notes:** - You must provide the same `id`, `namespace`, and `tenant` as the target flow. - Updating requires sending the full YAML, including all inputs, tasks, and metadata. - Invalid YAML or missing fields will return a `4xx` error. ::: --- ## Execute a flow Execute flows programmatically using the [`create_execution` method](https://github.com/kestra-io/client-sdk/blob/main/python-sdk/docs/ExecutionsApi.md#create_execution). ```python def create_execution(): tenant = "main" execution = kestra_client.executions.create_execution( tenant=tenant, namespace="my_namespace", flow_id="my_flow", wait=True, inputs={"input_id": "value"} ) print(f"Execution started: {execution.id}") ``` :::alert{type="info"} **Notes:** - `wait=True` blocks the call until the execution completes. Use `wait=False` for asynchronous runs. - `inputs` correspond to the flow’s defined input parameters. - The response includes execution details and the unique execution ID. ::: --- ## Follow an execution You can stream live execution updates using the `follow_execution` method. ```python def follow_execution(): tenant = "main" execution = kestra_client.executions.create_execution( namespace="my_namespace", id="my_flow", wait=False, tenant=tenant ) for event in kestra_client.executions.follow_execution( execution_id=execution.id, tenant=tenant ): print(event.state.current) ``` :::alert{type="info"} **Notes:** - Use `follow_execution` to monitor running flows in real-time. - The stream yields execution state updates (e.g., RUNNING, SUCCESS, FAILED). - The first SSE payload is intentionally empty; it acts as a keepalive so you can ignore it before processing subsequent events. - Use this method in CI/CD, CLI tools, or real-time dashboards. ::: --- ## Best practices - **Reuse your client:** Initialize one `KestraClient` per application and share it. - **Avoid hardcoding credentials:** Use `.env` or environment variables. - **Validate YAML before submission:** Invalid syntax causes `422` responses. - **Automate your workflows:** Combine `create_flow` and `create_execution` for full CI/CD automation. --- # Open Source API Reference for Kestra URL: https://kestra.io/docs/api-reference/open-source > Detailed API documentation for Kestra Open Source edition, covering endpoints for flows, executions, and triggers. import ApiDoc from "~/components/content/ApiDoc.astro" API Reference of the Open-Source edition of Kestra. ## Explore the Kestra Open Source API --- # Architecture in Kestra: Components and Deployment Models URL: https://kestra.io/docs/architecture > Overview of Kestra's Architecture. Explore the scalable, event-driven design connecting server components, storage, and external systems. Kestra's architecture is designed to be scalable and fault-tolerant. Depending on your needs, you can choose between two different architectures: **JDBC** and **Kafka**. ## Choose the right Kestra architecture The following diagram shows the main components of Kestra using the JDBC backend. ![Kestra JDBC Architecture](./jdbc.gif "Kestra Architecture") Here are the components and their interactions: 1. **JDBC Backend**: the data storage layer used for orchestration metadata 2. **Server**: the central part of the system, composed of: - [**Webserver**](./02.server-components/index.md#webserver): serves both the [API](../api-reference/index.mdx) and the [User Interface](../09.ui/index.mdx) - [**Scheduler**](./02.server-components/index.md#scheduler): schedules [workflows](../05.workflow-components/01.flow/index.md) and handles all [triggers](../05.workflow-components/07.triggers/index.mdx) except for the flow triggers (see below) - [**Executor**](./02.server-components/index.md#executor): responsible for the orchestration logic including [flow triggers](../05.workflow-components/07.triggers/02.flow-trigger/index.md) - [**Worker**](./02.server-components/index.md#worker): one or multiple processes that carry out the heavy computation of [runnable tasks](../05.workflow-components/01.tasks/01.runnable-tasks/index.md) and [Polling Triggers](../05.workflow-components/07.triggers/04.polling-trigger/index.md). For privacy reasons, workers are the only components that interact with the user's infrastructure, including the [Internal Storage](./data-components/index.md#internal-storage) and external services. 3. **User**: interacts with the system via [UI](../09.ui/index.mdx) and [API](../api-reference/index.mdx) 4. **User's Infrastructure**: private infrastructure components that are part of the user's environment, which Kestra interacts with: - [**Internal Storage**](./data-components/index.md#internal-storage): object storage system within the user's infrastructure (e.g. AWS S3, Google Cloud Storage, or Azure Blob Storage) - **External Services**: third-party APIs or services outside of Kestra which Workers might interact with to process data within a given task The arrows indicate the direction of communication. The JDBC Backend connects to the Server, which in turn interacts with the User's Infrastructure. The User interacts with the system through the API and UI. For either database backend, the respective [PostgreSQL JDBC Driver](https://jdbc.postgresql.org/documentation/ssl/#configuring-the-client) can provide an encrypted connection with some configuration. ### Scalability with JDBC The scalable design of the architecture allows you to run multiple instances of the [Webserver](./02.server-components/index.md#webserver), [Executor](./02.server-components/index.md#executor), [Worker](./02.server-components/index.md#worker), and [Scheduler](./02.server-components/index.md#scheduler) to handle increased load. As your workload increases, more instances of the required components can be added to the system to distribute the load and maintain performance. The JDBC Backend can be scaled too, either through clustering or sharding, to handle larger volumes of data and a higher number of requests from the [Server components](./02.server-components/index.md). Most cloud providers offer managed database services that can be scaled up and down as needed. ## Architecture with Kafka and Elasticsearch backend The following diagram shows the main components of Kestra using the [Kafka](https://kafka.apache.org/) and [Elasticsearch](https://www.elastic.co/elasticsearch) backend. ![Kestra OSS Architecture](./kafka.gif "Kestra Architecture") :::alert{type="info"} This architecture is only available in the [Enterprise Edition](../07.enterprise/01.overview/01.enterprise-edition/index.md) of Kestra. See [Open Source vs Enterprise](../oss-vs-paid/index.md) for a comparison of editions. ::: This architecture provides enhanced scalability, high availability, and fault tolerance for large-scale deployments. 1. **Kafka**: serves as the messaging backend, which communicates between different components of the system and provides scalability and fault tolerance 2. **Microservices**: This layer includes several services: - [**Webserver**](./02.server-components/index.md#webserver): serves the [API](../api-reference/index.mdx) and the [User Interface](../09.ui/index.mdx) for interaction with the system - [**Scheduler**](./02.server-components/index.md#scheduler): schedules [workflows](../05.workflow-components/01.flow/index.md) and processes all [triggers](../05.workflow-components/07.triggers/index.mdx) except for the flow triggers - [**Executor**](./02.server-components/index.md#executor): handles the orchestration logic, including [flow triggers](../05.workflow-components/07.triggers/02.flow-trigger/index.md) - [**Indexer**](./02.server-components/index.md#indexer): indexes data from Kafka to Elasticsearch for quick retrieval and search (optional component since [Kestra v0.20](../11.migration-guide/v0.20.0/elasticsearch-indexer/index.md)) - [**Worker**](./02.server-components/index.md#worker): runs [runnable tasks](../05.workflow-components/01.tasks/01.runnable-tasks/index.md) and interacts with the user's infrastructure 3. **User**: engages with the system through the Webserver's [UI](../09.ui/index.mdx) and [API](../api-reference/index.mdx) 4. **Elasticsearch**: acts as a search and UI backend, storing [logs](./data-components/index.md#logs), execution history, and enabling fast data retrieval 5. **User's Infrastructure**: private infrastructure components that are part of the user's environment, which Kestra interacts with: - [**Internal Storage**](./data-components/index.md#internal-storage): object storage system where user's data is stored (e.g. AWS S3, Google Cloud Storage, or Azure Blob Storage) - **External Services**: APIs or services that Workers might interact with during task processing ### Scalability with Kafka and Elasticsearch Kafka's messaging backend handles large volumes of data and scales horizontally. You can run multiple instances of Workers, Schedulers, Webservers, and Executors to distribute load, ensure fault tolerance, and maintain performance as demand increases. Elasticsearch contributes to scalability by providing a horizontally scalable UI backend that can efficiently search across large amounts of data. ## Comparison between JDBC and Kafka architectures The main difference between the **JDBC** and **Kafka** architectures is the data layer (_JDBC Database vs. Kafka and Elasticsearch_). :::alert{type="info"} You can use the [Enterprise Edition](../07.enterprise/01.overview/01.enterprise-edition/index.md) with a JDBC database backend for smaller deployments. It's often more practical to start with JDBC and migrate to Kafka and Elasticsearch as your deployment grows. ::: The **Worker** is the only component communicating with your private data sources to extract and transform data. The Worker also interacts with [**Internal Storage**](./data-components/index.md#internal-storage) to persist intermediary results and store the final task run outputs. All components of the **application layer** (including the Worker, Executor, and Scheduler) are decoupled and stateless, communicating with each other through the [**Queue**](./01.main-components/index.md#queue) (Kafka/JDBC). You can deploy and scale them independently. The **Webserver** communicates with the (Elasticsearch/JDBC) [Repository](./01.main-components/index.md#repository) to serve data for Kestra UI and API. The **data layer** is decoupled from the application layer and provides a separation between: - storing your private data processing artifacts — **Internal Storage** is used to store outputs of your executions; you can think of Internal Storage as your own private AWS S3 bucket - storing execution metadata — (Kafka/JDBC) [**Queue**](./01.main-components/index.md#queue) is used as the orchestration backend - storing logs and user-facing data — the (Elasticsearch/JDBC) [**Repository**](./01.main-components/index.md#repository) is used to store data needed to serve Kestra UI and API. The Indexer, available only in the [Enterprise Edition](../07.enterprise/01.overview/01.enterprise-edition/index.md), indexes content from Kafka topics (_such as the flows and executions topics_) to the Elasticsearch repositories. Because the Queue and Repository are separate in the Kafka architecture, executions continue even if Elasticsearch experiences downtime. ## Components in detail The following sections provide more details about the components of the architecture. import ChildCard from "~/components/docs/ChildCard.astro" --- # Data Storage Components in Kestra Architecture URL: https://kestra.io/docs/architecture/data-components > Dive into Kestra's Data Architecture. Learn how inputs, outputs, logs, and metadata are stored across Repositories and Internal Storage systems. Understand where different data components ([inputs](../../05.workflow-components/05.inputs/index.md), [outputs](../../05.workflow-components/06.outputs/index.md), logs, and more) are stored in Kestra’s architecture. Kestra processes and stores a variety of data, including [flow definitions](../../05.workflow-components/01.flow/index.md), workflow inputs, outputs, logs, execution metadata, and more. Understanding how these components are stored helps optimize performance, configure persistence, and integrate with external storage systems. Kestra data is stored in either the [repository](../01.main-components/index.md#repository), such as PostgreSQL, or in [internal storage](../data-components/index.md#internal-storage). By default, internal storage is local, but you can configure it to use services like [AWS S3](https://aws.amazon.com/s3/) or [MinIO](https://min.io/). :::alert{type="info"} See [Kestra architecture](../../08.architecture/03.deployment-architecture/index.md) and [internal storage](../data-components/index.md#internal-storage) for more details. ::: ## Data storage components The table below outlines key data components, where they are stored, and their purpose. | Data component | Storage location | Description | |--------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------| --- | | **Flows & definitions** | Repository | Stores flows, tasks, and their configurations. | | **Namespaces** | Repository | Organizes workflows and manages secrets, plugin defaults, and variables. | | **Namespace files** | Internal storage | Stores code and configuration files in Kestra’s storage backend. | | **Executions & metadata** | Repository | Stores execution details including status, timestamps, and metadata. | | **Inputs** | Internal storage | Stores files provided as inputs to a flow execution. | | **Input files** | Internal storage | Stores additional files for script or CLI tasks. | | **Outputs** | Internal storage | Stores outputs from tasks, separate from the database. | | **Output files** | Internal storage | Stores generated files for download and reuse in downstream tasks. | | **Key-value pairs** | Internal storage & repository (metadata only) | KV store holds data in key-value format. Metadata is recorded in the repository. | | **Logs & [audit logs](../../07.enterprise/02.governance/06.audit-logs/index.md) (Enterprise Edition)** | Repository | Stores logs generated by tasks. | | **Task state & variables** | Repository | Stores dynamic variables and task states during executions. | | **Secrets** | Repository or external [secret manager](../../07.enterprise/02.governance/secrets-manager/index.md) | Stores secrets internally or integrates with services like AWS Secrets Manager, Vault, or Google Secret Manager. | | **Queues** | Repository or Kafka | Handles internal communication between Kestra components. | | **Triggers** | Repository | Stores definitions of event-based triggers. | | **User administration** | Repository | Stores RBAC, user management, and related metadata. | ## Internal storage **Internal storage** is used for handling files during executions. It ensures efficient input and output management without burdening the database. - **Purpose**: Handles inputs, outputs, temporary execution data, and artifacts such as [namespace files](../../06.concepts/02.namespace-files/index.md). - **KV store**: Stores key-value pairs in internal storage, with metadata in the repository. Metadata includes the key, URI, TTL, and timestamps. - **Backends**: By default, Kestra uses local storage, but for production you can configure cloud storage such as: - [AWS S3](https://aws.amazon.com/s3/) - [Google Cloud Storage](https://cloud.google.com/storage) - [Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/) - [MinIO](https://min.io/) - Any S3-compatible service ### Configuring internal storage Example `docker-compose.yaml` configuration for AWS S3: ```yaml kestra: storage: type: s3 bucket: "kestra-internal-storage" region: "us-east-1" ``` For full details, see [internal storage configuration](../../08.architecture/data-components/index.md#internal-storage). ## Additional information ### Flows and execution metadata - Stored in **PostgreSQL**, **MySQL**, or **H2** (not recommended for distributed components). - Includes: - Flow definitions - Execution details - Execution queues - Historical metadata - Accessible via the [Kestra API](../../api-reference/index.mdx) and [UI](../../09.ui/index.mdx). ### Logs - **Open source**: Logs are stored in the database. - **Enterprise Edition**: Supports Elasticsearch as a log backend, in addition to the database. - Audit logs are stored in the repository. - Logs can be accessed through the API, UI, or external logging integrations such as the [log shipper](../../07.enterprise/02.governance/logshipper/index.md). ### Queues - **Open source**: Stored in the database. - **Enterprise Edition**: Can use Kafka for inter-component messaging. ### Secrets management Secrets can be stored in Kestra’s database or in external managers like [AWS Secrets Manager](https://aws.amazon.com/secrets-manager/), [Google Secret Manager](https://cloud.google.com/secret-manager), [Azure Key Vault](https://azure.microsoft.com/products/key-vault/), or [HashiCorp Vault](https://developer.hashicorp.com/vault/docs/secrets/kv/kv-v2). Example configuration for AWS Secret Manager: ```yaml kestra: secret: type: aws-secret-manager aws-secret-manager: access-key-id: my-access-key secret-key-id: my-secret-key sessionToken: my-session-token region: us-east-1 ``` See [secret managers](../../07.enterprise/02.governance/secrets-manager/index.md) for more. ### Database maintenance Use [purge tasks](../../10.administrator-guide/purge/index.md) to free up storage and maintain performance as databases accumulate execution and log data. To further separate data across business units or environments, see the [governance features](../../07.enterprise/02.governance/index.mdx) in the [Enterprise Edition](../../07.enterprise/01.overview/01.enterprise-edition/index.md), including [tenants](../../07.enterprise/02.governance/tenants/index.md). --- # Deployment Architectures in Kestra: JDBC and Kafka URL: https://kestra.io/docs/architecture/deployment-architecture > Choose your Kestra deployment architecture. Compare Standalone (JDBC), Medium (Database), and High-Availability (Kafka & Elasticsearch) models. Examples of deployment architectures, depending on your needs. Kestra is a Java application distributed as an executable. It supports multiple deployment options: - [Docker](../../02.installation/02.docker/index.md) - [Kubernetes](../../02.installation/03.kubernetes/index.md) - Manual deployment Kestra’s plugin system allows you to choose the dependency types that best match your requirements. Below are three common deployment architectures. ## Small-sized deployment ![Kestra Standalone Architecture](./archi-diagram-small.png "Kestra Standalone Architecture") For small-scale deployments, you can use the Kestra **standalone server**, which runs all server components in a single process. This architecture has no scaling capability. In this setup, a database is the only dependency, minimizing the stack to maintain. Supported databases include: - PostgreSQL - MySQL - H2 ## Medium-sized deployment ![Kestra Architecture](./archi-diagram-medium-sized-deployement.png "Kestra Architecture") For medium-scale deployments, where high availability is not required, Kestra can be run with a relational database (Postgres or MySQL) as the only dependency. H2 is not recommended in distributed setups. - Supported databases: PostgreSQL and MySQL - All server components communicate through the database In this mode, if components are distributed across multiple hosts, you must use a shared [internal storage](../data-components/index.md#internal-storage) implementation such as [Google Cloud Storage](../../02.installation/09.gcp-vm/index.md), [AWS S3](../../02.installation/08.aws-ec2/index.md), or [Azure Blob Storage](../../02.installation/10.azure-vm/index.md). ## High-availability deployment ![Kestra High Availability Architecture](./archi-diagram.png "Kestra High Availability Architecture") For high throughput and full horizontal and vertical scaling, the database is replaced with Kafka and Elasticsearch. This architecture removes single points of failure and enables scaling of all server components. - Dependencies: Kafka and Elasticsearch - Available only in the [Enterprise Edition](../../07.enterprise/01.overview/01.enterprise-edition/index.md) As with medium deployments, a distributed [internal storage](../data-components/index.md#internal-storage) solution is required if components run on different hosts. ### Kafka [Kafka](https://kafka.apache.org/) is the backbone of high availability mode, powering communication and scalability. #### Kafka executor The [executor](../02.server-components/index.md#executor) runs as a [Kafka Streams](https://kafka.apache.org/documentation/streams/) application. It: - Processes all events from Kafka in order - Maintains the internal state of executions - Merges task run results from [workers](../02.server-components/index.md#worker) - Detects failed workers and resubmits their tasks Executors scale horizontally within the limits of Kafka partitions. Since executors perform lightweight operations, they typically require minimal resources unless handling very high execution volumes. #### Kafka worker The [worker](../02.server-components/index.md#worker) runs as a [Kafka consumer](https://kafka.apache.org/documentation/#consumerapi). It: - Processes tasks assigned by executors - Runs tasks in an internal thread pool - Scales horizontally, with multiple instances across servers If a worker fails, the executor detects it and resubmits the tasks to another available worker. ### Elasticsearch [Elasticsearch](https://www.elastic.co/elasticsearch) acts as the database for Kestra’s [webserver](../02.server-components/index.md#webserver), providing fast search, aggregation, and retrieval of flows, executions, and logs. It is only required in high availability mode and is used exclusively by the [API and UI](../../09.ui/index.mdx). --- # Main Components of Kestra Architecture URL: https://kestra.io/docs/architecture/main-components > Understand Kestra's core architecture. Dive into main components like the Repository, Queue, Internal Storage, and Plugin system. Technical overview of Kestra’s main components: internal storage, queue, repository, and plugins. Kestra relies on the following internal components: - **Internal storage**: stores flow data such as task outputs and flow inputs. - **Queue**: enables internal communication between Kestra server components. - **Repository**: persists flows, templates, executions, logs, and all other internal objects. - **Plugins**: extend Kestra’s core with additional task and trigger types, storage implementations, and data transformations. Each component has multiple implementations depending on deployment architecture. Some require additional plugins. ## Internal storage The **internal storage** is a dedicated system that handles files of any size during flow executions. It manages both inputs and outputs, enabling scalable file sharing between tasks. ### Purpose Internal storage is used to: - Save files generated during a [flow’s execution](../../05.workflow-components/03.execution/index.md) and pass them between tasks via [outputs](../../05.workflow-components/06.outputs/index.md). - Automatically persist [flow inputs](../../05.workflow-components/05.inputs/index.md) of type `FILE`. - Provide download links for stored files in the **Outputs** tab of an execution. Files can be retrieved in the execution context using `{{ outputs.task_id.output_attribute }}` (often the `uri` property). Kestra fetches the file automatically when referenced. Execution metadata — including storage file paths — is recorded in the **repository**. ### Storage types By default, Kestra uses **local storage**, which stores files on the host filesystem. This option is simple but not scalable and is usually not recommended for production (unless for standalone deployments). :::alert{type="warning"} Local storage behavior differs between standalone and distributed deployments: - ✅ **Standalone**: Local storage with persistent volumes is OK - ❌ **Distributed with ReadWriteOnce**: NOT recommended for distributed services - ✅ **Distributed with ReadWriteMany**: OK for distributed services (rarely available) - ❌ **Host storage sharing**: NOT recommended — difficult to achieve reliably When `ReadWriteMany` is unavailable, use cloud storage (S3, GCS, Azure) or distributed object storage (MinIO, Ceph, SeaweedFS, Garage). ::: Scalable alternatives are available as plugins: - [Storage MinIO](https://github.com/kestra-io/storage-minio) — supports [MinIO](https://min.io/), [AWS S3](https://aws.amazon.com/s3/), and other S3-compatible systems. - [Storage GCS](https://github.com/kestra-io/storage-gcs) — for [Google Cloud Storage](https://cloud.google.com/storage). - [Storage Azure](https://github.com/kestra-io/storage-azure) — for [Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/). For details, see [Runtime and Storage](../../configuration/02.runtime-and-storage/index.md). ## Queue The **queue** is used internally for communication between Kestra’s server components. Each repository type has a matching queue implementation: - **In-memory queue** — must be used with the in-memory repository. - **Database queue** — must be used with the database repository. - **Kafka queue** — must be used with the Elasticsearch repository. **Only available in the [Enterprise Edition](../../07.enterprise/01.overview/01.enterprise-edition/index.md).** ## Repository The **repository** persists all internal objects, including flows, executions, logs, and templates. Each repository type must be paired with its corresponding queue: - **In-memory repository** — must be used with the in-memory queue. - **Database repository** — must be used with the database queue. - **Elasticsearch repository** — must be used with the Kafka queue. **Only available in the [Enterprise Edition](../../07.enterprise/01.overview/01.enterprise-edition/index.md).** ## Plugins Kestra’s core only provides basic functionality. A [plugin ecosystem](/plugins) extends the platform with: - New task and trigger types. - Alternative implementations of core components (e.g., storage backends). - Integrations with external systems and data transformation utilities. A wide range of plugins is already available, and the ecosystem continues to expand. --- # Multi-Tenancy in Kestra: Tenant Isolation Model URL: https://kestra.io/docs/architecture/multi-tenancy > Understand Kestra's Multi-tenancy architecture. Learn how tenant isolation works for flows, data, and resources in a single Enterprise instance. Multi-tenancy allows you to manage **multiple environments** (e.g., dev, staging, prod) in a single Kestra instance. Multi-tenancy is a software architecture in which a single instance of software serves multiple tenants. You can think of it as running multiple virtual instances in a single physical instance. You can use multi-tenancy to **separate resources** between business units, teams, or customers. This feature requires the [Enterprise Edition](../../07.enterprise/index.mdx). ## How does multi-tenancy work in Kestra Multi-tenancy is enabled by default and required. All resources (such as flows, triggers, executions, RBAC, and more) are isolated by the tenant. This means that you can have a flow with the same identifier and the same namespace in multiple tenants at the same time. Data stored inside the [Internal Storage](../data-components/index.md#internal-storage) is also isolated by tenants. End-users can use the tenant selection dropdown menu from the [UI](../../09.ui/index.mdx) to see tenants they have access to. Users can switch between tenants from this dropdown. Each UI page also includes the tenant ID in the URL (e.g., `https://demo.kestra.io/ui/yourTenantId/executions/namespace/flow/executionId`.) ![Tenants selection dropdown](./tenants-select.png "Tenants selection dropdown") Most [API](../../api-reference/index.mdx) endpoints also include the tenant identifier. The exception is instance-level endpoints such as `/configs`, `/license-info`, or `/banners`, which require Superadmin access. For example, the URL of the API operation to list flows of the `products` namespace is `/api/v1/{your_tenant_id}/flows/products`. See the [Enterprise Edition API Guide](../../api-reference/01.enterprise/index.mdx) for details. :::alert{type="warning"} Tenants must be created upfront, and a user needs to be granted access to use a specific tenant. ::: --- # Server Components in Kestra Architecture Explained URL: https://kestra.io/docs/architecture/server-components > Explore Kestra server components. Learn about the Executor, Worker, Scheduler, Webserver, and Indexer roles in the orchestration engine. Detailed breakdown of the server components behind Kestra. Kestra consists of multiple server components that can be scaled independently. Each server component interacts with internal components ([Internal Storage](../data-components/index.md#internal-storage), [Queue](../01.main-components/index.md#queue), and [Repository](../01.main-components/index.md#repository)). ## Executor The **Executor** is a lightweight server component responsible for processing all executions and orchestrating the next tasks to run. It does not perform heavy computations itself, instead deferring actual task execution to [Workers](#worker). The Executor plays a central role in coordinating workflows based on the information it receives from the [Scheduler](#scheduler) and the [Queue](../01.main-components/index.md#queue). It handles specific types of tasks, such as: - [Flowable Tasks](../../05.workflow-components/01.tasks/00.flowable-tasks/index.md) - [Flow Triggers](../../05.workflow-components/07.triggers/02.flow-trigger/index.md) - Templates *(deprecated)* - Listeners *(deprecated)* Although the Executor oversees all executions, it never interacts directly with your data. Because of its low resource usage, the Executor rarely needs to be scaled. However, in deployments with a very large number of executions, you can scale Executors horizontally to meet demand. ## Worker The **Worker** is a server component responsible for executing all [runnable tasks](../../05.workflow-components/01.tasks/01.runnable-tasks/index.md) and [Polling Triggers](../../05.workflow-components/07.triggers/04.polling-trigger/index.md). These are received from the [Executor](#executor) and the [Scheduler](#scheduler), respectively. Workers are highly configurable and designed to handle a wide range of workloads — from simple API calls to heavy computational tasks. Internally, each Worker functions as a configurable thread pool, allowing you to define the number of threads per instance based on your workload requirements. You can deploy multiple Worker instances across different servers to scale horizontally. This flexibility enables efficient handling of parallel executions, especially in high-throughput environments. Because Workers directly execute tasks and triggers, they are the **only** server components that require access to external systems — such as databases, REST APIs, message brokers, and any other services your flows interact with. :::alert{type="info"} Looking for runtime status? The **Instance – Services** view shows live health for each component. See [Instance – services](../../07.enterprise/05.instance/index.mdx#services). ::: ## Worker Group (EE) In the [Enterprise Edition](../../07.enterprise/01.overview/01.enterprise-edition/index.md), [Worker Groups](../../07.enterprise/04.scalability/worker-group/index.md) allow tasks and [Polling Triggers](../../05.workflow-components/07.triggers/04.polling-trigger/index.md) to be executed on specific worker sets. They can be beneficial in various scenarios, such as using compute instances with GPUs, executing tasks on a specific OS, restricting backend access, and region-specific execution. A default worker group is recommended per [tenant](../10.multi-tenancy/index.md) or namespace. To specify a worker group for a task, use the `workerGroup.key` property in the task definition to point the task to a specific worker group key. If no worker group is specified, the task will be executed on the default worker group. :::alert{type="info"} Worker Groups are available in Kestra Enterprise Edition only, not in Kestra Cloud. ::: ## Scheduler The **Scheduler** is a server component responsible for managing all [triggers](../../05.workflow-components/07.triggers/index.mdx) — except for [Flow Triggers](../../05.workflow-components/07.triggers/02.flow-trigger/index.md), which are handled by the [Executor](#executor). The Scheduler continuously evaluates trigger conditions and determines when a flow should start. When a trigger is satisfied, the Scheduler submits the flow to the Executor for execution. For [Polling Triggers](../../05.workflow-components/07.triggers/04.polling-trigger/index.md), the Scheduler checks them at their configured evaluation interval. If the polling conditions are met, it sends the execution — along with trigger metadata — to the [Worker](#worker) for execution. Polling Triggers have specific constraints: - They cannot be evaluated concurrently. - They cannot be reevaluated while a previous execution from the same trigger is still running. Internally, the Scheduler checks every second to determine whether any trigger needs evaluation. :::alert{type="info"} By default, Kestra handles all date and time values using your system's timezone. You can override this behavior using [JVM options](../../configuration/02.runtime-and-storage/index.md) ::: ## Indexer The **Indexer** is responsible for reading content from Kafka topics — such as flows and executions — and indexing it into Elasticsearch. This component enables [low-latency querying](../../11.migration-guide/v0.20.0/elasticsearch-indexer/index.md) when using Kafka and Elasticsearch together. By default, the Indexer runs as part of the [Web Server](#webserver). However, you can choose to run the Web Server independently without the Indexer by using the `server webserver --no-indexer` CLI option. The Indexer is required for deployments that rely on Kafka and Elasticsearch, particularly in **Kestra Enterprise Edition** and **Kestra Cloud**. ## Webserver The **Webserver** is the entry point for all external communications with Kestra. It is responsible for serving both the [User Interface (UI)](../../09.ui/index.mdx) and the [REST API](../../api-reference/index.mdx). It consists of two main modules: - **API**: Exposes all [REST endpoints](../../api-reference/index.mdx) for interacting with Kestra — including triggering executions, retrieving flow data, managing tasks, and more. - **UI**: Serves the [Kestra web interface](../../09.ui/index.mdx), enabling users to design, monitor, and manage workflows visually. The Webserver primarily interacts with the [Repository](../01.main-components/index.md#repository) to serve content through the API and UI. It also connects to the [Queue](../01.main-components/index.md#queue) to submit new executions and stream real-time updates on flow progress. :::alert{type="info"} As long as the [Queue](../01.main-components/index.md#queue) is operational, most server components — including the Webserver — will continue to function. While the Repository is essential for rendering the UI, workloads can still be processed even if the Repository is temporarily unavailable. ::: --- # Workflow Best Practices in Kestra: Design & Patterns URL: https://kestra.io/docs/best-practices > Best practices for building reliable workflows in Kestra, including guidance on choosing the right patterns and making sound design decisions. import ChildCard from "~/components/docs/ChildCard.astro" Best practices for building reliable workflows in Kestra, including guidance on choosing the right patterns for common design decisions. ## Apply best practices for Kestra workflows Kestra often provides multiple ways to achieve the same outcome. This section helps you choose the approach that best fits your use case, with guidance on design decisions, implementation patterns, and tradeoffs. --- # Business Unit Separation in Kestra Enterprise URL: https://kestra.io/docs/best-practices/business-unit-separation > Strategies for isolating business units in Kestra Enterprise using Tenants and Namespaces for security and governance. Kestra Enterprise provides two primary levels of isolation within an instance: tenants and namespaces. ## Separate business units with tenants and namespaces Choosing between tenants and namespaces for separating business units depends on the required level of data isolation, access control, and visibility into cross-workflow dependencies. ## When to use multiple tenants A [tenant](../../07.enterprise/02.governance/tenants/index.md) is an **isolated environment** within a Kestra instance. Tenants have their own **fully isolated resources**, including flows, RBAC policies, secrets, variables, plugin defaults, and more. Users exist globally across the instance but can have different roles and permissions per tenant. You can configure dedicated resources for each tenant: - Dedicated internal storage (e.g., a separate S3 bucket per tenant) - Dedicated secrets manager backend (e.g., a separate Vault or AWS Secrets Manager per tenant) - Dedicated worker groups (e.g., a pool of workers used exclusively by a specific tenant) - Flows, executions, and logs are **isolated** between tenants by default :::alert{type="info"} Worker groups are not yet available in Kestra Cloud; they are supported only in Kestra Enterprise Edition. ::: ### Use cases for tenants - **Customer separation**: For multi-tenant SaaS setups requiring strict isolation between customers, assign each customer its own tenant rather than a dedicated namespace. - **Fully isolated teams**: When even administrators should not have visibility into other teams' workflows, tenants provide the highest level of isolation. **Note:** Since tenants are **fully isolated**, there is no cross-tenant visibility. If you need to share flows (e.g., team A runs a subflow from team B) or manage interdependent workflows, namespaces are a better option. ## When to use multiple namespaces A [namespace](../../07.enterprise/02.governance/07.namespace-management/index.md) is a logical grouping within a tenant that organizes teams and projects while providing fine-grained access control. Unlike tenants, namespaces allow cross-visibility of flows and dependencies. You can configure dedicated resources for each namespace: - Dedicated internal storage (e.g., a separate S3 bucket per namespace) - Dedicated secrets manager backend (e.g., a separate Vault or AWS Secrets Manager per namespace) - Dedicated worker groups (e.g., a pool of workers used exclusively by a specific namespace) - Flows, executions, and logs are **shared** across namespaces; isolation is managed through RBAC permissions ### Use cases for namespaces - **Team-based organization**: Separate flows and resources by team within the same tenant, maintaining visibility for users with appropriate permissions. - **Project-based organization**: Create separate namespaces for projects that need limited isolation while retaining workflow visibility. - **Dependency management**: Namespaces support cross-team dependencies (e.g., subflows or triggers), simplifying dependency tracking. - **RBAC control**: Namespaces allow granular role-based access. A user might have `READ` access in one namespace and full CRUD permissions in another. ## Summary of when to use tenants vs. namespaces | Feature | Tenant | Namespace | |------------------------------|------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------| | **Full isolation** | ✅ Yes | ❌ No (unless configured via RBAC) | | **Cross-visibility** | ❌ No (tenants don't share flows) | ✅ Yes (namespaces can share flows and dependencies) | | **RBAC control** | ✅ Yes (separate roles per tenant) | ✅ Yes (roles can be restricted to namespaces) | | **Secrets manager backend** | ✅ Optional dedicated backend per tenant | ✅ Optional dedicated backend per namespace | | **Internal storage backend** | ✅ Optional dedicated backend per tenant | ✅ Optional dedicated backend per namespace | | **Best for** | Customer separation or strict business unit isolation | Team or project isolation with centralized governance and shared dependencies | ## Recommendations - Use **tenants** when strict isolation is required, and no cross-team dependencies exist. - Use **namespaces** to organize business units or teams that require centralized governance and cross-team visibility. - Configure **dedicated secrets and internal storage backends** at the lowest necessary level (tenant or namespace) to follow the principle of least privilege and simplify management (e.g., applying S3 lifecycle policies per team). - Prefer **namespace isolation** over tenant isolation when workflows depend on each other to maintain visibility across dependencies. For a hands-on guide on how to use RBAC to separate business units at the namespace level, see the video below:
--- # Choosing Where to Store Sensitive and Shared Values URL: https://kestra.io/docs/best-practices/credentials-vs-secrets-vs-kv-store > Learn how to choose between Kestra credentials, secrets, and the KV Store for authentication, sensitive values, and runtime state. How to choose between credentials, secrets, and the KV Store for sensitive values, authentication, and shared state in Kestra. ## Choose the right place for sensitive values and shared state Kestra gives you multiple ways to manage authentication and sensitive values: - [credentials](../../07.enterprise/03.auth/credentials/index.md) - [secrets](../../06.concepts/04.secret/index.md) - the [KV Store](../../06.concepts/05.kv-store/index.md) These options serve different purposes. Choosing the right one improves security, reduces duplication, and makes workflows easier to maintain. ## Quick recommendation Use this rule of thumb: - Use **credentials** for reusable server-to-server authentication where Kestra should mint or refresh short-lived access tokens for you. - Use **secrets** for protected static values such as API keys, passwords, client secrets, private keys, and connection strings. - Use the **KV Store** for dynamic values that must be created, updated, or read at runtime across executions or flows. ## Comparison table | If you need to store... | Prefer | Why | | --- | --- | --- | | A reusable OAuth2 or GitHub App authentication object | Credentials | Kestra can mint and refresh access tokens at runtime | | A password, API key, client secret, private key, or connection string | Secrets | Sensitive static values should be protected and not stored in flow code | | A value created by one flow and reused later by another flow | KV Store | It is designed for runtime state shared across flows | | A value that changes during execution and must be updated programmatically | KV Store | Flows can read and write KV pairs dynamically | | Sensitive material used by a credential, such as a client secret or private key | Secrets | Credentials should reference secrets rather than embed raw secret values | | A non-sensitive setting such as region, endpoint, or bucket name | Task properties, [variables](../../05.workflow-components/04.variables/index.md), or [plugin defaults](../../05.workflow-components/09.plugin-defaults/index.md) | These are configuration values, not authentication objects | ## When to use credentials Use credentials when you need a reusable server-to-server authentication object that Kestra manages for you. Credentials are designed for authentication patterns such as: - OAuth2 `client_credentials` - OAuth2 JWT Bearer - OAuth2 `private_key_jwt` - GitHub App authentication Why credentials are the right choice here: - They let you define the authentication object once and reuse it across tasks. - Kestra can mint and refresh access tokens at runtime. - Workflows only need the current access token via `{{ credential('name') }}`. - Sensitive material used by the credential can stay in secrets. Best practice: - Use credentials for managed authentication objects, not as a generic replacement for secrets. - Scope credentials at the namespace or [tenant](../../07.enterprise/02.governance/tenants/index.md) level based on the required reuse. For multi-team setups, align that scope with your [namespace management](../../07.enterprise/02.governance/07.namespace-management/index.md) model. - Reference secrets inside credentials for client secrets, private keys, or certificates. For more details, see [Credentials](../../07.enterprise/03.auth/credentials/index.md). ## When to use secrets Use secrets when the value is sensitive and should not be committed to source control or exposed broadly in flow definitions. Typical examples: - database passwords - API keys - cloud access keys - OAuth client secrets - SSH private keys - service account JSON - webhook secrets Why secrets are the default for sensitive values: - They keep sensitive data out of flow YAML. - They can be managed centrally, including through an [external secrets manager](../../07.enterprise/02.governance/secrets-manager/index.md). - They reduce accidental exposure when flows are shared across teams. - They are the right source for secret values referenced by credentials and task properties. Best practice: - Store secrets at the lowest [namespace](../../07.enterprise/02.governance/07.namespace-management/index.md) level that still supports the required reuse. - Avoid placing broadly scoped secrets at the root [namespace](../../07.enterprise/02.governance/07.namespace-management/index.md) unless they truly need to be inherited everywhere. If you need inheritance without allowing downstream edits, consider [read-only secrets](../../07.enterprise/02.governance/read-only-secrets/index.md). - Avoid logging secrets or transforming them in ways that could bypass masking. For more details, see [Secrets](../../06.concepts/04.secret/index.md) and [Best Practices for Secrets in Kestra](../9.secrets-management/index.md). ## When to use the KV Store Use the KV Store for dynamic state, not for secret management. The KV Store is a good fit when a value: - is created during an execution - must be updated by flows - needs to be read later by other flows or later executions - represents state rather than static configuration Typical examples: - checkpoints - cursors and offsets - last processed timestamp - feature flags or runtime switches managed by workflows - identifiers of external resources created by one execution and reused later Why the KV Store is usually the wrong place for secrets: - Secrets and credentials are usually static or centrally managed, while KV pairs are designed for runtime mutation. - Flows can update KV pairs, which increases the risk of accidental overwrite for sensitive data. - The KV Store is better suited to application state than authentication material. If you need a value to be both sensitive and mutable, stop and review the design carefully. In most cases, that indicates a workflow state problem or a secret-lifecycle problem that should be solved more explicitly. ## Recommended patterns ### Pattern 1: Credential plus secrets Use a credential for the authentication flow and secrets for the sensitive material that backs it. Example: - OAuth2 credential stored in Kestra - `client_secret` or private key referenced from a secret This is the preferred pattern when Kestra should mint or refresh tokens for you. ### Pattern 2: Secrets plus non-sensitive configuration Use secrets for the confidential part and task properties, [variables](../../05.workflow-components/04.variables/index.md), or [plugin defaults](../../05.workflow-components/09.plugin-defaults/index.md) for the rest. Example: - `password` from `secret('DB_PASSWORD')` - `host`, `port`, and `database` from [variables](../../05.workflow-components/04.variables/index.md) or [plugin defaults](../../05.workflow-components/09.plugin-defaults/index.md) ### Pattern 3: Secret plus KV Store Use secrets for authentication and the KV Store for runtime state. Example: - API key from a secret - `last_processed_cursor` from the KV Store This is common in [polling triggers](../../05.workflow-components/07.triggers/04.polling-trigger/index.md), ingestion, and synchronization flows. ### Pattern 4: Plugin defaults plus secrets Use [plugin defaults](../../05.workflow-components/09.plugin-defaults/index.md) to centralize repeated connection settings, while referencing secrets for the sensitive fields. This is often the cleanest approach for large teams because it reduces duplication without putting secret material in the flow body. ## Anti-patterns Avoid these patterns: - Storing passwords, API keys, or client secrets directly in flow YAML - Using secrets where a managed credential would better handle token minting and refresh - Using the KV Store as the default place for secret material - Putting broad secrets at the root namespace when only one team or project needs them - Mixing static configuration, secret material, and mutable runtime state without a clear reason ## Decision guide Ask these questions: 1. Is the value sensitive? If yes, start with **secrets**. 2. Do you need Kestra to mint or refresh an access token for a supported authentication flow? If yes, use **credentials**, backed by **secrets** for the sensitive inputs. 3. Does the value need to be updated dynamically by flows? If yes, consider the **KV Store**. 4. Is the value just a protected static value such as an API key, password, or private key? If yes, use **secrets**. 5. Is the value stable non-sensitive configuration reused across many tasks? If yes, consider [**plugin defaults**](../../05.workflow-components/09.plugin-defaults/index.md), [variables](../../05.workflow-components/04.variables/index.md), or [namespace-level configuration](../../07.enterprise/02.governance/07.namespace-management/index.md). ## Summary - **Credentials** are for reusable managed authentication objects that mint or refresh tokens. - **Secrets** are for sensitive static values and for secret inputs referenced by credentials. - **KV Store** is for dynamic runtime state shared across flows or executions. In most cases, the right answer is not one feature alone, but a combination: - credentials for token-based authentication - secrets for sensitive inputs - [plugin defaults](../../05.workflow-components/09.plugin-defaults/index.md) or [variables](../../05.workflow-components/04.variables/index.md) for non-sensitive configuration - KV Store for changing state --- # Expressions with Namespace Files in Kestra URL: https://kestra.io/docs/best-practices/expressions-with-namespace-files > Learn how to pass Kestra expressions to scripts stored in Namespace Files using environment variables or CLI arguments. Learn how to pass expressions to Namespace Files. ## Use expressions with namespace files You can write scripts inline in your flow and include expressions within them. The following example shows a flow that contains an expression in the inline script: ```yaml id: expressions_inline namespace: company.team inputs: - id: uri type: URI defaults: https://www.google.com/ tasks: - id: inline_script type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest script: | import requests url = "{{ inputs.uri }}" response = requests.get(url) if response.status_code == 200: print(response.text) else: print(f"Failed to retrieve the webpage. Status code: {response.status_code}") ``` This approach is convenient for scripts specific to a flow, but it does not allow the use of separate files for your code. Using separate files has several benefits: - Multiple files can be used, which is common in larger projects. - Long code blocks are easier to maintain when separated from the workflow. - Files can be written and tested locally, then synced to Kestra through Git. - The same files can be reused across multiple flows, avoiding code duplication. You cannot directly use [expressions](../../expressions/index.mdx) inside [Namespace Files](../../06.concepts/02.namespace-files/index.md), as they will not be rendered or executed outside of Kestra. With that being said, you can use the combination of the `render()` and `read()` functions in script tasks like in the following example (see the `render()` function [migration guide](../../11.migration-guide/v0.14.0/recursive-rendering/index.md)): ```yaml id: expression_render_example namespace: company.team inputs: - id: param1 type: STRING defaults: myInput tasks: - id: hello type: io.kestra.plugin.scripts.python.Script script: "{{ render(read('main.py')) }}" ``` With a Python namespace file using: ```python print("Here is my input displayed using expression in the python script: {{ inputs.param1 }}") ``` The expressions will be rendered in the logs. However, while possible, this is not necessarily best practice, as the script would only work inside Kestra. The following two methods are recommended best practice for passing expressions to your code: 1. [Using Namespace Files with environment variables](#using-namespace-files-with-environment-variables) 2. [Using Namespace Files with CLI arguments](#using-namespace-files-with-cli-arguments) In either case, you need to add your code as a [Namespace File](../../06.concepts/02.namespace-files/index.md). You can do this using the [Editor](../../09.ui/01.flows/index.md) or by importing it directly. ![namespace_file](./namespace_file.png) ## Using namespace files with environment variables You can pass inputs as environment variables using expressions. The following example uses the input `uri` and passes it to the task `code` as an environment variable so the Python code can access it: ```yaml id: expressions_env_vars namespace: company.team inputs: - id: uri type: URI defaults: https://www.google.com/ tasks: - id: code type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest commands: - python main.py env: URI: "{{ inputs.uri }}" ``` Inside the Python code, use `os.environ` to retrieve the environment variable: ```python import requests import os ## Perform the GET request response = requests.get(os.environ['URI']) ## Check if the request was successful if response.status_code == 200: print(response.text) else: print(f"Failed to retrieve the webpage. Status code: {response.status_code}") ``` This method keeps the code in a separate file while avoiding the maintenance challenges of long inline scripts. ## Using namespace files with CLI arguments You can also pass arguments directly to your code during execution. In Python, the `argparse` module can be used to handle these arguments. First, modify your code to accept arguments as follows: ```python import argparse import requests ## Set up command-line argument parsing parser = argparse.ArgumentParser(description="Fetch the content of a given URL") parser.add_argument("url", type=str, help="The URL to fetch") args = parser.parse_args() ## Perform the GET request response = requests.get(args.url) ## Check if the request was successful if response.status_code == 200: print(response.text) else: print(f"Failed to retrieve the webpage. Status code: {response.status_code}") ``` Next, pass the arguments to your code using expressions. The expressions will be rendered, and the evaluated values passed to the script via `argparse`: ```yaml id: expressions_argparse namespace: company.team inputs: - id: uri type: URI defaults: https://www.google.com/ tasks: - id: hello type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest commands: - python main.py "{{ inputs.uri }}" ``` This method makes the code slightly longer due to the argument-handling logic, but it offers better reusability. The same script can be used in multiple flows without duplicating code. --- # Choose the Right Fetch Pattern in Kestra URL: https://kestra.io/docs/best-practices/fetch-patterns > Learn when to use Download, HTTP Request, or script-based fetching in Kestra, and how to choose the right pattern for files, APIs, and custom integrations. Choose the simplest fetch pattern that matches the shape, size, and the amount of control you need. ## Decision guide Use `Download` when: - You need to retrieve a file over HTTP or HTTPS. - The result is naturally a file, such as CSV, JSON, ZIP, or a binary artifact. - The result is a large payload. - You want the response body streamed directly to Kestra internal storage. - Downstream tasks should consume a file URI rather than an in-memory response body. Use `Request` when: - You need to call an HTTP API and inspect the response directly. - You need to work with status codes, headers, form data, JSON payloads, or authentication options. - The response is small enough to treat as task output, less than 10 MB. - You are orchestrating an API call, not implementing a full client. Use a script task when: - You need custom pagination, signing, retries, throttling, or SDK-specific logic. - You need to combine multiple requests before producing one output. - The integration logic is complex enough that inline HTTP task configuration becomes harder to maintain than code. - You want to write a file after custom processing and pass that file to downstream tasks. :::alert{type="info"} Prefer built-in tasks first. Use scripts when you genuinely need custom fetching logic, not just because HTTP can also be called from Python or Shell. ::: ## Start with dedicated plugins If Kestra already provides a plugin for the system you want to integrate with, prefer that plugin over generic HTTP tasks or scripts. Dedicated plugins are usually the best choice because they: - model the external system more clearly - reduce custom request construction and parsing - make flows easier to read and maintain - often provide better task outputs and task-specific validation Use `Request` or script-based fetching when: - no dedicated plugin exists - the plugin does not yet support the endpoint or behavior you need - you need temporary or highly custom integration logic If a dedicated plugin exists but does not meet your needs, open a [GitHub issue](https://github.com/kestra-io) on that plugin repository rather than silently replacing it with long-term custom code. That helps improve the plugin for the next user facing the same limitation. ## Quick rule of thumb - If you are fetching a file, start with `Download`. - If you are calling an API endpoint, start with `Request`. - If you need custom client behavior, move to a script task. ## Comparison table | If your goal is... | Prefer | Why | | --- | --- | --- | | Download a file and store it for downstream tasks | [`Download`](/plugins/core/http/io.kestra.plugin.core.http.download) | It streams the response body directly to internal storage and returns a file URI | | Call an API and inspect status, headers, or body | [`Request`](/plugins/core/http/io.kestra.plugin.core.http.request) | It exposes structured HTTP outputs such as `code`, `headers`, and `body` | | Implement custom pagination, retries, signing, or SDK logic | A script task from [`plugin-scripts`](/docs/scripts) | It gives you full code control and can still emit files or structured outputs | ## Use `Download` for file-oriented retrieval `Download` is the best default when the remote resource should be treated as a file. This task is implemented in Kestra's core to stream the response body directly to Kestra internal storage and return a `uri`, `code`, `headers`, and `length`. That makes it a good fit for file ingestion pipelines. Best practice: - Use `Download` when the next task expects a file URI. - Prefer it over `Request` for large payloads or binary files. - Let downstream conversion or processing tasks read the file from internal storage. ### Example: download a CSV file and convert it ```yaml id: fetch_with_download namespace: company.team tasks: - id: download_orders type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: csv_to_ion type: io.kestra.plugin.serdes.csv.CsvToIon from: "{{ outputs.download_orders.uri }}" - id: log_download_status type: io.kestra.plugin.core.log.Log message: "Downloaded {{ outputs.download_orders.length }} bytes with status {{ outputs.download_orders.code }}" ``` Use this pattern when the remote system is serving a file and your workflow should continue from a persisted file URI. ## Use `Request` for API-oriented retrieval `Request` is the right fit when you need HTTP semantics, not just file retrieval. This task is implemented in Kestra's core as a generic HTTP client. It is designed to render URLs, headers, request bodies, form data, and auth options, then expose structured outputs such as response code, headers, and body. Best practice: - Use `Request` for JSON APIs, form posts, or authenticated HTTP endpoints. - Keep response bodies small enough to work with as task output. - Prefer dedicated plugins when one exists for the external system. - Prefer `Download` instead when the response is really a file or a large payload. ### Example: call an API and use the JSON response ```yaml id: fetch_with_request namespace: company.team tasks: - id: api type: io.kestra.plugin.core.http.Request uri: https://dummyjson.com/products method: GET - id: log_status type: io.kestra.plugin.core.log.Log message: "API status: {{ outputs.api.code }}" - id: extract_first_product type: io.kestra.plugin.core.log.Log message: "{{ outputs.api.body | jq('.products[0].title') }}" ``` Use this pattern when the API response itself is the thing you want to orchestrate around. ## Use script-based fetching for custom integration logic [Script tasks](../../16.scripts/index.mdx) are the right choice when the fetch logic starts to look like application code. This is especially true for: - multi-step authentication - cursor-based or token-based pagination - API-specific retry or backoff behavior - combining multiple responses into one normalized output Best practice: - Keep orchestration in YAML and only move the integration-specific logic into code. - Emit files with `outputFiles` when the result should be persisted and reused. - Use explicit dependencies rather than assuming packages are already installed. - Do not hide simple one-request logic inside a script when `Request` is enough. ### Example: fetch paginated API data with Python and emit one file ```yaml id: fetch_with_script namespace: company.team tasks: - id: fetch_all_products type: io.kestra.plugin.scripts.python.Script containerImage: python:3.13-slim dependencies: - requests outputFiles: - all_products.json script: | import json import requests all_products = [] limit = 10 skip = 0 while True: response = requests.get( "https://dummyjson.com/products", params={"limit": limit, "skip": skip}, timeout=30 ) response.raise_for_status() payload = response.json() products = payload.get("products", []) if not products: break all_products.extend(products) skip += limit if skip >= payload.get("total", 0): break with open("all_products.json", "w") as f: json.dump(all_products, f) - id: preview_output type: io.kestra.plugin.core.log.Log message: "{{ read(outputs.fetch_all_products.outputFiles['all_products.json']) }}" ``` Use this pattern when your fetch logic needs real code, but the workflow should still continue from a persisted file in Kestra. ## `Download` vs `Request` Choose between them based on the shape of the response: - If the response should be treated as a file, use `Download`. - If the response should be treated as HTTP output data, use `Request`. For example: - fetching `orders.csv` for downstream conversion is a `Download` use case - calling `/products` and branching on the JSON payload is a `Request` use case ## `Request` vs script-based fetching Choose between them based on how much client logic you need. Use `Request` when: - one request is enough - auth and headers are straightforward - the response can be handled directly with outputs and expressions Use a script when: - you need loops, pagination, or multiple dependent requests - the external API requires custom signing or SDK usage - you want to normalize or aggregate responses before downstream tasks Rule of thumb: - `Request` is for orchestrating API calls. - A script is for implementing API client behavior. ## Anti-patterns Avoid these patterns: - using a Python or Shell script just to download one file from a public URL - using `Request` for very large file downloads that should go straight to storage - storing large API payloads in task outputs when a file-based pattern would scale better - hiding complex integration logic in a long inline script without clear output artifacts - skipping dedicated plugins when they already solve the target integration cleanly ## Recommended patterns ### Pattern 1: File ingestion Use `Download`, then pass the returned URI to a conversion or processing task. ### Pattern 2: API orchestration Use `Request`, then branch, transform, or log based on `code`, `headers`, or `body`. ### Pattern 3: Custom fetch client Use a script task, then persist the result with `outputFiles` so downstream tasks still operate on Kestra-managed files. ## Summary - Start with built-in tasks. - Use `Download` for file retrieval. - Use `Request` for API calls and structured HTTP responses. - Use script-based fetching only when you need custom client logic that built-in tasks do not provide. For task-level details, see [`Download`](/plugins/core/http/io.kestra.plugin.core.http.download), [`Request`](/plugins/core/http/io.kestra.plugin.core.http.request), and the [`plugin-scripts` documentation](/docs/scripts). --- # Flow Best Practices: Performance and Reliability URL: https://kestra.io/docs/best-practices/flows > Design Kestra flows for optimal performance and reliability by managing task count, data volume, and parallelism. How to design your workflows for optimal performance. ## Understanding what an execution is in Kestra A flow execution in Kestra is an object that contains: - All **TaskRuns** for that flow, each with: - Their attempts, including: - Metrics - State history - Their outputs - Their state history Internally: - Each TaskRun belongs to the same execution context, which holds all task data for the entire flow. - The Kestra Executor reads each TaskRun’s status changes (typically three per task: **CREATED**, **RUNNING**, and **SUCCESS**). - For each state transition, the Executor must: - Fetch the serialized flow execution context over the network. - Deserialize it, determine the next task(s), then reserialize the context. - Send the updated serialized context back over the network. The larger the flow execution context, the longer the serialization process takes. Depending on the internal queue and repository implementation, there may be a hard limit on the size of this context, since it’s stored as a single row or message (often around **1 MB**). To avoid hitting this limit, do not store large amounts of data inside the flow execution context. ## Tasks in the same execution While a flow can contain many tasks, it’s not recommended to include a large number of tasks within a single execution. A flow can contain either manually defined tasks or dynamically generated ones. While [ForEach](/plugins/core/flow/io.kestra.plugin.core.flow.foreach) and [ForEachItem](/plugins/core/flow/io.kestra.plugin.core.flow.foreachitem) are powerful for looping over results, they can create hundreds of TaskRuns if used on large datasets. For example, a nested loop of 20 × 20 tasks results in **400 TaskRuns**. :::alert{type="warning"} Flows with **over 100 tasks** tend to experience performance degradation and longer execution times. ::: To avoid this, consider breaking the workflow into subflows using the [Subflow task](../../05.workflow-components/10.subflows/index.md). Since a `Subflow` task creates a new execution, its tasks are **isolated** and do not affect the parent flow’s performance. ## Managing output data volume Some tasks allow you to fetch outputs from previous tasks and reuse them in subsequent ones. While powerful, this feature **should not be used to transfer large amounts of data**. For example, the [Query](/plugins/plugin-gcp/bigquery/io.kestra.plugin.gcp.bigquery.query) task in BigQuery has a `fetch` property that retrieves query results as an output attribute. If the query returns a large dataset, the result will be stored in the execution context — meaning it will be serialized and deserialized on each task state change, severely impacting performance. This feature is best suited for small datasets, such as querying a few rows to feed into a [Switch](/plugins/core/flow/io.kestra.plugin.core.flow.switch) or [ForEach](/plugins/core/flow/io.kestra.plugin.core.flow.foreach) task. :::alert{type="info"} For large data volumes, use the `stores` property instead. Stored outputs are written to Kestra’s internal storage, and only the file URL is referenced in the execution context. ::: Some plugins have outputs that include both a `value` and a `uri`. The `store` property for these plugins is set to `false` by default and should typically only be used with small data volumes. This property should be adjusted for larger data volumes to make file URIs available. When `store` is set to `false` or the default value, the output will include a `value`, which is accessible through a Pebble expression like `"{{ outputs.task.value }}"`. When `store` is set to `true`, `value` is not accessible but instead the file URI is accessible through a Pebble expression like `"{{ outputs.task.uri }}"`. `value` and `uri` are not available outputs at the same time. Trying to access `value` when `store: true` will cause an execution error. ## Parallel tasks The [Parallel](/plugins/core/flow/io.kestra.plugin.core.flow.parallel) task helps reduce overall flow duration by running multiple branches simultaneously. By default, **all parallel tasks start at the same time**, unless you set the `concurrent` property. The only limit is the number of worker threads configured in your environment. Be mindful of external system limits such as API rate restrictions or connection quotas — running too many parallel tasks may overload those systems. ## Task duration By default, Kestra **does not limit task duration** unless explicitly stated in a task’s documentation. Long-running or infinite processes will continue indefinitely. You can control task runtime for [Runnable Tasks](../../05.workflow-components/01.tasks/01.runnable-tasks/index.md) using the `timeout` property, which accepts [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) durations (e.g., `PT5M` for five minutes). This helps prevent stalled executions and ensures resource efficiency. ## Flow trigger on state change Kestra can automatically start a flow as soon as another flow completes. This makes it easy to create dependencies between flows, even when they are owned by different teams. For example, a flow can trigger based on the `state` of another flow’s execution. There are multiple ways to configure this behavior, but one approach is recommended as a best practice. Take the following two triggers polling one specific flow: one using `preconditions.flows.states` to define the required `states` and the other using the `states` property. **Option 1** ```yaml triggers: - id: release type: io.kestra.plugin.core.trigger.Flow preconditions: id: flows flows: - namespace: company.release flowId: parent states: - SUCCESS ``` or **Option 2** ```yaml triggers: - id: release type: io.kestra.plugin.core.trigger.Flow states: - SUCCESS preconditions: id: flows flows: - namespace: company.release flowId: parent ``` While both configurations will work, **Option 1** is the recommended approach. It is more performant and declarative compared to **Option 2**, especially when working with flow triggers dependent on state. --- # ForEach vs ForEachItem in Kestra: When to Use Each URL: https://kestra.io/docs/best-practices/foreach-and-foreachitem > Learn when to use ForEach or ForEachItem in Kestra, how they scale differently, and how to access their outputs correctly in downstream tasks. Use `ForEach` and `ForEachItem` for different scaling and orchestration patterns. ## Choose the right loop primitive Both tasks iterate over multiple items, but they do it in different ways: - `ForEach` creates child task runs inside the same execution. - `ForEachItem` creates one subflow execution per batch of items. That design difference affects performance, restart behavior, and how you access outputs. ## Decision guide Use `ForEach` when: - You already have a small list in memory, such as an input, a small JSON array, or a small fetched result. - The work for each item is lightweight. - You want to share outputs between sibling tasks inside the loop. - You want a simple loop without introducing a subflow. Use `ForEachItem` when: - You need to process a large dataset or file. - You want to split data into batches and scale processing through subflows. - You need better isolation, troubleshooting, and restart behavior for individual batches. - The data already lives in Kestra internal storage, or can be written there first. :::alert{type="warning"} `ForEach` can generate many task runs in a single execution. For large fan-out or nested loops, prefer `ForEachItem` or a `Subflow`-based design to avoid oversized execution contexts and slower orchestration. ::: :::alert{type="info"} `ForEachItem` expects `items` to be a Kestra internal storage URI, for example `{{ outputs.extract.uri }}` or a `FILE` input. If your source data is a regular JSON array, Excel file, Parquet file, or another non line-oriented format, convert it first. ::: ## `Subflow` vs `ForEachItem` `Subflow` and `ForEachItem` both create child executions, but they solve different orchestration problems. Use `Subflow` when: - You want to trigger one child flow once. - You already know the exact inputs to pass to that child flow. - You want execution isolation without batching or iteration. - You are decomposing a large workflow into smaller reusable modules. Use `ForEachItem` when: - You want to start many child flow executions from one dataset or file. - You need batching by `rows`, `partitions`, or `bytes`. - You want to process file-backed items incrementally at scale. - You want Kestra to merge outputs from multiple child executions. Rule of thumb: - `Subflow` is one child execution for one unit of work. - `ForEachItem` is many child executions for many units of work. For example, if you need to process one uploaded file in a dedicated child flow, use `Subflow`. If you need to split that file into many batches and process each batch in its own child flow execution, use `ForEachItem`. ## Understand the main difference `ForEach` iterates over a list of values and exposes: - `{{ taskrun.value }}` for the current value - `{{ taskrun.iteration }}` for the zero-based loop index `ForEachItem` iterates over batches of file-backed items and exposes: - `{{ taskrun.items }}` for the current batch file URI - `{{ taskrun.iteration }}` for the zero-based batch index In practice: - `ForEach` is best when the iteration value itself is the thing you want to work with. - `ForEachItem` is best when each iteration should receive a file or batch and hand it off to a subflow. ## Best practices for `ForEach` - Keep the `values` list small to moderate in size. - Use `concurrencyLimit` deliberately rather than leaving fan-out unbounded. - If each iteration needs multiple tasks in parallel, put a `Parallel` task inside the loop instead of expecting child tasks to run concurrently by default. - If iterating over JSON objects, remember that `taskrun.value` is a JSON string. Use `fromJson(taskrun.value)` to access properties. - When referencing outputs from sibling tasks inside the same loop iteration, use `outputs.task_id[taskrun.value]`. ### Example: use sibling outputs correctly inside `ForEach` ```yaml id: foreach_outputs namespace: company.team tasks: - id: enrich_regions type: io.kestra.plugin.core.flow.ForEach values: ["north", "south", "west"] concurrencyLimit: 2 tasks: - id: metadata type: io.kestra.plugin.core.output.OutputValues values: region: "{{ taskrun.value }}" bucket: "landing-{{ taskrun.value }}" - id: build_message type: io.kestra.plugin.core.debug.Return format: "Load {{ outputs.metadata[taskrun.value].values.region }} into {{ outputs.metadata[taskrun.value].values.bucket }}" - id: log_one_result type: io.kestra.plugin.core.log.Log message: "{{ outputs.build_message['north'].value }}" ``` Why this pattern works: - Inside the loop, `outputs.metadata[taskrun.value]` reads the output from the current iteration. - Outside the loop, `outputs.build_message['north'].value` reads the output for one specific loop value. ### Example: iterate over JSON objects safely ```yaml id: foreach_json namespace: company.team tasks: - id: process_users type: io.kestra.plugin.core.flow.ForEach values: - {"id": 101, "email": "a@example.com"} - {"id": 102, "email": "b@example.com"} tasks: - id: log_user type: io.kestra.plugin.core.log.Log message: "User {{ fromJson(taskrun.value).id }} -> {{ fromJson(taskrun.value).email }}" ``` ## Best practices for `ForEachItem` - Store the dataset in internal storage first and pass its URI to `items`. - If your source file is CSV, JSON, Excel, or another external format, convert it to ION before passing it to `ForEachItem`. - Batch by `rows`, `partitions`, or `bytes` based on how the downstream subflow processes data. - Design the subflow so it can be rerun independently for one batch. - Prefer passing `taskrun.items` to a `FILE` input in the subflow. - If the parent flow must depend on child results, keep `wait: true`. - If a child failure should fail the parent task, keep `transmitFailed: true`. ### Example: process a file in batches with `ForEachItem` This pattern is recommended when each batch should run in its own execution. ```yaml id: parent_foreachitem namespace: company.team tasks: - id: download_orders_csv type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: orders_to_ion type: io.kestra.plugin.serdes.csv.CsvToIon from: "{{ outputs.download_orders_csv.uri }}" - id: process_batches type: io.kestra.plugin.core.flow.ForEachItem items: "{{ outputs.orders_to_ion.uri }}" batch: rows: 2 namespace: company.team flowId: process_order_batch wait: true transmitFailed: true inputs: orders_file: "{{ taskrun.items }}" - id: log_merged_outputs_uri type: io.kestra.plugin.core.log.Log message: "{{ outputs.process_batches_merge.subflowOutputs }}" - id: preview_merged_outputs type: io.kestra.plugin.core.log.Log message: "{{ read(outputs.process_batches_merge.subflowOutputs) }}" ``` And the subflow: ```yaml id: process_order_batch namespace: company.team inputs: - id: orders_file type: FILE tasks: - id: inspect_batch type: io.kestra.plugin.core.log.Log message: "{{ read(inputs.orders_file) }}" outputs: - id: batch_summary type: STRING value: "{{ 'Processed batch content: ' ~ read(inputs.orders_file) }}" ``` Here, `orders_file` is a batch file generated from the ION output of `CsvToIon`. Each subflow execution receives one batch file through `{{ taskrun.items }}`. ## Use `ForEachItem` outputs correctly `ForEachItem` is best consumed through its internal helper task outputs: - `{{ outputs.task_id_split.splits }}` contains the file listing generated batch URIs. - `{{ outputs.task_id_merge.subflowOutputs }}` contains a file with the merged outputs from the child subflows. If your `ForEachItem` task id is `process_batches`, those become: - `{{ outputs.process_batches_split.splits }}` - `{{ outputs.process_batches_merge.subflowOutputs }}` This is different from `ForEach`, where you typically access outputs by loop value, such as `outputs.inner['north'].value`. ### Example: consume merged subflow outputs If the subflow defines typed flow outputs, `ForEachItem` merges them into a file exposed by the internal merge task. In the example above, each child execution returns a `batch_summary` string, and the merge task gathers those subflow outputs into a single file. ```yaml id: parent_read_merged_outputs namespace: company.team tasks: - id: download_orders_csv type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: orders_to_ion type: io.kestra.plugin.serdes.csv.CsvToIon from: "{{ outputs.download_orders_csv.uri }}" - id: process_batches type: io.kestra.plugin.core.flow.ForEachItem items: "{{ outputs.orders_to_ion.uri }}" batch: rows: 2 namespace: company.team flowId: process_order_batch wait: true transmitFailed: true inputs: orders_file: "{{ taskrun.items }}" - id: log_merged_outputs_uri type: io.kestra.plugin.core.log.Log message: "{{ outputs.process_batches_merge.subflowOutputs }}" - id: preview_merged_outputs type: io.kestra.plugin.core.log.Log message: "{{ read(outputs.process_batches_merge.subflowOutputs) }}" ``` Use `{{ outputs.process_batches_merge.subflowOutputs }}` when a downstream task needs the collected outputs from all child subflows. If you want to inspect the merged file content directly, use `read(outputs.process_batches_merge.subflowOutputs)`. ## Common mistakes to avoid - Do not use `ForEach` for very large datasets just because the input started as a JSON array. - Do not pass a non-storage path or raw inline content to `ForEachItem.items`; it must be a Kestra internal storage URI. - Do not assume sibling task outputs in `ForEach` use the plain `outputs.task_id.value` syntax; inside the loop, use `outputs.task_id[taskrun.value]`. - Do not expect `ForEach` child tasks to run in parallel unless you either set loop concurrency or add a `Parallel` task inside the loop. - Do not forget that `taskrun.iteration` starts at `0` for both `ForEach` and `ForEachItem`. ## Recommended rule of thumb Use `ForEach` for orchestration over a relatively small list of values. Use `ForEachItem` for data processing over file-backed items or batches, especially when you need scale, restartability, or subflow isolation. For API details, see the [ForEach plugin documentation](/plugins/core/flow/io.kestra.plugin.core.flow.foreach), the [ForEachItem plugin documentation](/plugins/core/flow/io.kestra.plugin.core.flow.foreachitem), and the [Outputs documentation](../../05.workflow-components/06.outputs/index.md). --- # Dev to Production in Kestra: Promote Flows Safely URL: https://kestra.io/docs/best-practices/from-dev-to-prod > Recommended patterns for promoting Kestra flows from development to production environments using Git and CI/CD. Common patterns for deploying flows from development to production environments. ## Move flows from development to production safely
## Development environment A best practice with Kestra is to maintain a dedicated **development instance** where users can create and test flows safely. This environment acts as a sandbox, allowing experimentation without impacting production or critical business operations. You can set up a development environment in one of the following ways: - Install Kestra locally using [Docker Compose](../../02.installation/03.docker-compose/index.md) - Deploy Kestra on a [Kubernetes cluster](../../02.installation/03.kubernetes/index.md) accessible to users and isolated from production workloads ## Production environment The production instance should be secured and tightly controlled, as it runs critical workflows that directly impact end users. A common best practice is to **limit access** to production systems. Two areas should be considered: - User access - Flow deployments ### User access In **Kestra Enterprise**, user management is streamlined through [RBAC](../../07.enterprise/03.auth/rbac/index.md) and [SSO](../../07.enterprise/03.auth/sso/index.md). Administrators can define fine-grained access using role policies such as *Admin* or *Viewer*, ensuring proper access control across all resources. Learn more in the [Enterprise documentation](../../07.enterprise/index.mdx). For open-source users, it’s recommended to run a **restricted production instance**, accessible only to administrators and your [CI/CD system](../../version-control-cicd/cicd/index.md). ### Flow deployments Kestra supports several deployment strategies: - [Via the UI](../../09.ui/01.flows/index.md) - [Git synchronization](../../version-control-cicd/04.git/index.md) - [CI/CD pipelines](../../version-control-cicd/cicd/index.md) - [Terraform](../../13.terraform/index.mdx) - [API](../../api-reference/index.mdx) Choose a method that aligns with your organization’s existing deployment processes. A common approach is to move flows from development to production using a **version control system** (such as Git) combined with **CI/CD automation**. In this pattern, developers commit flow changes to Git. Once the pull request is approved, the CI/CD system deploys the updated flows to the production instance. Flows can be committed to Git using: - Manual export or copy-paste from the UI - The [`git.PushFlows` task](/plugins/plugin-git/io.kestra.plugin.git.pushflows) CI/CD deployment to production can be automated with: - GitHub Actions, GitLab CI/CD, Jenkins, or Azure DevOps - Terraform - The Kestra CLI For more details on CI/CD automation, refer to the [CI/CD guide](../../version-control-cicd/cicd/index.md). ## Git example
You can use the [`git.SyncFlows` task](/plugins/plugin-git/io.kestra.plugin.git.syncflows) together with a [trigger](../../05.workflow-components/07.triggers/index.mdx) to automatically pull flows from the `main` branch of your Git repository. This enables Kestra to manage the synchronization process directly, minimizing the need for external tools. You can schedule synchronization using: - A [Schedule trigger](../../05.workflow-components/07.triggers/01.schedule-trigger/index.md) to pull flows at regular intervals (e.g., nightly) - A [Webhook trigger](../../05.workflow-components/07.triggers/03.webhook-trigger/index.md) to pull updates whenever new commits are pushed to `main` See the [dedicated SyncFlows guide](../../15.how-to-guides/syncflows/index.md) for details. To push flows from development to Git, use the [`git.PushFlows` task](/plugins/plugin-git/io.kestra.plugin.git.pushflows). This ensures flows are validated before being saved — Kestra will reject invalid flows automatically. You can also automate pull requests with the [`create.Pulls` task](/plugins/plugin-github/github-pull-requests/io.kestra.plugin.github.pulls.create), which creates a PR to `main` for review before deploying to production. :::alert{type="info"} While Kestra validates flow syntax, it does not detect logical or runtime errors. Always test flows thoroughly before promoting them to production. ::: ## CI/CD example
CI/CD pipelines can automatically deploy flows from Git to your Kestra production instance when changes are merged into the `main` branch. For GitHub, Kestra provides an official [Deploy Action](../../version-control-cicd/cicd/01.github-action/index.md) that uses the Kestra Server CLI behind the scenes to perform deployments. You can pair this with the [Validate Action](../../version-control-cicd/cicd/01.github-action/index.md), which checks that flows are valid before merging. By enforcing required status checks on pull requests, you can prevent invalid flows from being merged and deployed to production. :::alert{type="info"} If a flow contains invalid syntax, the **Deploy Action** will fail. ::: --- # Version Control with Git in Kestra URL: https://kestra.io/docs/best-practices/git > Best practices for using Git with Kestra for version control, including SyncFlows, PushFlows, and CI/CD integration. Best practices for version control with Git in Kestra. ## Use Git effectively with Kestra By default, **all Kestra flows are automatically versioned** using [Revisions](../../06.concepts/03.revision/index.md). You don't need an additional version control system to track changes to your flows. Kestra automatically creates a new revision each time you save a flow, allowing you to view change history, compare revisions, and restore previous versions at any time. However, if you want to use Git to manage your [flows](../../05.workflow-components/01.flow/index.md) and [namespace files](../../06.concepts/02.namespace-files/index.md), you can do so with Kestra’s built-in Git integration.
There are multiple ways to use Git with Kestra: - The [git.SyncFlows](/plugins/plugin-git/io.kestra.plugin.git.syncflows) pattern enables a GitOps approach, using Git as the single source of truth for your flows. - The [git.SyncNamespaceFiles](/plugins/plugin-git/io.kestra.plugin.git.syncnamespacefiles) pattern enables GitOps for your namespace files. - The [git.PushFlows](/plugins/plugin-git/io.kestra.plugin.git.pushflows) pattern allows you to edit flows from the UI and automatically commit and push changes to Git — ideal if you prefer using the built-in editor while keeping your code synchronized with Git. - The [git.PushNamespaceFiles](/plugins/plugin-git/io.kestra.plugin.git.pushnamespacefiles) pattern allows you to edit namespace files from the UI and push updates to Git. - The [CI/CD](../../version-control-cicd/cicd/index.md) pattern is ideal if you prefer managing the CI/CD process manually (for example, using GitHub Actions or Terraform) while keeping Git as the single source of truth for your code. - [Clone](https://kestra.io/plugins/plugin-git/io.kestra.plugin.git.clone) task clones a repository directly into a flow, making scripts available for execution. - [TenantSync](/plugins/plugin-git/io.kestra.plugin.git.tenantsync) task synchronizes all namespaces in a tenant, including flows, files, apps, tests, and dashboards. - [NamespaceSync](/plugins/plugin-git/io.kestra.plugin.git.namespacesync) task synchronizes objects within a single namespace with your Git repository. The diagram below illustrates how to choose the right pattern for your workflow: ![git](../../version-control-cicd/04.git/git.png) For a detailed comparison of these patterns, see the [Version Control with Git](../../version-control-cicd/04.git/index.md) page. --- # Managing Environments in Kestra: Dev, Prod & Tenants URL: https://kestra.io/docs/best-practices/manage-environments > Best practices for managing Kestra environments, including separate instances for development and production, and using tenants. Kestra users can manage their environments with different levels of granularity. Kestra has three main concepts: instance, tenant, and namespace. ## Manage Kestra instances, tenants, and namespaces An instance is a full deployment of Kestra. A best practice is to have at least two separate instances: one for development and one for production. The development instance serves as a sandbox for testing and experimentation, while the production instance supports critical operations and should only be accessible to administrators. Large organizations sometimes have three or four environments. In such cases, it's best to use the [Kestra Enterprise Edition](../../oss-vs-paid/index.md) to manage all instances effectively, benefiting from improved governance, security, and scalability. ## When to use multiple tenants A [tenant](../../07.enterprise/02.governance/tenants/index.md) is a logical separation within an instance. You can think of tenants as isolated Kestra projects that share instance resources. A single instance can have multiple tenants. Tenants are useful when Kestra manages operations for different customers or teams. For example, a company with ten customers could assign each one to a separate tenant. Similarly, an international organization could use tenants to separate workflows by country. Tenants can also be used to isolate environments for different engineering teams within the same development instance. :::alert{type="info"} Each tenant uses the same underlying instance resources. Therefore, it is not recommended to use tenants to separate development and production environments. If the underlying instance goes down, all tenants will be affected. ::: ## When to use multiple namespaces Namespaces are useful for organizing your flows. They can help structure projects by domain or team. Namespaces can also be used as lightweight “environments” for getting started, especially for open-source users who don’t need to manage multiple instances. However, this approach is not recommended for critical operations, since an issue in one namespace could impact production flows. --- # Managing pip Dependencies in Kestra: Docker & Caching URL: https://kestra.io/docs/best-practices/managing-pip-dependencies > Efficiently manage Python pip dependencies in Kestra using custom Docker images, server startup installs, or caching. Learn how to manage pip package dependencies in your flows. ## Manage Python dependencies efficiently Your Python code may require `pip` package dependencies. How you manage these dependencies can affect the execution time of your flows. If you install `pip` packages within `beforeCommands`, the packages will be downloaded and installed each time the task runs. This can significantly increase the duration of workflow executions. The following sections describe several ways to manage `pip` package dependencies efficiently in your flows. ## Using a custom Docker image Instead of using the base Python Docker image and installing dependencies through `beforeCommands`, you can create a custom Docker image that includes Python and all required `pip` packages. Since the dependencies are built into the image, they do not need to be downloaded and installed at runtime. This reduces overhead and ensures that execution time is dedicated solely to running your Python code. For example, if your Python script depends on `pandas`, you can use a container image that already includes it, such as `ghcr.io/kestra-io/pydata:latest`. This eliminates the need to install dependencies using `beforeCommands`: ```yaml id: docker_dependencies namespace: company.team tasks: - id: code type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest script: | import pandas as pd df = pd.read_csv('https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv') total_revenue = df['total'].sum() ``` ## Installing pip package dependencies at server startup Another way to avoid installing dependencies during every execution is to preinstall them before starting the Kestra server. For a standalone Kestra server, you can run: ```bash pip install requests pandas polars && ./kestra server standalone --worker-thread=16 ``` If you are running Kestra with Docker, create a Dockerfile and install dependencies using the `RUN` command. Set the `USER` to `root` to allow package installation: ```dockerfile FROM kestra/kestra:latest USER root RUN pip install requests pandas polars CMD ["server", "standalone"] ``` In your Docker Compose configuration, replace the `image` property with `build: .` to use your custom Dockerfile instead of the official image from Docker Hub. Also, remove the `command` property, since the `CMD` instruction in your Dockerfile now handles it: ```yaml services: ... kestra: build: . ... ``` When you start Kestra using Docker Compose, the Python dependencies will already be included in the container. In both installation methods, you must run Python tasks using the [Process Task Runner](../../task-runners/04.types/01.process-task-runner/index.md) to ensure the code can access the dependencies installed in the Kestra server process. You can verify that the dependencies are installed with the following example: ```yaml id: list_dependencies namespace: company.team tasks: - id: check type: io.kestra.plugin.scripts.python.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - pip list ``` ## Using cache files In a `WorkingDirectory` task, you can create a virtual environment using the [Process Task Runner](../../task-runners/04.types/01.process-task-runner/index.md), install all required `pip` dependencies, and cache the `venv` folder. This ensures the dependencies are reused in subsequent executions, eliminating the need for repeated installations. For more details, see the [caching](../../06.concepts/12.caching/index.md) page. The example below demonstrates how to cache the `venv` folder: ```yaml id: python_cached_dependencies namespace: company.team tasks: - id: working_dir type: io.kestra.plugin.core.flow.WorkingDirectory tasks: - id: python_script type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.core.runner.Process beforeCommands: - python -m venv venv - . venv/bin/activate - pip install pandas script: | import pandas as pd print(pd.__version__) cache: patterns: - venv/** ttl: PT24H ``` By using one of these techniques, you can avoid reinstalling dependencies for each execution and reduce overall execution time. --- # Naming Conventions in Kestra: Flows and Namespaces URL: https://kestra.io/docs/best-practices/naming-conventions > Learn the best practices for naming namespaces, flows, tasks, and other identifiers in Kestra for a clean and scalable hierarchy. Common naming conventions to keep your flows and tasks well-organized and consistent in Kestra. ## Name namespaces and flows consistently Follow a `company.team` structure for namespaces to maintain a clean, scalable, and consistent hierarchy across your workflows. This approach helps with: 1. Centralized governance for credentials and configurations 2. Easier sharing of variables, plugin defaults, and secrets across teams 3. Simplified Git synchronization ## Why use the `company.team` structure By defining a **root namespace named after your company**, you can centralize management of [plugin defaults](../../05.workflow-components/09.plugin-defaults/index.md), [variables](../../05.workflow-components/04.variables/index.md), and [secrets](../../06.concepts/04.secret/index.md). These configurations can then be inherited by all namespaces under that root. This structure also simplifies [Git synchronization](../../version-control-cicd/04.git/index.md). You can maintain a single synchronization flow that updates all namespaces under your company root. The next level — named after your team (e.g., `company.team`) — allows for shared governance and visibility at the team level. From there, you can further divide namespaces by project, system, or other logical hierarchies. When synced with Git, this nested structure maps directly to nested directories in your repository. ## Example namespace structure ```plaintext mycompany ├── mycompany.marketing │ ├── mycompany.marketing.projectA │ └── mycompany.marketing.projectB └── mycompany.sales ├── mycompany.sales.projectC └── mycompany.sales.projectD ``` ## Should you use environment-specific namespaces? Avoid **environment-specific namespaces** (e.g., `dev`, `staging`, `prod`) because they can introduce several issues: - **Shared risk:** Development workflows can unintentionally impact production. - **Configuration drift:** Duplicating configurations across environments can lead to inconsistencies. Instead, run **separate Kestra instances** (or tenants in Enterprise Edition) for development and production. ## Summary Using a `company.team` namespace structure creates a clear, maintainable hierarchy that mirrors your organization’s structure and simplifies Git synchronization. To separate environments reliably, use distinct Kestra instances or tenants rather than environment-based namespaces. ## ID naming convention Use a consistent naming pattern across all identifiers in Kestra, including: - Flows - Tasks - Inputs - Outputs - Triggers ### Valid characters and subscript notation Kestra does not enforce a specific naming style, but IDs must match the regex pattern: `^[a-zA-Z0-9][a-zA-Z0-9_-]*` This means: - Only letters, numbers, underscores `_`, and hyphens `-` are allowed. - If you use hyphens (e.g., `kebab-case`), reference IDs using **subscript notation**, such as: `{{ outputs.task_id["your-custom-value"].attribute }}` :::alert{type="info"} Use **snake_case** or **camelCase** instead of `kebab-case` to avoid the need for subscript notation and improve readability. ::: ### Snake case Snake case is popular among Python developers, especially in data and AI workflows. Here’s an example using `snake_case` for all IDs: ```yaml id: api_python_sql namespace: prod.marketing.attribution inputs: - id: api_endpoint type: URL defaults: https://dummyjson.com/products tasks: - id: fetch_products type: io.kestra.plugin.core.http.Request uri: "{{ inputs.api_endpoint }}" - id: transform_in_python type: io.kestra.plugin.scripts.python.Script containerImage: python:slim beforeCommands: - pip install polars outputFiles: - "products.csv" script: | import polars as pl data = {{ outputs.fetch_products.body | jq('.products') | first }} df = pl.from_dicts(data) df.glimpse() df.select(["brand", "price"]).write_csv("products.csv") - id: sql_query type: io.kestra.plugin.jdbc.duckdb.Query inputFiles: in.csv: "{{ outputs.transform_in_python.outputFiles['products.csv'] }}" sql: | SELECT brand, round(avg(price), 2) as avg_price FROM read_csv_auto('{{ workingDir }}/in.csv', header=True) GROUP BY brand ORDER BY avg_price DESC; fetchType: STORE outputs: - id: final_result value: "{{ outputs.sql_query.uri }}" triggers: - id: daily_at_9am type: io.kestra.plugin.core.trigger.Schedule cron: "0 9 * * *" ``` ### Camel case Camel case is common in Java and JavaScript ecosystems. Here’s the same example using `camelCase`: ```yaml id: apiPythonSql namespace: prod.marketing.attribution inputs: - id: apiEndpoint type: URL defaults: https://dummyjson.com/products tasks: - id: fetchProducts type: io.kestra.plugin.core.http.Request uri: "{{ inputs.apiEndpoint }}" - id: transformInPython type: io.kestra.plugin.scripts.python.Script containerImage: python:slim beforeCommands: - pip install polars outputFiles: - "products.csv" script: | import polars as pl data = {{ outputs.fetchProducts.body | jq('.products') | first }} df = pl.from_dicts(data) df.glimpse() df.select(["brand", "price"]).write_csv("products.csv") - id: sqlQuery type: io.kestra.plugin.jdbc.duckdb.Query inputFiles: in.csv: "{{ outputs.transformInPython.outputFiles['products.csv'] }}" sql: | SELECT brand, round(avg(price), 2) as avgPrice FROM read_csv_auto('{{ workingDir }}/in.csv', header=True) GROUP BY brand ORDER BY avgPrice DESC; store: true outputs: - id: finalResult value: "{{ outputs.sqlQuery.uri }}" triggers: - id: dailyAt9am type: io.kestra.plugin.core.trigger.Schedule cron: "0 9 * * *" ``` Both conventions are valid — choose the one that best matches your team’s coding standards. --- # Managing and Purging Flow Outputs in Kestra URL: https://kestra.io/docs/best-practices/outputs > Best practices for managing flow outputs in Kestra, including purging large files and handling conditional outputs efficiently. Best practices for handling flow outputs, including purging large outputs and conditionally returning outputs. ## Handle flow outputs safely When a flow can return different outputs depending on certain conditions, you can use an expression in the `outputs` section. This allows you to conditionally return the output of task `A` if it wasn’t skipped, or the output of task `B` otherwise. ```yaml id: conditionallyReturnOutputs namespace: company.team inputs: - id: runTask type: BOOLEAN defaults: true tasks: - id: taskA runIf: "{{ inputs.runTask }}" type: io.kestra.plugin.core.debug.Return format: Hello World! - id: taskB type: io.kestra.plugin.core.debug.Return format: Fallback output outputs: - id: flowOutput type: STRING value: "{{ tasks.taskA.state != 'SKIPPED' ? outputs.taskA.value : outputs.taskB.value }}" ``` ## Purging large output files If a flow generates large output files that are not needed after execution, you can use the `io.kestra.plugin.core.storage.PurgeExecutionFiles` task to delete those files from internal storage. In the example below, the flow downloads a large file from an HTTP API and uploads it to an S3 bucket. Once the file is uploaded, it’s no longer needed locally, so the `PurgeExecutionFiles` task is used to remove it from internal storage. ```yaml id: extractLoadPurge namespace: company.team tasks: - id: extract type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: load type: io.kestra.plugin.aws.s3.Upload from: "{{ outputs.extractLargeFile.uri }}" bucket: myBucket key: largeFiles/orders.csv - id: purge type: io.kestra.plugin.core.storage.PurgeCurrentExecutionFiles ``` --- # Data Retention and Purging in Kestra: Choose a Strategy URL: https://kestra.io/docs/best-practices/purging-data > Learn how to choose the right purge strategy in Kestra for executions, logs, key-value pairs, Namespace files, assets, and other retained data. How to choose the right purge strategy for operational data, retained artifacts, and mutable state in Kestra. ## Choose the right purge strategy Kestra stores different kinds of data for different reasons. Some data supports execution history and troubleshooting, some supports runtime state, and some supports retained files or assets. Because of that, purging data in Kestra is not a single decision. The right approach depends on: - what kind of data you want to remove - why the data exists - how long you need to keep it - whether you want hard deletion or to hide it from the UI ## Quick recommendation Use this rule of thumb: - purge **executions** and **logs** to control operational storage growth - purge **KV pairs** only when they represent expired runtime state - purge **Namespace files** only when you need version retention on file history - purge **assets and lineage** only when you are enforcing a retention policy for asset metadata - do not rely on UI deletion when your goal is storage reclamation or permanent deletion ## Comparison table | If you want to remove... | Prefer | Why | | --- | --- | --- | | Old execution records | [`PurgeExecutions`](/plugins/core/execution/io.kestra.plugin.core.execution.purgeexecutions) | It permanently deletes execution metadata and related execution data | | Old execution and trigger logs | [`PurgeLogs`](/plugins/core/log/io.kestra.plugin.core.log.purgelogs) | It is designed for bulk log cleanup | | Expired runtime state in the KV Store | [`PurgeKV`](/plugins/core/kv/io.kestra.plugin.core.kv.purgekv) or automatic KV expiration purge | It removes stale KV entries without treating them as static configuration | | Old Namespace file versions | [`PurgeFiles`](/plugins/core/namespace/io.kestra.plugin.core.namespace.purgefiles) | It applies retention rules to Namespace files and their versions | | Old asset records, usages, or lineage data | [`PurgeAssets`](../../10.administrator-guide/purge/index.md#purge-assets-and-lineage-retention) | It applies retention to asset-related records without touching executions or logs | | A flow, namespace, or other object only in the UI | UI deletion | It hides records, but does not perform the same hard deletion as purge tasks | ## Use purge tasks for retention, not for routine cleanup by hand If data should be deleted on a recurring basis, treat it as a retention policy rather than a manual maintenance task. Best practice: - define retention periods explicitly - schedule purge flows - keep those flows in a central administrative or `system` namespace - test purge behavior in non-production environments first This makes retention predictable and easier to review. ## When to purge executions and logs Use [`PurgeExecutions`](/plugins/core/execution/io.kestra.plugin.core.execution.purgeexecutions) and [`PurgeLogs`](/plugins/core/log/io.kestra.plugin.core.log.purgelogs) when your main goal is to reduce the footprint of historical operational data. This is usually the right choice when: - you no longer need old execution history for troubleshooting - old logs are consuming storage - you already have another system for long-term observability or audit retention Best practice: - set separate retention periods for executions and logs if your teams use them differently - avoid deleting recent data that is still useful for troubleshooting failed workflows - run purge flows on a schedule instead of waiting for storage pressure Relevant blueprints: - [Purge execution data including logs, metrics and outputs on a schedule](https://kestra.io/blueprints/purge) - [Purge disk space interactively](https://kestra.io/blueprints/purge-disk-space-interactively) ## When to purge KV pairs Use [`PurgeKV`](/plugins/core/kv/io.kestra.plugin.core.kv.purgekv) only for runtime state that has expired or is no longer valid. The KV Store is best used for mutable state such as: - cursors - offsets - checkpoints - last processed timestamps Best practice: - set TTLs where possible - enable automatic purging of expired KV pairs when that matches your operational model - avoid using KV purge as a substitute for redesigning unclear state lifecycles Relevant blueprint: - [Purge old KV pairs from the KV Store](https://kestra.io/blueprints/kv-store-purge) If a value is actually configuration or a secret, it probably does not belong in the KV Store in the first place. ## When to purge Namespace files Use [`PurgeFiles`](/plugins/core/namespace/io.kestra.plugin.core.namespace.purgefiles) when your goal is to manage retention of Namespace file versions rather than execution data. This is useful when: - teams frequently update scripts or SQL files stored as Namespace files - you want to keep only the most recent versions - you want to remove versions older than a given date Best practice: - scope purge rules with `filePattern` so you do not delete unrelated files - define whether you want date-based or version-based retention - be careful when applying purge rules across parent and child namespaces ## When to purge assets and lineage Use [`PurgeAssets`](../../10.administrator-guide/purge/index.md#purge-assets-and-lineage-retention) when you need retention for asset metadata, asset usage records, or lineage data. This is different from purging executions or logs. Asset retention is its own concern and should be managed separately. Best practice: - filter by namespace, asset type, or metadata when possible - define retention based on operational or compliance requirements - purge only the records you intend to remove, especially if you want to keep lineage or usage data longer than the assets themselves [`PurgeAssets`](../../10.administrator-guide/purge/index.md#purge-assets-and-lineage-retention) is an Enterprise Edition feature. ## Purge tasks vs. UI deletion Do not treat purge tasks and UI deletion as equivalent. - purge tasks perform hard deletion and reclaim storage - UI deletion is a soft deletion and preserves underlying history Use purge tasks when you need permanent removal or storage reduction. Use UI deletion when you only want to remove an item from the visible working set without changing retention at the storage level. ## What purge tasks do not cover Purge tasks are not a universal retention mechanism for every internal component. In particular: - purge tasks do not manage internal queue retention - queue retention is configured separately depending on your backend Avoid assuming that a purge flow alone covers all retained system data. ## Recommended patterns ### Pattern 1: Scheduled operational retention Create a scheduled purge flow for executions and logs with a clearly defined retention period. This is the most common pattern for controlling storage growth. ### Pattern 2: State lifecycle management for KV pairs Use KV TTLs and expired-key cleanup for temporary runtime state rather than accumulating state indefinitely. ### Pattern 3: Version retention for Namespace files Apply [`PurgeFiles`](/plugins/core/namespace/io.kestra.plugin.core.namespace.purgefiles) with explicit namespace scope and file patterns when teams manage reusable scripts or SQL as Namespace files. ### Pattern 4: Separate retention policies by data type Use different purge flows or schedules for executions, logs, KV pairs, Namespace files, and assets. This keeps retention aligned with how each type of data is actually used. ## Anti-patterns Avoid these patterns: - using one retention period for every type of data without considering how the data is used - relying on manual cleanup only after storage becomes a problem - using UI deletion when you actually need hard deletion - purging KV pairs that are really standing in for missing configuration or poor state design - running broad Namespace file purges without a `filePattern` or namespace scope ## Decision guide Ask these questions: 1. Is the data operational history, mutable state, retained files, or asset metadata? Choose the purge task that matches that data type. 2. Do you need hard deletion or only to remove an item from the UI? If you need permanent deletion, use a purge task. 3. Should the data expire automatically based on age or lifecycle? If yes, define a retention policy and schedule it. 4. Is the data still needed for troubleshooting, auditability, or compliance? If yes, shorten retention carefully rather than purging broadly. 5. Are you trying to solve a storage problem or a modeling problem? If the data should never have been long-lived state, fix the design instead of only adding a purge flow. ## Summary - Use purge tasks based on the type of data you want to remove. - Treat retention as a deliberate operational policy, not an afterthought. - Use hard-deletion purge tasks for permanent cleanup and storage reclamation. - Keep separate retention strategies for executions, logs, KV pairs, Namespace files, and assets. For the underlying purge tasks and configuration options, see [Purge old data in Kestra](../../10.administrator-guide/purge/index.md). --- # Secrets Management in Kestra: Avoid Accidental Exposure URL: https://kestra.io/docs/best-practices/secrets-management > Best practices for securely managing and using secrets in Kestra workflows to prevent accidental exposure. A quick guide to [secrets](../../07.enterprise/02.governance/secrets/index.md) management best practices in Kestra. ## Manage secrets securely in Kestra Kestra provides a built-in [secret manager](../../07.enterprise/02.governance/secrets-manager/index.md) with obfuscation capabilities, but it’s important to understand its limitations and follow best practices to minimize the risk of secret exposure. ## Secret obfuscation in logs is best effort Kestra attempts to mask secrets in logs and during expression evaluation, but masking is not foolproof. Current log obfuscation replaces full secret matches with `****`. However, if a secret is modified — for example, through substring extraction, concatenation, encoding, or interpolation — it may bypass obfuscation and appear in logs. Refer to the [Filter Reference](../../expressions/03.filters/index.mdx) for a list of possible transformations. For example, the following flow uses `jq()` in a log message to return a partial value associated with a secret: ```yaml id: secret_test namespace: company.team tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "You can see my secret token {{ secret('jsonSecret') | jq('.token') }}" ``` In the logs, the token name `SUPER_SECRET` is exposed: ![Secret Log](./secret-log.png) **Best practices:** - Never log secrets intentionally. - Avoid passing secrets into string manipulation expressions that could expose partial values. - Treat all secrets as sensitive, even in debug or test workflows. ## Understand expression evaluation limits When using the **Debug Expression** tool in the **Outputs** tab of an execution, Kestra forbids direct calls to `secret()` to prevent leaks. Logs are more permissive because tasks can emit any property, but this also increases the risk of accidental exposure. Avoid using secrets in Log tasks. If they are required, ensure you understand the risks and limitations before doing so. ## Avoid root-level secret placement Secrets defined at the root namespace are inherited by all sub-namespaces. This can unintentionally broaden accessibility and increase exposure risk. **Best practices:** - Store secrets at the lowest namespace level necessary for their use. - Use granular RBAC permissions to control who can access secrets and which workflows can use them. ## Design workflows to limit exposure Follow these practices when designing workflows: - Pass secrets only to tasks that require them. - Avoid exposing secrets in user-facing outputs or debug messages. - Where possible, design tasks to reference secrets rather than embedding raw values directly. --- # Kestra Brand Assets: Logos and Visual Identity URL: https://kestra.io/docs/brand-assets > Download official Kestra brand assets including logos, color palettes, and usage guidelines for presentations, blog posts, partner pages, and marketing. import DownloadLogoPack from '~/components/content/DownloadLogoPack.vue' import CardLogos from '~/components/content/CardLogos.vue' ![logo](./kestra-logo.png) ## Our Story ![timeline](./our-story.png) Kestra strives to be a simple yet powerful orchestration platform, enabling our clients to manage complex workflows with the same agility as a conductor who leads a symphony. This is how our logo was born, which embodies Kestra’s ability to orchestrate a wide variety of workflows anywhere at any scale. ## Logos Click on the link below to download the logo pack in PNG and SVG: --- # Core Concepts in Kestra: Architecture and Templating URL: https://kestra.io/docs/concepts > Core concepts of Kestra orchestration. Reference guide for architecture, data handling, templating, and key terminology used throughout the platform. import ChildCard from "~/components/docs/ChildCard.astro" This section lists key concepts and templating expressions. You can treat this section as a lookup reference anytime you need more details about a specific concept or expression. ## Explore Key Concepts --- # Backfill in Kestra: Replay Missed Schedules URL: https://kestra.io/docs/concepts/backfill > Replay missed schedule intervals with Kestra Backfills. Rerun historical executions between a start and end date to reprocess data or recover from gaps. Backfills are replays of missed schedule intervals between a defined start and end date. Let's take the following flow as an example: ```yaml id: scheduled_flow namespace: company.team tasks: - id: label type: io.kestra.plugin.core.execution.Labels labels: # label to track scheduled date scheduledDate: "{{trigger.date ?? execution.startDate}}" - id: external_system_export type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - echo "processing data for {{trigger.date ?? execution.startDate}}" - sleep $((RANDOM % 5 + 1)) triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "*/30 * * * *" ``` This flow runs every 30 minutes. However, imagine that your source system had an outage for 5 hours. The flow will miss 10 executions. To replay these missed executions, you can use the backfill feature. Ensure the backfill’s start and end dates encompass every missed schedule, so the trigger can replay each execution. Note that Backfill does not only replay missed executions in the time window. If there are successful executions, then these are also replayed. To target specific executions, rather than a time window, to avoid duplication use [Replay](../10.replay/index.md). :::alert{type="info"} **All missed schedules are automatically recovered by default** if the Kestra server is down. The missed schedules will be executed as soon as Kestra is back up because of the `recoverMissedSchedules: ALL` property default. If you have configured this differently in your global Kestra configuration or specifically on a trigger, a Backfill achieves the same behavior. Read more about `recoverMissedSchedules` in the [dedicated documentation](../../05.workflow-components/07.triggers/01.schedule-trigger/index.md#recover-missed-schedules). ::: To backfill the missed executions, go to the **Triggers** tab on the Flow's detail page and click on the **Backfill executions** button. ![Backfill a Trigger](./backfill1.png) You can then select the start and end date for the backfill. Additionally, you can set custom labels for the backfill executions to help you identify them in the future.
You can pause and resume the backfill process at any time: ![Pause Backfill](./backfill3.png) And by clicking on the **Details** button, you can see more details about that backfill process: ![Backfill Details](./backfill2.png) :::alert{type="info"} Backfill executions will not be processed if the associated trigger is disabled. ::: ## Delete a backfill You can delete a Backfill from the **Administrations - Triggers** view. Select the triggers you'd like to delete Backfills for if you do not want to replay any missed executions. ![Delete Backfills](./delete-backfills.png) Deleting a backfill only cancels the scheduled catch-up executions. For example, if you defined a `* * * * *` schedule and backfilled the last five minutes, removing that backfill prevents those five replayed runs from being emitted. This is different from **Delete trigger**, which clears the trigger state itself — effectively recreating the trigger so it starts evaluating from the current time. Use **Delete backfill** to stop pending replays, and **Delete trigger** when you need to reset a stuck trigger or start it fresh. ## Trigger backfill via an API call ### Using cURL You can invoke the backfill executions using the `cURL` call as follows: ```sh curl -X PUT http://localhost:8080/api/v1/main/triggers \ -H "Authorization: Bearer $KESTRA_API_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "namespace": "company.team", "flowId": "myflow", "triggerId": "schedule", "backfill": { "start": "2025-04-29T11:30:00Z", "end": null, "labels": [ { "key": "reason", "value": "outage" } ] } }' ``` In the `backfill` attribute, you need to provide the start time for the backfill; the end time can be optionally provided. You can provide inputs to the flow with `inputs`, as well as assign labels to the backfill executions by providing key-value pairs in the `labels` section. In the example `reason:outage` is labelled to make it clear what caused the need to backfill. Other attributes in this PUT call are `flowId`, `namespace`, and `triggerId`, corresponding to the flow to backfill. Check out the [API Reference](../../api-reference/02.open-source/index.mdx) for further backfill operations via the API. ### Using a service account :::badge{version=">=0.15" editions="EE,Cloud"} ::: For Enterprise and Cloud users, the same process as above can be done with [Service Accounts](../../07.enterprise/03.auth/service-accounts/index.md), so no human user needs to be involved. In this case, you must specify the Tenant to use in the request header and definition: `X-KESTRA-TENANT` and `tenantId`. In the example, we use a Tenant named `production`. ```sh curl -X PUT http://localhost:8080/api/v1/main/triggers \ -H "Authorization: Bearer $KESTRA_API_TOKEN" \ -H "X-Kestra-Tenant: production" \ -H "Content-Type: application/json" \ -d '{ "namespace": "company.team", "flowId": "myflow", "triggerId": "schedule", "tenantId": "production", "backfill": { "start": "2025-04-29T11:30:00Z", "end": null, "labels": [ { "key": "reason", "value": "outage" } ] } }' ``` To use a Service Account, go to **Administration -> IAM -> Service Accounts**. From the Service Accounts tab, create a Service Account, generate an API Token, copy the token, and give the Service Account the appropriate access to backfill a flow. Use this API token in your `cURL` instead of a user's token. The interactive demo below walks through the steps one-by-one.
### Using Python requests You can invoke the backfill executions using Python `requests` as follows: ```python import requests import json url = 'http://localhost:8080/api/v1/main/triggers' headers = { 'Content-Type': 'application/json' } data = { "backfill": { "start": "2025-06-03T06:30:00.000Z", "end": None, "inputs": None, "labels": [ { "key": "reason", "value": "outage" } ] }, "flowId": "myflow", "namespace": "company.team", "triggerId": "schedule" } response = requests.put(url, headers=headers, data=json.dumps(data)) print(response.status_code) print(response.text) ``` With this code, you will be invoking the backfill for `scheduled_flow` flow under `company.team` namespace based on `schedule` trigger ID within the flow. The number of backfills that will be executed will depend on the schedule present in the `schedule` trigger and the `start` and `end` times mentioned in the backfill. When the `end` time is null, as in this case, the `end` time would be considered as the present time. --- # Blueprints in Kestra: Reusable Workflow Templates URL: https://kestra.io/docs/concepts/blueprints > Explore Kestra Blueprints — ready-to-use workflow templates that help you get started faster. Browse community and Enterprise blueprints for any automation. Ready-to-use examples designed to kickstart your workflow.
Blueprints are a curated, organized, and searchable catalog of ready-to-use examples designed to help you kickstart your workflow. Each Blueprint combines code and documentation and can be assigned several tags for organization and discoverability. All Blueprints are validated and documented. You can easily customize and integrate them into your new or existing flows with a single click on the "Use" button. To see more, check out the [Blueprints library](/blueprints). ![Blueprint](./blueprints.png) ## Community blueprints We refer to all Blueprints available in the open-source product as Community Blueprints, as they are guided by the community feedback and represent common usage patterns we see among open-source users and contributors. Community Blueprints are particularly helpful when you're getting started with a new use case, integration, or with Kestra in general because they reflect fairly standardized workflow patterns. All Blueprints are verified by the Kestra team, but everyone is welcome to contribute new Blueprints or suggest improvements to the existing ones using [the following GitHub issue template](https://github.com/kestra-io/kestra/issues/new?assignees=&labels=blueprint&projects=&template=blueprint.yml). ### Where to find Blueprints Blueprints are accessible from two places in the UI: 1. The left navigation sidebar ![Blueprint UI](./blueprints2.png) 2. A dedicated tab in the flow code editor named **Blueprints**, showing your source code and Blueprints side by side. ![Flow Editor Blueprints](./blueprints3.png) ### How to find the right Blueprint Once you are on the Blueprints page, you can: - **Search** Blueprints for a specific use case or integration, e.g., Snowflake, BigQuery, DuckDB, Slack, ETL, ELT, Pandas, GPU, Git, Python, Docker, Redis, MongoDB, dbt, Airbyte, Fivetran, etc. - **Filter** by one or multiple tags, e.g., filter for Docker to see various ways to run containers in your flow, or filter for Notifications to see several options for configuring alerts on success or failure. ## Custom blueprints :::alert{type="info"} This feature requires the [Enterprise Edition](../../07.enterprise/index.mdx). ::: Apart from Community Blueprints, you can create custom Blueprints available only to your organization. You can use them to share, centralize, and document commonly used workflows in your team. Read more in the [Custom Blueprints](../../07.enterprise/02.governance/custom-blueprints/index.md) documentation. --- # Caching in Kestra: Speed Up Repeated Tasks URL: https://kestra.io/docs/concepts/caching > Speed up repeated tasks with file caching in Kestra. Use the WorkingDirectory task to cache dependencies and skip redundant downloads across flow executions. Manage file caching inside Kestra. Kestra provides file caching, which is especially useful when you work with sizable package dependencies that don't change often. ## Cache files in a `WorkingDirectory` task The file caching functionality on the `WorkingDirectory` task allows you to cache a subset of files to speed up your workflow execution. This is especially useful when you work with sizable package dependencies that don't change often. :::alert{type="info"} Kestra can only cache files installed or created as part of the script tasks if the script uses a `PROCESS` runner. If the script uses a `DOCKER` runner, the files will not be cached and the `WorkingDirectory` task will [throw an error](https://github.com/kestra-io/kestra/issues/2233): `Unable to execute WorkingDirectory post actions`. ::: ### Use cases for file caching The file caching is useful if you want to install some `pip` or `npm` packages before running your script. You can cache the `node_modules` or Python `venv` folder to avoid re-installing the dependencies on each run. To do that, add a `cache` to your `WorkingDirectory` task. The `cache` property accepts a list of glob `patterns` to match files to cache. The cache will be automatically invalidated after a specified time-to-live using the `ttl` property accepting a duration. ```yaml id: caching_files namespace: company.team tasks: - id: working_dir type: io.kestra.plugin.core.flow.WorkingDirectory cache: patterns: - some_directory/** ttl: PT1H ``` ### How does it work under the hood Kestra packages the files that need to be cached and stores them in the internal storage. When the task is executed again, the cached files are retrieved, initializing the working directory with their contents. ### Node.js example Below is an example of a flow that installs the `colors` package before running a Node.js script. The `node_modules` folder is cached for one hour. ```yaml id: node_cached_dependencies namespace: company.team tasks: - id: working_dir type: io.kestra.plugin.core.flow.WorkingDirectory cache: patterns: - node_modules/** ttl: PT1H tasks: - id: node_script type: io.kestra.plugin.scripts.node.Script beforeCommands: - npm install colors script: | const colors = require("colors"); console.log(colors.red("Hello")); ``` ### Python example Below is an example of a flow that installs the `pandas` package before running a Python script. The `venv` folder is cached for one day. ```yaml id: python_cached_dependencies namespace: company.team tasks: - id: working_dir type: io.kestra.plugin.core.flow.WorkingDirectory tasks: - id: python_script type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.core.runner.Process beforeCommands: - python -m venv venv - source venv/bin/activate - pip install pandas script: | import pandas as pd print(pd.__version__) cache: patterns: - venv/** ttl: PT24H ``` ### How to invalidate the cache Below are the details how to invalidate the cache: - After the first run, the files are cached - The next time the task is executed: - If the `ttl` didn't pass, then the files are retrieved from cache. - If the `ttl` passed, then the cache is invalidated and no files will be retrieved from cache; because cache is no longer present, the `npm install` command from the `beforeCommands` property will take a bit longer to execute. - If you edit the task and change the `ttl` to: - a longer duration e.g., `PT5H` — the files will be cached for five hours using the new `ttl` duration - a shorter duration e.g., `PT5M` — the cache will be invalidated after five minutes using the new `ttl` duration. The `ttl` is evaluated at runtime. If the most recently set `ttl` duration has passed as compared to the last task run execution date, the cache is invalidated and the files are no longer retrieved from cache. --- # File Access in Kestra: Local and Namespace Files URL: https://kestra.io/docs/concepts/file-access > Access local and namespace files in Kestra using the universal file protocol. Learn how to read, write, and share files between tasks and namespaces. Access local and namespace files in Kestra with universal file protocol. Kestra supports a universal file protocol that simplifies how to reference files in your flows. This protocol provides more consistent and flexible handling of local and [namespace files](../02.namespace-files/index.md) in your flows. You can still reference files inline by defining the file name and its content directly in YAML, but you can now also use `nsfile:///` and `file:///` URIs to reference files stored as namespace files or on the host machine. The example flow below shows a task demonstrating the various file access methods: ```yaml id: protocol namespace: company.team tasks: - id: inline_file type: io.kestra.plugin.scripts.python.Commands inputFiles: hello.py: | x = "Hello world!" print(x) - id: local_file type: io.kestra.plugin.scripts.python.Commands inputFiles: hello.py: file:///scripts/hello.py - id: namespace_file_from_the_same_namespace type: io.kestra.plugin.scripts.python.Commands inputFiles: hello.py: nsfile:///scripts/hello.py - id: namespace_file_from_other_namespace type: io.kestra.plugin.scripts.python.Commands inputFiles: hello.py: nsfile://company/scripts/hello.py pluginDefaults: - type: io.kestra.plugin.scripts.python.Commands values: taskRunner: type: io.kestra.plugin.core.runner.Process commands: - python hello.py ``` ### Allowed paths Note that to use the `file:///` scheme, you will need to bind-mount the host directory containing the files into the Docker container running Kestra, as well as set the `kestra.local-files.allowed-paths` configuration property to allow access to that directory. For example, if you want to read files from the `scripts` folder on your host machine, you can add the following to your `kestra.yml` configuration: ```yaml kestra: image: kestra/kestra:latest volumes: - /Users/yourdir/scripts:/scripts # Bind-mount the host directory ... environment: # Allow access to the /scripts directory in Kestra container KESTRA_CONFIGURATION: | kestra: local-files: allowed-paths: - /scripts ``` If you see the following error: ```plaintext java.lang.SecurityException: The path /scripts/hello.py is not authorized. Only files inside the working directory are allowed by default, other paths must be allowed either globally inside the Kestra configuration using the `kestra.local-files.allowed-paths` property, or by plugin using the `allowed-paths` plugin configuration.`. ``` It means that you have not configured the allowed paths correctly. Ensure that the host directory is bind-mounted into the container and that the `kestra.local-files.allowed-paths` configuration property includes the path to that directory. ### Protocol reference Here is a reference of the new file protocol: 1. Use `file:///path/to/file.txt` to reference local files on the host machine from explicitly allowed paths. 2. Use `nsfile:///path/to/file.txt` to reference files stored in the current namespace. Note that this protocol uses three slashes after `nsfile://` to indicate that you are referencing a file in the current namespace. The namespace inheritance doesn't apply here, i.e., if you specify `nsfile:///path/to/file.txt` in a flow from `company.team` namespace and Kestra can't find it there, Kestra won't look for that file in the parent namespace, i.e., the `company` namespace, unless you explicitly specify the parent namespace in the path, e.g., `nsfile://company/path/to/file.txt`. 3. Use `nsfile://your.infinitely.nested.namespace/path/to/file.txt` to reference files stored in another namespace, provided that the current namespace has permission to access it. Note how this protocol uses two slashes after `nsfile://`, followed by the namespace name, to indicate that you are referencing a file in a different namespace. Under the hood, Kestra EE uses the Allowed Namespaces concept to check permissions to read that file. 4. Kestra also uses the `kestra:///` scheme for internal storage files. If you need to reference files stored in the internal storage, you can use the `kestra:///path/to/file.txt` protocol. ### Usage with `read()` function You can also use the `read()` function to read namespace files or local files in tasks that expect content rather than a path to a script or SQL query. For example, if you want to read a SQL query from a namespace file, you can use the `read()` function as follows: ```yaml id: query namespace: demo tasks: - id: duckdb type: io.kestra.plugin.jdbc.duckdb.Query sql: "{{ read('nsfile:///query.sql') }}" ``` For local files on the host, you can use the `file:///` scheme: ```yaml id: query namespace: demo tasks: - id: duckdb type: io.kestra.plugin.jdbc.duckdb.Query sql: "{{ read('file:///query.sql') }}" ``` ### Namespace files as default FILE-type inputs One of the benefits of this protocol is that you can reference Namespace Files as default FILE-type inputs in your flows. See the example below, which reads a local file, `hello.txt`, from the `demo` namespace and logs its content. ```yaml id: file_input namespace: demo inputs: - id: myfile type: FILE defaults: nsfile:///hello.txt tasks: - id: print_file_content type: io.kestra.plugin.core.log.Log message: "{{ read(inputs.myfile) }}" ``` --- # KV Store in Kestra: Persist Shared State URL: https://kestra.io/docs/concepts/kv-store > Build stateful workflows with the Kestra KV Store. Persist and share key-value pairs across flows and executions for dynamic configuration and shared state. Build stateful workflows with the KV Store.
Kestra's workflows are stateless by design. All workflow executions and task runs are isolated from each other by default to avoid any unintended side effects. When you pass data between tasks, you do so explicitly by passing outputs from one task to another, and that data is stored transparently in Kestra's internal storage. This stateless execution model ensures that workflows are idempotent and can be executed anywhere in parallel at scale. However, in certain scenarios, your workflow might need to share data beyond passing outputs from one task to another. For example, you might want to persist data across executions or even across different workflows. This is where the Key Value (KV) store comes into play. KV Store allows you to store any data in a convenient key-value format. You can create them directly from the UI, via dedicated tasks, Terraform, or through the API. The KV store is a powerful tool that allows you to build stateful workflows and share data across executions and workflows. ## How KV store fits into Kestra's architecture Kestra's architecture has been designed to offer a transparent separation between the orchestration and data processing capabilities. Kestra's [Executor](../../08.architecture/02.server-components/index.md#executor) is responsible for executing tasks and workflows without directly interacting with the user's infrastructure. The Executor relies on [Workers](../../08.architecture/02.server-components/index.md#worker), which are stateless processes that carry out the computation of runnable tasks and polling triggers. For privacy reasons, workers are the only components that interact with the user's infrastructure, including the internal storage and external services. Given that data persisted in the KV Store might contain sensitive information, the **KV Store has been built on top of Kestra's internal storage**. This ensures that all values are stored in your private cloud storage bucket, and Kestra's database only contains metadata about the object, such as the key, file URI, any attached metadata about the object like TTL, creation date, last updated timestamp, etc. In short, the KV Store gives you full control and privacy over your data, and Kestra only stores metadata about the KV pairs. ## Keys and Values `Keys` are arbitrary strings. Keys can contain: - characters in uppercase and or lowercase - standard ASCII characters `Values` are stored as ION files in Kestra's internal storage. Values are strongly typed and can be of one of the following types: - string - number - boolean - datetime - date - duration - JSON. For each KV pair, you can set a `Time to Live` (TTL) to avoid cluttering your storage with data that may only be relevant for a limited time. ## Namespace binding Key value pairs are defined at a namespace level, and you can access them from the namespace page in the UI in the KV Store tab. You can create and read KV pairs across namespaces as long as those namespaces are [allowed](../../07.enterprise/02.governance/07.namespace-management/index.md#allowed-namespaces). ## UI: How to Create, Read, Update and Delete KV pairs from the UI Kestra follows a philosophy of Everything as Code and also from the UI. Therefore, you can create, read, update, and delete KV pairs both from the UI and Code. Here is a list of the different ways to manage KV pairs: 1. **Kestra UI**: select a Namespace and go to the KV Store tab — from here, you can create, edit, and delete KV pairs. 2. **Task in a flow**: use the `io.kestra.plugin.core.kv.Set`, `io.kestra.plugin.core.kv.Get`, and `io.kestra.plugin.core.kv.Delete` tasks to create, read, and delete KV pairs in a flow. 3. **Kestra's API**: use our HTTP REST API to create, read, and delete KV pairs. 4. **Kestra's Terraform provider**: use the `kestra_kv` resource to create, read, and delete KV pairs. 5. **Pebble function**: use the `kv()` function to retrieve a value by key in a flow. 6. **GitHub Actions**: create, read, and delete KV pairs in your CI/CD pipeline. 7. **kestractl**: use `kestractl kv` to list, set, update, get, and delete KV pairs from the command line. See the [kestractl docs](../../kestra-cli/kestractl/index.md) for setup. The sections below provide detailed instructions on how to create and manage KV pairs using each of these methods. ### Create new KV pairs from the UI You can create, read, update, and delete KV pairs from the UI in the following way: 1. Navigate to the `Namespaces` page from the left navigation menu and select the namespace where you want to create the KV pair. ![navigate_to_namespace](./navigate_to_namespace.png) 2. Go to the `KV Store` tab. This is where you can see all the KV pairs associated with this namespace. ![navigate_to_keystore](./navigate_to_keystore.png) 3. Click on `New Key-Value` button in the top right corner to create a new KV pair. Enter a name for the `Key` and assign a suitable `Type` for the value — it can be a string, number, boolean, datetime, date, duration, or JSON. ![create_kv_pair](./create_kv_pair.png) 4. Enter the value in the `Value` field. 5. Optionally, you can configure a Time to Live (TTL) for the KV pair. The dropdown contains some standard durations. You can also select `Custom duration` to enter a custom duration as a string in ISO 8601 duration format. 6. Finally, `Save` the changes. Your new KV pair should now be displayed in the list of KV pairs for that namespace. ### Update, Delete, and Copy KV pairs from the UI You can edit, delete, or copy any KV pair by clicking on the associated button on the right side of each KV pair. The copy option copies the [Pebble expression for the KV pair](#read-kv-pairs-with-pebble) (i.e., `{{ kv('YOUR_KEY') }}`) so you can use it directly in your flow. ![edit_delete_kv_pair](./edit_delete_kv_pair.png) ## CODE: How to Create, Read, Update and Delete KV pairs in your flow code ### Create a new KV pair with the `Set` task in a flow To create a KV pair from a flow, you can use the `io.kestra.plugin.core.kv.Set` task. Below is an example of how to create a KV pair in a flow: ```yaml id: add_kv_pair namespace: company.team tasks: - id: download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: set_kv type: io.kestra.plugin.core.kv.Set key: my_key value: "{{ outputs.download.uri }}" namespace: company.team # the current namespace of the flow is used by default overwrite: true # whether to overwrite or fail if a value for that key already exists; default true ttl: P30D # optional Time to Live (TTL) for the KV pair - id: set_simple_kv type: io.kestra.plugin.core.kv.Set key: simple_string value: hello from Kestra - id: set_json_kv type: io.kestra.plugin.core.kv.Set key: json_kv value: | { "author": "Rick Astley", "song": "Never Gonna Give You Up" } - id: get_kv type: io.kestra.plugin.core.output.OutputValues values: my_key: "{{ kv('my_key') }}" simple_string: "{{ kv('simple_string') }}" favorite_song: "{{ json(kv('json_kv')).song }}" ``` You can use the `io.kestra.plugin.core.kv.Set` task to create or modify any KV pair. When modifying existing values, you can leverage the `overwrite` boolean parameter to control whether to overwrite the existing value or fail if a value for that key already exists. By default, the `overwrite` parameter is set to `true` so that the existing value is always updated. ### Read KV pairs with Pebble The easiest way to retrieve a value by key is to use the `{{ kv('YOUR_KEY'') }}` Pebble function. Below is the full syntax of that function: ```twig {{ kv(key='your_key_name', namespace='your_namespace_name', errorOnMissing=false) }} ``` Assuming that you retrieve the key in a flow in the same namespace as the one for which the key was created, you can simply use `"{{ kv('my_key') }}"` to retrieve the value: ```yaml id: read_kv_pair namespace: company.team tasks: - id: log_key type: io.kestra.plugin.core.log.Log message: "{{ kv('my_key') }}" ``` When retrieving the key from another namespace, you can use the following syntax: ```yaml id: read_kv_pair_from_another_namespace namespace: company.team tasks: - id: log_key_from_another_namespace type: io.kestra.plugin.core.log.Log message: "{{ kv('my_key', 'kestra.engineering.myproject') }}" ``` By default, when you try to retrieve a key that doesn't exist, the task using the `"{{ kv('non_existing_key') }}"` expression will run with an error. If you prefer to instead run without error when the key doesn't exist, you can set the `errorOnMissing` parameter to `false` (that expression will simply return `null`): ```yaml id: read_non_existing_kv_pair namespace: company.team tasks: - id: log_key_from_another_namespace type: io.kestra.plugin.core.debug.Return format: "{{ kv('non_existing_key', errorOnMissing=false) }}" ``` The function arguments such as the `errorOnMissing` keyword can be skipped for brevity as long as you fill in all positional arguments i.e., `{{ kv(key='your_key_name', namespace='your_namespace_name', errorOnMissing=false) }}` — the version below has the same effect: `{{ kv(key='my_key', namespace='company.team') }}` ```yaml id: read_non_existing_kv_pair namespace: company.team tasks: - id: log_key_from_another_namespace type: io.kestra.plugin.core.debug.Return format: "{{ kv('my_key', 'kestra.engineering.myproject', false) }}" ``` ### Read KV pairs with the `Get` task You can also retrieve the value of any KV pair using the `Get` task. The `Get` task produces the `value` output, which you can use in subsequent tasks. This option is a little more verbose, but it has two benefits: 1. More declarative syntax 2. Useful when you need to pass the current state of that value to multiple downstream tasks ```yaml id: get_kv_pair namespace: company.team tasks: - id: get type: io.kestra.plugin.core.kv.Get key: my_key namespace: company.team errorOnMissing: false - id: log_key_get type: io.kestra.plugin.core.log.Log message: "{{ outputs.get.value }}" ``` ### Read and parse JSON-type values from KV pairs To parse JSON values in Kestra's templated expressions, make sure to wrap the `kv()` call in the `json()` function like the following: `"{{ json(kv('your_json_key')).json_property }}"`. The following example demonstrates how to parse values from JSON-type KV pairs in a flow: ```yaml id: kv_json_flow namespace: company.team tasks: - id: set_json_kv type: io.kestra.plugin.core.kv.Set key: favorite_song value: | { "author": "Rick Astley", "song": "Never Gonna Give You Up", "album": { "name": "Whenever You Need Somebody", "release_date": "1987-11-16" } } - id: parse_json_kv type: io.kestra.plugin.core.log.Log message: - "Author: {{ json(kv('favorite_song')).author }}" - "Song: {{ json(kv('favorite_song')).song }}" - "Album name: {{ json(kv('favorite_song')).album.name }}" - "Album release date: {{ json(kv('favorite_song')).album.release_date }}" - id: get type: io.kestra.plugin.core.kv.Get key: favorite_song - id: parse_json_from_kv type: io.kestra.plugin.core.log.Log message: "Country: {{ json(outputs.get.value).album.name }}" ``` ### Read keys by prefix with the `GetKeys` task If you want to check if some values already exist for a given key, you can search keys by prefix: ```yaml id: get_keys_by_prefix namespace: company.team tasks: - id: get type: io.kestra.plugin.core.kv.GetKeys prefix: "test_" namespace: company.team - id: log_key_prefix type: io.kestra.plugin.core.log.Log message: "{{ outputs.get.keys }}" ``` The output is a list of keys - if no keys were found, an empty list will be returned. ### Delete a KV pair with the `Delete` task The `io.kestra.plugin.core.kv.Delete` task produces the boolean output `deleted` to confirm whether a given KV pair was deleted or not. ```yaml id: delete_kv_pair namespace: company.team tasks: - id: kv type: io.kestra.plugin.core.kv.Delete key: my_key namespace: company.team errorOnMissing: false - id: check_if_deleted type: io.kestra.plugin.core.log.Log message: "{{ outputs.kv.deleted }}" ``` --- ## API: How to Create, Read, Update and Delete KV pairs via REST API Let's look at how you can interact with the KV Store via the REST API. ### Create a KV pair The API call to set the KV pair follows the structure: ```bash curl -X PUT -H "Content-Type: application/json" http://localhost:8080/api/v1/main/namespaces/{namespace}/kv/{key} -d '' ``` For example: ```bash curl -X PUT -H "Content-Type: application/json" http://localhost:8080/api/v1/main/namespaces/company.team/kv/my_key -d '"Hello World"' ``` The above `curl` command creates the KV pair with key `my_key` and the `Hello World` string value in the `company.team` namespace. The API does not return any response. ### Read all keys in the namespace You can get all KV pairs using: ```bash curl -X GET -H "Content-Type: application/json" http://localhost:8080/api/v1/main/kv/ ``` You can also use the `filters` to get all KV pairs from a specific Namespace (replace `namespace-name`): ```bash curl -G "http://localhost:8080/api/v1/main/kv" \ --data-urlencode "filters[namespace][EQUALS]= namespace-name" \ -H "Authorization: Bearer " ``` Older versions of Kestra may use the path to specify a Namespace: ```bash curl -X GET -H "Content-Type: application/json" http://localhost:8080/api/v1/main/namespaces/{namespace}/kv/{key} ``` :::alert{type="info"} As a general tip, your Kestra instance exposes an interactive API reference at https:///api which lists all available endpoints for your installed version. ::: The output is returned as a JSON array of all keys in the namespace: ```json [ {"key":"my_key","creationDate":"2024-07-27T06:10:33.422Z","updateDate":"2024-07-27T06:11:08.911Z"}, {"key":"test_key","creationDate":"2024-07-27T04:37:18.196Z","updateDate":"2024-07-27T04:37:18.196Z"} ] ``` ### Delete a KV pair You can delete any KV pair using the following API call: ```bash curl -X DELETE -H "Content-Type: application/json" http://localhost:8080/api/v1/main/namespaces/{namespace}/kv/{key} ``` This call returns a boolean indicating whether the key was deleted. For example, the following `curl` command returns `false` because the key `non_existing_key` does not exist: ```bash curl -X DELETE -H "Content-Type: application/json" http://localhost:8080/api/v1/main/namespaces/company.team/kv/non_existing_key ``` However, when we try to delete a key `my_key` which exists in the `company.team` namespace, the same API call returns `true`: ```bash curl -X DELETE -H "Content-Type: application/json" http://localhost:8080/api/v1/main/namespaces/company.team/kv/my_key ``` --- ## TERRAFORM: How to Create, Read, Update and Delete KV pairs via Terraform ### Create a KV pair You can create a KV pair via Terraform by using the `kestra_kv` resource. Below is an example of how to create a KV pair: ```hcl resource "kestra_kv" "my_key" { namespace = "company.team" key = "my_key" value = "Hello World" type = "STRING" } ``` ### Read a KV pair You can read a KV pair via Terraform by using the `kestra_kv` data source. Below is an example of how to read a KV pair: ```hcl data "kestra_kv" "new" { namespace = "company.team" key = "my_key" } ``` As with anything in Terraform, you can manage the state of your KV resources by adjusting the Terraform code and running the `terraform apply` command to create, update, or delete your KV pairs. --- # Namespace Files in Kestra: Manage Project Assets URL: https://kestra.io/docs/concepts/namespace-files > Manage Namespace Files in Kestra and use them in your flows. Store scripts, configs, and assets at the namespace level for centralized file management. Manage Namespace Files and how to use them in your flows.
Namespace Files are files tied to a given namespace. You can think of Namespace Files as the equivalent of a project in your local IDE or a copy of your Git repository. Namespace Files can hold Python files, R or Node.js scripts, SQL queries, dbt or Terraform projects, and much more. You can synchronize your Git repository with a specific namespace to orchestrate dbt, Terraform or Ansible, or any other project that contains code and configuration files. Once you add any file to a namespace, you can reference it inside your flows using the `read()` function in EVERY task or trigger from the same namespace. For instance, if you add a SQL query called `my_query.sql` to the `queries` directory in the `company.team` namespace, you can reference it in any `Query` task or any JDBC Trigger like so: `{{ read('queries/my_query.sql') }}`. Here is an example showing how you can use the `read()` function in a [ClickHouse Trigger](/plugins/plugin-jdbc-clickhouse/io.kestra.plugin.jdbc.clickhouse.trigger) to read a SQL query stored as a Namespace File: ```yaml id: jdbc_trigger namespace: company.team tasks: - id: for_each_row type: io.kestra.plugin.core.flow.ForEach values: "{{ trigger.rows }}" tasks: - id: return type: io.kestra.plugin.core.debug.Return format: "{{ json(taskrun.value) }}" triggers: - id: query_trigger type: io.kestra.plugin.jdbc.clickhouse.Trigger interval: "PT5M" url: jdbc:clickhouse://127.0.0.1:56982/ username: "{{ secret('CLICKHOUSE_USERNAME') }}" password: "{{ secret('CLICKHOUSE_PASSWORD') }}" sql: "{{ read('queries/my_query.sql') }}" # 🚀 The read() function reads the content of the file as a string! fetchType: FETCH ``` :::alert{type="info"} The `namespaceFiles.enabled: true` property is not required here — it is only needed to inject an entire directory of namespace files into the working directory of a script task. If you only need to read a file’s contents, use `read()` without mounting; mounting is for when the task needs files on disk. ::: ## Why use Namespace Files Namespace Files offer a simple way to organize your code and configuration files. Before Namespace Files, you had to store your code and configuration files in a Git repository and then clone that repository at runtime using the `git.Clone` task. With Namespace Files, you can store your code and configuration files directly in the Kestra's internal storage backend. That storage backend can be your local directory or an S3 bucket to ensure maximum security and privacy. Namespace Files make it easy to: - orchestrate Python, R, Node.js, SQL, and more without having to worry about code dependencies, packaging, and deployments — simply add your code in the embedded Code Editor or sync your Git repository with a given namespace - manage your code for a given project or team in one place, even if those files are stored in different Git repositories or even different Git providers - share your code and configuration files between workflows and team members in your organization - orchestrate complex projects that require the code to be separated into multiple scripts, queries, or modules. ## How to add Namespace Files ### Embedded code editor While creating or editing a Flow, you can access Namespace Files from the **Namespace Files** tab. You can easily write, import, or paste custom scripts, queries, and configuration files. To start, add a new file (e.g., a Python script). Add a folder named `scripts` and a file called `hello.py` with the following content: ```python print("Hello from the Editor!") ``` Once you added a file, you can use it in your flow: ```yaml id: editor namespace: company.team tasks: - id: hello type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true commands: - python scripts/hello.py ``` The **Execute** button allows you to run your flow directly from the Code Editor. Click on the **Execute** button to run your flow. You then see the Execution running, and once you navigate to the **Logs** tab, you should see a friendly message ``Hello from the Editor!`` in the logs. ### Namespace Files Revision History Namespace Files include revision history just like flows, so you can inspect or roll back earlier uploads without leaving the Editor. - First upload of a path is stored as `queries/my_query.sql` and treated as version 0 for backward compatibility. - Each subsequent upload keeps `queries/my_query.sql` as the latest version while adding suffixed revisions such as `queries/my_query.sql.v1`, `queries/my_query.sql.v2`, and so on. - Older revisions remain available under their suffixed filenames, letting you compare and restore as needed. To access a file's revision history, right-click on the file. ![Namespace file revision history](./namespace-file-revision-history.png) From the history, view, compare, and restore prior versions. ![Restore namespace file revision placeholder](./namespace-file-restore.png) From the **Revisions** list, you can delete a given revision or all revisions older than the selected one. You will be prompted to confirm this choice, as there is no possible way to restore a revision once is has been deleted. To keep your version history clean, you can purge "N" number of Namespace File revisions or revisions older than a certain date. Refer to the [Purge documentation](../../10.administrator-guide/purge/index.md#purge-namespace-files). ### PushNamespaceFiles and SyncNamespaceFiles tasks There are two tasks to help you automatically manage your namespace files with Git. This allows you to sync the latest changes from a Git repository. This example pushes Namespace Files you already have in Kestra to a Git repository for you: ```yaml id: push_to_git namespace: system tasks: - id: commit_and_push type: io.kestra.plugin.git.PushNamespaceFiles username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/git_username/scripts branch: dev namespace: company.team files: - "example.py" gitDirectory: _files commitMessage: "add namespace files" dryRun: true ``` This example syncs Namespace Files inside of a Git repository to your Kestra instance: ```yaml id: sync_files_from_git namespace: system tasks: - id: sync_files type: io.kestra.plugin.git.SyncNamespaceFiles username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/git_username/scripts branch: main namespace: git gitDirectory: _files dryRun: true ``` Check out the dedicated guides for more information: - [PushNamespaceFiles](../../15.how-to-guides/pushnamespacefiles/index.md) - [SyncNamespaceFiles](../../15.how-to-guides/syncnamespacefiles/index.md) ### GitHub Actions CI/CD Use the official Kestra [GitHub Actions](../../version-control-cicd/cicd/01.github-action/index.md) to upload namespace files directly from your repository. This is ideal for promoting configuration, scripts, or other assets that live alongside your code. Example workflow deploying the `scripts/` folder to the `prod` namespace using the `deploy-namespace-files` action: ```yaml name: Kestra Namespace Files on: [push] jobs: upload-namespace-files: runs-on: ubuntu-latest steps: - uses: actions/checkout@v5 - name: Upload scripts folder to prod uses: kestra-io/github-actions/deploy-namespace-files@main with: localPath: ./scripts # folder in the repo namespacePath: scripts # destination path in the namespace namespace: prod server: ${{ secrets.KESTRA_HOSTNAME }} # Choose one auth method: # apiToken: ${{ secrets.KESTRA_API_TOKEN }} # Enterprise Edition user: ${{ secrets.KESTRA_USERNAME }} # Basic auth password: ${{ secrets.KESTRA_PASSWORD }} ``` :::alert{type="info"} - Store credentials as GitHub Secrets. Provide `tenant` when targeting multi-tenant Enterprise environments. - Ensure the service account role grants namespace file permissions (and `FLOWS` when deploying flows) to your target namespace. ::: ### Terraform provider You can use the `kestra_namespace_file` resource from the official [Kestra Terraform Provider](https://registry.terraform.io/providers/kestra-io/kestra/latest/docs) to deploy all your custom script files from a specific directory to a given Kestra namespace. Below is a simple example showing how you can synchronize an entire directory of scripts from the directory `src` with the `company.team` namespace using Terraform: ```hcl resource "kestra_namespace_file" "prod_scripts" { for_each = fileset(path.module, "src/**") namespace = "company.team" filename = each.value # or "/${each.value}" content = file(each.value) } ``` ### Deploy namespace files via kestractl You can upload namespace files from the command line using [kestractl](../../kestra-cli/kestractl/index.md). The following example synchronizes an entire local directory with the `prod` namespace: ```bash kestractl nsfiles upload prod ./scripts --override ``` To upload to a specific path within the namespace rather than the root: ```bash kestractl nsfiles upload prod ./assets --path resources --override --fail-fast ``` The `--override` flag replaces existing files; `--fail-fast` stops on the first error rather than continuing. `kestractl nsfiles` also supports `list`, `get`, and `delete` for inspecting and removing individual files. Run `kestractl nsfiles --help` for the full reference. ## How to use Namespace Files in your flows There are multiple ways to use Namespace Files in your flows. You can use the `read()` function to read the content of a file as a string, point to the file path in the supported tasks, or use a dedicated task to retrieve it as an output. :::alert{type="info"} Kestra 0.24 introduced a universal file protocol that simplifies accessing files — local or namespace — in your flow. For more details, refer to the [File Access documentation page](../file-access/index.md). ::: Usually, pointing to a file location, rather than reading the file's content, is required when you want to use a file as an input to a CLI command (e.g., in a `Commands` task such as `io.kestra.plugin.scripts.python.Commands` or `io.kestra.plugin.scripts.node.Commands`). In all other cases, the `read()` function can be used to read the content of a file as a string (e.g., in `Query` or `Script` tasks). You can also use the `io.kestra.plugin.core.flow.WorkingDirectory` task to read namespace files and then use them in child tasks that require a file path in CLI commands, for example: `python scripts/hello.py`. ### The `read()` function The script in the first section used the `read()` function to read the content of the `scripts/hello.py` file as a string using the expression `"{{ read('scripts/hello.py') }}"`. It's important to remember that this function reads **the content of the file as a string**. Therefore, you should use that function only in tasks that expect a string as an input like `io.kestra.plugin.scripts.python.Script` or `io.kestra.plugin.scripts.node.Script`, rather than `io.kestra.plugin.scripts.python.Commands` or `io.kestra.plugin.scripts.node.Commands`. The `read()` function allows you to read the content of a Namespace File stored in the Kestra's internal storage backend. The `read()` function takes a single argument, which is the absolute path to the file you want to read. The path must point to a file stored in the **same namespace** as the flow you are executing. In this example, we have a namespace file called `example.txt` that contains the text `Hello, World!`. We can print the content to the logs by using `{{ read('example.txt') }}`: ```yaml id: files namespace: company.team tasks: - id: log type: io.kestra.plugin.core.log.Log message: "{{ read('example.txt') }}" ``` ### `namespaceFiles.enabled` on supported tasks With supported tasks, such as the `io.kestra.plugin.scripts` group, we can access files using their path and enabling the task to read namespace files. Below is a simple `weather.py` script that reads a secret to talk to a Weather Data API: ```python import requests api_key = '{{ secret("WEATHER_DATA_API_KEY") }}' url = f"https://api.openweathermap.org/data/2.5/weather?q=Paris&APPID={api_key}" weather_data = requests.get(url) print(weather_data.json()) ``` Next, is a flow that uses the script: ```yaml id: weather_data namespace: company.team tasks: - id: get_weather type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true include: - scripts/weather.py taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest commands: - python scripts/weather.py ``` #### `namespaceFiles` property The example above uses the `include` field to only allow the `scripts/weather.py` file to be accessible by the task. We can control what namespace files are available to our flow with the `namespaceFiles` property. `namespaceFiles` has several configurable attributes: - `enabled`: when set to true enables all files in that namespace to be visible to the task - `include`: specifies files you want to be accessible by the task - `exclude`: specifies files you don't want to be accessible by the task - `namespaces`: specifies a list of namespaces to search for files. - `ifExists`: specifies what to do in the instance a Namespace file already exists in the working directory - `folderPerNamespace`: a boolean property that mounts namespace files in separate directories (set to `false` by default) rather than all files to the root of the working directory The `namespaces` attribute can be used like in the following example: ```yaml id: namespace_files_example namespace: dev.test tasks: - id: namespace type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true namespaces: - "dev.test" - "company" commands: - python test.py - id: namespace2 type: io.kestra.plugin.scripts.python.Script namespaceFiles: enabled: true script: "{{ read('test.py') }}" ``` The files are loaded in the namespace order, and only the latest version of a file is kept. Meaning if a file is present in the first and second namespace, only the file present on the second namespace will be loaded. In the first task, the `test.py` file from the `company` namespace will be used because priority is given from top to bottom of the list of namespaces. In the case of multiple files of the same name, the last listed namespace holds priority. For the second task, the `test.py` file in the `dev.test` namespace will be used because no namespace has been defined in the `read()` function. If you want to fetch the `test.py` script from a different namespace, you need to explicitly define it as follows: `"{{ read('test.py', namespace='company.team') }}"`. The `ifExists` attribute has four possible options for behavior when tasks invoke a Namespace file that already exists in the working directory: - `OVERWRITE`: set by default, adds a debug log that the file was overwritten - `FAIL`: logs and ERROR and fails the task - `WARN`: logs a WARNING but continues running the execution - `IGNORE`: doesn't overwrite the file or log any warnings For example, in the following task the second instance of `sample_python.py` will overwrite the first: ```yaml id: test_workdir_issue namespace: prod tasks: - id: git_wdir type: io.kestra.plugin.core.flow.WorkingDirectory tasks: - id: clone type: io.kestra.plugin.git.Clone branch: main url: https://github.com/kestra-io/examples - id: python_command_1 type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true commands: - python scripts/sample_python.py - id: python_command_2 type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true ifExists: OVERWRITE commands: - python scripts/sample_python.py ``` ### Namespace tasks You can use the Namespace Tasks to upload, download, and delete tasks in Kestra. In the example below, we have a namespace file called `example.ion` that we want to convert to a `.csv` file. We can use the `DownloadFiles` task to generate an output that contains the file so we can easily pass it dynamically to the `IonToCsv` task. ```yaml id: files namespace: company.team tasks: - id: namespace type: io.kestra.plugin.core.namespace.DownloadFiles namespace: company.team files: - example.ion - id: ion_to_csv type: io.kestra.plugin.serdes.csv.IonToCsv from: "{{ outputs.namespace.files['/example.ion'] }}" ``` Read more about the tasks below: - [UploadFiles](/plugins/core/namespace/io.kestra.plugin.core.namespace.uploadfiles) - [DownloadFiles](/plugins/core/namespace/io.kestra.plugin.core.namespace.downloadfiles) - [DeleteFiles](/plugins/core/namespace/io.kestra.plugin.core.namespace.deletefiles) ## Include / exclude namespace files You can selectively include or exclude namespace files. Let's say that you have multiple namespace files present: file1.txt, file2.txt, file3.json, file4.yml. You can selectively include multiple files using the `include` attribute under `namespaceFiles` as shown below: ```yaml id: include_namespace_files namespace: company.team tasks: - id: include_files type: io.kestra.plugin.scripts.shell.Commands namespaceFiles: enabled: true include: - file1.txt - file3.json commands: - ls ``` The `include_files` task lists all the included files. In the example above, these are `file1.txt` and `file3.json` as only those were included from the namespace through `include`. The `exclude` attribute, alternatively, includes all the namespace files except those specified under `exclude`. ```yaml id: exclude_namespace_files namespace: company.team tasks: - id: exclude_files type: io.kestra.plugin.scripts.shell.Commands namespaceFiles: enabled: true exclude: - file1.txt - file3.json commands: - ls ``` The `exclude_files` task from the above flow lists `file2.txt` and `file4.yml`, all the namespace files except those that were excluded using `exclude`. ### Pattern matching rules for `include` / `exclude` - Patterns that do **not** start with `/` are automatically prefixed with `**/`, so they match recursively (e.g., `file1.txt` becomes `**/file1.txt`). - Patterns that start with `/` match from the namespace root only (e.g., `/config/settings.json`). - You can force explicit types with `glob:` or `regex:`: - `glob:/src/**/*.py` - `regex:^src/.*\\.py$` Examples: ```yaml namespaceFiles: enabled: true include: # Root-only matches - /file1.txt - /config/settings.json # Recursive matches (auto **/ prefix) - file1.txt # becomes **/file1.txt - src/** # becomes **/src/** # Explicit glob - glob:/src/**/*.py - glob:config/*.json # Regex - regex:^src/.*\.py$ - regex:.*test.*\.json ``` :::alert{type="warning"} Patterns without a leading `/` are automatically prefixed with `**/`. Use `/…` or explicit `glob:`/`regex:` patterns if you want root-only matching. Patterns that already contain `**` (for example `sg_base_etl/**`) may be unintentionally transformed; use `/sg_base_etl/**` or `glob:/sg_base_etl/**` as a workaround. ::: --- # Pebble Templating in Kestra: Dynamic Variables URL: https://kestra.io/docs/concepts/pebble > Dynamically render variables, inputs, and outputs in Kestra using Pebble templating. Use expressions to build flexible, data-driven workflows. Dynamically render variables, inputs and outputs. Pebble is a Java templating engine inspired by [Twig](https://twig.symfony.com/) and similar to the [Python Jinja Template Engine](https://palletsprojects.com/p/jinja/) syntax. Kestra uses it to dynamically render variables, inputs, and outputs within the execution context.
## Reading inputs When using `inputs` property in a Flow, you can access the corresponding values by using `inputs` variable in your tasks. ```yaml id: input_string namespace: company.team inputs: - id: name type: STRING tasks: - id: say_hello type: io.kestra.plugin.core.log.Log message: "Hello 👋, my name is {{ inputs.name }}" ``` ## Reading task outputs Most of Kestra's tasks expose output values. You can access those outputs in other tasks by using `outputs..`. Every task output can be found in the corresponding task documentation. In the example below, we use the `value` outputs of the `io.kestra.plugin.core.debug.Return` task in the downstream task. ```yaml id: input_string namespace: company.team inputs: - id: name type: STRING tasks: - id: say_hello type: io.kestra.plugin.core.debug.Return format: "Hello 👋, my name is {{ inputs.name }}" - id: can_you_repeat type: io.kestra.plugin.core.log.Log message: '{{ outputs.say_hello.value }}' ``` ## Dynamically render a task with `TemplatedTask` Since Kestra 0.16.0, you can use the `TemplatedTask` task to fully template all task properties using Pebble. This way, all task properties and their values can be dynamically rendered based on your custom inputs, variables, and outputs from other tasks. Below is an example of how to use the [TemplatedTask](/plugins/core/templating/io.kestra.plugin.core.templating.templatedtask) to create a Databricks job using dynamic properties: ```yaml id: templated_databricks_job namespace: company.team inputs: - id: host type: STRING - id: clusterId type: STRING - id: taskKey type: STRING - id: pythonFile type: STRING - id: sparkPythonTaskSource type: ENUM defaults: WORKSPACE values: - GIT - WORKSPACE - id: maxWaitTime type: STRING defaults: "PT30M" tasks: - id: templated_spark_job type: io.kestra.plugin.core.templating.TemplatedTask spec: | type: io.kestra.plugin.databricks.job.CreateJob authentication: token: "{{ secret('DATABRICKS_API_TOKEN') }}" host: "{{ inputs.host }}" jobTasks: - existingClusterId: "{{ inputs.clusterId }}" taskKey: "{{ inputs.taskKey }}" sparkPythonTask: pythonFile: "{{ inputs.pythonFile }}" sparkPythonTaskSource: "{{ inputs.sparkPythonTaskSource }}" waitForCompletion: "{{ inputs.maxWaitTime }}" ``` Note how in this example, the `waitForCompletion` property is templated using Pebble even though that property is not dynamic. The same is true for the `sparkPythonTaskSource` property. Without the `TemplatedTask` task, you would not be able to pass those values from inputs. --- ## Date formatting Pebble can be very useful for making small transformations on the fly without the need to use Python or another dedicated programming language. For instance, we can use the `date` filter to format date values: `'{{ inputs.my_date | date("yyyyMMdd") }}'` ## Coalesce operator to conditionally use trigger or execution date Most of the time, a flow will be triggered automatically. Either on schedule or based on external events. It’s common to use the date of the execution to process the corresponding data and make the flow dependent on time. With Pebble, you can use the `trigger.date` to get the date of the executed trigger. Still, sometimes you may want to manually execute a flow. In this case, the `trigger.date` variable won’t be suitable. In this scenario, you can use the `execution.startDate` variable that returns the execution start date. To support both use cases, use the coalesce operator `??`. The example below shows how to apply it in a flow. ```yaml id: pebble_date_trigger namespace: company.team tasks: - id: return_date type: io.kestra.plugin.core.debug.Return format: '{{ trigger.date ?? execution.startDate | date("yyyy-MM-dd")}}' triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "* * * * *" ``` ## Parsing objects & lists using jq Sometimes, outputs return nested objects or lists. To parse those elements, you may leverage `jq`. You can use jQuery to slice, filter, map, and transform structured data with the same ease that `sed`, `awk`, `grep`, and similar Linux commands let you manipulate strings. Consider the following flow: ```yaml id: object_example namespace: company.team inputs: - id: data type: JSON defaults: '{"value": [1, 2, 3]}' tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "{{ inputs.data }}" ``` The expression `{{ inputs.data.value }}` returns the list `[1, 2, 3]` The expression `{{ inputs.data.value | jq(".[1]") | first }}` returns `2`. `jq(".[1]")` accesses the second value of the list and returns an array with one element. We then use `first` to access the value itself. `{{ inputs | jq(".data.value[1]") | first }}` also works — jq can parse any object in the Kestra context. You can troubleshoot complex Pebble expressions using the **Debug Expression** button in the outputs tab of a Flow execution page in the UI. It's helpful to validate how complex objects will be parsed. ## Using conditions in Pebble In some tasks, such as the `If` or `Switch` tasks, you need to provide some conditions. You can use the Pebble syntax to use previous task outputs within those conditions: ```yaml id: test-object namespace: company.team inputs: - id: data type: JSON defaults: '{"value": [1, 2, 3]}' tasks: - id: if type: io.kestra.plugin.core.flow.If condition: '{{ inputs.data.value | jq(".[2]") | first == 3}}' then: - id: when_true type: io.kestra.plugin.core.log.Log message: 'Condition was true' else: - id: when_false type: io.kestra.plugin.core.log.Log message: 'Condition was false' ``` --- # Replay Executions in Kestra: Rerun from Any Task URL: https://kestra.io/docs/concepts/replay > Replay Kestra workflow executions from any chosen task run. Re-trigger failed or incomplete executions without starting from scratch for faster recovery. Replay allows you to re-run a workflow execution from any chosen task run.
By using Replay, you can re-run a workflow execution from any selected task run. To do that, simply go to the Gantt view of the chosen workflow execution (it doesn't need to be a Failed execution, it can be an execution in any state) and click on the task run you want to re-run. Additionally, you can re-run an execution or bulk executions from the **Executions** tab with the option to use the latest revision. ![replay6](./replay6.png) Replays are extremely useful for iterative development and reprocessing data. Imagine the following scenario: you have a workflow that extracts a large compressed CSV dataset and you want to transform it into a Parquet file with a specific schema. ```yaml id: divvy_tripdata namespace: company.team variables: file_id: "{{ execution.startDate | dateAdd(-3, 'MONTHS') | date('yyyyMM') }}" tasks: - id: get_zipfile type: io.kestra.plugin.core.http.Download uri: "https://divvy-tripdata.s3.amazonaws.com/{{ render(vars.file_id) }}-divvy-tripdata.zip" - id: unzip type: io.kestra.plugin.compress.ArchiveDecompress algorithm: ZIP from: "{{ outputs.get_zipfile.uri }}" - id: convert type: io.kestra.plugin.serdes.csv.CsvToIon from: "{{outputs.unzip.files[render(vars.file_id) ~ '-divvy-tripdata.csv']}}" - id: to_parquet type: io.kestra.plugin.serdes.avro.IonToAvro # render(vars.file_id) from: "{{ outputs.convert.uri }}" datetimeFormat: "yy-MM-dd' 'HH:mm:ss" schema: | { "type": "record", "name": "Ride", "namespace": "com.example.bikeshare", "fields": [ {"name": "ride_id", "type": "string"}, {"name": "rideable_type", "type": "string"}, {"name": "started_at", "type": {"type": "long", "logicalType": "timestamp-millis"}}, {"name": "ended_at", "type": {"type": "long", "logicalType": "timestamp-millis"}}, {"name": "start_station_name", "type": "string"}, {"name": "start_station_id", "type": "string"}, {"name": "end_station_name", "type": "string"}, {"name": "end_station_id", "type": "string"}, {"name": "start_lat", "type": "double"}, {"name": "start_lng", "type": "double"}, { "name": "end_lat", "type": ["null", "double"], "default": null }, { "name": "end_lng", "type": ["null", "double"], "default": null }, {"name": "member_casual", "type": "string"} ] } ``` When you run the above workflow, you should see an error in the `to_parquet` task. From the logs, you are able to see that the error is due to a misconfigured date format in the `datetimeFormat` field — in fact, the date format should have a full year, not just a two-digit year: `"yyyy-MM-dd' 'HH:mm:ss"`. You ask [AI](../../ai-tools/ai-copilot/index.md) to fix the flow for you, or you correct the error yourself in the workflow code and save it. ![Fix with AI](./replay-ai-fix.png) ![AI Suggestion](./ai-suggestion.png) :::collapse{title="Full corrected flow code"} ```yaml id: divvy_tripdata namespace: company.team variables: file_id: "{{ execution.startDate | dateAdd(-3, 'MONTHS') | date('yyyyMM') }}" tasks: - id: get_zipfile type: io.kestra.plugin.core.http.Download uri: "https://divvy-tripdata.s3.amazonaws.com/{{ render(vars.file_id) }}-divvy-tripdata.zip" - id: unzip type: io.kestra.plugin.compress.ArchiveDecompress algorithm: ZIP from: "{{ outputs.get_zipfile.uri }}" - id: convert type: io.kestra.plugin.serdes.csv.CsvToIon from: "{{outputs.unzip.files[render(vars.file_id) ~ '-divvy-tripdata.csv']}}" - id: to_parquet type: io.kestra.plugin.serdes.parquet.IonToParquet from: "{{ outputs.convert.uri }}" datetimeFormat: "yyyy-MM-dd HH:mm:ss.SSS" schema: | { "type": "record", "name": "Ride", "namespace": "com.example.bikeshare", "fields": [ {"name": "ride_id", "type": "string"}, {"name": "rideable_type", "type": "string"}, {"name": "started_at", "type": {"type": "long", "logicalType": "timestamp-millis"}}, {"name": "ended_at", "type": {"type": "long", "logicalType": "timestamp-millis"}}, {"name": "start_station_name", "type": "string"}, {"name": "start_station_id", "type": "string"}, {"name": "end_station_name", "type": "string"}, {"name": "end_station_id", "type": "string"}, {"name": "start_lat", "type": "double"}, {"name": "start_lng", "type": "double"}, { "name": "end_lat", "type": ["null", "double"], "default": null }, { "name": "end_lng", "type": ["null", "double"], "default": null }, {"name": "member_casual", "type": "string"} ] } ``` ::: Now, you can go to the previously failed Execution and click on the `to_parquet` task run to re-run it (either from the Gantt or from the Logs view). ![Replay Task](./replay-task.png) Now select the latest revision of the flow code that contains the fix. ![Latest Revision](./latest-revision.png) This re-runs the task with the new (corrected!) revision of the flow code. You can inspect the logs and verify that the task now completes successfully. The attempt number increments to show that this is a new run of the task. ![Error Free](./task-count.png) The **Overview** tab will additionally show the new attempt number and the new revision of the flow code that was used during Replay. ![Replay Execution Overview](./replay-execution-overview.png) Replay lets you re-run a failed task with the corrected flow code without rerunning tasks that already completed successfully. --- # Flow Revisions in Kestra: Versioning and Rollbacks URL: https://kestra.io/docs/concepts/revision > Track and manage flow versions in Kestra with built-in revision history. Roll back to any previous version to undo changes and maintain reliability. Manage versions of flows.
Flows are versioned by default. Whenever you make any changes to your flows, a new revision is created. This allows you to rollback to a previous version of your flow if needed. If you navigate to a specific flow and go to the **Revisions** tab, you will see a list of all revisions of that flow. You can then compare the differences between two revisions side-by-side or line-by-line and rollback to a previous revision if needed. ![revisions](./revisions.png) --- # Secrets in Kestra: Store Sensitive Values Securely URL: https://kestra.io/docs/concepts/secret > Store and access sensitive information securely in Kestra. Use Secrets to protect API keys, passwords, and credentials without exposing plain-text values. Store sensitive information securely. Secrets are a mechanism that allows you to securely store sensitive information, such as passwords and API keys, and retrieve them in your flows.
To retrieve secrets in a flow, use the `secret()` function, e.g., `"{{ secret('API_TOKEN') }}"`. You can leverage your existing secrets manager as a secrets backend. Your flows often need to interact with external systems. To do that, they need to programmatically authenticate using passwords or API keys. Secrets help you securely store such variables and avoid hard-coding sensitive information within your workflow code. You can leverage the `secret()` function to retrieve sensitive variables within your flow code. ## When should I use Secrets? Use **Secrets** for static sensitive values such as API keys, passwords, webhook URLs, certificates, and long-lived tokens. Use [Credentials](../../07.enterprise/03.auth/credentials/index.md) when Kestra needs to manage reusable server-to-server authentication for supported integrations, such as minting or refreshing short-lived access tokens at runtime. In short: - use **Secrets** for protected values - use **Credentials** for managed authentication objects Credentials can also reference Secrets for sensitive inputs such as client secrets, private keys, and certificates. ## Secrets in the Enterprise Edition From the **Secrets** tab, you can edit, delete, and copy your secret to your clipboard as a Pebble expression for use in a flow, such as `"{{ secret('API_TOKEN') }}"`. ![Secrets EE](./secrets-ee-0.png) ### Adding a new Secret from the UI If you are using a managed Kestra version, you can add **new Secrets** directly from the UI. In the left navigation menu, go to **Namespaces** and select the namespace to which you want to add a new secret. Next, add a new secret within the Secrets tab. ![Secrets EE](./secrets-ee-1.png) Here, we add a new secret with a key `MY_SECRET`. You can also include a short description and tags. ![Secrets EE - new Secret](./secrets-ee-2.png) ### Using secrets in your flows For a concrete example of using secrets in flows, check out our dedicated [How-To Guide on Secrets](../../15.how-to-guides/secrets/index.md). ### Secret management backends Kestra [Enterprise Edition](../../07.enterprise/index.mdx) provides additional secret management backends and integrations with secrets managers. See the [Secrets Manager](../../07.enterprise/02.governance/secrets-manager/index.md) page for more details. ## Secrets in the Open-Source version When using the open-source version, sensitive variables can be managed using base64-encoded environment variables. The section below demonstrates several ways to encode those values and use them in your Kestra instance. ### Manual encoding using a CLI command Imagine that so far, you were setting the following environment variable: ```bash export MYPASSWORD=myPrivateCode ``` Below is how you can encode the sensitive value of that environment variable: ```bash echo -n "myPrivateCode" | base64 ``` This outputs the value: `bXlQcml2YXRlQ29kZQ==` To use that value as a Secret in your Kestra instance, you would need to add a prefix `SECRET_` to the variable key (here: `SECRET_MYPASSWORD`) and set that key to the encoded value: ```bash export SECRET_MYPASSWORD=bXlQcml2YXRlQ29kZQ== ``` If you want to add the environment variable to the `kestra` container section in a [Docker Compose file](https://github.com/kestra-io/kestra/blob/develop/docker-compose.yml#L22), it would look as follows: ```yaml kestra: image: kestra/kestra:latest environment: SECRET_MYPASSWORD: bXlQcml2YXRlQ29kZQ== ``` This secret can be used in a flow using the `{{ secret('MYPASSWORD') }}` syntax, and it will be base64-decoded during flow execution. Make sure not to include the prefix `SECRET_` when calling the `secret('MYPASSWORD')` function, as this prefix is only there in the environment variable definition to prevent Kestra from treating other system variables as secrets (for better performance and increased security). Lastly, if you want to reference any non-encoded environment variables in your flow definitions, you can always use the syntax `{{ envs.lowercase_environment_variable_key }}`. :::alert{type="warning"} Kestra has built-in protection to prevent its logs from revealing any encoded secret you have defined. ::: ### Convert all variables in an `.env` file The previous section showed the process for one Secret, but if you have tens or hundreds of them, then the `.env` is better suited. Let's assume that you have an `.env` file with the following content: ```bash MYPASSWORD=password GITHUB_ACCESS_TOKEN=mypat AWS_ACCESS_KEY_ID=myawsaccesskey AWS_SECRET_ACCESS_KEY=myawssecretaccesskey ``` Make sure to keep the last line empty, otherwise the bash script below won't encode the last secret `AWS_SECRET_ACCESS_KEY` correctly. Using the bash script shown below, you can: 1. Encode all values using base64-encoding 2. Add a `SECRET_` prefix to all environment variable names 3. Store the result as `.env_encoded` ```bash while IFS='=' read -r key value; do echo "SECRET_$key=$(echo -n "$value" | base64)"; done < .env > .env_encoded ``` The `.env_encoded` file should look as follows: ```bash SECRET_MYPASSWORD=cGFzc3dvcmQ= SECRET_GITHUB_ACCESS_TOKEN=bXlwYXQ= SECRET_AWS_ACCESS_KEY_ID=bXlhd3NhY2Nlc3NrZXk= SECRET_AWS_SECRET_ACCESS_KEY=bXlhd3NzZWNyZXRhY2Nlc3NrZXk= ``` Then, in your Docker Compose file, you can replace: ```yaml kestra: image: kestra/kestra:latest env_file: - .env ``` with the encoded version of the file: ```yaml kestra: image: kestra/kestra:latest env_file: - .env_encoded ``` --- # Data Storage in Kestra: How Task Data Is Managed URL: https://kestra.io/docs/concepts/storage > Understand how Kestra stores and processes task data. Learn about internal storage, file handling, and how outputs are passed between tasks in your workflows. Manage data processed by tasks. Kestra's primary purpose is to orchestrate data processing via tasks, so data is central to each flow's execution. Depending on the task, data can be stored inside the execution context or inside Kestra's internal storage. You can also manually store data inside Kestra's KV store by using [dedicated tasks](/plugins/core/kv/io.kestra.plugin.core.kv.set). Some tasks give you the choice of where you want to store the data, usually using a `fetchType` property or the three `fetch`/`fetchOne`/`store` properties. For example, using the DynamoDB Query task: ```yaml id: query type: io.kestra.plugin.aws.dynamodb.Query tableName: persons keyConditionExpression: id = :id expressionAttributeValues: :id: "1" fetchType: FETCH ``` The `fetchType` property can have four values: - `FETCH_ONE`: fetches the first row and set it in a task output attribute (the `row` attribute for DynamoDB); the data is stored inside the execution context. - `FETCH`: fetches all rows and set them in a task output attribute (the `rows` attribute for DynamoDB); the data is stored inside the execution context. - `STORE`: stores all rows inside Kestra's internal storage. The internal storage returns a URI usually set in the task output attribute `uri` and that can be used to retrieve the file from the internal storage. - `NONE`: does nothing. The three `fetch`/`fetchOne`/`store` properties do the same but using three different task properties instead of a single one. ## Storing data ### Storing data inside the flow execution context Data can be stored as variables inside the flow execution context. This can be convenient for sharing data between tasks. To do so, tasks store data as [output attributes](../../05.workflow-components/06.outputs/index.md) that are then available inside the flow via Pebble expressions like `{{outputs.taskName.attributeName}}`. Be careful, the size of the data is significant, this increases the size of the flow execution context, which can lead to slow execution and increase the size of the execution storage inside Kestra's repository. :::alert{type="warning"} Depending on the Kestra internal queue and repository implementation, there can be a hard limit on the size of the flow execution context as it is stored as a single row/message. Usually, this limit is around 1MB, so this is important to avoid storing large amounts of data inside the flow execution context. ::: ### Storing data inside the internal storage Kestra has an internal storage that can store data of any size. By default, the internal storage uses the host filesystem, but plugins exist to use other implementations like Amazon S3, Google Cloud Storage, or Microsoft Azure Blobs storage. See [Runtime and Storage](../../configuration/02.runtime-and-storage/index.md). When using the internal storage, data is, by default, stored using [Amazon Ion](https://amazon-ion.github.io/ion-docs/) format. Tasks that can store data inside the internal storage usually have an output attribute named `uri` that can be used to access this file in following tasks. The following example uses the [DynamoDB Query](/plugins/plugin-aws/dynamodb/io.kestra.plugin.aws.dynamodb.query) task to query a table and the [FTP Upload](/plugins/plugin-fs/ftp-file-transfer-protocol/io.kestra.plugin.fs.ftp.upload) task to send the retrieved rows to an external FTP server. ```yaml tasks: - id: query type: io.kestra.plugin.aws.dynamodb.Query tableName: persons keyConditionExpression: id = :id expressionAttributeValues: :id: "1" fetchType: STORE - id: upload type: io.kestra.plugin.fs.ftp.Upload host: localhost port: 80 from: "{{ outputs.query.uri }}" to: "/upload/file.ion" ``` If you need to access data from the internal storage, you can use the `read()` function to read the file's content as a string. Dedicated tasks allow managing the files stored inside the internal storage: - [Concat](/plugins/core/storage/io.kestra.plugin.core.storage.concat): concat multiple files. - [Delete](/plugins/core/storage/io.kestra.plugin.core.storage.delete): delete a file. - [Size](/plugins/core/storage/io.kestra.plugin.core.storage.size): get the size of a file. - [Split](/plugins/core/storage/io.kestra.plugin.core.storage.split): split a file into multiple files depending on the size of the file or the number of rows. :::alert{type="warning"} This should be the main method for storing and carrying large data from task to task. As an example, if you know that a [HTTP Request](/plugins/core/http/io.kestra.plugin.core.http.request) returns a heavy payload, you should consider using [HTTP Download](/plugins/core/http/io.kestra.plugin.core.http.download) along with a [Serdes](/plugins/plugin-serdes) instead of carrying raw data in [Flow Execution Context](#storing-data-inside-the-flow-execution-context) ::: ### Storing data inside the KV store Dedicated tasks can store data inside Kestra's KV store. The KV store transparently uses Kestra's internal storage as its backend store. The KV store allows storing data that will be shared by all executions of the same namespace. You can think of it as a key/value store dedicated to a namespace. The following tasks are available: - [Set](/plugins/core/kv/io.kestra.plugin.core.kv.set): set data in key/value pair. - [Get](/plugins/core/kv/io.kestra.plugin.core.kv.get): get data from key/value pair. - [Delete](/plugins/core/kv/io.kestra.plugin.core.kv.delete): delete a key/value pair. Example: ```yaml tasks: - id: set_data type: io.kestra.plugin.core.kv.Set key: name value: John Doe - id: get_data type: io.kestra.plugin.core.kv.Get key: name ``` In the next example, the flow uses `Set`, `Get` and `Delete` on the data: :::collapse{title="Example Flow"} ```yaml id: kv_store_example namespace: company.team tasks: - id: set_data type: io.kestra.plugin.core.kv.Set key: user_name value: John Doe - id: get_data type: io.kestra.plugin.core.kv.Get key: user_name - id: log_state type: io.kestra.plugin.core.log.Log message: "{{ kv('user_name') }}" - id: set_new_data type: io.kestra.plugin.core.kv.Set key: user_name value: Bob Smith - id: get_new_data type: io.kestra.plugin.core.kv.Get key: user_name - id: log_new_data type: io.kestra.plugin.core.log.Log message: "{{ kv('user_name') }}" - id: delete_data type: io.kestra.plugin.core.kv.Delete key: user_name - id: get_deleted_data type: io.kestra.plugin.core.kv.Get description: You will not get any data as the corresponding key is deleted in the earlier task. key: user_name ``` When we `Set` a new value for `user_name`, we have to use another `Get` task to retrieve the most up-to-date value, and then reference that `Get` task `id` in the log below to show the latest value. The same applies to the `Delete` task. To show that the value has been deleted, we try to retrieve data from the key deleted in the `delete_data` task. ::: ## Processing data For basic data processing, you can leverage Kestra's [Pebble templating engine](../../expressions/index.mdx). For more complex data transformations, Kestra offers various data processing plugins including transform tasks or custom scripts. ### Converting files Files from the internal storage can be converted from/to the Ion format to/from another format using the [Serdes](/plugins/plugin-serdes) plugin. The following formats are currently available: Avro, JSON, XML, and Parquet. Each format offers a **reader** to read an Ion serialized data file and write it in the target format and a **writer** to read a file in a specific format and write it as an Ion serialized data file. For example, to convert an Ion file to CSV, then back to Ion: ```yaml tasks: - id: query type: io.kestra.plugin.aws.dynamodb.Query tableName: persons keyConditionExpression: id = :id expressionAttributeValues: :id: "1" fetchType: STORE - id: convertToCsv type: io.kestra.plugin.serdes.csv.IonToCsv from: "{{outputs.query.uri}}" - id: convertBackToIon type: io.kestra.plugin.serdes.csv.CsvToIon from: "{{ outputs.convertToCsv.uri }}" ``` ### Processing data using scripts Kestra can launch Python, R, Node.js, Shell, Powershell, and Go scripts. Depending on the `runner`, they can run directly in a local process on the host or inside Docker containers. Those script tasks are available in the [Scripts Plugin](https://github.com/kestra-io/plugin-scripts). Below is documentation for each of them: - The [Python](/plugins/plugin-script-python/io.kestra.plugin.scripts.python.script) task runs a Python script in a Docker container or in a local process. - The [Node](/plugins/plugin-script-node/io.kestra.plugin.scripts.node.script) task runs a Node.js script in a Docker container or in a local process. - The [R](/plugins/plugin-script-r/io.kestra.plugin.scripts.r.script) task runs an R script in a Docker container or in a local process. - The [Shell](/plugins/plugin-script-shell/io.kestra.plugin.scripts.shell.script) task executes a single Shell command, or a list of commands that you provide. - The [PowerShell](/plugins/plugin-script-powershell/io.kestra.plugin.scripts.powershell.script) task executes a single PowerShell command, or a list of commands that you provide. - The [Go (Script)](/plugins/plugin-script-go/io.kestra.plugin.scripts.go.script) task executes a single multi-line script, while the [Go (Commands)](/plugins/plugin-script-go/io.kestra.plugin.scripts.go.commands) task executes a list of commands that you provide. The following example queries the BigQuery public dataset with Wikipedia page views to find the top 10 pages, convert it to CSV, and use the CSV file inside a Python task for further transformations using Pandas. ```yaml id: wikipedia-top-ten-python-panda namespace: company.team description: analyze top 10 Wikipedia pages tasks: - id: query type: io.kestra.plugin.gcp.bigquery.Query sql: | SELECT DATETIME(datehour) as date, title, views FROM `bigquery-public-data.wikipedia.pageviews_2023` WHERE DATE(datehour) = current_date() and wiki = 'en' ORDER BY datehour desc, views desc LIMIT 10 store: true projectId: geller serviceAccount: "{{envs.gcp_creds}}" - id: write-csv type: io.kestra.plugin.serdes.csv.IonToCsv from: "{{outputs.query.uri}}" - id: wdir type: io.kestra.plugin.core.flow.WorkingDirectory inputFiles: data.csv: "{{outputs['write-csv'].uri}}" tasks: - id: pandas type: io.kestra.plugin.scripts.python.Script containerImage: ghcr.io/kestra-io/pydata:latest script: | import pandas as pd from kestra import Kestra df = pd.read_csv("data.csv") views = df['views'].sum() Kestra.outputs({'views': int(views)}) ``` Kestra offers several plugins for ingesting and transforming data — check [the Plugin list](/plugins) for more details. Make sure to also check: 1. The [Script documentation](../../16.scripts/index.mdx) for a detailed overview of how to work with Python, R, Node.js, Shell and Powershell scripts, and how to integrate them with Git and Docker. 2. The [Blueprints](/blueprints) catalog — simply search for the relevant language (e.g., Python, R, Rust) or use case (*ETL, Git, dbt, etc.*) to find the relevant examples. ### Processing data using file transform Kestra can process data **row by row** using file transform tasks. The transformation is done with a small script written in Python, JavaScript, or Groovy. - The [GraalVM Python FileTransform](/plugins/plugin-graalvm/python-graalvm-tasks-on-graalvm/io.kestra.plugin.graalvm.python.filetransform) task allows transforming rows with Python. - The [GraalVM JavaScript FileTransform](/plugins/plugin-graalvm/javascript-tasks-on-graalvm/io.kestra.plugin.graalvm.js.filetransform) task allows transforming rows with JavaScript. - The [Groovy Script](/plugins/plugin-script-groovy/io.kestra.plugin.scripts.groovy.script) task allows running scripts with Groovy. The following example queries the BigQuery public dataset for Wikipedia pages, convert it row by row with the Nashorn FileTransform, and write it in a CSV file. ```yaml id: wikipedia-top-ten-file-transform namespace: company.team description: A flow that loads wikipedia top 10 EN pages tasks: - id: query-top-ten type: io.kestra.plugin.gcp.bigquery.Query sql: | SELECT DATETIME(datehour) as date, title, views FROM `bigquery-public-data.wikipedia.pageviews_2023` WHERE DATE(datehour) = current_date() and wiki = 'en' ORDER BY datehour desc, views desc LIMIT 10 store: true - id: file-transform type: io.kestra.plugin.graalvm.python.FileTransform from: "{{outputs['query-top-ten'].uri}}" script: | logger.info('row: {}', row) if (row['title'] === 'Main_Page' || row['title'] === 'Special:Search' || row['title'] === '-') { // remove un-needed row row = null } else { // add a 'time' column row['time'] = String(row['date']).substring(11) // modify the 'date' column to only keep the date part row['date'] = String(row['date']).substring(0, 10) } - id: write-csv type: io.kestra.plugin.serdes.csv.IonToCsv from: "{{outputs['file-transform'].uri}}" ``` :::alert{type="info"} The script can access a logger to log messages. Each row is available in a `row` variable where each column is accessible using the dictionary notation `row['columnName']`. ::: ## Purging data
The PurgeExecution task can purge all the files stored inside the internal context by a flow execution. It can be used at the end of a flow to purge all its generated files. ```yaml tasks: - id: purge-execution type: io.kestra.plugin.core.storage.PurgeExecution ``` The execution context itself is not available after the end of the execution and is automatically deleted from Kestra's repository after a retention period (seven days by default) that can be changed; see [Runtime and Storage](../../configuration/02.runtime-and-storage/index.md). Also, the [Purge](/plugins/core) task can be used to purge storages, logs, and executions of previous execution. For example, this flow purges all of these every day: ```yaml id: purge namespace: company.team tasks: - id: purge type: io.kestra.plugin.core.storage.Purge endDate: "{{ now() | dateAdd(-1, 'MONTHS') }}" triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "0 0 * * *" ``` ## FAQ ### Internal storage FAQ #### How to read a file from internal storage as a string The `read()` function expects a `path` argument that points to a namespace file or an internal storage URI. Note that when using inputs, outputs, or trigger variables, you don't need any extra quotation marks. Here is how you can use those variables with the `read()` function: - `{{ read(inputs.file) }}` for a FILE-type input variable named `file` - `{{ read(outputs.mytaskid.uri) }}` for an output `uri` from a task named `mytaskid` - `{{ read(trigger.uri) }}` for a `uri` of many triggers incl. Kafka, AWS SQS, GCP PubSub, etc. - `{{ read(trigger.objects | jq('.[].uri')) }}` for a `uri` of a trigger that returns a list of detected objects, e.g. AWS S3, GCP GCS, etc. Note that the read function can only read files within the same execution. If you try to read a file from a previous execution, you will get an Unauthorized error. :::collapse{title="Example using a FILE-type inputs variable"} ```yaml id: read_file_as_string namespace: company.team inputs: - id: file type: FILE tasks: - id: log_internal_storage_uri type: io.kestra.plugin.core.log.Log message: "{{ inputs.file }}" - id: log_file_content type: io.kestra.plugin.core.log.Log message: "{{ read(inputs.file) }}" ``` ::: :::collapse{title="Example with the ForEachItem task reading file's content as a string"} When using the `ForEachItem` task, you can use the `read()` function to read the content of a file as a string. This is especially useful when you want to pass the content of a file as a raw string as an input to a subflow. Below is a simple subflow example that uses a string input: ```yaml id: subflow_raw_string_input namespace: company.team inputs: - id: string_input type: STRING defaults: hey there tasks: - id: for_each_item type: io.kestra.plugin.core.debug.Return format: "{{ inputs.string_input }}" ``` Because the `ForEachItem` task splits the `items` file into batches of smaller files (one file per row by default), you can use the `read()` function to read the content of that file for a given batch as a string value and pass it as an input to that subflow shown above. ```yaml id: parent_flow namespace: company.team tasks: - id: extract type: io.kestra.plugin.jdbc.duckdb.Queries sql: | INSTALL httpfs; LOAD httpfs; SELECT * FROM read_csv_auto('https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv', header=True); store: true - id: each_raw type: io.kestra.plugin.core.flow.ForEachItem items: "{{ outputs.extract.outputs[0].uri }}" namespace: company.team flowId: subflow_raw_string_input inputs: string_input: "{{ read(taskrun.items) }}" ``` ::: #### How to read a Namespace File as a string? So far, you've seen how to read a file from the internal storage as a string. However, you can use the same `read()` function to read a Namespace File as a string. This is especially useful when you want to execute a Python script or a long SQL query stored in a dedicated SQL file. The `read()` function takes the absolute path to the file you want to read. The path must point to a file stored in the **same namespace** as the flow you are executing. Below is a simple example showing how you can read a file named `hello.py` stored in the `scripts` directory of the `company.team` namespace: ```yaml id: hello namespace: company.team tasks: - id: my_python_script type: io.kestra.plugin.scripts.python.Script script: "{{ read('scripts/hello.py') }}" ``` The same syntax applies to SQL queries, custom scripts, and many more. Check the [Namespace Files](../../06.concepts/02.namespace-files/index.md) documentation for more details. #### How to read a file from the internal storage as a JSON object? You can use the Pebble function `{{ fromJson(myvar) }}` and a `{{ myvar | toJson }}` filter to process JSON data. :::collapse{title="The fromJson() function"} The function is used to convert a string to a JSON object. For example, the following Pebble expression converts the string `{"foo": [666, 1, 2]}` to a JSON object and then returns the first value of the `foo` key, which is `42`: ```yaml {{ json('{"foo": [42, 43, 44]}').foo[0] }} ``` You can use the `read()` function to read the content of a file as a string and then apply the `json()` function to convert it to a JSON object. Afterwards, you can read the value of a specific key in that JSON object. For example, the following Pebble expression reads the content of a file named `my.json` and then returns the value of the `foo` key, which is `42`: ```yaml id: extract_json namespace: company.team tasks: - id: extract type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/json/app_events.json - id: read_as_string type: io.kestra.plugin.core.log.Log message: "{{ read(outputs.extract.uri) }}" - id: read_as_json type: io.kestra.plugin.core.log.Log message: "{{ json(read(outputs.extract.uri)) }}" - id: parse_json_elements type: io.kestra.plugin.core.log.Log message: "{{ json(read(outputs.extract.uri)) | jq('map(.detail | fromjson | .message)') | first }}" ``` The above flow downloads a JSON file via an HTTP Request, reads its content as a string, converts it to a JSON object, and then in another task, it parses the JSON object and returns the value of a nested key. ::: :::collapse{title="The json filter"} You can use the `json` filter to convert any variable to a JSON string. You can think of it as a reverse process to what the `json()` function does. The example below shows how you can convert a list of numbers to a JSON string `'[1, 2, 3]'` using the `| json` filter: ```yaml {{ [1, 2, 3] | json }} ``` :::alert{type="info"} You typically would never used the `| json` filter in combination with the `read()` function. Anytime you need to read a file's content and then convert it to a JSON object, use a combination of the `read()` function and the `json()` function instead. ::: --- # System Flows in Kestra: Automate Maintenance URL: https://kestra.io/docs/concepts/system-flows > Automate platform maintenance with System Flows in Kestra. Schedule cleanup, monitoring, and admin tasks that run on a fixed cadence automatically. Automate maintenance workflows with System Flows.
System Flows periodically execute background operations that keep your platform running but which you would generally prefer to keep out of sight. These flows automate maintenance workflows, such as: 1. Sending [alert notifications](/blueprints/failure-alert-slack) 2. Creating automated support tickets when critical workflows fail 3. [Purging logs](/blueprints/purge) and removing old executions or internal storage files to save space 4. Syncing code from Git or pushing code to Git 5. Automatically [releasing flows](/blueprints/copy-flows-to-new-tenant) from development to QA and staging environments We refer to these as **System Flows** because by default they are only visible within the `system` namespace and to users with appropriate access. If you prefer, you can use a different namespace name instead of `system` by overwriting the following [Plugins and Execution configuration](../../configuration/04.plugins-and-execution/index.md): ```yaml kestra: systemFlows: namespace: system ``` To access System Flows, navigate to the **Namespaces** section in the UI. The `system` namespace is pinned at the top for quick access. ![system_namespace](./system-namespace.png) From this section, you’ll find the **System Blueprints** tab, which provides fully customizable templates that you can modify to suit your organization’s needs. ![system_blueprints](./system-blueprints.png) :::alert{type="info"} Keep in mind that System Flows are not restricted to System Blueprints — any valid Kestra flow can become a System Flow if it's added to the `system` namespace. ::: System flow executions appear across the Dashboard, Flows, and Executions pages, each with a multi-select **Scope** filter (`User`, `System`) so you can view user-facing and system executions separately or together. ![system_filter](./system-filters.png) In terms of permissions, `system` namespace is open by default, but using the namespace-level RBAC functionality in the Enterprise Edition, you can restrict access to the `system` namespace only to Admins, while assigning `company.*` namespaces to your general user base. --- # System Labels in Kestra: Reserved Admin Metadata URL: https://kestra.io/docs/concepts/system-labels > Use system and hidden labels in Kestra for admin metadata. Understand how internal labels differ from user labels and how they affect filtering. Special labels for system use only. System Labels and Hidden Labels are reserved for storing metadata used by administrators to manage and monitor Kestra. These labels are hidden in the UI by default. To view executions with a specific Hidden Label, you must explicitly filter for it using the `Labels` filter, such as `system.correlationId: 6WuLA1vh9lpFsGyrkuVRYb`. ![correlationId](./correlationId.png) The table will then show the execution connected to that ID. ![Correlation ID Filter Result](./correlationId-filter-result.png) ## Hidden labels Hidden Labels are labels excluded from the UI by default. You can configure which prefixes should be hidden via the `kestra.hidden-labels.prefixes` configuration. For example, to hide labels starting with `admin.`, `internal.`, and `system.`, you can use the following configuration in your `application.yaml`: ```yaml kestra: hidden-labels: prefixes: - system. - internal. - admin. ``` By default, System Labels (prefixed with `system.`) are hidden. To display them, simply remove the `system.` prefix from the list of hidden prefixes. ## System labels System Labels are labels prefixed with `system.` that serve specific purposes. Below are the available System Labels. For a step-by-step guide on using `system.correlationId` specifically as an idempotency key, see [Idempotency with correlation IDs](../../15.how-to-guides/idempotency/index.md). ### `system.correlationId` - Automatically set for every execution and propagated to downstream executions created by `Subflow` or `ForEachItem` tasks - Represents the ID of the first execution in a chain of executions, enabling tracking of execution lineage - Can also be set to a stable business key and used as an idempotency key for flows that must not process the same event twice - Use this label to filter all executions originating from a specific parent execution or business event. For example, if a parent flow triggers multiple subflows, filtering by the parent's `system.correlationId` displays all related executions. :::alert{type="info"} The Execution API supports setting this label at execution creation but not modification. ::: --- ### `system.username` - Automatically set for every execution and contains the username of the user who triggered the execution - Useful for auditing and identifying who initiated specific executions ### `system.readOnly` - Used to mark a flow as read-only, disabling the flow editor in the UI - Helps prevent modifications to critical workflows, such as production flows managed through CI/CD pipelines **Example:** ```yaml id: read_only_flow namespace: company.team labels: system.readOnly: true tasks: - id: log type: io.kestra.plugin.core.log.Log message: Hello from a read-only flow! ``` Once this label is set, the editor for this flow will be disabled in the UI. ![readOnly](./readOnly.png) :::alert{type="info"} In the Enterprise Edition, updating a read-only flow server-side is restricted to service accounts or API keys. ::: --- # Configure Kestra: Settings, Environments & Defaults URL: https://kestra.io/docs/configuration > Learn where Kestra configuration lives and how to edit it. Covers runtime, storage, observability, security, plugins, and enterprise-only settings. import ChildCard from "~/components/docs/ChildCard.astro" Use this page when you need to know where Kestra configuration is defined, how it is overridden, and which part of the config tree controls a given capability. ## Where Kestra configuration lives Kestra configuration is YAML-based, but the place you edit it depends on how you run Kestra. | Deployment style | Where you usually edit configuration | | --- | --- | | Local Docker or Docker Compose | The `KESTRA_CONFIGURATION` environment variable or your `docker-compose.yml` | | Kubernetes | Your Helm values or rendered manifests | | VM or standalone server | A YAML config file plus environment variables such as `JAVA_OPTS` or `KESTRA_*` | | Managed or shared environments | A checked-in deployment config, then environment-specific overrides through environment variables or platform secrets | Environment variables override file-based configuration, so many teams keep a shared base config in YAML and inject deployment-specific values at runtime. ## Start here - Use [Configuration Basics](./01.configuration-basics/index.md) to understand config sources, overrides, environment variable conversion, and the minimal setup required to boot Kestra. - Use [Runtime and Storage](./02.runtime-and-storage/index.md) for repository, queue, datasource, storage, server, and JVM settings. - Use [Observability and Networking](./03.observability-and-networking/index.md) for telemetry, logs, metrics, Micronaut, endpoints, and webserver behavior. - Use [Plugins and Execution](./04.plugins-and-execution/index.md) for plugin installation, plugin defaults, retries, tasks, system flows, templates, and file access. - Use [Security and Secrets](./05.security-and-secrets/index.md) for encryption, secret backends, auth-related security settings, and server hardening. - Use [Enterprise and Advanced Features](./06.enterprise-and-advanced/index.md) for EE license settings, Elasticsearch, Kafka, indexer, AI Copilot, and air-gapped deployments. ## Common tasks | If you need to... | Start here | | --- | --- | | Find the file or deployment surface where config is edited | [Configuration Basics](./01.configuration-basics/index.md) | | Configure queue, repository, datasource, storage, or server settings | [Runtime and Storage](./02.runtime-and-storage/index.md) | | Tune logs, metrics, telemetry, endpoints, CORS, SSL, or UI web settings | [Observability and Networking](./03.observability-and-networking/index.md) | | Configure plugin defaults, task temp storage, retries, or local flow sync | [Plugins and Execution](./04.plugins-and-execution/index.md) | | Set up encryption, secret backends, RBAC-adjacent security, or server auth | [Security and Secrets](./05.security-and-secrets/index.md) | | Configure Enterprise-only platform services such as Kafka, Elasticsearch, AI Copilot, or air-gapped operation | [Enterprise and Advanced Features](./06.enterprise-and-advanced/index.md) | ## Browse configuration docs --- # Configuration Basics in Kestra: YAML & Env Overrides URL: https://kestra.io/docs/configuration/configuration-basics > Learn where Kestra configuration is defined, how YAML and environment variables interact, and what minimal settings are needed to start a Kestra instance. Use this page first if you are not sure where Kestra configuration is actually edited in your environment. ## Configuration sources Kestra reads configuration from YAML. In practice, teams usually provide it in one of these ways: - a YAML file mounted or bundled with the Kestra process - the `KESTRA_CONFIGURATION` environment variable - inline YAML inside Docker Compose - Helm values or Kubernetes manifests Environment variables override file-based configuration, so many teams keep a shared YAML base config in version control and inject deployment-specific values at runtime. ## Minimal boot configuration Most deployments need to decide at least these three things: 1. repository type 2. queue type 3. internal storage type Example: ```yaml datasources: postgres: url: jdbc:postgresql://postgres:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 kestra: repository: type: postgres queue: type: postgres storage: type: local url: "http://localhost:8080/" ``` These three choices drive the rest of the deployment: - `kestra.repository.type` controls the persistence backend for core metadata. - `kestra.queue.type` must be compatible with the repository type. - `kestra.storage.type` controls where Kestra stores internal files and task artifacts. ## Environment variable conversion Convert YAML keys to environment variables like this: - replace dots (`.`) with underscores - replace hyphens (`-`) with underscores - convert camelCase boundaries to underscores - uppercase everything - prefix Kestra-specific keys with `KESTRA_` Examples: | Configuration value | Resulting properties | | --- | --- | | `MYAPP_MYSTUFF` | `myapp.mystuff`, `myapp-mystuff` | | `MY_APP_MY_STUFF` | `my.app.my.stuff`, `my.app.my-stuff`, `my-app.my.stuff`, `my-app.my-stuff`, and similar variants | File-based configuration: ```yaml datasources: postgres: username: kestra ``` becomes: ```bash DATASOURCES_POSTGRES_USERNAME=kestra ``` And: ```yaml kestra: storage: s3: accessKey: myKey ``` or: ```yaml kestra: storage: s3: access-key: myKey ``` becomes: ```bash KESTRA_STORAGE_S3_ACCESS_KEY=myKey ``` Common patterns: ```bash MICRONAUT_SERVER_PORT=8080 DATASOURCES_POSTGRES_USERNAME=kestra KESTRA_STORAGE_TYPE=s3 KESTRA_URL=https://kestra.example.com ``` ## SDK default authentication SDK-based plugins can use default authentication if configured. Kestra resolves credentials in this order: 1. namespace-level default service account 2. tenant-level default service account 3. global SDK defaults Example: ```yaml tasks: sdk: authentication: username: ${kestra.server.basic-auth.username} password: ${kestra.server.basic-auth.password} # token: ${KESTRA_API_TOKEN} ``` If no namespace, tenant, or global default is configured, SDK-based tasks that use `DEFAULT` or `AUTO` authentication fail because no API credentials are available. ## What belongs on the other configuration pages - Use [Runtime and Storage](../02.runtime-and-storage/index.md) for datasources, queue, repository, internal storage, JVM, environment metadata, and global variables. - Use [Observability and Networking](../03.observability-and-networking/index.md) for logs, metrics, Micronaut, endpoints, access logs, SSL, and CORS. - Use [Plugins and Execution](../04.plugins-and-execution/index.md) for plugin installation, plugin defaults, retries, local flow sync, templates, and execution behavior. - Use [Security and Secrets](../05.security-and-secrets/index.md) for encryption, secret backends, auth hardening, and liveness settings. - Use [Enterprise and Advanced](../06.enterprise-and-advanced/index.md) for EE license, Kafka, Elasticsearch, indexer, AI Copilot, and air-gapped deployments. ## Next steps - Need repository, datasource, storage, JVM, or server settings: [Runtime and Storage](../02.runtime-and-storage/index.md) - Need logs, metrics, or SSL settings: [Observability and Networking](../03.observability-and-networking/index.md) - Need secret backends or advanced EE infrastructure: [Security and Secrets](../05.security-and-secrets/index.md) and [Enterprise and Advanced](../06.enterprise-and-advanced/index.md) --- # Enterprise & Advanced Configuration in Kestra URL: https://kestra.io/docs/configuration/enterprise-and-advanced > Configure Enterprise-only Kestra settings. Manage licenses, Elasticsearch, Kafka, indexer behavior, UI custom links, AI Copilot, and air-gapped deployments. Use this page for configuration areas that are either Enterprise-specific or advanced platform concerns. ## Enterprise platform settings This page groups together settings that are important but not part of a normal OSS-style runtime setup. If the instance is not using EE features, you can ignore most of this page. This area includes: - Enterprise license configuration - Enterprise Java security - UI sidebar customization - historical multi-tenancy and default tenant settings - custom links in the UI EE license configuration: ```yaml kestra: ee: license: id: fingerprint: key: | ``` Kestra validates the license on startup. The `fingerprint` is also required for versioned plugins. EE Java security lets you restrict filesystem access and thread creation. Three controls are available: - `forbidden-paths` — disallows read/write on listed filesystem paths - `authorized-class-prefix` — limits which classes are allowed to create threads - `forbidden-class-prefix` — blocks specific classes from creating threads ```yaml kestra: ee: java-security: enabled: true forbidden-paths: - /etc/ authorized-class-prefix: - io.kestra.plugin.core - io.kestra.plugin.gcp ``` Use `forbidden-class-prefix` when you want to block a specific plugin family from spawning threads rather than maintaining an allowlist: ```yaml kestra: ee: java-security: enabled: true forbidden-class-prefix: - io.kestra.plugin.scripts ``` Use EE Java security carefully. It is a platform hardening feature, so the goal is to narrow what plugin code is allowed to touch, not to tune routine runtime behavior. UI customization examples: ```yaml kestra: ee: right-sidebar: custom-links: internal-docs: title: "Internal Docs" url: "https://kestra.io/docs/" ``` ```yaml kestra: ee: left-sidebar: disabled-menus: - "Blueprints/Flow Blueprints" ``` The old multi-tenancy and default-tenant configuration was removed in `0.23.0`; keep it only in mind for migration work. ## Elasticsearch, Kafka, and indexing This section is really about one architectural choice: running Kestra on the Kafka plus Elasticsearch stack instead of the simpler JDBC-backed setup. If you are on PostgreSQL or MySQL only, much of this page will not apply. These settings cover the advanced repository and queue stack used in Enterprise deployments: - Elasticsearch repository settings - Kafka client and topic settings - Kafka message protection - indexer behavior Use this section when you are running the Kafka plus Elasticsearch architecture instead of a JDBC-only deployment. Minimal Elasticsearch repository configuration: ```yaml kestra: elasticsearch: client: http-hosts: "http://localhost:9200" repository: type: elasticsearch ``` Start by proving the minimal connection first. After that, add auth, SSL handling, index prefixes, or rotation only when the deployment model requires them. With authentication: ```yaml kestra: elasticsearch: client: http-hosts: - "http://node-1:9200" - "http://node-2:9200" basic-auth: username: "" password: "" repository: type: elasticsearch ``` Related advanced Elasticsearch settings include: - `trust-all-ssl` for self-signed development clusters - custom index prefixes - daily, weekly, monthly, or yearly index rotation Minimal Kafka queue configuration: ```yaml kestra: kafka: client: properties: bootstrap.servers: "localhost:9092" queue: type: kafka ``` Kafka tuning is usually about cluster shape rather than syntax. Partition count limits how much component-level concurrency you can achieve, while replication settings should match your broker topology and HA expectations. This page also covers: - SSL-secured Kafka clients - default topic partition and replication settings - consumer, producer, and stream defaults - custom topic names and topic properties - consumer and topic prefixes for shared clusters - Kafka Streams local state directory - message protection for oversized Kafka messages Representative advanced Kafka settings: ```yaml kestra: kafka: client: properties: bootstrap.servers: "localhost:9092" security.protocol: SSL defaults: topic: partitions: 3 replication-factor: 3 topics: executions: properties: retention.ms: 604800000 ``` Use client properties for transport and auth, `defaults` for cluster-wide topic behavior, and `topics.*.properties` only when one topic needs behavior that differs from the rest. Full SSL client configuration with keystores: ```yaml kestra: kafka: client: properties: bootstrap.servers: "host:port" security.protocol: "SSL" ssl.endpoint.identification.algorithm: "" ssl.key.password: "" ssl.keystore.location: "/etc/ssl/private/keystore.p12" ssl.keystore.password: "" ssl.keystore.type: "PKCS12" ssl.truststore.location: "/etc/ssl/private/truststore.jks" ssl.truststore.password: "" queue: type: kafka ``` Consumer, producer, and stream defaults: ```yaml kestra: kafka: defaults: consumer: properties: isolation.level: "read_committed" auto.offset.reset: "earliest" enable.auto.commit: "false" producer: properties: acks: "all" compression.type: "lz4" max.request.size: "10485760" stream: properties: processing.guarantee: "exactly_once" replication.factor: "${kestra.kafka.defaults.topic.replication-factor}" acks: "all" compression.type: "lz4" max.request.size: "10485760" state.dir: "/tmp/kafka-streams" ``` Client loggers for debugging message flow: ```yaml kestra: kafka: client: loggers: - level: INFO type: PRODUCER topic-regexp: "kestra_(executions|workertaskresult)" key-regexp: .*parallel.* value-regexp: .*parallel.* ``` :::alert{type="warning"} Client loggers have a heavy performance impact. Use them only for short-lived debugging sessions. ::: Shared-cluster deployments often also need prefixes or dedicated topic names to avoid collisions with other tenants or environments. To reject oversized Kafka messages early: ```yaml kestra: kafka: message-protection: enabled: true limit: 1048576 ``` Indexer settings control batch indexing from Kafka into Elasticsearch: ```yaml kestra: indexer: batch-size: 500 batch-duration: PT1S ``` If indexing falls behind, tune indexer batch settings before changing flow definitions. Those settings control how aggressively Kafka-backed events are flushed into Elasticsearch. ## AI and isolated environments These are the most optional settings on the page. They matter only if you are enabling Copilot integrations or operating Kestra in restricted network environments. This page also includes: - AI Copilot provider configuration - air-gapped instance settings ### AI Copilot Set `kestra.ai.enabled` to `false` to fully disable the AI Copilot, including the built-in fallback to `api.kestra.io`. Defaults to `true`. Enterprise Edition supports multiple providers in one configuration, which is useful when teams need both a default internal model and a fallback external model: ```yaml kestra: ai: enabled: true # set to false to disable AI Copilot entirely providers: - id: gemini display-name: Gemini - Private type: gemini configuration: model-name: gemini-2.5-flash api-key: YOUR_GEMINI_API_KEY - id: gpt display-name: OpenAI type: openai isDefault: true configuration: model-name: gpt-4 api-key: YOUR_OPENAI_API_KEY ``` Optional provider settings include `temperature`, `top-p`, `top-k`, `max-output-tokens`, `log-requests`, `log-responses`, and `base-url`. ### Air-gapped mode Use air-gapped mode when the UI and blueprint experience must avoid external dependencies: ```yaml kestra: ee: airgapped: true ``` When enabled, the UI hides or adapts features that normally depend on external services, such as hosted fonts, external blueprint sources, or embedded internet content. ### Execution data in internal storage If EE outputs and inputs must be isolated per tenant or namespace, store execution data in internal storage: ```yaml kestra: ee: execution-data: internal-storage: enabled: true ``` To enforce that behavior everywhere: ```yaml kestra: ee: execution-data: internal-storage: force-globally: true ``` ### Mail service Invitation and password-reset emails rely on the EE mail service: ```yaml kestra: ee: mail-service: host: host.smtp.io port: 587 username: user password: password from: configurable@mail.com from-name: Kestra auth: true starttls-enable: true ``` Use this page when an instance needs non-default enterprise infrastructure, custom UI platform behavior, or advanced deployment constraints rather than routine runtime configuration. ## When to use this page - Need secure runtime or secret backend settings: [Security and Secrets](../05.security-and-secrets/index.md) - Need queue, repository, storage, or JVM setup: [Runtime and Storage](../02.runtime-and-storage/index.md) --- # Observability & Networking Configuration in Kestra URL: https://kestra.io/docs/configuration/observability-and-networking > Configure telemetry, logs, metrics, Micronaut settings, endpoints, SSL, CORS, and webserver behavior in Kestra. Use this page for operational visibility and network-facing configuration. ## Observability Use this section when you need to understand what Kestra emits about itself, not when you are changing task behavior. The settings here are mostly for platform operators and anyone integrating Kestra with monitoring or logging systems. Configuration areas in this group include: - anonymous telemetry - logger settings - access logs and log formatting - metrics and label-based metrics - Micronaut HTTP settings These settings are useful when you need to tune visibility, log volume, request handling, or integration with monitoring platforms. Anonymous usage reporting is enabled by default. Disable or tune it with: ```yaml kestra: anonymous-usage-report: enabled: false ``` ```yaml kestra: anonymous-usage-report: initial-delay: 5m fixed-delay: 1h ``` UI usage reporting is configured separately: ```yaml kestra: ui-anonymous-usage-report: enabled: false ``` ## Logs and access logging There are two different concerns here: application logs and HTTP access logs. Reach for `logger.levels` when you want to change verbosity inside Kestra, and Micronaut access logging when you want request-by-request HTTP visibility. Use `logger.levels` to adjust server log verbosity: ```yaml logger: levels: io.kestra.core.runners: TRACE org.apache.kafka: DEBUG ``` You can also suppress execution-scoped logs globally: ```yaml logger: levels: execution: 'OFF' task: 'OFF' trigger: 'OFF' ``` Or scope suppression to a specific flow, task, or trigger by appending the flow ID and optionally the task or trigger ID: ```yaml logger: levels: execution.hello-world: 'OFF' task.hello-world: 'OFF' trigger.hello-world: 'OFF' task.hello-world.log: 'OFF' trigger.hello-world.schedule: 'OFF' ``` Micronaut access logging is configured separately: ```yaml micronaut: server: netty: access-logger: enabled: true logger-name: io.kestra.webserver.access log-format: "[Date: {}] [Duration: {} ms] [Method: {}] [Url: {}] [Status: {}] [Length: {}] [Ip: {}] [Port: {}]" exclusions: - /ui/.+ - /health - /prometheus ``` Kestra uses [Logback](https://logback.qos.ch/) for logging. To use a custom `logback.xml`, pass it via `JAVA_OPTS`: ```shell export JAVA_OPTS="-Dlogback.configurationFile=file:/path/to/logback.xml" ``` GCP structured logging: ```xml ``` ECS format: ```xml ``` ## Metrics and telemetry exports These settings are usually enabled with restraint. Metrics are broadly useful, but label-based metrics should stay limited to a small set of low-cardinality dimensions or they become expensive to store and query. Set a metrics prefix: ```yaml kestra: metrics: prefix: kestra ``` Add low-cardinality labels as metric tags: ```yaml kestra: metrics: labels: - country - environment ``` This creates a tag named `label_` for each configured label. When an execution does not have a configured label key, the tag value is set to `__none__`, which keeps the set of tag keys stable and avoids metric series fragmentation. For example, with `country` and `environment` configured, an execution that has `country=Germany` but no `environment` label produces: ```plaintext kestra_executions_total{flow_id="my-flow",namespace_id="default",state="SUCCESS",label_country="Germany",label_environment="__none__"} 1 ``` For traces, metrics, and logs exported through OpenTelemetry, use the dedicated [OpenTelemetry guide](../../10.administrator-guide/open-telemetry/index.md). ## Network and HTTP settings This section matters when Kestra is exposed behind a load balancer, reverse proxy, ingress, or private network boundary. If requests are not arriving with the expected URL, protocol, size limit, or auth behavior, the fix is often here. Micronaut-backed settings cover: - server port - SSL - timeouts - upload size - base path - host resolution - CORS - management endpoints Common examples: ```yaml micronaut: server: port: 8086 ``` ```yaml micronaut: server: max-request-size: 10GB multipart: max-file-size: 10GB disk: true read-idle-timeout: 60m write-idle-timeout: 60m idle-timeout: 60m netty: max-chunk-size: 10MB ``` Reverse proxy support: ```yaml micronaut: server: context-path: "kestra-prd" host-resolution: host-header: Host protocol-header: X-Forwarded-Proto ``` Enable CORS: ```yaml micronaut: server: cors: enabled: true ``` Secure or move management endpoints: ```yaml endpoints: all: basic-auth: username: your-user password: your-password port: 8084 ``` SSL example: ```yaml micronaut: security: x509: enabled: true ssl: enabled: true server: ssl: client-authentication: need key-store: path: classpath:ssl/keystore.p12 password: ${KEYSTORE_PASSWORD} type: PKCS12 trust-store: path: classpath:ssl/truststore.jks password: ${TRUSTSTORE_PASSWORD} type: JKS ``` ## UI and webserver settings These settings are lighter-weight than the Micronaut server settings above. Use them when you are customizing the user-facing web experience rather than transport-level HTTP behavior. The webserver-related configuration also includes: - Google Analytics ID - additional HTML tags - mail server settings Examples: ```yaml kestra: webserver: google-analytics-id: G-XXXXXXXXXX ``` ```yaml kestra: webserver: html-head: - "" ``` Mail server settings are useful when you need platform emails for invitations and notifications. ## Typical use cases Use this section when you need to: - expose Kestra behind a reverse proxy - enable HTTPS - adjust access log format for GCP or ECS - configure Prometheus-style metrics ingestion - change management endpoint behavior --- # Plugins & Execution Configuration in Kestra URL: https://kestra.io/docs/configuration/plugins-and-execution > Configure plugin installation, plugin defaults, feature flags, retries, task settings, system flows, templates, and execution-related behavior in Kestra. Use this page when configuring how tasks, plugins, and execution-time behaviors work across your Kestra instance. ## Plugins This section is about how Kestra discovers and distributes plugin code. If a task type is missing, a plugin version needs to be pinned, or your organization uses a private artifact source, start here. This area includes: - installing plugins - custom Maven repositories - Enterprise plugin repositories - plugin defaults - forced plugin defaults - plugin security and allowed plugins - plugin management settings For many teams, this is the most important section after runtime setup because it centralizes behavior shared across many flows. Install a plugin from Maven repositories with: ```bash kestra plugins install io.kestra.plugin:plugin-script-python:LATEST ``` Add custom repositories: ```yaml kestra: plugins: repositories: central: url: https://repo.maven.apache.org/maven2/ google-artifact-registry: url: https://${GCP_REGISTRY_LOCATION}-maven.pkg.dev/${GCP_PROJECT_ID}/${GCP_REPOSITORY} basic-auth: username: oauth2accesstoken password: ${GCP_OAUTH_ACCESS_TOKEN} ``` Install EE plugins from the Kestra registry: ```yaml kestra: plugins: repositories: kestra-io: url: https://registry.kestra.io/maven basic-auth: username: ${kestra.ee.license.id:} password: ${kestra.ee.license.fingerprint:} ``` Most teams only need custom repositories if they publish private plugins or mirror public artifacts through an internal registry. ## Global plugin defaults and shared behavior Use plugin defaults when many flows should inherit the same behavior. This is usually preferable to repeating the same task settings across dozens of flow definitions. Apply global defaults that flows can still override: ```yaml kestra: plugins: defaults: - type: io.kestra.plugin.core.log.Log values: level: ERROR ``` Use forced defaults when teams must not override the value: ```yaml kestra: plugins: defaults: - type: io.kestra.plugin.scripts.shell.Commands forced: true values: containerImage: ubuntu:latest taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker ``` :::alert{type="warning"} Plugin defaults are evaluated by the Executor and propagated to other components, so every server should use the same `kestra.plugins.defaults`. ::: Enable or preconfigure plugin features globally: ```yaml kestra: plugins: configurations: - type: io.kestra.plugin.core.flow.Subflow values: outputs: enabled: true - type: io.kestra.plugin.core.trigger.Schedule values: recoverMissedSchedules: NONE ``` You can also enable Docker task-runner volume mounting: ```yaml kestra: plugins: configurations: - type: io.kestra.plugin.scripts.runner.docker.Docker values: volumeEnabled: true ``` The examples in this section do different jobs: `defaults` applies reusable task values, while `configurations` enables or tunes plugin features that are not always expressed directly in a flow. ## Plugin security and management This section becomes relevant once you want governance over what can be installed or when plugin lifecycle is managed centrally instead of manually on each instance. In Enterprise Edition, you can restrict which plugins are allowed: ```yaml kestra: plugins: security: includes: - io.kestra.* excludes: - io.kestra.plugin.core.debug.Echo ``` Managed plugins are configured under `kestra.plugins.management`: ```yaml kestra: plugins: management: enabled: true remote-storage-enabled: true custom-plugins-enabled: true local-repository-path: /tmp/kestra/plugins-repository auto-reload-enabled: true auto-reload-interval: 60s default-version: LATEST ``` - `remote-storage-enabled`: store managed plugins in internal storage rather than on local disk - `auto-reload-enabled` / `auto-reload-interval`: check for updated plugins on a fixed interval - `default-version`: controls which plugin version is selected when no explicit version is pinned; accepts `LATEST`, `CURRENT`, `OLDEST`, `NONE`, or a specific version string ## Execution behavior These settings affect how the platform behaves around tasks and executions globally. Use them for platform-wide operational defaults, not for flow-specific logic. This part of the configuration also includes: - retries - temporary task storage - tutorial flows - system flows - local flow synchronization - enabling templates Global retries for internal storage and secret-manager calls: ```yaml kestra: retries: attempts: 5 delay: 1s max-delay: ~ multiplier: 2.0 ``` `max-delay` caps the maximum backoff interval. It is undefined by default, which means the delay grows without bound according to the multiplier. :::alert{type="warning"} These retries do not apply to tasks. For task-level retries across many plugins, use plugin defaults. ::: Example task-level retry default: ```yaml - type: io.kestra retry: type: constant interval: PT5M maxDuration: PT1H maxAttempts: 3 warningOnRetry: true ``` That distinction matters: `kestra.retries` protects platform integrations such as storage and secret backends, while task retry behavior should be managed through plugin defaults or the flow itself. Use `kestra.tasks.tmp-dir` when task runners need a predictable working directory on the host or inside a mounted volume: ```yaml kestra: tasks: tmp-dir: path: /tmp/kestra-wd/tmp ``` Ensure your container or VM volume mounts align with that path: ```yaml volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /home/kestra:/home/kestra ``` Reserve `system` for background workflows, or rename it if your organization already uses that namespace for something else: ```yaml kestra: system-flows: namespace: system ``` Disable tutorial flows outside trial or demo environments: ```yaml kestra: tutorial-flows: enabled: false ``` Templates are deprecated and disabled by default, but can still be re-enabled for migration work: ```yaml kestra: templates: enabled: true ``` ```yaml micronaut: io: watch: enabled: true paths: - /path/to/your/flows ``` Use Micronaut file watching only when you want local flow synchronization from disk into Kestra. ## Variables and rendering These settings influence expression rendering across the whole instance. They are linked here because they affect execution-time behavior, but they are documented in more depth on the runtime page. Relevant runtime-wide settings include: - environment variable prefixes - global variables - recursive rendering - template cache Those settings are documented in more detail on [Runtime and Storage](../02.runtime-and-storage/index.md), since they affect the whole instance and not just plugin behavior. ## Related docs - Flow-level plugin defaults: [Plugin Defaults](../../05.workflow-components/09.plugin-defaults/index.md) - Universal file access: [File Access](../../06.concepts/file-access/index.md) - Storage backends, JVM, and global variables: [Runtime and Storage](../02.runtime-and-storage/index.md) - Execution data isolation and enterprise-only runtime features: [Enterprise and Advanced](../06.enterprise-and-advanced/index.md) --- # Runtime & Storage Configuration in Kestra URL: https://kestra.io/docs/configuration/runtime-and-storage > Configure Kestra's repository, queue, datasource, internal storage, server runtime, JVM behavior, environment settings, and variables. Use this page when configuring the core runtime services that make Kestra run. ## Core setup decisions Every Kestra deployment must define: - repository type - queue type - internal storage type The common production path is PostgreSQL for queue and repository, plus an object store or durable internal storage backend. Queues and repositories must stay compatible: - in-memory queue with in-memory repository for local testing only - JDBC queue with H2, MySQL, or PostgreSQL repository - Kafka queue with Elasticsearch repository in Enterprise Edition ## Database and datasources Start here if you are choosing the persistence layer for a new Kestra instance or moving from a local setup to a durable environment. In most teams, this is the first configuration page they revisit after initial installation. Use `kestra.queue.type` and `kestra.repository.type` to select your backend: ```yaml kestra: queue: type: postgres repository: type: postgres ``` Then define the datasource: ```yaml datasources: postgres: url: jdbc:postgresql://localhost:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 ``` The examples below are intentionally minimal. Use them to confirm the backend choice and basic connection shape first, then add pooling and operational settings afterward. Minimal datasource examples: :::collapse{title="PostgreSQL"} ```yaml kestra: queue: type: postgres repository: type: postgres datasources: postgres: url: jdbc:postgresql://localhost:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 ``` ::: :::collapse{title="MySQL"} ```yaml kestra: queue: type: mysql repository: type: mysql datasources: mysql: url: jdbc:mysql://localhost:3306/kestra driver-class-name: com.mysql.cj.jdbc.Driver username: kestra password: k3str4 dialect: MYSQL ``` ::: :::collapse{title="H2"} ```yaml kestra: queue: type: h2 repository: type: h2 datasources: h2: url: jdbc:h2:mem:public;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=FALSE username: sa password: "" driver-class-name: org.h2.Driver ``` ::: Use H2 for local development. For production, prefer PostgreSQL, or MySQL if PostgreSQL is not an option. :::alert{type="info"} For PostgreSQL performance issues, consider `random_page_cost=1.1` and `kestra.queue.postgres.disable-seq-scan=true` if queue polling is choosing poor query plans. ::: ## Connection pooling and JDBC queue tuning Most users can keep the defaults here until they see either connection pressure or queue latency. This section matters most for larger deployments, split-component topologies, or databases that are already under load. Kestra uses HikariCP for datasource pooling. Common options include: | Property | Purpose | Default | | --- | --- | --- | | `maximum-pool-size` | Maximum number of open connections | `10` | | `minimum-idle` | Minimum number of idle connections | `10` | | `connection-timeout` | Max wait for a connection (ms) | `30000` | | `idle-timeout` | Max idle time (ms) | `600000` | | `max-lifetime` | Max connection lifetime (ms) | `1800000` | :::collapse{title="Full HikariCP property reference"} | Property | Type | Description | Default | | --- | --- | --- | --- | | `url` | String | JDBC connection string | — | | `username` | String | Database username | — | | `password` | String | Database password | — | | `catalog` | String | Default catalog | driver default | | `schema` | String | Default schema | driver default | | `transaction-isolation` | String | Default transaction isolation level | driver default | | `pool-name` | String | Pool name | `HikariPool-` | | `maximum-pool-size` | Int | Maximum number of open connections | `10` | | `minimum-idle` | Long | Minimum number of idle connections | `10` | | `connection-timeout` | Long | Max time to wait for a connection (ms) | `30000` | | `idle-timeout` | Long | Max time a connection can be idle (ms) | `600000` | | `max-lifetime` | Long | Max connection lifetime (ms) | `1800000` | | `validation-timeout` | Long | Max time to validate a connection (ms) | `5000` | | `initialization-fail-timeout` | Long | Timeout for pool initialization failure (ms) | `1` | | `leak-detection-threshold` | Long | Threshold before a connection leak is reported (ms) | `0` | | `connection-init-sql` | String | SQL executed on each new connection | `null` | | `connection-test-query` | String | Query used to validate connections | `null` | ::: Example: ```yaml datasources: postgres: url: jdbc:postgresql://localhost:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 maximum-pool-size: 20 minimum-idle: 10 ``` Rough connection planning: - standalone server: about 10 connections - split components: about 40 connections - split components with 3 replicas: about 120 connections JDBC queues long-poll the `queues` table. Lower intervals reduce latency but increase database load: ```yaml kestra: jdbc: queues: poll-size: 100 min-poll-interval: 25ms max-poll-interval: 1000ms poll-switch-interval: 5s ``` The JDBC cleaner removes old queue rows: ```yaml kestra: jdbc: cleaner: initial-delay: 1h fixed-delay: 1h retention: 7d ``` To reject oversized JDBC messages before they create memory pressure: ```yaml kestra: jdbc: queues: message-protection: enabled: true limit: 1048576 ``` If you are not troubleshooting queue throughput or database pressure, you can usually leave the JDBC queue settings alone and return to them only when scaling. ## Internal storage Choose the storage backend based on durability and how workers exchange files. Local storage is easy to start with, but object storage is the safer default once you care about resilience or multiple instances. `kestra.storage.type` controls where Kestra stores internal files. Common options include: - `local` for local testing - `s3` - `gcs` - `azure` - `minio` - other object-storage-compatible backends The default local storage is fine for local testing but not for every production topology. The important distinction is whether every Kestra component can see the same files. Representative examples: ```yaml kestra: storage: type: local local: base-path: /app/storage ``` ```yaml kestra: storage: type: gcs ``` ### Local storage deployment guidance Local storage works well for standalone deployments with a persistent volume. In distributed deployments, it only works safely when all components share the same filesystem through a `ReadWriteMany` volume or an equivalent shared storage layer. If that shared filesystem does not exist, move to object storage instead of trying to share host paths between services. ### Storage isolation Like secret isolation, storage isolation lets you prevent specific services from resolving internal-storage files: ```yaml kestra: storage: type: gcs isolation: enabled: true denied-services: - EXECUTOR ``` This is useful when you want orchestration components to reference files, but do not want every service process to fetch file contents directly. ### S3 Use S3 when Kestra runs in AWS or when another object store exposes a compatible API. ```yaml kestra: storage: type: s3 s3: endpoint: "" access-key: "" secret-key: "" region: "" bucket: "" force-path-style: false ``` If Kestra runs on EC2 or EKS with IAM roles, omit static credentials and keep only the region and bucket: ```yaml kestra: storage: type: s3 s3: region: "" bucket: "" ``` For cross-account access, use STS assume-role settings: ```yaml kestra: storage: type: s3 s3: region: "" bucket: "" sts-role-arn: "" sts-role-external-id: "" sts-role-session-name: "" sts-role-session-duration: "" sts-endpoint-override: "" ``` ### MinIO MinIO is a good self-hosted choice when you want object storage behavior without depending on a cloud provider: ```yaml kestra: storage: type: minio minio: endpoint: my.domain.com port: 9000 secure: false access-key: ${AWS_ACCESS_KEY_ID} secret-key: ${AWS_SECRET_ACCESS_KEY} region: "default" bucket: my-bucket part-size: 5MB ``` If MinIO uses `MINIO_DOMAIN`, enable `kestra.storage.minio.vhost: true` and keep `endpoint` set to the base domain rather than `bucket.domain`. ### SeaweedFS SeaweedFS fits teams that want a lightweight distributed object storage layer in self-managed environments: ```yaml kestra: storage: type: seaweedfs seaweedfs: filer-host: localhost filer-port: 18888 prefix: "" replication: "000" ``` ### Outscale Object Storage Outscale uses the MinIO-compatible backend type. The main thing that changes is the endpoint and the requirement to keep TLS enabled: ```yaml kestra: storage: type: minio minio: endpoint: https://oos.eu-west-2.outscale.com bucket: your-bucket-name accessKey: YOUR_ACCESS_KEY secretKey: YOUR_SECRET_KEY port: 443 secure: true ``` ### Azure Blob Storage Choose one Azure authentication method and keep the others unset: ```yaml kestra: storage: type: azure azure: endpoint: "https://unittestkt.blob.core.windows.net" container: storage connection-string: "" shared-key-account-name: "" shared-key-account-access-key: "" sas-token: "" ``` :::alert{type="info"} Disable hierarchical namespace on the target container. That Azure feature is not supported by the storage backend. ::: ### Google Cloud Storage Use GCS when the deployment already runs in GCP or when workload identity is easier to manage than static keys: ```yaml kestra: storage: type: gcs gcs: bucket: "" project-id: "" service-account: "" ``` If `service-account` is omitted, Kestra falls back to default GCP credentials, which is usually the right choice on GKE or GCE. ### Cloudflare R2 Use R2 as an S3-compatible object storage backend: ```yaml kestra: storage: type: cloudflare cloudflare: bucket: "" accountId: "" accessKeyId: "{{ secret('CLOUDFLARE_R2_ACCESS_KEY') }}" secretAccessKey: "{{ secret('CLOUDFLARE_R2_SECRET_KEY') }}" ``` Optional settings: - `path`: Prefix applied to all stored objects - `jurisdiction`: Restricts the bucket to a specific region (e.g. EU) and updates the endpoint accordingly - `endpointOverride`: Custom endpoint, typically used for testing with S3-compatible services ## Server, environment, and JVM settings These settings shape how the instance presents itself and how the Java process behaves at runtime. They are less about feature enablement and more about making the deployment fit its environment. Common runtime areas include: - `kestra.server.*` for basic auth and liveness - `kestra.url` for the instance URL - `kestra.environment.*` for environment display metadata - `JAVA_OPTS` for JVM tuning such as timezone and heap settings - `kestra.variables.*` for global variables and recursive rendering behavior Environment metadata shown in the UI: ```yaml kestra: environment: name: Production color: "#FCB37C" ``` JVM settings are usually passed through `JAVA_OPTS`: ```bash export JAVA_OPTS="-Duser.timezone=Europe/Paris -Xmx1g" ``` Common uses include: - setting `user.timezone` to control scheduling and log display - setting a fixed heap with `-Xmx` - configuring Java proxy settings for outbound access Global variables and rendering behavior also live here: ```yaml kestra: variables: env-vars-prefix: ENV_ globals: region: eu-west-1 recursive-rendering: true cache-enabled: true ``` `env-vars-prefix` controls which environment variables become available in expressions under `envs.*`. For example, `ENV_MY_VARIABLE` becomes `{{ envs.my_variable }}`. Use `globals` for values that need to be available in every flow, `recursive-rendering` only when you intentionally want pre-0.14 recursive behavior, and `cache-enabled` when you need to trade CPU for correctness while debugging template changes. Set `cache-size` to limit the number of cached templates (default `1000`): ```yaml kestra: variables: cache-size: 1000 ``` ## Optional runtime features These settings are not part of the core queue or repository setup, but they do matter in real deployments. Some notifications and generated links depend on `kestra.url` being set to the public base URL without `/ui` or `/api`: ```yaml kestra: url: https://www.my-host.com/kestra/ ``` The web UI can also be customized at runtime: ```yaml kestra: webserver: google-analytics: UA-12345678-1 html-head: | ``` Use `html-head` sparingly for environment banners, extra CSS, or internal scripts that must load with the app shell. To allow universal file access from host-mounted paths, both mount the directory and add it to the allowlist: ```yaml kestra: local-files: allowed-paths: - /scripts enable-preview: false ``` Without the allowlist, file-access URIs pointing at local host paths will be rejected even if the path is mounted into the container. ## When to use this page - Need logs, telemetry, metrics, endpoints, CORS, or SSL: [Observability and Networking](../03.observability-and-networking/index.md) - Need plugin defaults, retries, task temp storage, templates, or system flows: [Plugins and Execution](../04.plugins-and-execution/index.md) - Need secret backends or server hardening: [Security and Secrets](../05.security-and-secrets/index.md) --- # Security & Secrets Configuration in Kestra URL: https://kestra.io/docs/configuration/security-and-secrets > Configure encryption, secret backends, auth-related security settings, RBAC-adjacent platform security, and secure server behavior in Kestra. Use this page when you need to protect sensitive values or harden a Kestra deployment. ## Encryption This is the minimum security configuration most self-managed instances should think about early, because it determines whether sensitive flow values can be stored safely at rest. Kestra supports encryption of sensitive inputs and outputs at rest through `kestra.encryption.secret-key`. Example: ```yaml kestra: encryption: secret-key: BASE64_ENCODED_STRING_OF_32_CHARACTERS ``` Generate a key with: ```bash openssl rand -base64 32 ``` Without `kestra.encryption.secret-key`, `SECRET` inputs and outputs fail at runtime because Kestra cannot encrypt the value at rest. Example flow using `SECRET` types: ```yaml id: my_secret_flow namespace: company.team inputs: - id: secret type: SECRET tasks: - id: mytask type: io.kestra.plugin.core.log.Log message: task that needs the secret to connect to an external system outputs: - id: secret_output type: SECRET value: "{{ inputs.secret }}" ``` ## Secret backends Choose the backend based on where your organization already stores secrets. In practice, most teams want Kestra to consume an existing cloud or Vault-based secret system rather than create a separate one just for workflows. Kestra can be configured to use a secrets backend through `kestra.secret.*`. This page covers: - AWS Secrets Manager - Azure Key Vault - Google Secret Manager - HashiCorp Vault - JDBC - secret tags - secret cache - isolation options Base structure: ```yaml kestra: secret: type: azure-key-vault azure-key-vault: client-secret: tenant-id: "id" client-id: "id" client-secret: "secret" isolation: enabled: true denied-services: - EXECUTOR ``` `isolation` is the key control to understand here: it limits which Kestra services are allowed to resolve secrets, which is useful when you want workers or executors to have narrower access than the whole platform. The Azure service principal referenced in the base structure above must have the following Key Vault access policy permissions: `Get`, `List`, `Set`, `Delete`, `Recover`, `Backup`, `Restore`, `Purge`. Representative backend examples: AWS Secrets Manager requires the following IAM permissions: `CreateSecret`, `DeleteSecret`, `DescribeSecret`, `GetSecretValue`, `ListSecrets`, `PutSecretValue`, `RestoreSecret`, `TagResource`, `UpdateSecret`. ```yaml kestra: secret: type: aws-secret-manager aws-secret-manager: access-key-id: mysuperaccesskey secret-key-id: mysupersecret-key session-token: mysupersessiontoken region: us-east-1 ``` Google Secret Manager requires the `roles/secretmanager.admin` role. Omit `service-account` to fall back to `GOOGLE_APPLICATION_CREDENTIALS` or the environment's default credentials: ```yaml kestra: secret: type: google-secret-manager google-secret-manager: project: gcp-project-id service-account: | ``` Elasticsearch secrets are additionally encrypted with AES. The key must be at least 32 characters: ```yaml kestra: secret: type: elasticsearch elasticsearch: secret: "a-secure-32-character-minimum-key" ``` HashiCorp Vault (KV v2) supports Userpass, Token, and AppRole authentication. Userpass: ```yaml kestra: secret: type: vault vault: address: "http://localhost:8200" password: user: john password: foo ``` Token: ```yaml kestra: secret: type: vault vault: address: "http://localhost:8200" token: token: your-secret-token ``` AppRole: ```yaml kestra: secret: type: vault vault: address: "http://localhost:8200" app-role: path: approle role-id: your-role-id secret-id: your-secret-id ``` JDBC-backed secrets, secret tags, and secret caching are covered below. ### JDBC secret backend Use JDBC-backed secrets only when secrets must stay inside the same database boundary as Kestra and you do not already have a dedicated secret manager: ```yaml kestra: secret: type: jdbc jdbc: secret: "your-secret-key" ``` ### Secret tags Some backends let you scope lookups with tags so the same secret manager can serve multiple environments: ```yaml kestra: secret: : tags: application: kestra-production ``` Tags are useful when secrets are selected by metadata rather than by one fixed path convention. ### Secret cache Caching reduces repeated secret-manager calls for frequently used values: ```yaml kestra: secret: cache: enabled: true maximum-size: 1000 expire-after-write: 60s ``` Use a cache when executions hit the same secret names repeatedly, but keep TTLs conservative if secret rotation happens often. ![Secrets UI Configuration](../is-secrets-configuration.png) ## Security settings This section is about hardening the running platform rather than managing secret values. Reach for it when you are locking down access, controlling invitations and roles, or tuning how the server reacts to component health issues. This group includes: - super-admin behavior - default roles - invitation expiration - password rules - server basic auth - deletion of configuration files Server and endpoint hardening examples: ```yaml kestra: server: basic-auth: username: admin@kestra.io password: change-me open-urls: - "/api/v1/main/executions/webhook/" ``` ```yaml endpoints: all: basic-auth: username: your-user password: your-password ``` ### Super-admin The super-admin account has the highest level of platform access and should be reserved for break-glass administration: ```yaml kestra: security: super-admin: username: your_username password: ${KESTRA_SUPERADMIN_PASSWORD} tenant-admin-access: - ``` :::alert{type="warning"} Never store clear-text passwords in config. Use environment variables or your platform secret mechanism. ::: ### Default role Assign a default role to newly created users when you want them to land with a predictable permission set: ```yaml kestra: security: default-role: name: default description: "Default role" permissions: FLOW: ["CREATE", "READ", "UPDATE", "DELETE"] ``` In multi-tenant environments, scope that role to one tenant: ```yaml kestra: security: default-role: name: default description: "Default role" permissions: FLOW: ["CREATE", "READ", "UPDATE", "DELETE"] tenant-id: staging ``` :::alert{type="info"} Place `default-role` under `kestra.security`, not `micronaut.security`. ::: ### Invitation expiration and password rules Invitation links expire after seven days by default. Extend them if user onboarding happens through slower approval processes: ```yaml kestra: security: invitations: expire-after: P30D ``` For username/password auth, enforce password complexity explicitly: ```yaml kestra: security: basic-auth: password-regexp: "" ``` ### Delete configuration files after startup If the runtime reads secrets from configuration files, delete them after startup so tasks cannot read them later from disk: ```yaml kestra: configurations: delete-files-on-start: true ``` Liveness and heartbeat settings also belong here. The parameter constraints below affect cluster stability: - `timeout` — must match across **all Executors** - `initial-delay` — must match across **all Executors** - `heartbeat-interval` — must be strictly less than `timeout` Recommended settings for JDBC-backed (OSS) deployments: ```yaml kestra: server: liveness: enabled: true interval: 5s timeout: 45s initial-delay: 45s heartbeat-interval: 3s ``` Recommended settings for Kafka-based (EE) deployments: ```yaml kestra: server: liveness: timeout: 1m initial-delay: 1m ``` :::alert{type="warning"} Worker liveness in Kafka mode is handled by Kafka's protocol guarantees, so you only need to set `timeout` and `initial-delay` for the EE stack. ::: Heartbeat and restart behavior also belong here: - `kestra.heartbeat.frequency` controls how often workers emit heartbeats. Default: `10s`. - `kestra.heartbeat.heartbeat-missed` controls how many missed heartbeats mark a worker as dead. Default: `3`. - `kestra.server.worker-task-restart-strategy` accepts `NEVER`, `IMMEDIATELY`, or `AFTER_TERMINATION_GRACE_PERIOD` and determines what happens to running worker tasks during shutdown. Set the termination grace period long enough for tasks to exit cleanly: ```yaml kestra: server: termination-grace-period: 5m ``` If the deployment regularly creates empty server instances, adjust how often purge runs: ```yaml kestra: server: service: purge: retention: 7d ``` :::alert{type="warning"} Keep the external process manager timeout longer than Kestra's own termination grace period. Otherwise Kubernetes, Docker, or systemd can kill the process before graceful shutdown finishes. ::: ## Regex timeout Kestra protects worker threads from ReDoS (catastrophic backtracking) by enforcing a timeout on all regex operations. This applies to [Pebble expression filters](../../expressions/index.mdx) (`regexMatch`, `regexReplace`, `regexExtract`, `replace` with `regexp=true`) and to `validator` patterns on `STRING` and `SECRET` inputs. The default timeout is **10 seconds**. To change it, set `kestra.regex.timeout` in your configuration: ```yaml kestra: regex: timeout: 30s ``` Accepts any [ISO 8601 duration](https://en.wikipedia.org/wiki/ISO_8601#Durations) string (e.g., `5s`, `PT30S`, `1m`). :::alert{type="info"} The timeout is set once at startup and cannot be changed at runtime without restarting the server. ::: ## Related docs - Secrets manager concepts: [External Secrets Manager](../../07.enterprise/02.governance/secrets-manager/index.md) - Enterprise auth and RBAC: [Authentication and Users](../../07.enterprise/03.auth/index.mdx) - EE platform settings and advanced backends: [Enterprise and Advanced](../06.enterprise-and-advanced/index.md) --- # Contribute to Kestra: Code, Docs, and Community URL: https://kestra.io/docs/contribute-to-kestra > Join the Kestra open-source community. Discover how to contribute to the codebase, improve documentation, and engage with other developers. import ChildCard from "~/components/docs/ChildCard.astro" Contribute to the Kestra open-source project. ## How to contribute to Kestra --- # Community Guidelines: How to Participate in Kestra URL: https://kestra.io/docs/contribute-to-kestra/community-guidelines > Read the Kestra Community Guidelines. Learn how to participate respectfully, ask for help effectively, and foster a welcoming and inclusive environment. The Kestra community is a welcoming and inclusive place for everyone. ## Participate in the Kestra Community 1. **Be respectful to the Kestra community** 1. Be respectful toward other members of the Slack community. Harassment will not be tolerated. 2. Assume positive intent. 2. **Make it easy to help you** 1. Share relevant flows (YAML), logs, and stack traces [formatted](https://slack.com/intl/en-gb/help/articles/202288908-Format-your-messages) in code blocks (Avoid using screenshots). 2. Share how you deployed Kestra: 1. Deployment method (Standalone, Docker, Kubernetes, etc.) 2. Kestra version 3. Operating system and version 3. **Use relevant channels** 1. Avoid posting the same question in multiple channels. 4. **Don’t spam** — while we’ll do our best to help you, there is no guaranteed timeline to answer your question. If you need support with SLA guarantees, [reach out to us](/demo). If you have questions, feel free to ask in our [Slack community](/slack) --- # Contribute to the Kestra Codebase: Issues and PRs URL: https://kestra.io/docs/contribute-to-kestra/contributing > Guide to contributing to the Kestra codebase. Learn how to report bugs, request features, build plugins, and submit pull requests to help improve the platform. Contribute to the Kestra open-source project. ## Contribute to the Kestra codebase You can contribute to Kestra in many ways, depending on your skills and interests. The issues with the label `good first issue` are a great place to start and get familiar with the codebase. Check out the current list of [good first issues](https://github.com/search?q=org%3Akestra-io+label%3A%22good+first+issue%22+is%3Aopen&type=issues) and start contributing.
## Build a plugin Check out our [Plugin Developer Guide](../../plugin-developer-guide/index.mdx) for instructions on how to build a new plugin. ## Contribute to the documentation To contribute to the documentation, fork the [docs repository](https://github.com/kestra-io/docs/fork) and create a pull request with your changes. Check out the [Contribute to Kestra Documentation page](../04.docs-contributor-guide/index.mdx) for more information about building the documentation site locally, how we write the documentation, and contributing to the product and plugin documentation. ## Write a blog post You can contribute an article about how you use Kestra to our [blog](/blogs). Email [hello@kestra.io](mailto:hello@kestra.io) to start the collaboration. If you wrote a post mentioning Kestra on your personal blog, we'd be happy to feature it in our community section. ## Other ways to show support - Star Kestra on [GitHub](https://github.com/kestra-io/kestra). - Follow us on [X](https://twitter.com/kestra_io) and [LinkedIn](https://www.linkedin.com/company/kestra). - Join the [Slack](/slack) community. ## Build Kestra locally ### Requirements The following dependencies are required to build Kestra locally: - JDK 25 (runtime) with source/target set to Java 21 - Node 14+ and npm - Docker & Docker Compose - an IDE (Intellij IDEA, Eclipse or VS Code) To start contributing: - [Fork](https://github.com/kestra-io/kestra/fork) the repository - Clone the fork on your workstation: ```shell git clone git@github.com:{YOUR_USERNAME}/kestra.git cd kestra ``` ### Backend development The backend is built using [Micronaut](https://micronaut.io). Open the cloned repository in your favorite IDE. In many IDEs, Gradle build will be detected and all dependencies will be downloaded. You can also build it from a terminal using `./gradlew build`. The Gradle wrapper will automatically download the correct Gradle version to use. - Set your IDE language level to **Java 21** while using the **JDK 25** toolchain; builds are compiled with `--release 21`. - You may need to enable Java annotation processors since we use it a lot. - The main class is `io.kestra.cli.App` from module `kestra.cli.main`. - Pass as program arguments the server you want to develop, for example `server standalone` starts a standalone Kestra server. - The Intellij Idea configuration can be found in screenshot below: ![Intellij Idea Configuration ](./standalone.png) - `MICRONAUT_ENVIRONMENTS`: can be set as any string and will load a custom configuration file in `cli/src/main/resources/application-{env}.yml` - `KESTRA_PLUGINS_PATH`: is the path where you save plugins as Jar and is loaded during the startup process - If you encounter **JavaScript memory heap out** error during startup, configure `NODE_OPTIONS` environment variable with some large value. - Example `NODE_OPTIONS: --max-old-space-size=4096` or `NODE_OPTIONS: --max-old-space-size=8192` ![Intellij IDEA Configuration ](./node_option_env_var.png) - You can also use the gradle task `./gradlew runLocal` that runs a standalone server with `MICRONAUT_ENVIRONMENTS=override` and plugins path `local/plugins` - The server start by default on port 8080 and is reachable on `http://localhost:8080`. If you want to launch all tests, you need Python and some packages installed on your machine. On Ubuntu, you can install them with the following command: ```shell sudo apt install python3 pip python3-venv python3 -m pip install virtualenv ``` ### Frontend development All frontend code is located in the `/ui` folder. The front-end uses [Vue.js](https://vuejs.org/). Deep knowledge of Vue.js is not required to contribute. To run Kestra's frontend in development mode, you will need Node.js version `22.12.0`. The repository has a `.nvmrc` file. #### Initial setup ```shell npm install ``` #### Run the frontend ```shell npm run dev ``` This will start a local server on port `5173`. You will need to open the Kestra UI in a browser at http://localhost:5173 #### Open Storybook You can also run the [Storybook](https://storybook.js.org/) to view the components in isolation. ```shell npm run storybook ``` This will start a local server on port `6006` and open the Storybook in your default browser at http://localhost:6006. You can also run all tests in the command line without opening a browser: ```shell npm run test:unit ``` Even better, you can run one `test` file or `stories` file in isolation by specifying part of its name or path in the command ```shell npm run test:unit BarChart ``` ### Set up the configuration to connect to the backend Now that you can run the frontend, if opened, you will see a loading screen running forever. It waits for a backend to answer. To set it up: - To avoid CORS restrictions when using the local development npm server, you need to configure the backend to allow the http://localhost:5173 origin in `cli/src/main/resources/application-override.yml` using the following addition to your [Observability and Networking configuration](../../configuration/03.observability-and-networking/index.md) YAML definition: ```yaml micronaut: server: cors: enabled: true configurations: all: allowedOrigins: - http://localhost:5173 ``` Then, you can run the backend by running the gradle task. ```shell MICRONAUT_ENVIRONMENTS=override ./gradlew runLocal server standalone ``` This will start a local server on port 8080, accessible at `http://localhost:8080`. ### Set up Kestra frontend without building the backend from the source code If you want to work on the frontend without having to install Java and everything to run the Kestra Application, you can start a Kestra [Docker container](https://docs.docker.com/engine/install/) and connect the frontend to it. To do so, you can first use the following [Docker Compose file](https://github.com/kestra-io/kestra/blob/develop/docker-compose.yml). Save it as `docker-compose.yml` in a separate directory from the Git repository and run the following command in this new directory: ```shell docker compose up ``` This starts Kestra running with PostgreSQL as the database. You can change the port or other configurations by updating the `docker-compose.yml` file. Finally, install the dependencies with `npm install`, and serve the UI with hot reload at http://localhost:5173 using the command: `npm run dev`. ## Kestra devcontainer Thanks to the Kestra community, if you are using VSCode, you can start development on either the frontend or backend with a bootstrapped Docker container without the need to manually set up the environment. Check out the [README](https://github.com/kestra-io/kestra/tree/develop/.devcontainer) for set-up instructions and the associated [Dockerfile](https://github.com/kestra-io/kestra/blob/develop/.devcontainer/Dockerfile) in the repository to get started. ## Code of conduct This project and everyone participating in it is governed by the [Kestra Code of Conduct](https://github.com/kestra-io/kestra/blob/develop/.github/CODE_OF_CONDUCT.md). By participating, you are expected to uphold this code. Please report unacceptable behavior to [hello@kestra.io](mailto:hello@kestra.io). ### Legal notice > When contributing to this project, you must agree that you have authored 100% of the content, that you have the necessary rights to the content and that the content you contribute may be provided under the project license. ### Submit issues To submit feature requests or report bugs, please open an [issue](https://github.com/kestra-io/kestra/issues) on GitHub. ### Reporting bugs Bug reports make Kestra better for everyone. We provide a preconfigured template for bugs to make it very clear what information we need. Before reporting a bug, please search for your issue in our [already reported bugs](https://github.com/kestra-io/kestra/issues?q=is%3Aissue+is%3Aopen+label%3Abug) to avoid raising a duplicate. ### Reporting security issues Please do not create a public GitHub issue. If you've found a security issue, please email us directly at [security@kestra.io](mailto:security@kestra.io) instead of raising an issue. ### Requesting new features Use our issue templates when opening new issues. It contains a few essential questions that help us understand the problem you are looking to solve. To see what has already been proposed by the community, you can refer to our [current issues board](https://github.com/kestra-io/kestra/issues?q=is%3Aissue+is%3Aopen+label%3Aenhancement). --- # Kestra Docs Contributor Guide: Writer's Reference URL: https://kestra.io/docs/contribute-to-kestra/docs-contributor-guide > Help improve Kestra's documentation. This writer's guide covers local build setup, front matter conventions, and best practices for contributing to docs. import ChildCard from "~/components/docs/ChildCard.astro" Contribute to the Kestra Documentation. To contribute to the documentation, fork the [docs repository](https://github.com/kestra-io/docs/fork) and create a pull request with your changes.
## Build the documentation locally The following dependencies are required to build Kestra docs locally: - Node 14+ and npm - An IDE (such as VS Code, IntelliJ, etc.) To start contributing: - [Fork](https://github.com/kestra-io/docs/fork) the repository - Clone the fork on your workstation: ```shell git clone git@github.com:{YOUR_USERNAME}/docs.git cd docs ``` Use the following commands to serve the docs locally: ```shell ## install dependencies npm install ## serve with hot reload at localhost:3001 npm run dev ## to generate static pages npm run generate ## making a production build npm run build ``` In addition to contributing content and understanding the overall structure of the documentation, it's important to become familiar with the custom Markdown components and patterns used throughout the Kestra Docs. For those contributing to the Kestra Plugin documentation, a basic understanding of plugin structure and Java syntax for doc strings is required. This guide is designed to help external contributors get up to speed with the tools, conventions, and components you'll encounter when contributing to Kestra's documentation. The documentation is structured on multiple levels. The top level is an index page such as "Getting Started," "Workflow Components," and "Cloud & Enterprise Edition." This acts as a landing page for all content that falls under those high-level categories. To serve a visitor everything within that topic, we use a `ChildCard` component on the index page. This component is built from the `ChildCard.vue` file in the `components/content` directory. The index file's markdown looks like this: ```markdown --- title: Getting Started --- Follow the [Quickstart Guide](../../01.quickstart/index.md) to install Kestra and start building your first workflows. ``` And the page displays the following with all the sub topics of "Getting Started" listed with their card and icon: ![Getting Started ChildCard](./child-card.png) Note that when writing a standalone documentation page, the first sentence appears in the ChildCard view to introduce the topic. In the above example for [Quickstart Guide](../../01.quickstart/index.md) this sentence is visible: ```markdown Start Kestra in a Docker container and create your first flow. ``` Ideally, keep this first sentence as clear and concise as possible to not clutter the view on the card. ### Front matter Each documentation page is expected to include several key front matter properties. We briefly mentioned one of them, **icon**, in the last section. For example, take our [Apps](../../07.enterprise/04.scalability/apps/index.md) page. This is the front matter specified on the markdown page: ```markdown --- title: Apps in Kestra Enterprise – Build Frontends for Flows description: Build custom Apps with Kestra. Create user-facing interfaces for workflows, enabling forms, approvals, and interactive data applications. sidebarTitle: Apps icon: /src/contents/docs/icons/admin.svg editions: ["EE", "Cloud"] version: ">= 0.20.0" docId: apps --- ``` And this is the resulting view: ![Apps Front Matter](./apps-frontmatter.png) Each property is described below. #### title `title` is the SEO title of the page. It appears in browser tabs, search results, and social previews. Write it to be descriptive and keyword-rich. For feature or concept pages this is typically the feature name; for how-to guides, make it clear about both the purpose and the tool involved (e.g., [Access Files on your Local Machine in Kestra](../../15.how-to-guides/access-local-files/index.md)). Use Title Case for `title` and `sidebarTitle`. #### h1 `h1` controls the heading displayed at the top of the page. It is separate from `title` so the on-page heading can be written for readability while `title` is optimised for search. If `h1` is omitted, `title` is used as the heading. Most pages define both: ```markdown --- title: "Handle Errors in Kestra: Retries and Alerts" h1: Build Resilient Workflows with Retries, Alerts & Failure Handling --- ``` #### description `description` is the meta description shown in search results and link previews. Write one to two sentences (roughly 150–160 characters) summarising the page's content. Every page should include one. #### sidebarTitle `sidebarTitle` controls the label shown in the left-hand navigation. Keep it short — typically just the feature or topic name. Use Title Case. #### icon Icons are SVG files that are used to identify a certain tool being used or a general concept. They appear at the top of all documentation pages and in the ChildCard of the page. For example, this [Neon with Kestra guide](../../15.how-to-guides/neon/index.md) has the following properties: ```markdown --- title: Connect Neon Database to Kestra icon: /src/contents/docs/icons/neon.svg stage: Intermediate topics: - Integrations --- ``` And appears on the site as follows: ![Neon Icon Display](./neon-icon.png) The icon lives in the `public/docs/icons` folder path and is specified as [Neon](https://neon.tech/home), so the correct logo shows for the tool. General icons, such as `api.svg` or `installation.svg`, are also available in the folder. If you contribute a guide incorporating a tool without an existing icon, place the appropriate SVG file in this folder and reference it in the front matter. #### topics & stage Our **How-To Guides** require a couple of extra front-matter properties to provide clarity to the site visitor about the guide's topic and level: `topics` and `stage`. Using the same example as above, you can see that the properties are set as `stage: Intermediate` and `topics: Integrations`. ```markdown --- title: Connect Neon Database to Kestra icon: /src/contents/docs/icons/neon.svg stage: Intermediate topics: - Integrations --- ``` These properties are `const` variables set in the `GuidesChildCard.vue` file of the repository. They have a set list to choose from when classifying a guide. For example, `stage` can be "Getting Started," "Intermediate," or "Advanced." `topics` can be a multitude of different concepts such as "Scripting," "Kestra Concepts," "Best Practices," and more. If your guide doesn’t fit into any of the existing topics, feel free to suggest a new one in a pull request. #### editions Kestra has three editions: Open Source, Enterprise, and Cloud. A feature or guide may be relevant only to one, two, or all editions, so we have a front-matter property to specify that right at the top of a page for the reader. For example, depending on the Kestra edition, there are different pages relevant to handling secrets. We have a [Kubernetes Secrets How-to Guide](../../15.how-to-guides/kubernetes-secrets/index.md) where we set the edition as `OSS` in the front matter: ```markdown --- title: Set Up Secrets from a Helm Chart icon: /src/contents/docs/icons/helm.svg stage: Getting Started topics: - Kestra Concepts - DevOps editions: ["OSS"] --- ``` And we have a page for [Secrets](../../07.enterprise/02.governance/secrets/index.md) that is specifically for **Enterprise & Cloud** users. ```markdown --- title: Secrets icon: /src/contents/docs/icons/admin.svg editions: ["EE", "Cloud"] docId: secrets --- ``` #### version Like `editions`, some Kestra features are only available in specific Kestra versions and onwards. We use the `version` property in the front matter to identify this in the documentation. For example, [Worker Groups](../../07.enterprise/04.scalability/worker-group/index.md) are only available starting in Kestra version 0.10.0. This is specified as follows: ```markdown --- title: Worker Group icon: /src/contents/docs/icons/admin.svg editions: ["EE"] version: ">= 0.10.0" --- ``` #### docId One of Kestra's major benefits is its in-app contextual docs. This means that when constructing flows in the platform, you can access the documentation in the same interface without having to navigate to the browser to check against our documentation. This is done through the `docId` front matter. Kestra knows that you are working with Apps, and it can show you the relevant documentation without a task switch. ![In-App Docs](./in-app-docs.gif) The same is true for all the main components of Kestra (e.g., Namespace, Flow, Blueprints, Plugins, etc.). #### release `release` is a front matter property only relevant for our [Migration Guides](../../11.migration-guide/index.mdx). These guides outline the need-to-know information for upgrading from one version of Kestra to another. This includes renaming a feature or "Before and After" examples of an action in Kestra. Example configuration looks like this: ```markdown --- title: Restarting parent flow icon: /src/contents/docs/icons/migration-guide.svg release: 0.21.0 editions: ["OSS", "EE"] --- ``` ### Writing style #### Headings Use sentence case for all body headings — capitalise only the first word and proper nouns. Use Title Case for `title` and `sidebarTitle` in front matter. Avoid restating the page title as the first H2. The first heading should introduce the first distinct section, not repeat what the title already says. #### Voice and tone - Address the reader as "you" (second person). - Use active voice and present tense for facts and product behavior. - Be direct. Avoid filler phrases like "In this guide," "It's worth noting that," or "Simply." - Do not use first-person plural ("we," "our"). Kestra docs address the reader, not the writing team. ### Customized text We use several components to add customized text presentations in the documentation. To differentiate important information from average text, we use three different levels of alert types: "info," "success," and "warning." :::alert{type="info"} This is important to note. ::: :::alert{type="success"} Yippee, it worked. ::: :::alert{type="warning"} This is a warning, but it's fine. ![this is fine](./this-is-fine.png) ::: Use alerts sparingly. Reserve them for content that would cause failure or confusion if missed — a required prerequisite, a destructive side effect, or a non-obvious constraint. Avoid using them for general information that works just as well as a sentence in the body text. Another helpful component we use is `:::collapse`. This tag keeps the documentation space-efficient and hides long examples or other information that does not need to be seen when scrolling the page. Still, the reader can open it up to reveal its content. This is particularly useful for flows that could otherwise take up a lot of space on a page or FAQ Answers that may not be relevant to every reader and can be selected as needed. Use the following syntax with whatever should be collapsed within the colons and the title inline with `:::collapse`: ```markdown :::collapse{title="Introduction to whatever is collapsed"} Here is where the collapsed text goes. ::: ``` Here is a full example using a flow and subflow with a ForEach task: :::collapse{title="Full Flow Example"} Subflow: ```yaml id: subflow namespace: company.team inputs: - id: items type: STRING tasks: - id: for_each_item type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - cat "{{ inputs.items }}" - id: read type: io.kestra.plugin.core.log.Log message: "{{ read(inputs.items) }}" ``` Below is a Flow that uses the `ForEachItem` task to iterate over a list of items and run the `subflow` for a batch of 10 items at a time: ```yaml id: each_parent namespace: company.team tasks: - id: extract type: io.kestra.plugin.jdbc.duckdb.Query sql: | INSTALL httpfs; LOAD httpfs; SELECT * FROM read_csv_auto('https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv', header=True); store: true - id: each type: io.kestra.plugin.core.flow.ForEachItem items: "{{ outputs.extract.uri }}" batch: rows: 10 namespace: company.team flowId: subflow wait: true transmitFailed: true inputs: items: "{{ taskrun.items }}" ``` ::: While a feature may be available after a specific Kestra version and indicated in the front matter, an additional function may be added in later versions that don't match the front matter. We use the ':::badge' component to indicate this in the documentation for only a particular page section. This component can be used at any point on the page rather than solely at the top. The component has the following syntax, able to include both `version` and `editions` like the Front Matter: ```markdown :::badge{version=">=0.15" editions="OSS,EE,Cloud"} ::: ``` :::badge{version=">=0.15" editions="OSS,EE,Cloud"} ::: ### Video container In the documentation, we try to always have an accompanying video for the discussed feature. To ensure the YouTube video is embedded and displayed correctly and consistently on every page, we use a custom `video-container` div class. Add the div after the page's introductory sentence and before the main content. ```html
``` The `video-container` is maintained in the repository's `docs.scss` file. Refer back to the top of this page or check out the [Contributing Guide](../03.contributing/index.md) an embedded video. ### Code blocks When including code blocks in the documentation, specify which language the example is written in. Typically, in the Kestra documentation, example flows are included to demonstrate a feature; they are defined as a `yaml` code block. For example, see the following flow in markdown:
```yaml
id: getting_started
namespace: company.team
tasks:
  - id: hello_world
    type: io.kestra.plugin.core.log.Log
    message: Hello World!
```
The supported languages for code blocks are fully listed in the `useShiki.ts` file in the repository, and if you need something new added, you can make an addition there. ### How to use images Images are a key part of the documentation. We couple images used on a page within the page directory. The Apps documentation markdown page and its associated images are contained in the same folder to keep assets together and to make them easy to find and add. Taking this guide as an example, an image we used earlier in the guide appears in the markdown as follows: ```markdown ![Apps Front Matter](./apps-frontmatter.png) ``` The image has a clear title and is located in the Apps folder. For this guide, all images are placed in this folder path so the organization is clear and easily worked with by another contributor. This same practice is used for our blog and other parts of the website that are kept in the repository. ## Contribute to Kestra Plugin Documentation Kestra Plugins each have their own documentation page on the website in [Plugins](/plugins). Each plugin also has in-app contextual documentation so that task and property definitions are easily usable while building flows. Plugin docs are maintained in separate repositories rather than the product documentation. For example, if you want to contribute to the [OpenAI Plugin](/plugins/plugin-openai), you can find the documentation in the [OpenAI Plugin Repository](https://github.com/kestra-io/plugin-openai). :::alert{type="info"} All plugin repos are searchable from the central [Kestra GitHub](https://github.com/kestra-io). The name of the repository is in the URL of the plugin documentation page. For example, the OpenAI repo is called `plugin-openai` which is in the URL path `https://kestra.io/plugins/plugin-openai/io.kestra.plugin.openai.chatcompletion`. Simply searching the tool's name should suffice, but this always works just in case. ::: To contribute to a plugin's documentation, fork the repository. Once cloned, contributions are welcome to four key components of plugin task documentation: title, description, examples, and properties. Continuing with [OpenAI](/plugins/plugin-openai), the tasks include [ChatCompletion](/plugins/plugin-openai/io.kestra.plugin.openai.chatcompletion) and [CreateImage](/plugins/plugin-openai/io.kestra.plugin.openai.createimage). ![OpenAI Plugin Tasks](./openai-plugin-tasks.png) Each task is in the path `src/main/java/io/kestra/plugin/openai`. This will be similar to all other plugins (i.e., `src/main/java/io/kestra/plugin/`). To improve or add to the documentation, open the Java file for the task and edit the `@Schema`, `@Plugin`, and `@Example` doc strings. :::alert{type="info"} You do not need to be well-versed in Java to contribute to the plugin documentation. The doc strings are organized so that they are easy to work with, and we will review any contributions anyway, so have no fear. You can read more about how we instruct developers to document their plugins in the [Plugin Developer Guide](../../plugin-developer-guide/07.document/index.md). ::: The plugin documentation will generally look like the following: ```java @Schema( title = "Given a prompt, get a response from an LLM using the OpenAI’s Chat Completions API.", description = "For more information, refer to the [Chat Completions API docs](https://platform.openai.com/docs/guides/gpt/chat-completions-api)." ) @Plugin( examples = { @Example( full = true, title = "Based on a prompt input, generate a completion response and pass it to a downstream task.", code = """ id: openai namespace: company.team inputs: - id: prompt type: STRING defaults: What is data orchestration? tasks: - id: completion type: io.kestra.plugin.openai.ChatCompletion apiKey: "yourOpenAIapiKey" model: gpt-4o prompt: "{{ inputs.prompt }}" - id: response type: io.kestra.plugin.core.debug.Return format: {{ outputs.completion.choices[0].message.content }}" """ ), ``` The key properties to consider are: - `title`: A concise single sentence describing the task's objective that is displayed in the Kestra in-app contextual docs. - `description`: Additional information such as links to the external tool's documentation or best practices for using the task. - `examples`: Flow examples that demonstrate the task in use. Best if it is a logical use case utilizing multiple Kestra features (e.g., [Triggers](../../05.workflow-components/07.triggers/index.mdx), [Inputs](../../05.workflow-components/05.inputs/index.md), [Outputs](../../05.workflow-components/06.outputs/index.md), etc.). Similarly to the main plugin attributes, the properties are documented with a `title` and a `description`. For example, the [OpenAI ChatCompletion properties](/plugins/plugin-openai/io.kestra.plugin.openai.chatcompletion#properties-body): ```java public class ChatCompletion extends AbstractTask implements RunnableTask { @Schema( title = "A list of messages comprising the conversation so far.", description = "Required if prompt is not set." ) private Property> messages; @Schema( title = "The function call(s) the API can use when generating completions." ) private Property> functions; @Schema( title = "The name of the function OpenAI should generate a call for.", description = "Enter a specific function name, or 'auto' to let the model decide. The default is auto." ) private Property functionCall; @Schema( title = "The prompt(s) to generate completions for. By default, this prompt will be sent as a `user` role.", description = "If not provided, make sure to set the `messages` property." ) ``` To improve or add titles, descriptions, or examples, create a pull request or issue on the specific plugin repository. ## Contribute to Kestra Blueprints The official Kestra Blueprints library can be found under [kestra.io/blueprints](/blueprints). Blueprints are a curated, organized, and searchable catalog of ready-to-use examples designed to help you kickstart your workflow. Each Blueprint combines code and documentation and can be assigned several tags for organization and discoverability. To contribute a Blueprint or modify an existing one, clone the [Blueprints repository](https://github.com/kestra-io/blueprints). Within the repository, there are blueprints for [Apps](https://github.com/kestra-io/blueprints/tree/main/apps), [Dashboards](https://github.com/kestra-io/blueprints/tree/main/dashboards), and [Flows](https://github.com/kestra-io/blueprints/tree/main/flows). All Blueprints are `yaml` files composed of the example Flow, App, or Dashboard and an `extend` property that specifies attributes such as `title` and `description` to propagate onto the website. For example, this [Getting Started with Kestra – a Data Engineering Pipeline](/blueprints/data-engineering-pipeline) has the following `extend` property: ```yaml extend: title: Getting started with Kestra — a Data Engineering Pipeline example description: | This flow is a simple example of a Kestra flow used for a data engineering use case. It downloads a JSON file, filters the data, and calculates the average price per brand. The flow has three tasks: 1. The first task downloads a JSON file. 2. The second task filters the data and writes it to a new JSON file. 3. The third task reads the filtered data, calculates the average price per brand using DuckDB, and stores the result as a Kestra output which can be previewed and downloaded from the UI. tags: - Getting Started - API - Python - SQL ee: false demo: true metaTitle: Getting Started - Data Engineering Pipeline metaDescription: This flow represents a data engineering use case. It downloads a JSON file, filters the data in Python, and calculates the KPIs in SQL using DuckDB. ``` Check out the [full file](https://github.com/kestra-io/blueprints/blob/main/flows/data-engineering-pipeline.yaml) to see the Flow's YAML. For the Blueprint to be easily searchable, it is essential to include the appropriate `tags`. A complete list of tags is available on the [Blueprints homepage](/blueprints). With the proper YAML and extension, the Flow's topology will display interactively on the Blueprint page along with a **Copy source code** button and task icons. ![Blueprint Page](./blueprint-page.png) To suggest new blueprints or improve existing ones, create a pull request or issue on the [Blueprints repository](http://github.com/kestra-io/blueprints). --- # Cloud & Enterprise Edition: Features and Setup URL: https://kestra.io/docs/enterprise > Kestra Enterprise & Cloud. Explore advanced features like SSO, RBAC, Multi-tenancy, and High Availability for enterprise-grade orchestration. import ChildCard from "~/components/docs/ChildCard.astro" How to configure Kestra Enterprise Edition and Kestra Cloud. ## Enterprise and Cloud Overview [Enterprise Edition](/enterprise) is a self-hosted version of Kestra deployed to your private infrastructure. It offers security and governance features including Multi-tenancy, Authentication, SSO, RBAC, Namespace-level management, distributed Worker Groups, Worker isolation, Secrets Manager integrations, Audit Logs, and more. [Kestra Cloud](/cloud) is a fully managed version of Kestra Enterprise Edition, hosted and maintained by the Kestra team. It provides most of the features of the Enterprise Edition, plus the additional benefits of automatic updates, backups, and infrastructure monitoring. ## Key differences between Kestra Enterprise and Kestra Cloud While Kestra Cloud is fully managed, it differs from Kestra Enterprise in several important ways, primarily around infrastructure control, customization, and direct access to backend components. | Feature / Area | Kestra Cloud | Kestra Enterprise Edition | | ------------------------------------- | -------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------- | | **Infrastructure Control** | Fully managed by Kestra for simplicity and reliability | Full control and customization | | **Backend Technology** | PostgreSQL JDBC only | Customizable Kafka, PostgreSQL, MySQL, H2 (testing) | | **Workers** | Managed worker pools sized for stability | Remote [Worker Groups](./04.scalability/worker-group/index.md), autoscaling | | **Custom Internal Storage & Secrets** | No instance-level control, **tenant/namespace-level only** backends | Fully customizable backends | | **Network Configuration** | Secure access over public Internet | Private networking (self-hosted, VPC peering, etc.) | | **Backup Access** | Automatic backups handled by Kestra | Customer-controlled backups | | **Plugins** | Curated plugin environment | Full plugin customization | | **Identity Providers (IdP)** | Built-in Google, Microsoft, or Basic Authentication | Custom SSO/SCIM supported | | **Log Retention** | Automatic retention protocol managed by Kestra | Unlimited (based on customer setup) | | **Deployment Regions** | US & EU (Belgium) on GCP | Any cloud, any region | | **Task Runners** | Compatible with most, [Process Task Runner](../task-runners/04.types/01.process-task-runner/index.md) excluded | Compatible with all task runners | This section describes those features in detail and explains how to configure them. If you're interested to learn more, check the [Open-Source and Enterprise Edition comparison](../oss-vs-paid/index.md), explore our [Pricing](/pricing), and [get in touch](/demo) to discuss your requirements. --- # Auth & Users in Kestra Enterprise: RBAC, SSO URL: https://kestra.io/docs/enterprise/auth > Manage Authentication and Users in Kestra Enterprise. Overview of RBAC, SSO, API tokens, and service accounts for secure access control. import ChildCard from "~/components/docs/ChildCard.astro" Features for managing Authentication, Role-based Access Control, Users, and more in Kestra. ## Authentication and users – RBAC and access
--- # Enterprise API in Kestra: Endpoints and Auth URL: https://kestra.io/docs/enterprise/auth/api > Interact with the Kestra Enterprise API. Learn about available endpoints, authentication methods, and how to programmatically manage your Kestra instance. How to interact with the Kestra Enterprise Edition using the API. ## Kestra Enterprise API – endpoints and authentication
## Authentication To authenticate with the Kestra API, you need to create an [API token](../api-tokens/index.md). You can create it directly from the Kestra UI. Once you have your API token, use it to authenticate with the API by passing it in the `Authorization` header as a `Bearer` token. ```bash curl -X POST http://localhost:8080/api/v1/executions/company.team/hello_world \ -H "Authorization: Bearer YOUR_API_TOKEN" ``` ## Browse the API Reference For a full list of available API endpoints, refer to the [Enterprise Edition API Reference](../../../api-reference/01.enterprise/index.mdx). --- # API Tokens in Kestra: Manage Programmatic Access URL: https://kestra.io/docs/enterprise/auth/api-tokens > Manage programmatic access with API Tokens in Kestra. Create and control tokens for users and service accounts to securely interact with the Kestra API. How to manage API tokens in Kestra.
## API tokens – manage programmatic access API tokens authenticate requests to the Kestra API. You can create an API token for a user or a [service account](../service-accounts/index.md). ## Where you can use API tokens API tokens are used anytime you want to grant programmatic access to the Kestra API. To authenticate your custom API calls, you can pass a bearer token to the request header. For example, you can use API tokens to authenticate with the Kestra API from a CI/CD pipeline or from a custom application. Currently, we support API tokens as an authentication mechanism for the following services: 1. [GitHub Actions](https://github.com/kestra-io/deploy-action) 2. [Terraform Provider](https://registry.terraform.io/providers/kestra-io/kestra/latest/docs) 3. [Kestra Server CLI](../../../kestra-cli/kestra-server/index.md) 4. [kestractl](../../../kestra-cli/kestractl/index.md) 5. [Kestra API](../api/index.md) ## How to create a User API token To create an API token, navigate to your profile in the bottom left corner of the Kestra UI and click on **+ Create API Token**. ![user-api-token](./user-api-token.png) Once in your profile, click **+ Create API Token** in the **Manage your API Tokens** section. ![create-api-token](./create-api-token.png) Fill in the form with the required information, including the `Name`, `Description`, and `Max age`. Once satisfied, click `Generate`: ![new-token-details](./new-token-details.png) :::alert{type="info"} **Note:** you can configure the token to expire after a certain period of time or to never expire. Also, there is a toggle called `Extended` that automatically prolongs the token's expiration date by the specified number of days (`Max Age`) if the token is actively used. This toggle is disabled by default. ::: Once you confirm the API token creation, the token will be generated and displayed in the UI. Make sure to copy the token and store it in a secure location, as it will not be displayed again. ![copy-and-save](./copy-and-save.png) ## How to create a Service Account API token To create an API token for a Service Account, navigate to the `Administration` section and click the `Service Accounts` page. Then, go to the `API Tokens` tab and click the `Create` button: ![api-token](./api-token.png) Fill in the form with the required information including the `Name`, `Description`, and `Max age`. Once satisfied, click `Generate`: ![api-token2](./api-token2.png) :::alert{type="info"} **Note:** same as for a user token, you can configure the token to expire after a certain period of time or to never expire. Also, there is a toggle called `Extended` that will automatically prolong the token's expiration date by the specified number of days (`Max Age`) if the token is actively used. That toggle is disabled by default. ::: Once you confirm the API token creation via the **Generate** button, the token will be generated and displayed in the UI. Make sure to copy the token and store it in a secure location as it will not be displayed again. ![api-token3](./api-token3.png) ## How to use an API token in an API request To authenticate your custom API calls, pass a `Bearer` token to the request's `Authorization` header. Here is an example that will trigger a flow execution using the Kestra API: ```bash curl -X POST http://localhost:8080/api/v1/executions/dev/hello-world \ -H "Authorization: Bearer YOUR_API_TOKEN" ``` --- # Authentication in Kestra Enterprise: OIDC Setup URL: https://kestra.io/docs/enterprise/auth/authentication > Configure Authentication in Kestra. Set up Basic Auth and OpenID Connect (OIDC) for secure user login and access management. How to configure authentication for your Kestra instance. ## Authentication – configure login and OIDC
Kestra provides two authentication methods: - Basic Auth – enabled by default - OpenID Connect (OIDC) By default, JWT token security is configured to use the default Kestra encryption key. If you haven't already configured it, generate a secret that is at least 256 bits and add it to your [Kestra Security and Secrets configuration](../../../configuration/05.security-and-secrets/index.md) as follows: ```yaml kestra: encryption: secret-key: your-256-bits-secret ``` This secret must be the same across all your webserver instances and will be used to sign the JWT cookie and encode the refresh token. If you want to use different keys, you can configure the key using the following configuration: ```yaml micronaut: security: token: jwt: generator: refresh-token: secret: refresh-token-256-bits-secret signatures: secret: generator: secret: signature-256-bits-secret ``` :::alert{type="info"} **JWT configuration** It is possible to change the JWT cookie behavior using [Micronaut Cookie Token Reader](https://micronaut-projects.github.io/micronaut-security/latest/guide/#cookieToken) configuration. For example, you can define the cookie's maximum lifetime using `micronaut.security.token.cookie.cookie-max-age: P2D`. ::: ## Basic authentication The default installation comes with no users defined. To create an administrator account, use the following CLI command: ```bash ./kestra auths users create --admin --username= --password= --tenant= ``` If you do not have multi-tenancy enabled, you can omit the `--tenant` parameter. :::alert{type="info"} Multi-tenancy is enabled by default, so make sure to include the `--tenant` parameter. ::: ## Single sign-on (SSO) Single Sign-On (SSO) is an authentication process that allows users to access multiple applications with one set of login credentials (e.g., Sign in with Google). Kestra supports SSO using the OpenID Connect (OIDC) protocol, which is a simple identity layer built on top of the OAuth 2.0 protocol. To enable OIDC in the application, make sure to enable OIDC in Micronaut: ```yaml micronaut: security: oauth2: enabled: true clients: google: client-id: "{{ clientId }}" client-secret: "{{ clientSecret }}" openid: issuer: "{{ issuerUrl }}" ``` More information can be found in the [Micronaut OIDC configuration](https://micronaut-projects.github.io/micronaut-security/latest/guide/#openid-configuration). Check the [Single Sign-On documentation](../sso/index.md) for more details on how to configure SSO with Google, Microsoft, and other providers. --- # Credentials in Kestra: Authenticate External Systems URL: https://kestra.io/docs/enterprise/auth/credentials > Authenticate to external systems securely with Kestra Credentials. Store and manage server-to-server auth tokens for use across flows and namespaces. Authenticate to external systems securely. ## Credentials – Server to Server authentication for Flows Credentials are a reusable way to configure server-to-server authentication credentials once and use it across tasks. Instead of embedding token minting/refresh logic in each plugin, Kestra can mint and refresh access tokens at runtime and you reference them in your workflow with a simple expression. Many APIs are moving away from long-lived static API keys toward **short-lived tokens** (e.g. OAuth 2.0), which improves security and simplifies rotation and revocation. For simple static values (API keys, usernames/passwords), use [Secrets](../../../06.concepts/04.secret/index.md) directly. Sensitive material used by a credential (client secrets, private keys, certificates) is referenced via [Secrets](../../../06.concepts/04.secret/index.md) so it can be stored in external or read-only secret managers (e.g., [Secrets Manager](../../02.governance/secrets-manager/index.md) or [Read-only Secrets](../../02.governance/read-only-secrets/index.md)) and never appears in plain text in the credential config. --- ## Availability and scope Credentials can be accessed and created at: - **Tenant level** (reusable across namespaces in that tenant) - **Namespace level** (scoped to a single namespace) During setup, Kestra lets you **test token retrieval** from the UI to ensure your configuration is correct. --- ## Use a credential in a flow Use the `credential()` Pebble function to retrieve the **current access token** for a credential key. ```yaml id: api_call namespace: company.team tasks: - id: request type: io.kestra.plugin.core.http.Request uri: https://api.example.com/v1/ping method: GET auth: type: BEARER token: "{{ credential('my_oauth') }}" ``` `credential()` returns the access token only. For non-sensitive configuration (e.g., hostnames, table names, feature flags), prefer [Variables](../../../05.workflow-components/04.variables/index.md). --- ## Credential types Credentials cover common server-to-server authentication patterns, including: - OAuth2 `client_credentials` (generic) - OAuth2 JWT Bearer extension grant (`jwt_bearer`, RFC 7523) - OAuth2 `private_key_jwt` (client authentication) - GitHub App Credentials can reference sensitive inputs via existing [Secrets](../../../06.concepts/04.secret/index.md) (e.g., client secrets, private keys, certificates), including secrets stored in an external or [read-only secrets manager](../../02.governance/read-only-secrets/index.md). --- ## Example: Google service account with JWT Bearer The following example shows how to use a Google Cloud service account with an OAuth2 JWT Bearer credential in Kestra. ### 1. Create a service account in Google Cloud In Google Cloud: 1. Go to **IAM & Admin** -> **Service Accounts**. 2. Create a new service account and grant it only the roles required for your use case. 3. Open the service account, go to **Keys**, and create a new **JSON** key. 4. Download the JSON key file. From that JSON file, you will use: - `client_email` - `private_key` - `private_key_id` - `token_uri` For more information, see the [Google service account guide](https://cloud.google.com/iam/docs/service-account-overview). ### 2. Create a secret for the private key Store the private key from the downloaded JSON in a Kestra secret rather than embedding it directly in the credential. For example, create a secret named `GCP_PRIVATE_KEY` with the value of the `private_key` field from the JSON file. You can manage that secret from the Kestra UI or by using an external [Secrets Manager](../../02.governance/secrets-manager/index.md). ### 3. Create the credential in Kestra In the Credentials UI, create a new credential with the following values: - **Credential Type:** `OAUTH2` - **Auth Config Type:** `JWT_BEARER` - **Token Endpoint:** `https://oauth2.googleapis.com/token` - **Issuer:** the `client_email` value from the JSON key - **Subject:** use the service account email for a standard service account flow; for Google Workspace domain-wide delegation, use the delegated user instead - **Private Key:** reference the `GCP_PRIVATE_KEY` secret - **Key ID:** the `private_key_id` value from the JSON key - **Algorithm:** `RS256` - **Additional Claims:** add a `scope` claim containing the Google OAuth scopes required by the API, for example `https://www.googleapis.com/auth/cloud-platform.read-only` For Google service accounts, the scope must be included in the JWT claims. If you need multiple scopes, provide them as a single space-delimited string in the `scope` claim, for example: ```plaintext https://www.googleapis.com/auth/cloud-platform.read-only https://www.googleapis.com/auth/bigquery.readonly ``` :::alert{type="info"} Use the **Test connection** action in the Credentials UI to confirm that Kestra can mint an access token before using the credential in a flow. ::: ### 4. Use the credential in a flow Once the credential is saved, you can use it in a flow with the `credential()` Pebble function. The example below calls the Google Cloud Resource Manager API and sends the access token as a Bearer token in the request header: ```yaml id: google_api_with_credential namespace: company.team inputs: - id: project_id type: STRING tasks: - id: request type: io.kestra.plugin.core.http.Request method: GET uri: "https://cloudresourcemanager.googleapis.com/v1/projects/{{ inputs.project_id }}" options: auth: type: BEARER token: "{{ credential('gcp-service-account') }}" - id: log_result type: io.kestra.plugin.core.log.Log message: | code={{ outputs.request.code }} body={{ outputs.request.body }} ``` If the service account has the required permissions on the target project, the request should return `200` and the project metadata in the response body. --- ## Token lifecycle and caching - Tokens are **not persisted**. - The token cache is **in-memory only** (when enabled). - Tokens are retrieved during **task execution** and refreshed based on the **Refresh before expiry** setting configured on the credential. - Token caching can be **enabled or disabled** per credential. :::alert{type="warning"} Avoid storing long-lived secrets directly in flow YAML. Prefer credentials + secrets so Kestra can handle token minting/refresh and reduce exposure risk. ::: --- ## Credential hygiene - **Least privilege:** scope credentials to the smallest set of permissions required. - **Rotate regularly:** prefer short-lived tokens where possible; rotate long-lived keys. - **Avoid leaking values:** don’t print tokens or derived values (e.g., substrings) to logs; see [Best Practices for Secrets](../../../14.best-practices/9.secrets-management/index.md). --- # Invitations in Kestra Enterprise: Onboard Users URL: https://kestra.io/docs/enterprise/auth/invitations > Onboard users easily with Invitations in Kestra. Manage user access by sending email invitations to join specific tenants or the entire instance. Add new users to your Tenant or Instance by using the invitation process.
## Invitations – onboard users Administrators can invite users with pre-configured RBAC permissions. Invitations can be emailed directly, and users can set up their accounts upon acceptance. By default, if the [email server is configured in Kestra EE](../../../configuration/03.observability-and-networking/index.md), an email with an invitation link is sent. If the email server is not configured, you can manually share the link with invited users. ## How to Invite Users 1. Navigate to the **IAM** page in the **Tenant** section 2. Click on the **Users** tab 3. Click on the **+ Add** button 4. Fill in the user's email address, and select the desired group or attach the role directly — optionally restricting the permission to one or more namespaces 5. Click the **Add** button — this will send an email to the user with an invitation link, or display the link you can share with the user manually. ![Add User Interface](./invite1.png) :::alert{type="info"} You can check the box to **Create user directly (skip invitation)** if one is not required. This action is recommended only with third-party authentication such as SSO or LDAP. ::: ![invite2](./invite2.png) ## Accepting invitations When a user receives an invitation, they can click on the link in the email to accept it. The user will be redirected to the Kestra login page, where they set up their account (i.e., create a password), or log in using SSO if it's enabled. ## Invite expiration time Users have 7 days to accept the invitation. After this period, the invitation will expire and must be reissued. If you want to change the default expiration time, you can do so by setting the `expireAfter` property in the `kestra.security.invitation` section of your `application.yaml` file. For example, to set the expiration time to 30 days, add the following configuration: ```yaml kestra: security: invitations: expireAfter: P30D ``` --- # RBAC in Kestra Enterprise: Roles and Permissions URL: https://kestra.io/docs/enterprise/auth/rbac > Implement Role-Based Access Control (RBAC) in Kestra. Define granular permissions for users, groups, and service accounts to secure your platform. How to manage access and permissions to your instance.
## RBAC – manage roles and permissions Kestra Enterprise supports Role-Based Access Control (RBAC), allowing you to manage access to Tenants, Namespaces, Flows and resources. In Kestra you will find three types of entities: * Users: Represents a **person**. To add users to your Kestra instance, you can do one of the following: - [Invite users](../invitations/index.md) to your instance or tenant from the UI - Sync users from an external identity provider using [SCIM](../scim/index.mdx) - Create users directly using [Terraform](../../../13.terraform/index.mdx) * Groups: Represent a collection of **Users** and **Service Accounts**. Groups are a useful mechanism for providing the same roles to multiple Users or Service Accounts at once by binding a role to a Group. * Service Accounts: Represents an **application**. They are considered Users when binding Role assignments. All theses entities can be assigned to a Role, which define what resources the User, Group, or Service Account can access. Note that these entities don’t belong to Namespaces, but their permissions can be limited to specific namespaces via Bindings (**IAM** page). The image below shows the relationship between Users, Groups, Service Accounts, Roles, and Bindings: ![bindings](./rbac.png) ## Roles and Bindings A Role is a collection of permissions that can be assigned to Users, Service Accounts, or Groups. These permissions are defined by a combination of a **Permission** (e.g., `FLOWS`, `NAMESPACE`, `SECRET`, etc.) and an **Action** ( e.g., `CREATE`). The **Role** itself does not grant any permissions. Through the **IAM** page, you are able to assign a Role to a User, Service Account, or Group, which creates a **Binding**. This Binding grants the permissions defined by that Role to the User, Service Account, or Group. Select any IAM entity (User, Group, etc.), and assign the desired Role. There is no limit to the number of Roles that can be bound to an entity. They can have zero, one, or more Roles attached, giving specific permissions, optionally tied to one or more namespaces; make sure to test their access with the [Impersonate](../rbac/index.md#impersonate) feature. Once a Role has been created, you can assign that Role to Users and Groups. Optionally, when you assign the Role to an entity (User, Group, or Service Account), you can specify the Binding to a specific Namespace(s). A Binding can be optionally limited to specific namespaces. When a Binding is tied to a namespace, it automatically grants permissions to all child namespaces. For example, a User assigned to a Role specifying the `prod` namespace automatically grants access to the `prod.engineering` namespace as well. Note that you can [configure a default role](../../../configuration/05.security-and-secrets/index.md) so that all new Users are automatically assigned that Role. This is especially useful to grant a default set of permissions to all new Users who join your Kestra instance via [SSO](../sso/index.md). ## Impersonate After assigning permissions to a User, Superadmins can impersonate Users to ensure their access is as intended. Impersonation switches your view immediately to that User's perspective and can be easily closed back to Superadmin view – a seamless way to test RBAC in one context. ![Impersonate](./impersonate-user.png) ![Stop Impersonating User](./stop-impersonate-user.png) ### Permissions A Permission is a resource that can be accessed by a User or Group. Open the following to view all supported permissions: :::collapse{title="Permissions"} - `FLOW` - `EXECUTION` - `TEMPLATE` - `NAMESPACE` - `KVSTORE` - `DASHBOARD` - `SECRET` - `CREDENTIAL` - `GROUP` - `ROLE` - `BINDING` - `AUDITLOG` - `BLUEPRINT` - `IMPERSONATE` - `SETTING` - `APP` - `AI_COPILOT` - `APPEXECUTION` - `TEST` - `ASSET` - `USER` - `SERVICE_ACCOUNT` - `TENANT_ACCESS` - `INVITATION` - `GROUP_MEMBERSHIP` - `CREDENTIALS` - `AI_COPILOT` :::alert{type="warning"} The `ME` and `APITOKEN` are removed in [Kestra 0.24](../../../11.migration-guide/v0.24.0/endpoint-changes/index.md#rbac-updates) ::: ### Actions An Action is the CRUD verb allowed on a given resource (Flow, Execution, Secret, KV, Namespace, etc.). Supported Actions map directly to HTTP operations: - `CREATE` → typically `POST` the resource (e.g., create a flow, secret, KV entry). - `READ` → `GET` to list or view the resource; no writes. - `UPDATE` → `PUT`/`PATCH` to modify an existing resource; cannot create new ones. - `DELETE` → `DELETE` to remove the resource. Example (Flows): - `CREATE` lets you `POST /api/v1/{tenant}/flows` - `READ` lets you `GET /api/v1/{tenant}/flows/*` - `UPDATE` lets you `PUT /api/v1/{tenant}/flows/{flowId}` - `DELETE` lets you `DELETE /api/v1/{tenant}/flows/delete/by-ids` :::alert{type="info"} For a complete CRUD-to-endpoint mapping for every permission, see the [Permissions Reference](./permissions-reference/index.md). ::: ### Currently supported roles Currently, Kestra only creates an **Admin** role by default. That role grants full access to **all resources**. Apart from **Admin**, Kestra has the managed Roles: Developer, Editor, Launcher, and Viewer. Each Role's permissions can be viewed from **IAM - Roles**. Superadmins can create additional Roles with custom permission combinations in addition to Kestra-managed roles. Users can be assigned multiple Roles. ## Superadmin and Admin Kestra provides two roles for managing your instance: Superadmin and Admin. - Superadmin is a user type with elevated privileges for global control - Admin is a customizable role that grants full access to all resources (scoped to a tenant if multi-tenancy is enabled). :::collapse{title="Summary"} Here's a table summarizing the key differences between an Admin and a Super Admin: | Feature | Admin (scoped to a tenant if enabled) | Super Admin | |-------------------------------------|----------------------------------------------------|------------------------------------------------------| | Access Level | By default as all permissions, depends on the Role | Manages tenants and IAM across all tenants | | Tenant Management | No | Create/Update/Read/Delete tenants across all tenants | | User/Role/Group/Bindings Management | Has the permission by default | Create/Update/Read/Delete across all tenants | | Flow/Execution Management | Has the permission by default | No | | Set Super Admin privilege | No | Yes | ::: ## Super Admin Super Admin is a powerful type of user. Use the role sparingly and only for use cases that require it, such as creating a new tenant, troubleshooting tenant issues, or helping a user with a problem. Without any Role or Binding, Super Admin has access to manage tenants, users, roles, and groups within a Kestra Enterprise instance. There are multiple methods to create a Superadmin user. ### Through the UI When you launch Kestra for the first time, if no prior action has been made through the CLI, you will be invited to setup Kestra through the [Setup Page](../../01.overview/02.setup/index.md). This interface invites you to create your first User which will be automatically assigned the `Superadmin` privilege. ### Through the CLI To create a User with a Superadmin privilege from the [CLI](../../../kestra-cli/kestra-server/index.md), use the `--superadmin` option: ```bash kestra auths users create admin@kestra.io TopSecret42 --superadmin ## schema: kestra auths users create \ --tenant= --superadmin ``` To set or revoke Superadmin privileges, use the following in the CLI: ```bash kestra auths users set-superadmin user@email.com true # (use false to revoke) ``` ### Configuration A Super Admin can also be created from the configuration file using the configuration below: ```yaml kestra: security: superAdmin: username: password: tenantAdminAccess: - ``` For more details, check the [Security and Secrets configuration](../../../configuration/05.security-and-secrets/index.md) page. ## Grant/Revoke Super Admin permissions :::alert{type="info"} Note that you need to be a Superadmin yourself. ::: ### Through the UI You can grant or revoke the Superadmin privilege using the switch in the User Edit page. ![superadmin switch](./superadmin_switch.png) ### Through the CLI To set an existing User with a Superadmin privilege from the [CLI](../../../kestra-cli/kestra-server/index.md), use the dedicated command: ```bash ## Set a user as Super Admin kestra auths users set-superadmin admin@kestra.io true ## Revoke Super Admin privilege kestra auths users set-superadmin admin@kestra.io false ``` ## Admin In Kestra, the notion of Admin user does not exist; instead we create an **Admin** Role with all permissions. This role can be assigned to any User, Service Account, or Group. This allows you to have different types of admins, to grant admin permissions to a whole group, and to revoke those admin permissions at any time without having to delete any group or user. When using multi-tenancy, Kestra assigns the Admin Role to the user who created the tenant by default. :::alert{type="info"} If you see an error when creating a new User or Service Account, it might be caused by a limit on your license. In that case, [reach out to us](/contact-us) to validate and optionally upgrade your license. ::: ## Creating a User with an Admin Role ### Through the UI When launching Kestra for the first time, if no prior action has been made through the CLI, you will be invited to setup Kestra through the [Setup Page](../../01.overview/02.setup/index.md). This interface invites you to create the first User which will automatically create the role Admin and bind the User to the role. Later, you can create a new User or pick an existing User and assign the Admin role to it from the Access page. ### Through the CLI To create a User with an Admin Role from the CLI, use the `--admin` option: ```bash kestra auths users create prod.admin@kestra.io TopSecret42 --admin ## schema: kestra auths users create --admin ``` ## User lockout Use the following configuration to change the lockout behavior after too many failed login attempts. By default, Kestra >= 0.22 will lock the user for the `lock-duration` period after a `threshold` number of failed attempts performed within the `monitoring-window` duration. The snippet below lists the default values for those properties — you can adjust them based on your preferences: ```yaml kestra: security: login: failed-attempts: threshold: 10 monitoring-window: PT5M lock-duration: PT30M ``` The key attributes are: - `threshold`: Sets the number of allowed failed attempts before a user is locked out. - `monitoring-window`: Defines the period during which failed login attempts are counted before triggering a lock. Super Admin can unlock the user manually by resetting their password from the user's detail page. - `lock-duration`: Defines how long the account remains locked. In the above configuration, a user is allotted 10 failed login attempts in a 5-minute window before they are locked out. They must wait 30 minutes to try again, be unlocked by an Admin, or reset their password by clicking on the "Forgot password" link and following the instructions in the email. ## Change password If a user wants to change their password, they can do it on their profile. This page can be accessed through the profile in the bottom left corner of the UI. "Forgot Password" settings can be configured in your Kestra configuration under `basic-auth.password-reset`. Settings to consider are the cooldown time between reset requests and how many requests can be made in a given time window. ```yaml kestra: security: basic-auth: password-reset: cooldown: PT5M # Minimum time required between two password reset emails for the same user rate-limit: max-requests: 10 # Maximum number of password reset requests allowed per client within the time window window: PT1H # Time window during which password reset requests are counted for rate limiting ``` ### Reset password (by a Super Admin) Kestra provides a "forgot password" functionality that your users can leverage to reset their password. This functionality is available on the login page, where users can click on the "Forgot password?" link. On top of that, a Super Admin can reset a user's password from the User Edit page by going to **Instance** - **IAM - Users**. ![Reset Password](./forgot-password.png) ![Superadmin Change Password](./create-user-password.png) ## RBAC FAQ :::collapse{title="Why is Admin a Role rather than User type?"} The Admin role is a collection of permissions that can be assigned to Users, Service Accounts, or Groups. This allows you to grant multiple users with admin permissions if needed, and you can revoke only specific admin permissions at any time without having to delete the user. Admin roles can be assumed by multiple users or groups, and some user may later be granted a lower or a higher permission boundary. In the same way, some users may initially be Admins but then their permission may be revoked. The Admin role enables all these patterns in a flexible way. You can think of Users as **authentication** mechanism (who you are), and Roles as **authorization** mechanism (what you are allowed to do). Decoupling authentication from authorization allows you to grant permissions to multiple users or groups at once by attaching a single Role to a Group. ::: :::collapse{title="Why can't I edit an existing Binding?"} A Binding is an immutable object. If a Binding no longer reflects the desired permissions, you can delete the existing Binding and create a new one for the same User, Service Account, or Group but with different Roles and/or namespaces. This is a safety feature to prevent accidental changes to existing permissions. ::: :::collapse{title="What happens if you delete a Group?"} All Users and Service Accounts assigned to that Group will lose permissions that were binds to the groups. However, Users and Services Accounts will still exist. ::: --- # RBAC Permissions Reference for Kestra Enterprise URL: https://kestra.io/docs/enterprise/auth/rbac/permissions-reference > Reference for Kestra RBAC permissions and CRUD actions mapped to API endpoints. Use this to configure precise access controls for users and service accounts. This reference maps each RBAC Permission and Action to the Enterprise API endpoints that enforce it. Use it to design least-privilege roles and troubleshoot authorization errors. ## How to read this page - Endpoints are grouped by Permission and CRUD Action. - Endpoints marked with "any action" are accessible to any user who has the permission, regardless of which CRUD action (CREATE, READ, UPDATE, or DELETE) is assigned. - Some endpoints require multiple permissions; notes call out additional checks. - Namespace bindings apply to the namespace and all child namespaces. ## Permissions reference :::collapse{title="FLOW"} **Scope:** Namespace **CRUD meaning** - Create: create flows or namespace files; import flows. - Read: view flows, revisions, tasks, graphs, dependencies; export flows; read namespace files and triggers. - Update: modify flow source, tasks, enable or disable flows; move namespace files. - Delete: delete flows or namespace files. **Endpoints** Create - POST `/api/v1/{tenant}/flows` (YAML) - POST `/api/v1/{tenant}/flows` (JSON, deprecated) - POST `/api/v1/{tenant}/flows/{namespace}` (bulk upsert; also requires UPDATE and DELETE) - POST `/api/v1/{tenant}/flows/import` (imports require CREATE + UPDATE per flow) - POST `/api/v1/{tenant}/namespaces/{namespace}/files/directory` - POST `/api/v1/{tenant}/namespaces/{namespace}/files` Read - GET `/api/v1/{tenant}/flows/{namespace}/{id}` - GET `/api/v1/{tenant}/flows/{namespace}/{id}/graph` - POST `/api/v1/{tenant}/flows/graph` (any action; no action check) - GET `/api/v1/{tenant}/flows/{namespace}/{id}/revisions` - GET `/api/v1/{tenant}/flows/{namespace}/{id}/tasks/{taskId}` - GET `/api/v1/{tenant}/flows/search` - GET `/api/v1/{tenant}/flows/{namespace}` - GET `/api/v1/{tenant}/flows/source` - GET `/api/v1/{tenant}/flows/{namespace}/{id}/dependencies` - GET `/api/v1/{tenant}/namespaces/{namespace}/dependencies` - GET `/api/v1/{tenant}/flows/distinct-namespaces` (any action; no action check) - POST `/api/v1/{tenant}/flows/validate` (any action; no action check) - POST `/api/v1/{tenant}/flows/validate/task` (JSON, any action; no action check) - POST `/api/v1/{tenant}/flows/validate/task` (YAML, any action; no action check) - POST `/api/v1/{tenant}/flows/validate/trigger` (any action; no action check) - GET `/api/v1/{tenant}/flows/export/by-query` - POST `/api/v1/{tenant}/flows/export/by-ids` - GET `/api/v1/{tenant}/flows/export/by-query/csv` - GET `/api/v1/{tenant}/namespaces/{namespace}/files/search` - GET `/api/v1/{tenant}/namespaces/{namespace}/files` - GET `/api/v1/{tenant}/namespaces/{namespace}/files/stats` - GET `/api/v1/{tenant}/namespaces/{namespace}/files/revisions` - GET `/api/v1/{tenant}/namespaces/{namespace}/files/directory` - GET `/api/v1/{tenant}/namespaces/{namespace}/files/export` - GET `/api/v1/{tenant}/triggers/search` - GET `/api/v1/{tenant}/triggers/{namespace}/{flowId}` - GET `/api/v1/{tenant}/triggers/export/by-query/csv` Update - PUT `/api/v1/{tenant}/flows/{namespace}/{id}` (YAML) - PUT `/api/v1/{tenant}/flows/{namespace}/{id}` (JSON, deprecated) - PATCH `/api/v1/{tenant}/flows/{namespace}/{id}/{taskId}` - POST `/api/v1/{tenant}/executions/{executionId}/eval/{taskRunId}` - POST `/api/v1/{tenant}/flows/bulk` - POST `/api/v1/{tenant}/flows/disable/by-query` - POST `/api/v1/{tenant}/flows/disable/by-ids` - POST `/api/v1/{tenant}/flows/enable/by-query` - POST `/api/v1/{tenant}/flows/enable/by-ids` - PUT `/api/v1/{tenant}/namespaces/{namespace}/files` Delete - DELETE `/api/v1/{tenant}/flows/{namespace}/{id}` - DELETE `/api/v1/{tenant}/flows/delete/by-query` - DELETE `/api/v1/{tenant}/flows/delete/by-ids` - DELETE `/api/v1/{tenant}/namespaces/{namespace}/files` Notes - Trigger update operations require EXECUTION permissions, but trigger routes also require FLOW permission at the route level. - Creating a flow in a new namespace also requires NAMESPACE CREATE. ::: :::collapse{title="EXECUTION"} **Scope:** Namespace **CRUD meaning** - Create: trigger or create executions; replay executions (creates new executions). - Read: view executions, graphs, logs, metrics, files, and exports. - Update: change state, pause or resume, restart, replay by ids, set labels, unqueue, force-run, update task run state. - Delete: delete executions and logs. **Endpoints** Create - POST `/api/v1/{tenant}/executions/trigger/{namespace}/{id}` (deprecated) - POST `/api/v1/{tenant}/executions/{namespace}/{id}` - POST `/api/v1/{tenant}/executions/{namespace}/{id}/validate` (any action; no action check) - POST `/api/v1/{tenant}/executions/{executionId}/replay` - POST `/api/v1/{tenant}/executions/{executionId}/replay-with-inputs` - POST `/api/v1/{tenant}/executions/replay/by-query` (any action; no action check) - GET `/api/v1/{tenant}/executions/namespaces` (requires CREATE) - GET `/api/v1/{tenant}/executions/namespaces/{namespace}/flows` (requires CREATE) Read - GET `/api/v1/{tenant}/executions/search` - GET `/api/v1/{tenant}/executions` - GET `/api/v1/{tenant}/executions/{executionId}` - GET `/api/v1/{tenant}/executions/{executionId}/graph` - GET `/api/v1/{tenant}/executions/{executionId}/flow` - GET `/api/v1/{tenant}/executions/flows/{namespace}/{flowId}` - GET `/api/v1/{tenant}/executions/{executionId}/file` - GET `/api/v1/{tenant}/executions/{executionId}/file/metas` - GET `/api/v1/{tenant}/executions/{executionId}/file/preview` - GET `/api/v1/{tenant}/executions/{executionId}/follow` - GET `/api/v1/{tenant}/executions/{executionId}/follow-dependencies` - POST `/api/v1/{tenant}/executions/latest` (any action; no action check) - GET `/api/v1/{tenant}/executions/export/by-query/csv` - GET `/api/v1/{tenant}/logs/search` - GET `/api/v1/{tenant}/logs/{executionId}` - GET `/api/v1/{tenant}/logs/{executionId}/download` - GET `/api/v1/{tenant}/logs/{executionId}/follow` - GET `/api/v1/{tenant}/metrics/{executionId}` - GET `/api/v1/{tenant}/metrics/names/{namespace}/{flowId}` - GET `/api/v1/{tenant}/metrics/names/{namespace}/{flowId}/{taskId}` - GET `/api/v1/{tenant}/metrics/tasks/{namespace}/{flowId}` - GET `/api/v1/{tenant}/metrics/aggregates/{namespace}/{flowId}/{metric}` - GET `/api/v1/{tenant}/metrics/aggregates/{namespace}/{flowId}/{taskId}/{metric}` Update - POST `/api/v1/{tenant}/executions/{executionId}/restart` - POST `/api/v1/{tenant}/executions/restart/by-ids` - POST `/api/v1/{tenant}/executions/restart/by-query` (any action; no action check) - POST `/api/v1/{tenant}/executions/{executionId}/state` - POST `/api/v1/{tenant}/executions/{executionId}/change-status` - POST `/api/v1/{tenant}/executions/change-status/by-ids` - POST `/api/v1/{tenant}/executions/change-status/by-query` (any action; no action check) - DELETE `/api/v1/{tenant}/executions/{executionId}/kill{?isOnKillCascade}` - DELETE `/api/v1/{tenant}/executions/kill/by-ids` - DELETE `/api/v1/{tenant}/executions/kill/by-query` (any action; no action check) - POST `/api/v1/{tenant}/executions/{executionId}/resume/validate` (any action; no action check) - POST `/api/v1/{tenant}/executions/{executionId}/resume` - POST `/api/v1/{tenant}/executions/{executionId}/resume-from-breakpoint` - POST `/api/v1/{tenant}/executions/resume/by-ids` - POST `/api/v1/{tenant}/executions/resume/by-query` (any action; no action check) - POST `/api/v1/{tenant}/executions/{executionId}/pause` - POST `/api/v1/{tenant}/executions/pause/by-ids` - POST `/api/v1/{tenant}/executions/pause/by-query` (any action; no action check) - POST `/api/v1/{tenant}/executions/{executionId}/labels` - POST `/api/v1/{tenant}/executions/labels/by-ids` - POST `/api/v1/{tenant}/executions/labels/by-query` (any action; no action check) - POST `/api/v1/{tenant}/executions/{executionId}/unqueue` - POST `/api/v1/{tenant}/executions/unqueue/by-ids` - POST `/api/v1/{tenant}/executions/unqueue/by-query` (any action; no action check) - POST `/api/v1/{tenant}/executions/{executionId}/force-run` - POST `/api/v1/{tenant}/executions/force-run/by-ids` - POST `/api/v1/{tenant}/executions/force-run/by-query` (any action; no action check) - POST `/api/v1/{tenant}/executions/replay/by-ids` (uses UPDATE in current implementation) Delete - DELETE `/api/v1/{tenant}/executions/{executionId}` - DELETE `/api/v1/{tenant}/executions/by-ids` - DELETE `/api/v1/{tenant}/executions/by-query` (any action; no action check) - DELETE `/api/v1/{tenant}/logs/{executionId}` - DELETE `/api/v1/{tenant}/logs/{namespace}/{flowId}` (any action; no action check) Notes - Webhook execution endpoints (`/executions/webhook/{namespace}/{id}/{key}`) are anonymous and are authorized by webhook key, not RBAC. - `GET /api/v1/{tenant}/logs/search` only checks that the EXECUTION permission exists (any action). ::: :::collapse{title="TEMPLATE"} **Scope:** Namespace **CRUD meaning** - Create: create templates or bulk update a namespace of templates. - Read: view templates, search, export, validate. - Update: update templates or bulk update a namespace of templates. - Delete: delete templates, bulk delete by query or ids. **Endpoints** Create - POST `/api/v1/{tenant}/templates` - POST `/api/v1/{tenant}/templates/{namespace}` (bulk update; also requires UPDATE and DELETE) - POST `/api/v1/{tenant}/templates/import` (requires FLOW CREATE + UPDATE) Read - GET `/api/v1/{tenant}/templates/{namespace}/{id}` - GET `/api/v1/{tenant}/templates/search` - GET `/api/v1/{tenant}/templates/distinct-namespaces` (any action; no action check) - POST `/api/v1/{tenant}/templates/validate` (any action; no action check) - GET `/api/v1/{tenant}/templates/export/by-query` - POST `/api/v1/{tenant}/templates/export/by-ids` Update - PUT `/api/v1/{tenant}/templates/{namespace}/{id}` - POST `/api/v1/{tenant}/templates/{namespace}` (bulk update; also requires CREATE and DELETE) Delete - DELETE `/api/v1/{tenant}/templates/{namespace}/{id}` - DELETE `/api/v1/{tenant}/templates/delete/by-query` - DELETE `/api/v1/{tenant}/templates/delete/by-ids` Notes - `POST /api/v1/{tenant}/templates/import` uses FLOW CREATE and UPDATE permissions in the current implementation. ::: :::collapse{title="NAMESPACE"} **Scope:** Namespace **CRUD meaning** - Create: create namespaces. - Read: view namespaces, inherited variables, inherited plugin defaults, and export plugin defaults. - Update: update namespace metadata and import plugin defaults. - Delete: delete namespaces. **Endpoints** Create - POST `/api/v1/{tenant}/namespaces` Read - POST `/api/v1/{tenant}/namespaces/autocomplete` - GET `/api/v1/{tenant}/namespaces/{id}` - GET `/api/v1/{tenant}/namespaces/search` - GET `/api/v1/{tenant}/namespaces/{id}/inherited-variables` - GET `/api/v1/{tenant}/namespaces/{id}/inherited-plugindefaults` - POST `/api/v1/{tenant}/namespaces/{id}/plugindefaults/export` Update - PUT `/api/v1/{tenant}/namespaces/{id}` - POST `/api/v1/{tenant}/namespaces/{id}/plugindefaults/import` Delete - DELETE `/api/v1/{tenant}/namespaces/{id}` ::: :::collapse{title="KVSTORE"} **Scope:** Namespace **CRUD meaning** - Create: create new KV entries. - Read: list or retrieve KV entries, including inherited entries. - Update: update existing KV entries. - Delete: delete KV entries. **Endpoints** Create - PUT `/api/v1/{tenant}/namespaces/{namespace}/kv/{key}` (creates if key does not exist) Read - GET `/api/v1/{tenant}/kv` (any action; no action check) - GET `/api/v1/{tenant}/namespaces/{namespace}/kv` (deprecated) - GET `/api/v1/{tenant}/namespaces/{namespace}/kv/inheritance` - GET `/api/v1/{tenant}/namespaces/{namespace}/kv/{key}` - GET `/api/v1/{tenant}/namespaces/{namespace}/kv/{key}/detail` Update - PUT `/api/v1/{tenant}/namespaces/{namespace}/kv/{key}` (updates if key exists) Delete - DELETE `/api/v1/{tenant}/namespaces/{namespace}/kv/{key}` - DELETE `/api/v1/{tenant}/namespaces/{namespace}/kv` Notes - The PUT endpoint chooses CREATE vs UPDATE based on whether the key already exists. ::: :::collapse{title="DASHBOARD"} **Scope:** Global (tenant) **CRUD meaning** - Create: create dashboards. - Read: view dashboards and charts. - Update: update dashboards and charts. - Delete: delete dashboards. **Endpoints** Create - POST `/api/v1/{tenant}/dashboards` Read - GET `/api/v1/{tenant}/dashboards` - GET `/api/v1/{tenant}/dashboards/{id}` - POST `/api/v1/{tenant}/dashboards/{id}/charts/{chartId}` - POST `/api/v1/{tenant}/dashboards/charts/preview` - POST `/api/v1/{tenant}/dashboards/validate` - POST `/api/v1/{tenant}/dashboards/validate/chart` - POST `/api/v1/{tenant}/dashboards/{id}/charts/{chartId}/export/to-csv` - POST `/api/v1/{tenant}/dashboards/charts/export/to-csv` Update - PUT `/api/v1/{tenant}/dashboards/{id}` Delete - DELETE `/api/v1/{tenant}/dashboards/{id}` Notes - Read endpoints rely on repository-level permission checks (any DASHBOARD action); action-specific READ checks are not enforced at the controller level. ::: :::collapse{title="SECRET"} **Scope:** Namespace **CRUD meaning** - Create: create secrets (implemented via UPDATE in current API). - Read: list and view secret metadata. - Update: update secret values or metadata. - Delete: delete secrets. **Endpoints** Read - GET `/api/v1/{tenant}/secrets` (any action; no action check) - GET `/api/v1/{tenant}/namespaces/{namespace}/secrets` - GET `/api/v1/{tenant}/namespaces/{namespace}/inherited-secrets` Update - PUT `/api/v1/{tenant}/namespaces/{namespace}/secrets` - PATCH `/api/v1/{tenant}/namespaces/{namespace}/secrets/{key}` Delete - DELETE `/api/v1/{tenant}/namespaces/{namespace}/secrets/{key}` Notes - No endpoint currently checks SECRET CREATE; secret creation is enforced via UPDATE on `PUT /namespaces/{namespace}/secrets`. ::: :::collapse{title="CREDENTIAL"} **Scope:** Namespace or global (tenant-level credentials) **CRUD meaning** - Create: create tenant or namespace credentials. - Read: list and view credentials. - Update: update credentials or test connections. - Delete: delete credentials. **Endpoints** Create - POST `/api/v1/{tenant}/credentials` - POST `/api/v1/{tenant}/namespaces/{namespace}/credentials` Read - GET `/api/v1/{tenant}/credentials` - GET `/api/v1/{tenant}/credentials/{id}` - GET `/api/v1/{tenant}/namespaces/{namespace}/credentials` - GET `/api/v1/{tenant}/namespaces/{namespace}/credentials/{name}` - GET `/api/v1/{tenant}/namespaces/{namespace}/credentials/inherited` Update - PUT `/api/v1/{tenant}/credentials/{id}` - POST `/api/v1/{tenant}/credentials/{id}/test` - PUT `/api/v1/{tenant}/namespaces/{namespace}/credentials/{name}` - POST `/api/v1/{tenant}/namespaces/{namespace}/credentials/{name}/test` Delete - DELETE `/api/v1/{tenant}/credentials/{id}` - DELETE `/api/v1/{tenant}/namespaces/{namespace}/credentials/{name}` ::: :::collapse{title="BLUEPRINT"} **Scope:** Global (tenant) **CRUD meaning** - Create: create custom blueprints. - Read: list or view custom blueprints and templates. - Update: update custom blueprints. - Delete: delete custom blueprints. **Endpoints** Create - POST `/api/v1/{tenant}/blueprints/flows` - POST `/api/v1/{tenant}/blueprints/custom` (deprecated) Read - GET `/api/v1/{tenant}/blueprints/custom` - GET `/api/v1/{tenant}/blueprints/custom/{id}` - GET `/api/v1/{tenant}/blueprints/custom/{id}/source` - GET `/api/v1/{tenant}/blueprints/custom/tags` - GET `/api/v1/{tenant}/blueprints/flow/{id}` - GET `/api/v1/{tenant}/blueprints/flows/{id}` - POST `/api/v1/{tenant}/blueprints/flows/{id}/use-template` Update - PUT `/api/v1/{tenant}/blueprints/flows/{id}` - PUT `/api/v1/{tenant}/blueprints/custom/{id}` (deprecated) Delete - DELETE `/api/v1/{tenant}/blueprints/flows/{id}` - DELETE `/api/v1/{tenant}/blueprints/custom/{id}` (deprecated) Notes - Community blueprint endpoints under `/api/v1/{tenant}/blueprints/community/...` do not use BLUEPRINT permission. ::: :::collapse{title="APP"} **Scope:** Global (tenant) with namespace checks on app definitions **CRUD meaning** - Create: create apps and import apps. - Read: view app source, search, export apps. - Update: update apps and enable or disable apps. - Delete: delete apps. **Endpoints** Create - POST `/api/v1/{tenant}/apps` - POST `/api/v1/{tenant}/apps/import` - POST `/api/v1/{tenant}/apps/preview` (requires global APP CREATE) Read - GET `/api/v1/{tenant}/apps/search` - GET `/api/v1/{tenant}/apps/catalog` (private apps also require APPEXECUTION READ) - GET `/api/v1/{tenant}/apps/tags` - GET `/api/v1/{tenant}/apps/{uid}` - POST `/api/v1/{tenant}/apps/export` Update - PUT `/api/v1/{tenant}/apps/{uid}` - POST `/api/v1/{tenant}/apps/{uid}/enable` - POST `/api/v1/{tenant}/apps/{uid}/disable` - POST `/api/v1/{tenant}/apps/enable` - POST `/api/v1/{tenant}/apps/disable` Delete - DELETE `/api/v1/{tenant}/apps/{uid}` - DELETE `/api/v1/{tenant}/apps` ::: :::collapse{title="APPEXECUTION"} **Scope:** Namespace (checked when app access is PRIVATE) **CRUD meaning** - Create: not used for apps (execution happens via app dispatch). - Read: view apps and read execution artifacts through apps. - Update: dispatch app actions and stream updates. - Delete: not used. **Endpoints** Read - GET `/api/v1/{tenant}/apps/view/{uid}` (PRIVATE apps require APPEXECUTION READ) - GET `/api/v1/{tenant}/apps/view/{id}/file/preview` - GET `/api/v1/{tenant}/apps/view/{id}/file/meta` - GET `/api/v1/{tenant}/apps/view/{id}/file/download` - GET `/api/v1/{tenant}/apps/view/{uid}/logs/download` Update - POST `/api/v1/{tenant}/apps/view/{id}/dispatch/{dispatch}` - GET `/api/v1/{tenant}/apps/view/{id}/streams/{stream}` Notes - App view endpoints are anonymous for PUBLIC apps; PRIVATE apps require APPEXECUTION permissions and, if configured, group membership. ::: :::collapse{title="ASSET"} **Scope:** Global (tenant) with namespace checks when an asset has a namespace **CRUD meaning** - Create: create assets. - Read: view assets, search assets, and dependency or usage graphs. - Update: not used (create or replace is done via POST). - Delete: delete assets. **Endpoints** Create - POST `/api/v1/{tenant}/assets` Read - GET `/api/v1/{tenant}/assets/{id}` - GET `/api/v1/{tenant}/assets/{id}/dependencies` - GET `/api/v1/{tenant}/assets/search` - GET `/api/v1/{tenant}/assets/usages/search` Delete - DELETE `/api/v1/{tenant}/assets/{id}` - DELETE `/api/v1/{tenant}/assets/by-ids` - DELETE `/api/v1/{tenant}/assets/by-query` ::: :::collapse{title="TEST"} **Scope:** Namespace **CRUD meaning** - Create: create tests or run tests. - Read: view tests and test results. - Update: update tests or enable or disable tests. - Delete: delete tests. **Endpoints** Create - POST `/api/v1/{tenant}/tests` - POST `/api/v1/{tenant}/tests/{namespace}/{id}/run` - POST `/api/v1/{tenant}/tests/run` Read - GET `/api/v1/{tenant}/tests/{namespace}/{id}` - GET `/api/v1/{tenant}/tests/search` - POST `/api/v1/{tenant}/tests/validate` - GET `/api/v1/{tenant}/tests/results/{id}` - POST `/api/v1/{tenant}/tests/results/search/last` - GET `/api/v1/{tenant}/tests/results/search` Update - PUT `/api/v1/{tenant}/tests/{namespace}/{id}` - POST `/api/v1/{tenant}/tests/disable/by-ids` - POST `/api/v1/{tenant}/tests/enable/by-ids` Delete - DELETE `/api/v1/{tenant}/tests/{namespace}/{id}` - DELETE `/api/v1/{tenant}/tests/by-ids` ::: :::collapse{title="AUDITLOG"} **Scope:** Global (tenant) **CRUD meaning** - Read: search and export audit logs; read resource history and diffs. **Endpoints** Read - GET `/api/v1/{tenant}/auditlogs/search` - POST `/api/v1/{tenant}/auditlogs/find` - GET `/api/v1/{tenant}/auditlogs/history/{detailId}` (requires READ on the underlying resource) - GET `/api/v1/{tenant}/auditlogs/{id}/diff` (requires READ on the underlying resource or AUDITLOG READ; superadmin-only for certain resources) - GET `/api/v1/{tenant}/auditlogs/export` Notes - Cross-tenant audit log endpoints under `/api/v1/auditlogs/...` are superadmin-only and are not controlled by AUDITLOG permissions. ::: :::collapse{title="USER"} **Scope:** Global (tenant) **CRUD meaning** - Create, Read, Update, Delete: manage users via SCIM provisioning endpoints. **Endpoints** Create - POST `/api/v1/{tenant}/integrations/{integration}/scim/v2/Users` Read - GET `/api/v1/{tenant}/integrations/{integration}/scim/v2/Users` - GET `/api/v1/{tenant}/integrations/{integration}/scim/v2/Users/{id}` Update - PUT `/api/v1/{tenant}/integrations/{integration}/scim/v2/Users/{id}` - PATCH `/api/v1/{tenant}/integrations/{integration}/scim/v2/Users/{id}` Delete - DELETE `/api/v1/{tenant}/integrations/{integration}/scim/v2/Users/{id}` Notes - IAM user management endpoints under `/api/v1/users` are superadmin-only and do not use USER permissions. ::: :::collapse{title="SERVICE_ACCOUNT"} **Scope:** Global (tenant) **CRUD meaning** - Create: create service accounts. - Read: list or view service accounts and API tokens. - Update: update service accounts and create API tokens. - Delete: delete service accounts or API tokens. **Endpoints** Create - POST `/api/v1/{tenant}/service-accounts` Read - GET `/api/v1/{tenant}/service-accounts/{id}` - GET `/api/v1/{tenant}/service-accounts/{id}/api-tokens` Update - PUT `/api/v1/{tenant}/service-accounts/{id}` - POST `/api/v1/{tenant}/service-accounts/{id}/api-tokens` Delete - DELETE `/api/v1/{tenant}/service-accounts/{id}` - DELETE `/api/v1/{tenant}/service-accounts/{id}/api-tokens/{tokenId}` Notes - Superadmin-only service account endpoints under `/api/v1/service-accounts` do not use SERVICE_ACCOUNT permissions. ::: :::collapse{title="GROUP"} **Scope:** Global (tenant) **CRUD meaning** - Create, Read, Update, Delete: manage groups. **Endpoints** Create - POST `/api/v1/{tenant}/groups` Read - GET `/api/v1/{tenant}/groups/{id}` - GET `/api/v1/{tenant}/groups/search` - POST `/api/v1/{tenant}/groups/autocomplete` - POST `/api/v1/{tenant}/groups/ids` Update - PUT `/api/v1/{tenant}/groups/{id}` Delete - DELETE `/api/v1/{tenant}/groups/{id}` Notes - SCIM group endpoints under `/api/v1/{tenant}/integrations/{integration}/scim/v2/Groups` use GROUP permissions for CRUD. ::: :::collapse{title="GROUP_MEMBERSHIP"} **Scope:** Global (tenant) **CRUD meaning** - Create: add users to groups. - Read: list group members. - Update: update membership roles or replace a user's group list. - Delete: remove users from groups. **Endpoints** Create - PUT `/api/v1/{tenant}/groups/{id}/members/{userId}` Read - GET `/api/v1/{tenant}/groups/{id}/members` Update - PUT `/api/v1/{tenant}/groups/{id}/members/membership/{userId}` - PUT `/api/v1/{tenant}/users/{id}/groups` Delete - DELETE `/api/v1/{tenant}/groups/{id}/members/{userId}` Notes - Group owners can manage membership without GROUP_MEMBERSHIP permission; non-owners require it. ::: :::collapse{title="ROLE"} **Scope:** Global (tenant) **CRUD meaning** - Create, Read, Update, Delete: manage roles. **Endpoints** Create - POST `/api/v1/{tenant}/roles` Read - GET `/api/v1/{tenant}/roles/{id}` - GET `/api/v1/{tenant}/roles/search` - POST `/api/v1/{tenant}/roles/autocomplete` - POST `/api/v1/{tenant}/roles/ids` - GET `/api/v1/{tenant}/acls/permissions` (any action; no action check) - GET `/api/v1/{tenant}/acls/actions` (any action; no action check) Update - PUT `/api/v1/{tenant}/roles/{id}` Delete - DELETE `/api/v1/{tenant}/roles/{id}` ::: :::collapse{title="BINDING"} **Scope:** Global (tenant) **CRUD meaning** - Create, Read, Delete: manage bindings between users, groups, and roles. **Endpoints** Create - POST `/api/v1/{tenant}/bindings` - POST `/api/v1/{tenant}/bindings/bulk` Read - GET `/api/v1/{tenant}/bindings/{id}` - GET `/api/v1/{tenant}/bindings/search` Delete - DELETE `/api/v1/{tenant}/bindings/{id}` ::: :::collapse{title="INVITATION"} **Scope:** Global (tenant) **CRUD meaning** - Create: create invitations. - Read: list or view invitations. - Delete: delete invitations. **Endpoints** Create - POST `/api/v1/{tenant}/invitations` Read - GET `/api/v1/{tenant}/invitations/search` - GET `/api/v1/{tenant}/invitations/email/{email}` - GET `/api/v1/{tenant}/invitations/{id}` Delete - DELETE `/api/v1/{tenant}/invitations/{id}` ::: :::collapse{title="TENANT_ACCESS"} **Scope:** Global (tenant) **CRUD meaning** - Create: grant a user access to a tenant. - Read: list tenant access or fetch a user's tenant access. - Delete: revoke tenant access. **Endpoints** Create - PUT `/api/v1/{tenant}/tenant-access/{userId}` - POST `/api/v1/{tenant}/tenant-access` Read - GET `/api/v1/{tenant}/tenant-access` - POST `/api/v1/{tenant}/tenant-access/autocomplete` - GET `/api/v1/{tenant}/tenant-access/{userId}` Delete - DELETE `/api/v1/{tenant}/tenant-access/{userId}` Notes - `GET /tenant-access/{userId}` is allowed for the authenticated user without TENANT_ACCESS permission; all other access requires the permission. ::: :::collapse{title="IMPERSONATE"} **Scope:** Global (tenant) **CRUD meaning** - Read: allow impersonation via the API header. **Endpoints** Read - Use `X-Kestra-Impersonate: user@example.com` on authenticated requests (requires IMPERSONATE READ). Notes - The IAM endpoint `POST /api/v1/users/{id}/impersonate` is superadmin-only and does not use IMPERSONATE permission. ::: :::collapse{title="SETTING"} **Scope:** Global (tenant) **CRUD meaning** - Create, Read, Update, Delete: reserved for webserver settings. **Endpoints** - No API endpoints currently enforce SETTING permissions. ::: :::collapse{title="AI_COPILOT"} **Scope:** Global (tenant) **CRUD meaning** - Read: access AI flow generation. **Endpoints** Read - POST `/api/v1/{tenant}/ai/generate/flow` (any action; no action check) ::: --- # SCIM Directory Sync in Kestra Enterprise URL: https://kestra.io/docs/enterprise/auth/scim > Automate user provisioning with SCIM Directory Sync. Synchronize users and groups from IdPs like Okta, Azure AD, and Keycloak to Kestra Enterprise. import ChildCard from "~/components/docs/ChildCard.astro" Sync users and groups from your Identity Provider (IdP) to Kestra using SCIM.
## SCIM directory sync SCIM (System for Cross-domain Identity Management) is an open-standard protocol designed to facilitate user identity management across multiple systems. It simplifies user provisioning, de-provisioning, and group synchronization between IdPs, such as Microsoft Entra ID or Okta, and service providers (SPs) such as Kestra. In layman's terms, SCIM allows you to automatically keep your users and groups in sync between your IdP and Kestra. Kestra explicitly relies on the SCIM 2.0 protocol for directory synchronization. ![System for Cross-domain Identity Management specification](./scim.png) ## Benefits of a Directory Sync with SCIM 1. **Automated provisioning and de-provisioning**: SCIM automates the provisioning and de-provisioning of users, creating a single source of truth (SSOT) for user identity data. Instead of manually creating and managing users in Kestra, you can synchronize them from your IdP. 2. **Consistency and compliance**: With SCIM, you can ensure consistency of identity information across systems and stay compliant with security and regulatory requirements. 3. **Governance at scale**: Managing users at scale across many applications can be difficult without a standardized method for identity synchronization. SCIM provides a scalable solution for managing user identities. ## Supported identity providers For a detailed guide on how to set up SCIM provisioning with a specific IdP, refer to the documentation for the respective provider. --- # authentik SCIM Provisioning in Kestra URL: https://kestra.io/docs/enterprise/auth/scim/authentik > Configure SCIM provisioning with authentik. Learn how to automatically sync users and groups from authentik to your Kestra Enterprise instance. Sync Users and Groups from authentik to Kestra using SCIM. ## authentik SCIM provisioning ## Prerequisites - **authentik Account**: An account with administrative privileges to configure SCIM provisioning. - **Enable multi-tenancy in Kestra**: Tenants must be enabled in Kestra to support SCIM provisioning. You can enable tenants by setting the `kestra.ee.tenants.enabled` configuration property to `true`: ```yaml kestra: ee: tenants: enabled: true ``` :::alert{type="info"} Tenants are enabled by default. Please refer to the [Migration Guide](../../../../11.migration-guide/v0.23.0/tenant-migration-ee/index.md) to assist with upgrading. ::: ## Kestra SCIM setup: create a new provisioning integration 1. In the Kestra UI, navigate to the `Tenant` → `IAM` → `SCIM Provisioning` page. 2. Click on the `Create` button in the top right corner of the page. 3. Fill in the following fields: - **Name**: Enter a name for the provisioning integration. - **Description**: Provide a brief description of the integration. - **Provisioning Type**: Currently, only SCIM 2.0 is supported — leave the default selection and click `Save`. ![scim1](./scim_authentik.png) The above steps will generate a SCIM endpoint URL and a Secret Token that you will use to authenticate authentik with the SCIM integration in Kestra. Save those details, as they will be needed in the next steps. ![scim2](./scim_authentik2.png) The endpoint should look as follows: ```plaintext https:///api/v1//integrations/integration_id/scim/v2 ``` The Secret Token will be a long string (approximately 200 characters) used to authenticate requests from authentik to Kestra. ### Enable or disable SCIM integration Note that you can disable or completely remove the SCIM Integration at any time. When an integration is disabled, all incoming requests to that integration endpoint will be rejected. ![scim3](../okta/scim3.png) :::alert{type="info"} At first, you can disable the integration to configure your authentik SCIM integration, and then enable it once the configuration is complete. ::: ### IAM role and service account When creating a new Provisioning Integration, Kestra will automatically create two additional objects: 1. Role `SCIMProvisioner` with the following permissions: - `GROUPS`: `CREATE`, `READ` `UPDATE`, `DELETE` - `USERS`: `CREATE`, `READ`, `UPDATE` - `BINDINGS`: `CREATE`, `READ`, `UPDATE`, `DELETE` ![scim4](../okta/scim4.png) 2. Service Account with an API Token which was previously displayed as a Secret Token for the integration: ![scim5](../okta/scim5.png) :::alert{type="info"} Why the `SCIMProvisioner` role doesn't have the `DELETE` permission for `USERS`? This is because you cannot delete a user through our SCIM implementation. Users are global and SCIM provisioning is per tenant. When we receive a `DELETE` query for a user, we remove their tenant access but the user itself remains in the system. ::: ## authentik SCIM 2.0 setup Configuring SCIM 2.0 follows a process similar to SSO — you'll need to create a new `Application`. Then, in the second step, select `SCIM` as the Provider Type. ![scim-for-authentik-7](./authentik7.png) In the `Protocol settings` section, enter the `URL` and `Secret Token` obtained from Kestra. :::alert{type="info"} If you are running authentik on a Mac machine with [docker-compose installer](https://docs.goauthentik.io/docs/installation/docker-compose), make sure to replace `localhost` in your Kestra's SCIM endpoint with `host.docker.internal` since otherwise the sync won't work. Your URL should look as follows: `http://host.docker.internal:8080/api/v1/dev/integrations/zIRjRAMGvkammpeLVuyJl/scim/v2`. ::: ![scim-for-authentik-8](./authentik8.png) ## Test both SSO and SCIM by adding users and groups First, create `Users` and `Groups` in the `Directory` settings. ![scim-for-authentik-9](./authentik9.png) Then assign your user(s) to an existing group. ![scim-for-authentik-10](./authentik10.png) You can set a password for each authentik user to allow them to log in directly to Kestra with their username/email and password. ![scim-for-authentik-11](./authentik11.png) Once groups and users are created, they should be visible in the Kestra UI under the `IAM` → `Users` and `Groups` sections. It’s best to log in as the default admin user and attach the desired `Role` to each group to ensure that the users have the necessary permissions. ![scim-for-authentik-12](./authentik12.png) Then, to verify access, log in as one of those new authentik users in a separate browser or incognito mode and verify that the user has the permissions you expect. ## Additional resources - [SCIM for authentik Documentation](https://docs.goauthentik.io/docs/providers/scim/) - [Manage applications in authentik Documentation](https://docs.goauthentik.io/docs/applications/manage_apps) --- # Keycloak SCIM Provisioning in Kestra URL: https://kestra.io/docs/enterprise/auth/scim/keycloak > Configure SCIM provisioning with Keycloak. Synchronize users and groups from Keycloak to Kestra Enterprise for centralized identity management. Sync users and groups from Keycloak to Kestra using SCIM. ## Keycloak SCIM provisioning ## Prerequisites - **Keycloak Account**: An account with administrative privileges is required to configure SCIM provisioning. - **Enable multi-tenancy in Kestra**: Tenants must be enabled in Kestra to support SCIM provisioning. You can enable tenants by setting the `kestra.ee.tenants.enabled` configuration property to `true`: ```yaml kestra: ee: tenants: enabled: true ``` :::alert{type="info"} Tenants are enabled by default. Please refer to the [Migration Guide](../../../../11.migration-guide/v0.23.0/tenant-migration-ee/index.md) to assist with upgrading. ::: ## Kestra SCIM setup: create a new provisioning integration 1. In the Kestra UI, navigate to the `Tenant` → `IAM` → `SCIM Provisioning` page. 2. Click on the `Create` button in the top right corner of the page. 3. Fill in the following fields: - **Name**: Enter a name for the provisioning integration. - **Description**: Provide a brief description of the integration. - **Provisioning Type**: currently, we only support SCIM 2.0 — leave the default selection and click `Save`. ![scim1](./scim1_keycloak.png) The steps above will generate a SCIM endpoint URL and a Secret Token that you will use to authenticate Keycloak with the SCIM integration in Kestra. Save those details as we will need them in the next steps. ![scim2](../okta/scim2.png) The endpoint should look as follows: ```plaintext https:///api/v1//integrations/integration_id/scim/v2 ``` The Secret Token is a long string (approx. 200 characters) used to authenticate requests from Keycloak to Kestra. ### Enable or Disable SCIM Integration Note that you can disable or completely remove the SCIM Integration at any time. When an integration is disabled, all incoming requests to that integration endpoint will be rejected. ![scim3](../okta/scim3.png) :::alert{type="info"} At first, you can disable the integration to configure your Keycloak SCIM integration, and then enable it once the configuration is complete. ::: ### IAM Role and Service Account When creating a new Provisioning Integration, Kestra will automatically create two additional objects: 1. Role `SCIMProvisioner` with the following permissions: - `GROUPS`: `CREATE`, `READ` `UPDATE`, `DELETE` - `USERS`: `CREATE`, `READ`, `UPDATE` - `BINDINGS`: `CREATE`, `READ`, `UPDATE`, `DELETE` ![scim4](../okta/scim4.png) 2. Service Account with an API Token which was previously displayed as the Secret Token for the integration: ![scim5](../okta/scim5.png) :::alert{type="info"} Why the `SCIMProvisioner` role doesn't have the `DELETE` permission for `USERS`? This is because you cannot delete a user using our SCIM implementation. Users are global and SCIM provisioning is per tenant. When we receive a `DELETE` query for a user, we remove their tenant access but the user itself remains in the system. ::: ## Keycloak SCIM setup Keycloak [does not provide](https://github.com/keycloak/keycloak/issues/13484) any built-in support for SCIM v2.0. Some [open-source solutions](https://github.com/mitodl/keycloak-scim/) support groups synchronization but not users and membership synchronization. However, there are paid solutions such as [SCIM for Keycloak](https://scim-for-keycloak.de/) that allow you to extend Keycloak with SCIM. The setup shown below was validated with Kestra 0.18.0 and Keyclock 25.0.2 — best if you use the same or higher versions. 1. **Obtain a License**: - Create a new account on: https://scim-for-keycloak.de/ - Purchase a free license (no VAT number or credit card is required for a free license). ![scim-for-keycloak-license](./keycloak1.png) 2. **Install the SCIM Provider Plugin**: - Download the plugin JAR file from the `Downloads` section in your Account (e.g. `scim-for-keycloak-kc-25-2.2.1-free.jar`). ![scim-for-keycloak-download](./keycloak2.png) - Place the JAR file in the `./providers` directory of your Keycloak installation (or in the current folder if Keycloak is deployed with Docker). - More information: [SCIM for Keycloak Installation](https://scim-for-keycloak.de/documentation/installation/install) 3. **Deploy Keycloak**: - Create a simple `docker-compose.yaml` file: ```yaml services: keyclock: container_name: keyclock image: quay.io/keycloak/keycloak:25.0.2 ports: - 8085:8085 environment: KEYCLOAK_ADMIN: admin KEYCLOAK_ADMIN_PASSWORD: admin KC_SPI_THEME_WELCOME_THEME: scim KC_SPI_REALM_RESTAPI_EXTENSION_SCIM_LICENSE_KEY: command: ["start-dev", "--http-port=8085"] volumes: - ./providers:/opt/keycloak/providers network_mode: "host" # Optional: for accessing external Kestra ``` - Run `docker compose up` to start Keycloak. 4. **Configure the SCIM for Keycloak**: - To synchronize Users and Groups from Keycloak to Kestra, connect to the `SCIM Administration Console` for Keycloak with SCIM. ![scim-for-keycloak-3](./keycloak3.png) - Enable SCIM for the Realm ![scim-for-keycloak-4](./keycloak4.png) - Note that `Bulk` and `Password synchronization` operations are currently not supported by Kestra and must be disabled in Keycloak. 5. **Create a SCIM Client**: - Navigate to the `Remote SCIM Provider` section - Fill the `Base URL` field with your Kestra `SCIM Endpoint`: ![scim-for-keycloak-5](./keycloak5.png) - Fill the `Authentication` with your Kestra `Secret Token`: ![scim-for-keycloak-6](./keycloak6.png) 6. **Enable Provisioning**: - Now that everything is configured, you can toggle the `Enabled` field on in the Kestra Provisioning Integration to start syncing users and groups from Keycloak to Kestra. ## Additional resources - [SCIM for Keycloak Documentation](https://scim-for-keycloak.de/documentation/administration/scim-client) --- # Microsoft Entra ID SCIM Provisioning in Kestra URL: https://kestra.io/docs/enterprise/auth/scim/microsoft-entra-id > Set up SCIM provisioning with Microsoft Entra ID. Automatically sync users and groups from Entra ID to Kestra for streamlined user management. Sync users and groups from Microsoft Entra ID to Kestra using SCIM. ## Microsoft Entra ID SCIM provisioning ## Prerequisites - **Microsoft Entra ID Account**: An account with administrative privileges is required to configure SCIM provisioning. - **Enable multi-tenancy in Kestra**: Tenants must be enabled in Kestra to support SCIM provisioning. You can enable tenants by setting the `kestra.ee.tenants.enabled` configuration property to `true`: ```yaml kestra: ee: tenants: enabled: true ``` ## Kestra SCIM setup: create a new provisioning integration 1. In the Kestra UI, navigate to the `Tenant` → `IAM` → `SCIM Provisioning` page. 2. Click on the `Create` button in the top right corner of the page. 3. Fill in the following fields: - **Name**: Enter a name for the provisioning integration. - **Description**: Provide a brief description of the integration. - **Provisioning Type**: currently, we only support SCIM 2.0 — leave the default selection and click `Save`. ![scim1](./scim1.png) The above steps will generate a SCIM endpoint URL and a Secret Token that you will use to authenticate Microsoft Entra ID with the SCIM integration in Kestra. Save those details as they will be needed in the next steps. ![scim2](../okta/scim2.png) The endpoint should look as follows: ```plaintext https:///api/v1//integrations/integration_id/scim/v2 ``` The Secret Token is a long string (approx. 200 characters) used to authenticate requests from Microsoft Entra ID to Kestra. ### Enable or Disable SCIM Integration Note that you can disable or completely remove the SCIM Integration at any time. When an integration is disabled, all incoming requests to that integration endpoint will be rejected. ![scim3](../okta/scim3.png) :::alert{type="info"} At first, you can disable the integration to configure your Microsoft Entra ID integration in the Azure portal, and then enable it once the configuration is complete. ::: ### IAM Role and Service Account When creating a new Provisioning Integration, Kestra will automatically create two additional objects: 1. Role `SCIMProvisioner` with the following permissions: - `GROUPS`: `CREATE`, `READ` `UPDATE`, `DELETE` - `USERS`: `CREATE`, `READ`, `UPDATE` - `BINDINGS`: `CREATE`, `READ`, `UPDATE`, `DELETE` ![scim4](../okta/scim4.png) 2. Service Account with an API Token which was previously displayed as the Secret Token for the integration: ![scim5](../okta/scim5.png) :::alert{type="info"} Why the `SCIMProvisioner` role doesn't have the `DELETE` permission for `USERS`? This is because you cannot delete a user through our SCIM implementation. Users are global and SCIM provisioning is per tenant. When we receive a `DELETE` query for a user, we remove their tenant access but the user itself remains in the system. ::: ## Microsoft Entra ID SCIM setup ### 1. Register Kestra as an Enterprise Application: - Navigate to Microsoft Entra ID → Enterprise Applications. - Click on the `+ New application` button to create a new custom application. You can name the app "KestraSCIM" or any other relevant name. ![scim6](./scim6.png) ### 2. Configure SCIM Provisioning: - Go to the newly created Kestra application. - Select "Provisioning" and set the Provisioning Mode to "Automatic". - Enter the SCIM endpoint URL and the Secret Token provided by Kestra. Paste Kestra's SCIM endpoint URL into the Tenant URL field and the Secret Token into the Secret Token field. - Finally, click on `Test Connection` and on the `Save` button. ![scim7](./scim7.png) ### 3. Map User and Group Attributes: After entering and saving the **Admin Credentials** for the SCIM provisioning connection in Microsoft Entra ID (i.e., the Tenant URL and Secret Token), Azure will **enable the `Mappings` section** under the Provisioning settings. The **Mappings** section allows you to define how user and group attributes should flow between Microsoft Entra ID and Kestra. #### SCIM Schema Support in Kestra Kestra adheres to the [SCIM 2.0 specification (RFC 7643)](https://datatracker.ietf.org/doc/html/rfc7643#section-4), specifically supporting the following resource types: - **User Resource**: - Example attributes: `userName`, `name.givenName`, `name.familyName`, `emails`, `active` - **Group Resource**: - Example attributes: `displayName`, `members` #### Retrieve supported schemas Kestra exposes SCIM resource schemas via its `/Schemas` endpoint exposed via the SCIM URL. This allows Microsoft Entra ID to discover the required attributes automatically. ```plaintext GET /api/v1//integrations//scim/v2/Schemas ``` :::alert{type="info"} Replace `` with your actual tenant, and `` with your actual Kestra SCIM integration ID. ::: This endpoint returns a list of supported schemas and their attributes. Use it as a reference when configuring attribute mappings in Entra ID. #### Configure user and group mapping To configure mappings: 1. Go to: **Microsoft Entra Admin Center** → **Enterprise Applications** → *Your Kestra App* → **Provisioning** → **Mappings** 2. Configure attribute mappings: - **For Users**: - Map source attributes such as `userPrincipalName`, `mail` to their SCIM equivalents. - **For Groups**: - Map attributes such as `displayName` - Ensure group `members` are synchronized properly. 3. Refer to the `/Schemas` endpoint response from Kestra to guide accurate mapping. 4. Use attribute expressions or transformations in Entra ID if needed (e.g., to format names or emails). :::alert{type="info"} By default, Azure will pre-populate the mapping with many Microsoft Entra ID attributes. You may need to **remove or simplify** some of these mappings if synchronization issues occur with users or groups in Kestra. ::: #### Test the Configuration After mappings are configured: - Trigger a **manual provisioning cycle** from the **Provisioning** tab. - Verify that **users and groups** are correctly created or updated in Kestra. - Review **provisioning logs** in Entra ID for any errors or warnings. ### 4. Enable Provisioning: - Once everything is configured, you can enable the provisioning integration toggle in the Kestra UI to start syncing users and groups from Microsoft Entra ID to Kestra. ## Additional resources - [Microsoft Entra ID SCIM Documentation](https://docs.microsoft.com/en-us/azure/active-directory/app-provisioning/) --- # Okta SCIM Provisioning in Kestra Enterprise URL: https://kestra.io/docs/enterprise/auth/scim/okta > Enable SCIM provisioning with Okta. Learn how to automatically synchronize Okta users and groups with your Kestra Enterprise instance. Sync users and groups from Okta to Kestra using SCIM. ## Okta SCIM provisioning ## Prerequisites - **Okta Account**: An account with administrative privileges is required to configure SCIM provisioning. - **Enable multi-tenancy in Kestra**: Tenants must be enabled in Kestra to support SCIM provisioning. You can enable tenants by setting the `kestra.ee.tenants.enabled` configuration property to `true`: ```yaml kestra: ee: tenants: enabled: true ``` :::alert{type="info"} Tenants are enabled by default. Please refer to the [Migration Guide](../../../../11.migration-guide/v0.23.0/tenant-migration-ee/index.md) to assist with upgrading. ::: ## Kestra SCIM setup: create a new provisioning integration 1. In the Kestra UI, navigate to the `Tenant` → `IAM` → `SCIM Provisioning` page. 2. Click on the `Create` button in the top right corner of the page. 3. Fill in the following fields: - **Name**: Enter a name for the provisioning integration. - **Description**: Provide a brief description of the integration. - **Provisioning Type**: Currently, only SCIM 2.0 is supported — leave the default selection and click `Save`. ![scim1](./scim1_okta.png) The above steps will generate a SCIM endpoint URL and a Secret Token that you will use to authenticate Okta with the SCIM integration in Kestra. Save those details as we will need them in the next steps. ![scim2](./scim2.png) The endpoint should look as follows: ```plaintext https:///api/v1//integrations/integration_id/scim/v2 ``` The Secret Token is a long string (approx. 200 characters) used to authenticate requests from Okta to Kestra. ### Enable or Disable SCIM Integration Note that you can disable or completely remove the SCIM Integration at any time. When an integration is disabled, all incoming requests for that integration endpoint will be rejected. ![scim3](./scim3.png) :::alert{type="info"} At first, you can disable the integration to configure your Okta SCIM integration, and then enable it once the configuration is complete. ::: ### IAM Role and Service Account When creating a new Provisioning Integration, Kestra will automatically create two additional objects: 1. Role `SCIMProvisioner` with the following permissions: - `GROUPS`: `CREATE`, `READ` `UPDATE`, `DELETE` - `USERS`: `CREATE`, `READ`, `UPDATE` - `BINDINGS`: `CREATE`, `READ`, `UPDATE`, `DELETE` ![scim4](./scim4.png) 2. Service Account with an API Token which was previously displayed as the Secret Token for the integration: ![scim5](./scim5.png) :::alert{type="info"} Why the `SCIMProvisioner` role doesn't have the `DELETE` permission for `USERS`? This is because you cannot delete a user through our SCIM implementation. Users are global and SCIM provisioning is per tenant. When we receive a `DELETE` query for a user, we remove their tenant access but the user itself remains in the system. ::: --- ## Okta SCIM setup 1. **Create an App Integration**: - Navigate to Okta Admin Console → Applications → Applications. - Click on "Create App Integration" and then select: - Sign-in Method: **OIDC - OpenID Connect** - Application Type: Web Application - Then on the next page: - Give your application a name, e.g., `Kestra` - Grant Type: Client acting on behalf of itself → Client Credentials → True - Login - Sign-in redirect URIs → http:///oauth/callback/okta - Sign-out redirect URIs → http:///logout - Once application is created, select it in the Applications view and take note of the client ID and client secret. ![okta1](./okta1.png) 2. **Configure Okta in Kestra**: - With the above client ID and secret, add the following in your Kestra Micronaut configuration: ```yaml micronaut: security: oauth2: enabled: true clients: okta: client-id: "CLIENT_ID" client-secret: "CLIENT-SECRET" openid: issuer: "https://{okta-account}.okta.com/" ``` - Enter the SCIM endpoint URL and API token provided by Kestra. 3. **Configure SCIM 2.0 in Okta**: - In Okta, navigate to Applications → Applications → Browse App Catalog - Search for SCIM 2.0 - Select SCIM 2.0 Test App (OAuth Bearer Token) - in Sign-in options select Secure Web Authentication → user sets username/password - Click Done - Select the integration you have just created, then enter the `Provisioning` tab. - Fill in the SCIM 2.0 Base URL field with the endpoint URL you obtained from Kestra. Enter the Secret Token generated in Kestra into the `OAuth Bearer Token` field. - Finally, click `Test API Credentials` to verify the connection. ![okta2](./okta2.png) 4. **Map Attributes**: - Select “Push Groups” and choose the Groups you wish to push to Kestra. - Perform a test to ensure that the mappings are correct and data is syncing properly. 5. **Enable Provisioning**: - Enable the provisioning integration toggle in the Kestra UI to begin automatic synchronization of users and groups from Okta to Kestra. ## Additional resources - [Okta SCIM Documentation](https://developer.okta.com/docs/reference/scim/) --- # Service Accounts in Kestra Enterprise: CI/CD Auth URL: https://kestra.io/docs/enterprise/auth/service-accounts > Create and manage Service Accounts in Kestra. Securely authenticate external applications and CI/CD pipelines with programmatic access tokens. How to create and manage Service Accounts. ## Service accounts – non-human access
A Service Account represents an **application** that can access Kestra. It is not tied to a specific person and does not have personal information (such as the first name, last name, or email) attached to it. Instead, it only has a name, an optional description, an optional allocation to a group, and a list of Roles that grant it permissions to access specific resources. ## Service accounts vs. users In contrast to regular users, Service Accounts don't have a password and they do not have access to the Kestra UI — they only have a programmatic API access to Kestra. You can think of Service Accounts as bots authenticating with Kestra using an API token. ## Creating a Service Account To create a new service account, go to **Service Accounts** tab on the **IAM** page under the **Tenant** section and click the **Create** button. Fill in the form with the required information, including the name and description and click **Save**: ![service_account_create](./service_account_create.png) Once you have created a service account, you can add a Role that will grant it permissions to specific resources. To do this, switch to the **Access** tab and click the **Add** button and select the role you want to assign to the service account. ![Assign Service Account Role](./service_account_role.png) Finally, you can generate an API token for the service account by clicking the **Create API Token** button in the service account's details. This will generate a token that you can use to authenticate the service account with Kestra from external applications such as CI/CD pipelines (e.g., in Terraform provider configuration or GitHub Actions secrets). :::alert{type="info"} **Note:** You can configure the token to expire after a certain period of time or to never expire. Also, there is a toggle called `Extended` that will automatically prolong the token's expiration date by the specified number of days (`Max Age`) if the token is actively used. That toggle is disabled by default. ::: Once you confirm the API token creation via the **Generate** button, the token will be generated and displayed in the UI. Make sure to copy the token and store it in a secure location as it will not be displayed again. ![service_account_create_2](./service_account_create_2.png) ## Users vs. Service Accounts vs. API Tokens You can create an **API token** for a regular user as well. While Service Accounts are recommended for programmatic API access to Kestra from CI/CD or other external applications, it's often useful to create an API token for a regular user, so that programmatic actions performed by that user can be tracked and audited. ![service_account_create_3](./service_account_create_3.png) Therefore, the difference between a service account and a user is that a service account is designed for programmatic access and doesn't have a password or personal information attached to it. Instead, it is authenticated exclusively using an API token. A user, on the other hand, can interact with both the Kestra UI and the API, and can be authenticated using a password or an API token. ## The purpose of service accounts Service Accounts are intended for programmatic access to Kestra from any other application, such as CI/CD pipelines or your own custom APIs. For example, you can use the token **to authenticate with Kestra Terraform provider or Kestra's GitHub Actions CI/CD pipeline**. ## Allocating service accounts to groups Each Service Account can be attached to one or more Groups such as a group called “Bots” that centrally governs programmatic access for CI/CD across multiple projects with just one Role. This is useful to manage programmatic access used by Terraform, GitHub Actions, or other external applications, in one place by attaching a single Role to that Group. Speaking of CI/CD, note that Kestra currently supports authenticating with either a basic authentication user or an API token: 1. Use the `--api-token=mytoken` CLI property to allow authenticating with a service account token: ```bash ./kestra namespace files update prod scripts . \ --server=https://demo.kestra.io --api-token yourtoken ``` 2. Use the `--user user_email:password` flag to the CLI to allow authenticating with a Basic Authentication access: ```bash ./kestra namespace files update prod scripts . \ --server=https://demo.kestra.io --user=rick.astely@kestra.io:password42 ``` ## Service account name convention When creating a new Service Account, make sure to follow the DNS naming convention. Specifically, the `name` property needs to: - contain at most 63 characters - contain only lowercase alphanumeric characters or hyphens (i.e., the `-` character) - start with an alphanumeric character - end with an alphanumeric character. Some examples to make that clear: - ✅ `my-service-account` is a valid name - ✅ `my-service-account-1` is a valid name - ❌ `MY_SERVICE_ACCOUNT` is not a valid name because it contains uppercase characters and underscores - ❌ `myServiceAccount` is not a valid name because it contains uppercase characters and camel case - ❌ `my-service-account-` is not a valid name because it ends with a hyphen. **Why do we follow such a restrictive convention?** We follow the standard DNS-style pattern to be ready for potential future use cases where we could, for example, forward the service account name to a Kubernetes pod's labels. This way, we ensure that the service account name can be used in a variety of contexts without any issues. --- # Single Sign-On in Kestra: Providers and Setup URL: https://kestra.io/docs/enterprise/auth/sso > Enable Single Sign-On (SSO) in Kestra Enterprise. Configure OIDC authentication with providers like Google, Microsoft, Okta, and Keycloak. How to enable and set up SSO in your Kestra Enterprise instance. ## Configure single sign-on Single Sign-On (SSO) is an authentication process that allows users to access multiple applications with a single set of login credentials (e.g., "Sign in with Google"). Kestra supports SSO using the OpenID Connect (OIDC) protocol, which is a simple identity layer built on top of the OAuth 2.0 protocol.
## Configuring single sign-on with OpenID Connect (OIDC) To implement OIDC SSO, you'll need to configure the Micronaut framework that Kestra uses under the hood. Start by enabling OIDC in your `yaml` configuration file as follows: ```yaml micronaut: security: oauth2: enabled: true clients: oidc-provider: client-id: "{{ clientId }}" client-secret: "{{ clientSecret }}" openid: issuer: "{{ issuerUrl }}" ``` Replace `oidc-provider` with your chosen provider's name, `{{ clientId }}` with your client ID, `{{ clientSecret }}` with your client secret, and `{{ issuerUrl }}` with your issuer URL. For more configuration details, refer to the [Micronaut OIDC configuration guide](https://micronaut-projects.github.io/micronaut-security/latest/guide/#openid-configuration). ## Provider guides Check out our guides for specific SSO providers: - [Google](./google-oidc/index.md) - [Microsoft](./microsoft-oidc/index.md) - [Keycloak](./keycloak/index.md) - [Okta](./okta/index.md) - [authentik](./authentik/index.md) --- # Set Up authentik SSO in Kestra URL: https://kestra.io/docs/enterprise/auth/sso/authentik > Configure authentik SSO for Kestra. Enable seamless user authentication using authentik as your OpenID Connect provider. Set up authentik SSO to manage authentication for users. ## Configure authentik SSO In conjunction with SSO, check out the [authentik SCIM provisioning guide](../../scim/authentik/index.md). ### Install authentik Authentik provides a simple docker-compose installer for testing purposes. Follow [the instructions](https://docs.goauthentik.io/docs/installation/docker-compose) and click on the [initial setup URL](http://docker.for.mac.localhost:9000/if/flow/initial-setup/) to create your first user. ![scim-for-authentik-user](./authentik1.png) ### Create Application and SSO Provider in authentik On the left-hand side, select **Applications → Applications**. For simplicity, we’ll use the **Create with Wizard** button, as this will create both an application and a provider. ![scim-for-authentik-2](./authentik2.png) On the **Application Details** screen, fill in the application `name` and `slug`. Set both here to `kestra` and click `Next`. ![scim-for-authentik-3](./authentik3.png) On the **Provider Type** screen, select **OAuth2/OIDC** and click **Next**. ![scim-for-authentik-4](./authentik4.png) On the **Provider Configuration** screen: 1. In the **Authentication flow** field, select “default-authentication-flow (Welcome to authentik!)”. 2. In the **Authorization flow** field, select “default-provider-authorization-explicit-consent (Authorize Application)”. ![scim-for-authentik-5](./authentik5.png) 3. Keep the Client type as **Confidential**. Under the **Redirect URIs/Origins (RegEx)**, enter your Kestra host's `/oauth/callback/authentik` endpoint in the format `http://:/oauth/callback/authentik` (e.g., http://localhost:8080/oauth/callback/authentik) and then `Submit` the Application. ![scim-for-authentik-6](./authentik6.png) Note the `Client ID` and `Client Secret` as you will need these to configure Kestra in the next step. ### Configure Authentik SSO in Kestra Settings With the above Client ID and Secret, add the following in the `micronaut` configuration section: ```yaml micronaut: security: oauth2: enabled: true clients: authentik: clientId: "CLIENT_ID" clientSecret: "CLIENT_SECRET" openid: issuer: "http://localhost:9000/application/o/kestra/" ``` You may need to adjust the above `issuer` URL if you named your application something other than `kestra`. Make sure to update that URL to match your application name `http://localhost:9000/application/o//`. ### Configure a Default Role for your SSO users in Kestra Settings To ensure that your SSO users have initial permissions within the Kestra UI, set up a default role for them. Achieve this by adding the following configuration under the `kestra.security` section: ```yaml kestra: security: defaultRole: name: default_admin_role description: "Default Admin Role" permissions: NAMESPACE: ["CREATE", "READ", "UPDATE", "DELETE"] ROLE: ["CREATE", "READ", "UPDATE", "DELETE"] GROUP: ["CREATE", "READ", "UPDATE", "DELETE"] EXECUTION: ["CREATE", "READ", "UPDATE", "DELETE"] AUDITLOG: ["CREATE", "READ", "UPDATE", "DELETE"] USER: ["CREATE", "READ", "UPDATE", "DELETE"] BINDING: ["CREATE", "READ", "UPDATE", "DELETE"] FLOW: ["CREATE", "READ", "UPDATE", "DELETE"] SECRET: ["CREATE", "READ", "UPDATE", "DELETE"] BLUEPRINT: ["CREATE", "READ", "UPDATE", "DELETE"] KVSTORE: ["CREATE", "READ", "UPDATE", "DELETE"] ee: tenants: enabled: true defaultTenant: false ``` :::alert{type="info"} ⚠️ Make sure that your `defaultRole` is added under the `kestra.security` section, not under `micronaut.security`. Also, ensure that the `defaultRole` has the necessary permissions for your users to interact with Kestra. The above configuration is just an example and you might want to restrict the permissions boundaries for production use. ::: --- # Set Up Google OIDC SSO in Kestra URL: https://kestra.io/docs/enterprise/auth/sso/google-oidc > Set up Google OIDC SSO for Kestra. Authenticate users with their Google accounts using OpenID Connect for secure and easy access. ## Set up Google OIDC SSO This guide provides step-by-step instructions to configure **OpenID Connect (OIDC) authentication using Google Identity Platform** and link it to [**Kestra Enterprise**](../../../index.mdx) for [Single Sign-On (SSO)](../index.md). ## Prerequisites - **Google Cloud Project**: Ensure you have a Google Cloud project with billing enabled. - **Administrator Access**: You need sufficient permissions to configure Identity Platform and manage identity providers. - **Kestra Enterprise Edition**: Kestra SSO is available only in the Enterprise Edition. Refer to the [Google OIDC setup documentation](https://cloud.google.com/identity-platform/docs/web/oidc) for more details. --- ## Step 1: Enable Identity Platform in Google Cloud 1. **Navigate to the Identity Platform**: - Go to the [Identity Platform page](https://console.cloud.google.com/identity) in the Google Cloud Console. 2. **Confirm your project**: - Make sure that you have the correct project selected to add an identity provider to. --- ## Step 2: Add an OIDC Provider in Google Cloud 1. **Access Identity Providers**: - In the Identity Platform menu, select **Providers**. 2. **Add a New Provider**: - Click on **Add a Provider**. - From the list, choose **OpenID Connect**. ![add-provider](./add-provider.png) 3. **Configure the OIDC Provider**: - **Grant type**: Select the Code Flow grant type. - **Provider Name**: Enter a display name for the OIDC provider. - **Client ID**: Enter the **Client ID** obtained from Google. - **Client Secret**: Enter the **Client Secret** associated with the Client ID. - **Issuer URL**: Provide the **Issuer URL** (e.g., `https://accounts.google.com`). - **Scopes**: Specify any additional scopes required by your application. ![oidc-details](./oidc-provider.png) 4. **Save the Configuration**: - Click **"Save"** to add the OIDC provider to your Identity Platform configuration. --- ## Step 3: Configure Kestra to Use Google as an OIDC SSO Provider Now that Google is set up as an OIDC provider, we need to link it to Kestra. 1. **Navigate to the Kestra Configuration File**: - Locate the [Kestra Security and Secrets configuration](../../../../configuration/05.security-and-secrets/index.md) file. 2. **Add the OIDC Settings**: - Add the following configuration to enable Google as an OIDC provider for Kestra: ```yaml micronaut: security: oauth2: enabled: true clients: google: client-id: "{{ clientId }}" client-secret: "{{ clientSecret }}" openid: issuer: 'https://accounts.google.com' ``` - Replace `clientId` and `clientSecret` with the values from the Google Identity Platform. - Update the `redirectUri` with your Kestra instance URL. - Restart Kestra to apply the changes. ## Additional Resources - [Managing SAML and OIDC Providers Programmatically](https://cloud.google.com/identity-platform/docs/managing-providers-programmatically) - [Identity Platform Documentation](https://cloud.google.com/identity-platform/docs) By following these steps, you can successfully set up OIDC authentication using Google Identity Platform, allowing users to sign in with their existing credentials via your chosen OIDC provider. --- # Set Up Keycloak SSO in Kestra URL: https://kestra.io/docs/enterprise/auth/sso/keycloak > Integrate Keycloak SSO with Kestra. Configure OpenID Connect authentication to manage user access via your Keycloak identity provider. Set up Keycloak SSO to manage authentication for users. ## Configure Keycloak SSO In conjunction with SSO, check out the [Keycloak SCIM provisioning guide](../../scim/keycloak/index.md). ## Start a Keycloak service If you don't have a Keycloak server already running, you can use a managed service like [Cloud IAM](https://app.cloud-iam.com). You can follow the steps described in the [Keycloak tutorial documentation](https://documentation.cloud-iam.com/get-started/complete-tutorial.html) to deploy a managed Keycloak cluster for free. ## Configure Keycloak client Once in Keycloak, create a new client: ![Create Client](../../../../15.how-to-guides/keycloak/client1.png) ![Client Settings](../../../../15.how-to-guides/keycloak/client2.png) Set `https://{{ yourKestraInstanceURL }}/oauth/callback/keycloak` as the valid redirect URI and `https://{{ yourKestraInstanceURL }}/logout` as the valid post-logout redirect URI. ![Redirect URI](../../../../15.how-to-guides/keycloak/redirect-uri.png) ## Kestra Configuration ```yaml micronaut: security: oauth2: enabled: true clients: keycloak: client-id: "{{clientId}}" client-secret: "{{clientSecret}}" openid: issuer: "https://{{keyCloakServer}}/realms/{{yourRealm}}" endpoints: logout: get-allowed: true ``` You can retrieve the `clientId` and `clientSecret` via Keycloak user interface ![Client ID](../../../../15.how-to-guides/keycloak/clientId.png) ![Client Secret](../../../../15.how-to-guides/keycloak/clientSecret.png) Don't forget to set a default role in your [Kestra Security and Secrets configuration](../../../../configuration/05.security-and-secrets/index.md) to streamline the process of onboarding new users. ```yaml kestra: security: defaultRole: name: Editor description: Default Editor role permissions: FLOW: ["CREATE", "READ", "UPDATE", "DELETE"] EXECUTION: - CREATE - READ - UPDATE - DELETE ``` :::alert{type="info"} Note: depending on the Keycloak configuration, you might want to tune the issuer URL. ::: For more configuration details, refer to the [Keycloak OIDC configuration guide](https://guides.micronaut.io/latest/micronaut-oauth2-keycloak-gradle-java.html). ## Manage Groups via OIDC Claims If you are unable to use [SCIM with Keycloak](../../scim/keycloak/index.md), you can configure Kestra to source user groups from OIDC claims. In this setup, Keycloak acts as the single source of truth for user group membership. This method requires creating a `groups` client scope that exposes group membership via a claim in the ID Token. ### Create a Groups Client Scope In Keycloak, go to **Client Scopes** and click **Create Client Scope**. Name it `groups`, set Type to **Default**, and keep Protocol as **OpenID Connect**. ![Create Client Scope](../../../../15.how-to-guides/keycloak/01-groups_create_client_scope.png) ### Add a Group Membership Mapper In the newly created `groups` scope, go to the **Mappers** tab and click **Configure a new mapper**. ![Add Mappers](../../../../15.how-to-guides/keycloak/02-add-mappers.png) Select **Group Membership** from the list of available mapper types. ![Configure Mapper](../../../../15.how-to-guides/keycloak/03-configure-mappers.png) Configure the mapper with the following settings: - **Name**: `groups` - **Token Claim Name**: `groups` - **Full group path**: Off - **Add to ID token**: On ![Mapper Details](../../../../15.how-to-guides/keycloak/04-mapper-details.png) ### Add the Client Scope to Your Client Go to **Clients**, select your Kestra client, and add the `groups` client scope. ![Add Client Scope](../../../../15.how-to-guides/keycloak/05-add_client_scope.png) ### Configure Kestra Update your Micronaut configuration to include `groups` in the scopes: ```yaml micronaut: security: oauth2: enabled: true clients: keycloak: client-id: "{{clientId}}" client-secret: "{{clientSecret}}" openid: issuer: "https://{{keyCloakServer}}/realms/{{yourRealm}}" scopes: ["openid", "profile", "email", "groups"] endpoints: logout: get-allowed: true ``` Then configure Kestra to synchronize groups from the `groups` claim: ```yaml kestra: security: oidc: groups-claim-path: "groups" ``` Once configured, Kestra will source user groups from the `groups` claim in the ID Token, with Keycloak as the single source of truth. --- # LDAP Authentication in Kestra: Directory Login URL: https://kestra.io/docs/enterprise/auth/sso/ldap > Enable LDAP authentication in Kestra. Connect your existing LDAP directory to manage user login and group synchronization securely. Enable LDAP authentication in Kestra to authenticate users against your existing directory and sync group memberships automatically.
## What is LDAP Lightweight directory access protocol (LDAP) allows applications to quickly query user information. Organizations use directories to store usernames, passwords, email addresses, and other static data. LDAP is an open, vendor-neutral protocol for accessing and managing that data. With Kestra, you can use an existing LDAP directory to authenticate users and sync them to groups with specific access permissions. ## Configuration LDAP is configured under the security context of your [Kestra Security and Secrets configuration](../../../../configuration/05.security-and-secrets/index.md) file. [LDAP with Micronaut](https://micronaut-projects.github.io/micronaut-security/4.11.3/guide/#ldap) supports `context`, `search`, and `groups` as core configuration properties supported out of the box. These properties define the connection context, user attribute mapping, and group filtering needed to synchronize users and their group memberships with Kestra. The `user-attributes` section maps LDAP attributes such as `givenName`, `sn`, and `mail` to the corresponding Kestra user properties (first name, last name, and email). The examples below extend the base Micronaut LDAP configuration with these Kestra-specific mappings. ### Unix configuration ```yaml micronaut: security: ldap: default: user-attributes: firstName: givenName lastName: sn email: mail context: server: "ldap://localhost:389" manager-dn: "cn=admin,dc=example,dc=org" manager-password: "LDAP_ADMIN_PASSWORD" search: base: "ou=users,dc=example,dc=org" filter: "(mail={0})" attributes: - "uid" - "givenName" - "sn" - "mail" groups: enabled: true base: "ou=groups,dc=example,dc=org" filter: "{&(objectClass=posixGroup)(memberUid={0})}" filter-attribute: uid ``` ### Windows configuration ```yaml micronaut: security: ldap: default: enabled: true user-attributes: firstName: givenName lastName: sn email: userPrincipalName context: server: "ldaps://:636" # ldap://:389 for non-TLS manager-dn: "CN=********,CN=Users,DC=domain,DC=local" manager-password: "********" search: base: "DC=domain,DC=local" filter: "(userPrincipalName={0})" attributes: - "sAMAccountName" - "givenName" - "sn" - "userPrincipalName" groups: enabled: true base: "DC=domain,DC=local" filter: "(&(objectClass=group)(member={0}))" filter-attribute: dn ``` Key points for Windows Active Directory: - **Login format**: the `userPrincipalName` filter requires users to log in with their full UPN, e.g. `john@domain.local`. If your users expect to log in with just their short username (e.g. `john`), change the filter to `(sAMAccountName={0})` and update the `email` attribute mapping accordingly. - **Search base**: setting `search.base` and `groups.base` to the root domain (`DC=domain,DC=local`) covers users and groups across all OUs. Narrow these to a specific OU (e.g. `OU=Engineering,DC=domain,DC=local`) if you want to restrict access to a subset of your directory. - **Group filter attribute**: AD `member` attributes store full DNs, so `filter-attribute: dn` is required. Without it, Micronaut defaults to `cn` and group membership lookups will silently return no results. - **TLS**: use `ldaps://` on port 636 in production. Plain `ldap://` on port 389 sends credentials in cleartext. If your AD uses a self-signed certificate, you must add it to the JVM truststore or configure certificate trust in your Kestra deployment. #### Finding Windows Active Directory values Use the following PowerShell commands on your Windows domain controller to look up the values needed for the configuration above. **LDAP server hostname** (`context.server`) ```powershell (Get-ADDomainController).HostName ``` Use the returned hostname as `ldaps://:636` for TLS or `ldap://:389` for non-TLS. **Manager DN** (`context.manager-dn`) ```powershell ([adsisearcher]"(sAMAccountName=Administrator)").FindOne().Properties.distinguishedname ``` Replace `Administrator` with the service account you intend to use as the bind user. The returned distinguished name (DN) is the value for `manager-dn`. **User distinguished name** To look up the DN of a specific user (useful for verifying your `search.base`): ```powershell Get-ADUser -Identity "JohnDoe" | Select-Object Name, DistinguishedName ``` **Groups for a user** To list the groups a user belongs to (useful for planning your `groups.base` and `groups.filter`): ```powershell Get-ADPrincipalGroupMembership -Identity "JohnDoe" | Select-Object Name, DistinguishedName ``` **Members of a group** To verify the members of a specific group: ```powershell Get-ADGroupMember -Identity "CN=Auto,OU=Distro,OU=Groups,DC=kestra,DC=local" | Select-Object sAMAccountName, Name ``` Replace the identity string with the DN of your target group. ## LDAP users in Kestra Once LDAP is configured, when a user logs into Kestra for the first time, their credentials are validated against the LDAP directory, and a corresponding user is created in Kestra. If a matching account already exists in Kestra, the user is authenticated using their LDAP credentials. If they are a part of any groups specified in the directory, those groups will be added to Kestra. If the group already exists in Kestra, they will be automatically added. If a user is added to a group after their initial login, they must log out and log back in for the new group assignment to sync, as synchronization occurs only at login. Any user authenticated via LDAP will show `LDAP` as their Authentication method in the **IAM - Users** tab in Kestra. ![IAM Users tab showing LDAP as the authentication method for a user](./ldap-1.png) Any updates to a user and their group access on the LDAP server will update in Kestra at the next synchronization (typically at the next login). :::alert{type="warning"} If a user is deleted from the LDAP server, they will lose access to Kestra at the next synchronization or login attempt. ::: --- # Set Up Microsoft OIDC SSO in Kestra URL: https://kestra.io/docs/enterprise/auth/sso/microsoft-oidc > Configure Microsoft OIDC SSO for Kestra. Enable users to sign in with their Microsoft Entra ID (Azure AD) credentials using OpenID Connect. ## Set up Microsoft OIDC SSO To configure Microsoft authentication, follow these steps: ```yaml micronaut: security: oauth2: enabled: true clients: microsoft: client-id: "{{ clientId }}" client-secret: "{{ clientSecret }}" openid: issuer: 'https://login.microsoftonline.com/common/v2.0/' ``` To get your `client-id` and `client-secret`, refer to the [Microsoft Documentation](https://learn.microsoft.com/en-us/entra/identity-platform/v2-protocols-oidc). ## Using Microsoft Entra ID as an OIDC SSO provider ### Create an Enterprise Application 1. Visit the [Azure portal](https://portal.azure.com/). 2. Select **Microsoft Entra ID**. 3. Navigate to **App registrations**. 4. Click on **New registration** and provide the necessary details: - Enter a name for your application. - Set **Supported account types** (e.g., "Default Directory only - Single tenant"). - Under **Redirect URI**, select *Web* and enter `https://{{ url }}/oauth/callback/microsoft`. Be sure to use `https` and the actual URL of your webserver. ### Generate client secret 1. Go to **Certificates & secrets**. 2. Under **Client secrets**, click on **New client secret**. 3. Copy the generated secret and use it in the `{{ clientSecret }}` field in your [Security and Secrets configuration](../../../../configuration/05.security-and-secrets/index.md). ### Kestra configuration - Copy the **Application (client) ID** from the **Overview** section and use it as your `{{ clientId }}`. - In the **Endpoints** section, locate the **OpenID Connect metadata document** URL. Remove the `.well-known/openid-configuration` suffix, and use the remainining base URL as your `{{ issuerUrl }}`. The final URL should look like `https://login.microsoftonline.com/{{ directory }}/v2.0/`. Here's an example Microsoft OIDC configuration: ```yaml micronaut: security: oauth2: enabled: true clients: microsoft: client-id: "{{ clientId }}" client-secret: "{{ clientSecret }}" openid: issuer: '{{ issuerUrl }}' ``` With these settings, Kestra is now configured to use OIDC for SSO with your chosen providers. Ensure that all placeholders are replaced with the actual values obtained during the provider's setup process. --- # Set Up Okta OIDC SSO in Kestra URL: https://kestra.io/docs/enterprise/auth/sso/okta > Set up Okta OIDC SSO for Kestra. Securely authenticate users via Okta OpenID Connect for centralized access management. ## Set up Okta OIDC SSO This guide provides step-by-step instructions to configure **OpenID Connect (OIDC) authentication using Okta** and link it to [**Kestra Enterprise**](../../../01.overview/index.mdx) for [Single Sign-On (SSO)](./index.md). ## Prerequisites - **Okta Developer Account**: Ensure you have an Okta Developer Account or Organization. - **Administrator Access**: You need sufficient permissions to configure Identity Platform and manage identity providers. - **Kestra Enterprise Edition**: Kestra SSO is available only in the Enterprise Edition. This guide covers setup with Okta from a high level, refer to the [Okta OIDC setup documentation](https://help.okta.com/oie/en-us/content/topics/apps/apps_app_integration_wizard_oidc.htm) for more details. ## Step 1: Create an App Integration Log in to your Okta account and select **Applications** from the left side menu. ![Okta Applications menu](./okta-1.png) Next, select **Create App Integration**, select **OIDC - OpenID Connect** as the sign-in method and **Web Application** as the application type. Select **Next**, and you will be taken to configure the general settings of the new web app integration. ![Create App Integration with OIDC and Web Application selected](./okta-2.png) ## Step 2: Configure the web app integration In the General Settings, give your App integration a name and set your grant type. For this example, we are using Authorization Code. You can open **Advanced Settings** to configure more sensitive grants. Okta has several direct-auth API grants, such as OTP, OOB, MFA OTP, and MFA OOB that you can select only if necessary. ![Okta app integration general settings with grant type selection](./okta-3.png) Here, you also set the **Sign-in redirect URIs** and **Sign-out redirect URIs** for your App integration. For this example connecting to Kestra, we set a Sign-in redirect URI as `http://localhost:8080/oauth/callback/okta` and sign-out as `http://localhost:8080/logout`, but you can customize this to your environment. Further down the page, you can configure optional **Trusted Origins**, and then choose the **Assignments** and the access settings for the App integration. We'll set the access to everyone in the organization, but you can set stricter access to only certain selected groups or skip for now. Lastly, we uncheck the setting to enable immediate access with Federation Broker Mode because we will give manual app access for this basic example. Finally, hit **Save**. ![Sign-in redirect URIs and assignments settings for Okta app](./okta-4.png) ## Step 3: Add test user to Okta app integration To create a test user in your Okta Directory to test your app integration, in your Okta Admin Dashboard, navigate to **Directory > People**. Select **Add Person**. ![Add Person form in Okta Directory](./okta-7.png) Enter user test details, including a password, and save the test user. In the **Directory**, select the new user, and navigate to the **Applications** tab for the user and choose **Assign Applications**. ![Assign Applications to user in Okta Directory](./okta-8.png) Select the Kestra application name you created and enter the added details for the user and hit **Save**. ## Step 4: Connect to Kestra Now that Okta is set up as an OIDC provider, we need to link it to Kestra. After saving your settings in the previous step, Okta will automatically redirect you to your integration. Here, you can collect your client credentials to connect to Kestra, **Client ID** and **Client Secret**. ![Client ID and Client Secret in Okta app integration](./okta-5.png) After copying your **Client ID** and **Client Secret**, switch from the **General** tab to the **Sign On** tab. Here, you can configure your **OpenID Connect ID Token**. For this example, we will edit the issuer from Dynamic to our Okta URL. Click **Save** and copy the URL to be used in our [Kestra Security and Secrets configuration](../../../../configuration/05.security-and-secrets/index.md) along with the Client ID and Client Secret. ![OpenID Connect ID Token issuer URL configuration in Okta](./okta-6.png) 1. **Navigate to the Kestra Configuration File**: - Locate the [Kestra Security and Secrets configuration](../../../../configuration/05.security-and-secrets/index.md) file. 2. **Add the OIDC Settings**: - Add the following configuration to enable Okta as an OIDC provider for Kestra: ```yaml micronaut: security: oauth2: enabled: true clients: okta: client-id: "{{ clientId }}" client-secret: "{{ clientSecret }}" openid: issuer: 'https://.okta.com' ``` - Replace `clientId` and `clientSecret` with the values copied from the Okta App integration. - Replace `issuer` with your issuer URL from Application's sign on settings from before. - Restart Kestra to apply the changes and log in. On restart, you will now see Okta as an available login method. ![Okta login option on Kestra login page](./okta-9.png) After logging in with the created user, navigate to the **Administration > IAM** tab, and you can see in the **Users** tab that the user can sign in with basic authentication as well as Okta. ![User shown with Okta authentication in IAM Users tab](./okta-10.png) --- # Cloud & Enterprise FAQ: Licensing and Configuration URL: https://kestra.io/docs/enterprise/ee-faq > FAQ for Kestra Cloud and Enterprise. Find answers to common questions about licensing, configuration, session management, and enterprise features. Frequently asked questions about the Cloud and Enterprise Edition of Kestra. ## Kestra Cloud & Enterprise FAQ – common questions ## My session expires too quickly. Is there a way to change the session expiration time? Yes, there is! Add the following Micronaut setting to your [Observability and Networking configuration](../../configuration/03.observability-and-networking/index.md) to change the session expiration time to 10 hours: ```yaml environment: KESTRA_CONFIGURATION: | micronaut: security: token: generator: access-token: expiration: 36000 cookie: cookie-max-age: 10h ``` In Cloud, you might need to ask our support team to change this setting for you. ## How do I configure Kestra with my license details? To use Kestra Enterprise Edition, you will need a valid license configured under the `kestra.ee.license` configuration. The license is unique to your organization. If you need a license, please reach out to our Sales team at [sales@kestra.io](mailto:sales@kestra.io). The license is set up using three configuration properties: `id`, `fingerprint`, and `key`. - `kestra.ee.license.id`: license identifier. - `kestra.ee.license.fingerprint`: license authentication. - `kestra.ee.license.key`: license key. ```yaml kestra: ee: license: id: fingerprint: key: | ``` When you launch Kestra Enterprise Edition, it will check the license and display the validation step in the log. ## When should I use Secrets vs Credentials? Use [Secrets](../../06.concepts/04.secret/index.md) when you need to store and reference sensitive values such as API keys, passwords, webhook URLs, or tokens in your flows and configuration. Secrets are the right choice when you want to inject a protected value with the `secret()` function or manage sensitive data centrally. Use [Credentials](../03.auth/credentials/index.md) when a supported integration or plugin expects a reusable authentication object managed through the UI. Credentials are better suited to connection-level authentication that you want to define once and reuse across multiple flows. In short: use **Secrets** for protected values, and use **Credentials** for managed authentication objects supported by Kestra integrations. --- # Governance in Kestra Enterprise: Security and Control URL: https://kestra.io/docs/enterprise/governance > Give your team secured, isolated environments and control over workflows with tenants, audit logs, secrets and more. import ChildCard from "~/components/docs/ChildCard.astro" Give your team secured, isolated environments and control over workflows with tenants, audit logs, secrets and more. ## Governance – security and control With tailored automation and precise access management, you can ensure compliance and efficiency at scale. --- # Allowed & Restricted Plugins in Kestra Enterprise URL: https://kestra.io/docs/enterprise/governance/allowed-plugins > Control plugin usage in Kestra Enterprise. Configure allowed and restricted plugins to enforce security policies and compliance standards. How to configure Kestra to allow or restrict specific plugins. ## Allowed & Restricted Plugins Kestra comes with the full library of official plugins by default. However, in some cases you may want to restrict which plugins are available to specific teams or users. For example, you might allow a team to use only BigQuery tasks while blocking script execution. Kestra enables this by letting you define allowlists (`includes`) and blocklists (`excludes`) using plugin names or regular expressions. To allow specific plugins, add the `includes` attribute in your [Plugins and Execution configuration](../../../configuration/04.plugins-and-execution/index.md) file and list the approved plugins or use a regular expression. Below is an example that `includes` all plugins from the `io.kestra` package using a regular expression. ```yaml kestra: plugins: security: includes: - io.kestra.* ``` ## Restricted plugins To restrict certain plugins, add the `excludes` attribute in your [Plugins and Execution configuration](../../../configuration/04.plugins-and-execution/index.md) file and list the disallowed plugins or use a regular expression. Below is the previous example with `excludes` added to disallow the `io.kestra.plugin.core.debug.Echo` plugin. ```yaml kestra: plugins: security: includes: - io.kestra.* excludes: - io.kestra.plugin.core.debug.Echo ``` --- # Assets in Kestra: Track Lineage and Metadata URL: https://kestra.io/docs/enterprise/governance/assets > Use Assets in Kestra Enterprise to track workflow lineage and metadata. Manage resources like tables, files, and datasets across your data stack. Track and manage the resources your workflows create and use.
## Track workflow assets and lineage Assets keeps a live inventory of resources that your workflows interact with. These resources can be database tables, virtual machines, files, or any external system you work with. Assets are captured automatically when tasks declare `assets.inputs` or `assets.outputs`; you can also add them manually from the **Assets** tab. Once created, you can view asset details, check which workflow runs created or modified them, and see how assets connect to each other across your workflows. This feature enables: - Shipping metadata to lineage providers (e.g., OpenLineage). - Populating dropdowns or Pebble inputs with live assets (e.g., available VMs). - Monitoring assets and their state. ## Asset definition Define assets directly on any task using the `assets` property. Each task can declare `inputs` assets (resources it reads) and `outputs` assets (resources it creates or modifies). Every asset includes these fields: | Field | Description | | --- | --- | | `id` | unique within a tenant | | `namespace` | each asset can be associated with a namespace for filtering and RBAC management | | `type` | use predefined Kestra types like `io.kestra.plugin.ee.assets.Table` or any custom string value | | `displayName` | optional human-readable name | | `description` | markdown-supported documentation | | `metadata` | map of key-value for adding custom metadata to the given asset | ## Asset Identifier An asset is uniquely identified by its `id` and the tenant (`tenantId`) where you create it. You can attach a namespace to an asset to improve filtering and to restrict visibility so only users or groups with the appropriate RBAC can access the asset. ## Asset Type Asset types fall into two categories: - **Kestra-defined asset types**: These predefined types use the `io.kestra.core.models.assets` model and provide structured metadata fields specific to each asset type. In future iterations of the Assets feature, Kestra plugins will allow to automatically generate assets with these types and populate their metadata fields during task execution. For example, a database plugin could automatically create a `Table` asset with the system, database, and schema fields filled in based on the connection details. The current Kestra-defined asset types are the following: - `io.kestra.plugin.ee.assets.Dataset` - Represents a dataset asset managed by Kestra. - Metadata: `system`, `location`, `format` - `io.kestra.plugin.ee.assets.File` - Represents a file asset, such as documents, logs, or other file-based outputs. - Metadata: `system`, `path` - `io.kestra.plugin.ee.assets.Table` - Represents a database table asset with schema and data location metadata. - Metadata: `system`, `database`, `schema` - `io.kestra.plugin.ee.assets.VM` - Represents a virtual machine asset, including attributes like IP address and provider. - Metadata: `provider`, `region`, `state` - `io.kestra.core.models.assets.External` - Represents an external asset that exists outside of Kestra's managed resources. - This type is automatically assigned when you reference an asset in `assets.inputs` that doesn't already exist in Kestra. You don't need to explicitly set the type — Kestra will create the asset with the `External` type automatically. - This is useful for tracking dependencies on resources managed outside your workflows, such as external database tables, third-party APIs, or manually provisioned infrastructure. - **Free-form asset types**: You can define asset types using any custom string value to represent asset categories that fit your organization's needs. This lets you create and manage your own asset taxonomies, giving you flexibility to describe resources that are not covered by Kestra's standard models. These assets require manual definition and will not be auto-generated by plugins. ## Quick start: minimal asset flow A small example that registers one output asset and logs its ID: ```yaml id: hello_assets namespace: company.team tasks: - id: write_file type: io.kestra.plugin.core.log.Log message: "Created report.csv" assets: outputs: - id: report.csv type: io.kestra.plugin.ee.assets.File metadata: path: s3://company/reports/report.csv - id: confirm type: io.kestra.plugin.core.log.Log message: "Asset recorded: {{ assets() | jq('.[] | {id: .id, type: .type, metadata: .metadata}') }}" ``` ## Auto-generated assets Some plugins support automatic asset generation when `assets.enableAuto: true` is set on a task. This removes the need to manually declare `assets.inputs` and `assets.outputs` — the plugin inspects its execution context and emits assets automatically: - **JDBC Query**: detects `CREATE TABLE` statements and emits a single `io.kestra.plugin.ee.assets.Table` output; JDBC URL populates `system` and `database`. - **Ansible CLI**: parses `inventory` hosts as `inputs` of type `io.kestra.core.models.assets.External`, marking the infrastructure targets the playbook runs against. - **dbt CLI**: parses `manifest.json` to emit each model as an `io.kestra.plugin.ee.assets.Table` output with `database`, `schema`, `name`, and lineage edges based on `depends_on`. :::collapse{title="JDBC Query auto-generated assets"} ```yaml id: jdbc_create_trips namespace: company.team tasks: - id: create_trips_table type: io.kestra.plugin.jdbc.sqlite.Query url: jdbc:sqlite:myfile.db outputDbFile: true sql: | CREATE TABLE IF NOT EXISTS trips ( VendorID INTEGER, passenger_count INTEGER, trip_distance REAL ); assets: enableAuto: true ``` ::: :::collapse{title="Ansible CLI auto-generated assets"} ```yaml id: ansible_playbook namespace: company.team tasks: - id: ansible_task type: io.kestra.plugin.ansible.cli.AnsibleCLI inputFiles: inventory.ini: | localhost ansible_connection=local myplaybook.yml: | --- - hosts: localhost tasks: - name: Print Hello World debug: msg: "Hello, World!" assets: enableAuto: true commands: - ansible-playbook -i inventory.ini myplaybook.yml ``` ::: :::collapse{title="dbt CLI auto-generated assets"} ```yaml id: dbt_build_duckdb namespace: company.team tasks: - id: dbt type: io.kestra.plugin.core.flow.WorkingDirectory tasks: - id: clone_repository type: io.kestra.plugin.git.Clone url: https://github.com/kestra-io/dbt-example branch: main - id: dbt_build type: io.kestra.plugin.dbt.cli.DbtCLI taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/dbt-duckdb:latest commands: - dbt deps - dbt build - dbt run profiles: | my_dbt_project: outputs: dev: type: duckdb path: ":memory:" fixed_retries: 1 threads: 16 timeout_seconds: 300 target: dev assets: enableAuto: true ``` ::: ## Operational automation Assets go beyond lineage: you can manage lifecycle, react to events, and automate remediation directly from flows: - Imperative lifecycle tasks to create/update, list, and delete assets (`Set`, `List`, `Delete`). - Event-based triggers with `EventTrigger` that react to asset lifecycle events (`CREATED`, `UPDATED`, `DELETED`, `USED`). - Freshness monitoring with `FreshnessTrigger` to detect stale assets and launch workflows automatically. - Flexible scoping by asset ID, namespace, type, and metadata filters. - Actionable trigger context (`event`, `eventTime`, `lastUpdated`, `staleDuration`, `checkTime`) to drive alerts, routing, and recovery. **Trigger use mapping** | Trigger | Primary use | | --- | --- | | `EventTrigger` | React instantly to asset lifecycle events (`CREATED`, `UPDATED`, `DELETED`, `USED`). | | `FreshnessTrigger` | Poll assets on an interval to detect staleness and launch remediation. | ### Operational controls and triggers Use asset tasks and triggers to automate lifecycle, governance, and freshness checks directly from flows. :::collapse{title="Advanced: event-driven automation"} ```yaml id: asset_event_driven_pipeline namespace: company.data tasks: - id: transform_to_mart type: io.kestra.plugin.core.flow.Subflow namespace: company.data flowId: create_mart_tables inputs: source_asset_id: "{{ trigger.asset.id }}" source_event: "{{ trigger.asset.event }}" event_time: "{{ trigger.asset.eventTime }}" triggers: - id: staging_table_event type: io.kestra.plugin.ee.assets.EventTrigger namespace: company.data assetType: io.kestra.plugin.ee.assets.Table events: - CREATED - UPDATED metadataQuery: - field: model_layer type: EQUAL_TO value: staging ``` ::: :::collapse{title="Advanced: audit deletions"} ```yaml id: audit_asset_deletions namespace: company.security tasks: - id: log_deletion type: io.kestra.plugin.jdbc.postgresql.Query sql: | INSERT INTO audit_log (asset_id, asset_type, namespace, event, event_time) VALUES ( '{{ trigger.asset.id }}', '{{ trigger.asset.type }}', '{{ trigger.asset.namespace }}', '{{ trigger.asset.event }}', '{{ trigger.asset.eventTime }}' ) triggers: - id: asset_deletion_event type: io.kestra.plugin.ee.assets.EventTrigger events: - DELETED ``` ::: :::collapse{title="Advanced: freshness monitoring"} ```yaml id: stale_assets_monitor namespace: company.monitoring tasks: - id: log_stale type: io.kestra.plugin.core.log.Log message: > Found {{ trigger.assets | length }} stale assets. First asset: {{ trigger.assets[0].id ?? 'n/a' }}. Stale for: {{ trigger.assets[0].staleDuration ?? 'n/a' }}. triggers: - id: stale_assets type: io.kestra.plugin.ee.assets.FreshnessTrigger maxStaleness: PT24H interval: PT1H ``` ::: :::collapse{title="Advanced: scoped freshness checks"} ```yaml id: prod_assets_freshness namespace: company.monitoring tasks: - id: trigger_remediation type: io.kestra.plugin.core.flow.Subflow namespace: company.data flowId: refresh_marts inputs: asset_id: "{{ trigger.assets[0].id }}" last_updated: "{{ trigger.assets[0].lastUpdated }}" stale_duration: "{{ trigger.assets[0].staleDuration }}" triggers: - id: stale_prod_marts type: io.kestra.plugin.ee.assets.FreshnessTrigger namespace: company.data assetType: TABLE maxStaleness: PT6H interval: PT30M metadataQuery: - field: environment type: EQUAL_TO value: prod - field: model_layer type: EQUAL_TO value: mart ``` ::: :::collapse{title="Advanced: lifecycle tasks"} ```yaml id: asset_lifecycle_ops namespace: company.data tasks: - id: upsert_asset type: io.kestra.plugin.ee.assets.Set namespace: assets.data assetId: customers_by_country assetType: TABLE displayName: Customers by Country assetDescription: Customer distribution by country metadata: owner: data-team environment: prod - id: list_assets type: io.kestra.plugin.ee.assets.List namespace: assets.data types: - TABLE metadataQuery: - field: owner type: EQUAL_TO value: data-team fetchType: FETCH - id: delete_asset type: io.kestra.plugin.ee.assets.Delete assetId: customers_by_country ``` ::: ## Data Pipeline Use Cases :::collapse{title="Advanced: data pipeline examples"} Assets are essential for tracking data lineage in analytics and data engineering workflows. The following examples demonstrate how to use assets for simple table creation and complex multi-layer data pipelines. ### Example 1: Simple Table Creation **Scenario**: You're creating a new database table from scratch. This is a foundational asset with no upstream dependencies. ```yaml id: pipeline_with_assets namespace: company.team tasks: - id: create_trips_table type: io.kestra.plugin.jdbc.sqlite.Queries url: jdbc:sqlite:myfile.db outputDbFile: true sql: | CREATE TABLE IF NOT EXISTS trips ( VendorID INTEGER, passenger_count INTEGER, trip_distance REAL ); INSERT INTO trips (VendorID, passenger_count, trip_distance) VALUES (1, 1, 1.5), (1, 2, 2.3), (2, 1, 0.8), (2, 3, 3.1); assets: outputs: - id: trips namespace: "{{ flow.namespace }}" type: io.kestra.plugin.ee.assets.Table metadata: database: sqlite table: trips ``` **Key points**: - There are no `inputs` assets as this is a source table with no dependencies - The `trips` table is registered as an output asset that downstream workflows can reference - Metadata captures the database type and table name for easier discovery ### Example 2: Multi-Layer Data Pipeline **Scenario**: You're building a modern data stack with staging and mart layers. The staging layer reads from an external source, and the mart layer creates aggregated analytics tables. ```yaml id: data_pipeline_assets namespace: kestra.company.data tasks: - id: create_staging_layer_asset type: io.kestra.plugin.jdbc.duckdb.Query sql: | CREATE TABLE IF NOT EXISTS trips AS select VendorID, passenger_count, trip_distance from sample_data.nyc.taxi limit 10; assets: inputs: - id: sample_data.nyc.taxi outputs: - id: trips namespace: "{{flow.namespace}}" type: io.kestra.plugin.ee.assets.Table metadata: model_layer: staging - id: for_each type: io.kestra.plugin.core.flow.ForEach values: - passenger_count - trip_distance tasks: - id: create_mart_layer_asset type: io.kestra.plugin.jdbc.duckdb.Query sql: SELECT AVG({{taskrun.value}}) AS avg_{{taskrun.value}} FROM trips; assets: inputs: - id: trips outputs: - id: avg_{{taskrun.value}} type: io.kestra.plugin.ee.assets.Table namespace: "{{flow.namespace}}" metadata: model_layer: mart pluginDefaults: - type: io.kestra.plugin.jdbc.duckdb values: url: "jdbc:duckdb:md:my_db?motherduck_token={{ secret('MOTHERDUCK_TOKEN') }}" fetchType: STORE ``` **What's happening in this pipeline**: 1. **External Source Tracking**: The `create_staging_layer_asset` task references `sample_data.nyc.taxi` as an input asset, even though it's managed outside this workflow. This establishes lineage to external data sources. 2. **Staging Layer**: The `trips` table is created and registered with `model_layer: staging` metadata. This becomes an intermediate asset that mart layers will consume. 3. **Dynamic Mart Creation**: The `ForEach` task generates two mart tables: - `avg_passenger_count` - `avg_trip_distance` Both declare `trips` as an input, creating a clear dependency chain. 4. **Complete Lineage Graph**: Kestra automatically builds the dependency graph. **Benefits of this approach**: - **Impact Analysis**: If `sample_data.nyc.taxi` changes, you can instantly see that it affects 3 downstream assets - **Layer Organization**: Filter assets by `model_layer` to view only staging or mart tables - **Dependency Tracking**: Know exactly which tables depend on others before making schema changes - **Audit Trail**: Track which workflows created each table and when Check out an interactive demo to see the Flow in action:
::: --- ## Infrastructure Use Case: Team Bucket Provisioning :::collapse{title="Advanced: infrastructure provisioning"} Assets are particularly valuable for infrastructure management scenarios. This example demonstrates how a DevOps team can provision cloud resources and track their usage across different teams. **Scenario**: Your DevOps team needs to create dedicated S3 buckets for multiple teams (Business, Data, Finance, Product). By registering these buckets as assets during provisioning, you establish a clear lineage of which workflows and executions interact with each infrastructure component. The following flow creates S3 buckets for selected teams and registers them as assets: ```yaml id: infra_assets namespace: kestra.company.infra inputs: - id: teams type: MULTISELECT values: - Business - Data - Finance - Product tasks: - id: for_each type: io.kestra.plugin.core.flow.ForEach values: "{{ inputs.teams }}" tasks: - id: create_bucket type: io.kestra.plugin.aws.cli.AwsCLI commands: - aws s3 mb s3://kestra-{{ taskrun.value | slugify }}-bucket assets: outputs: - id: kestra-{{ taskrun.value | slugify }}-bucket type: AWS_BUCKET metadata: provider: s3 address: s3://kestra-{{ taskrun.value | slugify }}-bucket pluginDefaults: - type: io.kestra.plugin.aws values: accessKeyId: "{{ secret('AWS_ACCESS_KEY') }}" secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}" region: "{{ secret('AWS_REGION') }}" allowFailure: true ``` This flow dynamically creates buckets (e.g., `kestra-data-bucket`, `kestra-finance-bucket`) and registers each as an `AWS_BUCKET` asset with relevant metadata. Once the infrastructure is provisioned, teams can reference these assets in their workflows. Here's how the Data team uses their bucket: ```yaml id: upload_file namespace: kestra.company.data tasks: - id: download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/jaffle-csv/raw_customers.csv - id: aws_upload type: io.kestra.plugin.aws.s3.Upload bucket: kestra-data-bucket from: '{{ outputs.download.uri }}' key: raw_customer.csv assets: inputs: - id: kestra-data-bucket outputs: - id: raw_customer type: io.kestra.plugin.ee.assets.File metadata: owner: data pluginDefaults: - type: io.kestra.plugin.aws values: accessKeyId: "{{ secret('AWS_ACCESS_KEY') }}" secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}" region: "{{ secret('AWS_REGION') }}" ``` In this workflow: - The `aws_upload` task declares `kestra-data-bucket` as an **input asset**, linking it to the infrastructure provisioned earlier - It also creates an **output asset** (`raw_customer`) representing the uploaded file - This establishes a complete lineage chain: infrastructure creation → data upload → file asset **Benefits**: With this approach, you can easily answer questions like: - Which teams are using which buckets? - What files have been uploaded to each bucket? - Which workflows and executions have interacted with a specific infrastructure component? - When was this infrastructure resource created and by which flow? ::: ## Populate dropdowns and app inputs The `assets()` Pebble function allows you to query and retrieve assets dynamically in your workflows. This is particularly useful for populating dropdown inputs or dynamically selecting resources based on filters. ### Function signature ```plaintext assets(type: string, namespace: string, metadata: map) ``` ### Parameters | Parameter | Type | Required | Description | | --- | --- | --- | --- | | `type` | string | No | Filter assets by type (e.g., `"io.kestra.core.models.assets.Table"`). If omitted, returns all assets. | | `namespace` | string | No | Filter assets by namespace. | | `metadata` | map | No | Filter assets by metadata key-value pairs (e.g., `{"key": "value"}`). | ### Return value Returns an array of asset objects. Each asset object contains the following properties: - `tenantId` - The tenant ID where the asset is created - `namespace` - The namespace the asset belongs to - `id` - The asset identifier - `type` - The asset type - `metadata` - Map of custom metadata key-value pairs - `created` - ISO 8601 timestamp when the asset was created - `updated` - ISO 8601 timestamp when the asset was last updated - `deleted` - Boolean indicating if the asset has been deleted ### Examples **Populate a multiselect dropdown with table assets:** ```yaml id: select_assets namespace: company.team inputs: - id: assets type: MULTISELECT expression: '{{ assets(type="io.kestra.core.models.assets.Table") | jq(".[].id") }}' tasks: - id: for_each type: io.kestra.plugin.core.flow.ForEach values: "{{inputs.assets}}" tasks: - id: log type: io.kestra.plugin.core.log.Log message: "{{taskrun.value}}" ``` **Filter assets by namespace:** ```yaml inputs: - id: staging_tables type: MULTISELECT expression: '{{ assets(type="io.kestra.core.models.assets.Table", namespace="company.team") | jq(".[].id") }}' ``` **Filter assets by metadata:** ```yaml inputs: - id: mart_tables type: MULTISELECT expression: '{{ assets(metadata={"model_layer": "mart"}) | jq(".[].id") }}' ``` **Get all assets and extract metadata:** ```yaml id: list_assets_metadata namespace: company.team tasks: - id: list_all_assets type: io.kestra.plugin.core.log.Log message: "{{ assets() | jq('.[] | {id: .id, type: .type, metadata: .metadata}') }}" ``` ## Export assets with AssetShipper The `AssetShipper` task allows you to export asset metadata to external systems for lineage tracking, monitoring, or integration with data catalogs. You can ship assets to files or to lineage providers like OpenLineage. ### Export assets to file Export asset metadata to a file in either ION or JSON format. This is useful for archiving, auditing, or importing into other systems. ```yaml id: ship_asset_to_file namespace: kestra.company.data tasks: - id: export_assets type: io.kestra.plugin.ee.assets.AssetShipper assetExporters: - id: file_exporter type: io.kestra.plugin.ee.assets.FileAssetExporter format: ION ``` You can change the `format` property to `JSON` if you prefer a more widely-compatible format. ### Export assets to OpenLineage Ship asset metadata to an OpenLineage-compatible lineage provider. This requires mapping Kestra asset fields to OpenLineage conventions. ```yaml id: ship_asset_to_openlineage namespace: kestra.company.data tasks: - id: export_to_lineage type: io.kestra.plugin.ee.assets.AssetShipper assetExporters: - id: openlineage_exporter type: io.kestra.plugin.ee.openlineage.OpenLineageAssetExporter uri: http://host.docker.internal:5000 mappings: io.kestra.plugin.ee.assets.Table: namespace: namespace ``` The `mappings` property defines how Kestra asset metadata fields map to OpenLineage dataset facets. Each asset type can have its own mapping configuration. For more information about OpenLineage dataset facets and available fields, see the [OpenLineage Dataset Facets documentation](https://openlineage.io/docs/spec/facets/dataset-facets/). ## Purge assets and lineage (retention) Use the `io.kestra.plugin.ee.assets.PurgeAssets` task to enforce asset retention without touching executions or logs. By default, this task purges assets, asset usage events (execution view), and asset lineage events (for asset exporters) matching the filters. You can configure it to only purge specific types of records. **Filters:** | Property | Description | | --- | --- | | `namespace` | Filter by namespace. Supports prefix matching (e.g., `company.data` matches `company.data.staging`). | | `assetId` | Filter by a specific asset ID. | | `assetType` | Filter by one or more asset types (e.g., `io.kestra.plugin.ee.assets.Table`). | | `metadataQuery` | Filter by metadata key-value pairs. | | `endDate` | **(required)** Purge records created or updated before this date (ISO 8601). | **Purge scope:** | Property | Default | Description | | --- | --- | --- | | `purgeAssets` | `true` | Whether to purge the asset records themselves. | | `purgeAssetUsages` | `true` | Whether to purge asset usage events (execution view). | | `purgeAssetLineages` | `true` | Whether to purge asset lineage events. | **Outputs:** `purgedAssetsCount`, `purgedAssetUsagesCount`, `purgedAssetLineagesCount`. Example: purge old VM assets on a monthly schedule. ```yaml id: asset_retention_policy namespace: company.infra triggers: - id: monthly_cleanup type: io.kestra.plugin.core.trigger.Schedule cron: "0 0 1 * *" tasks: - id: purge_old_vms type: io.kestra.plugin.ee.assets.PurgeAssets assetType: - io.kestra.plugin.ee.assets.VM endDate: "{{ now() | dateAdd(-180, 'DAYS') }}" ``` --- # Audit Logs in Kestra: Governance and Compliance URL: https://kestra.io/docs/enterprise/governance/audit-logs > Ensure compliance with Kestra Audit Logs. Track and monitor all user activities, flow executions, and system changes for security and governance. How to use Audit Logs to govern activities in your Kestra instance.
## Audit logs – governance and compliance guide Audit Logs record all activities performed in your Kestra instance by users and service accounts. By reviewing Audit Logs, system administrators can track user activity, and security teams can investigate incidents and ensure compliance with regulatory requirements. ## Why are Audit Logs important The audit log table in Kestra serves as a historical record that developers and system administrators can use to track changes, monitor system usage, and verify system activity. It's a transparency tool that tracks the sequence of activities, ensuring accountability for actions taken and providing data for troubleshooting and analysis. Given that Audit Logs are immutable, they can also be used to detect and investigate security incidents. If you leverage Kestra edition with Elasticsearch backend, you can also use Kibana to search and visualize your logs. ## How to access Audit Logs You can access Audit Logs from the **Tenant** section in the UI. That UI page provides a detailed table of recorded events, capturing the actions taken within the system: ![Audit Logs](./audit_logs.png) Each row in the table represents a distinct event with several columns providing specific details: - **Resource Type** column categorizes the resource that the event is associated with, such as editing a flow (FLOW) or executing it (EXECUTION). - **Action** indicates whether a given resource has been created, updated, or deleted. - **Actor** identifies who performed the action. The user can be a human, a system, or a service account. - **Details** section offers an in-depth description of the event, including identifiers such as the `id`, `namespace`, `flowId`, `executionId`, revision, etc. — those fields depend on the type of resource the event is associated with. - **Date** represents the timestamp of when the event occurred. - **Changes** shows two buttons: one to view the revision and a second to link you directly to the resource that created the log. ## How to see a full diff of a specific event To see a full diff of a specific event, click on the icon in the **Changes** column. The expanded view shows the full diff of the event side-by-side, including the `before` and `after` states of a given resource: ![Changes Diff](./changes_diff.png) ## How to use the Details filter to search for specific Audit Log events The `Details` filter allows you to flexibly search for any Audit Log event using the `key:value` format. It's a tag-based system which works the same way as [Execution Labels](../../../05.workflow-components/08.labels/index.md). For example, you can filter for all events related to a specific namespace by typing `namespace:your_namespace`: ![Filter by Namespace](./audit-logs-filter.png) To further filter for a specific event, you can click on the relevant tag in the `Details` column, and it automatically adds the filter to the view. ## How to Purge Audit Logs The Enterprise Edition of Kestra generates an audit log for _every action_ taken on the platform. While these logs are essential for tracking changes and ensuring compliance, they can accumulate over time and take up a significant amount of space in the database. The `PurgeAuditLogs` task removes old audit logs that are no longer needed. You can set a date range for the logs you want to delete, choose a specific `namespace`, and even filter by `resources` or `actions` (`CREATE`, `READ`, `UPDATE`, `DELETE`). :::alert{type="info"} Additional types of **Purge tasks** are described in the [dedicated section](../../../10.administrator-guide/purge/index.md). ::: Here is the recommended way to implement the audit logs retention policy that purges audit logs older than one month: ```yaml id: audit_log_cleanup namespace: system tasks: - id: purge_audit_logs type: io.kestra.plugin.ee.core.log.PurgeAuditLogs description: Purge audit logs older than 1 month endDate: "{{ now() | dateAdd(-1, 'MONTHS') }}" ``` Note how the above flow is added to the `system` namespace, which is the default namespace for System Flows. This ensures that this maintenance flow and its executions are hidden from the main UI, making them only visible within the `system` namespace that can be managed by platform administrators. Combining the [System Flows](../../../06.concepts/system-flows/index.md) functionality with the `PurgeAuditLogs` task provides a simple way to manage your audit logs as code and from the UI, ensuring you keep them as long as you need to stay compliant while keeping your database clean and performant. ## Export audit logs Audit logs can be forwarded to an external monitoring system such as Datadog, AWS CloudWatch, Google Operational Suite, and more with the [Audit Log Shipper task](../logshipper/index.md#audit-log-shipper). --- # Custom Blueprints in Kestra Enterprise: Templates URL: https://kestra.io/docs/enterprise/governance/custom-blueprints > Create Custom Blueprints in Kestra Enterprise. Standardize workflows with private templates, promoting reuse and best practices across your organization. How to create and manage Custom Blueprints. # Custom Blueprints in Kestra Enterprise – Private Templates
In addition to the publicly available [Community Blueprints](../../../06.concepts/07.blueprints/index.md), Kestra allows you to create **Custom Blueprints**—private, reusable workflow templates tailored to your team. These blueprints help centralize orchestration patterns, document best practices, and streamline collaboration across your organization. You can think of Custom Blueprints as your team's internal App Store, offering a wide range of integrations and validated workflow patterns tailored to your needs. ### How to create a new custom blueprint From the left navigation menu, go to **Blueprints**. Then, select the **Custom Blueprints** tab. Click on **Create**. Add a title, description, and the contents of the flow. You can add as many tags as you want. Then click on the **Create** button. ![New Custom Blueprint](./blueprint-org-2.png) You can edit Blueprints at any time, for example, to add new tasks or expand the documentation. ## Templated Blueprints Templated Blueprints allow you to create reusable, configurable workflows that users can instantiate without editing YAML. Instead of copying and modifying Blueprints, users fill in guided inputs and Kestra generates the complete flow automatically. Platform teams build templates once; business users instantiate them by filling in a form rather than editing YAML. **How It Works:** Templated Blueprints use [Pebble templating](../../../06.concepts/06.pebble/index.md), with custom delimiters to avoid conflicts with Kestra expressions. ### Define Template Arguments Template arguments define the inputs users must provide. To add them to your Blueprint, use the `extend` key with a `templateArguments` section: ```yaml extend: templateArguments: - id: values displayName: An array of values type: MULTISELECT values: - value1 - value2 - value3 ``` All Kestra [input types](../../../05.workflow-components/05.inputs/index.md) and their validation rules are supported. These arguments automatically generate a UI form when the blueprint is instantiated. ### Use Template Arguments Templated blueprints use the Pebble templating engine. To avoid conflicts with Kestra expressions (`{{ }}`), template arguments use custom delimiters: `<<` and `>>`. Template arguments are accessed using the `arg` prefix. For example, if you have a template argument with `id: my_custom_field`, you can use it in your flow as follows: ```yaml tasks: - id: log type: io.kestra.plugin.core.log.Log message: Hello << arg.my_custom_field >> ``` ### Loops and Conditions You can dynamically generate multiple tasks, inputs, variables, or triggers through for-loops and if/else conditions using the `<% %>` syntax. For example, the following loop creates one log task for each value in an array input. ```yaml extend: templateArguments: - id: values displayName: An array of values type: MULTISELECT values: - value1 - value2 - value3 id: myflow namespace: company.team tasks: <% for value in arg.values %> - id: log_<< value >> type: io.kestra.plugin.core.log.Log message: Hello << value >> <% endfor %> ``` This allows you to dynamically generate tasks or include them conditionally. Solutions such as templatized Terraform configurations or using the Python SDK to make DAG factories are still valid ways to address similar templating needs. Templated Custom Blueprints offer a more direct, simpler and integrated approach within the Kestra platform. ### Example: Data Ingestion Template Here's an example showing a Templated Blueprint that generates data ingestion workflows based on user selections: :::collapse{title="Template Definition"} ```yaml id: data-ingest namespace: kestra.data extend: templateArguments: - id: domains displayName: Domains type: MULTISELECT values: - Online Shop - Manufacture - HR - Finance - id: target type: SELECT values: - Postgres - Oracle - id: env type: SELECT values: - dev - staging - prod tasks: - id: parallel_<< arg.env >> type: io.kestra.plugin.core.flow.Parallel tasks: <% for domain in arg.domains %> - id: sequential_<< domain | slugify >> type: io.kestra.plugin.core.flow.Sequential tasks: - id: << domain | slugify >>-download type: io.kestra.plugin.jdbc.postgresql.CopyOut sql: SELECT * FROM public.<< domain | slugify >> - id: << domain | slugify >>-ingest <% if arg.target == 'Oracle' %> type: io.kestra.plugin.jdbc.oracle.Batch from: "{{ << domain | slugify >>-download.uri }}" table: public.< domain | slugify >> <% elseif arg.target == 'Postgres' %> type: io.kestra.plugin.jdbc.postgresql.CopyIn from: "{{ outputs.<< domain | slugify >>-download.uri }}" url: jdbc:postgres://sample_<< arg.target | lower>>:5432/<> table: public.< domain | slugify >> <% endif %> <% endfor %> pluginDefaults: - type: io.kestra.plugin.jdbc.postgresql values: url: jdbc:postgresql://sample_postgres:5432/<> username: '{{ secret("POSTGRES_USERNAME") }}' password: '{{ secret("POSTGRES_PASSWORD") }}' format: CSV - type: io.kestra.plugin.jdbc.oracle.Batch values: url: jdbc:oracle:thin:@<< arg.env >>:49161:XE username: '{{ secret("ORACLE_USERNAME") }}' password: '{{ secret("ORACLE_USERNAME") }}' ``` ::: :::collapse{title="Generated Flow (after template rendering)"} After selecting `env: dev`, `domains: [HR, Manufacture]`, and `target: Oracle`, the template generates this complete workflow: ```yaml id: data-ingest namespace: kestra.data tasks: - id: parallel_dev type: io.kestra.plugin.core.flow.Parallel tasks: - id: sequential_hr type: io.kestra.plugin.core.flow.Sequential tasks: - id: hr-download type: io.kestra.plugin.jdbc.postgresql.CopyOut sql: SELECT * FROM public.hr - id: hr-ingest type: io.kestra.plugin.jdbc.oracle.Batch from: "{{ hr-download.uri }}" table: public.< domain | slugify >> - id: sequential_manufacture type: io.kestra.plugin.core.flow.Sequential tasks: - id: manufacture-download type: io.kestra.plugin.jdbc.postgresql.CopyOut sql: SELECT * FROM public.manufacture - id: manufacture-ingest type: io.kestra.plugin.jdbc.oracle.Batch from: "{{ manufacture-download.uri }}" table: public.< domain | slugify >> pluginDefaults: - type: io.kestra.plugin.jdbc.postgresql values: url: jdbc:postgresql://sample_postgres:5432/dev username: '{{ secret("POSTGRES_USERNAME") }}' password: '{{ secret("POSTGRES_PASSWORD") }}' format: CSV - type: io.kestra.plugin.jdbc.oracle.Batch values: url: jdbc:oracle:thin:@dev:49161:XE username: '{{ secret("ORACLE_USERNAME") }}' password: '{{ secret("ORACLE_USERNAME") }}' ``` ::: --- # Log Shipper in Kestra Enterprise: Centralize Logs URL: https://kestra.io/docs/enterprise/governance/logshipper > Centralize monitoring with Kestra Log Shipper. Export workflow and audit logs to Datadog, Splunk, Elastic, AWS S3, and other observability platforms. Manage and distribute logs across your entire infrastructure.
## Log shipper – centralize logs Log Shipper can distribute Kestra logs from across your instance to an external logging platform. Log synchronization fetches logs and batches them into optimized chunks automatically. The batch process is done intelligently through defined synchronization points. Once batched, the Log Shipper delivers consistent and reliable data to your monitoring platform. Log Shipper is built on top of [Kestra plugins](/plugins), ensuring it can integrate with popular logging platforms and expand as more plugins are developed. Supported observability platforms include ElasticSearch, Datadog, New Relic, Azure Monitor, Google Operational Suite, AWS Cloudwatch, Splunk, OpenSearch, and OpenTelemetry. ## Log shipper properties The Log Shipper plugin has several key properties to define where the logs should be sent and how they are batched. Below is a list of the definable properties and their purpose: - `logExporters` - This property is required, and it specifies the platform where the logs will be exported. It support a list of entries, allowing you to export logs to different platforms at once - `logLevelFilter` - Specifies the minimum log level to send with the default being `INFO`. With `INFO`, all log levels `INFO` and above (`WARNING` and `ERROR`) are batched. If you only want logs that are warnings or errors, then you can set this property to `WARNING` and so on. - `lookbackPeriod` - Determines the fetch period for logs to be sent. For example, with a default value of `P1D`, all logs generated between now and one day ago are batched. - `namespace` - Sets the task to only gather logs from a specific Kestra [Namespace](../../../05.workflow-components/02.namespace/index.md). If not specified, all instance logs are fetched. - `offsetKey` - Specifies the prefix of the [Key Value (KV) store](../../../06.concepts/05.kv-store/index.md) key that contains the last execution's end fetched date. By default this is set as `LogShipper-state`. You can change this key store name to reset the last fetched date if, for example, you want to export previously exported logs. - `delete` - By default this property is set to `false`. Boolean property that when set to `true` deletes the batched logs as a part of the task run ## How log shipper works Let's take a look at a simple example of a Log Shipper task that fetches logs and exports them to AWS CloudWatch, Google Operational Suite, and Azure Monitor at the same time. ```yaml id: logShipper namespace: system tasks: - id: shipLogs type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO lookbackPeriod: P1D offsetKey: logShipperOffset logExporters: - id: awsCloudWatch type: io.kestra.plugin.ee.aws.cloudwatch.LogExporter accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" region: us-east-1 logGroupName: kestra logStreamName: production chunk: 5000 - id: googleOperationalSuite type: io.kestra.plugin.ee.gcp.gcs.LogExporter projectId: my-gcp-project chunk: 2000 - id: azureMonitor type: io.kestra.plugin.ee.azure.monitor.LogExporter endpoint: https://endpoint-host.ingest.monitor.azure.com tenantId: "{{ secret('AZURE_TENANT_ID') }}" clientId: "{{ secret('AZURE_CLIENT_ID') }}" clientSecret: "{{ secret('AZURE_CLIENT_SECRET') }}" ruleId: dcr-69f0b123041d4d6e9f2bf72aad0b62cf streamName: kestraLogs chunk: 1000 ``` The plugin starts by identifying the starting timestamp and checking if the last processed log exists. If it does, the plugin uses the `offsetKey` to fetch logs from the database. If the last processed log does not exist, the plugin uses the current time minus the `lookbackPeriod` to fetch logs from the database. The logs are then distributed to the exporters in chunks of 5000, 2000, and 1000 for AWS CloudWatch, Google Suite, and Azure Monitor, respectively. Once the logs are distributed, the offset key in the Key Value store is updated. ![Log Shipper Flow Chart](./logshipper-flow-chart.png) ## Log shipper examples The Log Shipper integrates with many popular observability platforms. Below are a couple of example flows using a Kestra core plugin as well as external platform plugins. ### Kestra `FileLogExporter` The following example uses Kestra's core `FileLogExporter` plugin to synchronize the logs of the `company.team` namespace. The `synchronize_logs` task outputs a file, and the log file `uri` is passed as an expression in the `upload` task to then upload the logs to an S3 bucket. ```yaml id: log_shipper_file namespace: system tasks: - id: synchronize_logs type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO lookbackPeriod: P1D offsetKey: LogShipper-local-demo delete: false namespace: company.team logExporters: - id: file type: io.kestra.plugin.ee.core.log.FileLogExporter format: JSON # default ION maxLinesPerFile: 100 - id: upload type: io.kestra.plugin.aws.s3.Upload accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" from: "{{ outputs.synchronize_logs.outputs.file.uri }}" key: logs/kestra.txt bucket: kestra-log-demo-bucket region: eu-west-2 ``` ### Datadog The below example demonstrates an execution that runs a daily log synchronization and distribution of logs with [Datadog](https://www.datadoghq.com/) using the default property settings. ```yaml id: log_shipper namespace: company.team triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" tasks: - id: log_export type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO lookbackPeriod: P1D delete: false logExporters: - id: DatadogLogExporter type: io.kestra.plugin.ee.datadog.LogExporter basePath: '{{ secret("DATADOG_INSTANCE_URL") }}' apiKey: '{{ secret("DATADOG_APIK_KEY") }}' ``` The batched logs directly populate your Datadog instance like in the following screenshot: ![Datadog Logs](./logshipper_datadog.png) ### AWS Cloudwatch This example exports logs to [AWS Cloudwatch](https://aws.amazon.com/cloudwatch/). The following example flow triggers a daily batch and exports to AWS's service [Amazon CloudWatch](https://docs.aws.amazon.com/cloudwatch/): ```yaml id: log_shipper namespace: company.team triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" tasks: - id: log_export type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO lookbackPeriod: P1D offsetKey: log_shipper_aws_cloudwatch_state delete: false logExporters: - id: aws_cloudwatch type: io.kestra.plugin.ee.aws.cloudwatch.LogExporter accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" region: "{{ vars.region }}" logGroupName: kestra logStreamName: kestra-log-stream ``` The logs are viewable in the interface of the specified Log Group and can be examined like in the following screenshot: ![AWS Cloud Watch Logs](./logshipper_aws_cloudwatch.png) ### AWS S3 This example exports logs to [AWS S3](https://aws.amazon.com/s3/). The following example flow triggers a daily batch and exports to AWS's S3 object storage: ```yaml id: log_shipper namespace: system triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" tasks: - id: log_export type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO lookbackPeriod: P1D logExporters: - id: S3LogExporter type: io.kestra.plugin.ee.aws.s3.LogExporter accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" region: "{{ vars.region }}" format: JSON bucket: logbucket logFilePrefix: kestra-log-file maxLinesPerFile: 1000000 ``` ### Google Operational Suite This example exports logs to [Google Cloud Observability](https://cloud.google.com/products/observability). The following example flow triggers a daily batch and exports to Google Cloud Platform's observability monitor: ```yaml id: log_shipper namespace: company.team triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" tasks: - id: shipLogs type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO lookbackPeriod: P1D offsetKey: logShipperOffset delete: false logExporters: - id: googleOperationalSuite type: io.kestra.plugin.ee.gcp.operationalsuite.LogExporter projectId: my-gcp-project ``` This example exports logs to [Google Cloud Storage](https://cloud.google.com/storage?hl=en). The following example flow triggers a daily batch and exports to Google Cloud Storage: ```yaml id: log_shipper namespace: company.team triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" tasks: - id: log_export type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO lookbackPeriod: P1D logExporters: - id: GCPLogExporter type: io.kestra.plugin.ee.gcp.gcs.LogExporter projectId: myProjectId format: JSON maxLinesPerFile: 10000 bucket: my-bucket logFilePrefix: kestra-log-file ``` ### Azure Monitor This example exports logs to [Azure Monitor](https://learn.microsoft.com/en-us/azure/azure-monitor/overview). The following example flow triggers a daily batch and export to Azure Monitor: ```yaml id: log_shipper namespace: company.team triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" tasks: - id: shipLogs type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO lookbackPeriod: P1D offsetKey: logShipperOffset delete: false logExporters: - id: azureMonitor type: io.kestra.plugin.ee.azure.monitor.LogExporter endpoint: https://endpoint-host.ingest.monitor.azure.com tenantId: "{{ secret('AZURE_TENANT_ID') }}" clientId: "{{ secret('AZURE_CLIENT_ID') }}" clientSecret: "{{ secret('AZURE_CLIENT_SECRET') }}" ruleId: dcr-69f0b123041d4d6e9f2bf72aad0b62cf streamName: kestraLogs ``` ### Azure Blob Storage This example exports logs to [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs/). The following example flow triggers a daily batch and export to Azure Blob Storage: ```yaml id: log_shipper namespace: company.team triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" tasks: - id: log_export type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO lookbackPeriod: P1D logExporters: - id: AzureLogExporter type: io.kestra.plugin.ee.azure.storage.LogExporter endpoint: https://myblob.blob.core.windows.net/ tenantId: tenant_id clientId: client_id clientSecret: client_secret containerName: logs format: JSON logFilePrefix: kestra-log-file maxLinesPerFile: 1000000 ``` ### Elasticsearch This example exports logs to [Elasticsearch](https://www.elastic.co). The following example flow triggers a daily batch and export to [Elasticsearch Observability platform](https://www.elastic.co/observability). ```yaml id: logShipper namespace: system triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" tasks: - id: shipLogs type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO lookbackPeriod: P1D offsetKey: logShipperOffset delete: false logExporters: - id: elasticsearch type: io.kestra.plugin.elasticsearch.LogExporter indexName: kestra-logs connection: basicAuth: password: "{{ secret('ES_PASSWORD') }}" username: kestra_user hosts: - https://elastic.example.com:9200 ``` ### New Relic This example exports logs to [New Relic](https://newrelic.com/). The following example flow triggers a daily batch and export to the [New Relic Observability Platform](https://newrelic.com/platform). ```yaml id: logShipper namespace: system triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" tasks: - id: shipLogs type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO lookbackPeriod: P1D offsetKey: logShipperOffset delete: false logExporters: - id: newRelic type: io.kestra.plugin.ee.newrelic.LogExporter basePath: https://log-api.newrelic.com apiKey: "{{ secret('NEWRELIC_API_KEY') }}" ``` ### Splunk This example exports logs to [Splunk](https://www.splunk.com/). The following example flow triggers a daily batch and export to [Splunk Observability Cloud](https://www.splunk.com/en_us/products/observability-cloud.html). ```yaml id: log_shipper namespace: system triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" tasks: - id: log_export type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO lookbackPeriod: P1D offsetKey: logShipperOffset delete: false logExporters: - id: SplunkLogExporter type: io.kestra.plugin.ee.splunk.LogExporter host: https://example.splunkcloud.com:8088 token: "{{ secret('SPLUNK_API_KEY') }}" ``` ### OpenSearch This example exports logs to [OpenSearch](https://opensearch.org/) database. The following example flow triggers a daily batch and export to [OpenSearch Observability platform](https://opensearch.org/platform/observability/index.html). ```yaml id: log_shipper namespace: system triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" tasks: - id: logSync type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO lookbackPeriod: P1D offsetKey: logShipperOffset delete: false logExporters: - id: OpensearchLogExporter type: io.kestra.plugin.ee.opensearch.LogExporter connection: hosts: - "http://localhost:9200/" indexName: "logs" ``` ### OpenTelemetry This example exports logs to [OpenTelemetry](https://opentelemetry.io/). The following example flow triggers a daily batch and export to an [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/). ```yaml id: logShipper namespace: system triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" tasks: - id: shipLogs type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO lookbackPeriod: P1D offsetKey: logShipperOffset delete: false logExporters: - id: openTelemetry type: io.kestra.plugin.ee.opentelemetry.LogExporter otlpEndpoint: http://otel-collector:4318/v1/logs authorizationHeaderName: Authorization authorizationHeaderValue: "Bearer {{ secret('OTEL_TOKEN') }}" ``` ### Graylog This example exports logs to [Graylog](https://graylog.org/). The following example flow triggers a daily batch sends logs to Graylog using a GELF HTTP input. Refer to the [Graylog Plugin Documentation](/plugins/plugin-ee-graylog) for more property details. ```yaml id: log_shipper namespace: system triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" tasks: - id: log_export type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO lookbackPeriod: P1D logExporters: - id: GraylogExporter type: io.kestra.plugin.ee.graylog.LogExporter endpoint: "http://localhost:12201/gelf" graylogHost: "Kestra" chunk: 1000 ``` ## Audit log shipper To send [Audit Logs](../06.audit-logs/index.md) to an external system, there is the Audit Log Shipper task type. The Audit Log Shipper task extracts logs from the Kestra backend and loads them to desired destinations including Datadog, Elasticsearch, New Relic, OpenTelemetry, AWS CloudWatch, Google Operational Suite, and Azure Monitor. The Audit Log Shipper uses the following properties similar to the execution Log Shipper, except that the `resources` property replaces the `logLevelFilter` property. - `logExporters` - This property is required, and it specifies the platform where the audit logs will be exported. It supports a list of entries, allowing you to export logs to different platforms at once - `resources` - Specifies from which Kestra resource to ship audit logs for (e.g., FLOW, EXECUTION, USER, KV STORE, etc.) - `lookbackPeriod` - Determines the fetch period for audit logs to be sent. For example, with a default value of `P1D`, all audit logs generated between now and one day ago are batched. - `offsetKey` - Specifies the [key](../../../06.concepts/05.kv-store/index.md) that contains the last fetched date. By default, Kestra uses the key `LogShipper-state`. You can change the value of that KV pair if you want to export previously fetched logs again. - `delete` - Boolean property that, when set to `true`, deletes the logs from Kestra’s database immediately after successful export, helping optimize storage by removing logs that no longer need to reside in Kestra’s metadata store. By default, this property is set to `false`. The below workflow ships Audit Logs to multiple destinations using each of the supported monitoring systems. ```yaml id: Audit-logShipper namespace: system tasks: - id: shipLogs type: io.kestra.plugin.ee.core.log.AuditLogShipper resources: - FLOW - EXECUTION lookbackPeriod: P1D offsetKey: logShipperOffset logExporters: - id: file type: io.kestra.plugin.ee.core.log.FileLogExporter - id: awsCloudWatch type: io.kestra.plugin.ee.aws.cloudwatch.LogExporter accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" region: us-east-1 logGroupName: kestra logStreamName: production - id: googleOperationalSuite type: io.kestra.plugin.ee.gcp.operationalsuite.LogExporter projectId: my-gcp-project - id: azureMonitor type: io.kestra.plugin.ee.azure.monitor.LogExporter endpoint: https://endpoint-host.ingest.monitor.azure.com tenantId: "{{ secret('AZURE_TENANT_ID') }}" clientId: "{{ secret('AZURE_CLIENT_ID') }}" clientSecret: "{{ secret('AZURE_CLIENT_SECRET') }}" ruleId: dcr-69f0b123041d4d6e9f2bf72aad0b62cf streamName: kestraLogs - id: datadog type: io.kestra.plugin.ee.datadog.LogExporter basePath: https://http-intake.logs.datadoghq.eu apiKey: "{{ secret('DATADOG_API_KEY') }}" - id: elasticsearch type: io.kestra.plugin.elasticsearch.LogExporter indexName: kestra-logs connection: basicAuth: password: "{{ secret('ES_PASSWORD') }}" username: kestra_user hosts: - https://elastic.example.com:9200 - id: newRelic type: io.kestra.plugin.ee.newrelic.LogExporter basePath: https://log-api.newrelic.com apiKey: "{{ secret('NEWRELIC_API_KEY') }}" - id: openTelemetry type: io.kestra.plugin.ee.opentelemetry.LogExporter otlpEndpoint: http://otel-collector:4318/v1/logs authorizationHeaderName: Authorization authorizationHeaderValue: "Bearer {{ secret('OTEL_TOKEN') }}" triggers: - id: dailySchedule type: io.kestra.plugin.core.trigger.Schedule cron: "0 0 * * *" disabled: true ``` --- # Namespace Management in Kestra Enterprise: Isolation URL: https://kestra.io/docs/enterprise/governance/namespace-management > Secure your Kestra instance with Namespace Management. Configure isolated environments, manage secrets, and set Namespace-level plugin defaults. How to manage secrets, variables, and plugin defaults at the Namespace level.
## Namespace management – secure configuration Kestra is a [multi-tenant](../../02.governance/tenants/index.md) platform. Each tenant can have multiple Namespaces, and each Namespace provides additional isolation and security. Namespaces provide: - Logical isolation of resources on top of instance- or tenant-level isolation - Fine-grained access control for secrets, variables, and task configurations Namespaces are particularly useful in environments with many users, teams, projects, and applications. ## Namespace-level features The Namespace page allows you to configure secrets, plugin defaults, and variables that can be used within any flow in that Namespace. It allows your organization to centrally manage your secrets, variables, and task configuration while providing fine-grained access-control to those resources. Since Kestra supports [everything as code and from the UI](https://youtu.be/dU3p6Jf5fMw?si=bqNWS1e3_if-mePS), you can manage Namespaces from the UI or programmatically (e.g., via our [Terraform provider](https://registry.terraform.io/providers/kestra-io/kestra/latest/docs)). ### Secrets On the Namespaces page, select the Namespace where you want to define the secrets and go to the **Secrets** tab. Here, you will see all existing secrets associated with this Namespace. Click on **Add a secret** button on the top right corner of the page. ![add_secret.png](./add_secret.png) Define the secret by entering its key and value. Save the secret by clicking on the **Save** button at the bottom. The secret key should now start appearing on the **Secrets** tab. You can edit the secret's value or delete the secret by clicking on the appropriate button towards the right of the secret row. You can reference the secret in the flow by using the key, for example, `"{{ secret('MYSQL_PASSWORD') }}"`. For APIs that issue short-lived access tokens (e.g., OAuth2), create a [Credential](../../03.auth/credentials/index.md) that relies on these secrets and fetch the token in flows with `{{ credential('your_credential_key') }}`. Here is how you can use it in a flow: ```yaml id: query-mysql namespace: company.team tasks: - id: query type: io.kestra.plugin.jdbc.mysql.Query url: jdbc:mysql://localhost:3306/test username: root password: "{{ secret('MYSQL_PASSWORD') }}" sql: select * from employees fetchOne: true ``` :::alert{type="info"} Make sure to only use the secret in flows defined in the same Namespace (or child Namespace) as your secret. ::: When building new flows in a Namespace, Namespace secrets are accessible from the **Secrets** tab. Open the tab to view all available Namespace secret key names. ### Plugin defaults Plugin Defaults can also be defined at the Namespace level. These plugin defaults are then applied for all tasks of the corresponding type defined in the flows under the same Namespace. On the Namespaces page, select the Namespace where you want to define the plugin defaults and navigate to the **Plugin defaults** tab. You can add the plugin defaults here and save the changes by clicking on the **Save** button at the bottom of the page. ![Define Plugin Defaults](./plugindefaults-namespaces.png) You can reference secrets and variables defined with the same Namespace in the plugin defaults. In the example below, you no longer need to add the `password` property for the MySQL query task as it's defined in your Namespace-level `pluginDefaults`: ```yaml id: query-mysql namespace: company.team tasks: - id: query type: io.kestra.plugin.jdbc.mysql.Query url: jdbc:mysql://localhost:3306/test username: root sql: select * from employees fetchOne: true ``` ### Default service account for SDK plugins Namespaces can now provide **default authentication credentials** that [SDK-based plugins](/plugins/plugin-kestra) use to run tasks such as [List all Namespaces](/plugins/plugin-kestra/kestra-namespaces/io.kestra.plugin.kestra.namespaces.list). This allows tasks relying on the [Kestra SDK](../../../api-reference/kestra-sdk/index.mdx) to call the API without hard-coding credentials inside the flow. On the Namespace **Edit** page, open the **Default authentication** section and choose either: - **API token** (recommended), or - **Basic auth** (username/password) ### Variables Variables defined at the Namespace level can be used in any flow defined under the same Namespace using the syntax: `{{ namespace.variable_name }}`. On the Namespaces page, select the Namespace where you want to define the variables. Go to the **Variables** tab. You can now define the variables on this page. Save the changes by clicking the **Save** button at the bottom of the page. ![define_variables.png](./define_variables.png) Here is an example flow where the Namespace variable is used: ```yaml id: query-mysql namespace: company.team tasks: - id: query type: io.kestra.plugin.jdbc.mysql.Query url: jdbc:mysql://localhost:3306/test username: "{{ namespace.mysql_user }}" sql: select * from employees fetchOne: true ``` When building new flows in a Namespace, Namespace variables are accessible from the **Variables** tab. Open the tab to view all available Namespace variables and their associated values. ![Namespace Variables Tab](./namespace-variable-tab.png) ## Creating Namespaces ### From the UI The video below shows how you can create a Namespace from the Kestra UI. After creating a Namespace, we're adding: - several new secrets - a nested Namespace variable that references one of these secrets - a list of plugin defaults helping to use those pre-configured secrets and variables in all the tasks from the AWS and Git plugins.
### From Terraform The following example reproduces the UI steps using Terraform, so that you know how to perform the same steps both from the UI and programmatically. To create a Namespace from Terraform, use the [kestra_namespace](https://registry.terraform.io/providers/kestra-io/kestra/latest/docs) resource. First, configure your Terraform backend and add Kestra as a required provider: ```hcl terraform { backend "s3" { bucket = "kestraio" key = "terraform.tfstate" region = "us-east-1" } required_providers { kestra = { source = "kestra-io/kestra" version = "~>0.14" } } } provider "kestra" { url = var.kestra_host username = var.kestra_user password = var.kestra_password tenant_id = var.kestra_tenant_id # only if you are using multi-tenancy } ``` You can add a file `main.tf` to your Terraform project with the following content: ```hcl resource "kestra_namespace" "marketing" { namespace_id = "marketing" description = "Namespace for the marketing team" } ``` The only required property is the `namespace_id`, which is the name of the Namespace. The `description` and all other properties are optional. #### Adding variables and plugin defaults to a Namespace Terraform resource You can add variables and plugin defaults directly to the Namespace resource by pointing to the YAML configuration files. First, create the `variables_marketing.yml` file: ```yaml github: token: "{{ secret('GITHUB_TOKEN') }}" ``` Then, create another file for `task_defaults_marketing.yml`: ```yaml - type: io.kestra.plugin.aws values: accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" region: us-east-1 secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}" - type: io.kestra.plugin.git values: password: "{{ render(namespace.github.token) }}" username: your-github-username ``` Finally, reference those files in your Namespace resource definition: ```hcl resource "kestra_namespace" "marketing" { namespace_id = "marketing" description = "Namespace for the marketing team" variables = file("variables_marketing.yml") task_defaults = file("task_defaults_marketing.yml") } ``` #### Adding secrets to a Namespace using Terraform To programmatically add secrets to your Namespace via [Terraform](https://registry.terraform.io/providers/kestra-io/kestra/latest/docs), you can use the [kestra_namespace_secret](../../../13.terraform/resources/namespace_secret/index.md) resource. Here is an example of adding multiple secrets to the `marketing` Namespace: ```hcl resource "kestra_namespace_secret" "github_token" { namespace = "marketing" secret_key = "GITHUB_TOKEN" secret_value = var.github_token } resource "kestra_namespace_secret" "aws_access_key_id" { namespace = "marketing" secret_key = "AWS_ACCESS_KEY_ID" secret_value = var.aws_access_key_id } resource "kestra_namespace_secret" "aws_secret_access_key" { namespace = "marketing" secret_key = "AWS_SECRET_ACCESS_KEY" secret_value = var.aws_secret_access_key } ``` Before referencing variables in your Terraform configuration, make sure to define them in your `variables.tf` file: ```hcl variable "github_token" { type = string sensitive = true } variable "aws_access_key_id" { type = string sensitive = true } variable "aws_secret_access_key" { type = string sensitive = true } variable "kestra_user" { type = string sensitive = true } variable "kestra_password" { type = string sensitive = true } variable "kestra_host" { type = string sensitive = false default = "http://your_kestra_host:8080" # Change this to your Kestra host URL } variable "kestra_tenant_id" { type = string sensitive = false default = "kestra-tech" } ``` And add your secrets to the `terraform.tfvars` file: ```hcl github_token = "your-github-token" aws_access_key_id = "your-aws-access-key-id" aws_secret_access_key = "your-aws-secret-access-key" kestra_user = "your-kestra-user" kestra_password = "your-kestra-password" ``` ## Allowed Namespaces When you navigate to any Namespace and go to the Edit tab, you can explicitly configure which Namespaces are allowed to access flows and other resources related to that Namespace. By default, all Namespaces are allowed: ![allowed-namespaces](./allowed-namespaces.png) However, you can restrict that access if you want only specific Namespaces (or no Namespace at all) to trigger its corresponding resources. ![allowed-namespaces-2](./allowed-namespaces-2.png) --- # Read-Only Secret Manager in Kestra Enterprise URL: https://kestra.io/docs/enterprise/governance/read-only-secrets > Enhance security with Read-Only Secret Managers in Kestra. Integrate external secret stores like Vault or AWS Secrets Manager in immutable mode. Integrate external secrets managers in a read-only mode.
## Read-only secret manager When integrating an external [secrets manager](../secrets-manager/index.md) with Kestra, you may want to ensure that those secrets cannot be modified within Kestra, maintaining immutability. Currently, read-only secrets can be configured for [AWS Secret Manager](../secrets-manager/index.md#aws-secrets-manager-configuration), [Azure Key Vault](../secrets-manager/index.md#azure-key-vault-configuration), [Google Secret Manager](../secrets-manager/index.md#google-secret-manager-configuration), and [Vault](../secrets-manager/index.md#vault-configuration). :::alert{type="info"} Need short-lived tokens while keeping secrets immutable? Use a [Credential](../../03.auth/credentials/index.md); it mints tokens from your read-only secrets and surfaces them at runtime via `credential()`. ::: ## Configure read-only secrets Read-only secrets can be configured globally in the configuration file as well as enabled from the UI at the [Tenant](../tenants/index.md) and the [Namespace](../../../05.workflow-components/02.namespace/index.md) level. To enable for a specific Tenant, toggle the setting on in the **Dedicated secrets manager** configuration. ![read-only-secrets-8](./read-only-secrets-8.png) To enable for a specific Namespace, toggle the setting on in the **Dedicated secrets manager** configuration of the **Edit** tab. ![read-only-secrets-1](./read-only-secrets-1.png) Secrets will display a lock icon to indicate read-only status, and the **Create New Secret** button will no longer be visible. ![read-only-secrets-4](./read-only-secrets-4.png) To configure globally, add `read-only: true` to the configuration of your external secret manager like in the examples below. ### AWS Secret Manager For compatibility with Kestra, ensure that your AWS secrets are stored as plain text in AWS Secrets Manager and not as key-value pairs. The following example shows the configuration for AWS Secret Manager with a read-only secrets backend: ```yaml kestra: secret: type: aws-secret-manager read-only: true aws-secret-manager: access-key-id: mysuperaccesskey secret-key-id: mysupersecretkey region: us-east-1 ``` When adding a secret in AWS, you will need to specify the following tags: - `namespace`: the namespace this secret should appear in. - `key`: the key which you will use to access the secret inside of your workflow. - `prefix`: used to store secrets separately. Will be set to `kestra` by default if secret is created inside Kestra. :::alert{type="info"} The secret name in AWS will not display inside of Kestra. Instead set this to something easy to differentiate between other secrets. ::: ### Azure Key Vault The following example shows the configuration for Azure Key Vault with a read-only secrets backend: ```yaml kestra: secret: type: azure-key-vault read-only: true azure-key-vault: clientSecret: tenantId: "id" clientId: "id" clientSecret: "secret" ``` ### Google Secret Manager The following example shows the configuration for Google Secret Manager with a read-only secrets backend: ```yaml kestra: secret: type: google-secret-manager read-only: true google-secret-manager: project: gcp-project-id service-account: | Paste the contents of the service account JSON key file here. ``` ### Vault With [Vault](../secrets-manager/index.md#vault-configuration), secrets are stored in a unique structure that can vary depending on the organization and version of Vault. Typically, there is a Secret Engine that hosts different Secrets with specific paths. Those Secrets are the paths to subkeys that are the actual key value pairs such as Username or Password to a service (e.g., `MY_SECRET = MY_SECRET_PASSWORD`). Here’s an example directory structure of a Vault secret engine used with Kestra: ```plaintext secret/ ├── app1/ │ ├── db/ <-- SECRET │ │ ├── DATABASE_USERNAME # Subkey │ │ ├── DATABASE_PASSWORD # Subkey │ ├── api/ <-- SECRET │ ├── keys # Subkey │ ├── API_TOKEN # Subkey ├── app2/ ├── config ``` - `secret`: This is the secret engine. - `app1` and `app2`: These are the path names to the secrets. This could be for example separate business units or applications. - `db`, `api`, and `config`: These are the secret names visible in the Kestra UI. `api` could be the Vault Secret that contains all API Keys for an application's external services. - `DATABASE_USERNAME`, `DATABASE_PASSWORD`, `keys`, `API_TOKEN`: These are the `subkey` key value pairs that can be used in a Kestra flow. To configure access to secrets under `app1`, use the following [Kestra Security and Secrets configuration](../../../configuration/05.security-and-secrets/index.md) with the added property `secret-path-prefix`: ```yaml address: https://my-vault:8200/ root-engine: secret secret-path-prefix: app1 token: token: my-vault-access-token ``` This configuration gives Kestra access to the `db` and `api` secrets, as they are the secrets on the `app1` path. In a flow, to access the value for the subkey `API_TOKEN`, you write the `secret()` function with the specified parameters `{{ secret('api', subkey='API_TOKEN') }}`. ## Vault full example The following steps are a full example of configuring Vault as your secret manager with read-only secrets enabled. This example uses [KV Secrets Engine - Version 2 with Vault Enterprise](../secrets-manager/index.md#kv-secrets-engine---version-2), so `root-engine` and `namespace` are used as optional properties. In Vault, we have a Secrets Engine named `business-unit` in the `admin` namespace that hosts the path to our database password that we want to use to [add a table and populate with data in Neon](../../../15.how-to-guides/neon/index.md). ![read-only-secrets-2](./read-only-secrets-2.png) In Kestra, we can now navigate to the Namespace we want to set up Vault as a secrets manager for and enter the configuration details: ![read-only-secrets-3](./read-only-secrets-3.png) After saving, we can move to the Secrets tab and see which paths we have access to. Notice the lock icon indicating that read-only is successfully turned on. No new secrets can be created from Kestra, and existing secrets are not editable. ![read-only-secrets-4](./read-only-secrets-4.png) In Vault, we know `my-app` is the secret that hosts the subkey we are looking for, in this case, `NEON_PASSWORD`. ![read-only-secrets-5](./read-only-secrets-5.png) Now to use in our flow, we need to use the `secret()` function with the name of our secret `my-app` and the `subkey` parameter set to the key of the secret value we want to use, which in this case is `NEON_PASSWORD`. :::collapse{title="Expand for a Flow yaml"} ```yaml id: neon-db namespace: company.team tasks: - id: download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: create_columns type: io.kestra.plugin.jdbc.postgresql.Queries sql: | ALTER TABLE kestra_example_secret ADD COLUMN order_id int, ADD COLUMN customer_name text, ADD COLUMN customer_email text, ADD COLUMN product_id int, ADD COLUMN price double precision, ADD COLUMN quantity int, ADD COLUMN total double precision; - id: copy_in type: io.kestra.plugin.jdbc.postgresql.CopyIn table: "kestra_example_secret" from: "{{ outputs.download.uri }}" header: true columns: [order_id,customer_name,customer_email,product_id,price,quantity,total] delimiter: "," pluginDefaults: - forced: true type: io.kestra.plugin.jdbc.postgresql values: url: jdbc:postgresql://ep-ancient-flower-a2e73um1-pooler.eu-central-1.aws.neon.tech/neondb?user=neondb_owner&password={{secret('my-app', subkey='NEON_PASSWORD')}} ``` ::: ![read-only-secrets-6](./read-only-secrets-6.png) After saving the flow and executing, we can see that Kestra successfully accessed the correct value from Vault and added 100 rows to our Neon database. ![read-only-secrets-7](./read-only-secrets-7.png) ## Filter secrets by tags When integrating an external secrets manager in read-only mode, you can filter which secrets are visible in Kestra by matching [tags](../secrets-manager/index.md#default-tags). This is supported for AWS Secrets Manager, Azure Key Vault, and Google Secret Manager. - Set `read-only: true` and configure `filter-on-tags.tags` as a map of key/value pairs to match. Below are example configurations for AWS Secrets Manager, Azure Key Vault, and Google Secret Manager: ```yaml kestra: secret: type: aws-secret-manager read-only: true aws-secret-manager: filter-on-tags: tags: application: kestra-production ``` ```yaml kestra: secret: type: azure-key-vault read-only: true azure-key-vault: filter-on-tags: tags: application: kestra-production ``` ```yaml kestra: secret: type: google-secret-manager read-only: true google-secret-manager: filter-on-tags: tags: application: kestra-production ``` ## Filter secrets by prefix For AWS Secrets Manager, you can also filter secrets by a name prefix when using read-only mode. Use `filter-on-prefix.prefix` to select secrets whose names start with the given prefix and `filter-on-prefix.keep-prefix` to control whether the prefix is kept in the Kestra secret key. ```yaml kestra: secret: type: aws-secret-manager read-only: true aws-secret-manager: filter-on-prefix: prefix: prod_ keep-prefix: true ``` --- # Secrets in Kestra Enterprise: Manage Sensitive Data URL: https://kestra.io/docs/enterprise/governance/secrets > Manage sensitive data securely in Kestra Enterprise. Create, use, and govern secrets within your workflows and integrations. How to create and manage Secrets in the Enterprise Edition. ## Secrets – manage sensitive data Secrets are used to store confidential information such as passwords, API keys, and other sensitive data that must not be exposed as plain text. Secrets managed in Kestra are encrypted at rest and in transit to guarantee that your sensitive information is secure. :::alert{type="info"} Need short-lived OAuth-style tokens or app-to-app auth? Define a [Credential](../../03.auth/credentials/index.md) that mints/refreshes tokens from your secrets and injects them at runtime via `credential()`. :::
## How to create a new Secret From the left navigation menu, go to **Namespaces**. Select a namespace and click on the **Secrets** tab. Then, click on the **Create** button to add a new secret. ## How are Secrets different between the Open-Source and Enterprise Editions? The Open Source Edition does not include built-in secrets management. However, you can pass special base64-encoded environment variables to your Kestra instance to store sensitive information. These environment variables can still be accessed in your flows using the `secret()` function, just like in the Enterprise Edition. :::alert{type="info"} Since there is no real notion of Secrets Management in the Open Source Edition, you will need to manage the lifecycle of these environment variables manually. This means that you will need to restart your Kestra instance to update or delete a Secret. We recommend planning these operations carefully to avoid any downtime, or contact us about upgrading to the Enterprise Edition to gain access to full secrets management features, including integration with external [Secrets Managers](../secrets-manager/index.md). ::: For more, check out our [secrets documentation](../../../06.concepts/04.secret/index.md), the [credentials guide](../../03.auth/credentials/index.md) for short-lived OAuth-style access tokens, and our [secrets best practices guide](../../../14.best-practices/9.secrets-management/index.md). --- # External Secrets Manager in Kestra: AWS, Azure, GCP URL: https://kestra.io/docs/enterprise/governance/secrets-manager > Secure sensitive data in Kestra with External Secrets Managers. Integrate with AWS, Azure, Google Cloud, Vault, and more for robust secret management. How to configure a secrets manager. ## Configure external secrets manager Kestra integrates with various secret managers to provide secure storage and handling of sensitive data. Kestra respects your privacy. Therefore, secrets are persisted externally in a backend of your choice. Workers fetch them at runtime and keep them only in memory. You can add, modify, or delete secrets from the **Secrets** tab of any given namespace in the Kestra UI or programmatically via [Terraform](https://registry.terraform.io/providers/kestra-io/kestra/latest/docs/resources/namespace_secret). :::alert{type="info"} If you need short-lived OAuth-style access tokens, create a [Credential](../../03.auth/credentials/index.md) that mints/refreshes tokens using the secrets stored in your external manager, then call it with `credential()` in flows. :::
## AWS Secrets Manager Configuration To use [AWS Secrets Manager](https://aws.amazon.com/secrets-manager/) as a secrets backend, make sure your AWS IAM user or role has the required permissions, including `CreateSecret`, `DeleteSecret`, `DescribeSecret`, `GetSecretValue`, `ListSecrets`, `PutSecretValue`, `RestoreSecret`, `TagResource`, and `UpdateSecret`. You can configure the authentication to AWS Cloud in multiple ways: - Use `accessKeyId`, `secretKeyId`, and `region` properties. - Include a `sessionToken` alongside the above credentials. - If the above properties are not set, Kestra will use the default AWS authentication in the same way as AWS CLI handles it (i.e., trying to use the AWS CLI profile or the default environment variables `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, and `AWS_DEFAULT_REGION`). ```yaml kestra: secret: type: aws-secret-manager aws-secret-manager: access-key-id: mysuperaccesskey secret-key-id: mysupersecretkey sessionToken: mysupersessiontoken region: us-east-1 ``` Additionally, you can configure the following properties: - **Prefix**: `kestra.secret.aws-secret-manager.prefix` is an optional property to store secrets separately for a different namespace, tenant, or instance. If configured, Kestra prefixes all secret keys with that value. This allows sharing a single secrets backend across multiple Kestra instances. - **Endpoint Override**: `kestra.secret.aws-secret-manager.endpoint-override` is an optional property to replace the default AWS endpoint with an AWS-compatible service such as [MinIO](https://min.io/). ## Azure Key Vault configuration To configure [Azure Key Vault](https://azure.microsoft.com/products/key-vault/) as your secrets backend, make sure Kestra's user or service principal (`clientId`) has the necessary permissions, including: - `"Get"` - `"List"` - `"Set"` - `"Delete"` - `"Recover"` - `"Backup"` - `"Restore"` - `"Purge"` Then, paste the `clientSecret` from the Azure portal to the `clientSecret` property in the configuration below. ```yaml kestra: secret: type: azure-key-vault azure-key-vault: clientSecret: tenantId: "id" clientId: "id" clientSecret: "secret" ``` If no credentials are set in the above configuration, Kestra uses the default Azure authentication (the same mechanism as the Azure CLI). Additionally, you can configure the following properties: - **Vault Name**: `kestra.secret.azure-key-vault.vault-name` is the name of the Azure Key Vault. - **Key Vault URI**: `kestra.secret.azure-key-vault.key-vault-uri` is an optional property allowing you to replace the Azure Key Vault name with a full URL. - **Prefix**: `kestra.secret.azure-key-vault.prefix` is an optional property to store secrets separately for a different namespace, tenant, or instance. If configured, Kestra prefixes all secret keys with that value, which is useful when sharing one vault across multiple Kestra instances. ## Elasticsearch configuration Elasticsearch backend stores secrets with an additional layer of security using [AES encryption](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard). You need to provide a cryptographic key (at least 32 characters-long string) in order to encrypt and decrypt secrets stored in Elasticsearch. ```yaml kestra: secret: type: elasticsearch elasticsearch: secret: "a-secure-32-character-minimum-key" ``` For Kestra instance deployed using the Kafka/Elastic backend, you can use the same configuration. Your secret key should be encrypted. You can find an example key in our [Security and Secrets configuration documentation](../../../configuration/05.security-and-secrets/index.md). ## Google Secret Manager configuration To leverage [Google Secret Manager](https://cloud.google.com/secret-manager) as your secrets backend, you need to create a **service account** with the [`roles/secretmanager.admin`](https://cloud.google.com/secret-manager/docs/access-control) permission. For configuring the secret manager in _READ_ONLY_ mode, only `roles/secretmanager.secretAccessor` permission is sufficient. Paste the contents of the service account JSON key file to the `serviceAccount` property in the configuration below. Alternatively, set the `GOOGLE_APPLICATION_CREDENTIALS` environment variable to point to the credentials file. ```yaml kestra: secret: type: google-secret-manager google-secret-manager: project: gcp-project-id serviceAccount: | { "type": "service_account", "project_id": "gcp-project-id", "private_key_id": "...", "private_key": "...", ... } ``` If you opt for authentication using the `GOOGLE_APPLICATION_CREDENTIALS` environment variable, make sure that it's set on all worker nodes. Keep in mind that this authentication method is less secure than using the `serviceAccount` property. If no credentials are set in the above configuration, Kestra will use the default Google authentication akin to the Google Cloud SDK. Additionally, you can configure the `kestra.secret.google-secret-manager.prefix` property to store secrets separately for a different namespace, tenant, or instance. If configured, Kestra will prefix all Secret keys using that prefix. The main purpose of a prefix is to share the same secret manager between multiple Kestra instances. When configuring the secret manager using the UI, either under Namespace or Tenant, you only need to configure the `project` and `serviceAccount` YAML configuration: ![GCP Secret Manager Configuration via UI](./gcp-secret-configuration.png) ## HashiCorp Vault configuration Kestra currently supports the [KV secrets engine - version 2](https://developer.hashicorp.com/vault/docs/secrets/kv/kv-v2) as a secrets backend. If you are considering alternative HashiCorp Vault secrets engines, please note the following: - The [Vault's database secrets engine](https://developer.hashicorp.com/vault/docs/secrets/databases), often referred to as "dynamic secrets", is not supported as we need long-term secret storage. - The [Vault Secrets Operator on Kubernetes](https://developer.hashicorp.com/vault/tutorials/kubernetes/vault-secrets-operator) creates a Kubernetes secret which is compatible with Kestra with some additional steps. If you are interested about this option, [reach out to us](/demo) and we can advise how you can set this up. Follow the steps below to configure the [KV Secrets Engine - Version 2](https://www.vaultproject.io/docs/secrets/kv/kv-v2) as your secrets backend. ### KV Secrets Engine - Version 2 To authenticate Kestra with [HashiCorp Vault](https://www.vaultproject.io/), you can use Userpass, Token, AppRole, or Kubernetes [Auth Methods](https://developer.hashicorp.com/vault/docs/auth), all of which require full [read and write policies](https://www.vaultproject.io/docs/concepts/policies). You can optionally change `rootEngine` or `namespace` (_if you use Vault Enterprise_). 1. Here is how you can set up [Userpass Auth Method](https://www.vaultproject.io/docs/auth/userpass) in your Kestra configuration: ```yaml kestra: secret: type: vault vault: address: "http://localhost:8200" password: user: john password: foo ``` 2. Here is how you can set up [Token Auth Method](https://www.vaultproject.io/docs/auth/token) in your Kestra configuration: ```yaml kestra: secret: type: vault vault: address: "http://localhost:8200" token: token: ``` 3. Finally, here is how you can set up [AppRole Auth Method](https://www.vaultproject.io/docs/auth/approle) in your Kestra configuration: ```yaml kestra: secret: type: vault vault: address: "http://localhost:8200" appRole: path: approle roleId: secretId: ``` 4. Finally, here is how you can set up [Kubernetes Auth Method](https://www.vaultproject.io/docs/auth/kubernetes) in your Kestra configuration: ```yaml kestra: secret: type: vault vault: address: "http://localhost:8200" kubernetes: path: "kubernetes" # defaults to "kubernetes" role: "kestra" # <-- the Vault K8s auth role name to use ``` Additionally, you can configure the following properties: - **Address**: `kestra.secret.vault.address` is a fully qualified address with scheme and port to your Vault instance. - **Namespace**: `kestra.secret.vault.namespace` is an optional configuration available on [Vault Enterprise Pro](https://learn.hashicorp.com/vault/operations/namespaces) allowing you to set a global namespace for the Vault server instance. - **Engine Version**: `kestra.secret.vault.engine-version` is an optional property allowing you to set the KV Secrets Engine version of the Vault server instance. Default is `2`. - **Root Engine**: `kestra.secret.vault.root-engine` is an optional property allowing you to set the KV Secrets Engine of the Vault server instance. Default is `secret`. Using the Token method with Root Engine has the following configuration: ```yaml kestra: secret: type: vault vault: token: token: YOUR_TOKEN address: http://vault:8200 rootEngine: dev ``` In Vault, `rootEngine: dev` translates to your KV secret engine type with path set as "dev". ![Vault Secret UI](./kv-secret-engine.png) And any secret that you create from Kestra would be placed under the following structure: `TENANT_ID/NAMESPACE_PARENT/NAMESPACE_CHILD/NAMESPACE_GRANDCHILD/SECRET_NAME`. Assuming a Tenant ID of `internal` and a `company.team` Namespace, Vault will show the following: ![Vault Secret Structure](./secret-structure.png) ## CyberArk Configuration Kestra integrates with [CyberArk](https://www.cyberark.com/products/secrets-management/) as a secrets backend. CyberArk stores your secrets externally, and Kestra workers retrieve them at runtime and keep them only in memory. To use CyberArk, configure the CyberArk endpoint and credentials. This configuration can be set globally in your [Kestra Security and Secrets configuration](../../../configuration/05.security-and-secrets/index.md) or per-namespace using the **Secrets** tab with a dedicated secret manager. ```yaml kestra: secret: type: cyberark cyberark: address: https://your-cyberark-host username: YOUR_USERNAME password: YOUR_PASSWORD ``` **Configuration properties:** * **address**: The CyberArk API base URL. * **username**: Username used to authenticate to CyberArk. * **password**: Password used to authenticate to CyberArk. ## Doppler configuration Kestra integrates with [Doppler](https://api.doppler.com) as a secrets backend. Doppler securely stores your secrets and exposes them through its API, which Kestra workers access at runtime. Secrets are only kept in memory by Kestra and are never persisted internally. To use Doppler, generate a Doppler service token with access to the desired project and config. Then, add the following configuration either globally in your [Kestra Security and Secrets configuration](../../../configuration/05.security-and-secrets/index.md) or per-namespace using the **Secrets** tab with a dedicated secret manager. ```yaml secret: type: doppler doppler: token: YOUR_TOKEN config: kestra_unit_test project: kestra_unit_test secretNamePrefix: kestra ``` **Configuration properties:** * **token**: Your Doppler service token. * **project**: The Doppler project containing the secrets. * **config**: The Doppler config/environment to read from. * **secretNamePrefix**: Optional prefix added to all secret keys to avoid collisions and share a Doppler backend across multiple Kestra instances or namespaces. ## 1Password Configuration Kestra integrates with 1Password as a secrets backend. Under the hood, it relies on the [1Password Connect API](https://developer.1password.com/docs/connect/api-reference/) to read and manage secrets securely. Workers access secrets at runtime and store them only in memory. To use 1Password, you need a running 1Password Connect server and a Connect token with access to the target vault. Then, add the following configuration either globally in your [Kestra Security and Secrets configuration](../../../configuration/05.security-and-secrets/index.md) or per-namespace using the **Secrets** tab with a dedicated secret manager. ```yaml kestra: secret: type: 1password 1password: address: http://localhost:18080 token: YOUR_TOKEN vaultId: YOUR_VAULT_ID ``` **Configuration properties:** * **address**: The base URL of your 1Password Connect server. * **token**: Your 1Password Connect API token. * **vaultId**: The ID of the vault containing your secrets. ## BeyondTrust Configuration Kestra integrates with BeyondTrust Password Safe (Secrets Safe) as an external secrets backend. Secrets are stored securely in BeyondTrust using [Secret Safe API](https://docs.beyondtrust.com/bips/v24.3/docs/secrets-safe-api), and Kestra workers retrieve them at runtime and keep them only in memory. ```yaml kestra: secret: type: beyondtrust beyondtrust: address: https://beyondtrust.example.com apiKey: YOUR_API_KEY runAs: domain\\service-account folderId: YOUR_SECRETS_SAFE_FOLDER_ID ``` **Configuration properties:** * **address**: The base URL of the BeyondTrust Password Safe instance. * **apiKey**: API key used to authenticate with BeyondTrust. * **runAs**: User context to run API calls as (e.g. domain\\username). * **folderId**: Secrets Safe folder ID where Kestra secrets are stored. ## Delinea Secret Server Configuration Kestra integrates with [Delinea Secret Server](https://delinea.com/products/secret-server) as an external secrets backend. Secrets are stored securely in Delinea Secret Server, and Kestra workers retrieve them at runtime and keep them only in memory. ```yaml kestra: secret: type: delinea delinea: address: https://your-delinea-instance.secretservercloud.com username: YOUR_USERNAME password: YOUR_PASSWORD folderId: YOUR_FOLDER_ID secretTemplateId: YOUR_TEMPLATE_ID ``` **Configuration properties:** - **address**: The base URL of your Delinea Secret Server instance. - **username**: Username used to authenticate to Delinea Secret Server. - **password**: Password used to authenticate to Delinea Secret Server. - **domain**: Optional. Active Directory domain for on-premise deployments using domain accounts. - **folderId**: The folder ID in Delinea Secret Server where Kestra secrets are stored. Required for write operations. - **secretTemplateId**: The secret template ID used when creating new secrets. Required for write operations. ## JDBC (Postgres, H2, MySQL) Secret Manager Kestra also supports internal secret backend. For the JDBC backend (H2, PostgreSQL, or MySQL), the following configuration allows you to set secret backend: ```yaml kestra: secret: type: jdbc jdbc: secret: ``` Your secret key should be encrypted. You can find an example key in our [Security and Secrets configuration documentation](../../../configuration/05.security-and-secrets/index.md). ## Default tags For each secret manager, you can configure the default tags that will be added to all newly created or updated secrets. Configuration example: ```yaml kestra: secret: : # a map of default key/value tags tags: application: kestra-production ``` Tags can be used as filters on your secrets in read-only mode. Refer to the [Read-only Secret Manager documentation](../read-only-secrets/index.md#filter-secrets-by-tags) for more details. ## Enable caching If you use a secret manager provided by a cloud service provider, it may be worth enabling the secret cache to reduce the number of calls to the secret manager API. Configuration example: ```yaml kestra: secret: cache: enabled: true maximum-size: 1000 expire-after-write: 60s ``` * **`kestra.secret.cache.enabled`**: Specifies whether to enable caching for secrets. * **`kestra.secret.cache.maximum-size`**: The maximum number of entries the cache may contain. * **`kestra.secret.cache.expire-after-write`**: Specifies that each entry should be automatically removed from the cache once this duration has elapsed after the entry's creation. --- # Multi-Tenancy in Kestra: Configure Tenants URL: https://kestra.io/docs/enterprise/governance/tenants > Enable Multi-Tenancy in Kestra Enterprise. Isolate resources, flows, and users across different teams or projects within a single Kestra instance. How to enable multi-tenancy in your Kestra instance.
## Multi-tenancy – configure and manage tenants A tenant represents an **isolated environment within a single Kestra instance**. Each tenant functions as a separate entity with its own resources, such as flows, triggers, or executions. Multi-tenancy enables different teams, projects, or customers to operate independently within the same Kestra instance, ensuring data privacy, security, and separation of resources between business units, teams, or customers. For example, you can have a `dev` tenant for development, a `staging` tenant for testing, and a `prod` tenant for production. :::alert{type="info"} You can think of multi-tenancy as running multiple virtual instances in a single physical instance of [Kestra Cloud](/cloud) or [Kestra Enterprise Edition](../../01.overview/01.enterprise-edition/index.md). ::: All resources (such as [flows](../../../05.workflow-components/01.flow/index.md), [triggers](../../../05.workflow-components/07.triggers/index.mdx), [executions](../../../05.workflow-components/03.execution/index.md), [RBAC](../../03.auth/rbac/index.md), and more) are isolated by the tenant. This means that you can have a flow with the same identifier and the same namespace in multiple tenants at the same time. Data stored inside the internal storage is also separated by tenants. End-users can use the tenant selection dropdown menu from the [UI](../../../09.ui/index.mdx) to see tenants they have access to. It allows users to switch between tenants easily. Each UI page includes the tenant ID in the URL (e.g., `https://demo.kestra.io/ui/yourTenantId/executions/namespace/flow/executionId`.) ![Tenants selection dropdown](./tenants.png) Most [API](../../../api-reference/index.mdx) endpoints also include the tenant identifier. The exception to that is instance-level endpoints such as `/configs`, `/license-info` or `/banners` that require Superadmin access. For example, the URL of the API operation to list flows of the `products` namespace is `/api/v1/{your_tenant_id}/flows/products`. You can check the [Enterprise Edition API Guide](../../../api-reference/01.enterprise/index.mdx) for more information. Tenants must be created upfront, and a user needs to be granted access to use a specific tenant. ## Key benefits of multi-tenancy 1. **Data Isolation**: each tenant's data, configuration, and code is isolated and inaccessible to other tenants. 2. **Resource Isolation**: each tenant's resources are isolated from other tenants — including flows, triggers, executions, logs, audit logs, secrets, etc. 3. **Simple Configuration**: create new tenants at any time, each providing a fresh, fully isolated workspace accessible from your existing Kestra instance. 4. **Intuitive UI Navigation**: the UI provides a dropdown as well as tenant identifiers included in the URL to make switching between tenants seamless. ## Creating and Managing Tenants Tenants in Kestra can be managed in various ways: from the UI, CLI, API, or Terraform. ### Creating a Tenant from the UI Tenants can be created and managed directly through Kestra's user interface. Go to **Instance -> Tenants**. Then, click on the **Create** button: ![create tenant from the UI](./tenant-create.png) Fill in the form and click **Save**: ![create tenant from the UI](./tenant-create-2.png) The user who creates a tenant is automatically granted the Admin Role for that tenant. You may need to refresh the UI to see updated Roles. ### Creating a Tenant from the CLI Kestra provides CLI commands for tenant creation. The following command creates a tenant with the identifier `stage` and the name `Staging`: ```bash kestra tenants create --tenant stage --name "Staging" ``` Running `kestra tenants create --help` shows you all available properties: ```bash $ kestra tenants create --help Usage: kestra tenants create [-hVv] [--internal-log] [--admin-username=] [-c=] [-l=] [--name=] [-p=] [--tenant=] create a tenant and assign admin roles to an existing admin user --admin-username= Username of an existing admin user that will be admin of this tenant -c, --config= Path to a configuration file, default: /Users/anna/. kestra/config.yml) -h, --help Show this help message and exit. --internal-log Change also log level for internal log, default: false) -l, --log-level= Change log level (values: TRACE, DEBUG, INFO, WARN, ERROR; default: INFO) --name= tenant description -p, --plugins= Path to plugins directory , default: /Users/anna/dev/plugins) --tenant= tenant identifier -v, --verbose Change log level. Multiple -v options increase the verbosity. -V, --version Print version information and exit. ``` ### Creating a Tenant from the API Tenants can be managed programmatically via Kestra's [API](../../../api-reference/01.enterprise/index.mdx#post-/api/v1/tenants). Here is an example of an API call for creating a tenant: ```bash curl -X POST "https://demo.kestra.io/api/v1/tenants" \ -H "accept: application/json" \ -H "Content-Type: application/json" \ -d "{ \"id\": \"stage\", \"name\": \"staging\", \"deleted\": false}" ``` ### Creating a Tenant from Terraform Tenants can be managed via Infrastructure as Code using [Kestra's Terraform provider](../../../13.terraform/resources/tenant/index.md). :::alert{type="info"} This example assumes you have already configured the [Kestra Terraform Provider](../../../13.terraform/index.mdx) ::: Here is an example of a Terraform configuration for creating a tenant: ```hcl resource "kestra_tenant" "stage" { tenant_id = "stage" name = "staging" } ``` ### Deleting a tenant Deleting a tenant removes all associated resources including flows, namespaces, apps, dashboards, and roles. Execution data, logs, metrics, and audit logs are retained in the database, and they can be purged if needed with their corresponding [Purge tasks](../../../10.administrator-guide/purge/index.md). :::alert{type="warning"} Deleting a tenant is irreversible. All resources under the tenant will be permanently removed, except for logs and execution history stored in the database. ::: Key-value pairs and namespace files will not be deleted as they are persisted in internal storage. ### Admin role assignment Regardless of which of the above methods you use to create a tenant, the User who creates the tenant automatically gets the Admin Role assigned. That role grants admin rights to that user on that tenant. Note that there is an exception to this rule if a tenant is created by a Superadmin. In that case, the Superadmin has to explicitly assign the Admin Role for that tenant to themselves or any other User, Service Account, or Group. ### Dedicated storage and secrets backend per tenant By default, each tenant uses the same [runtime and storage configuration](../../../configuration/02.runtime-and-storage/index.md) and [secrets backend](../secrets-manager/index.md) configured for your Kestra instance. If you need more isolation, you can configure a dedicated storage and secrets backend per tenant. This can be useful if each of your tenants serves different customers and you need to ensure complete data isolation between them. To configure a dedicated storage and secrets backend per tenant, navigate to the **Instance - Tenants** in the UI and click on the **Details** button of the tenant you'd like to configure. Then, select the storage and secrets backend you want to use for that tenant: ![tenants-dedicated-internal-storage](./tenants-dedicated-internal-storage.png) For storage configuration examples, refer to [Runtime and Storage](../../../configuration/02.runtime-and-storage/index.md) in the configuration guide. ![tenants-dedicated-secrets-manager](./tenants-dedicated-secrets-manager.png) For the different secret managers' configurations, refer to the [Secret Managers documentation](../secrets-manager/index.md). :::alert{type="warning"} Make sure to use `camelCase` notation. For example, if you want to use the `GCS` storage backend, you should use `projectId` as the value rather than `project-id`. ::: ### Isolate Kestra services When using [Dedicated Storage or Secret backends](../tenants/index.md#dedicated-storage-and-secrets-backend-per-tenant), you can isolate specific [Kestra services](../../../08.architecture/02.server-components/index.md) to prevent them from accessing the storage or secret backend. For example, you may not want the [Webserver](../../../08.architecture/02.server-components/index.md#webserver) to be able to access the dedicated internal storage. This isolation is intended for Kestra instances where multiple teams or organizations share access, but storage or secret data access must be limited to specific segments. The configuration utilizes the `deniedServices` property with a list of the services to isolate. Take the following as an example using `storage` (this can be replaced with `secret` for a dedicated secret backend), where the Executor and Webserver must be isolated: ```yaml kestra: storage: # or secret isolation: enabled: true deniedServices: [EXECUTOR, WEBSERVER] ``` For additional configuration details, refer to dedicated [Security and Secrets](../../../configuration/05.security-and-secrets/index.md) and [Runtime and Storage](../../../configuration/02.runtime-and-storage/index.md) pages in the configuration guide. :::alert{type="info"} If this feature is enabled some UI or flow execution capabilities may not work as expected. If unsure, contact support. ::: ### Default service account for SDK plugins Each tenant can define **default authentication credentials** used by [SDK-based plugins](/plugins/plugin-kestra). Configure this in the Tenant settings (API token or basic auth). [Namespaces](../07.namespace-management/index.md#default-service-account-for-sdk-plugins) can override it with their own default service account; otherwise the tenant-level default is used. If neither is set, SDK plugins require the properties to be defined in the tasks. --- # Unit Tests in Kestra Enterprise: Validate Flows URL: https://kestra.io/docs/enterprise/governance/unit-tests > Validate workflows with Unit Tests in Kestra Enterprise. Create test suites, mock tasks, and assert flow behavior to ensure reliability before production. Build tests to ensure proper flow behavior. Tests let you verify that your flow behaves as expected, without cluttering your instance with test executions that run every task. For example, a unit test designed to mock the notification task of a flow ensures the configuration is correct without spamming dummy notifications to the recipient. They also let you isolate testing to specific changes to a task, rather than executing the entire flow.
## Flow unit tests Each test runs a single flow and checks its outcomes against your **assertions**, helping you avoid regressions when you change the flow later. Each **test case** creates a new transient execution, making it easy to run multiple tests in parallel, and each test case will not affect the others. Use **fixtures** to mock specific tasks or inputs by returning predefined outputs and states without executing the tasks. Unit tests are configured for and connected to their respective flows. To create a new Unit Test, access them either through the **Tests** tab on the left-hand side panel of the Kestra UI or via the **Tests** tab of a flow. When creating tests, you can open the YAML for both the test and its flow side by side.
Once tests are created, they can all be viewed from the **Tests** tab with their respective Id, Namespace, Tested Flow, and current State listed. Additionally, tests can be run from this view with expandable results. ![Tests Interface](./unit-test-interface.png) The following diagram illustrates the structure of flows and unit tests together in Kestra: ![Tests Tree Diagram](./unittest.png) ## Configuration Unit tests are written in YAML like flows. A test is made up of `testCases`, and each test case is made up of `fixtures` and `assertions`. Fixtures can target **files**, **inputs**, **tasks**, or **triggers** depending on what you need to mock or override. Like flows, you can write unit tests as code, in No Code, or with the [AI Copilot](../../../ai-tools/ai-copilot/index.md). - A **fixture** refers to the setup required before a test runs, such as initializing objects or configuring environments, to ensure the test has a consistent starting state. - An **assertion** is a statement that checks if a specific condition is true during the test. If the condition is false, the test fails, indicating an issue with the code being tested, while true indicates the expectation is met. Common fixture types: - **files**: provide inline files or namespace file URIs the flow can read. - **inputs**: set flow input values without changing the flow definition. - **tasks**: skip or mock task execution, override outputs, or force a state. - **triggers**: simulate an incoming event (e.g., webhook payload) that starts the flow. :::alert{type="warning"} If you don't specify any fixtures, the test will run the entire flow as in production, executing all tasks and producing outputs as usual. ::: The following flow: 1. sends a Slack message when it starts 2. extracts data from an API 3. transforms the returned data 4. loads the transformed data into BigQuery ```yaml id: etl_daily_products_bigquery namespace: company.team tasks: - id: send_slack_message_started type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "https://kestra.io/api/mock" # To use this example, replace the url with your own Slack webhook payload: | { "text": "{{ flow.namespace }}.{{ flow.id }}: Daily products flow has started" } - id: extract type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/json/orders.json - id: transform_to_products_name type: io.kestra.plugin.core.debug.Return format: "{{ fromJson(read(outputs.extract.uri)) | jq('.Account.Order[].Product[].\"Product Name\"') }}" - id: transform_to_uppercase type: io.kestra.plugin.core.debug.Return format: "{{ fromJson(outputs.transform_to_products_name.value) | upper }}" - id: load type: io.kestra.plugin.gcp.bigquery.Load from: "{{ outputs.transform_to_uppercase.value }}" destinationTable: "my_project.my_dataset.my_table" format: JSON ``` This example test suite shows two common patterns: letting one transformation run normally while mocking side effects, and fully mocking an upstream task to isolate a downstream transformation. ```yaml id: etl_daily_products_bigquery_testsuite namespace: company.team flowId: etl_daily_products_bigquery testCases: - id: extract_should_return_data type: io.kestra.core.tests.flow.UnitTest fixtures: tasks: - id: send_slack_message_started description: "don't send Slack message" - id: load description: "don't load data into BigQuery" assertions: - value: "{{outputs.transform_to_uppercase.value}}" isNotNull: true - id: extract_should_transform_product_names_to_uppercase_mocked type: io.kestra.core.tests.flow.UnitTest fixtures: tasks: - id: send_slack_message_started description: "don't send Slack message" - id: load description: "don't load data into BigQuery" - id: extract description: "don't fetch data from API" - id: transform_to_products_name outputs: value: | [ "my-product-1" ] assertions: - value: "{{outputs.transform_to_uppercase.value}}" contains: "MY-PRODUCT-1" ``` The `id` is unique to the test suite, and the `namespace` and `flowId` must match the intended flow. When you create a test from a flow, those values are filled in automatically. The `testCases` property contains the `fixtures` and `assertions` for each test case. In the first test case, `extract_should_return_data`, the `fixtures` include tasks to replace the Slack alert and BigQuery data load so as to not clutter a Slack channel with test alert messages or a BigQuery table with test data but still test the overall design of the flow. The `assertions` property defines the conditions for success or failure. In the example, the test aims to ensure that the outputs from the `transform_to_uppercase` task are not null. After running the test, we can see the results for the `extract_should_return_data` test by expanding the results. ![Test case 1 results](./test-case-1.png) The assertion passed as the `extract` task downloading data from the API returned product names and was not null. Additionally, since we did not include a fixture for the `transform_to_uppercase` task, we can see that the returned product names were also transformed successfully to uppercase in the assertion's actual result. Because we wrote the test suite with two test cases, both executed during the run. For more isolation, you could separate test cases into multiple tests of the flow as needed. While we know from the previous test that the uppercase transformation was successful, you may not want to extract actual data during testing, as it could add load to an external service or send unnecessary alerts. To mitigate this and solely test the transformation, we added the `extract` and `transform_to_products_name` fixtures in the second test case, `extract_should_transform_product_names_to_uppercase_mocked`. The `extract` fixture prevents the API call, and the `transform_to_products_name` fixture simulates the return of the flow task with a mock output, `my-product-1`, all in lowercase. After running, we can see that the assertion was successful and the actual result `MY-PRODUCT-1` was successfully transformed and matches the expected result defined in the `assertions` property of the test. ![Test case 2 results](./test-case-2.png) Execution details are not stored in the Executions page like normally run flows to avoid cluttering that space with unnecessary execution details. To view an execution made from a test, you can open the test case and click on the link for the ExecutionId. ![Test Execution Details](./test-execution.png) ## Unit test with a namespace file You can also simulate flows with namespace files that contain scripts, test data, or any other file content. In the previous example, you can add a namespace file that contains sample data from the production API endpoint so you do not need to make any API calls during testing. This avoids extra cost and unnecessary calls to external services. Use the following flow: ```yaml id: etl_download_file namespace: company.team tasks: - id: extract type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/json/orders.json method: GET - id: transform_to_products_name type: io.kestra.plugin.core.debug.Return format: "{{ fromJson(read(outputs.extract.uri)) | jq('.Account.Order[].Product[].\"Product Name\"') }}" - id: transform_to_uppercase type: io.kestra.plugin.core.debug.Return format: "{{ fromJson(outputs.transform_to_products_name.value) | upper }}" - id: load_result_to_outgoing_api type: io.kestra.plugin.core.log.Log message: "{{ outputs.transform_to_uppercase.this_task_should_not_be_run }}" ``` Then add a namespace file in the `company.team` namespace that mimics the API response format. For example, add `my-namespace-file-with-products.json` to the `company.team` namespace: ```json { "Account": { "Account Name": "Firefly", "Order": [ { "OrderID": "order103", "Product": [ { "Product Name": "Bowler Hat", "ProductID": 858383, "SKU": "0406654608", "Description": { "Colour": "Purple", "Width": 300, "Height": 200, "Depth": 210, "Weight": 0.75 }, "Price": 34.45, "Quantity": 2 }, { "Product Name": "Trilby hat", "ProductID": 858236, "SKU": "0406634348", "Description": { "Colour": "Orange", "Width": 300, "Height": 200, "Depth": 210, "Weight": 0.6 }, "Price": 21.67, "Quantity": 1 } ] } ] } } ``` This test uses the namespace file as mocked task output so the transformation runs against sample data instead of making the API request: ```yaml id: etl_mockfile_from_ns namespace: company.team flowId: etl_download_file testCases: - id: extract_should_transform_productNames_to_uppercase_with_mocked_file type: io.kestra.core.tests.flow.UnitTest fixtures: tasks: - id: extract description: "mock extract data file" outputs: uri: "{{ fileURI('my-namespace-file-with-products.json') }}" # this file is a namespace file in the same namespace - id: load_result_to_outgoing_api description: "don't send end output" assertions: - value: "{{outputs.transform_to_uppercase.value}}" equalsTo: "[BOWLER HAT, TRILBY HAT]" ``` With a combination of namespace files and tests, you can target specific components of your flow for correct functionality without using up any external resources or unnecessarily communicating with external hosts for scripts or files. ## Inline file fixture If you prefer not to use a namespace file for the file fixture in the test, you can also write the file contents inline with the `files` property to achieve the same result: ```yaml id: etl_mockfile_from_ns namespace: company.team flowId: etl_download_file testCases: - id: extract_should_transform_product_names_to_uppercase_with_mocked_file type: io.kestra.core.tests.flow.UnitTest fixtures: files: products.json: | { "Account": { "Account Name": "Firefly", "Order": [ { "OrderID": "order103", "Product": [ { "Product Name": "Bowler Hat", "ProductID": 858383, "SKU": "0406654608", "Description": { "Colour": "Purple", "Width": 300, "Height": 200, "Depth": 210, "Weight": 0.75 }, "Price": 34.45, "Quantity": 2 }, { "Product Name": "Trilby hat", "ProductID": 858236, "SKU": "0406634348", "Description": { "Colour": "Orange", "Width": 300, "Height": 200, "Depth": 210, "Weight": 0.6 }, "Price": 21.67, "Quantity": 1 } ] } ] } } tasks: - id: extract description: "mock extract data file" outputs: # this file is a namespace file in the same namespace, the fileURI() function will return its URI. uri: "{{files['products.json']}}" ``` ## Trigger fixture example When your flow is kicked off by a trigger, you can mock the trigger payload directly in the test so you don't have to hit the real endpoint. The example below stubs a webhook trigger carrying an order event and asserts the flow formats the message correctly. Example flow: ```yaml id: return-flow-webhook namespace: io.kestra.tests triggers: - id: webhook type: io.kestra.plugin.core.trigger.Webhook key: webhook tasks: - id: return_summary type: io.kestra.plugin.core.output.OutputValues values: body: "{{ trigger.body }}" ``` Example unit test: ```yaml id: simple-webhook-test-suite-1-id namespace: io.kestra.tests description: assert flow is returning the input value as output flowId: return-flow-webhook testCases: - id: test_case_1 type: io.kestra.core.tests.flow.UnitTest fixtures: trigger: id: webhook type: io.kestra.plugin.core.trigger.Webhook variables: body: webhook assertions: - value: "{{ trigger.body }}" equalTo: "webhook" ``` What this test does: it mocks the webhook trigger payload (`body: webhook`), skips any real HTTP call, runs the flow once, and asserts that the flow receives the mocked payload via `trigger.body`. Because fixtures create a transient execution, the test is fast, isolated, and leaves no execution history clutter. ## Mock task output files When testing flows that include script tasks (such as Shell, Python, or other scripts) that generate output files, you can mock these output files in your test fixtures. This is particularly useful when: - You want to test downstream tasks that parse or process output files without running the actual script - The script is expensive or time-consuming to execute - You want to test specific edge cases by providing controlled output file content For example, consider a flow where a shell script generates an output file that is later processed by another task: ```yaml id: shell_output namespace: company.team tasks: - id: generate_output_file type: io.kestra.plugin.scripts.shell.Script taskRunner: type: io.kestra.plugin.core.runner.Process outputFiles: - out.txt script: | echo "Processing data..." > out.txt echo "Result: SUCCESS" >> out.txt - id: parse_output type: io.kestra.plugin.core.log.Log message: "Output file content: {{ read(outputs.generate_output_file.outputFiles['out.txt']) }}" ``` You can create a unit test that mocks the output file content without executing the shell script: ```yaml id: test_shell_output flowId: shell_output namespace: company.team testCases: - id: mock_shell_output type: io.kestra.core.tests.flow.UnitTest description: Mock shell script output file to test downstream processing fixtures: files: mocked_output.txt: | Processing data... Result: SUCCESS tasks: - id: generate_output_file state: SUCCESS description: "don't run the shell script, mock its output" outputs: outputFiles: out.txt: "{{files['mocked_output.txt']}}" assertions: - value: "{{ outputs.generate_output_file.outputFiles['out.txt'] }}" isNotNull: true ``` In this example: 1. The `files` property defines inline file content (`mocked_output.txt`) that will be used as the mocked output 2. The task fixture for `generate_output_file` specifies `state: SUCCESS` to mark the task as successful without execution 3. The `outputs.outputFiles` property maps the expected output file name (`out.txt`) to the mocked file content using the `files` reference 4. Downstream tasks can read the mocked output file as if the script had actually run This approach allows you to test the complete flow logic while avoiding the overhead and complexity of executing actual scripts during testing. ## Available assertion operators While the above example uses `isNotNull` and `contains` as assertion operators, there are many more that can be used when designing unit tests for your flows. The complete list is as follows: | **Operator** | **Description of the assertion operator** | | -------------------- | ------------------------------------------------------------------------------------------------- | | isNotNull | Asserts the value is not null, e.g. `isNotNull: true` | | isNull | Asserts the value is null, e.g. `isNull: true` | | equalTo | Asserts the value is equal to the expected value, e.g. `equalTo: 200` | | notEqualTo | Asserts the value is not equal to the specified value, e.g. `notEqualTo: 200` | | endsWith | Asserts the value ends with the specified suffix, e.g. `endsWith: .json` | | startsWith | Asserts the value starts with the specified prefix, e.g. `startsWith: prod-` | | contains | Asserts the value contains the specified substring, e.g. `contains: success` | | greaterThan | Asserts the value is greater than the specified value, e.g. `greaterThan: 10` | | greaterThanOrEqualTo | Asserts the value is greater than or equal to the specified value, e.g. `greaterThanOrEqualTo: 5` | | lessThan | Asserts the value is less than the specified value, e.g. `lessThan: 100` | | lessThanOrEqualTo | Asserts the value is less than or equal to the specified value, e.g. `lessThanOrEqualTo: 20` | | in | Asserts the value is in the specified list of values, e.g. `in: [200, 201, 202]` | | notIn | Asserts the value is not in the specified list of values, e.g. `notIn: [404, 500]` | ## Assert on execution outputs Rather than assert with an operator and a fixed value, you can use execution outputs in your tests. To assert on execution outputs, use the `{{ execution.outputs.your_output_id }}` syntax in your test assertions. This allows you to verify that task outputs match the expected values. The following example assumes there is a flow that outputs a value: ```yaml id: flow_outputs_demo namespace: demo tasks: - id: mytask type: io.kestra.plugin.core.output.OutputValues values: myvalue: kestra outputs: - id: myvalue type: STRING value: "{{ outputs.mytask.values.myvalue }}" ``` Then, create a unit test for this flow that asserts the output value as follows: ```yaml id: test_flow_outputs_demo flowId: flow_outputs_demo namespace: demo testCases: - id: flow_output type: io.kestra.core.tests.flow.UnitTest assertions: - value: "{{ execution.outputs.myvalue }}" equalTo: kestra ``` When you run this test, Kestra will execute the flow and verify that the output value matches the expected value. If the assertion fails, the test will be marked as failed, and you can inspect the execution logs to see what went wrong. --- # Worker Isolation in Kestra Enterprise: Separation URL: https://kestra.io/docs/enterprise/governance/worker-isolation > Enforce security with Worker Isolation in Kestra. Isolate execution environments, file systems, and processes for secure multi-tenant operations. How to configure worker isolation in Kestra. ## Worker isolation – enforce separation When dealing with multiple teams, you can add extra security measures to your Kestra instance to isolate access so that there is no shared file system, only certain plugins can create worker threads, and script tasks are isolated. ## Java security By default, Kestra uses a shared worker to handle workloads. This is fine for most use cases. However, when using a shared Kestra instance between multiple teams, this can allow people to access temporary files created by Kestra with powerful tasks like [Groovy](/plugins/plugin-script-groovy), [GraalVM Python](/plugins/plugin-graalvm/python-graalvm), and more. This is because the worker shares the same file system. You can use the following to opt in to real isolation of file systems using advanced Kestra EE Java security: ```yaml kestra: ee: javaSecurity: enabled: true forbiddenPaths: - /etc/ authorizedClassPrefix: - io.kestra.plugin.core - io.kestra.plugin.gcp ``` To only limit access to certain plugins on a Worker without requiring file path protection, you can also consider configuring Kestra with [Allowed & Restricted plugins](../allowed-plugins/index.md). ### `kestra.ee.java-security.forbidden-paths` This is a list of paths on the file system that the Kestra Worker will be forbidden to read or write to. This can help to protect [Kestra Security and Secrets configuration](../../../configuration/05.security-and-secrets/index.md) files and ensure security for audits and compliance. With this property configured, you can reduce the amount of directories that a Worker can access such as protecting access to the folders where global Kestra configuration or `~/.aws/credentials` are stored. ### `kestra.ee.java-security.authorized-class-prefix` This is a list of classes that can create threads. Here you can set a list of prefixes (namespace) classes that will be allowed. All others will be refused. For example, [GCP plugins](/plugins/plugin-gcp) will need to create a thread in order to reach the GCP API. Since this whole plugin is deemed safe, you can whitelist it. ### `kestra.ee.java-security.forbidden-class-prefix` This is a list of classes that can't create any threads. Other plugins will be authorized. ```yaml kestra: ee: javaSecurity: enabled: true forbiddenClassPrefix: - io.kestra.plugin.scripts ``` :::alert{type="warning"} Currently, all the official Kestra plugins are safe to be whitelisted **except** [all scripts plugins](../../../16.scripts/00.languages/index.md) since they allow custom code to be created that can be read and written on the file system. Do not add these to the `forbidden-class-prefix`. ::: ## Scripting isolation You can provide global plugin defaults using the `kestra.plugins.defaults` configuration. Those will be applied to each task on your cluster **if a property is not defined** on flows or tasks. Plugin defaults ensure a property is defined at a default value for these tasks. ```yaml kestra: plugins: defaults: - type: io.kestra.plugin.core.log.Log values: level: ERROR ``` For [Bash tasks](/plugins/plugin-script-shell/io.kestra.plugin.scripts.shell.script) and other script tasks in the core, we advise you to force `io.kestra.plugin.scripts.runner.docker.Docker` isolation and to configure global cluster `pluginDefaults`: ```yaml kestra: tasks: defaults: - type: io.kestra.plugin.scripts.shell.Commands forced: true values: containerImage: ubuntu:latest taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker ``` Forced plugin defaults: - Ensure a property is set globally for a task, and no task can override it. - Are critical for security and governance — for example, to enforce Shell tasks to run as Docker containers. :::alert{type="warning"} You will need to add all script plugins tasks (like Python and Node) to be sure that no tasks can bypass the docker isolation. ::: --- # Instance Management in Kestra Enterprise: Health URL: https://kestra.io/docs/enterprise/instance > Manage your Kestra Instance. Monitor service health, handle upgrades, configure maintenance mode, and manage global settings from a centralized dashboard. import ChildCard from "~/components/docs/ChildCard.astro" The Instance menu gives you a centralized view of your Kestra deployment's health along with controls for upgrades, maintenance, and user notifications.
## Overview The **Instance** menu provides centralized control over your platform’s infrastructure so administrators can monitor service health, manage configurations, and communicate announcements (like planned maintenance downtime) to users without relying on additional observability tools. The **System Overview** tab gives a high-level snapshot of your instance’s operational status. Here, you can: - View **License Details**: validity, usage statistics, and installed secrets and storage plugins. ![Instance System Overview](./instance-system-overview.png) ## Services Kestra tracks the health of critical components, including: - **Workers**: Execute tasks. - **Schedulers**: Trigger workflows. - **Executors**: Manage task execution. - **Webservers**: Host the UI and API. ![Instance Overview Table](./instance-table.png) Each service displays: - **State**: Whether the service is active (`RUNNING`) or unresponsive. - **Host Name**: Identifier of the server/pod (e.g., `kafka-ee-preview-79fb7755f8-zhlhq`). - **Server Type**: For example, STANDALONE. - **Version** - **Start Date** - **Health Check Date** ### Service states - **RUNNING** — service is active and healthy. - **MAINTENANCE** — intentionally paused; executions are blocked until resumed. - **DISCONNECTED** — lost a required dependency (e.g., DB/queue) and may recover or shut down. - **TERMINATING** — shutting down; tries to drain work before stopping. - **TERMINATED_GRACEFULLY / TERMINATED_FORCED** — shutdown completed (clean or forced), moves to `NOT_RUNNING`. - **NOT_RUNNING / INACTIVE** — post-shutdown, final states reported in the UI. States come from the service lifecycle. Typical flow: - **CREATED** → **RUNNING** after a clean start; **MAINTENANCE** can be set directly for planned work. - **DISCONNECTED** signals a lost dependency (DB/queue) and may recover to **RUNNING** or proceed to shutdown. - **TERMINATING** attempts graceful stop; it ends in **TERMINATED_GRACEFULLY** or **TERMINATED_FORCED**, then **NOT_RUNNING** → **INACTIVE**. ### Server information and liveness Each service instance provides technical details for debugging when clicked on: - **Hostname**: Identifier of the server/pod (e.g., `kafka-ee-preview-79fb7755f8-zhlhq`). - **Session Timeout**: Time before an unresponsive service is marked offline (e.g., `60 seconds`). - **Heartbeat Interval**: The expected time between heartbeats. - **Last Heartbeat**: Timestamp of the latest health check. - **Termination Grace Period**: The expected time for this service to complete all its tasks before initiating a graceful shutdown. ![Services Overview](./services-overview.png) Additional tabs include **Configuration** to display port configuration, **Metrics** such as CPU Usage and Executor Thread Count: ![Service Metrics](./service-metrics.png) And an **Events Timeline** to give an overview of the service's lifecycle: ![Service Events](./service-events.png) ## Announcements Notify users about planned maintenance or updates: 1. **Create Announcements**: Specify a title, message, and date range. 2. **Choose Type**: Define severity of the announcement (e.g., `info`, `warning`, `error`). [Announcements](./announcements/index.md) appear in the UI during the selected period, ensuring users stay informed. ## Maintenance Mode From the **Instance - Services** tab, enter [Maintenance Mode](../05.instance/maintenance-mode/index.md) to temporarily pause all workflows and services for upgrades: - Services enter a paused state, and new executions are blocked. - Combine it with the Announcements feature so users see a maintenance banner while running workflows gracefully terminate. ## Worker Groups Create [Worker Groups](../04.scalability/worker-group/index.md) to isolate workloads or delegate tasks to specific workers: - **Add Worker Groups**: Define groups with specific resource limits or labels. - **Assign Tasks**: Route workflows to designated groups via worker group key within a task or trigger. ## Audit Logs View [Audit Logs](../02.governance/06.audit-logs/index.md) at a glance to monitor actions on all resource types taken by users in the instance. ![Instance Audit Logs](./instance-audit-logs.png) ## Versioned Plugins View all installed [Versioned Plugins](../05.instance/versioned-plugins/index.md) on the instance and upgrade, install, or uninstall as needed. ![Instance Versioned Plugins](./instance-versioned-plugins.png) ## Instance management – services and maintenance --- # Announcements in Kestra Enterprise: In-App Banners URL: https://kestra.io/docs/enterprise/instance/announcements > Broadcast messages with Kestra Announcements. Create in-app banners to notify users about maintenance, updates, or important system information. Communicate planned maintenance or incidents with in-app banners
## Announcements – in-app banners Announcements allow you to notify your users about any important events such as planned maintenance downtime. ## How to create an announcement To add a custom in-app banner, go to the **Instance → Announcements** tab. ![Announcement Tab](./instance-announcements.png) As a user with an Admin role, you can configure the following within each announcement: - **Message**: the text to display in the banner - **Type**: the type of banner to display (**INFO, WARNING, ERROR**) - The **START** and **END** date during which the announcement should be displayed. ![Create Announcement](./create-announcement.png) ![Display Announcement](./display-announcement.png) --- # Kill Switch in Kestra Enterprise: Stop Executions URL: https://kestra.io/docs/enterprise/instance/kill-switch > Use Kill Switch in Kestra Enterprise to immediately kill, cancel, or ignore executions by scope, with scheduling, audit logs, and in-app banners. Kill Switch is an operational safety lever that lets administrators stop misbehaving executions directly from the UI. ## Why a Kill Switch exists A runaway flow, a bad deployment, or a tenant-specific incident can flood workers with problematic executions. The Kill Switch lets administrators halt or quarantine those executions instantly, without pausing the entire platform or touching infrastructure. Use it when you need to: - Contain impact quickly while you ship a fix or rollback. - Target only the affected tenant/namespace/flow/execution instead of stopping everything. - Keep an auditable record of who intervened, when, and why. - Surface a visible banner so impacted users know what happened. Kill Switch replaces the CLI-only `--skip-executions` and `--skip-flows` commands with a scoped, auditable administration interface. ## Configure a Kill Switch To configure a Kill Switch, navigate to your **Instance → Kill Switch** section in Kestra. From there, name the Kill Switch (e.g., `Kill Switch – Payments Namespace Outage (TEMP)` ) and configure the switch's specifications. ![Create a Kill Switch](./create-kill-switch.png) ### Kill Switch types | Type | Behavior | |------|----------| | **KILL** | Kills running executions after the current task completes; any remaining tasks in the execution will not run. New executions are transitioned to `KILLED` state instantly. | | **CANCEL** | Blocks new executions; lets current task runs finish before marking the execution `CANCELLED`. | | **IGNORE** | Ignores all messages for matching executions—use as a last resort when an execution cannot be killed or cancelled. | For **KILL** and **CANCEL**, executions receive a [system label](../../../06.concepts/system-labels/index.md) identifying which Kill Switch applied. ### Scope Scope sets the reach of the Kill Switch with **Tenant** being the most inclusive and **Execution** the most specific. A **Namespace** scope requires a **Tenant**, and a **Flow** scope requires a **Tenant** and **Namespace**. The UI automatically adjusts to show only the relevant scope requirements depending on your first selection. All possible scopes are listed below: - **Tenant** - **Namespace** - **Flow** - **Execution** ### Scheduling The Kill Switch requires a **Start Date** and can be kept open ended if needed. - Mandatory **Start Date** (default: now) - Optional **End Date** - Enable/disable from the **Kill Switch** tab at any time ### Description Admins can optionally include a free-text reason stored with the Kill Switch and surfaced in banners to document the incident or change request. ## Lifecycle and audit Creation and updates are written to [**Audit Logs**](../../02.governance/06.audit-logs/index.md), and every state change—create, enable, disable, or archive—is recorded. Deleting a Kill Switch performs a soft delete, so the archived entry remains visible for traceability. ## Announcement banner Kill Switches raise contextual banners to alert affected users. A namespace-scoped Kill Switch shows the banner only to users working in that namespace, while a tenant-scoped one surfaces the banner across the UI for all users in the tenant. ## CLI compatibility The CLI remains for open-source parity, with renamed flags to match the behavior: ```bash # Old --skip-executions / --skip-flows # New --ignore-executions / --ignore-flows ``` ## Relationship to maintenance mode [Maintenance Mode](../maintenance-mode/index.md) pauses the platform broadly (queues new executions, lets running ones finish). Kill Switch keeps services up and targets specific tenants/namespaces/flows/executions to stop or ignore problematic runs—an operational tool rather than a platform pause. --- # Maintenance Mode in Kestra Enterprise: Safe Upgrades URL: https://kestra.io/docs/enterprise/instance/maintenance-mode > Safely upgrade with Kestra Maintenance Mode. Pause new executions while allowing running tasks to complete for seamless system updates. Prepare your Kestra instance for maintenance or migration. Maintenance Mode is an enterprise feature designed to transition your Kestra instance into a paused state to conduct maintenance operations such as platform updates.
## Maintenance mode – pause for upgrades Maintenance Mode addresses a common challenge faced by organizations running numerous workflows: finding the right moment to perform platform updates without disrupting ongoing operations. When activated, Maintenance Mode introduces a controlled state where: - The [executor](../../../08.architecture/02.server-components/index.md#executor) stops processing new executions and automatically queues new flow executions. - Existing executions are allowed to be completed gracefully ([workers](../../../08.architecture/02.server-components/index.md#worker) complete their current tasks without picking up new ones). - The platform continues to accept and schedule new executions, storing them for later processing ([web server](../../../08.architecture/02.server-components/index.md#webserver) and [scheduler](../../../08.architecture/02.server-components/index.md#scheduler) components remain active, ensuring no requests are lost). - New executions are queued for processing after maintenance concludes ## Access maintenance mode Maintenance Mode is accessible via the **Instance** menu section of the Kestra UI. You can switch to maintenance mode in the **Services** tab by clicking the **enter maintenance mode** button. This triggers a confirmation prompt and displays information regarding the transition into maintenance mode. ![Enter Maintenance Mode](./maintenance-mode.png) After completing all maintenance operations, you can exit maintenance mode with the same button and confirm that you want to switch back to a live state of your Kestra instance. --- # Versioned Plugins in Kestra Enterprise: Multi-Version URL: https://kestra.io/docs/enterprise/instance/versioned-plugins > Manage plugin versions in Kestra Enterprise. Install multiple versions of the same plugin to support legacy flows while upgrading others safely. Use multiple versions of a plugin depending on your instance requirements and upgrade path. ## Versioned plugins – manage plugin upgrades
Versioned plugins simplify the upgrade process. They allow you to pin older plugin versions to your production and legacy flows while using the latest version for newer flows, enabling granular version management in your Kestra instance. ## Configuration Versioned plugins support several properties that can be modified in your Kestra configuration: - `remoteStorageEnabled`: Specifies whether remote storage is enabled (i.e., plugins are stored on the internal storage). - `localRepositoryPath`: The local path where managed plugins will be synced. - `autoReloadEnabled`: Whether the server should periodically rescan repositories for new or removed plugins. - `autoReloadInterval`: How often to rescan (duration, e.g., `60s`). - `defaultVersion`: The version to use when none is specified in a flow. Accepted values: `LATEST`, `CURRENT`, `OLDEST`, `NONE`, or an explicit version (e.g., `0.20.0`). An example configuration looks as follows: ```yaml kestra: plugins: management: enabled: true # setting to false will make Versioned plugin tab disappear + API will return an error remoteStorageEnabled: true customPluginsEnabled: true # setting to false will disable installing or uploading custom plugins localRepositoryPath: /tmp/kestra/plugins-repository autoReloadEnabled: true autoReloadInterval: 60s defaultVersion: LATEST ``` ### Allow-list URLs In order to properly use Versioned Plugins, the following 3 URLs need to be allowed through your configuration: - https://repo.maven.apache.org/maven2/ - https://registry.kestra.io/maven/ - https://api.kestra.io/ A default configuration looks like: ```yaml kestra: plugins: repositories: central: url: https://repo.maven.apache.org/maven2/ kestra: url: https://registry.kestra.io/maven ``` Refer to the [Plugins and Execution](../../../configuration/04.plugins-and-execution/index.md) page in the Configuration guide for custom Maven repositories. With remote storage enabled, installed plugins are stored in a plugins repository in the `_plugins/repository` path. For example, the below paths show the storage for 0.19.0 and 0.20.0 versions of the Shell script plugin: ```bash _plugins/repository/io_kestra_plugin__plugin-script-shell__0_19_0 _plugins/repository/io_kestra_plugin__plugin-script-shell__0_19_0.jar _plugins/repository/io_kestra_plugin__plugin-script-shell__0_20_0 _plugins/repository/io_kestra_plugin__plugin-script-shell__0_20_0.jar ``` Artifact files are renamed using the format: `____` to be easily parseable (dots `.` are replaced with `_` for `groupId` and `version`). For locally stored plugins configured by the `localRepositoryPath` attribute, the file path looks like `/tmp/kestra/plugins-repository`. For example, the following plugins are stored locally, where the local repository contains a JSON `plugins.meta` file that contains metadata about remote plugins. This file is used for synchronization, where only plugins with detected changes are synchronized. ```bash ├── io_kestra_plugin__plugin-kafka__0_20_0.jar ├── io_kestra_plugin__plugin-script-shell__0_20_0.jar ├── io_kestra_plugin__plugin-terraform__0_20_0.jar ├── io_kestra_plugin__plugin-transform-grok__0_20_0.jar └── plugins.meta ``` ## Configuration for EE-specific plugins Some plugins are available only in the Enterprise Edition (EE) of Kestra. To install EE-specific plugins, you need to make sure that your [Enterprise and Advanced configuration](../../../configuration/06.enterprise-and-advanced/index.md) has the `kestra.ee.license.fingerprint` property set (apart from the `kestra.ee.license.id` and `kestra.ee.license.key` properties). The `kestra.ee.license.fingerprint` property is used to verify that the EE license is valid and allows you to use EE-specific plugins. ## Install versioned plugins Versioned plugins can be installed from the Kestra UI as well as programmatically. ### From the UI Below is a video demonstration walking through each step from installation to application in a flow.
Here are the steps again, listed one by one. Both Kestra official plugins and custom plugins can be installed from the UI. Navigate to the **Instance > Versioned Plugins** section. You can click **+ Install** and open up the full library of available plugins. ![versioned-plugins-1](./versioned-plugins-1.png) From the list, search and select the plugin to install and select the version. ![versioned-plugins-2](./versioned-plugins-2.png) After installing plugins, the full list of versioned plugins is displayed. Kestra alerts you that a newer version of your plugin is available and allows you to upgrade by installing the latest version. When upgrading, the previous version of the plugin is preserved, and a separate, fresh installation of the latest version is added. ![versioned-plugins-3](./versioned-plugins-3.png) For a custom plugin, after clicking **+ Install**, switch from Official plugin to Custom plugin. You need to specify two identifiers for each custom plugin installation: - Group ID: The group identifier of the plugin to be installed. - Artifact ID: The artifact identifier of the plugin to be installed. ![versioned-plugins-5](./versioned-plugins-4.png) Instead of installing a new plugin, you can **Upload** a plugin by choosing a valid Java archive file (`.jar`). ![versioned-plugins-4](./versioned-plugins-5.png) ### From the API Only Super Admin users can install versioned plugins with the API. To install a versioned plugin, you can use the API POST request with your username and password with `-u` or an [API token](../../03.auth/api-tokens/index.md). With Kestra username and password: ```bash curl -X POST http://0.0.0.0:8080/api/v1/cluster/versioned-plugins/install \ -u 'admin@kestra.io:kestra' \ -H "Content-Type: application/json" \ -d '{"plugins":["io.kestra.plugin:plugin-airbyte:0.21.0"]}' ``` With API Token: ```bash curl -X POST http://0.0.0.0:8080/api/v1/cluster/versioned-plugins/install \ -H "Authorization: Bearer YOUR-API-TOKEN" \ -H "Content-Type: application/json" \ -d '{"plugins":["io.kestra.plugin:plugin-airbyte:0.21.0"]}' ``` To uninstall a versioned plugin, use the following DELETE request: ```bash curl -X DELETE http://0.0.0.0:8080/api/v1/cluster/versioned-plugins/uninstall \ -u 'admin@kestra.io:kestra' \ -H "Content-Type: application/json" \ -d '{"plugins":["io.kestra.plugin:plugin-airbyte:0.21.0"]}' ``` To check for all available versions of a plugin, you can use the following API request to resolve: ```bash curl -X POST http://0.0.0.0:8080/api/v1/cluster/versioned-plugins/resolve \ -u 'admin@kestra.io:kestra' \ -H "Content-Type: application/json" \ -d '{"plugins":["io.kestra.plugin:plugin-airbyte:0.21.0"]}' ``` If you want to install a newer plugin version, use the install request with the specified version or use `LATEST` instead of the version number. This creates a second, separate installation of the plugin, so you can keep using an old version in production flows and test using the newer version in development. ```bash curl -X POST http://0.0.0.0:8080/api/v1/cluster/versioned-plugins/install \ -u 'admin@kestra.io:kestra' \ -H "Content-Type: application/json" \ -d '{"plugins":["io.kestra.plugin:plugin-airbyte:LATEST"]}' ``` ### From the CLI To install versioned plugins from the [Kestra CLI](../../../kestra-cli/kestra-server/index.md), you can use the following command: ```bash ./kestra plugins install --locally=false io.kestra.plugin:plugin-jdbc-duckdb:0.21.2 ``` The `--locally` flag specifies whether the plugin should be installed locally or according to your Kestra configuration, where remote storage can be enabled. - `--locally=true` installs the plugin locally. - `--locally=false` checks if `remoteStorageEnabled` is enabled and then plugins are downloaded and pushed to the [configured runtime and storage backend](../../../configuration/02.runtime-and-storage/index.md) directly. ## `version` property in a Flow In Flow tasks or triggers, you can specify the version of the plugin to use with the `version` property. For example, if the instance has both 0.22.0 and 0.21.0 versions installed of the Shell script plugin, the version to use can be specified in the flow as follows: ```yaml id: shell_script_example namespace: company.team tasks: - id: http_download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: shell_script_task type: io.kestra.plugin.scripts.shell.Script version: "0.21.0" outputFiles: - first.txt script: | echo "The current execution is : {{ execution.id }}" echo "1" >> first.txt cat {{ outputs.http_download.uri }} ``` The `version` property also accepts specific, non-case-sensitive values like in the configuration file: - `LATEST` (or `latest`): To use the latest available version of a Kestra plugin. - `OLDEST` (or `oldest`): To use the oldest available version of a Kestra plugin. When there are multiple versions of a plugin available, Kestra resolves the version of a plugin by following this priority order: 1. **Task-Level**: Using the version specified in the `version` property. 2. **Flow-Level**: Using the plugin’s default version. 3. **Namespace-Level**: Using the plugin’s default version for the namespace. 4. **Instance-Level**: Using the value set in `kestra.plugins.management.defaultVersion` (default: `LATEST`). - This property can be configured to `NONE` to enforce that a version is always explicitly defined. **Note**: By default, Kestra defaults to `LATEST` for core plugins if no version can be resolved. For other plugins, if no version can be resolved, the Flow will be considered invalid. :::alert{type="info"} The version is resolved both at flow creation time and execution time to ensure the correct plugin version is used during both stages. This means that a Task/Trigger can only be deserialized after ensuring that all default versions are properly resolved. ::: --- # Enterprise Edition in Kestra: Architecture and Setup URL: https://kestra.io/docs/enterprise/overview > Overview of the Enterprise Edition with an introduction to our enterprise-level features and initial setup guide. import ChildCard from "~/components/docs/ChildCard.astro" Overview of the Enterprise Edition with an introduction to our enterprise-level features and initial setup guide. ## Kestra Enterprise overview – architecture and setup Kestra Enterprise Edition builds on the open-source version by offering more granular access control, enhanced data isolation, improved performance and high-availability architecture, and enterprise-level support from our team. To learn more, explore the sections below or follow the setup guide to get started. --- # Enterprise Features in Kestra: High-Availability URL: https://kestra.io/docs/enterprise/overview/enterprise-edition > Learn about the Enterprise Edition and how it can help you run Kestra securely and reliably at scale. Learn about the Enterprise Edition and how it can help you run Kestra securely and reliably at scale. ## Kestra Enterprise features – high-availability platform Designed for production workloads with high security and compliance requirements, deployed wherever you need. ## Key Features Kestra Enterprise is built on top of the [Open Source Edition](https://github.com/kestra-io/kestra) but features a different architecture. Below are the key differences between the two. ⚡️**High Availability**: Kestra Enterprise is designed to be highly available and fault-tolerant. It uses a **Kafka** cluster as a backend for event-driven orchestration and **Elasticsearch** for storing logs and metrics. This not only improves performance but also eliminates single points of failure and enables the system to scale for large workloads. ⚡️**Multi-Tenancy**: The Enterprise Edition supports multi-tenancy, enabling separate environments for different teams or projects. Each tenant is fully isolated, can have its own access control policies, and can optionally run with Worker Isolation and dedicated worker groups to prevent cross-tenant contention. ⚡️**Security and Access Control**: Kestra Enterprise supports Single Sign-On (SSO) and Role-Based Access Control (RBAC), enabling you to integrate with your existing identity provider and manage user access to workflows and resources. Enforce plugin allow-lists, apply read-only secrets for least privilege, and lean on audit logs for full traceability. ⚡️**Enterprise Features**: Audit Logs, Custom Blueprints, Namespace-level secrets/variables and plugin defaults, Assets packaging, declarative Unit Tests for flows, Versioned Plugins for safe upgrades, and operational safeguards like the Kill Switch and in-product Announcements. ⚡️**Secrets Management**: Kestra Enterprise securely stores and manages secrets. It supports read-only secrets for sensitive values and integrates with existing secret managers such as AWS Secrets Manager, Azure Key Vault, Elasticsearch, Google Secret Manager, HashiCorp Vault, Doppler, 1Password, and more to come. ⚡️**Support**: The Enterprise Edition comes with guaranteed SLAs and priority support. ⚡️**Onboarding**: We provide onboarding and training for your team to ensure a fast and confident start. If you're interested to learn more, [get in touch!](/demo) :::alert{type="info"} **Kestra Cloud:** If you’re unable to host Kestra Enterprise yourself, you can try Kestra Cloud — a fully managed SaaS solution hosted by the Kestra team. Kestra Cloud is currently in early access. If you are interested in trying it out, [sign up here](/cloud). ::: --- # Migrate from OSS to Kestra Enterprise Edition URL: https://kestra.io/docs/enterprise/overview/migrate-from-oss > Migrate your Kestra OSS instance to Enterprise Edition. Learn how to export flows, data, and settings before importing them into Kestra Enterprise. How to migrate your flows and data from Kestra Open Source to Enterprise Edition. ## Migrate from Open Source to Enterprise Edition When you start **Kestra Enterprise Edition**, you can bring your existing flows from the open-source version. This guide covers how to export and import flows, and what to keep in mind for other resources. ## Export and import flows Kestra provides a built-in export/import mechanism for flows: 1. In the **Open Source** UI, go to **Settings**. 2. Click **Export All Flows** to download a single `.zip` file containing all your flows. 3. In the **Enterprise Edition** UI, go to **Flows**. 4. Click **Import** and select the `.zip` file you downloaded. This will import all flows into your Enterprise Edition instance. ## Namespace files, KV store, and other resources **Namespace Files** and the **Key-Value Store** data are not included in the flow export. If you rely on these, you will need to migrate them manually. For **Namespace Files**, re-upload the files through the Enterprise Edition UI or use the [API](../../../api-reference/index.mdx). For the **KV Store**, recreate the entries in your new instance. ## What's next Once your flows are imported, you can start using enterprise features such as [RBAC](../../03.auth/rbac/index.md), [Secrets Management](../../02.governance/secrets/index.md), [Worker Groups](../../04.scalability/worker-group/index.md), and more. --- # Set Up Kestra Enterprise: License and First Tenant URL: https://kestra.io/docs/enterprise/overview/setup > Configure your Kestra Enterprise instance. Activate your license, create the first tenant, and complete the initial setup to start using Enterprise features. How to set up Kestra Enterprise Edition. ## Set up Kestra Enterprise – license and first tenant These setup instructions guide you through the initial configuration of your instance. When you launch Kestra Enterprise Edition for the first time, Kestra will prompt you to configure your instance. This includes setting up your first tenant, creating your first user, and starting the Kestra UI. ## Prerequisites To use Kestra Enterprise Edition, you will need a valid license configured under the `kestra.ee.license` configuration. The license is unique to your organization. If you need a license, please reach out to our Sales team at [sales@kestra.io](mailto:sales@kestra.io). The license is set up using three configuration properties: `id`, `fingerprint`, and `key`. - `kestra.ee.license.id`: license identifier. - `kestra.ee.license.fingerprint`: license authentication. This is required to use [Versioned Plugins](../../05.instance/versioned-plugins/index.md). - `kestra.ee.license.key`: license key. ```yaml kestra: ee: license: id: fingerprint: key: | ``` When you launch Kestra Enterprise Edition, it will check the license and display the validation step in the log. ## Step 1: Validate configuration The first screen shows the main configuration of your instance. It displays: - whether `multitenancy` is enabled - whether `default tenant` is enabled — if yes, you can skip Step 2 allowing you to create your first tenant - which `database` backend is configured (e.g., PostgreSQL or Elasticsearch) - which `queue` backend is configured (e.g., PostgreSQL or Kafka) - which `internal storage` backend is configured (e.g., S3, GCS, Azure Blob Storage, MinIO, or local storage) - which `secret` backend is configured (e.g., Vault, AWS Secrets Manager, Elasticsearch, or not set up yet) ![Instance configuration validation screen](./setup_page1.png) This step asks you to confirm whether your configuration is valid. If not, you can correct the configuration, restart the instance, and start the setup from scratch. ## Step 2: Create your first tenant If `multitenancy` is enabled, Kestra, will prompt you to create your first tenant. If you choose to create a tenant, you will be asked to input the Tenant ID and Tenant Name, for example: - tenant id: `stage` - tenant name: `Staging Environment` If you enabled a default tenant, you can skip this step. ![Create first tenant form with ID and name fields](./setup_page2.png) ## Step 3: Create your first user Now that you have your instance configured, you will create your first user. This user will have a [Superadmin](../../03.auth/rbac/index.md#super-admin) role for the instance and will be able to manage tenants, users, and roles. ![Create first Superadmin user form](./setup_page3.png) ## Step 4: Start Kestra UI Once your tenant and user are configured, Kestra will launch the UI and log you into your new tenant as the first user. ![Kestra UI launched after completing setup](./setup_page4.png) --- # Install Kestra Enterprise from Standalone JAR URL: https://kestra.io/docs/enterprise/overview/standalone-server-installation > Install Kestra Enterprise on a standalone server using an executable JAR file. Run the platform without Docker where containerization is unavailable. Install Kestra on a standalone server with a simple executable file. ## Run Kestra Enterprise from a standalone JAR To deploy Kestra without Docker, there's a standalone JAR available that allows deployment in any environment that has JVM version 21+. ## Instructions The following is a quick start guide to get your Kestra Enterprise Edition up and running in standalone mode. ## Standalone JAR Download the latest version of the Kestra EE JAR from: [http://registry.kestra.io/exe/latest](http://registry.kestra.io/exe/latest) **Credentials:** - **Username**: `license-id` - **Password**: `fingerprint` :::alert{type="info"} Make sure to store your credentials in an `application.yaml` file. ::: This provides a single JAR file that can be used to start Kestra. Store the file in your execution environment as `kestra` (make it executable). To make the file executable, Linux or MacOS users use the following with filename: ```bash chmod +x kestra-ee-VERSION # Replace VERSION with your version ``` Or with a file path: ```bash mv kestra-ee-VERSION /usr/local/bin/kestra # Replace with your version and execution environment file path chmod +x /usr/local/bin/kestra ``` The file is then executable with: ```bash ./kestra-ee-VERSION server standalone # Replace VERSION with your version ``` :::alert{type="info"} You need to provide a configuration with a connection to a database. ::: For Windows users: ```powershell java -jar kestra-ee-VERSION # Replace VERSION with your version ``` Or with a file path assuming execution from the current directory: ```powershell java -jar kestra-ee-VERSION server standalone -c ./application.yaml -p ./plugins --port=8080 # Replace VERSION with your version ``` ## Plugins In standalone JAR deployments, all plugins must be downloaded separately. Kestra EE provides a command to install all available plugins: ```shell ## Install all available plugins kestra plugins install --all ``` This installs task plugins in the `plugins` directory. To install them elsewhere, specify a path with the `-p` argument. Additional Enterprise Edition plugins that are not task related may also be required -- such as secrets or storage plugins. ## Secret plugins Secret plugins must be downloaded from the Kestra registry using the same credentials, and placed in your `plugins` directory. | Secret Service | Download Link | | :------------- | :------------- | | Vault | https://registry.kestra.io/maven/io/kestra/ee/secret/secret-vault/0.24.0/secret-vault-0.24.0.jar | | AWS | https://registry.kestra.io/maven/io/kestra/ee/secret/secret-aws/0.24.0/secret-aws-0.24.0.jar | | GCP | https://registry.kestra.io/maven/io/kestra/ee/secret/secret-gcp/0.24.0/secret-gcp-0.24.0.jar | | Azure | https://registry.kestra.io/maven/io/kestra/ee/secret/secret-azure/0.24.0/secret-azure-0.24.0.jar | ## MinIO Internal Storage To enable MinIO storage, install the storage plugin: ```shell ## Install MinIO internal storage plugin kestra plugins install io.kestra.storage:storage-minio:LATEST ``` ## Enterprise deployment configuration For the full list of configuration options, refer to the [Configuration Reference](https://kestra.io/docs/configuration). To enable Kestra Enterprise features, configure the following parameters: | Configuration Parameter | Required | Documentation Link | Description | | :---------------------- | :------- |:-----------------------------------------------------------------------------------------------------------------------------------| :---------- | | Enterprise License | Yes | [Enterprise and Advanced Features](../../../configuration/06.enterprise-and-advanced/index.md) | License information for the Kestra instance | | Multi-tenancy | Yes | [Enterprise and Advanced Features](../../../configuration/06.enterprise-and-advanced/index.md) | Enables/disables multi-tenancy (required for SCIM) | | Secret Manager | Yes | [Security and Secrets](../../../configuration/05.security-and-secrets/index.md) | Configure a secret manager in RW or RO mode | | Encryption Key | Yes | [Security and Secrets](../../../configuration/05.security-and-secrets/index.md) | Key to encrypt inputs/outputs in flows | | Security | No | [Security and Secrets](../../../configuration/05.security-and-secrets/index.md) | Configure Super Admin (also settable in UI on startup) | | User Invitations | No | [Runtime and Storage](../../../configuration/02.runtime-and-storage/index.md), [Observability and Networking](../../../configuration/03.observability-and-networking/index.md) | Required for email invitations (not needed with LDAP/SCIM) | | SSO | No | [SSO](../../03.auth/sso/index.md) | Configure OIDC provider | | LDAP | No | [LDAP](../../03.auth/sso/ldap/index.md) | Connect to an existing LDAP provider | | SCIM | No | [SCIM](../../03.auth/scim/index.mdx) | Sync user/group membership with SCIM 2.0 | ## Starting Kestra Kestra can be started in **standalone mode** or in a **distributed setup** for production. Make sure to have a database configured and your Enterprise credentials stored in the `application.yaml` file. ## Standalone server ```shell kestra server standalone -c ./application.yaml -p ./plugins --port=8080 ``` This starts Kestra as a standalone service on port `8080`. ## Distributed mode For production usage, Kestra should run in distributed mode for scalability and high availability. Each component can run independently across servers, with shared access to the same database (no TCP communication is required between components). Example with all components on one server: ```shell kestra server webserver -c ./application.yaml -p ./plugins --port=8080 kestra server scheduler -c ./application.yaml -p ./plugins --port=8081 kestra server worker -c ./application.yaml -p ./plugins --port=8082 kestra server executor -c ./application.yaml -p ./plugins --port=8083 ``` --- # Scale Kestra Enterprise: Worker Groups and Apps URL: https://kestra.io/docs/enterprise/scalability > Scale Kestra Enterprise with advanced features. Explore Worker Groups, Task Runners, and Apps to enhance performance, isolation, and productivity. import ChildCard from "~/components/docs/ChildCard.astro" The following topics describe Kestra features that help scale and enhance the productivity of your orchestration workflows such as Apps and Worker Groups. ## Scale Kestra Enterprise – worker groups, task runners, apps --- # Apps in Kestra Enterprise: Frontends for Flows URL: https://kestra.io/docs/enterprise/scalability/apps > Build custom Apps with Kestra. Create user-facing interfaces for workflows, enabling forms, approvals, and interactive data applications. Build custom UIs to interact with Kestra from the outside world.
## Apps – build frontends for Flows Apps let you use your Kestra workflows as the backend for custom applications. Within each app, you can specify custom frontend blocks, such as forms for data entry, output displays, approval buttons, or markdown blocks. **Flows** act as the **backend**, processing data and executing tasks, while **Apps** serve as the **frontend**, allowing anyone to interact with your workflows regardless of their technical background. Business users can trigger new workflow executions, manually approve workflows that are paused, submit data to automated processes using simple forms, and view the execution results. You can think of Apps as **custom UIs for flows**. They are useful both for external-facing forms and for internal workflows such as approvals, requests, and guided operations. --- ## Common App use cases Most Apps fall into one of these two patterns: - **Execution forms**: users submit a form that starts a new execution with input parameters. For example, a requester can specify resources that need to be provisioned, and those inputs feed directly into a flow. - **Approval or resume interfaces**: users review a paused execution and approve, reject, or resume it. For example, a platform team can validate a provisioning request before the flow continues. ## App benefits Apps offer custom UIs on top of your Kestra workflows. Often, workflows are designed for non-technical users, and creating custom frontends for each of these workflows can be a lot of work. Imagine having to build and serve a frontend, connect it to Kestra’s API, validate user inputs, handle responses, manage workflow outputs, and deal with authentication and authorization — all from scratch. Apps generate a custom UI for any flow without custom frontend development. Here are some common scenarios where a custom UI is useful: - **Manual Approval**: workflows that need manual approval, such as provisioning resources, granting access to services, deploying apps, validating data results, or reviewing AI-generated outputs. - **Report Generation**: workflows where business users request data and receive a downloadable CSV or Excel file. - **IT Helpdesk**: workflows that accept bug reports, feature requests, or other tickets, and automatically forward the ticket to the relevant team. - **User Feedback & Signups**: workflows that collect feedback or allow users to sign up for events or email lists. - **Data Entry**: workflows where business users enter data that is processed and either sent back to them or stored in a database. Apps let non-technical users interact with workflows without editing YAML or flow configuration. ## How App stages map to execution progress Apps render different blocks based on the current execution state. This is useful when you want the page to guide users through the full lifecycle of a request, from submission to approval to delivery. | App stage | What the user sees | What usually happens in the flow | |-----------|--------------------|----------------------------------| | `OPEN` | The initial form or landing page | No execution exists yet. The user is about to submit a request. | | `CREATED` | Optional confirmation that the request was accepted | Kestra created the execution and is about to start processing it. | | `RUNNING` | Progress text, logs, loading indicators, or intermediate outputs | Tasks are actively running. | | `PAUSE` | Approval or review screen | The flow is waiting on a paused task or a manual decision. | | `RESUME` | Post-approval confirmation and follow-up details | The paused execution was resumed and continues running. | | `SUCCESS` | Final outputs, download links, or next-step buttons | The execution completed successfully. | | `FAILURE`, `ERROR`, `FALLBACK` | Error messages, logs, retry guidance, escalation links | The execution did not complete as expected. | For example, a VM request app might start with an `OPEN` form, move to `RUNNING` while Kestra validates the request, switch to `PAUSE` while a platform engineer reviews the requested size and environment, then show `SUCCESS` once the VM has been provisioned. This stage-based layout is what makes Apps easier for non-technical users: they don't need to understand workflow internals, only the current step of their request. --- ## Common App patterns The examples below are a good starting point when designing your own App: - **FTP upload portal**: give users a simple upload form while Kestra handles the backend credentials and transfer logic. See the [business user Apps blog example](../../../../blogs/use-case-apps/index.md#requests--review). - **Self-serve analytics request**: let users choose a dimension and time range, run a query and chart generation flow, and return the generated output on `SUCCESS`. See the [dynamic self-serve example](../../../../blogs/use-case-apps/index.md#dynamic-self-serve). - **AI-assisted intake or user research assistant**: collect free-form context from a sales, product, or support team member, run an LLM-backed flow, and display the suggested answer or categorization back in the App. See the [everyday automation example](../../../../blogs/use-case-apps/index.md#simple-interfaces-for-everyday-automation). - **VM or infrastructure request**: collect the requested environment, size, region, and justification on `OPEN`, show validation progress on `RUNNING`, pause for approval on `PAUSE`, then display the created VM details on `SUCCESS`. This pattern also fits the infrastructure workflows described in the [infrastructure automation blog](../../../../blogs/infra-automation/index.md). - **Human-in-the-loop review**: display task outputs, logs, or model results, then let an approver accept or reject the execution from the same screen. When in doubt, start by mapping the user journey first: 1. What should the user submit? 2. What should they see while the flow is running? 3. Does the flow need approval or review? 4. What is the final outcome you want to show back in the App? Once you know those answers, it becomes much easier to choose the right blocks for each stage. If you want inspiration beyond the examples on this page, browse the Apps-focused posts in the [blog section](../../../../blogs/introducing-apps/index.md) and [solutions content](../../../../blogs/use-case-apps/index.md). --- ## Creating Apps in code
To create a new app, go to the **Apps** page in the main UI and click **+ Create**. Add your app configuration as YAML and click **Save**. Like flows, apps have multiple editor views — you can configure the app while viewing documentation, previewing the layout, or searching the blueprint repository. You can set `disabled: true` in the YAML to create an app in an inactive state. A disabled app does not appear in the catalog and cannot be opened via its URL until you enable it. This is useful for staging an app before you are ready to release it. ![App Editor Views](./app-editor-views.png) ### App to run a Hello World flow Apps serve as custom UIs for workflows, so you need to first create a flow. Here is a simple configuration for a parameterized flow that logs a message when triggered: ```yaml id: myflow namespace: company.team inputs: - id: user type: STRING defaults: World tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Hello {{ inputs.user }} ``` Then add an app that triggers that flow: ```yaml id: hello_world_form type: io.kestra.plugin.ee.apps.Execution displayName: Hello World Form namespace: company.team flowId: myflow access: type: PUBLIC layout: - on: OPEN blocks: - type: io.kestra.plugin.ee.apps.core.blocks.Markdown content: | ## Say hello Enter a name and submit the form. - type: io.kestra.plugin.ee.apps.execution.blocks.CreateExecutionForm - type: io.kestra.plugin.ee.apps.execution.blocks.CreateExecutionButton text: Submit - on: SUCCESS blocks: - type: io.kestra.plugin.ee.apps.core.blocks.Alert style: SUCCESS showIcon: true content: Your request completed successfully. - type: io.kestra.plugin.ee.apps.execution.blocks.Logs ``` You can find a related example in the [enterprise-edition-examples repository](https://github.com/kestra-io/enterprise-edition-examples/blob/main/apps/06_hello_world_app.yaml). This app is `PUBLIC`, so anyone with the URL can access it without requiring login. Alternatively, you can set the `access` type to `PRIVATE` to restrict the app only to specific users. This app is perfect for building **public forms** that anyone in the world can access. ### App to request and download data Let's create a flow that fetches the relevant dataset based on user input: [flow source code](https://github.com/kestra-io/enterprise-edition-examples/blob/main/flows/company.team.get_data.yaml). Now, from the Apps page, you can create a new app that allows users to select the data they want to download: [app source code](https://github.com/kestra-io/enterprise-edition-examples/blob/main/apps/05_request_data_form.yaml). This app is perfect for reporting and analytics use cases where users can request data and download the results. ### App to request a VM and get it approved One common enterprise use case is a self-service infrastructure request. A requester fills out a form with the VM size, environment, and justification. Kestra validates the request, pauses for approval, and resumes the flow only after the request is approved. Add a flow simulating a request for compute resources that needs manual approval: [flow source code](https://github.com/kestra-io/enterprise-edition-examples/blob/main/flows/company.team.request_resources.yaml). Then, add your app configuration to create a form that requests the VM and routes it through the approval process: [app source code](https://github.com/kestra-io/enterprise-edition-examples/blob/main/apps/03_compute_resources_approval.yaml). In practice, that app often uses the following stages: - `OPEN`: request form with VM size, environment, owner, and business justification. - `RUNNING`: validation of the request, available quotas, tags, or naming conventions. - `PAUSE`: approval screen for the platform, security, or operations team. - `RESUME` or `SUCCESS`: confirmation that the request was approved and the VM is being created or is ready to use. This pattern also works for adjacent use cases such as database access requests, sandbox environment creation, firewall rule approvals, or SaaS account provisioning. --- ## Creating Apps without code Like flows, Apps can also be created using the no-code editor. Every element available in code — such as blocks, properties, and configuration options — is fully supported in the no-code interface. When you build or update an App in the no-code editor, those changes are immediately reflected in the code view, preserving the declarative YAML definition behind the scenes. This ensures consistency between visual and code-first approaches, allowing teams to switch seamlessly between them without losing control, readability, or versioning. ![Apps No Code](./app-no-code.png) --- ## App catalog The App Catalog is where users can find available apps. You can filter apps by name, type, namespace, or tags. From this page, you can also create new apps, edit existing ones, enable or disable individual apps, or delete them. ![apps_catalog](./apps_catalog.png) Kestra provides a direct access URL to the Apps Catalog in the format `http://your_host/ui/your_tenant/apps/catalog`. Any Kestra user with at least `APP`-Read and `APPEXECUTION`-Read permissions in that tenant can reach this URL (adding all `APPEXECUTION` permissions is recommended). The catalog page requires authentication, so it is never publicly accessible. Users see only the apps they are permitted to see based on their RBAC permissions. You can limit visibility to specific groups by setting the `groups` property in the `access` block: ```yaml access: catalog: true type: PRIVATE groups: - Admins ``` ### Hiding an app from the catalog Setting `catalog: false` removes the app from the browseable catalog while keeping its direct URL fully functional. Use this when you want to share an app with a specific audience via URL without surfacing it to everyone who can browse the catalog. ```yaml access: catalog: false type: PRIVATE ``` ### Managing apps in bulk From the Apps Catalog, you can select multiple apps and enable, disable, or delete them in a single operation. Bulk operations report partial failures individually so you can see which apps were affected and which were not. You can also export a selection of apps as a ZIP archive (`kestra-{tenant}-apps.zip`) and import that archive — or a multi-document YAML file — into another tenant or environment. The export produces one `{namespace}-{id}.yaml` file per app. On import, each app is validated independently; errors are reported per file so a single bad app does not block the rest. ### Customize the Apps Catalog You can customize your Apps Catalog to align with organization branding by navigating to the **Tenant** tab and then **Apps Catalog**. ![Apps Catalog Customization](./apps-catalog-customization.png) Here, you can give your catalog a display title, set a primary banner display color, and upload an image for banner (typically an organization logo). :::alert{type="info"} Currently, the uploaded banner display image must be an `.svg` file. ::: Once saved, navigate to the Apps Catalog, and see your branding: ![Apps Catalog Branding](./customized-catalog.png) From the Apps Catalog, you can also access the customization settings directly at any time by clicking on the **gear icon**. --- ## App tags You can add custom tags to organize and filter apps in the App Catalog. For example, you might tag apps with `DevOps`, `data-team`, `project-x`. You can then filter apps by tags to quickly find the apps you are looking for. --- ## App expiration You can limit an app to a specific time window using the `expiration` property. Once the window closes, the app is filtered out of the catalog and blocks new submissions — existing executions are unaffected. ```yaml id: survey_form type: io.kestra.plugin.ee.apps.Execution displayName: Q2 Survey namespace: company.team flowId: survey_processor access: type: PUBLIC expiration: startDate: "2025-06-01T00:00:00Z" endDate: "2025-06-30T23:59:59Z" layout: - on: OPEN blocks: - type: io.kestra.plugin.ee.apps.core.blocks.Markdown content: "## Please complete the survey before the end of June." - type: io.kestra.plugin.ee.apps.execution.blocks.CreateExecutionForm - type: io.kestra.plugin.ee.apps.execution.blocks.CreateExecutionButton text: Submit ``` Both fields are optional: - Omit `startDate` and the app is available immediately. - Omit `endDate` and the app never expires. - Omit `expiration` entirely and the app stays active indefinitely. Expiration is evaluated against the server clock at the moment a user opens or submits the app. --- ## App thumbnails Design Apps with thumbnails to clearly display their intended use case or function to catalog users. To add a thumbnail to your app, upload an image file as a [namespace file](../../../06.concepts/02.namespace-files/index.md) to the same namespace as the App's connected flow. For example, add an `.svg` (it can also be `.jpg`, `.png`, or other image file extension) to the `company.team` namespace. The example below adds `kestra-icon.svg`. ![Image Namespace File](./app-namespace-file.png) In your app code, add the `thumbnail` string property and point it towards the correct namespace file using `nsfiles:///`. For example: ```yaml id: request_data_form type: io.kestra.plugin.ee.apps.Execution displayName: Form to request and download data namespace: company.team flowId: get_data thumbnail: "nsfiles:///kestra-icon.svg" # Point this property to the correct namespace file. access: type: PRIVATE tags: - Reporting - Analytics ``` Once added, navigate to the Apps Catalog, and a new thumbnail will display on the connected app to help designate its use case: ![App with thumbnail](./app-with-icon.png) --- ## App URL Each app has a unique URL that you can share with others. When someone opens the URL, they see the app and can submit requests. You can share the URL with team members, customers, or partners. The URL format is: `https://yourHost/ui/tenantId/apps/appUid`, for example `http://localhost:8080/ui/release/apps/5CS8qsm7YTif4PWuAUWHQ5`. You can copy the URL from the Apps Catalog page in the Kestra UI. :::alert{type="info"} App URL generation relies on the `kestra.url` server configuration property. If this property is not set, generated links may be broken or missing. Set it to the externally reachable base URL of your Kestra instance, for example `kestra.url: https://kestra.example.com`. ::: ### App expressions From within flows, you can generate app URLs using the Enterprise-only `appLink` expression. See [Workflow Functions](../../../expressions/04.functions/04.workflow/index.mdx) for parameters and examples. --- ## App access and RBAC permissions Each app has an `access` block that controls who can open and submit it. ### Public access When an app is set to `PUBLIC`, anyone with the URL can open the form and submit requests without logging in. This is suitable for public-facing forms, surveys, or intake pages you share via email or embed on a website. :::alert{type="info"} For `PUBLIC` apps, execution IDs exposed through file download or log links are encrypted so that anonymous users cannot reference executions outside the app. ::: ### Private access for using apps When an app is set to `PRIVATE`, only authenticated users with the `APPEXECUTION` permission on the app’s namespace can open or submit it. You can further narrow access to specific IAM groups using the `groups` field: ```yaml access: type: PRIVATE groups: - DataOps - Finance ``` Group membership is checked at runtime on every request. Users who belong to at least one listed group are granted access; users outside those groups are denied even if they have `APPEXECUTION` permission on the namespace. If `groups` is omitted, any authenticated user with `APPEXECUTION` permission on the namespace can use the app. The `APPEXECUTION` permission is also namespace-scoped. A user with `APPEXECUTION` on `company.team` cannot dispatch an app in `company.other`, even if both apps appear in the same catalog view. This makes the `PRIVATE` + `groups` combination useful when you want to allow a specific group of business stakeholders or external partners to use an app without giving them access to the broader Kestra UI. ### Private access for building apps The `APP` permission controls who can create, read, update, or delete apps within a tenant. Like `APPEXECUTION`, it can be scoped to specific namespaces. Unlike `APPEXECUTION`, which governs the ability to submit requests through an app, `APP` governs the ability to build and manage apps. --- ## App executions Each time a user creates an execution by submitting a form in the app, a new execution is generated with the system label `system.app` and a value of `yourAppId`. For example, to filter all executions created by the `computeResourcesForm` app, you can search for `system.app:computeResourcesForm` in the label filter. For every execution, you can track the user inputs, see the current state, view logs, and check the outputs — all from the Kestra UI. This lets you observe, troubleshoot and manage issues with your apps just as you would with any other workflow execution in Kestra. --- ## App layout blocks Each app is made up of blocks that define the layout and content of the app. You can add blocks for markdown text, forms, buttons, logs, inputs, outputs, and more. The blocks are displayed in a specific order based on the app’s state (e.g. on `OPEN`, `RUNNING`, `SUCCESS`, `FAILURE`, `PAUSE`, `RESUME`). By combining different blocks, you can create a custom UI that guides users through the app’s workflow. For example, you could start with a markdown block that explains the purpose of the app, followed by a form block for users to enter their inputs, and a button block to submit the request. You can also add blocks to display execution logs, outputs, and buttons for approving or rejecting paused workflows. | Block type | Available on | Properties | Example | |--------------------------|--------------------------------------------------------------------------|---------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `Markdown` | OPEN, CREATED, RUNNING, PAUSE, RESUME, SUCCESS, FAILURE, FALLBACK | - `content` | `- type: io.kestra.plugin.ee.apps.core.blocks.Markdown`
    `content: "## Please validate the request. Inspect the logs and outputs below. Then, approve or reject the request."` | | `RedirectTo` | OPEN, CREATED, RUNNING, PAUSE, RESUME, SUCCESS, FAILURE, ERROR, FALLBACK | - `url`: redirect URL
- `delay`: delay in seconds | `- type: io.kestra.plugin.ee.apps.core.blocks.RedirectTo`
    `url: "https://kestra.io/docs"`
    `delay: "PT60S"` | | `CreateExecutionForm` | OPEN | None | `- type: io.kestra.plugin.ee.apps.execution.blocks.CreateExecutionForm` | | `ResumeExecutionForm` | PAUSE | None | `- type: io.kestra.plugin.ee.apps.execution.blocks.ResumeExecutionForm` | | `CreateExecutionButton` | OPEN | - `text`
- `style`: DEFAULT, SUCCESS, DANGER, INFO
- `size`: SMALL, MEDIUM, LARGE | `- type: io.kestra.plugin.ee.apps.execution.blocks.CreateExecutionButton`
    `text: "Submit"`
    `style: "SUCCESS"`
    `size: "MEDIUM"` | | `CancelExecutionButton` | CREATED, RUNNING, PAUSE | - `text`
- `style`: DEFAULT, SUCCESS, DANGER, INFO
- `size`: SMALL, MEDIUM, LARGE | `- type: io.kestra.plugin.ee.apps.execution.blocks.CancelExecutionButton`
    `text: "Reject"`
    `style: "DANGER"`
    `size: "SMALL"` | | `ResumeExecutionButton` | PAUSE | - `text`
- `style`: DEFAULT, SUCCESS, DANGER, INFO
- `size`: SMALL, MEDIUM, LARGE | `- type: io.kestra.plugin.ee.apps.execution.blocks.ResumeExecutionButton`
    `text: "Approve"`
    `style: "SUCCESS"`
    `size: "LARGE"` | | `ExecutionInputs` | PAUSE, RESUME, SUCCESS, FAILURE | - `filter`: include, exclude | `- type: io.kestra.plugin.ee.apps.execution.blocks.Inputs`
    `filter:`
        `include: []`
        `exclude: []` | | `ExecutionOutputs` | PAUSE, RESUME, SUCCESS, FAILURE | - `filter`: include, exclude | `- type: io.kestra.plugin.ee.apps.execution.blocks.Outputs`
    `filter:`
        `include: []`
        `exclude: []` | | `ExecutionLogs` | PAUSE, RESUME, SUCCESS, FAILURE, FALLBACK | - `filter`: logLevel, taskIds | `- type: io.kestra.plugin.ee.apps.execution.blocks.Logs`
    `filter:`
        `logLevel: "INFO"`
        `taskIds: []` | | `Loading` | RUNNING | None | `- type: io.kestra.plugin.ee.apps.core.blocks.Loading` | | `Alert` | FAILURE | - `style`: SUCCESS, WARNING, ERROR, INFO
- `showIcon`: true, false | `- type: io.kestra.plugin.ee.apps.core.blocks.Alert`
    `style: "WARNING"`
    `showIcon: true`
    `content: "An error occurred!"` | | `Button` | SUCCESS, FAILURE | - `text`
- `url`
- `style`: DEFAULT, SUCCESS, DANGER, INFO | `- type: io.kestra.plugin.ee.apps.core.blocks.Button`
    `text: "More examples"`
    `url: "https://github.com/kestra-io/examples"`
    `style: "INFO"` | | `TaskOutputs` | RUNNING, PAUSE, RESUME, SUCCESS | - `outputs`: list of outputs with `displayName`, `value`, and `type` | `- type: io.kestra.plugin.ee.apps.execution.blocks.TaskOutputs`
    `outputs:`
        `- displayName: My Task Output`
        `value: "{{ outputs.test.value }}"`
        `type: FILE` | Everything is customizable, from the text and style of buttons to the messages displayed before and after submissions. ### File preview and download The `Outputs` and `TaskOutputs` blocks can render file download links for outputs stored in Kestra's internal storage. File preview, metadata, and download are only available when: - The app type is `io.kestra.plugin.ee.apps.Execution`. - The layout includes an `Outputs` or `TaskOutputs` block. - The storage path belongs to an execution that the app has access to. By default, file preview shows the first 100 rows. You can change this server-side with `kestra.server.preview.initial-rows` (default `100`) and cap it with `kestra.server.preview.max-rows` (default `5000`). ### Log download The `ExecutionLogs` block renders an inline log viewer. When a `Logs` block is present in the layout, users can also download the full log file directly from the app. Log download is only available for `Execution`-type apps that include a `Logs` block in their layout. --- # Task Runners in Kestra Enterprise: Offload Compute URL: https://kestra.io/docs/enterprise/scalability/task-runners > Optimize compute with Kestra Task Runners. Offload intensive tasks to Docker, Kubernetes, AWS Batch, and other remote environments for scalability. Task Runner capabilities and supported plugins. ## Task runners – offload and isolate compute [Task Runners](../../../task-runners/index.mdx) offer a powerful way to offload compute-intensive tasks to remote environments.
## Task runner types There are a number of task runner types. The [Docker](../../../task-runners/04.types/02.docker-task-runner/index.md) and [Process](../../../task-runners/04.types/01.process-task-runner/index.md) task runners are included in the Open Source edition. All other types require an [Enterprise Edition](./index.md) license or a [Kestra Cloud](/cloud) account. Enterprise Edition Task Runners: - [Kubernetes](../../../task-runners/04.types/03.kubernetes-task-runner/index.md) - [AWS Batch](../../../task-runners/04.types/04.aws-batch-task-runner/index.md) - [Azure Batch](../../../task-runners/04.types/05.azure-batch-task-runner/index.md) - [Google Batch](../../../task-runners/04.types/06.google-batch-task-runner/index.md) - [Google Cloud Run](../../../task-runners/04.types/07.google-cloudrun-task-runner/index.md) ## Task runners vs Worker Groups [Task Runners](../../../task-runners/index.mdx) and [Worker Groups](../worker-group/index.md) both **offload compute-intensive tasks to dedicated workers**. However, **worker groups have a broader scope**, applying to **all tasks** in Kestra, whereas **task runners** are limited to **scripting tasks** (Python, R, JavaScript, Shell, dbt, etc. — see the full list in the [Task Runner Overview](../../../task-runners/index.mdx)). Worker groups can be used with any plugins. For instance, if you need to query an on-premise SQL Server database running on a different server than Kestra, your SQL Server Query task can target a worker with access to that server. Additionally, worker groups can fulfill the same use case as task runners by distributing the load of scripting tasks to dedicated workers with the necessary resources and dependencies (_incl. hardware, region, network, operating system_). You can read more about the differences on the [dedicated Task Runners vs. Worker Groups page](../../../task-runners/03.task-runners-vs-worker-groups/index.md). --- # Worker Groups in Kestra Enterprise: Target Workers URL: https://kestra.io/docs/enterprise/scalability/worker-group > Manage workloads with Kestra Worker Groups. Target specific workers for tasks based on hardware, region, or security requirements for optimized execution. How to configure Worker Groups in Kestra Enterprise Edition. ## Worker groups – configure targeted workers A Worker Group is a set of workers that can be explicitly targeted for task execution or polling trigger evaluation. For example, tasks that require heavy resources can be isolated to a Worker Group designed to handle that load, and tasks that perform best on a specific Operating System can be optimized to run on a Worker Group designed for them. :::alert{type="info"} Please note that Worker Groups are not yet available in Kestra Cloud, only in Kestra Enterprise Edition. :::
## Creating Worker Groups from the UI :::badge{version=">=0.19" editions="EE"} ::: To create a new Worker Group, navigate to the **Instance** page, go to the **Worker Groups** tab, and click on the `+ Add Worker Group` button. Then, set a **Key**, a **Description**, and optionally **Allowed Tenants** for that worker group. You can also accomplish this via API, CLI, or Terraform. ![Create Worker Group UI](./create-worker-group.png) ## Starting workers for a Worker Group Once a worker group key is created, you can start a worker with the `kestra server worker --worker-group {workerGroupKey}` flag to assign it to that worker group. You can also assign a default worker group at the namespace and tenant level. ![Worker Group UI](./worker-group-ui.png) The Worker Groups UI tracks the health of worker groups, showing how many workers are polling for tasks within each worker group. This gives you visibility into which worker groups are active and the number of active workers. ![Worker Group UI Details](./worker-group-details.png) :::alert{type="info"} In order to run the command at startup, you need to run each component independently and use the command for the worker component startup. To set this up, read more about running [Kestra with separated server components](../../../kestra-cli/kestra-server/index.md#kestra-with-server-components-in-different-services). ::: ## Using Worker Groups To assign a worker group, add the `workerGroup.key` property to the task or the polling trigger. A default worker group can also be configured at the `namespace` or `tenant` level. Worker groups can be defined at the flow level, and the flow editor validates worker group keys when creating flows from the UI. If the provided key doesn’t exist, the syntax validation will prevent the flow from being saved. Below is an example flow configuration with a worker group: ```yaml id: worker_group namespace: company.team tasks: - id: wait type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - sleep 10 workerGroup: key: gpu ``` If the `workerGroup.key` property is not provided, all tasks and polling triggers are executed on the default worker group. That default worker group doesn't have a dedicated key. A `workerGroup.key` can also be assigned dynamically using `inputs` like in the following example: ```yaml id: worker_group_dynamic namespace: company.team inputs: - id: my_worker_group type: STRING tasks: - id: workerGroup type: io.kestra.plugin.core.debug.Return format: "{{ taskrun.startDate }}" workerGroup: key: "{{ inputs.my_worker_group }}" ``` If the expression resolves to `null` or a blank string, the task is routed to the default worker group — the same behavior as omitting `workerGroup` entirely. This makes `null` a useful sentinel for conditional routing: ```yaml id: worker_group_conditional namespace: company.team inputs: - id: use_gpu type: BOOLEAN defaults: false tasks: - id: train type: io.kestra.plugin.core.debug.Return format: "{{ taskrun.startDate }}" workerGroup: key: "{{ inputs.use_gpu ? 'gpu' : null }}" ``` When `inputs.use_gpu` is `false`, the key resolves to `null` and the task runs on the default worker group. When `true`, it targets the `gpu` worker group. ## Worker Group fallback behavior :::badge{version=">=0.20" editions="EE"} ::: By default, a task configured to run on a given worker will wait for the worker to be available (i.e., `workerGroup.fallback: WAIT`). If you prefer to fail the task when the worker is not available, set `workerGroup.fallback: FAIL`. ```yaml id: worker_group namespace: company.team tasks: - id: wait type: io.kestra.plugin.core.flow.Sleep duration: PT0S workerGroup: key: gpu fallback: FAIL ``` Possible values for `workerGroup.fallback` are `WAIT` (default), `FAIL`, or `CANCEL`: - `WAIT`: The task will wait for the worker to be available and will remain in a `CREATED` state until the worker picks it up. - `FAIL`: The task run will be terminated immediately if the worker is not available, and the execution will be marked as `FAILED`. - `CANCEL`: The task run will be gracefully terminated, and the execution will be marked as `KILLED` without an error. You can set a custom `workerGroup.key` and `workerGroup.fallback` per plugin type and/or per namespace using `pluginDefaults`. When Fallback behavior is set in multiple places, Kestra resolves which action to take by following this priority order: 1. **Flow-Level**: Uses the behavior specified in the `fallback` property of the Flow task. 2. **Namespace-Level**: Uses the behavior set in the the Namespace settings. 3. **Tenant-Level**: Uses the behavior set in the the Tenant settings. ### Fallback behavior at the namespace level Namespaces can be configured to have a default `fallback` behavior. It can be configured by creating a namespace manaully or modifying in the **Edit** tab of the namespace. ![Configure Worker Group for a Namespace](./worker-group-namespace.png) ### Fallback behavior at the tenant level Tenants can be configured to have a default `fallback` behavior. It can be configured when creating a tenant on in the tenant's properties. ![Configure Worker Group for a Tenant](./worker-group-tenant.png) ## When to use Worker Groups Here are common use cases in which Worker Groups can be beneficial: - Execute tasks and polling triggers on specific compute instances (e.g., a VM with a GPU and preconfigured CUDA drivers). - Execute tasks and polling triggers on a worker with a specific Operating System (e.g., a Windows server). - Restrict backend access to a set of workers (firewall rules, private networks, etc.). - Execute tasks and polling triggers close to a remote backend (region selection). You can configure plugin groups to use a specific worker group. In this example, all [script tasks](../../../16.scripts/index.mdx) are set to run on the `gpu` worker group: ```yaml id: worker_group namespace: company.team tasks: - id: wait type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - sleep 10 - id: python_gpu type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true commands: - python ml_on_gpu.py pluginDefaults: - forced: false type: io.kestra.plugin.scripts values: workerGroup: key: gpu ``` ### Distant workers You can use a Worker Group to designate a worker to execute **any** task on a remote resource. Additionally, you may want to have an **always-on** worker that stays available for execution-intensive workloads. The Distant Worker use case requires a connection to the Kestra metastore, and it solves for scenarios of always-on, intensive workloads and workloads that need to execute workloads on an external environment. ![Distant Worker Architecture](./distant-worker.png) ### Task runners If you are using scripting tasks, you can set up Worker Group of Task Runners to leverage **on-demand** cloud resources to execute intensive workloads. For example, you can have a Worker Group dedicated to executing on AWS Batch or Kubernetes. This is particularly useful for script task workloads that have bursts in resource demand. ![Task Runner Architecture](./task-runners.png) ### Data isolation Worker Groups strongly fits **Data Isolation** use cases. Multi-tenancy requirements may demand that you have strict isolation of remote resources such as key vaults. Worker groups enable you to split out dedicated workers per tenant. In the below architecture, it is not possible to execute tasks on worker 1 from tenant 3. ![Data Isolation Architecture](./data-isolation.png) :::alert{type="warning"} Even if you are using worker groups, we strongly recommend having at least one worker in the default worker group. ::: ## Load balancing Whether you leverage worker groups or not, Kestra will balance the load across all available workers. The primary difference is that with worker groups, you can target **specific** workers for task execution or polling trigger evaluation. A worker is part of a worker group if it is started with the `--worker-group workerGroupKey` argument. There's a slight difference between Kafka and JDBC architectures in terms of load balancing: - The Kafka architecture relies on Kafka consumer group protocol — each worker group will use a different consumer group protocol, therefore each worker group will balance the load independently. - For JDBC, each worker within a group will poll the `queues` database table using the same poll query. All workers within the same worker group will poll for task runs and polling triggers in a FIFO manner. ### Central queue to distribute task runs and polling triggers In both JDBC and Kafka architectures, we leverage a Central Queue to ensure that tasks and polling triggers are executed only once and in the right order. Here's how it works: - Jobs (task runs and polling triggers) are submitted to a centralized queue. The queue acts as a holding area for all incoming jobs. - Workers periodically poll the central queue to check for available jobs. When a worker becomes free, it requests the next job from the queue. - Kestra backend keeps track of assignment of jobs to workers to ensure reliable execution and prevent duplicate processing. ### What if multiple workers from the same Worker Group poll for jobs from the central queue? Whether the jobs (task runs and polling triggers) are evenly distributed among workers depends on several factors: 1. The order in which workers poll the queue will affect distribution — workers that poll the queue first will get jobs first (FIFO). 2. Variations in worker compute capabilities (and their processing speeds) can cause uneven job distribution. Faster workers will complete jobs and return to poll the queue more quickly than slower workers. --- # Expressions in Kestra: Pebble Syntax and Variables URL: https://kestra.io/docs/expressions > Learn how to work with Kestra expressions using the execution context, Pebble syntax, filters, functions, and operators. import ChildCard from "~/components/docs/ChildCard.astro" Use expressions to dynamically set values in flows using `{{ ... }}` syntax backed by the Pebble templating engine.
## Common tasks | If you need to... | Start here | | --- | --- | | Access `inputs`, `outputs`, `vars`, `trigger`, or `namespace` values | [Execution Context](./01.context/index.mdx) | | Access secrets or credentials at runtime | [Data Access Functions](./04.functions/02.data-access/index.mdx) | | Format dates, parse JSON, or transform strings | [Filter Reference](./03.filters/index.mdx) | | Render nested expressions or inspect the full context | [Rendering Functions](./04.functions/01.rendering/index.mdx) | | Write loops, conditions, fallbacks, and comparisons | [Pebble Syntax](./02.syntax/index.mdx) | | Build or debug a multiline or nested expression | [Pebble Syntax](./02.syntax/index.mdx#multiline-json-bodies) and [render()](./04.functions/01.rendering/index.mdx#render) | --- # Kestra Expression Context: Inputs, Outputs & Variables URL: https://kestra.io/docs/expressions/context > Reference for all variables available inside Kestra expressions at runtime — flow metadata, inputs, outputs, trigger values, secrets, and namespace variables. Use this page to find out what data is available inside `{{ ... }}` at runtime — including flow metadata, inputs, outputs, trigger values, secrets, and namespace variables. ## Understand the execution context Kestra expressions combine the [Pebble templating engine](/docs/concepts/pebble) with the execution context to dynamically render flow properties. The execution context usually includes: - `flow` - `execution` - `inputs` - `outputs` - `labels` - `tasks` - `trigger` when the flow was started by a trigger - `vars` when the flow defines variables - `namespace` in Enterprise Edition when namespace variables are configured - `envs` for environment variables - `globals` for global configuration values :::alert{type="info"} To inspect the full runtime context, use `{{ printContext() }}` in the Debug Expression console. ::: The Debug Expression console is available in the Kestra UI under **Executions → Logs → Debug Expression**. Enter any expression and evaluate it against the live execution context without modifying the flow.
## Default execution context variables | Parameter | Description | | --- | --- | | `{{ flow.id }}` | Identifier of the flow | | `{{ flow.namespace }}` | Namespace of the flow | | `{{ flow.tenantId }}` | Tenant identifier in Enterprise Edition | | `{{ flow.revision }}` | Flow revision number | | `{{ execution.id }}` | Unique execution identifier | | `{{ execution.startDate }}` | Start date of the execution | | `{{ execution.state }}` | Current execution state | | `{{ execution.originalId }}` | Original execution ID preserved across replays | | `{{ task.id }}` | Current task identifier | | `{{ task.type }}` | Fully qualified class name of the current task | | `{{ taskrun.id }}` | Current task run identifier | | `{{ taskrun.startDate }}` | Start date of the current task run | | `{{ taskrun.attemptsCount }}` | Retry and restart attempt count | | `{{ taskrun.parentId }}` | Parent task run identifier for nested tasks | | `{{ taskrun.value }}` | Current loop or flowable value | | `{{ parent.taskrun.value }}` | Value of the nearest parent task run | | `{{ parent.outputs }}` | Outputs of the nearest parent task run | | `{{ parents }}` | List of parent task runs | | `{{ labels }}` | Execution labels accessible by key | Example: ```yaml id: expressions namespace: company.team tasks: - id: debug_expressions type: io.kestra.plugin.core.debug.Return format: | taskId: {{ task.id }} date: {{ execution.startDate | date("yyyy-MM-dd HH:mm:ss.SSSSSS") }} ``` ## Trigger variables When the execution is started by a `Schedule` trigger: | Parameter | Description | | --- | --- | | `{{ trigger.date }}` | Date of the current schedule | | `{{ trigger.next }}` | Date of the next schedule | | `{{ trigger.previous }}` | Date of the previous schedule | When the execution is started by a `Flow` trigger: | Parameter | Description | | --- | --- | | `{{ trigger.executionId }}` | ID of the triggering execution | | `{{ trigger.namespace }}` | Namespace of the triggering flow | | `{{ trigger.flowId }}` | ID of the triggering flow | | `{{ trigger.flowRevision }}` | Revision of the triggering flow | ## Environment and global variables Kestra provides access to environment variables prefixed with `ENV_` by default, unless configured otherwise in the [runtime and storage configuration](/docs/configuration/runtime-and-storage). - reference `ENV_FOO` as `{{ envs.foo }}` - reference the configured environment name as `{{ kestra.environment }}` - reference the configured Kestra URL as `{{ kestra.url }}` - reference global variables from configuration as `{{ globals.foo }}` ## Flow variables and inputs Use flow-level variables with `vars.*`: ```yaml id: flow_variables namespace: company.team variables: my_variable: "my_value" tasks: - id: print_variable type: io.kestra.plugin.core.debug.Return format: "{{ vars.my_variable }}" ``` Use inputs with `inputs.*`: ```yaml id: render_inputs namespace: company.team inputs: - id: myInput type: STRING tasks: - id: myTask type: io.kestra.plugin.core.debug.Return format: "{{ inputs.myInput }}" ``` ## Secrets, credentials, namespace variables, and outputs Use `secret()` to inject secret values at runtime: ```yaml tasks: - id: myTask type: io.kestra.plugin.core.debug.Return format: "{{ secret('MY_SECRET') }}" ``` Use `credential()` in Enterprise Edition to inject a short-lived token from a managed [Credential](/docs/enterprise/auth/credentials): ```yaml tasks: - id: request type: io.kestra.plugin.core.http.Request method: GET uri: https://api.example.com/v1/ping auth: type: BEARER token: "{{ credential('my_oauth') }}" ``` `credential()` returns the short-lived token only. The credential itself is managed in the Kestra UI. Use namespace variables in Enterprise Edition with `namespace.*`. To set them up: 1. Open the Kestra UI and navigate to **Namespaces**. 2. Select the namespace where the flow runs. 3. Open the **Variables** tab. 4. Add a key-value pair such as `github.token` with the desired value. Reference namespace variables in expressions using dot notation: ```yaml format: "{{ namespace.github.token }}" ``` If a namespace variable itself contains Pebble, evaluate it with `render()`: ```yaml format: "{{ render(namespace.github.token) }}" ``` Use outputs with `outputs.taskId.attribute`: ```yaml message: | First: {{ outputs.first.value }} Second: {{ outputs['second-task'].value }} ``` :::alert{type="info"} If a task ID or output key contains a hyphen, use bracket notation such as `outputs['second-task']`. To avoid that, prefer `camelCase` or `snake_case`. ::: --- # Kestra Filter Reference: Transform Expression Values URL: https://kestra.io/docs/expressions/filters > Complete reference for Kestra Pebble filters — JSON, collections, strings, dates, and YAML. Use filters to transform values with the pipe syntax. import ChildCard from "~/components/docs/ChildCard.astro" Use filters when you need to transform a value with the pipe syntax: `{{ value | filterName(...) }}`. ## Filter categories - [JSON and structured data](./01.json/index.mdx) — `toJson`, `toIon`, `jq` - [Numbers and collections](./02.collections/index.mdx) — `abs`, `number`, `first`, `last`, `sort`, `chunk`, `distinct`, and more - [Strings](./03.strings/index.mdx) — `lower`, `upper`, `replace`, `slugify`, `base64encode`, regex filters, and more - [Dates](./04.dates/index.mdx) — `date`, `dateAdd`, `timestamp`, `timestampMilli`, and precision variants - [YAML](./05.yaml/index.mdx) — `yaml`, `indent`, `nindent` ## Choosing the right filter quickly | If you need to... | Use | | --- | --- | | Parse or transform JSON payloads | `toJson`, `jq`, `first` | | Provide a fallback string or value | `default` | | Format a date | `date` | | Offset a date | `dateAdd` | | Split or join text | `split`, `join` | | Normalize casing | `lower`, `upper`, `title`, `capitalize` | | Convert a value to a string | `string` | | Sort a collection | `sort`, `rsort` | | Count items in a collection | `length` | | Get unique values | `distinct` | | Encode or decode Base64 | `base64encode`, `base64decode` | | Hash a string | `sha1`, `sha512`, `md5` | | Convert to a number | `number` | | Render YAML in a templated task | `yaml`, `indent`, `nindent` | --- # Number and Collection Filters in Kestra URL: https://kestra.io/docs/expressions/filters/collections > Reference for Kestra's number and collection filters — abs, number, first, last, sort, chunk, distinct, slice, merge, flatten, keys, values, and more. These filters are the everyday cleanup tools for expression values. Use them when you already have the right data but need to reformat it, count it, sort it, or coerce it into the type another task expects. ## `abs` Returns the absolute value of a number: ```twig {{ -7 | abs }} {# output: 7 #} ``` ## `number` Parses a string into a numeric type. Supports `INT`, `FLOAT`, `LONG`, `DOUBLE`, `BIGDECIMAL`, and `BIGINTEGER`. When no type is specified, the type is inferred: ```twig {{ "12.3" | number | className }} {# output: java.lang.Float #} {{ "9223372036854775807" | number('BIGDECIMAL') | className }} {# output: java.math.BigDecimal #} ``` Use `BIGDECIMAL` or `BIGINTEGER` when values exceed standard long or double precision. ## `className` Returns the Java class name of an object. Useful for debugging type inference when combined with `number`: ```twig {{ "12.3" | number | className }} {# output: java.lang.Float #} ``` ## `numberFormat` Formats a number using a Java `DecimalFormat` pattern: ```twig {{ 3.141592653 | numberFormat("#.##") }} {# output: 3.14 #} ``` ## `first` and `last` Returns the first or last element of a collection, or the first or last character of a string: ```twig {{ ['apple', 'banana', 'cherry'] | first }} {# output: apple #} {{ ['apple', 'banana', 'cherry'] | last }} {# output: cherry #} {{ 'Kestra' | first }} {# output: K #} {{ 'Kestra' | last }} {# output: a #} ``` ## `length` Returns the number of elements in a collection, or the number of characters in a string: ```twig {{ ['apple', 'banana'] | length }} {# output: 2 #} {{ 'Kestra' | length }} {# output: 6 #} ``` ## `join` Concatenates a collection into a single string with an optional delimiter: ```twig {{ ['apple', 'banana', 'cherry'] | join(', ') }} {# output: apple, banana, cherry #} ``` ## `split` Splits a string into a list using a delimiter. The delimiter is a regex, so escape special characters: ```twig {{ 'apple,banana,cherry' | split(',') }} {# output: ['apple', 'banana', 'cherry'] #} {{ 'a.b.c' | split('\\.') }} ``` The optional `limit` argument controls how many splits are performed: - **Positive**: limits the array size; the last entry contains the remaining content - **Zero**: no limit; trailing empty strings are discarded - **Negative**: no limit; trailing empty strings are included ```twig {{ 'apple,banana,cherry,grape' | split(',', 2) }} {# output: ['apple', 'banana,cherry,grape'] #} ``` ## `sort` and `rsort` Sort a collection in ascending or descending order: ```twig {{ [3, 1, 2] | sort }} {# output: [1, 2, 3] #} {{ [3, 1, 2] | rsort }} {# output: [3, 2, 1] #} ``` ## `reverse` Reverses the order of a collection: ```twig {{ [1, 2, 3] | reverse }} {# output: [3, 2, 1] #} ``` ## `chunk` Splits a collection into groups of a specified size: ```twig {{ [1, 2, 3, 4, 5] | chunk(2) }} {# output: [[1, 2], [3, 4], [5]] #} ``` ## `distinct` Returns only unique values from a collection: ```twig {{ [1, 2, 2, 3, 1] | distinct }} {# output: [1, 2, 3] #} ``` ## `slice` Extracts a portion of a collection or string using `fromIndex` (inclusive) and `toIndex` (exclusive): ```twig {{ ['apple', 'banana', 'cherry'] | slice(1, 2) }} {# output: [banana] #} {{ 'Kestra' | slice(1, 3) }} {# output: es #} ``` ## `merge` Merges two collections into one: ```twig {{ [1, 2] | merge([3, 4]) }} {# output: [1, 2, 3, 4] #} ``` ## `flatten` Removes one level of nesting from a collection: ```twig {{ [[1, 2], [3, 4], [5]] | flatten }} {# output: [1, 2, 3, 4, 5] #} ``` ## `keys` and `values` Return the keys or values of a map: ```twig {{ {'foo': 'bar', 'baz': 'qux'} | keys }} {# output: [foo, baz] #} {{ {'foo': 'bar', 'baz': 'qux'} | values }} {# output: [bar, qux] #} ``` --- # Date and Time Filters in Kestra Expressions URL: https://kestra.io/docs/expressions/filters/dates > Reference for Kestra's date and time filters — date, dateAdd, timestamp, timestampMilli, timestampMicro, and timestampNano — for formatting dates and converting to Unix timestamps. These are the most common filters in scheduled flows and integrations. Reach for them whenever a downstream system expects a specific date format or timestamp precision rather than Kestra's native datetime value. ## `date` ```twig {{ execution.startDate | date("yyyy-MM-dd") }} ``` You can also provide existing and target formats with named arguments: ```twig {{ stringDate | date(existingFormat="yyyy-MMMM-d", format="yyyy/MMMM/d") }} ``` When you are formatting an already parsed datetime, only `format` is usually needed. Use `existingFormat` when the source is still a plain string. ### Time zones Specify a target time zone when downstream systems require a local representation rather than UTC: ```twig {{ now() | date("yyyy-MM-dd'T'HH:mm:ssX", timeZone="UTC") }} ``` Supported arguments include: - `format` - `existingFormat` - `timeZone` - `locale` ## `dateAdd` Adds or subtracts time from a date. Arguments: - `amount`: integer specifying how much to add or subtract - `unit`: time unit such as `DAYS`, `HOURS`, `MONTHS`, or `YEARS` ```twig {{ now() | dateAdd(-1, 'DAYS') }} ``` ## Timestamp helpers Convert a date to a Unix timestamp at a specific precision: - `timestamp` — seconds - `timestampMilli` — milliseconds - `timestampMicro` — microseconds - `timestampNano` — nanoseconds :::alert{type="warning"} `timestampMicro` previously returned a nanosecond-precision value due to a bug. If you are migrating an older flow, verify the precision your downstream system expects. ::: All timestamp filters accept the same arguments as the `date` filter: `existingFormat` and `timeZone`. ```twig {{ now() | timestamp(timeZone="Europe/Paris") }} {{ now() | timestampMilli(timeZone="Asia/Kolkata") }} ``` Supported date formats include standard Java `DateTimeFormatter` patterns and shortcuts such as `iso`, `sql`, `iso_date_time`, and `iso_zoned_date_time`. ## Worked example ```yaml id: temporal_dates namespace: company.team tasks: - id: print_status type: io.kestra.plugin.core.log.Log message: - "Present timestamp: {{ now() }}" - "Formatted timestamp: {{ now() | date('yyyy-MM-dd') }}" - "Previous day: {{ now() | dateAdd(-1, 'DAYS') }}" - "Next day: {{ now() | dateAdd(1, 'DAYS') }}" - "Timezone (seconds): {{ now() | timestamp(timeZone='Asia/Kolkata') }}" - "Timezone (microseconds): {{ now() | timestampMicro(timeZone='Asia/Kolkata') }}" - "Timezone (milliseconds): {{ now() | timestampMilli(timeZone='Asia/Kolkata') }}" - "Timezone (nanoseconds): {{ now() | timestampNano(timeZone='Asia/Kolkata') }}" ``` This kind of example is a good sanity check when you are validating timestamp precision before sending values to an external API. --- # JSON and Structured Data Filters in Kestra URL: https://kestra.io/docs/expressions/filters/json > Reference for Kestra's JSON and structured data filters — toJson, toIon, and jq — for serializing, reshaping, and extracting fields from task outputs and API responses. Use these filters when the value you already have is structured and you need to reshape it, serialize it, or extract one field from a larger payload. They are especially common when working with task outputs and API responses. ## `toJson` Convert an object into JSON: ```twig {{ [1, 2, 3] | toJson }} {{ true | toJson }} {{ "foo" | toJson }} ``` ## `toIon` Convert an object into Ion: ```twig {{ myObject | toIon }} ``` ## `jq` Apply a JQ expression to a value. The result is always an array, so combine it with `first` when appropriate: ```twig {{ outputs | jq('.task1.value') | first }} ``` Examples: ```twig {{ [1, 2, 3] | jq('.') }} {{ [1, 2, 3] | jq('.[0]') | first }} ``` Example flow using `jq` inside a `ForEach`: ```yaml id: jq_with_foreach namespace: company.team tasks: - id: generate type: io.kestra.plugin.core.debug.Return format: | [ {"name": "alpha", "value": 1}, {"name": "bravo", "value": 2} ] - id: foreach type: io.kestra.plugin.core.flow.ForEach values: "{{ fromJson(outputs.generate.value) }}" tasks: - id: log_filtered type: io.kestra.plugin.core.log.Log message: | Name: {{ fromJson(taskrun.value).name }} Doubled value: {{ fromJson(taskrun.value) | jq('.value * 2') | first }} ``` The practical rule with `jq` is that it is great for extracting or transforming a small part of a larger payload, but it is usually overkill when plain dot access already gets you the value you need. ## Worked JSON payload example This larger example is useful when you need to mix accessors, math, collection helpers, and JSON-aware filters in one expression flow: ```yaml id: json_payload_example namespace: company.team inputs: - id: payload type: JSON defaults: |- { "name": "John Doe", "score": { "English": 72, "Maths": 88, "French": 95, "Spanish": 85, "Science": 91 }, "address": { "city": "Paris", "country": "France" }, "graduation_years": [2020, 2021, 2022, 2023] } tasks: - id: print_status type: io.kestra.plugin.core.log.Log message: - "Student name: {{ inputs.payload.name }}" - "Score in languages: {{ inputs.payload.score.English + inputs.payload.score.French + inputs.payload.score.Spanish }}" - "Total subjects: {{ inputs.payload.score | length }}" - "Total score: {{ inputs.payload.score | values | jq('reduce .[] as $num (0; .+$num)') | first }}" - "Complete address: {{ inputs.payload.address.city }}, {{ inputs.payload.address.country | upper }}" - "Started college in: {{ inputs.payload.graduation_years | first }}" - "Completed college in: {{ inputs.payload.graduation_years | last }}" ``` Use a pattern like this when the payload already arrives as JSON input and you want to keep the manipulation inside expressions instead of adding a preprocessing task. --- # String Filters in Kestra Expressions URL: https://kestra.io/docs/expressions/filters/strings > Reference for Kestra's string filters — casing, trimming, encoding, hashing, regex, and substring extraction. Use them for display formatting, filename shaping, and API-compatible encodings. String filters are where most small presentation fixes happen. They are usually the right tool for display formatting, filename shaping, templated messages, and API-compatible encodings. ## Case and whitespace `lower`, `upper`, `title`, and `capitalize` normalize casing. `trim` removes leading and trailing whitespace. ```twig {{ "LOUD TEXT" | lower }} {# loud text #} {{ "quiet text" | upper }} {# QUIET TEXT #} {{ "article title" | title }} {# Article Title #} {{ "hello world" | capitalize }} {# Hello world #} {{ " padded " | trim }} {# padded #} ``` ## `abbreviate` Truncates a string to a maximum length and appends an ellipsis. The length argument includes the ellipsis: ```twig {{ "this is a long sentence." | abbreviate(7) }} {# this... #} ``` Useful when you need to keep log messages or notification subjects within a character limit. ## `replace` Substitutes one or more substrings using a map. Pass `regexp=true` to use regex patterns in the keys: ```twig {{ "I like %this% and %that%." | replace({'%this%': foo, '%that%': "bar"}) }} ``` ## `substringBefore`, `substringAfter`, and their `Last` variants Extract the portion of a string before or after a delimiter. The `Last` variants match the final occurrence: ```twig {{ "a.b.c" | substringBefore(".") }} {# a #} {{ "a.b.c" | substringAfter(".") }} {# b.c #} {{ "a.b.c" | substringBeforeLast(".") }} {# a.b #} {{ "a.b.c" | substringAfterLast(".") }} {# c #} ``` These are particularly useful for extracting file extensions, path segments, or identifier prefixes from task output values. ## `slugify` Converts a string into a URL-safe slug: ```twig {{ "Hello World!" | slugify }} {# hello-world #} ``` ## `default` Returns a fallback value when the expression is null or empty: ```twig {{ user.phoneNumber | default("No phone number") }} ``` ## `startsWith` Returns `true` if the string begins with the given prefix: ```twig {{ "kestra://file.csv" | startsWith("kestra://") }} {# true #} ``` ## `endsWith` Returns `true` if the string ends with the given suffix: ```twig {{ "report.csv" | endsWith(".csv") }} {# true #} ``` ## Encoding and hashing `base64encode` and `base64decode` handle Base64 encoding. `urlencode` and `urldecode` percent-encode strings for use in URLs. `sha1`, `sha512`, and `md5` produce hex-encoded hashes of the corresponding algorithms. ```twig {{ "test" | base64encode }} {# output: dGVzdA== #} {{ "dGVzdA==" | base64decode }} {# output: test #} {{ "The string ü@foo-bar" | urlencode }} {# output: The+string+%C3%BC%40foo-bar #} {{ "The+string+%C3%BC%40foo-bar" | urldecode }} {# output: The string ü@foo-bar #} {{ "test" | sha1 }} {{ "test" | sha512 }} {{ "test" | md5 }} ``` ## `string` Coerces any value to its string representation: ```twig {{ 42 | string }} ``` Use this when chaining filters that expect string input on a value that may arrive as a number or boolean. ## `escapeChar` Escapes special characters in a string. The `type` argument controls which style of escaping is applied: `single`, `double`, or `shell`: ```twig {{ "Can't be here" | escapeChar('single') }} {# output: Can\'t be here #} ``` ## Regex filters `regexMatch(regex)` returns `true` if the input contains a substring matching the pattern. `regexReplace(regex, replacement)` replaces all matching substrings. `regexExtract(regex, group)` returns the first match or a specific capture group (`group` defaults to `0`; returns `null` if no match): ```twig {{ "hello world" | regexMatch("w[a-z]+") }} {# output: true #} {{ "2024-01-15" | regexReplace("(\\d{4})-(\\d{2})-(\\d{2})", "$3/$2/$1") }} {# output: 15/01/2024 #} {{ "order-12345-done" | regexExtract("\\d+") }} {# output: 12345 #} {{ "2024-01-15" | regexExtract("(\\d{4})-(\\d{2})-(\\d{2})", 1) }} {# output: 2024 #} ``` :::alert{type="warning"} Regex filter operations are subject to a **10-second timeout** to prevent ReDoS (catastrophic backtracking). If a pattern takes longer than the limit, the task fails with an error message. Patterns with nested quantifiers such as `(a+)+` applied to large inputs are most likely to trigger this. Use anchored, non-ambiguous patterns to avoid it. The timeout can be adjusted with [`kestra.regex.timeout`](../../../configuration/05.security-and-secrets/index.md#regex-timeout) in your Kestra configuration. ::: ## Worked string filter example This flow builds a sanitized filename and a display-safe summary from a raw input title: ```yaml id: string_filter_example namespace: company.team inputs: - id: title type: STRING defaults: " Quarterly Report: Q1 2025 (FINAL) " tasks: - id: format_output type: io.kestra.plugin.core.log.Log message: - "Trimmed: {{ inputs.title | trim }}" - "Normalized: {{ inputs.title | trim | lower }}" - "Slug (for filename): {{ inputs.title | trim | slugify }}" - "Abbreviated (for subject line): {{ inputs.title | trim | abbreviate(30) }}" - "Prefix check: {{ inputs.title | trim | startsWith('Quarterly') }}" - "After colon: {{ inputs.title | trim | substringAfter(':') | trim }}" ``` --- # YAML Filters in Kestra Expressions URL: https://kestra.io/docs/expressions/filters/yaml > Reference for Kestra's YAML filters — yaml, indent, and nindent — for parsing and formatting YAML in templated tasks, Kubernetes manifests, and config-management patterns. Use YAML filters when you are generating configuration or manifest-style text inside a task. They are less common in simple flows, but very useful in templated Kubernetes, Docker, or config-management patterns. ## `yaml` Parse YAML into an object: ```twig {{ "foo: bar" | yaml }} ``` This is especially useful in templated tasks where the source data starts as text but later expressions need object-style access. ### Example: using `yaml` in a templated task ```yaml id: yaml_filter_example namespace: company.team tasks: - id: yaml_filter type: io.kestra.plugin.core.log.Log message: | {{ "foo: bar" | yaml }} {{ {"key": "value"} | yaml }} ``` ## `indent` and `nindent` Useful when generating templated YAML or embedding structured content: ```twig {{ labels | yaml | indent(4) }} {{ variables.yaml_data | yaml | nindent(4) }} ``` ### Example with `indent` and `nindent` ```yaml id: templated_task_example namespace: company.team labels: example: test variables: yaml_data: | key1: value1 key2: value2 tasks: - id: yaml_with_indent type: io.kestra.plugin.core.templating.TemplatedTask spec: | id: example-task type: io.kestra.plugin.core.log.Log message: | Metadata: {{ labels | yaml | indent(4) }} Variables: {{ variables.yaml_data | yaml | nindent(4) }} ``` Use `indent` when the first line is already in place and only following lines need alignment. Use `nindent` when you need to start a fresh indented block on the next line. --- # Kestra Function Reference: Generate and Retrieve Values URL: https://kestra.io/docs/expressions/functions > Complete reference for Kestra Pebble functions — rendering, data access, parsing, workflow helpers, utilities, and date/calendar functions. import ChildCard from "~/components/docs/ChildCard.astro" Use functions when you need to generate or retrieve a value dynamically with syntax such as `{{ functionName(...) }}`. Functions are best thought of as helpers that either fetch something, compute something, or force evaluation behavior that plain variables and filters cannot provide on their own. ## Function groups - [Rendering and debugging](./01.rendering/index.mdx) — `render()`, `renderOnce()`, `printContext()`, template inheritance helpers - [Data access](./02.data-access/index.mdx) — `secret()`, `credential()`, `read()`, `fileURI()`, `kv()`, `encrypt()`, `decrypt()` - [Data parsing](./03.parsing/index.mdx) — `fromJson()`, `fromIon()`, `yaml()` - [Workflow helpers](./04.workflow/index.mdx) — `errorLogs()`, `currentEachOutput()`, `tasksWithState()`, `iterationOutput()`, `parentOutput()`, `appLink()` - [Utilities](./05.utilities/index.mdx) — `now()`, `uuid()`, `randomInt()`, `http()`, `fileSize()`, `fileExists()`, and more - [Date and calendar](./06.dates/index.mdx) — `isWeekend()`, `isPublicHoliday()`, `dayOfWeek()`, `monthOfYear()`, and more ## Worked example This flow uses several runtime functions together: `now()` for a timestamp, `uuid()` for a unique run identifier, `secret()` for a credential, and `render()` to evaluate a namespace variable containing Pebble: ```yaml id: function_reference_example namespace: company.team tasks: - id: log_context type: io.kestra.plugin.core.log.Log message: - "Run ID: {{ uuid() }}" - "Started at: {{ now() | date('yyyy-MM-dd HH:mm:ss') }}" - "API key: {{ secret('MY_API_KEY') }}" - "Config value: {{ render(namespace.my_config) }}" ``` --- # Data Access Functions in Kestra Expressions URL: https://kestra.io/docs/expressions/functions/data-access > Reference for Kestra's data access functions — secret(), credential(), read(), fileURI(), kv(), encrypt(), and decrypt() — for resolving secrets, files, and stored values at runtime. These functions bridge expressions to external or stored data. Use them when the value is not already present in the execution context and must be resolved at runtime. ## `secret()` Use `secret()` for sensitive values that should not appear in the flow definition: ```twig {{ secret('API_KEY') }} {{ secret('GITHUB_ACCESS_TOKEN') }} ``` ## `credential()` In Enterprise Edition, use `credential()` to inject a short-lived token from a managed credential: ```twig {{ credential('my_oauth') }} ``` `credential()` returns the token only, while the credential definition itself is managed in the Kestra UI: ```yaml tasks: - id: request type: io.kestra.plugin.core.http.Request method: GET uri: https://api.example.com/v1/ping auth: type: BEARER token: "{{ credential('my_oauth') }}" ``` ## `read()` `read()` is the simplest way to turn a file URI back into inline content for a later expression: ```twig {{ read(outputs.someTask.uri) }} {{ read('subdir/file.txt') }} ``` `read()` accepts both namespace files and internal-storage URIs, which makes it useful after download or transformation tasks that write files as outputs. ## `fileURI()` Returns the internal URI of a namespace file without reading its contents. Use `fileURI()` when a task parameter expects a URI rather than inline content: ```twig {{ fileURI('my_file.txt') }} ``` Use `read()` instead when you need to embed the file contents inline in a later expression. ## `kv()` Reads a value from the KV store by key. The namespace defaults to the flow's namespace; set `errorOnMissing` to `false` to return `null` instead of throwing when the key is absent: ```twig {{ kv('MY_KEY') }} {{ kv('MY_KEY', 'other.namespace') }} {{ kv('OPTIONAL_KEY', namespace, false) }} ``` Arguments: - `key` — the KV store key - `namespace` — defaults to the flow's namespace - `errorOnMissing` — defaults to `true` ## `encrypt()` and `decrypt()` Encrypt and decrypt string values using Kestra's encryption service. Both require a `key` argument that identifies which encryption key to use: ```twig {{ encrypt('MY_ENCRYPTION_KEY', inputs.sensitiveValue) }} {{ decrypt('MY_ENCRYPTION_KEY', outputs.encryptTask.value) }} ``` --- # Date and Calendar Functions in Kestra Expressions URL: https://kestra.io/docs/expressions/functions/dates > Reference for Kestra's date and calendar functions — isWeekend(), isPublicHoliday(), isDayWeekInMonth(), dayOfWeek(), dayOfMonth(), monthOfYear(), and hourOfDay() — for scheduling and routing logic. Use these functions when you need to make scheduling or routing decisions based on the calendar — for example, skipping runs on weekends or public holidays. ## `isWeekend()` Returns `true` if the date falls on Saturday or Sunday: ```twig {{ isWeekend(trigger.date) }} ``` ## `isPublicHoliday()` Checks against a country's public holiday calendar. `countryCode` is an ISO 3166-1 alpha-2 code; `subDivision` is optional and accepts ISO 3166-2 codes: ```twig {{ isPublicHoliday(trigger.date, 'US') }} {{ isPublicHoliday(trigger.date, 'DE', 'DE-BY') }} ``` ## `isDayWeekInMonth()` Returns `true` if the date is the Nth occurrence of the given weekday in its month. `position` accepts `FIRST`, `SECOND`, `THIRD`, `FOURTH`, or `LAST`: ```twig {{ isDayWeekInMonth(trigger.date, 'MONDAY', 'FIRST') }} ``` ## `dayOfWeek()` Returns the uppercase day name such as `MONDAY`: ```twig {{ dayOfWeek(trigger.date) }} ``` ## `dayOfMonth()` Returns the day of the month as an integer (1–31): ```twig {{ dayOfMonth(trigger.date) }} ``` ## `monthOfYear()` Returns the month as an integer (1–12): ```twig {{ monthOfYear(trigger.date) }} ``` ## `hourOfDay()` Returns the hour as an integer (0–23): ```twig {{ hourOfDay(execution.startDate) }} ``` --- # Data Parsing Functions in Kestra Expressions URL: https://kestra.io/docs/expressions/functions/parsing > Reference for Kestra's data parsing functions — fromJson(), fromIon(), and yaml() — for deserializing task outputs and working with structured data in expressions. These helpers are most useful when a task output is still a serialized string and you want to treat it like structured data in later expressions. ## `fromJson()` Parses a JSON string into an object so you can access its fields with dot or bracket notation: ```twig {{ fromJson(outputs.myTask.value).name }} {{ fromJson('[1, 2, 3]')[0] }} ``` Use `fromJson()` when a task output arrives as a serialized JSON string rather than a structured object. To go the other direction, use the [`toJson` filter](../03.filters/01.json/index.mdx#tojson). ## `fromIon()` Use `fromIon()` when a previous task or serializer produces Ion rather than JSON: ```twig {{ fromIon(read(outputs.serialize.uri)).someField }} ``` ## `yaml()` Parses a YAML string into an object so you can access its fields with dot or array notation: ```twig {{ yaml('foo: [666, 1, 2]').foo[0] }} ``` `yaml()` is available both as a function and as a filter (`{{ value | yaml }}`). Use the function form when you are working with a raw YAML string literal or a variable containing YAML text. See the [`yaml` filter](../03.filters/05.yaml/index.mdx) for additional options including `indent` and `nindent` for template formatting. --- # Rendering and Debugging Functions in Kestra URL: https://kestra.io/docs/expressions/functions/rendering > Reference for Kestra's rendering and debugging functions — render(), renderOnce(), printContext(), block(), and parent() — for evaluating nested expressions and inspecting execution context. This group matters when expressions stop behaving the way you expect. `render()` and `printContext()` are often the quickest way to understand whether a value is missing, nested, or still just a string. ## `render()` Use `render()` when a variable itself contains Pebble and must be evaluated: ```twig {{ render(namespace.github.token) }} {{ render("{{ trigger.date ?? execution.startDate | date('yyyy-MM-dd') }}") }} ``` Without `render()`, namespace or flow variables that contain Pebble are treated as plain strings. This pattern is especially useful with namespace variables, composed flow variables, and fallback logic based on trigger context: ```yaml variables: trigger_or_yesterday: "{{ trigger.date ?? (execution.startDate | dateAdd(-1, 'DAYS')) }}" tasks: - id: yesterday type: io.kestra.plugin.core.log.Log message: "{{ render(vars.trigger_or_yesterday) }}" ``` ## `renderOnce()` Equivalent to `render(expression, recursive=false)`. Use `renderOnce()` when you need one extra evaluation pass but do not want recursive expansion to keep walking nested Pebble content: ```twig {{ renderOnce(namespace.github.token) }} ``` `renderOnce()` is the safer choice when you need one extra evaluation pass but do not want recursive expansion to keep walking nested Pebble content. ## `printContext()` Outputs the full execution context as a string. Use it in the Debug Expression console to inspect every variable available at that point in the execution: ```twig {{ printContext() }} ``` This is the fastest way to discover the exact key names and structure of `inputs`, `outputs`, `trigger`, and other context variables when an expression is not resolving as expected. ## Template inheritance helpers These are less common than runtime-oriented helpers, but they matter when you are using Pebble blocks and template inheritance directly. ### `block()` `block()` renders the contents of a named block multiple times. It is different from the Pebble `block` tag, which declares the block: ```twig {% block "post" %}content{% endblock %} {{ block("post") }} ``` ### `parent()` Use `parent()` inside an overriding block to include the original block content from the parent template: ```twig {% extends "parent.peb" %} {% block "content" %} child content {{ parent() }} {% endblock %} ``` --- # Utility Functions in Kestra Expressions URL: https://kestra.io/docs/expressions/functions/utilities > Reference for Kestra's utility functions — now(), uuid(), randomInt(), range(), http(), fileSize(), fileExists(), isFileEmpty(), and more — for generating values and inspecting files at runtime. ## `now()` Returns the current datetime. Accepts a `timeZone` argument: ```twig {{ now() }} {{ now(timeZone="Europe/Paris") }} ``` ## `max()` and `min()` Returns the largest or smallest of its arguments: ```twig {{ max(5, 10, 15) }} {# output: 15 #} {{ min(5, 10, 15) }} {# output: 5 #} ``` ## `range()` Generates a list of integers up to and including `end`. The step defaults to 1: ```twig {{ range(0, 3) }} {# output: [0, 1, 2, 3] #} {{ range(0, 6, 2) }} {# output: [0, 2, 4, 6] #} ``` ## `uuid()` Generates a UUID in URL-safe base62 encoding: ```twig {{ uuid() }} ``` ## `id()` Generates a short unique ID using Kestra's internal ID utility: ```twig {{ id() }} ``` ## `ksuid()` Generates a K-Sortable Unique Identifier (timestamp-prefixed, base62-encoded). Useful when sort order by creation time matters: ```twig {{ ksuid() }} ``` ## `nanoId()` Generates a NanoID. `length` defaults to 21 and `alphabet` defaults to alphanumeric plus `-_`: ```twig {{ nanoId() }} {{ nanoId(length=10) }} ``` ## `randomInt()` Generates a random integer. The upper bound is **excluded**: ```twig {{ randomInt(1, 10) }} {# generates a random integer from 1 to 9 (10 is excluded) #} ``` ## `randomPort()` Picks an available local port. Useful in test or dev container flows: ```twig {{ randomPort() }} ``` ## `http()` Fetches a remote payload directly from an expression: ```twig {{ http(uri = 'https://dummyjson.com/products/categories') | jq('.[].slug') }} ``` Use it sparingly. It is convenient for dynamic dropdowns and lightweight lookups, but task-level HTTP calls are usually easier to observe and retry. ## `fileSize()` Returns the size in bytes of a file from internal storage: ```twig {{ fileSize(outputs.download.uri) }} ``` ## `fileExists()` Returns `true` if the file exists: ```twig {{ fileExists(outputs.download.uri) }} ``` ## `isFileEmpty()` Returns `true` if the file has no content: ```twig {{ isFileEmpty(outputs.download.uri) }} ``` --- # Workflow Helper Functions in Kestra Expressions URL: https://kestra.io/docs/expressions/functions/workflow > Reference for Kestra's workflow and execution helper functions — errorLogs(), currentEachOutput(), tasksWithState(), iterationOutput(), parentOutput(), and appLink(). This group is more situational, but it becomes valuable in complex flows where you need to inspect sibling results, build links back into Kestra, or summarize failures. ## `errorLogs()` Prints all error logs from the current execution: ```twig {{ errorLogs() }} ``` It is most useful in `errors` blocks, where you need a compact summary of what failed without manually traversing task state objects. ## `currentEachOutput()` Use it inside `ForEach` flows to avoid manual `taskrun.value` indexing: ```twig {{ currentEachOutput(outputs.make_data).values.data }} ``` ## `tasksWithState()` Returns a list of task run objects matching the given state. Use it in error handlers or notifications to report which tasks failed: ```twig {{ tasksWithState('FAILED') }} ``` Useful for building conditional logic or failure summaries based on task outcomes. ## `iterationOutput()` Retrieves the output of a specific iteration from a previous task. Both arguments are optional — `taskId` defaults to the current task and `iteration` defaults to the previous iteration: ```twig {{ iterationOutput(outputs.myTask).value }} {{ iterationOutput(outputs.myTask, 2).value }} ``` ## `parentOutput()` Retrieves the output of a parent task. The optional `index` argument specifies which ancestor to target; omitting it returns the direct parent's output: ```twig {{ parentOutput() }} {{ parentOutput(1) }} ``` ## `appLink()` Enterprise Edition's `appLink()` builds links back to Kestra Apps: ```twig {{ appLink(appId='com.example.my-app') }} {{ appLink(baseUrl=true) }} ``` Use it in notifications when you want recipients to jump directly into the related app rather than the generic flow UI. --- # Pebble Syntax in Kestra: Tags, Operators & Control Flow URL: https://kestra.io/docs/expressions/syntax > Complete reference for writing Kestra expressions — delimiters, attribute access, nested rendering, control flow, comparisons, logic operators, and Pebble type tests. Use this page when you need help writing expressions — delimiters, attribute access, nested rendering, control flow, fallback patterns, comparisons, logic operators, and type tests. ## Pebble basics Pebble templates use two primary delimiters: - `{{ ... }}` to output the result of an expression - `{% ... %}` to control template flow with tags such as `if`, `for`, or `set` Examples: ```twig {{ flow.id }} {% if inputs.region == "eu" %}Europe{% endif %} ``` To escape Pebble syntax literally, use the `raw` tag described in [Tags](#raw). ## Accessing values Use dot notation for standard property access: ```twig {{ foo.bar }} ``` Use bracket notation for special characters or indexed access: ```twig {{ foo['foo-bar'] }} {{ items[0] }} ``` :::alert{type="warning"} If a task ID, output key, or attribute contains a hyphen, use bracket notation. To avoid that, prefer `camelCase` or `snake_case`. ::: ## Parsing nested expressions Kestra renders expressions once by default. If a variable contains Pebble that should be evaluated later, use `render()`: ```yaml variables: trigger_or_yesterday: "{{ trigger.date ?? (execution.startDate | dateAdd(-1, 'DAYS')) }}" input_or_yesterday: "{{ inputs.mydate ?? (execution.startDate | dateAdd(-1, 'DAYS')) }}" tasks: - id: yesterday type: io.kestra.plugin.core.log.Log message: "{{ render(vars.trigger_or_yesterday) }}" - id: input_or_yesterday type: io.kestra.plugin.core.log.Log message: "{{ render(vars.input_or_yesterday) }}" ``` This pattern is especially useful with namespace variables, composed flow variables, and fallback logic based on trigger context. ### Multiline JSON bodies When an HTTP request body contains multiline user input, avoid partial string interpolation. Instead, build the whole payload as a single Pebble expression so JSON escaping happens correctly. ```yaml id: multiline_input_passed_to_json_body namespace: company.team inputs: - id: title type: STRING defaults: This is my title - id: message type: STRING defaults: |- This is my long multiline message. - id: priority type: INT defaults: 5 tasks: - id: hello type: io.kestra.plugin.core.http.Request uri: https://kestra.io/api/mock method: POST body: | {{ { "title": inputs.title, "message": inputs.message, "priority": inputs.priority } | toJson }} ``` ## Common syntax patterns ### Comments Use Pebble comments with `{# ... #}`: ```twig {# This is a comment #} {{ "Visible content" }} ``` In YAML, continue to use `#` for comments outside the expression itself. ### Literals and collections Pebble supports: - strings: `"Hello World"` - numbers such as `100 + 10l * 2.5` - booleans: `true`, `false` - null: `null` - lists: `["apple", "banana"]` - maps: `{"apple":"red", "banana":"yellow"}` ### Named arguments Filters, functions, and macros can accept named arguments: ```twig {{ stringDate | date(existingFormat="yyyy-MMMM-d", format="yyyy/MMMM/d") }} ``` ## Control flow and fallbacks Common patterns: - `if` and `elseif` for branching - `for` for iteration - `??` for fallback values - `? :` for ternary expressions Examples: ```twig {{ inputs.mydate ?? (execution.startDate | dateAdd(-1, 'DAYS')) }} ``` ```twig {% for article in articles %} {{ article.title }} {% else %} No articles available. {% endfor %} ``` Inside a `for` loop, Pebble provides a `loop` object with properties such as `loop.index`, `loop.first`, `loop.last`, and `loop.length`. For the full table and examples, see [for](#for). ```twig {% if category == "news" %} {{ news }} {% elseif category == "sports" %} {{ sports }} {% else %} Select a category {% endif %} ``` ## Operators ### Comparisons Supported comparison operators: - `==` - `!=` - `<` - `>` - `<=` - `>=` ```twig {% if execution.state == "SUCCESS" %} Flow completed successfully. {% endif %} {% if taskrun.attemptsCount >= 3 %} Max retries reached. {% endif %} ``` ### Logic and boolean checks Use: - `and` - `or` - `not` - `is` - `contains` Use parentheses to group expressions and make precedence explicit: ```twig {% if 2 is even and 3 is odd %} ... {% endif %} {% if (3 is not even) and (2 is odd or 3 is even) %} ... {% endif %} ``` ### `contains` Checks whether an item exists within a list, string, map, or array: ```twig {% if ["apple", "pear", "banana"] contains "apple" %} ... {% endif %} ``` For maps, `contains` checks for a matching key: ```twig {% if {"apple": "red", "banana": "yellow"} contains "banana" %} ... {% endif %} ``` To check for multiple items at once, pass a list on the right-hand side: ```twig {% if ["apple", "pear", "banana", "peach"] contains ["apple", "peach"] %} ... {% endif %} ``` `contains` also works inline in output expressions: ```twig {{ inputs.mainString contains inputs.subString }} ``` ### `isIn` Use `isIn` to test whether a value matches any item in a list. It reads more clearly than chaining multiple equality checks in `runIf`, SLAs, or alert conditions: ```twig {{ execution.state isIn ['SUCCESS', 'KILLED', 'CANCELLED'] }} ``` ### Math and concatenation Use: - `+`, `-`, `*`, `/`, `%` - `~` for string concatenation Example: ```twig {{ "apple" ~ "pear" ~ "banana" }} {{ 2 + 2 / (10 % 3) * (8 - 1) }} ``` ### Fallbacks and conditionals Use: - `??` for null-coalescing: returns the first non-null value - `???` for undefined-coalescing: returns the right-hand side only when the left is undefined (not just null) - `? :` for ternary expressions Examples: ```twig {{ foo ?? bar ?? "default" }} {# first non-null value #} {{ foo ??? "default" }} {# only if foo is undefined #} {{ foo == null ? bar : baz }} {{ foo ?? bar ?? raise }} {# raises an exception if all are undefined #} ``` For detailed null vs undefined behavior, see the [Handling null and undefined values](/docs/how-to-guides/null-values) guide. ### Operator precedence Pebble operators are evaluated in this order: 1. `.` 2. `|` 3. `%`, `/`, `*` 4. `-`, `+` 5. `==`, `!=`, `>`, `<`, `>=`, `<=` 6. `is`, `is not` 7. `and` 8. `or` ## Tags Pebble tags are enclosed in `{% %}` and control template flow. ### `set` Defines a variable in the template context: ```twig {% set header = "Welcome Page" %} {{ header }} {# output: Welcome Page #} ``` ### `if` Evaluates conditional logic. Use `elseif` and `else` for multiple branches: ```twig {% if users is empty %} No users available. {% elseif users.length == 1 %} One user found. {% else %} Multiple users found. {% endif %} ``` ### `for` Iterates over arrays, maps, or any `java.lang.Iterable`. **Iterating over a list:** ```twig {% for user in users %} {{ user.name }} lives in {{ user.city }}. {% else %} No users found. {% endfor %} ``` The `else` block runs when the collection is empty. **Iterating over a map:** ```twig {% for entry in map %} {{ entry.key }}: {{ entry.value }} {% endfor %} ``` **Loop special variables:** Inside any `for` loop, Pebble provides a `loop` object with these properties: | Variable | Description | | --- | --- | | `loop.index` | Zero-based index of the current iteration | | `loop.length` | Total number of items in the iterable | | `loop.first` | `true` on the first iteration | | `loop.last` | `true` on the last iteration | | `loop.revindex` | Number of iterations remaining | Example: ```twig {% for user in users %} {{ loop.index }}: {{ user.name }}{% if loop.last %} (last){% endif %} {% endfor %} ``` ### `filter` Applies a filter to a block of content. Filters can be chained: ```twig {% filter upper %} hello {% endfilter %} {# output: HELLO #} {% filter lower | title %} hello world {% endfilter %} {# output: Hello World #} ``` ### `raw` Prevents Pebble from parsing its content — useful when you need to output literal `{{ }}` syntax: ```twig {% raw %}{{ user.name }}{% endraw %} {# output: {{ user.name }} #} ``` ### `macro` Defines a reusable template snippet. Macros only have access to their own arguments by default: ```twig {% macro input(type="text", name, value="") %} type: "{{ type }}", name: "{{ name }}", value: "{{ value }}" {% endmacro %} {{ input(name="country") }} {# output: type: "text", name: "country", value: "" #} ``` To access variables from the outer template context, pass `_context` explicitly: ```twig {% set foo = "bar" %} {% macro display(_context) %} {{ _context.foo }} {% endmacro %} {{ display(_context) }} {# output: bar #} ``` ### `block` Defines a named, reusable template block. Use the `block()` function to render the block elsewhere: ```twig {% block "header" %} Introduction {% endblock %} {{ block("header") }} ``` ## Tests Tests are used with `is` and `is not` to perform type and value checks. ### `defined` Checks whether a variable exists in the context (regardless of its value): ```twig {% if missing is not defined %} Variable is not defined. {% endif %} ``` ### `empty` Returns `true` when a variable is null, an empty string, an empty collection, or an empty map: ```twig {% if user.email is empty %} No email on record. {% endif %} ``` ### `null` Checks whether a variable is null: ```twig {% if user.email is null %} ... {% endif %} {% if name is not null %} ... {% endif %} ``` ### `even` and `odd` Check whether an integer is even or odd: ```twig {% if 2 is even %} ... {% endif %} {% if 3 is odd %} ... {% endif %} ``` ### `iterable` Returns `true` when a variable implements `java.lang.Iterable`. Use this to guard a `for` loop when the collection may not always be present: ```twig {% if users is iterable %} {% for user in users %} {{ user.name }} {% endfor %} {% endif %} ``` ### `json` Returns `true` when a variable is a valid JSON string: ```twig {% if '{"test": 1}' is json %} ... {% endif %} ``` ### `map` Returns `true` when a variable is a map: ```twig {% if {"apple": "red", "banana": "yellow"} is map %} ... {% endif %} ``` --- # Kestra Glossary: Terms and Definitions URL: https://kestra.io/docs/glossary > Glossary of Kestra and declarative orchestration terms. Definitions for flows, tasks, triggers, namespaces, and key concepts used across the platform. A list of terms useful for understanding Kestra and declarative orchestration. ## A - [Apps](#apps) - custom user interfaces (UIs) or frontends for workflows, allowing your users to interact with Kestra from the outside world. Apps can trigger workflows or enable human-in-the-loop workflows. Available on [Enterprise Edition](../07.enterprise/04.scalability/apps/index.md). - [Approval Apps](#approval-apps) - Apps that enable forms for approving or rejecting paused workflows. - [Form Apps](#form-apps) - Apps that allow you to create forms that can trigger workflows with input parameters. ## B - [Backfill](#backfill) - replays of missed schedule intervals between a defined start and end date. All missed schedules are automatically recovered by default if the Kestra server is down. Learn how to manage and configure [backfills](../06.concepts/08.backfill/index.md). - [Blueprints](#blueprints) - ready-to-use examples with code and documentation designed to kickstart your worflow. [Blueprints](../06.concepts/07.blueprints/index.md) typically include multiple plugins. ## C - [Concurrency](#concurrency) - a flow-level property that limits the number of executions of a specific flow that can run simultaneously. Learn when to use [concurrency](../05.workflow-components/14.concurrency/index.md). - [Connector sprawl](#connector-sprawl) - the uncontrolled proliferation of integrations, or connectors, in an organization. [Connector sprawl](https://kestra.io/docs/tutorial/outputs#pass-outputs-between-tasks) can create security, operational, and maintenance issues. Kestra's architecture around outputs and internal storage works to prevent these risks. - [Context](#context) - typically referred to as "execution context" or a collection of variables and metadata that allows for dynamic rendering of flow properties during a workflow's execution. ## D - [Declarative](#declarative) - An approach where you describe _what_ a workflow should accomplish rather than _how_ to achieve it or expressesing logic without describing control flow. - [Declarative orchestration](#declarative-orchestration) - A declarative orchestrator is a system that allows you to define and manage complex workflows using a high-level, descriptive language. Instead of specifying the exact steps and sequences to achieve a specific outcome, a declarative orchestrator lets you define the desired end state and the system figures out how to reach it. ## E - [Events](#events) - in orchestration, an event is something that happens, internal or external to the system, to start a flow. - [Internal Events](#internal-events) - internal events happen internal, or inside of the Kestra platform, like scheduled CRON triggers, to start a flow. - [External Events](#external-events) - external events happen external, or outside of the Kestra platform to start a flow. - [Execution](#execution) - a single run of a flow, existing in a specific state. - [Execution context](#execution-context) - a collection of variables and metadata that allows for dynamic rendering of flow properties during a workflow's execution. - [Expressions](#expressions) - accessing and using variables in flows, combining the Pebble templating engine with the execution context to dynamically render flow properties. [Expressions](../expressions/index.mdx) allow you to dynamically set values within your workflows. Expression syntax uses curly braces, e.g., `{{ your_expression }}`. ## F - [Flowable Tasks](#flowable-tasks) - [Flowable tasks](../05.workflow-components/01.tasks/00.flowable-tasks/index.md) control orchestration logic — running tasks or subflows in parallel, creating loops, and handling conditional branching. They do not run heavy operations. - [Flows](#flows) - Flows act as a backend, processing data and executing tasks. Flows are versioned by default. [Flows](../05.workflow-components/01.flow/index.md) and workflows are often used interchangeable. ## I - [Inputs](#inputs) - dynamic values passed to the flow at runtime. Flow inputs are stored in the execution context and accessed with `{{ inputs.parameter_name }}`.Learn more about [inputs](../05.workflow-components/05.inputs/index.md). ## K - [KV Store](#kv-store) - also known as [Key Value Store](../06.concepts/05.kv-store/index.md), allows you to store any data in a key-value format. These values can be shared acrss executions and different workflows to provide persistent data. ## N - [Namespace](#namespace) - separates projects, teams, and environments to logically group things and provide structure. Working with languages like Java, you may have encountered the concept of [namespaces](../05.workflow-components/02.namespace/index.md) implemented as packages. - [Namespace File](#namespace-file) - files tied to a specific namespace, serving as project assets. They are analogous to a project in a local IDE or a copy of a Git repository. Learn more about [namespace files](../06.concepts/02.namespace-files/index.md). ## O - [Orchestration](#orchestration) - a process or a tool that automates, manages, and coordinates various workflows and tasks across different services, systems, or applications. It functions like a conductor of an orchestra, ensuring all components perform in harmony, following a predefined sequence or set of rules. - [Outputs](#outputs) - a mechanism to pass data between tasks and flows. They can be accessed by all downstream tasks and flows using dynamic properties (e.g., `{{ outputs.task_id.attribute_name }}`). Learn more about [outputs](../05.workflow-components/06.outputs/index.md). ## P - [Pebble Templating Engine](#pebble-templating-engine) - inspired by the Java templating engine, use `.` notation to access nested properties. [Pebble](../06.concepts/06.pebble/index.md) is used to dynamically render variables, inputs, and outputs withint the execution context. - [Plugin](#plugin) - the building blocks of tasks in Kestra that offer integerations to different systems and functionality. [Plugins](../05.workflow-components/02.plugins/index.md) power every task and trigger in Kestra. ## R - [Replay](#replay) - re-run a workflow execution from any chosen task, useful for iterative developer and reprocessing data. Learn more about [replay](../06.concepts/10.replay/index.md). - [Revision](#revision) - any changes to a flow create a new version of that flow, otherwise known as a [revision](../06.concepts/03.revision/index.md). - [Runnable Tasks](#runnable-tasks) - [Runnable tasks](../05.workflow-components/01.tasks/01.runnable-tasks/index.md) handle data processing, such as file system operations, API calls, and database queries. They can be compute-intensive and are executed by workers. Most tasks are runnable. ## S - [Secrets](#secrets) - sensitive information stored securely. [Secrets](../06.concepts/04.secret/index.md) can be retrieved and used within Kestra flows using the `secret()` function (e.g., `{{ secret('API_TOKEN') }}`). - [Sibling task](#sibling-task) - A sibling task is a task that shares a common parent task with other tasks, like in the `tasks` list inside a loop. - [Subflow](#subflow) - Subflows let you build modular and reusable workflow components. They work like function calls: executing a [subflow](../05.workflow-components/10.subflows/index.md) creates a new flow run from within another flow. - [System flows](#system-flows) - System flows automate maintenance workflows. Any valid Kestra flow can become a [System Flow](../06.concepts/system-flows/index.md) if it’s added to the `system` namespace. ## T - [Task runner](#task-runner) - extensible, pluggable system within Kestra capable of executing your tasks in arbitrary remote environments, to offload computationally intensive tasks. Learn more about [task runners](../task-runners/01.overview/index.md). - [Tasks](#tasks) - atomic actions in a flow. [Tasks](../05.workflow-components/01.tasks/index.mdx) are a required element in a flow and can be [Flowable Tasks](#flowable-tasks) or [Runnable Tasks](#runnable-tasks). - [Time To Live (TTL)](#ttl) - the expiration or duration something like a token, secret, or key-value pair is available. - [Triggers](#triggers) - a mechanism that automatically starts the execution of a flow. There are five core trigger types: schedule, flow, webhook, polling, realtime. [Triggers](../05.workflow-components/07.triggers/index.mdx) are scheduled or event-based. ## W - [Worker group](#worker-group) - offload computer-intensive tasks to dedicated workers, but at a broader scope than task runners. Available in [Enterprise Edition](../07.enterprise/04.scalability/worker-group/index.md). - [Workers](#workers) - a Kestra server component responsible for executing all runnable tasks and polling triggers. --- # Kestra How-to Guides: Hands-On Workflow Tutorials URL: https://kestra.io/docs/how-to-guides > Explore our collection of hands-on guides to learn how to integrate tools, manage workflows, and master Kestra's features. import GuidesChildCard from "~/components/docs/GuidesChildCard.astro" Learn Kestra with our hands-on guides. ## Find a Guide Adjust the filters based on your needs or search directly. --- # Access Local Files in Kestra: Bind Mounts Guide URL: https://kestra.io/docs/how-to-guides/access-local-files > Access and process files stored on your local machine within Kestra workflows using bind mounts and the Process task runner. Access locally stored files on your machine inside Kestra workflows. In Kestra, you can access files stored on your local machine from within your flows. This is useful when you have a directory of files to process or scripts to execute without needing to copy them into Kestra.
## Setting up Kestra with Docker If you're running Kestra with [Docker](../../02.installation/02.docker/index.md), you’ll need to create a bind mount to a local directory on your machine so that Kestra can access those files inside the container. In your [Docker Compose](../../02.installation/03.docker-compose/index.md) file, add the absolute path of the local directory and define its mount point inside the container. In this example, the local path `/Users/username/Documents/files` is mounted to `/files` inside the container using `- /Users/username/Documents/files:/files`. Add this under the `volumes` section of your Docker Compose file: ```yaml ... kestra: image: kestra/kestra:latest pull_policy: always user: "root" command: server standalone volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd - /Users/username/Documents/files:/files ... ``` You can now access any files or directories within `/Users/username/Documents/files` from inside Kestra under the `/files` path. ## Accessing files inside Script tasks By default, a Script task runs inside a [Docker Task Runner](../../task-runners/04.types/02.docker-task-runner/index.md). To access local files, change the Task Runner type to [Process](../../task-runners/04.types/01.process-task-runner/index.md), so it runs as a subprocess on your Kestra instance: ```yaml id: process namespace: company.team tasks: - id: hello type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - cat /files/myfile.txt ``` --- # Configure Alerts in Kestra URL: https://kestra.io/docs/how-to-guides/alerting > Configure alerts in Kestra to notify you of workflow failures via Slack, PagerDuty, or other platforms using subflows and flow triggers. Configure alerts that fire whenever a workflow fails.
Alerting is essential to keeping production systems reliable. Kestra makes it easy with multiple ways to attach alerts to workflows so you always know what’s happening. Kestra’s Notifications plugin group ships tasks for popular platforms such as Slack, Teams, and PagerDuty, making it straightforward to configure alerts directly inside workflows. ![notifications](./notifications.png) Each platform exposes two task types: - **Execution task** – sends execution metadata directly in the notification, including a link to the run, ID, namespace, flow name, start time, duration, and final status. - **Send task** – sends a custom message, useful when you want to describe the state of a specific task or output. For this walkthrough we’ll use the `SlackExecution` task to send a detailed execution summary. ## `errors` property If we add the task directly to a workflow, it runs every time — which isn’t useful. Instead, place it in the `errors` block so it only fires when the execution fails. Just like the `tasks` block, define `SlackExecution` under `errors`: ```yaml errors: - id: alert type: io.kestra.plugin.slack.notifications.SlackExecution channel: "#general" url: "{{ secret('SLACK_WEBHOOK') }}" ``` When executed, it looks like this in Slack: ![slack](./slack.png) Use `errors` when you only want failure alerts. If you need different notifications for different final states such as `SUCCESS`, `FAILED`, or `WARNING`, use [`afterExecution`](../../05.workflow-components/20.afterexecution/index.md) instead. ## Subflows Copying that snippet into every flow is repetitive and hard to maintain. Instead, move the alerting logic into a subflow and reference it from any workflow that needs alerts. Move the `errors` tasks into their own subflow so the `errors` block only calls that subflow. Update the alert logic once and every consumer benefits. Subflow containing the alert logic: ```yaml id: slack_alert namespace: system tasks: - id: alert type: io.kestra.plugin.slack.notifications.SlackExecution channel: "#general" url: "{{ secret('SLACK_WEBHOOK') }}" ``` Parent flow that calls the subflow only when an error occurs: ```yaml errors: - id: alert type: io.kestra.plugin.core.flow.Subflow flowId: slack_alert namespace: system ``` ## Flow trigger Subflows cut down on duplication, but you still need the `errors` block in every flow. For a fully centralized approach, use a **Flow trigger** that reacts to execution status. Trigger conditions let you target specific states, such as `FAILED` or `WARNING`, and you can define separate triggers per status if needed. ```yaml id: failure_alert_slack namespace: system tasks: - id: send_alert type: io.kestra.plugin.slack.notifications.SlackExecution url: "{{ secret('SLACK_WEBHOOK') }}" channel: "#general" executionId: "{{ trigger.executionId }}" triggers: - id: on_failure type: io.kestra.plugin.core.trigger.Flow conditions: - type: io.kestra.plugin.core.condition.ExecutionStatus in: - FAILED - WARNING ``` With multiple options for automatic alerting in Kestra, you can choose the level of centralization and customization that fits each use case. --- # Audit Machines and Tool Versions with Ansible in Kestra URL: https://kestra.io/docs/how-to-guides/ansible > Use Ansible playbooks orchestrated by Kestra to audit machine resources, check tool versions, and automate infrastructure updates. Run Ansible playbooks from Kestra and coordinate downstream infrastructure tasks. Ansible is an agentless automation tool that uses YAML playbooks to describe desired state and apply it over SSH or APIs. Teams rely on it to install software, manage configs, update systems, and provision cloud infrastructure. ## System report playbook (cross-platform) This playbook audits a host without assuming the OS, captures diagnostics, and upgrades `python3` when needed using the appropriate package manager (`apt`, `yum`, or Homebrew). It writes a JSON report to `./system_info.json`. You can extend the same pattern to a real-world fleet by adding an inventory of servers or laptops, running over SSH instead of `localhost`, and inserting more version/presence checks for tools your team depends on (e.g., `node`, `aws`, `kubectl`). In multi-machine mode, facts and JSON outputs can be aggregated centrally to spot drift and trigger remediations. :::collapse{title="View the playbook"} ```yaml --- - name: Collect and report system information (system agnostic) hosts: localhost connection: local gather_facts: true vars: system_info_output: "./system_info.json" python3_min_version: "3.11.0" tasks: - name: Show basic system summary ansible.builtin.debug: msg: - "Hostname: {{ ansible_facts['hostname'] | default('unknown') }}" - "OS family: {{ ansible_facts['os_family'] | default('unknown') }}" - "Distribution: {{ ansible_facts['distribution'] | default('') }} {{ ansible_facts['distribution_version'] | default('') }}" - "Kernel: {{ ansible_facts['kernel'] | default('unknown') }}" - "Architecture: {{ ansible_facts['architecture'] | default('unknown') }}" - "CPU(s): {{ ansible_facts['processor_vcpus'] | default('unknown') }}" - "Total RAM (MB): {{ ansible_facts['memtotal_mb'] | default('unknown') }}" - "Primary IP: {{ ansible_facts['default_ipv4']['address'] | default('unknown') }}" # ----------------------------- # Extra checks / diagnostics # ----------------------------- - name: Check overall disk usage (df -h) ansible.builtin.command: df -h register: disk_usage changed_when: false failed_when: false # in case df is not available - name: Check load average and uptime ansible.builtin.command: uptime register: uptime_cmd changed_when: false failed_when: false - name: Show top 5 memory-hungry processes ansible.builtin.shell: | ps aux | head -n 1 ps aux | sort -nrk 4 | head -n 5 register: top_mem_processes changed_when: false failed_when: false # ----------------------------- # Python3 detection & version # ----------------------------- - name: Check if python3 is installed ansible.builtin.command: python3 --version register: python3_check failed_when: false changed_when: false - name: Parse python3 version ansible.builtin.set_fact: python3_installed: "{{ python3_check.rc == 0 }}" python3_version: >- {{ (python3_check.stdout.split()[1]) if (python3_check.rc == 0 and (python3_check.stdout | length > 0)) else 'unknown' }} - name: Debug python3 detection ansible.builtin.debug: msg: - "python3 installed: {{ python3_installed }}" - "python3 version: {{ python3_version }}" # ----------------------------- # OS family convenience flags # ----------------------------- - name: Set OS family flags ansible.builtin.set_fact: os_family: "{{ ansible_facts['os_family'] | default('Unknown') }}" is_debian: "{{ ansible_facts['os_family'] == 'Debian' }}" is_redhat: "{{ ansible_facts['os_family'] == 'RedHat' }}" is_darwin: "{{ ansible_facts['os_family'] == 'Darwin' }}" # ----------------------------- # Decide if python3 upgrade is needed # ----------------------------- - name: Decide if python3 upgrade is needed ansible.builtin.set_fact: python3_needs_upgrade: >- {{ python3_installed and python3_version != 'unknown' and (python3_version is version(python3_min_version, '<')) }} - name: Debug python3 upgrade decision ansible.builtin.debug: msg: - "Minimum required python3 version: {{ python3_min_version }}" - "Current python3 version: {{ python3_version }}" - "Needs upgrade: {{ python3_needs_upgrade }}" - name: Initialize python3 upgrade result ansible.builtin.set_fact: python3_upgrade_result: manager: "none" attempted: false note: "No upgrade attempted yet." # ----------------------------- # Debian / Ubuntu path (apt) # ----------------------------- - name: Upgrade python3 via apt if needed (Debian family) ansible.builtin.apt: name: python3 state: latest update_cache: yes when: - is_debian - python3_needs_upgrade register: python3_upgrade_apt - name: Record python3 upgrade result for Debian family ansible.builtin.set_fact: python3_upgrade_result: >- {{ python3_upgrade_result | combine( { 'manager': 'apt', 'attempted': python3_needs_upgrade, 'note': ( python3_needs_upgrade | ternary( 'python3 upgrade handled by apt on Debian-based system (see play output).', 'python3 already meets minimum version; apt upgrade not required.' ) ) }, recursive=True ) }} when: is_debian # ----------------------------- # RedHat / CentOS / Fedora path (yum) # ----------------------------- - name: Upgrade python3 via yum if needed (RedHat family) ansible.builtin.yum: name: python3 state: latest when: - is_redhat - python3_needs_upgrade register: python3_upgrade_yum - name: Record python3 upgrade result for RedHat family ansible.builtin.set_fact: python3_upgrade_result: >- {{ python3_upgrade_result | combine( { 'manager': 'yum', 'attempted': python3_needs_upgrade, 'note': ( python3_needs_upgrade | ternary( 'python3 upgrade handled by yum on RedHat-based system (see play output).', 'python3 already meets minimum version; yum upgrade not required.' ) ) }, recursive=True ) }} when: is_redhat # ----------------------------- # macOS path (Homebrew) # ----------------------------- - name: Check if Homebrew is installed (macOS) ansible.builtin.command: brew --version register: brew_check failed_when: false changed_when: false when: is_darwin - name: Upgrade python via Homebrew if needed (macOS) ansible.builtin.command: brew upgrade python when: - is_darwin - python3_needs_upgrade - brew_check.rc == 0 register: python3_upgrade_brew changed_when: true - name: Record python3 upgrade result for macOS ansible.builtin.set_fact: python3_upgrade_result: >- {{ python3_upgrade_result | combine( { 'manager': (brew_check.rc == 0) | ternary('brew', 'none'), 'attempted': (python3_needs_upgrade and brew_check.rc == 0), 'note': ( (not python3_needs_upgrade) | ternary( 'python3 already meets minimum version; brew upgrade not required.', ( brew_check.rc == 0 | ternary( 'python upgrade handled by Homebrew on macOS (see play output).', 'Homebrew not available; cannot upgrade python on macOS.' ) ) ) ) }, recursive=True ) }} when: is_darwin # ----------------------------- # Build & write combined report # ----------------------------- - name: Build combined system info structure ansible.builtin.set_fact: full_system_info: collected_at: "{{ ansible_facts['date_time']['iso8601'] | default('') }}" hostname: "{{ ansible_facts['hostname'] | default('') }}" os: family: "{{ ansible_facts['os_family'] | default('') }}" distribution: "{{ ansible_facts['distribution'] | default('') }}" version: "{{ ansible_facts['distribution_version'] | default('') }}" release: "{{ ansible_facts['distribution_release'] | default('') }}" kernel: "{{ ansible_facts['kernel'] | default('') }}" hardware: architecture: "{{ ansible_facts['architecture'] | default('') }}" cpu_model: "{{ ansible_facts['processor'][1] | default('') if ansible_facts.get('processor') else '' }}" vcpus: "{{ ansible_facts['processor_vcpus'] | default(0) }}" memtotal_mb: "{{ ansible_facts['memtotal_mb'] | default(0) }}" network: default_ipv4: "{{ ansible_facts['default_ipv4'] | default({}) }}" all_ipv4: "{{ ansible_facts['all_ipv4_addresses'] | default([]) }}" interfaces: "{{ ansible_facts['interfaces'] | default([]) }}" storage: mounts: "{{ ansible_facts['mounts'] | default([]) }}" virtualization: type: "{{ ansible_facts['virtualization_type'] | default('') }}" role: "{{ ansible_facts['virtualization_role'] | default('') }}" diagnostics: disk_usage: "{{ disk_usage.stdout | default('') }}" uptime: "{{ uptime_cmd.stdout | default('') }}" top_mem_processes: "{{ top_mem_processes.stdout | default('') }}" python3: installed: "{{ python3_installed }}" version: "{{ python3_version }}" minimum_required: "{{ python3_min_version }}" needs_upgrade: "{{ python3_needs_upgrade }}" upgrade: "{{ python3_upgrade_result }}" ansible_facts: "{{ ansible_facts }}" - name: Write full system info to JSON file ansible.builtin.copy: dest: "{{ system_info_output }}" content: "{{ full_system_info | to_nice_json }}" mode: "0600" - name: Print location of saved system info ansible.builtin.debug: msg: - "Full system information written to: {{ system_info_output }}" - "You can inspect it with: jq '.' {{ system_info_output }} (if jq is installed)" ``` ::: ### What this playbook covers It gathers the usual suspects (OS family, distro, kernel, CPU, RAM, IP), then pulls quick diagnostics like disk usage, uptime, and top memory processes. It checks `python3` and, if it's older than `3.11.0`, upgrades it with the right package manager depending on the OS of the machine (`apt`, `yum`, or Homebrew). Each play in the playbook generates a log and output, for example, there is a log for each diagnostic metric check, a log for Python3 detection and version, and a log for building and writing the combined report to name a few. The image below shows an example output targeting a local machine where `python3` is installed (`python3_installed`), but the Python version is `"3.10.4"`. ![Ansible Python Check](./ansible-python-check.png) Ansible also reports that `"python3_needs_upgrade": true` and depending on the detected OS of the machine, upgrades accordingly. ![Ansible Python Needs Upgrade](./ansible-python-needs-upgrade.png) Everything from this Python upgrade to other machine diagnostics are aggregated in `system_info.json` with mode `0600` so you have a tidy, readable report. This playbook can of course be adapted for other checks and in principle demonstrates the possibilities when you combine Ansible with Kestra. ### Run it locally Ensure Ansible is installed and save the YAML as `system_info.yml`, run it against localhost, and inspect the output: - `ansible-playbook -i localhost, -c local system_info.yml` - Optionally inspect the JSON: `jq '.' system_info.json` The diagnostics report captured looks like the following (macOS): ```json { "diagnostics": { "disk_usage": [ "Filesystem Size Used Avail Capacity iused ifree %iused Mounted on", "/dev/disk3s1 466Gi 128Gi 318Gi 29% 1453290 4882459910 0% /" ], "uptime": "18:42 up 5 days, 7:31, 4 users, load averages: 2.34 2.11 1.98", "top_mem_processes": [ "USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND", "jdoe 4287 23.5 9.8 9876544 823456 ?? R 9:12PM 0:21.43 /Applications/Chrome", "jdoe 1562 7.3 5.4 6453320 455121 ?? S 7:58AM 12:11.01 /usr/bin/python3 myscript.py", "_windowser 991 3.8 3.8 5432100 315789 ?? S Fri11AM 5:45.22 WindowServer", "root 72 1.2 2.2 4321000 190233 ?? S Sun09AM 3:12.90 /usr/libexec/trustd", "jdoe 2178 0.9 1.6 3876543 131442 ?? S Sat08PM 1:03.07 Slack" ] } } ``` And the machine information outputs the follwing for local macOS machine: ```plaintext TASK [Show basic system summary] ************************************************************************************************************************************* ok: [localhost] => { "msg": [ "Hostname: Mac", "OS: Darwin MacOSX 15.6.1", "Kernel: 24.6.0", "Architecture: arm64", "CPU(s): 10", "Total RAM (MB): 24576", "Primary IP: 10.0.0.42" ] } ``` ### Run it from Kestra Embed the playbook in your flow's YAML inline, and collect the report with a single [Ansible CLI task](/plugins/plugin-ansible/cli/io.kestra.plugin.ansible.cli.ansiblecli): ```yaml id: system_report namespace: company.team tasks: - id: system_info type: io.kestra.plugin.ansible.cli.AnsibleCLI inputFiles: playbook.yml: | # paste the playbook above inventory.ini: | localhost ansible_connection=local outputFiles: - system_info.json containerImage: cytopia/ansible:latest-tools commands: - ansible-playbook -i inventory.ini playbook.yml ``` Or, keep the playbook as a [Namespace File](../../06.concepts/02.namespace-files/index.md) and reference it directly with the same [Ansible CLI task](/plugins/plugin-ansible/cli/io.kestra.plugin.ansible.cli.ansiblecli). ![Namespace Files](./flow-namespace-files.png) Also add the `inventory.ini` file to the Namespace (`localhost ansible_connection=local`). For simplicity, this guide checks the local machine, but of course this example can be expanded to utilize Ansible's capability to SSH into multiple servers and perform operations: ```yaml id: system_report namespace: company.team tasks: - id: system_info type: io.kestra.plugin.ansible.cli.AnsibleCLI namespaceFiles: enabled: true outputFiles: - system_info.json containerImage: cytopia/ansible:latest-tools commands: - ansible-playbook -i inventory.ini system_info.yml ``` After the run, the `outputFiles` property allows you to preview or download `system_info.json` from the task outputs and feed it into downstream checks or dashboards. ![Ansible File Output](./ansible-outputs.png) ### Upload the report to S3 Extend the Namespace File flow with an [S3 Upload task](/plugins/plugin-aws/s3/io.kestra.plugin.aws.s3.upload) and store credentials in [secrets](../../06.concepts/04.secret/index.md): ```yaml id: system_report_to_s3 namespace: company.team tasks: - id: system_info type: io.kestra.plugin.ansible.cli.AnsibleCLI namespaceFiles: enabled: true outputFiles: - system_info.json containerImage: cytopia/ansible:latest-tools commands: - ansible-playbook -i inventory.ini system_info.yml - id: upload_output_to_s3 type: io.kestra.plugin.aws.s3.Upload region: "{{ secret('AWS_DEFAULT_REGION') }}" accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" bucket: "{{ secret('S3_BUCKET_NAME') }}" key: "system_reports/{{ execution.id }}/system_info.json" from: "{{ outputs.system_info.outputFiles['system_info.json'] }}" ``` ![S3 Bucket Upload](./ansible-s3-upload.png) The `upload_output_to_s3` task pushes the generated JSON to S3 using secrets for credentials and bucket name; reuse `outputFiles` expressions anywhere you need the file. ### Add a Slack notification To include a separate notification to the relevant channels, add the [Slack Incoming Webhook task](/plugins/plugin-slack/slack-notifications/io.kestra.plugin.slack.notifications.slackincomingwebhook) after the upload with a message alerting that "Machine X" had outdated software and patched an upgrade. You can swap Slack for any other notifier in the Plugin catalog or chain multiple notifications if needed: ```yaml - id: slack_notification type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ secret('SLACK_WEBHOOK_URL') }}" messageText: "Machine `{{ flow.id }}` had outdated Python and an upgrade took place during execution `{{ execution.id }}`. Report available at S3: `{{ outputs.upload_output_to_s3.key }}`" ``` ![Slack Notification](./ansible-slack-notification.png) ### Trigger it (scheduled or event-driven) Lastly, add a trigger so the flow runs automatically — either on a schedule ([Schedule trigger](../../05.workflow-components/07.triggers/01.schedule-trigger/index.md)) or from an external event ([Webhook trigger](../../05.workflow-components/07.triggers/03.webhook-trigger/index.md)): ```yaml triggers: - id: nightly_audit type: io.kestra.plugin.core.trigger.Schedule cron: "0 2 * * *" # every night at 2 AM # Or, event-driven example (e.g., HTTP webhook from your MDM/ITSM): # - id: mdm_webhook # type: io.kestra.plugin.core.http.Webhook ``` A trigger allows you to build a historical log of machine health in S3 and Slack without manual runs. ### Wrap up Ansible handles host-level automation — collecting facts, checking software package versions, remediating with the right package manager, and so much more. Kestra now orchestrates the run, stores secrets, uploads the JSON report to S3, and notifies Slack (or your preferred channel) so teams see when upgrades occur. Together they scale this cross-platform playbook from one laptop to a fleet, with repeatable runs and downstream integrations ready to consume the results. --- # Detect Ansible Config Drift with Kestra URL: https://kestra.io/docs/how-to-guides/ansible-config-drift > Detect configuration drift across your infrastructure using Ansible and Kestra, and get alerted via Slack on changes. Keeps configs consistent and surfaces drift without manual checks with Ansible and Kestra. Use Ansible to enforce a required environment variable across multiple hosts and have Kestra alert you in Slack when a change occurs. ## Files to store as Namespace Files Ansible expects two file types: an `inventory.ini` and a `playbook.yml`. To use with Kestra, they can either be stored as [Namespace Files](../../06.concepts/02.namespace-files/index.md) or written in-line in the flow code. The example continues using Namespace Files. - `inventory.ini` (replace with your hosts and users; keys shown as placeholders): ```ini [servers] server1.example.test ansible_user=admin ansible_ssh_private_key_file=~/.ssh/id_rsa server2.example.test ansible_user=admin ansible_ssh_private_key_file=~/.ssh/id_rsa server3.example.test ansible_user=admin ansible_ssh_private_key_file=~/.ssh/id_rsa ``` - `myplaybook.yml` (enforce `MY_APP_MODE` and refresh the shell): ```yaml --- - name: Ensure environment variable is set correctly hosts: servers become: true tasks: - name: Ensure MY_APP_MODE is set lineinfile: path: /home/{{ ansible_user }}/.bashrc regexp: '^MY_APP_MODE=' line: 'MY_APP_MODE=production' state: present notify: Refresh environment handlers: - name: Refresh environment shell: . /home/{{ ansible_user }}/.bashrc changed_when: false ``` ## Flow: run Ansible and alert on drift This flow runs the playbook with the [Ansible CLI task](/plugins/plugin-ansible/cli/io.kestra.plugin.ansible.cli.ansiblecli), inspects each host result in a [`ForEach`](/plugins/core/flow/io.kestra.plugin.core.flow.foreach), and posts a Slack alert only when a host was changed using the [Slack Incoming Webhook task](/plugins/plugin-slack/slack-notifications/io.kestra.plugin.slack.notifications.slackincomingwebhook). The schedule trigger is disabled by default — enable it to run nightly. ```yaml id: ansible_config_drift namespace: company.team tasks: - id: set_up_env type: io.kestra.plugin.ansible.cli.AnsibleCLI namespaceFiles: enabled: true taskRunner: type: io.kestra.plugin.core.runner.Process ansibleConfig: | [defaults] interpreter_python = auto_silent log_path={{ workingDir }}/log callback_plugins = ./callback_plugins stdout_callback = kestra_logger commands: - ansible-playbook -i inventory.ini myplaybook.yml - id: loop_hosts type: io.kestra.plugin.core.flow.ForEach values: "{{ outputs.set_up_env.vars.outputs }}" tasks: - id: check_drift type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook runIf: "{{ taskrun.value | jq('.changed') | first == true }}" url: "{{ secret('SLACK_WEBHOOK') }}" payload: | { "text": "Configuration updated - {{ taskrun.value | jq('.msg') | first ?? Null }}" } triggers: - id: check_nightly type: io.kestra.plugin.core.trigger.Schedule cron: 0 3 * * * disabled: true ``` The execution generates logs for every play in the playbook for clear results and monitoring: ![Ansible Config Drift Logs](./config-drift-logs.png) In the execution outputs, you can examine results and debug expressions to use in potential downstream tasks or subflows: ![Ansible Config Drift Outputs](./config-drift-outputs.png) ## Why this matters This pattern enforces a critical env var across a fleet to catch drift quickly, streams Ansible output in structured form via `stdout_callback = kestra_logger`, and alerts only on changed hosts to keep Slack noise low. Keeping the playbook and inventory as Namespace Files means you can version and reuse them across flows without hardcoding paths in each run. You can expand this pattern to check multiple config files, package versions, or CIS controls per host, while letting Kestra handle scheduling, secrets, notifications, and downstream tasks (tickets, S3 archiving, SIEM) so Ansible remediation and orchestration stay tightly linked. --- # Extend Kestra with the API URL: https://kestra.io/docs/how-to-guides/api > Discover how to extend Kestra by using its API to create flows, trigger executions, manage KV store entries, and handle namespace files. Extend Kestra by using the API.
Kestra is API-first, so it’s straightforward to connect external systems to your flows or call the platform directly. This guide focuses on the Kestra API itself and how you can extend or integrate Kestra from other services. ## Using the API Reference The docs include references for both the [Open Source](../../api-reference/02.open-source/index.mdx) and [Cloud & Enterprise](../../api-reference/01.enterprise/index.mdx) APIs so you know exactly what endpoints are available. Opening the [Open Source reference](../../api-reference/02.open-source/index.mdx) shows a structured layout that’s easy to scan: ![api_reference](./api_reference.png) ## Making Requests with Authentication If you have [Basic Auth enabled](../../configuration/05.security-and-secrets/index.md) or you’re using the [Enterprise Edition](/enterprise), authenticate each request. With cURL you can pass credentials via `-u username:password`. The example below uses the defaults from the [Kestra Docker Compose](../../02.installation/03.docker-compose/index.md): ```bash curl -X POST -u 'admin@kestra.io:kestra' http://localhost:8080/api/v1/executions/company.team/hello_world ``` Enterprise users can generate [API tokens](../../07.enterprise/03.auth/api-tokens/index.md) and send them as Bearer headers: ```bash curl -X POST http://localhost:8080/api/v1/executions/company.team/hello_world \ -H "Authorization: Bearer YOUR_API_TOKEN" ``` The remaining examples assume authentication is disabled. ## Create a Flow To create a flow via API, open the **Flows** section and look for the `/api/v1/main/flows` [POST endpoint](https://kestra.io/docs/api-reference/open-source#post-/api/v1/flows). It expects a YAML payload containing the flow definition. The request body uses Content-Type `application/x-yaml`: ```yaml id: created_by_api namespace: company.team tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Hello World! 🚀 ``` Send the request with [cURL](https://en.wikipedia.org/wiki/CURL): ```bash curl -X POST http://localhost:8080/api/v1/main/flows -H "Content-Type:application/x-yaml" -d "id: created_by_api namespace: company.team tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Hello World! 🚀" ``` The response looks like this: ```json { "id": "created_by_api", "namespace": "company.team", "revision": 1, "disabled": false, "deleted": false, "tasks": [ { "id": "hello", "type": "io.kestra.plugin.core.log.Log", "message": "Hello World! \uD83D\uDE80" } ], "source": "id: created_by_api\nnamespace: company.team\n\ntasks:\n - id: hello\n type: io.kestra.plugin.core.log.Log\n message: Hello World! \uD83D\uDE80" } ``` ## Execute a Flow To execute a flow, provide the namespace and flow ID. The sample flow below (`hello_world`) lives in the `company.team` namespace and accepts a string input: ```yaml id: hello_world namespace: company.team inputs: - id: greeting type: STRING defaults: hey tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "{{ inputs.greeting }}" ``` Because the input has a default, we can call the [POST endpoint](https://kestra.io/docs/api-reference/open-source#post-/api/v1/executions/-namespace-/-id-) `/api/v1/main/executions/{namespace}/{id}` without providing additional data: ```bash curl -X POST \ http://localhost:8080/api/v1/main/executions/company.team/hello_world ``` To override inputs, send them as form data with `-F`: ```bash curl -X POST \ http://localhost:8080/api/v1/main/executions/company.team/hello_world \ -F greeting="hey there" ``` The response includes execution metadata and a link to the UI: ```json { "id": "MYkTmLrI36s10iVXHwRbR", "namespace": "company.team", "flowId": "hello_world", "flowRevision": 10, "inputs": { "greeting": "hey" }, "labels": [ { "key": "system.correlationId", "value": "MYkTmLrI36s10iVXHwRbR" } ], "state": { "current": "CREATED", "histories": [ { "state": "CREATED", "date": "2024-11-21T16:31:27.943162175Z" } ], "duration": 0.044177500, "startDate": "2024-11-21T16:31:27.943162175Z" }, "originalId": "MYkTmLrI36s10iVXHwRbR", "deleted": false, "metadata": { "attemptNumber": 1, "originalCreatedDate": "2024-11-21T16:31:27.943194342Z" }, "url": "http://localhost:8080//ui/executions/company.team/hello_world/MYkTmLrI36s10iVXHwRbR" } ``` For end-to-end idempotency using a stable business key, set `system.correlationId` when you create the execution and add a guard as shown in [Idempotency with correlation IDs](../idempotency/index.md). See the [Executions documentation](../../05.workflow-components/03.execution/index.md#execute-a-flow-via-an-api-call) for additional examples. ## Get Information from an Execution The execution response returns the execution ID, which you can use to fetch additional details once the run completes. Using `MYkTmLrI36s10iVXHwRbR` from the earlier example, call the [GET endpoint](https://kestra.io/docs/api-reference/open-source#get-/api/v1/executions/-executionId-) `/api/v1/main/executions/{executionId}`: ```bash curl -X GET http://localhost:8080/api/v1/main/executions/MYkTmLrI36s10iVXHwRbR ``` The response includes state transitions, durations, and outputs: :::collapse{title="Response Body"} ```json { "id": "MYkTmLrI36s10iVXHwRbR", "namespace": "company.team", "flowId": "hello_world", "flowRevision": 10, "taskRunList": [ { "id": "1ZSXuswTiOeLggIwxT3V98", "executionId": "MYkTmLrI36s10iVXHwRbR", "namespace": "company.team", "flowId": "hello_world", "taskId": "hello", "attempts": [ { "state": { "current": "SUCCESS", "histories": [ { "state": "CREATED", "date": "2024-11-21T16:31:29.463Z" }, { "state": "RUNNING", "date": "2024-11-21T16:31:29.463Z" }, { "state": "SUCCESS", "date": "2024-11-21T16:31:29.512Z" } ], "duration": 0.049000000, "startDate": "2024-11-21T16:31:29.463Z", "endDate": "2024-11-21T16:31:29.512Z" } } ], "outputs": {}, "state": { "current": "SUCCESS", "histories": [ { "state": "CREATED", "date": "2024-11-21T16:31:28.455Z" }, { "state": "RUNNING", "date": "2024-11-21T16:31:29.448Z" }, { "state": "SUCCESS", "date": "2024-11-21T16:31:29.512Z" } ], "duration": 1.057000000, "startDate": "2024-11-21T16:31:28.455Z", "endDate": "2024-11-21T16:31:29.512Z" } } ], "inputs": { "greeting": "hey" }, "labels": [ { "key": "system.correlationId", "value": "MYkTmLrI36s10iVXHwRbR" } ], "state": { "current": "SUCCESS", "histories": [ { "state": "CREATED", "date": "2024-11-21T16:31:27.943Z" }, { "state": "RUNNING", "date": "2024-11-21T16:31:28.463Z" }, { "state": "SUCCESS", "date": "2024-11-21T16:31:30.474Z" } ], "duration": 2.531000000, "startDate": "2024-11-21T16:31:27.943Z", "endDate": "2024-11-21T16:31:30.474Z" }, "originalId": "MYkTmLrI36s10iVXHwRbR", "deleted": false, "metadata": { "attemptNumber": 1, "originalCreatedDate": "2024-11-21T16:31:27.943Z" } } ``` ::: Modify the flow to emit an output: ```yaml id: hello_world namespace: company.team tasks: - id: hello type: io.kestra.plugin.core.debug.Return format: "This is an output" ``` Fetching execution `59uQXHbkMy5YwHEDom72Xv` now shows the output payload: :::collapse{title="Response Body"} ```json { "id": "59uQXHbkMy5YwHEDom72Xv", "namespace": "company.team", "flowId": "hello_world", "flowRevision": 13, "taskRunList": [ { "id": "4G8EJhI2IwTdlHYi250h7m", "executionId": "59uQXHbkMy5YwHEDom72Xv", "namespace": "company.team", "flowId": "hello_world", "taskId": "hello", "attempts": [ { "state": { "current": "SUCCESS", "histories": [ { "state": "CREATED", "date": "2024-11-21T17:09:42.016Z" }, { "state": "RUNNING", "date": "2024-11-21T17:09:42.016Z" }, { "state": "SUCCESS", "date": "2024-11-21T17:09:42.045Z" } ], "duration": 0.029000000, "startDate": "2024-11-21T17:09:42.016Z", "endDate": "2024-11-21T17:09:42.045Z" } } ], "outputs": { "value": "This is an output" }, "state": { "current": "SUCCESS", "histories": [ { "state": "CREATED", "date": "2024-11-21T17:09:40.937Z" }, { "state": "RUNNING", "date": "2024-11-21T17:09:41.967Z" }, { "state": "SUCCESS", "date": "2024-11-21T17:09:42.053Z" } ], "duration": 1.116000000, "startDate": "2024-11-21T17:09:40.937Z", "endDate": "2024-11-21T17:09:42.053Z" } } ], "labels": [ { "key": "system.correlationId", "value": "59uQXHbkMy5YwHEDom72Xv" } ], "state": { "current": "SUCCESS", "histories": [ { "state": "CREATED", "date": "2024-11-21T17:09:40.204Z" }, { "state": "RUNNING", "date": "2024-11-21T17:09:40.942Z" }, { "state": "SUCCESS", "date": "2024-11-21T17:09:42.994Z" } ], "duration": 2.790000000, "startDate": "2024-11-21T17:09:40.204Z", "endDate": "2024-11-21T17:09:42.994Z" }, "originalId": "59uQXHbkMy5YwHEDom72Xv", "deleted": false, "metadata": { "attemptNumber": 1, "originalCreatedDate": "2024-11-21T17:09:40.204Z" }, "scheduleDate": "2024-11-21T17:09:40.181Z" } ``` ::: ## Accessing the KV Store Kestra’s [KV Store](../../06.concepts/05.kv-store/index.md) keeps flows stateful. You can create, update, and delete entries via the API — either from code running inside a flow or from external systems. Add a key/value pair with the [PUT endpoint](https://kestra.io/docs/api-reference/open-source#put-/api/v1/namespaces/-namespace-/kv/-key-) `/api/v1/main/namespaces/{namespace}/kv/{key}`. The example below writes `"Hello, World"` to `my_key` in the `company.team` namespace: ```bash curl -X PUT -H "Content-Type: application/json" http://localhost:8080/api/v1/main/namespaces/company.team/kv/my_key -d '"Hello, World"' ``` Verify in Kestra that the entry exists: ![kv_api](./kv_api.png) Update the value by sending a different body, for example `"This is a modified value"`: ```bash curl -X PUT -H "Content-Type: application/json" http://localhost:8080/api/v1/main/namespaces/company.team/kv/my_key -d '"This is a modified value"' ``` Kestra shows the key as updated: ![modified_kv](./modified_kv.png) Opening the entry reveals the new value. ![modified_value_kv](./modified_value_kv.png) Fetch the value with the [GET endpoint](https://kestra.io/docs/api-reference/open-source#get-/api/v1/namespaces/-namespace-/kv/-key-) `/api/v1/main/namespaces/{namespace}/kv/{key}`: ```bash curl -X GET http://localhost:8080/api/v1/main/namespaces/company.team/kv/my_key ``` The response contains the type and value: ```json { "type": "STRING", "value": "This is a modified value" } ``` See the [KV Store documentation](../../06.concepts/05.kv-store/index.md#api-how-to-create-read-update-and-delete-kv-pairs-via-rest-api) for more operations. ## Get and Upload Namespaces Files Beyond flows, you can manage namespace files via the API. Use the [GET endpoint](https://kestra.io/docs/api-reference/open-source#get-/api/v1/namespaces/-namespace-/files/directory) `/api/v1/main/namespaces/{namespace}/files/directory` to list files in a namespace: ![files](./files.png) For `company.team`: ```bash curl -X GET http://localhost:8080/api/v1/main/namespaces/company.team/files/directory ``` The response is an array of file metadata: ```json [ { "type": "File", "size": 13, "fileName": "example.txt", "lastModifiedTime": 1731430406183, "creationTime": 1731430400773 }, { "type": "File", "size": 27, "fileName": "example.js", "lastModifiedTime": 1731415024668, "creationTime": 1730997234841 }, { "type": "File", "size": 19, "fileName": "example.sh", "lastModifiedTime": 1731415024667, "creationTime": 1730997234839 }, { "type": "File", "size": 171, "fileName": "example.ion", "lastModifiedTime": 1731430044778, "creationTime": 1731430012804 }, { "type": "File", "size": 21, "fileName": "example.py", "lastModifiedTime": 1731415024667, "creationTime": 1729781670534 } ] ``` Use the [GET endpoint](https://kestra.io/docs/api-reference/open-source#get-/api/v1/namespaces/-namespace-/files) `/api/v1/main/namespaces/{namespace}/files` to fetch file contents: Example request for `example.txt`: ```bash curl -X GET 'http://localhost:8080/api/v1/main/namespaces/company.team/files?path=example.txt' ``` which returns: ```plaintext Hello, World! ``` Upload files using the [POST endpoint](https://kestra.io/docs/api-reference/open-source#post-/api/v1/namespaces/-namespace-/files) `/api/v1/main/namespaces/{namespace}/files`. The example below uploads `api_example.py` with the following content: ```python import requests r = requests.get("https://kestra.io") print({r.status_code}) ``` Run: ```bash curl -X POST 'http://localhost:8080/api/v1/main/namespaces/company.team/files?path=api_example.py' -H "Content-Type:multipart/form-data" -F "fileContent=@api_example.py" ``` :::alert{type="info"} Ensure `fileContent` has the correct path to your file. ::: After the upload, the file appears in the Namespace editor: ![upload_file](./upload_file.png) --- # Use Azure Managed Workload Identity with Kestra URL: https://kestra.io/docs/how-to-guides/azure-workload-id > Configure Azure Workload Identity on Kestra Enterprise to securely access Azure resources like Key Vault without managing secrets. How to use Azure Workload identity to provide access to resources such as Azure Key Vault in Kestra :::alert{type="info"} This page is only relevant for the Enterprise Edition of Kestra. For Cloud-based secret manager integrations, contact us at sales@kestra.io or chat with us in our [Slack community](https://kestra.io/slack). ::: ## Pre-Requisites To follow this guide you will need the following 1. [Kestra Enterprise Edition](https://kestra.io/docs/enterprise) 2. Account with Azure 3. [Azure CLI](https://learn.microsoft.com/en-us/cli/azure/) installed 4. Kubernetes tools (kubectl & helm) 5. Permissions to provision the following: - [AKS Cluster](https://azure.microsoft.com/en-us/products/kubernetes-service/) - [Azure Key Vault](https://learn.microsoft.com/en-us/azure/key-vault/general/) - [User-assigned managed identity](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/overview) This guide is based on the official Azure documentation on Workload Identity — it's best to read [this Azure guide](https://learn.microsoft.com/en-us/azure/aks/workload-identity-deploy-cluster) first for full context. Here, we'll focus on enabling this feature in Kestra. ## Variables Define the following variables and update them to match your environment. ```shell ## Managed User Identity Name ID_NAME="kestra-managed-user" ## Azure Resource Group RESOURCE_GROUP="demo" ## Physical location you wish to provision resources LOCATION="eastus" ## The name of your Azure Kubernetes Cluster AKS_NAME="demo-cluster" ## The name of your Azure Key Vault KEYVAULT_NAME="my-demo-vault" ## The name you wish to provide to the Kubernetes Service Account linked to the managed identity SERVICE_ACCOUNT_NAME="kestra-sa" ## The namespace to deploy the service account. Use the same location as your Kestra deployment SERVICE_ACCOUNT_NAMESPACE="default" ## The Federated ID credential for linking the OIDC issuer to the service account FEDERATED_IDENTITY_CREDENTIAL_NAME="kestra-fed-cred" ``` ## Create the resources First, create the following main resources: 1. The Key Vault 2. The Managed Identity 3. The AKS cluster. Once these have been provisioned, there are several identifiers we must capture for later use. ### Azure Key Vault This creates an Azure Key Vault. By default this will be created with RBAC (role-based access control) enabled which is the recommended configuration. ```shell az keyvault create \ --name $KEYVAULT_NAME \ --resource-group $RESOURCE_GROUP \ --location $LOCATION ``` ### Managed Identity This creates the user-assigned managed identity used to provision access to resources within the Kubernetes cluster. ```shell az identity create --name $ID_NAME \ --resource-group $RESOURCE_GROUP ``` ### AKS Cluster ```shell az aks create \ --resource-group $RESOURCE_GROUP \ --name $AKS_NAME \ --enable-oidc-issuer \ --enable-workload-identity \ --node-count 1 \ --generate-ssh-keys ``` ### Setting identifiers from new resources Once all the above have been created, capture the following information in variables for use later on: ```shell OBJECT_ID=$(az identity show --name $ID_NAME --resource-group $RESOURCE_GROUP --query 'principalId' --output tsv) MANAGED_CLIENT_ID=$(az identity show --name $ID_NAME --resource-group $RESOURCE_GROUP --query clientId --output tsv) AKS_OIDC_ISSUER="$(az aks show --name "${AKS_NAME}" --resource-group "${RESOURCE_GROUP}" --query "oidcIssuerProfile.issuerUrl" --output tsv)" ``` ## Link Identity Resources One of the more challenging aspects of this setup is correctly linking together the various resources. This section covers how to tie the managed identities to the resources to allow access by the Kestra application. ### Create role assignment for created user This is one of the most critical steps as it sets the permission the resource has on the Key Vault. As Kestra needs to read and write secrets to the vault, the "Key Vault Secrets Officer" provides least priviledged access for this operation. Further details on this role can be found [in Azure's RBAC guide](https://learn.microsoft.com/en-us/azure/key-vault/general/rbac-guide?tabs=azure-cli#azure-built-in-roles-for-key-vault-data-plane-operations). ```shell az role assignment create \ --assignee-object-id $OBJECT_ID \ --role "Key Vault Secrets Officer" \ --scope $(az keyvault show --name $KEYVAULT_NAME --query id -o tsv) ``` ### Create the service account in the AKS Cluster First, we must switch context to the newly created AKS cluster: ```shell az aks get-credentials --resource-group $RESOURCE_GROUP --name $AKS_NAME ``` Next, create a service account in the same namespace where you deploy Kestra. ```shell cat < Set up a local Ceph cluster using cephadm and expose it to Kestra via MinIO Gateway for S3-compatible object storage. This guide demonstrates how to deploy a local Ceph cluster using [`cephadm`](https://docs.ceph.com/en/latest/cephadm/) and expose a S3-compatible endpoint (Rados Gateway). MinIO will act as a gateway to Ceph, and Kestra will continue to use MinIO as its object storage. --- :::alert{type="warning"} This guide is intended for **local testing only**. It sets up a single-node Ceph cluster using `cephadm` and exposes it via MinIO in gateway mode. This configuration is **not suitable for production** use. ::: ## Install `cephadm` Install `cephadm` and dependencies: ```sh curl --silent --remote-name https://download.ceph.com/keys/release.asc gpg --no-default-keyring --keyring ./ceph-release.gpg --import release.asc sudo apt update sudo apt install cephadm ``` Verify installation: ```sh cephadm version ``` 🔗 [Full installation reference](https://docs.ceph.com/en/latest/cephadm/install/#installing-cephadm) --- ## Enable SSH locally `cephadm` uses SSH to manage hosts, even in local single-node setups. Ensure `sshd` is running: ```sh sudo apt install openssh-server sudo systemctl enable ssh sudo systemctl start ssh ``` Test the connection: ```sh ssh root@localhost ``` --- ## Bootstrap the Ceph Cluster Use `--mon-ip 127.0.0.1` and skip network autodetection: ```sh sudo cephadm bootstrap --mon-ip 127.0.0.1 --skip-mon-network ``` This sets up: - MON, MGR - SSH key for managing the host - Admin keyring --- ### 📋 Check Ceph status ```sh sudo cephadm shell -- ceph -s ``` > The `ceph` CLI is only available inside the `cephadm` shell. --- ## Enable Rados Gateway (S3 endpoint) Ceph RGW provides a fully compatible S3 interface. First, find your actual hostname: ```sh hostname ``` Then deploy RGW on that hostname (e.g., `kestra`): ```sh sudo cephadm shell -- ceph orch apply rgw default kestra ``` :::alert{type="warning"} The second argument **must match your system's hostname**. Using `default` or a wrong name will result in an `Unknown hosts` error. ::: Verify RGW is running: ```sh sudo cephadm shell -- ceph orch ps ``` Look for a line like: ```plaintext rgw.default.kestra.xxxxxx kestra *:80 running (...) ``` Confirm RGW is listening: ```sh ss -tuln | grep ':80' ``` --- ## Create a Ceph S3 User Generate credentials for MinIO to use: ```sh sudo cephadm shell -- radosgw-admin user create --uid="demo" --display-name="Demo User" ``` Copy the `access_key` and `secret_key` from the output. --- ## Connect MinIO to Ceph (Gateway Mode) MinIO proxies all S3 requests to Ceph RGW. ### `docker-compose.yml` ```yaml version: '3.8' services: minio: image: minio/minio:latest container_name: minio-ceph-gateway command: gateway s3 http://host.docker.internal:80 environment: MINIO_ROOT_USER: ABCDEF1234567890 MINIO_ROOT_PASSWORD: abc/xyz890foobar== ports: - "9000:9000" restart: always ``` > Replace `MINIO_ROOT_USER` and `MINIO_ROOT_PASSWORD` with the credentials from the RGW user you just created. --- ## Validate with MinIO Client ```sh mc alias set ceph http://localhost:9000 ABCDEF1234567890 abc/xyz890foobar== mc mb ceph/kestra-bucket mc ls ceph ``` --- ## Use in Kestra (no changes) Your existing `application-psql.yml` remains valid: ```yaml storage: type: minio minio: endpoint: localhost port: 9000 bucket: kestra-bucket access-key: ABCDEF1234567890 secret-key: abc/xyz890foobar== ``` Kestra will talk to MinIO as usual, and MinIO will write to Ceph transparently. --- ## Test with a Flow ```yaml id: ceph_test_flow namespace: company.team tasks: - id: py_outputs type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest outputFiles: - ceph-output.json script: | import json from kestra import Kestra data = {'message': 'stored in Ceph'} Kestra.outputs(data) with open('ceph-output.json', 'w') as f: json.dump(data, f) ``` Validate the output: ```sh mc cat ceph/kestra-bucket/main/company/team/ceph_test_flow/... ``` Expected: ```json {"message": "stored in Ceph"} ``` --- ## Cleanup a Broken Cluster If the bootstrap process fails and the cluster is partially created, you can remove it with: ```sh sudo cephadm rm-cluster --force --zap-osds --fsid ``` 📚 Docs: [Purging a cluster](https://docs.ceph.com/en/latest/cephadm/operations/#purging-a-cluster) --- ## References - 🧰 [cephadm Install Guide](https://docs.ceph.com/en/latest/cephadm/install/) - 🔐 [RGW User Management](https://docs.ceph.com/en/latest/radosgw/admin/#user-management) - 🎯 [MinIO Gateway S3](https://docs.min.io/docs/minio-gateway-for-s3.html) --- You now have a local Ceph cluster backing MinIO for object storage, and Kestra continues to function without any change in configuration. --- # Use Cloudflare R2 with MinIO Gateway for Kestra URL: https://kestra.io/docs/how-to-guides/cloudflare-r2 > Configure Cloudflare R2 as an S3-compatible object storage backend for Kestra using MinIO Gateway. This guide demonstrates how to use **Cloudflare R2** as an object storage backend through an S3-compatible interface, exposed to **Kestra** via a **MinIO Gateway**. This setup enables Kestra to continue using S3 storage without requiring configuration changes. --- :::alert{type="warning"} This guide assumes that **MinIO runs locally in gateway mode** to access Cloudflare R2. It is intended for **local development and QA environments**, and is **not optimized for production deployments**. ::: ## Create an R2 Bucket Log into [Cloudflare Dashboard](https://dash.cloudflare.com/) and create a new R2 bucket: 1. Navigate to **R2 → Create Bucket** 2. Choose a name like `kestra-bucket` --- ## Generate Access Keys Go to **API Tokens → R2 Keys** and create a new key pair: - `access_key_id`: Your user access key - `secret_access_key`: Your secret key Be sure to save these credentials securely. --- ## Retrieve the R2 Endpoint Cloudflare R2 provides a static S3-compatible endpoint: ```plaintext https://.r2.cloudflarestorage.com ``` Replace `` with your Cloudflare account ID, found in the R2 dashboard. --- ## Set Up MinIO Gateway to R2 MinIO will act as a gateway, forwarding all S3 traffic to Cloudflare R2. ### `docker-compose.yml` ```yaml version: '3.8' services: minio: image: minio/minio:latest container_name: minio-r2-gateway command: gateway s3 https://.r2.cloudflarestorage.com environment: MINIO_ROOT_USER: MINIO_ROOT_PASSWORD: ports: - "9000:9000" restart: always ``` > Replace ``, ``, and `` with your actual Cloudflare and access values. --- ## Validate Setup with MinIO Client Install the [MinIO Client (mc)](https://min.io/docs/minio/linux/reference/minio-mc.html): ```sh mc alias set r2 http://localhost:9000 mc mb r2/kestra-bucket mc ls r2 ``` --- ## Configure Kestra (No Changes Required) Since Kestra supports MinIO-compatible S3 endpoints, no changes to your configuration are required: ```yaml storage: type: minio minio: endpoint: localhost port: 9000 bucket: kestra-bucket access-key: secret-key: ``` Kestra will interact with MinIO, which in turn proxies to R2. --- ## Test with a Flow ```yaml id: r2_test_flow namespace: company.team tasks: - id: write_output type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest outputFiles: - r2-output.json script: | import json from kestra import Kestra data = {'message': 'stored in R2'} Kestra.outputs(data) with open('r2-output.json', 'w') as f: json.dump(data, f) ``` Then, verify the file was stored correctly using: ```sh mc cat r2/kestra-bucket/main/company/team/r2_test_flow/... ``` Expected output: ```json {"message": "stored in R2"} ``` --- ## References - 🌩️ [Cloudflare R2 Docs](https://developers.cloudflare.com/r2/) - 🔐 [R2 Access Keys](https://developers.cloudflare.com/api/) - 🧰 [MinIO Gateway for S3](https://min.io/docs/minio/linux/gateway/s3.html) --- You now have Cloudflare R2 configured as your object storage backend for Kestra, fully integrated via MinIO Gateway. --- # Add Conditional Branching in Kestra URL: https://kestra.io/docs/how-to-guides/conditional-branching > Master conditional branching in Kestra workflows using the Switch task to direct execution paths based on dynamic input values. How to use the Switch task to branch the flow based on a value. Conditional branching is a process in which the execution of a task is directed along different paths based on specific values. In a flow, it allows for decision-making, where different tasks are executed depending on the value provided. This guide shows how to use Kestra's `Switch` task to control your flow based on a value. Depending on the value passed, the flow branches to different task `cases`. If there is no matching value, Kestra uses the `defaults` branch. ## Prerequisites Before you begin: - Deploy [Kestra](../../02.installation/index.mdx) in your preferred development environment. - Ensure you have a [basic understanding of how to run Kestra flows.](../../03.tutorial/index.mdx) ## Example 1: Conditional Branching with Input Strings This flow template serves as an introductory example to understand how the `Switch` task works within Kestra. The flow dynamically branches to different tasks depending on the input string. To see the flow in action, define the `switch` flow as shown below: ```yaml id: switch namespace: company.team inputs: - id: string type: STRING tasks: - id: parent-seq type: io.kestra.plugin.core.flow.Switch value: "{{inputs.string}}" cases: FIRST: - id: first type: io.kestra.plugin.core.debug.Return format: "{{task.id}} > {{taskrun.startDate}}" SECOND: - id: second1 type: io.kestra.plugin.core.debug.Return format: "{{task.id}} > {{taskrun.startDate}}" - id: second2 type: io.kestra.plugin.core.debug.Return format: "{{task.id}} > {{taskrun.startDate}}" THIRD: - id: third1 type: io.kestra.plugin.core.flow.Sequential tasks: - id: failed type: io.kestra.plugin.core.execution.Fail errors: - id: error1 type: io.kestra.plugin.core.debug.Return format: "Error Trigger ! {{task.id}}" defaults: - id: default type: io.kestra.plugin.core.debug.Return format: "{{task.id}} > {{taskrun.startDate}}" outputs: - id: extracted type: STRING value: "{{ outputs.first ?? outputs.second1 ?? outputs.third1 ?? outputs.default }}" ``` Save and execute the `switch` flow. You can input `FIRST`, `SECOND`, `THIRD`, or any other input string to see the flow in action. The above flow when executed checks a sequence of tasks based on the input string you provide. Within the flow: - `inputs`: Takes a string input to determine which case to execute. - `tasks`: Handles the input string with the following values: - `id: parent-seq`: Uses the `Switch` task to evaluate the input string and execute a case that matches the input string with the following cases: - `FIRST`: Executes task `first`, returning its ID and start time. - `SECOND`: Executes two tasks (`second1` and `second2`), both returning their task ID and start time. - `THIRD`: Runs a sequence of tasks where the `failed` task triggers an error and executes the `error1` task, which logs an error message. - `defaults`: If the input doesn't match any cases, it runs the `default` task and logs the task ID and start time. - `outputs`: Extracts and logs the output from one of the executed tasks (`first`, `second1`, `third1`, or `default`). ## Example 2: Conditional branching with Kestra’s website status To see the `Switch` task in action without an input string, we’ll create a flow to make a decision based on the status of an HTTP request to Kestra’s website. To follow along and implement this, define the `kestra-http-switch` flow as described below: ```yaml id: kestra-http-switch namespace: company.team tasks: - id: check_kestra_site type: io.kestra.plugin.scripts.python.Script outputFiles: - site_status.txt script: | import requests response = requests.head("https://kestra.io") with open('site_status.txt', 'w') as f: f.write(str(response.status_code)) - id: decide_site_status type: io.kestra.plugin.core.flow.Switch value: "{{ read(outputs.check_kestra_site.outputFiles['site_status.txt']) }}" cases: "200": - id: log-site-up type: io.kestra.plugin.core.log.Log message: "Kestra website is up and running. Status: 200" "404": - id: log-site-down type: io.kestra.plugin.core.log.Log message: "Kestra website not found. Status: 404" defaults: - id: unknown-status type: io.kestra.plugin.core.log.Log message: "Received unexpected status code: {{ read(outputs.check_kestra_site.outputFiles['site_status.txt']) }}" outputs: - id: status-output type: STRING value: "{{ outputs.log-site-up ?? outputs.log-site-down ?? outputs.unknown-status }}" ``` Save and execute the `kestra-http-switch` flow. The above flow when executed checks the status of Kestra’s website and logs a message depending on the response code returned. Within the flow: - `tasks`: Handles the status check of the Kestra website with the following tasks: - `id: check_kestra_site`: Executes a Python script to send a HEAD request to the Kestra website and writes the HTTP status code to a `site_status.txt` file. - `id: decide_site_status`: Utilizes the `Switch` task to evaluate the HTTP status code from the `check_kestra_site` task: - If the status code is `"200"`, it logs a message indicating the site is up. - If the status code is `"404"`, it logs a message indicating the site is not found. - If an unexpected status code is received, it falls back to the `defaults` branch with a message indicating unknown status. - `outputs`: Extracts and logs the output status message based on the logs generated from the `Switch` task. ## Next Steps You have implemented conditional branching with the `Switch` task using the `switch` flow to check your input strings and `kestra-http-switch` flow to check Kestra’s website status. The `Switch` task can further be implemented in various use cases to support your flows. Further resources about the `Switch` task: - [Kestra’s official Switch task plugin documentation](/plugins/core/flow/io.kestra.plugin.core.flow.switch) - [Kestra’s Blueprint Switch task use cases](/blueprints/switch) --- # Build a Custom Plugin for Kestra URL: https://kestra.io/docs/how-to-guides/custom-plugin > Learn how to build, package, and test a custom Kestra plugin in Java to extend Kestra's capabilities for your specific needs. Build your own Custom Plugin for Kestra. This tutorial walks through building a custom plugin for Kestra. ## Use-case for Custom Plugin We will be building a plugin that fetches the data for a given pokemon. We will use the API provided by [PokeAPI.co](https://pokeapi.co/) to fetch the pokemon's details: `https://pokeapi.co/api/v2/pokemon/{pokemon_name}`. The API provides detailed information about any pokemon. We will fetch a few fields like the ability names, base experience, height and move names, and showcase it in the output of our plugin. The plugin task will accept the pokemon name, and return the selected fields in the output. This is how the task should look: ```yaml id: fetch_details type: io.kestra.plugin.pokemon.Fetch pokemon: pikachu ``` ## Requirements You will need the following installed on your machine before proceeding: * [Java](https://java.com) 21 or later. * [IntelliJ IDEA](https://www.jetbrains.com/idea/) (or any other Java IDE, we provide only help for IntelliJ IDEA). * [Gradle](https://gradle.org/) (included most of the time with the IDE). #### Create a new plugin Here are the steps: 1. Go on the [plugin-template](https://github.com/kestra-io/plugin-template) repository. 2. Click on *Use this template*. 3. Choose the GitHub account you want to link and the repository name for the new plugin. 4. Clone the new repository: `git clone git@github.com:{{user}}/{{name}}.git`. 5. Open the cloned directory in IntelliJ IDEA. 6. Enable [annotations processors](https://www.jetbrains.com/help/idea/annotation-processors-support.html). 7. If you are using an IntelliJ IDEA < 2020.03, install the [lombok plugins](https://plugins.jetbrains.com/plugin/6317-lombok) (if not, it's included by default). Once you completed the steps above, you should see a similar directory structure: ![Structure](../../plugin-developer-guide/00.setup/plugins-architecture.png) As you can see, there is one generated plugin: the `Example` class representing the `Example` plugin (a task). A project typically hosts multiple plugins. We call a project a group of plugins, and you can have multiple sub-groups inside a project by splitting plugins into different packages. Each package that has a plugin class is a sub-group of plugins. ## Gradle Configuration We use [Gradle](https://gradle.org/) as a build tool. ### Mandatory configuration The first thing we need to configure is the plugin name and the class package. 1. Change in `settings.gradle` the `rootProject.name = 'plugin-template'` with the plugin name `rootProject.name = 'plugin-pokemon'`. 2. Change the class package: by default, the template provides a package `io.kestra.plugin.templates`, just rename the `templates` folder in `src/main/java` & `src/test/java` to `pokemon`. And change the first line in `Example.java`, `ExampleRunnerTest.java` and `ExampleTest.java` to `package io.kestra.plugin.pokemon;`. 3. In `build.gradle`: a. replace `description 'Plugin template for Kestra'` to the package name `description 'Plugin pokemon for Kestra'`. b. In the `dependencies` section, add a dependency which we will be using in our plugin task: `implementation group: 'com.googlecode.json-simple', name: 'json-simple', version: '1.1.1'` c. Change the `jar` section to the following: ```groovy jar { manifest { attributes( "X-Kestra-Name": project.name, "X-Kestra-Title": "Pokemon", "X-Kestra-Group": project.group + ".pokemon", "X-Kestra-Description": project.description, "X-Kestra-Version": project.version ) } } ``` ## Develop Fetch Task ### Create Pokemon class In `src/main/java/io/kestra/plugin/pokemon`, we will create a new class `Pokemon.java`. This will be used to map the JSON output of the pokemon API to the Java class. We only need to add the fields that we are interested in, and ignore the rest. :::collapse{title="Here is how the Pokemon.java file should look"} ```java package io.kestra.plugin.pokemon; import java.util.*; import lombok.*; @Data public class Pokemon { List abilities; long base_experience; long height; List moves; } @Data class DetailedAbility { Ability ability; } @Data class Ability { String name; } @Data class DetailedMove { Move move; } @Data class Move { String name; } ``` ::: ### Runnable Task We will refactor the java file from `Example.java` to `Fetch.java`. In this file, we will put in the appropriate schema for the plugin, including the inputs and output of the plugin. This will help us generate documentation for the plugin too. Also, we will include a few examples to help users understand how to use the plugin. The class should extend `Task` and implement `RunnableTask` for it to be considered as a plugin task. The `RunnableTask` has a generic representing the output class. The output class should implement `io.kestra.core.models.tasks.Output`. The actual crux of the task logic resides in the `run` method. This is an override method from the `Task` class, and takes `RunContext` as an argument, and return the instance of the `Output` class. In the `run` method, we use the name of the pokemon, and make a call to the pokemon API. The fetched response is then mapped to the Pokemon class using the `ObjectMapper`. The resulting `Pokemon` object is then transformed into the `Fetch.Output` class, and returned. :::collapse{title="Here is a pokemon Fetch task that will fetch the details of a given pokemon"} ```java package io.kestra.plugin.pokemon; import io.kestra.core.models.annotations.Plugin; import io.swagger.v3.oas.annotations.media.Schema; import lombok.*; import lombok.experimental.SuperBuilder; import org.apache.commons.lang3.StringUtils; import io.kestra.core.models.annotations.PluginProperty; import io.kestra.core.models.tasks.RunnableTask; import io.kestra.core.models.tasks.Task; import io.kestra.core.runners.RunContext; import org.slf4j.Logger; import com.fasterxml.jackson.databind.ObjectMapper; import com.fasterxml.jackson.databind.DeserializationFeature; import java.io.*; import java.net.*; import java.util.*; import io.kestra.plugin.pokemon.Pokemon; @SuperBuilder @ToString @EqualsAndHashCode @Getter @NoArgsConstructor @Schema( title = "Fetch the pokemon details.", description = "Fetches all the details about the given pokemon." ) @Plugin( examples = { @io.kestra.core.models.annotations.Example( title = "Fetching the details for pikachu", code = { "pokemon: gengar" } ) } ) public class Fetch extends Task implements RunnableTask { @Schema( title = "Name of the pokemon.", description = "Name of the pokemon for which details need to be fetched." ) @PluginProperty(dynamic = true) // If the variables will be rendered with template {{ }} @Builder.Default private String pokemon = "pikachu"; @Override public Fetch.Output run(RunContext runContext) throws Exception { Logger logger = runContext.logger(); ObjectMapper om = new ObjectMapper().configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false); String inputPokemon = runContext.render(pokemon); StringBuilder result = new StringBuilder(); URL url = new URL("https://pokeapi.co/api/v2/pokemon/" + inputPokemon); HttpURLConnection conn = (HttpURLConnection) url.openConnection(); conn.setRequestMethod("GET"); try (BufferedReader reader = new BufferedReader( new InputStreamReader(conn.getInputStream()))) { for (String line; (line = reader.readLine()) != null; ) { result.append(line); } } Pokemon pokemonObject = om.readValue(result.toString(), Pokemon.class); List abilities = new ArrayList(); for(DetailedAbility detailedAbility: pokemonObject.abilities) { abilities.add(detailedAbility.ability.name); } List moves = new ArrayList(); for(DetailedMove detailedMove: pokemonObject.moves) { moves.add(detailedMove.move.name); } return Output.builder() .abilities(abilities) .baseExperience(pokemonObject.base_experience) .height(pokemonObject.height) .moves(moves) .build(); } /** * Input or Output can be nested as you need */ @Builder @Getter public static class Output implements io.kestra.core.models.tasks.Output { @Schema( title = "Abilities of the pokemon." ) private final List abilities; @Schema( title = "Whether the ability is hidden." ) private final long baseExperience; @Schema( title = "Slot corresponding to the ability." ) private final long height; @Schema( title = "Slot corresponding to the ability." ) private final List moves; } } ``` ::: ### Compile the plugin Now that the plugin is developed, package and test it on a Kestra instance. Use the included Gradle task to build the plugin. To build your plugin, execute the `./gradlew shadowJar` command from the plugin directory. The resulting JAR file will be generated in the `build/libs` directory. To use this plugin in your Kestra instance, add this JAR to the [Kestra plugins path](../../kestra-cli/kestra-server/index.md#plugin-commands). ## Writing unit tests Refactor the file names from `ExampleRunnerTest.java` and `ExampleTest.java` to `FetchRunnerTest.java` and `FetchTest.java` respectively. Under the `tests/resources/flows` folder, rename `example.yaml` to `pokemonFetch.yaml`. Use the following flow in `pokemonFetch.yaml`: :::collapse{title="Contents of pokemonFetch.yaml"} ```yaml id: pokemonFetch namespace: company.team tasks: - id: fetch-pikachu type: io.kestra.plugin.pokemon.Fetch pokemon: "pikachu" - id: fetch-gengar type: io.kestra.plugin.pokemon.Fetch pokemon: "gengar" ``` ::: Update `FetchRunnerTest.java` to load `pokemonFetch.yaml` and run the flow, then assert that all tasks were executed. :::collapse{title="Contents of FetchRunnerTest.java"} ```java package io.kestra.plugin.pokemon; import io.kestra.core.junit.annotations.KestraTest; import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.Test; import io.kestra.core.models.executions.Execution; import io.kestra.core.repositories.LocalFlowRepositoryLoader; import io.kestra.core.runners.RunnerUtils; import io.kestra.core.runners.StandAloneRunner; import jakarta.inject.Inject; import java.io.IOException; import java.net.URISyntaxException; import java.util.Map; import java.util.Objects; import java.util.concurrent.TimeoutException; import static org.hamcrest.MatcherAssert.assertThat; import static org.hamcrest.Matchers.hasSize; import static org.hamcrest.Matchers.is; /** * This test will load all flow located in `src/test/resources/flows/` * and will run an in-memory runner to be able to test a full flow. There is also a * configuration file in `src/test/resources/application.yml` that is only for the full runner * test to configure in-memory runner. */ @KestraTest class FetchRunnerTest { @Inject protected StandAloneRunner runner; @Inject protected RunnerUtils runnerUtils; @Inject protected LocalFlowRepositoryLoader repositoryLoader; @BeforeEach protected void init() throws IOException, URISyntaxException { repositoryLoader.load(Objects.requireNonNull(FetchRunnerTest.class.getClassLoader().getResource("flows"))); this.runner.run(); } @SuppressWarnings("unchecked") @Test void flow() throws TimeoutException { Execution execution = runnerUtils.runOne(null, "io.kestra.plugin", "pokemonFetch"); assertThat(execution.getTaskRunList(), hasSize(2)); } } ``` ::: Test the plugin logic in `FetchTest.java` by creating the input, invoking the task, and verifying the output. :::collapse{title="Contents of FetchTest.java"} ```java package io.kestra.plugin.pokemon; import com.google.common.collect.ImmutableMap; import io.micronaut.test.extensions.junit5.annotation.MicronautTest; import org.apache.commons.lang3.StringUtils; import org.junit.jupiter.api.Test; import io.kestra.core.runners.RunContext; import io.kestra.core.runners.RunContextFactory; import jakarta.inject.Inject; import static org.hamcrest.MatcherAssert.assertThat; import static org.hamcrest.Matchers.is; /** * This test will only test the main task, this allow you to send any input * parameters to your task and test the returning behavior easily. */ @MicronautTest class FetchTest { @Inject private RunContextFactory runContextFactory; @Test void run() throws Exception { RunContext runContext = runContextFactory.of(ImmutableMap.of("variable", "gengar")); Fetch task = Fetch.builder() .pokemon("{{ variable }}") .build(); Fetch.Output runOutput = task.run(runContext); assertThat(runOutput.getBaseExperience(), is(250L)); assertThat(runOutput.getHeight(), is(15L)); assertThat(runOutput.getAbilities().size(), is(1)); assertThat(runOutput.getMoves().size(), is(123)); } } ``` ::: ### Running the tests You can run the tests from Intellij IDE, or from the terminal using the command: ```bash ./gradlew test ``` ## Plugin in Action Now that the plugin is developed and tested, its time to see the plugin in action. ### Use a custom docker image with your plugin Add this `Dockerfile` to the root of your plugin project: ```dockerfile FROM kestra/kestra:develop COPY build/libs/* /app/plugins/ ``` You can build and run the image with the following command assuming you're in the root directory of your plugin: `./gradlew shadowJar && docker build -t kestra-custom . && docker run --rm -p 8080:8080 kestra-custom server local` You can now navigate to http://localhost:8080 and start using your custom plugin. ### Execute the plugin and check the Output Create a new flow, and use this newly-built plugin's task in the flow. Here is a sample flow: ```yaml id: pokemonFetch namespace: company.team tasks: - id: fetch-pikachu type: io.kestra.plugin.pokemon.Fetch pokemon: "pikachu" ``` On executing the flow, navigate to the `Outputs` tab to view the output. ![custom_plugin_output](./custom_plugin_output.png) You are now all set to build more plugins and explore Kestra to its fullest! --- # Use Dataform in Kestra URL: https://kestra.io/docs/how-to-guides/dataform > Orchestrate DataForm transformations in Kestra. Schedule and run DataForm jobs in your data pipeline for reliable, version-controlled SQL-based data modeling. Run transformations on BigQuery using Dataform in Kestra Dataform is modern data pipeline tool based on Extract-Load-Transform (ELT). It has been acquired by Google Cloud and has been integrated within the BigQuery. Similar to other ELT tools, Dataform handles the transformation on diffferent warehouses. Some of the data stores that are supported by Dataform include BigQuery, Snowflake, Redshift, etc. One of the advantages of using Dataform is that you can put together the transformation in the form of SQL, thus empowering multiple roles like Data Analysts and Data Scientists to perform the transformations. Being based on SQL makes it easier for anyone to onboard onto Dataform. ## Using Dataform with Kestra There are two ways in which you can create a Dataform project while running with Kestra: 1. Create the Dataform project in GitHub, clone the GitHub project in Kestra and then run it using the [DataformCLI](/plugins/plugin-dataform/cli/io.kestra.plugin.dataform.cli.dataformcli) task. 2. Create the Dataform project in Kestra using [Namespace Files](../../../docs/06.concepts/02.namespace-files/index.md), and then run it using the [DataformCLI](/plugins/plugin-dataform/cli/io.kestra.plugin.dataform.cli.dataformcli) task. You can later choose to push the Namespace Files into GitHub repository using [PushNamespaceFiles](/plugins/plugin-git/io.kestra.plugin.git.pushnamespacefiles) task. This guide covers both methods for transforming data using Dataform in Kestra for BigQuery. ### Using GitHub repository Here is how you can pull an existing project from a GitHub repository and run it with DataformCLI task: ```yaml id: dataform namespace: company.team tasks: - id: wdir type: io.kestra.plugin.core.flow.WorkingDirectory tasks: - id: clone_repo type: io.kestra.plugin.git.Clone url: https://github.com/dataform-co/dataform-example-project-bigquery - id: transform type: io.kestra.plugin.dataform.cli.DataformCLI beforeCommands: - npm install @dataform/core - dataform compile env: GOOGLE_APPLICATION_CREDENTIALS: "sa.json" inputFiles: sa.json: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}" .df-credentials.json: | { "projectId": "", "location": "us" } commands: - dataform run --dry-run ``` The `clone_repo` task pulls the repository with the Dataform project, and the `transform` task executes the Dataform project. ### Using Dataform project creation in Kestra First, create and save the Kestra flow. The flow contains the following tasks: 1. HTTP Download task that downloads the `orders.csv` file using HTTP URL. 2. BigQuery CreateTable task that creates the `orders` table in the `ecommerce` dataset. 3. BigQuery Load task that loads the `orders.csv` contents into the BigQuery `orders` table. 4. DataformCLI task that runs the Dataform project, created later using Namespace Files. The project creates the `stg_orders` BigQuery view based on the `orders` BigQuery table. ```yaml id: dataform_project namespace: company.team tasks: - id: orders_http_download type: io.kestra.plugin.core.http.Download description: Download orders.csv using HTTP Download uri: https://huggingface.co/datasets/kestra/datasets/raw/#main/csv/orders.csv - id: create_orders_table type: io.kestra.plugin.gcp.bigquery.CreateTable description: Create orders table in BigQuery serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}" projectId: dataset: ecommerce table: orders tableDefinition: type: TABLE schema: fields: - name: order_id type: INT64 - name: customer_name type: STRING - name: customer_email type: STRING - name: product_id type: INT64 - name: price type: FLOAT64 - name: quantity type: INT64 - name: total type: FLOAT64 - id: load_orders_table type: io.kestra.plugin.gcp.bigquery.Load description: Load orders table with data from orders.csv from: "{{ outputs.orders_http_download.uri }}" projectId: serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}" destinationTable: ".ecommerce.orders" format: CSV csvOptions: fieldDelimiter: "," skipLeadingRows: 1 - id: dataform_cli type: io.kestra.plugin.dataform.cli.DataformCLI beforeCommands: - npm install @dataform/core - dataform compile namespaceFiles: enabled: true env: GOOGLE_APPLICATION_CREDENTIALS: "sa.json" inputFiles: sa.json: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}" .df-credentials.json: | { "projectId": "", "location": "us" } commands: - dataform run ``` Once the flow is saved, navigate to the Editor, and create a file `package.json` with the contents: ```json { "dependencies": { "@dataform/core": "2.3.0" } } ``` This file is not required for Kestra execution, as the dependency is installed via `beforeCommands`. It is required if you push the namespace files to a GitHub repository so you can run the project in other ways. Next, create `dataform.json`. ```json { "warehouse": "bigquery", "defaultSchema": "ecommerce", "defaultDatabase": "", "defaultLocation": "us" } ``` Most often, the `database` is same as the GCP project ID. Create a `definitions` folder. Inside it, create a file `orders.sqlx` to define the `orders` table as the source table: ```javascript config { type: "declaration", database: "", schema: "ecommerce", name: "orders", description: "raw orders table" } ``` Next, create `stg_orders.sqlx` under the `definitions` folder to define the `stg_orders` view: ```javascript config { type: "view", // Specify whether this model will create a table or a view schema: "ecommerce", database: "" } select order_id, customer_name, customer_email, product_id, price, quantity, total from ${ref("orders")} ``` That's it! We are now ready to run the flow. Once the flow runs successfully, you can go to the BigQuery console, and ensure that the view `stg_orders` has been created. This is how we can run Dataform for BigQuery in Kestra. These instructions can also help you integrate the DataformCLI task with other data stores like Snowflake, Redshift, Postgres and more. --- # Manage dbt Projects with Kestra's Code Editor URL: https://kestra.io/docs/how-to-guides/dbt > Clone dbt projects from Git, edit models in Kestra's Code Editor, run tests, and push changes back to Git for seamless dbt project management. Edit dbt code from Kestra's Code Editor Kestra's built-in Code Editor allows you to manage dbt projects by cloning the Git repository with the dbt code and uploading it to your Kestra namespace. You can make changes to the dbt models directly from the Kestra UI, test them as part of an end-to-end workflow, and push the changes to the desired Git branch when you are ready. ## Clone a dbt project from Git This flow pulls a dbt project from Git and uploads it to Kestra as Namespace Files: ```yaml id: upload_dbt_project namespace: company.datateam.dbt tasks: - id: wdir type: io.kestra.plugin.core.flow.WorkingDirectory tasks: - id: git_clone type: io.kestra.plugin.git.Clone url: https://github.com/kestra-io/dbt-example branch: master - id: upload type: io.kestra.plugin.core.namespace.UploadFiles namespace: "{{ flow.namespace }}" files: - "glob:**/dbt/**" ``` You can use this flow as an initial setup: 1. Add this flow within Kestra UI 2. Save it 3. Execute that flow 4. Click on the `Files` sidebar in the code editor to view the uploaded dbt files. ![dbt-code-editor](./dbt-code-editor.png) ## Run dbt CLI commands Create a flow that runs dbt CLI commands: ```yaml id: dbt_build namespace: company.datateam.dbt inputs: - id: dbt_command type: SELECT allowCustomValue: true defaults: dbt build --project-dir dbt --profiles-dir dbt --no-partial-parse --target prod values: - dbt build --project-dir dbt --profiles-dir dbt --no-partial-parse --target prod - dbt build --project-dir dbt --profiles-dir dbt --no-partial-parse --target prod --select state:modified+ --defer --state ./target --target-path ./dev tasks: - id: dbt type: io.kestra.plugin.dbt.cli.DbtCLI namespaceFiles: enabled: true containerImage: ghcr.io/kestra-io/dbt-duckdb:latest projectDir: dbt commands: - "{{ inputs.dbt_command }}" loadManifest: key: manifest.json namespace: "{{ flow.namespace }}" storeManifest: key: manifest.json namespace: "{{ flow.namespace }}" taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker ``` The `namespaceFiles` property lets you run dbt commands on the files uploaded to the namespace. This allows you to test the dbt models without having to build the entire project every time. Execute the flow using the default value for the `dbt_command` input. ## Edit dbt files You can now open the dbt files in the Code Editor and make changes as needed. For example, add a new model `my_third_dbt_model.sql`: ```sql select * from {{ ref('my_first_dbt_model') }} where id = 2 ``` ![dbt-code-editor](./dbt-code-editor-2.png) When you now run the flow using the second dropdown value for the `dbt_command` input, only the new model will be built. This allows you to test the changes quickly and iterate faster. ## Push changes to Git Once you are satisfied with the changes, you can push them to the same Git repository to your desired Git branch using the [PushNamespaceFiles](../pushnamespacefiles/index.md). ```yaml id: push_dbt_to_git namespace: company.datateam.dbt inputs: - id: commit_message type: STRING defaults: "Changes to dbt from Kestra" tasks: - id: commit_and_push type: io.kestra.plugin.git.PushNamespaceFiles namespace: "{{ flow.namespace }}" username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/git_username/scripts branch: dev gitDirectory: dbt commitMessage: "{{ inputs.commit_message }}" ``` Adjust the `url`, `branch`, and `gitDirectory` properties to match your dbt Git repository structure. If the branch does not exist, it will be created. If you want to test this step more incrementally, you can set the `dryRun` property to `true` to validate the changes before committing them to Git. --- # Use Debezium Tasks and Triggers in Kestra URL: https://kestra.io/docs/how-to-guides/debezium > Enable Change Data Capture (CDC) in your databases to use Debezium tasks and triggers in Kestra for real-time data ingestion. To use Debezium tasks and triggers, perform the necessary database setup described below. ## Creating a user A Debezium MySQL connector requires a MySQL user account. This MySQL user must have appropriate permissions on all databases for which the Debezium MySQL connector captures changes. **Prerequisites** - A MySQL server. - Basic knowledge of SQL commands. **Procedure** 1. Create the MySQL user: ```sql mysql> CREATE USER 'user'@'localhost' IDENTIFIED BY 'password'; ``` 2. Grant the required permissions to the user: ```sql mysql> GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'user' IDENTIFIED BY 'password'; ``` For a description of the required permissions, see [Descriptions of user permissions](https://debezium.io/documentation/reference/3.0/connectors/mysql.html#permissions-explained-mysql-connector). :::alert{type="info"} If using a hosted option such as Amazon RDS or Amazon Aurora that does not allow a global read lock, table-level locks are used to create the consistent snapshot. In this case, you need to also grant `LOCK TABLES` permissions to the user that you created. See [snapshots](https://debezium.io/documentation/reference/3.0/connectors/mysql.html#mysql-snapshots) for more details. ::: 3. Finalize the user’s permissions: ```sql mysql> FLUSH PRIVILEGES; ``` ## Enabling the binlog You must enable binary logging for MySQL replication. The binary logs record transaction updates in a way that enables replicas to propagate those changes. **Prerequisites** - A MySQL server. - Appropriate MySQL user privileges. **Procedure** 1. Check whether the `log-bin` option is enabled: ```sql // for MySQL 5.x mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::" FROM information_schema.global_variables WHERE variable_name='log_bin'; // for MySQL 8.x mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::" FROM performance_schema.global_variables WHERE variable_name='log_bin'; ``` 2. If the binlog is `OFF`, add the properties in the following table to the configuration file for the MySQL server: ```ini server-id = 223344 # Querying variable is called server_id, e.g. SELECT variable_value FROM information_schema.global_variables WHERE variable_name='server_id'; log_bin = mysql-bin binlog_format = ROW binlog_row_image = FULL binlog_expire_logs_seconds = 864000 ``` 3. Confirm your changes by checking the binlog status once more: ```sql // for MySQL 5.x mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::" FROM information_schema.global_variables WHERE variable_name='log_bin'; // for MySQL 8.x mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::" FROM performance_schema.global_variables WHERE variable_name='log_bin'; ``` 1. If you run MySQL on Amazon RDS, you must enable automated backups for your database instance for binary logging to occur. If the database instance is not configured to perform automated backups, the binlog is disabled, even if you apply the settings described in the previous steps. ## Enabling GTIDs Global transaction identifiers (GTIDs) uniquely identify transactions that occur on a server within a cluster. Though not required for a Debezium MySQL connector, using GTIDs simplifies replication and enables you to more easily confirm if primary and replica servers are consistent. GTIDs are available in MySQL 5.6.5 and later. See the [MySQL documentation](https://dev.mysql.com/doc/refman/8.2/en/replication-options-gtids.html#option_mysqld_gtid-mode) for more details. **Prerequisites** - A MySQL server. - Basic knowledge of SQL commands. - Access to the MySQL configuration file. **Procedure** 1. Enable `gtid_mode`: ```sql mysql> gtid_mode=ON ``` 2. Enable `enforce_gtid_consistency`: ```sql mysql> enforce_gtid_consistency=ON ``` 3. Confirm the changes: ```sql mysql> show global variables like '%GTID%'; ``` **Result** +--------------------------+-------+ | Variable_name | Value | +--------------------------+-------+ | enforce_gtid_consistency | ON | | gtid_mode | ON | +--------------------------+-------+ ## Configuring session timeouts When an initial consistent snapshot is made for large databases, your established connection could timeout while the tables are being read. You can prevent this behavior by configuring `interactive_timeout` and `wait_timeout` in your MySQL configuration file. **Prerequisites** - A MySQL server. - Basic knowledge of SQL commands. - Access to the MySQL configuration file. **Procedure** 1. Configure `interactive_timeout`: ```sql mysql> interactive_timeout= ``` 2. Configure wait_timeout: ```sql mysql> wait_timeout= ``` ## Enabling query log events You might want to see the original SQL statement for each binlog event. Enabling the `binlog_rows_query_log_events` option in the MySQL configuration file allows you to do this. This option is available in MySQL 5.6 and later. **Prerequisites** - A MySQL server. - Basic knowledge of SQL commands. - Access to the MySQL configuration file. **Procedure** 1. Enable `binlog_rows_query_log_events` in MySQL: ```sql mysql> binlog_rows_query_log_events=ON ``` `binlog_rows_query_log_events` is set to a value that enables/disables support for including the original SQL statement in the binlog entry. - `ON` = enabled - `OFF` = disabled ## Validating binlog row value options Verify the setting of the `binlog_row_value_options` variable in the database. To enable the connector to consume **UPDATE** events, this variable must be set to a value other than `PARTIAL_JSON`. **Prerequisites** - A MySQL server. - Basic knowledge of SQL commands. - Access to the MySQL configuration file. **Procedure** 1. Check current variable value ```sql mysql> show global variables where variable_name = 'binlog_row_value_options'; ``` **Result** +--------------------------+-------+ | Variable_name | Value | +--------------------------+-------+ | binlog_row_value_options | | +--------------------------+-------+ If the value of the variable is set to `PARTIAL_JSON`, run the following command to unset it: ```sql mysql> set @@global.binlog_row_value_options="" ; ``` ## Running Debezium tasks on MySQL You are now all set to run the Debezium MySQL based tasks and triggers. Here is an example flow using Debezium MySQL Realtime Trigger: ```yaml id: debezium_mysql namespace: company.team tasks: - id: send_data type: io.kestra.plugin.core.log.Log message: "{{ trigger.data }}" triggers: - id: realtime type: io.kestra.plugin.debezium.mysql.RealtimeTrigger serverId: 123456789 hostname: 127.0.0.1 port: 63306 username: mysql_user password: mysql_passwd ``` Debezium MySQL Realtime Trigger will collect the records from the change data capture as and when they occur. The flow can then process these records as required. ## Debezium with PostgreSQL In order for Debezium to work with PostgreSQL, you need to enable write ahead logging (WAL) on the PostgreSQL server. PostgreSQL’s [logical decoding](https://www.postgresql.org/docs/current/static/logicaldecoding-explanation.html) feature was introduced in version 9.4. It is a mechanism that allows the extraction of the changes that were committed to the transaction log and the processing of these changes in a user-friendly manner. ### Local PostgreSQL Installation Before using the PostgreSQL connector to monitor the changes committed on a PostgreSQL server, decide which logical decoding plug-in you intend to use. If you plan not to use the native pgoutput logical replication stream support, then you must install the logical decoding plug-in into the PostgreSQL server. Afterward, enable a replication slot, and configure a user with sufficient privileges to perform the replication. If your database is hosted by a service such as [Heroku Postgres](https://www.heroku.com/postgres) you might be unable to install the plug-in. If so, and if you are using PostgreSQL 10+, you can use the pgoutput decoder support to capture changes in your database. If that is not an option, you are unable to use Debezium with your database. ### PostgreSQL in the Cloud #### PostgreSQL on Amazon RDS It is possible to capture changes in a PostgreSQL database that is running in [Amazon RDS](https://aws.amazon.com/rds/). To do this: - Set the instance parameter `rds.logical_replication` to `1`. - Verify that the `wal_level` parameter is set to `logical` by running the query `SHOW wal_level` as the database RDS master user. This might not be the case in multi-zone replication setups. You cannot set this option manually. It is automatically changed when the `rds.logical_replication` parameter is set to `1`. If the `wal_level` is not set to `logical` after you make the preceding change, it is probably because the instance has to be restarted after the parameter group change. Restarts occur during your maintenance window, or you can initiate a restart manually. - Set the Debezium `plugin.name` parameter to `pgoutput`. - Initiate logical replication from an AWS account that has the `rds_replication` role. The role grants permissions to manage logical slots and to stream data using logical slots. By default, only the master user account on AWS has the `rds_replication` role on Amazon RDS. To enable a user account other than the master account to initiate logical replication, you must grant the account the rds_replication role. For example, `grant rds_replication to `. You must have `superuser` access to grant the `rds_replication` role to a user. To enable accounts other than the master account to create an initial snapshot, you must grant `SELECT` permission to the accounts on the tables to be captured. For more information about security for PostgreSQL logical replication, see the [PostgreSQL documentation](https://www.postgresql.org/docs/current/logical-replication-security.html). #### PostgreSQL on Azure It is possible to use Debezium with [Azure Database for PostgreSQL](https://docs.microsoft.com/azure/postgresql/), which has support for the `pgoutput` logical decoding plug-in, which is supported by Debezium. Set the Azure replication support to `logical`. You can use the [Azure CLI](https://docs.microsoft.com/en-us/azure/postgresql/concepts-logical#using-azure-cli) or the [Azure Portal](https://docs.microsoft.com/en-us/azure/postgresql/concepts-logical#using-azure-portal) to configure this. For example, to use the Azure CLI, here are the `az postgres server` commands that you need to execute: ```bash az postgres server configuration set --resource-group mygroup --server-name myserver --name azure.replication_support --value logical az postgres server restart --resource-group mygroup --name myserver ``` #### PostgreSQL on CrunchyBridge It is possible to use Debezium with [CrunchyBridge](https://crunchybridge.com/); logical replication is already turned on. The `pgoutput` plugin is available. You will have to create a replication user and provide correct privileges. :::alert{type="info"} While using the `pgoutput` plug-in, it is recommended that you configure `filtered` as the `publication.autocreate.mode`. If you use `all_tables`, which is the default value for `publication.autocreate.mode`, and the publication is not found, the connector tries to create one by using `CREATE PUBLICATION FOR ALL TABLES;`, but this fails due to lack of permissions. ::: ### Installing the logical decoding output plug-in :::alert{type="info"} For more detailed instructions about setting up and testing logical decoding plug-ins, see [Logical Decoding Output Plug-in Installation for PostgreSQL](https://debezium.io/documentation/reference/3.0/postgres-plugins.html). ::: Starting with PostgreSQL 9.4, the only way to read changes to the write-ahead-log is to install a logical decoding output plug-in. Plug-ins are written in C, compiled, and installed on the machine that runs the PostgreSQL server. Plug-ins use a number of PostgreSQL specific APIs, as described by the [PostgreSQL documentation](https://www.postgresql.org/docs/current/static/logicaldecoding-output-plugin.html). The PostgreSQL connector works with one of Debezium’s supported logical decoding plug-ins to receive change events from the database in either the [Protobuf format](https://github.com/google/protobuf) or the [pgoutput](https://github.com/postgres/postgres/blob/master/src/backend/replication/pgoutput/pgoutput.c) format. The `pgoutput` plugin comes out-of-the-box with the PostgreSQL database. For more details on using Protobuf via the `decoderbufs` plug-in, see the plug-in [documentation](https://github.com/debezium/postgres-decoderbufs/blob/main/README.md) which discusses its requirements, limitations, and how to compile it. For simplicity, Debezium also provides a container image based on the upstream PostgreSQL server image, on top of which it compiles and installs the plug-ins. You can [use this image](https://github.com/debezium/container-images/tree/main/postgres/13) as an example of the detailed steps required for the installation. :::alert{type="warning"} The Debezium logical decoding plug-ins have been installed and tested on only Linux machines. For Windows and other operating systems, different installation steps might be required. ::: ### Running Debezium tasks on PostgreSQL Once the WAL is enabled, you can run the Debezium PostgreSQL based tasks and triggers. Here is an example flow using Debezium PostgreSQL Realtime Trigger: ```yaml id: debezium_postgres namespace: company.team tasks: - id: send_data type: io.kestra.plugin.core.log.Log message: "{{ trigger.data }}" triggers: - id: realtime type: io.kestra.plugin.debezium.postgres.RealtimeTrigger database: postgres hostname: 127.0.0.1 port: 65432 username: postgres password: pg_passwd ``` Debezium PostgreSQL Realtime Trigger will collect the records from the change data capture as and when they occur. The flow can then process these records as required. ## Using Debezium with Microsoft SQL Server For Debezium to capture change events from SQL Server tables, a SQL Server administrator with the necessary privileges must first run a query to enable CDC on the database. The administrator must then enable CDC for each table that you want Debezium to capture. :::alert{type="info"} By default, JDBC connections to Microsoft SQL Server are protected by SSL encryption. If SSL is not enabled for a SQL Server database, or if you want to connect to the database without using SSL, you can disable SSL by setting the value of the `database.encrypt` property in connector configuration to `false`. ::: After CDC is applied, it captures all of the `INSERT`, `UPDATE`, and `DELETE` operations that are committed to the tables for which CDC is enabled. The Debezium connector can then capture these events and emit them to Kafka topics. ### Enabling CDC on the SQL Server database Before you can enable CDC for a table, you must enable it for the SQL Server database. A SQL Server administrator enables CDC by running a system stored procedure. System stored procedures can be run by using SQL Server Management Studio, or by using Transact-SQL. **Prerequisites** - You are a member of the sysadmin fixed server role for the SQL Server. - You are a db_owner of the database. - The SQL Server Agent is running. :::alert{type="info"} The SQL Server CDC feature processes changes that occur in user-created tables only. You cannot enable CDC on the SQL Server master database. ::: **Procedure** 1. From the **View** menu in SQL Server Management Studio, click **Template Explorer**. 2. In the **Template Browser**, expand **SQL Server Templates**. 3. Expand **Change Data Capture > Configuration** and then click **Enable Database for CDC**. 4. In the template, replace the database name in the `USE` statement with the name of the database that you want to enable for CDC. 5. Run the stored procedure `sys.sp_cdc_enable_db` to enable the database for CDC. After the database is enabled for CDC, a schema with the name cdc is created, along with a CDC user, metadata tables, and other system objects. The following example shows how to enable CDC for the database `MyDB`: ```sql USE MyDB GO EXEC sys.sp_cdc_enable_db GO ``` ### Enabling CDC on a SQL Server table A SQL Server administrator must enable change data capture on the source tables that you want to Debezium to capture. The database must already be enabled for CDC. To enable CDC on a table, a SQL Server administrator runs the stored procedure `sys.sp_cdc_enable_table` for the table. The stored procedures can be run by using SQL Server Management Studio, or by using Transact-SQL. SQL Server CDC must be enabled for every table that you want to capture. **Prerequisites** - CDC is enabled on the SQL Server database. - The SQL Server Agent is running. - You are a member of the `db_owner` fixed database role for the database. **Procedure** 1. From the **View** menu in SQL Server Management Studio, click **Template Explorer**. 2. In the **Template Browser**, expand **SQL Server Templates**. 3. Expand **Change Data Capture > Configuration**, and then click **Enable Table Specifying Filegroup Option**. 4. In the template, replace the table name in the `USE` statement with the name of the table that you want to capture. 5. Run the stored procedure sys.sp_cdc_enable_table. The following example shows how to enable CDC for the table `MyTable`: ```sql USE MyDB GO EXEC sys.sp_cdc_enable_table @source_schema = N'dbo', @source_name = N'MyTable', @role_name = N'MyRole', @filegroup_name = N'MyDB_CT', @supports_net_changes = 0 GO ``` **source_name**: Specifies the name of the table that you want to capture. **role_name**: Specifies a role `MyRole` to which you can add users to whom you want to grant `SELECT` permission on the captured columns of the source table. Users in the `sysadmin` or `db_owner` role also have access to the specified change tables. Set the value of `@role_name` to `NULL`, to allow only members in the `sysadmin` or `db_owner` to have full access to captured information. **filegroup_name**: Specifies the filegroup where SQL Server places the change table for the captured table. The named filegroup must already exist. It is best not to locate change tables in the same filegroup that you use for source tables. ### Verifying that the user has access to the CDC table A SQL Server administrator can run a system stored procedure to query a database or table to retrieve its CDC configuration information. The stored procedures can be run by using SQL Server Management Studio, or by using Transact-SQL. **Prerequisites** - You have `SELECT` permission on all of the captured columns of the capture instance. Members of the `db_owner` database role can view information for all of the defined capture instances. - You have membership in any gating roles that are defined for the table information that the query includes. **Procedure** 1. From the **View** menu in SQL Server Management Studio, click **Object Explorer**. 2. From the **Object Explorer**, expand **Databases**, and then expand your database object, for example, `MyDB`. 3. Expand **Programmability > Stored Procedures > System Stored Procedures**. 4. Run the `sys.sp_cdc_help_change_data_capture` stored procedure to query the table. Queries should not return empty results. The following example runs the stored procedure `sys.sp_cdc_help_change_data_capture` on the database `MyDB`: ```sql USE MyDB; GO EXEC sys.sp_cdc_help_change_data_capture GO ``` The query returns configuration information for each table in the database that is enabled for CDC and that contains change data that the caller is authorized to access. If the result is empty, verify that the user has privileges to access both the capture instance and the CDC tables. ### SQL Server on Azure The Debezium SQL Server connector can be used with SQL Server on Azure. Refer to [this example](https://learn.microsoft.com/en-us/samples/azure-samples/azure-sql-db-change-stream-debezium/azure-sql%2D%2Dsql-server-change-stream-with-debezium/) for configuring CDC for SQL Server on Azure and using it with Debezium. ### SQL Server Always On The SQL Server connector can capture changes from an Always On read-only replica. **Prerequisites** - Change data capture is configured and enabled on the primary node. SQL Server does not support CDC directly on replicas. - The configuration option `database.applicationIntent` is set to `ReadOnly`. This is required by SQL Server. When Debezium detects this configuration option, it responds by taking the following actions: - Sets `snapshot.isolation.mode` to `snapshot`, which is the only one transaction isolation mode supported for read-only replicas. - Commits the (read-only) transaction in every execution of the streaming query loop, which is necessary to get the latest view of CDC data. ### Running Debezium tasks on Microsoft SQL Server You are now all set to run the Debezium Microsoft SQL Server based tasks and triggers. Here is an example flow using Debezium Microsoft SQL Server Realtime Trigger: ```yaml id: debezium_sqlserver namespace: company.team tasks: - id: send_data type: io.kestra.plugin.core.log.Log message: "{{ trigger.data }}" triggers: - id: realtime type: io.kestra.plugin.debezium.sqlserver.RealtimeTrigger hostname: 127.0.0.1 port: 61433 username: sa password: password database: deb ``` Debezium Microsoft SQL Server Realtime Trigger will collect the records from the change data capture as and when they occur. The flow can then process these records as required. ## Debezium with MongoDB The MongoDB connector uses MongoDB’s change streams to capture the changes, so the connector works only with MongoDB replica sets or with sharded clusters where each shard is a separate replica set. See the MongoDB documentation for setting up a [replica set](https://docs.mongodb.com/manual/replication/) or [sharded cluster](https://docs.mongodb.com/manual/sharding/). Also, be sure to understand how to enable [access control and authentication](https://docs.mongodb.com/manual/tutorial/deploy-replica-set-with-keyfile-access-control/#deploy-repl-set-with-auth) with replica sets. You must also have a MongoDB user that has the appropriate roles to read the `admin` database where the oplog can be read. Additionally, the user must also be able to read the `config` database in the configuration server of a sharded cluster and must have `listDatabases` privilege action. When change streams are used (the default) the user also must have cluster-wide privilege actions `find` and `changeStream`. When you intend to utilize pre-image and populate the `before` field, you need to first enable `changeStreamPreAndPostImages` for a collection using `db.createCollection()`, `create`, or `collMod`. ### MongoDB in the Cloud You can use the Debezium connector for MongoDB with [MongoDB Atlas](https://www.mongodb.com/atlas/database). Note that MongoDB Atlas only supports secure connections via SSL, i.e. the `[+mongodb.ssl.enabled](https://debezium.io/documentation/reference/3.0/connectors/mongodb.html#mongodb-property-mongodb-ssl-enabled)` connector option must be set to `true`. ### Running Debezium tasks on MongoDB You are now all set to run the Debezium MongoDB based tasks and triggers. Here is an example flow using Debezium MongoDB Realtime Trigger: ```yaml id: debezium_mongodb namespace: company.team tasks: - id: send_data type: io.kestra.plugin.core.log.Log message: "{{ trigger.data }}" triggers: - id: realtime type: io.kestra.plugin.debezium.mongodb.RealtimeTrigger snapshotMode: INITIAL connectionString: mongodb://mongo_user:mongo_passwd@mongos0.example.com:27017,mongos1.example.com:27017/ ``` Debezium MongoDB Realtime Trigger will collect the records from the change data capture as and when they occur. The flow can then process these records as required. --- # Build Dynamic Flows in Kestra URL: https://kestra.io/docs/how-to-guides/dynamic-flows > Create dynamic Kestra flows at runtime using inputs and Pebble expressions. Generate flow configurations on the fly for data-driven, adaptive orchestration. Implement dynamic flows in Kestra. ## Dynamic Flows using Inputs In this method, we will create a flow as a template, and the dynamic values in the template can then be filled using Kestra inputs to generate the desired flow. Let us see this with the help of an example. Here, we will create a sample flow that downloads CSV data using the HTTP Download task and then loads the data to a PostgreSQL table. Such a dynamic flow can be helpful when you have new HTTP URLs getting generated on a regular cadence, and you need to pull in the latest data from the new HTTP URL to upload to a table. The flow will take the HTTP URL and the PostgreSQL database connection details as inputs. This leads to a dynamic flow, as the same flow can then be utilized with different HTTP URLs and different PostgreSQL databases and tables. ```yaml id: dynamic_flow namespace: company.team inputs: - id: http_url type: STRING defaults: "https://huggingface.co/datasets/kestra/datasets/raw/main/csv/products.csv" - id: postgres_host type: STRING defaults: "localhost" - id: postgres_port type: STRING defaults: "5432" - id: postgres_db type: STRING defaults: "postgres" - id: postgres_table type: STRING - id: postgres_username type: STRING - id: postgres_password type: STRING tasks: - id: http_download type: io.kestra.plugin.core.http.Download uri: "{{ inputs.http_url }}" - id: copyin type: io.kestra.plugin.jdbc.postgresql.CopyIn url: "jdbc:postgresql://{{ inputs.postgres_host }}:{{ inputs.postgres_port }}/{{ inputs.postgres_db }}" username: "{{ inputs.postgres_username }}" password: "{{ inputs.postgres_password }}" format: CSV from: "{{ outputs.http_download.uri }}" table: "{{ inputs.postgres_table }}" header: true ``` As can be seen from the above flow, it is dynamic as all its important parameters are controlled via inputs. ## Dynamic Flow using Code We can write code in any language to generate the dynamic flow, and then upload the flow to Kestra. Let us understand this with the help of an example. We will create a dynamic flow using python which downloads a CSV file using the HTTP Download task and upload the contents into PostgreSQL table. Say, we want to extract the data from multiple HTTP URLs and upload the data to corresponding a PostgreSQL table. We can, in parallel, start the process of downloading the data from HTTP URL and uploading it to PostgreSQL table. For two items, products and orders, this is how our flow should look like: ```yaml id: dynamic_flow namespace: company.team tasks: - id: parallel type: io.kestra.plugin.core.flow.Parallel tasks: - id: sequential_task_0 type: io.kestra.plugin.core.flow.Sequential tasks: - id: http_download_0 type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/products.csv - id: postgres_upload_0 type: io.kestra.plugin.jdbc.postgresql.CopyIn url: jdbc:postgresql://{{ kv('postgres_host', 'company.infra') }}:{{ kv('postgres_port', 'company.infra') }}/{{ kv('postgres_db', 'company.infra') }} username: "{{ secret('POSTGRES_USERNAME') }}" password: "{{ secret('POSTGRES_PASSWORD') }}" format: CSV from: '{{ outputs.http_download_0.uri }}' table: products header: true - id: sequential_task_1 type: io.kestra.plugin.core.flow.Sequential tasks: - id: http_download_1 type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: postgres_upload_1 type: io.kestra.plugin.jdbc.postgresql.CopyIn url: jdbc:postgresql://{{ kv('postgres_host', 'company.infra') }}:{{ kv('postgres_port', 'company.infra') }}/{{ kv('postgres_db', 'company.infra') }} username: "{{ secret('POSTGRES_USERNAME') }}" password: "{{ secret('POSTGRES_PASSWORD') }}" format: CSV from: '{{ outputs.http_download_1.uri }}' table: orders header: true ``` To generate this flow dynamically for any number of items, use the following Python code. ```python import os from ruamel.yaml import YAML ## Get the items from the environment variable and split them by commas items = os.getenv('EXTRACT_ITEMS', "products,orders").split(",") def http_download_task(idx, item): """Create HTTP Download task based on the item. The task id and uri will get dynamically generated based on `idx` and `item` respectively. """ return { "id": f"http_download_{str(idx)}", "type": "io.kestra.plugin.core.http.Download", "uri": f"https://huggingface.co/datasets/kestra/datasets/raw/main/csv/{item}.csv" } def postgres_upload_task(idx, item): """Create postgres CopyIn task to upload the data from CSV to the corresponding postgres table. """ return { "id": f"postgres_upload_{str(idx)}", "type": "io.kestra.plugin.jdbc.postgresql.CopyIn", "url": "jdbc:postgresql://" + "{{ kv('postgres_host', 'company.infra') }}:{{ kv('postgres_port', 'company.infra') }}/{{ kv('postgres_db', 'company.infra') }}", "username": "{{ secret('POSTGRES_USERNAME') }}", "password": "{{ secret('POSTGRES_PASSWORD') }}", "format": "CSV", "from": "{{ outputs.http_download_" + str(idx) + ".uri }}", "table": f"{item}", "header": True } def create_sequential_task(idx, task_list): """Create Sequential task for every item which will have two tasks: 1. Download the CSV data using HTTP Download task 2. Upload the CSV file into the corresponding postgres table using CopyIn task """ return { "id": f"sequential_task_{str(idx)}", "type": "io.kestra.plugin.core.flow.Sequential", "tasks": task_list } tasks_per_item = [] ## Iterate over the items and generate Sequential task for each item, and append it to `tasks_per_item` for idx, item in enumerate(items): sequential_tasks = [] sequential_tasks.append(http_download_task(idx, item)) sequential_tasks.append(postgres_upload_task(idx, item)) sequential_task = create_sequential_task(idx, sequential_tasks) tasks_per_item.append(sequential_task) ## Generate the dynamic flow kestra_flow = { "id": os.getenv('FLOW_ID', "postgres_upload_flow"), "namespace": os.getenv('FLOW_NAMESPACE', "company.team"), "tasks": [ { "id": "parallel", "type": "io.kestra.plugin.core.flow.Parallel", "tasks": tasks_per_item } ] } yaml = YAML() yaml.indent(mapping=2, sequence=4, offset=2) yaml.preserve_quotes = True ## Write the generated dynamic flow in yaml format in `kestra_flow.yaml` file output_path = "kestra_flow.yaml" with open(output_path, "w") as f: yaml.dump(kestra_flow, f) ``` The above python code will generate a dynamic flow with multiple Sequential tasks that will download the data from an HTTP URL and upload the CSV into the corresponding PostgreSQL table. You can write the above code in a namespace file, say `dynamic_flow.py`. Next, we will write a Kestra flow that will invoke the `dynamic_flow.py` python script and load the generated dynamic flow into Kestra. ```yaml id: generate_dynamic_flow namespace: company.team inputs: - id: flow_id type: STRING description: Name for the dynamic flow to be created defaults: dynamic_flow - id: flow_namespace type: STRING description: Namespace in which the dynamic flow is to be created defaults: company.team - id: extract_items type: STRING description: Comma separated list of items to be extracted defaults: products,orders - id: kestra_host type: STRING description: Your Kestra hostname defaults: "localhost:8080" tasks: - id: generate_kestra_flow type: io.kestra.plugin.scripts.python.Commands env: FLOW_ID: "{{ inputs.flow_id }}" FLOW_NAMESPACE: "{{ inputs.flow_namespace }}" EXTRACT_ITEMS: "{{ inputs.extract_items }}" beforeCommands: - pip install -q ruamel.yaml namespaceFiles: enabled: true inputFiles: script.py: "{{ read('dynamic_flow.py') }}" commands: - python script.py outputFiles: - "*.yaml" - id: create_flow type: io.kestra.plugin.scripts.shell.Commands inputFiles: flow.yaml: "{{ outputs.create_kestra_flow.outputFiles['kestra_flow.yaml'] }}" beforeCommands: - apt-get update - apt-get -y install curl commands: - curl -X POST http://{{inputs.kestra_host}}/api/v1/main/flows/import -F fileUpload=@flow.yaml - echo "Executing the flow from http://{{inputs.kestra_host}}/ui/flows/edit/{{ inputs.flow_namespace }}/{{ inputs.flow_id }}" - id: subflow type: io.kestra.plugin.core.flow.Subflow namespace: "{{ inputs.flow_namespace }}" flowId: "{{ inputs.flow_id }}" wait: true transmitFailed: true ``` The flow has the following tasks: 1. generate_kestra_flow: This task invokes the `dynamic_flow.py` and generates the dynamic flow by running the Python code. The environment variables like `FLOW_ID`, `FLOW_NAMESPACE`, and `EXTRACT_ITEMS` are provided in the task, which are then used by the Python script to dynamically generate the flow. 2. create_flow: This task creates the flow in Kestra by uploading the yaml file containing the dynamically generated flow. 3. subflow: This task triggers the newly created dynamic flow. Thus, you can generate the dynamic flow by generating it in the language of your choice and then loading it into Kestra. ## Dynamic Flow using Terraform Yet another way of generating dynamic flows in Kestra is using Terraform templates. Check out the detailed guide on implementing dynamic flows using [Terraform templating](../../15.how-to-guides/terraform-templating/index.md). --- # Create a Dynamic Dropdown for Inputs URL: https://kestra.io/docs/how-to-guides/dynamic-inputs > Create dynamic dropdown menus for flow inputs that populate from external sources like databases or APIs using the KV store or HTTP functions. Support dynamic dropdown for inputs based on data from external source. In this guide, we show how you can create a dynamic dropdown list for inputs. The dropdown retrieves the values from an external source. It is possible to do so by storing the values in the [KV store](../../06.concepts/05.kv-store/index.md), and also to directly integrate the external source with the HTTP Pebble function, `http()`. ## Update KV store on schedule To get started, we create a flow that fetches the data from the external source and set the value in the KV store. The value will be in the form of a list of strings. In this example, the flow fetches data from a PostgreSQL table on an hourly schedule. You can change the `cron` property to run at a different frequency depending on how frequently you expect the data at the source to change. If the external source is in a database that supports change data capture, as in this case where we use PostgreSQL table, you can also use the [debezium trigger](/plugins/plugin-debezium-postgres/io.kestra.plugin.debezium.postgres.trigger) and immediately update the KV store. ```yaml id: update_kv_store namespace: company.team tasks: - id: fetch_departments type: io.kestra.plugin.jdbc.postgresql.Query url: "jdbc:postgresql://{{ secret('POSTGRES_HOST') }}:5432/postgres" username: "{{ secret('POSTGRES_USERNAME') }}" password: "{{ secret('POSTGRES_PASSWORD') }}" sql: select department_name from departments fetchType: FETCH - id: department_key type: io.kestra.plugin.core.kv.Set key: "{{ task.id }}" kvType: JSON value: "{{ outputs.fetch_departments.rows | jq('.[].department_name') }}" triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "0 */1 * * *" ``` This is how the KV store will look post execution of the above flow. ![kv_store_content](./kv_store_content.png) ## Flow supporting Dynamic Inputs Let us now create the flow that supports dynamic dropdown for inputs powered by the KV store key. ```yaml id: dynamic_input_flow namespace: company.team inputs: - id: department displayName: Department Name type: SELECT expression: "{{ kv('department_key') }}" tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "The selected department is {{ inputs.department }}" ``` When you execute this flow, the `department` input will have a dropdown that contains the values fetched from the `department_key` key in the KV store. ![dynamic_dropdown](./dynamic_dropdown.png) ## Dynamic Inputs with HTTP function With the `http()` function, you can make `SELECT` and `MULTISELECT` inputs dynamic by fetching options from an external API. This proves valuable when your data used in dropdowns changes frequently or when you already have an API serving that data for existing applications. The example below demonstrates how to create a flow with two dynamic dropdowns: one for selecting a product category and another for selecting a product from that category. The first dropdown fetches the product categories from an external HTTP API. The second dropdown makes another HTTP call to dynamically retrieve products matching the selected category. ```yaml id: dynamic_dropdowns namespace: company.team inputs: - id: category type: SELECT expression: "{{ http(uri = 'https://dummyjson.com/products/categories') | jq('.[].slug') }}" - id: product type: SELECT dependsOn: inputs: - category expression: "{{ http(uri = 'https://dummyjson.com/products/category/' + inputs.category) | jq('.products[].title') }}" tasks: - id: display_selection type: io.kestra.plugin.core.log.Log message: | You selected Category: {{ inputs.category }} And Product: {{ inputs.product }} ``` --- Dynamic inputs are useful for flows using authenticated API requests like the following: ```yaml id: approversFlow namespace: company.team inputs: - id: executionIdsToBeApproved type: MULTISELECT expression: >- {{ http( uri = 'http://localhost:8080/api/v1/internal/executions/search?state=PAUSED', method = 'GET', contentType = 'application/json', headers={ 'User-Agent': 'kestra', 'Connection': 'keep-alive', 'Authorization': 'Bearer ' ~ secret("bearerToken") } ) | jq('.results[] | "ExecutionId: \(.id), FlowId: \(.flowId), RequestedBy: \(.labels[] | select(.key == "system.username").value) InputParams: \( .inputs | to_entries | map("\(.key):\(.value)") | join(" ") )"') }} tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Hello World! 🚀 ``` :::alert{type="info"} When using `http()` inside an `expression` with secrets in headers (e.g., an authenticated API request), use named arguments and string concatenation ([Pebble Literals](https://pebbletemplates.io/wiki/guide/basic-usage/#literals)). The key to the syntax is to use string interpolation with `~`. ::: --- # Build ETL Pipelines in Kestra URL: https://kestra.io/docs/how-to-guides/etl-pipelines > Build end-to-end ETL pipelines in Kestra. Extract data from any source, transform it, and load to your target data warehouse with full observability. Build ETL pipelines in Kestra using DuckDB, Python and Task Runners. This tutorial demonstrates building different ETL pipelines in Kestra. :::alert{type="info"} We have used AWS access key and secret key in the example workflows below. To know more about these keys and how to get one, you can refer to the [AWS guide on secret access keys](https://aws.amazon.com/blogs/security/wheres-my-secret-access-key/). Once we have these, we can store them in the [KV Store](../../06.concepts/05.kv-store/index.md) or as [Secrets](../../06.concepts/04.secret/index.md). ::: ## Using DuckDB DuckDB transforms data directly using SQL queries. In the example below, we fetch CSV files, perform the join transformation using DuckDB Query task, store the result, upload the detailed orders onto S3, perform another transformation on the stored result, and finally upload the file as CSV onto S3. ```yaml id: etl_using_duckdb namespace: company.team tasks: - id: download_orders_csv type: io.kestra.plugin.core.http.Download description: Download orders.csv file uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: download_products_csv type: io.kestra.plugin.core.http.Download description: Download products.csv file uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/products.csv - id: get_detailed_orders type: io.kestra.plugin.jdbc.duckdb.Query description: Perform JOIN transformation using DuckDB inputFiles: orders.csv: "{{ outputs.download_orders_csv.uri }}" products.csv: "{{ outputs.download_products_csv.uri }}" sql: | SELECT o.order_id, o.customer_name, o.customer_email, o.product_id, o.price, o.quantity, o.total, p.product_name, p.product_category, p.brand FROM read_csv_auto('{{ workingDir }}/orders.csv', header=True) o JOIN read_csv_auto('{{ workingDir }}/products.csv', header=True) p ON o.product_id = p.product_id ORDER BY order_id ASC; store: true - id: ion_to_csv type: io.kestra.plugin.serdes.csv.IonToCsv description: Convert the result into CSV from: "{{ outputs.get_detailed_orders.uri }}" - id: upload_detailed_orders_to_s3 type: io.kestra.plugin.aws.s3.Upload description: Upload the resulting CSV file onto S3 accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" region: "eu-central-1" from: "{{ outputs.get_orders_per_product_csv.uri }}" bucket: "my_bucket" key: "orders/detailed_orders" - id: get_orders_per_product type: io.kestra.plugin.jdbc.duckdb.Query description: Perform aggregation using DuckDB inputFiles: detailed_orders.csv: "{{ outputs.ion_to_csv.uri }}" sql: | SELECT product_id, COUNT(product_id) as order_count, SUM(quantity) as product_count, CAST(SUM(total) AS DECIMAL(10,2)) AS order_total FROM read_csv_auto('{{ workingDir }}/detailed_orders.csv', header=True) GROUP BY product_id ORDER BY product_id ASC store: true - id: get_orders_per_product_csv type: io.kestra.plugin.serdes.csv.IonToCsv description: Convert the result into CSV from: "{{ outputs.get_orders_per_product.uri }}" - id: upload_orders_per_product_to_s3 type: io.kestra.plugin.aws.s3.Upload description: Upload the resulting CSV file onto S3 accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" region: "eu-central-1" from: "{{ outputs.get_orders_per_product_csv.uri }}" bucket: "my_bucket" key: "orders/orders_per_product" ``` Similar Query tasks can be performed on different databases like Snowflake, Postgres, etc. ## Using Python You can choose to perform ETL using python (pandas) and then run it as a Python script. The ETL performed using [DuckDB](#using-duckdb) above can be performed using Python as shown in the example flow below. ```yaml id: python_etl namespace: company.team tasks: - id: etl type: io.kestra.plugin.scripts.python.Script description: Python ETL Script beforeCommands: - pip install requests pandas script: | import io import requests import pandas as pd def _extract(url): csv_data = requests.get(url).content return pd.read_csv(io.StringIO(csv_data.decode('utf-8')), header=0) def run_etl(): orders_data = _extract("https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv") products_data = _extract("https://huggingface.co/datasets/kestra/datasets/raw/main/csv/products.csv") # perform join transformation detailed_orders = orders_data.merge(products_data, how='left', left_on='product_id', right_on='product_id') detailed_orders.to_csv("detailed_orders.csv") # perform aggregation orders_per_product = detailed_orders.groupby('product_id').agg(order_count= ('product_id', 'count'), product_count=('quantity', 'sum'), order_total=('total', 'sum')).sort_values('product_id') orders_per_product['order_total'] = orders_per_product['order_total'].apply(lambda x: float("{:.2f}".format(x))) orders_per_product.to_csv("orders_per_product.csv") if __name__ == "__main__": run_etl() outputFiles: - detailed_orders.csv - orders_per_product.csv - id: upload_detailed_orders_to_s3 type: io.kestra.plugin.aws.s3.Upload description: Upload the resulting CSV file onto S3 accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" region: "eu-central-1" from: "{{ outputs.python_etl.outputFiles('detailed_orders.csv') }}" bucket: "my_bucket" key: "orders/detailed_orders" - id: upload_orders_per_product_to_s3 type: io.kestra.plugin.aws.s3.Upload description: Upload the resulting CSV file onto S3 accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" region: "eu-central-1" from: "{{ outputs.python_etl.outputFiles('orders_per_product.csv') }}" bucket: "my_bucket" key: "orders/orders_per_product" ``` ## Using Batch Task Runners When the python scripts get more compute-intesive or memory-intensive, it is advised to run them on remote batch compute resources using Batch Task Runners. Kestra provides a variety of [Batch Task Runners](../../07.enterprise/04.scalability/task-runners/index.md#task-runner-types). Here is an example of how the ETL python script can be run on a AWS Batch Task Runner. ```yaml id: aws_batch_task_runner_etl namespace: company.team tasks: - id: python_etl_on_aws_task_runner type: io.kestra.plugin.scripts.python.Script description: Run python ETL script on Docker Task Runner containerImage: python:3.11-slim taskRunner: type: io.kestra.plugin.ee.aws.runner.Batch region: eu-central-1 accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" computeEnvironmentArn: "arn:aws:batch:eu-central-1:01234567890:compute-environment/kestraFargateEnvironment" jobQueueArn: "arn:aws:batch:eu-central-1:01234567890:job-queue/kestraJobQueue" executionRoleArn: "arn:aws:iam::01234567890:role/kestraEcsTaskExecutionRole" taskRoleArn: arn:aws:iam::01234567890:role/ecsTaskRole bucket: kestra-product-de beforeCommands: - pip install requests pandas script: | import io import requests import pandas as pd def _extract(url): csv_data = requests.get(url).content return pd.read_csv(io.StringIO(csv_data.decode('utf-8')), header=0) def run_etl(): orders_data = _extract("https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv") products_data = _extract("https://huggingface.co/datasets/kestra/datasets/raw/main/csv/products.csv") # perform join transformation detailed_orders = orders_data.merge(products_data, how='left', left_on='product_id', right_on='product_id') detailed_orders.to_csv("detailed_orders.csv") # perform aggregation orders_per_product = detailed_orders.groupby('product_id').agg(order_count= ('product_id', 'count'), product_count=('quantity', 'sum'), order_total=('total', 'sum')).sort_values('product_id') orders_per_product['order_total'] = orders_per_product['order_total'].apply(lambda x: float("{:.2f}".format(x))) orders_per_product.to_csv("orders_per_product.csv") if __name__ == "__main__": run_etl() outputFiles: - detailed_orders.csv - orders_per_product.csv - id: upload_detailed_orders_to_s3 type: io.kestra.plugin.aws.s3.Upload description: Upload the resulting CSV file onto S3 accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" region: "eu-central-1" from: "{{ outputs.python_etl.outputFiles('detailed_orders.csv') }}" bucket: "my_bucket" key: "orders/detailed_orders" - id: upload_orders_per_product_to_s3 type: io.kestra.plugin.aws.s3.Upload description: Upload the resulting CSV file onto S3 accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" region: "eu-central-1" from: "{{ outputs.python_etl.outputFiles('orders_per_product.csv') }}" bucket: "my_bucket" key: "orders/orders_per_product" ``` ## Using dbt You can create a similar pipeline based on an ELT model using dbt via Kestra, using namespace files for the dbt models. This example uses dbt + BigQuery to perform the ELT process: it loads data from an HTTP request to Hugging Face into BigQuery tables, performs join and aggregate transformations using dbt, and then queries the resulting tables. ```yaml id: dbt_transformations namespace: kestra.engineering.bigquery.dbt tasks: - id: orders_http_download type: io.kestra.plugin.core.http.Download description: Download orders.csv using HTTP Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: products_http_download type: io.kestra.plugin.core.http.Download description: Download products.csv using HTTP Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/products.csv - id: create_orders_table type: io.kestra.plugin.gcp.bigquery.CreateTable description: Create orders table in BigQuery serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}" projectId: dataset: ecommerce table: orders tableDefinition: type: TABLE schema: fields: - name: order_id type: INT64 - name: customer_name type: STRING - name: customer_email type: STRING - name: product_id type: INT64 - name: price type: FLOAT64 - name: quantity type: INT64 - name: total type: FLOAT64 - id: create_products_table type: io.kestra.plugin.gcp.bigquery.CreateTable description: Create products table in BigQuery. serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}" projectId: dataset: ecommerce table: products tableDefinition: type: TABLE schema: fields: - name: product_id type: INT64 - name: product_name type: STRING - name: product_category type: STRING - name: brand type: STRING - id: load_orders_table type: io.kestra.plugin.gcp.bigquery.Load description: Load orders table with data from orders.csv from: "{{ outputs.orders_http_download.uri }}" projectId: serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}" destinationTable: ".ecommerce.orders" format: CSV csvOptions: fieldDelimiter: "," skipLeadingRows: 1 - id: load_products_table type: io.kestra.plugin.gcp.bigquery.Load description: Load products table with data from products.csv from: "{{ outputs.products_http_download.uri }}" projectId: serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}" destinationTable: ".ecommerce.products" format: CSV csvOptions: fieldDelimiter: "," skipLeadingRows: 1 - id: dbt type: io.kestra.plugin.dbt.cli.DbtCLI description: Use dbt build to perform the dbt transformations inputFiles: sa.json: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}" taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/dbt-bigquery:latest namespaceFiles: enabled : true profiles: | bq_dbt_project: outputs: dev: type: bigquery method: service-account dataset: ecommerce project: keyfile: sa.json location: US priority: interactive threads: 16 timeout_seconds: 300 fixed_retries: 1 target: dev commands: - dbt deps - dbt build - id: query_detailed_orders type: io.kestra.plugin.gcp.bigquery.Query description: Query the newly generated detailed_orders BigQuery table serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}" projectId: sql: | SELECT * FROM .ecommerce.detailed_orders store: true - id: query_orders_per_product type: io.kestra.plugin.gcp.bigquery.Query description: Query the newly generated orders_per_product BigQuery table serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}" projectId: sql: | SELECT * FROM .ecommerce.orders_per_product store: true ``` Here are the files that you should create in the Kestra editor. Firstly, create `dbt_project.yml` file, and put the following contents into it. ```yaml name: 'bq_dbt_project' version: '1.0.0' config-version: 2 profile: 'bq_dbt_project' model-paths: ["models"] analysis-paths: ["analyses"] test-paths: ["tests"] seed-paths: ["seeds"] macro-paths: ["macros"] snapshot-paths: ["snapshots"] clean-targets: - "target" - "dbt_packages" models: bq_dbt_project: example: +materialized: view ``` Next, create `models` folder. All the upcoming files will be created under the `models` folder. Create `sources.yml`, which defines the source tables referenced in other models. ```yaml version: 2 sources: - name: ecommerce database: schema: ecommerce tables: - name: orders - name: products ``` Next, create two files — `stg_orders.sql` and `stg_products.sql` — which materialize as views on top of the source tables: **stg_orders.sql** ```sql {{ config(materialized="view") }} select order_id, customer_name, customer_email, product_id, price, quantity, total from {{ source('ecommerce', 'orders') }} ``` **stg_products.sql** ```sql {{ config(materialized="view") }} select product_id, product_name, product_category, brand from {{ source('ecommerce', 'products') }} ``` Next, create `detailed_orders.sql`, which creates the `detailed_orders` table by joining the `stg_orders` and `stg_products` views on `product_id`: ```sql {{ config(materialized="table") }} select o.order_id, o.customer_name, o.customer_email, o.product_id, p.product_name, p.product_category, p.brand, o.price, o.quantity, o.total from {{ ref('stg_orders') }} o join {{ ref('stg_products') }} p on o.product_id = p.product_id ``` Next, create `order_per_product.sql`, which creates the `order_per_product` table by aggregating the `detailed_orders` table: ```sql {{ config(materialized="table") }} select product_id, COUNT(product_id) as order_count, SUM(quantity) as product_count, SUM(total) AS order_total from {{ ref('detailed_orders') }} group by product_id order by product_id asc ``` With this, we are ready with all the dbt models. You can now execute the flow. The flow will generate the `detailed_orders` and `orders_per_product` tables. You can view the content of this table by going to the Outputs of the last two tasks. ## Using Spark We can perform the same ETL process using Spark. The flow for performing the same transformation using Spark will look as follows: ```yaml id: spark_python_submit namespace: kestra.engineering.spark tasks: - id: python_submit type: io.kestra.plugin.spark.PythonSubmit runner: DOCKER docker: networkMode: host user: root master: spark://localhost:7077 args: - "10" mainScript: | from pyspark.sql import SparkSession from pyspark import SparkFiles orders_url = "https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv" products_url = "https://huggingface.co/datasets/kestra/datasets/raw/main/csv/products.csv" spark.sparkContext.addFile(orders_url) spark.sparkContext.addFile(products_url) if __name__ == "__main__": spark = SparkSession.builder.appName("EcommerceApp").getOrCreate() #Create orders dataframe based on orders.csv orders_df = spark.read.csv("file://" + SparkFiles.get("orders.csv"), inferSchema=True, header=True) #Create products dataframe based on orders.csv products_df = spark.read.csv("file://" + SparkFiles.get("products.csv"), inferSchema=True, header=True) #Create detailed_orders by joining orders_df and products_df detailed_orders_df = orders_df.join(products_df, orders_df.product_id == products_df.product_id, "left") # Print the contents of detailed_orders_df detailed_orders_df.show(10) spark.stop() ``` --- # Validate and Deploy Flows with GitHub Actions URL: https://kestra.io/docs/how-to-guides/github-actions > Automate the validation and deployment of your Kestra flows using GitHub Actions for a robust CI/CD pipeline. How to use GitHub Actions to automatically validate and deploy your flows to Kestra.
If you're version controlling your flows in a Git repository, it can be useful to automatically validate that they're in the correct format before merging into your `main` branch. On top of that, you can automatically deploy your flows in your `main` branch to your Kestra instance. There are three GitHub Actions available: - [Validate Flows](https://github.com/kestra-io/github-actions/tree/main/validate-flows) - Validate your flows before deploying anything. - [Deploy Flows](https://github.com/kestra-io/github-actions/tree/main/deploy-flows) - Deploy your flows to your Kestra server. - [Deploy Namespace Files](https://github.com/kestra-io/github-actions/tree/main/deploy-namespace-files) - Deploy namespace files to your Kestra server. ## Validate Your Flows The Validate Flows Action sets up a workflow to check all flows in the specified `directory` when a commit is pushed to `main` or a Pull Request is opened for the `main` branch. For the full list of inputs, see the [GitHub Actions reference](../../version-control-cicd/cicd/01.github-action/index.md#validate-flows-action-inputs). In the example below: 1. Triggers when a commit is pushed to `main` or when a PR is opened for the `main` branch. 2. Checks out the repository so we can access the files in later steps. 3. Uses the Validate Flows Action to check all the flows in the `./kestra/flows` directory. ```yaml name: Kestra CI/CD on: push: branches: [ "main" ] pull_request: branches: [ "main" ] jobs: validate: runs-on: ubuntu-latest name: Kestra validate steps: - name: Checkout repo content uses: actions/checkout@v4 - name: Validate all flows uses: kestra-io/github-actions/validate-flows@main with: directory: ./kestra/flows server: https://server-url.com ``` ## Deploy Your Flows The Deploy Flows Action sets up a workflow to deploy when new commits are pushed to the `main` branch. Specify a `directory` containing your flows and optionally a `namespace` to deploy them to. For the full list of inputs, see the [GitHub Actions reference](../../version-control-cicd/cicd/01.github-action/index.md#deploy-flows-action-inputs). If you want to deploy flows to multiple namespaces, you can add multiple steps using the Deploy Flows Action, each with a different `namespace` and `directory`. In the example below: 1. Triggers when commits are pushed to `main`. 2. Checks out the repository so we can access the files in later steps. 3. Deploys flows from `kestra/flows` to the `company.team` namespace in the Kestra instance. ```yaml name: Kestra CI/CD on: push: branches: [ "main" ] jobs: deploy: runs-on: ubuntu-latest name: Kestra deploy steps: - name: Checkout repo content uses: actions/checkout@v4 - name: Deploy flows uses: kestra-io/github-actions/deploy-flows@main with: namespace: company.team directory: ./kestra/flows server: https://server-url.com ``` ## Deploy Namespace Files Using the Deploy Namespace Files Action, you can deploy configuration files or other resources to a namespace. This is useful for managing shared files across your flows. In the example below: 1. Triggers when commits are pushed to `main`. 2. Checks out the repository so we can access the files in later steps. 3. Deploys a configuration file to the `company.team` namespace. ```yaml name: Kestra CI/CD on: push: branches: [ "main" ] jobs: deploy-nsfiles: runs-on: ubuntu-latest name: Kestra deploy namespace files steps: - name: Checkout repo content uses: actions/checkout@v4 - name: Deploy namespace files uses: kestra-io/github-actions/deploy-namespace-files@main with: namespace: company.team localPath: ./config/app.yaml namespacePath: config/app.yaml server: https://server-url.com ``` ## Authentication If you have [authentication](../../configuration/05.security-and-secrets/index.md) enabled in your Kestra instance, you will need to add additional properties so your action can authenticate with your instance. ### Basic Authentication If you have basic authentication enabled with a username and password (e.g. on the Open Source Edition), you can add the `user` and `password` properties to your action using [GitHub Secrets](https://docs.github.com/en/actions/security-for-github-actions/security-guides/using-secrets-in-github-actions): ```yaml name: Kestra CI/CD on: push: branches: [ "main" ] jobs: deploy: runs-on: ubuntu-latest name: Kestra deploy steps: - name: Checkout repo content uses: actions/checkout@v4 - name: Deploy flows uses: kestra-io/github-actions/deploy-flows@main with: namespace: company.team directory: ./kestra/flows server: https://server-url.com user: ${{ secrets.KESTRA_USERNAME }} password: ${{ secrets.KESTRA_PASSWORD }} ``` As you can see, the `user` and `password` are added as secrets with the expression syntax `${{ secrets.name }}` to prevent you from committing these to your repository. ### API Token Authentication If you're using the [Enterprise Edition](../../oss-vs-paid/index.md), you can use an [API Token](../../07.enterprise/03.auth/api-tokens/index.md) instead: ```yaml name: Kestra CI/CD on: push: branches: [ "main" ] jobs: deploy: runs-on: ubuntu-latest name: Kestra deploy steps: - name: Checkout repo content uses: actions/checkout@v4 - name: Deploy flows uses: kestra-io/github-actions/deploy-flows@main with: namespace: company.team directory: ./kestra/flows server: https://server-url.com apiToken: ${{ secrets.KESTRA_API_TOKEN }} ``` ## Set Up a Branch Ruleset If you're working in a team, it can be useful to set up a [Ruleset](https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-rulesets/about-rulesets) on your `main` branch to prevent broken flows from being deployed accidentally to your production instance. To enable this, go to the **Settings** of your repository on GitHub and go to **Rules** then **Rulesets**. In here, we can create a new branch ruleset. The goal of this ruleset is to protect the `main` branch as our GitHub Action will automatically deploy any flows in this branch to our Kestra instance. To achieve this, we can set the specific Branch rules: - Require a pull request before merging - No commits can be made directly to the `main` branch - Require status checks to pass - Requires our Validate Flows Action to pass before we can merge our Pull Requests ![ruleset](./ruleset.png) With these enabled, we are required to make a Pull Request before our flows end up in production. This enables us to run our validate check and require that to pass before we can merge any pull requests. ![pr](./pr.png) In the example above, the flow had an incorrect indentation so it failed the validate check. As a result of this, the Pull Request is unable to be merged until it is fixed. --- # Back Up GitHub Repos with Kestra Playground URL: https://kestra.io/docs/how-to-guides/github-repo-backup > Automate GitHub repository backups with Kestra. Schedule periodic exports of your repos to cloud storage with built-in error handling and audit logging. Clone every repository in the `kestra-io` GitHub organization, zip each repo, and upload the archives to Google Cloud Storage (GCS) for safekeeping. --- ## Why run this backup? Organizations often mirror source control data outside GitHub to satisfy compliance, enable disaster recovery drills, or seed analytics and search workloads. This flow collects every repository, produces portable zip artifacts, and stores them in GCS so you have an off-platform copy you can restore or inspect independently of GitHub. This flow has potentially long running operations, so to optimize testing certain tasks, we use the [Playground](../../09.ui/10.playground/index.md) feature to ensure each component works before a production execution. --- ## Prerequisites - GitHub personal access token stored as `GITHUB_TOKEN`. - GCP service account JSON key stored as `GCP_SERVICE_ACCOUNT`. - A target bucket such as `gs://your_gcs_bucket/kestra-backups/`. - The [Google Cloud Storage plugin](/plugins/plugin-gcp/cloud-storage-gcs/io.kestra.plugin.gcp.gcs.upload) available to your workers. --- ## Choosing a fetch mode The `repositories.Search` task exposes a `fetchType` property that controls how search results reach downstream tasks: | `fetchType` | Output field | Best for | |---|---|---| | `FETCH` | `rows` — a list of result objects directly in the task output | Moderate result sets where you want to use Pebble expressions immediately | | `FETCH_ONE` | `row` — only the first result | Lookups where you expect a single match | | `STORE` (default) | `uri` — an Ion file written to Kestra internal storage | Large result sets, auditing, or when you need to persist the raw data | | `NONE` | _(empty)_ | Triggering a side-effect without needing results | Use `FETCH` when you want to feed results directly into a ForEach loop with a simple Pebble expression. Use `STORE` when the result set may be large, when you want the raw file persisted in internal storage for inspection or reuse, or when downstream tasks need to read the data multiple times. --- ## Flow Definition — FETCH mode `fetchType: FETCH` places results directly in `outputs.search_kestra_repos.rows` as a list of objects. The ForEach `values` expression reads from that list without any file I/O step. ```yaml id: github_repo_backup namespace: company.team description: Clones Kestra GitHub repositories and backs them up to Google Cloud Storage. tasks: - id: search_kestra_repos type: io.kestra.plugin.github.repositories.Search description: Search for all repositories under the 'kestra-io' GitHub organization. query: "user:kestra-io" fetchType: FETCH oauthToken: "{{ secret('GITHUB_TOKEN') }}" - id: for_each_repo type: io.kestra.plugin.core.flow.ForEach description: Iterate over each found repository. values: "{{ outputs.search_kestra_repos.rows | jq('.[].clone_url') }}" tasks: - id: working_dir type: io.kestra.plugin.core.flow.WorkingDirectory description: Create a temporary working directory for cloning and zipping each repository. tasks: - id: clone_repo type: io.kestra.plugin.git.Clone description: Clone the current repository from GitHub. url: "{{ taskrun.value }}" directory: "{{ taskrun.value | split('/') | last | split('.') | first }}" - id: zip_repo type: io.kestra.plugin.scripts.shell.Commands description: Zip the cloned repository's contents. beforeCommands: - apk add zip > /dev/null 2>&1 || true commands: - | REPO_DIR="{{ outputs.clone_repo.directory }}" REPO_NAME="{{ REPO_DIR | split('/') | last }}" cd "${REPO_DIR}" zip -r "../${REPO_NAME}.zip" . outputFiles: - "{{ outputs.clone_repo.directory | split('/') | last }}.zip" - id: upload_to_gcs type: io.kestra.plugin.gcp.gcs.Upload description: Upload the zipped repository to Google Cloud Storage. from: "{{ outputs.zip_repo.outputFiles['' ~ (outputs.clone_repo.directory | split('/') | last) ~ '.zip'] }}" to: "gs://your_gcs_bucket/kestra-backups/{{ outputs.clone_repo.directory | split('/') | last }}.zip" serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT') }}" ``` --- ## Flow Definition — STORE mode `fetchType: STORE` (the default) writes results to an Ion file in Kestra internal storage and returns a `uri`. Use this variant when the result set is large, when you want to retain the raw file for auditing or reuse across multiple tasks, or when you are integrating with tasks that consume a storage URI directly. ```yaml id: github_repo_backup namespace: company.team description: Clones Kestra GitHub repositories and backs them up to Google Cloud Storage. tasks: - id: search_kestra_repos type: io.kestra.plugin.github.repositories.Search description: Search for all repositories under the 'kestra-io' GitHub organization. query: "user:kestra-io" fetchType: STORE oauthToken: "{{ secret('GITHUB_TOKEN') }}" - id: for_each_repo type: io.kestra.plugin.core.flow.ForEach description: Iterate over each found repository. values: "{{ outputs.search_kestra_repos.uri | internalStorage.get() | jq('.[].clone_url') }}" tasks: - id: working_dir type: io.kestra.plugin.core.flow.WorkingDirectory description: Create a temporary working directory for cloning and zipping each repository. tasks: - id: clone_repo type: io.kestra.plugin.git.Clone description: Clone the current repository from GitHub. url: "{{ taskrun.value }}" directory: "{{ taskrun.value | split('/') | last | split('.') | first }}" - id: zip_repo type: io.kestra.plugin.scripts.shell.Commands description: Zip the cloned repository's contents. beforeCommands: - apk add zip > /dev/null 2>&1 || true commands: - | REPO_DIR="{{ outputs.clone_repo.directory }}" REPO_NAME="{{ REPO_DIR | split('/') | last }}" cd "${REPO_DIR}" zip -r "../${REPO_NAME}.zip" . outputFiles: - "{{ outputs.clone_repo.directory | split('/') | last }}.zip" - id: upload_to_gcs type: io.kestra.plugin.gcp.gcs.Upload description: Upload the zipped repository to Google Cloud Storage. from: "{{ outputs.zip_repo.outputFiles['' ~ (outputs.clone_repo.directory | split('/') | last) ~ '.zip'] }}" to: "gs://your_gcs_bucket/kestra-backups/{{ outputs.clone_repo.directory | split('/') | last }}.zip" serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT') }}" ``` --- ## How It Works Both variants share the same structure. The only difference is how the search results move from the `search_kestra_repos` task to the `for_each_repo` loop. With `FETCH`, results live in `outputs.search_kestra_repos.rows` as a native list — no file read needed. With `STORE`, results are written to an Ion file and the ForEach expression reads the file via `internalStorage.get()` before applying the `jq` filter. In both cases: 1. `search_kestra_repos` fetches all repositories in the `kestra-io` organization. 2. `for_each_repo` loops over each `clone_url` extracted from the results. 3. `working_dir` isolates each iteration, keeping cloned data and archives scoped to a temporary folder. 4. `clone_repo` clones the current repository URL. 5. `zip_repo` compresses the cloned repository and exposes the zip file through `outputFiles` so the next task can read it. 6. `upload_to_gcs` uploads each archive to the chosen bucket path using the GCP service account key. Secrets supply tokens to GitHub and GCP at runtime without embedding credentials in the flow definition. For a smaller dry run, narrow the search query (for example, add `topic:cli`) or slice the list — `jq('.[0:2].clone_url')` with FETCH or `jq('.[0:2].clone_url')` with STORE — to process only a few repositories. --- ## Use Playground to Test Safely Playground mode helps you validate expensive steps incrementally. Start with the search task to confirm authentication and inspect the results before any cloning. When refining the zip or upload steps, slice the list to a single repository so you can replay those tasks without hitting GitHub or GCS repeatedly. Because Playground keeps prior task outputs, you can iterate on shell commands and storage paths while reusing the same search result and clone, keeping feedback fast and low-risk. Playground mode lets you validate one task at a time without running the whole backup loop. Follow [the Playground guide](../../09.ui/10.playground/index.md) and use this flow as follows: 1. Toggle Playground mode in the editor. 2. Run only `search_kestra_repos` to confirm your GitHub token works and inspect the search output. In the example below, the task fails in Playground due to a misconfigured secret. This lets you catch the issue before any attempted executions are made. Update the secret, then run it in Playground again to verify that it's correct. 3. Temporarily limit the `values` expression to a single repository while you iterate — with FETCH, use `jq('.[0:1].clone_url')`; with STORE, use `jq('.[0:1].clone_url')` after `internalStorage.get()`. 4. Play `zip_repo` and `upload_to_gcs` individually inside Playground; Kestra reuses outputs from previous played tasks, so you avoid recloning every repository. 5. When satisfied, revert any temporary limits and use **Run all tasks** for a full backup execution. This approach prevents unnecessary GitHub calls and GCS writes while you refine the flow logic. --- You now have a reusable flow that continuously backs up the `kestra-io` GitHub organization to GCS with secrets-managed authentication and a safe Playground workflow for testing. --- # Run Go Inside Your Flows URL: https://kestra.io/docs/how-to-guides/golang > Run Go code directly within Kestra flows for high-performance scripting using the Go plugin or inline scripts. Run Go code directly inside your Flows and generate outputs.
Go is a powerful programming language often used for cloud-native development, CLI utilities, and more. As Go is compiled, it's often much more performant than Python, making it a great alternative for heavy compute workloads. Combining Go's and Kestra's performance, you can build incredibly fast workflows. This guide is going to walk you through how to get Go running in a workflow, how to manage input and output files, and how you can pass outputs and metrics back to Kestra to use in later tasks. ## Commands Task There is an official Go plugin with a `Commands` task and an inline `Script` task. This example executes a Namespace file using `Commands`: ```yaml id: golang_commands namespace: company.team tasks: - id: go type: io.kestra.plugin.scripts.go.Commands namespaceFiles: enabled: true commands: - go run main.go ``` The contents of the `main.go` file contains a simple print statement: ```go package main import "fmt" func main() { fmt.Println("hello world") } ``` You'll need to add your Golang code using the built-in Editor or [sync it using Git](../../version-control-cicd/04.git/index.md) so Kestra can see it. You'll also need to set the `enabled` flag for the `namespaceFiles` property to `true` so Kestra can access the file. You can read more about the Go Commands type in the [Plugin documentation](/plugins/plugin-script-go/io.kestra.plugin.scripts.go.commands). ## Script You can also add your Golang code inline using the `Script` task. ```yaml id: golang_script namespace: company.team tasks: - id: go type: io.kestra.plugin.scripts.go.Script script: | package main import "fmt" func main() { fmt.Println("hello world") } ``` You can also use expressions directly inside your Go code. In this example, inputs are embedded directly into the code: ```yaml id: golang_script_expression namespace: company.team inputs: - id: message type: STRING defaults: "Hello, World!" - id: number type: INT defaults: 4 tasks: - id: go type: io.kestra.plugin.scripts.go.Script script: | package main import "fmt" func main() { fmt.Println("Message: {{ inputs.message }}") fmt.Println("Number: {{ inputs.number }}") } ``` You can read more about the Go Script type in the [Plugin documentation](/plugins/plugin-script-go/io.kestra.plugin.scripts.go.script). ## Handling Outputs If you want to get a variable or file from your Golang code, you can use an [output](../../05.workflow-components/06.outputs/index.md). ### Variable Output You can get the JSON outputs from the Golang script using the `::{}::` pattern. Here is an example: ```yaml id: golang_outputs namespace: company.team tasks: - id: go type: io.kestra.plugin.scripts.go.Script script: | package main import "fmt" func main() { fmt.Println("::{\"outputs\":{\"test\":\"value\",\"int\":2,\"bool\":true,\"float\":3.65}}::") } ``` All the output variables can be viewed in the Outputs tab of the execution. ![golang_outputs](./outputs.png) You can refer to the outputs in another task as shown in the example below: ```yaml id: golang_outputs_usage namespace: company.team tasks: - id: go type: io.kestra.plugin.scripts.go.Script script: | package main import "fmt" func main() { fmt.Println("::{\"outputs\":{\"test\":\"value\",\"int\":2,\"bool\":true,\"float\":3.65}}::") } - id: return type: io.kestra.plugin.core.debug.Return format: '{{ outputs.go.vars.test }}' ``` ### File Output Inside of your Golang code, write a file to the system. You'll need to add the `outputFiles` property to your flow and list the files you're trying to put out. In this case, we want to output `output.txt`. More information on the formats you can use for this property can be found in [Script Output Metrics](../../16.scripts/06.outputs-metrics/index.md). The example below writes a `output.txt` file containing the "Hello World" text. We can then refer the file using the syntax `{{ outputs.{task_id}.outputFiles[''] }}`, and read the contents of the file using the `read()` function. ```yaml id: golang_script namespace: company.team tasks: - id: go type: io.kestra.plugin.scripts.go.Script outputFiles: - output.txt script: | package main import ( "os" ) func check(e error) { if e != nil { panic(e) } } func main() { d1 := []byte("hello go") err := os.WriteFile("output.txt", d1, 0644) check(err) } - id: log type: io.kestra.plugin.core.log.Log message: "{{ read(outputs.go.outputFiles['output.txt']) }}" ``` ## Handling Metrics You can also get [metrics](../../16.scripts/06.outputs-metrics/index.md#outputs-and-metrics-in-script-and-commands-tasks) from your Golang code. Metrics use the same `::{}::` pattern as outputs. This example demonstrates both the counter and timer metrics. ```yaml id: golang namespace: company.team tasks: - id: go type: io.kestra.plugin.scripts.go.Script script: | package main import "fmt" func main() { fmt.Println("There are 20 products in the cart") fmt.Println("::{\"outputs\":{\"productCount\":20}}::") fmt.Println("::{\"metrics\":[{\"name\":\"productCount\",\"type\":\"counter\",\"value\":20}]}::") fmt.Println("::{\"metrics\":[{\"name\":\"purchaseTime\",\"type\":\"timer\",\"value\":32.44}]}::") } ``` Once this has executed, both the metrics can be viewed under **Metrics**. ![metrics](./metrics.png) --- # Configure a Google Service Account in Kestra URL: https://kestra.io/docs/how-to-guides/google-credentials > Securely configure Google Service Accounts in Kestra to authenticate and access Google Cloud resources and Workspace apps. Set Up a Google Service Account in Kestra. When you're using Google Cloud (and for some Google Workspace apps), you're going to need to authenticate in Kestra. The best way to do this is by using a Service Account. However, there's a few ways you can set this up. This guide will walk you through the best way to get your service account working correctly in Kestra. ## Create Service Account in Google Cloud Inside of Google Cloud, head to `IAM` and then `Service Accounts`. In here you can add the specific roles to the service account before creating it (this will depend on your use case). Once you've done that, you can go to `Keys` and click on `Add Key`. From the dropdown, select `Create New Key`. Select the Key type as `JSON` and click on `Create`. Download this as we'll need this in a second. For more information on Google Cloud Service Accounts, see the [documentation](https://cloud.google.com/iam/docs/service-account-overview). ## Configuring a task with a Service Account Inside of Kestra, you can paste the service account JSON directly to the task property. This is useful for testing purposes: ```yaml - id: upload type: io.kestra.plugin.googleworkspace.drive.Upload from: "{{ inputs.file }}" parents: - "1HuxzpLt1b0111MuKMgy8wAv-m9Myc1E_" name: "My awesome CSV" contentType: "text/csv" mimeType: "application/vnd.google-apps.spreadsheet" serviceAccount: | { "type": "service_account", "project_id": "...", "private_key_id": "...", "private_key": "...", "client_email": "...", "client_id": "...", "auth_uri": "https://accounts.google.com/o/oauth2/auth", "token_uri": "https://oauth2.googleapis.com/token", "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", "client_x509_cert_url": "...", "universe_domain": "googleapis.com" } ``` :::alert{type="warning"} This is not recommended as you might expose your key. We'd recommend using [secrets](#add-service-account-as-a-secret) to store your Service Account JSON. ::: ## Add Service Account as a Secret Add the Service Account with the `serviceAccount` property to any Google Cloud or Workspaces task. To do this, add it as a secret to Kestra. There are several ways to add secrets; this guide uses environment variables linked to the Docker Compose file. For more information on how secrets work, see the [secrets page](../../06.concepts/04.secret/index.md). Once you have the service account file downloaded, you can rename it to `service-account.json`. Then we'll encode the service account JSON and store it inside a file named `.env_encoded` which will hold all of our encoded secrets: ```bash echo SECRET_GCP_SERVICE_ACCOUNT=$(cat service-account.json | base64 -w 0) >> .env_encoded ``` If you already have an existing `.env` file, you can use the following bash script: ```bash #!/bin/bash ENV_FILENAME=.env_encoded while IFS='=' read -r key value; do echo "SECRET_$key=$(echo -n "$value" | base64)"; done < .env > $ENV_FILENAME ## Encodes the service account file without line wrapping to make sure the whole JSON value is intact. echo "SECRET_GCP_SERVICE_ACCOUNT=$(cat service-account.json | base64 -w 0)" >> $ENV_FILENAME ``` You can then set the `.env_encoded` file in your `docker-compose.yml`: ```yaml kestra: env_file: .env_encoded ``` ## Access Service Account in Kestra You can now access this in Kestra with the following pebble expression: ```yaml "{{ secret('GCP_SERVICE_ACCOUNT') }}" ``` With this, we can add this to the `serviceAccount` property like so: ```yaml - id: upload type: io.kestra.plugin.googleworkspace.drive.Upload from: "{{ inputs.file }}" parents: - "1HuxzpLt1b0111MuKMgy8wAv-m9Myc1E_" name: "My awesome CSV" contentType: "text/csv" mimeType: "application/vnd.google-apps.spreadsheet" serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT') }}" ``` ```yaml - id: fetch type: io.kestra.plugin.gcp.bigquery.Query fetch: true sql: | SELECT 1 as id, "John" as name UNION ALL SELECT 2 as id, "Doe" as name serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT') }}" ``` ## Set the Service Account with `PluginDefaults` If you're using multiple tasks that will require the service account secret, you can set up a Plugin Default to apply this property to all tasks of this type. For example: ```yaml tasks: - id: upload type: io.kestra.plugin.googleworkspace.drive.Upload from: "{{ inputs.file }}" parents: - "1HuxzpLt1b0111MuKMgy8wAv-m9Myc1E_" name: "My awesome CSV" contentType: "text/csv" mimeType: "application/vnd.google-apps.spreadsheet" pluginDefaults: - type: io.kestra.plugin.googleworkspace.drive.Upload values: serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT') }}" ``` ## Configuring Secrets in the Enterprise Edition In Kestra Enterprise Edition, secrets can be managed directly from the UI meaning there's no need to encode them in base64. To learn more about this, see the [secrets page](../../06.concepts/04.secret/index.md#secrets-in-the-enterprise-edition). ## `GOOGLE_APPLICATION_CREDENTIALS` By setting the `GOOGLE_APPLICATION_CREDENTIALS` environment variable on the nodes running Kestra. It must point to an application credentials file. Warning: it must be the same on all worker nodes and can cause some security concerns. While you can use the `GOOGLE_APPLICATION_CREDENTIALS` environment variable, this is not advised as you'll need to mount the JSON file to Docker which isn't always possible depending on how you've setup Kestra. To set it up, ensure Kestra has access to the JSON file containing the service account details. If you're using Docker, you'll need to create a bind mount like the example below: ```yaml kestra: image: kestra/kestra:latest pull_policy: always user: "root" command: server standalone volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd - ~/.gcp/workflow-orchestration-credentials.json:/.gcp/credentials.json ... ``` The example uses a file at `~/.gcp/workflow-orchestration-credentials.json`. Update this path to the location of your JSON file. It maps it to `/.gcp/credentials.json` inside the container, which we'll need to reference in the environment variable. After that, add an environment variable under `environment` called `GOOGLE_APPLICATION_CREDENTIALS` ```yaml environment: GOOGLE_APPLICATION_CREDENTIALS: '/.gcp/credentials.json' KESTRA_CONFIGURATION: | ... ``` :::collapse{title="Full Docker Compose with GOOGLE_APPLICATION_CREDENTIALS"} Here is a full Docker Compose that you can use to add a service account using the environment variable `GOOGLE_APPLICATION_CREDENTIALS`: ```yaml volumes: postgres-data: driver: local kestra-data: driver: local services: postgres: image: postgres:18 volumes: - postgres-data:/var/lib/postgresql environment: POSTGRES_DB: kestra POSTGRES_USER: kestra POSTGRES_PASSWORD: k3str4 healthcheck: test: ["CMD-SHELL", "pg_isready -d $${POSTGRES_DB} -U $${POSTGRES_USER}"] interval: 30s retries: 10 kestra: image: kestra/kestra:latest pull_policy: always user: "root" command: server standalone volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd - ~/.gcp/workflow-orchestration-credentials.json:/.gcp/credentials.json environment: GOOGLE_APPLICATION_CREDENTIALS: '/.gcp/credentials.json' KESTRA_CONFIGURATION: | datasources: postgres: url: jdbc:postgresql://postgres:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 kestra: server: basic-auth: enabled: false username: "admin@kestra.io" # it must be a valid email address password: kestra repository: type: postgres storage: type: local local: base-path: "/app/storage" tutorial-flows: enabled: false queue: type: postgres tasks: tmp-dir: path: /tmp/kestra-wd/tmp url: http://localhost:8080/ ports: - "8080:8080" - "8081:8081" depends_on: postgres: condition: service_started ``` ::: ## Google App Passwords For some Google applications, such as Gmail, you won't use a service account for authentication. Instead, you'll use a normal username and password associated with a Google account. However, this doesn't work if your account has two-factor authentication enabled. In that case, you'll need to generate an **App Password**. You can do this by going to **Manage your Google Account**, then **Security**. Select the **App Passwords** option, and you'll be able to generate a new one. This can be used wherever you would normally enter your password to connect it to Kestra. :::alert{type="info"} If your account is associated with Google Workspaces, you might need your Administrator to enable App Passwords in the Admin Console. ::: --- # Connect Google Sheets to Kestra URL: https://kestra.io/docs/how-to-guides/google-sheets > Integrate Google Sheets with Kestra workflows. Read spreadsheet data, write pipeline outputs, and trigger flows automatically from sheet updates. Learn step-by-step how to read data from a Google Sheet in a Kestra flow. You can use any Google Sheet for this tutorial. In case you do not have Google Sheet, you can: 1. Download the [orders CSV dataset](https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv) and save it locally as `orders.csv` file. 2. Create a new Google Sheet. 3. Navigate to the `File` menu on the top, and select `Import` option. 4. Navigate to the `Upload` tab, and click on the `Browse` button. 5. Select the recently created `orders.csv` file, and click on `Open` button at the bottom of the popup. 6. On the `Import file` popup, choose the import location as `Replace spreadsheet` and separator type as `Detect automatically`. In this case, it does not matter whether you check or uncheck the box `Convert text to numbers, dates, and formulas`. Click on `Import data` button. 7. The contents of the file will be populated in the spreadsheet. 8. You can put an appropriate title to the spreadsheet, and name the sheet containing the order records as `orders`. ![sheet_import](./sheet_import.png) ![sheet_upload](./sheet_upload.png) ![sheet_import_data](./sheet_import_data.png) ![sheet_uploaded_data](./sheet_uploaded_data.png) Now that we have the spreadsheet ready, let us proceed to assign appropriate authorization for the spreadsheet. For this: 1. Go to the GCP console, navigate to the [IAM service](https://console.cloud.google.com/iam-admin/iam). 2. Select `Service accounts` from the left navigation menu. 3. Click on `Create Service Account` on the top. You may choose to use an existing service account in which case you can skip the next step. 4. Put in appropriate service account name, service account id (the auto-populated value should be good to start with) and service account description, and click on `Done`. ![create_service_account_1](./create_service_account_1.png) ![create_service_account_2](./create_service_account_2.png) The new service account has been created. Add a key to the service account. 1. Click on the corresponding service account from the Service Accounts page. 2. Navigate to `Keys` tab, and click on `Add Key` -> `Create new key`. 3. On the `Create private key` popup, select `JSON` option, and click on `Create`. 4. This will download the service account JSON file on your local computer. 5. Provide this JSON file's content as the secret. a. With Kestra EE, provide the secret key `GCP_SERVICE_ACCOUNT_JSON` and the file contents as the value. b. For docker-based Kestra instance, convert the JSON file's contents into base64 encoded format using `cat .json | base64` and then provide the secret value as part of the environment file to the docker instance: `SECRET_GCP_SERVICE_ACCOUNT_JSON=`. Detailed instructions on creating service account can also be found in the [Google credentials guide](../google-credentials/index.md). ![create_new_key](./create_new_key.png) We will now provide access to the spreadsheet for the service account. 1. Copy the email corresponding to the service account from the Service Accounts page. 2. Go to the spreadsheet, and click on the `Share` button on the top right. 3. Add the service account email in the `Add people, groups, and calendar events` text box. 4. You can give the `Viewer` access to the service account. 5. Click on `Done`. Let us now enable the Google Sheets API in the GCP console. 1. On the GCP console, search for `Google Sheets API` service, or directly navigate to the [Google Sheets API page](https://console.cloud.google.com/marketplace/product/google/sheets.googleapis.com). 2. Check whether the Google Sheets API is already enabled. If not, you will see an `Enable` button on the page. Click on the `Enable` button. ![enable_google_sheets_api](./enable_google_sheets_api.png) With this, we are all set to access the Google Spreadsheet from Kestra flow. Here is an example of how the Kestra flow might look like: ```yaml id: gsheet namespace: company.team tasks: - id: read_gsheet type: io.kestra.plugin.googleworkspace.sheets.Read description: Read data from Google Spreadsheet spreadsheetId: "1U4AoiUrqiVaSIVcm_TwDc9RoKOdCULNGWxuC1vmDT_A" store: true valueRender: FORMATTED_VALUE serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}" header: true ``` The `spreadsheetId` in the flow is the ID that is present in the spreadsheet URL. For the URL `https://docs.google.com/spreadsheets/d/1U4AoiUrqiVaSIVcm_TwDc9RoKOdCULNGWxuC1vmDT_A/edit`, the `spreadsheetId` will be `1U4AoiUrqiVaSIVcm_TwDc9RoKOdCULNGWxuC1vmDT_A`. The `store : true` implies that the values read from the spreadsheet will be stored as a file in the Kestra's internal storage. In case, you only want to fetch the result, and not store them as a file in the Kestra's internal storage, you can use `fetch: true` instead. The `serviceAccount` value is fetched from the secret store, and its value is the service account key's JSON file contents. The `header: true` implies that the first line of the input contains the column headers. In case you only want a few selected sheets to be read, you can provide the array of sheets as part of the attribute `selectedSheetsTitle` as follows: ```yaml selectedSheetsTitle: - orders - products ``` Here is the output of executing the above Kestra flow: ![gsheet_read_output](./gsheet_read_output.png) This is how Kestra's Google Workspace plugin can be used to read the spreadsheet with its Sheet's [`Read`](/plugins/plugin-googleworkspace/sheets/io.kestra.plugin.googleworkspace.sheets.read) task. --- # Make HTTP Requests Inside Your Flows URL: https://kestra.io/docs/how-to-guides/http-request > Make HTTP requests inside Kestra workflows. Call REST APIs, fetch remote data, and chain API responses as inputs to downstream tasks. Make HTTP Requests to fetch data and generate outputs.
You can make HTTP Requests directly inside a flow as well as get outputs from the responses. In this guide, we'll walk you through what HTTP Requests are and how you can use the most common request methods in Kestra. ## What is a HTTP Request? Hypertext Transfer Protocol (better known as HTTP) [requests](https://developer.mozilla.org/en-US/docs/Web/HTTP/Messages#http_requests) are messages sent between a client and server to request something. Requests can send or request data, with common methods known as GET, POST, PUT and DELETE requests. We can use these directly in Kestra to interact with 3rd party systems to make our workflows more powerful. | Request Method | Description | | - | - | | GET | Used to retrieve data from a server | | POST | Used to create new data on a server | | PUT | Used to replace data on a server | | DELETE | Used to delete data on a server | There are many other request methods too, which you can read more about on the [MDN docs](https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods). When you make a request, you will receive a [response](https://developer.mozilla.org/en-US/docs/Web/HTTP/Messages#http_responses) from the server with the answer. This answer can drive Kestra automations. First, here is what makes up a request. ### Status Code When you make a request, you'll receive a response with a status code. This will tell you if your request was successful or not. The format follows: | Status Codes | Description | | - | - | | 100 - 199 | Informational | | 200 - 299 | Successful | | 300 - 399 | Redirection | | 400 - 499 | Client error | | 500 - 599 | Server error | A few common ones you might have seen include: - 200: OK - Request was successful. - 404: Not Found - Request reached the server but the resource wasn't found. A common one you see when you go to a page on a website that doesn't exist. - 500: Internal Server Error - Request reached the server but the server was unable to process it. Usually means the server has thrown an error. You can read the full list of status codes on the [MDN docs](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status). ### Headers Each request also has a set of Request Headers which can provide additional information for the request, such as what client the user is using, as well as the type of content that sent with our request. You can read more about HTTP Headers on the [MDN docs](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/). The response will also have headers following the same structure. ### Body Lastly, requests can also have a Request Body which contains all the data we want to send as part of our request. For example, if you wanted to add a user to a system, you would include their information in the body like name and email. These are fundamental for POST and PUT requests which are used for creating and updating data on other systems, but other methods like GET don't have them. You can read more about the Request Body on the [MDN docs.](https://developer.mozilla.org/en-US/docs/Web/API/Request/body) When you receive your [response](https://developer.mozilla.org/en-US/docs/Web/HTTP/Messages#body_2), it might have a body, for example a GET Request. This is helpful as you can receive data which you can use in your workflows, such as a JSON. ## How can I make HTTP Requests? You can make requests by putting a URL directly into your browser, especially for GET requests, but it can be challenging to specify the body and headers for other methods, such as POST and PUT requests. There's a variety of tools that can make this easier such as [Postman](https://www.postman.com/) and [cURL](https://curl.se/). In the example below, we can use Postman to make a POST Request to [dummyjson.com](https://dummyjson.com) which will give us some dummy data. We can use the `/products/add` route to add a new product by providing a body like this: ```json { "title": "Kestra Pen" } ``` In Postman, add the URL `https://dummyjson.com/products/add`, set the request type to `POST`, add the body above as a `raw` option, and change the type to JSON. Then press send: ![postman](./postman.png) We can also do the same with cURL by using the command below: ```bash curl -X POST https://dummyjson.com/products/add \ -H 'Content-Type: application/json' \ -d '{ "title": "Kestra Pen" }' ``` The arguments used are: - `-X {method} {url}` which allows us to specify what type of HTTP method we want to make, in this case a POST request, as well as the URL we will make the request to. - `-H {header-type}` which allows us to specify the headers we want to use - `-d {body}` which allows us to provide the body we want to send We get the same response that we got in Postman: ```json { "id": 101, "title": "Kestra Pen" } ``` While these tools are very useful for testing APIs, it can be challenging to automate requests, as well as integrate them with other platforms. ## Making HTTP Requests in Kestra This is where Kestra comes into enable us to automate requests with other tasks! Below, we'll cover how you can make a `GET`, `POST`, `PUT`, and `DELETE` request directly in your flow. To make a request, you can use the task type `io.kestra.plugin.core.http.Request`. For more information on the the task type, head over to the [dedicated documentation](/plugins/core/http/io.kestra.plugin.core.http.request). ### GET Request Making a `GET` Request in Kestra is super useful if you want to fetch up-to-date data from a server and then perform computation directly on it without needing to manually intervene. In this example, our flow is making a `GET` Request to collect a JSON of products and print the output to the logs: ```yaml id: http_get_request_example namespace: company.team description: Make a HTTP Request and Handle the Output tasks: - id: send_data type: io.kestra.plugin.core.http.Request uri: https://dummyjson.com/products method: GET contentType: application/json - id: print_status type: io.kestra.plugin.core.log.Log message: "{{ outputs.send_data.body }}" ``` We can see the response from the Logs task in the Logs page: ![http_get_logs](./http_get_logs.png) To view task outputs without Log tasks, use the Outputs page in the UI: ![http_get_outputs](./http_get_outputs.png) Here, we are using the [Debug Expression](../../05.workflow-components/06.outputs/index.md#using-debug-expression) option to allow us to view specific outputs by using an expression, like we would to output a dynamic value in a Log task, but after the flow has executed. This is very useful if you're trying to debug tasks and figure out what outputs were generated. ### POST Request Using our `POST` Request example from earlier, we can recreate it directly in Kestra. We can use our `GET` request example above as a template and build from that. We'll need to change the following properties: - `uri` will change to `https://dummyjson.com/products/add` - `method` will change to `POST` - `body` will be added where we'll add the data we want to send to the server ```yaml id: http_post_request_example namespace: company.team description: Make a HTTP Request and Handle the Output inputs: - id: payload type: JSON defaults: | { "title": "Kestra Pen" } tasks: - id: send_data type: io.kestra.plugin.core.http.Request uri: https://dummyjson.com/products/add method: POST contentType: application/json body: "{{ inputs.payload }}" - id: print_status type: io.kestra.plugin.core.log.Log message: "{{ outputs.send_data.body }}" ``` We can define the request body as an input so it's easier to remember what it is, change it when we execute and to use in multiple places if we decide to make multiple requests with the same body. :::alert{type="info"} If your body message input is multiple lines, the best practice is to use a pebble expression to convert it to JSON and avoid escape function issues. For more details, check out this [multiline JSON example with pebble](../../expressions/02.syntax/index.mdx#multiline-json-bodies). ::: When we execute this as a `POST` request, this is the response we receive using the same Debug Expression option in the Outputs page: ![http_post_outputs](./http_post_outputs.png) As we can see, this generates the same output from our earlier example but with the added benefit that we can pass this data to later tasks to perform computation if we wanted to! ### PUT Request Similar to the `POST` request, change the `method` property to `PUT`. Since the `PUT` request replaces content, adjust the body with the data to update. From the `GET` request, `id` 1 is an `iPhone 9` — change it to an `iPhone 10`: ```yaml id: http_put_request_example namespace: company.team description: Make a HTTP Request and Handle the Output inputs: - id: payload type: JSON defaults: | {"title": "iPhone 10"} tasks: - id: send_data type: io.kestra.plugin.core.http.Request uri: https://dummyjson.com/products/1 method: PUT contentType: application/json body: "{{ inputs.payload }}" - id: print_status type: io.kestra.plugin.core.log.Log message: "{{ outputs.send_data.body }}" ``` As we can see, the response body is showing our updated title field. ![http_put_outputs](./http_put_outputs.png) ### DELETE Request We can also remove a product from the list by using a `DELETE` Request. This example is very similar to the `GET` Request as we don't need to provide a body. ```yaml id: http_delete_request_example namespace: company.team description: Make a HTTP Request and Handle the Output inputs: - id: product_id type: INT tasks: - id: send_data type: io.kestra.plugin.core.http.Request uri: "https://dummyjson.com/products/{{ inputs.product_id }}" method: DELETE contentType: application/json - id: print_status type: io.kestra.plugin.core.log.Log message: "{{ outputs.send_data.body }}" ``` Adding an input lets you specify which product to remove by providing the `id` at execution. ![http_delete_outputs](./http_delete_input.png) As expected, we get the desired output: ![http_delete_outputs](./http_delete_outputs.png) --- # Prevent Duplicate Executions with Correlation IDs URL: https://kestra.io/docs/how-to-guides/idempotency Use `system.correlationId` as an idempotency key to group related executions, trace execution lineage across subflows, and skip duplicate processing. This guide covers two patterns: setting the correlation ID at execution creation for API-triggered flows, and using a custom label for webhook-triggered flows where the key only becomes available after the execution starts. :::alert{type="info"} This guide applies to Kestra Enterprise. API token authentication and tenant-scoped endpoints are Enterprise features. ::: ## How `system.correlationId` works - A built-in system label present on every execution. - Defaults to the execution's own ID and propagates to subflows automatically — all executions in a lineage share the root execution's `system.correlationId`. - Can be set to any stable business identifier at execution creation time, such as a payment intent, message UUID, or event key. - Immutable once the execution is created — it cannot be changed mid-flow. ## When to use it Use `system.correlationId` when the same business event might arrive more than once and you need to process it only once: - payment processing triggered via API - event-driven pipelines consuming messages from a queue - any flow where the caller controls the trigger and holds the stable key at creation time For **webhook-triggered flows**, where the idempotency key arrives in the request headers after the execution is already created, use a custom label instead. See [Webhook-triggered flows](#webhook-triggered-flows). ## Set the correlation ID at execution creation Pass `system.correlationId` as a `labels` query parameter when creating the execution. Labels use `key:value` format. Replace `{your-tenant}` with your tenant ID (visible in **Administration → Tenants**). ```bash curl -X POST "http://localhost:8080/api/v1/{your-tenant}/executions/company.team/payments?labels=system.correlationId:payment-ORD-123" \ -H "Authorization: Bearer {your-api-token}" ``` Use this approach when the caller already holds the stable business key. ## Short-circuit duplicates early After setting the correlation ID at creation, check whether a successful execution with that key already exists. If one does, skip the current run. Pass the idempotency key as both the `labels` query parameter and an input so the flow can reference it in the duplicate check: ```bash curl -X POST "http://localhost:8080/api/v1/{your-tenant}/executions/company.team/payment_flow_guarded?labels=system.correlationId:payment-ORD-123" \ -H "Authorization: Bearer {your-api-token}" \ -F "idempotencyKey=payment-ORD-123" ``` ```yaml id: payment_flow_guarded namespace: company.team inputs: - id: idempotencyKey type: STRING tasks: - id: check_existing type: io.kestra.plugin.core.http.Request uri: "http://localhost:8080/api/v1/{{ kv('KESTRA_TENANT') }}/executions/search?filters[labels][EQUALS][system.correlationId]={{ inputs.idempotencyKey }}&filters[namespace][EQUALS]=company.team&filters[flowId][EQUALS]=payment_flow_guarded&filters[state][IN]=SUCCESS&size=1" method: GET headers: Authorization: "Bearer {{ secret('KESTRA_API_TOKEN') }}" - id: maybe_skip type: io.kestra.plugin.core.flow.If condition: "{{ not (outputs.check_existing.body contains '\"total\":0') }}" then: - id: skip type: io.kestra.plugin.core.log.Log message: "Duplicate {{ inputs.idempotencyKey }} skipped; already succeeded" else: - id: continue type: io.kestra.plugin.core.log.Log message: "First time for {{ inputs.idempotencyKey }}, proceed" ``` Store your tenant ID and API token as a [KV pair](../../06.concepts/05.kv-store/index.md) and [Secret](../../06.concepts/04.secret/index.md) respectively. ## Webhook-triggered flows `system.correlationId` is assigned automatically when the execution is created and cannot be changed afterwards. For webhook-triggered flows, the provider's idempotency key is only available once the execution has started. Store it in a **custom label** using the [Labels task](/plugins/core/tasks/executions/io.kestra.plugin.core.execution.Labels), then use that label for the duplicate check. ```yaml id: payment_webhook namespace: company.team variables: idem_key: "{{ trigger.headers['Idempotency-Key'] | first }}" tasks: - id: set_idempotency_key type: io.kestra.plugin.core.execution.Labels labels: idempotency.key: "{{ vars.idem_key }}" - id: check_existing type: io.kestra.plugin.core.http.Request uri: "http://localhost:8080/api/v1/{{ kv('KESTRA_TENANT') }}/executions/search?filters[labels][EQUALS][idempotency.key]={{ vars.idem_key }}&filters[namespace][EQUALS]=company.team&filters[flowId][EQUALS]=payment_webhook&filters[state][IN]=SUCCESS&size=1" method: GET headers: Authorization: "Bearer {{ secret('KESTRA_API_TOKEN') }}" - id: maybe_skip type: io.kestra.plugin.core.flow.If condition: "{{ not (outputs.check_existing.body contains '\"total\":0') }}" then: - id: skip type: io.kestra.plugin.core.log.Log message: "Duplicate {{ vars.idem_key }} skipped" else: - id: process_payment type: io.kestra.plugin.core.log.Log message: "Charge payment for {{ vars.idem_key }}" triggers: - id: webhook type: io.kestra.plugin.core.trigger.Webhook key: payment-events ``` ## Operate with correlation IDs - **UI filtering:** In Executions, add the label filter `system.correlationId:your-key` to view the entire lineage. - **API search:** Use `filters[labels][EQUALS][system.correlationId]={value}` and `filters[state][IN]=SUCCESS` as query parameters in the Executions search API to audit or detect duplicates programmatically. - **Subflows:** The value propagates automatically, so downstream executions share the same `system.correlationId` without additional configuration. :::alert{type="warning"} `system.correlationId` identifies and groups executions for the same business event, but it does not prevent duplicate processing on its own. Pair it with an explicit duplicate check as shown in the examples in this guide. The duplicate check is not atomic. If two executions with the same key start simultaneously, both may pass the SUCCESS check before either completes — neither will be in SUCCESS state yet. For strict once-only guarantees under concurrent load, enforce uniqueness at the system that triggers the execution (message broker deduplication, database unique constraint, or API gateway idempotency). ::: ## Quick checklist - [ ] Pick a stable business key (payment intent, message ID, event key). - [ ] For API-triggered flows: set `system.correlationId` at execution creation via the `labels` query parameter. - [ ] For webhook-triggered flows: store the provider's idempotency key as a custom label via the Labels task. - [ ] Add an early guard to skip if a successful execution with the same key already exists. - [ ] Filter by `system.correlationId` in the UI or API for audit and lineage troubleshooting. --- # Pass Inputs via an API Call URL: https://kestra.io/docs/how-to-guides/inputs-api > Learn how to pass dynamic inputs to Kestra flow executions via API calls to parameterize your workflows at runtime. Passing Inputs via an API Call Inputs allow you to dynamically pass data to your execution at runtime. For a detailed overview of inputs, see the [Inputs](../../05.workflow-components/05.inputs/index.md) documentation page. ## Example If you want to trigger a flow and change the value for an input, you can do so by triggering your flow by the API and passing your new input in the form data. Take the following flow as an example: ```yaml id: inputs_demo namespace: company.team inputs: - id: user type: STRING defaults: Rick Astley tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Hey there, {{ inputs.user }} ``` This flow has the input `user` which we can modify via an API call. We can do that by triggering this flow and passing our new input using the form data. ```yaml id: input_api namespace: company.team tasks: - id: basic_auth_api type: io.kestra.plugin.core.http.Request uri: http://host.docker.internal:8080/api/v1/main/executions/dev/inputs_demo method: POST contentType: multipart/form-data formData: user: John Doe ``` The above example assumes you are running Kestra locally in Docker. If you are running Kestra in a different environment, replace `http://host.docker.internal:8080` with the correct URL. If you configured basic authentication for your Kestra instance, you can add the `basicAuthUser` and `basicAuthPassword` options to the `Request` task: ```yaml id: api_call namespace: company.team tasks: - id: basic_auth_api type: io.kestra.plugin.core.http.Request uri: http://host.docker.internal:8080/api/v1/main/executions/dev/inputs_demo options: basicAuthUser: admin basicAuthPassword: admin method: POST contentType: multipart/form-data formData: user: John Doe ``` When you execute the `api_call` flow, this will execute the `input_api` flow with the new `user` input ![input_api_log](./input_api_log.png) --- # Validate Inputs with the Enum Data Type URL: https://kestra.io/docs/how-to-guides/inputs-enum > Use ENUM-type inputs in Kestra to restrict flow parameters to a predefined set of values, improving validation and reducing runtime configuration errors. Input validation with the Enum data type Inputs allow you to dynamically pass data to your execution at runtime. For a detailed overview of inputs, see the [Inputs](../../05.workflow-components/05.inputs/index.md) documentation page. ## Input validation with Enum data type The following example shows how to use the `ENUM` input type and the `Switch` task to validate user input and conditionally branch the flow based on the input value. ```yaml id: orchestrate_everything namespace: company.team inputs: - id: use_case description: What do you want to orchestrate? type: ENUM defaults: Data pipelines values: - Data pipelines - Microservices - Business processes - Marketing automation tasks: - id: conditional_branching type: io.kestra.plugin.core.flow.Switch value: "{{ inputs.use_case }}" cases: Data pipelines: - id: data_pipelines type: io.kestra.plugin.core.log.Log message: Managing important data products Microservices: - id: microservices type: io.kestra.plugin.core.log.Log message: Orchestrating critical applications Business processes: - id: business_processes type: io.kestra.plugin.core.log.Log message: Orchestrating critical applications Marketing automation: - id: marketing_automation type: io.kestra.plugin.core.log.Log message: Orchestrating critical applications ``` You can add an arbitrary number of cases to the `Switch` task, and each case can contain one or more tasks. By using the `defaults` attribute, you can specify a default input value that will be prefilled in the dropdown menu in the UI when executing the flow. :::alert{type="info"} It's not possible to launch a workflow execution without selecting a value from the dropdown menu. The requirement for selecting a value guarantees that the flow is only executed with valid input `values` defined by the `ENUM` type. ::: --- # Run JavaScript Inside Your Flows URL: https://kestra.io/docs/how-to-guides/javascript > Run JavaScript and Node.js scripts in Kestra. Install npm packages at runtime and pass outputs between tasks using inputs and variables. Run Node.js code directly in your flows and generate outputs. You can execute NodeJS code in a flow by either writing your NodeJS inline or by executing a `.js` file. You can also get outputs and metrics from your NodeJS code too.
In this example, the flow will install the required npm packages, make an API request to fetch data and use the NodeJS Kestra Library to generate outputs and metrics using this data. ## Scripts If you want to write a short amount of NodeJS code to perform a task, you can use the `io.kestra.plugin.scripts.node.Script` type to write it directly inside your flow. This allows you to keep everything in one place. ```yaml id: js_scripts namespace: company.team description: This flow will install the npm package in a Docker container, and use kestra's NodeJS Script task to run the script. tasks: - id: run_nodejs_script type: io.kestra.plugin.scripts.node.Script beforeCommands: - npm install requestify taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: node:slim script: | const requestify = require('requestify'); function GetDockerImageDownloads(imageName){ // Queries the Docker Hub API to get the number of downloads for a specific Docker image. var url = `https://hub.docker.com/v2/repositories/${imageName}/` console.log(url) requestify.get(url) .then(function(response) { result = JSON.parse(response.body); console.log(result['pull_count']); return result['pull_count']; }) .catch(function(error) { console.log(error); }) } GetDockerImageDownloads("kestra/kestra") ``` You can read more about the Scripts type in the [Plugin documentation](/plugins/plugin-script-node/io.kestra.plugin.scripts.node.script) ## Commands If you would prefer to put your NodeJS code in a `.js` file (e.g. your code is much longer or spread across multiple files), you can run the previous example using the `io.kestra.plugin.scripts.node.Commands` type: ```yaml id: js_commands namespace: company.team description: This flow will install the npm package in a Docker container, and use kestra's NodeJS Commands task to run the script. tasks: - id: run_nodejs_commands type: io.kestra.plugin.scripts.node.Commands namespaceFiles: enabled: true taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: node:slim beforeCommands: - npm install requestify commands: - node docker_image_downloads.js ``` You'll need to add your JavaScript code using the Editor or [sync it using Git](../../version-control-cicd/04.git/index.md) so Kestra can see it. You'll also need to set the `enabled` flag for the `namespaceFiles` property to `true` so Kestra can access the file. You can read more about the Commands type in the [Plugin documentation](/plugins/plugin-script-node/io.kestra.plugin.scripts.node.commands). ## Handling Outputs If you want to get a variable or file from your JavaScript code, you can use an [output](../../05.workflow-components/06.outputs/index.md). To pass your variables to Kestra, install the [`@kestra-io/libs` npm package](https://npm.io/package/@kestra-io/libs). ```bash npm install @kestra-io/libs ``` ### Variable Output You'll need to use the `Kestra` class to pass your variables to Kestra as outputs. Using the `outputs` method, you can pass a dictionary of variables where the `key` is the name of the output you'll reference in Kestra. Using the same example as above, we can pass the number of downloads as an output. ```javascript const requestify = require('requestify'); const Kestra = require('@kestra-io/libs'); function GetDockerImageDownloads(imageName){ // Queries the Docker Hub API to get the number of downloads for a specific Docker image. var url = `https://hub.docker.com/v2/repositories/${imageName}/` console.log(url) requestify.get(url) .then(function(response) { result = JSON.parse(response.body); Kestra.outputs({"pull_count": result['pull_count']}) return result['pull_count']; }) .catch(function(error) { console.log(error); }) } GetDockerImageDownloads("kestra/kestra") ``` Once your NodeJS file has executed, you'll be able to access the outputs in later tasks as seen below: ```yaml id: outputs_nodejs namespace: company.team description: This flow will install the npm package in a Docker container, and use kestra's NodeJS Commands task to run the script. tasks: - id: run_nodejs_commands type: io.kestra.plugin.scripts.node.Commands namespaceFiles: enabled: true taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: node:slim beforeCommands: - npm install requestify - npm install @kestra-io/libs commands: - node outputs_nodejs.js - id: log_downloads type: io.kestra.plugin.core.log.Log message: "Number of downloads: {{ outputs.run_nodejs_commands.vars.pull_count }}" ``` _This example works for both `io.kestra.plugin.scripts.node.Script` and `io.kestra.plugin.scripts.node.Commands`._ ### File Output Inside of your JavaScript code, write a file to the system. You'll need to add `outputFiles` property to your flow and list the file you're trying to access. In this case, we want to access `downloads.txt`. More information on the formats you can use for this property can be found in [Script Output Metrics](../../16.scripts/06.outputs-metrics/index.md). The example below write a `.txt` file containing the number of downloads, similar the output we used earlier. We can then read the content of the file using the syntax `{{ outputs.{task_id}.outputFiles['{filename}'] }}` ```yaml id: js_outputs_files_scripts namespace: company.team description: This flow will install the npm package in a Docker container, and use kestra's NodeJS library to generate outputs (number of downloads of the Kestra Docker image). tasks: - id: nodejs_outputs type: io.kestra.plugin.scripts.node.Script beforeCommands: - npm install requestify taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: node:slim outputFiles: - downloads.txt script: | const requestify = require('requestify'); const fs = require('fs'); function GetDockerImageDownloads(imageName){ // Queries the Docker Hub API to get the number of downloads for a specific Docker image. var url = `https://hub.docker.com/v2/repositories/${imageName}/` console.log(url) requestify.get(url) .then(function(response) { result = JSON.parse(response.body); fs.writeFile("downloads.txt", result['pull_count'].toString(), (err) => { if (err) throw err; }) return result['pull_count']; }) .catch(function(error) { console.log(error); }) } GetDockerImageDownloads("kestra/kestra"); ``` We can also preview our file directly in the Outputs tab as well. ![outputs](./outputs.png) _This example works for both `io.kestra.plugin.scripts.node.Script` and `io.kestra.plugin.scripts.node.Commands`._ ## Handling Metrics You can also get [metrics](../../16.scripts/06.outputs-metrics/index.md#outputs-and-metrics-in-script-and-commands-tasks) from your NodeJS code. In this example, we can use the `Date` class to time the execution time of the function and then pass this to Kestra so it can be viewed in the Metrics tab. No flow changes are needed. ```javascript const Kestra = require('@kestra-io/libs'); const requestify = require('requestify'); function GetDockerImageDownloads(imageName){ // Queries the Docker Hub API to get the number of downloads for a specific Docker image. var url = `https://hub.docker.com/v2/repositories/${imageName}/` console.log(url) requestify.get(url) .then(function(response) { result = JSON.parse(response.body); Kestra.outputs({"pull_count": result['pull_count']}) return result['pull_count']; }) .catch(function(error) { console.log(error); }) } start = new Date().getTime(); GetDockerImageDownloads("kestra/kestra") end = new Date().getTime(); duration = (end - start) / 1000 Kestra.timer('duration', end - start); ``` Once this has executed, `duration` will be viewable under **Metrics**. ![metrics](./metrics.png) ## Execute GraalVM Task Kestra also supports GraalVM integration, allowing you to execute JavaScript code directly on the JVM, with the potential for performance improvements. There are currently two tasks: - [Eval](/plugins/plugin-graalvm/javascript-tasks-on-graalvm/io.kestra.plugin.graalvm.js.eval) - [FileTransform](/plugins/plugin-graalvm/javascript-tasks-on-graalvm/io.kestra.plugin.graalvm.js.filetransform) In this example, the `Eval` task is used to manipulate data from a previous task. As GraalVM can polyfill from Java, we can use the `int()` function to convert the string into an integer. Additionally, using the `outputs` property simplifies the process of fetching variables from JavaScript and accessing them inside Kestra. It is useful if you want to manipulate data and pass the new format to another task. ```yaml id: parse_json_data namespace: company.team tasks: - id: download type: io.kestra.plugin.core.http.Download uri: http://xkcd.com/info.0.json - id: graal type: io.kestra.plugin.graalvm.python.Eval outputs: - data script: | data = {{ read(outputs.download.uri) }} data["next_month"] = int(data["month"]) + 1 ``` --- # Connect Web Apps to Kestra via Webhooks URL: https://kestra.io/docs/how-to-guides/js-webhook > Integrate your web applications with Kestra using Webhook triggers to start workflows from your frontend or backend code. Integrate Kestra into your JavaScript App using Webhooks. With Kestra's API First Design, you can build web applications to integrate with Kestra acting as a backend server. This can be useful if you want a request from your website to be made and start a workflow execution to process orders. For example, you have an online shop where orders are made and you want Kestra to receive these orders and start processing them. In this guide, we'll walk through how you can set up Kestra to receive webhooks as well as build a basic JavaScript application with React.js that can make requests.
## Configuring CORS To allow requests to Kestra from a JavaScript application running locally, enable CORS in your Kestra configuration: ```yaml micronaut: server: cors: enabled: true ``` More information can be found in the [Observability and Networking configuration](../../configuration/03.observability-and-networking/index.md). ## Building a Workflow with a Webhook Trigger The JavaScript application needs a workflow with a [Webhook Trigger](../../05.workflow-components/07.triggers/03.webhook-trigger/index.md) to receive requests and start executions. Once we've added it, we can add any tasks to run. In this example, we have a log message that will log the request body field `dataField` from the webhook: ```yaml id: webhook_example namespace: company.team tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "{{ trigger.body.dataField ?? 'null' }}" triggers: - id: webhook type: io.kestra.plugin.core.trigger.Webhook key: abcdefg ``` ## Building our JavaScript application with React.js In this example, I am using React.js to interact with Kestra but this will work with any web framework that can make requests. Create the application using `create-react-app`: ```bash npx create-react-app example ``` Start it with: ```bash npm start ``` Navigate to the application at `localhost:3000`. With the application running, modify `App.js` to make a request to Kestra. First, install axios to make the POST request: ```bash npm install axios ``` Once we've done that, we can add a `useState` hook to help us make our request and handle the response, specifically handle the request body state: ```js function App() { const [formData, setFormData] = useState({}); const handleSubmit = async (e) => { e.preventDefault(); try { await axios.post('http://localhost:8080/api/v1/main/executions/webhook/company.team/webhook_example/abcdefg', formData).then(response => { console.log(response.data) }); } catch (error) { console.error('Error:', error); } }; return (

Kestra Webhook Example

); } ``` Get the Webhook URL by navigating to **Triggers** at the top of the flow in Kestra and hovering over the webhook icon on the right: ![trigger_copy](./trigger_copy.png) This example makes a request with data from a form (added later) using the `useState` hook, storing state in `formData` and updating it using `setFormData`. Add UI elements to set `formData`, make the request, and display the response. Modify the JSX in the return statement to include a form that handles the request: ```js function App() { ... return (

Kestra Webhook Example

Send a message to Kestra

); } ``` The form uses `onSubmit` and `onChange` to call functions. We use `onSubmit` on the button to call our newly added `handleSubmit` function. However, we don't have a function to handle automatically adding our text to our state variable `formData`. We can add a new function called `handleChange` which will use our state updater function `setFormData` to update `formData` everytime new text is added to the input. This means that when we press **Submit**, the text is ready to be sent in a request body. ```js function App() { const [formData, setFormData] = useState({}); const handleSubmit = async (e) => { e.preventDefault(); try { await axios.post('http://localhost:8080/api/v1/main/executions/webhook/company.team/webhook_example/abcdefg', formData).then(response => { console.log(response.data) }); } catch (error) { console.error('Error:', error); } }; const handleChange = (e) => { setFormData({ ...formData, [e.target.name]: e.target.value }); }; return (

Kestra Webhook Example

Send a message to Kestra

); } ``` Now our example will collect the data in the `input` field as `dataField` and send it in our request as a key value pair: `dataField: {the input value}`. For example, if I type "Hello" and press **Submit**, it will send the body `{dataField: "Hello"}`. ![js-final](./js-final.png) The last thing to add now is to display the response back to the user. We can add another state variable called `responseData` to handle the response from the request. We can add it into our JSX to only display if we have a response: ```js {responseData.id &&

Execution ID: {responseData.id}

} ``` In this case, `id` is the Execution ID of the Execution that was started because of the webhook request. Now we've added the response, our full `App.js` should look like this now: ```js import React, { useState } from 'react'; import axios from 'axios'; import './App.css'; function App() { const [formData, setFormData] = useState({}); const [responseData, setResponseData] = useState({}); const handleSubmit = async (e) => { e.preventDefault(); try { await axios.post('http://localhost:8080/api/v1/main/executions/webhook/company.team/webhook_example/abcdefg', formData).then(response => { setResponseData(response.data) }); } catch (error) { console.error('Error:', error); } }; const handleChange = (e) => { setFormData({ ...formData, [e.target.name]: e.target.value }); }; return (

Kestra Webhook Example

Send a message to Kestra

{responseData.id &&

Execution ID: {responseData.id}

}
); } export default App; ``` ![js-response](./js-response.png) This will: 1. Display a Form with an input and a button. 2. Make a request to Kestra with the input data as our request body. 3. Receive the response and display it to the user. When we type in a value and press **Submit**, we can see a new execution is created in Kestra and our request body was received and used in our Log task: ![kestra-logs](./kestra-logs.png) :::collapse{title="CSS Styling"} These are the CSS styles used in the example: ```css .App { text-align: center; } .App-header { background-color: #4b0aaa; min-height: 100vh; display: flex; flex-direction: column; align-items: center; justify-content: center; font-size: calc(10px + 2vmin); color: white; } .App h1 { font-size: 34px; } .App h2 { font-size: 24px; } .App p { font-size: 16px; } ``` ::: --- # Work with JSON in Kestra URL: https://kestra.io/docs/how-to-guides/json > Learn how to interact with JSON data in Kestra workflows, including parsing, accessing nested fields, and using jq expressions. Interact with JSONs using expressions. APIs often use JSON bodies to send data. Being able to interact with them in your workflows is crucial to any API related orchestration.
## Making a Request Inside of Your Workflow The API `https://kestra.io/api/mock` will return a JSON body that looks like the following: ```json { "title":"Success", "method":"GET", "params":{}, "code":200, "createdAt":"2025-07-04T15:42:29.545Z", "body":"Request processed successfully" } ``` Kestra can make a request to this API using the `Request` task. This will give us an output called `body` containing our JSON body. To access this in later tasks, we can use an [expression](../../expressions/index.mdx) like `{{ outputs.request.body }}`. This will return the full body: ```yaml id: json_demo namespace: company.team tasks: - id: request type: io.kestra.plugin.core.http.Request uri: https://kestra.io/api/mock - id: log type: io.kestra.plugin.core.log.Log message: "My response: {{ outputs.request.body }}" ``` The log message returns: ```json My response: {"title":"Success","method":"GET","params":{},"code":200,"createdAt":"2025-07-04T16:36:44.193Z","body":"Request processed successfully"} ``` ## Accessing Part of the Body However, if the body is large, we may only want to access a certain part of it. To do this, `jq` is required as the expression returns a string, not a JSON. Using `jq`, the JSON can be parsed and accessed: ```yaml {{ outputs.request.body | jq('.title') | first }} ``` This will access the key `title` from the JSON. `jq` will return the result in an array when used within an expression. To access the value, add the function `first` to the end of the expression to remove it from the array. We can put that into the example: ```yaml id: json_demo namespace: company.team tasks: - id: request type: io.kestra.plugin.core.http.Request uri: https://kestra.io/api/mock - id: log type: io.kestra.plugin.core.log.Log message: "My response: {{ outputs.request.body | jq('.title') | first }}" ``` The log message says `My response: Success`. ## Nested JSON If the JSON you're working with has multiple levels, you can extend the `jq` expression. In this example, the API `https://kestra.io/api/mock?example=test` has additional parameters which return the following body with nesting: ```json { "title": "Success", "method": "GET", "params": { "example": "test" }, "code": 200, "createdAt": "2025-07-04T16:39:02.871Z", "body": "Request processed successfully" } ``` The `jq` expression can be extended as follows to access `example`: ```yaml {{ outputs.request.body | jq('.params.example') | first }} ``` It looks like this when added to the workflow: ```yaml id: json_demo namespace: company.team tasks: - id: request type: io.kestra.plugin.core.http.Request uri: https://kestra.io/api/mock?example=test - id: log type: io.kestra.plugin.core.log.Log message: "My response: {{ outputs.request.body | jq('.params.example') | first }}" ``` The log message returns `My response: test`. ## Debugging Expressions You can use [Debug Expression](../../05.workflow-components/06.outputs/index.md#using-debug-expression) to test expressions without running your workflow. This is useful for inspecting different parts of a JSON structure. ![debug_outputs](./json1.png) --- # Run Julia Inside Your Flows URL: https://kestra.io/docs/how-to-guides/julia > Execute Julia scripts in Kestra for scientific computing and data analysis. Use Docker to manage Julia dependencies and pass results between tasks. Run Julia code directly in your flows and generate outputs. Julia is renowned for high-performance numerical analysis and computational science. Leverage Kestra to orchestrate your Julia scripts, enhancing their capabilities in large-scale analytics and machine learning applications. From data ingestion to complex numerical simulations, Kestra streamlines your Julia workflows, accelerating development and deployment. This guide is going to walk you through how to get Julia running in a workflow, how to manage input and output files, and how you can pass outputs and metrics back to Kestra to use in later tasks. ## Executing Julia inside Kestra Kestra has an official plugin for Julia allowing you to execute Julia code in a flow by either writing your Julia inline or by executing a `.jl` file. You can get outputs and metrics from your Julia code too. ### Scripts If you want to write a short amount of Julia code to perform a task, you can use the `io.kestra.plugin.scripts.julia.Script` type to write it directly inside your flow. This allows you to keep everything in one place. ```yaml id: julia_script namespace: company.team description: This flow runs the Julia script. tasks: - id: http_download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: julia_script_task type: io.kestra.plugin.scripts.julia.Script script: | println("The current execution is {{ execution.id }}") # Read the file downloaded in `http_download` task lines = readlines("{{ outputs.http_download.uri }}") println(lines) ``` You can read more about the Scripts type in the [Plugin documentation](/plugins/plugin-script-julia/io.kestra.plugin.scripts.julia.script) ### Commands If you would prefer to put your Julia code in a `.jl` file (e.g. your code is much longer or spread across multiple files), you can run the previous example using the `io.kestra.plugin.scripts.julia.Commands` type: ```yaml id: julia_commands namespace: company.team tasks: - id: run_julia type: io.kestra.plugin.scripts.julia.Commands namespaceFiles: enabled: true commands: - julia main.jl ``` You'll need to add your Julia code using the Editor or [sync it using Git](../../version-control-cicd/04.git/index.md) so Kestra can see it. You'll also need to set the `enabled` flag for the `namespaceFiles` property to `true` so Kestra can access the file. You can also have the Julia code written inline. ```yaml id: julia_commands namespace: company.team tasks: - id: http_download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: run_julia type: io.kestra.plugin.scripts.julia.Commands inputFiles: orders.csv: "{{ read(outputs.http_download.uri) }}" main.jl: | println("The current execution is {{ execution.id }}") # Read the file downloaded in `http_download` task lines = readlines("orders.csv") println(lines) commands: - julia main.jl ``` You can read more about the Commands type in the [Plugin documentation](/plugins/plugin-script-julia/io.kestra.plugin.scripts.julia.commands). ## Handling Outputs If you want to get a variable or file from your Julia script, you can use an [output](../../05.workflow-components/06.outputs/index.md). ### Variable Output You can get the JSON outputs from the Julia commands / script using the `::{}::` pattern. Here is an example: ```yaml id: julia_outputs namespace: company.team description: This flow runs the Julia script, and outputs the variable. tasks: - id: julia_outputs_task type: io.kestra.plugin.scripts.julia.Script script: | println("::{\"outputs\":{\"test\":\"value\",\"int\":2,\"bool\":true,\"float\":3.65}}::") ``` All the output variables can be viewed in the Outputs tab of the execution. ![julia_outputs](./outputs.png) You can refer to the outputs in another task as shown in the example below: ```yaml id: julia_outputs_usage namespace: company.team description: This flow runs the Julia script, and outputs the variable. tasks: - id: julia_outputs_task type: io.kestra.plugin.scripts.julia.Script script: | println("::{\"outputs\":{\"test\":\"value\",\"int\":2,\"bool\":true,\"float\":3.65}}::") - id: return type: io.kestra.plugin.core.debug.Return format: '{{ outputs.julia_outputs_task.vars.test }}' ``` _This example works for both `io.kestra.plugin.scripts.julia.Script` and `io.kestra.plugin.scripts.julia.Commands`._ ### File Output Inside of your Julia script, write a file to the system. You'll need to add the `outputFiles` property to your flow and list the files you're trying to put out. In this case, we want to output `output.txt`. More information on the formats you can use for this property can be found in [Script Output Metrics](../../16.scripts/06.outputs-metrics/index.md). The example below writes a `output.txt` file containing the "Hello World" text. We can then refer the file using the syntax `{{ outputs.{task_id}.outputFiles[''] }}`, and read the contents of the file using the `read()` function. ```yaml id: julia_output_file namespace: company.team description: This flow runs the Julia script to output a file. tasks: - id: julia_outputs_task type: io.kestra.plugin.scripts.julia.Script outputFiles: - output.txt script: | open("output.txt", "w") do file write(file, "Hello World") end - id: log_output type: io.kestra.plugin.core.log.Log message: "{{ read(outputs.julia_outputs_task.outputFiles['output.txt']) }}" ``` _This example works for both `io.kestra.plugin.scripts.julia.Script` and `io.kestra.plugin.scripts.julia.Commands`._ ## Handling Metrics You can also get [metrics](../../16.scripts/06.outputs-metrics/index.md#outputs-and-metrics-in-script-and-commands-tasks) from your Julia script. Metrics use the same `::{}::` pattern as outputs. This example demonstrates both the counter and timer metrics. ```yaml id: julia_metrics namespace: company.team description: This flow runs the Julia script, and puts out the metrics. tasks: - id: julia_metrics_task type: io.kestra.plugin.scripts.julia.Script script: | println("There are 20 products in the cart") println("::{\"outputs\":{\"productCount\":20}}::") println("::{\"metrics\":[{\"name\":\"productCount\",\"type\":\"counter\",\"value\":20}]}::") println("::{\"metrics\":[{\"name\":\"purchaseTime\",\"type\":\"timer\",\"value\":32.44}]}::") ``` Once this has executed, both the metrics can be viewed under **Metrics**. ![metrics](./metrics.png) --- # Configure Keycloak SSO in Kestra URL: https://kestra.io/docs/how-to-guides/keycloak > Integrate Keycloak with Kestra for SSO. Configure OpenID Connect authentication to secure your Kestra instance with an external identity provider. Set up Keycloak SSO to manage authentication for users. If you don't have a Keycloak server already running, you can use a managed service like [Cloud IAM](https://app.cloud-iam.com). You can follow the [Cloud IAM getting started tutorial](https://documentation.cloud-iam.com/get-started/complete-tutorial.html) to deploy a managed Keycloak cluster for free. ## Configure Keycloak client Once in Keycloak, you need to create a client: ![alt text](./client1.png) ![alt text](./client2.png) Set `https://{{ yourKestraInstanceURL }}/oauth/callback/keycloak` as Valid redirect URIs and `https://{{ yourKestraInstanceURL }}/logout` as Valid post logout redirect URIs. ![alt text](./redirect-uri.png) ## Kestra Configuration ```yaml micronaut: security: oauth2: enabled: true clients: keycloak: client-id: "{{clientId}}" client-secret: "{{clientSecret}}" openid: issuer: "https://{{keyCloakServer}}/auth/realms/{{yourRealm}}" endpoints: logout: get-allowed: true ``` You can retrieve `clientId` and `clientSecret` via the Keycloak user interface. ![alt text](./clientId.png) ![alt text](./clientSecret.png) Don't forget to set a default role in your [Kestra Security and Secrets configuration](../../configuration/05.security-and-secrets/index.md) to streamline the process of adding new users. ```yaml kestra: security: defaultRole: name: Editor description: Default Editor role permissions: FLOW: ["CREATE", "READ", "UPDATE", "DELETE"] EXECUTION: - CREATE - READ - UPDATE - DELETE ``` > Note: Depending on the Keycloak configuration, you might want to tune the issuer URL. For more configuration details, refer to the [Keycloak OIDC configuration guide](https://guides.micronaut.io/latest/micronaut-oauth2-keycloak-gradle-java.html). --- # Set Up Secrets from a Helm Chart URL: https://kestra.io/docs/how-to-guides/kubernetes-secrets > Learn how to pass secrets to your Kestra deployment via Helm Chart using environment variables or Kubernetes Secrets. How to add Kestra Secrets to your Helm Chart deployment. :::alert{type="info"} This page is only relevant for the Open-Source edition of Kestra. For the Enterprise Edition, you can use the built-in [Secrets](../../07.enterprise/02.governance/secrets/index.md) functionality allowing you to securely store your secrets in an [external secret manager](../../07.enterprise/02.governance/secrets-manager/index.md) of your choice. ::: ## Pass environment variables directly The simplest way to pass secrets to Kestra is to use environment variables referenced using the `common.extraEnv` property. Each environment variable's key must start with `SECRET_`. To add two secrets to your Helm Chart: 1. `DB_USERNAME` with the value `admin` 2. `DB_PASSWORD` with the value `password` You can set them directly in your Helm Chart `values.yaml` as follows: ```yaml deployments: standalone: enabled: true common: extraEnv: - name: SECRET_DB_USERNAME value: "admin" - name: SECRET_DB_PASSWORD value: "password" ``` :::alert{type="info"} Note how each environment variable's key starts with `SECRET_`. This is important for Kestra to recognize them as secrets. ::: Now, install or upgrade your Helm Chart: ```shell helm repo add kestra https://helm.kestra.io/ helm install kestra kestra/kestra -f values.yaml ## or if you already have Kestra installed: helm upgrade kestra kestra/kestra -f values.yaml export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=kestra,app.kubernetes.io/instance=kestra,app.kubernetes.io/component=standalone" -o jsonpath="{.items[0].metadata.name}") kubectl port-forward $POD_NAME 8080:8080 ``` To test that the secrets have been correctly set, go to the UI e.g. http://localhost:8080 and create a new flow: ```yaml id: secret_test namespace: company.team tasks: - id: hello type: io.kestra.plugin.core.output.OutputValues values: username: "{{ secret('DB_USERNAME') }}" password: "{{ secret('DB_PASSWORD') }}" ``` Execute the flow and check the output values in the Outputs tab in the UI. You should see the values `admin` and `password`. --- ## Pass environment variables from a Kubernetes Secret If you want to define your secrets in a Kubernetes Secret, you can use the `common.extraEnvFrom` property in your Helm Chart. This property allows you to reference an existing Kubernetes Secret and pass its values as environment variables to Kestra. Here is an example of a Kubernetes Secret definition: ```yaml apiVersion: v1 kind: Secret metadata: name: db-creds type: Opaque stringData: SECRET_DB_USERNAME: admin SECRET_DB_PASSWORD: password ``` First, create the Secret in your Kubernetes cluster: ```shell kubectl apply -f secret.yaml ``` Then, reference this secret in your Helm Chart `values.yaml`: ```yaml deployments: standalone: enabled: true common: extraEnvFrom: - secretRef: name: db-creds ``` Redeploy your Helm Chart: ```shell helm upgrade kestra kestra/kestra -f values.yaml ``` And test the secrets in a flow as described in the previous section. In this method, the Kubernetes Secret's keys must start with `SECRET_` to be recognized as Kestra Secrets. --- ## Use Kubernetes Secrets as Kestra Secrets with `configurations.secrets` An alternative is to mount an entire Kubernetes Secret as a [Kestra configuration](../../configuration/01.configuration-basics/index.md) file using the `configurations.secrets` property. For example, in `values.yaml`: ```yaml configurations: secrets: - name: db-creds key: db.yml ``` And in your Helm chart, define the secret in `extraManifests`: ```yaml extraManifests: - apiVersion: v1 kind: Secret metadata: name: db-creds stringData: db.yml: | kestra: datasources: postgres: url: jdbc:postgresql://postgres:5432/kestra username: admin password: password ``` This method avoids the need for encoding and allows you to configure secrets in YAML format directly. --- ## Summary - Use `common.extraEnv` for simple inline secrets. - Use `common.extraEnvFrom` to load secrets from existing Kubernetes Secret objects. - Use `configurations.secrets` when you want to mount YAML-based secrets as part of Kestra's configuration. Choose the method that best fits your security and deployment requirements. --- # Synchronize Local Flows in Kestra URL: https://kestra.io/docs/how-to-guides/local-flow-sync > Synchronize Kestra flows from a local directory to your development instance for a seamless developer experience using file watching. Sync Flows from a local directory. How to synchronize flows from a local directory on a local development instance.
## Configure your instance :::alert{type="warning"} This feature is only for local development, that is why you can not connect to a distant Kestra instance. ::: When developing on a local Kestra instance, it can be more convenient to have your flows in a local directory, maybe synchronize with a GitHub repository on your local machine, and have Kestra automatically load them. Below is the minimal configuration to enable local flow synchronization: ```yaml micronaut: io: watch: enabled: true paths: - /path/to/your/flows ``` Multiple paths can be provided, and nested files will also be watched. Files have to end with `.yml` or `.yaml` to be considered as a flow. And only valid flows will be loaded, invalid flows will be ignored. File created locally should use `_..yml` or `__.yml` syntax to be loaded properly. For open-source users, `tenantID` is always `main`. Flow created inside the UI will be created at the root of the first path supplied in the configuration. :::alert{type="info"} If you are using the docker-compose installation, you will need to mount a volume so Kestra container can access your local folder. ```yaml volumes: # ... other volumes - ./local_folder:/docker_folder environment: KESTRA_CONFIGURATION: | micronaut: io: watch: enabled: true paths: - /docker_folder ``` ::: ## Details At startup, every file in the watched directory will be loaded into the database. Then every flow not existing in the watched directory will be created in the first path supplied in the configuration. When a file is created, updated, or deleted in the watched directory, Kestra will automatically load the flow into the database or remove it if the file is deleted. If a flow is created, updated or deleted in the UI, the file will be created, updated or deleted in the watched directory. In the Kestra UI, you cannot change an ID nor a namespace, but in a file you can, in this case, the previous flow will be deleted, and a new one created. --- # Long-Running Tasks on Kubernetes in Kestra URL: https://kestra.io/docs/how-to-guides/long-running-intensive-tasks > Execute long-running and resource-intensive tasks on Kubernetes using Kestra's podCreate task or Kubernetes Task Runners. Schedule long running and intensive processes with Kestra on Kubernetes. Long running tasks hold strong importance in the world of automation. They can range from data processing, machine learning, and data analytics to batch processing, ETL, and more. While these tasks are essential for business operations, they can be resource-intensive and time-consuming while requiring specific hardware. To execute these tasks efficiently, you need a robust and scalable infrastructure that can handle the workload effectively. Kestra offers various task execution solutions such as Docker, local processes, and Kubernetes. See [Task Runners](../../07.enterprise/04.scalability/task-runners/index.md) for more details. This guide focuses on executing long-running tasks on **Kubernetes** using Kestra. Kubernetes pods are a great fit due to the control and flexibility they provide. With Kubernetes, you can precisely define resource requirements, permissions, namespace, handle workload identity, and ensure proper networking for your tasks. Pods can also access other Kubernetes services hosted on the cluster such as databases, storage, and applications. As an example, this guide uses a [dbt job](https://docs.getdbt.com/docs/running-a-dbt-project/run-your-dbt-projects) to demonstrate how Kestra executes complex tasks on Kubernetes with resource requirements. ## Schedule task in a Kubernetes pod using podCreate Kestra's [podCreate](/plugins/plugin-kubernetes/kubernetes-core/io.kestra.plugin.kubernetes.core.podcreate) task allows you to launch a Kubernetes pod directly by providing the complete Kubernetes YAML configuration as an input. This gives you full control over the pod’s specifications, such as CPU, memory, image, or node selector. Here is an example of a dbt job that runs on Kubernetes using Kestra: ```yaml id: my-dbt-job namespace: dev tasks: - id: dbt-command type: io.kestra.plugin.kubernetes.PodCreate # Retry the task if it fails retry: behavior: RETRY_FAILED_TASK maxAttempts: 2 type: constant interval: PT5M warningOnRetry: true namespace: kestra # Define the commands to run inputFiles: dbt-commands.sh: | #!/bin/bash # Exit on error set -eo pipefail # Clone the dbt example repository git clone --depth 1 https://github.com/dbt-labs/jaffle_shop_duckdb.git --branch duckdb --single-branch # Copy the dbt example repository to the working directory cp -a jaffle_shop_duckdb/. . # dbt commands to run dbt deps dbt build # Define the pod specification using the Kubernetes YAML syntax spec: restartPolicy: Never containers: - name: dbt-duckdb image: ghcr.io/kestra-io/dbt-duckdb:latest # Specify resource requirements resources: request: cpu: "300m" memory: "500Mi" # Run the script in the container command: - "/bin/bash" - "{{workingDir}}/dbt-commands.sh" # Node selector to run the pod on a specific node nodeSelector: {} ``` This flow will: - create a Kubernetes pod in the `kestra` namespace with the specified resource requirements: 300m CPU and 500Mi memory - clone the dbt example repository inside the pod - run the dbt seed and build commands ![dbt-pod-create](./pod_create_dbt.png) At the end of the execution, the pod is deleted, and the logs remain available in the Kestra UI. ![dbt-pod-deleted-after-success](./pod_create_delete.png) ## Embrace Kestra versality with Kubernetes Task Runners While podCreate provides deep control, it takes aways all the benefits of Kestra's rich plugins ecosystem, [dbt plugin](/plugins/plugin-dbt/dbt-cli/io.kestra.plugin.dbt.cli.dbtcli) in this case. Also it can be cumbersome to manage complex Kubernetes pod YAML specification for each task, especially when you have multiple commands to run. Kestra’s Task Runners let you define workflows that benefit from the plugin system, using familiar plugins while still taking advantage of Kubernetes to secure and scale tasks effectively. Task Runners also let you test a task locally using Docker or Process before deploying it on Kubernetes. The same example would look like this using a Task Runner: ```yaml id: dbt-task-runner namespace: dev tasks: - id: dbt_build type: io.kestra.plugin.dbt.cli.DbtCLI taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes namespace: kestra resources: request: cpu: "300m" memory: "500Mi" containerImage: ghcr.io/kestra-io/dbt-duckdb:latest commands: - git clone --depth 1 https://github.com/dbt-labs/jaffle_shop_duckdb.git --branch duckdb --single-branch - cp -a jaffle_shop_duckdb/. . - dbt deps - dbt build ``` ![dbt-task-runner](./task_runner_dbt.png) To test this flow on a local Kestra instance, change the `taskRunner` type to `io.kestra.plugin.scripts.runner.docker.Docker`: ```yaml id: dbt-task-runner namespace: dev tasks: - id: dbt_build type: io.kestra.plugin.dbt.cli.DbtCLI taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/dbt-duckdb:latest commands: - git clone --depth 1 https://github.com/dbt-labs/jaffle_shop_duckdb.git --branch duckdb --single-branch - cp -a jaffle_shop_duckdb/. . - dbt deps - dbt build ``` This flexibility allows you to test your tasks locally before deploying them on Kubernetes. ## Conclusion Kestra provides a flexible way to execute long-running and intensive tasks on Kubernetes. With Kestra’s Task Runners, you can define workflows that use the plugin system while taking advantage of Kubernetes to secure and scale tasks effectively. If needed, you can also use the `podCreate` task to launch a Kubernetes pod directly by providing the complete Kubernetes YAML configuration as input. --- # Loop Over a List of Values URL: https://kestra.io/docs/how-to-guides/loop > Learn how to iterate over lists of values in Kestra workflows using the ForEach task to execute tasks for each item efficiently. How to iterate over a list of values in your flow. In this guide, you will learn how to iterate over a list of values using the `ForEach` task. This task enables you to loop through a list of values and execute specific tasks for each value in the list. This approach is useful for scenarios where multiple similar tasks need to be run for different inputs. ## Prerequisites Before you begin: - Deploy [Kestra](../../02.installation/index.mdx) in your preferred development environment. - Ensure you have a [basic understanding of how to run Kestra flows.](../../03.tutorial/index.mdx) ## Loop over nested lists of values This example demonstrates how to use `ForEach` to loop over a list of strings and then loop through a nested list for each string. You can access the current iteration value using the variable `{{ taskrun.value }}` or `{{ parent.taskrun.value }}` if you are in a nested child task. Additionally, you can access the batch or iteration number with `{{ taskrun.iteration }}`. To see the flow in action, define the `each_nested` flow as shown below: ```yaml id: each_nested namespace: company.team tasks: - id: 1_each type: io.kestra.plugin.core.flow.ForEach values: '["s1", "s2", "s3"]' tasks: - id: 1-1_return type: io.kestra.plugin.core.debug.Return format: "{{task.id}} > {{taskrun.value}} > {{taskrun.startDate}}" - id: 1-2_each type: io.kestra.plugin.core.flow.ForEach values: '["a a", "b b"]' tasks: - id: 1-2-1_return type: io.kestra.plugin.core.debug.Return format: "{{task.id}} > {{taskrun.value}} > {{taskrun.startDate}}" - id: 1-2-2_return type: io.kestra.plugin.core.debug.Return format: "{{task.id}} > {{ outputs['1-2-1_return'].s1[taskrun.value].value }} >> get {{ outputs['1-2-1_return']['s1'][taskrun.value].value }} > {{taskrun.startDate}}" - id: 1-3_return type: io.kestra.plugin.core.debug.Return format: "{{task.id}} > {{ outputs['1-1_return'][taskrun.value].value }} > {{taskrun.startDate}}" - id: 2_return type: io.kestra.plugin.core.debug.Return format: "{{task.id}} > {{outputs['1-2-1_return'].s1['a a'].value}}" ``` Save and execute the `each_nested` flow. The above flow, when executed, iterates over a nested list of values, logging messages at each level of iteration to track the processing of both the outer and inner list items. Within the flow: - `1_each`: Uses the `ForEach` task to iterate over the list `["s1", "s2", "s3"]`. For each value, it runs the nested tasks defined within. - `1-1_return`: Logs the task ID, the current list value, and the task run start time. - `1-2_each`: Iterates over a second list `["a a", "b b"]` and runs a set of tasks for each value in this nested list. - `1-2-1_return`: Logs the task ID, the nested list value, and the start time of the task run. - `1-2-2_return`: Logs a custom output from `1-2-1_return`, which shows how to access outputs from previous iterations within the nested loop. - `1-3_return`: Logs the output from `1-1_return` after the inner loop is completed and displays the corresponding value processed in the outer loop. - `2_return`: Fetches the output from the nested loop (`1-2-1_return` for the value `a a`) and logs it. ## Next steps Now that you've seen how to loop over a list of values using `ForEach`, you can apply this technique to any scenario where multiple iterations of similar tasks are needed. You can further extend this flow by: - Adding more complex nested loops. - Using dynamic input values instead of hardcoded lists. - Logging or processing additional data from each iteration. For more advanced use cases, refer to Kestra’s official [ForEach](/plugins/core/flow/io.kestra.plugin.core.flow.foreach) task documentation and the [Best Practices for ForEach and ForEachItem](../../14.best-practices/11.foreach-and-foreachitem/index.md) guide, which covers how to access sibling task outputs inside and outside the loop, when to use `ForEachItem` instead, and common mistakes to avoid. --- # Safeguard Microservices with Unit Tests URL: https://kestra.io/docs/how-to-guides/microservices-unit-tests > Write unit tests for Kestra Enterprise workflows. Create test suites, mock task dependencies, and assert flow behavior before deploying to production. Build an automated guardrail that pings a microservice endpoint, alerts Slack when it fails, and runs only when its unit tests pass. Modern microservices and API backends often expose health endpoints. With Kestra you can monitor those endpoints, write unit tests to validate the monitoring flow, and gate downstream automations on the test results. This guide walks through: - Creating a flow that checks an HTTP endpoint and notifies Slack when it is down - Writing Enterprise Edition unit tests that cover both success and failure paths - Triggering a downstream flow only when the test suite passes ## Prerequisites - Kestra Enterprise Edition (required for [Unit Tests](../../07.enterprise/02.governance/unit-tests/index.md) and the `RunTest` task) - A Slack Incoming Webhook URL (or another channel supported by the Notifications plugin) - A Kestra API token stored as `KESTRA_API_TOKEN` (used by the test runner flow) ## Step 1: Monitor the API endpoint Create the following flow in the your namespace to send an alert when the target server is unreachable: ```yaml id: microservices-and-apis namespace: tutorial description: Microservices and APIs inputs: - id: server_uri type: URI defaults: https://kestra.io - id: slack_webhook_uri type: URI defaults: https://kestra.io/api/mock tasks: - id: http_request type: io.kestra.plugin.core.http.Request uri: "{{ inputs.server_uri }}" options: allowFailed: true - id: check_status type: io.kestra.plugin.core.flow.If condition: "{{ outputs.http_request.code != 200 }}" then: - id: server_unreachable_alert type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ inputs.slack_webhook_uri }}" payload: | { "channel": "#alerts", "text": "The server {{ inputs.server_uri }} is down!" } else: - id: healthy type: io.kestra.plugin.core.log.Log message: Everything is fine! ``` ![Microservices Flow Code](./monitoring-flow-code.png) This flow issues an HTTP request, lets it fail gracefully (`allowFailed: true`), then either sends a Slack alert or logs a healthy status. Breakdown of the components: - **Inputs** - `server_uri`: parameterizes the target so you can reuse the flow for staging, production, or any other health endpoint. - `slack_webhook_uri`: stores the Slack webhook that receives alerts without hardcoding secrets in the flow body. Instead of an input, you can also use the [KV Store](../../06.concepts/05.kv-store/index.md) or a [secret](../../06.concepts/04.secret/index.md) in the `url` property. - **`http_request` task**: performs the status check and captures the HTTP code; `allowFailed` ensures the flow continues even if the request fails. - **`check_status` conditional**: branches on the HTTP response, triggering the Slack alert when the service is down or logging “Everything is fine!” when the endpoint returns 200. ## Step 2: Add unit tests Next, define unit tests to cover both outcomes. Save the snippet below as a test resource in the same namespace. ```yaml id: test_microservices_and_apis flowId: microservices-and-apis namespace: tutorial testCases: - id: server_should_be_reachable type: io.kestra.core.tests.flow.UnitTest fixtures: inputs: server_uri: https://kestra.io assertions: - value: "{{ outputs.http_request.code }}" equalTo: 200 - id: server_should_be_unreachable type: io.kestra.core.tests.flow.UnitTest fixtures: inputs: server_uri: https://kestra.io/bad-url tasks: - id: server_unreachable_alert description: no Slack message from tests assertions: - value: "{{ outputs.http_request.code }}" notEqualTo: 200 ``` ![Unit Test Code](./unit-test-code.png) Each test case supplies fixtures (inputs and optional task overrides) and assertions. The second test disables the Slack call while still confirming that the alert path runs when the endpoint fails. Breakdown: - **Test definition**: `id`, `flowId`, and `namespace` tie this test suite to the flow created in Step 1. - **`server_should_be_reachable` case**: feeds a valid `server_uri` and asserts the HTTP response code is 200. - **`server_should_be_unreachable` case**: points to a bad URL, stubs the Slack task so no message is sent during testing (reduce channel noise or spamming test messages), and asserts the HTTP code differs from 200. ## Step 3: Run downstream logic only when tests pass Finally, create a control flow that executes the test suite and gates additional work on the result. The `RunTest` task returns a boolean in `outputs.run_test.result.state`. ```yaml id: run_if_tests_pass namespace: tutorial tasks: - id: run_test type: io.kestra.plugin.kestra.ee.tests.RunTest auth: apiToken: "{{ secret('KESTRA_API_TOKEN') }}" namespace: tutorial testId: test_microservices_and_apis - id: run_if_tests_pass type: io.kestra.plugin.core.flow.If condition: "{{ outputs.run_test.result.state }}" then: - id: log type: io.kestra.plugin.core.log.Log message: hello ``` ![Downstream Logic Flow Code](./downstream-logic-flow-code.png) Replace the final `log` task with deployments, escalations, or other automations that should run only after the tests succeed. Breakdown: - **`run_test` task**: invokes the Enterprise Edition `RunTest` plugin with an API token, namespace, and test ID; the result includes a `state` boolean. - **`run_if_tests_pass` conditional**: checks that boolean before proceeding, ensuring downstream work executes only when all test cases pass. ## Step 4: Execute the tests Run the unit tests from the Kestra UI or CLI to verify both assertions pass. A successful run confirms the monitor behaves correctly without sending Slack noise during testing. ![Unit Test Assertions](./unit-test-run.png) ## Next steps - Expand the monitoring flow to cover multiple endpoints by looping over inputs or using a namespace file. - Send alerts to PagerDuty, Teams, or email by swapping the Slack task for a different Notifications plugin. - Wire the gated flow into your CI/CD pipeline so every deployment validates critical monitors before rollout. --- # Configure Local MinIO Storage for Kestra URL: https://kestra.io/docs/how-to-guides/minio > Configure MinIO as a local object storage backend for Kestra using Docker and the MinIO client for development and testing. Set up and verify a local [MinIO](https://min.io/) storage backend for Kestra using the `mc` CLI and Docker. --- :::alert{type="warning"} This guide is intended for **local development and testing only**. MinIO is configured in gateway mode and exposed on `localhost`, without TLS or public access. **Do not use this setup in production** without additional security measures (e.g., HTTPS, access controls, and network isolation). ::: ## Install and Configure `mc` (MinIO Client) Download and install the MinIO Client (`mc`) tool using the following command: ```sh curl https://dl.min.io/client/mc/release/linux-amd64/mc --create-dirs -o $HOME/minio-binaries/mc && \ chmod +x $HOME/minio-binaries/mc && \ export PATH=$PATH:$HOME/minio-binaries/ ``` ### Remove and Recreate Local Alias Remove any existing local alias: ```sh mc alias remove local ``` Recreate the alias with your MinIO access credentials: ```sh mc alias set local http://localhost:9000 YOUR_ACCESS_KEY YOUR_SECRET_KEY ``` ### Create a Local Bucket Create the bucket where outputs will be stored: ```sh mc mb local/your-bucket ``` ## Start MinIO Server Run the MinIO Docker container using the dedicated CI Compose file (e.g., from [kestra-io/storage-minio](https://github.com/kestra-io/storage-minio/)): ```sh docker compose -f docker-compose-ci.yml up ``` ## Configure Kestra for MinIO Storage Update your `application-psql.yml` (or other relevant configuration file) under the `kestra:` section: ```yaml storage: type: minio minio: endpoint: localhost port: 9000 bucket: your-bucket access-key: YOUR_ACCESS_KEY secret-key: YOUR_SECRET_KEY ``` ## Launch Kestra Start Kestra as usual. Ensure the updated configuration file is correctly mounted or included. ## Test with a Flow that Produces Outputs Here is a sample flow that generates output files and logs intermediate data: ```yaml id: alligator_743987 namespace: company.team tasks: - id: pass_output type: io.kestra.plugin.core.debug.Return format: hello - id: py_outputs type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest outputFiles: - myoutput.json script: | import json from kestra import Kestra my_kv_pair = {'mykey': 'from Kestra'} Kestra.outputs(my_kv_pair) with open('myoutput.json', 'w') as f: json.dump(my_kv_pair, f) - id: take_inputs type: io.kestra.plugin.core.log.Log message: | data from previous tasks: {{ outputs.pass_output.value }} and {{ outputs.py_outputs.vars.mykey }} - id: check_output_file type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - cat {{ outputs.py_outputs.outputFiles['myoutput.json'] }} ``` ## Verify Output in MinIO Bucket You can now validate that the output file is stored in the MinIO bucket: ```sh mc cat local/your-bucket/main/company/team/alligator-743987/executions/23z9cJWEa23kNAxu6sm0CT/tasks/py-outputs/5kxYRM7UqUurvnpVNvHca7/1noPFEiCFGPf2hcqjVzywu-myoutput.json ``` Replace the following placeholders with your own values: - the bucket name (here `your-bucket`) - the path (namespace) (here `company/team`) - the flow id (here `alligator-743987`) - the execution id (here `23z9cJWEa23kNAxu6sm0CT`) - the task id (here `py-outputs`) - and finally the output file name (here `1noPFEiCFGPf2hcqjVzywu-myoutput.json`) The result should look like: ```json {"mykey": "from Kestra"} ``` You have successfully configured and validated MinIO as a local storage backend for Kestra. --- # Configure Monitoring with Grafana and Prometheus URL: https://kestra.io/docs/how-to-guides/monitoring > Set up comprehensive monitoring for Kestra using Prometheus for metrics scraping and Grafana for visualization and dashboards. Set up Prometheus and Grafana for monitoring Kestra.
Kestra exposes [Prometheus](https://prometheus.io/) metrics at port 8081 on the endpoint `/prometheus`. This endpoint can be used by any compatible monitoring system. Use the [docker-compose.yml](https://github.com/kestra-io/kestra/blob/develop/docker-compose.yml) file and start Kestra using the command: ```sh docker compose up ``` Once Kestra is up and running, view the available metrics at `http://localhost:8081/prometheus` in your browser. The metrics should appear as below: ![prometheus_metrics](./prometheus_metrics.png) Create a few flows and execute them to generate some metrics for visualization. You can also add triggers to the flows to check the metrics corresponding to executions happening on a regular basis. ## Setting up Prometheus With metrics available from Kestra, set up a Prometheus server to scrape them and store them in a time-series DB. Create a `prometheus.yml` file for scraping the metrics: ```yaml global: scrape_interval: 15s evaluation_interval: 15s scrape_configs: - job_name: "prometheus" metrics_path: /prometheus static_configs: - targets: [":8081"] ``` Be sure to put the appropriate `` in the last line, e.g. `localhost:8081` or `host.docker.internal:8081`. :::alert{type="info"} If you're running everything in Docker on the same machine, you will need to change your host address to `host.docker.internal` rather than localhost. ::: We can start the Prometheus server using the following docker command in the same directory as `prometheus.yml`: ```sh docker run -d -p 9090:9090 -v ./prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus ``` Note, in this last command you may have to add `--add-host=host.docker.internal:host-gateway` to ensure your Prometheus endpoint is shown as `UP` (you can check it in the [targets](http://localhost:9090/targets)). You can now go to `http://localhost:9090/graph` and try out visualizing some metrics using the PromQL. Here is one of the graphs for `kestra_executor_execution_started_count_total` metric: ![promql_graph](./promql_graph.png) ## Setting up Grafana Let us now move on to setting up Grafana. You start by installing Grafana using docker via the following command: ```sh docker run -d -p 3000:3000 --name=grafana grafana/grafana-enterprise ``` You can open the Grafana server at `http://localhost:3000`. The default credentials are `admin` as both username and password. Once logged into Grafana, click on the hamburger menu on the top left and go to **Connections -> Data Sources**. ![grafana_data_sources](./grafana_data_sources.png) ### Add Data Source Click on **Add new Data Source** button present on the top right, and select **Prometheus** from the time series databases list. In the **Prometheus server URL** text box, put in the following URL: `http://:9090`. All the other configuration can be left as default. You can click on **Save and Test** button at the bottom, and confirm that the connection to Prometheus database is successful. ## Add Dashboard We are now all set to create the Grafana dashboard. For this, click on the **+** button on the top of the page to add a **New Dashboard** to Grafana. Save the dashboard with an appropriate name. Then, click on **Add visualization**, and select **prometheus** as the data source. We will create a Gauge that shows the number of tasks that are presently running. For this, select **Gauge** as the Visualization in the top right corner. In the PromQL metrics explorer text box, you can write `sum(kestra_worker_running_count)`. Click on **Run queries** button to ensure the Gauge shows up the number. Head back to Kestra and create a number of tasks that will execute for a long time. The example below will sleep for 60 seconds: ```yaml id: sleep namespace: company.team tasks: - id: sleep_task type: io.kestra.plugin.scripts.shell.Commands commands: - sleep 60 ``` Now we have some long-running tasks in progress, we can check that the Gauge correctly reflects the count. You can now put an appropriate title in the Panel options that says **Tasks running**. This is how your Grafana should look like: ![grafana_tasks_running_gauge](./grafana_tasks_running_gauge.png) Click on **Save** and **Apply** to add this gauge to the dashboard. Similarly, you can now keep on adding more graphs to your dashboard. Here is one of the example dashboards for Kestra metrics. ![kestra_metrics_dashboard](./kestra_metrics_dashboard.png) The [Alerting & Monitoring](../../10.administrator-guide/03.monitoring/index.md#grafana-and-kibana) section includes an import-ready Grafana dashboard definition. --- # MultipleCondition Listener in Kestra: How It Works URL: https://kestra.io/docs/how-to-guides/multiplecondition-listener > Configure MultipleCondition triggers in Kestra to start flows only when multiple conditions are met for precise event-driven orchestration. How to set up a Flow to only trigger when multiple conditions are met. In this tutorial, we’ll explore how to set up a flow in Kestra that only triggers when multiple conditions are met. Specifically, we will create a flow that only executes if two other flows, `multiplecondition-flow-a` and `multiplecondition-flow-b`, have executed successfully within the last 24 hours. ## Why Use Multiple Condition Listeners? The `MultipleCondition` listener allows you to build more complex workflows that depend on the success of several flows. For example, if you have two dependent tasks or processes that need to succeed before triggering another process, this listener ensures that the next workflow is only executed when both conditions are met within a specific time window. ## Activation Process Overview The listener will trigger under the following conditions: 1. Both `multiplecondition-flow-a` and `multiplecondition-flow-b` must have successful executions. 2. The listener checks if both flows succeeded within the last 24 hours. 3. If the conditions are met, the flow is activated, and the conditions reset. 4. Future executions will only re-trigger the flow if both flows succeed again within another 24-hour window. ## How the Process Works 1. Time Window (P1D or 24 hours): - The `MultipleCondition` listener checks if both flows (`multiplecondition-flow-a` and `multiplecondition-flow-b`) have been executed successfully within the past 24 hours. 2. Resetting Conditions: - Once the listener triggers, the conditions reset, meaning that even if one of the flows succeeds again, the listener won't trigger until both flows succeed within a new 24-hour period. 3. Flow Dependency: - This is particularly useful when you have flows that depend on each other or when the successful execution of multiple workflows is a prerequisite for a downstream task. ## First Flow: `multiplecondition_flow_a` This is the first flow that the listener will check for success. ```yaml id: multiplecondition_flow_a namespace: company.team description: | This flow will start `multiplecondition_listener` if `MultipleCondition` is validated tasks: - id: only type: io.kestra.plugin.core.debug.Return format: "from parents: {{ execution.id }}" ``` This flow is a simple one that returns the execution ID as output. The listener checks whether this flow has executed successfully within the past 24 hours. ## Second Flow: multiplecondition_flow_b This is the second flow that the listener will check for success. ```yaml id: multiplecondition_flow_b namespace: company.team description: | This flow will start `multiplecondition_listener` if `MultipleCondition` is validated tasks: - id: only type: io.kestra.plugin.core.debug.Return format: "from parents: {{ execution.id }}" ``` Just like `multiplecondition_flow_a`, this flow also returns its execution ID. The listener will wait for both this and the first flow to succeed before activating the final flow. ## Final Flow with Trigger: `multiplecondition_listener` The final flow is where we define the trigger that listens to both `multiplecondition_flow_a` and `multiplecondition_flow_b`. ```yaml id: multiplecondition_listener namespace: company.team description: | This flow will start only if `multiplecondition_flow_a` and `multiplecondition_flow_b` are successful during the last 24h. tasks: - id: only_listener type: io.kestra.plugin.core.debug.Return format: "children" triggers: - id: multiple_listen_flow type: io.kestra.plugin.core.trigger.Flow conditions: - type: io.kestra.plugin.core.condition.ExecutionStatus in: - SUCCESS - id: multiple type: io.kestra.plugin.core.condition.MultipleCondition window: P1D windowAdvance: P0D conditions: flow_a: type: io.kestra.plugin.core.condition.ExecutionFlow namespace: company.team flowId: multiplecondition_flow_a flow_b: type: io.kestra.plugin.core.condition.ExecutionFlow namespace: company.team flowId: multiplecondition_flow_b ``` ## Explanation of the Flow 1. Tasks Section: - The task `only_listener` outputs a static value (`children`) when the trigger conditions are met. This part can be customized to perform more complex tasks after the conditions are satisfied. 2. Triggers Section: - The `multiple_listen_flow` trigger listens for both `multiplecondition_flow_a` and `multiplecondition_flow_b`. - Execution Status Condition: Ensures that only successful executions (status `SUCCESS`) are considered. - MultipleCondition: This condition checks that both `flow_a` and `flow_b` have successfully completed within the last 24 hours (`P1D`). 3. Window: - The `window: P1D` ensures that the listener checks for executions within the past 24 hours. - The `windowAdvance: P0D` parameter ensures that the time window starts immediately, without any delay. ## Expected Output When both multiplecondition_flow_a and multiplecondition_flow_b succeed within 24 hours, the listener will trigger multiplecondition_listener, and you will see output similar to this: `only listener > children` ## Common Pitfalls and Troubleshooting 1. **Conditions Not Met**: If the flow doesn't trigger, ensure both `multiplecondition_flow_a` and `multiplecondition_flow_b` have completed successfully within the time window. 2. **Incorrect Output Reference**: Verify the flow IDs and namespaces to ensure the trigger is referencing the correct flows. ## Conclusion In this tutorial, we’ve demonstrated how to set up a `MultipleCondition` listener that checks for the success of multiple flows within a specified time window. This is a powerful feature for managing complex workflows that depend on the successful execution of multiple tasks. By using this listener, you can ensure that downstream processes are only triggered when all necessary upstream conditions are met. --- # Namespace Variables vs. KV Store in Kestra URL: https://kestra.io/docs/how-to-guides/namespace-variables-vs-kvstore > Understand the differences between Namespace Variables and the KV Store in Kestra to choose the right storage for your data. When to store key-value pairs as namespace-level Variables vs. KV store
When navigating to a namespace in the Kestra UI, you can see two tabs: Variables and KV Store. Both allow you to store key-value pairs, but there are some significant differences in how those are handled and stored, and when you should use one over the other. ## Variables: use when you need to inherit values from parent namespaces Variables are typically intended for slowly changing values. Think of the database hostname, the bucket name in a cloud storage service, or the name of a shared queue in a message broker. These values are typically set once and then used across multiple flows and tasks. To add those, navigate to the Variables tab in the namespace and paste your key-value pairs as shown below: ```yaml POSTGRES_HOSTNAME: my-postgres-prod-hostname DATALAKE_S3_BUCKET_NAME: my-datalake-s3-bucket-name RABBITMQ_QUEUE_NAME: my-rabbitmq-queue-name GCP_PROJECT_ID: my-gcp-project-id GITHUB_REPO_URL: https://github.com/kestra-io/kestra ``` The additional benefit of using Variables is that they can be grouped to simplify some configurations. For example, you can group all database-related connection variables under a `postgres` prefix and access them using e.g. `{{ namespace.postgres.hostname }}` in your flows and tasks. ```yaml postgres: hostname: my-postgres-prod-hostname port: 5432 username: my-postgres-prod-username dataLake: s3BucketName: my-datalake-s3-bucket-name region: us-east-1 ``` You may notice that the Variables can be defined using the uppercased `SNAKE_CASE` convention, as well as `camelCase` or any other convention you prefer. Storing those values as Variables in a namespace allows you to: 1. Set them once by a DevOps engineer or a system administrator. 2. Centrally govern them in a single place (e.g. to update a database host or port, bucket names, regions, etc.). 3. Inherit them from parent namespace (e.g. `company` namespace) to all child namespaces (e.g. `company.myteam`, `company.myteam.myproject`). 4. Group them to simplify configurations of database connections, cloud storage services, message brokers, etc. This means that if you have a variable `POSTGRES_HOSTNAME` set in a parent namespace `company`, you can use `{{namespace.POSTGRES_HOSTNAME}}` in a child namespace `company.myteam` and `company.myteam.myproject` (and all other infinitely nested namespaces) without having to worry where in the namespace hierarchy that value is managed. ## KV Store: use when you need to store ephemeral or dynamic values Trying to use KV Store for the above use case would also work, but you would always need to remember to include the pointer to the namespace under which that key-value pair is stored (unless using the same one as the flow). This is because KV Store is not inherited from parent to child namespaces. Example: ```yaml {{ kv('POSTGRES_HOSTNAME', 'company') }} ``` The KV Store is more suited for storing ephemeral or dynamic values. Think of the last scraped timestamp, the offset of a Kafka consumer group, or the most recently processed file name. These values are typically set and updated by the workflow itself. Using KV Store for those use cases is better than Variables because KV pairs can be set and updated at runtime, while Variables are typically set once, centrally governed by Kestra Admins, and inherited from parent namespaces to reuse centrally governed configuration across multiple flows and tasks. ## Recap - **Variables**: use for slowly changing configuration values that are set once, updated fairly infrequently and inherited from parent namespaces by indefinitely nested child namespaces (e.g. `company`, `company.myteam`, `company.myteam.myproject`). - **KV Store**: use for ephemeral or dynamic values that are set and/or updated at runtime. Here are some examples to consolidate your understanding: - **Variables**: `POSTGRES_HOSTNAME`, `DATALAKE_S3_BUCKET_NAME`, `RABBITMQ_QUEUE_NAME`, `GCP_PROJECT_ID`, `GITHUB_REPO_URL` - **KV Store**: `last_scraped_timestamp`, `kafka_consumer_group_offset`, `last_processed_file_name`. --- # Connect a Neon Database to Kestra URL: https://kestra.io/docs/how-to-guides/neon > Connect your serverless Neon PostgreSQL database to Kestra workflows to query, ingest, and manage data seamlessly. Connect your Neon serverless database to your workflows using the PostgreSQL plugin. Neon is an open-source database company whose mission is to take everything that developers love about Postgres — reliability, performance, extensibility — and deliver it as a serverless product. Before you begin, ensure you have a [Neon account](https://neon.tech/home) set up and a [Kestra installation](../../02.installation/index.mdx) running. ## Setting up a Database in Neon Once you've logged into Neon, you'll need to set up a project where you'll give it a name, select your desired PostgreSQL version, and select your cloud provider and region. ![neon-1](./neon-1.png) Once your project is created, you'll arrive at the Project Dashboard page. From here, you can connect to your database, import data, get sample data, view database content, and much more. ![neon-2](./neon-2.png) ## Connecting Neon to Kestra To have Kestra supply the data, connect to your database. Leave the Branch, Compute, Database, and Role as their defaults, or adjust as needed. Click on the **Connection string** dropdown list and select Java. This is the connection string used in Kestra to connect to the Neon database. Make note of the password and save it for later steps. ![neon-3](./neon-3.png) With a database set up in Neon, create a table for the incoming data. Click on **Tables** on the left sidebar. ![neon-4](./neon-4.png) Next, click on the '+' icon to add a table, name it, and create it. You can leave just the default `id` column or add in the columns of your data set now. Kestra will alter the table, so leave it empty for now. ![neon-5](./neon-5.png) With the setup in Neon done, we can go Kestra to set up our connection. While there's no official Neon plugin, we can connect using the [PostgreSQL plugin](/plugins/plugin-jdbc-postgres), which supports a number of tasks such as `Query`, `CopyIn`, and `CopyOut`. To connect, we can copy the URL provided from before. To prevent exposing the password in our flow, take the password saved earlier and store it as a [secret](../../06.concepts/04.secret/index.md). Then, in the URL, switch out the password for the secret expression: `{{ secret('NEON_PASSWORD') }}`. By using [Plugin Defaults](../../05.workflow-components/09.plugin-defaults/index.md), we can configure our connection to Neon once for all tasks in our flow rather than individually for each task. Once configured, our connection in Kestra will look like the example below: ```yaml pluginDefaults: - forced: true type: io.kestra.plugin.jdbc.postgresql values: url: "jdbc:postgresql://ep-gentle-tree-a25pyhxb-pooler.eu-central-1.aws.neon.tech/neondb?user=neondb_owner&password={{ secret('NEON_PASSWORD') }}&sslmode=require" ``` :::alert{type="info"} You can also use the `username` and `password` properties rather than combining it all into the `url` property: ```yaml pluginDefaults: - forced: true type: io.kestra.plugin.jdbc.postgresql values: url: "jdbc:postgresql://ep-gentle-tree-a25pyhxb-pooler.eu-central-1.aws.neon.tech/neondb" username: "neondb_owner" password: "{{ secret('NEON_PASSWORD') }}" ``` ::: ## Copying a CSV File into Neon in a Flow Using this [example CSV](https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv), we can copy the data into our table directly from Kestra. You can either set up the columns directly in Neon in the earlier steps or add a task in Kestra to add them automatically like this: ```yaml id: neon_db_add_columns namespace: company.team tasks: - id: create_columns type: io.kestra.plugin.jdbc.postgresql.Queries sql: | ALTER TABLE kestra_example ADD COLUMN order_id int, ADD COLUMN customer_name text, ADD COLUMN customer_email text, ADD COLUMN product_id int, ADD COLUMN price double precision, ADD COLUMN quantity int, ADD COLUMN total double precision; pluginDefaults: - forced: true type: io.kestra.plugin.jdbc.postgresql values: url: "jdbc:postgresql://ep-gentle-tree-a25pyhxb-pooler.eu-central-1.aws.neon.tech/neondb?user=neondb_owner&password={{ secret('NEON_PASSWORD') }}&sslmode=require" ``` Once your columns are configured, you can use the [CopyIn](/plugins/plugin-jdbc-postgres/io.kestra.plugin.jdbc.postgresql.copyin) task combined with the [HTTP Download](/plugins/core/http/io.kestra.plugin.core.http.download) task to download the CSV file and copy it directly into the table. As we set up the database connection with our [Plugin Defaults](#connecting-neon-to-kestra), the CopyIn task will connect directly and copy the CSV file into the database. ```yaml id: neon_db_copyin namespace: company.team tasks: - id: download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: copy_in type: io.kestra.plugin.jdbc.postgresql.CopyIn table: "kestra_example" from: "{{ outputs.download.uri }}" header: true columns: [order_id,customer_name,customer_email,product_id,price,quantity,total] delimiter: "," pluginDefaults: - forced: true type: io.kestra.plugin.jdbc.postgresql values: url: "jdbc:postgresql://ep-gentle-tree-a25pyhxb-pooler.eu-central-1.aws.neon.tech/neondb?user=neondb_owner&password={{ secret('NEON_PASSWORD') }}&sslmode=require" ``` Once this flow completes, we can view the contents of our database in Neon: ![neon-6](./neon-6.png) --- # Integrate Notion Webhooks with Kestra URL: https://kestra.io/docs/how-to-guides/notion-webhook > Automate Notion database updates and send Slack notifications by triggering Kestra flows via Notion webhooks. Use Notion webhooks to trigger Kestra flows when pages or databases are updated in your Notion workspace. This guide shows you how to create a workflow that responds to Notion database changes, retrieves page details, and sends notifications to Slack when new tasks are assigned. ## Prerequisites Before you begin, you need: - A Notion workspace with a database - A [Notion integration](https://www.notion.so/my-integrations) with access to your database - A Slack workspace with webhook capabilities ([Slack Webhook Documentation](https://api.slack.com/messaging/webhooks)) - Access to your Notion API token and Slack webhook URL ## Create a Notion integration 1. Go to [Notion's My Integrations page](https://www.notion.so/my-integrations) 2. Click **"New integration"** 3. Give your integration a name and select your workspace 4. Copy the **Internal Integration Token** - you'll need this for the `NOTION_API_KEY` secret ## Share your database with the integration 1. Open your Notion database 2. Click the **"..."** menu in the top right 3. Select **"Add connections"** 4. Find and select your integration 5. Click **"Confirm"** to grant access ## Set up secrets in Kestra Store your sensitive credentials as [secrets](../../06.concepts/04.secret/index.md) or [key-value](../../06.concepts/05.kv-store/index.md) pairs: 1. Navigate to your namespace in the Kestra UI 2. Go to the **Secrets** tab (Alternatively go to the **KV Store** tab and do the same) 3. Create these secrets: - `NOTION_API_KEY`: Your Notion integration token - `SLACK_WEBHOOK_URL`: Your Slack incoming webhook URL ## Create the webhook flow Create a flow that listens for Notion webhook events and processes them: ```yaml id: notion-webhook namespace: company.team tasks: - id: get_notion_page_details type: io.kestra.plugin.notion.page.Read apiToken: "{{ secret('NOTION_API_KEY') }}" pageId: "{{ trigger.body.entity.id }}" - id: send_slack_alert type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ secret('SLACK_WEBHOOK_URL') }}" messageText: "New task titled {{ outputs.get_notion_page_details | jq('.properties.Button.title[0].text.content') | first }} assigned to {{ outputs.get_notion_page_details | jq('.properties.Assignee.multi_select[0].name') | first }} on the Product team Notion board! Link: {{ outputs.get_notion_page_details.url }}" triggers: - id: notion_new_task_webhook type: io.kestra.plugin.core.trigger.Webhook key: my-notion-product-alert-key # Replace with a secure key ``` :::alert{type="warning"} Replace `my-notion-product-alert-key` with a secure, randomly generated key. Consider storing this as a [secret](../../06.concepts/04.secret/index.md) or [key-value pair](../../06.concepts/05.kv-store/index.md) for better security. ::: ## Configure Notion webhooks Set up webhooks directly in your Notion integration: 1. Go to your [Notion integration settings](https://www.notion.so/my-integrations) 2. Select your integration 3. Navigate to the **"Webhooks"** section 4. Click **"Add webhook"** 5. Enter your Kestra webhook URL (see format below) 6. Select the events you want to listen for: - `page.property_values.updated` - When page properties change - `page.created` - When new pages are created - `database.created` - When new databases are created 7. Click **"Create"** to save the webhook ![Notion Integration UI](./kestra-webhook-notion.png) For more details, see the [Notion Webhooks API documentation](https://developers.notion.com/reference/webhooks). ## Webhook URL format Your Kestra webhook URL follows this pattern: ```plaintext http://your-kestra-host:8080/api/v1/main/executions/webhook/{namespace}/{flow_id}/{key} ``` For this example: - **Namespace**: `company.team` - **Flow ID**: `notion-webhook` - **Key**: `my-notion-product-alert-key` Complete URL: ```plaintext http://your-kestra-host:8080/api/v1/main/executions/webhook/company.team/notion-webhook/my-notion-product-alert-key ``` You can copy your webhook URL directly from the Kestra UI from the **Triggers** tab and paste it in Notion: ![Copy Webhook URL](./copy-webhook-url.png) ## Testing the integration Test your webhook flow manually: ```bash curl -X POST \ http://your-kestra-host:8080/api/v1/main/executions/webhook/company.team/notion-webhook/my-notion-product-alert-key \ -H "Content-Type: application/json" \ -d '{"entity": {"id": "your-notion-page-id"}}' ``` Replace `your-notion-page-id` with an actual page ID from your Notion database. ## Understanding the flow The flow performs these steps: 1. **Webhook trigger**: Listens for incoming webhook requests from Notion on the specified endpoint 2. **Get page details**: Uses the [Notion plugin](/plugins/plugin-notion) to fetch complete page information from Notion 3. **Send notification**: Extracts the task title and assignee information, then sends a formatted message to Slack ## Customizing the flow ### Different Notion properties Modify the Slack message to use different Notion properties. Common property types include: ```yaml ## For title properties title: "{{ outputs.get_notion_page_details | jq('.properties.Title.title[0].text.content') | first }}" ## For select properties status: "{{ outputs.get_notion_page_details | jq('.properties.Status.select.name') | first }}" ## For date properties due_date: "{{ outputs.get_notion_page_details | jq('.properties.DueDate.date.start') | first }}" ## For people properties assignee: "{{ outputs.get_notion_page_details | jq('.properties.Assignee.people[0].name') | first }}" ``` ### Adding conditional logic Add conditions to process only specific types of changes: ```yaml tasks: - id: check_status type: io.kestra.plugin.core.flow.If condition: "{{ outputs.get_notion_page_details | jq('.properties.Status.select.name') | first == 'In Progress' }}" then: - id: send_slack_alert type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ secret('SLACK_WEBHOOK_URL') }}" messageText: "Task moved to In Progress: {{ outputs.get_notion_page_details | jq('.properties.Title.title[0].text.content') | first }}" ``` ### Multiple notification channels Send notifications to different channels based on the assignee or project: ```yaml tasks: - id: send_to_team_channel type: io.kestra.plugin.core.flow.If condition: "{{ outputs.get_notion_page_details | jq('.properties.Project.select.name') | first == 'Product' }}" then: - id: product_team_notification type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ secret('PRODUCT_SLACK_WEBHOOK_URL') }}" messageText: "New product task assigned!" else: - id: general_notification type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ secret('GENERAL_SLACK_WEBHOOK_URL') }}" messageText: "New task assigned!" ``` :::alert{type="info"} Keep in mind that the above examples are additional tasks to add to the flow and not standalone flows. You need to add `id` and `namespace` properties to execute them standalone. ::: ## Security considerations - Use strong, randomly generated webhook keys - Store all sensitive tokens as [secrets](../../06.concepts/04.secret/index.md) or [key-value pairs](../../06.concepts/05.kv-store/index.md) - Consider implementing request validation in your webhook handler - Regularly rotate your API tokens and webhook URLs ## Related resources - [Webhook triggers](../../05.workflow-components/07.triggers/03.webhook-trigger/index.md) - [Notion plugin documentation](https://kestra.io/plugins/plugin-notion) - [Slack notifications](../../15.how-to-guides/slack-webhook/index.md) - [Secrets management](../../06.concepts/04.secret/index.md) - [Expression language guide](../../06.concepts/06.pebble/index.md) --- # Handle Null and Undefined Values in Kestra URL: https://kestra.io/docs/how-to-guides/null-values > Handle null and missing values in Kestra flows. Check for null inputs, set defaults, and use conditional logic to manage undefined task outputs. How to use the null coalescing operator to handle null and undefined values.
The null coalescing operator is a binary operator that returns its left-hand value if it's not null; otherwise, it returns its right-hand value. You can think of it as a way to provide a default value when the left-hand value is null. ```yaml "{{ null ?? now() | date('yyyy-MM-dd') }}" ``` In this example, since the left-hand side of the `??` operator is `null`, the right-hand side will be returned. The `now()` function will be called, and the result will be formatted as a date string in the `yyyy-MM-dd` format. ## Processing date values Imagine that you have a flow that processes data between two dates. You want to provide default values for the start and end dates if they're not provided as inputs. You can use the null coalescing operator to set default values for the start and end dates. In this example, the start date is set to one month ago, and the end date is set to today, effectively processing data for the last month by default. ```yaml id: process_data_between_dates namespace: company.team inputs: - id: start_date type: DATE required: false description: Start date to fetch data from - id: end_date type: DATE required: false description: End date to fetch data from variables: start_date: "{{ inputs.start_date ?? now() | dateAdd(-1, 'MONTHS') | date('yyyy-MM-dd') }}" end_date: "{{ inputs.end_date ?? now() | date('yyyy-MM-dd') }}" tasks: - id: process_data_between_dates type: io.kestra.plugin.core.log.Log message: processing data from {{ render(vars.start_date) }} to {{ render(vars.end_date) }} ``` Use the `render` function to recursively render variables containing Pebble expressions. ## Providing default values for optional and undefined inputs The null-coalescing operator `??` will return the right-hand value if the left-hand value is null or undefined (e.g. an `input` or a `variable` that have not been defined). This behavior is useful when you want to provide default values for optional inputs and for dynamic properties that may not be defined. If you want to return the right-hand side only if the left-hand side is undefined, you can use the `???` operator instead of `??`. The example below shows how to use both `??` and `???` operators to set defaults for optional or undefined values. ```yaml id: provide_default_values namespace: company.team inputs: - id: optional_input type: STRING required: false description: An optional input tasks: - id: coalesce_optional_input type: io.kestra.plugin.core.log.Log message: | Expression: inputs.optional_input ?? 'mydefault' Left-hand value: null Right-hand value: 'mydefault' Operator used: '??' This expression "{{ inputs.optional_input ?? 'mydefault' }}" will return 'mydefault' because the coalesce-operator '??' returns the right-hand value if the left-hand value is null or undefined. Only if you provide a value at runtime, that value will be used instead of 'mydefault'. - id: coalesce_undefined_input type: io.kestra.plugin.core.log.Log message: | Expression: inputs.undefined_input ?? 'mydefault' Left-hand value: undefined Right-hand value: 'mydefault' Operator used: '??' The expression "{{ inputs.undefined_input ?? 'mydefault' }}" will return 'mydefault' because the coalesce-operator '??' returns the right-hand value if the left-hand value is null or undefined. - id: coalesce_only_undefined_input_1 type: io.kestra.plugin.core.log.Log message: | Expression: inputs.undefined_input ??? 'mydefault' Left-hand value: undefined Right-hand value: 'mydefault' Operator used: '???' The expression "{{ inputs.undefined_input ??? 'mydefault' }}" will return 'mydefault' because he undefined-coalesce-operator '???' returns the right-hand value if the left-hand value is undefined. - id: coalesce_only_undefined_input_2 type: io.kestra.plugin.core.log.Log message: | Expression: inputs.optional_input ??? 'mydefault' Left-hand value: null Right-hand value: 'mydefault' Operator used: '???' The expression "{{ inputs.optional_input ??? 'mydefault' }}" will return "" i.e. no value aka null value because optional_input is defined and the undefined-coalesce-operator '???' only returns the right-hand value if the left-hand value is undefined. - id: both_operators_combined type: io.kestra.plugin.core.log.Log message: | Expression: (inputs.optional_input ??? 'mydefault') ?? 'other_default' Left-hand value: null Right-hand value: 'other_default' Operator used: '??' The expression "{{ (inputs.optional_input ??? 'mydefault') ?? 'other_default' }}" will return 'other_default' because the first expression using the undefined-coalesce-operator '???' will return null, and the coalesce-operator '??' will return the default value 'other_default'. ``` ## Processing Trigger values When using a Trigger, you can use the `{{ trigger }}` expression in your flow. However, this expression is undefined if you execute your flow manually. Here's an example of a Webhook trigger that might receive a body of data. You can use the null coalescing operator to handle when the body of data is different to what's expected: ```yaml id: webhook_example namespace: company.team tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "{{ trigger.body.dataField ?? 'data' }}" triggers: - id: webhook type: io.kestra.plugin.core.trigger.Webhook key: abcdefg ``` You can also use this for Schedule triggers. Here's an example that uses the date when the Schedule trigger starts an execution but in combination with the null coalescing operator to use an input value if an execution is started manually. ```yaml id: scheduling namespace: company.team inputs: - id: country type: STRING defaults: US - id: date type: DATETIME required: false defaults: 2023-12-24T14:00:00.000Z tasks: - id: check_if_business_date type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true commands: - python schedule.py "{{ trigger.date ?? inputs.date }}" {{ inputs.country }} beforeCommands: - pip install workalendar taskRunner: type: io.kestra.plugin.core.runner.Process - id: log type: io.kestra.plugin.core.log.Log message: business day - continuing the flow... triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: 0 14 25 12 * ``` --- # Parallel vs. Sequential Tasks in Kestra URL: https://kestra.io/docs/how-to-guides/parallel-vs-sequential > Choose between parallel and sequential task execution in Kestra. Understand trade-offs and dependency management to design efficient workflow patterns. When to use parallel tasks and when to use sequential tasks in Kestra.
## Parallel Tasks The following flow has 6 tasks wrapped in the `Parallel` task. Since the `concurrent` property is set to 3, Kestra will run 3 tasks in parallel. As soon as any of the three tasks completes, the next task will start. The addition of the Parallel task with the `concurrent` property set to 3 ensures that Kestra will run up to 3 tasks concurrently. The `last` task will run after all the tasks in the `Parallel` task group have completed. ```yaml id: parallel namespace: company.team tasks: - id: parent type: io.kestra.plugin.core.flow.Parallel concurrent: 3 tasks: - id: t1 type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'echo "running {{task.id}}"' - 'sleep 1' - id: t2 type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'echo "running {{task.id}}"' - 'sleep 1' - id: t3 type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'echo "running {{task.id}}"' - 'sleep 1' - id: t4 type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'echo "running {{task.id}}"' - 'sleep 1' - id: t5 type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'echo "running {{task.id}}"' - 'sleep 1' - id: t6 type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'echo "running {{task.id}}"' - 'sleep 1' - id: last type: io.kestra.plugin.core.debug.Return format: "{{task.id}} > {{taskrun.startDate}}" ``` ## Sequential Tasks This flow will start two sequential task groups in parallel. The addition of the `Sequential` task ensures that the tasks within each group will run one after the other. The `last` task will run after all the tasks in the `Sequential` task group have completed ```yaml id: sequential namespace: company.team description: | This flow will start the 2 sequential tasks in parallel and those will launch tasks one after the other. tasks: - id: parent type: io.kestra.plugin.core.flow.Parallel tasks: - id: seq1 type: io.kestra.plugin.core.flow.Sequential tasks: - id: t1 type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'echo "running {{task.id}}"' - 'sleep 1' - id: t2 type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'echo "running {{task.id}}"' - 'sleep 1' - id: t3 type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'echo "running {{task.id}}"' - 'sleep 1' - id: seq2 type: io.kestra.plugin.core.flow.Sequential tasks: - id: t4 type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'echo "running {{task.id}}"' - 'sleep 1' - id: t5 type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'echo "running {{task.id}}"' - 'sleep 1' - id: t6 type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'echo "running {{task.id}}"' - 'sleep 1' - id: last type: io.kestra.plugin.core.debug.Return format: "{{task.id}} > {{taskrun.startDate}}" ``` --- # Pause and Resume Flows in Kestra URL: https://kestra.io/docs/how-to-guides/pause-resume > Pause and resume Kestra workflow executions on demand. Use manual triggers, scheduled waits, and approval gates to control flow progression at runtime. How to Pause and Resume your flows. Here are common scenarios where the Pause and Resume feature is particularly useful: 1. **Output Validation**: you can pause a workflow to check the logs and view the generated outputs before processing downstream tasks. 2. **Manual Approval**: the execution can wait for manual approval, e.g. after validating that a file has been correctly uploaded to an external system. 3. **Human-in-the-loop**: you can pause a workflow execution to perform a human task before resuming the execution, e.g. to validate a trained machine learning model before deploying it to production. ## How to pause and resume a workflow ```yaml id: pause_resume namespace: company.team tasks: - id: pause type: io.kestra.plugin.core.flow.Pause - id: after_pause type: io.kestra.plugin.core.log.Log message: Execution has been resumed! ``` The `Pause` task will pause the execution and the `Log` task will run only once the workflow has been resumed. ## Pausing and resuming a workflow from the UI You can either use the Pause task or manually Pause from the Execution overview page. Once the execution is paused, you can inspect the current logs and outputs. Then, you can resume it from the UI by clicking on the `Resume` button in the `Overview` tab: ![pause_resume](./pause_resume.png) ## Bulk-resuming paused workflows You can bulk-resume paused workflows from the `Executions` page by selecting the workflows you want to resume and clicking on the `Resume` button: ![pause_resume2](./pause_resume2.png) This feature is useful when you have multiple paused workflows and want to resume them all at once. :::alert{type="warning"} Select only workflows in the `PAUSED` state, as the `Resume` button will not work if you select workflows in other states. ::: ### Manual Approval Process Below, you can see an example of a workflow that sends a Slack message requesting approval for a vacation request to a manager. The workflow execution is paused until the manager resumes it with custom input values. Those input values indicate whether the request was approved and the reason for the decision. ```yaml id: vacation_approval_process namespace: company.team inputs: - id: request.name type: STRING defaults: Rick Astley - id: request.start_date type: DATE defaults: 2024-07-01 - id: request.end_date type: DATE defaults: 2024-07-07 - id: slack_webhook_uri type: URI defaults: https://kestra.io/api/mock tasks: - id: sendApprovalRequest type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ inputs.slack_webhook_uri }}" payload: | { "channel": "#vacation", "text": "Validate holiday request for {{ inputs.request.name }}. To approve the request, click on the `Resume` button here http://localhost:28080/ui/executions/{{flow.namespace}}/{{flow.id}}/{{execution.id}}" } - id: waitForApproval type: io.kestra.plugin.core.flow.Pause onResume: - id: approved description: Approve the request? type: BOOLEAN defaults: true - id: reason description: Reason for approval or rejection? type: STRING defaults: Approved - id: approve type: io.kestra.plugin.core.http.Request uri: https://kestra.io/api/mock method: POST contentType: application/json body: "{{ inputs.request }}" - id: log type: io.kestra.plugin.core.log.Log message: Status is {{ outputs.waitForApproval.onResume.reason }}. Process finished with {{ outputs.approve.body }} ``` When you click on the `Resume` button in the UI, you will be prompted to provide the approval status and the reason for their decision. The workflow will then continue with the provided input values. ![pause_resume_1](./pause_resume_1.png) After the Execution has been resumed, any downstream task can access the `onResume` inputs using the `outputs` of the `Pause` task: ![pause_resume_2](./pause_resume_2.png) --- # Run Perl Inside Your Flows URL: https://kestra.io/docs/how-to-guides/perl > Execute Perl scripts inside Kestra workflows. Run automation and text-processing tasks with Perl, using Docker containers for clean dependency isolation. Run Perl code directly in your flows and generate outputs. There isn't an official Perl plugin but we can use the `Shell` `Commands` task to execute arbitrary commands inside a Docker container. We can also specify a container image that contains the necessary libraries to run the specific programming language. In this example, we're using the Docker Task Runner with the `perl:latest` image so that Perl can be executed. ```yaml id: perl_commands namespace: company.team tasks: - id: perl type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: perl:latest namespaceFiles: enabled: true commands: - chmod +x main.pl - perl main.pl ``` The contents of the `main.pl` file contains a simple print statement: ```perl #!/usr/bin/perl print "Hello World"; ``` You'll need to add your Perl code using the Editor or [sync it using Git](../../version-control-cicd/04.git/index.md) so Kestra can see it. You'll also need to set the `enabled` flag for the `namespaceFiles` property to `true` so Kestra can access the file. You can also have the Perl code written inline using the `inputFiles` property. ```yaml id: perl_commands namespace: company.team tasks: - id: perl type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: perl:latest inputFiles: main.pl: | #!/usr/bin/perl print "Hello World"; commands: - chmod +x main.pl - perl main.pl ``` You can read more about the Shell Commands type in the [Plugin documentation](/plugins/plugin-script-shell/io.kestra.plugin.scripts.shell.commands). ## Handling Outputs If you want to get a variable or file from your Perl code, you can use an [output](../../05.workflow-components/06.outputs/index.md). ### Variable Output You can get the JSON outputs from the Perl script using the `::{}::` pattern. Here is an example: ```yaml id: perl_outputs namespace: company.team tasks: - id: perl type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: perl:latest inputFiles: main.pl: | #!/usr/bin/perl print '::{"outputs":{"test":"value","int":2,"bool":true,"float":3.65}}::'; commands: - chmod +x main.pl - perl main.pl ``` All the output variables can be viewed in the Outputs tab of the execution. ![perl_outputs](./outputs.png) You can refer to the outputs in another task as shown in the example below: ```yaml id: perl_outputs namespace: company.team tasks: - id: perl type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: perl:latest inputFiles: main.pl: | #!/usr/bin/perl print '::{"outputs":{"test":"value","int":2,"bool":true,"float":3.65}}::'; commands: - chmod +x main.pl - perl main.pl - id: return type: io.kestra.plugin.core.debug.Return format: '{{ outputs.perl.vars.test }}' ``` ### File Output Inside of your Perl code, write a file to the system. You'll need to add the `outputFiles` property to your flow and list the files you're trying to put out. In this case, we want to output `output.txt`. More information on the formats you can use for this property can be found in [Script Output Metrics](../../16.scripts/06.outputs-metrics/index.md). The example below writes a `output.txt` file containing the "Hello World" text. We can then refer the file using the syntax `{{ outputs.{task_id}.outputFiles[''] }}`, and read the contents of the file using the `read()` function. ```yaml id: perl_script namespace: company.team tasks: - id: perl type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: perl:latest inputFiles: main.pl: | #!/usr/bin/perl use strict; use warnings; # Open the file for writing open(my $fh, '>', 'output.txt') or die "Cannot open file: $!"; # Write to the file print $fh "Hello World"; # Close the file close($fh); print "Successfully wrote to the file.\n"; outputFiles: - output.txt commands: - chmod +x main.pl - perl main.pl - id: log type: io.kestra.plugin.core.log.Log message: "{{ read(outputs.perl.outputFiles['output.txt']) }}" ``` ## Handling Metrics You can also get [metrics](../../16.scripts/06.outputs-metrics/index.md#outputs-and-metrics-in-script-and-commands-tasks) from your Perl code. Metrics use the same `::{}::` pattern as outputs. This example demonstrates both the counter and timer metrics. ```yaml id: perl_metrics namespace: company.team tasks: - id: perl type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: perl:latest inputFiles: main.pl: | #!/usr/bin/perl print "There are 20 products in the cart\n"; print "::{\"outputs\":{\"productCount\":20}}::\n"; print "::{\"metrics\":[{\"name\":\"productCount\",\"type\":\"counter\",\"value\":20}]}::\n"; print "::{\"metrics\":[{\"name\":\"purchaseTime\",\"type\":\"timer\",\"value\":32.44}]}::\n"; commands: - chmod +x main.pl - perl main.pl ``` Once this has executed, both the metrics can be viewed under **Metrics**. ![metrics](./metrics.png) --- # Populate Your Instance with Sample Data URL: https://kestra.io/docs/how-to-guides/populate-demo-data > Populate your Kestra instance with demo data. Use sample flows and datasets to explore features, test integrations, and validate your setup. Quickly populate your Kestra instance with realistic demo flows and executions using a single SQL script. This is useful for demos, testing dashboards, taking screenshots, or exploring Kestra's UI with meaningful data. ## Prerequisites - Kestra running via [Docker Compose](../../02.installation/03.docker-compose/index.md) with a **PostgreSQL** backend - Access to the `postgres` container via `docker compose exec` ## What Gets Inserted The script creates a fully populated instance with: | Category | Details | |------------|-------------------------------------------------------------------------| | **Flows** | 10 flows across 6 namespaces (`acme`, `acme.sales`, `acme.company.data`, `acme.operations`, `acme.marketing`, `acme.finance`) | | **Executions** | ~224 executions spread over the past 7 days | | **States** | Realistic distribution: ~70% SUCCESS, ~10% FAILED, ~8% WARNING, ~5% RUNNING, ~4% CANCELLED, ~3% RETRIED | | **Timing** | Weighted toward business hours (8 AM–6 PM), with some evening and night runs | ## How to Run Download the SQL script and pipe it into the PostgreSQL container: ```bash cat seed_demo_data.sql | docker compose exec -T postgres psql -U kestra ``` :::alert{type="info"} The `-T` flag disables pseudo-TTY allocation, which is required when piping input to `docker compose exec`. ::: ## Key Properties - **Idempotent** — uses `ON CONFLICT DO NOTHING`, so it's safe to run multiple times without duplicating data. - **Multi-tenant aware** — all records are created under the `main` tenant ID. - **Deterministic IDs** — execution IDs are generated with `md5(flow_id + day_offset + index)`, ensuring consistent results across runs. ## Full SQL Script :::collapse{title="View the full SQL script"} ```sql -- ============================================================================= -- Kestra Demo Seed Data -- Inserts ~10 flows and ~180 executions for the past 7 days -- Idempotent: safe to re-run (ON CONFLICT DO NOTHING) -- ============================================================================= BEGIN; -- ============================================================ -- FLOWS -- ============================================================ -- 1. hello-world (acme) INSERT INTO flows (key, value, source_code) VALUES ( 'acme_hello-world_1', '{"id":"hello-world","namespace":"acme","tenantId":"main","revision":1,"deleted":false,"description":"Hello World","tasks":[{"id":"first_task","type":"io.kestra.plugin.core.debug.Return","format":"thrilled"},{"id":"second_task","type":"io.kestra.plugin.scripts.shell.Commands","commands":["sleep 0.42","echo ''::{ \"outputs\":{\"returned_data\":\"mydata\"}}::''"]},{"id":"hello_world","type":"io.kestra.plugin.core.log.Log","message":"Welcome to Acme, {{ inputs.user }}!\nWe are {{ outputs.first_task.value }} to have you here!"}],"inputs":[{"id":"user","type":"STRING","defaults":"Rick Astley"}],"triggers":[{"id":"daily","type":"io.kestra.plugin.core.trigger.Schedule","cron":"0 9 * * *","disabled":true}]}'::jsonb, 'id: hello-world namespace: acme description: Hello World inputs: - id: user type: STRING defaults: Rick Astley tasks: - id: first_task type: io.kestra.plugin.core.debug.Return format: thrilled - id: second_task type: io.kestra.plugin.scripts.shell.Commands commands: - sleep 0.42 - echo ''::{"outputs":{"returned_data":"mydata"}}::'' - id: hello_world type: io.kestra.plugin.core.log.Log message: | Welcome to Acme, {{ inputs.user }}! We are {{ outputs.first_task.value }} to have you here! triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: 0 9 * * * disabled: true' ) ON CONFLICT (key) DO NOTHING; -- 2. customer_onboarding (acme.sales) INSERT INTO flows (key, value, source_code) VALUES ( 'acme.sales_customer_onboarding_1', '{"id":"customer_onboarding","namespace":"acme.sales","tenantId":"main","revision":1,"deleted":false,"description":"Automated customer onboarding workflow","tasks":[{"id":"welcome_message","type":"io.kestra.plugin.core.log.Log","message":"Starting onboarding process"},{"id":"generate_customer_id","type":"io.kestra.plugin.core.output.OutputValues","values":{"customer_id":"ACME-20260317"}},{"id":"send_welcome_email","type":"io.kestra.plugin.core.log.Log","message":"Welcome email sent"},{"id":"complete","type":"io.kestra.plugin.core.log.Log","message":"Customer onboarding completed successfully!"}],"inputs":[{"id":"customer_name","type":"STRING","required":true},{"id":"customer_email","type":"STRING","required":true}]}'::jsonb, 'id: customer_onboarding namespace: acme.sales description: Automated customer onboarding workflow inputs: - id: customer_name type: STRING required: true - id: customer_email type: STRING required: true tasks: - id: welcome_message type: io.kestra.plugin.core.log.Log message: Starting onboarding process - id: generate_customer_id type: io.kestra.plugin.core.output.OutputValues values: customer_id: ACME-20260317 - id: send_welcome_email type: io.kestra.plugin.core.log.Log message: Welcome email sent - id: complete type: io.kestra.plugin.core.log.Log message: Customer onboarding completed successfully!' ) ON CONFLICT (key) DO NOTHING; -- 3. monthly_sales_report (acme.sales) INSERT INTO flows (key, value, source_code) VALUES ( 'acme.sales_monthly_sales_report_1', '{"id":"monthly_sales_report","namespace":"acme.sales","tenantId":"main","revision":1,"deleted":false,"description":"Generate monthly sales performance report","tasks":[{"id":"fetch_sales_data","type":"io.kestra.plugin.core.log.Log","message":"Fetching sales data"},{"id":"calculate_metrics","type":"io.kestra.plugin.core.output.OutputValues","values":{"total_sales":"85000","num_customers":"342","avg_deal_size":"2480"}},{"id":"generate_report","type":"io.kestra.plugin.core.log.Log","message":"Monthly Sales Report generated"}],"triggers":[{"id":"monthly_schedule","type":"io.kestra.plugin.core.trigger.Schedule","cron":"0 9 1 * *","disabled":true}]}'::jsonb, 'id: monthly_sales_report namespace: acme.sales description: Generate monthly sales performance report tasks: - id: fetch_sales_data type: io.kestra.plugin.core.log.Log message: Fetching sales data - id: calculate_metrics type: io.kestra.plugin.core.output.OutputValues values: total_sales: "85000" num_customers: "342" avg_deal_size: "2480" - id: generate_report type: io.kestra.plugin.core.log.Log message: Monthly Sales Report generated triggers: - id: monthly_schedule type: io.kestra.plugin.core.trigger.Schedule cron: "0 9 1 * *" disabled: true' ) ON CONFLICT (key) DO NOTHING; -- 4. data_pipeline_assets (acme.company.data) INSERT INTO flows (key, value, source_code) VALUES ( 'acme.company.data_data_pipeline_assets_1', '{"id":"data_pipeline_assets","namespace":"acme.company.data","tenantId":"main","revision":1,"deleted":false,"tasks":[{"id":"create_staging_layer_asset","type":"io.kestra.plugin.jdbc.duckdb.Query","sql":"CREATE TABLE IF NOT EXISTS trips AS select VendorID, passenger_count, trip_distance from sample_data.nyc.taxi limit 10;"},{"id":"for_each","type":"io.kestra.plugin.core.flow.ForEach","values":["passenger_count","trip_distance"],"tasks":[{"id":"create_mart_layer_asset","type":"io.kestra.plugin.jdbc.duckdb.Query","sql":"SELECT AVG({{taskrun.value}}) AS avg_{{taskrun.value}} FROM trips;"}]}]}'::jsonb, 'id: data_pipeline_assets namespace: acme.company.data tasks: - id: create_staging_layer_asset type: io.kestra.plugin.jdbc.duckdb.Query sql: | CREATE TABLE IF NOT EXISTS trips AS select VendorID, passenger_count, trip_distance from sample_data.nyc.taxi limit 10; - id: for_each type: io.kestra.plugin.core.flow.ForEach values: - passenger_count - trip_distance tasks: - id: create_mart_layer_asset type: io.kestra.plugin.jdbc.duckdb.Query sql: SELECT AVG({{taskrun.value}}) AS avg_{{taskrun.value}} FROM trips;' ) ON CONFLICT (key) DO NOTHING; -- 5. system_health_check (acme.operations) INSERT INTO flows (key, value, source_code) VALUES ( 'acme.operations_system_health_check_1', '{"id":"system_health_check","namespace":"acme.operations","tenantId":"main","revision":1,"deleted":false,"description":"Monitor system health and performance","tasks":[{"id":"check_api_endpoints","type":"io.kestra.plugin.core.log.Log","message":"Checking API endpoint availability..."},{"id":"check_database","type":"io.kestra.plugin.core.log.Log","message":"Checking database connections..."},{"id":"check_services","type":"io.kestra.plugin.core.log.Log","message":"Checking microservices status..."},{"id":"calculate_uptime","type":"io.kestra.plugin.core.output.OutputValues","values":{"api_uptime":"99.95","db_response_time":"12","services_healthy":"10"}},{"id":"generate_report","type":"io.kestra.plugin.core.log.Log","message":"System Health Report generated"}],"triggers":[{"id":"hourly_check","type":"io.kestra.plugin.core.trigger.Schedule","cron":"0 * * * *","disabled":true}]}'::jsonb, 'id: system_health_check namespace: acme.operations description: Monitor system health and performance tasks: - id: check_api_endpoints type: io.kestra.plugin.core.log.Log message: Checking API endpoint availability... - id: check_database type: io.kestra.plugin.core.log.Log message: Checking database connections... - id: check_services type: io.kestra.plugin.core.log.Log message: Checking microservices status... - id: calculate_uptime type: io.kestra.plugin.core.output.OutputValues values: api_uptime: "99.95" db_response_time: "12" services_healthy: "10" - id: generate_report type: io.kestra.plugin.core.log.Log message: System Health Report generated triggers: - id: hourly_check type: io.kestra.plugin.core.trigger.Schedule cron: "0 * * * *" disabled: true' ) ON CONFLICT (key) DO NOTHING; -- 6. inventory_check (acme.operations) INSERT INTO flows (key, value, source_code) VALUES ( 'acme.operations_inventory_check_1', '{"id":"inventory_check","namespace":"acme.operations","tenantId":"main","revision":1,"deleted":false,"description":"Daily inventory level monitoring","tasks":[{"id":"scan_inventory","type":"io.kestra.plugin.core.log.Log","message":"Scanning inventory levels across all warehouses..."},{"id":"check_levels","type":"io.kestra.plugin.core.output.OutputValues","values":{"total_items":"8542","low_stock_items":"23","out_of_stock":"2"}},{"id":"generate_alerts","type":"io.kestra.plugin.core.log.Log","message":"Inventory Status Report generated"},{"id":"notify_purchasing","type":"io.kestra.plugin.core.log.Log","message":"Reorder notifications sent to purchasing team"}],"triggers":[{"id":"daily_check","type":"io.kestra.plugin.core.trigger.Schedule","cron":"0 7 * * *","disabled":true}]}'::jsonb, 'id: inventory_check namespace: acme.operations description: Daily inventory level monitoring tasks: - id: scan_inventory type: io.kestra.plugin.core.log.Log message: Scanning inventory levels across all warehouses... - id: check_levels type: io.kestra.plugin.core.output.OutputValues values: total_items: "8542" low_stock_items: "23" out_of_stock: "2" - id: generate_alerts type: io.kestra.plugin.core.log.Log message: Inventory Status Report generated - id: notify_purchasing type: io.kestra.plugin.core.log.Log message: Reorder notifications sent to purchasing team triggers: - id: daily_check type: io.kestra.plugin.core.trigger.Schedule cron: "0 7 * * *" disabled: true' ) ON CONFLICT (key) DO NOTHING; -- 7. email_campaign_trigger (acme.marketing) INSERT INTO flows (key, value, source_code) VALUES ( 'acme.marketing_email_campaign_trigger_1', '{"id":"email_campaign_trigger","namespace":"acme.marketing","tenantId":"main","revision":1,"deleted":false,"description":"Trigger email marketing campaigns","tasks":[{"id":"validate_campaign","type":"io.kestra.plugin.core.log.Log","message":"Validating campaign"},{"id":"calculate_audience","type":"io.kestra.plugin.core.output.OutputValues","values":{"audience_size":"4500"}},{"id":"send_campaign","type":"io.kestra.plugin.core.log.Log","message":"Sending campaign"},{"id":"track_metrics","type":"io.kestra.plugin.core.log.Log","message":"Campaign sent successfully. Tracking metrics..."}],"inputs":[{"id":"campaign_name","type":"STRING","defaults":"Monthly Newsletter"},{"id":"target_segment","type":"SELECT","values":["All Customers","Premium Customers","Trial Users","Inactive Users"],"defaults":"All Customers"}]}'::jsonb, 'id: email_campaign_trigger namespace: acme.marketing description: Trigger email marketing campaigns inputs: - id: campaign_name type: STRING defaults: "Monthly Newsletter" - id: target_segment type: SELECT values: - All Customers - Premium Customers - Trial Users - Inactive Users defaults: "All Customers" tasks: - id: validate_campaign type: io.kestra.plugin.core.log.Log message: Validating campaign - id: calculate_audience type: io.kestra.plugin.core.output.OutputValues values: audience_size: "4500" - id: send_campaign type: io.kestra.plugin.core.log.Log message: Sending campaign - id: track_metrics type: io.kestra.plugin.core.log.Log message: Campaign sent successfully. Tracking metrics...' ) ON CONFLICT (key) DO NOTHING; -- 8. social_media_analytics (acme.marketing) INSERT INTO flows (key, value, source_code) VALUES ( 'acme.marketing_social_media_analytics_1', '{"id":"social_media_analytics","namespace":"acme.marketing","tenantId":"main","revision":1,"deleted":false,"description":"Aggregate social media performance metrics","tasks":[{"id":"fetch_twitter_metrics","type":"io.kestra.plugin.core.log.Log","message":"Fetching Twitter/X metrics..."},{"id":"fetch_linkedin_metrics","type":"io.kestra.plugin.core.log.Log","message":"Fetching LinkedIn metrics..."},{"id":"aggregate_data","type":"io.kestra.plugin.core.output.OutputValues","values":{"total_impressions":"32000","total_engagement":"1250","follower_growth":"87"}},{"id":"generate_insights","type":"io.kestra.plugin.core.log.Log","message":"Social Media Weekly Report generated"}],"triggers":[{"id":"weekly_report","type":"io.kestra.plugin.core.trigger.Schedule","cron":"0 10 * * 1","disabled":true}]}'::jsonb, 'id: social_media_analytics namespace: acme.marketing description: Aggregate social media performance metrics tasks: - id: fetch_twitter_metrics type: io.kestra.plugin.core.log.Log message: Fetching Twitter/X metrics... - id: fetch_linkedin_metrics type: io.kestra.plugin.core.log.Log message: Fetching LinkedIn metrics... - id: aggregate_data type: io.kestra.plugin.core.output.OutputValues values: total_impressions: "32000" total_engagement: "1250" follower_growth: "87" - id: generate_insights type: io.kestra.plugin.core.log.Log message: Social Media Weekly Report generated triggers: - id: weekly_report type: io.kestra.plugin.core.trigger.Schedule cron: "0 10 * * 1" disabled: true' ) ON CONFLICT (key) DO NOTHING; -- 9. invoice_processing (acme.finance) INSERT INTO flows (key, value, source_code) VALUES ( 'acme.finance_invoice_processing_1', '{"id":"invoice_processing","namespace":"acme.finance","tenantId":"main","revision":1,"deleted":false,"description":"Process and validate invoices","tasks":[{"id":"validate_invoice","type":"io.kestra.plugin.core.log.Log","message":"Validating invoice"},{"id":"check_approval_needed","type":"io.kestra.plugin.core.output.OutputValues","values":{"needs_approval":"true","approver":"CFO"}},{"id":"process_payment","type":"io.kestra.plugin.core.log.Log","message":"Processing payment"},{"id":"send_confirmation","type":"io.kestra.plugin.core.log.Log","message":"Payment confirmation sent"}],"inputs":[{"id":"invoice_number","type":"STRING","required":true},{"id":"amount","type":"FLOAT","required":true},{"id":"vendor_name","type":"STRING","required":true}]}'::jsonb, 'id: invoice_processing namespace: acme.finance description: Process and validate invoices inputs: - id: invoice_number type: STRING required: true - id: amount type: FLOAT required: true - id: vendor_name type: STRING required: true tasks: - id: validate_invoice type: io.kestra.plugin.core.log.Log message: Validating invoice - id: check_approval_needed type: io.kestra.plugin.core.output.OutputValues values: needs_approval: "true" approver: CFO - id: process_payment type: io.kestra.plugin.core.log.Log message: Processing payment - id: send_confirmation type: io.kestra.plugin.core.log.Log message: Payment confirmation sent' ) ON CONFLICT (key) DO NOTHING; -- 10. quarterly_financial_report (acme.finance) INSERT INTO flows (key, value, source_code) VALUES ( 'acme.finance_quarterly_financial_report_1', '{"id":"quarterly_financial_report","namespace":"acme.finance","tenantId":"main","revision":1,"deleted":false,"description":"Generate quarterly financial statements","tasks":[{"id":"gather_financial_data","type":"io.kestra.plugin.core.log.Log","message":"Gathering financial data"},{"id":"calculate_financials","type":"io.kestra.plugin.core.output.OutputValues","values":{"revenue":"842000","expenses":"510000","profit_margin":"24.5"}},{"id":"generate_report","type":"io.kestra.plugin.core.log.Log","message":"Quarterly Financial Report generated"},{"id":"distribute_report","type":"io.kestra.plugin.core.log.Log","message":"Report distributed to executive team"}],"triggers":[{"id":"quarterly_schedule","type":"io.kestra.plugin.core.trigger.Schedule","cron":"0 8 1 1,4,7,10 *","disabled":true}]}'::jsonb, 'id: quarterly_financial_report namespace: acme.finance description: Generate quarterly financial statements tasks: - id: gather_financial_data type: io.kestra.plugin.core.log.Log message: Gathering financial data - id: calculate_financials type: io.kestra.plugin.core.output.OutputValues values: revenue: "842000" expenses: "510000" profit_margin: "24.5" - id: generate_report type: io.kestra.plugin.core.log.Log message: Quarterly Financial Report generated - id: distribute_report type: io.kestra.plugin.core.log.Log message: Report distributed to executive team triggers: - id: quarterly_schedule type: io.kestra.plugin.core.trigger.Schedule cron: "0 8 1 1,4,7,10 *" disabled: true' ) ON CONFLICT (key) DO NOTHING; -- ============================================================ -- EXECUTIONS -- Generated via PL/pgSQL to create ~180 executions over 7 days -- ============================================================ DO $$ DECLARE -- Flow definitions: flow_id, namespace, avg_duration_seconds, daily_frequency flow_configs TEXT[][] := ARRAY[ ARRAY['hello-world', 'acme', '8', '5'], ARRAY['customer_onboarding', 'acme.sales', '15', '4'], ARRAY['monthly_sales_report', 'acme.sales', '120', '2'], ARRAY['data_pipeline_assets', 'acme.company.data', '300', '3'], ARRAY['system_health_check', 'acme.operations', '25', '8'], ARRAY['inventory_check', 'acme.operations', '45', '3'], ARRAY['email_campaign_trigger', 'acme.marketing', '90', '2'], ARRAY['social_media_analytics', 'acme.marketing', '180', '1'], ARRAY['invoice_processing', 'acme.finance', '30', '3'], ARRAY['quarterly_financial_report','acme.finance', '600', '1'] ]; -- State distribution weights (cumulative out of 100): -- SUCCESS=70, FAILED=80, WARNING=88, RUNNING=93, CANCELLED=97, RETRIED=100 state_thresholds INT[] := ARRAY[70, 80, 88, 93, 97, 100]; state_names TEXT[] := ARRAY['SUCCESS', 'FAILED', 'WARNING', 'RUNNING', 'CANCELLED', 'RETRIED']; base_date TIMESTAMP; exec_id TEXT; flow_id TEXT; flow_ns TEXT; avg_dur INT; daily_freq INT; exec_start TIMESTAMP; exec_end TIMESTAMP; duration_secs INT; duration_iso TEXT; state TEXT; rand_val INT; day_offset INT; hour_val INT; minute_val INT; histories_json TEXT; exec_json TEXT; hour_weight FLOAT; i INT; j INT; BEGIN -- Base date = 7 days ago at midnight UTC base_date := date_trunc('day', NOW() - INTERVAL '7 days'); FOR i IN 1..array_length(flow_configs, 1) LOOP flow_id := flow_configs[i][1]; flow_ns := flow_configs[i][2]; avg_dur := flow_configs[i][3]::INT; daily_freq := flow_configs[i][4]::INT; FOR day_offset IN 0..6 LOOP -- Fixed daily count for idempotency (deterministic loop bounds) FOR j IN 1..daily_freq LOOP -- Generate hour weighted toward business hours (8-18) hour_weight := random(); IF hour_weight < 0.7 THEN -- 70% during business hours 8-18 hour_val := 8 + (random() * 10)::INT; ELSIF hour_weight < 0.9 THEN -- 20% during evening 18-23 hour_val := 18 + (random() * 5)::INT; ELSE -- 10% during night 0-7 hour_val := (random() * 7)::INT; END IF; minute_val := (random() * 59)::INT; exec_start := base_date + (day_offset || ' days')::INTERVAL + (hour_val || ' hours')::INTERVAL + (minute_val || ' minutes')::INTERVAL + ((random() * 59)::INT || ' seconds')::INTERVAL; -- Duration: vary between 30% and 250% of avg duration_secs := greatest(2, (avg_dur * (0.3 + random() * 2.2))::INT); -- Pick state based on distribution rand_val := (random() * 99)::INT + 1; state := 'SUCCESS'; FOR k IN 1..array_length(state_thresholds, 1) LOOP IF rand_val <= state_thresholds[k] THEN state := state_names[k]; EXIT; END IF; END LOOP; -- RUNNING executions have no end date IF state = 'RUNNING' THEN exec_end := NULL; -- Make start_date recent (within last 30 min) exec_start := NOW() - (random() * 30 || ' minutes')::INTERVAL; duration_iso := 'PT' || (EXTRACT(EPOCH FROM (NOW() - exec_start))::INT) || 'S'; ELSE exec_end := exec_start + (duration_secs || ' seconds')::INTERVAL; -- Build ISO 8601 duration IF duration_secs >= 3600 THEN duration_iso := 'PT' || (duration_secs / 3600) || 'H' || ((duration_secs % 3600) / 60) || 'M' || (duration_secs % 60) || 'S'; ELSIF duration_secs >= 60 THEN duration_iso := 'PT' || (duration_secs / 60) || 'M' || (duration_secs % 60) || 'S'; ELSE duration_iso := 'PT' || duration_secs || 'S'; END IF; END IF; -- Deterministic ID based on flow + day + index for idempotency exec_id := md5(flow_id || '_' || day_offset::TEXT || '_' || j::TEXT); -- Build state histories JSON IF state = 'RUNNING' THEN histories_json := '[{"state":"CREATED","date":"' || to_char(exec_start, 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '"},' || '{"state":"RUNNING","date":"' || to_char(exec_start + INTERVAL '100 milliseconds', 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '"}]'; ELSIF state = 'RETRIED' THEN histories_json := '[{"state":"CREATED","date":"' || to_char(exec_start, 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '"},' || '{"state":"RUNNING","date":"' || to_char(exec_start + INTERVAL '100 milliseconds', 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '"},' || '{"state":"FAILED","date":"' || to_char(exec_start + (duration_secs / 2 || ' seconds')::INTERVAL, 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '"},' || '{"state":"RETRYING","date":"' || to_char(exec_start + (duration_secs / 2 || ' seconds')::INTERVAL + INTERVAL '1 second', 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '"},' || '{"state":"RUNNING","date":"' || to_char(exec_start + (duration_secs / 2 || ' seconds')::INTERVAL + INTERVAL '2 seconds', 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '"},' || '{"state":"RETRIED","date":"' || to_char(exec_end, 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '"}]'; ELSIF state = 'CANCELLED' THEN histories_json := '[{"state":"CREATED","date":"' || to_char(exec_start, 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '"},' || '{"state":"RUNNING","date":"' || to_char(exec_start + INTERVAL '100 milliseconds', 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '"},' || '{"state":"KILLING","date":"' || to_char(exec_end - INTERVAL '1 second', 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '"},' || '{"state":"CANCELLED","date":"' || to_char(exec_end, 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '"}]'; ELSE -- SUCCESS, FAILED, WARNING histories_json := '[{"state":"CREATED","date":"' || to_char(exec_start, 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '"},' || '{"state":"RUNNING","date":"' || to_char(exec_start + INTERVAL '100 milliseconds', 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '"},' || '{"state":"' || state || '","date":"' || to_char(exec_end, 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '"}]'; END IF; -- Build full execution JSON exec_json := '{"id":"' || exec_id || '",' || '"namespace":"' || flow_ns || '",' || '"tenantId":"main",' || '"flowId":"' || flow_id || '",' || '"flowRevision":1,' || '"deleted":false,' || '"state":{' || '"current":"' || state || '",' || '"startDate":"' || to_char(exec_start, 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '",' || CASE WHEN exec_end IS NOT NULL THEN '"endDate":"' || to_char(exec_end, 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') || '",' ELSE '' END || '"duration":"' || duration_iso || '",' || '"histories":' || histories_json || '},' || '"taskRunList":[]}'; INSERT INTO executions (key, value) VALUES (exec_id, exec_json::jsonb) ON CONFLICT (key) DO NOTHING; END LOOP; -- j (executions per day) END LOOP; -- day_offset END LOOP; -- i (flows) END $$; COMMIT; ``` ::: --- # Run PowerShell Inside Your Flows URL: https://kestra.io/docs/how-to-guides/powershell > Run PowerShell scripts in Kestra. Automate Windows administration, call Azure APIs, and integrate Microsoft services into your automation pipelines. Run PowerShell code in your flow. PowerShell is commonly used for automating the management of systems and resources. With Kestra, you can effortlessly automate builds and tests for production systems, as well as manage cloud configurations and resources. Kestra's robust orchestration capabilities ensure that your PowerShell scripts run smoothly and efficiently, streamlining your infrastructure. This guide is going to walk you through how to get PowerShell running in a workflow, how to manage input and output files, and how you can pass outputs and metrics back to Kestra to use in later tasks. You can execute PowerShell code in a flow by either writing your PowerShell code inline or by executing a `.ps1` file. You can get outputs and metrics from your PowerShell code too. ## Scripts If you want to write a short amount of PowerShell code to perform a task, you can use the `io.kestra.plugin.scripts.powershell.Script` type to write it directly inside your flow. This allows you to keep everything in one place. ```yaml id: powershell_script namespace: company.team description: This flow runs the PowerShell script. tasks: - id: http_download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: powershell_script_task type: io.kestra.plugin.scripts.powershell.Script script: | Write-Output "The current execution is {{ execution.id }}" # Read the file downloaded in `http_download` task $content = Get-Content "{{ outputs.http_download.uri }}" $content ``` You can read more about the Scripts type in the [Plugin documentation](/plugins/plugin-script-powershell/io.kestra.plugin.scripts.powershell.script) ## Commands If you would prefer to put your PowerShell code in a `.ps1` file (e.g. your code is much longer or spread across multiple files), you can run the previous example using the `io.kestra.plugin.scripts.powershell.Commands` type: ```yaml id: powershell_commands namespace: company.team tasks: - id: run_powershell type: io.kestra.plugin.scripts.powershell.Commands namespaceFiles: enabled: true commands: - ./main.ps1 ``` The contents of the `main.ps1` file can be: ```powershell Write-Output "Hello World" ``` You'll need to add your PowerShell code using the Editor or [sync it using Git](../../version-control-cicd/04.git/index.md) so Kestra can see it. You'll also need to set the `enabled` flag for the `namespaceFiles` property to `true` so Kestra can access the file. You can also have the PowerShell code written inline. ```yaml id: powershell_commands namespace: company.team tasks: - id: http_download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: run_powershell type: io.kestra.plugin.scripts.powershell.Commands inputFiles: orders.csv: "{{ read(outputs.http_download.uri) }}" main.ps1: | Write-Output "The current execution is {{ execution.id }}" # Read the file $content = Get-Content "orders.csv" $content commands: - ./main.ps1 ``` You can read more about the Commands type in the [Plugin documentation](/plugins/plugin-script-powershell/io.kestra.plugin.scripts.powershell.commands). ## Handling Outputs If you want to get a variable or file from your PowerShell script, you can use an [output](../../05.workflow-components/06.outputs/index.md). ### Variable Output You can put out the JSON outputs from the PowerShell commands / script using the `::{}::` pattern. Here is an example: ```yaml id: powershell_outputs namespace: company.team description: This flow runs the PowerShell script, and outputs the variable. tasks: - id: powershell_outputs_task type: io.kestra.plugin.scripts.powershell.Script script: | Write-Output '::{"outputs":{"test":"value","int":2,"bool":true,"float":3.65}}::' ``` All the output variables can be viewed in the Outputs tab of the execution. ![powershell_outputs](./outputs.png) You can refer to the outputs in another task as shown in the example below: ```yaml id: powershell_outputs namespace: company.team description: This flow runs the PowerShell script, and outputs the variable. tasks: - id: powershell_outputs_task type: io.kestra.plugin.scripts.powershell.Script script: | Write-Output '::{"outputs":{"test":"value","int":2,"bool":true,"float":3.65}}::' - id: return type: io.kestra.plugin.core.debug.Return format: '{{ outputs.powershell_outputs_task.vars.test }}' ``` _This example works for both `io.kestra.plugin.scripts.powershell.Script` and `io.kestra.plugin.scripts.powershell.Commands`._ ### File Output Inside of your PowerShell script, write a file to the system. You'll need to add the `outputFiles` property to your flow and list the files you're trying to put out. In this case, we want to output `output.txt`. More information on the formats you can use for this property can be found in [Script Ouput Metrics](../../16.scripts/06.outputs-metrics/index.md). The example below writes a `output.txt` file containing the "Hello World" text. We can then refer the file using the syntax `{{ outputs.{task_id}.outputFiles[''] }}`, and read the contents of the file using the `read()` function. ```yaml id: powershell_output_file namespace: company.team description: This flow runs the PowerShell script to output a file. tasks: - id: powershell_outputs_task type: io.kestra.plugin.scripts.powershell.Script outputFiles: - output.txt script: | Set-Content -Path "output.txt" -Value "Hello World" - id: log_output type: io.kestra.plugin.core.log.Log message: "{{ read(outputs.powershell_outputs_task.outputFiles['output.txt']) }}" ``` _This example works for both `io.kestra.plugin.scripts.powershell.Script` and `io.kestra.plugin.scripts.powershell.Commands`._ ## Handling Metrics You can also get [metrics](../../16.scripts/06.outputs-metrics/index.md#outputs-and-metrics-in-script-and-commands-tasks) from your PowerShell script. Metrics use the same `::{}::` pattern as outputs. This example demonstrates both the counter and timer metrics. ```yaml id: powershell_metrics namespace: company.team description: This flow runs the PowerShell script, and puts out the metrics. tasks: - id: powershell_metrics_task type: io.kestra.plugin.scripts.powershell.Script script: | Write-Output 'There are 20 products in the cart' Write-Output '::{"outputs":{"productCount":20}}::' Write-Output '::{"metrics":[{"name":"productCount","type":"counter","value":20}]}::' Write-Output '::{"metrics":[{"name":"purchaseTime","type":"timer","value":32.44}]}::' ``` Once this has executed, both the metrics can be viewed under **Metrics**. ![powershell_metrics](./metrics.png) --- # Trigger a Flow on a Prometheus Alert URL: https://kestra.io/docs/how-to-guides/prometheus-alert-trigger > Connect Prometheus alerts to Kestra to automatically trigger flows via webhooks when specific metrics thresholds are breached. Connect Prometheus alerts to Kestra to trigger flows. ## Monitoring with Prometheus and Triggering Flows in Kestra This guide explains how to connect Prometheus to Kestra and configure a workflow that is triggered by Prometheus alerts. This guide covers the basics and is intended as a starting off point for any production workflows. You will: 1. Integrate Prometheus with Kestra 2. Configure Prometheus Alertmanager to send alerts via webhook 3. Use a Webhook Trigger in a Kestra flow --- ## Connect Prometheus to Kestra Kestra natively supports integration with Prometheus for metric scraping and visualization. Kestra exposes [Prometheus](https://prometheus.io/) metrics at port 8081 on the endpoint `/prometheus`. This endpoint can be used by any compatible monitoring system. Follow these steps based on the [Kestra Monitoring Guide](../monitoring/index.md). Once Kestra is up and running, view the available metrics at `http://localhost:8081/prometheus` in your browser. ## Configure Prometheus to Scrape Kestra Add the Kestra metrics endpoint to your Prometheus configuration (`prometheus.yml`): ```yaml global: scrape_interval: 15s evaluation_interval: 15s scrape_configs: - job_name: "prometheus" metrics_path: /prometheus static_configs: - targets: [":8081"] ``` Be sure to put the appropriate in the last line, e.g., `localhost:8081` or `host.docker.internal:8081`. Restart Prometheus for the changes to take effect. :::alert{type="info"} If you're running everything with Docker on the same machine, you will need to change your host address to `host.docker.internal` rather than localhost, or the name of the container. ::: --- ## Create a Prometheus Alert and Webhook Receiver To trigger a Kestra flow on a Prometheus alert, configure [Prometheus Alertmanager](https://github.com/prometheus/alertmanager) to send a webhook to Kestra. You can [download Alertmanager and Prometheus](https://prometheus.io/download/) from the official site as well as run the services in a Docker Compose file, refer below for an example: ```yaml services: prometheus: image: prom/prometheus privileged: true volumes: - ./prometheus.yml:/etc/prometheus/prometheus.yml - ./alertmanager/alert.rules:/alertmanager/alert.rules command: - '--config.file=/etc/prometheus/prometheus.yml' ports: - '9090:9090' node-exporter: image: prom/node-exporter ports: - '9100:9100' alertmanager: image: prom/alertmanager privileged: true volumes: - ./alertmanager/alertmanager.yml:/alertmanager.yml command: - '--config.file=/alertmanager.yml' ports: - '9093:9093' ``` You can verify Prometheus is up and running by going to `http://localhost:9090/graph` and visualizing some metrics using the PromQL. Below is one of the graphs for `kestra_executor_execution_started_count_total` metric: ![Prometheus metric graph](../monitoring/promql_graph.png) Similarly, you can go to `http://localhost:9093/status` and see that the Alertmanager is ready. ![Alertmanager Status](./alertmanager-status.png) ### Step 1: Define a Prometheus Alert In your `prometheus.yml` file, you must add Alertmanager and some rules files. For example, the `prometheus.yml` configuration now looks as follows: ```yaml global: scrape_interval: 15s evaluation_interval: 15s scrape_configs: - job_name: "prometheus" metrics_path: /prometheus static_configs: - targets: [":8081"] ## Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: - 'localhost: 9093' # Replace with your host name (i.e., host.docker.internal) ## Load rules once and periodically evaluate them according to global 'evaluation_interval'. rule_files: - "/alertmanager/alert.rules" ``` Create a simple rule to alert on high CPU usage or another metric exposed by Kestra. Refer to our full list at [Kestra Prometheus Metrics](../../10.administrator-guide/prometheus-metrics/index.md): ```yaml groups: - name: alert.rules rules: - alert: HighCPUUsage expr: system_cpu_usage == 1.0 for: 1m labels: severity: "critical" annotations: summary: "High CPU usage on {{ $labels.instance }}" ``` You can also use a simple, generic instance down alert such as: ```yaml groups: - name: alert.rules rules: # Alert for any instance that is unreachable for >5 minutes. - alert: InstanceDown expr: up == 0 for: 5m ``` Test different metrics and statuses for what fits your use case. Save these rules in a file such as `alert.rules.yml` and configure Prometheus to load it like in the above `prometheus.yml`: ```yaml rule_files: - "/alertmanager/alert.rules" ``` :::alert{type="info"} Ensure that `groups.name` and `rule_files` have the same name so that Alertmanager correctly connects the alerts. ::: To check that your rules are picked up by Prometheus, go to `http://localhost:9090/rules`. ![Prometheus Rules](./alert-rules.png) From there, you can see a list of the rules set in the `alert.rules.yml` file: ![Prometheus Rules List](./alert-rules-list.png) ### Step 2: Configure Alertmanager to Use a Webhook Now that all the services are connected, edit `alertmanager.yml` to send alerts to a Kestra webhook: ```yaml receivers: - name: 'kestra-webhook' webhook_configs: - url: 'https:///api/v1/triggers/webhook' send_resolved: true route: receiver: 'kestra-webhook' ``` Ensure your Alertmanager is restarted and using this configuration. --- ## 3. Create a Kestra Webhook Triggered Flow Now create a Kestra flow that is triggered by a Prometheus alert via [webhook](../../05.workflow-components/07.triggers/03.webhook-trigger/index.md) from the rule definitions specified in the `alert.rules.yml` file. ### Example Flow Definition ```yaml id: prometheus-alert namespace: system triggers: - id: from-prometheus type: io.kestra.plugin.core.trigger.Webhook key: prometheus tasks: - id: log-alert type: io.kestra.plugin.core.log.Log message: "Received alert: {{ trigger.body }}" ``` Once the flow is written, you can verify the trigger is active from the **Flows -> Triggers** tab in the UI. ![Prometheus Webhook Trigger](./prometheus-webhook-trigger.png) ### How It Works * The `Webhook` trigger listens for HTTP POST requests to: ```text https:///api/v1/triggers/webhook/prometheus ``` - Prometheus Alertmanager sends alerts to this endpoint. - The flow is executed with the alert content available as `{{ trigger.body }}`. --- For more on the Webhook trigger, see the [Kestra Webhook Trigger Docs](https://kestra.io/docs/workflow-components/triggers/webhook-trigger). Again, see the [full list of metrics Kestra exposes to Prometheus](../../10.administrator-guide/prometheus-metrics/index.md). --- # Push Flows to a Git Repository URL: https://kestra.io/docs/how-to-guides/pushflows > Use the PushFlows task to push your Kestra flows to a Git repository directly from the UI, keeping your development and production in sync. Push your Flows to a Git Repository with the PushFlows Task.
## How it works The [PushFlows](/plugins/plugin-git/io.kestra.plugin.git.pushflows) task is a powerful integration that allows you to **push your code to Git from the UI while still managing this process entirely in code**! Kestra unifies the development experience between the UI and code so you can combine the best of both worlds without sacrificing the benefits of version control. You can **build your flows** in a development namespace using all **productivity features of the Kestra UI** (_such as the built-in code editor, autocompletion, syntax validation, documentation, blueprint examples, live-updating topology view, output previews, replays, execution and revision history_) and then **push them to Git** after you have tested and validated them. The task pushes one or more flows from a given namespace (and optionally also child namespaces) to any Git-based Version Control System. Additionally, the `dryRun` property will help you see what files will be added, modified, or deleted without overwriting the files on Git yet. The following examples cover common patterns for the `PushFlows` task. ## Before you begin Before you start using the `PushFlows` task, ensure the following prerequisites are in place: 1. A Git repository where you want to push your flows. 2. A Personal Access Token (PAT) for Git authentication. 3. A running Kestra instance in a version 0.17.0 or later with the PAT stored as a [secret](../../06.concepts/04.secret/index.md) within the Kestra instance. ## Using the `dryRun` property Start by creating a single `hello_world` flow in the `dev` namespace and pushing it to a Git repository. Initially set the `dryRun` property to `true` to validate the changes before committing them to Git. ```yaml id: hello_world namespace: company.team inputs: - id: greeting type: STRING defaults: kestra tasks: - id: welcome type: io.kestra.plugin.core.log.Log message: welcome to {{ inputs.greeting }} ``` Here is a system flow that will push the `hello_world` flow to a Git repository: ```yaml id: push_to_git namespace: system tasks: - id: commit_and_push type: io.kestra.plugin.git.PushFlows username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/git_username/flows branch: develop flows: - hello_world sourceNamespace: company.team targetNamespace: prod gitDirectory: flows commitMessage: "changes to kestra flows" dryRun: true ``` Given that the `dryRun` property is set to `true`, the task will only output modifications without pushing any flows to Git yet: ![git1](./git1.png) ## Pushing a single flow to Git Set the `dryRun` property to `false` and push the `hello_world` flow to Git: ```yaml id: push_to_git namespace: system tasks: - id: commit_and_push type: io.kestra.plugin.git.PushFlows ... dryRun: false ``` You should see the following log message: ![git2.png](./git2.png) And here is what you should see in the Outputs tab: ![git3.png](./git3.png) When you click on the commit URL from the logs or from the Outputs tab, you'll be redirected to the commit page on GitHub: ![git4.png](./git4.png) Now, you can create a pull request and merge the changes to the main branch. ![git5_pr.png](./git5_pr.png) ## Pushing all flows from a single namespace to Git To push all flows from a given namespace to Git, create two more flows in the `dev` namespace: ```yaml id: flow1 namespace: company.team tasks: - id: test type: io.kestra.plugin.core.log.Log message: this is too easy ``` The `flow2` flow is just a copy of the `flow1` with a different flow ID and message: ```yaml id: flow2 namespace: company.team tasks: - id: test type: io.kestra.plugin.core.log.Log message: the simplest dev-to-prod workflow ever ``` ![git6_all_flows.png](./git6_all_flows.png) Adjust the system flow to push all flows from the `dev` namespace to the `develop` branch: ```yaml id: push_to_git namespace: system tasks: - id: commit_and_push type: io.kestra.plugin.git.PushFlows username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/git_username/flows branch: develop sourcenamespace: company.team targetNamespace: prod gitDirectory: flows commitMessage: "push all development flows to Git and create a PR" dryRun: true ``` Setting `dryRun` to `true` shows what files will be added, modified, or deleted based on the Git version without overwriting the files in Git yet: ![git7.png](./git7.png) Now if you change the `dryRun` property to `false` and run the system flow again, you should see all three flows being pushed to the `flows` directory on the `develop` branch with the exact commit messages we have specified in the `commitMessage` property: ![git8.png](./git8.png) ## Pushing all flows including child namespaces Finally, we get to the fun part of pushing all flows from the `dev` namespace **including all child namespaces**. Kestra will automatically create a subfolder for each child namespace and push the flows there to keep everything organized. Create two more flows in the `dev.tutorial` namespace: 1. `hello_world_1` flow: ```yaml id: hello_world_1 namespace: company.team.tutorial inputs: - id: greeting type: STRING defaults: hey tasks: - id: print_status type: io.kestra.plugin.core.log.Log message: hello on {{ inputs.greeting }} ``` 2. `hello_world_2` flow: ```yaml id: hello_world_2 namespace: company.team.tutorial inputs: - id: greeting type: STRING defaults: hey tasks: - id: print_status type: io.kestra.plugin.core.log.Log message: hello on {{ inputs.greeting }} ``` To include all child namespaces in our Git commit, we only need to add the `includeChildNamespaces` property, set to `true`: ```yaml id: push_to_git namespace: system tasks: - id: commit_and_push type: io.kestra.plugin.git.PushFlows username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/git_username/flows branch: develop sourcenamespace: company.team targetNamespace: prod gitDirectory: flows commitMessage: "push all flows" includeChildNamespaces: true ``` When you run this final system flow, you should see the following output: ![git9.png](./git9.png) And here is a confirmation that all flows from the `dev` namespace and its child namespaces have been pushed to the Git repository: ![git10.png](./git10.png) Here is a simple table to illustrate how flows are mapped to files in the Git repository: | Flow | Source namespace | Git directory path | |---------------|-----------------------------------|----------------------------------| | hello_world | dev | flows/hello_world.yml | | flow1 | dev | flows/flow1.yml | | flow2 | dev | flows/flow2.yml | | hello_world_1 | dev.tutorial | flows/tutorial/hello_world_1.yml | | hello_world_2 | dev.tutorial | flows/tutorial/hello_world_2.yml | You can see that each child namespace is represented as a subfolder in the Git repository, and all flows are neatly organized in their respective directories. ## Extra notes - The `flows` property allows you to specify a list of Regex strings that declare which flows should be included in the Git commit. By default, all flows from the specified `sourceNamespace` will be pushed (and optionally adjusted to match the `targetNamespace` before pushing to Git). If you want to push only the current flow, you can use the `{{flow.id}}` expression or specify the flow ID explicitly, e.g. `myflow`. You can also use this property to push only specific flows — you have full flexibility to customize this task to your preferred deployment strategy. - The `branch` property allows you to specify the branch to which files should be committed and pushed. If the branch doesn’t exist yet, it will be created. - The `commitMessage` property allows you to specify the Git commit message. You can use templating to include dynamic values in your commit message. - The `gitDirectory` property allows you to specify the directory to which flows should be pushed. If not set, flows will be pushed to a Git directory named `_flows` and will optionally also include subdirectories named after the child namespaces. If you prefer, you can specify an arbitrary path, e.g. `kestra/flows`, allowing you to push flows to that specific Git directory. - If you omit the `targetNamespace`, the `sourceNamespace` will be used as the `targetNamespace` by default. The `targetNamespace` is an optional mechanism to help you prepare your development flows to be merged into the production branch/namespace. If you set the `targetNamespace`, the `sourceNamespace` in the source code will be overwritten by the `targetNamespace` so that you can sync the flows to production. - If you try to add the Personal Access Token (PAT) directly in your source code in the `password` property, you will get an error message. This is a safety mechanism to prevent you and your users from accidentally exposing your PAT in the source code. You should store the PAT as a Kestra Secret, environment variable, namespace variable or as a SECRET-type input in your flow. - Git does not guarantee the order of push operations to a remote repository, which can lead to potential conflicts when multiple users or flows attempt to push changes simultaneously. To minimize the risk of data loss and merge conflicts, it is strongly recommended to use sequential workflows or push changes to separate branches. ![git11_credential_detected.png](./git11_credential_detected.png) --- # Push Namespace Files to a Git Repository URL: https://kestra.io/docs/how-to-guides/pushnamespacefiles > Push your namespace files, such as scripts and configuration, from Kestra to a Git repository to maintain version control. Push files in your namespace to a Git Repository with the PushNamespaceFiles Task.
## How it works The [PushNamespaceFiles](/plugins/plugin-git/io.kestra.plugin.git.pushnamespacefiles) task is a powerful integration that allows you to **push your namespace files to Git from the UI while still managing this process entirely in code**! Kestra unifies the development experience between the UI and code so you can combine the best of both worlds without sacrificing the benefits of version control. The process is simple: you can **build your flows and files** in a development namespace using all **productivity features of the Kestra UI** (_such as the built-in code editor, autocompletion, syntax validation, documentation, blueprint examples, live-updating topology view, output previews, replays, execution and revision history_) and then **push them to Git** after you have tested and validated them. The task pushes one or more files from a given namespace (and optionally also child namespaces) to any Git-based Version Control System. Additionally, the `dryRun` property will help you see what files will be added, modified, or deleted without overwriting the files on Git yet. The following examples cover common patterns for the `PushNamespaceFiles` task. ## Before you begin Before you start using the `PushNamespaceFiles` task, ensure the following prerequisites are in place: 1. A Git repository where you want to push your files. 2. A Personal Access Token (PAT) for Git authentication. 3. A running Kestra instance in a version 0.17.0 or later with the PAT stored as a [secret](../../06.concepts/04.secret/index.md) within the Kestra instance. ## Using the `dryRun` property Start by creating a single `example.py` file in the `dev` namespace and pushing it to a Git repository. Initially set the `dryRun` property to `true` to validate changes before committing to Git. You'll need a flow already in the `dev` namespace to create a new file. ```python print("Hello, World") ``` Here is a system flow that will push the `example.py` file to a Git repository: ```yaml id: push_to_git namespace: system tasks: - id: commit_and_push type: io.kestra.plugin.git.PushNamespaceFiles username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/git_username/scripts branch: dev namespace: company.team files: - "example.py" gitDirectory: _files commitMessage: "add namespace files" dryRun: true ``` Given that the `dryRun` property is set to `true`, the task will only output modifications without pushing any files to Git yet: ![git1](./git1.png) ## Pushing a single file to Git Set the `dryRun` property to `false` and push the `example.py` file to Git: ```yaml id: push_to_git namespace: system tasks: - id: commit_and_push type: io.kestra.plugin.git.PushNamespaceFiles ... dryRun: false ``` You should see the following log message: ![git2.png](./git2.png) And here is what you should see in the Outputs tab: ![git3.png](./git3.png) When you click on the commit URL from the logs or from the Outputs tab, you'll be redirected to the commit page on GitHub: ![git4.png](./git4.png) Now, you can create a pull request and merge the changes to the main branch. ![git5_pr.png](./git5_pr.png) ## Pushing all files from a single namespace to Git To push all files from a given namespace to Git, create two more files in the `dev` namespace: `example.sh` file: ```sh echo "Hello, World" ``` `example.js` file: ```js console.log("Hello, World") ``` ![git6_all_files.png](./git6_all_files.png) Adjust the system flow to push all files from the `dev` namespace to the `dev` branch: ```yaml id: push_to_git namespace: system tasks: - id: commit_and_push type: io.kestra.plugin.git.PushNamespaceFiles username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/git_username/scripts branch: dev namespace: company.team gitDirectory: _files commitMessage: "push all namespace files and create a PR" dryRun: true ``` Again, we can set the `dryRun` property to `true` to see what files will be added, modified, or deleted based on the Git version without overwriting the files in Git yet: ![git7.png](./git7.png) Now if you change the `dryRun` property to `false` and run the system flow again, you should see all three files being pushed to the `_files` directory on the `develop` branch with the exact commit messages we have specified in the `commitMessage` property: ![git8.png](./git8.png) ## Extra notes - Git does not guarantee the order of push operations to a remote repository, which can lead to potential conflicts when multiple users or flows attempt to push changes simultaneously. To minimize the risk of data loss and merge conflicts, it is strongly recommended to use sequential workflows or push changes to separate branches. --- # Run Python Inside Your Flows URL: https://kestra.io/docs/how-to-guides/python > Run Python scripts in Kestra. Install pip packages at runtime, execute code in Docker containers, and pass data between tasks using inputs and outputs. Run Python code directly in your flows and generate outputs. You can execute Python code in a flow by either writing your Python inline or by executing a `.py` file. You can also get outputs and metrics from your Python code too.
In this example, the flow will install the required pip packages, make an API request to fetch data and use the Python Kestra library to generate outputs and metrics using this data. ## Managing Dependencies Managing Python Dependencies can be frustrating. There's 3 ways you can manage your dependencies in Kestra: - Install with pip using `beforeCommands` - Set Container Image with Docker Task Runner - Build Docker Image and set it with Docker Task Runner For more information, see the [dedicated guide](../python-dependencies/index.md). ## Script Task If you want to write a short amount of Python to perform a task, you can use the `io.kestra.plugin.scripts.python.Script` type to write it directly in your flow configuration. This allows you to keep everything in one place. ```yaml id: python_scripts namespace: company.team description: This flow will install the pip package in a Docker container, and use kestra's Python library to generate outputs (number of downloads of the Kestra Docker image) and metrics (duration of the script). tasks: - id: outputs_metrics type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: python:slim dependencies: - requests script: | import requests def get_docker_image_downloads(image_name: str = "kestra/kestra"): """Queries the Docker Hub API to get the number of downloads for a specific Docker image.""" url = f"https://hub.docker.com/v2/repositories/{image_name}/" response = requests.get(url) data = response.json() downloads = data.get('pull_count', 'Not available') return downloads downloads = get_docker_image_downloads() ``` You can also include expressions directly in your Python code. In this example, an input is used in the Python method: ```yaml id: python_scripts_expression_input namespace: company.team description: This flow will install the pip package in a Docker container, and use kestra's Python library to generate outputs (number of downloads of the Kestra Docker image) and metrics (duration of the script). inputs: - id: image_name type: STRING defaults: kestra/kestra tasks: - id: outputs_metrics type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: python:slim dependencies: - requests script: | import requests def get_docker_image_downloads(): """Queries the Docker Hub API to get the number of downloads for a specific Docker image.""" url = f"https://hub.docker.com/v2/repositories/{{ inputs.image_name }}/" response = requests.get(url) data = response.json() downloads = data.get('pull_count', 'Not available') return downloads downloads = get_docker_image_downloads() ``` ## Commands Task If you would prefer to put your Python code in a `.py` file (e.g. your code is much longer or spread across multiple files), you can run the previous example using the `io.kestra.plugin.scripts.python.Commands` type: ```yaml id: python_commands namespace: company.team description: This flow will install the pip package in a Docker container, and use kestra's Python library to generate outputs (number of downloads of the Kestra Docker image) and metrics (duration of the script). tasks: - id: outputs_metrics type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: python:slim dependencies: - requests commands: - python outputs_metrics.py ``` You'll need to add your Python code using the Editor or [sync it using Git](../../version-control-cicd/04.git/index.md) so Kestra can see it. You'll also need to set the `enabled` flag for the `namespaceFiles` property to `true` so Kestra can access the file. You can read more about the Commands type in the [Plugin documentation](/plugins/plugin-script-python/io.kestra.plugin.scripts.python.commands). ## Handling Outputs
If you want to get a variable or file from your Python code, you can use an [output](../../05.workflow-components/06.outputs/index.md). Install the [`kestra` python module](https://pypi.org/project/kestra/) to pass your variables to Kestra. This Kestra Python client provides functionality to interact with the Kestra server for sending metrics, outputs, and logs and executing/polling flows. For example, The Kestra ION extra (`kestra[ion]`) provides a method to read files and convert them to a list of dictionaries, which can be converted into a dataframe in Python (using any Python library supporting dataframes, e.g., Pandas or Polars). See the [README](https://github.com/kestra-io/libs/blob/main/python/README.md) for more details and examples. ```bash pip install kestra ``` ### Variable Output You'll need to use the `Kestra` class to pass your variables to Kestra as outputs. Using the `outputs` method, you can pass a dictionary of variables where the `key` is the name of the output you'll reference in Kestra. Using the same example as above, we can pass the number of downloads as an output. ```python from kestra import Kestra import requests def get_docker_image_downloads(image_name: str = "kestra/kestra"): """Queries the Docker Hub API to get the number of downloads for a specific Docker image.""" url = f"https://hub.docker.com/v2/repositories/{image_name}/" response = requests.get(url) data = response.json() downloads = data.get('pull_count', 'Not available') return downloads downloads = get_docker_image_downloads() outputs = { 'downloads': downloads } Kestra.outputs(outputs) ``` Once your Python file has executed, you'll be able to access the outputs in later tasks as seen below: ```yaml id: python_outputs namespace: company.team tasks: - id: outputs_metrics type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: python:slim dependencies: - requests - kestra commands: - python outputs_metrics.py - id: log_downloads type: io.kestra.plugin.core.log.Log message: "Number of downloads: {{ outputs.outputs_metrics.vars.downloads }}" ``` _This example works for both `io.kestra.plugin.scripts.python.Script` and `io.kestra.plugin.scripts.python.Commands`._ ### File Output Inside of your Python code, write a file to the system. You'll need to add the `outputFiles` property to your flow and list the file you're trying to access. In this case, we want to access `downloads.txt`. More information on the formats you can use for this property can be found in [Script Output Metrics](../../16.scripts/06.outputs-metrics/index.md). The example below write a `.txt` file containing the number of downloads, similar the output we used earlier. We can then read the content of the file using the syntax `{{ outputs.{task_id}.outputFiles['{filename}'] }}` ```yaml id: python_output_files namespace: company.team tasks: - id: outputs_metrics type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: python:slim dependencies: - requests outputFiles: - downloads.txt script: | import requests def get_docker_image_downloads(image_name: str = "kestra/kestra"): """Queries the Docker Hub API to get the number of downloads for a specific Docker image.""" url = f"https://hub.docker.com/v2/repositories/{image_name}/" response = requests.get(url) data = response.json() downloads = data.get('pull_count', 'Not available') return downloads downloads = get_docker_image_downloads() # Generate a file with the output f = open("downloads.txt", "a") f.write(str(downloads)) f.close() - id: log_downloads type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - cat {{ outputs.outputs_metrics.outputFiles['downloads.txt'] }} ``` _This example works for both `io.kestra.plugin.scripts.python.Script` and `io.kestra.plugin.scripts.python.Commands`._ ## Capture Logs
If your Python code needs to log something to the console, use the `Kestra.logger()` method from the [Kestra pip package](https://github.com/kestra-io/libs) to instantiate a `logger` object — this logger is configured to correctly capture all Python log levels and send them to the Kestra backend. ```yaml id: python_logs namespace: company.team tasks: - id: python_logger type: io.kestra.plugin.scripts.python.Script allowFailure: true script: | import time from kestra import Kestra logger = Kestra.logger() logger.debug("DEBUG is used for diagnostic info.") time.sleep(0.5) logger.info("INFO confirms normal operation.") time.sleep(0.5) logger.warning("WARNING signals something unexpected.") time.sleep(0.5) logger.error("ERROR indicates a serious issue.") time.sleep(0.5) logger.critical("CRITICAL means a severe failure.") ``` When we execute the above example, we can see Kestra correctly captures the log levels in the Logs view: ![logs](./logs.png) You can read more about the Python Script task in the [Plugin documentation](/plugins/plugin-script-python/io.kestra.plugin.scripts.python.script) ## Handling Metrics You can also get [metrics](../../16.scripts/06.outputs-metrics/index.md#outputs-and-metrics-in-script-and-commands-tasks) from your Python code. In this example, we can use the `time` module to time the execution time of the function and then pass this to Kestra so it can be viewed in the Metrics tab. You don't need to modify your flow in order for this to work. ```python from kestra import Kestra import requests import time start = time.perf_counter() def get_docker_image_downloads(image_name: str = "kestra/kestra"): """Queries the Docker Hub API to get the number of downloads for a specific Docker image.""" url = f"https://hub.docker.com/v2/repositories/{image_name}/" response = requests.get(url) data = response.json() downloads = data.get('pull_count', 'Not available') return downloads downloads = get_docker_image_downloads() end = time.perf_counter() outputs = { 'downloads': downloads } Kestra.outputs(outputs) Kestra.timer('duration', end - start) ``` Once this has executed, `duration` will be viewable under **Metrics**. ![metrics](./metrics.png) ## Execute Flows in Python Inside of your Python code, you can execute flows. This is useful if you want to manage your orchestration directly in Python rather than using the Kestra flow editor. However, using [Subflows](../../05.workflow-components/10.subflows/index.md) to execute flows from other flows provides a more integrated experience. You can trigger a flow execution by calling the `execute()` method. Here is an example for the same `python_scripts` flow in the namespace `example` as above: ```python from kestra import Flow os.environ["KESTRA_HOSTNAME"] = "http://host.docker.internal:8080" # Set this when executing this Python code inside Kestra flow = Flow() flow.execute('example', 'python_scripts', {'greeting': 'hello from Python'}) ``` Read more about it on the [execution page](../../05.workflow-components/03.execution/index.md). ## Automate Python with Triggers You can combine your Python code with a trigger to automatically execute your code. There's a few key ways you can automate it: - Run on a schedule - Run when a webhook is called - Run when a file is available in a data lake or storage bucket
### Run on a schedule You can use the [Schedule Trigger](../../05.workflow-components/07.triggers/01.schedule-trigger/index.md) to run your flow on a routine. You can pass the date that the trigger executed to your code through an expression. This is useful when using backfills as it allows you to pass the date of when the execution was suppose to run from the schedule directly to your code, rather than the current time, for example useful when fetching a daily report from a specific date in the past: ```yaml id: schedule_code namespace: company.team tasks: - id: python type: io.kestra.plugin.scripts.python.Script outputFiles: - "*.txt" script: | date = f"{{ trigger.date | date("yyyy-MM-dd") }}" report_content = f"Daily Report - {date}\nSales: $5000\nUsers: 200" with open(f"daily_report_{date}.txt", "w") as file: file.write(report_content) triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "0 8 * * *" ``` ### Run when a webhook is called You can use the [Webhook Trigger](../../05.workflow-components/07.triggers/03.webhook-trigger/index.md) to run your flow when a webhook is called. You can also call the webhook with a body, which can be passed to your code through an expression or environment variable: ```yaml id: python_webhook namespace: company.team tasks: - id: python type: io.kestra.plugin.scripts.python.Script script: | response = {{ trigger.body ?? '' }} print(f"{response['first_name']} {response['last_name']}") triggers: - id: webhook type: io.kestra.plugin.core.trigger.Webhook key: abcdefg ``` ### Run when a file is available in a data lake or storage bucket You can use a [Polling Trigger](../../05.workflow-components/07.triggers/04.polling-trigger/index.md), such as the [S3 Trigger](/plugins/plugin-aws/s3/io.kestra.plugin.aws.s3.trigger) to run your flow when a new file arrives in an S3 bucket. This is useful if you have a data pipeline that can start once the data is available and begin transforming it with Python: ```yaml id: s3_trigger namespace: company.team tasks: - id: process_data type: io.kestra.plugin.scripts.python.Script containerImage: python:slim dependencies: - pandas - kestra inputFiles: input.csv: "{{ read(trigger.objects[0].uri) }}" outputFiles: - data.csv script: | import os import pandas as pd from kestra import Kestra df = pd.read_csv('input.csv') df['discounted_total'] = df['total'] * 0.9 df.to_csv('data.csv') triggers: - id: watch type: io.kestra.plugin.aws.s3.Trigger interval: "PT1S" accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" region: "eu-west-2" bucket: "kestra-python-s3" action: DELETE filter: FILES maxKeys: 1 ``` ## Execute GraalVM Task Kestra also supports GraalVM integration, allowing you to execute Python code directly on the JVM, with the potential for performance improvements. There are currently two tasks: - [Eval](/plugins/plugin-graalvm/python-graalvm-tasks-on-graalvm/io.kestra.plugin.graalvm.python.eval) - [FileTransform](/plugins/plugin-graalvm/python-graalvm-tasks-on-graalvm/io.kestra.plugin.graalvm.python.filetransform) In this example, the `Eval` task is used to manipulate data from a previous task. GraalVM makes it easy to generate outputs from variables in Python using the `outputs` property. This is useful if you want to manipulate data and pass the new format to another task. ```yaml id: parse_json_data namespace: company.team tasks: - id: download type: io.kestra.plugin.core.http.Download uri: http://xkcd.com/info.0.json - id: graal type: io.kestra.plugin.graalvm.python.Eval outputs: - data script: | data = {{ read(outputs.download.uri) }} data["next_month"] = int(data["month"]) + 1 ``` --- # Manage Python Dependencies in Kestra URL: https://kestra.io/docs/how-to-guides/python-dependencies > Learn various ways to manage Python dependencies in Kestra, including using pip, virtual environments, caching, and custom Docker images. Manage your Python dependencies in Kestra.
Managing Python Dependencies can be frustrating. There's several ways you can manage your dependencies in Kestra. ## Install with pip using `beforeCommands` Before your `Script` and `Commands` tasks, you can add a list of commands under the `beforeCommands` property. This works well for installing packages with `pip` or setting up a virtual environment: ```yaml id: beforecommands namespace: company.team tasks: - id: code type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.core.runner.Process beforeCommands: - python3 -m venv .venv - . .venv/bin/activate - pip install pandas kestra script: | import pandas as pd from kestra import Kestra df = pd.read_csv('https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv') total_revenue = df['total'].sum() Kestra.outputs({"total": total_revenue}) ``` By using a [Process Task Runner](../../task-runners/04.types/01.process-task-runner/index.md), we can speed up the execution time so that our task isn't pulling a container image to run the task in a container. ## Cache dependencies :::badge{version=">=0.23" editions="OSS,EE"} ::: Since Kestra 0.23, you can also use the `dependencies` property allowing you to cache Python dependencies across multiple executions.
With this feature, Python dependencies are cached and reused across executions of different flows. The [uv package manager](../python-uv/index.md) installs the dependencies on the [worker](../../08.architecture/02.server-components/index.md#worker) under the hood. These cached dependencies will be available for subsequent executions, leading to performance improvements. This method is recommended for smaller tasks that require only a few dependencies, which you don't want to add each time. For more complex workflows, you can continue to use `beforeCommands`. The added properties are `dependencies`, which lists the dependencies (e.g., pandas), and `dependencyCacheEnabled`, which, when set to true, enables caching of dependencies across tasks. An example flow is as follows: the first execution installs the dependencies, but each subsequent execution of this flow, or any other flow relying on these packages, will show improved performance. ```yaml id: python_dependencies namespace: company-team tasks: - id: python type: io.kestra.plugin.scripts.python.Script containerImage: python:3.13-slim dependencies: - pandas - kestra script: | from kestra import Kestra import pandas as pd data = { 'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35] } df = pd.DataFrame(data) print(df) print("Average age:", df['Age'].mean()) Kestra.outputs({"average_age": df['Age'].mean()}) ``` ## Set Container Image with Docker Task Runner If we would prefer to run our task in a container, we can set our Task Runner to Docker and specify a container image with the appropriate dependencies bundled in. Our previous example used `pandas` which is bundled into the `ghcr.io/kestra-io/pydata:latest` available as one of the ready to go images on our [GitHub](https://github.com/orgs/kestra-io/packages?repo_name=examples). ```yaml id: container_image namespace: company.team tasks: - id: code type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest script: | import pandas as pd from kestra import Kestra df = pd.read_csv('https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv') total_revenue = df['total'].sum() Kestra.outputs({"total": total_revenue}) ``` ## Build Docker Image and set it with Docker Task Runner If an image with the required dependencies isn't available, build your own using the `docker.Build` task. Specify a Dockerfile that uses a `python:3.10` image as the base and installs the required dependencies on top. The example below uses `pip install` to install both `kestra` and `pandas`. Once the image is built, reference it in an expression in the Python task: ```yaml id: container_image_build namespace: company.team tasks: - id: build type: io.kestra.plugin.docker.Build dockerfile: | FROM python:3.10 RUN pip install --upgrade pip RUN pip install --no-cache-dir kestra pandas tags: - python_image - id: code type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker pullPolicy: NEVER containerImage: "{{ outputs.build.imageId }}" script: | import pandas as pd from kestra import Kestra df = pd.read_csv('https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv') total_revenue = df['total'].sum() Kestra.outputs({"total": total_revenue}) ``` ## Build Custom Packages
You can also build packages directly in Kestra and then use that package between different flows in the same namespace. This works for zip files and wheels. Here's an example that generates a `.tar.gz` package: ```yaml id: build_tar_gz namespace: company tasks: - id: sync_code_to_kestra type: io.kestra.plugin.git.SyncNamespaceFiles disabled: true # already synced files namespace: "{{ flow.namespace }}" gitDirectory: . url: https://github.com/anna-geller/python-in-kestra branch: main username: anna-geller password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" - id: build type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true beforeCommands: - pip install build commands: - python -m build outputFiles: - "**/*.tar.gz" - id: upload type: io.kestra.plugin.core.namespace.UploadFiles namespace: company.sales filesMap: "etl-0.1.0.tar.gz": "{{ outputs.build.outputFiles['dist/etl-0.1.0.tar.gz']}}" ``` The package can be used in a separate workflow: ```yaml id: install_from_zip namespace: company.sales inputs: - id: date type: STRING defaults: 12/24/2024 displayName: Delivery Date tasks: - id: python type: io.kestra.plugin.scripts.python.Script namespaceFiles: enabled: true beforeCommands: - pip install etl-0.1.0.tar.gz script: | import etl.utils as etl out = etl.standardize_date_format("{{ inputs.date }}") print(out) ``` --- # Manage Python Dependencies with uv in Kestra URL: https://kestra.io/docs/how-to-guides/python-uv > Use uv in Kestra to manage Python dependencies and virtual environments for faster and more reliable script execution. Manage your Python Dependencies in Kestra using `uv`.
`uv` is a new Python package and project manager designed to be extremely fast. Written in rust, it aims to fix some of the pitfalls of pip while also combining multiple python dependency management tools like `virtualenv`, `poetry`, and more into one unified tool. `uv` can be used in Kestra to install dependencies as well as manage virtual environments in combination with the [Process Task Runner](../../task-runners/04.types/01.process-task-runner/index.md). ## Install Dependencies By default, Kestra has `uv` installed to our default Python image `kestrapy`, so anytime you use a Python `Commands` or `Script` task with the [Docker Task Runner](../../task-runners/04.types/02.docker-task-runner/index.md), it will be preinstalled. If you're using a different image or you'd prefer to use the [Process Task Runner](../../task-runners/04.types/01.process-task-runner/index.md), you can also install `uv` using `beforeCommands` with `pip install uv`. ```yaml id: python_example namespace: company.team tasks: - id: code_process type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.core.runner.Process beforeCommands: - pip install uv 2> /dev/null script: | print("Hello, World!") ``` By default, `uv` will look for a virtual environment to install dependencies into, but this is not required when using the [Docker Task Runner](../../task-runners/04.types/02.docker-task-runner/index.md) as that provides the isolation we would get from a virtual environment. To override this, we can add the `--system` flag to our install command. ```yaml id: python_example namespace: company.team tasks: - id: code type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker beforeCommands: - uv pip install pandas --system 2> /dev/null script: | import pandas as pd from kestra import Kestra df = pd.read_csv('https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv') total_revenue = df['total'].sum() Kestra.outputs({"total": total_revenue}) ``` If you're using the [Process Task Runner](../../task-runners/04.types/01.process-task-runner/index.md), you can use `uv` to create a virtual environment with `uv venv`. Once this has completed, you can run `uv pip install`, and it will automatically install these dependencies to this virtual environment without needing to activate the virtual environment. ```yaml id: python_example namespace: company.team tasks: - id: code_process type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.core.runner.Process beforeCommands: - pip install uv 2> /dev/null - uv venv 2> /dev/null - uv pip install pandas kestra 2> /dev/null - . .venv/bin/activate script: | import pandas as pd from kestra import Kestra df = pd.read_csv('https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv') total_revenue = df['total'].sum() Kestra.outputs({"total": total_revenue}) ``` ## Install with a custom Docker image If you have multiple workflows using `uv`, you can install it on the Kestra server by creating a custom Docker image for Kestra. Here's an example of a Dockerfile which is based off the Kestra image but installs `uv` on top of it. ```dockerfile FROM kestra/kestra:latest USER root RUN pip install uv CMD ["server", "standalone"] ``` Learn more about [installing pip package dependencies at server startup](../../14.best-practices/4.managing-pip-dependencies/index.md#installing-pip-package-dependencies-at-server-startup). --- # Run R Inside Your Flows URL: https://kestra.io/docs/how-to-guides/r > Run R scripts in Kestra for statistical computing and data analysis. Use Docker to manage package dependencies and capture outputs for downstream tasks. Run R code directly in your flows and generate outputs. R is essential for statistical analysis, visualization, and data manipulation. With Kestra, you can effortlessly automate data ingestion, conduct complex statistical analysis, and handle real-time data processing. Kestra's robust orchestration capabilities ensure that your R scripts run smoothly and efficiently, streamlining your data-driven projects. This guide is going to walk you through how to get R running in a workflow, how to manage input and output files, and how you can pass outputs and metrics back to Kestra to use in later tasks. Kestra has an official plugin for R allowing you to execute R code in a flow by either writing your R code inline or by executing an `.R` file. You can get outputs and metrics from your R code too. ## Scripts If you want to write a short amount of R code to perform a task, you can use the `io.kestra.plugin.scripts.r.Script` type to write it directly in your flow. This allows you to keep everything in one place. ```yaml id: r_script namespace: company.team description: This flow runs the R script. tasks: - id: http_download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: r_script_task type: io.kestra.plugin.scripts.r.Script script: | print("The current execution is {{ execution.id }}") # Read the file downloaded in `http_download` task data <- read.csv("{{ outputs.http_download.uri }}", header=TRUE) print(data) ``` You can read more about the Scripts type in the [Plugin documentation](/plugins/plugin-script-r/io.kestra.plugin.scripts.r.script) ## Commands If you would prefer to put your R code in an `.R` file (e.g. your code is much longer or spread across multiple files), you can run the previous example using the `io.kestra.plugin.scripts.r.Commands` type: ```yaml id: r_commands namespace: company.team tasks: - id: run_r type: io.kestra.plugin.scripts.r.Commands namespaceFiles: enabled: true commands: - Rscript main.R ``` The contents of the `main.R` file can be: ```r print("Hello World") ``` You'll need to add your R code using the Editor or [sync it using Git](../../version-control-cicd/04.git/index.md) so Kestra can see it. You'll also need to set the `enabled` flag for the `namespaceFiles` property to `true` so Kestra can access the file. You can also have the R code written inline. ```yaml id: r_commands namespace: company.team tasks: - id: http_download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: run_r type: io.kestra.plugin.scripts.r.Commands inputFiles: orders.csv: "{{ read(outputs.http_download.uri) }}" main.R: | print("The current execution is {{ execution.id }}") # Read the file data <- read.csv("orders.csv", header=TRUE) print(data) commands: - Rscript main.R ``` You can read more about the Commands type in the [Plugin documentation](/plugins/plugin-script-r/io.kestra.plugin.scripts.r.commands). ## Handling Outputs If you want to get a variable or file from your R script, you can use an [output](../../05.workflow-components/06.outputs/index.md). ### Variable Output You can get the JSON outputs from the R commands / script using the `::{}::` pattern. Here is an example: ```yaml id: r_outputs namespace: company.team description: This flow runs the R script, and outputs the variable. tasks: - id: r_outputs_task type: io.kestra.plugin.scripts.r.Script script: | cat('::{"outputs":{"test":"value","int":2,"bool":true,"float":3.65}}::') ``` All the output variables can be viewed in the Outputs tab of the execution. ![r_outputs](./outputs.png) You can refer to the outputs in another task as shown in the example below: ```yaml id: r_outputs namespace: company.team description: This flow runs the R script, and outputs the variable. tasks: - id: r_outputs_task type: io.kestra.plugin.scripts.r.Script script: | cat('::{"outputs":{"test":"value","int":2,"bool":true,"float":3.65}}::') - id: return type: io.kestra.plugin.core.debug.Return format: '{{ outputs.r_outputs_task.vars.test }}' ``` _This example works for both `io.kestra.plugin.scripts.r.Script` and `io.kestra.plugin.scripts.r.Commands`._ ### File Output Inside of your R script, write a file to the system. You'll need to add the `outputFiles` property to your flow and list the files you're trying to put out. In this case, we want to output `output.txt`. More information on the formats you can use for this property can be found in [Script Output Metrics](../../16.scripts/06.outputs-metrics/index.md). The example below writes a `output.txt` file containing the "Hello World" text. We can then refer the file using the syntax `{{ outputs.{task_id}.outputFiles[''] }}`, and read the contents of the file using the `read()` function. ```yaml id: r_output_file namespace: company.team description: This flow runs the R script to output a file. tasks: - id: r_outputs_task type: io.kestra.plugin.scripts.r.Script outputFiles: - output.txt script: | writeLines("Hello World", "output.txt") - id: log_output type: io.kestra.plugin.core.log.Log message: "{{ read(outputs.r_outputs_task.outputFiles['output.txt']) }}" ``` _This example works for both `io.kestra.plugin.scripts.r.Script` and `io.kestra.plugin.scripts.r.Commands`._ ## Handling Metrics You can also get [metrics](../../16.scripts/06.outputs-metrics/index.md#outputs-and-metrics-in-script-and-commands-tasks) from your R script. Metrics use the same `::{}::` pattern as outputs. This example demonstrates both the counter and timer metrics. ```yaml id: r_metrics namespace: company.team description: This flow runs the R script, and puts out the metrics. tasks: - id: r_metrics_task type: io.kestra.plugin.scripts.r.Script script: | print('There are 20 products in the cart') cat('::{"outputs":{"productCount":20}}::\n') cat('::{"metrics":[{"name":"productCount","type":"counter","value":20}]}::\n') cat('::{"metrics":[{"name":"purchaseTime","type":"timer","value":32.44}]}::\n') ``` Once this has executed, both the metrics can be viewed under **Metrics**. ![metrics](./metrics.png) --- # Realtime Triggers in Kestra: Kafka, SQS, Pub/Sub URL: https://kestra.io/docs/how-to-guides/realtime-triggers > React to events instantly with Kestra's Realtime Triggers for Kafka, Pulsar, AWS SQS, GCP Pub/Sub, and Azure Event Hubs. How to React to events as they happen with millisecond latency. As soon as you add a Realtime Trigger to your workflow, Kestra starts an always-on thread that listens to the external system for new events. When a new event occurs, Kestra starts a workflow execution to process the event. The following examples show how to implement Realtime Triggers for common messaging systems. ## Apache Kafka To setup Apache Kafka locally, follow the instructions mentioned in the [official documentation](https://kafka.apache.org/quickstart). Once Apache Kafka is installed, you can create the `logs` topic, and start producing data into the topic using the following commands: ```bash ## Create topic $ bin/kafka-topics.sh --create --topic logs --bootstrap-server localhost:9092 ## Produce data into Kafka topic $ bin/kafka-console-producer.sh --topic logs --bootstrap-server localhost:9092 > Hello World ``` You can use the Apache Kafka [RealtimeTrigger](/plugins/plugin-kafka/io.kestra.plugin.kafka.realtimetrigger) in the Kestra flow as follows: ```yaml id: kafka namespace: company.team tasks: - id: log type: io.kestra.plugin.core.log.Log message: "{{ trigger.value }}" triggers: - id: realtime_trigger type: io.kestra.plugin.kafka.RealtimeTrigger topic: logs properties: bootstrap.servers: localhost:9092 serdeProperties: valueDeserializer: STRING groupId: kestraConsumerGroup ``` When any message is pushed into the `logs` Kafka topic, this flow will get triggered immediately. ## Apache Pulsar To setup Apache Pulsar locally, you can install the [standalone cluster](https://pulsar.apache.org/docs/next/getting-started-standalone/) or [docker cluster](https://pulsar.apache.org/docs/next/getting-started-docker/) for Apache Pulsar. For docker cluster, you can run the `pulsar-admin` commands from the Apache Pulsar docker container. You can run the following commands to create the topic, and produce data to the topic: 1) Setup a tenant `bin/pulsar-admin tenants create apache` 2) Create a namespace `bin/pulsar-admin namespaces create apache/pulsar` 3) Create a topic `bin/pulsar-admin topics create-partitioned-topic apache/pulsar/logs -p 4` 4) Produce data to topic `bin/pulsar-client produce apache/pulsar/logs -m '--Hello World--' -n 1` You can use the Apache Pulsar [RealtimeTrigger](/plugins/plugin-pulsar/io.kestra.plugin.pulsar.realtimetrigger) in the Kestra flow as follows: ```yaml id: pulsar namespace: company.team tasks: - id: log type: io.kestra.plugin.core.log.Log message: "{{ trigger.value }}" triggers: - id: realtime_trigger type: io.kestra.plugin.pulsar.RealtimeTrigger topic: apache/pulsar/logs uri: pulsar://localhost:26650 subscriptionName: kestra_trigger_sub ``` When any message is pushed into the `apache/pulsar/logs` Pulsar topic, this flow will get triggered immediately. ## AWS SQS We will first create the SQS queue from the AWS Console. You can also AWS CLI for this purpose. This is how you can create the AWS SQS from the console: ![sqs_create_queue](./sqs_create_queue.png) You only need to put in the Queue name. Rest all the configuration can be kept as is, and click on "Create Queue" at the bottom of the page. You can now send messages to this queue by clicking on "Send and receive messages" button on the top of the page. ![sqs_send_and_receive_messages](./sqs_send_and_receive_messages.png) On the Send and Receive messages page, you can put the Message body under the Send message section, and click on the "Send message" button to send the message to the queue. ![sqs_send_message](./sqs_send_message.png) You can use the AWS SQS [RealtimeTrigger](/plugins/plugin-aws/sqs/io.kestra.plugin.aws.sqs.realtimetrigger) in the Kestra flow as follows: ```yaml id: aws-sqs namespace: company.team tasks: - id: log type: io.kestra.plugin.core.log.Log message: "{{ trigger.data }}" triggers: - id: "realtime_trigger" type: io.kestra.plugin.aws.sqs.RealtimeTrigger accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" region: "eu-central-1" queueUrl: "https://sqs.eu-central-1.amazonaws.com/000000000000/logs" ``` When any message is pushed into the `logs` SQS queue, this flow will get triggered immediately. ## GCP Pub/Sub We will first create the Pub/Sub topic from the GCP console. For this, click on "Create topic" button on the GCP Pub/Sub console. On the Create topic page, put the topic name `logs` in the Topic ID text box, and leave the rest of the settings as default. Ensure the "Add a default subscription" checkbox is ticked. Click on "CREATE" button at the bottom. This will create the `logs` Pub/Sub topic with the default subscription `logs-sub`. ![pubsub_create_topic](./pubsub_create_topic.png) ![pubsub_navigate_to_messages](./pubsub_navigate_to_messages.png) Navigate to the "MESSAGES" tab. On this tab, click on the "PUBLISH MESSAGE" button. ![pubsub_publish_message_button](./pubsub_publish_message_button.png) On the Publish message popup, put the message you would like to publish to the topic, and click on the "PUBLISH" button on the bottom of the page. This would publish the message to the Pub/Sub topic. ![pubsub_publish_message](./pubsub_publish_message.png) You can use the GCP Pub/Sub [RealtimeTrigger](/plugins/plugin-gcp/pubsub/io.kestra.plugin.gcp.pubsub.realtimetrigger) in the Kestra flow as follows: ```yaml id: gcp-pubsub namespace: company.team tasks: - id: log type: io.kestra.plugin.core.log.Log message: "{{ trigger.data }}" triggers: - id: trigger type: io.kestra.plugin.gcp.pubsub.RealtimeTrigger projectId: test-project-id topic: logs subscription: logs-sub ``` When any message is published into the `logs` Pub/Sub topic, this flow will get triggered immediately. ## Azure Event Hubs Create an Event Hub and a container for checkpoint storage: 1. Go to [Event Hubs](https://portal.azure.com/#view/HubsExtension/BrowseResource/resourceType/Microsoft.EventHub%2Fnamespaces) in the Azure portal. 2. Click on "Create" to create an Event Hubs namespace. 3. On the Create Namespace page, choose an appropriate Subscription and Resource Group. 4. Put an appropriate Namespace name, Location, Pricing tier and Throughput units. 5. Click on "Review + Create". Once the validation is successful, click on "Create". 6. Once the Event Hub namespace is created, click on the namespace. 7. Once on that particular namespace's page, click on "+ Event Hub" button to create an Event Hub. 8. Put an appropriate Name for the Event Hub. You can change the remaining settings as per your requirements. 9. Click on "Review + Create". Once the validation is successful, click on "Create". 10. On the particular Event Hub namespace page, you can now see the newly created Event Hub. 11. On the namespace page, click on "Shared access policies" from the left menu bar. 12. Click on the "RootManageSharedAccessKey". 13. In the popup page that appears, you can copy the "Connection string – primary key" to be used later in the Kestra flow. With this, the Event Hub is created. ![event_hubs_create_namespace_1](./event_hubs_create_namespace_1.png) ![event_hubs_create_namespace_2](./event_hubs_create_namespace_2.png) ![event_hubs_create_namespace_3](./event_hubs_create_namespace_3.png) ![event_hubs_create_namespace_4](./event_hubs_create_namespace_4.png) ![event_hubs_create_event_hub_1](./event_hubs_create_event_hub_1.png) ![event_hubs_create_event_hub_2](./event_hubs_create_event_hub_2.png) ![event_hubs_create_event_hub_3](./event_hubs_create_event_hub_3.png) ![event_hubs_create_event_hub_4](./event_hubs_create_event_hub_4.png) ![event_hubs_create_event_hub_5](./event_hubs_create_event_hub_5.png) ![event_hubs_create_event_hub_6](./event_hubs_create_event_hub_6.png) 14. Create the container. Go to [Storage accounts](https://portal.azure.com/#view/HubsExtension/BrowseResource/resourceType/Microsoft.Storage%2FStorageAccounts). 15. Click on "Create storage account". 16. On the "Create storage account" page, choose an appropriate Subscription and Resource Group. 17. Put an appropriate Storage account name, Region, Performance, and Redundancy. 18. Click on "Review + Create". Once the validation is successful, click on "Create". 19. Once the storage account is created, click on the storage account name. 20. On the storage account page, navigate from the left menu bar to the "Data storage", and then to "Containers". 21. Click on the "+ Container" button to create a container. 22. Put an appropriate name for the container, and click "Create". 23. Once the container is created, navigate to "Access keys" under "Security + networking" from the left menu bar. 24. For the key, click on the "Show" button for the connection string and note it down to be used later in the Kestra flow. ![azure_create_storage_account_1](./azure_create_storage_account_1.png) ![azure_create_storage_account_2](./azure_create_storage_account_2.png) ![azure_create_storage_account_3](./azure_create_storage_account_3.png) ![azure_create_container_1](./azure_create_container_1.png) ![azure_create_container_2](./azure_create_container_2.png) ![azure_create_container_3](./azure_create_container_3.png) ![azure_create_container_4](./azure_create_container_4.png) ![azure_create_container_5](./azure_create_container_5.png) Now that all the setup is ready in Azure, start the Kestra cluster with the environment variables "SECRET_EVENTHUBS_CONNECTION" and "SECRET_BLOB_CONNECTION" containing the base64-encoded value for the Event Hubs connection string and Blob connection string, respectively. The Kestra flow with the Azure Event Hub Realtime Trigger will look as follows: ```yaml id: TriggerFromAzureEventHubs namespace: company.team tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Hello there! I received {{ trigger.body }} from Azure EventHubs! triggers: - id: readFromEventHubs type: io.kestra.plugin.azure.eventhubs.RealtimeTrigger eventHubName: kestra namespace: kestra-namespace connectionString: "{{ secret('EVENTHUBS_CONNECTION') }}" bodyDeserializer: JSON consumerGroup: "$Default" checkpointStoreProperties: containerName: kestralogs connectionString: "{{ secret('BLOB_CONNECTION') }}" ``` On the particular Event Hubs page, you can click on "Generate Data" under "Features" from the left menu bar. Choose an appropriate Content-Type from the drop-down, and put the payload you want to push to the Event Hub. When you click on the "Send" button on the bottom, the payload will be pushed to the Event Hub. Also, the flow will be triggered immediately, and you can see the corresponding execution in Kestra. ![event_hubs_generate_data_1](./event_hubs_generate_data_1.png) ![event_hubs_generate_data_2](./event_hubs_generate_data_2.png) Realtime triggers let you react to events in real time to orchestrate business-critical processes. --- # Revision History and Rollback in Kestra URL: https://kestra.io/docs/how-to-guides/rollback-and-revision-history > Use Kestra's revision history to track changes, compare flow versions, and easily rollback to previous configurations. Use revision history to rollback to an older version of a flow.
Kestra stores revision history which allows you to roll back to any older version of the flow. Navigate to the "Revisions" tab on the flow's page to view older versions. By default, the page opens up a comparison of the current version of the flow against the previous version. ![revision_comparison](./revision_comparison.png) You can compare any two versions by choosing the appropriate revision number from the drop-down on both sides, allowing you to see the changes made between the two selected versions. ![revision_dropdown](./revision_dropdown.png) There is a `Restore` button allowing you to roll back to the selected version. The `Restore` button is disabled for the current live version as there is nothing to restore. ![restore_option](./restore_option.png) --- # Run Ruby Inside Your Flows URL: https://kestra.io/docs/how-to-guides/ruby > Execute Ruby scripts in Kestra. Automate tasks with Ruby code, install gems at runtime, and pass outputs to downstream tasks for flexible scripting. Run Ruby code directly in your flows and generate outputs. Ruby is well known for web development but has many other powerful use cases too, such as automation, web scraping, data processing and command-line tools. With Kestra, you can effortlessly automate data ingestion, as well as manage complex automations. Kestra's robust orchestration capabilities ensure that your Ruby scripts run smoothly and efficiently, streamlining your data-driven projects. This guide is going to walk you through how to get Ruby running in a workflow, how to manage input and output files, and how you can pass outputs and metrics back to Kestra to use in later tasks. Kestra has an official plugin for Ruby allowing you to execute Ruby code in a flow by either writing your Ruby code inline or by executing an `.rb` file. You can get outputs and metrics from your Ruby code too. ## Scripts If you want to write a short amount of Ruby code to perform a task, you can use the `io.kestra.plugin.scripts.ruby.Script` type to write it directly in your flow. This allows you to keep everything in one place. ```yaml id: ruby_output_file namespace: company.team description: This flow runs the Ruby script to output a file. tasks: - id: ruby_outputs_task type: io.kestra.plugin.scripts.ruby.Script outputFiles: - output.txt script: | File.open("output.txt", "w") do |file| file.write("Hello World") end - id: log_output type: io.kestra.plugin.core.log.Log message: "{{ read(outputs.ruby_outputs_task.outputFiles['output.txt']) }}" ``` You can read more about the Scripts type in the [Plugin documentation](/plugins/plugin-script-ruby/io.kestra.plugin.scripts.ruby.script) ## Commands If you would prefer to put your Ruby code in a `.rb` file (e.g. your code is much longer or spread across multiple files), you can run the previous example using the `io.kestra.plugin.scripts.ruby.Commands` type: ```yaml id: ruby_commands namespace: company.team tasks: - id: run_ruby type: io.kestra.plugin.scripts.ruby.Commands namespaceFiles: enabled: true commands: - ruby main.rb ``` The contents of the `main.rb` file can be: ```ruby puts "Hello World" ``` You'll need to add your Ruby code using the Editor or [sync it using Git](../../version-control-cicd/04.git/index.md) so Kestra can see it. You'll also need to set the `enabled` flag for the `namespaceFiles` property to `true` so Kestra can access the file. You can also have the Ruby code written inline. ```yaml id: ruby_commands namespace: company.team tasks: - id: http_download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: run_ruby type: io.kestra.plugin.scripts.ruby.Commands inputFiles: orders.csv: "{{ read(outputs.http_download.uri) }}" main.rb: | puts "The current execution is {{ execution.id }}" # Read the file downloaded in `http_download` task lines = File.readlines("orders.csv") puts lines commands: - ruby main.rb ``` You can read more about the Commands type in the [Plugin documentation](/plugins/plugin-script-ruby/io.kestra.plugin.scripts.ruby.commands). ## Handling Outputs If you want to get a variable or file from your Ruby script, you can use an [output](../../05.workflow-components/06.outputs/index.md). ### Variable Output You can get the JSON outputs from the Ruby commands / script using the `::{}::` pattern. Here is an example: ```yaml id: ruby_outputs namespace: company.team description: This flow runs the Ruby script, and outputs the variable. tasks: - id: ruby_outputs_task type: io.kestra.plugin.scripts.ruby.Script script: | puts '::{"outputs":{"test":"value","int":2,"bool":true,"float":3.65}}::' ``` All the output variables can be viewed in the Outputs tab of the execution. ![ruby_outputs](./outputs.png) You can refer to the outputs in another task as shown in the example below: ```yaml id: ruby_outputs namespace: company.team description: This flow runs the Ruby script, and outputs the variable. tasks: - id: ruby_outputs_task type: io.kestra.plugin.scripts.ruby.Script script: | puts '::{"outputs":{"test":"value","int":2,"bool":true,"float":3.65}}::' - id: return type: io.kestra.plugin.core.debug.Return format: '{{ outputs.ruby_outputs_task.vars.test }}' ``` _This example works for both `io.kestra.plugin.scripts.ruby.Script` and `io.kestra.plugin.scripts.ruby.Commands`._ ### File Output Inside of your Ruby script, write a file to the system. You'll need to add the `outputFiles` property to your flow and list the files you're trying to put out. In this case, we want to output `output.txt`. More information on the formats you can use for this property can be found in [Script Output Metrics](../../16.scripts/06.outputs-metrics/index.md). The example below writes a `output.txt` file containing the "Hello World" text. We can then refer the file using the syntax `{{ outputs.{task_id}.outputFiles[''] }}`, and read the contents of the file using the `read()` function. ```yaml id: ruby_output_file namespace: company.team description: This flow runs the Ruby script to output a file. tasks: - id: ruby_outputs_task type: io.kestra.plugin.scripts.ruby.Script outputFiles: - output.txt script: | File.open("output.txt", "w") do |file| file.write("Hello World") end - id: log_output type: io.kestra.plugin.core.log.Log message: "{{ read(outputs.ruby_outputs_task.outputFiles['output.txt']) }}" ``` _This example works for both `io.kestra.plugin.scripts.ruby.Script` and `io.kestra.plugin.scripts.ruby.Commands`._ ## Handling Metrics You can also get [metrics](../../16.scripts/06.outputs-metrics/index.md#outputs-and-metrics-in-script-and-commands-tasks) from your Ruby script. Metrics use the same `::{}::` pattern as outputs. This example demonstrates both the counter and timer metrics. ```yaml id: ruby_metrics namespace: company.team description: This flow runs the Ruby script, and puts out the metrics. tasks: - id: ruby_metrics_task type: io.kestra.plugin.scripts.ruby.Script script: | puts 'There are 20 products in the cart' puts '::{"outputs":{"productCount":20}}::' puts '::{"metrics":[{"name":"productCount","type":"counter","value":20}]}::' puts '::{"metrics":[{"name":"purchaseTime","type":"timer","value":32.44}]}::' ``` Once this has executed, both the metrics can be viewed under **Metrics**. ![metrics](./metrics.png) ## Execute GraalVM Task Kestra also supports GraalVM integration, allowing you to execute Ruby code directly on the JVM, with the potential for performance improvements. There are currently two tasks: - [Eval](/plugins/plugin-graalvm/ruby-graalvm/io.kestra.plugin.graalvm.ruby.eval) - [FileTransform](/plugins/plugin-graalvm/ruby-graalvm/io.kestra.plugin.graalvm.ruby.filetransform) In this example, the `Eval` is used to manipulate data from a previous task. GraalVM makes it easy to generate outputs from variables in Python using the `outputs` property. This is useful if you want to manipulate data and pass the new format to another task. ```yaml id: parse_json_data namespace: company.team tasks: - id: download type: io.kestra.plugin.core.http.Download uri: http://xkcd.com/info.0.json - id: graal type: io.kestra.plugin.graalvm.ruby.Eval outputs: - data script: | data = {{ read(outputs.download.uri) }} data["next_month"] = '{{ read(outputs.download.uri) | jq(".month") | first }}'.to_i + 1 return {data: data} ``` --- # Run Rust Inside Your Flows URL: https://kestra.io/docs/how-to-guides/rust > Execute Rust code directly within your Kestra flows using Docker to leverage Rust's performance for your data processing tasks. Run Rust code directly in your flows and generate outputs. Rust has jumped in popularity over the past few years, mainly due to its performance and reliability in production settings. Compared to Python, Rust is a great choice for performance-critical workloads so might be a good choice to use in your flows. This guide is going to walk you through how to get Rust running in a workflow, how to manage input and output files, and how you can pass outputs and metrics back to Kestra to use in later tasks. There isn't an official Rust plugin but we can use the `Shell` `Commands` task to execute arbitrary commands in a Docker container. We can also specify a container image that contains the necessary libraries to run the specific programming language. In this example, we're using the Docker Task Runner with the `rust:latest` image so that Rust code can be executed. ```yaml id: rust_commands namespace: company.team tasks: - id: rust type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: rust:latest namespaceFiles: enabled: true commands: - rustc main.rs && ./main ``` The contents of the `main.rs` file contains a simple print statement: ```rust fn main() { println!("Hello World"); } ``` You'll need to add your Rust code using the built-in Editor or [using our Git plugin](../../version-control-cicd/04.git/index.md) so Kestra can see it. You'll also need to set the `enabled` flag for the `namespaceFiles` property to `true` so Kestra can access the file. You can also add your Rust code inline using the `inputFiles` property. ```yaml id: rust_commands namespace: company.team tasks: - id: rust type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: rust:latest inputFiles: main.rs: | fn main() { println!("Hello World!"); } commands: - rustc main.rs && ./main ``` You can read more about the Shell Commands type in the [Plugin documentation](/plugins/plugin-script-shell/io.kestra.plugin.scripts.shell.commands). ## Handling Outputs Your Rust code can generate file-based [outputs](../../05.workflow-components/06.outputs/index.md). In your Rust code, write a file to the local directory. Then, use the `outputFiles` property to point Kestra to the path of those [output files](../../16.scripts/06.outputs-metrics/index.md). In this example, `output.txt` file containing the text "Hello World" is written to the local directory. To read that output file in another downstream task, you can use the syntax `{{ outputs.{task_id}.outputFiles[''] }}`, and if you need a file's content as a string rather than a file path, you can wrap that expression in a `read()` function e.g. `{{ read(outputs.mytask.outputFiles['outputs.txt']) }}`. ```yaml id: rust_script namespace: company.team tasks: - id: rust type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: rust:latest inputFiles: main.rs: | use std::fs::File; use std::io::Write; // For the `write_all` method fn main() -> std::io::Result<()> { // Create or open the file `output.txt` in write mode let mut file = File::create("output.txt")?; // Write the string "Hello, World" to the file file.write_all(b"Hello World")?; // Confirm successful write operation println!("Successfully wrote to the file."); Ok(()) } outputFiles: - output.txt commands: - rustc main.rs && ./main - id: read_file type: io.kestra.plugin.core.log.Log message: "{{ read(outputs.rust.outputFiles['output.txt']) }}" ``` ## Orchestrate with Rust Rust is a great choice for performance-critical workloads. If you're working with huge datasets, Rust could be a good choice for ETL. Below is an example of how you can setup Rust in Kestra to perform an ETL process. The example flow uses a Rust image created using the following [sample ETL project](https://github.com/kestra-io/examples/tree/main/examples/rust). The image contains the CLI command `etl` to allow us to start the process. ```yaml id: rust_in_container namespace: company.team tasks: - id: rust type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/rust:latest outputFiles: - "*.csv" commands: - etl ``` Once the container finishes execution, you'll be able to download all CSV files generated by the Rust container from the Outputs tab. Kestra makes it easy to both process heavy compute workloads while providing an intuitive interface to access the results. :::alert{type="info"} The `ghcr.io/kestra-io/rust:latest` image is public, so you can directly use the example shown above. ::: --- # Build SecOps Workflows with Kestra URL: https://kestra.io/docs/how-to-guides/secops-with-kestra > Automate security operations with Kestra. Build SecOps workflows for incident response, vulnerability scanning, and compliance automation. Operationalize SecOps benchmarks with Kestra. This how-to shows how to operationalize SecOps benchmarks with Kestra. You will download a CIS benchmark, store control recommendations as settings, and orchestrate compliance scans and automated remediation across multiple controls and teams. ## Prerequisites - Access to the CIS benchmark for your target operating system (Ubuntu 24.04 LTS in this example) - A Kestra namespace strategy for SecOps (for example `company.security.cis.linux.ubuntu.22-04-lts.devops`) - SSH access (public key) to the target VMs you plan to scan/remediate - Appropriate secrets configured in Kestra for usernames, private keys, and webhook triggers ## Step 1: Download the Benchmark 1. Go to [https://downloads.cisecurity.org/#/](https://downloads.cisecurity.org/#/) and download the **CIS_Ubuntu_Linux_24.04_LTS_Benchmark_v1.0.0** (or the benchmark that matches your OS). 2. Review the controls you plan to enforce and note the recommended settings. ![Benchmark download placeholder](./secops-benchmark.png) ## Step 2: Define the Namespace and Settings Structure 1. Decide how to segment namespaces per team or environment. Examples: - `company.security.cis.linux.ubuntu.22-04-lts.devops` - `company.security.cis.linux.ubuntu.22-04-lts.dataeng` 2. Create settings ([KV pairs](../../06.concepts/05.kv-store/index.md)) for every control you want to validate. For instance, controls under section **1.6**: ![Configure Command Line Warning Banners](./command-line-warning-banners.png) And they can be stored by following this hierarchy: ```plaintext 1 ├── 1.1 │ ├── 1.1.1 │ │ ├── 1.1.1.1 │ │ └── 1.1.1.2 └── 1.6 └── 1.6.4 ``` 3. Use consistent KV naming so any flow can dynamically fetch a control setting. Example naming convention: `control-1-1_6-1_6_4` for control **1.6.4**. 4. Store the recommended permission string or configuration snippet for each control. Control 1.6.4, for example, ensures `/etc/motd` permissions follow security guidance. ![KV structure placeholder](./secops-kv-structure.png) Repeat this process for every control you intend to enforce. The walkthrough below focuses on 1.6.4, 1.6.5, and 1.6.6. ## Step 3: Store Secrets for VM Access 1. Add [secrets](../../06.concepts/04.secret/index.md) for the SSH username (`vmUser`) and private key (`vmKey`) used to connect to the VM. 2. Store any additional secrets (for example, webhook secrets) you will reference in flows and triggers. ![vmKey Secret](./vmKey-secret.png) ![vmUser Secret](./vmUser-secret.png) ## Step 4: Model the Parent Flow Design a flow that evaluates each control, remediates if required, and proceeds to the next control. At a high level the logic looks like this: ```plaintext Start → Execute Control 1.6.4 → Assess Compliance → If compliant → Move to next control → If not compliant → Remediate → Re-assess → Next control ``` ![Flow Topology](./flow-topology.jpeg) ## Step 5: Create Reusable Control Subflows Create a subflow per control so you can reuse the same logic across namespaces. The example below implements control **1.6.5**. Note how periods in the control number are converted to underscores for IDs (for example, `1_6_5`). ```yaml id: control-1-1_6-1_6_5 namespace: company.security.cis.linux.ubuntu.22-04-lts.devops inputs: - id: remediateControls description: Toggle ON to auto-remediate non-compliant controls. displayName: Auto Remediate type: BOOL defaults: true - id: ipAddress type: STRING defaults: localhost variables: COMPLIANT: Compliant NOT_COMPLIANT: Not Compliant tasks: # Retrieve the recommended configuration from the KV store - id: getConfiguration type: io.kestra.plugin.core.kv.Get key: "{{ render(flow.id) }}" # Assess the current VM state - id: assess-1_6_5 type: io.kestra.plugin.fs.ssh.Command host: "{{ inputs.ipAddress }}" authMethod: PUBLIC_KEY username: "{{ secret('vmUser') }}" privateKey: "{{ secret('vmKey') }}" commands: - 'echo $(stat -Lc "Access: (%#a/%A) Uid: ( %u/ %U) Gid: { %g/ %G)" /etc/issue) > output.log' - echo '::{"outputs":{"result":"'$(cat output.log)'"}}::' - id: status-1_6_5 type: io.kestra.plugin.core.flow.If condition: "{{ outputs['assess-1_6_5']['vars']['result'] == outputs.getConfiguration.value }}" then: - id: compliant-1_6_5 type: io.kestra.plugin.core.debug.Return format: "{{ vars.COMPLIANT }}" else: - id: doRemediate type: io.kestra.plugin.core.flow.If condition: "{{ inputs.remediateControls == true }}" then: - id: remediate-1_6_5 type: io.kestra.plugin.fs.ssh.Command host: "{{ inputs.ipAddress }}" username: "{{ secret('vmUser') }}" privateKey: "{{ secret('vmKey') }}" authMethod: PUBLIC_KEY commands: - sudo chown root:root $(readlink -e /etc/issue) - sudo chmod u-x,go-wx $(readlink -e /etc/issue) - id: remediateResult-1_6_5 type: io.kestra.plugin.core.debug.Return format: "{{ vars.COMPLIANT }}" else: - id: not-compliant-1_6_5 type: io.kestra.plugin.core.debug.Return format: "{{ vars.NOT_COMPLIANT }}" ## Return output for the parent flow outputs: - id: complianceStatus-1_6_5 type: STRING value: "{{ outputs['compliant-1_6_5']['value'] ?? outputs['remediateResult-1_6_5']['value'] ?? outputs['not-compliant-1_6_5']['value'] ?? 'Error' }}" ``` Repeat the same pattern for controls **1.6.4** and **1.6.6**. ## Step 6: Assemble the Parent Flow Use the subflows inside a parent orchestration that evaluates each control sequentially within a `Parallel` task (with concurrency set to 1). This lets you retrigger individual control branches without re-running the entire benchmark. ```yaml id: csrRevamped namespace: company.security.cis.linux.ubuntu.22-04-lts.devops inputs: - id: remediateControls description: Toggle ON to auto-remediate non-compliant controls. displayName: Auto Remediate type: BOOL defaults: true - id: ipAddress displayName: IP Address description: Host on which the scan must run. type: STRING defaults: localhost tasks: - id: section-1-1_6 type: io.kestra.plugin.core.flow.Parallel # Tasks run in parallel but concurrency is limited to 1 so each control # can be retriggered independently without re-running downstream steps. concurrent: 1 tasks: - id: trigger-1-1_6-1_6_4 type: io.kestra.plugin.core.flow.Sequential tasks: - id: control-1-1_6-1_6_4 type: io.kestra.plugin.core.flow.Subflow namespace: "{{ flow.namespace }}" flowId: control-1-1_6-1_6_4 inputs: ipAddress: "{{ inputs.ipAddress }}" remediateControls: "{{ inputs.remediateControls }}" wait: true transmitFailed: true - id: logStatus-1-1_6-1_6_4 type: io.kestra.plugin.core.log.Log message: "{{ outputs['control-1-1_6-1_6_4'].outputs['complianceStatus-1_6_4'] }}" - id: trigger-1-1_6-1_6_5 type: io.kestra.plugin.core.flow.Sequential tasks: - id: control-1-1_6-1_6_5 type: io.kestra.plugin.core.flow.Subflow namespace: "{{ flow.namespace }}" flowId: control-1-1_6-1_6_5 inputs: ipAddress: "{{ inputs.ipAddress }}" remediateControls: "{{ inputs.remediateControls }}" wait: true transmitFailed: true - id: logStatus-1-1_6-1_6_5 type: io.kestra.plugin.core.log.Log message: "{{ outputs['control-1-1_6-1_6_5'].outputs['complianceStatus-1_6_5'] }}" ## These triggers will be demonstrated in the VM creation and ServiceNow tutorial triggers: - id: vmCreateFromServiceNow type: io.kestra.plugin.core.trigger.Webhook key: "{{ secret('webHookTriggerSecret') }}" - id: postVMCreation type: io.kestra.plugin.core.trigger.Flow inputs: ipAddress: "{{ trigger.outputs.externalIPAddress }}" preconditions: id: vmCreationSuccess flows: - namespace: company.ops.it flowId: createVMRevamped states: [ SUCCESS, WARNING ] ``` ## Step 7: Review the Topology Each control runs in parallel but only one at a time because `concurrent: 1`. This makes it easy to rerun non-compliant controls individually without re-running the entire benchmark. ![Trigger Topology placeholder](./trigger-topology.png) ## Demo 1. **Execute the flow.** Observe the initial compliance check. ![Flow execution](./flow-execution.png) 2. **Check the results.** Review the compliance summary. ![Result summary](./result-summary.png) 3. **Inspect the subflow.** Confirm whether the VM was already compliant. ![Subflow placeholder](./subflow-summary.png) 4. **Force a drift.** Change the VM setting for control `1_6_5` (for example, from `644` to `664`). ![Permission change](./permission-change.png) 5. **Retrigger only control `1_6_5`.** ![Retrigger](./retrigger.png) 6. **Review the logs.** Verify that remediation executed for `1_6_5`. ![Remediation logs](./remediation-logs.png) 7. **Validate the VM permissions.** Confirm they returned to `644`. ![Permission validation placeholder](./permission-validation.png) ## Result You have enforced CIS benchmark controls through Kestra, combined compliance assessment with optional remediation, and validated that individual controls can be retriggered independently. Replace the placeholder images with real screenshots from your environment to complete the documentation. --- # Configure Secrets in Kestra URL: https://kestra.io/docs/how-to-guides/secrets > Learn how to securely configure and use secrets in Kestra to protect sensitive information like passwords and API keys in your flows. Learn how to securely configure and use secrets in Kestra.
Secrets are sensitive values that should not be exposed in plain text, such as passwords, API tokens, access keys, or other confidential information. For a detailed overview, see the [Secrets](../../06.concepts/04.secret/index.md) documentation. This guide demonstrates how to add secrets to your Kestra server using an environment file (`.env`). If you prefer a simpler, UI-based experience, see the [Enterprise Edition](../../oss-vs-paid/index.md), which allows managing secrets per namespace directly from the web interface — without modifying server configuration files. --- ## Using secrets in Kestra ### Step 1: Create a `.env` file Start by defining your secrets in a standard environment file: ```bash POSTGRES_PASSWORD=actual_postgres_password OPENAI_KEY=actual_openai_key AWS_ACCESS_KEY=actual_aws_access_key AWS_SECRET_KEY=actual_aws_secret_key ``` ### Step 2: Encode and prefix your secrets Kestra expects all secret keys to be **prefixed with `SECRET_`** and their values **base64-encoded**. The resulting `.env_encoded` file should look like this: ```bash SECRET_POSTGRES_PASSWORD=base64_encoded_postgres_password SECRET_OPENAI_KEY=base64_encoded_openai_key SECRET_AWS_ACCESS_KEY=base64_encoded_aws_access_key SECRET_AWS_SECRET_KEY=base64_encoded_aws_secret_key ``` To generate this file automatically, use the following Bash script: ```bash while IFS='=' read -r key value; do echo "SECRET_$key=$(echo -n "$value" | base64)"; done < .env > .env_encoded ``` This script: 1. Base64-encodes all values. 2. Adds the `SECRET_` prefix to all variable names. 3. Saves the result as `.env_encoded`. You can verify the output by opening `.env_encoded` — it should look like the example above. Alternatively, you can manually write the file using macros to encode secrets dynamically: ```bash SECRET_POSTGRES_PASSWORD={{ "actual_postgres_password" | base64encode }} SECRET_OPENAI_KEY={{ "actual_openai_key" | base64encode }} SECRET_AWS_ACCESS_KEY={{ "actual_aws_access_key" | base64encode }} SECRET_AWS_SECRET_KEY={{ "actual_aws_secret_key" | base64encode }} ``` --- ### Step 3: Point Docker to the encoded file Update your `docker-compose.yaml` to use the `.env_encoded` file: ```yaml kestra: image: kestra/kestra:latest env_file: - .env_encoded ``` This ensures your secrets are loaded when Kestra starts. --- ### Step 4: Use secrets in a flow Once your secrets are loaded, reference them in your flows using the `secret()` function — **without** including the `SECRET_` prefix. For example, this flow connects to PostgreSQL using `SECRET_POSTGRES_PASSWORD` and uploads query results to AWS S3 using `SECRET_AWS_ACCESS_KEY` and `SECRET_AWS_SECRET_KEY`. ```yaml id: postgres_to_s3 namespace: company.team tasks: - id: fetch type: io.kestra.plugin.jdbc.postgresql.Query url: jdbc:postgresql://127.0.0.1:56982/ username: pg_user password: "{{ secret('POSTGRES_PASSWORD') }}" sql: select id, first_name, last_name, city from users fetchType: STORE - id: write_to_s3 type: io.kestra.plugin.aws.s3.Upload accessKeyId: "{{ secret('AWS_ACCESS_KEY') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY') }}" region: "eu-central-1" from: "{{ outputs.fetch.uri }}" bucket: "kestra-bucket" key: "data/users.csv" ``` --- ### How secrets are resolved When you reference a secret using `{{ secret('POSTGRES_PASSWORD') }}`, Kestra automatically: 1. Finds the corresponding environment variable (e.g., `SECRET_POSTGRES_PASSWORD`). 2. Base64-decodes its value. 3. Injects it securely into the execution context. This ensures your sensitive data stays encrypted at rest and never appears in logs or flow definitions. --- # Install Only Selected Plugins in Kestra OSS URL: https://kestra.io/docs/how-to-guides/selected-plugin-installation > Learn how to install specific Kestra plugins in the open-source version for a lightweight build and faster startup using the -no-plugins Docker image. Install a selection of Kestra plugins in the open-source version. Pick and choose Kestra plugins to create lightweight builds and achieve a faster startup. This guide explains how to: - Install specific plugins when using the `-no-plugins` Docker image - Understand plugin versioning across Open Source and [Enterprise](../../07.enterprise/01.overview/01.enterprise-edition/index.md) - Automate plugin installation using Docker Compose - Link to plugin documentation and versioning support See also: [Versioned Plugins in Kestra Enterprise](../../07.enterprise/05.instance/versioned-plugins/index.md). ## Plugin basics in Kestra Open Source Kestra plugins are distributed as individual JAR files and loaded at runtime. Plugins are not embedded by default in `-no-plugins` Docker images. You can: - Download specific [plugin JARs](https://repo.maven.apache.org/maven2/io/kestra/plugin/) manually or via `kestra plugins install`. - Mount them into `/app/plugins/` in your [Docker Compose](../../02.installation/03.docker-compose/index.md) setup. ## Install plugins via `kestra plugins install` You can install any plugin using: ```bash kestra plugins install io.kestra.plugin:plugin-dbt:LATEST ``` This will download the [plugin JAR from Maven Central](https://repo.maven.apache.org/maven2/io/kestra/plugin/) into `/app/plugins`. Just replace `plugin-dbt` with whichever plugin you'd like to download (e.g., `plugin-script-python`, `plugin-aws`, etc.) You can run this inside a container (interactively or as part of Dockerfile) to build custom plugin bundles. ## Automate plugin selection with Docker Compose If you're using the `kestra/kestra:*-no-plugins` image and want to add only selected plugins: ### Option 1: Use `kestra plugins install` inside the container ```yaml services: kestra: image: kestra/kestra:latest-no-plugins entrypoint: /bin/sh -c " kestra plugins install io.kestra.plugin:plugin-dbt:LATEST && \ kestra plugins install io.kestra.plugin:plugin-scripts:LATEST && \ kestra server standalone" volumes: - /var/run/docker.sock:/var/run/docker.sock - ./storage:/app/storage ``` ### Option 2: Preload plugin JARs locally You can copy only the JARs you need from a full Kestra image: ```bash docker run --rm -d --name kestra-temp kestra/kestra:latest docker cp kestra-temp:/app/plugins/. ./local-plugins docker rm -f kestra-temp ``` Then remove unwanted plugins: ```bash rm ./local-plugins/*unwanted-plugin*.jar ``` And mount your plugin folder: ```yaml volumes: - ./local-plugins:/app/plugins ``` You may also use a scripted alias to automate this process. Below is an example for reference: ```bash alias dl="rm -rf ./jar-plugins/* && docker run -d kestra/kestra:develop server local \ | xargs -I {} sh -c 'docker cp {}:/app/plugins ./jar-plugins && docker rm -f {}'" ``` ## Plugin versioning in Enterprise In Kestra Open Source, plugins must be installed at the latest compatible version. In Kestra Enterprise, you can: - Pin specific plugin versions - Upload custom plugin binaries per tenant - Enable version-aware workflows Learn more about versioned plugins in Enterprise: [Versioned Plugins](../../07.enterprise/05.instance/versioned-plugins/index.md) ## Best practices | Use Case | Recommendation | | -------------------------- | ---------------------------------------------------- | | Minimal runtime image | Use `kestra/kestra:*-no-plugins` with mounted JARs | | Dynamic plugin setup | Use `kestra plugins install` in entrypoint | | Controlled plugin versions | Use Enterprise with versioned plugins | | Custom plugin development | Build and copy plugins into `/app/plugins/` manually | --- # Trigger Kestra Flows from ServiceNow URL: https://kestra.io/docs/how-to-guides/servicenow-trigger > Integrate ServiceNow with Kestra by triggering flows via webhooks from ServiceNow Service Catalog items for automated fulfillment. Execute Kestra flows with a ServiceNow webhook trigger. ServiceNow often acts as the front door for enterprise automation. This guide shows how to let analysts request an on-demand compliance scan from a ServiceNow catalog item while Kestra executes the workflow behind the scenes through a webhook trigger. :::alert{type="info"} This guide assumes the existence of a flow like in our [SecOps with Kestra guide](../secops-with-kestra/index.md). ::: ## Prerequisites - A ServiceNow instance with Flow Designer access - A Kestra tenant with a flow exposed through a webhook trigger - The webhook URL, namespace, and token for the Kestra flow ## What You Will Build - A Service Catalog item (`complianceScanAndRemediate`) that collects the host IP and remediation preferences - Catalog variables that persist the user input - A reusable ServiceNow Action that calls the Kestra webhook - A Flow Designer flow that ties the catalog submission to the Action ## Step 1: Create the Catalog Item 1. Sign in to ServiceNow as an administrator and navigate to **Service Catalog → Catalog Definitions → Maintain Items**. ![Maintain Items Interface](./maintain-items-interface.png) 2. Select **New** and provide the basic metadata: - **Name**: `complianceScanAndRemediate` - **Catalogs**: *Service Catalog* - **Category**: *Services* - **Fulfillment automation level**: *Fully automated* 3. Fill in the **Short description** and **Description**, adjust any Portal settings you do not need, and click **Save**. ![Catalog item form](./catalog-item-form.png) ## Step 2: Add Catalog Variables 1. In the Variables related list, choose **New** and create the primary inputs: - **Type**: *Single Line Text* - **Question**: *IP Address* - **Name**: `ipAddress` - **Mandatory**: enabled ![Single Line Text Variable](./single-line-text-variable.png) 2. Create an additional variable for remediation control, for example: - **Type**: *Single Line Text* (or *Boolean* if you prefer a toggle) - **Question**: *Auto remediate* (Name `autoRemediate`) ![Autoremediate Variable](./autoremediate-variable.png) 3. (Optional) Add a multi-choice variable if you want to offer canned scan profiles. Define the choices under the **Choices** related list once the variable has been saved. ![Multi-choice Variable](./multi-choice-variable.png) 4. Click **Update** to persist the catalog item changes. ## Step 3: Build the Script Action Navigate to the **Workflow Studio**: ![Workflow Studio](./workflow-studio.png) 1. Open **Flow Designer → Action** and create a new Action named `triggerKestraWebhook` in the **Service Catalog** category. 2. Add two Action inputs: `ipAddress` and `remediateControls`. ![Action Inputs](./action-inputs.png) 3. Insert a **Script** step, expose the same inputs to that step, and paste the following code, updating the endpoint with your Kestra domain, tenant, namespace, flow ID, and webhook token. Store any secrets (such as the token) in ServiceNow Credential or Connection records rather than hardcoding them. ```javascript (function execute(inputs, outputs) { outputs.error = ""; try { var request = new sn_ws.RESTMessageV2(); request.setHttpMethod("post"); request.setEndpoint("https://{YOUR.KESTRA.DOMAIN}/api/v1/{TENANT}/executions/webhook/{NAMESPACE}/{FLOW_ID}/{WEBHOOK_TOKEN}"); request.setRequestHeader("Content-Type", "application/json"); request.setRequestHeader("Accept", "application/json"); var body = { ipAddress: inputs.ipAddress, remediateControls: inputs.remediateControls }; request.setRequestBody(JSON.stringify(body)); var response = request.execute(); var httpStatus = response.getStatusCode(); var responseBody = response.getBody(); gs.info("Kestra webhook response status: " + httpStatus); gs.info("Kestra webhook body: " + responseBody); outputs.responseBody = responseBody; outputs.statusCode = httpStatus; } catch (error) { gs.error("Kestra webhook failed: " + error.message); outputs.error = error.message; } })(inputs, outputs); ``` 4. Define Script outputs for `responseBody`, `statusCode`, and `error`, then map them to Action outputs so downstream flows can inspect the response. ![Script Outputs](./script-outputs.png) 5. Publish the Action. ![Action Outputs](./action-outputs.png) ## Step 4: Create the ServiceNow Flow 1. In Flow Designer, create a flow named `catalogSubmissionFlow`. 2. Select the **Service Catalog** trigger so the flow runs whenever the catalog item is submitted. 3. Add the **Get Catalog Variables** action and configure it to: - Use the **Requested Item record** from the trigger as the submitted request - Limit the template to the `complianceScanAndRemediate` catalog item - Return all of the variables you created earlier 4. Add the `triggerKestraWebhook` Action to the flow and map each Action input to the corresponding variable output from the previous step. 5. Activate the flow. ## Step 5: Connect the Catalog Item to the Flow In the **Workflow Editor**, click on **New -> Flow**: ![Flow designer](./flow-designer.png) 1. Name the flow `catalogSubmissionFlow` and give a description ![catalogSubmissionFlow form](./catalog-submission-flow-form.png) 2. Set the Trigger as Service Catalog 3. In Actions, get the Catalog Variables ![Get Catalog Variables](./get-catalog-variables.png) 4. Set Action Inputs ![Set Action Inputs](./set-action-inputs.png) 5. Set Template Catalog Items: Click on the magnifying glass and select `complianceScanAndRemediate`. ![Set Template Catalog Items](./set-template-catalog-items.png) 6. Set the Catalog Variables and **Save**. ![Set Catalog Variables](./set-catalog-variables.png) 7. Add an **Action** and search for `triggerKestraWebhook`: ![Add an Action](./add-action.png) 8. Under **Action Inputs**, for `ipAddress` and click on the wand icon to select **Get Catalog Variables → `ipAddress`** and repeat for Auto Remediate. ![Add Action Inputs](./add-action-inputs.png) ## Validate the End-to-End Run 1. Open your **Catalog Item Catalog → Catalog Definitions -> Maintain Items→ `complianceScanAndRemediate`** 2. Go to **Process Engine**, and under **Flow** select `catalogSubmissionFlow` ![Process Engine](./process-engine.png) 3. Click on **Update**, then try the workflow ![Update and Try](./update-and-try.png) 4. Submit the request and navigate **System Log -> All**: ![System Log](./system-log.png) The webhook will be triggered: ![Triggered Webhook](./triggered-webhook.png) 5. Navigate to Kestra, and view the **Flow Executions** tab: ![Kestra execution](./kestra-execution.png) ## Conclusion By fronting Kestra with a ServiceNow catalog item, you let users stay inside their familiar ITSM portal while still benefiting from Kestra's orchestration capabilities. The same pattern works for any flow that exposes a webhook trigger — swap in different inputs, reuse the Action, and tailor the downstream automation without changing the ServiceNow experience. --- # Run Shell Scripts Inside Your Flows URL: https://kestra.io/docs/how-to-guides/shell > Run Bash and shell scripts in Kestra workflows. Execute multi-step commands, chain scripts with pipes, and handle errors in automated shell tasks. Run Shell scripts directly in your flows and generate outputs. You can execute bash script in a flow by either writing your Shell commands inline or by executing a `.sh` file. You can get outputs and metrics from your Shell script too. ## Scripts If you want to write a series of commands together to form a small script, and run that script as a task in the flow, you can use the `io.kestra.plugin.scripts.shell.Script`. ```yaml id: shell_script namespace: company.team description: This flow runs the shell script. tasks: - id: shell_script_task type: io.kestra.plugin.scripts.shell.Script containerImage: badouralix/curl-jq script: | # invoke a GET call on an API and extract information from the JSON response downloads=$(curl https://hub.docker.com/v2/repositories/kestra/kestra/ | jq -r '.pull_count') echo "Downloads: ${downloads}" ``` You can read more about the Scripts type in the [Plugin documentation](/plugins/plugin-script-shell/io.kestra.plugin.scripts.shell.script). ## Commands You could also choose to provide the series of Shell commands in the task, and get the same result. Here is an example of how you can run the previous example using the `io.kestra.plugin.scripts.shell.Commands` type: ```yaml id: shell_commands namespace: company.team description: This flow runs the shell commands. tasks: - id: http_download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: shell_commands_task type: io.kestra.plugin.scripts.shell.Commands commands: - echo "The current execution is {{ execution.id }}" - cat {{ outputs.http_download.uri }} ``` You can also put a Shell script in a separate `.sh` file, and invoke the script as a command. For example, we have a script file called `hello.sh` that contains: ```bash echo "Hi there! This is an example of executing a Shell script file." sleep 2 echo "I am back from sleep" ``` You can now invoke this script as one of the commands in the `io.kestra.plugin.scripts.shell.Commands` task. Note that we have set the `enabled` flag for the `namespaceFiles` property to `true` so Kestra can access the file. ```yaml id: shell_invoke_file namespace: company.team description: This flow runs the shell script file. tasks: - id: shell_invoke_file_task type: io.kestra.plugin.scripts.shell.Commands namespaceFiles: enabled: true commands: - sh hello.sh ``` You can read more about the Commands type in the [Plugin documentation](/plugins/plugin-script-shell/io.kestra.plugin.scripts.shell.commands). ## Handling Outputs If you want to get a variable or file from your Shell script, you can use an [output](../../05.workflow-components/06.outputs/index.md). ### Variable Output You can get the JSON outputs from the Shell commands / script using the `::{}::` pattern. Here is an example: ```yaml id: shell_outputs namespace: company.team description: This flow runs the shell command, and outputs the variable. tasks: - id: shell_outputs_task type: io.kestra.plugin.scripts.shell.Commands commands: - echo '::{"outputs":{"test":"value","int":2,"bool":true,"float":3.65}}::' ``` All the output variables can be viewed in the Outputs tab of the execution. ![shell_outputs](./outputs.png) You can refer to the outputs in another task as shown in the example below: ```yaml id: shell_outputs_usage namespace: company.team description: This flow runs the shell command, and outputs the variable. tasks: - id: shell_outputs_task type: io.kestra.plugin.scripts.shell.Commands commands: - echo '::{"outputs":{"test":"value","int":2,"bool":true,"float":3.65}}::' - id: return type: io.kestra.plugin.core.debug.Return format: '{{ outputs.shell_outputs_task.vars.test }}' ``` _This example works for both `io.kestra.plugin.scripts.shell.Script` and `io.kestra.plugin.scripts.shell.Commands`._ ### File Output Inside of your Shell script, write a file to the system. You'll need to add the `outputFiles` property to your flow and list the files you're trying to put out. In this case, we want to output `output.txt`. More information on the formats you can use for this property can be found in [Script Output Metrics](../../16.scripts/06.outputs-metrics/index.md). The example below writes a `output.txt` file containing the "Hello world" text, similar the output we used earlier. We can then refer the file using the syntax `{{ outputs.{task_id}.outputFiles[''] }}`, and read the contents of the file using the `read()` function. ```yaml id: shell_output_file namespace: company.team description: This flow runs the shell command to output a file. tasks: - id: shell_outputs_task type: io.kestra.plugin.scripts.shell.Commands outputFiles: - output.txt commands: - echo 'Hello world' > output.txt - id: log_output type: io.kestra.plugin.core.log.Log message: "{{ read(outputs.shell_outputs_task.outputFiles['output.txt']) }}" ``` _This example works for both `io.kestra.plugin.scripts.shell.Script` and `io.kestra.plugin.scripts.shell.Commands`._ ## Handling Metrics You can also get [metrics](../../16.scripts/06.outputs-metrics/index.md#outputs-and-metrics-in-script-and-commands-tasks) from your Shell script. Metrics use the same `::{}::` pattern as outputs. This example demonstrates both the counter and timer metrics. ```yaml id: shell_metrics namespace: company.team description: This flow runs the shell command, and puts out the metrics. tasks: - id: shell_outputs_task type: io.kestra.plugin.scripts.shell.Commands commands: - echo 'There are 20 products in the cart' - echo '::{"outputs":{"productCount":20}}::' - echo '::{"metrics":[{"name":"productCount","type":"counter","value":20}]}::' - echo '::{"metrics":[{"name":"purchaseTime","type":"timer","value":32.44}]}::' ``` Once this has executed, both the metrics can be viewed under **Metrics**. ![metrics](./metrics.png) --- # Migrate from Shipyard to Kestra URL: https://kestra.io/docs/how-to-guides/shipyard-migration > A comprehensive guide for migrating workflows from Shipyard to Kestra, mapping concepts like Fleets and Vessels to Flows and Tasks. Migrate from Shipyard to Kestra. This is a guide for users who are considering migration of their workflows from [Shipyard](https://www.shipyardapp.com/) to [Kestra](https://kestra.io/). [Kestra](https://kestra.io/) is a language-agnostic orchestration platform allowing users to build workflows as code and from the UI. Similarly to Shipyard, Kestra uses YAML for workflow logic, and its extensive plugin ecosystem makes migration straightforward. ## Technical Glossary | Shipyard Concept | Equivalent Concept in Kestra | Description | |------------------|-----------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | [Fleet](https://www.shipyardapp.com/docs/reference/fleets/fleets-overview/) | [Flow](../../05.workflow-components/01.flow/index.md) | a container for tasks, their inputs, outputs, handling of errors and overall orchestration logic | | [Vessel](https://www.shipyardapp.com/docs/reference/vessels/) | [Task](../../05.workflow-components/01.tasks/index.mdx) | a discrete action within a flow, capable of taking inputs and variables from the flow, and producing outputs for downstream consumption by end users and other tasks | | [Project](https://www.shipyardapp.com/docs/reference/projects/) | [Namespace](../../05.workflow-components/02.namespace/index.md) | a logical grouping of flows, used to organize workflows and manage access to secrets, plugin defaults and variables | | [Triggers](https://www.shipyardapp.com/docs/reference/triggers/triggers-overview/) | [Triggers](../../05.workflow-components/07.triggers/index.mdx) | a mechanism that automates the execution of a flow; triggers can be scheduled or event-based | | [Blueprints](https://www.shipyardapp.com/docs/blueprint-library/) | [Blueprints](/blueprints) | a collection of premade templates ready to be used in your workflows; Blueprints work very similarly between both platforms — the main difference is that Shipyard's blueprints are like plugins in kestra since they are used to run a task (vessel). Kestra's blueprints are more comprehensive, they often contain multiple tasks composed together to accomplish some use case end-to-end. | | [Inputs](https://www.shipyardapp.com/docs/reference/inputs/) | [Inputs](../../05.workflow-components/05.inputs/index.md) | a list of dynamic values passed to the flow at runtime; the main difference between both is that Shipyard's inputs are provided to the task (i.e. vessel), while Kestra's inputs are defined at a flow (i.e. fleet) level | | UI | [UI](../../09.ui/index.mdx) | Shipyard's UI allows building workflows via drag-and-drop and autogenerates a YAML configuration; in Kestra, users typically write the YAML configuration first and then they can optionally modify the workflow or add new tasks from low-code UI forms. | ## Getting Started with Kestra To get started, follow the [Quickstart Guide](../../01.quickstart/index.md) to install Kestra and start building your first workflows. ## How to Migrate Every fleet in Shipyard generates a YAML configuration. You can retrieve it from the UI as shown below, or get it from the version control system like Git in case you maintained one for Shipyard. ![shiypard_yaml_configuration](./shipyard_yaml_configuration.png) For every vessel in the fleet, try to find a matching [Kestra Plugin](/plugins). For example, the equivalent of **Amazon S3 - Delete Files** vessel in Shipyard will be [io.kestra.plugin.aws.s3.Delete](/plugins/plugin-aws/s3/io.kestra.plugin.aws.s3.delete) and [io.kestra.plugin.aws.s3.DeleteList](/plugins/plugin-aws/s3/io.kestra.plugin.aws.s3.deletelist). In the same fashion as you would configure a vessel, you can configure a task in Kestra. Use the built-it task documentation in the Kestra UI to help you configure all task properties (the **Source and documentation** view). Find plugins directly within the built-in UI editor using the auto-complete feature. Each task documentation comes with an example and a detailed description of each task property. ![documentation_view](./documentation_view.png) There is no concept of **connections** in Kestra. By default, all tasks are executed sequentially. To adjust the execution logic e.g. to run some tasks in parallel, wrap your tasks in [flowable tasks](../../05.workflow-components/01.tasks/00.flowable-tasks/index.md). As always, the combination of the [core documentation](../../index.mdx), [Plugin documentation](/plugins) and [Blueprints](/blueprints) will help you figure out how to do that. Once you have the fleet equivalent (i.e. a flow) ready in Kestra, you can use the **Source and topology view** to validate whether your Kestra flow matches the connections in your Shipyard fleet. ![topology_view](./topology_view.png) You can now Save and Execute your flow. Then, check the Logs, Gantt and Outputs tab of your Execution to validate that your workflow behaves as expected. ## Need Help? Check out our extensive [plugin catalog](/plugins) for descriptions and examples of each task and trigger. Use our [blueprints](/blueprints) for guidance on creating various flows. For assistance, join our free [Slack community](/slack) and ask your questions in the `#help` channel. We respond to every message! --- # Slack Events API with Kestra: Trigger Flows URL: https://kestra.io/docs/how-to-guides/slack-webhook > Trigger Kestra flows based on Slack events using the Slack Events API and Webhook triggers to automate responses and interactions. Trigger Kestra flows based on Slack events. The Slack Events API allows you to build apps that respond to events from Slack. For example, you can trigger a custom action anytime a user joins a channel or when someone reacts to a message with a specific emoji. ## Create a Slack App To use the Slack Events API, you'll need to create a Slack app. You can do this from the [Slack API website](https://api.slack.com/apps). First, click on the "Create New App" button: ![Create New App button on the Slack API website](./img.png) Choose the option "From scratch": ![Choose From scratch option](./img_1.png) Then, give your app a name and select the workspace where you want to install it: ![App name and workspace selection form](./img_2.png) Now, you need to enable the "Event Subscriptions" feature: ![Enable Event Subscriptions toggle](./img_3.png) In the "Subscribe to bot events" section, you can add events you want to listen to. ![Subscribe to bot events section](./img_4.png) For example, you can listen to the `app_mentions` and `reaction_added` events: ![app_mentions and reaction_added events selected](./img_5.png) ## Create a flow with a Webhook trigger You can now create a Kestra flow that will listen to the events you've subscribed to: ```yaml id: slack_events namespace: prod tasks: - id: process_slack_event type: io.kestra.plugin.core.log.Log message: "{{ trigger.body }}" triggers: - id: slack_event type: io.kestra.plugin.core.trigger.Webhook key: superStrongSecretKey42 ``` :::alert{type="warning"} The **webhook key** cannot contain any **special characters** — only letters and digits. Also, consider it as a secret that you should keep safe. You can use Kestra's [Secrets](../../06.concepts/04.secret/index.md) to store it securely. ::: Now, the only part left is to create a simple app that will listen to Slack events and will forward them to your Kestra flow via the Webhook trigger. We'll look at how to do this using Python and FastAPI. For deployments, we'll show two options: 1. Using Modal for easy deployment 2. Using ngrok to expose our local FastAPI server to the internet. You can replace ngrok for any other deployment method you prefer. ## Deploy a Slack app with Modal First, sign up for a free account on [Modal](https://modal.com/). Then, go to your Settings: ![Modal Settings page](./img_6.png) And create a new API token: ![Create new API token in Modal](./img_7.png) You will see a similar command: ```bash modal token set --token-id ak-zzzzzzzzz --token-secret as-zzzzzzzzz ``` Now, create the following flow in Kestra and replace the token ID and token secret with the ones you got from Modal. You can use Kestra's [Secrets](../../06.concepts/04.secret/index.md) to store those securely. Also, replace `your_kestra_host` with your Kestra host URL in the `slack.py` file. ```yaml id: slack_app namespace: prod tasks: - id: modal_slack_app type: io.kestra.plugin.modal.cli.ModalCLI commands: - modal deploy slack.py env: MODAL_TOKEN_ID: "{{ secret('MODAL_TOKEN_ID') }}" MODAL_TOKEN_SECRET: "{{ secret('MODAL_TOKEN_SECRET') }}" inputFiles: slack.py: | import logging from fastapi import FastAPI, Request, BackgroundTasks from fastapi.responses import JSONResponse from modal import Image, Stub, asgi_app import requests web_app = FastAPI() stub = Stub("slack_app") image = Image.debian_slim().pip_install("requests") logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) def process_event(event): # TODO adjust the URL below to your Kestra Webhook URL url = "http://your_kestra_host:8080/api/v1/main/executions/webhook/prod/slack_events/superStrongSecretKey42" headers = {"Content-Type": "application/json"} response = requests.post(url, headers=headers, json=event) logger.info(f"Forwarding event response: {response.status_code} - {response.text}") @web_app.post("/slack/events") async def slack_events(request: Request, background_tasks: BackgroundTasks): json_data = await request.json() if "challenge" in json_data: logger.info("Received Slack challenge event") return JSONResponse(content={"challenge": json_data["challenge"]}) logger.info(f"Received event: {json_data}") # Process the event asynchronously background_tasks.add_task(process_event, json_data) # Respond immediately to Slack logger.info("Responding immediately to Slack") return JSONResponse(content={"status": "ok"}) @stub.function(image=image) @asgi_app() def fastapi_app(): return web_app ``` :::alert{type="info"} If you don't like adding the Python script inline in the YAML file, you can enable `namespaceFiles` and add the Python code in the embedded Code Editor in a separate file e.g. called `slack.py` and reference it in the flow as shown below: ```yaml id: slack_app namespace: prod tasks: - id: modal_slack_app type: io.kestra.plugin.modal.cli.ModalCLI namespaceFiles: enabled: true commands: - modal deploy slack.py env: MODAL_TOKEN_ID: "{{ secret('MODAL_TOKEN_ID') }}" MODAL_TOKEN_SECRET: "{{ secret('MODAL_TOKEN_SECRET') }}" ``` ![Namespace files enabled in Kestra code editor](./img_8.png) ::: Once you execute that flow, you will see the endpoint to your app in the logs: ![Modal app deployment endpoint shown in Kestra logs](./img_9.png) Go back to Slack and add the URL to the "Request URL" field in the "Event Subscriptions" section. Add `slack/events` at the end of the URL, e.g.: ```bash https://anna-geller--slack-app-fastapi-app.modal.run/slack/events ``` You should see the `Verified` message. Hit `Save Changes` and you're all set! ![Verified status in Slack Event Subscriptions request URL](./img_10.png) ## Install the Slack app to a Workspace and test it First, we need to install the app to the workspace. Go to "Install App" and click on "Install to Workspace": ![Install App menu in Slack app settings](./img_11.png) ![Install to Workspace button](./img_12.png) Now you can test the integration by mentioning your app in a channel. For example, you can write a hello message `hello @kestra`: ![Hello message mentioning the Kestra app in a Slack channel](./img_13.png) Confirm to invite the app to the channel and congratulate yourself with the "Nicely done!" emoji 🙌: ![Confirm invite and reaction added emoji in Slack](./img_14.png) You should see that both events (`app mention` and `reaction added`) have triggered an execution of your Kestra flow: ![app_mention event triggered Kestra execution](./img_15.png) ![reaction_added event triggered Kestra execution](./img_16.png) Now it's up to you to automate your daily operations with Slack and Kestra! ## Example automation: AI Chatbot You can extend the `slack_events` flow to automate your daily business operations. To do something more useful than just logging the Slack event, you can create a flow that listens to the `app_mention` event and responds to that message with a GPT-4 chatbot. First, create an incoming webhook in your Slack app: ![Add incoming webhook in Slack app features](./img_17.png) ![Incoming webhook added to Slack app](./img_18.png) Copy the webhook URL: ![Copy incoming webhook URL](./img_19.png) ...and paste it into the `url` field of the `SlackIncomingWebhook` task in the flow below: ```yaml id: slack_events namespace: prod tasks: - id: if_app_mention type: io.kestra.plugin.core.flow.If condition: "{{ trigger.body.event.type == 'app_mention' }}" then: - id: gpt type: io.kestra.plugin.openai.ChatCompletion apiKey: "{{ secret('OPENAI_API_KEY') }}" model: gpt-4-0125-preview messages: - role: system content: The user will refer to you as <@{{ trigger.body.authorizations[0].user_id }}>. You get a prompt from a user and provide a concise answer. prompt: "{{ trigger.body.event.text ?? null }}" - id: slack type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ secret('SLACK_WEBHOOK_URL') }}" payload: | {"channel":"{{ trigger.body.event.channel }}","text":"{{ outputs.gpt.choices[0].message.content }}"} else: - id: other_events type: io.kestra.plugin.core.log.Log message: "{{ trigger.body }}" triggers: - id: slack_event type: io.kestra.plugin.core.trigger.Webhook key: superStrongSecretKey42 ``` :::alert{type="info"} The `SlackIncomingWebhook` task also has the `messageText` property that can be used instead of the `payload` property, depending on the task's requirements. ::: And here is the result: ![GPT-4 chatbot responding to app mention in Slack](./img_20.png) ![Kestra flow execution triggered by Slack app mention](./img_21.png) --- ## Local testing with ngrok If you don't want to host your app on Modal, you can use ngrok to expose your local server to the internet. First, install ngrok: ```bash brew install ngrok/ngrok/ngrok ``` Then, [sign up](https://ngrok.com/) for a free account and then authenticate your terminal, as explained in the [Getting Started guide](https://dashboard.ngrok.com/get-started/setup/): ```bash ngrok config add-authtoken long_token_string ``` Create a FastAPI app in a file called `main.py`: ```python from fastapi import FastAPI, Request from fastapi.responses import JSONResponse import requests app = FastAPI() @app.post("/slack/events") async def slack_events(request: Request): json_data = await request.json() # Slack URL Verification Challenge if "challenge" in json_data: return JSONResponse(content={"challenge": json_data["challenge"]}) print("Received an event") print(json_data) # URL of your Kestra flow webhook url = "http://your_kestra_host:8080/api/v1/main/executions/webhook/prod/slack_events/superStrongSecretKey42" headers = { "Content-Type": "application/json", } response = requests.post(url, headers=headers, json=json_data) print(response.text) return JSONResponse( content={"status": response.status_code, "response": response.text} ) ``` Then, set up your FastAPI server: ```bash pip install fastapi uvicorn requests uvicorn main:app --reload --port 3000 ngrok http http://localhost:3000 ``` This will expose your local server to the internet. You should see a similar URL: ```bash https://0913-31-18-152-123.ngrok-free.app ``` Go back to your Slack app and add the URL to the "Request URL" field in the "Event Subscriptions" section. Add `slack/events` at the end of the URL, e.g.: ```bash https://0913-31-18-152-123.ngrok-free.app/slack/events ``` The rest of the process is the same as with Modal. You can now adjust the flow `slack_events` referenced in the FastAPI code and start automationg various processes based on Slack events. ## Other deployment options You can deploy that Slack app in many other ways including: - an on-prem VM - a serverless approach with [AWS Lambda](https://www.youtube.com/watch?v=rpVLOVeky6A), Google Cloud Functions, or Azure Functions - a containerized approach with AWS Fargate, Google Cloud Run, or Azure Container Instances - a Kubernetes deployment. And of course, you can use any other programming language and framework to build your Slack app. The only requirement is to forward the Slack events to your Kestra flow via the Webhook trigger. --- # Use SQLMesh to Run dbt Projects URL: https://kestra.io/docs/how-to-guides/sqlmesh > Orchestrate SQLMesh transformations in Kestra. Run and schedule SQLMesh plans as part of your data pipeline for version-controlled, SQL-first modeling. Using SQLMesh to run dbt project with Kestra. SQLMesh is an open source python data transformation and modelling framework. It automates everything needed to run a scalable data transformation platform. SQLMesh works with a variety of [engines and orchestrators](https://sqlmesh.readthedocs.io/en/stable/integrations/overview/). SQLMesh enables data teams to efficiently run and deploy data transformations written in SQL or Python. This guide shows how to run dbt projects on BigQuery using SQLMesh with Kestra. ## Example Our Flow will do the following steps: 1. Download `orders.csv` using HTTP download task. 2. Create the table in BigQuery. 3. Upload the data from the csv file into the BigQuery table. 4. Create a dbt project which will create the BigQuery view from the BigQuery table. 5. Create SQLMeshCLI task that will run the dbt project. SQLMesh supports integration with a variety of tools like Airflow, dbt, dlt, etc. One of the common use-cases of SQLMesh is to run dbt projects. You can choose to pull your dbt project from a Git repository as mentioned in the [How-to guide on dbt](../dbt/index.md) or create [namespace files](../../06.concepts/02.namespace-files/index.md) for the project. This guide creates the complete project using namespace files built up step by step. You can later choose to push all the namespace files to a GitHub repository using [PushNamespaceFiles](../pushnamespacefiles/index.md). ### Creating the flow with the SQLMeshCLI task Create tasks for each step: ```yaml id: sqlmesh_transform namespace: company.team tasks: - id: orders_http_download type: io.kestra.plugin.core.http.Download description: Download orders.csv using HTTP Download uri: https://huggingface.co/datasets/kestra/datasets/raw/#main/csv/orders.csv - id: create_orders_table type: io.kestra.plugin.gcp.bigquery.CreateTable description: Create orders table in BigQuery serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}" projectId: dataset: ecommerce table: orders tableDefinition: type: TABLE schema: fields: - name: order_id type: INT64 - name: customer_name type: STRING - name: customer_email type: STRING - name: product_id type: INT64 - name: price type: FLOAT64 - name: quantity type: INT64 - name: total type: FLOAT64 - id: load_orders_table type: io.kestra.plugin.gcp.bigquery.Load description: Load orders table with data from orders.csv from: "{{ outputs.orders_http_download.uri }}" projectId: serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}" destinationTable: ".ecommerce.orders" format: CSV csvOptions: fieldDelimiter: "," skipLeadingRows: 1 - id: sqlmesh_transform type: io.kestra.plugin.sqlmesh.cli.SQLMeshCLI description: Use SQLMesh to run the dbt project inputFiles: sa.json: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}" namespaceFiles: enabled : true beforeCommands: - pip install "sqlmesh[bigquery]" - pip install dbt-bigquery commands: - sqlmesh init -t dbt - sqlmesh plan --auto-apply ``` It's important that we have the following properties configured: - `namespaceFiles` property has `enabled` set to `true` to ensure that the task has access to your namespace files. - Provide the GCP service account JSON file so that the task can connect to your GCP account to access BigQuery. See the [dedicated guide](../google-credentials/index.md) on how to add it. This file is referenced in the dbt project file. - Install the `sqlmesh[bigquery]` and `dbt-bigquery` depenedencies with `beforeCommands`. These allow SQLMesh and dbt to perform operations on BigQuery. Once the task is created and configured correctly, save the flow. ### Creating dbt project Now go to the Editor, create a new file called `profiles.yml` with the following content: ```yaml bq_dbt_project: outputs: dev: type: bigquery method: service-account dataset: ecommerce project: keyfile: sa.json location: US priority: interactive threads: 16 timeout_seconds: 300 fixed_retries: 1 target: dev ``` Next, we will create `dbt_project.yml` with the following content: ```yaml name: 'bq_dbt_project' version: '1.0.0' config-version: 2 profile: 'bq_dbt_project' model-paths: ["models"] analysis-paths: ["analyses"] test-paths: ["tests"] seed-paths: ["seeds"] macro-paths: ["macros"] snapshot-paths: ["snapshots"] clean-targets: - "target" - "dbt_packages" models: bq_dbt_project: example: +materialized: view +start: Nov 10 2024 ``` :::alert{type="info"} `models` require a start date for backfilling data through use of the `start` configuration parameter. ::: Now create a folder called `models` in the namespace. In the `models` folder, create `sources.yml` to define the source models: ```yaml version: 2 sources: - name: ecommerce database: schema: ecommerce tables: - name: orders ``` Lastly, create `stg_orders.sql` to materialize the `stg_orders` view for the `orders` table. ```sql {{ config(materialized="view") }} select order_id, customer_name, customer_email, product_id, price, quantity, total from {{ source('ecommerce', 'orders') }} ``` Thats it! We are now ready to run the flow. Once the flow runs successfully, you can go to BigQuery console, and ensure that the view `stg_orders` has been created. This is how we can run SQLMeshCLI for the dbt project. These instructions can also help you integrate the SQLMeshCLI task with other SQLMesh [integrations and execution engines](https://sqlmesh.readthedocs.io/en/stable/integrations/dbt/). --- # Subflow Retries, Restarts, and Replays in Kestra URL: https://kestra.io/docs/how-to-guides/subflow-executions > Best practices for configuring retries, restarts, and replays in subflow executions to ensure efficient error handling and resumption. How to configure your flows so that failed subflow executions resume correctly without rerunning successful tasks. --- When working with subflows, it’s important to understand the difference between retries at the **Subflow task level** and retries at the **flow level** within the subflow. This guide explains how to manage retries, restarts, and replays in subflow executions to avoid unnecessary re-execution of completed tasks. ## Flow-level vs. Subflow-level retries ### Subflow task-level retry When you define a retry on the `Subflow` task, it controls how the **Subflow task** itself is retried within the parent flow. For example: ```yaml id: parent_flow namespace: company.team tasks: - id: subflow type: io.kestra.plugin.core.flow.Subflow namespace: company.team flowId: my_subflow wait: true retry: type: constant maxAttempts: 3 interval: PT1S ```` In this case, the retry applies to the `Subflow` task in the parent flow. When the task fails, the entire subflow execution is retried from the beginning. This means all subflow tasks will re-run within each retried execution, including tasks that already succeeded. ### Flow-level retry inside the subflow To retry the execution from **failed task** within a subflow (without rerunning tasks that already succeeded), configure the `retry` property **on the subflow flow definition**, not on the Subflow task. This allows the subflow execution to restart from the failed task rather than from the start. Example: ```yaml id: my_subflow namespace: company.team retry: maxAttempts: 3 behavior: RETRY_FAILED_TASK type: constant interval: PT1S tasks: - id: start type: io.kestra.plugin.core.log.Log message: This task will succeed and won't be retried - id: fail type: io.kestra.plugin.core.execution.Fail runIf: "{{ randomInt(lower=0, upper=2) == 1 }}" errorMessage: Bad value returned! - id: end type: io.kestra.plugin.core.log.Log message: This task will only run if the fail task succeeds ``` When this flow fails, only the failed task (`fail` in this example) will be retried. Tasks that already succeeded (`start`) will not run again. ## Recommended configuration In most cases, you should: * Define **`retry`** at the **flow level inside the subflow** (not on the Subflow task) * Use **`behavior: RETRY_FAILED_TASK`** to resume from the failed task (recommended to avoid rerunning tasks that already succeeded) * Use **`behavior: CREATE_NEW_EXECUTION`** ONLY if you want to always restart the subflow execution from the beginning. ### Example: Parent flow calling a subflow ```yaml id: my_parent_flow namespace: company.team tasks: - id: subflow type: io.kestra.plugin.core.flow.Subflow namespace: company.team flowId: my_subflow wait: true - id: hello type: io.kestra.plugin.core.log.Log message: Success ``` When `my_subflow` is configured with `behavior: RETRY_FAILED_TASK`, it automatically restarts from the failed task during retries. The parent flow does not need additional configuration. ## Replays and restarts * **Replay**: You can replay the parent execution from the failed subflow task if the subflow defines `behavior: RETRY_FAILED_TASK`. * **Restart**: When you restart the parent execution where a subflow task failed from the UI or API, this will restart the entire child execution from the beginning (regardless of the subflow task definition), unless the subflow YAML defines `behavior: RETRY_FAILED_TASK` as flow-level retry configuration. ## Summary * Use **flow-level retry** inside the subflow for fine-grained restart control. * Use **`behavior: RETRY_FAILED_TASK`** to continue from the failed task. * Avoid configuring conflicting retry behaviors between parent and subflow. * Use subflow task retries only if you want to **create an entirely new subflow execution** in each retry attempt - when `retry` is defined on the `Subflow`-task level, it will **always** create an entirely new execution from start rather than restarting the existing child execution from failed task, regardless of the `behavior` configured on the flow-level in the subflow. --- # Connect a Supabase Database to Kestra URL: https://kestra.io/docs/how-to-guides/supabase-db > Learn how to connect your Supabase Database to Kestra workflows using the PostgreSQL plugin to query, copy, and manage your data. Connect your Supabase Database to your workflows using the PostgreSQL plugin.
:::alert{type="info"} There is a dedicated [Supabase plugin](/plugins/plugin-supabase) to replace these steps. ::: Supabase is an open-source Backend-as-a-service (BaaS) platform that helps developers build applications faster and more efficiently. They provide a number of services, including hosted PostgreSQL databases, which can be used within Flows in Kestra. Before you begin, ensure you have a [Supabase account](https://supabase.com/) set up and a [Kestra installation](../../02.installation/index.mdx) running. ## Setting up a Database in Supabase Once you've logged into Supabase, you'll need to set up an organization where you will create projects to access resources such as a database. ![supabase-1](./supabase-1.png) Once your organization is created, you'll be prompted to create a new project. Set a password for this project to use later for authenticating with the database in Kestra. ![supabase-2](./supabase-2.png) Once your project is created, you will now be able to access resources in Supabase. Head to the menu on the left side and select **Database**. You will be prompted to create a new table in your database, as well as configure any columns you want to use. Leave the columns blank for now and modify them later once you know what data to copy into the database. ![supabase-3](./supabase-3.png) ## Connecting Supabase to Kestra Now that we have a database set up in Supabase, we can move into Kestra to set up our connection. While there's no official Supabase plugin, we can connect using the [PostgreSQL plugin](/plugins/plugin-jdbc-postgres), which supports a number of tasks such as `Query`, `CopyIn`, and `CopyOut`. Inside of Supabase, select the **Connect** button at the top to get information about our databases connection. Select **Type** and change this JDBC. This will give us 3 ways of connecting with a Connection String. As we're only connecting to the database when our workflow runs, the Transaction pooler is a good option to use. ![supabase-4](./supabase-4.png) To connect, we can copy the URL provided for the Transaction pooler and replace `[YOUR-PASSWORD]` with the password set earlier. To prevent exposing the password in our flow, store it as a [secret](../../06.concepts/04.secret/index.md). By using [Plugin Defaults](../../05.workflow-components/09.plugin-defaults/index.md), we can configure our connection to Supabase once for all tasks in our flow rather than individually for each task. Once configured, our connection in Kestra will look like the example below: ```yaml pluginDefaults: - forced: true type: io.kestra.plugin.jdbc.postgresql values: url: "jdbc:postgresql://aws-0-eu-west-2.pooler.supabase.com:6543/postgres?user=postgres.nqxaafovehwkjapsqqlk&password={{ secret('SUPABASE_PASSWORD') }}" ``` :::alert{type="info"} You can also use the `username` and `password` properties rather than combining it all into the `url` property: ```yaml pluginDefaults: - forced: true type: io.kestra.plugin.jdbc.postgresql values: url: "jdbc:postgresql://aws-0-eu-west-2.pooler.supabase.com:6543/postgres" username: "postgres.nqxaafovehwkjapsqqlk" password: "{{ secret('SUPABASE_PASSWORD') }}" ``` ::: ## Copying a CSV File into Supabase DB in a Flow Using this [example CSV](https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv), we can copy the data into our table directly from Kestra. You can either set up the columns directly in Supabase or add a task in Kestra to add them automatically like this: ```yaml id: supabase_db_add_columns namespace: company.team tasks: - id: create_columns type: io.kestra.plugin.jdbc.postgresql.Queries sql: | ALTER TABLE kestra_example ADD COLUMN order_id int, ADD COLUMN customer_name text, ADD COLUMN customer_email text, ADD COLUMN product_id int, ADD COLUMN price double precision, ADD COLUMN quantity int, ADD COLUMN total double precision; pluginDefaults: - forced: true type: io.kestra.plugin.jdbc.postgresql values: url: "jdbc:postgresql://aws-0-eu-west-2.pooler.supabase.com:6543/postgres?user=postgres.nqxaafovehwkjapsqqlk&password={{ secret('SUPABASE_PASSWORD') }}" ``` Once your columns are configured, you can use the [CopyIn](/plugins/plugin-jdbc-postgres/io.kestra.plugin.jdbc.postgresql.copyin) task combined with the [HTTP Download](/plugins/core/http/io.kestra.plugin.core.http.download) task to download the CSV file and copy it directly into our database. As we set up the database connection with our [Plugin Defaults](#connecting-supabase-to-kestra), the CopyIn task will connect directly and copy the CSV file into the database. ```yaml id: supabase_db_copyin namespace: company.team tasks: - id: download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: copy_in type: io.kestra.plugin.jdbc.postgresql.CopyIn table: "kestra_example" from: "{{ outputs.download.uri }}" header: true columns: [order_id,customer_name,customer_email,product_id,price,quantity,total] delimiter: "," pluginDefaults: - forced: true type: io.kestra.plugin.jdbc.postgresql values: url: "jdbc:postgresql://aws-0-eu-west-2.pooler.supabase.com:6543/postgres?user=postgres.nqxaafovehwkjapsqqlk&password={{ secret('SUPABASE_PASSWORD') }}" ``` Once this flow completes, we can view the contents of our database in Supabase: ![supabase-5](./supabase-5.png) --- # Sync Flows from a Git Repository URL: https://kestra.io/docs/how-to-guides/syncflows > Automatically sync your flows from a Git repository to Kestra using the SyncFlows task, enabling GitOps and version control for your workflows. Sync flows from a Git Repository to Kestra with the SyncFlows Task.
The [SyncFlows](/plugins/plugin-git/io.kestra.plugin.git.syncflows) task is a powerful integration that allows you to **sync your code with Git from the UI while still managing this process entirely in code**! Kestra unifies the development experience between the UI and code so you can combine the best of both worlds without sacrificing the benefits of version control. The task syncs one or more flows from a Git repository on a schedule or anytime you push a change to a given Git branch. ## Before you begin Before you start using the `SyncFlows` task, ensure the following prerequisites are in place: 1. A Git repository where you want to sync your flows. If you haven't pushed any flows yet, see the [guide using the PushFlows task](../pushflows/index.md). 2. A Personal Access Token (PAT) for Git authentication. 3. A running Kestra instance in a version 0.17.0 or later with the PAT stored as a [secret](../../06.concepts/04.secret/index.md) within the Kestra instance. ## Using the `dryRun` property Here is a system flow that will sync the `git` namespace with flows from the repository in the `flows` directory. ```yaml id: sync_flows_from_git namespace: system tasks: - id: sync_flows type: io.kestra.plugin.git.SyncFlows username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/git_username/flows branch: main targetNamespace: git gitDirectory: flows dryRun: true ``` Given that the `dryRun` property is set to `true`, the task will only output changes from the Git repository without syncing any flows to Kestra yet: ![git1](./git1.png) The files listed are the same ones we added in the [PushFlows guide](../pushflows/index.md). ## Sync all flows to a single namespace from Git Set the `dryRun` property to `false` and sync the repository with Kestra: ```yaml id: sync_flows_from_git namespace: system tasks: - id: sync_flows type: io.kestra.plugin.git.SyncFlows ... dryRun: false ``` You should see the same flows from the earlier log now in Kestra: ![git2.png](./git2.png) A full list is also available in the Outputs tab: ![git3.png](./git3.png) ## Sync all flows including child namespaces You can also sync all flows in child namespaces. In the repository, there is a sub-folder called `tutorial` with more flows. Sync those as well by adding the `includeChildNamespaces` property and setting it to `true`. ```yaml id: sync_flows_from_git namespace: system tasks: - id: sync_flows type: io.kestra.plugin.git.SyncFlows username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/kestra-io/flows branch: main targetNamespace: git gitDirectory: flows includeChildNamespaces: true ``` After executing, all flows — including those from the `tutorial` child namespace — are synced into Kestra: ![git4.png](./git4.png) The Outputs tab shows the same result: ![git5.png](./git5.png) ## Set up a schedule A common use case for this task is to set up a routine schedule to keep Kestra in sync with the Git repository. Add a [Schedule trigger](../../05.workflow-components/07.triggers/01.schedule-trigger/index.md). This example has a cron expression to execute once every hour: ```yaml id: sync_flows_from_git namespace: system tasks: - id: sync_flows type: io.kestra.plugin.git.SyncFlows username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/git_username/flows branch: main targetNamespace: git gitDirectory: flows triggers: - id: every_full_hour type: io.kestra.plugin.core.trigger.Schedule cron: "* 0 * * *" ``` ## Automatically sync when a change is pushed to Git You can also automate the syncing process by adding a [Webhook trigger](../../05.workflow-components/07.triggers/03.webhook-trigger/index.md) and creating a Webhook on your GitHub repository to trigger the flow every time something is pushed to the repository. This is useful for keeping Kestra always in sync with the repository. ```yaml id: sync_flows_from_git namespace: system tasks: - id: sync_flows type: io.kestra.plugin.git.SyncFlows username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/kestra-io/flows targetNamespace: git gitDirectory: flows triggers: - id: gh_webhook type: io.kestra.plugin.core.trigger.Webhook key: abcdefg ``` To setup this webhook, go to the Settings for your GitHub repository and head to Webhooks and create a new Webhook: ![webhook1.png](./webhook1.png) For the Payload URL, your URL will follow the following format: ```plaintext https://{your_hostname}/api/v1/main/executions/webhook/system/sync_flows_from_git/abcdefg ``` This will require your host name to be publicly accessible. If you want to test this without having to deploy Kestra first, you can use a tool like [ngrok](https://ngrok.com/) to tunnel Kestra so GitHub can see it. As we're putting the secret in the URL, we can leave the Secret field blank. Save and test by committing something to the Git repository. ![webhook2.png](./webhook2.png) The most recent execution was triggered by the Webhook, keeping Kestra in sync with the Git repository automatically. If you also want to sync your files, see the [guide on syncing namespace files](../syncnamespacefiles/index.md). ## Extra notes - The `branch` property allows you to specify the branch to which files should be synced from. - The `gitDirectory` property allows you to specify the directory to which flows should be synced from. If not set, flows will be synced from the Git directory named `_flows` and will optionally also include subdirectories named after the child namespaces. If you prefer, you can specify an arbitrary path, e.g. `kestra/flows`, allowing you to sync flows to that specific Git directory. - If you try to add the Personal Access Token (PAT) directly in your source code in the `password` property, you will get an error message. This is a safety mechanism to prevent you and your users from accidentally exposing your PAT in the source code. You should store the PAT as a Kestra Secret, environment variable, namespace variable or as a SECRET-type input in your flow. --- # Synchronous Executions API in Kestra URL: https://kestra.io/docs/how-to-guides/synchronous-executions-api > Trigger Kestra workflow executions synchronously via the REST API. Wait for completion and retrieve outputs in a single API call for real-time integrations. Manage the Executions API Synchronously. There are many use cases where you may want to trigger the flow and get the flow's output in the API's response. In other words, you want the Executions API to behave synchronously. ## Executions API Executions API is capable of creating a parametrized flow execution. Say you have the following flow: ```yaml id: myflow namespace: company.team tasks: - id: mytask type: io.kestra.plugin.core.debug.Return format: hello from kestra outputs: - id: mydata type: STRING value: "{{ outputs.mytask.value }}" description: return some data ``` You invoke this flow using the Executions API as follows: ```bash curl -X POST http://localhost:8080/api/v1/main/executions/company.team/myflow ``` By default, the Executions API is asynchronous. It will invoke the execution of the flow, and return immediately with a response that includes the Execution ID and the time at which the execution was created: ```json { "id": "1KWLxLeaXEXNDaXWP7YSKA", "namespace": "company.team", "flowId": "myflow", "flowRevision": 1, "state": { "current": "CREATED", "histories": [ { "state": "CREATED", "date": "2024-07-12T05:07:28.447110427Z" } ], "duration": "PT0.002939292S", "startDate": "2024-07-12T05:07:28.447110427Z" }, "originalId": "1KWLxLeaXEXNDaXWP7YSKA", "deleted": false, "metadata": { "attemptNumber": 1, "originalCreatedDate": "2024-07-12T05:07:28.447113302Z" } } ``` ## Synchronous Executions API To wait for an execution to finish and return the flow outputs in the response, call the Executions API with the `wait=true` query parameter. This would make the API call synchronous, and you will receive all outputs in the response that are explicitly exposed in the flow. You can invoke the Executions API in a synchronous fashion as follows: ```bash curl -X POST 'http://localhost:8080e/api/v1/main/xecutions/company.team/myflow?wait=true' ``` Here is the output of this API invocation: ```json { "id": "24znmto07B2ZGrI9IQoSSH", "namespace": "company.team", "flowId": "myflow", "flowRevision": 1, "taskRunList": [ { "id": "4536yghIDGwqeRWZEE7AEE", "executionId": "24znmto07B2ZGrI9IQoSSH", "namespace": "company.team", "flowId": "myflow", "taskId": "mytask", "attempts": [ { "state": { "current": "SUCCESS", "histories": [ { "state": "CREATED", "date": "2024-07-12T05:13:42.140Z" }, { "state": "RUNNING", "date": "2024-07-12T05:13:42.140Z" }, { "state": "SUCCESS", "date": "2024-07-12T05:13:42.142Z" } ], "duration": "PT0.002S", "endDate": "2024-07-12T05:13:42.142Z", "startDate": "2024-07-12T05:13:42.140Z" } } ], "outputs": { "value": "hello from kestra" }, "state": { "current": "SUCCESS", "histories": [ { "state": "CREATED", "date": "2024-07-12T05:13:42.011Z" }, { "state": "RUNNING", "date": "2024-07-12T05:13:42.140Z" }, { "state": "SUCCESS", "date": "2024-07-12T05:13:42.144Z" } ], "duration": "PT0.133S", "endDate": "2024-07-12T05:13:42.144Z", "startDate": "2024-07-12T05:13:42.011Z" } } ], "outputs": { "mydata": "hello from kestra" # ✅ this is the data that we returned in the flow }, "state": { "current": "SUCCESS", "histories": [ { "state": "CREATED", "date": "2024-07-12T05:13:41.789Z" }, { "state": "RUNNING", "date": "2024-07-12T05:13:42.012Z" }, { "state": "SUCCESS", "date": "2024-07-12T05:13:42.335Z" } ], "duration": "PT0.546S", "endDate": "2024-07-12T05:13:42.335Z", "startDate": "2024-07-12T05:13:41.789Z" }, "originalId": "24znmto07B2ZGrI9IQoSSH", "deleted": false, "metadata": { "attemptNumber": 1, "originalCreatedDate": "2024-07-12T05:13:41.789Z" } } ``` As expected, the API response returned the outputs from the flow. It also contains all execution states. ## Authentication You can use the same authentication mechanism for this API call as applicable to the other Kestra's APIs. ### Basic Authentication First, base64-encode your username and password. You can do this using the following command: ```bash echo -n "username:password" | base64 ``` Then, you can use the encoded string in the `Authorization` header: ```bash curl -X POST 'http://localhost:8080/api/v1/main/executions/company.team/myflow?wait=true' -H 'Authorization: Basic ' ``` ### API Token If you're on the Enterprise Edition, you can use the API token for authentication. You can use the API token in the `Authorization` header as follows: ```bash curl -X POST 'http://localhost:8080/api/v1/main/executions/company.team/myflow?wait=true' -H 'Authorization: Bearer YOUR_API_TOKEN' ``` Usually, you would need to include your tenant ID in the URL. Here is an example: ```bash curl -X POST 'http://localhost:8080/api/v1/{tenant_id}/executions/company.team/myflow?wait=true' -H 'Authorization: Bearer YOUR_API_TOKEN' ``` --- # Sync Namespace Files from a Git Repository URL: https://kestra.io/docs/how-to-guides/syncnamespacefiles > Sync your namespace files, such as scripts and configuration, from a Git repository to Kestra using the SyncNamespaceFiles task. Sync files from a Git Repository to Kestra with SyncNamespaceFiles Task.
The [SyncNamespaceFiles](/plugins/plugin-git/io.kestra.plugin.git.syncnamespacefiles) task is a powerful integration that allows you to **sync your namespace files with Git from the UI while still managing this process entirely in code**! Kestra unifies the development experience between the UI and code so you can combine the best of both worlds without sacrificing the benefits of version control. The task syncs one or more namespace files from a Git repository on a schedule or anytime you push a change to a given Git branch. ## Before you begin Before you start using the `SyncNamespaceFiles` task, ensure the following prerequisites are in place: 1. A Git repository where you want to sync your files. If you haven't pushed any files yet, see the [guide using the PushNamespaceFiles task](../pushnamespacefiles/index.md). 2. A Personal Access Token (PAT) for Git authentication. 3. A running Kestra instance in a version 0.17.0 or later with the PAT stored as a [secret](../../06.concepts/04.secret/index.md) within the Kestra instance. ## Using the `dryRun` property Here is a system flow that will sync the `git` namespace with files from the repository in the `_files` directory. ```yaml id: sync_files_from_git namespace: system tasks: - id: sync_files type: io.kestra.plugin.git.SyncNamespaceFiles username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/git_username/scripts branch: main namespace: git gitDirectory: _files dryRun: true ``` Given that the `dryRun` property is set to `true`, the task will only output changes from the Git repository without syncing any files to Kestra yet: ![git1](./git1.png) The files listed are the same ones we added in the [PushNamespaceFiles guide](../pushnamespacefiles/index.md). ## Sync all files to a single namespace from Git Set the `dryRun` property to `false` and sync the repository with Kestra: ```yaml id: sync_files_from_git namespace: system tasks: - id: sync_files type: io.kestra.plugin.git.SyncNamespaceFiles ... dryRun: false ``` You should see the same files from the earlier log now in Kestra: ![git2](./git2.png) A full list is also available in the Outputs tab: ![git3](./git3.png) ## Set up a schedule A common use case for this task is to set up a routine schedule to keep Kestra in sync with the Git repository. Add a [Schedule trigger](../../05.workflow-components/07.triggers/01.schedule-trigger/index.md). This example has a cron expression to execute once every hour: ```yaml id: sync_files_from_git namespace: system tasks: - id: sync_files type: io.kestra.plugin.git.SyncNamespaceFiles username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/git_username/scripts branch: main namespace: git gitDirectory: _files triggers: - id: every_full_hour type: io.kestra.plugin.core.trigger.Schedule cron: "* 0 * * *" ``` ## Automatically sync when a change is pushed to Git You can also automate the syncing process by adding a [Webhook trigger](../../05.workflow-components/07.triggers/03.webhook-trigger/index.md) and creating a Webhook on your GitHub repository to trigger the flow every time something is pushed to the repository. This is useful for keeping Kestra always in sync with the repository. ```yaml id: sync_files_from_git namespace: system tasks: - id: sync_files type: io.kestra.plugin.git.SyncNamespaceFiles username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/git_username/scipts branch: main namespace: git gitDirectory: _files triggers: - id: gh_webhook type: io.kestra.plugin.core.trigger.Webhook key: abcdefg ``` To setup this webhook, go to the Settings for your GitHub repository and head to Webhooks and create a new Webhook: ![webhook1.png](./webhook1.png) For the Payload URL, your URL will follow the following format: ```plaintext https://{your_hostname}/api/v1/main/executions/webhook/system/sync_files_from_git/abcdefg ``` This will require your host name to be publicly accessible. If you want to test this without having to deploy Kestra first, you can use a tool like [ngrok](https://ngrok.com/) to tunnel Kestra so GitHub can see it. As we're putting the secret in the URL, we can leave the Secret field blank. Save and test by committing something to the Git repository. ![webhook2.png](./webhook2.png) The most recent execution was triggered by the Webhook, keeping Kestra in sync with the Git repository automatically. If you also want to sync your flows, see the [guide on syncing flows](../syncflows/index.md). --- # Modularize Triggers and Schedules with Terraform URL: https://kestra.io/docs/how-to-guides/terraform-modules-for-triggers > Scale your Kestra codebase by modularizing triggers and schedules using Terraform templates for reusable and consistent definitions. Scale your codebase using Terraform to template and make scheduling a breeze As shown in the [terraform templating](../terraform-templating/index.md) guide, you can use Terraform to template and define flows. Managing triggers and schedules can be a **tedious task**, especially when you have a lot of flows generating **peak hours** due to reuse of same trigger schedules. This guide will show you how to use Terraform to define triggers and schedules for your flows with modularity. Note: we created the repo [kestra-flows-template](https://github.com/kestra-io/kestra-flows-template) for you to directly start from a very scalable codebase. ## Code structure ```plaintext . └── environment/ ├── development ├── production/ # Contains subfolders defining Kestra flows resources │ ├── airbyte/ │ └── ... ├── modules/ # Terraform modules to be used in environments │ ├── trigger_cron/ │ ├── trigger_cron_hourly_random/ │ ├── trigger_flow/ │ ├── trigger_webhook/ │ └── ... ``` Use `null_resource` to create reusable resources to DRY (Do not Repeat Yourself) your trigger definitions. With Terraform version >= 1.4, you can use `terraform_data` instead. For backward compatibility, this guide uses `null_resource`. ## Example of Cron schedule implementation Below an example of implementation for a Terraform module that defines a cron schedule trigger. `triggers.yml` ```yaml triggers: - id: ${cron-name} type: io.kestra.plugin.core.trigger.Schedule cron: "${cron-expression}" lateMaximumDelay: "${late-maximum-delay}" ``` `main.tf` ```hcl resource "null_resource" "trigger_cron" { triggers = { value = templatefile("${path.module}/triggers.yml", { cron-name = var.cron_name cron-expression = var.cron_expression late-maximum-delay = var.late_maximum_delay }) } } ``` `variables.tf` ```hcl variable "cron_expression" { type = string description = "Cron expression or supported expression like : @hourly" default = null } variable "cron_name" { type = string description = "Provide a description of your Cron expression for simplicity" default = null } variable "late_maximum_delay" { type = string description = "Allow to disable auto-backfill : if the schedule didn't start after this delay, the execution will be skipped." } ``` `outputs.tf` ```hcl output "trigger_content" { value = null_resource.trigger_cron.triggers.value } ``` Usage of this module would look like : ```hcl module "trigger_purge" { source = "../../../../modules/trigger_cron" cron_expression = "0 0 * * 0" cron_name = "weekly_kestra_purge" late_maximum_delay = "PT1H" } module "my_flow_module" { source = "../../../../modules/my_flow_module" trigger = module.trigger_purge.trigger_content } ``` Module `my_flow_module` will use the trigger defined in `trigger_purge` module. ## Scaling your codebase with Terraform trigger modules You can check here an example [using cron trigger using fully terraform](https://github.com/kestra-io/kestra-flows-template/blob/b6937f9d95970a4e909687eb64936f5ea3f02c1c/environment/production/dbt/jaffle_shop_classic.tf#L28). --- # Terraform for Flow Modularity in Kestra URL: https://kestra.io/docs/how-to-guides/terraform-templating > Combine Kestra with Terraform for IaC workflows. Template and provision cloud resources automatically as part of your CI/CD automation pipelines. Scale your codebase using Terraform to template and define flows This guide shows how to use Terraform's HCL (Hashicorp Configuration Language) templating features in a Kestra codebase. To make your codebase accessible to users unfamiliar with Kestra syntax, encapsulate most of the logic and DSL (Domain-specific programming language) into [Terraform modules](https://developer.hashicorp.com/terraform/language/modules). This quick tutorial, will show you how templating capabilities brought by Terraform can help you : - DRY (Do Not Repeat Yourself) your codebase - Facilitate onboarding on Kestra - Incorporate extra modularity - Implement complex pipelines while keeping syntax clear You can check the [kestra-flows-template](https://github.com/kestra-io/kestra-flows-template) repo which contains a set of modules and subflows to help you get started with Terraform. This guide covers creating a Terraform module and a subflow, and how to use them in your codebase. ## Code structure ```plaintext . └── environment/ ├── development ├── production/ # Contains subfolders defining Kestra flows resources │ ├── airbyte/ │ ├── dbt/ │ ├── triggers/ │ ├── main.tf # Instantiate each folder (airbyte, dbt ...) │ └── ... ├── modules/ # Terraform modules to be used in environments │ ├── airbyte_sync/ │ ├── trigger_cron/ │ └── ... └── subflows/ # Kestra subflows ├── main.tf ├── sub_cloud_sql_airbyte_query.yml └── ... ``` Modules are folders under `modules` folder and can be instantiated either in `development` or `production` environments. They only expose variables that are meant to be changed for usage purpose. Inside a module, you can define a `main.tf` file that will define the resources to be created. ## Creating a module, example with Airbyte Create a module that defines a Kestra flow to sync data from Airbyte. ## Tree structure of a Terraform module ```plaintext . └── airbyte_sync/ ├── main.tf ├── tasks.yml └── variables.tf ``` ### `main.tf` contains the kestra_flow terraform resource, which will define the flow using a templated YAML file ```hcl resource "kestra_flow" "airbyte_sync" { keep_original_source = true flow_id = var.flow_id namespace = var.namespace content = join("", [ yamlencode({ id = var.flow_id namespace = var.namespace labels = var.priority != null ? merge(var.labels, { priority = var.priority }) : var.labels description = var.description }), templatefile("${path.module}/tasks.yml", { description = var.description airbyte-url = var.airbyte_url airbyte-connections = var.airbyte_connections max-duration = var.max_sync_duration late-maximum-delay = var.late_maximum_delay cron-expression = var.cron_expression }), var.trigger, ]) } ``` ## `variables.tf` will contain all the variables that can be passed to the module with appropriate validation and description ```hcl variable "airbyte_connections" { description = "List of Airbyte connections to trigger : id (can be found in URL), name is whatever makes sense" type = list(object({ name = string id = string })) validation { condition = length(var.airbyte_connections) > 0 && length([ for o in var.airbyte_connections : true if length(regexall("^[A-Za-z_]+$", o.name)) > 0 ]) == length(var.airbyte_connections) error_message = "At least one connection should be provided, and connection names should not contain hyphens." } } variable "flow_id" { type = string } variable "description" { type = string } variable "namespace" { type = string default = "blueprint" } variable "airbyte_url" { type = string } variable "trigger" { type = string description = "String containing triggers sections of the flow" default = "" } variable "max_sync_duration" { type = string description = "Tell Kestra to wait logs for this max duration" default = "" } variable "labels" { type = map(string) default = null description = "Labels to apply to the flow" } variable "priority" { type = string default = null description = "Priority tag to apply to the flow" } variable "cron_expression" { type = string description = "Cron expression or supported expression like : @hourly" default = null } variable "late_maximum_delay" { type = string description = "Allow to disable auto-backfill : if the schedule didn't start after this delay, the execution will be skip." } ``` ## `tasks.yml`: flow definition with Terraform templatefile jinja syntax ```yaml tasks: ## Here we leverage the Terraform templating capabilities to generate the tasks ## Using jinja-like syntax, we can loop over the list of connections and generate tasks for each of them %{ for connection in airbyte-connections ~} - id: "trigger_${connection.name}" type: io.kestra.plugin.airbyte.connections.Sync connectionId: ${connection.id} url: "${airbyte-url}" httpTimeout: "PT1M" wait: false - id: "check_${connection.name}" type: io.kestra.plugin.airbyte.connections.CheckStatus url: "${airbyte-url}" jobId: "{{ outputs.trigger_${connection.name}.jobId }}" pollFrequency: "PT1M" httpTimeout: "PT1M" retry: type: constant interval: PT1M maxAttempts: 5 %{ if length(max-duration) > 0} maxDuration: "${max-duration}" %{ endif } %{ endfor ~} triggers: - id: cron_trigger type: io.kestra.plugin.core.trigger.Schedule cron: "${cron-expression}" lateMaximumDelay: "${late-maximum-delay}" ``` ## Using the module in a Terraform environment Using the module will look like this : ```hcl module "stripe_events_incremental" { source = "../../../modules/airbyte_sync" flow_id = "stripe_events" priority = "high" namespace = local.namespace description = "Stripe Events" airbyte_connections = [ { name = "stripe_events_incremental" id = module.airbyte_connection_stripe_offical.connection_id } ] max_sync_duration = "PT30M" airbyte_url = var.airbyte_url cron_expression = "@hourly" late_maximum_delay = "PT1H" } ``` It is now easy to instantiate the module in your `main.tf` file, and to expose only the variables that are meant to be changed: - `flow_id`: the flow id - `namespace`: the namespace to save the flow in - `description`: the description - `airbyte_connections`: the list of Airbyte connections to trigger in a linear order - `max_sync_duration`: the maximum duration to wait for logs - `airbyte_url`: the Airbyte URL of the instance - `cron_expression`: the cron expression to trigger the flow - `late_maximum_delay`: the maximum delay to wait for the flow to start, in case of missed schedules (backfill) In case of changes in the way you want to implement the underlying tasks, you can modify the Terraform module without changing the interface (variables). ## Subflow example: query and display results for a given Postgres database Subflows are a way to encapsulate logic and make it reusable across your codebase. Here is an example of a subflow that will query a Cloud SQL instance: ```yaml id: query_my_postgres_database namespace: company.team description: "Query Postgres database and display results in logs" inputs: - id: sqlQuery type: STRING defaults: "SELECT * FROM public.jobs ORDER BY created_at desc limit 1" # SQL query example tasks: - id: query_data type: io.kestra.plugin.jdbc.postgresql.Query url: jdbc:postgresql://MY_HOST/MY_DATABASE username: MY_USER password: "{{ secrets.get('my-postgres-password') }}" sql: "{{ inputs.sqlQuery }}" fetchType: FETCH - id: show_result type: io.kestra.plugin.core.log.Log message: | {% for row in outputs.query_data.rows %} {%- for key in row.keySet() -%} {{key}} : {{row.get(key)}} | {%- endfor -%} \n {% endfor %}" ## To make it easier to use the results in another flow ## we expose the query result by using `outputs` outputs: - id: query_result value: "{{ outputs.query_data.rows }}" type: JSON ``` You can either execute this subflow as is, or use it in another flow to avoid repeating the same logic. Executing the subflow will prompt you to enter the SQL query you want to execute : ![Subflow execution](./01-execute_sublow_query_my_postgres.png) ## Using the subflow in a flow ```yaml - id: query_last_job type: io.kestra.core.tasks.flows.Subflow namespace: company.team flowId: query_my_postgres_database inputs: sqlQuery: "SELECT * FROM public.jobs ORDER BY created_at desc limit 1" wait: true transmitFailed: true - id: use_result type: io.kestra.core.tasks.debugs.Return # Use the query result from the subflow format: "{{ outputs.query_last_job.outputs.query_result }}" ``` 1. Connection details are stored in the subflow, and only the SQL query is exposed to the user. 1. Subflow natively displays results in logs for easy debugging. 1. Outputs of the subflow can be used in the parent flow by using `outputs.query_data.rows` in the `show_result` task. > Note: `wait: true` will wait for the subflow to finish before continuing the flow execution. `transmitFailed: true` will transmit the failed status of the subflow to the parent flow. Parent flow logs will display tasks from subflow directly: ![Subflow execution from parent flow](./02-execute_sublow_from_parent_flow.png) ## Subflows vs Terraform templating Subflows hide unnecessary details to their users, abstracting connection details, logging and such for a given set of tasks. Terraform modules allow you to define complex flows in a modular way. Also it supports passing outputs from one Terraform resource to another across systems (Airbyte terraform resource output to Kestra module input variable) and strongly validate inputs which is not possible with subflows. ## Conclusion Terraform templating is a powerful way to define flows in a modular way, and to expose only the variables that are meant to be changed. It is a great way to make your codebase more maintainable and to facilitate onboarding for users unfamiliar with Kestra syntax. --- # Kestra with Pulumi's Terraform Provider URL: https://kestra.io/docs/how-to-guides/using-pulumis-terraform-provider > Integrate Kestra infrastructure management into your Pulumi projects using Pulumi's Terraform Provider bridge. Utilize Pulumi's Terraform Provider to manage Kestra infrastructure. This post outlines the process of leveraging Pulumi's terraform-provider package to seamlessly integrate the Kestra Terraform provider into your Pulumi projects. This approach allows you to manage Kestra resources using the familiar Pulumi infrastructure-as-code workflow, even if the provider isn't officially published in the main Pulumi Registry. ## About the example repository The [pulumi-kestra-example](https://github.com/japerry911/pulumi-kestra-example) repository is a hands-on example that shows how to provision and manage Kestra resources with Pulumi using a Python-based provider and SDK generated locally. The repo includes: - a Pulumi project YAML - a complete example flow and namespace - an app - demonstrating a real-world use case: uploading a file to Google Cloud Storage via a Kestra flow and app Note the flow.yaml does not perform an actual GCS File Upload, and that portion is commented out. This blog post is meant to prioritize understanding of the Pulumi Terraform-Provider. ## Step-by-step process Follow these steps to set up your environment and begin managing Kestra resources with Pulumi: 1. Clone/Fork the Example Repository - Start by cloning or forking the example repository, which provides a foundational structure for your project: https://github.com/japerry911/pulumi-kestra-example 2. Download and Install Pulumi - If you haven't already, install the Pulumi CLI on your system: - Mac: `brew install pulumi/tap/pulumi` - Linux: `curl -fsSL https://get.pulumi.com | sh` - Windows: Refer to the official Pulumi documentation for installation instructions: [Pulumi Docs](https://www.pulumi.com/docs/get-started/download-install/) 3. Create a Pulumi Python Project - For this example, we'll be using Python. You can create a new Pulumi Python project with the following command: `pulumi new python` - During the project creation process, you will be prompted to: - Log in with your browser or an access token. - Fill in project-specific options such as project name, stack, and preferred package management tool. 4. Add the Terraform Provider - Pulumi's `pulumi package add terraform-provider ` command is a powerful feature that utilizes local packages. This command instantly generates a language-specific SDK for any existing Terraform or OpenTofu provider directly within your project. This means you can use providers in your Pulumi code even if they are not officially published in the main Pulumi Registry. - Execute the following command in your terminal to add the Kestra Terraform provider: `pulumi package add terraform-provider kestra-io/kestra` - This command downloads the specified provider (kestra-io/kestra) and creates all the necessary wrapper code in a local directory (e.g., ./sdks/), enabling you to immediately manage that provider's resources as part of your Pulumi infrastructure. 5. Install the Local SDK - Now that you have a local SDK in your project's `sdks` folder, you need to install it into your local Python virtual environment. (If you're using a different language project, you'll need to follow the equivalent installation steps for that language.) - Add the SDK path to your requirements.txt file: `echo sdks/kestra >> requirements.txt` - Install the project dependencies: `pulumi install` 6. Create and Fill .env File - Create a .env file based on a .env.local template. This file will hold your Kestra secrets and provider URL. ```bash ## Kestra secrets ## API Token is required (Enterprise-only), ## or Username AND Password are required KESTRA_API_TOKEN= KESTRA_USERNAME= KESTRA_PASSWORD= ## Kestra Provider URL for Provider declaration KESTRA_PROVIDER_URL= ``` - Fill in the appropriate values for `KESTRA_API_TOKEN`, `KESTRA_USERNAME`, `KESTRA_PASSWORD`, and `KESTRA_PROVIDER_URL` based on your Kestra instance edition. 7. Prepare for Resource Building - With the local Pulumi SDK for the Kestra Terraform provider set up and installed, install some additional Python packages before defining your resources: - Activate your Python environment: `source venv/bin/activate` - Install python-dotenv and PyYAML: `pip install python dotenv PyYaml` 8. Build and Deploy - You are now ready to build and deploy your Kestra resources using Pulumi! - Execute the following command: `pulumi up` - This will initiate the deployment process, and Pulumi will provision your Kestra resources as defined in your project. ## What did we provision?
## Conclusion By following these steps, you can effectively integrate the Kestra Terraform provider into your Pulumi workflows, allowing for robust and consistent management of your Kestra infrastructure. Thank you for reading, Happy Coding! --- # Access Values Between Flows URL: https://kestra.io/docs/how-to-guides/values-between-flows > Share data across Kestra flows using Subflows, KV Store, and Namespace Variables. Learn best patterns for passing values between different workflows. How to access values across different flows. Sometimes it's useful to store values so they can be used across multiple flows. Whether that's configuration or state generated by similar flows, accessing values between flows has many benefits. There are multiple ways to do that in Kestra, each with different advantages depending on the use case.
There are three different ways you can access values across different flows: 1. Subflows 2. KV Store 3. Namespace Variables ## Subflows Using [Subflows](../../05.workflow-components/10.subflows/index.md), you can execute one flow from another flow. As part of that, you can pass inputs from the parent flow to the subflow and retrieve outputs generated from it. This is useful if you want multiple flows to execute together and interact directly with one another. However, this doesn't work if you want one flow to generate a value and another flow to use it later when it executes. In this example, our parent flow is passing the [variable](../../05.workflow-components/04.variables/index.md) `debug` into the subflow as an [input](../../05.workflow-components/05.inputs/index.md). On top of that, the subflow returns an [output](../../05.workflow-components/06.outputs/index.md) `subflow_output` too. ```yaml id: parent_flow namespace: company.team variables: debug: true tasks: - id: subflow type: io.kestra.plugin.core.flow.Subflow flowId: subflow namespace: company.team inputs: debug: "{{ vars.debug }}" - id: log type: io.kestra.plugin.core.log.Log message: "{{ outputs.subflow.outputs.subflow_output }}" ``` In the subflow, the output is explicitly defined to make it accessible in the parent flow. This example uses the input to generate the output sent to the parent flow. ```yaml id: subflow namespace: company.team inputs: - id: debug type: BOOLEAN tasks: - id: return type: io.kestra.plugin.core.debug.Return format: "Subflow: {{ inputs.debug }}" outputs: - id: subflow_output type: STRING value: "{{ outputs.return.value }}" ``` ## KV Store Using the [KV Store](../../06.concepts/05.kv-store/index.md), you can set and get values across different flows. This is good if you want to be able to store values without flows directly interacting with one another, like they do with Subflows. Flows can use the Get and Set tasks to make themselves stateful, allowing one flow to store the state, and another to access it when it wants. However, this approach isn't ideal if you don't want these values to be modified by the flows directly. For example, you can use `io.kestra.plugin.core.kv.Set` task as well as use the UI interface to manage the values in the KV Store. To access them, you can use the `io.kestra.plugin.core.kv.Get` task which will return them as an output. ```yaml id: kv_store namespace: company.team variables: debug: true tasks: - id: set type: io.kestra.plugin.core.kv.Set key: debug value: "{{ vars.debug }}" namespace: "{{ flow.namespace }}" - id: get type: io.kestra.plugin.core.kv.Get key: debug - id: log type: io.kestra.plugin.core.log.Log message: "{{ outputs.get.value }}" ``` ## Namespace Variables :::alert{type="info"} This is an [Enterprise Edition](../../07.enterprise/index.mdx) feature. ::: Using [Namespace Variables](../../07.enterprise/02.governance/07.namespace-management/index.md), you can define values that can be accessed between flows in a namespace, similar to the KV Store. However, these can only be set in the [Namespace page](../../07.enterprise/02.governance/07.namespace-management/index.md). This is useful if you want to access values across flows but do not want to update them dynamically inside your flows at the same time. For example, we can define our variables as a key-value pair in our Namespace: ```yaml debug: true state: failed hello: world ``` We can access them using the `{{ namespace.var_key }}` expression where `var_key` is the key of our key-value pair. ```yaml id: global_variables namespace: company.team variables: debug: true tasks: - id: debug_1 type: io.kestra.plugin.core.log.Log message: "Namespace: {{ namespace.state }}" - id: debug_2 type: io.kestra.plugin.core.log.Log message: "Local: {{ vars.debug }}" ``` --- # Set Up Webhooks to Trigger Flows URL: https://kestra.io/docs/how-to-guides/webhooks > Trigger Kestra workflows via webhooks. Configure webhook listeners to start flows in response to GitHub events, Slack commands, or any HTTP POST request. Execute flows using the Webhooks Trigger. Webhooks are HTTP requests that are triggered by an event. These are useful for being able to tell another application to do something, such as starting the execution of a Flow in Kestra. If your provider sends an idempotency key header (e.g., `Idempotency-Key`), map it to `system.correlationId` and add a duplicate guard as shown in [Idempotency with correlation IDs](../idempotency/index.md) to prevent double-processing. ## Using Webhooks in Kestra You can use webhooks to trigger an execution of your flow in Kestra. To do this, we can make a [trigger](../../05.workflow-components/07.triggers/03.webhook-trigger/index.md) with the type `io.kestra.plugin.core.trigger.Webhook`. Once we've done this, we can add a `key` property, which can be random as this will be used to trigger the webhook. In the example, the `key` is set to `1KERKzRQZSMtLdMdNI7Nkr` which is what we put at the end of our webhook URL to trigger it. ```yaml id: webhook_example namespace: company.team description: | Example flow for a webhook trigger. This endpoint doesn't need any login / password and is secured by `key` that is different for every flow tasks: - id: out type: io.kestra.plugin.core.debug.Return format: "{{ trigger | json }}" triggers: - id: webhook_trigger type: io.kestra.plugin.core.trigger.Webhook # the required key to start this flow - might be passed as a secret key: 1KERKzRQZSMtLdMdNI7Nkr ``` The format of the Webhook URL follows: `https://{your_hostname}/api/v1/main/executions/webhook/{namespace}/{flow_id}/{key}` where: - `your_hostname` is the domain or IP of your server, e.g. example.com - `namespace` is `io.kestra.demo.flows` - `flow_id` is `webhook_example` - `key` is `1KERKzRQZSMtLdMdNI7Nkr` With this information, you can test your flow by running the following command in the terminal to trigger the flow: ```bash curl http://localhost:8080/api/v1/main/executions/webhook/company.team/webhook_example/1KERKzRQZSMtLdMdNI7Nkr ``` You can also copy the formed Webhook URL from the **Triggers** tab.
## Webhooks in Kestra EE Use Kestra Secrets to store the webhook key. From the left navigation menu on the Kestra UI, navigate to `Namespaces`. Click on the namespace under which you want to create the flow with the webhook trigger. We will use `company.team` namespace for this example. On the corresponding namespace page, navigate to the `Secrets` tab. Click on the `New secret` button at the top, and create a new secret with `Key` as `WEBHOOK_KEY` (you may choose any appropriate name) and `Secret` as the webhook key value. Let us use `1KERKzRQZSMtLdMdNI7Nkr` for this example. Once you've done that, save the secret. ![navigate_to_secrets](./navigate_to_secrets.png) ![assign_secret_value](./assign_secret_value.png) Create the flow in the same namespace where you defined the `WEBHOOK_KEY` secret. The flow will use the webhook trigger, like this: ```yaml id: webhook_ee_example namespace: company.team description: | Example flow for a webhook trigger in Kestra EE. This endpoint doesn't need any login / password and is secured by `key` that is different for every flow tasks: - id: out type: io.kestra.plugin.core.debug.Return format: "{{ trigger | json }}" triggers: - id: webhook_trigger type: io.kestra.plugin.core.trigger.Webhook # the required key to start this flow - might be passed as a secret key: "{{ secret('WEBHOOK_KEY') }}" ``` In the `triggers` section of the flow, the secret is referenced in the `key` as `{{ secret('WEBHOOK_KEY') }}` rather than hardcoding the webhook key directly. The format of the Webhook URL follows: `https://{your_hostname}/api/v1/{tenant_id}/executions/webhook/{namespace}/{flow_id}/{key}` where: - `your_hostname` is the domain or IP of your server, e.g. example.com - `tenant_id` is the tenant ID belonging to your Kestra EE account - `namespace` is `company.team`` - `flow_id` is `webhook_ee_example` - `key` is `1KERKzRQZSMtLdMdNI7Nkr` With this information, you can test your flow by running the following command in the terminal to trigger the flow: ```bash curl http://my.kestra.clod/api/v1/my_tenant/executions/webhook/company.team/webhook_eE_example/1KERKzRQZSMtLdMdNI7Nkr ``` --- # Install Kestra: Docker, Kubernetes, VM, and JAR URL: https://kestra.io/docs/installation > Overview of Kestra installation methods, including Docker, Kubernetes, Virtual Machines, and Standalone JAR. import ChildCard from "~/components/docs/ChildCard.astro" Install Kestra using the method that fits your environment. You can deploy Kestra from a laptop or on-prem server to a distributed cluster in a public cloud. Some plugins such as the [Script plugin](../16.scripts/index.mdx) require Docker-in-Docker (DinD), which is not supported in some environments like AWS Fargate. For production, use Kubernetes or a virtual machine. If looking for a fully-managed orchestration platform without the overhead of infrastructure maintenance, Kestra Cloud is currently in early access; [sign up here](/cloud). The easiest way to install Kestra locally is to use [Docker](./02.docker/index.md). ## Choose how to install Kestra for your environment --- # Deploy Kestra on AWS EC2 – RDS and S3 Backend URL: https://kestra.io/docs/installation/aws-ec2 > Install Kestra on AWS EC2 using Amazon RDS for the database and S3 for internal storage backend. Install Kestra on AWS EC2 with PostgreSQL RDS database and S3 internal storage backend. :::alert{type="info"} Prefer a one-click option? You can launch Kestra directly from the [AWS Marketplace listing](https://aws.amazon.com/marketplace/pp/prodview-uilmngucs45cg). :::
## Prerequisites - Basic knowledge about using a command line interface - Basic knowledge about EC2, S3, and PostgreSQL. You can find the corresponding [full Terraform configuration in this repository](https://github.com/kestra-io/deployment-templates/tree/main/aws). ## Step 1: Create an EC2 instance & install Docker First, create an EC2 instance. To do so, [go to the AWS console and choose EC2](https://eu-north-1.console.aws.amazon.com/ec2/home). 1. Give a name to your instance. 2. Choose Ubuntu as your OS. 3. Instance type: Kestra requires at least 4GiB of memory and 2 vCPUs to run correctly. Choosing t3-medium is a good starting point. 4. Create a key-pair to securely connect to your instance. This key is needed to connect through SSH in the following steps. 5. Create a security group that allows SSH traffic from your IP and also allow HTTPS traffic. ![ec2 creation](./ec2_setup1.png) ![ec2 key pair](./ec2_setup2.png) ![ec2 network subgroup](./ec2_setup3.png) You can now click on **Launch instance** and wait a few seconds for the instance to be up and running. Once running, open a terminal on your laptop and connect to your instance through SSH: `ssh -i ubuntu@` Kestra can be run directly from the `.jar` binary or using Docker. We use Docker here for quicker setup: 1. Install Docker on the EC2 instance. [You can find the last updated instruction on the Docker website](https://docs.docker.com/engine/install/ubuntu/). 2. [Install Docker Compose](https://docs.docker.com/compose/install/). To check your installation, run `docker version` and `docker compose version`. You're now ready to download and launch the Kestra server. ## Step 2: Download and run Kestra Download the official Docker-Compose file: ```bash curl -o docker-compose.yml \ https://raw.githubusercontent.com/kestra-io/kestra/develop/docker-compose.yml ``` Use an editor such as Vim to modify the `docker-compose.yml` and set basic authentication to `true` and configure your basic authentication credentials to secure your Kestra instance. Make sure to add a valid email address too. ```yaml kestra: server: basic-auth: enabled: true username: admin@kestra.io # it must be a valid email address password: kestra ``` Next, use the following command to start the Kestra server: ```bash docker compose up -d ``` ## Step 3: Allow external traffic Kestra is now running and the Kestra server is exposing traffic on the `8080` port. To connect through your web browser, update the inbound traffic rules in the EC2 security group. Go to the EC2 console and select Security Group. Choose the security group attached to your EC2 instance and add a new inbound rule to open access to the `8080` port. If you did not select an existing security group when creating the instance, the security group will be prefixed with "launch-wizard-". If you want to only allow traffic coming from your IP address, set the source to your own IP. If you want to make it open to the entire Internet, leave it at `0.0.0.0`. :::alert{type="warning"} Note that if you haven't set up basic authentication in the previous step, your Kestra instance will be publicly available to anyone without any access restriction. ::: ![ec2 inbound rules](./ec2_security_group_port_inbound_rules.png) You can now access your Kestra instance and create, edit, and run Flows. ## Step 4: Use AWS RDS PostgreSQL as a database backend This first installation relies on a PostgreSQL database running alongside the Kestra server on the EC2 instance (see the PostgreSQL service running in Docker Compose). For a simple proof of concept (PoC), you can keep the PostgreSQL database running in Docker. However, for a production-grade installation, we recommend a managed database service such as [AWS RDS](https://aws.amazon.com/rds/). **Create an AWS RDS database** 1. Go to the [RDS console](https://eu-north-1.console.aws.amazon.com/rds/home). 2. Create a database and choose PostgreSQL (Kestra also supports MySQL, but PostgreSQL is recommended) 3. Set a username and password. 4. On the connectivity configuration choose “Connect to an EC2 compute resource” and choose your EC2 instance. 5. Also select the existing DB subnet group and existing VPC security group and choose the one attached to your EC2 instance. 6. Fine-tune instance class and storage type to avoid incurring AWS costs. For a first step, a small PostgreSQL instance is enough. 6. Click create and wait for completion ![RDS setup](./rds_setup1.png) ![RDS password](./rds_setup2.png) ![RDS connectivity](./rds_setup3.png) **Create Kestra database** Before attaching your Kestra server to the new database backend, initialize the database with a base schema as follows: 1. Connect to your EC2 instance with ssh. 2. Install a PostgreSQL client: `sudo apt-get install postgresql-client`. 3. Create the Kestra database: `createdb -h -U -p 5432 kestra`. **Update Kestra configuration** In the Docker compose configuration, edit the `datasources` property of the Kestra service in the following way: ```yaml datasources: postgres: url: jdbc:postgresql://:5432/kestra driver-class-name: org.postgresql.Driver username: password: ``` Because you now use the RDS service, you do not need the PostgreSQL service anymore. Remove it from the `docker-compose.yml` file. For the changes to take effect, restart the Docker services with `docker compose restart` or `docker compose up -d`. ## Step 5: Use AWS S3 for storage By default, internal storage is implemented using the local file system. This section guides you how to change the storage backend to S3 to ensure a more reliable, durable, and scalable storage. 1. Go to the S3 console and create a bucket. 2. Go to IAM and create a new User Group with AWS S3 full access. 3. Create a new user and attach it to the user group. 4. For the new user, go to **Security Credentials** and create an access key. Choose “Application running on an AWS compute service” and retrieve the access and secret keys. 5. Edit the Kestra storage configuration. ```yaml kestra: storage: type: s3 s3: access-key: "" secret-key: "" region: "" bucket: "" ``` 6. Restart docker services. ![S3 iam](./IAM-usergroup.png) For more information on S3 storage configuration, check out the [Runtime and Storage configuration guide](../../configuration/02.runtime-and-storage/index.md). ## Next steps This setup provides the easiest starting point for running Kestra in production on a single machine. For a deployment to a distributed cluster, check the [Kubernetes deployment guide](../03.kubernetes/index.md). Reach out via [Slack](/slack) if you encounter any issues, or if you have any questions regarding deploying Kestra to production. Make sure to also check the [CI/CD guide](../../version-control-cicd/cicd/index.md) to automate your workflow deployments based on changes in Git. --- # Deploy Kestra on Azure VM – Azure Database Backend URL: https://kestra.io/docs/installation/azure-vm > Deploy Kestra on an Azure Virtual Machine using Azure Database for PostgreSQL and Azure Blob Storage. Install Kestra on an Azure VM with Azure Database for PostgreSQL as the database backend and Azure Blob Storage as the internal storage backend. :::alert{type="info"} Prefer an Azure-native option? You can deploy Kestra directly from the [Azure Marketplace listing](https://marketplace.microsoft.com/en-us/product/AzureApplication/kestra_technologies.kestra-open-source-official). ::: ## Deploy Kestra on an Azure VM with Azure Database Prerequisites: - Basic command-line interface (CLI) skills. - Familiarity with Azure and PostgreSQL. ## Create an Azure VM First, create a virtual machine using Azure Virtual Machines. To do so, go to the Azure portal and choose [Virtual Machines](https://portal.azure.com/#view/HubsExtension/BrowseResource/resourceType/Microsoft.Compute%2FVirtualMachines). 1. Click **Create** and select **Azure Virtual Machine**. 2. Choose an appropriate **Subscription** and **Resource Group**. 3. Give a name for your VM, and choose a **Region** where it should be launched. 4. For **Availability options**, choose **Availability zone**, and keep the default availability zone. 5. For **Image**, choose **Ubuntu Server 22.04 LTS - x64 Gen2**, and **x64** as the VM architecture. 6. Kestra requires at least 4GiB of memory and 2 vCPUs to run correctly. Choosing the **Size** as **Standard_D2s_v3** is a good starting point. 7. Select **SSH public key** as the **Authentication type**. 8. You can keep the default `azureuser` as the **Username**. 9. For **SSH public key source**, you can select **Generate new key pair** and provide an appropriate name for the key pair. 10. For **Public inbound ports**, choose **Allow selected ports** and from the **Select inbound ports** dropdown, select **HTTPS** and **SSH**. 11. Click **Review + Create**. 12. You can now review the configurations and click on **Create**. On the **Generate new key pair** popup, click **Download private key and create** resource. ![vm setup1](./vm_setup1.png) ![vm setup2](./vm_setup2.png) ![vm setup3](./vm_setup3.png) Wait until the virtual machine is up and running. ![vm setup4](./vm_setup4.png) ## Install Docker In your terminal, run the following commands to SSH into the virtual machine: ```shell chmod 400 ssh -i azureuser@ ``` Kestra can be started using a `.jar` binary or Docker. In this guide, we’ll use Docker for a quick setup: 1. Install Docker on the Azure VM instance. You can find the last updated [instruction on the Docker website](https://docs.docker.com/engine/install/ubuntu/). 2. [Install Docker Compose](https://docs.docker.com/compose/install/). To check your installation, run `sudo docker version` and `sudo docker compose version`. You're now ready to download and launch the Kestra server. ## Install Kestra Download the official Docker-Compose file: ```bash curl -o docker-compose.yml \ https://raw.githubusercontent.com/kestra-io/kestra/develop/docker-compose.yml ``` Use an editor such as Vim to modify the `docker-compose.yml`, set basic authentication to `true`, and configure your basic authentication credentials to secure your Kestra instance. ```yaml kestra: server: basic-auth: username: admin@kestra.io # it must be a valid email address password: kestra ``` Next, use the following command to start the Kestra server: ```bash sudo docker compose up -d ``` ## Allow external traffic Kestra is now running and the Kestra server exposes traffic on the `8080` port. To connect through your web browser, update the inbound traffic rules in the Azure security group. 1. Go to the Virtual Machines console and select the recently created virtual machine. 2. On the left-side navigation menu, click **Networking**. 3. Under **Inbound port rules** tab, click the **Add inbound port rule** button. 4. In the **Add inbound security rule** page, put **Destination port ranges** as `8080`. You can keep the default values for the remaining properties. Finally, click **Add** at the bottom of the page. If you want to only allow traffic coming from your local machine, set the **Source** to your own IP address. To open the instance to the entire Internet, leave it as **Any**. ![vm choose_networking](./vm_choose_networking.png) ![vm inbound_port](./vm_inbound_port.png) :::alert{type="warning"} If you haven’t set up basic authentication, your Kestra instance will be publicly accessible to anyone without authentication. ::: You can now access your Kestra instance and start developing flows. ## Launch Azure Database This first installation relies on a PostgreSQL database running alongside the Kestra server on the VM instance (see the PostgreSQL service running in Docker Compose). For a simple proof of concept (PoC), you can keep the PostgreSQL database running in Docker. However, for a production-grade installation, we recommend a managed database service such as Azure Database for PostgreSQL servers. **Launch a database using Azure Database for PostgreSQL servers** 1. Go to the [Azure Database for PostgreSQL servers](https://portal.azure.com/#view/HubsExtension/BrowseResource/resourceType/Microsoft.DBforPostgreSQL%2Fservers). 2. Click on **Create Azure Database for PostgreSQL server** (Kestra also supports MySQL, but PostgreSQL is recommended). 3. Choose an appropriate **Subscription** and **Resource Group**. 4. Put an appropriate **Server name** and select the preferred **Region**. 5. Choose the latest **PostgreSQL version**. We recommend version 17. 6. Select the **Workload type** as per your requirement. 7. Choose **Authentication method** as **PostgreSQL authentication only**. 8. Provide an appropriate **Admin username** and **Password** and re-write the password in **Confirm password**. 9. Click **Next: Networking**. 10. Check the select box for **Allow public access from any Azure service within Azure to this server**. 11. Click **Review + Create**. Review the configurations and click **Create**. 12. Wait for the database to be provisioned. ![db_setup1](./db_setup1.png) ![db_setup2](./db_setup2.png) ![db_setup3](./db_setup3.png) **Create a Kestra database** 1. Go to the database overview page and click **Databases** from the left-side navigation menu. 2. Click **Add**. 3. Put an appropriate database name and click **Save** at the top. **Update Kestra configuration** In the `docker-compose.yml` file, edit the `datasources` property of the Kestra service to point Kestra to your Azure database: ```yaml datasources: postgres: url: jdbc:postgresql://:5432/ driver-class-name: org.postgresql.Driver username: password: ``` Because you now use the "Azure Database for PostgreSQL servers" service, you don't need the PostgreSQL Docker service anymore. Remove it from the `docker-compose.yml` file. For the changes to take effect, restart the Docker services with `sudo docker compose restart` or `sudo docker compose up -d`. ## Configure Azure Blob Storage By default, internal storage is implemented using the local file system. This section guides you how to change the storage backend to Blob Storage to ensure more reliable, durable, and scalable storage. 1. Go to the [Storage Accounts](https://portal.azure.com/#view/HubsExtension/BrowseResource/resourceType/Microsoft.Storage%2FStorageAccounts). 2. Click **Create**. 3. Choose an appropriate **Subscription** and **Resource Group**. 4. Put an appropriate **Storage account name** and select the preferred **Region**. 5. Select **Performance** and **Redundancy** as per your requirement. 6. Click **Review** and post reviewing the configurations, click **Create**. 7. Click on the newly created storage account. 8. On the storage account overview page, click **Containers** from the left-side navigation menu. 9. Click the **Create** button at the top to create a new container. 10. Put an appropriate name for the container and click **Create**. A new container will be created. 11. Now, click **Access keys** from the left-side navigation menu. 12. For one of the keys, either key1 or key2, click **Show** for the **Connection string** and click the **Copy to clipboard** button. 13. Make a note of the connection string for later use. We will require this for configuring the storage backend. 14. Edit the Kestra storage configuration in the `docker-compose.yml` file. ```yaml kestra: storage: type: azure azure: container: "" endpoint: "https://.blob.core.windows.net/" connection-string: "" ``` For the changes to take effect, restart the Docker services with `sudo docker compose restart` or `sudo docker compose up -d`. For more information on Azure Blob storage configuration, check out the [Runtime and Storage configuration guide](../../configuration/02.runtime-and-storage/index.md). ## Next steps This setup provides a simple starting point for running Kestra in production on a single machine. For a deployment to a distributed Kubernetes cluster, check the [Azure AKS deployment guide](../06.kubernetes-azure-aks/index.md). Reach out via [Slack](/slack) if you encounter any issues or have any questions regarding deploying Kestra to production. Also, check the [CI/CD guide](../../version-control-cicd/cicd/index.md) to automate your workflow deployments based on changes in Git. --- # Deploy Kestra on DigitalOcean – Managed DB Setup URL: https://kestra.io/docs/installation/digitalocean-droplet > Install Kestra on a DigitalOcean Droplet with Managed Database and Spaces Object Storage for a cloud-native setup. Install Kestra on a DigitalOcean Droplet with Managed Database as the database backend. ## Prerequisites - Basic knowledge about using a command line interface - Basic knowledge about DigitalOcean and PostgreSQL ## Create a DigitalOcean Droplet Go to the DigitalOcean portal and choose [Droplets](https://www.digitalocean.com/products/droplets) from the left navigation bar. 1. On the Droplets page, click **Create Droplet**. 2. Choose an appropriate region. 3. Choose `Ubuntu` as the OS image with the latest version. 4. Kestra requires at least 4 GiB of memory and 2 vCPUs. The `Basic` plan with `Regular` CPU and 4 GiB / 2 vCPU is a good starting point. 5. You can choose an appropriate authentication method: SSH Key or Password based. 6. Provide an appropriate hostname and click on the `Create Droplet` button at the bottom. ![droplet_setup1](./droplet_setup1.png) ![droplet_setup2](./droplet_setup2.png) ![droplet_setup3](./droplet_setup3.png) ![droplet_setup4](./droplet_setup4.png) Wait until the virtual machine is up and running. From the Droplets page, you can navigate to the recently created Droplet. From here, you can open the machine's console by clicking on the `Console` button at the top. ![droplet_setup5](./droplet_setup5.png) ## Install Docker Once in the console terminal, you can run the commands to install Kestra. Kestra can be started directly from a `.jar` binary or using Docker. We use Docker here for a quick setup: 1. Install Docker on the Droplet. [You can find the last updated instruction on the Docker website](https://docs.docker.com/engine/install/ubuntu/). 2. [Install docker compose](https://docs.docker.com/compose/install/). To check your installation, run `sudo docker version` and `sudo docker compose version`. You're now ready to download and launch the Kestra server. ## Install Kestra Download the official Docker-Compose file: ```bash curl -o docker-compose.yml https://raw.githubusercontent.com/kestra-io/kestra/develop/docker-compose.yml ``` Use an editor such as Vim to modify the `docker-compose.yml`, set basic authentication to `true`, and configure your basic authentication credentials to secure your Kestra instance. ```yaml kestra: server: basic-auth: enabled: true username: admin@kestra.io # it must be a valid email address password: kestra ``` Next, use the following command to start the Kestra server: ```bash sudo docker compose up -d ``` You can now access the Kestra UI at `http://:8080` and start developing flows. ## Launch DigitalOcean Database This first installation relies on a PostgreSQL database running alongside the Kestra server - on the VM instance (see the PostgreSQL service running thanks to the docker-compose). For a simple proof of concept (PoC), you can keep the PostgreSQL database running in Docker. However, for a production-grade installation, we recommend a managed database service such as DigitalOcean Database. **Launch a PostgreSQL database using [DigitalOcean Database](https://www.digitalocean.com/products/managed-databases-postgresql)** 1. Go to the [DigitalOcean Databases](https://cloud.digitalocean.com/databases). 2. Click on `Create Database` button on the top. 3. Choose an appropriate region and select `PostgreSQL` as the database engine (Kestra also supports MySQL, but PostgreSQL is recommended). 4. Choose the database configuration as per your requirement. 5. Provide an appropriate database cluster name. 6. Click on the `Create Database Cluster` button at the bottom of the page. 7. Wait for the database to be provisioned. Generally, this takes around 5 minutes. ![db_setup1](./db_setup1.png) ![db_setup2](./db_setup2.png) ![db_setup3](./db_setup3.png) 8. Once the database is ready, you can click on the `Get Started` button. 9. In the `Add trusted sources` dropbox, you can select your computer (in case you want to connect to this database from the PostgreSQL client running on your computer) and the `kestra-host` droplet created in the earlier section. 10. Click on `Allow these inbound sources only`. 11. On this page, ensure `Public network` is selected on the top. Take a note of the Connection Details that appear, and click on `Continue`. 12. On the next page, click on `Great, I'm done` button. ![db_setup4](./db_setup4.png) ![db_setup5](./db_setup5.png) ![db_setup6](./db_setup6.png) ![db_setup7](./db_setup7.png) **Create a Kestra database** 1. Go to the database overview page and navigate to `Users & Databases` tab. 2. Under `Databases`, type an appropriate database name and click on `Save`. ![db_setup8](./db_setup8.png) ![db_setup9](./db_setup9.png) **Update Kestra configuration** In the docker-compose configuration, edit the `datasources` property of the Kestra service to point Kestra to your DigitalOcean database: ```yaml datasources: postgres: url: jdbc:postgresql://:25060/ driver-class-name: org.postgresql.Driver username: doadmin password: ``` Because you now use the database powered by "DigitalOcean Database", you don't need the PostgreSQL Docker service anymore. Remove it from the `docker-compose.yml` file. You'll also need to delete the `depends_on` section at the end of the YAML file: ```yaml depends_on: postgres: condition: service_started ``` To apply the changes, restart the docker services with `sudo docker compose restart` or `sudo docker compose up -d`. ## Configure Spaces Object Storage By default, internal storage is implemented using the local file system. This section guides you how to change the storage backend to Spaces Object Storage to ensure more reliable, durable, and scalable storage. First, we create the access key and secret key that can be used to connect to Spaces Object Storage. 1. Navigate to the `API` from the left navigation menu. 2. Go to the `Spaces Keys` tab. 3. Click on `Generate New Key` button. 4. Provide an appropriate name for the spaces access key and click on `Create Access Key`. 5. A new access key with the given name will be generated. Take a note of the secret key as you will not be able to retrieve it later. ![spaces_api1](./spaces_api1.png) ![spaces_api2](./spaces_api2.png) ![spaces_api3](./spaces_api3.png) Let's create a bucket in the Spaces Object Storage. 1. Go to the [Spaces Object Storage](https://cloud.digitalocean.com/spaces). You can also navigate to the Spaces Object Storage from the left navigation menu. 2. Click on `Create Spaces Bucket` button. 3. Choose an appropriate data center region. 4. Put an appropriate unique Spaces Bucket name and select the corresponding project in which the Spaces Bucket needs to be created. 5. Click on `Create a Spaces Bucket` at the bottom to create the Spaces Bucket. 6. Once the bucket is created, you can go to the bucket's page and note down the `Original Endpoint`. 7. Edit the Kestra storage configuration in the `docker-compose.yml` file. ```yaml kestra: storage: type: minio minio: endpoint: "" port: "443" secure: true access-key: "" secret-key: "" region: "FRA1" bucket: "" ``` To apply the changes, restart the docker services with `sudo docker compose restart` or `sudo docker compose up -d`. ![spaces_object_storage1](./spaces_object_storage1.png) ![spaces_object_storage2](./spaces_object_storage2.png) ![spaces_object_storage3](./spaces_object_storage3.png) ![spaces_object_storage4](./spaces_object_storage4.png) ## Next steps This setup provides a simple starting point for running Kestra in production on a single machine. Reach out via [Slack](/slack) if you encounter any issues or if you have any questions regarding deploying Kestra to production. Make sure to also check the [CI/CD guide](../../version-control-cicd/cicd/index.md) to automate your workflow deployments based on changes in Git. --- # Run Kestra with Docker – Single-Container Setup URL: https://kestra.io/docs/installation/docker > Run Kestra in a single Docker container for quick testing and development, with options for custom configuration. Start Kestra using a single Docker container.
## Install Kestra with a single Docker Container Once you have Docker running, you can start Kestra in a single command (*if you're running on Windows, make sure to use [WSL](https://docs.docker.com/desktop/wsl/)*): ```bash docker run --pull=always --rm -it -p 8080:8080 --user=root \ --name kestra \ -v kestra_data:/app/storage \ -v kestra_db:/app/data \ -v /var/run/docker.sock:/var/run/docker.sock \ -v /tmp:/tmp \ kestra/kestra:latest server local ``` Open http://localhost:8080 in your browser to launch the UI and start building your first flows. :::alert{type="info"} The above command starts Kestra with an embedded H2 database. Storage files are stored on the `kestra_data` Docker volume, and the H2 database is persisted on the `kestra_db` Docker volume. For production-ready persistence with a PostgreSQL database and more configurability, follow the [Docker Compose installation](../03.docker-compose/index.md). ::: :::alert{type="info"} **Enterprise Edition images** — log in to the private registry with your license credentials before pulling images: ```bash docker login registry.kestra.io --username $LICENSEID --password $FINGERPRINT ``` Use `registry.kestra.io/docker/kestra-ee:latest` for the newest image, or pin a specific version such as `registry.kestra.io/docker/kestra-ee:v1.0`. Review the [Enterprise documentation](../../07.enterprise/index.mdx) and [configuration requirements](../../07.enterprise/05.instance/index.mdx) for additional setup guidance. Compare editions in [Open Source vs Enterprise](../../oss-vs-paid/index.md) if you are deciding between versions. ::: ## Configuration ### Using a configuration file You can adjust Kestra's configuration using a file mounted to the Docker container as a bind volume. First, create a configuration `.yml` file like the example below: ```yaml datasources: postgres: url: jdbc:postgresql://postgres:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 kestra: server: basic-auth: enabled: false username: "admin@kestra.io" # It must be a valid email address password: kestra repository: type: postgres storage: type: local local: base-path: "/app/storage" queue: type: postgres tasks: tmp-dir: path: "/tmp/kestra-wd/tmp" url: "http://localhost:8080/" ``` :::alert{type="info"} This configuration is taken from the official [docker-compose.yaml](https://github.com/kestra-io/kestra/blob/develop/docker-compose.yml#L33) file and uses a PostgreSQL database; you may want to retrieve it there to be sure it is accurate. ::: After creating the configuration file, update the command to mount the file to the container and start Kestra. We also adjust the Kestra command to start a standalone version, as we now have a PostgreSQL database as a backend. ```bash docker run --pull=always --rm -it -p 8080:8080 --user=root \ -v $PWD/application.yaml:/etc/config/application.yaml \ -v /var/run/docker.sock:/var/run/docker.sock \ -v /tmp:/tmp kestra/kestra:latest server standalone --config /etc/config/application.yaml ``` ### Using the `KESTRA_CONFIGURATION` environment variable You can adjust the [Kestra configuration](../../configuration/01.configuration-basics/index.md) by passing the `KESTRA_CONFIGURATION` variable to the Docker container via the `-e` option. This environment variable must be a valid YAML string. Managing a large configuration via a single YAML string can be tedious. To simplify this, consider using a configuration file instead. First, define an environment variable: ```bash export KESTRA_CONFIGURATION=$' datasources: postgres: url: jdbc:postgresql://postgres:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 kestra: server: basic-auth: enabled: false username: "admin@kestra.io" # it must be a valid email address password: kestra repository: type: postgres storage: type: local local: base-path: "/app/storage" queue: type: postgres tasks: tmp-dir: path: /tmp/kestra-wd/tmp url: http://localhost:8080/ ``` :::alert{type="info"} This configuration is taken from the official [docker-compose.yaml](https://github.com/kestra-io/kestra/blob/develop/docker-compose.yml#L33) file and uses a PostgreSQL database; you may want to retrieve it there to be sure it is accurate. ::: Once configured, pass the `KESTRA_CONFIGURATION` environment variable in a Docker command and adjust the Kestra command to run the standalone server: ```bash docker run --pull=always --rm -it -p 8080:8080 --user=root \ -e KESTRA_CONFIGURATION="$KESTRA_CONFIGURATION" \ -v /var/run/docker.sock:/var/run/docker.sock \ -v /tmp:/tmp kestra/kestra:latest server standalone ``` ## Official Docker images The official Kestra Docker images are available on [DockerHub](https://hub.docker.com/r/kestra/kestra) for both `linux/amd64` and `linux/arm64` platforms. Two image variants are available: - `kestra/kestra:*` - `kestra/kestra:*-no-plugins` Both variants are based on the [`eclipse-temurin:21-jre`](https://hub.docker.com/_/eclipse-temurin) Docker image. The `kestra/kestra:*` images include all Kestra [plugins](/plugins) in their **latest versions**. The `kestra/kestra:*-no-plugins` images do not contain any plugins. Use the `kestra/kestra:*` version to access all available plugins. ## Docker image tags The following tags are available for each Docker image (append `-no-plugins` to any image to exclude all but Kestra core plugins): - `latest`: The most recent stable release (rolling tag). Intended for trying new features; not an LTS. Support ends when the next stable release (~ 2 months) becomes available. - `latest-lts`: The current Long-Term Support (rolling tag). Tracks the active LTS line (updates roughly every 6 months to the new LTS) and receives fixes for ~1 year. - `v`: Minor-series floating tag (e.g., `v1.0`) that always points to the latest patch of that series (e.g., `v1.0.5`). Use `v` when you want automatic patch updates but to stay on a minor line. - `v`: Immutable tag for an exact version (e.g., `v1.0.1`). Never changes; **best for locked-down production.** - `develop`: Nightly/continuous build from the `develop` branch. Unstable and not recommended for production, only for testing. The **default Kestra image** `kestra/kestra:latest` already includes **all plugins**. To use a lightweight version of Kestra without plugins, add the suffix `*-no-plugins`. ### Recommended images for production For production deployments, choose one of the following: **Latest stable version** for staying most up to date while also stable (make note that this is a rolling tag that changes quite frequently): - `kestra/kestra:latest` — latest stable with all plugins - `kestra/kestra:latest-no-plugins` — latest stable without plugins **Pinned versions** for maximum stability: - `kestra/kestra:v` — all plugins included - `kestra/kestra:v-no-plugins` — no bundled plugins, only core to Kestra **LTS rolling tag** if you want automatic updates within the LTS line: - `kestra/kestra:latest-lts` - `kestra/kestra:latest-lts-no-plugins` ### Recommended images for development For development or testing new features: - `kestra/kestra:latest` — latest stable with all plugins - `kestra/kestra:latest-no-plugins` — latest stable without plugins - `kestra/kestra:develop` / `kestra/kestra:develop-no-plugins` — daily builds with unreleased features, unstable ## Build a custom Docker image If the base or full image doesn't contain package dependencies you need, you can build a custom image by using the Kestra base image and adding the required binaries and dependencies. ### Add custom binaries The following `Dockerfile` creates a new image from the Kestra base image and adds the `golang` binary along with Python packages: ```dockerfile ARG IMAGE_TAG=latest FROM kestra/kestra:$IMAGE_TAG RUN mkdir -p /app/plugins && \ apt-get update -y && \ apt-get install -y --no-install-recommends golang && \ apt-get install -y pip && \ pip install pandas==2.0.3 requests==2.31.0 && \ apt-get clean && rm -rf /var/lib/apt/lists/* /var/tmp/* ``` ### Add plugins to a Docker image By default, the base Docker image `kestra/kestra:latest` contains all plugins (unless you use the `kestra/kestra:latest-no-plugins` version). You can add specific plugins to the base image and build a custom image. The following `Dockerfile` creates an image from the base image and adds the `plugin-aws`, `storage-gcs` and `plugin-gcp` binaries using the command `kestra plugins install`: ```dockerfile ARG IMAGE_TAG=latest-no-plugins FROM kestra/kestra:$IMAGE_TAG RUN /app/kestra plugins install \ io.kestra.plugin:plugin-aws:LATEST \ io.kestra.storage:storage-gcs:LATEST \ io.kestra.plugin:plugin-gcp:LATEST ``` ### Add custom plugins to a Docker image The above `Dockerfile` installs plugins that have already been published to [Maven Central](https://central.sonatype.com/). If you are developing a custom plugin, make sure to build it following our [plugin developer guide](../../plugin-developer-guide/index.mdx). Once the `shadowJar` is built, add it to the plugins directory: ```dockerfile ARG IMAGE_TAG=latest FROM kestra/kestra:$IMAGE_TAG RUN mkdir -p /app/plugins COPY /build/libs/*.jar /app/plugins ``` ### Add custom plugins from a Git repository If you would like to build custom plugins from a specific Git repository, you can use the following approach: ```dockerfile FROM openjdk:17-slim as stage-build WORKDIR / USER root RUN apt-get update -y RUN apt-get install git -y && \ git clone https://github.com/kestra-io/plugin-aws.git RUN cd plugin-aws && ./gradlew :shadowJar FROM kestra/kestra:latest ## https://github.com/WASdev/ci.docker/issues/194#issuecomment-433519379 USER root RUN mkdir -p /app/plugins && \ apt-get update -y && \ apt-get install -y --no-install-recommends golang && \ apt-get install -y pip && \ pip install pandas==2.0.3 requests==2.31.0 && \ apt-get clean && rm -rf /var/lib/apt/lists/* /var/tmp/* RUN rm -rf /app/plugins/plugin-aws-*.jar COPY --from=stage-build /plugin-aws/build/libs/plugin-aws-*.jar /app/plugins ``` This multi-stage Docker build allows you to override a plugin that has already been installed. In this example, the AWS plugin is by default already included in the `kestra/kestra:latest` image. However, it's overridden by a plugin built in the first Docker build stage. --- # Deploy Kestra with Docker Compose – PostgreSQL URL: https://kestra.io/docs/installation/docker-compose > Get started with Kestra quickly using Docker Compose with a PostgreSQL backend for a robust local or server deployment. Start Kestra with a PostgreSQL database backend by using a Docker Compose file.
## Prerequisites - Install [Docker](https://docs.docker.com/compose/install/) before you begin. - Make sure Docker Compose is available in your Docker installation. ## Download the Docker Compose file Download the Docker Compose file using the following command on Linux and macOS: ```bash curl -o docker-compose.yml \ https://raw.githubusercontent.com/kestra-io/kestra/develop/docker-compose.yml ``` On Windows, use the following command: ```powershell Invoke-WebRequest -Uri "https://raw.githubusercontent.com/kestra-io/kestra/develop/docker-compose.yml" -OutFile "docker-compose.yml" ``` You can also download the [Docker Compose file](https://github.com/kestra-io/kestra/blob/develop/docker-compose.yml) manually and save it as `docker-compose.yml`. ## Launch Kestra Use the following command to start the Kestra server: ```bash docker compose up -d ``` Open the URL `http://localhost:8080` in your browser to launch the UI. :::alert{type="info"} **Enterprise Edition images** — log in to the private registry with your license credentials before pulling images: ```bash docker login registry.kestra.io --username $LICENSEID --password $FINGERPRINT ``` Use `registry.kestra.io/docker/kestra-ee:latest` for the newest image, or pin a specific version such as `registry.kestra.io/docker/kestra-ee:v1.0`. See the [Enterprise documentation](../../07.enterprise/index.mdx) and [configuration requirements](../../07.enterprise/05.instance/index.mdx) for deployment prerequisites. Compare editions in [Open Source vs Enterprise](../../oss-vs-paid/index.md) if you are deciding between versions. ::: ### Adjusting the configuration The command from the previous section starts a standalone server, with all architectural components running in one JVM. The [configuration](../../configuration/01.configuration-basics/index.md) lives in the `KESTRA_CONFIGURATION` environment variable of the Kestra container. You can update that environment variable inside the Docker Compose file or pass it as a Docker CLI argument. :::alert{type="info"} If you want to extend your Docker Compose file, modify container networking, or if you have any other issues using this Docker Compose file, check the [Troubleshooting Guide](../../10.administrator-guide/16.troubleshooting/index.md). For running Kestra in Docker Compose with each server component as a separate service, see the [multi-component Docker Compose example](../../kestra-cli/kestra-server/index.md#kestra-with-server-components-in-different-services). ::: ### Use a configuration file If you want to use a configuration file instead of the `KESTRA_CONFIGURATION` environment variable, update the default `docker-compose.yml`. First, create a configuration file containing the `KESTRA_CONFIGURATION` environment variable defined in the `docker-compose.yml` file. You can name it `application.yaml`. Next, update the `kestra` service in the `docker-compose.yml` file to mount this file into the container and start up Kestra using the `--config` option: ```yaml ## [...] kestra: image: kestra/kestra:latest pull_policy: always # Note that this is meant for development only. Refer to the documentation for production deployments of Kestra which runs without a root user. user: "root" command: server standalone --worker-thread=128 --config /etc/config/application.yaml volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd - $PWD/application.yaml:/etc/config/application.yaml ports: - "8080:8080" - "8081:8081" depends_on: postgres: condition: service_started ``` :::alert{type="info"} Check out all of our available [Docker image tags](./../02.docker/index.md#docker-image-tags) to see which one is best for your use case. ::: ### Configure networking in Docker Compose The [default Docker Compose file](https://github.com/kestra-io/kestra/blob/develop/docker-compose.yml) does not configure networking for the Kestra containers. This means you cannot access services exposed via `localhost` on your local machine, such as another Docker container with a mapped port. Your machine and the Docker container operate on different networks. To use a locally exposed service from the Kestra container, use the `host.docker.internal` hostname or `172.17.0.1`. The `host.docker.internal` address lets you reach your host machine's services from the container. Alternatively, you can use a Docker network. By default, your Kestra container is placed in a `default` network. You can add your custom services to the `docker-compose.yml` file provided by Kestra and use the service aliases, which are the keys in `services`, to reach them. A better approach may be to create a new network such as `kestra_net` and add your services to it. Then add that network to the `networks` section of the `kestra` service. With this configuration, you can access your exposed ports through `localhost`. The example below shows how you can add `iceberg-rest`, `minio`, and `mc` (i.e., MinIO client) to your Kestra Docker Compose file. :::collapse{title="Example"} ```yaml volumes: postgres-data: driver: local kestra-data: driver: local networks: kestra_net: services: postgres: image: postgres volumes: - postgres-data:/var/lib/postgresql environment: POSTGRES_DB: kestra POSTGRES_USER: kestra POSTGRES_PASSWORD: k3str4 healthcheck: test: ["CMD-SHELL", "pg_isready -d $${POSTGRES_DB} -U $${POSTGRES_USER}"] interval: 30s timeout: 10s retries: 10 networks: kestra_net: iceberg-rest: image: tabulario/iceberg-rest ports: - 8181:8181 environment: - AWS_ACCESS_KEY_ID=admin - AWS_SECRET_ACCESS_KEY=password - AWS_REGION=us-east-1 - CATALOG_WAREHOUSE=s3://warehouse/ - CATALOG_IO__IMPL=org.apache.iceberg.aws.s3.S3FileIO - CATALOG_S3_ENDPOINT=http://minio:9000 networks: kestra_net: minio: image: minio/minio container_name: minio environment: - MINIO_ROOT_USER=admin - MINIO_ROOT_PASSWORD=password - MINIO_DOMAIN=minio networks: kestra_net: aliases: - warehouse.minio ports: - 9001:9001 - 9000:9000 command: ["server", "/data", "--console-address", ":9001"] mc: depends_on: - minio image: minio/mc container_name: mc networks: kestra_net: environment: - AWS_ACCESS_KEY_ID=admin - AWS_SECRET_ACCESS_KEY=password - AWS_REGION=us-east-1 entrypoint: > /bin/sh -c " until (/usr/bin/mc config host add minio http://minio:9000 admin password) do echo '...waiting...' && sleep 1; done; /usr/bin/mc rm -r --force minio/warehouse; /usr/bin/mc mb minio/warehouse; /usr/bin/mc policy set public minio/warehouse; tail -f /dev/null " kestra: image: kestra/kestra:latest pull_policy: always entrypoint: /bin/bash # Note that this is meant for development only. Refer to the documentation for production deployments of Kestra which runs without a root user. user: "root" command: - -c - /app/kestra server standalone --worker-thread=128 volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd environment: KESTRA_CONFIGURATION: | datasources: postgres: url: jdbc:postgresql://postgres:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 kestra: server: basic-auth: username: admin password: kestra repository: type: postgres storage: type: minio minio: endpoint: http://minio port: 9000 access-key: admin secret-key: password region: us-east-1 bucket: warehouse queue: type: postgres tasks: tmp-dir: path: /tmp/kestra-wd/tmp url: http://localhost:8080/ ports: - "8080:8080" - "8081:8081" depends_on: postgres: condition: service_started networks: kestra_net: ``` ::: Finally, you can use `host` network mode for the `kestra` service. This makes the container use your host network, so it can reach all exposed ports. In that case, change `services.kestra.environment.KESTRA_CONFIGURATION.datasources.postgres.url` to `jdbc:postgresql://localhost:5432/kestra`. This is the easiest way to reach all ports, but it can be a security risk. See the example below using `network_mode: host`. :::collapse{title="Example"} ```yaml volumes: kestra-data: driver: local services: kestra: image: kestra/kestra:latest pull_policy: always entrypoint: /bin/bash network_mode: host environment: JAVA_OPTS: "--add-opens java.base/java.nio=ALL-UNNAMED" NODE_OPTIONS: "--max-old-space-size=4096" KESTRA_CONFIGURATION: | datasources: postgres: url: jdbc:postgresql://localhost:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 kestra: server: basic-auth: username: admin password: kestra anonymous-usage-report: enabled: true repository: type: postgres storage: type: local local: base-path: "/app/storage" queue: type: postgres tasks: tmp-dir: path: /tmp/kestra-wd/tmp scripts: docker: volume-enabled: true defaults: # Example demonstrating global pluginDefaults - type: io.kestra.plugin.airbyte.connections.Sync url: http://host.docker.internal:8000/ username: airbyte password: password url: http://localhost:8080/ variables: env-vars-prefix: "" # To avoid requiring KESTRA_ prefix on env vars ``` ::: ### PostgreSQL 16 incompatibility error By default, the Docker Compose template uses the latest image for PostgreSQL. However, if you initialized your Kestra database on an older version of PostgreSQL, you might encounter the following error: ```plaintext The data directory was initialized by PostgreSQL version 16, which is not compatible with this version 17.0 (Debian 17.0-1.pgdg120+1). ``` To resolve this, you need to specify a specific tag for the PostgreSQL image in your Docker Compose file. In the example below, we specify `16` as the error above was originally initialized by version 16: ```yaml services: postgres: image: postgres:16 ``` ### SIGILL in Java Runtime Environment on macOS M4 chip Add the following environment variable to your Kestra container: `-e JAVA_OPTS="-XX:UseSVE=0"`: ```bash docker run --pull=always --rm -it -p 8080:8080 --user=root -e JAVA_OPTS="-XX:UseSVE=0" --name kestra -v kestra_data:/app/storage -v kestra_db:/app/data -v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/tmp kestra/kestra:latest server local ``` To apply the same setting in a Docker Compose file: ```yaml services: kestra: image: kestra/kestra:latest environment: JAVA_OPTS: "-XX:UseSVE=0" ``` ## Kestra with server components in different services Server components can run independently from each other. Each of them communicate through the database. The `kestra server` command starts each server component individually: - `kestra server executor` - `kestra server worker` - `kestra server indexer` - `kestra server scheduler` - `kestra server webserver` For more details on Kestra server commands, check out the [Server CLI documentation](../../kestra-cli/kestra-server/index.md). Here is an example Docker Compose configuration file running Kestra services with replicas on the Postgres database backend. ```yaml volumes: postgres-data: driver: local kestra-data: driver: local services: postgres: image: postgres volumes: - postgres-data:/var/lib/postgresql/data environment: POSTGRES_DB: kestra POSTGRES_USER: kestra POSTGRES_PASSWORD: k3str4 healthcheck: test: ["CMD-SHELL", "pg_isready -d $${POSTGRES_DB} -U $${POSTGRES_USER}"] interval: 30s timeout: 10s retries: 10 kestra-scheduler: image: kestra/kestra:latest deploy: replicas: 2 pull_policy: if_not_present user: "root" command: server scheduler volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd environment: KESTRA_CONFIGURATION: &common_configuration | datasources: postgres: url: jdbc:postgresql://postgres:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 kestra: server: basic-auth: username: "admin@kestra.io" password: kestra repository: type: postgres storage: type: local local: base-path: "/app/storage" queue: type: postgres tasks: tmp-dir: path: /tmp/kestra-wd/tmp ports: - "8082-8083:8081" depends_on: postgres: condition: service_started kestra-worker: image: kestra/kestra:latest deploy: replicas: 2 pull_policy: if_not_present user: "root" command: server worker volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd environment: KESTRA_CONFIGURATION: *common_configuration ports: - "8084-8085:8081" depends_on: postgres: condition: service_started kestra-executor: image: kestra/kestra:latest deploy: replicas: 2 pull_policy: if_not_present user: "root" command: server executor volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd environment: KESTRA_CONFIGURATION: *common_configuration ports: - "8086-8087:8081" depends_on: postgres: condition: service_started kestra-webserver: image: kestra/kestra:latest deploy: replicas: 1 pull_policy: if_not_present user: "root" command: server webserver volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd environment: KESTRA_CONFIGURATION: *common_configuration KESTRA_URL: http://localhost:8080/ ports: - "8080:8080" - "8081:8081" depends_on: postgres: condition: service_started ``` --- # Deploy Kestra on GCP VM – Cloud SQL and GCS URL: https://kestra.io/docs/installation/gcp-vm > Deploy Kestra on a Google Cloud Platform (GCP) VM with Cloud SQL and Google Cloud Storage (GCS). Install Kestra on a GCP VM with Cloud SQL as the database backend and Cloud Storage as the internal storage backend.
## Prerequisites - Basic command-line interface (CLI) skills. - Familiarity with Compute Engine, Cloud Storage, and PostgreSQL. ## Create a VM instance First, create a VM instance using the Compute Engine. To do so, [go to the GCP console and choose Compute Engine](https://console.cloud.google.com/compute/instances). 1. Click the **Create Instance** button at the top. 2. Give a name to your instance. 3. Choose an appropriate Region and Zone. 4. Choose the **General Purpose** machine of the **E2** series. 5. Machine type: Kestra requires at least 4GiB of memory and 2 vCPUs to run correctly. Choosing the **Preset** machine type **e2-standard-2** is a good starting point. 6. Click on **Change** in the **Boot Disk** section, as we would like to change the image. 7. Under the "Public Images" tab, choose **Ubuntu** as the operating system and the **Ubuntu 22.04 LTS** version. 8. Continue with the **Allow default access** access scope, and select **Allow HTTPS traffic** in the Firewall section. ![vm creation_1](./vm_setup1.png) ![vm creation 2](./vm_setup2.png) ![change_boot_disk_image](./vm_setup3.png) ![vm_creation_3](./vm_setup4.png) You can now click on **Create** and wait a few seconds for the VM instance to be up and running. ## Install Docker Click on the **SSH** button on the right side of the VM instance details to SSH into the VM instance terminal. Click on the **Authorize** button in the pop-up to authorize the SSH connection into the VM instance. ![ssh_into_vm](./ssh_into_vm.png) Kestra can be started directly from a `.jar` binary or using Docker. We use Docker here for a quicker setup. Install Docker on the GCP VM instance. You can find the last updated [instruction on the Docker website](https://docs.docker.com/engine/install/ubuntu/). To check your installation, run `sudo docker version` and `sudo docker compose version`. You're now ready to download and launch the Kestra server. ## Install Kestra Download the official Docker-Compose file: ```bash curl -o docker-compose.yml \ https://raw.githubusercontent.com/kestra-io/kestra/develop/docker-compose.yml ``` Use an editor such as Vim to modify the `docker-compose.yml`, set basic authentication to `true`, and configure your basic authentication credentials to secure your Kestra instance. ```yaml kestra: server: basic-auth: enabled: true username: admin@kestra.io # it must be a valid email address password: kestra ``` ![vm_network_details_option](./vm_network_details_option.png) ![vm_network_interface_details](./vm_network_interface_details.png) ![vm_firewall_policies](./vm_firewall_policies.png) ![vm_create_firewall_rule](./vm_create_firewall_rule.png) :::alert{type="warning"} Note that if you haven't set up basic authentication in the previous step, your Kestra instance will be publicly accessible to anyone without authentication. ::: You can now access your Kestra instance and start developing flows. ## Launch Cloud SQL This first installation relies on a PostgreSQL database running alongside the Kestra server on the VM instance (see the PostgreSQL service running in Docker Compose). For a simple proof of concept (PoC), you can keep the PostgreSQL database running in Docker. However, for a production-grade installation, we recommend a managed database service such as Cloud SQL. **Create a Cloud SQL database** 1. Go to the [Cloud SQL console](https://console.cloud.google.com/sql/instances). 2. Click on `Choose PostgreSQL` (Kestra also supports MySQL, but PostgreSQL is recommended). 3. Put an appropriate Instance ID and password for the admin user `postgres`. 4. Select the latest PostgreSQL version from the dropdown. 5. Choose `Enterprise Plus` or `Enterprise` edition based on your requirements. 6. Choose an appropriate preset among `Production`, `Development` or `Sandbox` as per your requirement. 7. Choose the appropriate region and zonal availability. 8. Expand Show `Show Configuration Options` at the bottom of the page. ![db_choices](./db_choices.png) ![db_setup](./db_setup.png) ![db_show_config](./db_show_config_options.png) **Enable VM connection to database** 1. Expand the `Connections` section from the dropdown. 2. Uncheck Public IP and check Private IP. If this is your first time using a Private IP connection, you will be prompted to `Setup Connection`. 3. You will then need to choose `Enable API` on the right hand side pop out. 4. Choose `Use an automatically allocated IP range` and click `Continue`. 5. Click on `Create Connection`. ![db_connections](./db_connections.png) ![db_enable_api](./db_enable_api.png) ![db_auto_allocate](./db_auto_allocate.png) ![db_auto_allocate](./db_create_connection.png) **Enable Deletion** If you are just testing or would like to be able to delete your instance and all of its data, then expand the `Data Protection` on the left hand side and make sure `Enable deletion protection` is UNCHECKED. ![db_deletion_protection](./db_deletion_protection.png) **Create database user** 1. Go to the database overview page and click **Users** from the left-side navigation menu. 2. Click **Add User Account**. 3. Put an appropriate username and password and click **Add**. ![db_users](./db_users.png) ![db_user_creation](./db_user_creation.png) **Create Kestra database** 1. Go to the database overview page and click **Databases** from the left side navigation menu. 2. Click **Create Database**. 3. Put an appropriate database name and click **Create**. **Update Kestra configuration** In the Docker compose configuration, edit the `datasources` property of the Kestra service in the following way: ```yaml datasources: postgres: url: jdbc:postgresql://:5432/ driver-class-name: org.postgresql.Driver username: password: ``` And delete the `depends_on` section at the end of the YAML file: ```yaml depends_on: postgres: condition: service_started ``` Since you're now using Cloud SQL, you no longer need the PostgreSQL Docker service. Remove it from the `docker-compose.yml` file. For the changes to take effect, restart the Docker services with `sudo docker compose restart` or `sudo docker compose up -d`. ## Configure GCS By default, internal storage is implemented using the local file system. This section guides you how to change the storage backend to Cloud Storage to ensure more reliable, durable, and scalable storage. 1. Go to the Cloud Storage console and create a bucket. 2. Go to IAM and select **Service Accounts** from the left-side navigation menu. 3. On the Service Accounts page, click on **Create Service Account** at the top of the page. 4. Put the appropriate Service account name and Service account description and grant the service account **Storage Admin** access. Click **Done**. 5. On the Service Accounts page, click on the newly created service account. 6. On the newly created service account page, go to the **Keys** tab at the top of the page and click on **Add Key**. From the dropdown, select **Create New Key**. 7. Select the Key type as **JSON** and click **Create**. The JSON key file for the service account will get downloaded. 8. Use the stringified JSON for the configuration. You can use the bash command `cat | jq '@json'` to generate stringified JSON. 9. Edit the Kestra storage configuration. ```yaml kestra: storage: type: gcs gcs: bucket: "" project-id: "" service-account: | ``` To apply the changes, restart the docker services with `sudo docker compose restart` or `sudo docker compose up -d`. ## Next steps This setup provides the easiest starting point for running Kestra in production on a single machine. For a deployment to a distributed cluster on GCP, check the [GKE Kubernetes deployment guide](../05.kubernetes-gcp-gke/index.md). Reach out via [Slack](/slack) if you encounter any issues or have any questions regarding deploying Kestra to production. Also, check out the [CI/CD guide](../../version-control-cicd/cicd/index.md) to automate your workflow deployments based on changes in Git. --- # Deploy on Kubernetes with Helm in Kestra URL: https://kestra.io/docs/installation/kubernetes > Deploy Kestra on Kubernetes using the official Helm chart, scalable for production with PostgreSQL and object storage. Install Kestra in a Kubernetes cluster using a Helm chart.
## Prerequisites - **kubectl** — to interact with your cluster - **Helm** — to install and manage charts Refer to the respective documentation if these tools are not yet installed. ## Helm chart repository Kestra maintains three Helm charts: 1. **`kestra`** — production-ready chart. No dependencies included. Best suited for production deployments with customizable database and storage. 2. **`kestra-starter`** — includes PostgreSQL and Versity (S3-like storage) for evaluation only. Great for getting started quickly and experimenting with Kestra. 3. **`kestra-operator`** — installs the Enterprise Edition Kubernetes Operator. Chart sources: - Repository: [helm.kestra.io](https://helm.kestra.io/) - Source code: [kestra helm chart](https://github.com/kestra-io/kestra/tree/develop/charts/kestra) - ArtifactHub: [kestra](https://artifacthub.io/packages/helm/kestra/kestra) · [kestra-starter](https://artifacthub.io/packages/helm/kestra/kestra-starter) :::alert{type="info"} All default image tags are listed in the [Docker installation guide](../02.docker/index.md). ::: ### Chart configuration resources To understand available configuration options and compare versions: - **Compare versions**: See differences between two Helm chart versions on [ArtifactHub](https://artifacthub.io/packages/helm/kestra/kestra?modal=values) using the values comparison modal. - **Full values reference**: Review all available configuration options in the [values.yaml](https://github.com/kestra-io/kestra/blob/develop/charts/kestra/values.yaml) file on GitHub. ### Starter chart dependencies The `kestra-starter` chart installs: - Versity (object storage) - PostgreSQL (database) These are not suitable for production. ### Enterprise Edition To deploy the Enterprise Edition, authenticate before pulling images: ```bash docker login registry.kestra.io --username $LICENSEID --password $FINGERPRINT ``` Use: - `registry.kestra.io/docker/kestra-ee:latest` - or a pinned version such as `registry.kestra.io/docker/kestra-ee:v1.0` Review [Enterprise requirements](../../07.enterprise/05.instance/index.mdx) before deploying. Compare editions in [Open Source vs Enterprise](../../oss-vs-paid/index.md) if you are deciding between versions. :::alert{type="info"} To manage flows declaratively using CRDs, install the [Kestra Kubernetes Operator](../../version-control-cicd/cicd/07.kubernetes-operator/index.md) (Enterprise Edition). ::: ## Install Kestra Add the chart repository: ```bash helm repo add kestra https://helm.kestra.io/ helm repo update ``` Install the `kestra-starter` chart: ```bash helm install my-kestra kestra/kestra-starter ``` This deploys pods for Kestra, PostgreSQL (database), and Versity (storage). Alternatively, install the `kestra` production chart: ```bash helm install my-kestra kestra/kestra ``` This deploys Kestra in **standalone mode**—all core components run in a single pod. :::alert{type="warning"} The `kestra` chart does not include PostgreSQL or object storage. Configure these before production deployment. ::: ## Access the Kestra UI To list all pods run: ```bash kubectl get pods -n default -l app.kubernetes.io/name=kestra ``` If you installed the `kestra-starter` chart, you will likely see something like: ```perl my-kestra-kestra-starter-xxxxxx-xxxxx Running my-kestra-postgresql-0 Running my-kestra-versity-0 Running ``` The pod you want to port-forward is the **Kestra standalone pod**, usually named: ```perl my-kestra-kestra-starter-xxxxx ``` If your release is `my-kestra`, the label selector will reliably find it. Export the pod name: ```bash export POD_NAME=$(kubectl get pods \ -l "app.kubernetes.io/name=kestra,app.kubernetes.io/instance=my-kestra,app.kubernetes.io/component=standalone" \ -o jsonpath="{.items[0].metadata.name}") ``` Check it with: ```bash echo $POD_NAME ``` Port-forward the UI: ```bash kubectl port-forward $POD_NAME 8080:8080 ``` Open **http://localhost:8080** in your browser and create your user. ## Scaling Kestra on Kubernetes For production deployments, run each Kestra component in its own pod for improved scalability and resource isolation. Example `values.yaml`: ```yaml deployments: webserver: enabled: true executor: enabled: true indexer: enabled: true scheduler: enabled: true worker: enabled: true standalone: enabled: false ``` Apply changes: ```bash helm upgrade my-kestra kestra/kestra -f values.yaml ``` Validate pod layout: ```bash kubectl get pods -l app.kubernetes.io/name=kestra ``` ## Configuration Kestra configuration is provided through Helm values and rendered into ConfigMaps and Secrets. ### Minimal example (H2 database for testing only) ```yaml configurations: application: kestra: queue: type: h2 repository: type: h2 storage: type: local local: basePath: "/app/storage" datasources: h2: url: jdbc:h2:mem:public;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=FALSE username: kestra password: kestra driverClassName: org.h2.Driver ``` ## Using secrets Secrets can be mounted into Kestra through the `secrets` section and referenced via manifests. Example: enabling Kafka using a Secret ```yaml configurations: application: kestra: queue: type: kafka secrets: - name: kafka-server key: kafka.yml ``` Secret manifest: ```yaml extraManifests: - apiVersion: v1 kind: Secret metadata: name: kafka-server stringData: kafka.yml: | kestra: kafka: client: properties: bootstrap.servers: "localhost:9092" ``` ## Environment variables Use `extraEnv` or `extraEnvFrom` to load values from existing Secrets or ConfigMaps. Example: ```yaml common: extraEnvFrom: - secretRef: name: basic-auth-secret ``` Secret manifest: ```yaml extraManifests: - apiVersion: v1 kind: Secret metadata: name: basic-auth-secret stringData: basic-auth.yml: | kestra: server: basic-auth: enabled: true username: admin@localhost.com password: ChangeMe1234! ``` ## Docker-in-Docker (DinD) Kestra workers support rootless Docker-in-Docker by default. Some clusters restrict this. On Google Kubernetes Engine (GKE), using a node pool based on `UBUNTU_CONTAINERD` works well with rootless Docker DinD. ### Disable rootless mode Some clusters only support a root version of DinD. To enable insecure (privileged) mode instead, use the [insecure mode](https://github.com/kestra-io/kestra/blob/develop/charts/kestra/values.yaml) Helm values: ```yaml dind: # -- Enable Docker-in-Docker (dind) sidecar. # @section -- kestra dind enabled: true # -- Dind mode (rootless or insecure). # @section -- kestra dind mode: 'rootless' base: # -- Rootless dind configuration. # @section -- kestra dind rootless rootless: image: repository: docker pullPolicy: IfNotPresent tag: dind-rootless securityContext: privileged: true runAsUser: 1000 runAsGroup: 1000 args: - --log-level=fatal - --group=1000 # -- Insecure dind configuration (privileged). # @section -- kestra dind insecure insecure: image: repository: docker pullPolicy: IfNotPresent tag: dind-rootless securityContext: privileged: true runAsUser: 0 runAsGroup: 0 allowPrivilegeEscalation: true capabilities: add: - SYS_ADMIN - NET_ADMIN - DAC_OVERRIDE - SETUID - SETGID args: - '--log-level=fatal' ``` ### Troubleshooting DinD If you encounter errors like the following on some Kubernetes deployments: ```bash Device "ip_tables" does not exist. ip_tables 24576 4 iptable_raw,iptable_mangle,iptable_nat,iptable_filter modprobe: can't change directory to '/lib/modules': No such file or directory error: attempting to run rootless dockerd but need 'kernel.unprivileged_userns_clone' (/proc/sys/kernel/unprivileged_userns_clone) set to 1 ``` Attach to the DinD container to inspect logs: ```bash docker run -it --privileged docker:dind sh docker logs docker inspect ``` ## Disable DinD and use Kubernetes task runner To avoid using `root` to spin up containers via DinD, disable DinD by setting the following [Helm chart values](https://github.com/kestra-io/kestra/blob/develop/charts/kestra/README.md#kestra-dind): ```yaml dind: enabled: false ``` Use the Kubernetes task runner as the default method for running [script tasks](../../16.scripts/index.mdx): ```yaml pluginDefaults: - type: io.kestra.plugin.scripts forced: true values: taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes # ... your Kubernetes runner configuration ``` --- # Deploy on AWS EKS with RDS and S3 in Kestra URL: https://kestra.io/docs/installation/kubernetes-aws-eks > Deploy Kestra on Amazon EKS with RDS PostgreSQL and S3 for a scalable, cloud-native orchestration platform. Deploy Kestra to AWS EKS with a PostgreSQL RDS database and an S3 internal storage backend. ## Prerequisites - Basic command-line interface (CLI) skills. - Familiarity with AWS EKS, RDS, S3, and Kubernetes. ## Launch an EKS Cluster First, install [eksctl](https://eksctl.io/) and [kubectl](https://kubernetes.io/docs/tasks/tools/). After installing both, you can create the EKS cluster. There are plenty of configuration options available with `eksctl`, but the default settings are sufficient for this guide. Run the following command to create a cluster named `my-kestra-cluster`: ```shell eksctl create cluster --name my-kestra-cluster --region us-east-1 ``` Wait for the cluster to be created. Once it is confirmed that the cluster is up and that your kubecontext points to the cluster, run the following command: ```shell kubectl get svc ``` ## Launch AWS RDS for PostgreSQL Navigate to the RDS console to create a PostgreSQL database. Once your database is created, configure the settings, ensuring the database is accessible from your EKS cluster. Make note of the database endpoint and port after creation for later use. ## Prepare an AWS S3 Bucket Create a private S3 bucket (i.e., with public access blocked). Keep a record of the bucket name as this is needed for the [Kestra runtime and storage configuration](../../configuration/02.runtime-and-storage/index.md). ## Install Kestra on AWS EKS Add the Kestra Helm chart repository and install Kestra: ```shell helm repo add kestra https://helm.kestra.io/ helm install my-kestra kestra/kestra ``` In the deployment configuration, integrate RDS and S3 as the database and storage backends, respectively. Set the database connection under `datasources` and S3 details under `storage` in your Helm values. Here is how you can configure RDS in the [Helm chart's values](https://github.com/kestra-io/kestra/blob/develop/charts/kestra/values.yaml): ```yaml configurations: application: kestra: queue: type: postgres repository: type: postgres datasources: postgres: url: jdbc:postgresql://:5432/kestra driver-class-name: org.postgresql.Driver username: password: ``` Add the S3 configuration in the [Helm chart's values](https://github.com/kestra-io/kestra/blob/develop/charts/kestra/values.yaml) like in the following example: ```yaml configurations: application: kestra: storage: type: s3 s3: access-key: "" secret-key: "" region: "" bucket: "" ``` To apply these configurations, use the following command: ```bash helm upgrade my-kestra kestra/kestra -f values.yaml ``` ## Access Kestra UI To access the Kestra UI, implement an ingress controller. You can install the AWS Load Balancer (ALB) Controller via Helm: ```shell helm install aws-load-balancer-controller eks/aws-load-balancer-controller \ -n kube-system \ --set clusterName=my-kestra-cluster \ --set serviceAccount.create=false \ --set serviceAccount.name=aws-load-balancer-controller ``` Once the ALB is configured and deployed, access the Kestra UI using the ALB endpoint. ## Next steps Reach out via [Slack](/slack) if you encounter any issues or have questions about deploying Kestra to production. --- # Deploy on Azure AKS: PostgreSQL and Blob Storage URL: https://kestra.io/docs/installation/kubernetes-azure-aks > Run Kestra on Azure Kubernetes Service (AKS) with Azure Database for PostgreSQL and Blob Storage for enterprise-grade orchestration. Deploy Kestra to Azure AKS with Azure Database for PostgreSQL as the database backend and Blob Storage as the internal storage backend. ## Prerequisites - Basic command-line interface (CLI) skills - Familiarity with Azure AKS, PostgreSQL, Blob Storage, and Kubernetes ## Launch an AKS Cluster First, log in to Azure using `az login`. Run the following command to create an AKS cluster named `my-kestra-cluster`: ```shell az aks create \ --resource-group \ --name my-kestra-cluster \ --enable-managed-identity \ --node-count 1 \ --generate-ssh-keys ``` Confirm that the cluster is up. Run the following command to set your kubecontext to the newly created cluster: ```shell az aks get-credentials --resource-group --name my-kestra-cluster ``` You can now confirm that your kubecontext points to the AKS cluster using: ```shell kubectl get svc ``` ## Install Kestra on Azure AKS Add the Kestra Helm chart repository and install Kestra: ```shell helm repo add kestra https://helm.kestra.io/ helm install my-kestra kestra/kestra ``` ## Launch Azure Database for PostgreSQL servers This first installation relies on a PostgreSQL database running alongside the Kestra server - on a separate pod. For a production-grade installation, we recommend a managed database service such as [Azure Database for PostgreSQL servers](https://azure.microsoft.com/en-gb/products/postgresql/). **Launch a database using Azure Database for PostgreSQL servers** 1. Go to the [Azure Database for PostgreSQL servers](https://portal.azure.com/#view/HubsExtension/BrowseResource/resourceType/Microsoft.DBforPostgreSQL%2Fservers). 2. Click on **Create Azure Database for PostgreSQL server** (Kestra also supports MySQL, but PostgreSQL is recommended). 3. Choose an appropriate **Subscription** and **Resource Group**. 4. Put an appropriate **Server name** and select the preferred **Region**. 5. Choose the latest **PostgreSQL version**. We recommend version 17. 6. Select the **Workload type** as per your requirement. 7. Choose **Authentication method** as **PostgreSQL authentication only**. 8. Provide an appropriate **Admin username** and **Password**. 9. Click on **Next: Networking**. 10. Check the box for **Allow public access from any Azure service within Azure to this server**. 11. Click **Review + Create**. Review the configurations and click **Create**. 12. Wait for the database to be provisioned. ![db_setup1](./db_setup1.png) ![db_setup2](./db_setup2.png) ![db_setup3](./db_setup3.png) **Create a Kestra database** 1. Go to the database overview page and click on **Databases** from the left-side navigation menu. 2. Click on **Add**. 3. Put an appropriate database name and click **Save** at the top. **Update Kestra configuration** Configure Azure Database in the [Helm chart's values](https://github.com/kestra-io/kestra/blob/develop/charts/kestra/values.yaml) like in the following example: ```yaml configurations: application: kestra: queue: type: postgres repository: type: postgres datasources: postgres: url: jdbc:postgresql://:5432/ driver-class-name: org.postgresql.Driver username: password: ``` To apply the changes, run: ```shell helm upgrade my-kestra kestra/kestra -f values.yaml ``` ## Prepare an Azure Blob Storage container This section guides you on how to change the storage backend to Azure Blob Storage. 1. Go to the [Storage Accounts](https://portal.azure.com/#view/HubsExtension/BrowseResource/resourceType/Microsoft.Storage%2FStorageAccounts). 2. Click on **Create**. 3. Choose an appropriate **Subscription** and **Resource Group**. 4. Put an appropriate **Storage account name** and select the preferred **Region**. 5. Select **Performance** and **Redundancy** as per your requirement. 6. Click **Review** and post reviewing the configurations, click **Create**. 7. Click the newly created storage account. 8. On the storage account overview page, click the **Containers** from the left-side navigation menu. 9. Click **Create** at the top to create a new container. 10. Put an appropriate name for the container and click **Create**. A new container will be created. 11. Now, click **Access keys** from the left-side navigation menu. 12. For one of the keys, either key1 or key2, click **Show** for the **Connection string** and click the **Copy to clipboard** button. 13. Make a note of the connection string for later use. We will require this for configuring the storage backend. 14. Add Blob Storage configuration in the [Helm chart's values](https://github.com/kestra-io/kestra/blob/develop/charts/kestra/values.yaml) like in the following example: ```yaml configurations: application: kestra: storage: type: azure azure: container: "" endpoint: "https://.blob.core.windows.net/" connectionString: "" ``` To apply the changes, run: ```shell helm upgrade my-kestra kestra/kestra -f values.yaml ``` ## Access Kestra UI :::alert{type="info"} Note: You must create an [Application Gateway in Azure](https://portal.azure.com/#view/Microsoft_Azure_Network/LoadBalancingHubMenuBlade/~/applicationgateways) for creating an ingress controller. ::: Implement an ingress controller for access. You can install AKS Load Balancer Controller via Helm: ```shell helm install aks-load-balancer-controller application-gateway-kubernetes-ingress/ingress-azure \ --set appgw.name=kestra-application-gateway \ --set appgw.resourceGroup= \ --set appgw.subscriptionId= \ --set appgw.shared=false \ --set armAuth.type=servicePrincipal \ --set armAuth.secretJSON=$(az ad sp create-for-rbac --role Contributor --scopes /subscriptions//resourceGroups/ --sdk-auth | base64 -w0) \ --set rbac.enabled=true \ --set verbosityLevel=3 \ --set kubernetes.watchNamespace=default \ --set aksClusterConfiguration.apiServerAddress= ``` Once the load balancer is deployed, you can access the Kestra UI through the ALB URL. ## Next steps Reach out via [Slack](/slack) if you encounter any issues or have any questions regarding deploying Kestra to production. --- # Deploy on GCP GKE: CloudSQL and Cloud Storage URL: https://kestra.io/docs/installation/kubernetes-gcp-gke > Install Kestra on Google Kubernetes Engine (GKE) using CloudSQL and Google Cloud Storage for a robust GCP deployment. Deploy Kestra to GCP GKE with CloudSQL as the database backend and Google Cloud Storage as the internal storage backend. ## Prerequisites - Basic command-line interface (CLI) skills. - Familiarity with GCP GKE, PostgreSQL, GCS, and Kubernetes. ## Launch a GKE cluster First, log in to GCP using `gcloud init`. Run the following command to create a GKE cluster named `my-kestra-cluster`: ```shell gcloud container clusters create my-kestra-cluster --region=europe-west3 ``` Confirm that the cluster is up by using the GCP console. :::alert{type="info"} Before proceeding, check whether the `gke-gcloud-auth-plugin` plugin is already installed: ```shell gke-gcloud-auth-plugin --version ``` If the output displays version information, skip the next section. You can install the authentication plugin using: ```shell gcloud components install gke-gcloud-auth-plugin ``` ::: Run the following command to have your kubecontext point to the newly created cluster: ```shell gcloud container clusters get-credentials my-kestra-cluster --region=europe-west3 ``` You can now confirm that your kubecontext points to the GKE cluster using: ```shell kubectl get svc ``` ## Install Kestra on GCP GKE Add the Kestra Helm chart repository and install Kestra: ```shell helm repo add kestra https://helm.kestra.io/ helm install my-kestra kestra/kestra ``` ## Workload Identity setup If you are using Google Cloud Workload Identity, you can annotate your Kubernetes service account in the Helm chart configuration. This allows Kestra to automatically use the associated GCP service account for authentication. To configure this, you can add the following to your "values.yaml" file: ```yaml serviceAccount: create: true name: annotations: iam.gke.io/gcp-service-account: "@.iam.gserviceaccount.com" ``` Alternatively, you can apply the annotation directly when you install Kestra using Helm: ```shell helm install my-kestra kestra/kestra \ --set serviceAccount.annotations.iam.gke.io/gcp-service-account=@.iam.gserviceaccount.com ``` This configuration links your Kubernetes service account to the GCP service account, enabling Workload Identity for secure access to Google Cloud resources. ## Launch CloudSQL 1. Go to the [Cloud SQL console](https://console.cloud.google.com/sql/instances). 2. Click on **Choose PostgreSQL** (Kestra also supports MySQL, but PostgreSQL is recommended). 3. Put an appropriate Instance ID and password for the admin user `postgres`. 4. Select the latest PostgreSQL version from the dropdown. 5. Choose **Enterprise Plus** or **Enterprise** edition based on your requirements. 6. Choose an appropriate preset among **Production**, **Development** or **Sandbox** as per your requirement. 7. Choose the appropriate region and zonal availability. 8. Click create and wait for completion. ![db_choices](../09.gcp-vm/db_choices.png) ![db_setup](../09.gcp-vm/db_setup.png) **Enable VM connection to database** 1. Go to the database overview page and click on **Connections** from the left-side navigation menu. 2. Go to the **Networking** tab and click on **Add a Network**. 3. In the New Network section, add an appropriate name like **Kestra VM** and enter your GKE pods' IP address range in the network. 4. Click on **Done** in the section. 5. Click on **Save** on the page. ![db_connections](../09.gcp-vm/db_connections.png) ![db_add_a_network](../09.gcp-vm/db_create_connection.png) **Create database user** 1. Go to the database overview page and click on **Users** from the left-side navigation menu. 2. Click on **Add User Account**. 3. Put an appropriate username and password and click on **Add**. ![db_users](../09.gcp-vm/db_users.png) ![db_user_creation](../09.gcp-vm/db_user_creation.png) **Create Kestra database** 1. Go to the database overview page and click on **Databases** from the left side navigation menu. 2. Click on **Create Database**. 3. Put an appropriate database name and click on **Create**. **Update Kestra configuration** Configure CloudSQL Database in the [Helm chart's values](https://github.com/kestra-io/kestra/blob/develop/charts/kestra/values.yaml) like in the following example: ```yaml configurations: application: kestra: queue: type: postgres repository: type: postgres datasources: postgres: url: jdbc:postgresql://:5432/ driver-class-name: org.postgresql.Driver username: password: ``` To apply the changes, run: ```shell helm upgrade my-kestra kestra/kestra -f values.yaml ``` ## Prepare a GCS bucket This section guides you on how to change the storage backend to Cloud Storage to ensure more reliable, durable, and scalable storage. 1. Go to the Cloud Storage console and create a bucket. 2. Go to IAM and select **Service Accounts** from the left-side navigation menu. 3. On the Service Accounts page, click on **Create Service Account** at the top of the page. 4. Put the appropriate Service account name and Service account description and grant the service account **Storage Admin** access. **Click Done**. 5. On the Service Accounts page, click on the newly created service account. 6. On the newly created service account page, go to the **Keys** tab at the top of the page and click on **Add Key**. From the dropdown, select **Create New Key**. 7. Select the Key type as **JSON** and click on **Create**. The JSON key file for the service account will be downloaded. 8. Use the stringified JSON for the configuration. You can use the bash command `cat | jq '@json'` to generate stringified JSON. 9. Edit Kestra storage configuration in the [Helm chart's values](https://github.com/kestra-io/kestra/blob/develop/charts/kestra/values.yaml). :::alert{type="info"} *Note: If you want to use a Kubernetes service account configured with Workload Identity, you don't need to provide anything for `serviceAccount`, as it will be autodetected for the pod configuration if it's well configured.* ::: ```yaml configurations: application: kestra: storage: type: gcs gcs: bucket: "" project-id: "" service-account: | "" ``` To apply the changes, run: ```shell helm upgrade my-kestra kestra/kestra -f values.yaml ``` You can validate the Google Cloud setup by executing the flow example below with a file and then checking it is correctly uploaded to Google Cloud Storage. ```yaml id: inputs namespace: company.team inputs: - id: file type: FILE tasks: - id: validator type: io.kestra.plugin.core.log.Log message: User {{ inputs.file }} ``` ## Commented-out examples in values.yaml The `values.yaml` file includes commented-out examples for secrets, database, and other configuration options. Uncomment and adjust them as needed. Example: ```yaml ## Example configuration for secrets: ## configurations: ## application: ## kestra: ## queue: ## type: h2 ## repository: ## type: h2 ## storage: ## type: local ## local: ## base-path: "/app/storage" ## datasources: ## h2: ## url: jdbc:h2:mem:public;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=FALSE ## username: kestra ## password: "" ## driver-class-name: org.h2.Driver ## configmaps: ## - name: kestra-others ## key: others.yml ## secrets: ## - name: kestra-basic-auth ## key: basic-auth.yml ``` The example above demonstrates how to configure secrets, queue and repository types, and a PostgreSQL datasource. Uncomment and adjust the relevant sections for your setup. ## Next steps Reach out via [Slack](/slack) if you encounter any issues or have any questions regarding deploying Kestra to production. --- # Deploy with Podman Compose in Kestra: Postgres URL: https://kestra.io/docs/installation/podman-compose > Deploy Kestra using Podman Compose with PostgreSQL, offering a rootless container alternative to Docker. Start Kestra with a PostgreSQL database backend using Podman Compose. ## Deploy Kestra with Podman Compose Make sure you have already installed: - [Podman](https://podman.io/docs/installation) - [Podman Compose](https://github.com/containers/podman-compose?tab=readme-ov-file#installation) ## Download the Docker Compose file Download the Docker Compose file using the following command: ```bash curl -o docker-compose.yml \ https://raw.githubusercontent.com/kestra-io/kestra/develop/docker-compose.yml ``` If you don't have `curl` installed, you can download the [Docker Compose file](https://github.com/kestra-io/kestra/blob/develop/docker-compose.yml) manually and save it as `docker-compose.yml`. :::alert{type="info"} Podman Compose works using the provided Docker Compose file out of the box. ::: ## Launch Kestra in Root Mode :::alert{type="warning"} To avoid errors during the `podman machine init` (`Error: exec: "qemu-img": executable file not found in $PATH`) and during the `podman machine start` (`Error: could not find "gvproxy"...`) you should install both qemu and gvproxy before: - with a `brew install qemu` on MacOS (podman comes already installed with gvproxy and virtiofsd) - a `sudo apt install qemu-utils qemu-system-x86 virtiofsd` on Debian / Ubuntu - a `sudo dnf install qemu-img qemu-system-x86 podman-gvproxy virtiofsd` on Fedora / CentOS / RHEL - a `sudo pacman -S qemu qemu-system-x86 virtiofsd` on Arch Linux (podman comes already installed with gvproxy) Then, you may also have to edit the `~/.config/containers/containers.conf` file and add replace `/path/to/bin` with the result of `dirname $(which gvproxy)`: ```toml [engine] helper_binaries_dir = ["/path/to/bin"] ``` ::: Use the following command to create a Podman machine, start it up, and launch Kestra on it: ```bash podman machine init --cpus 2 --rootful -v /tmp:/tmp -v $PWD:$PWD podman machine start podman compose up -d # Optional steps if you have SSH-related issues with `podman compose up -d`: # podman machine inspect podman-machine-default --format '{{.SSHConfig.IdentityPath}}' # ssh-add $(podman machine inspect podman-machine-default --format '{{.SSHConfig.IdentityPath}}') # ssh -v -p 46719 core@127.0.0.1 echo "hello" # accept the tunnel between Podman and localhost # # Optional step if you see the error: "Cannot connect to the Docker daemon at [path]. Is the docker daemon running?" # export DOCKER_HOST='unix:///run/user/1000/podman/podman-machine-default-api.sock' # replace with the [path] shown in the error ``` :::alert{type="info"} Podman executes containers through a VM on your local machine. In order to access local volumes from your container, you need to ensure you mount these to the podman VM, hence the `-v /tmp:/tmp -v $PWD:$PWD` arguments. Note: Check if you have an existing podman VM on your local machine by navigating to the 'Resources' tab in podman desktop or running the command `podman machine list` in your terminal. If you have an existing VM, ensure the required volumes are mounted as expected. If that does not work, you can [recreate the podman VM](https://stackoverflow.com/questions/69298356/how-to-mount-a-volume-from-a-local-machine-on-podman) with volumes mounted and then run Kestra. ::: Open the URL `http://localhost:8080` in your browser to launch the UI. ### Adjusting the configuration The command above starts a *standalone* server (all architecture components in one JVM). The [configuration](../../configuration/01.configuration-basics/index.md) is done inside the `KESTRA_CONFIGURATION` environment variable of the Kestra container. You can update the environment variable inside the Docker compose file or pass it via the Docker command line argument. :::alert{type="info"} If you want to extend your Docker Compose file, modify container networking, or if you have any other issues using this Docker Compose file, check the [Troubleshooting Guide](../../10.administrator-guide/16.troubleshooting/index.md). For running Kestra with Docker Compose with each server component as a separate service, see the [multi-component Docker Compose example](../../kestra-cli/kestra-server/index.md). ::: ### Use a configuration file If you want to use a configuration file instead of the `KESTRA_CONFIGURATION` environment variable to configure Kestra, you can update the default `docker-compose.yml`. First, create a configuration file, for example named `application.yaml`, and populate with the content of the `KESTRA_CONFIGURATION` environment variable defined in the `docker-compose.yml` file. Next, update `kestra` service in the `docker-compose.yml` file to mount this file into the container and to use the `--config` option: ```yaml ## [...] kestra: image: kestra/kestra:latest pull_policy: always # Note that this is meant for development only. Refer to the documentation for production deployments of Kestra which runs without a root user. user: "root" command: server standalone --worker-thread=128 --config /etc/config/application.yaml volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd - $PWD/application.yaml:/etc/config/application.yaml ports: - "8080:8080" - "8081:8081" depends_on: postgres: condition: service_started ``` --- # Run from Standalone JAR in Kestra: No Docker URL: https://kestra.io/docs/installation/standalone-server > Run Kestra directly from a standalone executable JAR file, suitable for environments where Docker is not available. Install Kestra on a standalone server with a simple executable file. ## Run Kestra from the Standalone JAR To deploy Kestra without Docker, there's a standalone JAR available that allows deployment in any environment that has JVM version 21+. Make sure that you have [Java](https://adoptium.net/en-GB/temurin/releases) installed on your machine. The latest JAR can be downloaded [via Kestra API](https://api.kestra.io/v1/versions/download). This is an executable JAR. For Linux & MacOS, run it with `./kestra-VERSION `. For example, to launch Kestra: - In local mode (with an H2 local file database), you run `./kestra-VERSION server local`. - In standalone mode (you need to provide a configuration with a connection to a database) , you run `./kestra-VERSION server standalone`. For more information on database configuration, check out the [Runtime and Storage configuration guide](../../configuration/02.runtime-and-storage/index.md) :::alert{type="warning"} Running the jar version comes without any [plugins](/plugins). You need to install them manually with the `kestra plugins install directory_with_plugins/` command. Alternatively, point to a directory with the plugins in the configuration file or an environment variable `KESTRA_PLUGINS_PATH` (e.g., `KESTRA_PLUGINS_PATH=/Users/anna/dev/plugins`). ::: ## Configuration You can configure Kestra in multiple ways: - **Configuration file** – point to a YAML file with `--config` (or `-c`). - **Environment variable** – set the entire YAML payload in `KESTRA_CONFIGURATION`. Example using a dedicated file: ```shell ./kestra server local --config confs/application.yaml ``` By default, Kestra looks for `${HOME}/.kestra/config.yaml`. Use absolute paths for clarity if the config lives elsewhere. When using `KESTRA_CONFIGURATION`, ensure a `confs/` directory exists in the working directory: Kestra persists the generated configuration file there on startup. Quote multi-line values (as shown in the [Docker deployment guide](../02.docker/index.md#using-the-kestra_configuration-environment-variable)) so the YAML structure remains intact. Configuration options are available in the [Configuration landing page](../../configuration/index.mdx). You can also review the default settings on [GitHub](https://github.com/kestra-io/kestra/blob/develop/cli/src/main/resources/application.yaml). ## Deploy as a systemd service On [systemd](https://systemd.io/)-based systems, Kestra can be deployed as a systemd service. Below is a basic unit file template: ```systemd [Unit] Description=Kestra Event-Driven Declarative Orchestrator Documentation=https://kestra.io/docs/ After=network-online.target [Service] Type=simple ExecStart=/bin/sh /kestra- server standalone User= Group= RestartSec=5 Restart=always ## Send SIGTERM to the main Kestra process and wait up to 'TimeoutStopSec' for child processes in the cgroup to finish; ## if there are any remaining running processes in the cgroup, send SIGKILL to all of them. KillMode=mixed TimeoutStopSec=150 ## Treat received SIGTERM as 'inactive' SuccessExitStatus=143 ## The syslog tag SyslogIdentifier=kestra [Install] WantedBy=multi-user.target ``` ## Install plugins from a Docker image To copy the plugins from a Docker container to your local machine, you can use the following commands: ```bash id=$(docker create kestra/kestra:develop) docker cp $id:/app/kestra kestra docker cp $id:/app/plugins plugins docker rm $id ./kestra server local ``` ## Installation on Windows For comprehensive Windows-specific installation steps including plugin installation, see the [Windows installation guide](../13.windows/index.md). See below for a basic configuration example. ### Configuration Kestra is configured via a YAML file passed with `--config` (or `-c`). If no flag is provided, Kestra looks for `%USERPROFILE%\.kestra\config.yaml` by default. A minimal configuration for local testing: ```yaml kestra: repository: type: memory storage: type: local local: base-path: "C:\\kestra\\storage" queue: type: memory ``` For a production setup or Enterprise installation, your configuration requires at minimum: ```yaml kestra: encryption: secret-key: "" # generate with: openssl rand -base64 32 secret: type: jdbc jdbc: secret: "" repository: type: postgres queue: type: postgres storage: type: s3 s3: endpoint: "" access-key: "" secret-key: "" region: "" bucket: "" ee: license: id: "" fingerprint: "" key: | datasources: postgres: url: jdbc:postgresql://:/ driver-class-name: org.postgresql.Driver username: "" password: "" ``` For the full list of configuration options, see the [Configuration guide](../../configuration/index.mdx). --- # Install Kestra on Windows – Standalone JAR Setup URL: https://kestra.io/docs/installation/windows > Run Kestra on Windows using the standalone executable JAR. Covers plugin installation, configuration, and running a local or standalone server. Run Kestra on Windows using the standalone executable JAR — no Docker required. One use case for this setup is running a Windows remote worker as part of a [worker group](../../07.enterprise/04.scalability/worker-group/index.md), allowing Windows-native scripts such as PowerShell or batch commands to be executed within a broader Kestra deployment. For a production-grade setup on Windows, consider running Kestra through [Docker Compose with WSL 2](../03.docker-compose/index.md) instead.
## Prerequisites 1. Install [Java JRE 21](https://adoptium.net/temurin/releases/?os=windows&version=21) — download the `x64` MSI installer and run it. 2. Download the latest Kestra binary from the [Releases](https://github.com/kestra-io/kestra/releases) page — find it under **Assets** for the desired version (e.g., `kestra-1.3.6`). 3. Rename the downloaded file to `kestra.bat`. 4. Open **CMD** in the directory containing `kestra.bat`. ## Plugin installation The standalone JAR ships without plugins. You must install them before starting the server, otherwise tasks that rely on plugins will fail. ### Install specific plugins Install only the plugins you need by listing them explicitly: ```bat .\kestra.bat plugins install io.kestra.plugin:plugin-script-powershell:LATEST io.kestra.plugin:plugin-script-python:LATEST ``` Find plugin identifiers in the [full plugins list](https://github.com/kestra-io/kestra/blob/develop/.plugins). ### Install all plugins To install every available plugin at once, use the `--all` flag: ```bat .\kestra.bat plugins install --all ``` :::alert{type="info"} Installing all plugins downloads approximately 3 GB of files sequentially and can take a significant amount of time. Prefer installing specific plugins when possible. ::: ### Use a plugins directory Point Kestra to a directory of pre-downloaded plugin JARs using the `KESTRA_PLUGINS_PATH` environment variable: ```bat set KESTRA_PLUGINS_PATH=C:\kestra\plugins ``` ## Start the server Use `server local` for a quick local setup backed by an H2 embedded database. To connect to an external database (PostgreSQL or MySQL), use `server standalone` and provide a full configuration file. ```bat .\kestra.bat server local --config path\to\your\config.yaml ``` The `.\` prefix is required when running a file from the current directory in CMD. Once started, Kestra is accessible at [localhost:8080](http://localhost:8080). Log in with your email and password. :::alert{type="warning"} `server local` is suitable for local testing only — do not use it in production. ::: ## Configuration Kestra is configured via a YAML file passed with `--config` (or `-c`). If no flag is provided, Kestra looks for `%USERPROFILE%\.kestra\config.yaml` by default. A minimal configuration for local testing: ```yaml kestra: repository: type: memory storage: type: local local: base-path: "C:\\kestra\\storage" queue: type: memory ``` For a production setup or Enterprise installation, your configuration requires at minimum: ```yaml kestra: encryption: secret-key: "" # generate with: openssl rand -base64 32 secret: type: jdbc jdbc: secret: "" repository: type: postgres queue: type: postgres storage: type: s3 s3: endpoint: "" access-key: "" secret-key: "" region: "" bucket: "" ee: license: id: "" fingerprint: "" key: | datasources: postgres: url: jdbc:postgresql://:/ driver-class-name: org.postgresql.Driver username: "" password: "" ``` The JDBC secret backend stores secrets in the same PostgreSQL database as Kestra. For cloud-based alternatives such as AWS Secrets Manager, Azure Key Vault, or Google Secret Manager, see the [external secrets manager documentation](../../07.enterprise/02.governance/secrets-manager/index.md). For all secret and encryption configuration options, see the [Security and secrets guide](../../configuration/05.security-and-secrets/index.md). For the full list of configuration options, see the [Configuration guide](../../configuration/index.mdx). ## Alternative: Docker Compose with WSL 2 For a production-ready setup on Windows, Docker Compose running inside WSL 2 is the recommended approach. It pairs Kestra with a PostgreSQL container and avoids the limitations of the H2 embedded database. **Prerequisites:** - [Docker Desktop](https://www.docker.com/products/docker-desktop/) with the **WSL 2 backend** enabled. In Docker Desktop, go to **Settings → Resources → WSL Integration** and confirm you are using WSL 2 rather than the Windows backend. Docker runs significantly better for Kestra under WSL 2. - A WSL 2 distribution installed and running (e.g. Ubuntu via the Microsoft Store). Once those are in place, follow the [Docker Compose installation guide](../03.docker-compose/index.md). It includes a curl command to download the example compose file and a pre-configured PostgreSQL container. --- # CLI Guide: kestractl vs Kestra Server CLI URL: https://kestra.io/docs/kestra-cli > Understand when to use kestractl versus the Kestra Server CLI, and access both command references. import ChildCard from "~/components/docs/ChildCard.astro" Use this section to choose the right CLI for the task and jump to the full command references. ## Choose the right CLI Kestra provides two command-line interfaces with different scopes: | CLI | Best for | Scope | | --- | --- | --- | | `kestractl` | Day-to-day platform operations through the API | Covers flows, executions, namespaces, and namespace files, with capabilities aligned with what the Kestra API supports. | | `kestra` (Server CLI) | Runtime and infrastructure operations | Covers server startup and component processes (`server` commands), system/database operations, plugin lifecycle, and maintenance tasks that require direct access to the runtime environment and backend services. | In short, use `kestractl` for API-level resource management and automation, and use the Kestra Server CLI for server, process or database operations. --- # Server CLI in Kestra: Commands and Options URL: https://kestra.io/docs/kestra-cli/kestra-server > Reference guide for Kestra CLI commands to manage servers, flows, plugins, and configurations. Use `kestra` to interact with the Kestra server and database ## Use the Kestra server CLI effectively This page includes CLI commands and options for both Open Source and Enterprise editions. Enterprise-only operations are marked with (EE) where relevant. ## Installation The Kestra Server CLI (`kestra`) is not a separate tool to install. It is the same executable used to run Kestra server components, so you get it through your Kestra installation method. - **Docker / Docker Compose / Kubernetes**: the CLI is already included in the `kestra/kestra` image. - **Standalone JAR**: the downloaded executable (`./kestra-VERSION`) is the CLI. - **Managed environments (e.g. Kestra Cloud)**: host-level server commands are typically not available. Use [`kestractl`](../kestractl/index.md) for API-level operations. To install Kestra first, follow one of these guides: - [Docker](../../02.installation/02.docker/index.md) - [Docker Compose](../../02.installation/03.docker-compose/index.md) - [Kubernetes](../../02.installation/03.kubernetes/index.md) - [Standalone JAR](../../02.installation/12.standalone-server/index.md) Examples of the same CLI in each mode: ```bash # Docker container docker exec -it kestra /app/kestra plugins list # Standalone binary ./kestra-VERSION plugins list ``` --- ## Global options These options can be used with **any** Kestra CLI command. - `-v, --verbose` — Increase log verbosity (use `-vv` for more). - `-l, --log-level` — Set a specific level: `TRACE`, `DEBUG`, `INFO`, `WARN`, `ERROR`. - `--internal-log` — Also change the level for internal logs. - `-c, --config` — Path to a configuration file (default: `~/.kestra/config.yml`). - `-p, --plugins` — Path to the plugins directory. **Examples** ```bash kestra plugins list -vv kestra plugins install --log-level DEBUG ``` ## API options Available for commands that talk to the server API. - `--server` — Kestra server URL (default: `http://localhost:8080`). - `--headers` — Add custom headers (``). - `--user` — Basic auth (`user:password`). - `--tenant` — Tenant identifier (**EE only**). - `--api-token` — API token (**EE only**). **Examples** ```bash kestra flow list --server http://my-kestra:8080 kestra flow list --user admin:secret ``` --- ## `kestra` (top-level) ```bash Usage: kestra [-hV] [COMMAND] Options: -h, --help Show this help message and exit. -V, --version Print version information and exit. Commands: plugins handle plugins server handle servers flow handle flows template handle templates sys handle systems maintenance configs handle configs namespace handle namespaces auths handle auths sys-ee handle kestra ee systems maintenance tenants handle tenants migrate handle migrations backups (EE) handle metadata backups and restore server start Kestra servers (see `--flow-path` below for preloading flows) ``` ### Preload flows at startup Use the `--flow-path` (or `-f`) flag to load all flows from a directory when starting Kestra so they’re available immediately: ```bash kestra server standalone --flow-path /path/to/flows ``` Point this to a folder of YAML flow definitions; Kestra will load them at startup into the namespaces declared in each file. --- ## Configuration commands ### `kestra configs properties` Display the effective configuration properties. ```bash kestra configs properties ``` --- ## Flow commands ### `kestra flow validate` Validate a flow file. **Input**: `file` (path) ```bash kestra flow validate /path/to/my-flow.yml ``` ### `kestra flow test` Run a flow locally with specific inputs, helping you test its logic without deploying it to the server. **Inputs**: `file` (path), `inputs` (key value pairs; absolute path for file inputs) ```bash kestra flow test /path/to/my-flow.yml myInput1 value1 ``` ### `kestra flow dot` Generate a DOT graph from a flow file, which you can use with a visualization tool to create a visual diagram of your flow's structure. ```bash kestra flow dot /path/to/my-flow.yml ``` ### `kestra flow export` Export flows to a ZIP file. **Inputs**: `--namespace` (optional), `directory` (path to export into) ```bash kestra flow export --namespace my-namespace /path/to/export-directory ``` ### `kestra flow update` Update a single flow on the server from a local file. You must specify the flow's namespace and its unique ID. **Inputs**: `flowFile` (path), `namespace` (string), `id` (string) ```bash kestra flow update /path/to/my-updated-flow.yml my-namespace my-flow-id ``` ### `kestra flow updates` Bulk update flows from a directory. Point the command to a directory, and Kestra will create or update all the flows it finds. The `--delete` flag removes any flows on the server that are no longer in the specified directory. **Inputs**: `directory` (path), `--delete` (optional), `--namespace` (optional) ```bash kestra flow updates /path/to/my-flows --delete --namespace my-namespace ``` ### `kestra flow namespace update` Update **all** flows within a namespace from a directory. **Option**: `--override-namespaces` (optional) ```bash kestra flow namespace update --override-namespaces /path/to/flows ``` ### `kestra flow create` Create a new flow from a YAML file. ```bash kestra flow create /path/to/new-flow.yml ``` ### `kestra flow delete` Delete a flow. **Inputs**: `namespace`, `id` ```bash kestra flow delete my-namespace my-flow-id ``` --- ## Migration commands ### `kestra migrate default-tenant` Migrate all resources without tenant to a new tenant (multi-tenant setups). **Options**: `--tenant-id`, `--tenant-name`, `--dry-run` ```bash kestra migrate default-tenant --tenant-id my-tenant --tenant-name "My Tenant" --dry-run ``` --- ## Namespace commands ### `kestra namespace files update` Sync namespace files from a local directory. **Inputs**: `namespace`, `from` (local path), `to` (remote path, default `/`), `--delete` (optional) ```bash kestra namespace files update my-namespace /path/to/local/files / --delete ``` ### `kestra namespace kv update` Set/update a key in the namespace KV store. Set an expiration time, specify the data type, and even read the value from a file. **Inputs**: `namespace`, `key`, `value` **Options**: `-e, --expiration`, `-t, --type`, `-f, --file-value` ```bash kestra namespace kv update my-ns my-key "my-value" -e 1d ``` --- ## Plugin commands ### `kestra plugins install` Install one or more plugins by Maven coordinates. **Options**: `--locally` (default true), `--all`, `--repositories` ```bash kestra plugins install io.kestra.plugin.jdbc:mysql:1.2.3 ``` ### `kestra plugins uninstall` Uninstall one or more plugins. ```bash kestra plugins uninstall io.kestra.plugin.jdbc:mysql:1.2.3 ``` ### `kestra plugins list` List installed plugins. **Option**: `--core` to include core task plugins ```bash kestra plugins list --core ``` ### `kestra plugins doc` Generate documentation for installed plugins. **Inputs**: `output` (default: `./docs`) **Options**: `--core`, `--icons`, `--schema` ```bash kestra plugins doc ./docs --core ``` ### `kestra plugins search` Search for available plugins. ```bash kestra plugins search jdbc ``` --- ## Server commands ### `kestra server executor` Start the executor. **Options**: `--ignore-executions` (list) ```bash kestra server executor ``` ### `kestra server indexer` Start the indexer. ```bash kestra server indexer ``` ### `kestra server scheduler` Start the scheduler. ```bash kestra server scheduler ``` ### `kestra server standalone` Start a standalone server (all core services). ```bash kestra server standalone ``` ### `kestra server webserver` Start the webserver. **Option**: `--no-tutorials` to disable auto-loading tutorials ```bash kestra server webserver --no-tutorials ``` ### `kestra server worker` Start a worker. **Options**: `-t, --thread` (max threads), `-g, --worker-group` (EE only) ```bash kestra server worker --thread 16 ``` ### `kestra server local` Start a local dev server. ```bash kestra server local ``` ## Kestra with server components in different services Server components can run independently from each other. Each of them communicate through the database. Below is an example Docker Compose configuration file running Kestra services with replicas on the PostgreSQL database backend. :::collapse{title="Docker Compose Example"} ```yaml volumes: postgres-data: driver: local kestra-data: driver: local services: postgres: image: postgres volumes: - postgres-data:/var/lib/postgresql/data environment: POSTGRES_DB: kestra POSTGRES_USER: kestra POSTGRES_PASSWORD: k3str4 healthcheck: test: ["CMD-SHELL", "pg_isready -d $${POSTGRES_DB} -U $${POSTGRES_USER}"] interval: 30s timeout: 10s retries: 10 kestra-scheduler: image: kestra/kestra:latest deploy: replicas: 2 pull_policy: if_not_present user: "root" command: server scheduler volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd environment: KESTRA_CONFIGURATION: &common_configuration | datasources: postgres: url: jdbc:postgresql://postgres:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 kestra: server: basic-auth: enabled: false username: "admin@kestra.io" password: kestra repository: type: postgres storage: type: local local: base-path: "/app/storage" queue: type: postgres tasks: tmp-dir: path: /tmp/kestra-wd/tmp ports: - "8082-8083:8081" depends_on: postgres: condition: service_started kestra-worker: image: kestra/kestra:latest deploy: replicas: 2 pull_policy: if_not_present user: "root" command: server worker volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd environment: KESTRA_CONFIGURATION: *common_configuration ports: - "8084-8085:8081" depends_on: postgres: condition: service_started kestra-executor: image: kestra/kestra:latest deploy: replicas: 2 pull_policy: if_not_present user: "root" command: server executor volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd environment: KESTRA_CONFIGURATION: *common_configuration ports: - "8086-8087:8081" depends_on: postgres: condition: service_started kestra-webserver: image: kestra/kestra:latest deploy: replicas: 1 pull_policy: if_not_present user: "root" command: server webserver volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd environment: KESTRA_CONFIGURATION: *common_configuration KESTRA_URL: http://localhost:8080/ ports: - "8080:8080" - "8081:8081" depends_on: postgres: condition: service_started ``` ::: In production you might run a similar pattern either by: 1. Running Kestra services on dedicated machines. For examples, running the webserver, the scheduler, and the executor on one VM and running one or more workers on other instances. 2. Using Kubernetes and Helm charts. Read more about how to set these up [in the Kubernetes installation documentation](../../02.installation/03.kubernetes/index.md). --- ## System commands ### `kestra sys reindex` Reindex records (currently only `flow`). **Option**: `--type` ```bash kestra sys reindex --type flow ``` ### `kestra sys submit-queued-execution` Submit all queued executions to the executor. ```bash kestra sys submit-queued-execution ``` ### `kestra sys database migrate` Force database schema migration (Flyway). ```bash kestra sys database migrate ``` ### `kestra sys state-store migrate` Migrate old state store files to the Key-Value (KV) Store. ```bash kestra sys state-store migrate ``` --- ## Auths (EE) ### `kestra auths users create` Create a user. **Inputs**: `username` (required), `password` (optional) **Options**: `--groups`, `--tenant`, `--admin`, `--superadmin`, `--if-not-exists` ```bash kestra auths users create --superadmin --tenant=default admin Admin_password@123 ``` ### `kestra auths users create-basic-auth` Create or replace a basic auth password for a user. ```bash kestra auths users create-basic-auth alice ``` ### `kestra auths users refresh` Refresh users to update their properties. ```bash kestra auths users refresh ``` ### `kestra auths users set-superadmin` Set or remove Superadmin status. **Inputs**: `user`, `isSuperAdmin` (true|false) ```bash kestra auths users set-superadmin alice true ``` ### `kestra auths users email-replace-username` Set the username as the email for every user. ```bash kestra auths users email-replace-username ``` ### `kestra auths users sync-access` Sync users' access with the fallback tenant (for enabling multi-tenancy). ```bash kestra auths users sync-access ``` --- ## Backups (EE) ### `kestra backups create` Create a metadata backup. **Inputs**: `type` (`FULL` | `TENANT`) **Options**: `--tenant`, `--encryption-key`, `--no-encryption`, `--include-data` ```bash kestra backups create FULL --no-encryption ``` ### `kestra backups restore` Restore a metadata backup. **Input**: `uri` (Kestra internal storage URI) **Options**: `--encryption-key`, `--to-tenant` ```bash kestra backups restore kestra:///backups/full/backup-20240917163312.kestra ``` --- ## Systems (EE) ### kestra sys-ee restore-flow-listeners Restores the state-store for FlowListeners. Useful after restoring a flow queue. **Inputs** - `--timeout` (option): Timeout in seconds before quitting (default: 60). **Example Usage** ```bash kestra-ee sys-ee restore-flow-listeners --timeout 120 ``` --- ### kestra sys-ee restore-queue Sends all data from a repository to Kafka. Useful for restoring all resources after a backup. **Inputs** - `--no-recreate` (option): Don't drop and recreate the Kafka topic. - `--no-flows` (option): Don't send flows. - `--no-templates` (option): Don't send templates. **Example Usage** ```bash kestra-ee sys-ee restore-queue --no-flows ``` --- ### kestra sys-ee reset-concurrency-limit Resets the concurrency limit stored on the Kafka runner. **Inputs** None **Example Usage** ```bash kestra-ee sys-ee reset-concurrency-limit ``` ## Tenants (EE) ### `kestra tenants create` Create a tenant and assign admin roles to an existing admin user. **Inputs**: `tenantId`, `tenantName` **Option**: `--admin-username` ```bash kestra tenants create tenantA "Tenant A" --admin-username alice ``` --- # kestractl: Kestra CLI for Flows and Executions URL: https://kestra.io/docs/kestra-cli/kestractl > Manage Kestra flows, executions, namespaces, and files from the command line. The kestractl CLI provides full control over your instance without a UI. Use `kestractl` to interact with the Kestra host API for flows, executions, namespaces, namespace files, and key-value pairs. For server components, plugins, and system maintenance commands, see the [Kestra Server CLI](../kestra-server/index.mdx). ## Installation ```bash curl -fsSL https://raw.githubusercontent.com/kestra-io/kestractl/main/install-scripts/install.sh | bash ``` ## Quick Setup ### Open Source (basic auth) ```bash kestractl config add default http://localhost:8080 main --username YOUR_USERNAME --password YOUR_PASSWORD --default ``` ### Enterprise (API token) ```bash kestractl config add default https://kestra.example.com production --token YOUR_TOKEN --default ``` Your configuration is saved at `~/.kestractl/config.yaml` and the default context is used automatically. ```yaml contexts: default: host: https://kestra.example.com tenant: production auth_method: token token: YOUR_TOKEN default_context: default ``` For basic auth contexts, `auth_method` is `basicAuth` and the file stores `username` and `password` instead of a token. ## Usage ### Examples ```bash # Deploy flows, then run one and wait for completion kestractl flows deploy ./flows --namespace prod --override --fail-fast kestractl executions run prod nightly-refresh --wait --output json ``` ```bash # Sync namespace files for a release kestractl nsfiles upload prod ./assets --path resources --override --fail-fast ``` ```bash # List flows as JSON kestractl flows list my.namespace --output json ``` ### Command groups - `config`: manage authentication contexts. - `flows`: list, get, deploy, and validate flows. - `executions`: run and inspect executions. - `namespaces`: list and filter namespaces. - `nsfiles`: list, get, upload, and delete namespace files. - `kv`: list, set, update, get, and delete key-value pairs. Note: `kv list` requires token auth and returns 401 with basic auth. Use `kestractl --help` or `kestractl --help` for the full command reference. ## Configuration ### Global flags - `--host` - Kestra host URL - `--tenant` - Tenant name - `--token` / `-t` - API token (Enterprise) - `--username` - Basic auth username (Open Source) - `--password` - Basic auth password (Open Source) - `--output` / `-o` - Output format (`table` or `json`) - `--config` - Custom config file path (default: `~/.kestractl/config.yaml`) - `--verbose` / `-v` - Verbose output (warning: prints credentials in HTTP requests) ### Config file and contexts Manage contexts with `kestractl config add`, `kestractl config show`, `kestractl config use`, and `kestractl config remove`. ### Environment variables Environment variables override config file settings. Use either `KESTRACTL_TOKEN` or `KESTRACTL_USERNAME` and `KESTRACTL_PASSWORD`. ```bash export KESTRACTL_HOST=http://localhost:8080 export KESTRACTL_TENANT=main export KESTRACTL_TOKEN=YOUR_TOKEN export KESTRACTL_USERNAME=admin export KESTRACTL_PASSWORD=admin export KESTRACTL_OUTPUT=json ``` ### Configuration precedence 1. **Command-line flags** (`--host`, `--token`, etc.) 2. **Environment variables** (`KESTRACTL_HOST`, `KESTRACTL_TOKEN`, etc.) 3. **Config file** (`~/.kestractl/config.yaml` or custom via `--config`) 4. **Default values** ### Override per command ```bash kestractl flows get my.namespace my-flow \ --host https://kestra.example.com \ --tenant production \ --token YOUR_TOKEN ``` --- # Kestra Migration Guide: Version Upgrades & Changes URL: https://kestra.io/docs/migration-guide > Comprehensive migration guide for Kestra, covering version upgrades, deprecated features, and breaking changes. import ChildCard from "~/components/docs/ChildCard.astro" ## Version Migration Guides Migrate Kestra smoothly with in-depth guides. Upgrades and migrations are sometimes necessary. This section covers what's being phased out and what you can use instead. --- # Kestra 0.11.0 Migration Guide: What Changed URL: https://kestra.io/docs/migration-guide/v0.11.0 > Migration guide for Kestra 0.11.0. Covers deprecated features including removal of core script tasks and templates, plus step-by-step upgrade instructions. import ChildCard from "~/components/docs/ChildCard.astro" ## 0.11.0 Deprecated features and migration guides for 0.11.0 and onwards. --- # Script Tasks Moved to Plugins in Kestra 0.11.0 URL: https://kestra.io/docs/migration-guide/v0.11.0/core-script-tasks > Migration guide for moving from core script tasks to dedicated script plugins in Kestra 0.11.0. ## Script tasks moved to dedicated plugins Script tasks included in the core plugin have been deprecated in 0.11.0 and moved to dedicated plugins. Previously, there was scripting tasks inside the core plugin (the plugin that offers core task and is always included inside any Kestra distribution). Since the introduction of the new [Script tasks](../../../16.scripts/index.mdx) in dedicated plugins, the old core scripting tasks have been deprecated and moved out of the core plugin. If you use one of these tasks, you should migrate to the new one that offers improved scripting capabilities and runs by default in a separate Docker container. If you still want to use one of the old tasks and you don't use one of our `*-full` Docker images and manually install plugins, you must include the new plugin that now includes the old deprecated tasks. Here is the list of the old tasks with their new location and the replacement tasks: - The [Bash](/plugins/plugin-script-shell) task is now located inside the `plugin-script-shell` plugin and replaced by the Shell [Commands](/plugins/plugin-script-shell/io.kestra.plugin.scripts.shell.commands) and [Script](/plugins/plugin-script-shell/io.kestra.plugin.scripts.shell.script) tasks. - The [Node](/plugins/plugin-script-node/io.kestra.plugin.core.scripts.node) task is now located inside the `plugin-script-node` plugin and replaced by the Node [Commands](/plugins/plugin-script-node/io.kestra.plugin.scripts.node.commands) and [Script](/plugins/plugin-script-node/io.kestra.plugin.scripts.node.script) tasks. - The [Python](/plugins/plugin-script-python) task is now located inside the `plugin-script-python` plugin and replaced by the Python [Commands](/plugins/plugin-script-python/io.kestra.plugin.scripts.python.commands) and [Script](/plugins/plugin-script-python/io.kestra.plugin.scripts.python.script) tasks. --- # Templates Deprecated in Kestra 0.11.0: Migrate to Subflows URL: https://kestra.io/docs/migration-guide/v0.11.0/templates > Information on the deprecation of Templates in Kestra 0.11.0 and how to migrate to Subflows. ## Deprecation of Templates Since 0.11.0, Templates are deprecated and disabled by default. Use subflows instead. If you still rely on templates, you can re-enable them in your [Plugins and Execution configuration](../../../configuration/04.plugins-and-execution/index.md). 1. Subflows are more powerful — subflows provide the same functionality as templates while simultaneously being more flexible than templates. For instance, `inputs` are not allowed in a template because a template is only a list of tasks that get copied to another flow that references it. In contrast, when invoking a subflow, you can parametrize it with custom parameters. This way, subflows allow you to define workflow logic once and invoke it in other flows with custom parameters. 2. Subflows are more transparently reflected in the topology view and don't require copying tasks. If you are using templates, and you are not ready to migrate to subflows yet, add the following [Plugins and Execution configuration](../../../configuration/04.plugins-and-execution/index.md) option to still be able to use them: ```yaml kestra: templates: enabled: true ``` ## Templates :warning: A typical template has an ID, a namespace, and a list of tasks. Here is an example template: ```yaml id: mytemplate namespace: company.team tasks: - id: workingDir type: io.kestra.plugin.core.flow.WorkingDirectory tasks: - id: bash type: io.kestra.plugin.scripts.shell.Commands commands: - mkdir -p out - echo "Hello from 1" >> out/output1.txt - echo "Hello from 2" >> out/output2.txt - echo "Hello from 3" >> out/output3.txt - echo "Hello from 4" >> out/output4.txt taskRunner: type: io.kestra.plugin.core.runner.Process - id: out type: io.kestra.plugin.core.storage.LocalFiles outputs: - out/** - id: each type: io.kestra.plugin.core.flow.ForEach concurrencyLimit: 0 values: "{{ outputs.out.uris | jq('.[]') }}" tasks: - id: path type: io.kestra.plugin.core.debug.Return format: "{{taskrun.value}}" - id: contents type: io.kestra.plugin.scripts.shell.Commands commands: - cat "{{taskrun.value}}" taskRunner: type: io.kestra.plugin.core.runner.Process ``` You can trigger it in a flow using the `io.kestra.plugin.core.flow.Template` task: ```yaml id: templatedFlow namespace: company.team tasks: - id: first type: io.kestra.plugin.core.log.Log message: first task - id: template type: io.kestra.plugin.core.flow.Template namespace: company.team templateId: mytemplate - id: last type: io.kestra.plugin.core.log.Log message: last task ``` This example shows that templates are quite restrictive — you can only invoke them as-is. You cannot set custom input values, and there is no link from this flow to the template. In contrast, subflows can be parametrized, and you can navigate to the subflow in the topology view. From the 0.11.0 release, you can also expand and collapse a subflow (child flow) to inspect the available tasks directly from the parent flow. ## Subflows ✅ To migrate from a template to a subflow, you can create a flow that is a 1:1 copy of your template. This flow can then be invoked as a subflow the same way you used to invoke a template (only using a different task). In our example, we can create a new flow called `mytemplate` in a namespace `dev`. This flow will be invoked from a parent flow as a subflow. Then, to create a child flow (a subflow), you only need to change the following values in the `templatedFlow`: - Change the `io.kestra.plugin.core.flow.Template` task type to `io.kestra.plugin.core.flow.Subflow` - Change the `templateId` to `flowId`. See the example below showing how you can invoke a subflow from a parent flow: ```yaml id: parentFlow namespace: company.team tasks: - id: subflow type: io.kestra.plugin.core.flow.Subflow namespace: company.team flowId: mytemplate ``` And here is a complete example showing how a template task can be migrated to a subflow task: ```yaml id: parentFlow namespace: company.team tasks: - id: first type: io.kestra.plugin.core.log.Log message: first task - id: subflow type: io.kestra.plugin.core.flow.Subflow namespace: company.team flowId: mytemplate - id: last type: io.kestra.plugin.core.log.Log message: last task ``` If your subflow has input parameters and you want to override them when calling the subflow, you can configure them as follows: ```yaml id: parentFlow namespace: company.team tasks: - id: first type: io.kestra.plugin.core.log.Log message: first task - id: subflow type: io.kestra.plugin.core.flow.Subflow namespace: company.team flowId: mytemplate inputs: myIntegerParameter: 42 myStringParameter: hello world! - id: last type: io.kestra.plugin.core.log.Log message: last task ``` ## Side-by-side comparison You can look at both a flow with a template task and a flow with a subflow task side by side to see the difference in syntax: ![template-vs-subflow](./template-vs-subflow.png) If you still have questions about migrating from templates to subflows, reach out via [Community Slack](/slack). ## Documentation of the deprecated feature Templates are lists of tasks that can be shared between flows. You can define a template and call it from other flows, allowing them to share a list of tasks and keep these tasks updated without changing your flow. All tasks in a template will be executed sequentially; you can provide the same tasks that are found in a *standard* flow, including an *errors* branch. Templates can have arguments passed via the `args` property — see the [Template Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.template). ### Example Below is a flow sample that will include a template: ```yaml id: with-template namespace: company.team inputs: - id: store type: STRING required: true tasks: - id: render-template type: io.kestra.plugin.core.flow.Template namespace: company.team templateId: template-example args: renamedStore: "{{ inputs.store }}" ``` If the template is defined like so: ```yaml id: template-example namespace: company.team tasks: - id: task-defined-by-template type: io.kestra.plugin.core.debug.Return format: "{{ parent.outputs.args.renamedStore }}" ``` It will result in a flow similar to the following: ```yaml id: with-template namespace: company.team tasks: - id: render-template type: io.kestra.plugin.core.flow.Sequential tasks: - id: task-defined-by-template type: io.kestra.plugin.core.debug.Return format: "{{ inputs.store }}" ``` All tasks from the template will be *copied* at runtime. :::alert{type="warning"} From the template, you can access all execution context variables. However, this is discouraged. The best is to use the `args` property to rename variables from the global context to the template's local one. ::: ### Templates UI If enabled, you can inspect Templates on the **Templates** page. ![Kestra User Interface Templates Page](./06-Templates.png) A **Template** page allows to edit the template via a YAML editor. ![Kestra User Interface Template Page](./07-Templates-Template.png) --- # Kestra 0.12.0 Migration Guide: What Changed URL: https://kestra.io/docs/migration-guide/v0.12.0 > Overview of changes and migration actions for upgrading to Kestra version 0.12.0. import ChildCard from "~/components/docs/ChildCard.astro" ## 0.12.0 Deprecated features and migration guides for 0.12.0 and onwards. --- # Listeners Deprecated in Kestra 0.12.0: Use Flow Triggers URL: https://kestra.io/docs/migration-guide/v0.12.0/listeners > Information on the deprecation of Listeners in Kestra 0.12.0 and the transition to Flow triggers. ## Deprecation of Listeners Listeners are deprecated and disabled by default starting from the 0.12.0 release. Use [Flow triggers](../../../05.workflow-components/07.triggers/02.flow-trigger/index.md) instead. 1. The listener is a **redundant** concept. Flow triggers allow you to do all that listeners can accomplish and more. The only difference between listeners and triggers is that listeners are defined inline within the same flow code and are, therefore, more tightly coupled with the flow. In contrast, a Flow trigger is defined in a separate independent flow that can simultaneously listen to the condition of multiple flows that satisfy specific `conditions`. This gives you more flexibility. 2. It is an extra concept that you, as a user, would need to learn even though you may not have to if you already know Flow triggers. 3. It's a hard-to-grasp concept — listeners can launch tasks *outside* of the flow, i.e., tasks that will not be considered part of the flow but are defined *within* it. Additionally, the results of listeners will not change the execution status of the flow, so having them defined within the flow has caused some confusion in the past. 4. Currently, listeners are mainly used to send failure (or success) notifications, and Kestra already has two concepts allowing you to do that: `triggers` and `errors`. Having **three** choices for such a standard use case has led to confusion about when to use which of them. If you are using listeners and you are not ready to migrate to Flow triggers yet, add the following [Plugins and Execution configuration](../../../configuration/04.plugins-and-execution/index.md) option to still be able to use listeners: ```yaml kestra: listeners: enabled: true ``` Also add the following plugin defaults to your configuration to ensure that conditions work properly after upgrading to any version after 0.12.0: ```yaml kestra: tasks: defaults: - type: io.kestra.plugin.core.condition.DateTimeBetweenCondition values: date: "{{ now() }}" - type: io.kestra.plugin.core.condition.DayWeekCondition values: date: "{{ now(format="iso_local_date") }}" - type: io.kestra.plugin.core.condition.DayWeekInMonthCondition values: date: "{{ now(format="iso_local_date") }}" - type: io.kestra.plugin.core.condition.TimeBetweenCondition values: date: "{{ now(format='iso_offset_time') }}" - type: io.kestra.plugin.core.condition.WeekendCondition values: date: "{{ now(format='iso_local_date') }}" ``` Due to listeners' deprecation, the default behavior of various `io.kestra.core.models.conditions`-type conditions changed to use `{{trigger.date}}` as the default value for the `date` property instead of `"{{ now(format='iso_local_date') }}"`. To ensure conditions work properly after upgrading to any version after 0.12.0, add the above plugin defaults to your Kestra configuration. ## Listeners :warning: Here is an example of a fairly typical listener used to implement error notifications: ```yaml id: alert_to_slack namespace: prod.monitoring tasks: - id: fail type: io.kestra.plugin.core.execution.Fail listeners: - tasks: - id: slack type: io.kestra.plugin.slack.notifications.SlackExecution url: "{{ secret('SLACK_WEBHOOK') }}" channel: "#general" executionId: "{{ execution.id }}" conditions: - type: io.kestra.plugin.core.condition.ExecutionStatusCondition in: - FAILED - WARNING ``` This flow will fail and the listener tasks will be triggered anytime the flow reaches the specified execution status condition — here, the `FAILED` status. The next section shows how you can accomplish the same using Flow triggers. ## Flow trigger ✅ To migrate from a listener to a Flow trigger, create a new flow. Add a trigger of type `io.kestra.plugin.core.trigger.Flow` and move the condition e.g. `ExecutionStatusCondition` to the trigger conditions. Finally, move the list of tasks from listeners to `tasks` in the flow. The example below will explain it better than words: ```yaml id: alert_to_slack namespace: prod.monitoring tasks: - id: slack type: io.kestra.plugin.slack.notifications.SlackExecution url: "{{ secret('SLACK_WEBHOOK') }}" channel: "#general" executionId: "{{trigger.executionId}}" triggers: - id: execution_status_events type: io.kestra.plugin.core.trigger.Flow conditions: - type: io.kestra.plugin.core.condition.ExecutionStatusCondition in: - FAILED - WARNING - type: io.kestra.plugin.core.condition.ExecutionFlowCondition namespace: prod flowId: demo ``` That flow trigger listens to the execution status of the following flow: ```yaml id: demo namespace: prod tasks: - id: fail type: io.kestra.plugin.core.execution.Fail ``` Anytime you execute that `demo` flow, the Slack notification will be sent via the Flow trigger. Additionally, the **Dependencies** tab of both flows will make it clear that they depend on each other. ## Side-by-side comparison You can look at both a flow with a listener and a flow with a Flow trigger side by side to see the syntax difference: ![listeners-vs-flow-triggers](./listeners-vs-flow-triggers.png) If you still have questions about migrating from listeners to flow triggers, reach out via [Community Slack](/slack). ## Documentation of the deprecated feature Listeners are special branches of a flow that can listen to the current flow and launch tasks *outside the flow*. The result of a listener's tasks will not change the execution status of a flow. The listener's tasks are run at the end of the flow. Listeners are usually used to send notifications or handle special end-task behavior that should not be considered part of the main flow. ### Example listener > A listener that sends a Slack notification for a failed task (this would require the Slack plugin). ```yaml listeners: - tasks: - id: sendSlackAlert type: io.kestra.plugin.slack.notifications.SlackExecution url: https://hooks.slack.com/services/XXX/YYY/ZZZ conditions: - type: io.kestra.plugin.core.condition.ExecutionStatusCondition in: - FAILED ``` ### Properties **`conditions`** * **Type:** ==array== * **SubType:** ==Condition== * **Required:** ❌ > A list of Conditions that must be validated to execute the listener `tasks`. If you don't provide any conditions, the listeners will always be executed. **`tasks`** * **Type:** ==array== * **SubType:** ==Task== * **Required:** ❌ > A list of tasks that will be executed at the end of the flow. The status of these tasks will not impact the main execution and will not change the execution status even if they fail. > > You can use every tasks you need here, even Flowable. > All task `id` must be unique for the whole flow even for main `tasks` and `errors`. **`description`** * **Type:** ==string== * **Required:** ❌ > Description for documentation. --- # Kestra 0.13.0 Migration Guide: What Changed URL: https://kestra.io/docs/migration-guide/v0.13.0 > Migration guide for Kestra 0.13.0. Covers deprecated features and required steps including syncing user access to the default tenant for multitenancy. import ChildCard from "~/components/docs/ChildCard.astro" ## 0.13.0 Deprecated features and migration guides for 0.13.0 and onwards. --- # Sync User Access to Default Tenant in Kestra 0.13.0 URL: https://kestra.io/docs/migration-guide/v0.13.0/default-tenant > Instructions for syncing user access to the default tenant following the introduction of multitenancy in Kestra 0.13.0. ## Sync Users Access to a Default Tenant Adjusting users' access to the default tenant. In the [v0.13.0 release](../../../../blogs/2023-11-16-release-0-13/index.md), Kestra introduced multitenancy. As a result, user access is now managed at the tenant level. ## Migration After upgrading to v0.13.0, you will need to adjust your users' access to make it consistent with the new multitenancy model. To make this process easier, there is a new `kestra-ee auths users sync-access` command available in the Kestra CLI that allows you to automatically sync users' access to a default tenant. Run the following command: ```bash kestra-ee auths users sync-access ``` Here is a detailed command usage for reference: ```bash Usage: kestra-ee auths users sync-access [-hVv] [--internal-log] [-c=] [-l=] [-p=] Sync users access with the default Tenant. This command is designed to be used when enabling multi-tenancy on an existing Kestra instance, in this case the existing user will need to have their access synchronized if they need access to the default tenants (groups and roles will be synchronized) -c, --config= Path to a configuration file Default: /home/kestra/.kestra/config.yml -h, --help Show this help message and exit. --internal-log Change also log level for internal log -l, --log-level= Change log level (values: TRACE, DEBUG, INFO, WARN, ERROR) Default: INFO -p, --plugins= Path to plugins directory Default: /app/plugins -v, --verbose Change log level. Multiple -v options increase the verbosity. -V, --version Print version information and exit. ``` ## Summary Running the `kestra-ee auths users sync-access` command will perform the necessary migration to make your users' access consistent with the new multitenancy model. --- # Kestra 0.14.0 Migration Guide: What Changed URL: https://kestra.io/docs/migration-guide/v0.14.0 > Migration guide for Kestra 0.14.0. Covers breaking changes including the groupList rename, non-recursive Pebble rendering, and required workflow adaptations. import ChildCard from "~/components/docs/ChildCard.astro" ## 0.14.0 Deprecated features and migration guides for 0.14.0 and onwards. --- # Groups API Change in Kestra 0.14.0: groupList Rename URL: https://kestra.io/docs/migration-guide/v0.14.0/group-list > API changes in Kestra 0.14.0. The groups property is renamed to groupList, and groupId must now be unique to prevent duplicate groups across tenants. ## Change in managing Groups via the API This change affects the way you manage groups via the API. In the [v0.14.0 release](../../../../blogs/2024-01-22-release-0-14/index.md), the Groups API structure changed to prevent duplicate groups from being created. Before Kestra v0.14.0, you could create multiple groups with the same name. Since this can lead to confusion in a multitenant environment, this behavior is now prevented. ## Migration The `groups` property has been renamed to `groupList` and the `groupId` now needs to be unique. ```yaml groupList: - groupId: yourGroupId membership: MEMBER ``` :::alert{type="info"} If you use Kestra UI or manage users and groups via Terraform, this change does not affect you. This change only affects customers who manage groups programmatically via the API. ::: ## Summary The main change is that you no longer can create multiple groups with the same name. If you try to edit a group of which name exists more than once, you will be prompted to rename the group to a unique name. --- # Non-Recursive Pebble Rendering in Kestra 0.14.0 URL: https://kestra.io/docs/migration-guide/v0.14.0/recursive-rendering > Guide to the new non-recursive Pebble expression rendering and the usage of the render() function. ## Recursive rendering of Pebble expressions Since 0.14.0, Kestra's templating engine has changed the default rendering behavior to **not recursive**. Before the release 0.14, Kestra's templating engine has been rendering all expressions **recursively**. While recursive rendering enabled many flexible usage patterns, it also opened up the door to some unintended behavior. For example, if you wanted to parse JSON elements of a webhook payload that contained a templated string from other applications (such as GitHub Actions or dbt core), the recursive rendering would attempt to parse those expressions, resulting in an error. The release 0.14.0 has changed the default rendering behavior to **not recursive** and introduced a new `render()` function that gives you more control over which expressions should be rendered and how. ## The new `render()` function The syntax for the `render()` function is as follows: ```yaml {{ render(expression_string, recursive=true) }} # if false, render only once ``` Here is a simple usage example: ```yaml id: render_variables_recursively namespace: company.team variables: trigger_var: "{{ trigger.date ?? execution.startDate | date('yyyy-MM-dd') }}" tasks: - id: parse_date type: io.kestra.plugin.core.debug.Return format: "{{ render(vars.trigger_var) }}" # this will print the recursively-rendered variable triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "*/1 * * * *" ``` --- ## Migrating a 0.13.0 flow to the new rendering behavior in 0.14.0 As you can see in the above example, wrapping the Pebble expression in the `render()` function is all that is needed to migrate existing flows to Kestra 0.14.0. However, if you have many flows that use the previous recursive rendering behavior, you may perform that migration later. A boolean configuration variable `recursive-rendering` is available to keep the previous recursive rendering behavior while you migrate your flows. ## How to keep the previous behavior To keep the previous (recursive) behavior, add the following configuration: ```yaml kestra: variables: recursiveRendering: true # default: false ``` This is an instance-level configuration, so you don't need any changes in your code. Migrating flows to the new rendering behavior as soon as possible is recommended; the explicit rendering behavior is more intuitive and less error-prone. --- # Kestra 0.15.0 Migration Guide: What Changed URL: https://kestra.io/docs/migration-guide/v0.15.0 > Migration guide for Kestra 0.15.0. Covers scheduleConditions deprecation, input name-to-id rename, subflow output behavior changes, and Micronaut 4.3 compat. import ChildCard from "~/components/docs/ChildCard.astro" ## 0.15.0 Deprecated features and migration guides for 0.15.0 and onwards. --- # Inputs name Renamed to id in Kestra 0.15.0 URL: https://kestra.io/docs/migration-guide/v0.15.0/inputs-name > Notice regarding the change of the name property to id for flow inputs in Kestra 0.15.0. ## Inputs `name` changed to `id` The `name` property of `inputs` are deprecated in favor of `id` for consistency with the rest of the flow configuration. The change is non-breaking, so existing flows do not need to be changed immediately to migrate to 0.15.0. Use the `id` property for new flows. The `name` property will be removed in the future. :::alert{type="info"} All you need to do is to rename the `name` to `id` in your flow configuration — no other changes are required. ::: To make the change clear, here is how inputs were defined before Kestra 0.15.0: ```yaml id: myflow namespace: company.team inputs: - name: beverage type: STRING defaults: coffee - name: quantity type: INTEGER defaults: 1 ``` ## After Kestra 0.15.0 Here is how inputs are defined after Kestra 0.15.0: ```yaml id: myflow namespace: company.team inputs: - id: beverage type: STRING defaults: coffee - id: quantity type: INTEGER defaults: 1 ``` --- # Micronaut 4.3 Migration in Kestra 0.15.0: Plugin Update URL: https://kestra.io/docs/migration-guide/v0.15.0/micronaut4 > Guide for migrating custom plugins to be compatible with Micronaut 4.3 in Kestra 0.15.0. ## Migration to Micronaut 4.3 Kestra 0.15.0 has been migrated to Micronaut 4.3 for improved security. This page explains how to make your custom plugins compatible with this new version. Custom plugins need to be migrated to Micronaut 4.3 to be compatible with Kestra 0.15.0 and later. Upgrade your `gradle.properties` to the following library versions: ```properties version=0.15.0-SNAPSHOT kestraVersion=[0.15,) micronautVersion=4.3.0 lombokVersion=1.18.30 ``` In `gradle.build`, update the dependencies as follows: ```groovy dependencies { // lombok annotationProcessor "org.projectlombok:lombok:$lombokVersion" compileOnly "org.projectlombok:lombok:$lombokVersion" // micronaut annotationProcessor platform("io.micronaut.platform:micronaut-platform:$micronautVersion") annotationProcessor "io.micronaut:micronaut-inject-java" annotationProcessor "io.micronaut.validation:micronaut-validation-processor" compileOnly platform("io.micronaut.platform:micronaut-platform:$micronautVersion") compileOnly "io.micronaut:micronaut-inject" compileOnly "io.micronaut.validation:micronaut-validation" // kestra compileOnly group: "io.kestra", name: "core", version: kestraVersion // other libs go here } ``` Some libraries are no longer included by default in Micronaut 4.3. For instance: - if you use Jackson in your custom plugin, you need to add ` compileOnly "io.micronaut:micronaut-jackson-databind"` - if you use the HTTP client, you need to add `compileOnly "io.micronaut:micronaut-http-client"`. Remove the following Gradle configuration, as Kestra now uses SLF4J 2: ```groovy configurations.all { resolutionStrategy { force("org.slf4j:slf4j-api:1.7.36") } } ``` Test dependencies require an adjustment as well: ```groovy dependencies { // lombok testAnnotationProcessor "org.projectlombok:lombok:" + lombokVersion testCompileOnly 'org.projectlombok:lombok:' + lombokVersion // micronaut testAnnotationProcessor platform("io.micronaut.platform:micronaut-platform:$micronautVersion") testAnnotationProcessor "io.micronaut:micronaut-inject-java" testAnnotationProcessor "io.micronaut.validation:micronaut-validation-processor" testImplementation platform("io.micronaut.platform:micronaut-platform:$micronautVersion") testImplementation "io.micronaut.test:micronaut-test-junit5" // test deps needed only for to have a runner testImplementation group: "io.kestra", name: "core", version: kestraVersion testImplementation group: "io.kestra", name: "repository-memory", version: kestraVersion testImplementation group: "io.kestra", name: "runner-memory", version: kestraVersion testImplementation group: "io.kestra", name: "storage-local", version: kestraVersion // test testImplementation "org.junit.jupiter:junit-jupiter-engine" testImplementation "org.hamcrest:hamcrest:2.2" testImplementation "org.hamcrest:hamcrest-library:2.2" } ``` ### Jakarta migration Adjust the imports from `javax.*` to `jakarta.*` — this is due to the migration from Java EE to Jakarta EE. Some IDEs do this automatically. For example, IntelliJ has a command `Refactor` -> `Migrate Packages and Classes` -> `Java EE to Jakarta EE`). Alternatively, you can use the [OpenRewrite](https://docs.openrewrite.org/recipes/java/migrate/jakarta/javaxmigrationtojakarta) project. ### Project Reactor migration Our reactive stack has been migrated from the deprecated RxJava 2 to the Project Reactor. If your plugin uses RxJava, migrate it to Project Reactor. Replace the library `io.micronaut.rxjava2:micronaut-rxjava2` by `io.micronaut.reactor:micronaut-reactor`. Then, update your code to use the Project Reactor types: - `Flux` (instead of `Flowable`) - `Mono` (instead of `Single`). Lastly, if you were using the reactive HTTP client, replace `io.micronaut.rxjava2:micronaut-rxjava2-http-client` with `io.micronaut.reactor:micronaut-reactor-http-client`. --- # scheduleConditions Deprecated in Kestra 0.15.0 URL: https://kestra.io/docs/migration-guide/v0.15.0/schedule-conditions > Deprecation of the scheduleConditions property in favor of conditions for Schedule triggers in Kestra 0.15.0. ## Schedule Conditions: `scheduleConditions` deprecated for `conditions` The `scheduleConditions` property of `Schedule` trigger is deprecated. Instead, use `conditions` to define custom scheduling conditions. This change is non-breaking, so existing flows do not need to be changed immediately to migrate to 0.15.0. Use the `conditions` property for new flows. The `scheduleConditions` property will be removed in the future. :::alert{type="info"} All you need to do is to rename `scheduleConditions` to `conditions` in your flow configuration — no other changes are required. ::: To make the change clear, here is how scheduling conditions were defined before Kestra 0.15.0: ```yaml id: beverage_order namespace: company.team inputs: - id: beverage type: STRING defaults: coffee tasks: - id: order_beverage type: io.kestra.plugin.core.http.Request uri: https://kestra.io/api/mock method: POST contentType: application/json formData: beverage: "{{inputs.beverage}}" - id: set_labels type: io.kestra.plugin.core.execution.Labels labels: date: "{{trigger.date ?? execution.startDate | date('yyyy-MM-dd')}}" beverage: "{{inputs.beverage}}" triggers: - id: workday type: io.kestra.plugin.core.trigger.Schedule cron: "0 9 * * *" conditions: - type: io.kestra.plugin.core.condition.Not conditions: - type: io.kestra.plugin.core.condition.Weekend - id: weekend type: io.kestra.plugin.core.trigger.Schedule cron: "0 20 * * *" conditions: - type: io.kestra.plugin.core.condition.Weekend inputs: beverage: beer ``` The above flow has two triggers, `workday` and `weekend`. 1. The `workday` trigger is scheduled to run on workdays to order a coffee at 9 am. 2. The `weekend` trigger is scheduled to run on weekends to order a beer at 8 pm. ## Behavior after Kestra 0.15.0 Here is the same flow with the `scheduleConditions` property replaced by `conditions`: ```yaml id: beverage_order namespace: company.team inputs: - id: beverage type: STRING defaults: coffee tasks: - id: order_beverage type: io.kestra.plugin.core.http.Request uri: https://kestra.io/api/mock method: POST contentType: application/json formData: beverage: "{{inputs.beverage}}" - id: set_labels type: io.kestra.plugin.core.execution.Labels labels: date: "{{trigger.date ?? execution.startDate | date('yyyy-MM-dd')}}" beverage: "{{inputs.beverage}}" triggers: - id: workday type: io.kestra.plugin.core.trigger.Schedule cron: "0 9 * * *" conditions: - type: io.kestra.plugin.core.condition.Not conditions: - type: io.kestra.plugin.core.condition.Weekend - id: weekend type: io.kestra.plugin.core.trigger.Schedule cron: "0 20 * * *" conditions: - type: io.kestra.plugin.core.condition.Weekend inputs: beverage: beer ``` --- # Subflow Outputs Behavior Change in Kestra 0.15.0 URL: https://kestra.io/docs/migration-guide/v0.15.0/subflow-outputs > Learn how subflow output behavior changed in Kestra 0.15.0 and how to adopt flow-level outputs for better decoupling and maintainability. ## Subflow outputs behavior The `outputs` property of a parent flow's `Subflow` task is deprecated. Instead, use flow `outputs` to pass data between flows. If you are on Kestra 0.14.4 or earlier, passing data between subflows required using the `outputs` property within the parent flow's `Subflow` task. ## Example Consider a subflow (also called a child flow) with a task `mytask` generating an output called `value`: ```yaml id: flow_outputs namespace: company.team tasks: - id: mytask type: io.kestra.plugin.core.debug.Return format: this is a task output used as a final flow output ``` To access this output in a different task within the same flow, you would use the syntax `{{outputs.mytask.value}}`. However, if you want to access this output in a parent flow, you would need to define the output in the `outputs` property within the parent flow's `Subflow` task as follows: ```yaml id: parent_flow namespace: company.team tasks: - id: subflow type: io.kestra.plugin.core.flow.Subflow flowId: flow_outputs namespace: company.team wait: true outputs: # 🚨 this property is deprecated in Kestra 0.15.0 final: "{{ outputs.mytask.value }}" - id: log type: io.kestra.plugin.core.log.Log message: "{{ outputs.subflow.outputs.final }}" ``` You can see that the `outputs` property is used to define the output of the subflow and stored in the variable named `final` (_the name of the keys are arbitrary_). This approach is not ideal, as **you need to know the internals of the subflow to access its outputs**. Also, it's not clear to the consumer what type of data is being passed. This is why this property is deprecated in Kestra 0.15.0. ## How to keep the old subflow outputs behavior Before looking at how the same is achieved in Kestra 0.15.0, here is how to keep this behavior **if you are not ready to migrate** to the new subflow outputs behavior. To keep the old behavior with the `outputs` property, you can set the following configuration in your `application.yml`: ```yaml kestra: plugins: configurations: - type: io.kestra.plugin.core.flow.Subflow values: outputs: enabled: true # for backward-compatibility -- false by default - type: io.kestra.plugin.core.flow.Flow values: outputs: enabled: true # for backward-compatibility -- false by default ``` Once the `outputs` configuration is set to `enabled: true`, you can use the old behavior of defining `outputs` within the Subflow or Flow task in the parent flow. ## Improved subflow outputs in Kestra 0.15.0 ### Why the change? Kestra 0.15.0 introduced a concept of flow-level `outputs` to make it easier to pass data between flows. Until now, the parent flow had to know the internals of the subflow to access its outputs. This introduced a **tight coupling** as the parent flow was **dependent on the subflow's internal logic**, which can change over time, potentially breaking the parent flow. Also, it was **exposing all outputs** from child flows (producers) to all parent flows (consumers), which is not always desirable. Often you don't want to expose all outputs of a subflow to the parent flow. ### Benefits of the new subflow outputs Now, you have **more control** over what subflow outputs do you want to expose to other flows. The parent flow does not need to know the internals of the child flow — it can access the subflow outputs by key. This **more decoupled** approach means that the parent flow is less dependent on the subflow, and **the subflow can change its implementation without breaking the parent flow**. You can think of flow outputs as **data contracts** between flows. The subflow defines what data it produces, and the parent flow defines what data it consumes. This makes it easier to understand the dataflow between workflows and improves maintainability of both flows over time. ### How to use the new subflow outputs Since 0.15.0, the flow can produce `outputs` by defining them in the flow file. Here is an example of a flow that produces an output: ```yaml id: flow_outputs namespace: company.team tasks: - id: mytask type: io.kestra.plugin.core.debug.Return format: this is a task output used as a final flow output outputs: - id: final type: STRING value: "{{ outputs.mytask.value }}" ``` You can see that outputs are defined as a list of key-value pairs. The `id` is the name of the output attribute (which must be unique within a flow), and the `value` is the value of the output. The `type` lets you define the expected type of the output. You can also add a `description` to the output. You will see the output of the flow on the **Executions** page in the **Overview** tab. ![subflow_output](../../../05.workflow-components/06.outputs/subflow_output.png) Here is how you can access the flow output in the parent flow: ```yaml id: parent_flow namespace: company.team tasks: - id: subflow type: io.kestra.plugin.core.flow.Subflow flowId: flow_outputs namespace: company.team wait: true - id: log_subflow_output type: io.kestra.plugin.core.log.Log message: "{{ outputs.subflow.outputs.final }}" ``` In the example above, the `subflow` task produces an output attribute `final`. This output attribute is then used in the `log_subflow_output` task. :::alert{type="info"} Note how the `outputs` are set twice within the `"{{outputs.subflow.outputs.final}}"`: 1. once to access outputs of the `subflow` task 2. once to access the outputs of the subflow itself — specifically, the `final` output. ::: Here is what you will see in the Outputs tab of the **Executions** page in the parent flow: ![subflow_output_parent](../../../05.workflow-components/06.outputs/subflow_output_parent.png) --- # Kestra 0.17.0 Migration Guide: Renamed Plugins URL: https://kestra.io/docs/migration-guide/v0.17.0 > Migration guide for Kestra version 0.17.0, covering renamed plugins and discovery mechanism changes. import ChildCard from "~/components/docs/ChildCard.astro" ## 0.17.0 Deprecated features and migration guides for 0.17.0 and onwards. --- # JSON Serialization Change in Kestra 0.17.0: NON_NULL URL: https://kestra.io/docs/migration-guide/v0.17.0/json-objects-serialization > Adapt Kestra flows to the NON_NULL JSON serialization strategy introduced in 0.17. Understand changes from NON_DEFAULT and how to update Pebble expressions. ## JSON Object Serialization How to adapt flows to the `NON_NULL` JSON serialization strategy. Kestra 0.17 migrates away from the previously used `NON_DEFAULT` JSON serialization strategy to fix various limitations and make the flow behavior more user-friendly. This change makes empty lists or maps serialized instead of being undefined. Adapting [Pebble expressions](../../../expressions/index.mdx) relying on the previously existing behavior is necessary to keep the functionality untouched. There are three main cases where Pebble expressions might be affected: 1) [Operators, Tags, and Tests](../../../expressions/02.syntax/index.mdx#tags) 2) [Pebble Syntax](../../../expressions/02.syntax/index.mdx) 3) [Conditions in Pebble](../../../06.concepts/06.pebble/index.md#using-conditions-in-pebble) ## 0.16 ```yaml id: inputsV16 namespace: company.team inputs: - id: optionalInput type: STRING required: false tasks: - id: testNullCoalescing type: io.kestra.core.tasks.log.Log message: "=>{{ inputs.optionalInput ?? 'undefined' }}<=" # =>undefined<= - id: testOutputsMapPrepare type: io.kestra.plugin.scripts.python.Script script: "print('test')" - id: testOutputsMap type: io.kestra.core.tasks.log.Log message: "=>{{ outputs.testOutputsMapPrepare.outputFiles ?? 'empty' }}<=" # =>empty<= - id: testCondition type: io.kestra.core.tasks.flows.If condition: "{{ outputs.testOutputsMapPrepare.outputFiles is defined }}" then: - id: logOutputFiles type: io.kestra.core.tasks.log.Log message: "found" else: - id: logNoOutputFiles type: io.kestra.core.tasks.log.Log message: "not found" # not found ``` ## 0.17 ```yaml id: inputsV17 namespace: company.team inputs: - id: optionalInput type: STRING required: false tasks: - id: testNullCoalescing type: io.kestra.plugin.core.log.Log message: "=>{{ inputs.optionalInput ?? 'undefined' }}<=" # =><= - id: testOutputsMapPrepare type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: python:3.11-slim script: "print('test')" - id: testOutputsMap type: io.kestra.plugin.core.log.Log message: "=>{{ outputs.testOutputsMapPrepare.outputFiles ?? 'empty' }}<=" # =>{}<= - id: testCondition type: io.kestra.core.tasks.flows.If condition: "{{ outputs.testOutputsMapPrepare.outputFiles is defined }}" then: - id: logOutputFiles type: io.kestra.plugin.core.log.Log message: "found" # found else: - id: logNoOutputFiles type: io.kestra.plugin.core.log.Log message: "not found" ``` For more information, you can refer the [Improved serialization of JSON objects](../../../../blogs/2024-06-04-release-0-17/index.md#improved-serialization-of-json-objects) page. --- # LocalFiles & outputDir Deprecated in Kestra 0.17.0 URL: https://kestra.io/docs/migration-guide/v0.17.0/local-files > Guide to migrating from deprecated LocalFiles and outputDir to inputFiles and outputFiles in Kestra 0.17.0. ## Deprecation of LocalFiles and outputDir Migrate from `LocalFiles` and `outputDir` to `inputFiles` and `outputFiles`. The `LocalFiles` and `outputDir` are deprecated due to overlapping functionality that already exists using `inputFiles` and `outputFiles` on the `WorkingDirectory` and [script](../../../16.scripts/index.mdx) tasks. 1. **outputDir**: the `{{ outputDir }}` expression has been deprecated due to overlapping functionality available through the `outputFiles` property which is more flexible. 2. **LocalFiles**: the `LocalFiles` feature was initially introduced to allow injecting additional files into the script task's `WorkingDirectory`. However, this feature was confusing as there is nothing local about these files, and with the introduction of `inputFiles` to the `WorkingDirectory`, it became redundant. We recommend using the `inputFiles` property instead of `LocalFiles` to inject files into the script task's `WorkingDirectory`. The example below demonstrates how to do that: ```yaml id: apiJSONtoMongoDB namespace: company.team tasks: - id: inlineScript type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: python:3.11-slim beforeCommands: - pip install requests kestra > /dev/null outputFiles: - output.json inputFiles: query.sql: | SELECT sum(total) as total, avg(quantity) as avg_quantity FROM sales; script: | import requests import json from kestra import Kestra with open('query.sql', 'r') as input_file: sql = input_file.read() response = requests.get('https://api.github.com') data = response.json() with open('output.json', 'w') as output_file: json.dump(data, output_file) Kestra.outputs({'receivedSQL': sql, 'status': response.status_code}) - id: loadToMongoDB type: io.kestra.plugin.mongodb.Load connection: uri: mongodb://host.docker.internal:27017/ database: local collection: github from: "{{ outputs.inlineScript.outputFiles['output.json'] }}" ``` ## Examples To help you migrate your flows, here's a few examples of how you might update your flow to use the new format in 0.17.0. ### `outputDir` #### Before Previously, you would specify `{{ outputDir }}` as you save the file. ```yaml id: getting_started_output namespace: company.team inputs: - id: api_url type: STRING defaults: https://dummyjson.com/products tasks: - id: api type: io.kestra.plugin.fs.http.Request uri: "{{ inputs.api_url }}" - id: python type: io.kestra.plugin.scripts.python.Script docker: image: python:slim beforeCommands: - pip install polars script: | import polars as pl data = {{outputs.api.body | jq('.products') | first}} df = pl.from_dicts(data) df.glimpse() df.select(["brand", "price"]).write_csv("{{outputDir}}/products.csv") ``` #### After Now you can remove this, and just specify the file name in the `outputFiles` properties. ```yaml id: getting_started_output namespace: company.team inputs: - id: api_url type: STRING defaults: https://dummyjson.com/products tasks: - id: api type: io.kestra.plugin.fs.http.Request uri: "{{ inputs.api_url }}" - id: python type: io.kestra.plugin.scripts.python.Script containerImage: python:slim beforeCommands: - pip install polars outputFiles: - "products.csv" script: | import polars as pl data = {{outputs.api.body | jq('.products') | first}} df = pl.from_dicts(data) df.glimpse() df.select(["brand", "price"]).write_csv("products.csv") ``` ### `LocalFiles` #### Before Previously, you would add a separate `LocalFiles` task inside of the `WorkingDirectory` task to specify your inputs for later tasks. ```yaml id: pip namespace: company.team tasks: - id: wdir type: io.kestra.plugin.core.flow.WorkingDirectory tasks: - id: pip type: io.kestra.plugin.core.storage.LocalFiles inputs: requirements.txt: | kestra>=0.6.0 pandas>=1.3.5 requests>=2.31.0 - id: pythonScript type: io.kestra.plugin.scripts.python.Script docker: image: python:3.11-slim beforeCommands: - pip install -r requirements.txt > /dev/null script: | import requests import kestra import pandas as pd print(f"requests version: {requests.__version__}") print(f"pandas version: {pd.__version__}") methods = [i for i in dir(kestra.Kestra) if not i.startswith("_")] print(f"Kestra methods: {methods}") ``` #### After In 0.17.0, you can specify your input files by using the `inputFiles` property from the `WorkingDirectory` task, removing the need for the `LocalFiles` task all together. ```yaml id: pip namespace: company.team tasks: - id: wdir type: io.kestra.plugin.core.flow.WorkingDirectory inputFiles: requirements.txt: | kestra>=0.6.0 pandas>=1.3.5 requests>=2.31.0 tasks: - id: pythonScript type: io.kestra.plugin.scripts.python.Script containerImage: python:3.11-slim beforeCommands: - pip install -r requirements.txt > /dev/null script: | import requests import kestra import pandas as pd print(f"requests version: {requests.__version__}") print(f"pandas version: {pd.__version__}") methods = [i for i in dir(kestra.Kestra) if not i.startswith("_")] print(f"Kestra methods: {methods}") ``` --- # Plugin Discovery Mechanism Change in Kestra 0.17.0 URL: https://kestra.io/docs/migration-guide/v0.17.0/plugin-discovery-mechanism > Changes to the plugin discovery mechanism in Kestra 0.17.0 using Java Service Loader. ## Plugin Discovery Mechanism Kestra 0.17.0 uses a new mechanism to discover and load plugins. If you use custom plugins, follow this guide to make the necessary adjustments. Plugins are now discovered and loaded using the standard *Java Service Loader*. So far, Kestra heavily relied on the _Bean Introspection_ mechanism provided by the Micronaut Framework for loading plugins (_Micronaut is the JVM-based API framework used by Kestra_). However, this implementation encountered limitations in maintaining backward compatibility of plugins during major Micronaut version upgrades, and was limiting the ability to introduce future enhancements around the plugin mechanism. This new implementation reduces the number of dependencies required for developing custom plugins, and plugins now load twice as fast. Finally, this change is part of a wider effort to improve the developer experience around plugins, and to reduce Micronaut's exposure outside the Kestra core. This change introduces minor breaking changes to how custom plugins are built. Below are the changes required to migrate to Kestra 0.17.0. ## Micronaut Dependencies For most plugin implementations, all Micronaut libs can be removed from the `compileOnly` dependencies in the `build.gradle` file. However, Micronaut is still required to use the utility classes provided by Kestra for running unit-tests. ## Kestra's Annotation Processor Kestra requires a new annotation processor to be configured in the `build.gradle` file of your project (or `pom.xml` for Maven). ```groovy annotationProcessor group: "io.kestra", name: "processor", version: kestraVersion ``` The role of this processor is to automatically manage the `META-INF/services` file needed by Java to discover your plugins. ## Custom Validators Kestra allows you to develop a custom constraint validator using the standard Java API for bean validation (i.e., JSR-380), which is used to validate the properties of custom tasks. :::alert{type="warning"} The custom validator must now implement the standard `jakarta.validation.ConstraintValidator` instead of the interface provided by Micronaut: `io.micronaut.validation.validator.constraints.ConstraintValidator`. ::: In addition, custom validation annotation should now strictly adhere to the Java bean specification — see the example below. Kestra 0.16.6 and before: ```java // file: io.kestra.plugins.custom.CustomNotEmpty.java @Retention(RetentionPolicy.RUNTIME) @Constraint(validatedBy = CustomNotEmptyValidator.class) public @interface CustomNotEmpty { String message() default "invalid"; } ``` ```java // file: io.kestra.plugins.custom.CustomNotEmptyValidator.java import io.micronaut.validation.validator.constraints.ConstraintValidator; import io.micronaut.validation.validator.constraints.ConstraintValidatorContext; // ... @Singleton @Introspected public class CustomNotEmptyValidator implements ConstraintValidator { @Override public boolean isValid( @Nullable String value, @NonNull AnnotationValue annotationMetadata, @NonNull ConstraintValidatorContext context) { if (value == null) { return true; // nulls are allowed according to spec } else if (value.size() < 2) { context.messageTemplate("string must have at-least two characters"); return false; } else { return true; } } } ``` Kestra 0.17.0 and later: ```java // file: io.kestra.plugins.custom.CustomNotEmpty.java @Retention(RetentionPolicy.RUNTIME) @Constraint(validatedBy = CustomNotEmptyValidator.class) public @interface CustomNotEmpty { String message() default "invalid"; Class[] groups() default {}; Class[] payload() default {}; } ``` ```java // file: io.kestra.plugins.custom.CustomNotEmptyValidator.java import jakarta.validation.ConstraintValidator; import jakarta.validation.ConstraintValidatorContext; // ... @Singleton @Introspected public class CustomNotEmptyValidator implements ConstraintValidator { @Override public boolean isValid(String value, ConstraintValidatorContext context) { if (value == null) { return true; // nulls are allowed according to spec } else if (value.size() < 2) { context.disableDefaultConstraintViolation(); context .buildConstraintViolationWithTemplate("string must have at-least two characters") .addConstraintViolation(); context.messageTemplate(); return false; } else { return true; } } } ``` --- # Renamed Plugins in Kestra 0.17.0: Update Your Flows URL: https://kestra.io/docs/migration-guide/v0.17.0/renamed-plugins > List of renamed plugins and task runners in Kestra 0.17.0 and how to update your flows. ## Renamed Plugins Many core plugins have been renamed in Kestra 0.17.0, and `taskDefaults` are now `pluginDefaults`. While these are non-breaking changes, update your flows to use the new names. Multiple plugin types have been moved to a new package structure under `io.kestra.plugin.core` to make the plugin system more consistent and intuitive. :::alert{type="warning"} Kestra also renamed `taskDefaults` to `pluginDefaults` to highlight that you can set default values for all plugins (_including triggers, task runners and more_), not just tasks. ::: All of these are non-breaking changes as Kestra uses **aliases** for backward compatibility. You will see a friendly warning in the UI code editor if you use the old names. ![renamed-core-plugins](./renamed-core-plugins.png) It's worth taking a couple of minutes to rename those in your flows to future-proof your code. ## Renamed Core Plugins Here is the schema showing how the core abstractions have been renamed: - `io.kestra.core.models.conditions.types.*` → `io.kestra.plugin.core.condition.*` - `io.kestra.core.models.triggers.types.*` → `io.kestra.plugin.core.trigger.*` - `io.kestra.core.models.tasks.runners.types.*` → `io.kestra.plugin.core.runner.*` - `io.kestra.core.tasks.storages.*` → `io.kestra.plugin.core.storage.*` - `io.kestra.core.tasks.*.*` → `io.kestra.plugin.core.*.*` - `io.kestra.plugin.fs.http.*` → `io.kestra.plugin.core.http.*` Below you can see the full list of renamed plugins: | Old Name Before Kestra 0.17.0 | New Name After Kestra 0.17.0 | |----------------------------------------------------------------------|---------------------------------------------------------------| | `io.kestra.core.models.conditions.types.DateTimeBetweenCondition` | `io.kestra.plugin.core.condition.DateTimeBetweenCondition` | | `io.kestra.core.models.conditions.types.DayWeekCondition` | `io.kestra.plugin.core.condition.DayWeekCondition` | | `io.kestra.core.models.conditions.types.DayWeekInMonthCondition` | `io.kestra.plugin.core.condition.DayWeekInMonthCondition` | | `io.kestra.core.models.conditions.types.ExecutionFlowCondition` | `io.kestra.plugin.core.condition.ExecutionFlowCondition` | | `io.kestra.core.models.conditions.types.ExecutionLabelsCondition` | `io.kestra.plugin.core.condition.ExecutionLabelsCondition` | | `io.kestra.core.models.conditions.types.ExecutionNamespaceCondition` | `io.kestra.plugin.core.condition.ExecutionNamespaceCondition` | | `io.kestra.core.models.conditions.types.ExecutionStatusCondition` | `io.kestra.plugin.core.condition.ExecutionStatusCondition` | | `io.kestra.core.models.conditions.types.FlowCondition` | `io.kestra.plugin.core.condition.FlowCondition` | | `io.kestra.core.models.conditions.types.FlowNamespaceCondition` | `io.kestra.plugin.core.condition.FlowNamespaceCondition` | | `io.kestra.core.models.conditions.types.HasRetryAttemptCondition` | `io.kestra.plugin.core.condition.HasRetryAttemptCondition` | | `io.kestra.core.models.conditions.types.MultipleCondition` | `io.kestra.plugin.core.condition.MultipleCondition` | | `io.kestra.core.models.conditions.types.NotCondition` | `io.kestra.plugin.core.condition.NotCondition` | | `io.kestra.core.models.conditions.types.OrCondition` | `io.kestra.plugin.core.condition.OrCondition` | | `io.kestra.core.models.conditions.types.PublicHolidayCondition` | `io.kestra.plugin.core.condition.PublicHolidayCondition` | | `io.kestra.core.models.conditions.types.TimeBetweenCondition` | `io.kestra.plugin.core.condition.TimeBetweenCondition` | | `io.kestra.core.models.conditions.types.VariableCondition` | `io.kestra.plugin.core.condition.ExpressionCondition` | | `io.kestra.core.models.conditions.types.WeekendCondition` | `io.kestra.plugin.core.condition.WeekendCondition` | | `io.kestra.core.models.tasks.runners.types.ProcessTaskRunner` | `io.kestra.plugin.core.runner.Process` | | `io.kestra.core.models.triggers.types.Flow` | `io.kestra.plugin.core.trigger.Flow` | | `io.kestra.core.models.triggers.types.Schedule` | `io.kestra.plugin.core.trigger.Schedule` | | `io.kestra.core.models.triggers.types.Webhook` | `io.kestra.plugin.core.trigger.Webhook` | | `io.kestra.core.tasks.debugs.Echo` | `io.kestra.plugin.core.debug.Echo` | | `io.kestra.core.tasks.debugs.Return` | `io.kestra.plugin.core.debug.Return` | | `io.kestra.core.tasks.executions.Counts` | `io.kestra.plugin.core.execution.Count` | | `io.kestra.core.tasks.executions.Fail` | `io.kestra.plugin.core.execution.Fail` | | `io.kestra.core.tasks.executions.Labels` | `io.kestra.plugin.core.execution.Labels` | | `io.kestra.core.tasks.flows.AllowFailure` | `io.kestra.plugin.core.flow.AllowFailure` | | `io.kestra.core.tasks.flows.Dag` | `io.kestra.plugin.core.flow.Dag` | | `io.kestra.core.tasks.flows.EachParallel` | `io.kestra.plugin.core.flow.EachParallel` | | `io.kestra.core.tasks.flows.EachSequential` | `io.kestra.plugin.core.flow.EachSequential` | | `io.kestra.core.tasks.flows.Flow` | `io.kestra.plugin.core.flow.Subflow` | | `io.kestra.core.tasks.flows.ForEachItem` | `io.kestra.plugin.core.flow.ForEachItem` | | `io.kestra.core.tasks.flows.If` | `io.kestra.plugin.core.flow.If` | | `io.kestra.core.tasks.flows.Parallel` | `io.kestra.plugin.core.flow.Parallel` | | `io.kestra.core.tasks.flows.Pause` | `io.kestra.plugin.core.flow.Pause` | | `io.kestra.core.tasks.flows.Sequential` | `io.kestra.plugin.core.flow.Sequential` | | `io.kestra.core.tasks.flows.Subflow` | `io.kestra.plugin.core.flow.Subflow` | | `io.kestra.core.tasks.flows.Switch` | `io.kestra.plugin.core.flow.Switch` | | `io.kestra.core.tasks.flows.Template` | `io.kestra.plugin.core.flow.Template` | | `io.kestra.core.tasks.flows.WorkingDirectory` | `io.kestra.plugin.core.flow.WorkingDirectory` | | `io.kestra.core.tasks.log.Fetch` | `io.kestra.plugin.core.log.Fetch` | | `io.kestra.core.tasks.log.Log` | `io.kestra.plugin.core.log.Log` | | `io.kestra.core.tasks.states.Delete` | `io.kestra.plugin.core.state.Delete` | | `io.kestra.core.tasks.states.Get` | `io.kestra.plugin.core.state.Get` | | `io.kestra.core.tasks.states.Set` | `io.kestra.plugin.core.state.Set` | | `io.kestra.core.tasks.storages.Concat` | `io.kestra.plugin.core.storage.Concat` | | `io.kestra.core.tasks.storages.DeduplicateItems` | `io.kestra.plugin.core.storage.DeduplicateItems` | | `io.kestra.core.tasks.storages.Delete` | `io.kestra.plugin.core.storage.Delete` | | `io.kestra.core.tasks.storages.FilterItems` | `io.kestra.plugin.core.storage.FilterItems` | | `io.kestra.core.tasks.storages.LocalFiles` | `io.kestra.plugin.core.storage.LocalFiles` | | `io.kestra.core.tasks.storages.Purge` | `io.kestra.plugin.core.storage.Purge` | | `io.kestra.core.tasks.storages.PurgeExecution` | `io.kestra.plugin.core.storage.PurgeExecution` | | `io.kestra.core.tasks.storages.Reverse` | `io.kestra.plugin.core.storage.Reverse` | | `io.kestra.core.tasks.storages.Size` | `io.kestra.plugin.core.storage.Size` | | `io.kestra.core.tasks.storages.Split` | `io.kestra.plugin.core.storage.Split` | | `io.kestra.core.tasks.templating.TemplatedTask` | `io.kestra.plugin.core.templating.TemplatedTask` | | `io.kestra.core.tasks.trigger.Toggle` | `io.kestra.plugin.core.trigger.Toggle` | | `io.kestra.plugin.fs.http.Download` | `io.kestra.plugin.core.http.Download` | | `io.kestra.plugin.fs.http.Request` | `io.kestra.plugin.core.http.Request` | | `io.kestra.plugin.fs.http.Trigger` | `io.kestra.plugin.core.http.Trigger` | ## Renamed Serdes Plugins [Serialization tasks](https://github.com/kestra-io/kestra/issues/2298) have also been renamed from `Readers` and `Writers` to explicit conversion tasks to make it clear that these tasks convert from or to [Ion](https://amazon-ion.github.io/ion-docs/) — the primary data format used in Kestra to serialize data between tasks and storage systems. For example, `CsvReader` is now `CsvToIon` and `CsvWriter` is now `IonToCsv`. A full list of the renamed serialization tasks: - `CsvReader` → `CsvToIon` - `CsvWriter` → `IonToCsv` - `JsonReader` → `JsonToIon` - `JsonWriter` → `IonToJson` - `AvroReader` → `AvroToIon` - `AvroWriter` → `IonToAvro` - `XmlReader` → `XmlToIon` - `XmlWriter` → `IonToXml` - `ParquetReader` → `ParquetToIon` - `ParquetWriter` → `IonToParquet` The table shows full paths of the renamed serialization tasks: | Old Path Before Kestra 0.17.0 | New Path After Kestra 0.17.0 | |-------------------------------------------------|------------------------------------------------| | `io.kestra.plugin.serdes.csv.CsvReader` | `io.kestra.plugin.serdes.csv.CsvToIon` | | `io.kestra.plugin.serdes.csv.CsvWriter` | `io.kestra.plugin.serdes.csv.IonToCsv` | | `io.kestra.plugin.serdes.json.JsonReader` | `io.kestra.plugin.serdes.json.JsonToIon` | | `io.kestra.plugin.serdes.json.JsonWriter` | `io.kestra.plugin.serdes.json.IonToJson` | | `io.kestra.plugin.serdes.avro.AvroReader` | `io.kestra.plugin.serdes.avro.AvroToIon` | | `io.kestra.plugin.serdes.avro.AvroWriter` | `io.kestra.plugin.serdes.avro.IonToAvro` | | `io.kestra.plugin.serdes.xml.XmlReader` | `io.kestra.plugin.serdes.xml.XmlToIon` | | `io.kestra.plugin.serdes.xml.XmlWriter` | `io.kestra.plugin.serdes.xml.IonToXml` | | `io.kestra.plugin.serdes.parquet.ParquetReader` | `io.kestra.plugin.serdes.parquet.ParquetToIon` | | `io.kestra.plugin.serdes.parquet.ParquetWriter` | `io.kestra.plugin.serdes.parquet.IonToParquet` | ## Renamed Task Runners Task runners have also been renamed for readability. For example, `io.kestra.plugin.aws.runner.AwsBatchTaskRunner` is now `io.kestra.plugin.ee.aws.runner.Batch`. The updated names are as follows: | Old Path Before Kestra 0.17.0 | New Path After Kestra 0.17.0 | |---------------------------------------------------------------|-------------------------------------------------| | `io.kestra.core.models.tasks.runners.types.ProcessTaskRunner` | `io.kestra.plugin.core.runner.Process` | | `io.kestra.plugin.scripts.runner.docker.DockerTaskRunner` | `io.kestra.plugin.scripts.runner.docker.Docker` | | `io.kestra.plugin.ee.kubernetes.runner.KubernetesTaskRunner` | `io.kestra.plugin.ee.kubernetes.runner.Kubernetes` | | `io.kestra.plugin.ee.aws.runner.AwsBatchTaskRunner` | `io.kestra.plugin.ee.aws.runner.Batch` | | `io.kestra.plugin.ee.azure.runner.AzureBatchTaskRunner` | `io.kestra.plugin.ee.azure.runner.Batch` | | `io.kestra.plugin.ee.gcp.runner.GcpBatchTaskRunner` | `io.kestra.plugin.ee.gcp.runner.Batch` | | `io.kestra.plugin.ee.gcp.runner.GcpCloudRunTaskRunner` | `io.kestra.plugin.ee.gcp.runner.CloudRun` | ## Renamed Redis Triggers and Tasks The Redis plugin has been updated to make it easier to extend and maintain. The following classes have been renamed: | Old Path Before Kestra 0.17.0 | New Path After Kestra 0.17.0 | |--------------------------------------|-----------------------------------------------| | `io.kestra.plugin.redis.ListPop` | `io.kestra.plugin.redis.list.ListPop` | | `io.kestra.plugin.redis.ListPush` | `io.kestra.plugin.redis.list.ListPush` | | `io.kestra.plugin.redis.TriggerList` | `io.kestra.plugin.redis.list.Trigger` | | - | `io.kestra.plugin.redis.list.RealtimeTrigger` | | `io.kestra.plugin.redis.Publish` | `io.kestra.plugin.redis.pubsub.Publish` | | `io.kestra.plugin.redis.Get` | `io.kestra.plugin.redis.string.Get` | | `io.kestra.plugin.redis.Set` | `io.kestra.plugin.redis.string.Set` | | `io.kestra.plugin.redis.Delete` | `io.kestra.plugin.redis.string.Delete` | --- # Volume Mount Migration in Kestra 0.17.0: Plugin Config URL: https://kestra.io/docs/migration-guide/v0.17.0/volume-mount > Guide to migrating from the deprecated volume-enabled property to plugin configuration in Kestra 0.17.0. ## Volume Mount How to migrate `volume-enabled` to the plugin configuration. The docker volume mount, by setting the property `kestra.tasks.scripts.docker.volume-enabled` to `true`, has been deprecated since 0.17.0. Use the plugin configuration `volume-enabled` for the Docker runner plugin instead. This change is implemented in a non-breaking way, so you don't need to immediately change the way you use the docker volume mount. In case you use this older method for mounting the volume, you will receive the following deprecation warning: :::alert{type="warning"} The `kestra.tasks.scripts.docker.volume-enabled` is deprecated. Use the plugin configuration `volume-enabled` instead. ::: Make the following change in the [Docker Compose file](https://github.com/kestra-io/kestra/blob/develop/docker-compose.yml) for mounting the volume: ```yaml kestra: image: kestra/kestra:latest pull_policy: always user: "root" env_file: - .env command: server standalone --worker-thread=128 volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd:rw environment: KESTRA_CONFIGURATION: | datasources: postgres: url: jdbc:postgresql://postgres:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 kestra: server: basic-auth: enabled: false username: "admin@kestra.io" # it must be a valid email address password: kestra repository: type: postgres storage: type: local local: base-path: "/app/storage" queue: type: postgres tasks: tmp-dir: path: /tmp/kestra-wd/tmp plugins: configurations: - type: io.kestra.plugin.scripts.runner.docker.Docker values: volume-enabled: true # 👈 this is the relevant setting ``` For more information, you can refer the [Bind mount](../../../16.scripts/index.mdx) page. --- # Kestra 0.18.0 Migration Guide: What Changed URL: https://kestra.io/docs/migration-guide/v0.18.0 > Migration guide for Kestra 0.18.0. Covers the runner to taskRunner transition and Terraform task_defaults to plugin_defaults configuration changes. import ChildCard from "~/components/docs/ChildCard.astro" ## 0.18.0 Deprecated features and migration guides for 0.18.0 and onwards. --- # runner Deprecated in Kestra 0.18.0: Use taskRunner URL: https://kestra.io/docs/migration-guide/v0.18.0/runners > Guide to migrating from the deprecated runner property to the more flexible taskRunner property. ## Deprecation of runner property in favor of taskRunner How to migrate from `runner` to `taskRunner`. Task Runners is a pluggable system that allows you to offload the execution of your tasks to different environments. With the general availability of `taskRunner` in Kestra 0.18.0, the [runner](../../../16.scripts/03.task-runners/index.md) property is deprecated. Task Runners provide more flexibility and control over how your tasks are executed, allowing you to run your code in various remote environments by: 1. Leveraging task runner plugins [managed by Kestra](/demo) 2. Building your own task runner plugins customized to your needs. ## Migration To migrate from the `runner` property to `taskRunner`, update your flow code as follows: 1. Replace the `runner` property with `taskRunner`. 2. If you were using the `DOCKER` runner with a custom Docker image, replace the `docker.image` property with the `containerImage` property. 3. Update any other properties in the `taskRunner` configuration as needed, e.g. to configure Docker image pull policies, CPU and memory limits, or to provide credentials to private Docker registries. :::alert{type="info"} All other script task properties, such as `beforeCommands`, `commands`, `inputFiles`, `outputFiles`, `interpreter`, `env`, `workerGroup`, and more, remain the same. **You only need to replace the `runner` property with `taskRunner` and adjust the Docker image configuration if needed.** ::: The following examples clarify the migration. ### From `PROCESS` runner to `taskRunner` If you were using the `PROCESS` runner to execute your tasks in local processes, add the `taskRunner` property with the `type` set to `io.kestra.plugin.core.runner.Process`. Before (the old way): ```yaml id: example_with_runner namespace: company.team tasks: - id: script type: io.kestra.plugin.scripts.python.Script runner: PROCESS script: | from kestra import Kestra data = dict(message="Hello from Kestra!", release="0.17.0") print(data.get("message")) Kestra.outputs(data) ``` After (the current way): ```yaml id: example_with_taskRunner namespace: company.team tasks: - id: script type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.core.runner.Process script: | from kestra import Kestra data = dict(message="Hello from Kestra!", release="0.18.0") print(data.get("message")) Kestra.outputs(data) ``` ### From `DOCKER` runner to `taskRunner` If you were using the `DOCKER` runner to run your scripts in a Docker container, add the `taskRunner` property with the `type` set to `io.kestra.plugin.scripts.runner.docker.Docker`. Before (the old way): ```yaml id: example_with_runner namespace: company.team tasks: - id: script type: io.kestra.plugin.scripts.python.Script runner: DOCKER docker: image: ghcr.io/kestra-io/kestrapy:latest pullPolicy: IF_NOT_PRESENT cpu: cpus: 1 memory: memory: "512MB" script: | from kestra import Kestra data = dict(message="Hello from Kestra!", release="0.17.0") print(data.get("message")) Kestra.outputs(data) ``` After (the current way): ```yaml id: example_with_taskRunner namespace: company.team tasks: - id: script type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker pullPolicy: IF_NOT_PRESENT cpu: cpus: 1 memory: memory: "512MB" containerImage: ghcr.io/kestra-io/kestrapy:latest script: | from kestra import Kestra data = dict(message="Hello from Kestra!", release="0.18.0") print(data.get("message")) Kestra.outputs(data) ``` Note how the `containerImage` is now a top-level property of each script task. This makes the configuration more flexible, as the image changes more often than the standard runner configuration. --- # Terraform task_defaults to plugin_defaults in 0.18.0 URL: https://kestra.io/docs/migration-guide/v0.18.0/tf-task-defaults > Migrate from Terraform task_defaults to plugin_defaults in Kestra 0.18.0. Update your Terraform configurations to use the new plugin_defaults property. ## Deprecation of Terraform task_defaults in favor of plugin_defaults How to migrate `task_defaults` to `plugin_defaults` for the Kestra Terraform Provider. In the [v0.17.0 release](../../../../blogs/2024-06-04-release-0-17/index.md), Task Defaults was renamed to [Plugin Defaults](../../../05.workflow-components/09.plugin-defaults/index.md) to better reflect its purpose. As a result, the 0.18.0 version of the [Terraform Provider](../../../13.terraform/index.mdx) now uses the property `plugin_defaults` instead of `task_defaults` in the `kestra_namespace` resource. To migrate, replace `task_defaults` with `plugin_defaults` in your Terraform configuration before upgrading your Kestra Terraform provider. --- # Kestra 0.19.0 Migration Guide: State Store to KV Store URL: https://kestra.io/docs/migration-guide/v0.19.0 > Kestra 0.19.0 Migration Guide. Overview of deprecated features and steps to upgrade to 0.19.0, with a focus on State Store to KV Store migration. import ChildCard from "~/components/docs/ChildCard.astro" ## 0.19.0 Deprecated features and migration guides for 0.19.0 and onwards. --- # State Store Deprecated in Kestra 0.19.0: Use KV Store URL: https://kestra.io/docs/migration-guide/v0.19.0/state-store > Migrate from State Store to KV Store in Kestra 0.19.0. Learn why State Store is deprecated and how to transition to the KV Store for better data management. ## Deprecation of State Store in favor of KV Store How to migrate from State Store to KV Store. The State Store is a mechanism used under the hood by kestra to store the state of a task execution as a file in internal storage. With the general availability of the [KV Store](../../../06.concepts/05.kv-store/index.md) in Kestra 0.18.0, the State Store is deprecated starting with Kestra 0.19.0. ## Why the change? State Store was difficult to troubleshoot and manage. There was no way to view what data is actually stored in the State Store from the UI/API, and the data stored there was tied to a given flow execution, making it chalenging to manage the lifecycle of the data. The KV Store provides more flexibility and control over the data persisted during your task execution, allowing you to: - set a type for each key (e.g. string, number, boolean, datetime, date, duration, JSON), - view the data from the UI, - query the persisted values via key from your flows or via API, - manage the lifecycle for each key via TTL. ## State Store tasks The [State Store tasks](/plugins/core#state) are deprecated in favor of equivalent [KV Store tasks](/plugins/core#kv). The table below shows a mapping of the deprecated State Store tasks to the KV Store tasks. | State Store task | KV Store task | |------------------|---------------| | `io.kestra.plugin.core.state.Get` | `io.kestra.plugin.core.kv.Get` | | `io.kestra.plugin.core.state.Set` | `io.kestra.plugin.core.kv.Set` | | `io.kestra.plugin.core.state.Delete` | `io.kestra.plugin.core.kv.Delete` | ## How to migrate All plugins that used State Store now use KV Store under the hood. This includes: - all [Singer plugins](/plugins) - all [Debezium plugins](https://github.com/kestra-io/plugin-debezium) - [CloudQuery plugin](/plugins/plugin-cloudquery) If you were using one of those plugins, run this command after upgrading to Kestra 0.19.0: ```bash /app/kestra sys state-store migrate ``` Additionally, if you were using the [State Store tasks](/plugins/core#state) directly in your flows, update them to use the equivalent [KV Store tasks](/plugins/core#kv). --- # Kestra 0.20.0 Migration Guide: KV, Kafka & Config URL: https://kestra.io/docs/migration-guide/v0.20.0 > Kestra 0.20.0 Migration Guide. Detailed instructions for upgrading, including KV function changes, Kafka queue restore, and configuration updates. import ChildCard from "~/components/docs/ChildCard.astro" ## 0.20.0 Deprecated features and migration guides for 0.20.0 and onwards. --- # Cluster Monitoring Permissions Change in Kestra 0.20.0 URL: https://kestra.io/docs/migration-guide/v0.20.0/cluster-monitoring > Cluster Monitoring permissions update in Kestra 0.20.0. Access to the Instance/Cluster Monitoring page now requires `SUPER_ADMIN` privileges. ## Different permissions for accessing Cluster Monitoring Migrating permissions for accessing Cluster Monitoring Before the 0.20.0 release, the permission `INFRASTRUCTURE` was required to access the Cluster Monitoring page. The page has been renamed `Instance` and now provides more information and offers new features. Access to this page now requires `SUPER_ADMIN` status. --- # Conditions Renamed in Kestra 0.20.0: Update Your Flows URL: https://kestra.io/docs/migration-guide/v0.20.0/conditions-renamed > Condition renaming in Kestra 0.20.0. Update your flows to use the new condition names (e.g., `ExecutionStatus` instead of `ExecutionStatusCondition`). ## Conditions renamed Migrating Flow trigger conditions All conditions [have been renamed](https://github.com/kestra-io/kestra/pull/6032) without the `Condition` at the end. Aliases are in place, so all flows will still work, but you will see an information note recommending you to upgrade to the new name. Already deprecated conditions haven't changed to avoid extra overhead on your end. Examples of renamed conditions: - `io.kestra.plugin.core.condition.ExecutionStatusCondition` → `io.kestra.plugin.core.condition.ExecutionStatus` - `io.kestra.plugin.core.condition.ExecutionNamespaceCondition` → `io.kestra.plugin.core.condition.ExecutionNamespace` - `io.kestra.plugin.core.condition.ExecutionLabelsCondition` → `io.kestra.plugin.core.condition.ExecutionLabels` --- # Custom Plugins Migration in Kestra 0.20.0: namespace Param URL: https://kestra.io/docs/migration-guide/v0.20.0/custom-plugins > Custom Plugin migration for Kestra 0.20.0. Update plugins to handle the mandatory `namespace` parameter in internal storage methods. ## Custom plugins Migrating custom plugins The internal storage [now takes](https://github.com/kestra-io/kestra/pull/6022) a `namespace` parameter on all its methods. This is mandatory to be passed by a plugin but can be safely set as `null` in tests that directly use the internal storage. Plugins are normally not affected if they use the `runContext().storage()` method that has been updated to automatically use the execution's namespace. --- # Elasticsearch Indexer Change in Kestra 0.20.0 Enterprise URL: https://kestra.io/docs/migration-guide/v0.20.0/elasticsearch-indexer > Elasticsearch Indexer changes in Kestra 0.20.0 (Enterprise). The webserver now embeds the indexer by default for Kafka backends, simplifying deployment. ## Elasticsearch indexer Migration guide for the Elasticsearch indexer Starting with 0.20, if you are using the Kafka backend, there is no need to start an external indexer, even for the Kafka backend, as the webserver will start an embedded indexer automatically. However, if you still want to start one, it is still possible to do so, and you can disable the webserver-embedded indexer by starting it with `--no-indexer`. Starting an extra indexer should only be needed for very high throughput when you want the UI to be updated with very low latency about execution information. Most of the time, the webserver embedded indexer should be enough. --- # KV Namespace Access Change in Kestra 0.20.0: Permissions URL: https://kestra.io/docs/migration-guide/v0.20.0/kv-function > KV Function security update in Kestra 0.20.0. Ensure proper namespace access permissions when retrieving Key-Value pairs from different namespaces. ## Retrieving KV pairs from other namespaces Migrating usage of KV functions The `kv()` Pebble function was missing a check for allowed namespace in case a namespace is passed to the function e.g. `{{ kv('MY_KEY', 'differentNamespace') }}`. This check has been added in 0.20 release. If you use the `kv()` function to get a KV from a different namespace in the Enterprise Edition, ensure access to that namespace is allowed (this happens by default unless explicitly restricted). --- # Restore Kafka Queue in Kestra 0.20.0 Enterprise URL: https://kestra.io/docs/migration-guide/v0.20.0/restore-kafka-queue > Kafka Queue Restore for Kestra 0.20.0 (Enterprise). Instructions to run the `sys-ee restore-queue` command to sync flow source code for plugin defaults. ## Restore Kafka queue Migration guide for Kafka backend users Due to a change in how Kestra handles plugin defaults, the flow source needs to be available to the Kestra Executor. This change syncs the flow source code with the queue, allowing the executor to apply `pluginDefaults` defined in the flow YAML configuration more efficiently. For users with a Kafka backend, this migration can be performed by running the following CLI command: ```bash ./kestra sys-ee restore-queue --no-recreate --no-templates --no-triggers --no-namespaces --no-tenants ``` If you are using a Kestra version with the JDBC backend, this change doesn't apply to you, and you don't need to run the above command. --- # Server Configuration Changes in Kestra 0.20.0 URL: https://kestra.io/docs/migration-guide/v0.20.0/server-configuration > Server Configuration updates in Kestra 0.20.0. Mail service config moved to `kestra.ee` and Secret Manager configuration is now mandatory. ## Server configuration In Kestra < 0.20.0, email server configuration lived under `kestra.mail-service`. Given that it's used only within the Enterprise Edition (for resetting passwords and sending invites), we moved it to `kestra.ee.mail-service`. ## Required Secret Manager In Kestra < 0.20.0, if you are not using the Enterprise Edition secret manager, Kestra falls back automatically to the open-source version and requires no configuration. In Kestra > 0.20.0, Kestra requires a secret manager configuration, while still falling back to the open-source version if the secret does not exist on the EE. Add one of these secret manager configurations depending on the backend: > JDBC Backend ```yaml kestra: secret: type: jdbc jdbc: secret: "your-secret-key" ``` > Kafka + ElasticSearch Backend ```yaml kestra: secret: type: elasticsearch elasticsearch: secret: "your-secret-key" ``` --- # Usernames Replaced by Emails in Kestra 0.20.0 Enterprise URL: https://kestra.io/docs/migration-guide/v0.20.0/username-replaced-by-email > Usernames replaced by emails in Kestra 0.20.0 (Enterprise). Run the migration command to update user identifiers to valid email addresses. ## Usernames replaced by email addresses Replace usernames by email addresses Starting with Kestra 0.20, Kestra mandates the username to be an email address. If this is not the case within your instance, you can run the following CLI command to replace a username with the corresponding email for each user in the instance: ```yaml ./kestra auths users email-replace-username ``` If the email address is not set for the user or is invalid, Kestra will log those usernames so you can address those edge cases manually. If the username is already set as email, the above command will additionally set that value within the email property. --- # Worker Group Fallback Change in Kestra 0.20.0 Enterprise URL: https://kestra.io/docs/migration-guide/v0.20.0/worker-group-fallback > Worker Group Fallback changes in Kestra 0.20.0 (Enterprise). Configure fallback: FAIL to retain previous behavior when no workers are available in a group. ## Fallback on unhealthy workers Migrating usage of worker group keys By default, a task configured to run on a worker group where no workers are available will wait for the worker to be available. The previous behavior was to fail. This behavior is configurable. To keep the previous behavior, set the `fallback` behavior to `FAIL`: ```yaml - id: hello type: io.kestra.plugin.core.log.Log message: Hello World! workerGroup: key: wg1 fallback: FAIL # possible values are WAIT (default), FAIL or CANCEL ``` To set a custom `workerGroup` `key` and `behavior` per plugin type and/or namespace, use `pluginDefaults`. --- # Kestra 0.21.0 Migration Guide: Secrets & Logging URL: https://kestra.io/docs/migration-guide/v0.21.0 > Kestra 0.21.0 Migration Guide. Information on new features like restarting parent flows, secret function behavior changes, and logging updates. import ChildCard from "~/components/docs/ChildCard.astro" ## 0.21.0 Deprecated features and migration guides for 0.21.0 and onwards. --- # Default Git Branch Changed to main in Kestra 0.21.0 URL: https://kestra.io/docs/migration-guide/v0.21.0/default-git-branch > Default Git Branch update in Kestra 0.21.0. Git tasks now default to `main` instead of `kestra`. Update flows relying on the old default. ## Default Git Branch Changed default Git branch name from `kestra` to `main`. The default branch within Git tasks has been renamed from `kestra` to `main` ([PR #98](https://github.com/kestra-io/plugin-git/pull/98)). Update any workflows that implicitly rely on the former default branch within [PushFlows](/plugins/plugin-git/io.kestra.plugin.git.pushflows), [PushNamespaceFiles](/plugins/plugin-git/io.kestra.plugin.git.pushnamespacefiles), [SyncNamespaceFiles](/plugins/plugin-git/io.kestra.plugin.git.syncnamespacefiles). Here is an example before and after the change. ### Before 0.21.0 ```yaml id: push_to_git namespace: system tasks: - id: commit_and_push type: io.kestra.plugin.git.PushFlows gitDirectory: _flows url: https://github.com/kestra-io/scripts # required string username: git_username # required string needed for Auth with Git password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" branch: main # optional, uses "kestra" by default ``` ### After 0.21.0 ```yaml id: push_to_git namespace: system tasks: - id: commit_and_push type: io.kestra.plugin.git.PushFlows gitDirectory: _flows url: https://github.com/kestra-io/scripts # required string username: git_username # required string needed for Auth with Git password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" branch: main # optional, uses "main" by default ``` --- # Parent Flow Restart Behavior in Kestra 0.21.0 URL: https://kestra.io/docs/migration-guide/v0.21.0/restarting-parent-flow > Restart behavior change in Kestra 0.21.0. Parent flows now restart failed subflows by default. Configure `restartBehavior` to change this if needed. ## Restarting parent flow Restarting Parent Flow with Failed Subflow When restarting an execution, `Subflow` or `ForEachItem` tasks now restart the existing failed subflow execution rather than creating a new one. This behavior is configurable via the new `restartBehavior` enum property; setting it to `NEW_EXECUTION` retains the previous behavior ([PR #6799](https://github.com/kestra-io/kestra/pull/6799); [Issue #6722](https://github.com/kestra-io/kestra/issues/6722)). A `system.restarted: true` label is added during restart for tracking, and the underlying subflow execution storage table is retained to avoid migration issues (scheduled for removal in v0.22). ## Example To keep the previous behavior of creating a new subflow execution when restarting the parent flow, set the `restartBehavior` property to `NEW_EXECUTION`: ```yaml id: parent namespace: company.team tasks: - id: subflow type: io.kestra.plugin.core.flow.Subflow namespace: company.team flowId: child restartBehavior: NEW_EXECUTION # or RETRY_FAILED ``` The default behavior is `RETRY_FAILED`, which restarts the existing failed subflow execution when restarting the parent flow. --- # Secret Function Change: Missing Keys Now Throw Errors URL: https://kestra.io/docs/migration-guide/v0.21.0/secret-function > Secret function update in Kestra 0.21.0 (OSS). Fetching missing secrets now throws an exception instead of returning null, matching Enterprise behavior. ## Retrieving non-existing secrets Changed handling of non-existing secrets. Fetching a non-existing secret using the `secret()` function now throws an exception instead of returning `null` in the open-source version, aligning the open-source behavior with the behavior in the Enterprise Edition. --- # stderr Log Level Change: WARNING to ERROR in 0.21.0 URL: https://kestra.io/docs/migration-guide/v0.21.0/stderr-log-level > Script Task logging update in Kestra 0.21.0. Output to `stderr` is now logged as ERROR level instead of WARNING. ## Log level for stderr output STDERR Logged at ERROR Level in Script Tasks Script tasks now log output sent to `stderr` at the ERROR level instead of WARNING ([PR #6383](https://github.com/kestra-io/kestra/pull/6383); [Issue #190](https://github.com/kestra-io/plugin-scripts/issues/190)). ## Example Here is an example of a script task that logs an error message to `stderr`: ```yaml id: error_logs_demo namespace: company.team tasks: - id: fail type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.core.runner.Process script: | raise ValueError("An error occurred: This is a manually raised exception.") errors: - id: alert type: io.kestra.plugin.core.log.Log message: list of error logs — {{ errorLogs() }} ``` ## Before 0.21.0 Here is the output of the `fail` task before the change: ![Script task stderr output logged at WARNING level before 0.21.0](./stderr-log-level1.png) ## After 0.21.0 Here is the output of the `fail` task after the change: ![Script task stderr output logged at ERROR level after 0.21.0](./stderr-log-level2.png) --- # ME and APITOKEN Permissions in Kestra 0.21.0 RBAC URL: https://kestra.io/docs/migration-guide/v0.21.0/token-permissions > RBAC updates in Kestra 0.21.0 (Enterprise). New `ME` and `APITOKEN` permissions for managing user profiles and API tokens. Update custom roles accordingly. ## ME and APITOKEN user permissions `ME` and `APITOKEN` permissions added to RBAC. Additional permissions were introduced for creating Users and Groups, allowing better control over personal data management and API Token creation for programmatic access. ## After 0.21 The `ME` and `APITOKEN` permissions were added in version 0.21.0. After upgrading to 0.21.0 or later, Admins must update any custom roles with these permissions as needed. Any roles managed by Kestra that need these permissions have them automatically applied in the upgrade. `ME:READ` permission is added to all Kestra-managed roles. All users will be able to access profile information. Only the Admin role will be configured with: - `ME: [CREATE, READ, UPDATE]`: Change profile data. - `APITOKEN: [CREATE, READ, UPDATE, DELETE]`: Control user API access. `ME:DELETE` is currently not supported. A user cannot delete its own account. In the [Kestra API](../../../api-reference/01.enterprise/index.mdx), the Users API `/api/v1/main/users/password` changed to `/api/v1/main/me/password`. --- # Kestra 0.22.0 Migration Guide – Changes & Actions URL: https://kestra.io/docs/migration-guide/v0.22.0 > Migration guide for Kestra 0.22.0. Covers Azure Log Exporter split, default tenant deprecation, account lockout policy, and Service Account API changes. import ChildCard from "~/components/docs/ChildCard.astro" ## 0.22.0 Deprecated features and migration guides for 0.22.0 and onwards. --- # Azure Log Exporter Split in Kestra 0.22.0 URL: https://kestra.io/docs/migration-guide/v0.22.0/azure-log-exporter > Azure Log Exporter split in Kestra 0.22.0 (Enterprise). Update configurations to use specific plugins for Azure Monitor or Azure Blob Storage. ## Azure Log Exporter Azure Log Exporter plugin is now split into two plugins. The log exporter plugin for Azure `io.kestra.plugin.ee.azure.LogExporter`, introduced in Kestra 0.21, got split into two plugins: 1. `io.kestra.plugin.ee.azure.monitor.LogExporter` for exporting logs to Azure Monitor. 2. `io.kestra.plugin.ee.azure.storage.LogExporter` for exporting logs to Azure Blob Storage. This reflects that you can now export your log to Azure either using Azure Monitor or using Azure Blob Storage. ## Before Kestra 0.22 Before Kestra 0.22, the `io.kestra.plugin.ee.azure.LogExporter` plugin would export logs to Azure Monitor. ```yaml id: log_shipper namespace: company.team tasks: - id: log_export type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO batchSize: 1000 lookbackPeriod: P1D logExporters: - id: AzureLogExporter type: io.kestra.plugin.ee.azure.LogExporter endpoint: https://endpoint-host.ingest.monitor.azure.com tenantId: tenant_id clientId: client_id clientSecret: client_secret ruleId: dcr-69f0b123041d4d6e9f2bf72aad0b62cf streamName: Custom-JSONLogs ``` ## After Kestra 0.22 In Kestra 0.22, you can now choose to export logs to Azure Monitor or Azure Blob Storage, or both. ```yaml id: log_shipper namespace: company.team tasks: - id: log_export type: io.kestra.plugin.ee.core.log.LogShipper logLevelFilter: INFO batchSize: 1000 lookbackPeriod: P1D logExporters: - id: AzureLogExporter type: io.kestra.plugin.ee.azure.monitor.LogExporter endpoint: https://endpoint-host.ingest.monitor.azure.com tenantId: tenant_id clientId: client_id clientSecret: client_secret ruleId: dcr-69f0b123041d4d6e9f2bf72aad0b62cf streamName: Custom-JSONLogs - id: AzureLogExporter type: io.kestra.plugin.ee.azure.storage.LogExporter endpoint: https://myblob.blob.core.windows.net/ tenantId: tenant_id clientId: client_id clientSecret: client_secret containerName: logs format: JSON logFilePrefix: kestra-log-file maxLinesPerFile: 1000000 chunk: 1000 ``` --- # Default Tenant Deprecated: Multi-Tenancy Now Default URL: https://kestra.io/docs/migration-guide/v0.22.0/default-tenant > Deprecation of the default tenant functionality and enablement of multi-tenancy by default in Enterprise Edition. ## Default Tenant & Multi-Tenancy Default tenant is deprecated and multi-tenancy is enabled by default. [Multi-tenancy](../../../07.enterprise/02.governance/tenants/index.md) was introduced in Kestra 0.13. For backward compatibility with older versions (≤0.12), you could use the concept of a [default tenant](../../../07.enterprise/02.governance/tenants/index.md), which imitated the multitenancy feature with the so-called `null`-tenant or `default` tenant. In Kestra 0.22, the default tenant functionality is deprecated and will be removed in the future. Sufficient migration time will be provided, along with a detailed migration guide for customers still using the default tenant. :::alert{type="warning"} Prior to Kestra 0.22, `tenants.enabled` was by default set to `false` and `defaultTenant` was set to `true`. Starting from Kestra 0.22, `tenants.enabled` is set to `true` and `defaultTenant` is set to `false` by default. ::: ## How to keep the default tenant for now To continue using the default tenant, set the `defaultTenant` configuration flag to `true` and `tenants.enabled` to `false` in your `kestra.yml` configuration file: ```yaml kestra: ee: tenants: enabled: false defaultTenant: true ``` In Kestra 0.22 and higher, `defaultTenant` is **no longer enabled by default**, so set that configuration option to `true` to keep using the default tenant. ### Before 0.22.0 Here is the default multi-tenancy configuration before 0.22.0: ```yaml kestra: ee: tenants: enabled: false defaultTenant: true ``` ### After 0.22.0 Here is the default multi-tenancy configuration after 0.22.0: ```yaml kestra: ee: tenants: enabled: true defaultTenant: false ``` --- # Service Account API Changes in Kestra 0.22.0 EE URL: https://kestra.io/docs/migration-guide/v0.22.0/ee-api-changes > Changes to the Service Account API in Enterprise Edition to support non-unique names across tenants. ## Enterprise Edition API changes Service Account name uniqueness is no longer enforced. Before Kestra 0.22, the Service Account name had to be globally unique within the instance. As a result, attempting to create a Service Account `cicd` in a `dev` tenant would raise an error `"Username already exists"` if your `prod` tenant also has a Service Account with the name `cicd`. To support multiple service accounts with the same name, the `username` property was renamed to `name` in the JSON payload for the following REST API endpoint: `POST /api/v1{/tenant}/users/service-accounts{/id}`. ## Before Kestra 0.22 Here is an example of a JSON payload returned by the REST API endpoint `POST /api/v1{/tenant}/users/service-accounts{/id}` in Kestra 0.21: ```json { "username": "cicd", "password": "Admin2025" } ``` ## After Kestra 0.22 Here is an example of a JSON payload returned by the REST API endpoint `POST /api/v1{/tenant}/users/service-accounts{/id}` in Kestra 0.22: ```json { "name": "cicd", "password": "Admin2025" } ``` --- # Account Lockout After Failed Login Attempts (0.22.0) URL: https://kestra.io/docs/migration-guide/v0.22.0/failed-attempts-lockout > Introduction of account lockout after multiple failed login attempts in Enterprise Edition for enhanced security. ## Failed Attempts Lockout Too many failed login attempts now lock user's account To improve the security of your Enterprise Edition instance, Kestra now automatically locks user accounts after a `threshold` number of failed login attempts made within `monitoring-window`. The number of failed attempts, the monitoring window to track the failed attempts and the duration of how long the user remains locked are configurable. ```yaml kestra: security: login: failed-attempts: threshold: 10 # the number of failed attempts before lockout monitoring-window: PT5M # period to count failed attempts lock-duration: PT30M # period the account remains locked ``` :::alert{type="info"} This change only applies to users who use LDAP or basic authentication, not SSO users. ::: Super Admin can unlock the user manually by resetting their password from the user's detail page. The user can also unlock their account by resetting their password using the "Forgot password" link on the login page and following the instructions in the email. --- # Helm Chart Health Check Path Changes in 0.22.0 URL: https://kestra.io/docs/migration-guide/v0.22.0/healthcheck-paths > Update to the health check paths in Kestra's Helm Chart for improved Kubernetes probe reliability. ## Helm Chart Health Check Paths Change in the health check paths for Kestra's Helm Chart Before [this Helm Charts PR](https://github.com/kestra-io/helm-charts/pull/62/files), both probes pointed to `/health`. This caused Kubernetes to restart the pod when an external component was unavailable. To resolve this, the value file was updated to configure liveness and readiness probes to use the health paths recommended by Micronaut: - Liveness probe now points to `/health/liveness` - Readiness probe now points to `/health/readiness`. ## Before Kestra 0.22 - Liveness probe: `/health` - Readiness probe: `/health` ## After Kestra 0.22 - Liveness probe: `/health/liveness` - Readiness probe: `/health/readiness` --- # KV Function Now Errors on Missing Keys in 0.22.0 URL: https://kestra.io/docs/migration-guide/v0.22.0/kv-error-on-missing > Change in default behavior of the kv() function to throw an error when a key is missing in Kestra 0.22.0. ## KV function errors on missing key New default behavior of the KV function Before Kestra 0.22, the `kv()` function had the property `errorOnMissing` set to `false` by default. It was changed to `true` to align with the rest of the system — for example, the `secret()` function throws an error when the secret is missing. If you want to keep the previous behavior of returning `null` without an error when attempting to fetch non-existing KV-pairs, use the syntax `"{{kv('NON_EXISTING_KV_PAIR', errorOnMissing=false)}}"`. ## Before Kestra 0.22 ```yaml id: myflow namespace: company.team tasks: - id: this_will_return_null type: io.kestra.plugin.core.log.Log message: Hello {{kv('NON_EXISTING_KV_PAIR')}} # Hello ``` ## After Kestra 0.22 This flow will fail because the `kv()` function will throw an error when the key is missing: ```yaml id: myflow namespace: company.team tasks: - id: this_will_fail type: io.kestra.plugin.core.log.Log message: Hello {{kv('NON_EXISTING_KV_PAIR')}} # Error ``` To keep the previous behavior, use the `errorOnMissing=false` syntax: ```yaml id: myflow namespace: company.team tasks: - id: this_will_fail type: io.kestra.plugin.core.log.Log message: Hello {{kv('NON_EXISTING_KV_PAIR', errorOnMissing=false)}} # Hello ``` --- # Plugin 'version' Property Renamed in Kestra 0.22.0 URL: https://kestra.io/docs/migration-guide/v0.22.0/renamed-version-property > Renaming of the version property in several plugins to reserve the keyword for Kestra's internal plugin management. ## Version property renamed Renamed version property in many plugins With the introduction of plugin versioning, Kestra reserves the `version` keyword for internal use to specify the plugin version. As a result, the `version` property was renamed for a few plugins that already used it, including the following: - `io.kestra.plugin.elasticsearch.Get` → renamed as `docVersion` - `io.kestra.plugin.opensearch.Get` → renamed as `docVersion` - `io.kestra.plugin.mqtt.RealtimeTrigger` → renamed as `mqttVersion` - `io.kestra.plugin.mqtt.Trigger` → renamed as `mqttVersion` - `io.kestra.plugin.serdes.parquet.IonToParquet` → renamed as `parquetVersion` :::alert{type="warning"} Custom plugins need an equivalent rename for any property named `version`, which is now reserved for plugin management. Any custom plugin that uses a `version` property will not compile until you rename it. ::: ## Ensure Kestra can access the `_plugins/` directory Upgrading to Kestra 0.22.0 requires a change in the way plugins are stored and managed. The [plugin versioning system](../../../07.enterprise/05.instance/versioned-plugins/index.md) requires a global internal storage configuration, because plugins are now stored in a global internal storage location. This is true even if you are using a dedicated internal storage backend for each tenant. Under the hood, plugins are now stored in the [Internal Storage](../../../08.architecture/data-components/index.md#internal-storage) under the path `_plugins/repository/`. Therefore, the service account or credentials you use in your [Runtime and Storage configuration](../../../configuration/02.runtime-and-storage/index.md) **must have permissions to access the `_plugins` directory in the global instance internal storage (e.g. your S3 bucket)**. If you are using a service account or an IAM role, ensure it has access to these resources. Alternatively you can temporarily disable this feature using the following configuration: ```yaml kestra: plugins: management: enabled: false ``` --- # Kestra 0.23.0 Migration Guide – Mandatory Multitenancy URL: https://kestra.io/docs/migration-guide/v0.23.0 > Overview of changes and migration guides for Kestra version 0.23.0, including mandatory multitenancy. import ChildCard from "~/components/docs/ChildCard.astro" ## 0.23.0 Deprecated features and migration guides for 0.23.0 and onwards. :::alert{type="warning"} Export all your flows before starting the upgrade process to have a backup in case the upgrade doesn't go as planned. Go to your Profile → Settings → scroll down and click "Export All Flows". This will download a ZIP file containing all your flows. Tenant is now required; `defaultTenant` (null tenant) is no longer supported. Kestra now always requires a tenant context in both OSS and Enterprise editions. A migration is required to upgrade to 0.23: - [Open Source](./tenant-migration-oss/index.md) - [Enteprise](./tenant-migration-ee/index.md) ::: --- # BOOLEAN Input Deprecated: Switch to BOOL in 0.23.0 URL: https://kestra.io/docs/migration-guide/v0.23.0/boolean-input-change > Deprecation of the BOOLEAN-type input in favor of the new BOOL-type toggle input. ## The BOOLEAN-type input is deprecated in favor of BOOL The Java-style `BOOLEAN` input, which used three options (true, false, or not defined) caused too much confusion and bugs, so it's now deprecated in favor of the `BOOL` input which is a toggle (can only be true or false). Read more in the GitHub issue: [#8225](https://github.com/kestra-io/kestra/issues/8225). The following example inputs demonstrate the difference: ```yaml inputs: - id: boolean type: BOOLEAN # Deprecated as of version 0.23.0 defaults: true displayName: "A boolean input" - id: bool type: BOOL # Included in version 0.23.0 and later defaults: true displayName: "A boolean input displayed as a toggle." ``` --- # Env Variable Prefix Changed: KESTRA_ to ENV_ (0.23.0) URL: https://kestra.io/docs/migration-guide/v0.23.0/default-env-prefix > Information on the change of default environment variable prefix from KESTRA_ to ENV_ for improved security. ## Default environment variable prefix changed from KESTRA_ to ENV_ for security Kestra [previously defaulted](https://github.com/kestra-io/kestra-ee/issues/3131) to autoloading environment variables with the prefix `KESTRA_` into flows. This posed a security risk, as Micronaut allows overriding configuration using environment variables and translates non-alphanumeric characters (such as `:` in `kestra:storage:type`) into underscores, producing env vars like `KESTRA_STORAGE_TYPE`. If a sensitive value (e.g., a storage password) was provided via an environment variable starting with `KESTRA_`, it would be available in all flows, increasing the risk of secret exposure. ## 0.23 change - The default prefix for autoloaded environment variables is now `ENV_` (instead of `KESTRA_`). - Any variable you want to expose in flows must now start with `ENV_` by default (unless you configure a custom prefix). - The `KESTRA_CONFIGURATION` env var still uses `KESTRA_` as the configuration key (**unchanged**). ## How to use In your Docker Compose or environment configuration, set environment variables using the `ENV_` prefix rather than the `KESTRA_` prefix: ```yaml kestra: image: kestra/kestra:latest environment: ENV_MY_VARIABLE: extra variable value ENV_NEW_VARIABLE: new variable value KESTRA_CONFIGURATION: kestra: variables: env-vars-prefix: "ENV_" # this is now the default as of v0.23 ``` You can reference these in your flows as `{{ envs.my_variable }}` and `{{ envs.new_variable }}`. To use a custom prefix e.g., `PROD_`: ```yaml kestra: image: kestra/kestra:latest environment: PROD_MY_VARIABLE: extra variable value KESTRA_CONFIGURATION: kestra: variables: env-vars-prefix: "PROD_" ``` ## Required Action Review and update any existing environment variables that used the `KESTRA_` prefix for flow variables and use `ENV_` or your custom prefix instead. No changes are needed for configuration properties that use `KESTRA_CONFIGURATION`. For more details, check the [Runtime and Storage configuration](../../../configuration/02.runtime-and-storage/index.md). --- # Docker pullPolicy Default Changed to IF_NOT_PRESENT URL: https://kestra.io/docs/migration-guide/v0.23.0/default-pull-policy > Details on the change of default pullPolicy for Docker-based tasks to IF_NOT_PRESENT. ## The default pullPolicy for Docker-based tasks changed Due to the new [Docker Hub pull usage and limits](https://docs.docker.com/docker-hub/usage/pulls/), all the Docker-based Kestra tasks have their default `pullPolicy` updated from `ALWAYS` to `IF_NOT_PRESENT` to avoid any pull limit issue. Read more about the change in the [GitHub issue](https://github.com/kestra-io/plugin-scripts/issues/230). Previously, the following flow would have the `pullPolicy` default to `ALWAYS`: ```yaml id: docker_script_runner namespace: company.team tasks: - id: shell type: io.kestra.plugin.scripts.shell.Commands containerImage: centos taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker cpu: cpus: 1 commands: - echo "Hello World!" ``` Now, the plugin defaults to `IF_NOT_PRESENT`. This also applies to all other Docker-based tasks from the `plugin-docker` group, such as `io.kestra.plugin.docker.Run`. ![Default Docker Runner Pull Policy](./pullPolicy-default.png) --- # Flow Trigger Now Reacts to PAUSED State by Default URL: https://kestra.io/docs/migration-guide/v0.23.0/flow-trigger-paused-state > Details on the Flow trigger now reacting to the PAUSED state by default in Kestra 0.23.0. ## Flow trigger now also reacts to PAUSED state by default Next to the terminated states, the Flow trigger now also reacts to the `PAUSED` state to make it easier to respond to a paused workflow, for example, to send alerts to the right stakeholders to manually approve and resume paused workflow executions. Using the following flow with a Flow trigger as example: ```yaml id: react_to_states namespace: company tasks: - id: hello ... triggers: - id: mytrigger type: io.kestra.plugin.core.trigger.Flow preconditions: id: flow flows: - namespace: company ``` ## Before This flow would be triggered for each terminated execution in the `company` namespace. ## After From 0.23 on, this flow will be triggered for each **terminated** and `PAUSED` execution in the `company` namespace. --- # Internal Storage Path Migration for S3 and GCS (0.23.0) URL: https://kestra.io/docs/migration-guide/v0.23.0/internal-storage-migration > Migration guide for S3 and GCS users to handle the removal of leading root slashes in internal storage paths. ## Internal Storage Migration Guide for S3 and GCS Users For users of S3 or GCS as internal storage, Kestra now removes the leading root slash in all storage paths. Storage keys now have a single slash separator, not a double slash. This helps display internal storage objects [in various cloud storage interfaces](https://github.com/kestra-io/kestra/issues/3933). Below is an example of how the storage path looks like before and after the change (note the double slash before the namespace `company`): - Before 0.23: `gs://ee-default-22//company/team/_files/test.txt` - After 0.23: `gs://ee-default-22/company/team/_files/test.txt` Run the following script for your provider. Otherwise, Kestra won’t be able to find your internal storage files. :::alert{type="warning"} Before taking any action to fix the double slash issue, Open-source users **MUST** follow the steps in the [OSS Tenant Migration Guide](../tenant-migration-oss/index.md) and Enterprise users **MUST** follow the steps in the [EE Tenant Migration Guide](../tenant-migration-ee/index.md). ::: ## GCS storage root slash migration script ```bash gcloud storage cp -r "gs://mybucket//*" gs://mybucket/ ``` After running the script, the old files can be removed using: ```bash gsutil rm -r "gs://mybucket//** ``` ## S3 root slash migration script ```bash #!/bin/bash BUCKET="mybucket" aws s3 ls s3://$BUCKET --recursive | awk '{print $4}' | grep '^/' | grep -v '/$' | while read -r key; do # Strip the leading slash clean_key="${key#/}" echo "Copying s3://$BUCKET/$key → s3://$BUCKET/$clean_key" # Copy to new key without leading slash aws s3 cp "s3://$BUCKET/$key" "s3://$BUCKET/$clean_key" # Optional: Delete original after copy succeeds # aws s3 rm "s3://$BUCKET/$key" done echo "Migration finished!" ``` ## Migrating Files Using Graphical User Interfaces (GUI) For users who prefer not to use command-line scripts, migration can be accomplished with graphical tools. Most S3-compatible providers (including AWS S3 and Cloudflare R2) allow you to move or copy files directly in their web interfaces: 1. **Log in** to the AWS S3 or Cloudflare R2 management console. 2. **Navigate** to your bucket. 3. Use the console’s object browser to **locate files** with leading double slashes in the key name (may appear as objects or folders starting with `/`). 4. Use the **copy or move action** to duplicate the object to the correct key (without the leading slash), then **delete the original** if needed. *Note: Some consoles may hide leading slashes or display objects as folders. Double-check object keys if you're unsure.* --- # JDBC autocommit Property Removed from Query Tasks URL: https://kestra.io/docs/migration-guide/v0.23.0/jdbc-autocommit > Announcement of the removal of the autocommit property from JDBC Query and Queries tasks. ## The autocommit property removed from JDBC Query and Queries tasks The `autocommit` property [has been removed](https://github.com/kestra-io/plugin-jdbc/issues/550) from both the Query and Queries tasks in the [JDBC plugin](https://github.com/kestra-io/plugin-jdbc). ## **Reason for change** The `Query` task executes a single statement and does not support multi-step transactions; autocommit is not relevant. The `Queries task` processes all contained queries within a single transaction by default; autocommit has no effect. ## **Impact** The `autocommit` property is no longer configurable in either task. You must remove any usage of the `autocommit` property in your existing flows, as using it will raise an error. --- # LoopUntil checkFrequency Default Values Changed URL: https://kestra.io/docs/migration-guide/v0.23.0/loop-until-defaults > Information on the changed default values for the LoopUntil task's checkFrequency property. ## LoopUntil task changed default values for checkFrequency The default behavior of the `LoopUntil` core task [has changed](https://github.com/kestra-io/kestra/issues/9152#issuecomment-2929847060) as follows: ```json { "maxIterations": null, "maxDuration": null, "interval": "PT1M" } ``` ## Before Previously, `LoopUntil` capped executions at 100 iterations and 1 hour duration (`maxIterations: 100`, `maxDuration: PT1H`, `interval: PT1S`). This was intended to prevent runaway loops from impacting instance stability, especially with frequent (1s) intervals. ## After **What’s changed**: - The default configuration no longer enforces arbitrary limits on iterations and duration. - The new default uses a 1-minute interval (`PT1M`), which greatly reduces the risk of instance performance issues, even with no iteration or duration limits. **Backwards compatibility**: If you want to retain the previous default limits to prevent potentially long-running loops, add the following to your global plugin defaults: ```yaml pluginDefaults: - forced: true type: io.kestra.plugin.core.flow.LoopUntil values: checkFrequency: maxIterations: 100 maxDuration: PT1H interval: PT1S ``` Adding that plugin default restores the earlier behavior and prevents any breaking change for your existing flows. --- # Python Script Tasks Now Use python:3.13-slim Image URL: https://kestra.io/docs/migration-guide/v0.23.0/python-script-image > Information on the change of default Docker image for Python script tasks to the official python:3.13-slim. ## Python script tasks now use official python:3-13-slim image Kestra previously used a custom `ghcr.io/kestra-io/kestrapy:latest` image containing `kestra` and `amazon-ion` pip packages. The tasks now use the official `python:3-13-slim` image by default. To maintain the previous behavior, add those packages using the `dependencies` property and they will be installed at runtime (and cached): ```yaml id: python_demo namespace: company.team tasks: - id: python type: io.kestra.plugin.scripts.python.Script dependencies: - kestra - amazon-ion - requests script: | from kestra import Kestra import requests response = requests.get('https://kestra.io') print(response.status_code) Kestra.outputs({'status': response.status_code, 'text': response.text}) ``` --- # Script Tasks: WARNING State Removed for ERROR Logs URL: https://kestra.io/docs/migration-guide/v0.23.0/script-warnings > Information on the removal of the WARNING state for script tasks when ERROR logs are present. ## No more WARNING state on script tasks when ERROR logs are present The task-run state is no longer set to `Warning` if the script task emits ERROR or WARNING logs, and the `warningOnStdErr` [property is deprecated](https://github.com/kestra-io/plugin-scripts/issues/233). Script tasks now always report a **SUCCESS** state if the Docker container exits with code 0, and a **FAILED** state for any non-zero exit code — ERROR or WARNING logs no longer influence the task run state. Example flow: ```yaml id: loguru namespace: company.team inputs: - id: nr_logs type: INT defaults: 100 tasks: - id: reproducer type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest script: | from loguru import logger from faker import Faker import time import sys logger.remove() logger.add(sys.stdout, level="DEBUG") logger.add(sys.stderr, level="DEBUG") def generate_logs(fake, num_logs): logger.error("Starting to generate log messages") for _ in range(num_logs): log_message = fake.sentence() logger.warning(log_message) time.sleep(0.01) logger.warning("Finished generating log messages") if __name__ == "__main__": faker_ = Faker() generate_logs(faker_, int("{{ inputs.nr_logs }}")) ``` Before this change, the flow would end in WARNING; now it ends in SUCCESS. --- # SQL Server Backend Removed in Kestra 0.23.0 URL: https://kestra.io/docs/migration-guide/v0.23.0/sql-server-backend > Announcement regarding the removal of support for the SQL Server backend in Kestra 0.23.0. ## SQL Server backend is no longer supported Support for the SQL Server backend has been dropped. SQL Server was an available backend option, but due to low demand, it is no longer maintained. --- # Superadmin Property Migration: Manual User Refresh URL: https://kestra.io/docs/migration-guide/v0.23.0/superadmin-refresh > Required action for Enterprise Edition users to refresh user data for the new Superadmin property handling. ## Manual user refresh to migrate Superadmin property The handling of `Superadmin` users in Kestra Cloud and Enterprise Edition has changed. Previously, the `Superadmin` status was determined by the user type (`SUPER_ADMIN`). In version 0.23, this is now managed through a dedicated property (`isSuperAdmin`). This change enables new use cases such as assigning a `Superadmin` permission to a Service Account as well as sending an invite with `Superadmin` permissions, but it also impacts user role detection for existing users. **Required action**: All EE customers must run the following CLI command after upgrading to 0.23: ```shell kestra auths users refresh ``` This command migrates and refreshes user data to correctly assign `Superadmin` status under the new property-based model. ## Impact - Existing Enterprise and Cloud users with the type `SUPER_ADMIN` will not automatically have the new `isSuperAdmin` property set unless you run the migration command after upgrading to 0.23. - This may result in users unexpectedly losing `Superadmin` privileges. If you see this happening, run `kestra auths users refresh` from the CLI to resolve the missing access. --- # EE Migration: defaultTenant to Mandatory Multitenancy URL: https://kestra.io/docs/migration-guide/v0.23.0/tenant-migration-ee > Comprehensive migration guide for Enterprise Edition users to transition from defaultTenant to mandatory multitenancy. ## Enterprise Migration Guide from defaultTenant to Multitenancy Kestra now requires a tenant context across both the OSS and EE versions. For Enterprise users, this affects default tenants and their associated configuration properties. ## Enterprise Edition (EE) changes ### Tenant System Now Always Enabled The configuration properties `kestra.ee.tenants.enabled` and `kestra.ee.tenants.defaultTenant` have been removed, as tenants are now mandatory and must be manually created. ### New Configuration Property With this change, there is a new configuration property: `kestra.ee.tenants.fallbackTenant: tenant-id`. This property is used to route non-tenant-specific API calls to a fallback tenant. This does not rewrite the route but internally assigns the tenant. ### Compatibility Layer In OSS, URIs are transformed to include the main `tenantId` directly in the API routes. In EE, the fallback tenant is injected into the request header without rerouting the endpoints — `/api/v1/...` is not mapped to `/api/v1/fallbackTenant/...`. This manual tenant header injection will be removed in a future version. ### Migration Script :::alert{type="warning"} Before running the following migration scripts, you must completely shut down all server components of your Kestra application. Running these scripts while the application is active may result in data corruption or migration failures. ::: The following command will migrate the `defaultTenant` to a newly created tenant. Thus, you need to provide both the `--tenant-id` and the `--tenant-name` (both are required). Use `--dry-run` to simulate the migration. Before running the migrate script, do a complete database dump to preserve a restore point in case of any issues during the process. ```shell kestra migrate default-tenant \ --tenant-id=tenant \ --tenant-name="Tenant Name" \ [--dry-run] ``` :::alert{type="warning"} The migration command is also required for customers that have the following configuration: ```yaml kestra: tenants: defaultTenant: true enabled: false ``` ::: :::alert{type="info"} If you are using Helm for deployment, you can use an init container to run the migration: ```yaml initContainers: - name: kestra-migrate image: registry.kestra.io/docker/kestra-ee:v0.23.0 command: ['sh', '-c', 'exec', '/app/kestra', 'migrate', 'default-tenant', '--tenant-id', 'migrated', '--tenant-name', 'migrated'] ``` You can remove it after successful run (it has to be only executed once). ::: Migrating some tables can take a long time, you can use `--excludes=table1,table2` to exclude some tables from the migration and update them manually. ### Kafka Queue Handling If your queue is Kafka, queues will be recreated after migration. No manual action is needed — Kestra recreates the queue automatically. To start fresh, using a new Kafka cluster for your queues is strongly recommended. ### Elasticsearch repository handling If your repository is Elasticsearch, your instance likely stores a large amount of execution data. Exclude executions, logs, and metrics from the migration and update them manually. ```shell kestra migrate default-tenant \ --tenant-id=tenant \ --tenant-name="Tenant Name" \ --excludes=executions,logs,metrics \ [--dry-run] ``` After the migration, use the following Kibana-style query to migrate the excluded indices asynchronously. This can take a long time. ``` POST /kestra_executions/_update_by_query?wait_for_completion=false { "query": { "bool": { "must_not": [ {"exists": {"field": "tenantId"}} ] } }, "script": { "source": "ctx._source.tenantId = 'tenant'", "lang": "painless" } } ``` You should adapt the index name to match your configuration, if you're using aliases you can use an index wildcard, for ex `kestra-executions-*`. Do it for each excluded index, it returns a task identifier that you can use to check the status of the query task. You can use the following Kibana style query to get the status of the migration query: ``` GET /_tasks/ ``` ## Internal storage migration guide from `defaultTenant` to a tenant This section explains how to migrate internal storage data to ensure the tenant ID is included and properly queried by the application. Migration can be done via the provided scripts or directly through the management console of your cloud storage provider. ### **Who needs to perform this migration?** - Enterprise users who used to rely on the `defaultTenant` need to run this script as well. :::alert{type="info"} The provided commands use a list of existing tenant names (`main`, `tenant1`, `tenant2`). Update these in the scripts to match your actual tenant names. ::: ## Local storage If you use both `defaultTenant` and specific tenants, you need to specify all existing tenant ID in the list here `[[ "$bn" == "main" || "$bn" == "tenant1" || "$bn" == "tenant2" ]]`, and replace those names with your existing tenant IDs. Also replace `main` in `base-path/main/` with your target tenant ID. ```bash #!/bin/bash for f in base-path/*; do bn=$(basename "$f") [[ "$bn" == "main" || "$bn" == "tenant1" || "$bn" == "tenant2" ]] || { rsync -a "$f/" base-path/main/"$bn"/ && rm -rf "$f" }; done ``` If you used to rely on `defaultTenant` with no multitenancy enabled, use the following script: ```bash #!/bin/bash for f in base-path/*; do bn=$(basename "$f") [[ "$bn" == "main" ]] || { rsync -a "$f/" base-path/main/"$bn"/ && rm -rf "$f" }; done ``` - Your `base-path` is configured under the configuration section `kestra.storage.local.base-path`. - Replace `main` with the appropriate tenant ID. ## MinIO Storage For MinIO, keep the `undefined` option due to the different handling of storage paths. ### Enterprise Users ```bash #!/bin/bash for f in $(mc ls myminio/mybucket | awk '{print $NF}' | sed 's|/$||'); do if [[ "$f" != "main" && "$f" != "tenant" && "$f" != "undefined" ]]; then # List of known tenant folders. If you use defaultTenant with no multitenancy enabled, you only need one listed tenant ID (i.e., main) and undefined. echo "Moving $f → tenantId/" mc mv --recursive "myminio/mybucket/$f" "myminio/mybucket/tenantId/" fi done ``` - Replace `mybucket` with the bucket name from `kestra.storage.minio.bucket`. ## Azure Blob Storage ```bash #!/bin/bash ## Set your Azure Storage account and bucket (container) name ACCOUNT_NAME="myaccount" BUCKET_NAME="mybucket" ## Configurable destination tenant (default: 'main') DEST_TENANT="${1:-main}" ## List of tenant folders to skip (don't move) TENANTS=("main" "tenant1" "tenant2") # List of known tenant folders. If you use defaultTenant with no multitenancy enabled, you only need one listed tenant ID (i.e., main). ## Get all blob names blob_names=$(az storage blob list --account-name "$ACCOUNT_NAME" --container-name "$BUCKET_NAME" --query "[].name" --output tsv) ## Separate top-level files and folders top_files=() top_folders=() for name in $blob_names; do if [[ "$name" == */* ]]; then top_folder=$(echo "$name" | cut -d'/' -f1) top_folders+=("$top_folder") else top_files+=("$name") fi done ## Deduplicate folder list unique_folders=($(printf "%s\n" "${top_folders[@]}" | sort | uniq)) ## Remove from top_files any that match folder names clean_files=() for file in "${top_files[@]}"; do skip=false for folder in "${unique_folders[@]}"; do if [[ "$file" == "$folder" ]]; then skip=true break fi done if [ "$skip" = false ]; then clean_files+=("$file") fi done ## Process top-level files for file in "${clean_files[@]}"; do skip=false for tenant in "${TENANTS[@]}"; do if [[ "$file" == "$tenant" ]]; then skip=true break fi done if [ "$skip" = false ]; then echo "Copying single file $file -> $DEST_TENANT/$file" az storage blob copy start \ --account-name "$ACCOUNT_NAME" \ --destination-container "$BUCKET_NAME" \ --destination-blob "$DEST_TENANT/$file" \ --source-uri "$(az storage blob url --account-name "$ACCOUNT_NAME" --container-name "$BUCKET_NAME" --name "$file" -o tsv)" fi done ## Process top-level folders (batch copy) for folder in "${unique_folders[@]}"; do skip=false for tenant in "${TENANTS[@]}"; do if [[ "$folder" == "$tenant" ]]; then skip=true break fi done if [ "$skip" = false ]; then echo "Batch copying $folder/* -> $DEST_TENANT/" az storage blob copy start-batch \ --account-name "$ACCOUNT_NAME" \ --destination-container "$BUCKET_NAME" \ --destination-path "$DEST_TENANT" \ --source-container "$BUCKET_NAME" \ --pattern "$folder/*" fi done echo "Migration finished!" ``` - `BUCKET_NAME` is configured under `kestra.storage.azure.container`. ## S3 Storage ```bash #!/bin/bash BUCKET="mybucket" DEST_TENANT="${1:-main}" TENANTS=("main" "tenant1" "tenant2") # List of known tenant folders. If you use defaultTenant with no multitenancy enabled, you only need one listed tenant ID (i.e., main). echo "Starting S3 tenant migration → destination tenant: $DEST_TENANT" ## List all keys, no leading slash aws s3 ls s3://$BUCKET --recursive | awk '{print $4}' | sed 's|^/||' | grep -v '^$' | while read -r key; do # Check top-level folder or file top_level=$(echo "$key" | cut -d'/' -f1) # Skip if key is already under an existing tenant skip=false for tenant in "${TENANTS[@]}"; do if [[ "$top_level" == "$tenant" ]]; then skip=true break fi done if [ "$skip" = false ]; then new_key="$DEST_TENANT/$key" echo "Copying s3://$BUCKET/$key → s3://$BUCKET/$new_key" # Copy object to tenant folder aws s3 cp "s3://$BUCKET/$key" "s3://$BUCKET/$new_key" fi done echo "Tenant migration finished!" ``` - `BUCKET` is configured under `kestra.storage.s3.bucket`. ## GCS storage ```bash #!/bin/bash BUCKET="gs://bucket" DEST_TENANT="${1:-main}" # Default tenant is 'main' if not specified TENANTS=("main" "tenant1" "tenant2") # List of known tenant folders. If you use defaultTenant with no multitenancy enabled, you only need one listed tenant ID (i.e., main). echo "Starting GCS tenant migration on $BUCKET → destination tenant: $DEST_TENANT" ## Get all object keys (strip bucket prefix) all_keys=$(gsutil ls "$BUCKET/**" | sed "s|$BUCKET/||") ## Collect top-level folders and files declare -A top_folders declare -a top_files for key in $all_keys; do # Skip folder markers (end with /) if [[ "$key" == */ ]]; then top_folder=$(echo "$key" | cut -d'/' -f1) top_folders["$top_folder"]=1 else top_level=$(echo "$key" | cut -d'/' -f1) if [[ "$key" != */* ]]; then # Root-level file (no folder) top_files+=("$key") else top_folders["$top_level"]=1 fi fi done ## Process top-level files for file in "${top_files[@]}"; do skip=false for tenant in "${TENANTS[@]}"; do if [[ "$file" == "$tenant" ]]; then skip=true break fi done if [ "$skip" = false ]; then new_key="$DEST_TENANT/$file" echo "Copying file $BUCKET/$file → $BUCKET/$new_key" gsutil cp "$BUCKET/$file" "$BUCKET/$new_key" # Optional: gsutil rm "$BUCKET/$file" fi done ## Process top-level folders for folder in "${!top_folders[@]}"; do skip=false for tenant in "${TENANTS[@]}"; do if [[ "$folder" == "$tenant" ]]; then skip=true break fi done if [ "$skip" = false ]; then echo "Batch copying folder $BUCKET/$folder/** → $BUCKET/$DEST_TENANT/" gsutil cp -r "$BUCKET/$folder" "$BUCKET/$DEST_TENANT/" # Optional: gsutil rm -r "$BUCKET/$folder" fi done echo "Tenant migration finished!" ``` - `BUCKET` is configured under `kestra.storage.gcs.bucket`. ### Migrating files using graphical user interfaces (GUI) For users who prefer not to use command-line scripts or are limited by their environment (e.g., Windows Server without shell access), migration can be accomplished with graphical tools. Below are guidelines for each storage type. --- #### Windows: Using File Explorer If your internal storage is a local directory (or a network drive), you can manually move or copy files to migrate them to the right tenant folder: 1. **Open File Explorer** and go to your base storage path (as configured in `kestra.storage.local.base-path`). 2. **Identify all folders and files** at the root level that are *not* already under a tenant folder (e.g., “main”, “tenant1”, “tenant2”). Example: If your structure is ```plaintext base-path/ main/ tenant1/ foo/ bar/ ``` You need to move `foo/` and `bar/` into `main/` or your target tenant directory. 3. **Select** the folders and files to migrate, right-click, and choose **Cut** (or **Copy** if you want to keep the original temporarily). 4. **Paste** them into the appropriate tenant folder (e.g., `main/`). The result should be: ```plaintext base-path/ main/ foo/ bar/ tenant1/ ``` 5. **Delete** the original folders/files from the root after confirming the migration. --- #### Local storage on macOS 1. **Open Finder** and navigate to your base storage directory. 2. **Locate folders and files** at the root level not already under your tenant folders. 3. **Drag and drop** each folder or file into the appropriate tenant folder (e.g., “main”). 4. **Verify** the migration by checking that only tenant folders exist at the root. 5. **Remove** the original files/folders if you used Copy. --- #### S3/MinIO/Cloudflare R2: Using management console for S3-compatible storage Most S3-compatible providers (including AWS S3, MinIO, and Cloudflare R2) allow you to move or copy files directly in their web interfaces: 1. **Open** the management console for your S3-compatible storage provider. 2. **Navigate** to your bucket. 3. **Locate all objects** at the root of the bucket (not under any tenant folder such as “main” or “tenant1”). 4. For each object or folder: * In S3 console, use the **Move** function to relocate it into the correct tenant folder (e.g., move `foo/bar.txt` → `main/foo/bar.txt`). * If your R2/MinIO/Ceph console does not support move/rename in-place, you may need to copy the object to the new location and then delete the original. 5. **Verify** that all data now resides under the tenant folder. ![s3 migration](./s3_migrate.png) --- # OSS Migration: Introducing the defaultTenant Context URL: https://kestra.io/docs/migration-guide/v0.23.0/tenant-migration-oss > Migration guide for Open-Source Edition users to introduce the mandatory defaultTenant context in Kestra 0.23.0. ## Open-Source Migration Guide to introduce defaultTenant Kestra now requires a tenant context in the OSS version. ## Open-Source Edition Changes ### Default Tenant A fixed, non-configurable tenant named "main" is always used now in the open-source version. The tenant is not stored in the database and does not impact the user experience in the UI or building flows. ### Breaking change All Open-source API URIs now include the tenantId: **Before**: `/api/v1/...` **0.23 & onwards**: `/api/v1/main/...` Temporarily, there is a compatibility layer implemented to map `/api/v1/...` to `/api/v1/main/...` to ease the transition, but this compatibility layer will eventually be removed in a future Kestra version. ### Migration Script :::alert{type="warning"} Before running the following migration scripts, you must completely shut down the main Kestra application. Running these scripts while the application is active may result in data corruption or migration failures. ::: To add the tenantId field across your existing database (flows, executions, logs, etc.), use (with migrated being customizable): ```shell kestra migrate default-tenant --dry-run ``` :::alert{type="info"} Before running the migrate script, do a complete database dump to preserve a restore point in case of any issues during the process. - Use `--dry-run` to preview changes without modifying data. - Re-run without the flag to execute the migration. ::: :::alert{type="info"} If you are using Helm for deployment, you can use an init container to run the migration: ```yaml initContainers: - name: kestra-migrate image: kestra/kestra:v0.23.0 command: ['sh', '-c', 'exec', '/app/kestra', 'migrate', 'default-tenant', '--tenant-id', 'migrated', '--tenant-name', 'migrated'] ``` You can remove it after successful run (it has to be only executed once). ::: Migrating some tables can take a long time, you can use `--exludes=table1,table2` to exclude some tables from the migration and update them manually. ## Internal storage migration guide from `defaultTenant` to a tenant This section explains how to migrate internal storage data to ensure the tenant ID is included and properly queried by the application. Migration can be done via the provided scripts or directly through the management console of your cloud storage provider. ### **Who needs to perform this migration?** - All OSS users need to run the migration script to ensure that the tenant ID is included in the internal storage paths. ## Local storage The following script ensures that the `main` tenant ID is added to the internal storage path for your configuration. For OSS, this ID is immutable, so there is no need to adjust the name or path. ```bash for f in base-path/*; do bn=$(basename "$f") [[ "$bn" == "main" ]] || { rsync -a "$f/" base-path/main/"$bn"/ && rm -rf "$f" }; done ``` - Your `base-path` is configured under the configuration section `kestra.storage.local.base-path`. - For OSS users, the destination tenant ID is always `main`, thus you should keep the `base-path/main/` intact. ## MinIO Storage For MinIO, keep the `undefined` option due to the different handling of storage paths. ### OSS Users ```bash for f in $(mc ls myminio/mybucket | awk '{print $NF}' | sed 's|/$||'); do if [[ "$f" != "main" && "$f" != "undefined" ]]; then echo "Moving $f → main/" mc mv --recursive "myminio/mybucket/$f" "myminio/mybucket/main/" fi done ``` - Replace `mybucket` with the bucket name from `kestra.storage.minio.bucket`. ## Azure Blob Storage ```bash #!/bin/bash ## Set your Azure Storage account and bucket (container) name ACCOUNT_NAME="myaccount" BUCKET_NAME="mybucket" ## Configurable destination tenant (default: 'main') DEST_TENANT="main" ## List of tenant folders to skip (don't move) TENANTS=("main") ## Get all blob names blob_names=$(az storage blob list --account-name "$ACCOUNT_NAME" --container-name "$BUCKET_NAME" --query "[].name" --output tsv) ## Separate top-level files and folders top_files=() top_folders=() for name in $blob_names; do if [[ "$name" == */* ]]; then top_folder=$(echo "$name" | cut -d'/' -f1) top_folders+=("$top_folder") else top_files+=("$name") fi done ## Deduplicate folder list unique_folders=($(printf "%s\n" "${top_folders[@]}" | sort | uniq)) ## Remove from top_files any that match folder names clean_files=() for file in "${top_files[@]}"; do skip=false for folder in "${unique_folders[@]}"; do if [[ "$file" == "$folder" ]]; then skip=true break fi done if [ "$skip" = false ]; then clean_files+=("$file") fi done ## Process top-level files for file in "${clean_files[@]}"; do skip=false for tenant in "${TENANTS[@]}"; do if [[ "$file" == "$tenant" ]]; then skip=true break fi done if [ "$skip" = false ]; then echo "Copying single file $file -> $DEST_TENANT/$file" az storage blob copy start \ --account-name "$ACCOUNT_NAME" \ --destination-container "$BUCKET_NAME" \ --destination-blob "$DEST_TENANT/$file" \ --source-uri "$(az storage blob url --account-name "$ACCOUNT_NAME" --container-name "$BUCKET_NAME" --name "$file" -o tsv)" fi done ## Process top-level folders (batch copy) for folder in "${unique_folders[@]}"; do skip=false for tenant in "${TENANTS[@]}"; do if [[ "$folder" == "$tenant" ]]; then skip=true break fi done if [ "$skip" = false ]; then echo "Batch copying $folder/* -> $DEST_TENANT/" az storage blob copy start-batch \ --account-name "$ACCOUNT_NAME" \ --destination-container "$BUCKET_NAME" \ --destination-path "$DEST_TENANT" \ --source-container "$BUCKET_NAME" \ --pattern "$folder/*" fi done echo "Migration finished!" ``` - `BUCKET_NAME` is configured under `kestra.storage.azure.container`. - For OSS users, the destination tenant is always `main`. ## S3 Storage ```bash #!/bin/bash BUCKET="mybucket" DEST_TENANT="main" TENANTS=("main") echo "Starting S3 tenant migration → destination tenant: $DEST_TENANT" ## List all keys, no leading slash aws s3 ls s3://$BUCKET --recursive | awk '{print $4}' | sed 's|^/||' | grep -v '^$' | while read -r key; do # Check top-level folder or file top_level=$(echo "$key" | cut -d'/' -f1) # Skip if key is already under an existing tenant skip=false for tenant in "${TENANTS[@]}"; do if [[ "$top_level" == "$tenant" ]]; then skip=true break fi done if [ "$skip" = false ]; then new_key="$DEST_TENANT/$key" echo "Copying s3://$BUCKET/$key → s3://$BUCKET/$new_key" # Copy object to tenant folder aws s3 cp "s3://$BUCKET/$key" "s3://$BUCKET/$new_key" fi done echo "Tenant migration finished!" ``` - `BUCKET` is configured under `kestra.storage.s3.bucket`. - For OSS users, the destination tenant is always `main`. ## GCS Storage ```bash #!/bin/bash BUCKET="gs://bucket" DEST_TENANT="main" TENANTS=("main") echo "Starting GCS tenant migration on $BUCKET → destination tenant: $DEST_TENANT" ## Get all object keys (strip bucket prefix) all_keys=$(gsutil ls "$BUCKET/**" | sed "s|$BUCKET/||") ## Collect top-level folders and files declare -A top_folders declare -a top_files for key in $all_keys; do # Skip folder markers (end with /) if [[ "$key" == */ ]]; then top_folder=$(echo "$key" | cut -d'/' -f1) top_folders["$top_folder"]=1 else top_level=$(echo "$key" | cut -d'/' -f1) if [[ "$key" != */* ]]; then # Root-level file (no folder) top_files+=("$key") else top_folders["$top_level"]=1 fi fi done ## Process top-level files for file in "${top_files[@]}"; do skip=false for tenant in "${TENANTS[@]}"; do if [[ "$file" == "$tenant" ]]; then skip=true break fi done if [ "$skip" = false ]; then new_key="$DEST_TENANT/$file" echo "Copying file $BUCKET/$file → $BUCKET/$new_key" gsutil cp "$BUCKET/$file" "$BUCKET/$new_key" # Optional: gsutil rm "$BUCKET/$file" fi done ## Process top-level folders for folder in "${!top_folders[@]}"; do skip=false for tenant in "${TENANTS[@]}"; do if [[ "$folder" == "$tenant" ]]; then skip=true break fi done if [ "$skip" = false ]; then echo "Batch copying folder $BUCKET/$folder/** → $BUCKET/$DEST_TENANT/" gsutil cp -r "$BUCKET/$folder" "$BUCKET/$DEST_TENANT/" # Optional: gsutil rm -r "$BUCKET/$folder" fi done echo "Tenant migration finished!" ``` - `BUCKET` is configured under `kestra.storage.gcs.bucket`. - For OSS users, the destination tenant is always `main`. ### Migrating Files Using Graphical User Interfaces (GUI) For users who prefer not to use command-line scripts or are limited by their environment (e.g., Windows Server without shell access), migration can be accomplished with graphical tools. Below are guidelines for each storage type. --- #### Windows: Using File Explorer If your internal storage is a local directory (or a network drive), you can manually move or copy files to migrate them into the `main` tenant folder: 1. **Open File Explorer** and navigate to your storage root directory as configured in `kestra.storage.local.base-path`. 2. **Identify all folders and files** at the root level that are *not* already under the `main` folder. For example: ```plaintext base-path/ main/ foo/ bar/ ``` You need to move `foo/` and `bar/` into `main/`. 3. **Select** all such folders/files, right-click and **Cut** (or **Copy**). 4. **Paste** into the `main` folder, e.g., `base-path/main/`. 5. **Delete** the originals from the root after confirming successful migration. --- #### Local Storage on MacOS 1. **Open Finder** and go to your base storage directory. 2. **Select all files and folders** at the root that are not already in the `main` directory. 3. **Drag and drop** them into the `main` folder. 4. **Verify** that only the `main` folder remains at the root (along with its content). 5. Remove the originals if you used Copy instead of Move. --- #### S3/MinIO/Cloudflare R2: Using Management Console for S3-compatible Storage Most S3-compatible providers (AWS S3, MinIO, Cloudflare R2) allow file operations through their web UI: 1. **Log in** to your S3-compatible storage console and open the bucket (`kestra.storage.s3.bucket` or `kestra.storage.minio.bucket`). 2. **Locate all objects** at the root level (not under the `main` prefix/folder). 3. For each such file or folder: * Use the **Move** or **Rename** function to move it to the `main/` prefix (e.g., move `foo/file.txt` → `main/foo/file.txt`). * If your console only allows copy, use **Copy** then delete the original. 4. **Verify** that all files are now organized under the `main/` folder. --- # Tenant Path Removed from Superadmin API Routes (EE) URL: https://kestra.io/docs/migration-guide/v0.23.0/tenant-segment-removed-from-superadmin-apis > Details on the removal of the tenant path segment from Superadmin API routes in Enterprise Edition. ## Removal of tenant from Superadmin API routes The `{tenant}` parameter has been removed from several API routes related to tenant management in Enterprise Edition. This change affects only EE users who interact with these endpoints programmatically (i.e., via direct API calls). ## Reason for change These routes are relevant only to `Superadmin` users, who can see and manage all tenants: * The `{tenant}` path parameter was unnecessary and led to confusion, as all access control is based on the authenticated user's privileges (i.e. their tenant access), not the path. * The endpoints now reflect the actual access model: actions depend on the `Superadmin` context, not on a specified `{tenant}` in the path. ## Changed endpoints The following API endpoints have been updated to remove the `{tenant}` path segment: * `/api/v1/{tenant}/clusters` → `/api/v1/clusters` * `/api/v1/{tenant}/tenants` → `/api/v1/tenants` * `/api/v1/{tenant}/tenants/bindings/` → `/api/v1/tenants/bindings/` * `/api/v1/{tenant}/tenants/{resourceTenant}/group` → `/api/v1/tenants/{resourceTenant}/group` * `/api/v1/{tenant}/tenants/{resourceTenant}/invitations` → `/api/v1/tenants/{resourceTenant}/invitations` * `/api/v1/{tenant}/tenants/{resourceTenant}/namespaces` → `/api/v1/tenants/{resourceTenant}/namespaces` * `/api/v1/{tenant}/tenants/{resourceTenant}/roles` → `/api/v1/tenants/{resourceTenant}/roles` * `/api/v1/{tenant}/tenants/{resourceTenant}/users/` → `/api/v1/tenants/{resourceTenant}/users/` ## How to migrate * If you are using these endpoints programmatically, update your API clients to remove the `{tenant}` path segment. * Access remains limited to `Superadmin` users only. --- # Kestra 0.24.0 Migration Guide – Changes and Actions URL: https://kestra.io/docs/migration-guide/v0.24.0 > Migration guide for Kestra 0.24.0. Covers mandatory Basic Authentication for OSS, IAM and API endpoint changes, and the LangChain4j to Plugin AI migration. import ChildCard from "~/components/docs/ChildCard.astro" ## 0.24.0 Deprecated features and migration guides for 0.24.0 and onwards. --- # Basic Authentication Now Required in Kestra OSS 0.24.0 URL: https://kestra.io/docs/migration-guide/v0.24.0/basic-authentication > Notice regarding mandatory Basic Authentication for all Kestra Open-Source instances starting in version 0.24.0. ## Required Basic Authentication Basic authentication (`username` and `password`) is now required to enhance security on open-source instances. All users must log in to access the Kestra UI and API, even if they are running Kestra locally or in a development environment. This change is designed to prevent unauthorized access to your Kestra instance and ensure that only authenticated users can view and manage flows. The credentials can be configured from the Setup Page in the UI (http://localhost:8080/ui/main/setup) or you can set them manually in the [Kestra Security and Secrets configuration](../../../configuration/05.security-and-secrets/index.md) file under `basic-auth` (recommended for production): ```yaml kestra: server: basic-auth: username: admin@kestra.io password: Admin1234 ``` Now that basic authentication is required, the `enabled` flag is ignored (ideally, don't use it anymore), and credentials must be set to interact with Kestra UI or API. For new users, follow the Setup Page that appears when you start the Kestra UI. For production deployments, set a valid email address and a strong password in the configuration file. If you upgrade to version 0.24, there are three possible scenarios for existing users. For the details, refer to the [Basic Authentication Troubleshooting guide](../../../10.administrator-guide/basic-auth-troubleshooting/index.md). --- # FILE Input API: Capture Filename on Upload (0.24.0) URL: https://kestra.io/docs/migration-guide/v0.24.0/capture-filename > Information on the requirement to use part name and filename for uploading FILE-type inputs via API in Kestra 0.24.0. ## Capture filename on input type FILE To upload a file for an input of type `FILE`, you should now use the part **name** for the input and the part **filename** attribute for the file name. For example, when using `cURL` to start an execution for the following flow: ```yaml id: file_flow namespace: company.team inputs: - id: fileInput type: FILE tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "{{inputs.fileInput}}" ``` **Before 0.24 - now deprecated** ```bash curl -v "http://localhost:8080/api/v1/executions/company.team/file_flow" \ -H "Content-Type:multipart/form-data" \ -F "files=@/tmp/test.txt;filename=fileInput" ``` **Since 0.24** ```bash curl -v "http://localhost:8080/api/v1/executions/company.team/file_flow" \ -H "Content-Type:multipart/form-data" \ -F "fileInput=@/tmp/test.txt;filename=test.txt" ``` --- # IAM and API Endpoint Changes in Kestra 0.24.0 URL: https://kestra.io/docs/migration-guide/v0.24.0/endpoint-changes > Details on significant IAM and API endpoint revisions in Kestra 0.24.0 for improved security and management. ## IAM and API Endpoint Changes To streamline API usage, reduce ambiguity, and improve security and manageability for large organizations, the IAM and related API endpoints have been significantly revised in 0.24. These changes consolidate user, group, and role management around explicit, well-defined routes and permissions, and remove redundant or confusing API paths. ## Global API Changes - `/v1/api/{tenant}/me` moved to `/v1/api/me` - `/v1/api/cluster` moved to `/v1/api/instance` - All `/v1/api/{tenant}/users` endpoints are removed: - Use `/v1/api/users` (instance-level, Superadmin only) - Use `/v1/api/{tenant}/tenant-access` for tenant access management - Use `/v1/api/{tenant}/service-accounts` for service account management - All Superadmin endpoints under `/v1/api/tenants/{tenant}/groups`, `/bindings`, `/roles`, `/invitations`, and `/namespaces` are removed. ## Role APIs - `GET /v1/api/tenants/{tenant}/roles/[search|autocomplete]` now only returns operation-relevant fields; `tenantId`, `deleted`, `description`, and `permissions` have been removed. - `GET /v1/api/tenants/{tenant}/roles/{id}` now only returns relevant fields; `tenantId` and `deleted` are removed. - `POST/PUT /v1/api/tenants/{tenant}/roles`: the request body now excludes `id`, `tenantId`, and `deleted`. ## Group APIs - `GET /v1/api/tenants/{tenant}/groups/[search|autocomplete]` now only returns `id` and `name`. - `GET /v1/api/tenants/{tenant}/groups/{id}` now only returns `id`, `name`, and `description`. - `POST/PUT /v1/api/tenants/{tenant}/groups` the request body now excludes `id`, `tenantId`, and `deleted`. - `GET /v1/api/tenants/{tenant}/groups/{groupId}/members` and `/members/{userId}` now return only `id`, `username`, `displayName`, and `groups`. ## RBAC Updates - Permissions `API_TOKEN` and `ME` are removed. - New permissions: - `SERVICE_ACCOUNT` for managing service accounts - `INVITATION` for managing invitations - `TENANT_ACCESS` for managing users in a tenant - `GROUP_MEMBERSHIP` for group membership management - The `USER` permission is now only required for SCIM integration. ## Action Required for applications built on top of Kestra API - Update any API clients or scripts that interact with affected endpoints. - Review permission assignments and RBAC configurations to use the updated permissions. - For file uploads, ensure the request format matches the new requirements. --- # LangChain4j to Plugin AI Migration in Kestra 0.24.0 URL: https://kestra.io/docs/migration-guide/v0.24.0/renaming-langchain4j-plugin-ai > Guide to migrating from the LangChain4j plugin to the new Plugin AI package in Kestra 0.24.0. ## Migrate from LangChain4j Plugin to Plugin AI The LangChain4j plugin has been renamed and is now available as Plugin AI, with a new repository and package namespace. Starting in Kestra 0.24, you must update all flows using `langchain4j` to reference the new `plugin-ai` package. No functional changes were made; this is a rename for clarity and future extensibility. The new Plugin AI repository is [https://github.com/kestra-io/plugin-ai](https://github.com/kestra-io/plugin-ai). ## Required Migration Steps - Update all flow definitions: - Change `type:` and `provider:` values from `io.kestra.plugin.langchain4j.*` to `io.kestra.plugin.ai.*` - Adjust any provider types under `provider:` to use the new namespace, as shown in the examples below. ### Before 0.24 (using langchain4j): ```yaml id: chat_completion namespace: company.team inputs: - id: prompt type: STRING tasks: - id: chat_completion type: io.kestra.core.plugin.langchain4j.completion.ChatCompletion provider: type: io.kestra.plugin.langchain4j.provider.GoogleGemini apiKey: "{{secret('GOOGLE_API_KEY')}}" modelName: gemini-2.5-flash messages: - type: SYSTEM content: You are a helpful assistant, answer concisely, avoid overly casual language or unnecessary verbosity. - type: USER content: "{{inputs.prompt}}" ``` ### After migrating to 0.24 (using plugin-ai): ```yaml id: chat_completion namespace: company.team inputs: - id: prompt type: STRING tasks: - id: chat_completion type: io.kestra.plugin.ai.completion.ChatCompletion provider: type: io.kestra.plugin.ai.provider.GoogleGemini apiKey: "{{ secret('GOOGLE_API_KEY') }}" modelName: gemini-2.5-flash messages: - type: SYSTEM content: You are a helpful assistant, answer concisely, avoid overly casual language or unnecessary verbosity. - type: USER content: "{{inputs.prompt}}" ``` ## Additional Notes - The new namespace applies to all AI providers and task types previously under `langchain4j`. - No configuration changes are needed apart from the updated type paths. --- # maxAttempt renamed maxAttempts URL: https://kestra.io/docs/migration-guide/v0.24.0/retries-maxAttempts > Announcement of the renaming of the maxAttempt retry property to maxAttempts for grammatical correctness. ## maxAttempt renamed maxAttempts For [retries](../../../05.workflow-components/12.retries/index.md), the `maxAttempt` property has been renamed with an alias to `maxAttempts` to promote proper English grammar. This is a non-breaking change, but update all flows to use the correctly named property as a long-term safeguard. ## Before The following example defines a retry for the `retry_sample` task with a maximum of 5 attempts every 15 minutes: ```yaml - id: retry_sample type: io.kestra.plugin.core.log.Log message: my output for task {{task.id}} timeout: PT10M retry: type: constant maxAttempt: 5 # This name will still work, but it is recommended to search and replace in your flows. interval: PT15M ``` ## After The following example defines a retry for the `retry_sample` task with a maximum of 5 attempts every 15 minutes: ```yaml - id: retry_sample type: io.kestra.plugin.core.log.Log message: my output for task {{task.id}} timeout: PT10M retry: type: constant maxAttempts: 5 # The correct, long-term naming convention interval: PT15M ``` --- # Kestra 1.0.0 Migration Guide – Milestone Release Changes URL: https://kestra.io/docs/migration-guide/v1.0.0 > Comprehensive guide to changes and migration actions for the milestone Kestra 1.0.0 release. import ChildCard from "~/components/docs/ChildCard.astro" ## 1.0.0 Deprecated features and migration guides for 1.0.0 and onwards. --- # Custom Plugin Package Structure Changes in Kestra 1.0 URL: https://kestra.io/docs/migration-guide/v1.0.0/custom-plugin-packages > Internal package structure changes for custom plugin developers in Kestra 1.0.0. ## Internal Package Structure Changes (Custom Plugins Only) This change affects only users building custom plugins or using the Java library in tests. - `io.kestra.core.runners.StandAloneRunner` → replaced by `io.kestra.core.runners.TestRunner`. - `io.kestra.core.schedulers.AbstractScheduler` → replaced by `io.kestra.scheduler.AbstractScheduler`. For plugin tests using Scheduler or Worker directly, add new Gradle modules: ```groovy testImplementation group: "io.kestra", name: "scheduler" testImplementation group: "io.kestra", name: "worker" ``` Tests using `@ExecuteFlow` remain unaffected. No impact for UI/API users. --- # Helm Chart Updates in Kestra 1.0.0 for Production URL: https://kestra.io/docs/migration-guide/v1.0.0/helm-charts > Major updates and restructuring of Kestra Helm charts for production-grade deployments in version 1.0.0. ## Helm Chart Updates Kestra's Helm charts have been updated to be more comprehensive for production environments while also offering charts for starter use cases. Previously, a single chart deployed one standalone Kestra service with one replica (all Kestra server components in a single pod) with preinstalled dependencies such as PostgreSQL and MinIO — helpful to get started but typically unnecessary for production. The Kestra Operator is also available as a custom Kubernetes Operator for reading Resource Definitions to conduct various actions in Kestra. We also restructured configurations and values to be more comprehensive and production grade. There are now three charts: `kestra` (production chart), `kestra-starter` (starter chart with dependencies), and `kestra-operator` (Enterprise only custom Kubernetes operator). :::alert{type="info"} Breaking changes have been made to the Helm chart to support the new features and improvements in Kestra 1.0.0. Review the following changes carefully before upgrading. ::: ## `kestra` This chart is intended for production deployments. Here is how you can install it under the release name `my-kestra`: ```bash $ helm repo add kestra https://helm.kestra.io/ $ helm install my-kestra kestra/kestra --version 1.0.0 ``` PostgreSQL, MinIO, Kafka, and Elasticsearch have been removed from the chart dependencies. You can now use your own managed services or deploy them separately. To install Kestra with dependencies, use the `kestra-starter` chart, but you will then need to manage those dependencies yourself. ## Deployment configuration Most of the deployment configuration options have been restructured. There is now a common entry in the `values.yaml` — compare the `Before` and `After` sections below. ### Before ```yaml nodeSelector: {} tolerations: [] affinity: {} extraVolumeMounts: [] extraVolumes: [] extraEnv: [] ## more... ``` ### After ```yaml common: nodeSelector: {} tolerations: [] affinity: {} extraVolumeMounts: [] extraVolumes: [] extraEnv: [] # more... ``` You can override all those configuration options in the `deployments` entry in the `values.yaml` file. ```yaml deployments: standalone: nodeSelector: {} tolerations: [] affinity: {} extraVolumeMounts: [] extraVolumes: [] extraEnv: [] # more... ``` ## Custom configuration files The method for providing custom configuration files to Kestra has changed. It is now all under the `configurations` entry in the `values.yaml` file. ### Before ```yaml ### This creates a config map of the Kestra configuration configuration: {} ## Example: Setting the plugin defaults for the Docker runner ## kestra: ## plugins: ## configurations: ## - type: io.kestra.plugin.scripts.runner.docker.Docker ## values: ## volume-enabled: true ### This will create a Kubernetes Secret for the values provided ## This will be appended to kestra-secret with the key application-secrets.yml secrets: {} ## Example: Store your postgres backend credentials in a secret ## secrets: ## kestra: ## datasources: ## postgres: ## username: pguser ## password: mypass123 ## url: jdbc:postgresql://pghost:5432/db ### Load Kestra configuration from existing secret ## Here this assumes the secret is already deployed and the following apply: ## 1. The secret type is "Opaque" ## 2. The secret has a single key ## 3. The value of the secret is the Kestra configuration. externalSecret: {} #secretName: secret-name #key: application-kestra.yml ### configuration files ## This option allows you to reference existing local files to configure Kestra, e.g. configurationPath: ## configurationPath: /app/application.yml,/app/application-secrets.yml extraConfigMapEnvFrom: # - name: my-existing-configmap-no-prefix # - name: my-existing-configmap-with-prefix # prefix: KESTRA_ extraSecretEnvFrom: # - name: my-existing-no-prefix # - name: my-existing-with-prefix # prefix: SECRET_ ``` ### After ```yaml configurations: application: kestra: queue: type: h2 repository: type: h2 storage: type: local local: base-path: "/app/storage" datasources: h2: url: jdbc:h2:mem:public;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=FALSE username: kestra password: "" driver-class-name: org.h2.Driver configmaps: - name: kestra-others key: others.yml secrets: - name: kestra-basic-auth key: basic-auth.yml ``` There is no more need to take care of `configurationPath:`; it's automatically managed by the chart. If you need to add extra environment variables from existing `ConfigMaps` or `Secrets`, you can still use `extraEnv` and `extraEnvFrom` under the `common` entry. ## Managing Docker in Docker (dind) The way `dind` is managed has been updated. It is now under the `dind` entry in the `values.yaml`. A `dind.mode` option is now available to choose between `rootless` and `insecure`; `rootless` is the default and recommended mode. For a full list of values, refer to the [Values](https://github.com/kestra-io/kestra/blob/develop/charts/kestra/README.md#values) in the chart's source code. --- # Input Default Values Are Now Dynamically Rendered URL: https://kestra.io/docs/migration-guide/v1.0.0/inputs-defaults-property > Details on the change to dynamic rendering for input default values in Kestra 1.0.0. ## Input defaults are now dynamic The `defaults` property of all inputs is now dynamic. This change has implications for users who use a Pebble [expression](../../../expressions/index.mdx) as a default value. Consider this use case: ```yaml id: session namespace: company.team inputs: - id: sessionId type: STRING defaults: "{{ execution.id }}" tasks: - id: log type: io.kestra.plugin.core.log.Log message: "This is my session id: {{render(inputs.sessionId)}}" ``` Given that the `defaults` are now dynamically rendered, the above flow will fail in Kestra 0.24 and higher, unless you move the expression to the tasks as follows: ```yaml id: session namespace: company.team inputs: - id: sessionId type: STRING required: false tasks: - id: log type: io.kestra.plugin.core.log.Log message: "This is my session id: {{ inputs.sessionId ?? execution.id }}" ``` --- # PurgeAuditLogs: 'permissions' Renamed to 'resources' URL: https://kestra.io/docs/migration-guide/v1.0.0/purge-audit-logs > Renaming of the permissions property to resources in the PurgeAuditLogs task for consistency. ## Audit Log's permissions are renamed to resources The `permissions` property used by the [PurgeAuditLogs](/plugins/core/log-ee/io.kestra.plugin.ee.core.log.purgeauditlogs) task is now called `resources`, aligning with the [AuditLogShipper](/plugins/core/log-ee/io.kestra.plugin.ee.core.log.auditlogshipper) task. The functionality remains the same, but you will need to change the property name in any flows using the task. Additionally, the API, `/api/v1{/tenant}/auditlog/search`: filter on `permissions` is replaced by `resources`. This change is also reflected in the UI filter on the **Audit Logs** page. --- # Reserved Keywords Cannot Be Used as Flow IDs (1.0.0) URL: https://kestra.io/docs/migration-guide/v1.0.0/reserved-flow-ids > Announcement of reserved keywords that can no longer be used as Flow IDs in Kestra 1.0.0. ## Reserved keywords cannot be used as Flow IDs Starting with Kestra 1.0, certain keywords are reserved and **cannot be used as Flow IDs**. These identifiers collide with internal API endpoints and are therefore restricted. **Reserved keywords:** ```plaintext pause resume force-run change-status kill executions search source disable enable ```` If your flows use one of these IDs, you will not be able to edit them after upgrading. To avoid disruption, you must rename the flows **before upgrading**. See the [commit introducing this change](https://github.com/kestra-io/kestra/commit/d4e7b0cde4cf5cfad99b3fb39bff5728e056a049) for details. ## Migration ### 1. Identify impacted flows Check if any flows are using one of the reserved keywords as their `id`. ```yaml id: pause # ❌ Invalid in 1.0 namespace: company.team tasks: - id: log type: io.kestra.plugin.core.log.Log message: "This flow will break after upgrade" ```` ### 2. Copy the flow under a new ID Create a new flow with a different `id` that does not use a reserved keyword. ```yaml id: session_pause # ✅ Valid in 1.0 namespace: company.team tasks: - id: log type: io.kestra.plugin.core.log.Log message: "This flow works in 1.0" ``` ### 3. Remove the old flow Once you have validated the new flow, delete the old flow with the reserved keyword `id`. ## Recommendation Perform this migration **before upgrading to Kestra 1.0**, as otherwise you will not be able to edit affected flows after the upgrade. --- # Singer Tap Plugin Removed: Migrate to Airbyte or dlt URL: https://kestra.io/docs/migration-guide/v1.0.0/singer-plugin > Guide to migrating from the removed Singer tap plugin to supported alternatives like Airbyte or dlt. ## Singer Tap Plugin Removal Singer support is being deprecated in Kestra 0.24 and fully removed in Kestra 1.0. This guide walks you through migrating existing Singer pipelines to supported alternatives in Kestra, such as [Airbyte](/plugins/plugin-airbyte), [dlt](/blueprints?page=1&size=24&q=dlt), and [CloudQuery](/plugins/plugin-cloudquery). ## Why is Singer support being removed? Singer was once a promising open-source technology for building and sharing data connectors. However: - It is no longer actively maintained. - After Meltano shut down, there is no longer a company backing the ecosystem. - As a result, compatibility, security, and reliability cannot be guaranteed going forward. To ensure Kestra users have reliable, well-maintained data ingestion options, migrate to other open-source alternatives. For example, Kestra provides plugins for: - Airbyte: Large connector ecosystem for databases, SaaS apps, and warehouses. Runs in both **Cloud** and **OSS** modes. - dlt: Flexible **Python-based** ingestion framework, great for custom pipelines and lightweight ingestion. - CloudQuery: Purpose-built for cloud asset discovery and syncing metadata to databases or warehouses. ## Migration checklist 1. Identify your Singer taps/targets and their data sources/destinations. 2. Check Kestra’s supported plugins for Airbyte, dlt, or CloudQuery equivalents. 3. Configure connections: - For Airbyte, set up connections in the Airbyte UI and reference `connectionId` in Kestra. - For dlt, define your pipeline script and run it as a Python task in Kestra. - For CloudQuery, configure your ingestion spec in YAML and execute it with the Sync or CLI tasks. 4. Update secrets in Kestra (API tokens, database credentials). 5. Run test migrations and validate data consistency. 6. Remove old Singer flows after successful migration. These plugins are fully integrated with Kestra, making the transition straightforward and ensuring your ingestion pipelines remain reliable. --- # Kestra 1.1.0 Migration Guide – Deprecated Features URL: https://kestra.io/docs/migration-guide/v1.1.0 > Migration guide for Kestra 1.1.0. Covers ForEachItem index change, new prefill property, KV Store metadata migration, and Task Runs UI removal from EE. import ChildCard from "~/components/docs/ChildCard.astro" ## 1.1.0 Deprecated features and migration guides for 1.1.0 and onwards. --- # ForEachItem Iteration Now Starts at 0 Instead of 1 URL: https://kestra.io/docs/migration-guide/v1.1.0/foreach-item > Notice regarding ForEachItem iteration starting at 0 instead of 1 for consistency. ## ForEachItem now starts iteration at 0 instead of 1 `ForEachItem` now starts iteration at 0 instead of 1 to align with the iteration starting value of `ForEach`. If you use {{taskrun.iteration}} in your flow with a `ForEachItem` the starting value is now 0 instead of 1. --- # KV Store and Secrets Metadata Migration in Kestra 1.1 URL: https://kestra.io/docs/migration-guide/v1.1.0/kv-secrets-metadata-migration > Required metadata migration for Key-Value Store and Secrets to enable efficient indexing and search. ## Key-value Store and Secrets Metadata Migration Version **1.1.0** improves the backend logic that powers **Key-Value Pairs** and **Secrets** (Enterprise Edition) search in the Kestra UI. Previously, the UI fetched *all* stored pairs, which could become resource-intensive and inefficient in environments with a large number of entries. To enhance performance and scalability, this release introduces **metadata indexing** that allows the backend to query these resources more efficiently. ## Impact Because of this change, you must run a metadata migration when upgrading to version **1.1.0** (or later). This ensures existing Key-Value and Secrets (EE) data are correctly indexed for the new query structure. When upgrading, include the migration command `- /app/kestra migrate metadata` in your startup configuration. For example, if you’re using **Docker Compose**, start your container with the newest version image and add the migration script in `command` as follows: ```yaml kestra: image: registry.kestra.io/docker/kestra:latest command: migrate metadata kv ``` and then do the same with ```yaml kestra: image: registry.kestra.io/docker/kestra:latest command: migrate metadata secrets ``` :::alert{type="info"} Secrets metadata migration is only necessary for Enterprise users. Open-source users will see an exception error: `❌ Secrets Metadata migration failed: Secret migration is not needed in the OSS version`. ::: Once the migration is complete, the container will stop automatically. You can then move back to the usual command to run the server: ```yaml kestra: image: registry.kestra.io/docker/kestra:latest command: - server standalone --worker-thread=128 ``` Similarly, for Kubernetes installations, run a pod with the migration script (`- /app/kestra migrate metadata kv && /app/kestra migrate metadata secrets`), so the KV Store and Secrets databases are updated. Then, restart your normal pod for Kestra server components without the script. :::alert{type="warning"} If you upgrade to **1.1.0** without running the migration script, the **Key-Value Store** and **Secrets** pages in the UI will appear empty. This is only a **UI issue** — your flows and tasks will continue to run normally and access their values as expected. To fix the UI display, run the migration command above. It’s safe to execute this migration **retroactively** after the upgrade if needed. ::: --- # New 'prefill' Property for Inputs: Breaking Change URL: https://kestra.io/docs/migration-guide/v1.1.0/prefill-inputs > Introduction of the new prefill property for inputs to allow editable initial values. ## Breaking change If you have flows with input property `defaults`, then `required` can no longer be `false`. This combination will throw an error, as inputs with a default value must be required. Previously this combination was valid, and any flows with inputs using this configuration must be refactored to one of the below combinations. Inputs using `defaults` must have `required: true`. ## New prefill Property for Inputs A new `prefill` property has been added to input definitions to let users start with an initial value that can be cleared or set to `null` when the input is not required. **What changed:** Inputs can now define a `prefill` value, which works like an editable default. Unlike `defaults`, a `prefill` value does not persist if the user removes it. This allows workflows to support optional inputs that start with a suggestion but can still be reset to `null` at runtime. **Impact:** This update clarifies how `required`, `defaults`, and `prefill` behave together: * `prefill` and `defaults` cannot be used on the same input. * Use `prefill` when `required: false` and the user should be able to clear the value. * Use `defaults` when `required: true` or when the value must always have a non-null default. To note again, input `defaults` cannot be used together with `required: false`. **Example:** ```yaml inputs: - id: nullable_string_with_prefilled_default type: STRING prefill: "This is a prefilled value you can remove (set to null if needed)" required: false ``` **Migration:** No migration is required. For optional inputs that previously used `defaults` but need to allow clearing or null values, switch those definitions to `prefill` instead. --- # Query Task Now Supports Only One SQL Statement URL: https://kestra.io/docs/migration-guide/v1.1.0/query-task > JDBC Query tasks in Kestra 1.1.0 now accept only a single SQL statement. Learn how to split multi-statement queries to update your flows for compatibility. ## The Query Task Now Supports Only One SQL Statement The `Query` task in **plugin-jdbc** now supports only a **single SQL statement** per execution. Any workflows that include multiple SQL statements separated by semicolons (`;`) within a single `Query` task will now **fail**. ### What changed Previously, the `Query` task accepted multiple SQL statements separated by semicolons. This has been removed to ensure consistent transactional behavior and compatibility across database providers. To execute multiple statements: * Use the **`Queries`** task for multi-statement operations. * Or split SQL statements into individual `Query` tasks. ### Impact Workflows containing multiple SQL statements in one `Query` task will fail with the following error: ```plaintext Query task supports only a single SQL statement. Use the Queries task to run multiple statements. ``` ### Migration To update affected workflows: 1. Replace the `Query` task with the `Queries` task when multiple SQL statements need to be executed together. 2. Or, split the SQL statements into separate `Query` tasks. ### Example (will fail) ```yaml id: queries namespace: company tasks: - id: query type: io.kestra.plugin.jdbc.sqlite.Query allowFailure: true description: "This will fail with error: Query task supports only a single SQL statement. Use the Queries task to run multiple statements." url: jdbc:sqlite:kestra.db fetchType: STORE sql: | CREATE TABLE IF NOT EXISTS features ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, description TEXT NOT NULL, release_version TEXT NOT NULL, edition TEXT NOT NULL ); DELETE FROM features; ``` ### Example (fixed using `Queries` task) ```yaml id: queries namespace: company tasks: - id: queries type: io.kestra.plugin.jdbc.sqlite.Queries url: jdbc:sqlite:kestra.db fetchType: STORE sql: | CREATE TABLE IF NOT EXISTS features ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, description TEXT NOT NULL, release_version TEXT NOT NULL, edition TEXT NOT NULL ); DELETE FROM features; INSERT INTO features (name, description, release_version, edition) VALUES ('Worker Groups', 'Allows targeting specific tasks or triggers to run on specific remote workers for better scalability and resource management.', '0.10', 'Enterprise'), ('Realtime Triggers', 'Supports triggering event-driven workflows in real-time.', '0.17', 'Open-Source'), ('Task Runners', 'Provides on-demand remote execution environments for running tasks.', '0.16', 'Open-Source'), ('KV Store', 'Adds key-value storage for persisting data across workflow executions.', '0.18', 'Open-Source'), ('SCIM Directory Sync', 'Allows synchronization of users and groups from Identity Providers.', '0.18', 'Enterprise'); SELECT * FROM features ORDER BY release_version; ``` --- # Task Runs UI Page Removed in Kestra EE 1.1.0 URL: https://kestra.io/docs/migration-guide/v1.1.0/task-runs-ui > Announcement of the removal of the Task Runs page from the Enterprise Edition UI. ## Task Runs UI Page Removed The **Task Runs** page has been removed from the Enterprise Edition UI. **Reason for removal:** * The page presented confusing granularity (task-level runs displayed on execution-level detail pages) * Filters were incomplete, and performance degraded when handling large datasets * The feature was only available on the Kafka & Elasticsearch backend, causing confusion among other customers * Customer interviews confirmed the page was not being actively used. **Impact:** * No replacement is required, as all relevant execution and task run details remain accessible through the Execution detail view. * If you previously accessed Task Runs, use the **Executions** page and drill down into individual task runs through Gantt and Logs views for task-level information. --- # Webhook Execution API Return Type Changed in 1.1.0 URL: https://kestra.io/docs/migration-guide/v1.1.0/webhook-response > Information on the change of the Webhook Execution API return type to a generic response. ## Webhook Execution API Return Type Changed The return type of the Webhook Execution API endpoint has been updated from a **typed response** to a **generic response** to support broader use cases and improve extensibility. **What changed:** The API method previously returned a strongly typed `WebhookResponse` object. It now returns a generic HTTP response body (`HttpResponse`). **Impact:** Any custom integrations, SDK consumers, or extensions that previously relied on the `WebhookResponse` type in the response body will need to adjust their handling logic: * Direct access to `WebhookResponse` methods or fields will no longer compile. * You must handle the response body dynamically and verify its type at runtime if necessary. **Migration:** * Treat the HTTP response body as a generic object. * Add runtime type checks before casting if your code depends on specific response fields. * Update any logic that assumes a fixed `WebhookResponse` structure. This change ensures greater flexibility in future webhook response handling but requires updates to any consuming code that previously depended on a fixed response type. --- # Migration Guide for Kestra 1.2.0 – Changes & Actions URL: https://kestra.io/docs/migration-guide/v1.2.0 > Overview of changes and migration actions required for upgrading to Kestra version 1.2.0. import ChildCard from "~/components/docs/ChildCard.astro" Deprecated features and migration guides for 1.2.0 and onwards. --- # Namespace Files Metadata Migration in Kestra 1.2.0 URL: https://kestra.io/docs/migration-guide/v1.2.0/namespace-file-migration > Migration guide for Namespace Files metadata in Kestra 1.2.0, optimizing search and scalability. The backend now indexes Namespace Files metadata to optimize search and scalability ## Overview Version 1.2 changes how Namespace Files metadata are handled. The backend now **indexes Namespace Files metadata** to optimize search and scalability, replacing the previous approach of fetching all stored files directly from storage (which could be slow and inefficient for large datasets). Additionally, Namespace Files can now be **versioned and restored**. If you are upgrading to 1.2.0 from an earlier version, you must run the Namespace Files metadata migration. This migration is required for **both runtime execution and UI visualization** — without it, Namespace Files may not be accessible: - In flows/tasks that use Namespace Files (e.g., `namespaceFiles` or `read()`). - In the Kestra UI (files missing, not browsable, or not selectable where expected). ## Required command :::alert{type="warning"} Before running the migration command, stop the main Kestra application to avoid inconsistent reads/writes during the metadata update. ::: Run the following command once: ```shell /app/kestra migrate metadata nsfiles ``` :::alert{type="info"} Running the migration after the upgrade is safe and will restore the missing UI data immediately. ::: After it completes, restart Kestra and verify: - A task using `namespaceFiles` can `read()` the expected files. - Namespace Files are visible and accessible in the UI. ## Docker Compose For **Docker Compose** setups, run the migration command by overriding the container command: ```yaml kestra: image: registry.kestra.io/docker/kestra:latest command: migrate metadata nsfiles ``` After the migration completes, revert to the standard startup command to run the server, e.g., `server standalone --worker-thread=128`. ## Kubernetes For **Kubernetes** deployments, create a one-time Pod (or Job) to run the same migration command **before restarting** your regular Kestra server Pods. --- # Notification Plugins Split in Kestra 1.2.0 (Non-Breaking) URL: https://kestra.io/docs/migration-guide/v1.2.0/notifications-plugin-split > Guide to migrating to the split notification plugins in Kestra 1.2.0, allowing for more granular plugin management. Kestra is splitting the monolithic notifications plugin into provider-specific plugins. With this change, you can include only what you need and, in Enterprise, pin versions per provider. It is not necessary to update Flows – aliases are in place – and executions will process as usual. The update is something to keep in mind when using or searching for Notifications Plugins moving forward. # Migrate to the split notification plugins ## New plugin packages Replace the single notifications bundle with the individual plugins listed below. Install only the providers you use. Examples: - Slack → [`plugin-slack`](/plugins/plugin-slack) - Email (SMTP) → [`plugin-email`](/plugins/plugin-email) - PagerDuty → [`plugin-pagerduty`](/plugins/plugin-pagerduty) - Microsoft 365 / Teams → [`plugin-microsoft365`](/plugins/plugin-microsoft365) - Google Chat → [`plugin-googleworkspace`](/plugins/plugin-googleworkspace) - Meta (Messenger & WhatsApp) → [`plugin-meta`](/plugins/plugin-meta) Other providers (Twilio/Segment, Opsgenie, SendGrid, Sentry, Squadcast, X, Zenduty, Zulip, Discord, Telegram, LINE, etc.) follow the same naming pattern (`plugin-`). ## What to do 1. **Update plugin sources** - OSS: download or reference the new provider plugins you need and remove the legacy notifications plugin from your plugins directory. - EE: specify the desired provider plugins (and versions) in your plugin configuration instead of the monolithic bundle. See [versioned plugins](../../../07.enterprise/05.instance/versioned-plugins/index.md) to pin and control upgrades per provider. 2. **Keep flow definitions unchanged (where possible)** Task and trigger type aliases remain in place, so flows generally do **not** need YAML changes as long as the matching provider plugin is installed. If a task fails to load, confirm the plugin is present and restart the worker/webserver. 3. **Review plugin defaults and secrets** Move any `pluginDefaults` or secrets you set for notifications to the corresponding provider plugin type (for example, Slack defaults now rely on `plugin-slack` being available). ## Notes for Enterprise deployments - You can pin versions per provider to control blast radius (for example, keep `plugin-slack` at a known version while upgrading `plugin-email`). - Ensure all workers in a worker group share the same plugin set so tasks resolve consistently. --- # Migration Guide for Kestra 1.3.0 – Changes & Actions URL: https://kestra.io/docs/migration-guide/v1.3.0 > Migration guide for Kestra 1.3.0. Overview of deprecated features, required upgrade steps, and breaking changes when upgrading to Kestra 1.3.0. import ChildCard from "~/components/docs/ChildCard.astro" Deprecated features and migration guides for 1.3.0 and onwards. --- # Enterprise License Upgrade in Kestra 1.3.0 URL: https://kestra.io/docs/migration-guide/v1.3.0/ee-license-upgrade > Upgrade your Kestra Enterprise license in version 1.3.0. Follow the step-by-step migration guide to apply the new license format and avoid service disruption. ## Overview Kestra 1.3.0 included required changes in the Kestra licensing API. Licenses issued before 1.3.0 will not be accepted by 1.3.0+ nodes. Every customer upgrading to 1.3.0 must install a newly generated [license](../../../configuration/06.enterprise-and-advanced/index.md). If you are planning to upgrade to `v1.3.0`, reach out to Customer Success or Sales to receive a compatible license. The change required will be to update the license ID, key, and fingerprint with the new value prior to deploying Kestra v1.3.0. ## Breaking change in 1.3.0 - **Startup enforcement:** Nodes running 1.3.0+ will refuse to start with legacy licenses. A valid license is required at boot. - **Scope:** Applies to all new installs and upgrades to 1.3.0+. ## Action required for existing customers 1. Before upgrading to 1.3.0, contact Customer Success or Sales to generate a new license for each environment. 2. Install the new license on all Kestra Enterprise nodes (controllers, executors, webapps) before restarting on 1.3.0. 3. Verify startup succeeds and feature access aligns with the entitlements in the new license. --- # File-Listing Plugins Default to 25 Results in 1.3.0 URL: https://kestra.io/docs/migration-guide/v1.3.0/file-listing-default-limit > All plugins that list files now cap results at 25 by default to protect execution size and database load; set an explicit limit if you need more. ## Context Plugins that list files (local FS, SFTP, S3/GCS/Azure buckets, etc.) could previously return every matching object. In production, unbounded listings have produced executions with thousands of files, leading to huge `executions.value` payloads, overloaded MySQL/PostgreSQL storage, and, in worst cases, worker or server crashes. ## What changed - A **default limit of 25 files** (e.g., `maxFiles: 25`) now applies to all file-listing plugins (tasks and triggers). - The limit is **configurable** on each plugin — set it explicitly when you need more than 25 results. - The change is **breaking** for users who relied on unbounded listings; those flows will now see truncated results unless a higher limit is provided. ## Who is affected Any flow or trigger that lists files and expects more than 25 results in one invocation (for example, listing a bucket, large directory, or prefix/path on SFTP, local filesystems, S3/GCS/Azure, or similar storage backends). ## How to migrate 1) **Audit file-listing usages.** Identify tasks/triggers that list files (search for list-type tasks in your flow YAMLs, such as S3/GCS/Azure/FS/SFTP “List” operations). 2) **Set an explicit limit where needed.** Add the plugin’s limit input (commonly named `limit`, sometimes `maxResults`/`maxKeys`, depending on the plugin) to the desired value. Refer to the plugin's documentation for properties. 3) **Prefer bounded values.** Raise the limit only as high as necessary; extremely large limits can still produce oversized executions and heavy database writes. 4) **Consider pagination/partitioning.** Where possible, paginate by prefix/date folder or break listings into smaller batches to avoid large single executions. 5) **Verify after upgrade.** Run validation or a dry-run listing in lower environments to confirm the new limit returns the expected number of files. --- # LTS Migration: Kestra 1.0 to 1.3 Upgrade Guide URL: https://kestra.io/docs/migration-guide/v1.3.0/lts-migration > Consolidated migration guide for users upgrading directly from Kestra 1.0 LTS to 1.3 LTS. ## Overview Kestra **1.3.0** is a Long-Term Support (LTS) release. If you are upgrading directly from the previous LTS release (**1.0**), you must run the metadata migrations that were introduced in the intermediate **1.1** and **1.2** releases. This guide consolidates all required steps into a single walkthrough so you can upgrade in one pass. ## Required metadata migrations Three migration commands must be run — **all three are required** for a complete upgrade from 1.0 to 1.3: | Command | Introduced in | Scope | |---|---|---| | `kestra migrate metadata kv` | 1.1.0 | Key-Value Store metadata indexing | | `kestra migrate metadata secrets` | 1.1.0 | Secrets metadata indexing (Enterprise Edition only) | | `kestra migrate metadata nsfiles` | 1.2.0 | Namespace Files metadata indexing | :::alert{type="info"} The `secrets` migration applies only to **Enterprise Edition** users. Open-source users can skip it — running it on OSS will produce an exception that can be safely ignored. ::: ## Order of operations 1. **Stop Kestra** — shut down all running Kestra server components to avoid inconsistent reads/writes during migration. 2. **Run all three migration commands** (see examples below). 3. **Restart Kestra** with the standard server command. ## Docker Compose Run each migration command by temporarily overriding the container command. Execute them one at a time — each command exits automatically when complete. ```yaml # Step 1 – KV metadata kestra: image: registry.kestra.io/docker/kestra:latest command: migrate metadata kv ``` ```yaml # Step 2 – Secrets metadata (Enterprise Edition only) kestra: image: registry.kestra.io/docker/kestra:latest command: migrate metadata secrets ``` ```yaml # Step 3 – Namespace Files metadata kestra: image: registry.kestra.io/docker/kestra:latest command: migrate metadata nsfiles ``` After all three migrations complete, revert to the standard startup command: ```yaml kestra: image: registry.kestra.io/docker/kestra:latest command: server standalone --worker-thread=128 ``` ## Kubernetes For Kubernetes deployments, create a one-time **Job** (or Pod) that runs all three commands before restarting your regular Kestra server Pods: ```shell /app/kestra migrate metadata kv \ && /app/kestra migrate metadata secrets \ && /app/kestra migrate metadata nsfiles ``` Once the Job completes successfully, roll out your updated Kestra server Pods as usual. ## What happens if you skip these migrations | Skipped migration | Impact | |---|---| | `kv` | The **Key-Value Store** page in the UI appears empty. Flows continue to work — this is a UI-only issue. | | `secrets` | The **Secrets** page in the UI appears empty (EE only). Flows continue to work — this is a UI-only issue. | | `nsfiles` | **Namespace Files are inaccessible** both in flows/tasks (e.g., `namespaceFiles`, `read()`) and in the UI. | :::alert{type="warning"} Unlike the KV and Secrets migrations (which only affect UI display), the Namespace Files migration affects **runtime execution**. Skipping it can break flows that depend on Namespace Files. ::: All three migrations are safe to run **retroactively** — if you have already upgraded and notice missing data, running the commands will restore it immediately. ## Further reading - [Key-Value Store and Secrets Metadata Migration (1.1.0)](/docs/migration-guide/v1.1.0/kv-secrets-metadata-migration) - [Namespace Files Metadata Migration (1.2.0)](/docs/migration-guide/v1.2.0/namespace-file-migration) --- # No Code in Kestra: Build Flows & Dashboards Visually URL: https://kestra.io/docs/no-code > Discover Kestra's No Code capabilities for building flows and dashboards visually, empowering non-technical users. import ChildCard from "~/components/docs/ChildCard.astro" Build in Kestra without touching YAML. ## Build flows and dashboards without YAML --- # No Code Dashboards in Kestra: Design Without YAML URL: https://kestra.io/docs/no-code/no-code-dashboards > Build interactive dashboards in Kestra using the No Code editor to visualize your data without writing YAML. Build Dashboards without writing YAML. ## Create dashboards with the No Code editor The No Code Dashboard editor lets you design Kestra Dashboards directly in the UI using structured forms. It’s ideal for teams that want to create insightful dashboards quickly, empower non-developers to contribute, and maintain a smooth handoff to code. You can switch between No Code and YAML views at any time — the editor automatically generates schema-validated YAML and stays synchronized with the live preview and documentation panels. --- ## Why choose No Code (or combine it with code) - **Speed & accessibility**: Start creating dashboards without writing YAML — perfect for analysts, operators, or anyone new to Kestra. - **Visual clarity**: Live previews and real-time updates let you “see” your dashboard evolve as you edit. - **Consistency & governance**: Form-based configuration aligns with widget schemas and validation rules, ensuring consistent data representation and design standards across teams. - **No ceiling**: When you need advanced customization, switch to YAML to add filters, queries, or layout logic — all while keeping everything in sync within the same editor. ## The multi-panel dashboard editor - **No Code View:** Form-based editing of dashboard widgets, layout, and data sources. Changes automatically generate YAML in real time. - **Dashboard Code View (YAML):** Full-featured editor with autocompletion, validation, and file sidebar. - **Charts Tab:** Displays all currently saved charts in the dashboard, providing a complete overview of existing visualizations. - **Preview Tab:** Shows in-progress charts as you design, allowing you to instantly review updates before saving. - **Documentation & Blueprints Panels:** In-context documentation and ready-to-use dashboard examples to help you get started faster. ## Interactive demo
## Quick start: create your first dashboard in No Code To start building dashboards, navigate to the **Dashboards** tab. Click the **Default Dashboard** button at the top of the page (the name may vary depending on your instance) and select **+ Create Dashboard**. ![Create Dashboard](./create-dashboard.png) Next, you’ll see the Dashboard YAML editor. Select the **No Code** tab to open the No Code panel editor. It will appear alongside the YAML editor so you can view both as you work. ![No Code Dashboard Editor](./no-code-dashboards.png) ## Build a chart This example walks through creating a dashboard for the last year, starting with an **Executions Success Ratio KPI Chart**. Begin by giving your dashboard an ID, title, description, and time window. If the YAML editor is open, you’ll see every change in the No Code form instantly reflected in code. ![Time Window](./time-window.png) Once those fields are set, click **+ Add** in the **charts** block to create your first chart. The first step is to choose a chart type. In this example, select **KPI Chart**. Each chart type has its own set of options and data representations — see the [Chart Plugin documentation](/plugins/core/chart) for full details. While editing, you can open the **Documentation** tab to view chart-specific guidance without leaving the editor. ![Documentation Multi-Panel](./documentation-view.png) After selecting **KPI Chart**, give the chart an ID and select the type of data. In this example, choose **Executions**. To ensure all executions are captured, set the `field` property to `ID` and the `agg` property to `COUNT`. Optionally, set a display name or label for readability. ![KPI Chart](./kpi-chart.png) Next, add a filter for the execution data. Click **+ Add** under the data numerator section. For an execution success ratio, choose `IN` for the `type`, with `values` set to `STRING`. Add the value `SUCCESS` so the chart only considers successful executions. Under **Optional Properties**, set the `field` to `STATE`. ![Add Numerator](./add-numerator.png) Now that the correct data is connected to the chart, return to the `charts` No Code tab and open **Optional Properties**. Set the `displayName` and change `numberType` to `PERCENTAGE` so the chart shows a ratio rather than a flat count. Adjust the `width` to your preference — a value of `3` is recommended for this type of chart. ![Chart Options](./chart-options.png) Once configured, open the **Preview** tab to view your chart. If satisfied with the result, click **Save** and continue building additional charts. For example, you can copy the YAML generated for the KPI Success Chart, paste it into the YAML editor, and replace `SUCCESS` with `FAILED` to create a chart for the execution failure ratio. ![Chart Preview](./chart-preview.png) For more example charts, see the [Dashboards documentation page](../../09.ui/00.dashboard/index.md#create-a-new-custom-dashboard-as-code). ## Best practices ### Organize dashboards by purpose Group related charts into dashboards that serve a clear goal — for example, separate dashboards for **system health**, **execution performance**, and **user activity**. This keeps data focused and easier to interpret. ### Use consistent naming conventions Use clear, descriptive IDs for dashboards and charts. A good pattern is `team_metric_type`, such as `dataops_executions_latency` or `marketing_pipeline_health`. Consistent naming helps when searching, versioning, or exporting dashboards as code. ### Leverage YAML for advanced logic When you need to reuse queries, apply filters dynamically, or control layout programmatically, switch to YAML mode. If a chart is similar to another with a small tweak of the filter or field, copy and paste the YAML to quickly change the field rather than rebuild with No Code forms. The No Code editor keeps everything synchronized, so you can safely go back and forth without losing your structure. ### Preview frequently Use the **Preview** tab to verify data bindings and chart outputs before saving. This ensures filters and aggregations are configured correctly and helps catch mismatched fields early. --- # No Code Flow Building in Kestra: Skip the YAML URL: https://kestra.io/docs/no-code/no-code-flow-building > Create and edit Kestra flows visually using the No Code flow builder, streamlining workflow development. Build flows without touching YAML. ## Design flows in the No Code editor The No Code editor lets you design Kestra flows directly in the UI using structured forms. It’s ideal for teams who want to move fast, enable non-developers, and still keep a clean handoff to code. You can switch between No Code and YAML at any time — the editor generates schema-validated YAML for you and stays in sync with topology and documentation panels.
## Why choose No Code (or combine it with code) - **Speed & onboarding**: Start building flows without learning YAML first; power users can drop into code view whenever needed. - **Clarity**: Live topology and documentation help you “see” the flow while you edit. - **Consistency & governance**: UI-driven forms align with plugin schemas and validation, reducing drift. - **No ceiling**: When you outgrow forms, switch to YAML, add files/scripts, and keep everything in one place. :::alert{type="info"} In addition to No Code flow building, avoid writing your own YAML by checking out the Kestra [AI Copilot](../../ai-tools/ai-copilot/index.md) to get your YAML started for you. ::: ## The multi-panel flow editor - **No Code View:** Form-based editing of tasks, inputs, triggers, and flow structure. Changes auto-generate YAML in real time. - **Flow Code View (YAML):** Full editor with autocompletion, validation, and file sidebar. - **Topology View:** Visual structure (DAG-like) that updates as you build. - **Documentation & Blueprints Panels:** In-context docs and ready-to-use examples. ## Quick start: first flow in No Code 1. **Create a flow** from **Flows → + Create**; confirm namespace and identifiers. 2. **Open No Code view** by selecting the view from the editor panel and add your first task (e.g., a plugin from the catalog). Browse or search the catalog; select to reveal form fields. ![No Code Panel View](./no-code-flow-panel.png) Use the **Documentation** panel for property descriptions and examples. With multi-panel editing, you can close, open, and reposition any view at any time. For example, below the Slack plugin documentation is opened up alongside the No Code editor while the flow-code YAML editor is closed. ![No Code Documentation View](./multi-panel.png) 3. **Configure Inputs** by clicking **+ Add** in the inputs section of the No Code editor. This action opens a new tab for configuring your input with all required and optional properties available. As you finish an input, you can close the tab or navigate back to the **No-code** tab, click **+ Add** again to create another input. If you leave the **Flow Code** YAML view open, you can see YAML code propagating in real time as you add inputs. ![No Code Input Configuration](./no-code-inputs.png) 4. **Configure task properties** via forms; dynamic vs. static fields are indicated by the plugin’s schema/docs. Each task opens a No Code tab and propagates code as you select and configure properties. Property fields can even autocomplete expressions for inputs previously configured. ![No Code Task Configuration](./no-code-tasks.png) 5. **Add flow logic** (If/Switch/ForEach/Subflow) tasks to control execution paths. 6. **Add a trigger** (e.g., schedule, file-event) to automate runs. ![No Code Trigger Configuration](./no-code-trigger.png) 7. **Add additional flow components** such as [outputs](../../05.workflow-components/06.outputs/index.md), [retry](../../05.workflow-components/12.retries/index.md), [SLA](../../05.workflow-components/18.sla/index.md), [afterExecution](../../05.workflow-components/20.afterexecution/index.md), [Plugin Defaults](../../05.workflow-components/09.plugin-defaults/index.md), and every other possible [Kestra flow component](../../05.workflow-components/index.mdx). Everything possible with code, can be done with No Code. ![Additonal Flow Components](./additional-components.png) 8. **Validate & run**: save, then execute from the UI to see logs and results. ## Switching between No Code and YAML - **Round-trip editing:** Edits in forms update YAML instantly; edits in YAML reflect back in No Code. - **When to switch:** - Complex expressions or advanced plugin fields - Bulk edits across many tasks - Importing flows from repos/CI - **Export/Share:** Use the **Actions** menu to export YAML or copy the flow. ## FAQ - **Can I build everything No Code?** Most flows, yes; complex cases may be faster in YAML. You can mix both. - **Do I lose control vs. YAML?** No — the No Code editor writes standard Kestra YAML that you can export, version, and run anywhere Kestra runs. --- # Open-Source vs. Enterprise Edition of Kestra URL: https://kestra.io/docs/oss-vs-paid > Compare Kestra Open-Source and Enterprise editions to choose the right solution for your orchestration, security, and scalability needs. Understand the differences between Kestra's Open-Source and Enterprise Editions, and learn how the commercial offering supports teams running mission-critical workflows at scale. ## Choose the right Kestra edition Kestra's Open-Source Edition provides a foundation for workflow automation — it's best suited for solo-developers or small teams exploring workflow orchestration. The [Enterprise Edition](../07.enterprise/index.mdx) adds enterprise-grade security, scalability, and governance features required by organizations managing complex workflows across multiple teams or environments. It includes advanced authentication and access controls with SSO, SCIM & RBAC, multi-tenancy, high availability, dedicated secrets manager and storage backends per team, dedicated worker groups or on-demand remote task runners, audit logs, service accounts, apps, revision history for every resource, maintenance mode, log shipper, cluster monitoring, backup and restore, dedicated support with SLAs, plus newer safeguards like assets packaging, versioned plugins, read-only secrets, plugin allow-listing, worker isolation, and built-in flow unit tests. In short, everything you need for production deployments with strict compliance or reliability requirements is available in the Enterprise Edition. --- ## Security and Access Control The Open-Source Edition supports basic authentication, suitable for one-person projects or small teams with shared credentials. In contrast, the Enterprise Edition has an easy way to add collaborators via [invitations](../07.enterprise/03.auth/invitations/index.md) and manage permissions at scale using [SCIM Directory Sync](../07.enterprise/03.auth/scim/index.mdx). It integrates with many identity providers via [Single Sign-On (SSO)](../07.enterprise/03.auth/sso/index.md) and **OpenID Connect (OIDC)**, simplifying user management for large teams. [Role-Based Access Control (RBAC)](../07.enterprise/03.auth/rbac/index.md) lets you define granular permissions at user, group and namespace level, e.g. restricting developer access to specific namespaces while granting auditors read-only access. [Namespace-level secrets management](../07.enterprise/02.governance/secrets/index.md) ensures that sensitive credentials stay isolated between projects. [Service accounts](../07.enterprise/03.auth/service-accounts/index.md) and [API tokens](../07.enterprise/03.auth/api-tokens/index.md) enable secure automation, such as [CI/CD pipelines](../version-control-cicd/index.mdx) deploying workflows without requiring user credentials. For organizations using external [secrets managers](../07.enterprise/02.governance/secrets-manager/index.md) such as Azure Key Vault or HashiCorp Vault, Enterprise Edition integrates directly with these systems. [SCIM directory sync](../07.enterprise/03.auth/scim/index.mdx) automates user (de)provisioning at scale, reducing administrative overhead when onboarding or offboarding team members. Enterprise-only safeguards include [read-only secrets](../07.enterprise/02.governance/read-only-secrets/index.md) for least-privilege access and [allowed plugins](../07.enterprise/02.governance/allowed-plugins/index.md) to centrally control which plugins may run. --- ## Governance and Compliance Enterprise Edition provides [audit logs](../07.enterprise/02.governance/06.audit-logs/index.md) that track every user action and resource change, which are critical in highly regulated industries. Logs can be automatically exported to observability platforms such as Datadog or Elasticsearch using the [Log Shipper](../07.enterprise/02.governance/logshipper/index.md). [Multi-tenancy](../07.enterprise/02.governance/tenants/index.md) allows you to create fully isolated environments, e.g. separate tenants for specific [teams or business units](../14.best-practices/8.business-unit-separation/index.md). Each tenant can use separate secrets managers or dedicated internal storage backends (e.g., AWS S3 for Tenant A, GCS for Tenant B). [Worker Group](../07.enterprise/04.scalability/worker-group/index.md) ensures tasks from different tenants run on separate infrastructure, reducing the risk of resource contention or cross-tenant breaches. [Worker Isolation](../07.enterprise/02.governance/worker-isolation/index.md) adds hard isolation policies when you need stricter separation. Encryption safeguards data at rest and in transit, meeting regulatory standards. --- ## Scalability and Reliability The Open-Source Edition runs by default on a single server, which can become a bottleneck for large workloads. Enterprise Edition can use Kafka and Elasticsearch for distributed event processing, enabling horizontal scaling and high throughput. High Availability (HA) architecture eliminates single points of failure — if a worker node fails, tasks automatically reroute to healthy nodes. [Worker Groups](../07.enterprise/04.scalability/worker-group/index.md) let you assign tasks to specialized infrastructure. For example, GPU-heavy machine learning workflows can target a worker group with NVIDIA GPUs, while ETL jobs run on cost-optimized spot instances. [Task Runners](../07.enterprise/04.scalability/task-runners/index.md) offload compute-intensive scripts on-demand to Kubernetes or cloud batch services such as Azure Batch, Google Cloud Run or AWS ECS Fargate to prevent resource contention and making it easy to scale in a cost-effective way. :::alert{type="info"} Please note that Worker Groups are not yet available in Kestra Cloud, only in Kestra Enterprise Edition. ::: [Maintenance Mode](../07.enterprise/05.instance/maintenance-mode/index.md) allows safe upgrades: new executions queue while in-progress tasks complete gracefully, avoiding abrupt workflow termination. [Cluster monitoring](../07.enterprise/05.instance/index.mdx) provides real-time visibility into resource usage, helping teams proactively address infrastructure bottlenecks. Additionally, using **Custom Dashboards**, you can create custom views to track specific metrics, logs, or executions. The [Backup and Restore](../10.administrator-guide/backup-and-restore/index.md) eliminates the risk of data loss or corruption during upgrades, allowing you to recover from accidental deletions or system failures. [Versioned Plugins](../07.enterprise/05.instance/versioned-plugins/index.md) let you pin plugin versions per environment for safe rollouts, while the [Kill Switch](../07.enterprise/05.instance/kill-switch/index.md) can pause risky changes instantly. [Announcements](../07.enterprise/05.instance/announcements/index.md) provide in-product notifications for maintenance or policy updates. --- ## Productivity and Collaboration [Custom Blueprints](../07.enterprise/02.governance/custom-blueprints/index.md) act as reusable workflow templates, e.g. a **standardized** data ingestion pattern that all teams can consistently adopt. **Full-text search across task runs** speeds up navigation — e.g. engineers can quickly find logs for a failed Python script without manually filtering through thousands of executions. **Centralized** namespace-level [plugin defaults](../07.enterprise/02.governance/07.namespace-management/index.md) simplify configuration. A [namespace-wide setting](../07.enterprise/02.governance/07.namespace-management/index.md) on a root namespace might **define AWS credentials** that will be automatically inherited, and optionally also enforced, by all child namespaces, eliminating redundant code and allowing admins to centrally govern secrets and plugin configurations. **Impersonation** lets admins validate permissions by temporarily assuming a user’s role, which significantly helps with troubleshooting access management issues. [Apps](../07.enterprise/04.scalability/apps/index.md) turn workflows into user-friendly interfaces. A finance team can build a self-service tool for expense approvals, where non-technical stakeholders can submit requests via a form. Approved requests automatically trigger downstream tasks to process payments. [Assets](../07.enterprise/02.governance/01.assets/index.md) package reusable files and artifacts alongside flows, and [Unit Tests](../07.enterprise/02.governance/unit-tests/index.md) let teams validate flows early to prevent regressions. --- ## Support and Services Enterprise Edition includes **SLAs with guaranteed response times** for support tickets, which is critical for teams running 24/7-operations. Onboarding support helps customize Kestra to your stack and deployment requirements. Customers’ feature requests are prioritized over those from open-source users. They also get early access to beta features and roadmap previews, allowing teams to plan upgrades around upcoming capabilities. The dedicated customer portal provides direct access to Kestra’s engineering team for architecture reviews or best practices. --- ## When to Choose Enterprise Edition **Stick with Open-Source if:** - You’re a solo developer - You’re prototyping or running non-critical workflows - Your team has minimal compliance requirements - You can manage secrets and access controls manually **Upgrade to Enterprise if:** - Multiple users or teams share the same Kestra instance - Workflows handle sensitive data (PII, financial records) - Downtime would impact business operations - You need to meet audit or regulatory standards --- ## How Upgrading Works Switching to Enterprise involves adding a license key to your configuration and restarting Kestra — no code changes required. All existing workflows and plugins remain compatible. For hybrid setups, you can run Open-Source and Enterprise instances side-by-side during transition periods. --- # Performance in Kestra: Benchmarks and Tuning URL: https://kestra.io/docs/performance > Overview of Kestra performance resources, including benchmarks and tuning guides to help you optimize your orchestration platform. import ChildCard from "~/components/docs/ChildCard.astro" Learn how to measure and optimize Kestra’s performance. The Benchmark guide shows real-world execution metrics for both the Open Source and Enterprise editions, while the Performance Tuning guide explains configuration options for the Worker, JDBC backend, and Kafka backend to balance throughput, latency, and resource usage. ## Measure and tune Kestra performance --- # Benchmarks: Orchestration Throughput & Latency URL: https://kestra.io/docs/performance/benchmark > View performance benchmarks for Kestra's orchestration throughput and latency across Open Source and Enterprise editions. Kestra is an orchestration platform: you define a flow, and Kestra orchestrates it. Flows can range from lightweight tasks running in milliseconds to complex scripts in containers that run for tens of minutes. ## See Kestra orchestration benchmark results This benchmark focuses on **orchestration performance**, including dispatching to the Kestra Worker, rather than workload execution, which varies by use case. To isolate orchestration performance, we use workflows with fast tasks, such as: - `io.kestra.plugin.core.log.Log` — logs a single message. - `io.kestra.plugin.core.output.OutputValues` — produces a single output (simulating a data-oriented workflow). ## Test environment Benchmarks were run on a Google Cloud **e2-standard-4** VM (4 vCPUs, 16 GB RAM) with two setups: 1. **Kestra Open Source (OSS)** — Postgres 16 backend (4 vCPUs, 16 GB RAM). Database runs remotely to simulate production. 2. **Kestra Enterprise Edition (EE)** — Kafka backend (4 vCPUs, 16 GB RAM). Kafka and Elasticsearch run on separate VMs. :::alert{type="info"} Benchmark results are for Kestra 1.2.0. ::: --- ## Benchmark 1 -- simple flow **Description** Triggered by a Webhook. Contains two tasks: 1. Outputs a variable. 2. Logs that variable. ```yaml id: benchmark01 namespace: benchmarks triggers: - id: webhook type: io.kestra.plugin.core.trigger.Webhook key: benchmark inputs: - id: name type: STRING defaults: World tasks: - id: concatenate type: io.kestra.plugin.core.output.OutputValues values: message: Hello {{ inputs.name }} - id: hello type: io.kestra.plugin.core.log.Log message: "{{ outputs.concatenate.values.message }}" ``` **Results for Kestra OSS** ![Kestra OSS - Benchmark01](./bench01-OSS.png "Kestra OSS Benchmark01 results") | Executions(per minutes) | Tasks (per minutes) | Execution Latency (in seconds) | |:--|:--|:--------------------------------| | 250 | 500 | 0,17 | | 500 | 1000 | 0,17 | | 1000 | 2000 | 0,19 | | 1500 | 3000 | 0,26 | | 2000 | 4000 | 2.5 | **Results for Kestra EE** ![Kestra EE - Benchmark01](./bench01-EE.png "Kestra EE Benchmark01 results") | Executions(per minutes) | Tasks (per minutes) | Execution Latency (in seconds) | |:-------------------------|:--------------------|:--------------------------------| | 250 | 500 | 0,24 | | 500 | 1000 | 0,25 | | 1000 | 2000 | 0,26 | | 1500 | 3000 | 0,29 | | 2000 | 4000 | 0,28 | | 2500 | 5000 | 0,29 | | 3000 | 6000 | 0,32 | | 3500 | 7000 | 1.17 | | 4000 | 8000 | 1.3 | | 4500 | 9000 | 1.9 | **Key takeaways** - At 250 executions/min (500 tasks/min), execution latency is approximately 170ms — similar to single execution time. - Kestra OSS (JDBC backend) sustains up to 1500 executions/min (3000 tasks/min) with an execution duration of less than 1s, which is what we could realistically target for such a workflow. - Kestra EE (Kafka backend) sustains up to 4000 executions/min (8000 tasks/min). - Kestra EE has a slightly higher latency on low throughput but supports way higher throughput than Kestra OSS. ## Benchmark 2 -- complex flow **Description** Triggered by a Webhook. Contains 5 `If` tasks with 2 subtasks each (only one executes per run). This creates 10 task runs per execution and stresses the Executor. ```yaml id: benchmark02 namespace: benchmarks inputs: - id: condition type: BOOL defaults: true triggers: - id: webhook type: io.kestra.plugin.core.trigger.Webhook key: benchmark tasks: - id: if1 type: io.kestra.plugin.core.flow.If condition: "{{inputs.condition}}" then: - id: hello-true-1 type: io.kestra.plugin.core.log.Log message: Hello True 1 else: - id: hello-false-1 type: io.kestra.plugin.core.log.Log message: Hello False 1 - id: if2 type: io.kestra.plugin.core.flow.If condition: "{{inputs.condition}}" then: - id: hello-true-2 type: io.kestra.plugin.core.log.Log message: Hello True 2 else: - id: hello-false-2 type: io.kestra.plugin.core.log.Log message: Hello False 2 - id: if1-3 type: io.kestra.plugin.core.flow.If condition: "{{inputs.condition}}" then: - id: hello-true-3 type: io.kestra.plugin.core.log.Log message: Hello True 3 else: - id: hello-false-3 type: io.kestra.plugin.core.log.Log message: Hello False 3 - id: if4 type: io.kestra.plugin.core.flow.If condition: "{{inputs.condition}}" then: - id: hello-true-4 type: io.kestra.plugin.core.log.Log message: Hello True 4 else: - id: hello-false-4 type: io.kestra.plugin.core.log.Log message: Hello False 4 - id: if5 type: io.kestra.plugin.core.flow.If condition: "{{inputs.condition}}" then: - id: hello-true-5 type: io.kestra.plugin.core.log.Log message: Hello True 5 else: - id: hello-false-5 type: io.kestra.plugin.core.log.Log message: Hello False 5 ``` **Results for Kestra OSS** ![Kestra OSS - Benchmark02](./bench02-OSS.png "Kestra OSS Benchmark02 results") | Executions(per minutes) | Tasks (per minutes) | Execution Latency (in seconds) | |:--|:--|:--------------------------------| | 100 | 1000 | 0,7 | | 200 | 2000 | 0,7 | | 300 | 3000 | 0,8 | | 400 | 4000 | 1.5 | | 500 | 5000 | 15 | **Results for Kestra EE** ![Kestra EE - Benchmark02](./bench02-OSS.png "Kestra OSS Benchmark02 results") | Executions(per minutes) | Tasks (per minutes) | Execution Latency (in seconds) | |:-------------------------|:--------------------|:--------------------------------| | 100 | 1000 | 1,3 | | 200 | 2000 | 1,4 | | 300 | 3000 | 1,4 | | 400 | 4000 | 1,5 | | 500 | 5000 | 1,7 | | 600 | 6000 | 1.8 | | 700 | 7000 | 2,3 | | 800 | 8000 | 5.3 | **Key takeaways** - At 250 executions/min (500 tasks/min), execution latency is approximately 700ms — similar to single execution time. - Kestra OSS (JDBC backend) sustains up to 400 executions/min (4000 tasks/min) with an execution duration of less than 3s, which is what we could realistically target for such a workflow. - Kestra EE (Kafka backend) sustains up to 700 executions/min (7000 tasks/min). - The Kestra Executor processing capability is independent of the type of tasks to process; the number of tasks per minute sustained in this benchmark is the same as in the first benchmark. ## Benchmark 3 -- large `ForEach` loop **Description** Executes 100 iterations of a ForEach loop with unbounded concurrency. ```yaml id: benchmark03 namespace: benchmarks tasks: - id: foreach type: io.kestra.plugin.core.flow.ForEach values: "{{range(1, 100)}}" concurrencyLimit: 0 tasks: - id: output type: io.kestra.plugin.core.output.OutputValues values: some: value ``` **Observations** The `ForEach` task is executed on each iteration, resulting in 200 task executions. On average, the execution time for the OSS JDBC backend is 5s, that is about 40 tasks/s or 3600 tasks/mn which is on par with the throughput of the previous benchmarks. On the EE Kafka backend, the average execution time is 8s, that is about 25 tasks/s or 1500 tasks/mn. This is lower than the throughput in the previous benchmarks because a single flow with many task runs creates a large execution context, which is costly to orchestrate. ## Benchmark 4 -- realtime trigger with JSON transformation **Description** Consumes messages from a Kafka topic in real time, transforms them with JSONata `TransformValue` task, and outputs new data in the `OutputValues` task. This triples the size of the data in the execution context. ```yaml id: benchmark04 namespace: benchmarks triggers: - id: kafka-logs type: io.kestra.plugin.kafka.RealtimeTrigger topic: test_kestra properties: bootstrap.servers: localhost:9092 groupId: myGroup tasks: - id: transform type: io.kestra.plugin.transform.jsonata.TransformValue from: "{{trigger.value}}" expression: | $.{ "order_id": order_id, "customer_name": first_name & ' ' & last_name, "address": address.city & ', ' & address.country, "total_price": $sum(items.(quantity * price_per_unit)) } - id: hello type: io.kestra.plugin.core.output.OutputValues values: log: "{{outputs.transform.value}}" ``` Benchmarked with: - Small messages (~1.6 KB) - Medium messages (~16 KB) - Large messages (~160 KB) **Results for Kestra OSS** With 1.6 KB small-sized messages: ![Kestra OSS - Benchmark04 - Small messages](./bench04-OSS-small.png "Kestra OSS Benchmark04 with small messages results") | Executions(per minutes) | Tasks (per minutes) | Execution Latency (in seconds) | |:--|:--|:--------------------------------| | 500 | 1000 | 0,19 | | 1000 | 2000 | 0,20 | | 1500 | 3000 | 0,31 | | 2000 | 4000 | 5,3 | With 16 KB medium-sized messages: ![Kestra OSS - Benchmark04 - Medium messages](./bench04-OSS-medium.png "Kestra OSS Benchmark04 with medium messages results") | Executions(per minutes) | Tasks (per minutes) | Execution Latency (in seconds) | |:--|:--|:--------------------------------| | 500 | 1000 | 0,21 | | 750 | 1500 | 0,23 | | 1000 | 2000 | 0,27 | | 1250 | 2500 | 0,36 | | 1500 | 3000 | 4,6 | With 160 KB large-sized messages: ![Kestra OSS - Benchmark04 - Big messages](./bench04-OSS-big.png "Kestra OSS Benchmark04 with big messages results") | Executions(per minutes) | Tasks (per minutes) | Execution Latency (in seconds) | |:--|:--|:--------------------------------| | 250 | 500 | 0,48 | | 375 | 750 | 0,77 | | 500 | 1000 | 15 | | 625 | 1250 | 25 | **Results for Kestra EE** With 1.6 KB small-sized messages: ![Kestra EE - Benchmark04 - Small messages](./bench04-EE-small.png "Kestra EE Benchmark04 with small messages results") | Executions(per minutes) | Tasks (per minutes) | Execution Latency (in seconds) | |:-------------------------|:--------------------|:--------------------------------| | 500 | 1000 | 0,26 | | 1000 | 2000 | 0,28 | | 1500 | 3000 | 0,30 | | 2000 | 4000 | 0,29 | | 2500 | 5000 | 0,31 | | 3000 | 6000 | 0,36 | | 3500 | 7000 | 0,48 | | 4000 | 8000 | 0.48 | With 16 KB medium-sized messages: ![Kestra EE - Benchmark04 - Meidum messages](./bench04-EE-medium.png "Kestra OSS Benchmark04 with medium messages results") | Executions(per minutes) | Tasks (per minutes) | Execution Latency (in seconds) | |:-------------------------|:--------------------|:--------------------------------| | 500 | 1000 | 0,26 | | 750 | 1500 | 0,28 | | 1000 | 2000 | 0,30 | | 1250 | 2500 | 0.31 | | 1500 | 3000 | 0.31 | | 1750 | 3400 | 0,36 | | 2000 | 4000 | 0,36 | | 2250 | 4500 | 0,38 | | 2500 | 5000 | 0.61 | With 160 KB large-sized messages: ![Kestra EE - Benchmark04 - Big messages](./bench04-OSS-big.png "Kestra OSS Benchmark04 with big messages results") | Executions(per minutes) | Tasks (per minutes) | Execution Latency (in seconds) | |:-------------------------|:--------------------|:--------------------------------| | 250 | 500 | 0,28 | | 375 | 750 | 0,36 | | 500 | 1000 | 0,29 | | 625 | 1250 | 0,33 | | 750 | 1500 | 0,97 | **Key takeaways** - Small messages: Similar performance to Benchmark 1, which is expected. - Medium messages: Kestra sustains up to 1250 executions/min (2500 tasks/min) with an execution duration of less than 1s, which is what we could realistically target for such a workflow. - Large messages: Performance starts to degrade significantly, which is expected due to the increased Worker workload and Executor sensitivity to execution size. - EE sustains higher throughput than Kestra OSS (8000 tasks/min vs. 2500 tasks/min) in real-time scenarios for small messages. ## Conclusion Kestra is a platform, not just a framework. It provides orchestration plus logging, metrics, retries, SLAs, error handling, governance, and observability. While this adds overhead compared to lightweight tools, performance is balanced with feature richness. Kestra is designed for high performance in workflow orchestration and task dispatching, ensuring minimal time spent in the orchestrator and more time in actual task execution. Thanks to continuous performance tuning by the engineering team, Kestra remains among the fastest, most high-performing workflow orchestrators in every release. Kestra is built to scale horizontally. When a use case demands it, add Executor/Worker nodes to increase throughput. :::alert{type="info"} This page is updated with each new Kestra release. For more details on how to run this benchmark yourself, refer to our [Benchmarks](https://github.com/kestra-io/benchmarks) repository on GitHub. ::: --- # Performance Tuning in Kestra: Workers, JDBC, Kafka URL: https://kestra.io/docs/performance/performance-tuning > Tips and best practices for tuning Kestra performance, covering Worker configuration, JDBC backend, and Kafka settings. Not all workloads are the same, so Kestra is configured to balance throughput (the ability to process a lot of executions in parallel) and latency (the ability to process executions quickly) without using too many resources. ## Tune Kestra for throughput and latency This page introduces performance tuning options for the Kestra orchestrator. Each comes with trade-offs, so ensure you understand them before applying. Before you read on, please familiarize yourself with the [Kestra architecture](../../08.architecture). ## Worker The [Worker](../../08.architecture/02.server-components/index.md#worker) executes your tasks, and tuning it depends on the type of workloads you run. The most useful configuration is the number of Worker threads, which is 8 times the number of available cores by default. You can increase it to boost parallel task execution. Depending on how you start Kestra, use one of the following methods: - If using the standalone, set `--worker-thread` in the standalone command line. - If using separate component processes, set `--thread` in the Worker command line. - If using our Helm chart, set `deployments.worker.workerThreads` in the values. ## JDBC backend The JDBC backend is composed of main components such as the JDBC queue and JDBC executor. ### JDBC queue The JDBC queue is the most performance-critical component of the JDBC backend. It may be configured using the following configuration options: ```yaml kestra: jdbc: queues: minPollInterval: 25ms maxPollInterval: 500ms pollSwitchInterval: 60s pollSize: 100 ``` The JDBC queue polls a database table for new messages; this poll is made every 25 ms by default and has a limit of 100 records. This default setup provides low latency (25 ms) and batching for reasonable throughput. To avoid wasting resources when there are no messages, the queue will switch (progressively) to polling every 500 ms if no messages are processed for 60 seconds. You can configure: - `minPollInterval`: Lowering it reduces latency but increases database load. - `maxPollInterval`: Lowering it reduces latency when a workflow is executed on an idle instance, but it will increase the load on the database on an idle instance. - `pollSwitchInterval`: Increasing this value helps prevent the queue from entering an idle state too quickly, ensuring that new executions are picked up promptly when they arrive. - `pollSize`: Lowering it may reduce latency, but also reduces throughput. Increasing it will do the opposite; it may increase latency, but it will also increase throughput. ## The JDBC executor The performance of the JDBC executor will be tightly coupled with the performance of the JDBC queue. You can configure the number of threads used by the Executor with the following configuration: ```yaml kestra: jdbc: executor: thread-count: 0 ``` By default, it's 0, which means the number of available CPUs. Two thread pools are started, effectively using 2 times the number of available CPUs by default. ## The Kafka backend First, we set the Kafka partition count to 16 with a replication factor of 1 by default. Because Kafka is not the primary storage, increasing the replication factor is optional; all data can be re-created from Elasticsearch if needed. It's worth noting that as the partition count is 16, starting more than 16 instances of a Kestra component (16 Workers, 16 Executors, etc.) would not provide any benefits. If you plan to exceed this, increase the partition count. This is the default topic configuration: ```yaml kestra: kafka: defaults: topic: partitions: 16 replication-factor: 1 ``` You can configure any Kafka producer and consumer properties recommended in standard Kafka application tuning to improve performance. They are configurable via `kestra.kafka.defaults.consumer.properties` and `kestra.kafka.defaults.producer.properties` for the standard consumer and producer properties, and `kestra.kafka.defaults.stream.properties` for Kafka Streams. The most impactful properties are `poll.ms` and `commit.interval.ms`, which are reduced by default from 100 ms to 25 ms. You can decrease them further at the cost of more resources used by the broker. This is the default properties configuration: ```yaml kestra: kafka: defaults: stream: properties: poll.ms: 25 # Kafka default 100 commit.interval.ms: 25 # Kafka default 100 ``` You can also set those properties on a topic basis. You can also configure the number of threads used by Kafka Stream (`num.stream.threads`) which is one by default. You can increase it for better resource utilization if you process a high number of small executions. However, if you process a lot of big executions, increasing it can incur an increase in memory usage; careful benchmarking should be done before increasing it. ```yaml kestra: kafka: stream: properties: num.stream.threads: 4 # Default to 1 ``` --- # Size & Scale Kestra: Executors, Workers, Schedulers URL: https://kestra.io/docs/performance/sizing-and-scaling-infrastructure > Guidance on sizing and scaling your Kestra infrastructure, including Executors, Workers, and Schedulers, for optimal performance. Kestra is designed to scale from lightweight workflows to enterprise-scale orchestration with thousands of task runs per minute. Choosing the right infrastructure depends on your workload patterns, execution volume, and latency requirements. This page provides practical guidance on how to size your Kestra deployment, how many Executors and Workers you need, and how to scale and tune performance over time. ## Size and scale your Kestra deployment ## Core concepts Before diving into numbers, it helps to understand how Kestra executes work: 1. **Executors** orchestrate workflows: they orchestrate workflow logic via flowable tasks, delegate tasks to the right worker nodes, and manage execution state and concurrency. 2. **Workers** run the tasks themselves: from lightweight logging to long-running scripts or container workloads. 3. **Schedulers** handle triggers such as scheduled events, webhook calls, or polling external resources. 4. **Webservers** provide the API and UI, they handle user interactions incl. processing execution inputs. Performance depends on balancing **throughput** (task runs per minute) and **latency** (how quickly executions start and complete) given your infrastructure. --- ## Baseline infrastructure recommendations Start with the CPU and memory allocation listed below based on your expected number of task runs per minute. As your workload grows, you can later scale vertically (add more CPU/RAM per node) or horizontally (add more nodes). ### Up to 1,000 task runs/minute The table below provides a baseline for a typical Kestra deployment handling up to 1,000 task runs per minute. | Component | CPU | RAM | Notes | |------------|-----|-----|-------| | Webserver | 1 | 2 GB | Add CPU for heavy API usage or large file uploads. | | Executor | 2 | 2 GB | Add +2 CPU / +2 GB per additional 1,000 task runs/min. | | Scheduler | 1 | 2 GB | Add +1 CPU / +1 GB per additional 1,000 triggers. | | Worker | 2 | 4 GB | Heavily workload-dependent. Use [Worker Calculator](#worker-sizing-methodology). | ### More than 1,000 task runs/minute For workloads exceeding 1,000 task runs per minute, increase Executor and Worker resources as follows: | Component | CPU | RAM | Notes | |------------|-----|-----|-------| | Webserver | 1 | 2 GB | Scale for traffic. | | Executor | 4 | 4 GB | Add +2 CPU / +2 GB per additional 1,000 task runs/min. | | Scheduler | 1 | 2 GB | Scale with triggers. | | Worker | 4 | 8 GB | Scale based on [Worker Calculator](#worker-sizing-methodology). | :::alert{type="info"} **Extra Tips**: 1. It is best practice to provision at least one extra node per component for maintenance and one extra node for high availability. 2. These guidelines are starting points. Monitor your actual usage and adjust based on performance metrics. 3. Use the above guidelines in conjunction with the [Worker Calculator](https://v0-worker-calculator.vercel.app/) to determine the right number of Workers for your workload. ::: **Why we don’t provide exact information about VM instance type?** Compute nodes vary across cloud providers, on-prem and internal requirements of any company. Thus, we focus on simple CPU/memory recommendations you can use regardless of where you deploy Kestra. --- ## How many executors and workers do I need? ### Workers Workers scale with workload type and concurrency: - **Lightweight tasks** such as simple API calls triggering remote jobs, variable manipulation → high concurrency per Worker. - **Heavy tasks** such as complex scripts, running containers and heavy data transformations → fewer concurrent tasks per Worker. 👉 Use the [Worker Calculator](#worker-sizing-methodology) to compute the number of Workers based on task runs, triggers, and average task duration you expect. ### Executors Executors scale with [orchestration load](https://kestra.io/docs/performance/benchmark): - **Simple flows (few tasks)**: ~1500 executions/min with JDBC backend, ~2000 with Kafka backend before latency rises. - **Complex flows (many tasks per execution)**: throughput is lower (300–400 exec/min), as each execution spawns many task runs. **Add Executors when**: - Execution latency exceeds a few seconds under normal load. - You consistently exceed 1,000 task runs/min per 2 vCPUs allocated. #### Kestra executor throughput factors Below are key factors affecting Executor throughput, with examples from our [Benchmarks](https://kestra.io/docs/performance/benchmark): | **Factor** | **Effect on Throughput** | **Examples from Benchmarks** | |-----------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | **Number of tasks per execution** | More tasks per execution increase orchestration overhead and reduce sustainable execution rate. | Benchmark 2: with 10 task runs/execution, OSS sustains ~300 exec/min, EE ~400 exec/min. | | **Execution context size** | Larger payloads or many task runs per execution make orchestration heavier and lower throughput. | Benchmark 3: `ForEach` loop with 200 tasks sustains ~20 task runs/s (10–11s total). Benchmark 4: Large 160 KB messages cause latency spikes (OSS: 34s at 625 exec/min). | | **Execution rate (load)** | Higher incoming execution rate raises latency once system limits are reached. | Benchmark 1 (simple flow, OSS): latency jumps from 0.9s at 1500 exec/min to 24s at 2000 exec/min. | | **System resources** | Fixed CPU/RAM constrain throughput; more resources or horizontal scaling (extra Executors/Workers) increase capacity. | All benchmarks run on 4 vCPUs, 16 GB RAM. Scaling beyond this would allow higher sustained throughput. | | **Concurrency & scheduling** | Unbounded concurrency leads to orchestration bottlenecks; limiting concurrency helps control overhead. | Benchmark 3: unbounded `ForEach` (100 concurrent iterations) slows orchestration to ~20 task runs/s. | | **Backend type** | Kafka backend (EE) sustains higher execution rates before latency increases; Postgres (EE & OSS) saturates earlier. However, the differences aren't dramatic — we recommend JDBC Postgres as a baseline backend. | Benchmark 1: OSS sustains ~1500 exec/min (<1s latency), EE sustains ~2000 exec/min (<0.5s latency). | #### Improving executor throughput To improve executor throughput: - Keep **tasks per execution low** where possible, e.g. a flow with 5 tasks delivers higher throughput than a flow with 10 tasks. Use subflows to break up complex flows. - Watch **execution context size** and avoid passing large raw payloads between tasks, use external storage (e.g. use `store: true` on tasks), and avoid large loops in a single flow. - Apply **concurrency limits** in your flow configuration to avoid orchestration bottlenecks. - Scale horizontally with more Executors/Workers for higher loads. --- ## Backend considerations - **JDBC/Postgres backend (Enterprise and OSS)**: simpler to operate with low latency for up to ~1,000 task runs/min. [Performance tunin](../../performance/performance-tuning/index.md)g involves adjusting JDBC queue polling intervals and executor threads beyond scaling the infrastructure. - **Kafka backend (Enterprise)**: required for higher throughput, real-time triggers, and scaling beyond ~2,000 task runs/min. Ensure enough partitions are allocated (≥ number of Executors/Workers) for full parallelism. --- ## Scaling the Webserver for many UI users When a large number of business users access the Kestra UI — for example to trigger flows manually, monitor executions, or browse logs — the **Webserver is rarely the bottleneck**. The Webserver is a stateless component that serves the UI and REST API. It can be scaled horizontally by simply adding more replicas behind a load balancer. ### Estimating concurrent users Not all registered users are active at the same time. A common rule of thumb for internal tools is a 10 per 1 to 100 per 1 ratio between total users and concurrent users: | Total users | Estimated concurrent users | |-------------|---------------------------| | 100 | 1 – 10 | | 800 | 8 – 80 | | 5,000 | 50 – 500 | A single Webserver instance (1 vCPU / 2 GB RAM) can comfortably handle **dozens of concurrent users**. For larger deployments or high API usage (bulk trigger calls, large file uploads), add Webserver replicas. ### The real bottleneck: the database When many users simultaneously browse dashboards, execution lists, or large log outputs, **the database becomes the primary concern**, not the Webserver itself. Each page load in the UI can trigger several queries against your execution and log tables, which grow continuously over time. To keep the database healthy under this type of usage: - **Purge execution history regularly**: use [purge tasks](../../administrator-guide/purge/index.md) to delete old executions, logs, and storage files. At high throughput, execution data can accumulate quickly — terabytes per year is not uncommon. - **Reduce the default dashboard time range**: shorter default periods (e.g. last 24h instead of last 7 days) reduce the volume of data scanned on each dashboard load. - **Monitor slow queries**: track query latency on your database to identify execution or log queries that benefit from index tuning or data retention changes. :::alert{type="info"} If you are using a managed Postgres instance (e.g. Cloud SQL on GCP), ensure that your instance tier is sized to handle the read load from concurrent UI users in addition to the write load from flow executions. ::: --- ## Scaling and performance tuning ### When to scale - **Executors**: scale when orchestration latency grows with load. - **Workers**: scale when task runs start lagging or queue for too long. - **Webservers**: scale with API/UI traffic, especially if handling large file inputs and trigger payloads. - **Schedulers**: scale with number of triggers and their frequency. ### Tuning options - **JDBC backend**: adjust `minPollInterval`, `maxPollInterval`, and `pollSize` to trade off latency vs. DB load. - **Executor threads**: increase beyond default (0.5 × CPU cores) to raise concurrency. - **Kafka backend**: tune `poll.ms` and `commit.interval.ms` for lower latency at cost of broker load. - **Worker threads**: set based on workload (4–16 threads per core). More details in our [Performance Tuning guide](../../performance/performance-tuning/index.md). --- ## Worker sizing methodology Use this formula (applied in the [Worker Calculator](https://v0-worker-calculator.vercel.app/)): - `Total Threads Needed = Realtime Triggers + Polling Triggers + Task Execution` - `Workers Needed = ⌈Total Threads ÷ Threads per Worker⌉` Where: - **Realtime triggers**: 1 thread is needed for each realtime trigger. - **Polling triggers**: depends on duration × frequency. - **Task execution**: depends on the number of task runs per day, average duration, and active hours. :::alert{type="info"} **Rule of thumb**: Start with calculated number of Workers, then add a **20–30% buffer** for production workloads. ::: --- ## Best practices for long-term performance - **Benchmark early**: Test flows with representative workloads using our [Benchmarks](https://kestra.io/docs/performance/benchmark). Refer to the README in the [Benchmarks repo](https://github.com/kestra-io/benchmarks) for setup instructions. - **Monitor resource usage**: Track CPU, memory, and thread utilization. Scale before bottlenecks appear. - **Account for data growth**: At >1,000 task runs/min, **Postgres storage can grow quickly** (terabytes per year). **Purge execution history regularly**. - **Plan for peaks**: Use active hours in the Worker Calculator to size for peak load periods rather than averaging across a full day. **Example:** Suppose you expect **120,000 task runs per day**, but most of them (100,000) occur during a **6-hour nightly ETL window**. - If you divide 120,000 task runs by 24 hours, the calculator would estimate **~5,000 task runs/hour** → leading to a modest Worker requirement. - If you set Active Hours to 6, the calculator will distribute those 100,000 tasks across just 6 hours → **~16,600 task runs/hour**, which means **3× more Workers** are required to handle the load without queuing. - Use **Active Hours** to estimate if your infrastructure can absorb **spikes** when they actually happen, instead of under-estimating based on averages. - **Tune cautiously**: Each tuning option has trade-offs; always validate in staging before applying to production. - **Maintain High Availability (HA)**: Run at least two nodes per component (webserver, executor, worker, scheduler). :::alert{type="info"} 🔧 **Why no separate Indexer service is needed**: each Webserver replica has an Indexer component running as a background process. Adding a second Webserver (for HA) also **doubles indexing throughput** without introducing a new component to deploy and manage. This reduces complexity while still providing high availability and adequate indexing throughput for most workloads. ::: --- ## Quick reference checklist - Start with baseline sizing based on task runs/minute. Use **1,000 task runs/min** as a key threshold. - Use [Worker Calculator](https://v0-worker-calculator.vercel.app/) for estimation of the required number of Workers. - Scale Executors with orchestration load (Flowable tasks) - Scale Workers with task execution load (Runnable tasks) - Monitor latency, throughput, and resource usage continuously. - Add 20–30% capacity buffer in production. - Ensure HA by running ≥2 nodes per component. With these guidelines, you can right-size Kestra for your workload today and scale confidently as your orchestration needs grow. --- # Plugin Developer Guide: Build & Publish in Kestra URL: https://kestra.io/docs/plugin-developer-guide > Comprehensive guide for developers to build, test, and publish custom plugins for Kestra. import ChildCard from "~/components/docs/ChildCard.astro" Browse [Kestra's integrations](/plugins) and learn how to create your own plugins. ## Build, test, and publish Kestra plugins Plugins are the building blocks of Kestra's tasks and triggers. They encompass components interacting with external systems and performing the actual work in your flows. Kestra comes prepackaged with hundreds of [plugins](/plugins), and you can also develop your own custom plugins. To integrate with your internal systems and processes, you can build custom plugins. If you think it could be useful to others, consider contributing your plugin to our open-source community. --- # Develop a Kestra Condition Plugin URL: https://kestra.io/docs/plugin-developer-guide/condition > Develop custom Condition plugins for Kestra to control flow execution logic based on specific criteria. Here is how you can develop a new [Condition](../../05.workflow-components/07.triggers/index.mdx#conditions). ## Build a condition plugin for Kestra :::collapse{title="Here is a simple condition example that validate the current flow:"} ```java @SuperBuilder @ToString @EqualsAndHashCode @Getter @NoArgsConstructor @Schema( title = "Condition for a specific flow" ) @Plugin( examples = { @Example( full = true, code = { "- conditions:", " - type: io.kestra.plugin.core.condition.FlowCondition", " namespace: company.team", " flowId: my-current-flow" } ) } ) public class FlowCondition extends Condition { @NotNull @Schema(title = "The namespace of the flow") public String namespace; @NotNull @Schema(title = "The flow ID") public String flowId; @Override public boolean test(ConditionContext conditionContext) { return conditionContext.getFlow().getNamespace().equals(this.namespace) && conditionContext.getFlow().getId().equals(this.flowId); } } ``` ::: You just need to extend `Condition` and implement the `boolean test(ConditionContext conditionContext)` method. You can have any properties you want for any task such as validation or documentation; everything works the same way. The `test` will receive a `ConditionContext` that will expose: - `conditionContext.getFlow()`: the current flow. - `conditionContext.getExecution()`: the current execution that can be null for [Triggers](../04.trigger/index.md). - `conditionContext.getRunContext()`: a RunContext to render your properties. This method must return a boolean to validate the condition. ## Documentation Remember to document your conditions. For this, we provide a set of annotations explained in the [Document each plugin](../07.document/index.md) section. --- # Plugin Contribution Guidelines for Kestra URL: https://kestra.io/docs/plugin-developer-guide/contribution-guidelines > Guidelines for contributing to Kestra plugins. Covers PR rules, code quality, HTTP and JSON conventions, test requirements, and how to add new plugins. This page outlines the guidelines to follow when contributing to **Kestra plugins**. It helps ensure contributions are: - easy to review - easy to QA - consistent with Kestra conventions - safe and maintainable over time --- ## General Guidelines Follow these baseline rules for every pull request. - PR title and commits follow [conventional commits](https://www.conventionalcommits.org/en/v1.0.0/). - Add a `closes #ISSUE_ID` or `fixes #ISSUE_ID` in the description if the PR relates to an opened issue. - Documentation updated (plugin docs from `@Schema` for properties and outputs, `@Plugin` with examples, `README.md` file with basic knowledge and specifics). - Setup instructions included if needed (API keys, accounts, etc.). - Prefix all rendered properties by `r` not `rendered` (e.g., `rHost`). - Use `runContext.logger()` to log important information with the right level (DEBUG, INFO, WARN, or ERROR). --- ## Properties Ensure properties are declared and validated consistently. - Prefer the `Property` carrier type; use `@PluginProperty(dynamic = true)` only when dynamic rendering is required. - **Every property must declare a group** via `@PluginProperty(group = "...")`. Use the standard groups below; introduce a custom group (e.g. `"logging"`, `"schema registry"`) only when none of the standard ones fit. | Group | What belongs here | |---|---| | `"main"` | Required properties and primary-intent properties (sql, query, prompt, commands, script, action, …) | | `"connection"` | Endpoint, account, and authentication properties | | `"source"` | Input origin and source location | | `"processing"` | Filtering, selection, and data-shaping options | | `"execution"` | Runner/runtime/environment controls | | `"destination"` | Output destination and write target | | `"reliability"` | Retries, failure handling, safety and consistency knobs | | `"advanced"` | Expert-level or rarely-changed options | - Mandatory properties must be annotated with `@NotNull` and checked during the rendering. - You can model a JSON thanks to a simple `Property>`. --- ## HTTP Use the shared HTTP client for all outbound requests. - Must use Kestra’s internal HTTP client from `io.kestra.core.http.client`. --- ## JSON Use the standard serializers and avoid breaking changes from upstream APIs. - If you are serializing response from an external API, you may have to add a `@JsonIgnoreProperties(ignoreUnknown = true)` at the mapped class level to prevent crashes when providers add new fields. - Must use Jackson mappers provided by core (`io.kestra.core.serializers`). --- ## New plugins / subplugins Keep new packages aligned with project conventions and metadata. - Ensure your new plugin is configured as described in the [Gradle mandatory configuration guide](https://kestra.io/docs/plugin-developer-guide/gradle#mandatory-configuration). - Add a `package-info.java` under each sub package respecting [this format](https://github.com/kestra-io/plugin-odoo/blob/main/src/main/java/io/kestra/plugin/odoo/package-info.java) and choosing the right category. - Every time you use `runContext.metric(...)` you have to add a `@Metric` ([see this doc](https://kestra.io/docs/plugin-developer-guide/document#document-the-plugin-metrics)) - Docs don't support to have both tasks/triggers in the root package (e.g. `io.kestra.plugin.kubernetes`) and in a sub package (e.g. `io.kestra.plugin.kubernetes.kubectl`), whether it's: all tasks/triggers in the root package OR only tasks/triggers in sub packages. - Icons added in `src/main/resources/icons` in SVG format and not in thumbnail (keep it big): - `plugin-icon.svg` - One icon per package, e.g. `io.kestra.plugin.aws.svg` - For subpackages, e.g. `io.kestra.plugin.aws.s3`, add `io.kestra.plugin.aws.s3.svg` See the [Elasticsearch Search.java example](https://github.com/kestra-io/plugin-elasticsearch/blob/master/src/main/java/io/kestra/plugin/elasticsearch/Search.java#L76). - Use `"{{ secret('YOUR_SECRET') }}"` in the examples for sensible infos such as an API KEY. - If you are fetching data (one, many or too many), you must add a `Property fetchType` to be able to use `FETCH_ONE`, `FETCH` and even `STORE` to store large amounts of data in the internal storage. - Align the `"""` to close examples blocks with the flow id. - Update the existing `index.yaml` for the main plugin, and for each new subpackage add a metadata file named exactly after the subpackage (e.g. `s3.yaml` for `io.kestra.plugin.aws.s3`) under `src/main/resources/metadata/`, following the same schema. --- ## Tests Cover behavior and provide evidence of local validation. - Unit tests added or updated to cover the change (using the `RunContext` to actually run tasks). - Add sanity checks if possible with a YAML flow inside `src/test/resources/flows`. - Prefer **Testcontainers** when possible to avoid running extra Docker services that can consume a lot of GitHub Actions runner disk space (and reduce flakiness / CI setup complexity). - If Testcontainers is not suitable, avoid disabling tests for CI. Instead, configure a local environment with `.github/setup-unit.sh` (to be set executable with `chmod +x setup-unit.sh`) (which can be executed locally and in the CI) all along with a new `docker-compose-ci.yml` file (do **not** edit the existing `docker-compose.yml`). If needed, create an executable (`chmod +x cleanup-unit.sh`) `cleanup-unit.sh` to remove the potential costly resources (tables, datasets, etc). - Provide screenshots from your QA / tests locally in the PR description. The goal here is to use the JAR of the plugin and directly test it locally in Kestra UI to ensure it integrates well. --- ## Outputs Ensure outputs are minimal and non-duplicative. - Do not send back as outputs the same information you already have in your properties. - If you do not have any output use `VoidOutput`. - Do not output twice the same information (e.g., a status code and an error code saying the same thing). --- # Document Your Kestra Plugin with Annotations URL: https://kestra.io/docs/plugin-developer-guide/document > Document your Kestra plugins using annotations and schemas to generate documentation for the UI and website. Here is how you can document your plugin. ## Document your plugin for Kestra First, let us remember the organization of a plugin project: - The Gradle project can contain several plugins, we call it a group of plugins. - The package in which a plugin is written in is called a sub-group of plugins. Sometimes, there is only one sub-group, in which case the group and the sub-group are the same. - Each class is a plugin that provides one task, trigger, condition, etc. The plugin documentation will be used on the Kestra website and the Kestra UI. We provide a way to document each level of a plugin project. ## Document the plugin group ### Manifest attributes Kestra uses custom manifest attributes to provide top-level group documentation. The following manifest attributes are used to document the group of plugins: - `X-Kestra-Title`: by default, the Gradle `project.name` property is used. - `X-Kestra-Group`: by default, the Gradle `group.id` property with an additional group name is used. - `X-Kestra-Description`: by default, the Gradle `project.description` property is used. - `X-Kestra-Version`: by default, the Gradle `project.version` property is used. If you follow the plugin structure of the template on GitHub, you should have something like this: :::collapse{title="Example"} ```groovy group "io.kestra.plugin" description 'Google Cloud Platform (GCP) plugins for Kestra.' // [...] jar { manifest { attributes( "X-Kestra-Title": project.name, "X-Kestra-Group": project.group + ".gcp", "X-Kestra-Description": project.description, "X-Kestra-Version": project.version ) } } ``` ::: As you can see, the most important documentation attribute is the `description`, which should be a short sentence describing the plugins. ### Additional markdown files You can add additional markdown files in the `src/main/resources/doc` directory. If there is a file `src/main/resources/doc/`, it will be inlined inside the main documentation page as the long description for the group of plugins. For example, for the GCP group of plugins, the file is `src/main/resources/doc/io.kestra.plugin.gcp`, and it contains authentication information that applies to all tasks. If there are files inside the `src/main/resources/doc/guides` directory, they are listed in a `Guides` section on the documentation for the group of plugins. ### Group Icon It is possible to provide an icon representing the whole plugin group. If there is a [SVG file](https://www.worg/Graphics/SVG/) `src/main/resources/icons/plugin-icon.svg`, it will be used as the group icon. ## Document the plugin sub-groups Each sub-group can be documented via the `io.kestra.core.models.annotations.PluginSubGroup` annotation that must be defined at the package level in a `package-info.java` file. The `@PluginSubGroup` annotation allows setting: - The sub-group `title`. If not set, the name of the sub-group will be used. - The sub-group `description`, which is a short sentence introducing the sub-group. - The sub-group `categories`, which is a list of `PluginCategory`. If not set, the category `MISC` will be used. For example, for the GCP BigQuery sub-group: ```java @PluginSubGroup( title = "BigQuery", description = "This sub-group of plugins contains tasks for accessing Google Cloud BigQuery.\n" + "BigQuery is a completely serverless and cost-effective enterprise data warehouse.", categories = { PluginSubGroup.PluginCategory.DATABASE, PluginSubGroup.PluginCategory.CLOUD } ) package io.kestra.plugin.gcp.bigquery; import io.kestra.core.models.annotations.PluginSubGroup; ``` ### Sub-Group Icon Each plugin sub-group can define an icon representing plugins contained in the sub-group. If there is a SVG file `src/main/resources/icons/.svg`, it will be used as the icon for the corresponding plugins. For example, for the GCP BigQuery sub-group, the `src/main/resources/icons/io.kestra.plugin.gcp.bigquery.svg` file is used. ## Document each plugin Plugin documentation will generate a [JSON Schema](https://json-schema.org/) that will be used to validate flows. It also generates documentation for both the UI and the website (see the `kestra plugins doc` command). ### Document the plugin class Each plugin class must be documented via the following: - The `io.kestra.core.models.annotations.Plugin` annotation allows providing examples. - The `io.swagger.v3.oas.annotations.media.Schema` annotation, which the `title` attribute will use as the plugin description. For example, the `Query` task of the PostgreSQL group of plugins is documented as follows: ```java @Schema( title = "Query a PostgresSQL server." ) @Plugin( examples = { @Example( full = true, title = "Execute a query.", code = """ id: query_postgres namespace: company.team tasks: - id: query type: io.kestra.plugin.jdbc.postgresql.Query url: jdbc:postgresql://127.0.0.1:56982/ username: pg_user password: pg_password sql: | select concert_id, available, a, b, c, d, play_time, library_record, floatn_test, double_test, real_test, numeric_test, date_type, time_type, timez_type, timestamp_type, timestampz_type, interval_type, pay_by_quarter, schedule, json_type, blob_type from pgsql_types fetchType: FETCH """ ) } ) ``` For convenience, the `code` attribute of the `@Example` annotation is a list of strings. Each string will be a line of the example. That avoids concatenating multi-line strings in a single attribute. You can add multiple examples if needed. ### Document the plugin properties Declare inputs with the `Property` carrier type whenever possible, and document them with the `io.swagger.v3.oas.annotations.media.Schema` annotation plus validation rules from `javax.validation.constraints.*`. Use `io.kestra.core.models.annotations.PluginProperty` only when you need its metadata attributes—`group`, `hidden`, or `internalStorageURI`—that are not available on `Property`. The Swagger `@Schema` annotation contains a lot of attributes that can be used to document the plugin properties. The most useful are: - `title`: a short description of a property. - `description`: long description of a property. - `anyOf`: a list of allowed sub-types of a property. Use it when the property type is an interface, an abstract class, or a class inside a hierarchy of classes to denote possible sub-types. This should be set when the property type is `Object`. The `@Schema` and `@PluginProperty` annotations can be used on fields, methods, or classes. Many tasks can take input from multiple sources on the same property. They usually have a single `from` property, a string representing a file in the Kestra Storage, a single object, or a list of objects. To document such property, you can use `anyOf` this way: ```java @NotNull @Schema( title = "The source of the published data.", description = "Can be an internal storage URI, a list of Pub/Sub messages, or a single Pub/Sub message.", anyOf = {String.class, Message[].class, Message.class} ) private Object from; ``` :::alert{type="info"} Due to limitations on how JSON Schema works, you cannot add `@Schema` on a Java enum type and the plugin property that uses this type. We advise to avoid using `@Schema` on enumerations. ::: ## Document the plugin outputs Outputs should be documented with the `io.swagger.v3.oas.annotations.media.Schema` annotation in the same way as plugin properties. Please read the section above for more information. Only use the annotation mentioned above. Never use `@PluginProperty` on an output. ## Document the plugin metrics Tasks can expose metrics; to document those you must add a `@Metric` annotation instance for each metric in the `@Plugin` annotation instance of the task. For example, to document two metrics: a **length** metric of type `counter` and a **duration** metric of `type` timer, you can use the following: ```java @Plugin( metrics = { @Metric(name = "length", type = Counter.TYPE), @Metric(name = "duration", type = Timer.TYPE) } ) ``` ## JSON Schema Usage for Flow Validation Kestra provides a JSON Schema to validate your flow definitions. This ensures that your flows are correctly structured and helps catch errors early in the development process. ### JSON Schema in VSCode To use the JSON Schema in Visual Studio Code (VSCode), follow these steps: 1. Install the [YAML extension](https://marketplace.visualstudio.com/items?itemName=redhat.vscode-yaml) by Red Hat. 2. Open your VSCode settings (`Ctrl+,` or `Cmd+,`). 3. Search for `YAML: Schemas` and click on `Edit in settings.json`. 4. Add the following configuration to associate the Kestra JSON Schema with your flow files: ```json { "yaml.schemas": { "https://your-kestra-instance.com/api/v1/main/schemas/flow.json": "/*.yaml" } } ``` Replace `https://your-kestra-instance.com/api/v1/main/schemas/flow.json` with the actual URL of your Kestra JSON Schema. ### Example of Using JSON Schema in Flow Editor Here is an example of how to use the JSON Schema in the flow editor: ```yaml id: example_flow namespace: example_namespace tasks: - id: example_task type: io.kestra.core.tasks.log.Log message: "Hello, World!" ``` When you open this flow in the editor, the JSON Schema will validate the structure and provide autocompletion and error checking. ### Globally Available Location for JSON Schema The JSON Schema for Kestra flows is available at the following URL: ```plaintext https://your-kestra-instance.com/api/v1/main/schemas/flow.json ``` Replace `https://your-kestra-instance.com` with the actual URL of your Kestra instance. ### Generating and Using JSON Schema for Plugins To generate a JSON Schema for your plugin, you can use the `kestra plugins doc` command. This command will generate a JSON Schema file for your plugin, which can be used to validate your plugin's configuration. Here is an example of how to use the `kestra plugins doc` command: ```sh kestra plugins doc --plugin io.kestra.plugin.yourplugin --output yourplugin-schema.json ``` This will generate a JSON Schema file named `yourplugin-schema.json` for your plugin. You can then use this JSON Schema file to validate your plugin's configuration in the same way as the flow JSON Schema. --- # Gradle Configuration for Kestra Plugins URL: https://kestra.io/docs/plugin-developer-guide/gradle > Configure Gradle for Kestra plugin development, including dependencies, mandatory settings, and shadow jar creation. We use [Gradle](https://gradle.org/) as a build tool. This page will help you configure Gradle for your plugin. ## Configure Gradle for Kestra plugins Start by setting the core project metadata and required configuration. ## Mandatory configuration The first thing you need to configure is the plugin name and the class package. 1. In `settings.gradle`, change the `rootProject.name = 'plugin-template'` with your plugin name. 2. Change the class package: by default, the template provides a package `io.kestra.plugin.templates`, just rename the folder in `src/main/java` & `src/test/java` 3. Change the package name on `build.gradle`: replace `group "io.kestra.plugin.templates"` to the package name. Now you can start [developing your task](../03.task/index.md) or look at other optional Gradle configuration. ## Include some dependencies on plugins You can add many dependencies to your plugins, they will be isolated in the Kestra runtime. Thanks to this isolation, we ensure that two different versions of the same library will not clash and have runtime errors due to missing methods. The `build.gradle` includes the Kestra core library which covers most of the task needs via `compileOnly group: "io.kestra", name: "core", version: kestraVersion`. If your plugin need some dependencies, you can add as many as you want that will be isolated; you just need to add `api` dependencies: ```groovy api group: 'com.google.code.gson', name: 'gson', version: '2.8.6' ``` --- # Build and Publish a Kestra Plugin URL: https://kestra.io/docs/plugin-developer-guide/publish > Learn how to build and publish your Kestra plugins to Maven Central using Gradle and GitHub Actions. Use the included Gradle task to build the plugin. Then, you can publish it to [Maven Central](https://central.sonatype.com). ## Build and publish your Kestra plugin Start by building the plugin locally before publishing it. ## Build a plugin To build your plugin, execute the `./gradlew shadowJar` command from the plugin directory. The resulting JAR file will be generated in the `build/libs` directory. To use this plugin in your Kestra instance, add this JAR to the [Kestra plugins path](../../kestra-cli/kestra-server/index.md#plugin-commands). ### Use a custom docker image with your plugin Adding this `Dockerfile` to the root of your plugin project: ```dockerfile FROM kestra/kestra:develop COPY build/libs/* /app/plugins/ ``` You can build and run the image with the following command, assuming you're in the root directory of your plugin: `./gradlew shadowJar && docker build -t kestra-custom . && docker run --rm -p 8080:8080 kestra-custom server local` You can now navigate to http://localhost:8080 and start using your custom plugin. Feel free to adapt the Dockerfile to your needs (e.g., if you plan to use multiple custom plugins, include all builds directory in it). ## Publish a plugin Next are the steps to publishing your plugin to Maven Central. ### GitHub Actions The plugin template includes a [GitHub Actions](https://github.com/features/actions) workflow to test and publish your plugin. You can extend it by adding any additional testing or deployment steps. ### Publish to Maven Central The template includes a Gradle task that publishes the plugin to Maven Central. You need a Maven Central account to publish your plugin. You only need to configure the `gradle.properties` to have all required properties: ```properties sonatypeUsername= sonatypePassword= signing.keyId= signing.password= signing.secretKeyRingFile= ``` There is a pre-configured GitHub Actions workflow in the `.github/workflows/main.yml` file that you can customize to your need: :::collapse{title="Example"} ```yaml ## Publish - name: Publish package to Sonatype if: github.ref == 'refs/heads/master' env: ORG_GRADLE_PROJECT_sonatypeUsername: ${{ secrets.SONATYPE_USER }} ORG_GRADLE_PROJECT_sonatypePassword: ${{ secrets.SONATYPE_PASSWORD }} SONATYPE_GPG_KEYID: ${{ secrets.SONATYPE_GPG_KEYID }} SONATYPE_GPG_PASSWORD: ${{ secrets.SONATYPE_GPG_PASSWORD }} SONATYPE_GPG_FILE: ${{ secrets.SONATYPE_GPG_FILE }} run: | echo "signing.keyId=${SONATYPE_GPG_KEYID}" > ~/.gradle/gradle.properties echo "signing.password=${SONATYPE_GPG_PASSWORD}" >> ~/.gradle/gradle.properties echo "signing.secretKeyRingFile=${HOME}/.gradle/secring.gpg" >> ~/.gradle/gradle.properties echo ${SONATYPE_GPG_FILE} | base64 -d > ~/.gradle/secring.gpg ./gradlew publishToSonatype ## Release - name: Release package to Maven Central if: startsWith(github.ref, 'refs/tags/v') env: ORG_GRADLE_PROJECT_sonatypeUsername: ${{ secrets.SONATYPE_USER }} ORG_GRADLE_PROJECT_sonatypePassword: ${{ secrets.SONATYPE_PASSWORD }} SONATYPE_GPG_KEYID: ${{ secrets.SONATYPE_GPG_KEYID }} SONATYPE_GPG_PASSWORD: ${{ secrets.SONATYPE_GPG_PASSWORD }} SONATYPE_GPG_FILE: ${{ secrets.SONATYPE_GPG_FILE }} run: | echo "signing.keyId=${SONATYPE_GPG_KEYID}" > ~/.gradle/gradle.properties echo "signing.password=${SONATYPE_GPG_PASSWORD}" >> ~/.gradle/gradle.properties echo "signing.secretKeyRingFile=${HOME}/.gradle/secring.gpg" >> ~/.gradle/gradle.properties echo ${SONATYPE_GPG_FILE} | base64 -d > ~/.gradle/secring.gpg ./gradlew publishToSonatype closeAndReleaseSonatypeStagingRepository ``` ::: --- # Set Up for Kestra Plugin Development URL: https://kestra.io/docs/plugin-developer-guide/setup > Set up your development environment for creating Kestra plugins, including Java, IntelliJ IDEA, and Gradle configuration. Set up your environment for Plugin Development. ## Prepare your environment for plugin development This quick setup guide gets you ready to build and publish Kestra plugins. ## Plugin Template To get started with building a new plugin, use the [plugin-template](https://github.com/kestra-io/plugin-template), which comes prepackaged with the standardized repository structure and deployment workflows. That template will create a project hosting a group of plugins — we usually create multiple subplugins for a given service. For example, there's only one plugin for AWS, but it includes many subplugins for specific AWS services. :::alert{type="warning"} The Kestra plugin library **version** must align with your Kestra instance. You may encounter validation issues during flow creation (e.g., `Invalid bean` response with status 422) when some plugins are on an older version of the Kestra plugin library. In that case, you may want to update the file `plugin-yourplugin/gradle.properties` and set the `version` property to the correct Kestra version like below: ```properties version=0.20.0-SNAPSHOT kestraVersion=[0.20,) ``` It's not mandatory that your plugin version matches the Kestra version, Kestra's official plugins version will always match the minor version of Kestra, but it is best practice. Then rebuild and publish the plugin. ::: ## Requirements Kestra plugins development requirements are: * [Java](https://java.com) 25 or later. * [IntelliJ IDEA](https://www.jetbrains.com/idea/) (or any other Java IDE, we provide only help for IntelliJ IDEA). * [Gradle](https://gradle.org/) (included most of the time with the IDE). ## Create a new plugin Here are the steps: 1. Go to the [plugin-template](https://github.com/kestra-io/plugin-template) repository. 2. Click on *Use this template*. 3. Choose the GitHub account you want to link and the repository name for the new plugin. 4. Clone the new repository: `git clone git@github.com:{{user}}/{{name}}.git`. 5. Open the cloned directory in IntelliJ IDEA. 6. Enable [annotations processors](https://www.jetbrains.com/help/idea/annotation-processors-support.html). 7. If you are using IntelliJ IDEA < 2020.03, install the [lombok plugins](https://plugins.jetbrains.com/plugin/6317-lombok) (if not, it's included by default). Once you completed the steps above, you should see a similar directory structure: ![Structure](./plugins-architecture.png) As you can see, there is one generated plugin: the `Example` class representing the `Example` plugin (a task). A project typically hosts multiple plugins. We call a project a group of plugins, and you can have multiple sub-groups inside a project by splitting plugins into different packages. Each package that has a plugin class is a sub-group of plugins. ## Plugin icons Plugin icons need to be added in the SVG format — see an example [here in the JIRA plugin](https://github.com/kestra-io/plugin-jira/commit/64393190281c9001eb8f57b412a0d7d74f986d41). **Where can you find icons?** - For proprietary systems, Wikipedia is a good source of SVG icons. - For AWS services, the [AWS icons](https://awsicons.dev/) is a great resource. - [Google Fonts Icons](https://fonts.google.com/icons) - [Feather Icons](https://feathericons.com/). --- # Develop a Kestra Task Plugin URL: https://kestra.io/docs/plugin-developer-guide/task > Step-by-step guide to developing custom Task plugins for Kestra, including properties, run logic, outputs, and validation. Here are the instructions to develop a new task. ## Develop a task plugin for Kestra Start with a simple runnable example to understand the required structure. ## Runnable task :::collapse{title="Here is a simple Runnable Task that reverses a string"} ```java @SuperBuilder @ToString @EqualsAndHashCode @Getter @NoArgsConstructor @Schema( title = "Reverse a string", description = "Reverse all letters from a string" ) public class ReverseString extends Task implements RunnableTask { @Schema( title = "The base string you want to reverse" ) private Property string; @Override public ReverseString.Output run(RunContext runContext) throws Exception { Logger logger = runContext.logger(); String render = runContext.render(string).as(String.class).orElse(null); logger.debug(render); return Output.builder() .reversed(StringUtils.reverse(render)) .build(); } @Builder @Getter public static class Output implements io.kestra.core.models.tasks.Output { @Schema( title = "The reversed string " ) private final String reversed; } } ``` ::: :::alert{type="info"} All optional properties are displayed within the "Optional properties" section in the No-Code Editor in the Kestra UI. ::: Look at this more closely. ### Class annotations ```java @SuperBuilder @ToString @EqualsAndHashCode @Getter @NoArgsConstructor ``` These annotations are required to make your plugin work with Kestra. They are [Lombok](https://projectlombok.org/) annotations and allow Kestra and its internal serialization to work properly. ### Class declaration ```java public class ReverseString extends Task implements RunnableTask ``` * `ReverseString` is the name of your task, it can be used in Kestra with `type: io.kestra.plugin.templates.ReverseString` (aka: `{{package}}.{{className}}`). * The task class must extend `Task`, this is the base class for all tasks. * The task class must implement `RunnableTask` as it's a task that must run on the Worker, and must declare its output which is here of type `ReverseString.Output`. ### Properties ```java private Property string; ``` All task properties must be declared as task class attributes. They will be passed to the task by the flow at execution time. If you want your attribute to be dynamic, you need to wrap the type of your attribute into the `Property` type. Dynamic properties are explained [later](#dynamic-properties-rendering). For example, this will be a valid YAML for using this task. It uses an output from a previous task as its property, which is possible thanks to dynamic properties rendering inside the task via `runContext.render(string).as(String.class).orElse(null)`: ```yaml type: io.kestra.plugin.templates.ReverseString string: "{{ outputs.previousTask.name }}" ``` You can declare as many properties as you want. :::alert{type="warning"} The `version` property is a core property reserved for [plugin management](../../07.enterprise/05.instance/versioned-plugins/index.md#version-property-in-a-flow). Custom plugins will fail to compile if they use this property, so you must rename it to something else. ::: You can use any serializable types by [Jackson](https://github.com/FasterXML/jackson) for your properties (ex: Double, boolean, ...). You can create any class as long as the class is Serializable. #### Properties validation Properties can be validated using `jakarta.validation.constraints.*` annotations. When the user creates a flow, your task properties will be validated before insertion and prevent any wrong flow definition from being saved. The default available annotations are: - `@Positive` - `@AssertFalse` - `@AssertTrue` - `@Max` - `@Min` - `@Negative` - `@NegativeOrZero` - `@Positive` - `@PositiveOrZero` - `@NotBlank` - `@NotNull` - `@Null` - `@NotEmpty` - `@Past` - `@PastOrPresent` - `@Future` - `@FutureOrPresent` The validation must be added to the inner type: ```java private Property<@Min(1) @Max(30) Integer> integer; ``` You can also create your own custom validation. To do so, first define the annotation as follows: ```java @Retention(RetentionPolicy.RUNTIME) @Constraint(validatedBy = { CronExpressionValidator.class }) public @interface CronExpression { String message() default "invalid cron expression ({validatedValue})"; Class[] groups() default {}; Class[] payload() default {}; } ``` Then, define a custom validator: ```java public class CronExpressionValidator implements ConstraintValidator { private static final CronParser CRON_PARSER = new CronParser(CronDefinitionBuilder.instanceDefinitionFor(CronType.UNIX)); @Override public boolean isValid(QueryInterface value, ConstraintValidatorContext context) { if (value == null) { return true; // nulls are allowed according to spec } try { Cron parse = CRON_PARSER.parse(value.toString()); parse.validate(); } catch (IllegalArgumentException e) { return false; } return true; } } ``` ### Run ```java @Override public ReverseString.Output run(RunContext runContext) throws Exception { Logger logger = runContext.logger(); String render = runContext.render(string).as(String.class).orElse(null); logger.debug(render); return Output.builder() .reversed(StringUtils.reverse(render)) .build(); } ``` The `run` method is where the main logic of your task will do all the work needed. You can use any Java code here with any required libraries as long as you have declared them in the [Gradle configuration](../02.gradle/index.md). #### Log ```java Logger logger = runContext.logger(); ``` You can access a logger via the run context. The run context will provide a logger for the current execution. Do not create your own custom logger so logs can be tracked inside Kestra. #### Dynamic properties rendering ```java String rendered = runContext.render(string).as(String.class).orElse(null); ``` Kestra supports [expressions](../../expressions/index.mdx) as tasks parameters. To use them, your task attribute must be encapsulated into the `Property` carrier type. A dynamic property must be rendered before usage; this will use our templating engine, Pebble, to render the property into the target type. Rendering properties using the `Property` carrier type via the run context is null-safe, it will return an empty Optional or an empty collection for lists and maps. Dynamic properties supports all Java types that can be serialized via Jackson, for example, for using a `Duration` you can do: ```java // property definition private Property duration; @Override public Output run(RunContext runContext) throws Exception { Duration rendered = runContext.render(duration).as(Duration.class).orElse(null); // [...] } ``` You can provide a default value at property definition time via `Property.of()`: ```java @Builder.Default private Property duration = Property.of(Duration.ofSeconds(10)); ``` Lists are supported via `Property>`, for example: ```java // property definition private Property> list; @Override public Output run(RunContext runContext) throws Exception { List rendered = runContext.render(list).asList(String.class); // [...] } ``` Maps are supported via `Property>`, for example: ```java // property definition private Property> map; @Override public Output run(RunContext runContext) throws Exception { Map rendered = runContext.render(map).asMap(String.class, String.class); // [...] } ``` Kestra uses a special type to carry data in a flexible way: `Data`. `Data` can be built via three different types or properties: a URI (which will points to a Kestra internal storage file), a list of map (for defining multiple items), or a map (for a single item). Thanks to this, the task user can pass data to it in a very flexible way, we strongly encourage you to use this type when it fits your needs. Here is an example that defines a `Data` attibute of type `Message`, at run time you will need to render this property and map the message from a `Map`. It uses Project Reactor `Flux` under the cover to allow processing items one by one in a reactive manner allowing to process an arbitrary number of items. When couple with our internal storage files, it can process files or billions of items if needed: ```java // property definition @NotNull private Data data; @Override public Output run(RunContext runContext) throws Exception { List outputMessages = data.flux(runContext, Message.class, message -> Message.fromMap(message)) .collectList() .block(); // [...] } ``` At flow definition, the user must define how to define the data property. For using it with an internal storage file (in the ION format), which would allow to process the file line by line: ```yaml type: io.kestra.plugin.myplugin.MyTask data: fromURI: "{{inputs.file}}" ``` For using it with a single item as a map: ```yaml type: io.kestra.plugin.myplugin.MyTask data: fromMap: prop1: "{{inputs.value1}}" prop2: "{{inputs.value2}}" ``` For using it with a list of items as a list of maps: ```yaml type: io.kestra.plugin.myplugin.MyTask data: fromList: - prop1: "{{inputs.value1}}" prop2: "{{inputs.value2}}" - prop1: "{{inputs.value3}}" prop2: "{{inputs.value4}}" ``` #### Static properties or the old `@PluginProperty` annotation Prefer the `Property` carrier type for inputs. If you can't use it (for example you need legacy dynamic rendering support), you can still declare the task property this way: ```java @PluginProperty private String string; ``` Use the `@PluginProperty` annotation when you cannot rely on the `Property` carrier type so the task JSON Schema and documentation are generated correctly. When a field must be rendered dynamically, add `@PluginProperty(dynamic = true)`. Keep this fallback for those edge cases—`Property` remains the recommended default. #### Kestra internal storage The run context object has a `storage()` method that allows accessing the Kestra internal storage. It also has a `workingDir()` method which allow managing files inside the task working directory, this is important to use it for all files manipulations, to avoid any task going out of its working directory for security reasons, the `WorkingDir` object will protect your task against that. Example of reading a file from the internal storage: ```java final URI from = new URI(runContext.render(this.from).as(String.class).orElseThrow()); final InputStream inputStream = runContext.storage().getFile(from); ``` It will return an `InputStream` that will read the file from the internal storage. You can also write files to Kestra's internal storage using `runContext.storage().putFile(File file)`. The local file must be created inside the working directory and will be deleted after being put inside the internal storage. ```java File file = runContext.workingDir().createFile("items.csv"); // [...] -> fill the file URI uri = runContext.storage().putFile(file); // return the uri inside an Output so it can be used by other tasks ``` If a file with the same name already exist, the call to `createFile()` will fail, to avoid that you can use `runContext.workingDir().createTempFile(".csv")` instead, which will generate a unique file name for you. `runContext.storage().putFile()` will return a URI pointing to the file inside the internal storage, for this file to be used by other tasks and available inside the execution outputs; you must return it as one of your task outputs. ### Outputs ```java public class ReverseString extends Task implements RunnableTask { @Override public ReverseString.Output run(RunContext runContext) throws Exception { return Output.builder() .reversed(StringUtils.reverse(render)) .build(); } @Builder @Getter public static class Output implements io.kestra.core.models.tasks.Output { @Schema( title = "The reversed string" ) private final String reversed; } } ``` Each task must return an object of type `io.kestra.core.models.tasks.Output` with the output properties that would be available for the next tasks and as execution outputs. You can add as many properties as you want, just keep in mind that outputs need to be serializable. At execution time, outputs can be accessed by downstream tasks by leveraging outputs expressions e.g. `{{ outputs.task_id.output_attribute }}`. Tasks outputs will be stored inside the execution context, they are not designed to store data but task execution metadata, to store data, use an internal storage file and return the file URI inside the output. If your task doesn't have any outputs, you can use `io.kestra.core.models.tasks.VoidOutput` and returns null: ```java public class NoOutput extends Task implements FlowableTask { @Override public VoidOutput run(RunContext runContext) throws Exception { return null; } } ``` ### Exception In the `run` method, you can throw any Java `Exception`, they will be caught by Kestra and will fail the execution. We advise you to throw any Exception that can fail your task as soon as possible. ### Metrics You can expose metrics to add observability to your task. Metrics will be recorded within the execution, and can be accessed via the UI or as [Prometheus metrics](../../10.administrator-guide/03.monitoring/index.md#prometheus). There are two kinds of metrics available: - `Counter`: `Counter.of("your.counter", count, tags);` with args - `String name`: The name of the metric - `Double|Long|Integer|Float count`: the associated counter - `String... tags`: a list of tags associated with your metric - `Timer`: `Timer.of("your.duration", duration, tags);` - `String name`: The name of the metric - `Duration duration`: the recorded duration - `String... tags`: a list of tags associated with your metric To save metrics with the execution, you need to use `runContext.metric(metric)`. #### Name Must be lowercase separated by dots. #### Tags Must be pairs of tag key and value. An example of two valid tags (`zone` and `location`) is: ```java Counter.of("your.counter", count, "zone", "EU", "location", "France"); ``` ### Documentation Remember to document your tasks. For this, we provide a set of annotations explained in the [Document each plugin](../07.document/index.md) section. ## Flowable Task [Flowable tasks](../../05.workflow-components/01.tasks/00.flowable-tasks/index.md) are the most complex tasks to develop, and will usually be available from Kestra core. You will rarely need to create a flowable task by yourself. :::alert{type="warning"} When developing such tasks, you must make it fault-tolerant as an exception thrown by a flowable task can endanger the Kestra instance and lead to inconsistencies in the flow execution. ::: Keep in mind that a flowable task will be evaluated very frequently inside the Executor and must have low CPU usage; no I/O should be done by this kind of task. In the future, complete documentation will be available here. In the meantime, you can find all the actual [Flowable task source files](https://github.com/kestra-io/kestra/tree/develop/core/src/main/java/io/kestra/plugin/core/flow) to have some inspiration for Sequential or Parallel tasks development. --- # Develop a Kestra Trigger Plugin URL: https://kestra.io/docs/plugin-developer-guide/trigger > Learn how to develop custom Trigger plugins for Kestra, including Polling and Realtime triggers. Here is how you can develop a [Trigger](../../05.workflow-components/07.triggers/index.mdx). ## Develop a trigger plugin for Kestra :::collapse{title="The Trigger example below will create an execution randomly"} ```java @SuperBuilder @ToString @EqualsAndHashCode @Getter @NoArgsConstructor public class Trigger extends AbstractTrigger implements PollingTriggerInterface, TriggerOutput { @Builder.Default private final Duration interval = Duration.ofSeconds(60); protected Double min = 0.5; @Override public Optional evaluate(ConditionContext conditionContext, TriggerContext context) { RunContext runContext = conditionContext.getRunContext(); Logger logger = conditionContext.getRunContext().logger(); double random = Math.random(); if (random < this.min) { return Optional.empty(); } Execution execution = Execution.builder() .id(runContext.getTriggerExecutionId()) .namespace(context.getNamespace()) .flowId(context.getFlowId()) .flowRevision(context.getFlowRevision()) .state(new State()) .trigger(ExecutionTrigger.of( this, Trigger.Random.builder().random(random).build() )) .build(); return Optional.of(execution); } @Builder @Getter public class Random implements io.kestra.core.models.tasks.Output { private Double random; } } ``` ::: You need to extend `PollingTriggerInterface` and implement the `Optional evaluate(ConditionContext conditionContext, TriggerContext context)` method. You can have any properties that you want, like for any task (validation, documentation, etc.), and everything works the same way. The `evaluate` method will receive these arguments: - `ConditionContext conditionContext`: a ConditionContext which includes various properties such as the RunContext in order to render your properties. - `TriggerContext context`: to have the context of this call (flow, execution, trigger, date). In this method, you can add any logic that you want: connect to a database, connect to remote file systems, ... You don't have to take care of resources, Kestra will run this method in its own thread. This method must return an `Optional` with: - `Optional.empty()`: if the condition is not validated. - `Optional.of(execution)`: with the execution created if the condition is validated. You have to provide an `Output` for any output needed (result of query, result of file system listing, etc.) that will be available for the flow tasks within the `{{ trigger.* }}` variables. :::alert{type="warning"} The trigger must free the resource for the next evaluation. For each interval, this method will be called and if the conditions are met, an execution will be created. To avoid this, move the file or remove the record from the database; take an action to avoid an infinite triggering. ::: ## Realtime Triggers To create a Realtime Trigger, your plugin class must: 1. **Implement the Interface:** Extend the `RealtimeTriggerInterface`. This interface requires the implementation of the `evaluate` method. 2. **Overwrite the `evaluate` Method:** Implement the following method signature: ```java // Method required by RealtimeTriggerInterface Publisher evaluate(ConditionContext conditionContext, TriggerContext triggerContext) throws Exception; ``` *Parameters:* - The method accepts two essential context objects: ⮕ `ConditionContext conditionContext`: Provides access to the flow's runtime environment, including the `RunContext` (used for logging, accessing configuration, and accessing storage). ⮕ `TriggerContext triggerContext`: Provides metadata about the flow, such as the flow ID, namespace and tenant ID. *Return Type and Purpose:* - The *return type*, `Publisher`, is a reactive stream (from Reactive Streams) that allows the trigger to continuously listen for events. ⮕ *Purpose*: Unlike Polling Triggers, this method *maintains a connection or subscription* to the external system (e.g., a Kafka topic). Every time the publisher emits a new `Execution object`, the flow is instantly triggered. ⮕ *Implementation Guidance*: Your implementation should focus on the logic required to subscribe to the external system and emit a new `Execution` object whenever a relevant event is received. ## Documentation Remember to document your triggers. For this, we provide a set of annotations explained in the [Document each plugin](../07.document/index.md) section. --- # Add Unit Tests for Kestra Plugins URL: https://kestra.io/docs/plugin-developer-guide/unit-tests > Learn how to write unit tests for your Kestra plugins using JUnit and the Kestra testing framework. To avoid regression, add unit tests for all your tasks. There are two main ways to test your tasks. In both cases, annotate your tests with `@KestraTest` so the required Kestra components start correctly. ## Unit test a RunnableTask This is the most common way to test a `RunnableTask`. You create your `RunnableTask`, test its `run()` method, and assert on its output or exception. This example shows a task test that builds the task, creates a `RunContext`, runs the task directly, and asserts on the output: :::collapse{title="Example"} ```java @KestraTest class ExampleTest { @Inject private RunContextFactory runContextFactory; @Test void run() throws Exception { RunContext runContext = runContextFactory.of(Map.of("variable", "John Doe")); Example task = Example.builder() .format("Hello {{ variable }}") .build(); Example.Output runOutput = task.run(runContext); assertThat(runOutput.getChild().getValue(), is(StringUtils.reverse("Hello John Doe"))); } } ``` ::: This works like any other Java unit test. You can use additional dependencies, helper methods, and Docker containers when needed. Kestra tests are Micronaut tests, so you can inject any bean into them. ## Unit test with a full flow If you want to add some unit tests with a full flow (which can be necessary in some rare cases, for example, for a `FlowableTask`), you will use the `@ExecuteFlow` annotation. This example shows a flow-level test that executes a test flow from `src/test/resources` and asserts on the resulting execution: :::collapse{title="Example"} ```java @KestraTest(startRunner = true) // This annotation starts an embedded Kestra for tests class ExampleRunnerTest { @Test @ExecuteFlow("flows/example.yaml") void flow(Execution execution) throws TimeoutException, QueueException { assertThat(execution.getTaskRunList(), hasSize(3)); assertThat(((Map)execution.getTaskRunList().get(2).getOutputs().get("child")).get("value"), is("task-id")); } } ``` ::: - `@KestraTest(startRunner = true)` will start Kestra with an in-memory backend. - `@ExecuteFlow("flows/example.yaml")` will start the flow from the `src/test/resources/flows/example.yaml` file and execute it. - The created execution is then available for test method parameter injection so that you can make assertions on it. To make it work, you need to have an `application.yml` file with this minimum configuration: ```yaml kestra: repository: type: memory queue: type: memory storage: type: local local: base-path: /tmp/unittest ``` And these dependencies on your `build.gradle`: ```groovy testAnnotationProcessor group: "io.kestra", name: "processor", version: kestraVersion testImplementation group: "io.kestra", name: "core", version: kestraVersion testImplementation group: "io.kestra", name: "tests", version: kestraVersion testImplementation group: "io.kestra", name: "repository-memory", version: kestraVersion testImplementation group: "io.kestra", name: "runner-memory", version: kestraVersion testImplementation group: "io.kestra", name: "storage-local", version: kestraVersion ``` This enables the in-memory runner and runs your flow without any other dependencies such as Kafka. If you created it from our plugin template, those are usually already included in your project. --- # Quickstart Guide: Run Your First Kestra Workflow URL: https://kestra.io/docs/quickstart > Get started with Kestra in minutes by launching Kestra locally with Docker and running your first workflow. Launch Kestra locally, create a simple flow, and run your first execution in a few minutes. ## Run your first Kestra workflow with Docker
## Prerequisites - Install Docker in your environment. We recommend [Docker Desktop](https://docs.docker.com/get-docker/). - If you use Windows, make sure [WSL](https://docs.docker.com/desktop/wsl/) is enabled. ## Start Kestra Once Docker is running, start Kestra with a single command: ```bash docker run --pull=always --rm -it -p 8080:8080 --user=root \ --name kestra \ -v kestra_data:/app/storage \ -v kestra_db:/app/data \ -v /var/run/docker.sock:/var/run/docker.sock \ -v /tmp:/tmp \ kestra/kestra:latest server local ``` If you re-run the command and Docker reports `You have to remove (or rename) that container to be able to reuse that name.`, remove the old container with `docker rm -f kestra` or pick a different `--name`. :::collapse{title="This command does the following:"} - starts Kestra on port `8080` - stores local files in the `kestra_data` Docker volume - persists the H2 database in the `kestra_db` Docker volume - mounts `/tmp` and the Docker socket so script and container tasks can run locally ::: Open `http://localhost:8080` in your browser to launch the UI, create your user, and take the product tour to begin building your first flow.

:::alert{type="info"} The above command starts Kestra with an embedded H2 database. Storage files are stored on the `kestra_data` Docker volume, and the H2 database is persisted on the `kestra_db` Docker volume. For production-ready persistence with a PostgreSQL database and more configurability, follow the [Docker Compose installation](../02.installation/03.docker-compose/index.md). ::: ## Next steps Congrats! You've taken the product tour, executed your first flow, and familiarized yourself with Kestra. Next, you can follow the documentation in this order to build on what you've learned so far: - Continue with a [Tutorial](../03.tutorial/index.mdx) to add inputs, outputs, triggers, and more task types. - Follow the full [Installation guide](../02.installation/index.mdx) for persistent local or distributed setups. - Explore the available [Plugins](/plugins) to integrate with external systems, and begin orchestrating your applications, microservices, and processes. - [Contribute to Kestra](../04.contribute-to-kestra/index.mdx) – whether a developer or not, we value outside contribution of all kinds: Plugins, Features, Documentation, Feature Requests, and Bug Reports. Get involved! --- # Releases & LTS Policy in Kestra: Cadence and Support URL: https://kestra.io/docs/releases > Information on Kestra's release cadence, versioning strategy, and Long-Term Support (LTS) policy. Track Kestra's long‑term support (LTS) releases alongside the fast cadence of feature releases. ## Current releases Kestra maintains two tracks: | Type | Version | Release Date | Supported Until | Release Notes | |---------|---------|--------------|---------------------------|---------------| | LTS | 1.3 | 2026‑03‑03 | 2027‑03 | [GitHub Release](https://github.com/kestra-io/kestra/releases/tag/v1.3.0) | | Feature | 1.2 | 2026‑01‑13 | Support ended by LTS 1.3 | [GitHub Release](https://github.com/kestra-io/kestra/releases/tag/v1.2.0) | | Feature | 1.1 | 2025‑11‑04 | Support ended by LTS 1.3 | [GitHub Release](https://github.com/kestra-io/kestra/releases/tag/v1.1.0) | | LTS | 1.0 | 2025‑09‑09 | 2026‑09 | [GitHub Release](https://github.com/kestra-io/kestra/releases/tag/v1.0.0) | :::alert{type="info"} Runtime prerequisite: Kestra 1.3 requires Java 25 or later (Eclipse Temurin recommended). Upgrade Java before adopting a new LTS or feature line to avoid startup/runtime issues. ::: ## Release model Kestra follows a structured release strategy with three release types: ### Patch releases (x.y.Z) - Weekly backports — every Tuesday - Security fixes and non-breaking bug fixes - Applied to all supported release lines (Latest Feature Release + LTS) ### Feature releases (x.Y.z) - Deliver new functionality every ~2 months - Candidates for future LTS promotion ### LTS (Long-Term Support) releases - Promoted from stable feature releases every ~6 months - Supported for 1 year with all security and bug fixes - Recommended for production deployments ## LTS support model Each LTS release receives support for 1 year. All security fixes, bug fixes, and patches are automatically applied. Up to 2 LTS versions can be active at the same time. Both receive the same bug and security fixes. The newer LTS also includes features released between the two versions. ## Backports and bug fixes Bugs are fixed in the latest current version line first, then backported to all active LTS versions. Changes are carried forward where technically feasible. To identify backported changes, check the [GitHub releases](https://github.com/kestra-io/kestra/releases). Release notes indicate which fixes are included in each version. ## Tracking the latest LTS LTS versions are clearly labeled in the [GitHub release notes](https://github.com/kestra-io/kestra/releases) and we also publish a `latest-lts` Docker tag (see [Docker image tags](../02.installation/02.docker/index.md#docker-image-tags) for the full tag list): ```bash kestra/kestra:latest-lts # Open Source registry.kestra.io/docker/kestra-ee:latest-lts # Enterprise Edition ``` :::alert{type="info"} For production environments, we strongly recommend pinning to a specific version number (instead of using `latest` or `latest-lts`) to ensure your deployments remain stable and avoid unplanned upgrades. This practice helps prevent unexpected changes from upstream releases. For example: ```bash kestra/kestra:1.0.3 # Open Source registry.kestra.io/docker/kestra-ee:1.0.3 # Enterprise Edition ``` ::: ## Plugin releases Plugin releases are **totally uncoupled from Kestra Core**. Plugins follow their own versioning and release cycle. ### Kestra Core Minimal Compatible Version Every plugin version matches a **Kestra Core Minimal Compatible Version**. This means the given plugin version works within the range provided by this value. ### Plugin versioning: Major, Minor, Patch **Patch (x.y.Z)** A patch release is made mainly for bug fixes. **Minor (x.Y.z)** A minor release is made for: - A new feature introduced in the plugin (e.g., a new task) - A breaking change, usually due to a change in the Kestra Core version **Major (X.y.z)** A major release is made only for important breaking changes where users must update their task definitions. ### Plugins and LTS **Plugins don't have LTS versions.** The Kestra Core Minimal Compatible Version indicates which plugin version is supported for a given Kestra Core LTS. To find the right plugin version for your Kestra installation you can visit each dedicated plugin page where you'll find the Kestra Core Minimal Compatible Version listed. You can also check the plugin's GitHub releases page and match the compatible version with your Kestra version. :::alert{type="info"} Need an earlier version's status? Check the [GitHub releases archive](https://github.com/kestra-io/kestra/releases). ::: --- # Run Scripts in Kestra: Multi-Language Tasks & Runners URL: https://kestra.io/docs/scripts > Comprehensive guide on running scripts in any language with Kestra, covering tasks, runners, and dependencies. import ChildCard from "~/components/docs/ChildCard.astro" Kestra is language agnostic. Write your business logic in any language. ## Run scripts in any language with Kestra You can orchestrate custom business logic written in any language, and you can also build custom plugins in Java.
There are dedicated plugins for `Python`, `R`, `Julia`, `Ruby`, `Node.js`, `Powershell` and `Shell`. You can also run any language using the `Shell` plugin too. By default, these tasks run in individual Docker containers (taskRunner type: `io.kestra.plugin.scripts.runner.docker.Docker`). You can overwrite that default behavior if you prefer that your scripts run in a local process (taskRunner type: `io.kestra.plugin.core.runner.Process`) instead. If you use the [Enterprise Edition](../07.enterprise/index.mdx), you can also run your scripts on [dedicated remote workers](../07.enterprise/04.scalability/worker-group/index.md) by specifying a `workerGroup` property or using other [Task Runner types](../task-runners/04.types/index.mdx) for AWS, GCP, Azure, and Kubernetes. The following pages dive into details of each task runner, supported programming languages, and how to manage dependencies. --- # Bind Mount Scripts into Kestra – Run Local Code URL: https://kestra.io/docs/scripts/bind-mount > Bind-mount locally stored scripts into Kestra containers to execute code from your filesystem without importing it. Use bind-mount to execute locally stored scripts. ## Bind-mount local scripts into Kestra To run a script stored locally, you can bind-mount it to your Kestra container. Bind-mounting local scripts to the Kestra server can also make the local scripts available to the Docker containers running the script tasks. This is useful when you want to test a script, and you don't want to use Namespace Files. First, ensure that your [Plugins and Execution configuration](../../configuration/04.plugins-and-execution/index.md) in the [Docker Compose file](https://github.com/kestra-io/kestra/blob/develop/docker-compose.yml) allows volume mounting. Below is an example with the intended setting in the final line: ```yaml kestra: image: kestra/kestra:latest pull_policy: always user: "root" env_file: - .env command: server standalone --worker-thread=128 volumes: - kestra-data:/app/storage - /var/run/docker.sock:/var/run/docker.sock - /tmp/kestra-wd:/tmp/kestra-wd:rw environment: KESTRA_CONFIGURATION: | datasources: postgres: url: jdbc:postgresql://postgres:5432/kestra driver-class-name: org.postgresql.Driver username: kestra password: k3str4 kestra: server: basic-auth: enabled: false username: "admin@kestra.io" # it must be a valid email address password: kestra repository: type: postgres storage: type: local local: base-path: "/app/storage" queue: type: postgres tasks: tmp-dir: path: /tmp/kestra-wd/tmp plugins: configurations: - type: io.kestra.plugin.scripts.runner.docker.Docker values: volume-enabled: true # 👈 this is the relevant setting ``` With that setting, you can point the script task to any script on your local file system: ```yaml id: pythonVolume namespace: company.team tasks: - id: anyPythonScript type: io.kestra.plugin.scripts.python.Commands containerImage: ghcr.io/kestra-io/pydata:latest taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker volumes: - /Users/anna/gh/KESTRA_REPOS/scripts:/app commands: - python /app/etl/parametrized.py ``` This flow points the Python task running in a Docker container to [this ETL script](https://github.com/kestra-io/scripts/blob/main/etl/parametrized.py). --- # Commands vs Script Tasks in Kestra URL: https://kestra.io/docs/scripts/commands-vs-scripts > Understand the differences between Script and Commands tasks in Kestra and when to use each for your workflows. Types of tasks for executing programming languages. ## Decide between Script and Commands tasks
For each of the [supported languages](../00.languages/index.md) (e.g., Python, R, Node.js, Shell), Kestra provides two task types: **Script** and **Commands**. 1. **Script** tasks are written inline in the YAML flow configuration. They’re best for short scripts and make it easy to pass data from flow inputs and other tasks to your code. 2. **Commands** tasks are better for longer or multi-file scripts, typically added to Kestra as [namespace files](../../06.concepts/02.namespace-files/index.md). The table below gives an overview of script-related tasks and example configuration snippets. | Language | Default `image` | `beforeCommands` example | `Script` example | `Commands` example | |------------|----------------------------------------------------|-------------------------------------------------|--------------------------------------|------------------------------| | Python | python | pip install requests kestra | print("Hello, World!") | python hello.py | | R | r-base | Rscript -e "install.packages('dplyr')" | print("Hello, World!") | Rscript hello.R | | Julia | julia | julia -e 'using Pkg; Pkg.add("CSV")' | println("Hello, World!") | julia hello.jl | | Ruby | ruby | gem install httparty | puts "Hello, World!" | ruby hello.rb | | Node.js | node | npm install json2csv | console.log('Hello, World!'); | node hello.js | | Shell | ubuntu | apt-get install curl | echo "Hello, World!" | ./hello.bash | | PowerShell | mcr.microsoft.com/powershell | Install-Module -Name ImportExcel | Write-Output "Hello, World!" | .\hello.ps1 | | Go | golang | go mod init go_script | println("Hello, World!") | go run hello.go | | Deno | denoland/deno | N/A | console.log("Hello from Kestra!") | deno run main.ts | | Lua | nickblah/lua | N/A | print("Hello from Kestra!") | lua -e 'print("Hello from Kestra!")' | | Bun | over/bun | bun add cowsay | console.log("Hello, World!") | bun run index.ts | | PHP | php | N/A | echo "Hello, World!"; | php main.php | | Perl | perl | N/A | print "Hello from Kestra!\n"; | perl -e 'print "Hello from Kestra!\n"' | | Groovy | groovy | N/A | println "Hello, $name!" | groovy HelloWorld.groovy | **Full class names:** - [io.kestra.plugin.scripts.python.Commands](/plugins/plugin-script-python/io.kestra.plugin.scripts.python.commands) - [io.kestra.plugin.scripts.python.Script](/plugins/plugin-script-python/io.kestra.plugin.scripts.python.script) - [io.kestra.plugin.scripts.r.Commands](/plugins/plugin-script-r/io.kestra.plugin.scripts.r.commands) - [io.kestra.plugin.scripts.r.Script](/plugins/plugin-script-r/io.kestra.plugin.scripts.r.script) - [io.kestra.plugin.scripts.julia.Commands](/plugins/plugin-script-julia/io.kestra.plugin.scripts.julia.commands) - [io.kestra.plugin.scripts.julia.Script](/plugins/plugin-script-julia/io.kestra.plugin.scripts.julia.script) - [io.kestra.plugin.scripts.ruby.Commands](/plugins/plugin-script-ruby/io.kestra.plugin.scripts.ruby.commands) - [io.kestra.plugin.scripts.ruby.Script](/plugins/plugin-script-ruby/io.kestra.plugin.scripts.ruby.script) - [io.kestra.plugin.scripts.node.Commands](/plugins/plugin-script-node/io.kestra.plugin.scripts.node.commands) - [io.kestra.plugin.scripts.node.Script](/plugins/plugin-script-node/io.kestra.plugin.scripts.node.script) - [io.kestra.plugin.scripts.shell.Commands](/plugins/plugin-script-shell/io.kestra.plugin.scripts.shell.commands) - [io.kestra.plugin.scripts.shell.Script](/plugins/plugin-script-shell/io.kestra.plugin.scripts.shell.script) - [io.kestra.plugin.scripts.powershell.Commands](/plugins/plugin-script-powershell/io.kestra.plugin.scripts.powershell.commands) - [io.kestra.plugin.scripts.powershell.Script](/plugins/plugin-script-powershell/io.kestra.plugin.scripts.powershell.script) - [io.kestra.plugin.scripts.go.Script](/plugins/plugin-script-go/io.kestra.plugin.scripts.go.script) - [io.kestra.plugin.scripts.go.Commands](/plugins/plugin-script-go/io.kestra.plugin.scripts.go.commands) - [io.kestra.plugin.scripts.deno.Script](/plugins/plugin-script-deno/io.kestra.plugin.scripts.deno.script) - [io.kestra.plugin.scripts.deno.Commands](/plugins/plugin-script-deno/io.kestra.plugin.scripts.deno.commands) - [io.kestra.plugin.scripts.bun.Script](/plugins/plugin-script-bun/io.kestra.plugin.scripts.bun.script) - [io.kestra.plugin.scripts.bun.Commands](/plugins/plugin-script-bun/io.kestra.plugin.scripts.bun.commands) - [io.kestra.plugin.scripts.php.Script](/plugins/plugin-script-php/io.kestra.plugin.scripts.php.script) - [io.kestra.plugin.scripts.php.Commands](/plugins/plugin-script-php/io.kestra.plugin.scripts.php.commands) - [io.kestra.plugin.scripts.perl.Script](/plugins/plugin-script-perl/io.kestra.plugin.scripts.perl.script) - [io.kestra.plugin.scripts.perl.Commands](/plugins/plugin-script-perl/io.kestra.plugin.scripts.perl.commands) - [io.kestra.plugin.scripts.groovy.Script](/plugins/plugin-script-groovy/io.kestra.plugin.scripts.groovy.script) - [io.kestra.plugin.scripts.groovy.Commands](/plugins/plugin-script-groovy/io.kestra.plugin.scripts.groovy.commands) Check available [blueprints](/blueprints) to get started. ## When to use `Script` over `Commands`? **Advantages of Script:** - **Simplicity:** script code is stored and **versioned** with the flow’s revision history alongside orchestration logic. - **Easy templating:** when the workflow is defined in a single file, it’s straightforward to access inputs, variables, and pass outputs to downstream tasks. **Disadvantages of Script (vs. Commands):** - **Readability:** long inline scripts are harder to read and test than code in separate files (which also benefit from the embedded code editor). - **Complex use cases:** for multi-file projects or shared modules, use **Commands** instead. **Recommendation:** use **Commands** for advanced production workloads; **Script** is great for simple use cases and quick iteration. ## Examples Below is the same Python logic implemented with both **Script** and **Commands** tasks. ### Script task ```yaml id: python_script namespace: company.team tasks: - id: python type: io.kestra.plugin.scripts.python.Script beforeCommands: - pip install requests kestra script: | from kestra import Kestra import requests response = requests.get('https://kestra.io') print(response.status_code) Kestra.outputs({'status': response.status_code, 'text': response.text}) ``` ### Commands task ```yaml id: python_commands namespace: company.team tasks: - id: python type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true include: - main.py beforeCommands: - pip install requests kestra commands: - python main.py ``` `main.py`: ```python from kestra import Kestra import requests response = requests.get('https://kestra.io') print(response.status_code) Kestra.outputs({'status': response.status_code, 'text': response.text}) ``` ## Pass values into code You can pass values into your code using expressions. Below, the expression is used directly inside a **Script** task: ```yaml id: python_script_dynamic namespace: company.team inputs: - id: uri type: STRING defaults: https://kestra.io tasks: - id: python type: io.kestra.plugin.scripts.python.Script beforeCommands: - pip install requests kestra script: | from kestra import Kestra import requests response = requests.get('{{ inputs.uri }}') print(response.status_code) Kestra.outputs({'status': response.status_code, 'text': response.text}) ``` To pass values into **Commands** tasks, use environment variables: ```yaml id: python_commands_dynamic namespace: company.team inputs: - id: uri type: STRING defaults: https://kestra.io tasks: - id: python type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true include: - main.py beforeCommands: - pip install requests kestra commands: - python main.py env: URI: "{{ inputs.uri }}" ``` `main.py`: ```python from kestra import Kestra import requests import os response = requests.get(os.environ['URI']) print(response.status_code) Kestra.outputs({'status': response.status_code, 'text': response.text}) ``` --- # Build a Custom Docker Image for Script Tasks URL: https://kestra.io/docs/scripts/custom-docker-image > Build and use custom Docker images to package dependencies and environments for your Kestra script tasks. Build a custom Docker image for your script tasks. ## Build custom Docker images for scripts ## Use Kestra base image You can bake all dependencies needed for your script tasks directly into the Kestra's base image. Here is an example installing Python dependencies: ```dockerfile FROM kestra/kestra:latest USER root RUN apt-get update -y && apt-get install pip -y RUN pip install --no-cache-dir pandas requests boto3 ``` Then, point to that Dockerfile in your [`docker-compose.yml` file](https://github.com/kestra-io/kestra/blob/develop/docker-compose.yml): ```yaml services: kestra: build: context: . dockerfile: Dockerfile image: kestra-python:latest ``` Once you start Kestra containers using `docker compose up -d`, you can create a flow that directly runs Python tasks with your custom dependencies using the `PROCESS` runner: ```yaml id: python_process namespace: company.team tasks: - id: custom_dependencies type: io.kestra.plugin.scripts.python.Script runner: PROCESS script: | import pandas as pd import requests import boto3 print(f"Pandas version: {pd.__version__}") print(f"Requests version: {requests.__version__}") print(f"Boto3 version: {boto3.__version__}") ``` ## Building a custom Docker image for your script tasks Imagine you use [the following flow](/blueprints/zip-to-parquet): ```yaml id: zip_to_python namespace: company.team variables: file_id: "{{ execution.startDate | dateAdd(-3, 'MONTHS') | date('yyyyMM') }}" tasks: - id: get_zipfile type: io.kestra.plugin.core.http.Download uri: "https://divvy-tripdata.s3.amazonaws.com/{{ render(vars.file_id) }}-divvy-tripdata.zip" - id: unzip type: io.kestra.plugin.compress.ArchiveDecompress algorithm: ZIP from: "{{ outputs.get_zipfile.uri }}" - id: parquet_output type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest env: FILE_ID: "{{ render(vars.file_id) }}" inputFiles: "{{ outputs.unzip.files }}" script: | import os import pandas as pd file_id = os.environ["FILE_ID"] file = f"{file_id}-divvy-tripdata.csv" df = pd.read_csv(file) df.to_parquet(f"{file_id}.parquet") outputFiles: - "*.parquet" ``` The Python task requires pandas to be installed. Pandas is a large library, and it's not included in the default `python` image. In this case, you have the following options: 1. Install pandas in the `beforeCommands` property of the Python task. 2. Use one of our pre-built images that already include pandas, such as the `ghcr.io/kestra-io/pydata:latest` image. 3. Build your own custom Docker image that includes pandas. ### 1) Installing pandas in the `beforeCommands` property ```yaml id: install_pandas_at_runtime namespace: company.team tasks: - id: custom_dependencies type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.core.runner.Process beforeCommands: - pip install pyarrow pandas script: | import pandas as pd print(f"Pandas version: {pd.__version__}") ``` ### 2) Using one of our pre-built images ```yaml id: use_prebuilt_image namespace: company.team tasks: - id: custom_dependencies type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest script: | import pandas as pd print(f"Pandas version: {pd.__version__}") ``` ### 3) Building a custom Docker image If you want to build a custom Docker image for some of your scripts, first create a Dockerfile: ```dockerfile FROM python:3.11-slim RUN pip install --upgrade pip RUN pip install --no-cache-dir kestra requests pyarrow pandas amazon-ion ``` Then, build the image: ```bash docker build -t kestra-custom:latest . ``` Finally, use that image in your flow: ```yaml id: zip_to_python namespace: company.team variables: file_id: "{{ execution.startDate | dateAdd(-3, 'MONTHS') | date('yyyyMM') }}" tasks: - id: get_zipfile type: io.kestra.plugin.core.http.Download uri: "https://divvy-tripdata.s3.amazonaws.com/{{ render(vars.file_id) }}-divvy-tripdata.zip" - id: unzip type: io.kestra.plugin.compress.ArchiveDecompress algorithm: ZIP from: "{{ outputs.get_zipfile.uri }}" - id: parquet_output type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker pullPolicy: NEVER # ⚡️ Use the local image instead of pulling it from DockerHub containerImage: kestra-custom:latest # ⚡️ Use your custom image here env: FILE_ID: "{{ render(vars.file_id) }}" inputFiles: "{{ outputs.unzip.files }}" script: | import os import pandas as pd file_id = os.environ["FILE_ID"] file = f"{file_id}-divvy-tripdata.csv" df = pd.read_csv(file) df.to_parquet(f"{file_id}.parquet") outputFiles: - "*.parquet" ``` The `pullPolicy: NEVER` property ensures that Kestra uses the local image instead of trying to pull it from DockerHub. If you want to run languages other than Python using a custom Docker image, here is an example with [Go](../00.languages/index.md#run-any-language-using-a-custom-docker-image). --- # Git Clone Task – Fetch Repos for Script Workflows URL: https://kestra.io/docs/scripts/git-clone > Use the Git Clone task to fetch repositories into Kestra's working directory and process files in your workflows. Clone a Git repository and use the files in your tasks. ## Clone Git repositories into working directories This task clones a Git repository into a working directory, and then enables you to use the files from that repository in downstream tasks. ## `Git` plugin To use the `io.kestra.plugin.git.Clone` task in your flow, add it as the first child task of the `WorkingDirectory` task. Otherwise, you’ll get an error: `Destination path "xyz" already exists and is not an empty directory`. This happens because you can only clone a GitHub repository into an empty working directory. ### Add `io.kestra.plugin.git.Clone` as the first task in a `WorkingDirectory` Adding the `io.kestra.plugin.git.Clone` task directly as the first child task of the `WorkingDirectory` task ensures that you clone your repository into an empty directory before any other task would generate any output artifacts. ### Private Git repositories Typically, you want to use `io.kestra.plugin.git.Clone` with a private GitHub repository. Make sure to: 1. Add your organization/user name as `username` 2. Generate your access token and provide it on the `password` property Below you can find links to instructions on how to generate an access token for the relevant Git platform: - [GitHub](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens) - [GitLab](https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html) - [Bitbucket](https://support.atlassian.com/bitbucket-cloud/docs/create-a-repository-access-token/) - [AWS CodeCommit](https://docs.aws.amazon.com/codecommit/latest/userguide/auth-and-access-control.html) - [Azure DevOps](https://learn.microsoft.com/en-us/azure/devops/organizations/accounts/use-personal-access-tokens-to-authenticate?view=azure-devops&tabs=Windows) --- # Inline Scripts in Docker: Write Code Directly in Tasks URL: https://kestra.io/docs/scripts/inline-scripts-in-docker > Write scripts directly inside Kestra task definitions and run them in Docker. No file imports needed — inline code executes with full container isolation. Writing code directly inside your task. ## Write inline scripts inside Docker tasks To get started with a Script task, paste your custom script inline in your YAML workflow definition along with any other configuration. ```yaml id: api_json_to_mongodb namespace: company.team tasks: - id: extract type: io.kestra.plugin.scripts.python.Script containerImage: python:3.11-slim beforeCommands: - pip install requests kestra > /dev/null outputFiles: - "*.json" script: | import requests import json from kestra import Kestra response = requests.get("https://api.github.com") data = response.json() with open("output.json", "w") as output_file: json.dump(data, output_file) Kestra.outputs({"status": response.status_code}) - id: load type: io.kestra.plugin.mongodb.Load connection: uri: mongodb://host.docker.internal:27017/ database: local collection: github from: "{{ outputs.extract.outputFiles['output.json'] }}" description: "you can start MongoDB using: docker run -d mongo" ``` The example above uses a Python script added as a multiline string into the `script` property. The script fetches data from an API and stores it as a JSON file in Kestra's internal storage using the `outputFiles` property. The `Kestra.outputs` method captures additional output variables, such as the API response status code. The `image` argument of the `docker` property specifies (*optionally*) the Docker image to use for the script. If you don't specify an image, Kestra uses the default image for the language you are using. In the above example, we use the `python:3.11-slim` image. You can also *optionally* use the `beforeCommands` property to install libraries used in your inline script. Above, the command `pip install requests kestra` installs `pip` packages not available in the base image `python:3.11-slim`. --- # Input and Output Files in Script Tasks URL: https://kestra.io/docs/scripts/input-output-files > Manage input and output files in Kestra script tasks. Pass extra files using inputFiles (including Namespace Files) and capture task outputs with outputFiles. Manage Input and Output files with your scripts. ## Handle input and output files in scripts You can pass additional files to any script or CLI task using the `inputFiles` property: ```yaml id: ansible namespace: company.team tasks: - id: ansible_task type: io.kestra.plugin.ansible.cli.AnsibleCLI inputFiles: inventory.ini: | localhost ansible_connection=local myplaybook.yml: | --- - hosts: localhost tasks: - name: Print Hello World debug: msg: "Hello, World!" containerImage: cytopia/ansible:latest-tools commands: - ansible-playbook -i inventory.ini myplaybook.yml ``` You can also use [Namespace Files](../../06.concepts/02.namespace-files/index.md) as follows: ```yaml id: ansible namespace: company.team tasks: - id: ansible_task type: io.kestra.plugin.ansible.cli.AnsibleCLI namespaceFiles: enabled: true inputFiles: inventory.ini: "{{ read('inventory.ini') }}" myplaybook.yml: "{{ read('myplaybook.yml') }}" containerImage: cytopia/ansible:latest-tools commands: - ansible-playbook -i inventory.ini myplaybook.yml ``` ### Using input files to pass data from a trigger to a script task Another use case for input files is when your custom scripts need input coming from other tasks or triggers. Consider the following example flow that runs when a new object with the prefix `"raw/"` arrives in the S3 bucket `"declarative-orchestration"`: ```yaml id: s3TriggerCommands namespace: company.team description: process CSV file from S3 trigger tasks: - id: wdir type: io.kestra.plugin.core.flow.WorkingDirectory inputFiles: data.csv: "{{ trigger.objects | jq('.[].uri') | first }}" outputFiles: - "*.csv" - "*.parquet" tasks: - id: cloneRepo type: io.kestra.plugin.git.Clone url: https://github.com/kestra-io/examples branch: main - id: python type: io.kestra.plugin.scripts.python.Commands description: this script reads a file `data.csv` from the S3 trigger taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest commands: - python scripts/clean_messy_dataset.py triggers: - id: waitForS3object type: io.kestra.plugin.aws.s3.Trigger bucket: declarative-orchestration maxKeys: 1 interval: PT1S filter: FILES action: MOVE prefix: raw/ moveTo: key: archive/raw/ accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}" region: "{{ secret('AWS_DEFAULT_REGION') }}" ``` Nothing is hardcoded specifically to Kestra in the Python script from GitHub. That script remains pure Python that you can run anywhere. Kestra's trigger logic is stored along with orchestration and infrastructure configuration in the YAML flow definition. This separation of concerns (*i.e., not mixing orchestration and business logic*) makes your code easier to test and keeps your business logic vendor-agnostic. ## Output files To generate files in your script and make them available for download and use in downstream tasks, use the `outputFiles` property. ### Generating outputs from a script task using `outputFiles` :::alert{type="info"} From 0.17.0, `outputDir` has been deprecated. Use the `outputFiles` property instead. ::: The `outputFiles` property allows to specify a list of files to be persisted in Kestra's internal storage. Here is an example: ```yaml id: output_text_files namespace: company.team tasks: - id: python_output type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.core.runner.Process outputFiles: - "*.txt" script: | f = open("myfile.txt", "a") f.write("Hi, this is output from a script 👋") f.close() - id: read_output type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - cat {{ outputs.python_output.outputFiles['myfile.txt'] }} ``` Note how the `outputFiles` property is used to specify the list of files to be persisted in Kestra's internal storage. The `outputFiles` property supports [glob patterns](https://en.wikipedia.org/wiki/Glob_(programming)). The subsequent task can access the output file by leveraging the syntax `{{outputs.yourTaskId.outputFiles['yourFileName.fileExtension']}}`. --- # Install Dependencies at Runtime for Script Tasks URL: https://kestra.io/docs/scripts/installing-dependencies > Learn how to install dependencies at runtime for your script tasks using `beforeCommands` or prebuilt Docker images. Install dependencies at runtime using `beforeCommands`. ## Install script dependencies at runtime There are several ways of installing custom packages for your workflows. This page shows how to install dependencies at runtime using the `beforeCommands` property. ## Installing dependencies using `beforeCommands` While you could bake all your package dependencies into a custom container image, often it's convenient to install a couple of additional packages at runtime without having to build separate images. The `beforeCommands` can be used for that purpose. ### pip install package Here is a simple example installing `pip` packages `requests` and `kestra` before starting the script: ```yaml id: pip namespace: company.team tasks: - id: before_commands type: io.kestra.plugin.scripts.python.Script containerImage: python:3.11-slim beforeCommands: - pip install requests kestra > /dev/null script: | import requests import kestra kestra_modules = [i for i in dir(kestra.Kestra) if not i.startswith("_")] print(f"Requests version: {requests.__version__}") print(f"Kestra modules: {kestra_modules}") ``` ### pip install -r requirements.txt This example clones a Git repository that contains a `requirements.txt` file. The script task uses `beforeCommands` to install those packages. Lastly, a task lists recently installed packages to validate that this process works as expected: ```yaml id: python_requirements_file namespace: company.team tasks: - id: wdir type: io.kestra.plugin.core.flow.WorkingDirectory tasks: - id: cloneRepository type: io.kestra.plugin.git.Clone url: https://github.com/kestra-io/examples branch: main - id: print_requirements type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - cat requirements.txt - id: list_installed_packages type: io.kestra.plugin.scripts.python.Commands containerImage: python:3.11-slim beforeCommands: - pip install -r requirements.txt > /dev/null commands: - ls -lt $(python -c "import site; print(site.getsitepackages()[0])") | head -n 20 ``` And here is a simple version where we add the `requirements.txt` file using the `inputFiles` property: ```yaml id: python_requirements_file namespace: company.team tasks: - id: list_installed_packages type: io.kestra.plugin.scripts.python.Script env: PIP_ROOT_USER_ACTION: ignore inputFiles: requirements.txt: | polars requests kestra containerImage: python:3.11-slim beforeCommands: - pip install --upgrade pip - pip install -r requirements.txt > /dev/null script: | from kestra import Kestra import pkg_resources import re with open('requirements.txt', 'r') as file: # find package names without versions required_packages = {re.match(r'^\s*([a-zA-Z0-9_-]+)', line).group(1) for line in file if line.strip()} installed_packages = [(d.project_name, d.version) for d in pkg_resources.working_set] kestra_outputs = {} for name, version in installed_packages: if name in required_packages: kestra_outputs[name] = version Kestra.outputs(kestra_outputs) ``` Shown in the example above, the `WorkingDirectory` task is usually only needed if you use the `git.Clone` task. In most other cases, you can use the `inputFiles` property to add files to the script's working directory. ### Run any language with Process task runner To run languages other than Python directly with the [Process Task Runner](../../task-runners/04.types/01.process-task-runner/index.md) you need to install it before executing the code. Here is an example using Go: ```yaml id: antelope_355074 namespace: company.team tasks: - id: script type: io.kestra.plugin.scripts.go.Script taskRunner: type: io.kestra.plugin.core.runner.Process beforeCommands: - wget -qO- https://go.dev/dl/go1.24.3.linux-amd64.tar.gz | tar -C /usr/local -xzf - && echo 'export PATH=$PATH:/usr/local/go/bin' > /etc/profile.d/golang.sh && export PATH=$PATH:/usr/local/go/bin - go mod init go_script - go get github.com/go-gota/gota/dataframe - go mod tidy script: | package main import ( "os" "github.com/go-gota/gota/dataframe" "github.com/go-gota/gota/series" ) func main() { names := series.New([]string{"Alice", "Bob", "Charlie"}, series.String, "Name") ages := series.New([]int{25, 30, 35}, series.Int, "Age") df := dataframe.New(names, ages) file, _ := os.Create("output.csv") df.WriteCSV(file) defer file.Close() } outputFiles: - output.csv ``` ## Using Kestra's prebuilt images Many data engineering use cases require performing fairly standardized tasks such as: - processing data with `pandas` - transforming data with `dbt-core` (*using a dbt adapter for your data warehouse*) - making API calls with the `requests` library To solve those common challenges, the [kestra-io/examples](https://github.com/orgs/kestra-io/packages?repo_name=examples) repository provides several **public** Docker images with the latest versions of those common packages. Many [Blueprints](/blueprints) use those public images by default. The images are hosted in GitHub Container Registry managed by Kestra's team and those images follow the naming `ghcr.io/kestra-io/packageName:latest`. ### Example: running R script in Docker Here is a simple example using the `ghcr.io/kestra-io/rdata:latest` Docker image provided by Kestra to analyze the built-in `mtcars` dataset using `dplyr` and `arrow` R libraries: ```yaml id: rCars namespace: company.team tasks: - id: r type: io.kestra.plugin.scripts.r.Script containerImage: ghcr.io/kestra-io/rdata:latest outputFiles: - "*.csv" - "*.parquet" script: | library(dplyr) library(arrow) data(mtcars) # Load mtcars data print(head(mtcars)) final <- mtcars %>% summarise( avg_mpg = mean(mpg), avg_disp = mean(disp), avg_hp = mean(hp), avg_drat = mean(drat), avg_wt = mean(wt), avg_qsec = mean(qsec), avg_vs = mean(vs), avg_am = mean(am), avg_gear = mean(gear), avg_carb = mean(carb) ) final %>% print() write.csv(final, "final.csv") mtcars_clean <- na.omit(mtcars) # remove rows with NA values write_parquet(mtcars_clean, "mtcars_clean.parquet") ``` Installation of R libraries is time-consuming. From a technical standpoint, you could install custom R packages at runtime as follows: ```yaml id: rCars namespace: company.team tasks: - id: r type: io.kestra.plugin.scripts.r.Script containerImage: ghcr.io/kestra-io/rdata:latest beforeCommands: - Rscript -e "install.packages(c('dplyr', 'arrow'))" > /dev/null 2>&1 ``` However, that flow above might take up to 30 minutes, depending on the R packages you install. Prebuilt Docker images such as `ghcr.io/kestra-io/rdata:latest` can help you iterate much faster. Before moving to production, you can build your custom images with the exact package versions that you need. --- # Supported Programming Languages in Kestra URL: https://kestra.io/docs/scripts/languages > See which languages have dedicated Kestra script plugins and how to run other languages with Shell tasks and Docker. Kestra lets you run code in many languages inside your workflows. ## Choose the right way to run a language Use a dedicated script plugin when Kestra provides one for your language. Use the Shell plugin when you need to run another language or compile code inside a container.
## Languages with dedicated plugins Kestra provides dedicated script plugins for these languages: - [Python](/plugins/plugin-script-python) - [R](/plugins/plugin-script-r) - [Node.js](/plugins/plugin-script-node) - [Shell](/plugins/plugin-script-shell) - [PowerShell](/plugins/plugin-script-powershell) - [Julia](/plugins/plugin-script-julia) - [Ruby](/plugins/plugin-script-ruby) - [Go](/plugins/plugin-script-go) - [Deno](/plugins/plugin-script-deno) - [Lua](/plugins/plugin-script-lua) - [Bun](/plugins/plugin-script-bun) - [PHP](/plugins/plugin-script-php) - [Perl](/plugins/plugin-script-perl) - [Groovy](/plugins/plugin-script-groovy) Each of these plugins provides two task types: - `Script` for short inline code in your flow definition. - `Commands` for code stored in files or split across multiple commands. Here is a minimal example that uses the Python `Script` task: ```yaml id: python_script_example namespace: company.team tasks: - id: run_python type: io.kestra.plugin.scripts.python.Script containerImage: python:3.12-slim script: | print("Hello from Python") ``` Use the `Commands` task when you want to run a file from `namespaceFiles` or execute multiple commands in sequence: ```yaml id: python_commands_example namespace: company.team tasks: - id: run_python_file type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true containerImage: python:3.12-slim commands: - python hello.py ``` ## Language-specific guides Use these guides for complete examples, outputs, metrics, and dependency management: - [Run Python inside your flows](../../15.how-to-guides/python/index.md) - [Run R inside your flows](../../15.how-to-guides/r/index.md) - [Run JavaScript inside your flows](../../15.how-to-guides/javascript/index.md) - [Run Shell scripts inside your flows](../../15.how-to-guides/shell/index.md) - [Run PowerShell inside your flows](../../15.how-to-guides/powershell/index.md) - [Run Julia inside your flows](../../15.how-to-guides/julia/index.md) - [Run Go inside your flows](../../15.how-to-guides/golang/index.md) - [Run Perl inside your flows](../../15.how-to-guides/perl/index.md) - [Run Rust inside your flows](../../15.how-to-guides/rust/index.md) ## Run other languages with the Shell plugin Use `io.kestra.plugin.scripts.shell.Commands` when your language does not have a dedicated plugin or when you want to compile and run code in a container. This approach works best when: - the language runtime is available in the container image - your commands can be executed from a shell - you want to use `namespaceFiles` or `inputFiles` for source code For outputs and metrics, use the same `::{}::` syntax documented for Shell tasks. See [Shell outputs and metrics](../06.outputs-metrics/index.md#shell). ### Rust example This example compiles and runs a Rust file stored in `namespaceFiles`: ```yaml id: rust_example namespace: company.team tasks: - id: run_rust type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: rust:1.82 namespaceFiles: enabled: true commands: - rustc hello_world.rs - ./hello_world ``` The `hello_world.rs` file can contain: ```rust fn main() { println!("Hello, World!"); } ``` See the full [Rust guide](../../15.how-to-guides/rust/index.md) for outputs and file handling. ### Java example If you need to run Java code without building a custom plugin, you can compile and run it from a container: ```yaml id: java_example namespace: company.team tasks: - id: run_java type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: eclipse-temurin:21 namespaceFiles: enabled: true commands: - javac HelloWorld.java - java HelloWorld ``` ```java class HelloWorld { public static void main(String[] args) { System.out.println("Hello, World!"); } } ``` ### TypeScript example You can run TypeScript with the Node.js plugin by compiling it to JavaScript first: ```yaml id: typescript_example namespace: company.team tasks: - id: run_typescript type: io.kestra.plugin.scripts.node.Commands namespaceFiles: enabled: true containerImage: node:22-slim commands: - npm install --save-dev typescript - npx tsc example.ts - node example.js ``` The `example.ts` file can contain: ```typescript type User = { name: string; age: number; }; function isAdult(user: User): boolean { return user.age >= 18; } const justine: User = { name: "Justine", age: 23, }; console.log(isAdult(justine)); ``` For more background, see the official [Node.js with TypeScript guide](https://nodejs.org/en/learn/getting-started/nodejs-with-typescript). ## Use a custom Docker image for extra dependencies Use a custom Docker image when you need tools or libraries that are not present in the default runtime image. The Dockerfile below adds Go to a Kestra image: ```dockerfile FROM kestra/kestra:latest USER root RUN apt-get update -y && apt-get install -y wget && \ wget -qO- https://go.dev/dl/go1.24.3.linux-amd64.tar.gz | tar -C /usr/local -xzf - && \ echo 'export PATH=$PATH:/usr/local/go/bin' > /etc/profile.d/golang.sh ENV PATH="/usr/local/go/bin:${PATH}" ``` Point your `docker-compose.yml` file to that Dockerfile: ```yaml services: kestra: build: context: . dockerfile: Dockerfile image: kestra-go:latest ``` After you start Kestra with `docker compose up -d`, you can run Go code with the `Process` task runner: ```yaml id: golang_process namespace: company.team tasks: - id: go_custom_dependencies type: io.kestra.plugin.scripts.go.Script taskRunner: type: io.kestra.plugin.core.runner.Process beforeCommands: - go mod init go_script - go get github.com/go-gota/gota/dataframe - go mod tidy script: | package main import ( "os" "github.com/go-gota/gota/dataframe" "github.com/go-gota/gota/series" ) func main() { names := series.New([]string{"Alice", "Bob", "Charlie"}, series.String, "Name") ages := series.New([]int{25, 30, 35}, series.Int, "Age") df := dataframe.New(names, ages) file, _ := os.Create("output.csv") defer file.Close() df.WriteCSV(file) } outputFiles: - output.csv ``` ## Related pages - [Scripts overview](../index.mdx) - [Commands vs. scripts](../01.commands-vs-scripts/index.md) - [Task runners](../03.task-runners/index.md) - [Installing dependencies](../05.installing-dependencies/index.md) - [Outputs and metrics](../06.outputs-metrics/index.md) --- # Logging from Scripts – Send Logs to Kestra URL: https://kestra.io/docs/scripts/logging > Learn how to send logs from your Python, Node.js, and Shell scripts directly to Kestra's backend during execution. Send logs back to Kestra. ## Log from scripts to Kestra Your scripts can log to Kestra's backend during flow execution. This logs events occurring during execution of a flow. ## Logging from Script and Commands tasks The [Scripts Plugin](https://github.com/kestra-io/plugin-scripts) provides convenient methods to log to the Kestra backend during flow Execution. Under the hood, Kestra tracks logs from script tasks by searching standard output and standard error for `::{}::` patterns that specify log messages using a JSON request payload. Below is an example showing `logs` as list of dictionaries: ```json { "logs": [ { "level": "DEBUG", "message": "Hello World from logs!" }, { "level": "INFO", "message": "Hello World!" } ] } ``` ## Python The example below shows how you can log from your Python script to Kestra's backend at runtime: ```python from kestra import Kestra logger = Kestra.logger() logger.debug("Hello World from logs!") logger.info("Hello World!") ``` Here is a more comprehensive example in a flow: ```yaml id: logFromPython namespace: company.team tasks: - id: py type: io.kestra.plugin.scripts.python.Script script: | from kestra import Kestra logger = Kestra.logger() logger.info("Py task is alive!") ``` ## Node.js Node.js follows the same syntax for sending logs as in Python. Here is an example: You need to install the [npm package](https://www.npmjs.com/package/@kestra-io/libs), that can be done with a `beforeCommands`: ```yaml beforeCommands: - npm i @kestra-io/libs ``` Then use the `require` function to import the Kestra package and emit logs: ```js const Kestra = require("@kestra-io/libs"); const logger = Kestra.logger(); logger.debug("Hello World from logs!"); logger.info("Hello World!"); ``` ## Shell To log from a Shell task, wrap the JSON payload with double colons `'::{"logs": [{"level":"DEBUG","message":"Hello World!"}]}::'` as shown in the following examples: ```bash echo '::{"logs": [{"level":"DEBUG","message":"Hello World from logs!"},{"level":"INFO","message":"Hello World!"}]}::' ``` The JSON payload should be provided without any spaces. Here is a comprehensive example in a flow: ```yaml id: shell_script namespace: company.team tasks: - id: shell_script type: io.kestra.plugin.scripts.shell.Script containerImage: ubuntu script: | echo '::{"logs": [{"level":"INFO","message":"Shell task is alive!"}]}::' ``` --- # Script Outputs & Metrics: Send Data Back to Kestra URL: https://kestra.io/docs/scripts/outputs-metrics > Send outputs and metrics from your scripts back to Kestra to track metadata, pass data between tasks, and visualize performance. Send Outputs and Metrics back to Kestra. ## Send outputs and metrics from scripts Your scripts can send outputs and metrics to Kestra's backend during flow execution. This allows you to track custom metadata and visualize it across multiple executions of a flow. ## How to emit `outputs` and `metrics` from script tasks The `outputFiles` is useful to send files generated in a script to Kestra's internal storage so that these files can be used in downstream tasks or exposed as downloadable artifacts. However, `outputs` can also be simple key-value pairs that contain metadata generated in your scripts. Many tasks from Kestra plugins emit certain outputs by default. You can inspect which outputs are generated by each task or trigger from the respective plugin documentation. For instance, check out [this plugin documentation](/plugins/core/http/io.kestra.plugin.core.http.download#outputs) to see the outputs generated by the HTTP Download task. Once the flow is executed, the Outputs tab will list the output metadata as key-value pairs. Run the example below to see it in action: ```yaml id: download namespace: company.team tasks: - id: http type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv ``` This example automatically emits output metadata, such as the status `code`, file `uri`, and request `headers` because those properties have been preconfigured on that plugin's task. However, in your custom script, you can decide what metadata you want to send to Kestra to make that metadata visible in the UI. ### Outputs and metrics in Script and Commands tasks The [Scripts Plugin](https://github.com/kestra-io/plugin-scripts) provides convenient methods to send outputs and metrics to the Kestra backend during flow Execution. Under the hood, Kestra tracks outputs and metrics from script tasks by searching standard output and standard error for `::{}::` patterns that allow you to specify outputs and metrics using a JSON request payload: - `::{}::` for JSON objects. > `outputs` require a **dictionary**, while `metrics` expect a **list of dictionaries**. Below is an example showing an `outputs` object with key-value pairs: ```json { "outputs": { "key": "value", "exampleList": [1, 2, 3], "tags": { "s3Bucket": "declarative-orchestration", "region": "us-east-1" } } } ``` If using the [Enterprise or Cloud Edition](../../07.enterprise/index.mdx), you can encrypt script task outputs with `::{"encryptedOutputs":{"key":"value"}}::`. For a full example, check out the section in our [Outputs guide](../../05.workflow-components/06.outputs/index.md#encrypted-outputs-from-script-tasks). Here is the representation of a `metrics` object. It's a **list of dictionaries**: ```plaintext "metrics": [ { "name": "myMetric", // mandatory, the name of the metrics "type": "counter", // mandatory, "counter" or "timer" metric type "value": 42, // mandatory, Double or Integer value "tags": { // optional list of tags "readOnly": true, "location": "US" } } ] ``` Both outputs and metrics can optionally include a list of tags that expose internal details. ### Metric types: `counter` and `timer` There are two metric types: 1. `counter`, expressed in **Integer** or **Double** data type, measures a countable number of rows/bytes/objects processed in a given task. 2. `timer`, expressed in **Double** data type, measures the number of `seconds` to process specific computation in your flow. Below you can find examples of `outputs` and `metrics` definition for each language. ### Python The example below shows how you can add simple key-value pairs in your Python script to send custom metrics and outputs to Kestra's backend at runtime: ```python from kestra import Kestra Kestra.outputs({'data': data, 'nr': 42}) Kestra.counter('nr_rows', len(df), tags={'file': filename}) Kestra.timer('ingestion_duration', duration, tags={'file': filename}) ``` The `Kestra.outputs({"key": "value"})` takes a dictionary of key-value pairs, while the metrics such as **Counter** and **Timer** take the metric name, metric value, and a dictionary of tags as positional arguments, for example: - `Kestra.counter("countable_int_metric_name", 42, tags={"key": "value"})` - `Kestra.timer("countable_double_metric_name", 42.42, tags={"key": "value"})` Here is a more comprehensive example in a flow: ```yaml id: outputsMetricsPython namespace: company.team inputs: - id: attempts type: INT defaults: 10 tasks: - id: py type: io.kestra.plugin.scripts.python.Script containerImage: ghcr.io/kestra-io/pydata:latest script: | import timeit from kestra import Kestra attempts = {{inputs.attempts}} modules = ['pandas', 'requests', 'kestra', 'faker', 'csv', 'random'] results = {} for module in modules: time_taken = timeit.timeit(f'import {module}', number=attempts) results[module] = time_taken Kestra.timer(module, time_taken, tags=dict(nr_attempts=attempts)) Kestra.outputs(results) ``` ### Node.js Node.js follows the same syntax for sending outputs and metrics as in Python. You need to install the [npm package](https://www.npmjs.com/package/@kestra-io/libs), that can be done with a `beforeCommands`: ```yaml beforeCommands: - npm i @kestra-io/libs ``` Next, use the `require()` function or import the package: ```js const Kestra = require("@kestra-io/libs"); Kestra.outputs({data: 'data', nr: 42, mybool: true, myfloat: 3.65}); Kestra.counter('metric_name', 100, {partition: 'file1'}); Kestra.timer('timer1', (callback) => {setTimeout(callback, 1000)}, {tag1: 'hi'}); Kestra.timer('timer2', 2.12, {tag1: 'from', tag2: 'kestra'}); ``` ### Shell To send outputs and metrics from a Shell task, wrap a JSON payload (i.e., a map/dictionary) with double colons `'::{"outputs": {"key":"value"}}::'` or `'::{"metrics": [{"name":"count","type":"counter","value":1,"tags":{"key":"value"}::'` as shown in the following examples: ```shell ## 1. send outputs with different data types echo '::{"outputs":{"test":"value","int":2,"bool":true,"float":3.65}}::' ## 2. send a counter with tags echo '::{"metrics":[{"name":"count","type":"counter","value":1,"tags":{"tag1":"i","tag2":"win"}}]}::' ## 3. send a timer with tags echo '::{"metrics":[{"name":"time","type":"timer","value":2.12,"tags":{"tag1":"i","tag2":"destroy"}}]}::' ``` The JSON payload should be provided without any spaces. Here is a comprehensive example in a flow: ```yaml id: shell_script namespace: company.team tasks: - id: shell_script type: io.kestra.plugin.scripts.shell.Script containerImage: ubuntu script: | echo '{"outputs":{"test":"value","int":2,"bool":true,"float":3.65}}' echo '::{"metrics":[{"name":"count","type":"counter","value":1,"tags":{"tag1":"i","tag2":"win"}}]}::' echo '::{"metrics":[{"name":"time","type":"timer","value":2.12,"tags":{"tag1":"i","tag2":"destroy"}}]}::' ``` --- ## When to use metrics and when to use outputs? If you want to track task-run metadata across multiple executions of a flow, and this metadata is of an arbitrary data type (*it might be a string, a list of dictionaries, or even a file*), use `outputs` rather than `metrics`. Metrics can only be used with numerical values. ### Use cases for `outputs`: results of a task of any data type Outputs are task-run artifacts. They are generated as a **result** of a given task. Outputs can be used for two reasons: 1. To **pass data** between tasks 2. To **generate result artifacts** for observability and auditability e.g., to track specific metadata or to share downloadable file artifacts with business stakeholders. ### Using outputs to pass data between tasks Outputs can be used to pass data between tasks. One task can generate some outputs and other task can use that value: ```yaml id: outputsInputs namespace: company.team tasks: - id: passOutput type: io.kestra.plugin.core.debug.Return format: "hello world!" - id: takeInput type: io.kestra.plugin.core.debug.Return format: "data from previous task - {{ outputs.passOutput.value }}" ``` ### Use cases for `metrics`: numerical values that can be aggregated and visualized across Executions Metrics are intended to track custom **numeric** (metric type: `counter`) or **duration** (metric type: `timer`) attributes that you can visualize across flow executions, such as number of rows or bytes processed in a task. Metrics are expressed as numerical values of `integer` or `double` data type. Examples of metadata you may want to track as `metrics`: - the **number of rows** processed in a given task (e.g., during data ingestion or transformation) - the **accuracy score** of a trained ML model to compare this result across multiple workflow runs (*e.g., you can see the average or max value across multiple executions*) - other pieces of **metadata** that you can track across executions of a flow (e.g., a duration of a certain function execution within a Python ETL script). --- # Task Runners in Scripts: Control Execution Environment URL: https://kestra.io/docs/scripts/task-runners > Manage execution environments for your scripts using Kestra's Task Runners, including Docker and Process runners. Manage the environment your code is executed with Task Runners. ## Manage task runners for script execution Task Runners are extensible, pluggable systems capable of executing your tasks in arbitrary remote environments. Each `taskRunner` is identified by its `type`. The [Process](../../task-runners/04.types/01.process-task-runner/index.md) and [Docker](../../task-runners/04.types/02.docker-task-runner/index.md) task runners are fully open-source and located within the [Kestra repository](https://github.com/kestra-io/kestra). By default, Kestra runs all script tasks using the Docker task runner.
Here's an example of the Docker Task Runner configured to use the `centos` container image: ```yaml id: docker_task_runner namespace: company.team tasks: - id: shell type: io.kestra.plugin.scripts.shell.Commands containerImage: centos taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker cpu: cpus: 1 commands: - echo "Hello World!" ``` You can learn more in the [Docker Task Runner documentation](../../task-runners/04.types/02.docker-task-runner/index.md). Here's an example of the Process Task Runner: ```yaml id: process_task_runner namespace: company.team tasks: - id: shell type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - echo "Hello World!" ``` You can learn more in the [Process Task Runner documentation](../../task-runners/04.types/01.process-task-runner/index.md). Find out more information on the dedicated [Task Runners](../../task-runners/index.mdx) documentation. --- # Working Directory Task – Share Files Across Scripts URL: https://kestra.io/docs/scripts/working-directory > Share files across script tasks using Kestra's WorkingDirectory task. Group related scripts to read and write to a shared filesystem within one execution. Run multiple tasks in the same working directory sequentially. ## Share files with the WorkingDirectory task This task runs multiple tasks sequentially in the same working directory. It is useful when you want to share files from Namespace Files or from a Git repository across multiple tasks. ## When to use the `WorkingDirectory` task By default, all Kestra tasks are **stateless**. If one task generates files, those files won’t be available in downstream tasks unless they are persisted in internal storage. Upon each task completion, the temporary directory for the task is purged. This behavior is generally useful, as it keeps your environment clean and dependency free, and it avoids potential privacy or security issues when exposing some data generated by a task to other processes. Despite the benefits of the stateless execution, in certain scenarios, **statefulness** is **desirable**. Imagine that you want to execute several Python scripts, and each of them generates some output data. Another script combines that data as part of an ETL/ML process. Executing those related tasks in the same working directory and **sharing state** between them is helpful for the following reasons: - You can attach namespace files to the `WorkingDirectory` task and use them in all downstream tasks. This allows you to work the same way you would work on your local machine, where you can import modules from the same directory. - Within a `WorkingDirectory`, you can **clone** your entire **GitHub branch** with multiple modules and configuration files needed to run several scripts and **reuse** them across multiple downstream tasks. - You can **execute** multiple scripts **sequentially** on the same worker or in the same container, minimizing latency. - **Output artifacts** of each task (such as CSV, JSON, or Parquet files you generate in your script) are directly available to other tasks without having to persist them within the internal storage. This is because all child tasks of the `WorkingDirectory` task share the same file system. The `WorkingDirectory` task allows you to: 1. Share files from Namespace Files or from a Git repository across multiple tasks 2. Run multiple tasks sequentially in the same working directory 3. Share data across multiple tasks without having to persist it in internal storage For more detail, see the [plugin documentation](/plugins/core/flow/io.kestra.plugin.core.flow.workingdirectory) ## Example In this example, the flow sequentially executes Shell Scripts and Shell Commands in the same working directory using a local Process Task Runner. ```yaml id: shell_scripts namespace: company.team tasks: - id: working_directory type: io.kestra.plugin.core.flow.WorkingDirectory tasks: - id: create_csv_file type: io.kestra.plugin.scripts.shell.Script taskRunner: type: io.kestra.plugin.core.runner.Process script: | #!/bin/bash echo "Column1,Column2,Column3" > file.csv for i in {1..10} do echo "$i,$RANDOM,$RANDOM" >> file.csv done - id: inspect_file type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - cat file.csv - id: filter_file type: io.kestra.plugin.scripts.shell.Commands description: select only the first five rows of the second column taskRunner: type: io.kestra.plugin.core.runner.Process commands: - cut -d ',' -f 2 file.csv | head -n 6 ``` --- # Task Runners in Kestra: Offload & Isolate Compute URL: https://kestra.io/docs/task-runners > Overview of Kestra Task Runners, enabling you to offload and isolate task execution across various environments. import ChildCard from "~/components/docs/ChildCard.astro" Task Runners are an extensible, pluggable system capable of executing your tasks in arbitrary remote environments. ## Offload and isolate compute with task runners Many data processing tasks are **computationally intensive** and require a lot of resources (_such as CPU, GPU, and memory_). Instead of provisioning always-on servers, Task Runners can execute your code on **dynamically provisioned compute instances** in the cloud, such as AWS ECS Fargate, Azure Batch, Google Batch, auto-scaled Kubernetes clusters, and more. All you have to do to offload your task execution to a remote environment is to specify the `taskRunner` type in your task configuration. Each `type` of a task runner is a **plugin** with its own schema. The built-in code editor provides documentation, autocompletion, and syntax validation for all task runner plugin properties to ensure correctness, standardization, and consistency. :::alert{type="info"} Some task runner plugins are available only in the [Enterprise Edition](../07.enterprise/index.mdx). If you want to try them out, please [reach out](/demo). See [Open Source vs Enterprise](../oss-vs-paid/index.md) to compare editions. :::
--- # Task Runner Benefits: Resource Control & Flexibility URL: https://kestra.io/docs/task-runners/benefits > Explore the benefits of using Task Runners in Kestra for isolated execution, resource control, and deployment flexibility. Discover how Task Runners simplify resource allocation, environment management, and deployment across environments. ## Docker in development, Kubernetes in production Many Kestra users develop their scripts locally using **Docker containers** and deploy the same code in production as **Kubernetes pods**. Thanks to the `taskRunner` property, switching between environments is seamless. Below is an example showing how you can combine `pluginDefaults` with the `taskRunner` property to use Docker during development and Kubernetes in production — without changing your code. ### 1. Development environment (namespace / tenant / instance) ```yaml pluginDefaults: - type: io.kestra.plugin.scripts values: taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker pullPolicy: IF_NOT_PRESENT # In dev, only pull the image when needed cpu: cpus: 1 memory: memory: 512Mi ``` ### 2. Production environment (namespace / tenant / instance) ```yaml pluginDefaults: - type: io.kestra.plugin.scripts values: taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes namespace: company.team pullPolicy: ALWAYS # Always pull the latest image in production config: username: "{{ secret('K8S_USERNAME') }}" masterUrl: "{{ secret('K8S_MASTER_URL') }}" caCert: "{{ secret('K8S_CA_CERT') }}" clientCert: "{{ secret('K8S_CLIENT_CERT') }}" clientKey: "{{ secret('K8S_CLIENT_KEY') }}" resources: # Can be overridden by a specific task if needed request: cpu: "500m" # Request 1/2 CPU (500 milliCPU) memory: "256Mi" # Request 256 MB of memory ``` :::alert{type="info"} Notice that the `containerImage` property is not part of the `taskRunner` configuration — it’s defined at the task level instead. This makes configurations more flexible, as container images typically change more often than the runner setup. For instance, a dbt plugin might require a different image from a Python script, while both can share the same runner configuration. ::: ## Centralized configuration management The combination of `pluginDefaults` and `taskRunner` enables centralized management of your task runner configuration. For example, you can define AWS credentials at the namespace level for the `Batch` task runner plugin: ```yaml pluginDefaults: - type: io.kestra.plugin.ee.aws.runner.Batch values: accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}" region: "us-east-1" ``` This approach ensures consistency and eliminates repetitive configuration across multiple workflows. ## Documentation and autocompletion Each task runner is a self-contained plugin with its own icon, documentation, and property schema. The built-in Kestra code editor provides **autocompletion**, **inline documentation**, and **syntax validation** for all runner properties. Clicking on the runner’s name in the editor opens its documentation sidebar for quick reference. ![docker_runner](./docker_runner.png) ## Full customization: build your own Task Runner For advanced use cases, you can create a [custom task runner plugin](../../15.how-to-guides/custom-plugin/index.md) tailored to your environment. Simply build it as a JAR file and add it to the `plugins` directory. Once Kestra restarts, your custom runner will appear as an available option in any script task. --- # Task Runner Capabilities & Supported Plugins in Kestra URL: https://kestra.io/docs/task-runners/overview > Learn about Kestra Task Runners capabilities and supported plugins for executing tasks in diverse environments. Understand the capabilities of Task Runners and the plugins that support them. ## Understand task runner capabilities Task Runners provide a flexible and efficient way to execute compute-intensive workloads across different environments. Whether you’re running scripts locally, on Kubernetes, or on cloud platforms like AWS, Azure, or Google Cloud, Task Runners ensure consistent, isolated, and configurable task execution. The table below outlines the main capabilities of Task Runners in Kestra. | Capability | Description | |------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | **Fine-grained resource allocation** | Gain full control over compute resources — allocate the precise amount of CPU, memory, or GPU to individual tasks. | | **Flexible deployment patterns** | Deploy tasks across diverse environments, including AWS ECS Fargate, Azure Batch, Google Batch, Kubernetes, and more. You can even mix different runners within a single workflow. | | **No vendor lock-in** | Built on a modular plugin system, Task Runners let you run workloads on any cloud or on-prem infrastructure — without being tied to a specific provider. | | **Task isolation** | Each task runs in a fully isolated container environment, preventing conflicts and ensuring consistent performance. | | **Development-to-production consistency**| Develop locally using Docker containers and seamlessly deploy the same code to production in Kubernetes or cloud environments — just by changing one property. | | **Centralized configuration management** | Define and manage runner configurations globally using `pluginDefaults`. This allows you to govern credentials and environment settings at the namespace or organization level. | | **Built-in documentation and validation**| Each Task Runner plugin includes a schema. The Kestra code editor offers inline documentation, autocompletion, and syntax validation for every property, ensuring correctness and standardization. | | **No code changes required** | Move between environments — from local to cloud — without altering your business logic or code. | | **Fully customizable** | Extend functionality by developing your own Task Runner plugin tailored to your infrastructure and deployment requirements. | ## Supported plugins Task Runners are primarily used in tasks from the [Script Plugin](https://github.com/kestra-io/plugin-scripts) and its related sub-plugins. These tasks support execution of custom scripts or command sets via the `taskRunner` property. ### Supported script-based plugins - [Python](/plugins/plugin-script-python) - [Node.js](/plugins/plugin-script-node) - [Go](/plugins/plugin-script-go) - [Shell](/plugins/plugin-script-shell) - [PowerShell](/plugins/plugin-script-powershell) - [R](/plugins/plugin-script-r) - [Julia](/plugins/plugin-script-julia) - [Ruby](/plugins/plugin-script-ruby) - [Deno](/plugins/plugin-script-deno) - [Lua](/plugins/plugin-script-lua) - [Bun](/plugins/plugin-script-bun) - [PHP](/plugins/plugin-script-php) - [Perl](/plugins/plugin-script-perl) - [Groovy](/plugins/plugin-script-groovy) - [dbt](/plugins/plugin-dbt) - [SQLMesh](/plugins/plugin-sqlmesh) - [Ansible](/plugins/plugin-ansible) - [Terraform](/plugins/plugin-terraform) - [Modal](/plugins/plugin-modal) - [AWS CLI](/plugins/plugin-aws/cli/io.kestra.plugin.aws.cli.awscli) - [GCloud CLI](/plugins/plugin-gcp/cli/io.kestra.plugin.gcp.cli.gcloudcli) - [Azure CLI](/plugins/plugin-azure/cli/io.kestra.plugin.azure.cli.azcli) Whenever you see a task capable of executing a `script` or a series of `commands`, it’s a script-based task that can leverage a `taskRunner` to define where and how that task runs. --- # Task Runners vs Worker Groups – When to Use Each URL: https://kestra.io/docs/task-runners/task-runners-vs-worker-groups > Learn when to use Task Runners versus Worker Groups in Kestra for optimal compute resource management and isolation. Find out when to use Task Runners or Worker Groups. ## Choose between task runners and worker groups
[Task Runners](../index.mdx) and [Worker Groups](../../07.enterprise/04.scalability/worker-group/index.md) both **offload compute-intensive tasks to dedicated workers**. However, **worker groups have a broader scope**, applying to **all tasks** in Kestra, whereas **task runners** are limited to **scripting tasks** (Python, R, JavaScript, Shell, dbt, etc. — see the [full list here](../01.overview/index.md#supported-script-based-plugins)). Worker groups can be used with any plugins. For instance, if you need to query an on-premise SQL Server database running on a different server than Kestra, your SQL Server Query task can target a worker with access to that server. Additionally, worker groups can fulfill the same use case as task runners by distributing the load of scripting tasks to dedicated workers with the necessary resources and dependencies (_incl. hardware, region, network, operating system_). ## Key differences Worker groups are always-on servers that can run any task in Kestra, while task runners are ephemeral containers that are spun up only when a task is executed. This has implications with respect to latency and cost: - Worker groups are running on dedicated servers, so they can start executing tasks immediately with millisecond latency. Task runners, on the other hand, need to be spun up before they can execute a task, which can introduce latency up to minutes. For example, the AWS Batch task runner can take up to 50 seconds to register a task definition and start a container on AWS ECS Fargate. With the Google Batch task runner, it can take up to 90 seconds if you don't use a compute reservation because GCP spins up a new compute instance for each task run. - Task runners can be more cost-effective for infrequent short-lived tasks, while worker groups are more cost-effective for frequent and long-running tasks. - Worker Groups work at the task level, whereas Task Runners are only available for some task types, such as Scripts, Commands, and CLI tasks. The table below summarizes the differences between task runners and worker groups. | | Task Runners | Worker Groups | |-----------------------|---------------------------------------|---------------------------------------------| | **Scope** | Limited to scripting tasks | Applicable to all tasks in Kestra | | **Use Cases** | Scripting tasks (Python, R, etc.) | Any task, including database queries | | **Deployment** | Ephemeral containers | Always-on servers | | **Resource Handling** | Spins up as needed | Constantly available | | **Latency** | High latency (seconds, up to minutes) | Low latency (milliseconds) | | **Cost Efficiency** | Suitable for infrequent tasks | Suitable for frequent or long-running tasks | :::alert{type="info"} Worker Groups are not yet available in Kestra Cloud, only in Kestra Enterprise Edition. ::: ## Use cases Here are common use cases in which **Worker Groups** can be beneficial: - Execute tasks and polling triggers on specific servers (e.g., a VM with access to your on-premise database or a server with preconfigured CUDA drivers). - Execute tasks and polling triggers on a worker with a specific Operating System (e.g., a Windows server configured with specific software needed for a task). - Restrict backend access to a set of workers (firewall rules, private networks, etc.). Here are common use cases in which **Task Runners** can be beneficial: - Offload compute-intensive tasks to compute resources provisioned on-demand. - Run tasks that temporarily require more resources than usual e.g., during a backfill or a nightly batch job. - Run tasks that require specific dependencies or hardware (e.g., GPU, memory, etc.). ## Usage ### Worker Groups usage First, start the worker with the `--worker-group myWorkerGroupKey` flag. It's important for the new worker to have a configuration similar to that of your principal Kestra server and to have access to the same backend database and internal storage. The configuration file will be passed via the `--config` flag, as shown in the example below. ```shell kestra server worker --worker-group=myWorkerGroupKey --config=/path/to/kestra-config.yaml ``` To assign a task to the desired worker group, add a `workerGroup.key` property. This will ensure that the task or polling trigger is executed on a worker in the specified worker group. ```yaml id: myflow namespace: company.team tasks: - id: gpu type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true commands: - python ml_on_gpu.py workerGroup: key: myWorkerGroupKey ``` A default worker group can also be configured at the namespace level so that all tasks and polling triggers in that namespace are executed on workers in that worker group by default. ![default_worker_group](./default_worker_group.png) ### Task Runners usage To use a task runner, add a `taskRunner` property to your task configuration and choose the desired `type` of task runner. For example, to use the AWS Batch task runner, you would configure your task as follows: ```yaml id: aws_ecs_fargate_python namespace: company.team tasks: - id: run_python type: io.kestra.plugin.scripts.python.Script containerImage: ghcr.io/kestra-io/pydata:latest taskRunner: type: io.kestra.plugin.ee.aws.runner.Batch computeEnvironmentArn: "arn:aws:batch:eu-west-1:707969873520:compute-environment/kestraFargateEnvironment" jobQueueArn: "arn:aws:batch:eu-west-1:707969873520:job-queue/kestraJobQueue" executionRoleArn: "arn:aws:iam::707969873520:role/kestraEcsTaskExecutionRole" taskRoleArn: "arn:aws:iam::707969873520:role/ecsTaskRole" accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}" region: eu-west-1 bucket: kestra-ie script: | import platform import socket import sys def print_environment_info(): print("Hello from AWS Batch and kestra!") print(f"Host's network name: {platform.node()}") print(f"Python version: {platform.python_version()}") print(f"Platform information (instance type): {platform.platform()}") print(f"OS/Arch: {sys.platform}/{platform.machine()}") try: hostname = socket.gethostname() ip_address = socket.gethostbyname(hostname) print(f"Host IP Address: {ip_address}") except socket.error as e: print("Unable to obtain IP address.") if __name__ == '__main__': print_environment_info() ``` --- # Task Runner Types – Choose the Right Execution Backend URL: https://kestra.io/docs/task-runners/types > Explore the different types of Task Runners available in Kestra, including Docker, Process, Kubernetes, and cloud-based runners. import ChildCard from "~/components/docs/ChildCard.astro" This section lists all task runners available in Kestra. Each `taskRunner` is identified by its `type`. The [Process](./01.process-task-runner/index.md) and [Docker](./02.docker-task-runner/index.md) task runners are fully open-source and located within the [Kestra repository](https://github.com/kestra-io/kestra). By default, Kestra runs all script tasks using the Docker task runner.
All other plugins such as the [AWS Batch](./04.aws-batch-task-runner/index.md), [Google Batch](./06.google-batch-task-runner/index.md), [Google Cloud Run](./07.google-cloudrun-task-runner/index.md), [Azure Batch](./05.azure-batch-task-runner/index.md), [Kubernetes](./03.kubernetes-task-runner/index.md), and more planned for the future, are managed by Kestra and require an [Enterprise Edition](../../07.enterprise/index.mdx) license. If you want to try them out, please [reach out](/demo). --- # AWS Batch Task Runner – Run Tasks on ECS Fargate, EC2, or EKS URL: https://kestra.io/docs/task-runners/types/aws-batch-task-runner > Execute Kestra tasks as AWS Batch jobs on ECS Fargate, EC2, or EKS for scalable and serverless compute. Run tasks as AWS Batch jobs on ECS Fargate, EC2, or EKS compute environments. ## Offload tasks to AWS Batch To launch tasks on AWS Batch, you need to understand three key concepts: 1. **Compute environment** — mandatory; it won’t be created by the task. The compute environment defines the infrastructure for your tasks and can be ECS Fargate, EC2, or EKS. 2. **Job queue** — optional; it will be created by the task if not specified. Creating a queue adds some latency to the script’s runtime. 3. **Job** — created by the task runner; contains information about the image, commands, and resources to use. :::alert{type="info"} To get started quickly, use [this blueprint](/blueprints/aws-batch-terraform-git) to provision all required resources for running containers on ECS Fargate. ::: ## How does the AWS Batch task runner work? To support `inputFiles`, `namespaceFiles`, and `outputFiles`, the task runner creates sidecar containers that handle S3 file transfers alongside the main container. The approach differs by compute environment type. **ECS (Fargate and EC2):** Uses [multi-container ECS jobs](https://docs.aws.amazon.com/batch/latest/userguide/multi-container-jobs.html) with three containers per job: 1. A _before_-container that uploads input files to S3. 2. The _main_ container that fetches input files into the `{{ workingDir }}` directory and runs the task. 3. An _after_-container that fetches output files using `outputFiles` to make them available from the Kestra UI for download and preview. **EKS:** Uses [EKS job definitions](https://docs.aws.amazon.com/batch/latest/userguide/jobs-eks.html) with a Kubernetes pod. Sidecar containers run as pod containers using the same S3-based file transfer pattern. The main container command is wrapped in `/bin/sh -c`, so the container image must include `/bin/sh`. Since the working directory of the container isn’t known in advance, you must define the working and output directories explicitly. For example, use `cat {{ workingDir }}/myFile.txt` instead of `cat myFile.txt`. ### Exit codes The task runner maps AWS Batch job statuses to exit codes as follows: | AWS Batch status | Exit code | |---|---| | `SUCCEEDED` | `0` | | `FAILED` | `1` | | `RUNNING` | `2` | | `RUNNABLE` | `3` | | `PENDING` | `4` | | `STARTING` | `5` | | `SUBMITTED` | `6` | | Unknown | `-1` | ## Minimum permissions required To submit and monitor AWS Batch jobs, the IAM principal used by Kestra needs permission to create, tag, inspect, and clean up Batch job definitions and jobs. It also needs permission to pass the ECS roles used by the job and to read the AWS Batch log group. The following policy is the minimum set required by the task runner: ```json { "Version": "2012-10-17", "Statement": [ { "Action": [ "logs:DescribeLogGroups", "batch:TagResource", "batch:SubmitJob", "batch:RegisterJobDefinition", "batch:ListJobs", "batch:DescribeJobs", "batch:DescribeJobDefinitions", "batch:DescribeComputeEnvironments", "batch:DeregisterJobDefinition", "batch:TerminateJob", "batch:CreateJobQueue", "batch:UpdateJobQueue", "batch:DeleteJobQueue", "batch:DescribeJobQueues" ], "Effect": "Allow", "Resource": "*" }, { "Action": [ "iam:PassRole" ], "Effect": "Allow", "Resource": [ "", "", "" ] }, { "Action": [ "logs:StartLiveTail" ], "Effect": "Allow", "Resource": "arn:aws:logs:eu-central-1::log-group:/aws/batch/job" } ] } ``` :::alert{type="info"} The `batch:CreateJobQueue`, `batch:UpdateJobQueue`, `batch:DeleteJobQueue`, and `batch:DescribeJobQueues` permissions are only required when `jobQueueArn` is not configured — the task runner will create and clean up a job queue automatically in that case. If you always provide a `jobQueueArn`, you can omit those four permissions. ::: Replace ``, ``, ``, and `` with the values from your AWS account. If you use a different region, update the CloudWatch Logs ARN accordingly. :::alert{type="info"} The `iam:PassRole` entries for `` and `` apply to **ECS compute environments only**. For EKS, these roles are ignored — use `serviceAccountName` with [IRSA](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html) to grant IAM permissions to your EKS pods instead. ::: ### S3 permissions when using `bucket` When you set the `bucket` property, the Kestra worker itself (not the ECS task container) uploads `inputFiles` and `namespaceFiles` to S3 before the job starts and downloads `outputFiles` after it finishes. It also deletes the working-directory prefix from the bucket on cleanup. The Kestra IAM principal therefore needs the following additional permissions when `bucket` is configured: ```json { "Version": "2012-10-17", "Statement": [ { "Action": [ "s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:ListBucket" ], "Effect": "Allow", "Resource": "*" } ] } ``` The ECS task container separately needs S3 access via its `taskRoleArn` to read input files and write output files at runtime. Refer to the [Create the `ecsTaskRole` IAM role](#create-the-ecstaskrole-iam-role) section for the task-level policy. For EKS compute environments, grant S3 access to the pod's IAM role via IRSA and set `serviceAccountName` on the task runner. ## Resource sizing ### Default resources By default, each job runs with `1 vCPU` and `2048 MiB` of memory. Override this with the `resources` property: ```yaml taskRunner: type: io.kestra.plugin.ee.aws.runner.Batch # ... resources: request: cpu: "2" memory: "4096" ``` ### Fargate CPU and memory constraints AWS Fargate enforces strict combinations of vCPU and memory. The task runner validates these at runtime and will throw an error if an invalid combination is used. | vCPU | Allowed memory (MiB) | |---|---| | `0.25` | 512, 1024, 2048 | | `0.5` | 1024, 2048, 3072, 4096 | | `1` | 2048, 3072, 4096, 5120, 6144, 7168, 8192 | | `2` | 4096 – 16384 (increments of 1024) | | `4` | 8192 – 30720 (increments of 1024) | | `8` | 16384 – 61440 (increments of 4096) | | `16` | 32768 – 122880 (increments of 8192) | For EC2 compute environments, the vCPU value must be a whole integer (e.g. `"1"`, `"2"`) and must be ≥ 1. For EKS compute environments, CPU is specified as a decimal (e.g. `"0.5"`, `"1"`) and memory as an integer in MiB. The Fargate combination restrictions above do not apply. ### Sidecar container resources When `inputFiles`, `namespaceFiles`, or `outputFiles` are used, the task runner adds sidecar containers that handle S3 file transfers. Default sidecar resources are: - **ECS Fargate**: `0.25 vCPU` / `512 MiB` - **ECS EC2**: `1 vCPU` / `128 MiB` On Fargate, AWS Batch enforces resource limits at the **task level**. To keep the overall task resources equal to the value set in `resources.request`, the sidecar resources are automatically subtracted from the main container. For example, with `resources.request = 1 vCPU / 2048 MiB` and one sidecar at the default `0.25 vCPU / 512 MiB`, the main container will receive `0.75 vCPU / 1536 MiB`. If your `resources.request` is too small to accommodate the sidecars, the task runner will throw an error at startup. You can either increase `resources.request` or override sidecar sizing with `sidecarResources`: ```yaml taskRunner: type: io.kestra.plugin.ee.aws.runner.Batch # ... resources: request: cpu: "1" memory: "2048" sidecarResources: request: cpu: "0.25" memory: "512" ``` :::alert{type="info"} Fargate always assigns a public IP address to each task. If your subnets do not have a route to the internet (no internet gateway or NAT gateway), the containers will not be able to pull Docker images from public registries. ::: For EKS compute environments, sidecar resource limits are applied at the container level rather than the pod level, so the task-level resource subtraction described above does not apply. ## How to run tasks on AWS ECS Fargate The example below demonstrates how to use the AWS Batch task runner to offload Python scripts to a serverless container running on AWS ECS Fargate: ```yaml id: aws_batch_runner namespace: company.team tasks: - id: scrape_environment_info type: io.kestra.plugin.scripts.python.Script containerImage: ghcr.io/kestra-io/pydata:latest taskRunner: type: io.kestra.plugin.ee.aws.runner.Batch region: eu-central-1 accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" computeEnvironmentArn: "arn:aws:batch:eu-central-1:707969873520:compute-environment/kestraFargateEnvironment" jobQueueArn: "arn:aws:batch:eu-central-1:707969873520:job-queue/kestraJobQueue" executionRoleArn: "arn:aws:iam::707969873520:role/kestraEcsTaskExecutionRole" taskRoleArn: arn:aws:iam::707969873520:role/ecsTaskRole bucket: kestra-product-de namespaceFiles: enabled: true outputFiles: - "*.json" script: | import platform import socket import sys import json from kestra import Kestra print("Hello from AWS Batch and kestra!") def print_environment_info(): print(f"Host's network name: {platform.node()}") print(f"Python version: {platform.python_version()}") print(f"Platform information (instance type): {platform.platform()}") print(f"OS/Arch: {sys.platform}/{platform.machine()}") env_info = { "host": platform.node(), "platform": platform.platform(), "OS": sys.platform, "python_version": platform.python_version(), } Kestra.outputs(env_info) filename = "{{ workingDir }}/environment_info.json" with open(filename, "w") as json_file: json.dump(env_info, json_file, indent=4) if __name__ == "__main__": print_environment_info() ``` :::alert{type="info"} For a full list of available properties, see the [AWS plugin documentation](/plugins/plugin-ee-aws/aws-batch-task-runner/io.kestra.plugin.ee.aws.runner.batch) or view them in the built-in Code Editor in the Kestra UI. ::: ## How to run tasks on AWS Batch with EKS The example below shows how to run a shell command using an EKS compute environment. The container image must include `/bin/sh`. Use `serviceAccountName` with [IRSA](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html) to grant the pod access to AWS services like S3 — `taskRoleArn` and `executionRoleArn` are ignored for EKS. ```yaml id: run_container_on_eks namespace: company.team variables: region: us-east-1 compute_environment_arn: arn:aws:batch:us-east-1:123456789:compute-environment/kestraEksEnvironment job_queue_arn: arn:aws:batch:us-east-1:123456789:job-queue/kestraEksQueue tasks: - id: shell type: io.kestra.plugin.scripts.shell.Commands containerImage: amazonlinux:2 taskRunner: type: io.kestra.plugin.ee.aws.runner.Batch region: "{{ vars.region }}" accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" computeEnvironmentArn: "{{ vars.compute_environment_arn }}" jobQueueArn: "{{ vars.job_queue_arn }}" serviceAccountName: kestra-sa commands: - echo "Hello from AWS Batch on EKS" ``` :::alert{type="warning"} CloudWatch log streaming for EKS requires the EKS cluster to have CloudWatch logging configured, for example via [Fluent Bit](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-setup-logs-FluentBit.html) or the CloudWatch agent. Without cluster-level logging, task logs will not appear in Kestra. ::: To set up an EKS cluster for use with AWS Batch, follow the [AWS getting started with AWS Batch on Amazon EKS](https://docs.aws.amazon.com/batch/latest/userguide/getting-started-eks.html) guide. ## Full step-by-step guide: setting up AWS Batch from scratch To use the AWS Batch task runner, you must configure resources in your AWS account. You can set up the environment in two ways: 1. Using Terraform to provision all necessary resources using a simple `terraform apply` command. 2. Creating the resources step by step from the AWS Management Console.
### Before you begin You will need: 1. An AWS account. 2. A Kestra Enterprise Edition instance running version 0.18.0 or later with AWS credentials stored as [secrets](../../../06.concepts/04.secret/index.md). --- ### Terraform setup Follow the instructions in the [aws-batch README](https://github.com/kestra-io/deployment-templates/blob/main/aws/terraform/aws-batch/README.md) in the [terraform-deployments-templates](https://github.com/kestra-io/deployment-templates/tree/main) repository to provision resources using Terraform. You can also use [this blueprint](/blueprints/aws-batch-terraform-git), which creates all required resources in a single Kestra workflow execution. Here is a list of resources that will be created: - **AWS Security Group:** a security group for AWS Batch jobs with egress to the internet (required to be able to download public Docker images in your script tasks). - **AWS IAM Roles and Policies:** IAM roles and policies for AWS Batch and ECS Task Execution, including permissions for S3 access (S3 is used to store input and output files for container access). - **AWS Batch Compute Environment:** a managed ECS Fargate compute environment named `kestraFargateEnvironment`. - **AWS Batch Job Queue:** a job queue named `kestraJobQueue` for submitting batch jobs. --- ### AWS Management Console setup #### Create the `ecsTaskExecutionRole` IAM role Create an execution role that allows AWS Batch to manage resources on your behalf. 1. Open the [IAM console](https://console.aws.amazon.com/iam). 2. In the navigation menu, choose **Roles**. 3. Choose **Create role**. 4. In the **Select trusted entity**, choose **Custom trust policy** and paste the following trust policy JSON: ```json { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "ecs-tasks.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } ``` ![iam](./iam.png) 5. Click on **Next** and add the `AmazonECSTaskExecutionRolePolicy`. 6. Then, for **Role Name**, enter `ecsTaskExecutionRole` 7. Finally, click on **Create role**. ![create_role](./create_role.png) Make sure to copy the ARN of the role. You will need it later. ![role_arn](./role_arn.png) #### Create the `ecsTaskRole` IAM role On top of the Execution Role, we will also need a Task Role that includes S3 access permissions to store files. First, we'll need to create a policy the role can use for accessing S3. 1. Open the [IAM console](https://console.aws.amazon.com/iam). 2. In the navigation menu, choose **Policies**. 3. Select **JSON** and paste the following into the `Policy editor`: ```json { "Version": "2012-10-17", "Statement": [ { "Action": [ "s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:ListBucket" ], "Effect": "Allow", "Resource": "*" } ] } ``` ![policy1](./policy1.png) 4. Select **Next** and type in a name for the policy, such as `ecsTaskRoleS3Policy`. 5. Once you're done, select **Create policy**. ![policy2](./policy2.png) Now create a new role with the same trust policy as above. Attach the new policy before completing. 1. Open the [IAM console](https://console.aws.amazon.com/iam). 2. In the navigation menu, choose **Roles**. 3. Choose **Create role**. 4. In the **Select trusted entity**, choose **Custom trust policy** and paste the following trust policy JSON: ```json { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "ecs-tasks.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } ``` 5. Click on **Next** 6. Search for the new policy and check the box on the left. Once you've done this, select **Next**. ![role_permission](./role_permission.png) 7. Then, for **Role Name**, enter `ecsTaskRole` 6. Finally, click on **Create role**. #### AWS Batch setup Go to the AWS Batch console. ![batch4_search](./batch4_search.png) Then, click on **Get Started**. If you don't see the **Get Started** button, add `#firstRun` to the URL: ![batch4_firstrun](./batch4_firstrun.png) Follow the wizard to create a new compute environment. ![batch4_jobtype](./batch4_jobtype.png) You should see the following text recommending the use of Fargate: > "We recommend using Fargate in most scenarios. Fargate launches and scales the compute to closely match the resource requirements that you specify for the container. With Fargate, you don't need to over-provision or pay for additional servers. You also don't need to worry about the specifics of infrastructure-related parameters such as instance type. When the compute environment needs to be scaled up, jobs that run on Fargate resources can get started more quickly. Typically, it takes a few minutes to spin up a new Amazon EC2 instance. However, jobs that run on Fargate can be provisioned in about 30 seconds. The exact time required depends on several factors, including container image size and number of jobs. [Learn more](https://docs.aws.amazon.com/batch/latest/userguide/fargate.html)." We will follow that advice and use Fargate for this tutorial. #### Step 1: Select Orchestration type Select **Fargate** and click on **Next**. #### Step 2: Create a compute environment Add a name for your compute environment — here, we chose `kestra`. You can keep the default settings for everything. Select the VPC and subnets you want to use — you can use the default VPC and subnets and the default VPC security group. Then, click on Next. ![batch5](./batch5.png) #### Step 3: Create a job queue Now we can create a job queue. Here, we also name it `kestra`. You can keep the default settings. Then, click on Next: ![batch6](./batch6.png) #### Step 4: Create a job definition Finally, create a job definition. Here, we name it also `kestra`. Under Execution role, select the role we created earlier (`ecsTaskExecutionRole`). Besides that, you can keep default settings for everything else (we adjusted the image to ``ghcr.io/kestra-io/pydata:latest`` but that's totally optional). Then, click on **Next**: ![batch7](./batch7.png) #### Step 5: Create a job Finally, create a job named `kestra`. Click **Next** to review settings: ![batch8](./batch8.png) #### Step 6: Review and create Review your settings and click on **Create resources**: ![batch9](./batch9.png) Once you see this message, you are all set: ![batch10](./batch10.png) #### Copy and apply the ARN to your Kestra configuration Copy the ARN of the compute environment and job queue. You will need to add these to your Kestra configuration. ![batch11](./batch11.png) ![batch12](./batch12.png) ### Create an S3 Bucket Create an S3 bucket to store input and output files. To do this, open **S3** → **Create bucket**. ![s3_create](./s3_create.png) Next you'll need to add a name and leave everything else as a default value. ![s3_bucket_name](./s3_bucket_name.png) Scroll to the bottom and select **Create bucket**. Now that we have a bucket, we'll need to add the name into Kestra. ### Run your Kestra task on AWS ECS Fargate Fill in the ARNs of the compute environment and job queue in your Kestra configuration. Here is an example of a flow that uses the `aws.runner.Batch` to run a Python script on AWS ECS Fargate to get environment information and print it to the logs: ```yaml id: aws_batch_runner namespace: company.team variables: compute_environment_arn: arn:aws:batch:us-east-1:123456789:compute-environment/kestra job_queue_arn: arn:aws:batch:us-east-1:123456789:job-queue/kestra execution_role_arn: arn:aws:iam::123456789:role/ecsTaskExecutionRole task_role_arn: arn:aws:iam::123456789:role/ecsTaskRole tasks: - id: send_data type: io.kestra.plugin.scripts.python.Script containerImage: ghcr.io/kestra-io/pydata:latest taskRunner: type: io.kestra.plugin.ee.aws.runner.Batch region: us-east-1 accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" computeEnvironmentArn: "{{ vars.compute_environment_arn }}" jobQueueArn: "{{ vars.job_queue_arn }}" executionRoleArn: "{{ vars.execution_role_arn }}" taskRoleArn: "{{ vars.task_role_arn }}" bucket: kestra-us script: | import platform import socket import sys print("Hello from AWS Batch and kestra!") def print_environment_info(): print(f"Host's network name: {platform.node()}") print(f"Python version: {platform.python_version()}") print(f"Platform information (instance type): {platform.platform()}") print(f"OS/Arch: {sys.platform}/{platform.machine()}") try: hostname = socket.gethostname() ip_address = socket.gethostbyname(hostname) print(f"Host IP Address: {ip_address}") except socket.error as e: print("Unable to obtain IP address.") if __name__ == '__main__': print_environment_info() ``` When you execute this task, the environment information appears in the logs generated by the Python script: ![logs](./logs.png) ## Advanced configuration The task runner exposes several optional properties for tuning behavior and authentication. ### Polling and timeouts | Property | Default | Description | |---|---|---| | `waitUntilCompletion` | `PT1H` | Maximum duration to wait for the job to complete. If the task defines a `timeout`, that value takes precedence. AWS Batch will automatically terminate the job when this duration is reached. | | `completionCheckInterval` | `PT5S` | How often Kestra polls AWS Batch for job status. Lower values reduce latency for short jobs; higher values reduce API call volume for long-running jobs. | ### Job lifecycle | Property | Default | Description | |---|---|---| | `resume` | `true` | When `true`, if the Kestra worker is restarted while a job is running, it will reconnect to the existing job rather than submitting a new one. Requires a `jobQueueArn` to be configured. | | `delete` | `true` | When `true`, the job definition, any auto-created job queue, and the S3 working-directory prefix are deleted after the job completes. Set to `false` to retain resources for debugging — note that a task retry may then reconnect to the previous (failed) job. | ### EKS: service account and IRSA For EKS compute environments, use `serviceAccountName` to attach a Kubernetes service account to the pod. Annotate the service account with an IAM role ARN to enable [IRSA](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html) — this is the recommended way to grant pods access to AWS services such as S3. | Property | Description | |---|---| | `serviceAccountName` | Name of the Kubernetes service account to attach to the EKS pod. Use with IRSA for IAM authorization. Ignored for ECS compute environments. | `taskRoleArn` and `executionRoleArn` are ignored when the compute environment is EKS. ### STS role assumption Instead of static `accessKeyId` / `secretKeyId` credentials, you can authenticate via AWS STS `AssumeRole` for cross-account access or short-lived credentials: ```yaml taskRunner: type: io.kestra.plugin.ee.aws.runner.Batch region: eu-central-1 stsRoleArn: "arn:aws:iam::123456789012:role/kestra-batch-role" stsRoleExternalId: "{{ secret('STS_EXTERNAL_ID') }}" stsRoleSessionName: kestra-session computeEnvironmentArn: "arn:aws:batch:eu-central-1:123456789012:compute-environment/kestraFargateEnvironment" ``` | Property | Description | |---|---| | `stsRoleArn` | ARN of the IAM role to assume. | | `stsRoleExternalId` | External ID for the trust policy (optional). | | `stsRoleSessionName` | Session name tag attached to the assumed-role session (optional). | | `stsEndpointOverride` | Override the STS endpoint URL (optional, useful in GovCloud or custom environments). | | `stsRoleSessionDuration` | Duration of the assumed-role session (optional; defaults to the AWS minimum). | --- # Azure Batch Task Runner: Run Tasks on Azure Containers URL: https://kestra.io/docs/task-runners/types/azure-batch-task-runner > Offload Kestra tasks to Azure Batch to run large-scale parallel and high-performance computing applications efficiently. Run tasks as containers on Azure Batch VMs. ## Offload tasks to Azure Batch This task runner deploys a container for the task in a specified Azure Batch pool. To launch a task on Azure Batch, there are two main concepts to understand: 1. **Pool** — mandatory; not created by the task. This is a pool composed of nodes where your task can run. 2. **Job** — created by the task runner; contains information about which image, commands, and resources to use. ## How the Azure Batch task runner works To support `inputFiles`, `namespaceFiles`, and `outputFiles`, the Azure Batch task runner relies on [resource files](https://learn.microsoft.com/en-us/azure/batch/resource-files) and [output files](https://learn.microsoft.com/en-us/rest/api/batchservice/task/add?view=rest-batchservice-2023-11-01&tabs=HTTP), which transit through [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs). Since the working directory of the container is not known in advance, you must explicitly define both the working directory and the output directory when using the Azure Batch runner. For example, use `cat {{ workingDir }}/myFile.txt` rather than `cat myFile.txt`. The following Pebble expressions and environment variables are available inside the container at runtime: - `{{ workingDir }}` / `WORKING_DIR` — the working directory path inside the container where input files are staged. - `{{ outputDir }}` / `OUTPUT_DIR` — the output directory path inside the working directory where output files must be written. - `{{ bucketPath }}` / `BUCKET_PATH` — the path in Azure Blob Storage allocated for this task run, used to stage input and output files. ## Authentication requirements The Azure Batch task runner described on this page uses the following credentials: - a Batch account access key (`account`, `accessKey`, `endpoint`) - an Azure Blob Storage connection string (`blobStorage.connectionString`) Because the runner uses shared keys rather than Azure RBAC to submit jobs, this page does not define a minimum Azure role assignment in the same way as the AWS Batch or Google Cloud Run task runners. Instead, the minimum requirement for Kestra is: - access to a Batch account key for the target Batch account - access to a Blob Storage connection string for the container used to stage input and output files If your organization uses Azure RBAC to control who can create or rotate those credentials, manage that access outside Kestra as part of your platform setup. #### Alternative blob storage authentication Instead of a `connectionString`, you can authenticate to blob storage using a shared key directly. In that case, provide the following properties under `blobStorage` instead: - `containerName` — the name of the blob container. - `endpoint` — the blob storage endpoint URL (e.g. `https://.blob.core.windows.net`). - `sharedKeyAccountName` — the storage account name. - `sharedKeyAccountAccessKey` — the storage account access key. Example: ```yaml blobStorage: containerName: "{{ vars.containerName }}" endpoint: "{{ secret('AZURE_BLOB_ENDPOINT') }}" sharedKeyAccountName: "{{ secret('AZURE_STORAGE_ACCOUNT_NAME') }}" sharedKeyAccountAccessKey: "{{ secret('AZURE_STORAGE_ACCOUNT_KEY') }}" ``` :::alert{type="info"} If the Kestra Worker running this task is terminated, the Azure Batch job will continue running until completion. Upon restart, the Worker will resume processing on the existing job unless `resume` is set to `false`. ::: ## Key properties | Property | Default | Description | |---|---|---| | `poolId` | — | **Required.** ID of the Azure Batch pool on which to run the job. | | `delete` | `true` | Whether to delete the job upon completion. If `false`, a task retry could reconnect to a previous failed job. | | `resume` | `true` | Whether to reconnect to an existing job if one already exists (useful after a Worker restart). | | `waitUntilCompletion` | `PT1H` | Maximum duration to wait for job completion. Overridden by the task-level `timeout` property if set. | | `completionCheckInterval` | `PT5S` | How often Kestra polls Azure Batch for job completion. Lower values increase API call frequency. | | `streamLogs` | `false` | Enable log streaming during task execution. Useful with `timeout` — ensures logs up to the termination point are captured. | | `registry` | — | Private container registry configuration for pulling images that require authentication. | | `blobStorage` | — | Blob storage configuration for staging input/output files. Required when using `inputFiles`, `outputFiles`, or `namespaceFiles`. | ## A full flow example ```yaml id: azure_batch_runner namespace: company.team variables: poolId: "poolId" containerName: "containerName" tasks: - id: scrape_environment_info type: io.kestra.plugin.scripts.python.Commands containerImage: ghcr.io/kestra-io/pydata:latest taskRunner: type: io.kestra.plugin.ee.azure.runner.Batch account: "{{ secret('AZURE_ACCOUNT') }}" accessKey: "{{ secret('AZURE_ACCESS_KEY') }}" endpoint: "{{ secret('AZURE_ENDPOINT') }}" poolId: "{{ vars.poolId }}" blobStorage: containerName: "{{ vars.containerName }}" connectionString: "{{ secret('AZURE_CONNECTION_STRING') }}" commands: - python {{ workingDir }}/main.py namespaceFiles: enabled: true outputFiles: - "environment_info.json" inputFiles: main.py: | import platform import socket import sys import json from kestra import Kestra print("Hello from Azure Batch and Kestra!") def print_environment_info(): print(f"Host's network name: {platform.node()}") print(f"Python version: {platform.python_version()}") print(f"Platform information (instance type): {platform.platform()}") print(f"OS/Arch: {sys.platform}/{platform.machine()}") env_info = { "host": platform.node(), "platform": platform.platform(), "OS": sys.platform, "python_version": platform.python_version(), } Kestra.outputs(env_info) filename = 'environment_info.json' with open(filename, 'w') as json_file: json.dump(env_info, json_file, indent=4) if __name__ == '__main__': print_environment_info() ``` :::alert{type="info"} For a full list of properties available in the Azure Batch task runner, see the [Azure plugin documentation](/plugins/plugin-ee-azure/runner/io.kestra.plugin.ee.azure.runner.batch) or view them in the built-in Code Editor in the Kestra UI. ::: ## Full step-by-step guide: setting up Azure Batch from scratch
### Before you begin Before starting, ensure you have the following: 1. A Microsoft Azure account. 2. A Kestra instance (version 0.16.0 or later) with Azure credentials stored as [secrets](../../../06.concepts/04.secret/index.md) or environment variables. ### Azure portal setup #### Create a Batch account and Azure Storage account Once logged into your Azure account, search for **Batch accounts** and select the first option under **Services**. ![search](./search.png) On that page, select **Create** to make a new account. ![create-account](./create-account.png) Select the appropriate resource group, then fill in the **Account name** and **Location** fields. Next, click **Select a storage account**. ![new-account](./new-account.png) If you don’t have an existing storage account, click **Create new** and type a name (e.g., `mybatchstorage`). Leave the other settings as defaults and select **OK**. ![storage-account](./storage-account.png) ![create-storage-account](./create-storage-account.png) After the details are filled, click **Review + create** and then **Create** to finish creating the Batch account. ![account-created](./account-created.png) Once the account is created, you’ll see a **Deployment succeeded** message. Select **Go to resource** to open the account. #### Create a pool With your Batch account ready, you can create a pool of compute nodes in which Kestra will run tasks. On the Batch account page, select **Pools** from the left navigation menu, then click **Add** at the top. ![pools-menu](./pools-menu.png) On the **Add pool** page, enter a **Pool ID**. ![pool-name](./pool-name.png) Under **Operating System**: - Select **Publisher**: `microsoft-azure-batch` - Select **Offer**: `ubuntu-server-container` - Select **Sku**: `20-04-lts` ![os](./os.png) Scroll to **Node size** and select **Standard_A1_v2**, which provides 1 vCPU and 2 GB of memory. Enter **2** for **Target dedicated nodes**. ![node-size](./node-size.png) Once complete, select **OK** to create the pool. #### Create an access key In your Batch account, go to **Settings** → **Keys**. Generate a new set of keys. You will need: - `Batch account` for `account` - `Account endpoint` for `endpoint` - `Primary access key` for `accessKey` #### Create blob storage Search for **Storage accounts** and select your newly created account. Under **Data storage**, select **Containers**, then click **+ Container** to make a new one. ![data-storage](./data-storage.png) Enter a name for the container and select **Create**. ![create-container](./create-container.png) Now that you’ve created your Batch account, storage account, pool, and container, you can create your flow in Kestra. ### Creating your flow Below is an example flow that runs a Python file called `main.py` on an Azure Batch task runner. At the top of the `io.kestra.plugin.scripts.python.Commands` task, you’ll define the task runner properties: ```yaml containerImage: ghcr.io/kestra-io/pydata:latest taskRunner: type: io.kestra.plugin.ee.azure.runner.Batch account: "{{ secret('AZURE_ACCOUNT') }}" accessKey: "{{ secret('AZURE_ACCESS_KEY') }}" endpoint: "{{ secret('AZURE_ENDPOINT') }}" poolId: "{{ vars.poolId }}" blobStorage: containerName: "{{ vars.containerName }}" connectionString: "{{ secret('AZURE_CONNECTION_STRING') }}" ``` Here you can provide Azure details such as `account`, `accessKey`, `endpoint`, `poolId`, and `blobStorage`. These can be added as [secrets](../../../06.concepts/04.secret/index.md) and [variables](../../../05.workflow-components/04.variables/index.md). ```yaml id: azure_batch_runner namespace: company.team variables: poolId: "poolId" containerName: "containerName" tasks: - id: get_env_info type: io.kestra.plugin.scripts.python.Commands containerImage: ghcr.io/kestra-io/pydata:latest taskRunner: type: io.kestra.plugin.ee.azure.runner.Batch account: "{{ secret('AZURE_ACCOUNT') }}" accessKey: "{{ secret('AZURE_ACCESS_KEY') }}" endpoint: "{{ secret('AZURE_ENDPOINT') }}" poolId: "{{ vars.poolId }}" blobStorage: containerName: "{{ vars.containerName }}" connectionString: "{{ secret('AZURE_CONNECTION_STRING') }}" commands: - python {{ workingDir }}/main.py namespaceFiles: enabled: true outputFiles: - "environment_info.json" inputFiles: main.py: | import platform import socket import sys import json from kestra import Kestra print("Hello from Azure Batch and Kestra!") def print_environment_info(): print(f"Host's network name: {platform.node()}") print(f"Python version: {platform.python_version()}") print(f"Platform information (instance type): {platform.platform()}") print(f"OS/Arch: {sys.platform}/{platform.machine()}") env_info = { "host": platform.node(), "platform": platform.platform(), "OS": sys.platform, "python_version": platform.python_version(), } Kestra.outputs(env_info) filename = 'environment_info.json' with open(filename, 'w') as json_file: json.dump(env_info, json_file, indent=4) if __name__ == '__main__': print_environment_info() ``` When you execute the flow, you can see the task runner logs in Kestra: ![logs](./logs.png) You can also view the created task runner in the Azure Portal: ![batch-jobs](./batch-jobs.png) Once the task is complete, Azure automatically shuts down the runner. You can view the generated outputs in the **Outputs** tab in Kestra, which includes the information produced by the Azure Batch task runner from the Python script: ![outputs](./outputs.png) --- # Docker Task Runner – Run Tasks in Containers URL: https://kestra.io/docs/task-runners/types/docker-task-runner > Isolate task execution in Docker containers using Kestra's Docker Task Runner for consistent environments. Run tasks as Docker containers. ## Run tasks in containers with the Docker runner
The following example shows how to use the Docker task runner to execute commands in a Docker container: ```yaml id: docker_task_runner namespace: company.team tasks: - id: shell type: io.kestra.plugin.scripts.shell.Commands containerImage: centos taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker cpu: cpus: 1 commands: - echo "Hello World!" ``` Once you specify the `taskRunner` type, you get autocompletion and validation for the runner-specific properties. In the example above, the task allocates one CPU to the container. ![docker_runner](../../02.benefits/docker_runner.png) ## Docker task runner properties The only required property when using the Docker task runner is `containerImage`, which must be set on the script task. The image can come from a public or private registry. Additionally, when using the Docker task runner, you can configure memory allocation, volumes, environment variables, and more. For a full list of available properties, refer to the [Docker plugin documentation](/plugins/plugin-script-python/io.kestra.plugin.scripts.runner.docker.docker) or explore them in the built-in Code Editor in the Kestra UI. :::alert{type="info"} The Docker task runner executes the script task as a container in a Docker-compatible engine. This means you can use it to run scripts within a Kubernetes cluster using Docker-In-Docker (DinD) or in a local Docker engine. ::: --- ## Task runner behavior in a failure scenario In general, each task runner container initiated by Kestra will **continue running until the task completes**, even if the Kestra worker is terminated (for example, due to a crash). However, there are a few caveats depending on how Kestra and the task runner are deployed. ### Kestra running in a Docker container, task runner running in DinD When Kestra runs in a Docker container and uses DinD for task runners, terminating the Kestra container will also terminate the DinD container and any running task containers inside it. No container is automatically restarted. ### Kestra running in Kubernetes, task runner running in DinD When Kestra and DinD are deployed in the same pod in a Kubernetes environment, the pod will restart if the Kestra Worker fails. This ensures that both the DinD container and any task runner containers are restarted. ### Kestra deployed with Docker Compose, task runner running in DinD When using Docker Compose, Kestra and DinD containers can be managed independently. Restarting the Kestra container does **not** automatically restart the DinD container. Therefore, task runners running inside DinD may continue running even if Kestra is restarted. --- ## Insecure registry The Docker task runner relies on the Docker daemon for registry configuration. If you need to use an insecure (HTTP) registry, it must be configured at the **daemon level**, not within the task or flow — make sure that the insecure registry is configured on the host machine where your Kestra server is running. For example, if your registry is at `10.10.1.5:5000`, add the following to `/etc/docker/daemon.json`, then restart the Docker daemon: ```json { "insecure-registries": ["10.10.1.5:5000"] } ``` Important notes: * Do **not** include `http://` or `https://` in the registry address. * The `insecure-registries` setting cannot be provided in the task configuration; it has no effect there. * Restart the Docker daemon after updating this file. You can then reference your registry directly in the flow: ```yaml id: docker_example namespace: demo tasks: - id: my_command type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: 10.10.1.5:5000/my-image commands: - echo "Hello World!" ``` --- # Google Batch Task Runner: Run Tasks on Cloud VMs URL: https://kestra.io/docs/task-runners/types/google-batch-task-runner > Execute Kestra tasks using Google Batch to provision and manage compute resources on Google Cloud efficiently. Run tasks as containers on Google Cloud VMs. ## Offload tasks to Google Cloud Batch The Google Batch task runner deploys a container for each task on a specified Google Cloud Batch VM. To launch tasks on Google Cloud Batch, you should understand five main concepts: 1. **Machine type** — a required property that defines the compute machine type where the task will be deployed. If no `reservation` is specified, a new compute instance will be created for each batch, which can add up to a minute of startup latency. 2. **Reservation** — an optional property that lets you reserve virtual machines in advance to avoid the delay of provisioning new instances for every task. 3. **Network interfaces** — optional; if not specified, the runner will use the default network interface. 4. **Compute resources** — an optional property that overrides CPU (in milliCPU), memory (in MiB), and boot disk size per task, independently of the machine type. Defaults are 2000 milliCPU (2 vCPU) and 2048 MiB. Values must stay compatible with the chosen machine type — for example, `n2-standard-2` provides 2 vCPUs and 8 GiB of memory, so `cpu` must not exceed `2000` and `memory` must not exceed `8192`. ```yaml computeResource: cpu: "1000" # 1 vCPU in milliCPU memory: "1024" # 1 GiB in MiB bootDisk: "20GiB" ``` 5. **Task retries** — use `maxRetryCount` (0–10, default 0) to have Google Batch automatically retry a failed task container before marking the job as failed. Combine with `lifecyclePolicies` for fine-grained control over which exit codes trigger a retry. ## How the Google Batch task runner works To support `inputFiles`, `namespaceFiles`, and `outputFiles`, the Google Batch task runner performs the following actions: - Mounts a volume from a GCS bucket. - Uploads input files to the bucket before launching the container. - Downloads output files from the bucket after the container finishes. - Alternatively, any file written to `{{ outputDir }}` (accessible via the `OUTPUT_DIR` environment variable) is automatically captured as an output — useful when the set of output files is not known in advance. :::alert{type="warning"} Unlike other task runners, the Google Batch task runner executes commands from the **root directory** (`/`), not the working directory. You must always reference files using the `{{ workingDir }}` expression or the `WORKING_DIR` environment variable — for example, `python {{ workingDir }}/main.py` instead of `python main.py`. ::: The following Pebble expressions and environment variables are available inside the task: | Pebble expression | Environment variable | Description | |---|---|---| | `{{ workingDir }}` | `WORKING_DIR` | Path to the task's working directory where input files are placed | | `{{ outputDir }}` | `OUTPUT_DIR` | Path to the output directory; files written here are automatically captured | | `{{ bucketPath }}` | `BUCKET_PATH` | GCS URI of the task's staging folder in the configured bucket | :::alert{type="info"} If the Kestra worker processing this task is restarted, the Batch job continues running on GCP. When the worker comes back up, it automatically reattaches to the existing job (matched by labels) rather than creating a new one. Set `resume: false` to disable this behavior. ::: By default, the task runner deletes the Batch job and all staging files from the GCS bucket once the task completes. Set `delete: false` to retain them for inspection — but be aware that stale jobs may be reused by the `resume` logic on the next run. ## Example flow ```yaml id: gcp_batch_runner namespace: company.team variables: region: europe-west9 tasks: - id: scrape_environment_info type: io.kestra.plugin.scripts.python.Commands containerImage: ghcr.io/kestra-io/pydata:latest taskRunner: type: io.kestra.plugin.ee.gcp.runner.Batch projectId: "{{ secret('GCP_PROJECT_ID') }}" region: "{{ vars.region }}" bucket: "{{ secret('GCS_BUCKET') }}" serviceAccount: "{{ secret('GOOGLE_SA') }}" commands: - python {{ workingDir }}/main.py namespaceFiles: enabled: true outputFiles: - "environment_info.json" inputFiles: main.py: | import platform import socket import sys import json from kestra import Kestra print("Hello from GCP Batch and kestra!") def print_environment_info(): print(f"Host's network name: {platform.node()}") print(f"Python version: {platform.python_version()}") print(f"Platform information (instance type): {platform.platform()}") print(f"OS/Arch: {sys.platform}/{platform.machine()}") env_info = { "host": platform.node(), "platform": platform.platform(), "OS": sys.platform, "python_version": platform.python_version(), } Kestra.outputs(env_info) filename = '{{ workingDir }}/environment_info.json' with open(filename, 'w') as json_file: json.dump(env_info, json_file, indent=4) if __name__ == '__main__': print_environment_info() ``` :::alert{type="info"} For a full list of available properties, see the [Google Batch plugin documentation](/plugins/plugin-ee-gcp/google-cloud-task-runner/io.kestra.plugin.ee.gcp.runner.batch) or explore the configuration in the built-in Code Editor in the Kestra UI. ::: --- ## Full setup guide: running Google Batch from scratch
### Before you begin You'll need the following prerequisites: 1. A Google Cloud account. 2. A Kestra instance (version 0.16.0 or later) with Google credentials stored as [secrets](../../../06.concepts/04.secret/index.md) or set as environment variables. ### Required IAM roles The service account used by Kestra needs the following roles: | Role | Purpose | |---|---| | `roles/batch.jobsEditor` | Create and manage Batch jobs | | `roles/logging.viewer` | Stream task logs from Cloud Logging | | `roles/storage.objectAdmin` | Read and write staging files in the GCS bucket | | `roles/iam.serviceAccountUser` | Allow Batch to run jobs as the Compute Engine service account | ### Google Cloud Console setup #### Create a project If you don't already have one, create a new project in the Google Cloud Console. ![project](../../04.types/07.google-cloudrun-task-runner/project.png) Once created, ensure your new project is selected in the top navigation bar. ![project_selection](../../04.types/07.google-cloudrun-task-runner/project-selection.png) #### Enable the Batch API Navigate to the **APIs & Services** section and search for **Batch API**. Enable it so Kestra can create and manage Batch jobs. ![batchapi](./batchapi.png) After enabling the API, you'll be prompted to create credentials for integration. #### Create a service account Once the Batch API is active, create a service account to allow Kestra to access GCP resources. Follow the prompt for **Application data**, which will generate a new service account. ![api-credentials-1](./api-credentials-1.png) Give the service account a descriptive name. ![sa-1](./sa-1.png) Assign the following roles: - **Batch Job Editor** - **Logs Viewer** - **Storage Object Admin** ![roles](./roles.png) Next, create a key for this service account by going to **Keys → Add Key**, and choose **JSON**. This will generate credentials you can add to Kestra as a secret or directly into your flow configuration. ![create-key](./create-key.png) See [Google credentials guide](../../../15.how-to-guides/google-credentials/index.md) for more details. Grant this service account access to the **Compute Engine default service account** by navigating to **IAM & Admin → Service Accounts → Permissions → Grant Access**, then assigning the **Service Account User** role. ![compute](./compute.png) #### Create a storage bucket Search for "Bucket" in the Cloud Console and create a new GCS bucket. You can keep the default configuration for now. ![bucket](../../04.types/07.google-cloudrun-task-runner/bucket.png) ### Create a flow Below is a sample flow that runs a Python file (`main.py`) using the Google Batch Task Runner. The `taskRunner` section defines properties such as the project, region, and bucket. :::alert{type="info"} By default, the task runner uses the default network configuration of your Google Cloud project. If none exists, you can configure connectivity manually using the `networkInterfaces` property. See the [Google Cloud Batch Task Runner documentation](https://kestra.io/plugins/plugin-ee-gcp/google-cloud-task-runner/io.kestra.plugin.ee.gcp.runner.batch#properties_networkInterfaces-body) for details. ::: ```yaml id: gcp_batch_runner namespace: company.team variables: region: europe-west2 tasks: - id: scrape_environment_info type: io.kestra.plugin.scripts.python.Commands containerImage: ghcr.io/kestra-io/kestrapy:latest taskRunner: type: io.kestra.plugin.ee.gcp.runner.Batch projectId: "{{ secret('GCP_PROJECT_ID') }}" region: "{{ vars.region }}" bucket: "{{ secret('GCS_BUCKET') }}" serviceAccount: "{{ secret('GOOGLE_SA') }}" commands: - python {{ workingDir }}/main.py namespaceFiles: enabled: true outputFiles: - "environment_info.json" inputFiles: main.py: | import platform import socket import sys import json from kestra import Kestra print("Hello from GCP Batch and kestra!") def print_environment_info(): print(f"Host's network name: {platform.node()}") print(f"Python version: {platform.python_version()}") print(f"Platform information (instance type): {platform.platform()}") print(f"OS/Arch: {sys.platform}/{platform.machine()}") env_info = { "host": platform.node(), "platform": platform.platform(), "OS": sys.platform, "python_version": platform.python_version(), } Kestra.outputs(env_info) filename = '{{ workingDir }}/environment_info.json' with open(filename, 'w') as json_file: json.dump(env_info, json_file, indent=4) print_environment_info() ``` When you execute the flow, the logs will show the task runner being created: ![logs](./logs.png) You can also confirm job creation directly in the Google Cloud Console: ![batch-jobs](./batch-jobs.png) After the task completes, the runner automatically shuts down. You can review output artifacts in Kestra's **Outputs** tab: ![outputs](./outputs.png) --- # Google Cloud Run Task Runner: Serverless Task Execution URL: https://kestra.io/docs/task-runners/types/google-cloudrun-task-runner > Run Kestra tasks as serverless containers on Google Cloud Run for scalable and managed execution. Run tasks as containers on Google Cloud Run. ## Overview The Google Cloud Run task runner deploys the container for each task as a Cloud Run Job. When no file operations are configured, the container starts in the root directory — use the `{{ workingDir }}` Pebble expression or the `WORKING_DIR` environment variable to reference input files and outputs rather than relying on the current directory. ## File handling To use the `inputFiles`, `outputFiles`, or `namespaceFiles` properties, set the `bucket` property. The bucket acts as an intermediary storage layer: - Input and namespace files are uploaded to the bucket before the task runs. - Output files are stored in the bucket during execution and made available for download and preview in the Kestra UI afterward. The task runner creates a unique folder for each run. You can access this folder using the `{{ bucketPath }}` Pebble expression or the `BUCKET_PATH` environment variable. ### Syncing the full working directory Set `syncWorkingDirectory: true` to download the entire working directory after task completion, which is useful when tasks produce files dynamically without knowing their names in advance: ```yaml taskRunner: type: io.kestra.plugin.ee.gcp.runner.CloudRun projectId: "{{ secret('GCP_PROJECT_ID') }}" region: europe-west2 bucket: "{{ secret('GCP_BUCKET') }}" serviceAccount: "{{ secret('GOOGLE_SA') }}" syncWorkingDirectory: true ``` ## IAM permissions The service account used by Kestra must be able to create Cloud Run jobs, view logs, and access the Cloud Storage bucket used for staging files. Grant the following IAM roles to the Kestra service account: - **Cloud Run Developer** - **Logs Viewer** - **Storage Admin** if you use `inputFiles`, `outputFiles`, or `namespaceFiles` with a `bucket` If Cloud Run jobs execute as the Compute Engine default service account, also grant the Kestra service account the **Service Account User** role on that service account. ## Termination behavior :::alert{type="warning"} If the Kestra Worker running this task is terminated, the Cloud Run job continues to run until completion. This prevents interruptions due to Worker crashes. If you manually stop the execution from the Kestra UI, the Cloud Run job is terminated to avoid unnecessary costs. _(This behavior is under development; track progress [on GitHub](https://github.com/kestra-io/plugin-gcp/issues/381))._ ::: By default, jobs are deleted after the task completes. When a task is resubmitted, the runner reattaches to an existing job for the same task run rather than creating a new one. Use `delete: false` to keep the job for inspection after completion, or `resume: false` to force a new job on every execution attempt: ```yaml taskRunner: type: io.kestra.plugin.ee.gcp.runner.CloudRun projectId: "{{ secret('GCP_PROJECT_ID') }}" region: europe-west2 serviceAccount: "{{ secret('GOOGLE_SA') }}" delete: false resume: false ``` ## Container resources Use the `resources` property to set CPU and memory limits on the Cloud Run container. Both `cpu` and `memory` accept static values or Pebble expressions: ```yaml taskRunner: type: io.kestra.plugin.ee.gcp.runner.CloudRun projectId: "{{ secret('GCP_PROJECT_ID') }}" region: europe-west2 serviceAccount: "{{ secret('GOOGLE_SA') }}" resources: cpu: "2" memory: "4Gi" ``` CPU accepts whole vCPUs (`1`, `2`, `4`, `8`) or millicores (`1000m`). Memory accepts standard SI suffixes (`512Mi`, `1Gi`, `2Gi`). Both fields support Pebble expressions, so you can size containers dynamically from flow inputs: ```yaml inputs: - id: cpu_count type: INT defaults: 2 - id: memory_gb type: INT defaults: 4 tasks: - id: run type: io.kestra.plugin.scripts.shell.Commands containerImage: ubuntu taskRunner: type: io.kestra.plugin.ee.gcp.runner.CloudRun projectId: "{{ secret('GCP_PROJECT_ID') }}" region: europe-west2 serviceAccount: "{{ secret('GOOGLE_SA') }}" resources: cpu: "{{ inputs.cpu_count }}" memory: "{{ inputs.memory_gb }}Gi" commands: - echo "Running with {{ inputs.cpu_count }} vCPUs and {{ inputs.memory_gb }}Gi memory" ``` ## Timeout and polling Three properties control how long the runner waits and how often it checks job status: | Property | Default | Description | |---|---|---| | `waitUntilCompletion` | `PT1H` | Maximum wall-clock time before the job is timed out. The task's own `timeout` takes precedence when set. | | `completionCheckInterval` | `PT5S` | How often to poll the Cloud Run API for job status. Lower values reduce latency for short jobs; higher values reduce API calls for long ones. | | `waitForLogInterval` | `PT5S` | Extra time to stream late log entries after job completion. | ```yaml taskRunner: type: io.kestra.plugin.ee.gcp.runner.CloudRun projectId: "{{ secret('GCP_PROJECT_ID') }}" region: europe-west2 serviceAccount: "{{ secret('GOOGLE_SA') }}" waitUntilCompletion: PT4H completionCheckInterval: PT30S waitForLogInterval: PT10S ``` ## Job retries `maxRetries` controls the number of Cloud Run task-level retries (default `3`). These are retries within the Cloud Run execution itself, not Kestra-level task retries. Set to `0` to disable: ```yaml taskRunner: type: io.kestra.plugin.ee.gcp.runner.CloudRun projectId: "{{ secret('GCP_PROJECT_ID') }}" region: europe-west2 serviceAccount: "{{ secret('GOOGLE_SA') }}" maxRetries: 0 ``` ## VPC networking ### VPC Access Connector Route job traffic through a Serverless VPC Access Connector to reach private resources such as Cloud SQL or internal services. Both `vpcAccessConnector` and `vpcEgress` must be set together: ```yaml taskRunner: type: io.kestra.plugin.ee.gcp.runner.CloudRun projectId: "{{ secret('GCP_PROJECT_ID') }}" region: europe-west1 serviceAccount: "{{ secret('GOOGLE_SA') }}" vpcAccessConnector: projects/my-project/locations/europe-west1/connectors/my-connector vpcEgress: PRIVATE_RANGES_ONLY ``` `vpcEgress` accepts `PRIVATE_RANGES_ONLY` (only private RFC 1918 ranges use the connector) or `ALL_TRAFFIC` (all outbound traffic uses the connector). ### Direct VPC Egress Direct VPC Egress connects the job to a VPC network without a connector. Set `network` and/or `subnetwork` using full resource paths. `vpcAccessConnector` and Direct VPC Egress are mutually exclusive: ```yaml taskRunner: type: io.kestra.plugin.ee.gcp.runner.CloudRun projectId: "{{ secret('GCP_PROJECT_ID') }}" region: europe-west1 serviceAccount: "{{ secret('GOOGLE_SA') }}" network: projects/my-project/global/networks/my-vpc subnetwork: projects/my-project/regions/europe-west1/subnetworks/my-subnet vpcEgress: ALL_TRAFFIC ``` ## Mounting GCS buckets as volumes Use `volumes` to mount one or more GCS buckets directly into the container at specified paths. This is independent of the `bucket` property used for file transfer, and is useful for mounting reference datasets or shared outputs: ```yaml taskRunner: type: io.kestra.plugin.ee.gcp.runner.CloudRun projectId: "{{ secret('GCP_PROJECT_ID') }}" region: europe-west1 serviceAccount: "{{ secret('GOOGLE_SA') }}" volumes: - bucket: my-reference-data-bucket mountPath: /data readOnly: true - bucket: my-output-bucket mountPath: /output ``` Each entry requires `bucket` (the GCS bucket name) and `mountPath` (the absolute path inside the container). Set `readOnly: true` for read-only access; the default is read-write. ## Service account configuration The Cloud Run task runner uses three service account properties with distinct roles: | Property | Purpose | |---|---| | `serviceAccount` | JSON key used for Kestra API calls to create and manage Cloud Run jobs. Also used as the job execution identity when `runtimeServiceAccount` is not set. | | `runtimeServiceAccount` | Email of the service account that the Cloud Run container runs as (`--service-account` equivalent). Controls what GCP resources the container can access at runtime. When set, takes precedence over `serviceAccount` for job execution identity. | | `impersonatedServiceAccount` | Email of a service account to impersonate for API calls (`--impersonate-service-account` equivalent). Applies to job creation and management, not the container runtime. | Use `runtimeServiceAccount` when the container needs a different identity from the service account Kestra uses to manage jobs: ```yaml taskRunner: type: io.kestra.plugin.ee.gcp.runner.CloudRun projectId: "{{ secret('GCP_PROJECT_ID') }}" region: europe-west2 serviceAccount: "{{ secret('GOOGLE_SA') }}" runtimeServiceAccount: container-runner@my-project.iam.gserviceaccount.com ``` ## Example flows ### Basic example The following example runs a shell command inside a Cloud Run container: ```yaml id: new-shell namespace: company.team variables: projectId: myProjectId region: europe-west2 tasks: - id: shell type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.ee.gcp.runner.CloudRun projectId: "{{ vars.projectId }}" region: "{{ vars.region }}" serviceAccount: "{{ secret('GOOGLE_SA') }}" commands: - echo "Hello World" ``` ### Example with file inputs and outputs The following flow uploads an input file to GCS, runs a shell command, and retrieves the output: ```yaml id: new-shell-with-file namespace: company.team variables: projectId: myProjectId region: europe-west2 inputs: - id: file type: FILE tasks: - id: shell type: io.kestra.plugin.scripts.shell.Commands inputFiles: data.txt: "{{ inputs.file }}" outputFiles: - out.txt containerImage: centos taskRunner: type: io.kestra.plugin.ee.gcp.runner.CloudRun projectId: "{{ vars.projectId }}" region: "{{ vars.region }}" bucket: "{{ vars.bucket }}" serviceAccount: "{{ secret('GOOGLE_SA') }}" commands: - cp {{ workingDir }}/data.txt {{ workingDir }}/out.txt ``` :::alert{type="info"} For a complete list of properties available in the Cloud Run task runner, see the [GCP plugin documentation](/plugins/plugin-ee-gcp/google-cloud-task-runner/io.kestra.plugin.ee.gcp.runner.cloudrun) or explore the configuration in the built-in Code Editor in the Kestra UI. ::: ## How to run tasks on Google Cloud Run
### Before you begin Ensure you have the following: 1. A Google Cloud account. 2. A Kestra instance with Google credentials stored as [secrets](../../../06.concepts/04.secret/index.md) or as environment variables. ### Google Cloud Console setup #### Create a project If you don't already have one, create a new project in the Google Cloud Console. ![project](./project.png) Ensure the new project is selected in the top navigation bar. ![project_selection](./project-selection.png) #### Enable the Cloud Run Admin API Open **APIs & Services → Enable APIs and Services**, then search for and enable **Cloud Run Admin API**. ![cloudrunapi](./cloudrunapi.png) #### Create a service account After enabling the API, create a service account to allow Kestra to access Cloud Run resources. In the search bar, find **Service Accounts** and select **Create Service Account**. ![sa-1](./sa-1.png) Assign the following roles to the service account: - **Cloud Run Developer** - **Logs Viewer** - **Storage Admin** (required for file upload and download) ![roles](./roles.png) Refer to the [Google credentials guide](../../../15.how-to-guides/google-credentials/index.md) for details on adding the service account to Kestra as a secret. To grant access to the Compute Engine default service account, go to **IAM & Admin → Service Accounts → Permissions → Grant Access**, and assign the **Service Account User** role to your new service account. ![compute](./compute.png) #### Create a storage bucket Search for "Bucket" in the Cloud Console and create a new GCS bucket. Use default settings unless otherwise required. ![bucket](./bucket.png) ### Create a flow The following flow runs a shell script using the Google Cloud Run task runner and copies a file under a new name: ```yaml id: new-shell-with-file namespace: company.team variables: projectId: myProjectId region: europe-west2 inputs: - id: file type: FILE tasks: - id: shell type: io.kestra.plugin.scripts.shell.Commands inputFiles: data.txt: "{{ inputs.file }}" outputFiles: - out.txt containerImage: centos taskRunner: type: io.kestra.plugin.ee.gcp.runner.CloudRun projectId: "{{ secret('GCP_PROJECT_ID') }}" region: "{{ vars.region }}" bucket: "{{ secret('GCP_BUCKET') }}" serviceAccount: "{{ secret('GOOGLE_SA') }}" commands: - cp {{ workingDir }}/data.txt {{ workingDir }}/out.txt ``` When you execute the flow, the Kestra logs confirm that the Cloud Run job was created and started: ![logs](./logs.png) You can also verify job creation in the Google Cloud Console: ![jobs](./jobs.png) After the task completes, the Cloud Run job is automatically deleted to free up resources. --- # Kubernetes Task Runner – Run Tasks as K8s Pods URL: https://kestra.io/docs/task-runners/types/kubernetes-task-runner > Run Kestra tasks as Kubernetes pods with the K8s Task Runner. Configure pod templates, namespaces, and resource limits for scalable container-based execution. Run tasks as Kubernetes pods. ## Overview This plugin is available only in the [Enterprise Edition](../../../07.enterprise/01.overview/01.enterprise-edition/index.md) (EE) and Kestra Cloud. The task runner is container-based, so the `containerImage` property must be set. To access the task's working directory, use either the `{{ workingDir }}` Pebble expression or the `WORKING_DIR` environment variable. Input files and namespace files are available in this directory. To generate output files, you can either: - Use the `outputFiles` property of the task and create a file with the same name in the task's working directory, or - Create any file in the output directory, accessible via the `{{ outputDir }}` Pebble expression or the `OUTPUT_DIR` environment variable. When the Kestra Worker running this task is terminated, the pod continues until completion. After restarting, the Worker resumes processing on the existing pod unless `resume` is set to `false`. If your cluster is configured with [RBAC](https://kubernetes.io/docs/reference/access-authn-authz/rbac/), the service account running your pod must have the following authorizations: - `pods`: get, create, delete, watch, list - `pods/log`: get, watch - `pods/exec`: get, watch The following role grants these authorizations: ```yaml apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: task-runner rules: - apiGroups: [""] resources: ["pods"] verbs: ["get", "create", "delete", "watch", "list"] - apiGroups: [""] resources: ["pods/exec"] verbs: ["get", "watch"] - apiGroups: [""] resources: ["pods/log"] verbs: ["get", "watch"] ``` Use the `serviceAccountName` property to assign a custom service account to the pod. When omitted, the namespace default service account is used, which must carry the required RBAC permissions above. ## How to use the Kubernetes task runner The following example connects to a cluster using certificate-based authentication, uploads an input file, runs a shell command, and retrieves the output: ```yaml id: kubernetes_task_runner namespace: company.team description: | To get the kubeconfig file, run: `kubectl config view --minify --flatten`. Then, copy the values to the configuration below. Here is how Kubernetes task runner properties (on the left) map to the kubeconfig file's properties (on the right): - clientKeyData: client-key-data - clientCertData: client-certificate-data - caCertData: certificate-authority-data - masterUrl: server - oauthToken: token (if using OAuth, e.g., GKE/EKS) inputs: - id: file type: FILE tasks: - id: shell type: io.kestra.plugin.scripts.shell.Commands inputFiles: data.txt: "{{ inputs.file }}" outputFiles: - "*.txt" containerImage: centos taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes config: clientKeyData: "{{ secret('K8S_CLIENT_KEY_DATA') }}" clientCertData: "{{ secret('K8S_CLIENT_CERT_DATA') }}" caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" masterUrl: https://docker-for-desktop:6443 commands: - echo "Hello from a Kubernetes task runner!" - cp data.txt out.txt ``` :::alert{type="info"} To deploy Kubernetes with Docker Desktop, see the [Docker Desktop Kubernetes guide](https://docs.docker.com/desktop/kubernetes/#install-and-turn-on-kubernetes). To install `kubectl`, see the [kubectl installation guide](https://kubernetes.io/docs/tasks/tools/#kubectl). :::
## File handling When a task has `inputFiles` or `namespaceFiles` configured, Kestra adds an **init container** to the pod as a synchronization gate. The Worker transfers files directly to the pod using `kubectl cp`, then signals the init container, which exits and allows the main container to start. When a task has `outputFiles` configured, a **sidecar container** is added to the pod that waits for the main container to finish before downloading output files back to the Worker. All containers in the pod share an in-memory `emptyDir` volume for file exchange. ### Syncing the full working directory By default, only files listed in `outputFiles` are downloaded after task completion. Set `syncWorkingDirectory: true` to download the entire working directory, which is useful when tasks produce files dynamically without knowing their names in advance: ```yaml taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes syncWorkingDirectory: true config: masterUrl: https://docker-for-desktop:6443 caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" ``` ## Failure scenarios If a task is resubmitted (for example, due to a retry or a Worker crash), the new Worker reattaches to the existing (or completed) pod instead of starting a new one. Set `resume: false` to force a new pod to be created on every execution attempt rather than reattaching to an existing pod. By default, pods are deleted after the task completes. Set `delete: false` to keep the pod alive after completion, which is useful when debugging failures — you can then inspect the pod with `kubectl exec` or `kubectl logs`: ```yaml taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes delete: false config: masterUrl: https://docker-for-desktop:6443 caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" ``` ### Exec timeout and residual `InterruptedIOException` errors The sequence diagram below illustrates a failure mode that occurs when the `waitUntilRunning` timeout expires while the OkHttp dispatcher is still retrying the `/exec` WebSocket upgrade in the background. ```mermaid sequenceDiagram participant W as Kestra Worker (Main Thread) participant OK as OkHttp Dispatcher (Background Threads) participant API as EKS API Server (Control Plane) participant P as Target Task Pod (Worker Node) Note over W: Task Start: io.kestra.plugin.ee.kubernetes.runner.Kubernetes W->>API: 1. POST /api/v1/namespaces/default/pods (Create Pod) API-->>P: Schedule & Initialize Container Note over W: Wait for 'waitUntilRunning' (Default PT10M) W->>OK: 2. Initiate /exec Handshake (File/Marker Upload) loop Background Retry Loop OK->>API: 3. GET /api/v1/.../exec (WebSocket Upgrade) API-->>OK: 500 Internal Server Error (Kubelet/Node not ready) Note over OK: Wait for retry interval end Note over W: 4. Main Thread Timeout Reached W->>W: Mark TaskRun as FAILED par Cleanup Phase W->>API: 5. DELETE /api/v1/namespaces/default/pods/{name} API-->>P: Terminate Pod W->>W: 6. SHUTDOWN OkHttp Thread Pool (Executor) and Residual Logging Note over OK: 7. Background Thread wakes for Attempt 3 OK->>OK: Thread INTERRUPTED (Pool is Terminated) Note right of OK: Log: java.io.InterruptedIOException: executor rejected Note right of OK: Log: ERROR Stop retry, attempts 3 elapsed after 24 seconds end ``` The task is already marked `FAILED` at step 4. The `java.io.InterruptedIOException: executor rejected` and `ERROR Stop retry` log lines emitted at step 7 are residual — they confirm the cleanup path ran correctly and can be safely ignored. If the `waitUntilRunning` timeout fires before the pod is ready (for example, due to slow image pulls or kubelet initialization on a cold node), increase the value to give the cluster more time: ```yaml taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes waitUntilRunning: PT20M config: masterUrl: https://docker-for-desktop:6443 caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" ``` ## Specifying resource requests Use the `resources` property to set CPU and memory requests and limits on the main task container. Both `cpu` and `memory` accept static values or Pebble expressions, so you can drive them from flow inputs at runtime. The following example sizes the pod dynamically based on inputs: ```yaml id: kubernetes_resources namespace: company.team inputs: - id: cpu_count type: INT defaults: 2 - id: memory_per_cpu type: INT defaults: 4 tasks: - id: python_script type: io.kestra.plugin.scripts.python.Script containerImage: ghcr.io/kestra-io/pydata:latest taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes namespace: default pullPolicy: ALWAYS config: masterUrl: https://docker-for-desktop:6443 caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" clientCertData: "{{ secret('K8S_CLIENT_CERT_DATA') }}" clientKeyData: "{{ secret('K8S_CLIENT_KEY_DATA') }}" resources: request: cpu: "{{ inputs.cpu_count }}" memory: "{{ inputs.cpu_count * inputs.memory_per_cpu }}Gi" limit: cpu: "{{ inputs.cpu_count }}" memory: "{{ inputs.cpu_count * inputs.memory_per_cpu }}Gi" outputFiles: - "*.json" script: | import platform import socket import sys import json from kestra import Kestra print("Hello from a Kubernetes runner!") host = platform.node() py_version = platform.python_version() platform_info = platform.platform() os_arch = f"{sys.platform}/{platform.machine()}" def print_environment_info(): print(f"Host name: {host}") print(f"Python version: {py_version}") print(f"Platform: {platform_info}") print(f"OS/Arch: {os_arch}") env_info = { "host": host, "platform": platform_info, "os_arch": os_arch, "python_version": py_version, } Kestra.outputs(env_info) with open("environment_info.json", "w") as json_file: json.dump(env_info, json_file, indent=4) if __name__ == "__main__": print_environment_info() ``` :::alert{type="info"} For a full list of Kubernetes task runner properties, see the [Kubernetes plugin documentation](/plugins/plugin-ee-kubernetes/io.kestra.plugin.ee.kubernetes.runner.kubernetes) or explore them in the built-in Code Editor in the Kestra UI. ::: ## Timeout configuration Three properties control how long the runner waits at different stages of pod execution: | Property | Default | Description | |---|---|---| | `waitUntilRunning` | `PT10M` | Maximum time to wait for the pod to be scheduled, the image to be pulled, and containers to start. | | `waitUntilCompletion` | `PT1H` | Wall-clock timeout for task execution when the task itself has no `timeout` set. | | `waitForLogs` | `PT30S` | Extra time after containers exit to allow the log stream to flush completely. | Increase `waitUntilRunning` for clusters that pull large images or have slow scheduling. Increase `waitUntilCompletion` for long-running tasks. Decrease `waitForLogs` when you know logs are always flushed quickly and want to reduce idle time at the end of each task. ```yaml taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes waitUntilRunning: PT20M waitUntilCompletion: PT4H waitForLogs: PT10S config: masterUrl: https://docker-for-desktop:6443 caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" ``` ## Pod and container customization The Kubernetes task runner exposes several properties for customizing the pod spec beyond standard options like `resources` and `namespace`. These are advanced properties intended for cases such as security hardening, shared volumes, custom sidecars, or node scheduling constraints. ### `podSpec` — overlay the full pod spec `podSpec` accepts a freeform YAML map that is merged into the generated pod's spec. Use it for anything not covered by a first-class property: tolerations, affinity, priority classes, additional volumes, or user-defined sidecar containers. Any container listed under `podSpec.containers` whose name is **not** `"main"` is added as a user-defined sidecar alongside the Kestra main container. A container named `"main"` has its fields (such as `ports` and `env`) merged as defaults into the Kestra-built main container, with Kestra-injected values taking precedence on collision. ```yaml taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes config: masterUrl: https://docker-for-desktop:6443 caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" podSpec: tolerations: - key: "gpu" operator: "Exists" effect: "NoSchedule" affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: cloud.google.com/gke-accelerator operator: Exists volumes: - name: shared-data emptyDir: {} containers: - name: sidecar image: busybox command: ["sh", "-c", "while true; do sleep 5; done"] volumeMounts: - name: shared-data mountPath: /data ``` Template expressions, including `{{ workingDir }}` (which resolves to `/kestra/working-dir` when file I/O is enabled), are supported inside `podSpec`. ### `containerSpec` — augment the main container `containerSpec` is merged into the Kestra-generated main container. Use it for additional environment variables, a custom security context, or other per-container settings. Kestra-injected values (such as `WORKING_DIR` and `OUTPUT_DIR`) always take precedence over values defined here. ```yaml taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes config: masterUrl: https://docker-for-desktop:6443 caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" containerSpec: securityContext: runAsNonRoot: true runAsUser: 1000 allowPrivilegeEscalation: false env: - name: MY_CUSTOM_VAR value: "hello" ``` ### `containerDefaultSpec` — apply settings to all containers `containerDefaultSpec` is merged into every container in the pod: the main task container, any user-defined sidecars in `podSpec.containers`, and the Kestra file-transfer init and sidecar containers. This is the right place for settings that must be uniform across all containers, such as: - `volumeMounts` — mount a shared volume into every container without repeating it per container - `securityContext` — enforce a consistent security posture - `resources` — set default resource requests that individual containers can override - `env` — inject common environment variables Container-specific values always win over the defaults. For list fields (`volumeMounts`, `env`), defaults are prepended to any container-specific entries. ```yaml taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes config: masterUrl: https://docker-for-desktop:6443 caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" containerDefaultSpec: securityContext: allowPrivilegeEscalation: false volumeMounts: - name: docker-socket mountPath: /var/run/docker.sock podSpec: volumes: - name: docker-socket hostPath: path: /var/run/docker.sock ``` ### `fileSideCarSpec` — customize file transfer containers `fileSideCarSpec` is merged only into the init and sidecar containers that Kestra uses for file transfer. Use it when the file transfer containers need different settings from the main container, such as a stricter security context or an additional volume mount: ```yaml taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes config: masterUrl: https://docker-for-desktop:6443 caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" fileSideCarSpec: securityContext: runAsNonRoot: true runAsUser: 65534 ``` ### `fileSidecar` — file transfer container image and resources The `fileSidecar` property controls the container image, resource requests, and default spec used by the init and sidecar containers that handle file transfer. By default these containers use `busybox`. Use this when your cluster's security policy restricts which images can be used, or when you need to limit the resources consumed by file transfer: ```yaml taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes config: masterUrl: https://docker-for-desktop:6443 caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" fileSidecar: image: gcr.io/my-project/busybox:latest resources: requests: cpu: "100m" memory: "32Mi" limits: cpu: "200m" memory: "64Mi" ``` `fileSidecar.defaultSpec` applies additional container spec fields to the file transfer containers only, and takes precedence over `containerDefaultSpec` for those containers: ```yaml fileSidecar: image: busybox defaultSpec: securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true volumeMounts: - name: tmp mountPath: /tmp ``` ## OAuth token refresh for long-running tasks When authenticating via `oauthToken`, the token is used as-is for the lifetime of the task runner. This works for short tasks but fails with an `Unauthorized` error for tasks that run longer than the token's validity period (typically one hour on GKE and EKS). Use `oauthTokenProvider` to automatically refresh the token. The provider executes a Kestra task each time the Kubernetes client needs to re-authenticate, and caches the result for a configurable duration to avoid unnecessary token fetches: ```yaml taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes config: masterUrl: "https://{{ outputs.metadata.endpoint }}" caCertData: "{{ outputs.metadata.masterAuth.clusterCertificate }}" oauthTokenProvider: cache: PT5M task: type: io.kestra.plugin.gcp.auth.OauthAccessToken output: "{{ accessToken.tokenValue }}" ``` | Property | Default | Description | |---|---|---| | `task` | — | Any Kestra `RunnableTask` whose output contains the token. | | `output` | — | A Pebble expression evaluated against the task's output map to extract the token string. | | `cache` | `PT5M` | How long the fetched token is reused before the provider runs the task again. Set to `PT0S` to disable caching. | ## Using plugin defaults to avoid repetition You can use `pluginDefaults` to avoid repeating configuration across multiple tasks. For example, you can set the `pullPolicy` to `ALWAYS` for all tasks in a namespace: ```yaml id: k8s_taskrunner namespace: company.team tasks: - id: parallel type: io.kestra.plugin.core.flow.Parallel tasks: - id: run_command type: io.kestra.plugin.scripts.python.Commands containerImage: ghcr.io/kestra-io/kestrapy:latest commands: - pip show kestra - id: run_python type: io.kestra.plugin.scripts.python.Script containerImage: ghcr.io/kestra-io/pydata:latest script: | import socket ip_address = socket.gethostbyname(socket.gethostname()) print("Hello from Kubernetes and Kestra!") print(f"Host IP Address: {ip_address}") pluginDefaults: - type: io.kestra.plugin.scripts.python forced: true values: taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes namespace: default pullPolicy: ALWAYS config: masterUrl: https://docker-for-desktop:6443 caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" clientCertData: "{{ secret('K8S_CLIENT_CERT_DATA') }}" clientKeyData: "{{ secret('K8S_CLIENT_KEY_DATA') }}" ``` ## Guides The following guides can help you set up the Kubernetes task runner on different platforms. ### Google Kubernetes Engine (GKE)
#### Before you begin 1. A Google Cloud account. 2. A Kestra instance with Google credentials stored as [secrets](../../../06.concepts/04.secret/index.md) or environment variables. #### Set up Google Cloud In Google Cloud, perform the following steps: 1. Create and select a project. 2. Create a GKE cluster. 3. Enable the Kubernetes Engine API. 4. Set up the `gcloud` CLI with `kubectl`. 5. Create a service account. :::alert{type="info"} To authenticate with Google Cloud, create a service account and add a JSON key to Kestra. Read more in our [Google credentials guide](../../../15.how-to-guides/google-credentials/index.md). For GKE, ensure the `Kubernetes Engine default node service account` role is assigned to your service account. ::: #### Creating a flow The following flow authenticates with GKE using a service account OAuth token: ```yaml id: gke_task_runner namespace: company.team tasks: - id: metadata type: io.kestra.plugin.gcp.gke.ClusterMetadata clusterId: kestra-dev-gke clusterZone: "europe-west1" clusterProjectId: kestra-dev - id: auth type: io.kestra.plugin.gcp.auth.OauthAccessToken - id: pod type: io.kestra.plugin.scripts.shell.Commands containerImage: ubuntu commands: - echo "Hello from a Kubernetes task runner!" taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes namespace: default config: caCertData: "{{ outputs.metadata.masterAuth.clusterCertificate }}" masterUrl: "https://{{ outputs.metadata.endpoint }}" oauthToken: "{{ outputs.auth.accessToken['tokenValue'] }}" ``` :::alert{type="info"} For tasks that run longer than one hour, replace the static `oauthToken` with an `oauthTokenProvider` so that the token is refreshed automatically. See [OAuth token refresh for long-running tasks](#oauth-token-refresh-for-long-running-tasks). ::: Use the `gcloud` CLI to get credentials such as `masterUrl` and `caCertData`: ```bash gcloud container clusters get-credentials clustername --region myregion --project projectid ``` Update the following arguments with your own values: - `clusterId`: the name of your cluster. - `clusterZone`: the region of your cluster (for example, `europe-west2`). - `clusterProjectId`: the ID of your Google Cloud project. After running the command, access your config with `kubectl config view --minify --flatten` to replace `caCertData`, `masterUrl`, and `username`. ### Amazon Elastic Kubernetes Service (EKS) The following flow authenticates with EKS using an OAuth token: ```yaml id: eks_task_runner namespace: company.team tasks: - id: shell type: io.kestra.plugin.scripts.shell.Commands containerImage: centos taskRunner: type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes config: caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" masterUrl: https://xxx.xxx.region.eks.amazonaws.com username: arn:aws:eks:region:xxx:cluster/cluster_name oauthToken: "{{ secret('K8S_OAUTH_TOKEN') }}" commands: - echo "Hello from a Kubernetes task runner!" ``` --- # Process Task Runner – Run Tasks as Local Processes URL: https://kestra.io/docs/task-runners/types/process-task-runner > Run Kestra tasks as local processes on the worker node for fast execution without container overhead. Run tasks as local processes. ## Run tasks locally with the Process runner
The following example shows a shell script configured with the Process task runner, which runs a command as a child process on the Kestra host: ```yaml id: process_task_runner namespace: company.team tasks: - id: shell type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - echo "Hello World!" ``` The [Process task runner](/plugins/core/runner/io.kestra.plugin.core.runner.process) has no additional configuration properties — only the `type` is required. :::alert{type="info"} Script tasks default to the Docker task runner if no `taskRunner` is specified. Set `taskRunner.type` to `io.kestra.plugin.core.runner.Process` explicitly when you want local process execution. ::: ## Working directory and file handling Before each execution, Kestra prepares a working directory on the worker host. The following variables give your script access to it: | Variable | Type | Description | |---|---|---| | `{{workingDir}}` | Template expression | Absolute path to the working directory | | `WORKING_DIR` | Environment variable | Same path, available to any subprocess | | `{{outputDir}}` | Template expression | Dedicated output directory (only when `outputDirectoryEnabled` is set on the task) | | `OUTPUT_DIR` | Environment variable | Same path as `{{outputDir}}` when available | **Input files** declared with `inputFiles` and **namespace files** are copied into the working directory before the process starts. **Output files** declared with `outputFiles` must be written to the working directory to be captured by Kestra. Because the Process runner sets the process working directory to Kestra's working directory, **relative paths work without needing the `{{workingDir}}` prefix** in commands. The following self-contained example writes a file and captures it as an output: ```yaml id: process_output_file namespace: company.team tasks: - id: shell type: io.kestra.plugin.scripts.shell.Commands outputFiles: - out.txt taskRunner: type: io.kestra.plugin.core.runner.Process commands: - echo "Hello from Process runner" > out.txt ``` You can also pass an input file into the task and capture a transformed result. The input file is placed in the working directory under the key name you specify in `inputFiles`: ```yaml id: process_input_output_files namespace: company.team inputs: - id: file type: FILE tasks: - id: shell type: io.kestra.plugin.scripts.shell.Commands inputFiles: data.txt: "{{inputs.file}}" outputFiles: - out.txt taskRunner: type: io.kestra.plugin.core.runner.Process commands: - cp data.txt out.txt ``` ## Installing dependencies with beforeCommands Use `beforeCommands` to install dependencies before the main script runs. When running inside the Kestra Docker image (which manages its own Python environment), pass `--break-system-packages` to avoid conflicts: ```yaml id: python_with_deps namespace: company.team inputs: - id: url type: URI defaults: https://jsonplaceholder.typicode.com/todos/1 tasks: - id: transform type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.core.runner.Process beforeCommands: - pip install kestra requests --break-system-packages script: | import requests from kestra import Kestra url = "{{ inputs.url }}" response = requests.get(url) print('Status Code:', response.status_code) Kestra.outputs(response.json()) ``` :::alert{type="info"} By default, Python is the only programming language installed in the Kestra Docker image. If you are running Kestra directly on a host machine, whatever is installed on that machine is available to the process. To use other languages, ensure their runtimes are installed in your environment before running Kestra. ::: ## Logs and exit codes Both stdout and stderr are captured and streamed to Kestra's task logs in real time. Any non-zero exit code causes the task to fail with a `TaskException` reporting the exit code. ## Behavior on worker interruption When the Kestra worker is stopped or interrupted: 1. The running process **and all its child processes** are killed immediately. 2. The task is marked as failed. 3. When the worker restarts, Kestra re-queues the task execution from the beginning. There is no process-level resume — the task restarts from scratch on the next attempt. ## Considerations - **No isolation**: Unlike the Docker task runner, the Process runner provides no container boundary. Subprocesses run with the same OS user and permissions as the Kestra worker JVM, inherit the worker's full environment, and share host CPU, memory, and disk with other running tasks. - **Platform support**: Works on Linux, macOS, and Windows. - **Dependencies**: Any tool or library used by the script must already be installed on the worker host (or installed via `beforeCommands`). ## Benefits The Process task runner is well suited when you need to: - Access local files or hardware (e.g., GPUs, local databases) - Reuse locally configured software libraries or virtual environments - Avoid the overhead of container startup ## Combining task runners with Worker Groups You can combine the Process task runner with [Worker Groups](../../../07.enterprise/04.scalability/worker-group/index.md) to run tasks on dedicated servers that have specific software libraries or configurations. This combination allows you to leverage the compute resources of your Worker Groups while running tasks as local processes, without the overhead of containerization. The following example demonstrates how to combine the Process task runner with Worker Groups to fully leverage the GPU resources of a dedicated server: ```yaml id: python_on_gpu namespace: company.team tasks: - id: gpu_intensive_ai_workload type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true commands: - python main.py workerGroup: key: gpu taskRunner: type: io.kestra.plugin.core.runner.Process ``` :::alert{type="info"} Worker Groups are an Enterprise Edition feature. To try them out, please [reach out](/demo). ::: --- # Kestra Terraform Provider: Manage Resources as IaC URL: https://kestra.io/docs/terraform > Manage Kestra resources using the official Terraform provider for Infrastructure as Code (IaC) and automation. import ChildTableOfContents from "~/components/content/ChildTableOfContents.astro" Manage resources and their underlying infrastructure with the official [Terraform provider](https://registry.terraform.io/providers/kestra-io/kestra/latest) to facilitate CI/CD and infrastructure management for your Kestra resources. - [Resources](https://registry.terraform.io/providers/kestra-io/kestra/latest/docs/resources/binding) - [Data Sources](https://registry.terraform.io/providers/kestra-io/kestra/latest/docs/data-sources/binding) --- # Terraform Provider in Kestra: Data Sources Index URL: https://kestra.io/docs/terraform/data-sources > The kestra_index data source allows you to read index in Kestra using Terraform. import ChildCard from "~/components/docs/ChildCard.astro" --- # Terraform Provider in Kestra: Read Bindings URL: https://kestra.io/docs/terraform/data-sources/binding > The kestra_binding data source allows you to read bindings in Kestra using Terraform. ## Terraform Data Source: kestra_binding Use this data source to access information about an existing Kestra binding :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example usage ```hcl data "kestra_binding" "example" { binding_id = "65DsawPfiJPkTkZJIPX6jQ" } ``` ## Schema ### Required - `binding_id` (String) The binding id. ### Read-Only - `external_id` (String) The binding external id. - `id` (String) The ID of this resource. - `namespace` (String) The linked namespace. - `role_id` (String) The role id. - `tenant_id` (String) The tenant id. - `type` (String) The binding type. --- # Terraform Provider in Kestra: Read Flows URL: https://kestra.io/docs/terraform/data-sources/flow > The kestra_flow data source allows you to read flows in Kestra using Terraform. ## Terraform Data Source: kestra_flow Use this data source to access information about an existing Kestra Flow ## Example usage ```hcl data "kestra_flow" "example" { namespace = "company.team" id = "my-flow" } ``` ## Schema ### Required - `flow_id` (String) The flow id. - `namespace` (String) The namespace. ### Read-Only - `content` (String) The flow content as yaml. - `id` (String) The ID of this resource. - `revision` (Number) The flow revision. - `tenant_id` (String) The tenant id. --- # Terraform Provider in Kestra: Read Groups URL: https://kestra.io/docs/terraform/data-sources/group > The kestra_group data source allows you to read groups in Kestra using Terraform. ## Terraform Data Source: kestra_group Use this data source to access information about an existing Kestra Group. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example usage ```hcl data "kestra_group" "example" { group_id = "68xAawPfiJPkTkZJIPX6jQ" } ``` ## Schema ### Required - `group_id` (String) The group. ### Optional - `namespace` (String) The linked namespace. ### Read-Only - `description` (String) The group description. - `id` (String) The ID of this resource. - `name` (String) The group name. - `tenant_id` (String) The tenant id. --- # Terraform Provider in Kestra: Read KV Entries URL: https://kestra.io/docs/terraform/data-sources/kv > The kestra_kv data source allows you to read KV entries in Kestra using Terraform. ## Terraform Data Source: kestra_kv Use this data source to access value for an existing Key-Value pair. ## Schema ### Required - `key` (String) The key to fetch value for. - `namespace` (String) The namespace of the Key-Value pair. ### Read-Only - `id` (String) The ID of this resource. - `tenant_id` (String) The tenant id. - `type` (String) The type of the value. One of STRING, NUMBER, BOOLEAN, DATETIME, DATE, DURATION, JSON. - `value` (String) The fetched value. --- # Terraform Provider in Kestra: Read Namespaces URL: https://kestra.io/docs/terraform/data-sources/namespace > The kestra_namespace data source allows you to read namespaces in Kestra using Terraform. ## Terraform Data Source: kestra_namespace Use this data source to access information about an existing Kestra Namespace. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example usage ```hcl data "kestra_namespace" "example" { namespace_id = "company.team" } ``` ## Schema ### Required - `namespace_id` (String) The namespace. ### Read-Only - `allowed_namespaces` (List of Object) The allowed namespaces. (see [below for nested schema](#nestedatt--allowed_namespaces)) - `description` (String) The namespace friendly description. - `id` (String) The ID of this resource. - `outputs_in_internal_storage` (Boolean) Whether outputs are stored in internal storage. - `plugin_defaults` (String) The namespace plugin defaults. - `secret_configuration` (Map of String) The secret configuration. - `secret_isolation` (List of Object) Secret isolation configuration (same shape as storage_isolation). (see [below for nested schema](#nestedatt--secret_isolation)) - `secret_read_only` (Boolean) Whether secrets are read-only in this namespace. - `secret_type` (String) The secret type. - `storage_configuration` (Map of String) The storage configuration. - `storage_isolation` (List of Object) Storage isolation configuration. (see [below for nested schema](#nestedatt--storage_isolation)) - `storage_type` (String) The storage type. - `tenant_id` (String) The tenant id. - `variables` (String) The namespace variables. - `worker_group` (List of Object) The worker group. (see [below for nested schema](#nestedatt--worker_group)) ### Nested schema for `allowed_namespaces` Read-Only: - `namespace` (String) ### Nested schema for `secret_isolation` Read-Only: - `denied_services` (List of String) - `enabled` (Boolean) ### Nested schema for `storage_isolation` Read-Only: - `denied_services` (List of String) - `enabled` (Boolean) ### Nested schema for `worker_group` Read-Only: - `fallback` (String) - `key` (String) --- # Terraform Provider in Kestra: Read Namespace Files URL: https://kestra.io/docs/terraform/data-sources/namespace_file > The kestra_namespace_file data source allows you to read namespace files in Kestra using Terraform. ## Terraform Data Source: kestra_namespace_file Use this data source to access information about an existing Namespace File ## Example usage ```hcl data "kestra_namespace_file" "example" { namespace = "company.team" filename = "myscript.py" content = file("myscript.py") } ``` ## Schema ### Required - `filename` (String) The filename to the namespace file. - `namespace` (String) The namespace of the namespace file resource. ### Read-Only - `content` (String) Content to store in the file, expected to be a UTF-8 encoded string. - `id` (String) The ID of this resource. - `tenant_id` (String) The tenant id. --- # Terraform Provider in Kestra: Read Roles URL: https://kestra.io/docs/terraform/data-sources/role > The kestra_role data source allows you to read roles in Kestra using Terraform. ## Terraform Data Source: kestra_role Use this data source to access information about an existing Kestra Role. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example usage ```hcl data "kestra_role" "example" { role_id = "3kcvnr27ZcdHXD2AUvIe7z" } ``` ## Schema ### Required - `role_id` (String) The role. ### Optional - `is_default` (Boolean) The role is the default one at user creation. Only one role can be default. Latest create/update to true will be keep as default. Defaults to `false`. - `namespace` (String) The linked namespace. ### Read-Only - `description` (String) The role description. - `id` (String) The ID of this resource. - `name` (String) The role name. - `permissions` (Set of Object) The role permissions. (see [below for nested schema](#nestedatt--permissions)) - `tenant_id` (String) The tenant id. ### Nested schema for `permissions` Read-Only: - `permissions` (List of String) - `type` (String) --- # Terraform Provider in Kestra: Read Service Accounts URL: https://kestra.io/docs/terraform/data-sources/service_account > The kestra_service_account data source allows you to read service accounts in Kestra using Terraform. ## Terraform Data Source: kestra_service_account Use this data source to access information about an existing Kestra Service Account. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example usage ```hcl data "kestra_user_service_account" "example" { id = "68xAawPfiJPkTkZJIPX6jQ" } ``` ## Schema ### Required - `id` (String) The service account id. ### Read-Only - `description` (String) The service account description. - `groups` (Block Set) The service account group. (see [below for nested schema](#nestedblock--groups)) - `name` (String) The service account name. - `super_admin` (Boolean) The service account description. ### Nested schema for `groups` Read-Only: - `id` (String) The group id. --- # Terraform Provider in Kestra: Read SA API Tokens URL: https://kestra.io/docs/terraform/data-sources/service_account_api_tokens > The kestra_service_account_api_tokens data source allows you to read service account API tokens in Kestra using Terraform. ## Terraform Data Source: kestra_service_account_api_tokens Use this data source to access information about the API tokens of a Kestra Service Account. ::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) :: ## Schema ### Required - `service_account_id` (String) The ID of the Service Account owning the API Tokens. ### Read-Only - `api_tokens` (Set of Object) The API tokens of the Service Account. (see [below for nested schema](#nestedatt--api_tokens)) - `id` (String) The ID of this resource. ### Nested schema for `api_tokens` Read-Only: - `description` (String) - `exp` (String) - `expired` (Boolean) - `extended` (Boolean) - `iat` (String) - `last_used` (String) - `name` (String) - `token_id` (String) - `token_prefix` (String) --- # Terraform Provider in Kestra: Read Templates URL: https://kestra.io/docs/terraform/data-sources/template > The kestra_template data source allows you to read templates in Kestra using Terraform. ## Terraform Data Source: kestra_template Use this data source to access information about an existing Kestra Template ## Example usage ```hcl data "kestra_template" "example" { namespace_ = "company.team" id = "my-template" } ``` ## Schema ### Required - `namespace` (String) The namespace. - `template_id` (String) The template id. ### Read-Only - `content` (String) The template content as yaml. - `id` (String) The ID of this resource. - `tenant_id` (String) The tenant id. --- # Terraform Provider in Kestra: Read Tenants URL: https://kestra.io/docs/terraform/data-sources/tenant > The kestra_tenant data source allows you to read tenants in Kestra using Terraform. ## Terraform Data Source: kestra_tenant Use this data source to access information about an existing Kestra Tenant. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example usage ```hcl data "kestra_tenant" "example" { tenant_id = "my-tenant" } ``` ## Schema ### Required - `tenant_id` (String) The tenant id. ### Read-Only - `id` (String) The ID of this resource. - `name` (String) The tenant name. - `outputs_in_internal_storage` (Boolean) Whether outputs are stored in internal storage. - `require_existing_namespace` (Boolean) Whether the tenant requires existing namespaces. - `secret_configuration` (Map of String) The secret configuration. - `secret_isolation` (List of Object) Secret isolation configuration (same shape as storage_isolation). (see [below for nested schema](#nestedatt--secret_isolation)) - `secret_read_only` (Boolean) Whether secrets are read-only in this tenant. - `secret_type` (String) The secret type. - `storage_configuration` (Map of String) The storage configuration. - `storage_isolation` (List of Object) Storage isolation configuration. (see [below for nested schema](#nestedatt--storage_isolation)) - `storage_type` (String) The storage type. - `worker_group` (List of Object) The worker group. (see [below for nested schema](#nestedatt--worker_group)) ### Nested schema for `secret_isolation` Read-Only: - `denied_services` (List of String) - `enabled` (Boolean) ### Nested schema for `storage_isolation` Read-Only: - `denied_services` (List of String) - `enabled` (Boolean) ### Nested schema for `worker_group` Read-Only: - `fallback` (String) - `key` (String) --- # Terraform Provider in Kestra: Read Tests URL: https://kestra.io/docs/terraform/data-sources/test > The kestra_test data source allows you to read tests in Kestra using Terraform. ## Terraform Data Source: kestra_test Test data source ## Schema ### Required - `namespace` (String) The Test namespace - `test_id` (String) The Test id ### Read-Only - `content` (String) The actual Test YAML content --- # Terraform Provider in Kestra: Read Users URL: https://kestra.io/docs/terraform/data-sources/user > The kestra_user data source allows you to read users in Kestra using Terraform. ## Terraform Data Source: kestra_user Use this data source to access information about an existing Kestra User. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example usage ```hcl data "kestra_user" "example" { user_id = "68xAawPfiJPkTkZJIPX6jQ" } ``` ## Schema ### Required - `user_id` (String) The user. ### Optional - `namespace` (String) The linked namespace. ### Read-Only - `description` (String) The user description. - `email` (String) The user email. - `first_name` (String) The user first name. - `groups` (List of String) The user global roles in yaml string. - `id` (String) The ID of this resource. - `last_name` (String) The user last name. - `tenant_id` (String) The tenant id. - `username` (String) The user name. --- # Terraform: Read User API Tokens in Kestra URL: https://kestra.io/docs/terraform/data-sources/user_api_tokens > The kestra_user_api_tokens data source allows you to read user API tokens in Kestra using Terraform. ## Terraform Data Source: kestra_user_api_tokens Use this data source to access information about the API tokens of a Kestra User. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Schema ### Required - `user_id` (String) The ID of the user owning the API Tokens. ### Read-Only - `api_tokens` (Set of Object) The API tokens of the user. (see [below for nested schema](#nestedatt--api_tokens)) - `id` (String) The ID of this resource. ### Nested schema for `api_tokens` Read-Only: - `description` (String) - `exp` (String) - `expired` (Boolean) - `extended` (Boolean) - `iat` (String) - `last_used` (String) - `name` (String) - `token_id` (String) - `token_prefix` (String) --- # Terraform Provider in Kestra: Read Worker Groups URL: https://kestra.io/docs/terraform/data-sources/worker_group > The kestra_worker_group data source allows you to read worker groups in Kestra using Terraform. ## Terraform Data Source: kestra_worker_group Use this data source to access information about an existing Kestra Worker Group. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Schema ### Required - `id` (String) The worker group id. - `key` (String) The worker group key. ### Read-Only - `allowed_tenants` (String) The list of tenants allowed to use the worker group. - `description` (String) The worker group description. --- # Kestra Terraform Provider Guides URL: https://kestra.io/docs/terraform/guides > Explore guides on using the Kestra Terraform provider to manage your resources as code. import ChildCard from "~/components/docs/ChildCard.astro" --- # Kestra Terraform Provider: Configuration Options URL: https://kestra.io/docs/terraform/guides/configurations ## Kestra 1.0.x compatibility :::alert{type="danger"} **Warning:** Kestra Terraform provider 1.0.x is only compatible with Kestra 1.0.x and above. ::: Additionally, if you want to terraform Kestra 1.0.x you need to use Kestra Terraform provider 1.0.x ### Breaking changes from 1.0.x Various breaking changes were made around from 0.24.x to 1.0.x, especially around IAM. ## Example usage ```hcl provider "kestra" { # mandatory, the Kestra webserver/standalone URL url = "http://localhost:8080" # optional basic auth username username = "john" # optional basic auth password password = "my-password" # optional jwt token (EE) jwt = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Iktlc3RyYS5pbyIsImlhdCI6MTUxNjIzOTAyMn0.hm2VKztDJP7CUsI69Th6Y5NLEQrXx7OErLXay55GD5U" # optional tenant id (EE) tenant_id = "the-tenant-id" # optional extra headers extra_headers = { x-pipeline = "*****" authorization = "Bearer *****" } } ``` ## Schema ### Optional - `api_token` (String, Sensitive) The API token (EE) - `extra_headers` (Map of String) Extra headers to add to every request - `jwt` (String, Sensitive) The JWT token (EE) - `keep_original_source` (Boolean) Keep original source code, keeping comment and indentation. Setting to false is now deprecated and will be removed in the future. - `password` (String, Sensitive) The basic-auth password - `tenant_id` (String) The tenant id (EE) - `timeout` (Number) The timeout (in seconds) for http requests - `url` (String) The endpoint url - `username` (String) The basic-auth username --- # Working with YAML in Kestra Terraform Provider URL: https://kestra.io/docs/terraform/guides/working-with-yaml > Learn how to handle YAML content in Terraform for Kestra resources, including multiline strings and external files. ## Working with YAML Most Kestra resources must be described as YAML, such as [kestra_flow](../../resources/flow/index.md) and [kestra_template](../../resources/template/index.md). Kestra uses full YAML in the Terraform definition because the resource structure is recursive and dynamic, so it cannot be described using Terraform's internal schema. There are 2 ways to handle YAML for a flow: * `keep_original_source = true` (default): the raw YAML is sent and saved in Kestra as-is. * `keep_original_source = false`: the YAML is encoded as JSON before being sent to the server, so comments and indentation are handled by the server. **These properties must be set at the provider level.** :::alert{type="danger"} With `keep_original_source = false`, the Terraform provider has no awareness of tasks or plugins and cannot know their default values. Most conversion logic runs on the Kestra server. If you see a diff that **is always present** (even just after apply), your Terraform flow definition likely has a minor difference from what the server returns. In that case, **copy the source from the Kestra UI** into your Terraform files to eliminate the diff. ::: Terraform provides many functions for working with YAML content: ## Simple multiline string example Use a [Heredoc String](https://www.terraform.io/docs/language/expressions/strings.html#heredoc-strings) for a simple multiline string: ```hcl resource "kestra_flow" "example" { namespace = "company.team" flow_id = "my-flow" content = < The kestra_index resource allows you to manage index in Kestra using Terraform. import ChildCard from "~/components/docs/ChildCard.astro" --- # Terraform: Manage Apps in Kestra URL: https://kestra.io/docs/terraform/resources/app > The kestra_app resource allows you to manage apps in Kestra using Terraform. ## Terraform Resource: kestra_app Manages an App resource. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Schema ### Required - `source` (String) The source text. ### Read-Only - `id` (String) The ID of this resource. - `uid` (String) The unique identifier. --- # Terraform: Manage Bindings in Kestra URL: https://kestra.io/docs/terraform/resources/binding > The kestra_binding resource allows you to manage bindings in Kestra using Terraform. ## Terraform Resource: kestra_binding Manages a Kestra Binding. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example usage ```hcl resource "kestra_binding" "example" { type = "GROUP" external_id = "68xAawPfiJPkTkZJIPX6jQ" role_id = "3kcvnr27ZcdHXD2AUvIe7z" namespace = "company.team" } ``` ## Schema ### Required - `external_id` (String) The binding external id. - `role_id` (String) The role id. - `type` (String) The binding type. ### Optional - `namespace` (String) The linked namespace. ### Read-Only - `id` (String) The ID of this resource. - `tenant_id` (String) The tenant id. ## Import Import is supported using the following syntax: ```shell terraform import kestra_binding.example {{binding_id}} ``` --- # Terraform: Manage Dashboards in Kestra URL: https://kestra.io/docs/terraform/resources/dashboard > The kestra_dashboard resource allows you to manage dashboards in Kestra using Terraform. ## Terraform Resource: kestra_dashboard Manages a Dashboard resource. ## Schema ### Required - `source_code` (String) The source code text. ### Read-only - `id` (String) The unique identifier. --- # Terraform: Manage Flows in Kestra URL: https://kestra.io/docs/terraform/resources/flow > The kestra_flow resource allows you to manage flows in Kestra using Terraform. ## Terraform Resource: kestra_flow Manages a Kestra Flow. ## Example usage ```hcl resource "kestra_flow" "example" { namespace = "company.team" flow_id = "my-flow" content = < ## Schema ### Required - `content` (String) The flow full content in yaml string. - `flow_id` (String) The flow id. - `namespace` (String) The flow namespace. ### Read-Only - `id` (String) The ID of this resource. - `revision` (Number) The flow revision. - `tenant_id` (String) The tenant id. ## Import Import is supported using the following syntax: ```shell terraform import kestra_flow.example {{namespace}}/{{flow_id}} ``` --- # Terraform: Manage Groups in Kestra URL: https://kestra.io/docs/terraform/resources/group > The kestra_group resource allows you to manage groups in Kestra using Terraform. ## Terraform Resource: kestra_group Manages a Kestra Group. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example usage ```hcl resource "kestra_group" "example" { namespace = "company.team" name = "Friendly name" description = "Friendly description" } ``` ## Schema ### Required - `name` (String) The group name. ### Optional - `description` (String) The group description. - `namespace` (String) The linked namespace. ### Read-Only - `id` (String) The ID of this resource. - `tenant_id` (String) The tenant id. ## Import Import is supported using the following syntax: ```shell terraform import kestra_group.example {{group_id}} ``` --- # Terraform: Manage KV Entries in Kestra URL: https://kestra.io/docs/terraform/resources/kv > The kestra_kv resource allows you to manage KV entries in Kestra using Terraform. ## Terraform Resource: kestra_kv Manages a Kestra Key-value pair. ## Schema ### Required - `key` (String) The key of the pair. - `namespace` (String) The namespace of the Key-Value pair. - `value` (String) The fetched value. ### Optional - `type` (String) The type of the value. If not provided, Kestra will attempt to deduce the type based on the value. Useful in case you provide numbers, booleans, dates or json that you want to be stored as string. Accepted values are: STRING, NUMBER, BOOLEAN, DATETIME, DATE, DURATION, JSON. ### Read-Only - `id` (String) The ID of this resource. - `tenant_id` (String) The tenant id. --- # Terraform: Manage Namespaces in Kestra URL: https://kestra.io/docs/terraform/resources/namespace > The kestra_namespace resource allows you to manage namespaces in Kestra using Terraform. ## Terraform Resource: kestra_namespace Manages a Kestra Namespace. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example Usage ```hcl resource "kestra_namespace" "example" { namespace_id = "company.team" description = "Friendly description" variables = < ## Schema ### Required - `namespace_id` (String) The namespace. ### Optional - `allowed_namespaces` (Block List) The allowed namespaces. (see [below for nested schema](#nestedblock--allowed_namespaces)) - `description` (String) The namespace friendly description. - `outputs_in_internal_storage` (Boolean) Whether outputs are stored in internal storage. - `plugin_defaults` (String) The namespace plugin defaults in yaml string. - `secret_configuration` (Map of String) The secret configuration. - `secret_isolation` (Block List, Max: 1) Secret isolation configuration (same shape as storage_isolation). (see [below for nested schema](#nestedblock--secret_isolation)) - `secret_read_only` (Boolean) Whether secrets are read-only in this namespace. - `secret_type` (String) The secret type. - `storage_configuration` (Map of String) The storage configuration. - `storage_isolation` (Block List, Max: 1) Storage isolation configuration. (see [below for nested schema](#nestedblock--storage_isolation)) - `storage_type` (String) The storage type. - `variables` (String) The namespace variables in yaml string. - `worker_group` (Block List, Max: 1) The worker group. (see [below for nested schema](#nestedblock--worker_group)) ### Read-Only - `id` (String) The ID of this resource. - `tenant_id` (String) The tenant id. ### Nested schema for `allowed_namespaces` Required: - `namespace` (String) The namespace. ### Nested Schema for `secret_isolation` Optional: - `denied_services` (List of String) List of denied services for secret isolation. - `enabled` (Boolean) Enable secret isolation. ### Nested Schema for `storage_isolation` Optional: - `denied_services` (List of String) List of denied services for isolation. - `enabled` (Boolean) Enable storage isolation. ### Nested Schema for `worker_group` Required: - `key` (String) The worker group key. Optional: - `fallback` (String) The fallback strategy. ## Import Import is supported using the following syntax: ```shell terraform import kestra_namespace.example {{namespace}} ``` --- # Terraform: Manage Namespace Files in Kestra URL: https://kestra.io/docs/terraform/resources/namespace_file > The kestra_namespace_file resource allows you to manage namespace files in Kestra using Terraform. ## Terraform Resource: kestra_namespace_file Manages a Kestra Namespace File. ## Example usage ```hcl resource "kestra_namespace_file" "example" { namespace = "company.team" filename = "/path/my-file.sh" content = < ## Schema ### Required - `filename` (String) The path to the namespace file that will be created. Missing parent directories will be created. If the file already exists, it will be overridden with the given content. - `namespace` (String) The namespace of the namespace file resource. ### Optional - `content` (String) Content to store in the file, expected to be a UTF-8 encoded string. ### Read-Only - `id` (String) The ID of this resource. - `tenant_id` (String) The tenant id. ## Import Import is supported using the following syntax: ```shell terraform import kestra_namespace_file.example {{namespace}}/{{filename}} ``` --- # Terraform: Manage Namespace Secrets in Kestra URL: https://kestra.io/docs/terraform/resources/namespace_secret > The kestra_namespace_secret resource allows you to manage namespace secrets in Kestra using Terraform. ## Terraform Resource: kestra_namespace_secret Manages a Kestra Namespace Secret. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example usage ```hcl resource "kestra_namespace_secret" "example" { namespace = "company.team" secret_key = "MY_KEY" secret_value = "my-r34l-53cr37" } ``` ## Schema ### Required - `namespace` (String) The namespace. - `secret_key` (String) The namespace secret key. - `secret_value` (String, Sensitive) The namespace secret value. ### Optional - `secret_description` (String) The namespace secret description. - `secret_tags` (Map of String) The namespace secret description. ### Read-Only - `id` (String) The ID of this resource. - `tenant_id` (String) The tenant id. --- # Terraform: Manage Roles in Kestra URL: https://kestra.io/docs/terraform/resources/role > The kestra_role resource allows you to manage roles in Kestra using Terraform. ## Terraform Resource: kestra_role Manages a Kestra Role. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example usage ```hcl resource "kestra_role" "example" { name = "Friendly name" description = "Friendly description" permissions { type = "FLOW" permissions = ["READ", "UPDATE"] } permissions { type = "EXECUTION" permissions = ["READ", "UPDATE"] } } ``` ## Schema ### Required - `name` (String) The role name. ### Optional - `description` (String) The role description. - `is_default` (Boolean) The role is the default one at user creation. Only one role can be default. Latest create/update to true will be keep as default. Defaults to `false`. - `namespace` (String) The linked namespace. - `permissions` (Block Set) The role permissions. (see [below for nested schema](#nestedblock--permissions)) ### Read-Only - `id` (String) The ID of this resource. - `tenant_id` (String) The tenant id. ### Nested schema for `permissions` Required: - `permissions` (List of String) The permissions for this type. - `type` (String) The type of permission. ## Import Import is supported using the following syntax: ```shell terraform import kestra_role.example {{role_id}} ``` --- # Terraform: Manage Security Integrations in Kestra URL: https://kestra.io/docs/terraform/resources/security_integration > The kestra_security_integration resource allows you to manage security integrations in Kestra using Terraform. ## Terraform Resource: kestra_security_integration Manages a Kestra Security Integration. When imported, URI and secret token are not provided. ::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) :: ## Schema ### Required - `name` (String) The name of the security integration. - `type` (String) The type of the security integration. ### Optional - `description` (String) The description of the security integration. ### Read-Only - `id` (String) The ID of this resource. - `secret_token` (String, Sensitive) The secret token of the security integration. - `tenant_id` (String) The tenant id. - `uid` (String) The unique identifier of the security integration. - `uri` (String) The url of the security integration. --- # Terraform: Manage Service Accounts in Kestra URL: https://kestra.io/docs/terraform/resources/service_account > The kestra_service_account resource allows you to manage service accounts in Kestra using Terraform. ## Terraform Resource: kestra_service_account Manages a Kestra Service Account. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example usage ```hcl resource "kestra_service_account" "example" { name = "my-service-account" description = "Friendly description" } ``` ## Schema ### Required - `name` (String) The service account name. ### Optional - `description` (String) The service account description. - `groups` (Block Set) The service account group. (see [below for nested schema](#nestedblock--groups)) - `super_admin` (Boolean) Is the service account a super admin. ### Read-Only - `id` (String) The ID of this resource. ### Nested schema for `groups` Required: - `id` (String) The group id. ## Import Import is supported using the following syntax: ```shell terraform import kestra_service_account.example {{user_id}} ``` --- # Terraform: Manage Service Account API Tokens URL: https://kestra.io/docs/terraform/resources/service_account_api_token > The kestra_service_account_api_token resource allows you to manage service account API tokens in Kestra using Terraform. ## Terraform Resource: kestra_service_account_api_token Manages a Kestra Service Account Api Token. ::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) :: ## Schema ### Required - `description` (String) The API token description. - `max_age` (String) The time the token remains valid since creation (ISO 8601 duration format). - `name` (String) The API token display name. - `service_account_id` (String) The ID of the Service Account owning the API Token. ### Optional - `extended` (Boolean) Specify whether the expiry date is automatically moved forward by max age whenever the token is used. Defaults to `false`. ### Read-Only - `full_token` (String, Sensitive) The full API token. - `id` (String) The ID of this resource. --- # Terraform: Manage Templates in Kestra URL: https://kestra.io/docs/terraform/resources/template > The kestra_template resource allows you to manage templates in Kestra using Terraform. ## Terraform Resource: kestra_template Manages a Kestra Template. ## Example usage ```hcl resource "kestra_template" "example" { namespace = "company.team" template_id = "my-template" content = < ## Schema ### Required - `content` (String) The template full content in yaml string. - `namespace` (String) The template namespace. - `template_id` (String) The template id. ### Read-Only - `id` (String) The ID of this resource. - `tenant_id` (String) The tenant id. ## Import Import is supported using the following syntax: ```shell terraform import kestra_template.example {{namespace}}/{{template_id}} ``` --- # Terraform: Manage Tenants in Kestra URL: https://kestra.io/docs/terraform/resources/tenant > The kestra_tenant resource allows you to manage tenants in Kestra using Terraform. ## Terraform Resource: kestra_tenant Manages a Kestra Tenant. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example usage ```hcl resource "kestra_tenant" "example" { tenant_id = "my-tenant" name = "My Tenant" } ``` ## Schema ### Required - `tenant_id` (String) The tenant id. ### Optional - `name` (String) The tenant name. - `outputs_in_internal_storage` (Boolean) Whether outputs are stored in internal storage. - `require_existing_namespace` (Boolean) Whether tenant requires an existing namespace. - `secret_configuration` (Map of String) The secret configuration. - `secret_isolation` (Block List, Max: 1) Secret isolation configuration (same shape as storage_isolation). (see [below for nested schema](#nestedblock--secret_isolation)) - `secret_read_only` (Boolean) Whether secrets are read-only in this tenant. - `secret_type` (String) The secret type. - `storage_configuration` (Map of String) The storage configuration. - `storage_isolation` (Block List, Max: 1) Storage isolation configuration. (see [below for nested schema](#nestedblock--storage_isolation)) - `storage_type` (String) The storage type. - `worker_group` (Block List, Max: 1) The worker group. (see [below for nested schema](#nestedblock--worker_group)) ### Read-Only - `id` (String) The ID of this resource. ### Nested schema for `secret_isolation` Optional: - `denied_services` (List of String) List of denied services for secret isolation. - `enabled` (Boolean) Enable secret isolation. ### Nested schema for `storage_isolation` Optional: - `denied_services` (List of String) List of denied services for isolation. - `enabled` (Boolean) Enable storage isolation. ### Nested schema for `worker_group` Required: - `fallback` (String) The fallback strategy. - `key` (String) The worker group key. ## Import Import is supported using the following syntax: ```shell terraform import kestra_tenant.example {{tenant_id}} ``` --- # Terraform: Manage Tests in Kestra URL: https://kestra.io/docs/terraform/resources/test > The kestra_test resource allows you to manage tests in Kestra using Terraform. ## Terraform Resource: kestra_test Test resource ## Schema ### Required - `content` (String) The actual Test YAML content - `namespace` (String) The Test namespace - `test_id` (String) The Test id --- # Terraform: Manage Users in Kestra URL: https://kestra.io/docs/terraform/resources/user > The kestra_user resource allows you to manage users in Kestra using Terraform. ## Terraform Resource: kestra_user Manages a Kestra User. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example usage ```hcl resource "kestra_user" "example" { email = "john@doe.com" namespace = "company.team" description = "Friendly description" first_name = "John" last_name = "Doe" groups = ["4by6NvSLcPXFhCj8nwbZOM"] } ``` ## Schema ### Required - `email` (String) The user email. ### Optional - `description` (String) The user description. - `first_name` (String) The user first name. - `groups` (List of String) The user groups id. - `last_name` (String) The user last name. - `namespace` (String) The linked namespace. ### Read-Only - `id` (String) The ID of this resource. - `username` (String) The user name. Always the email. ## Import Import is supported using the following syntax: ```shell terraform import kestra_user.example {{user_id}} ``` --- # Terraform: Manage User API Tokens in Kestra URL: https://kestra.io/docs/terraform/resources/user_api_token > The kestra_user_api_token resource allows you to manage user API tokens in Kestra using Terraform. ## Terraform Resource: kestra_user_api_token Manages a Kestra User Api Token. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example Usage ```hcl resource "kestra_user_api_token" "example" { user_id = "4by6NvSLcPXFhCj8nwbZOM" name = "test-token" description = "Test token" max_age = "PT1H" extended = false } ``` ## Schema ### Required - `description` (String) The API token description. - `max_age` (String) The time the token remains valid since creation (ISO 8601 duration format). - `name` (String) The API token display name. - `user_id` (String) The ID of the user owning the API Token. ### Optional - `extended` (Boolean) Specify whether the expiry date is automatically moved forward by max age whenever the token is used. Defaults to `false`. ### Read-Only - `full_token` (String, Sensitive) The full API token. - `id` (String) The ID of this resource. ## Import Import is supported using the following syntax: ```shell terraform import kestra_user_api_token.example {{user_id}} ``` --- # Terraform: Manage User Passwords in Kestra URL: https://kestra.io/docs/terraform/resources/user_password > The kestra_user_password resource allows you to manage user passwords in Kestra using Terraform. ## Terraform Resource: kestra_user_password Manages a Kestra User Basic Auth Password. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Example usage ```hcl resource "kestra_user_password" "example" { user_id = "4by6NvSLcPXFhCj8nwbZOM" password = "my-random-password" } ``` ## Schema ### Required - `password` (String, Sensitive) The user password. - `user_id` (String) The user id. ### Read-Only - `id` (String) The ID of this resource. --- # Terraform: Manage Worker Groups in Kestra URL: https://kestra.io/docs/terraform/resources/worker_group > The kestra_worker_group resource allows you to manage worker groups in Kestra using Terraform. ## Terraform Resource: kestra_worker_group Manages a Kestra Worker Group. :::alert{type="info"} This resource is only available on the [Enterprise Edition](https://kestra.io/enterprise) ::: ## Schema ### Required - `key` (String) The worker group key. ### Optional - `allowed_tenants` (List of String) The list of tenants allowed to use the worker group. - `description` (String) The worker group description. ### Read-Only - `id` (String) The ID of this resource. --- # Tutorial: Build Kestra Flows Step by Step URL: https://kestra.io/docs/tutorial > Follow our step-by-step tutorial to learn Kestra's core concepts and build your first workflows from scratch. import ChildCard from "~/components/docs/ChildCard.astro" Use this tutorial to learn Kestra’s core concepts and build flows step by step.
## Learn Kestra by building Flows step by step You can use Kestra to: - Run workflows on demand, event-driven, or on a schedule. - Interact with any system or language through plugins and tasks. - Orchestrate microservices, batch jobs, scripts, SQL queries, data syncs, dbt or Spark jobs, and other processes. This tutorial starts with a simple flow that makes an API request and then expands to use more of Kestra's core building blocks: Inputs, downstream tasks, scripts, Outputs, and Triggers. Later sections cover conditional flow logic, parallel task execution, and error handling. --- # Handle Errors in Kestra: Retries and Alerts URL: https://kestra.io/docs/tutorial/errors > Build resilient Kestra workflows with robust error handling. Configure retries, set up alerts, and manage failures at the flow and namespace levels. Handle errors with automatic retries and notifications. Failure is inevitable. Kestra offers automatic retries and error handling to help you build resilient workflows.
## Handle errors with retries and alerts By default, if any task fails, the execution stops and is marked as failed. For more control over error handling, you can add the `errors` property, `AllowFailure` tasks, or automatic retries. The `errors` property allows you to execute one or more actions before terminating the flow (e.g., sending an email or a Slack message to your team). The property is named `errors` because it is triggered when errors occur within a flow. You can implement error handling at the flow level or namespace level: 1. **Flow-level**: Useful to implement custom alerting for a specific flow or task. This can be accomplished by adding the `errors` property. 2. **Namespace-level**: Useful to send a notification for any failed Execution within a given namespace. This approach allows you to implement centralized error handling for all flows within a given namespace. ## Flow-level error handling using `errors` The `errors` property of a flow accepts a list of tasks to execute when an error occurs. You can add as many tasks as you want, and they will be executed sequentially similar to the `tasks` block. The following example workflow automatically sends a Slack alert via the [SlackIncomingWebhook](/plugins/plugin-slack/slack-notifications/io.kestra.plugin.slack.notifications.slackincomingwebhook) whenever any flow in the `company.team` namespace fails or finishes with warnings. ```yaml id: unreliable_flow namespace: company.team tasks: - id: fail type: io.kestra.plugin.core.execution.Fail errors: - id: alert_on_failure type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ secret('SLACK_WEBHOOK') }}" # https://hooks.slack.com/services/xyz/xyz/xyz messageText: "Failure alert for flow {{ flow.namespace }}.{{ flow.id }} with ID {{ execution.id }}" ``` :::alert{type="info"} Note that we hide the Slack Webhook URL with a secret expression to not expose private endpoints or credentials. Follow our [Open-source Secrets guide](../../15.how-to-guides/secrets/index.md) or check out the [Enterprise Edition](../../oss-vs-paid/index.md) to incorporate your own external [Secrets Manager](../../07.enterprise/02.governance/secrets-manager/index.md). ::: Taking our flow from earlier stages, we can add a Slack alert on an execution error like the following: ```yaml id: getting_started_category_check namespace: company.team inputs: - id: category type: SELECT displayName: Select a category values: ['beauty', 'notebooks'] defaults: 'beauty' tasks: - id: api type: io.kestra.plugin.core.http.Request uri: "https://dummyjson.com/products/category/{{ inputs.category }}" method: GET - id: check_products type: io.kestra.plugin.core.flow.If condition: "{{ json(outputs.api.body).products | length > 0 }}" then: - id: log_status type: io.kestra.plugin.core.log.Log message: "Found {{ json(outputs.api.body).products | length }} products for category {{ inputs.category }}" - id: python type: io.kestra.plugin.scripts.python.Script containerImage: python:slim dependencies: - polars outputFiles: - "products.csv" script: | import polars as pl data = {{ outputs.api.body | jq('.products') | first }} df = pl.from_dicts(data) df.glimpse() df.select(["title", "brand", "price", "rating"]).write_csv("products.csv") - id: sqlQuery type: io.kestra.plugin.jdbc.duckdb.Queries inputFiles: in.csv: "{{ outputs.python.outputFiles['products.csv'] }}" sql: | SELECT brand, round(avg(price), 2) AS avg_price, count(*) AS cnt FROM read_csv_auto('{{ workingDir }}/in.csv', header=True) GROUP BY brand ORDER BY avg_price DESC; store: true else: - id: when_false type: io.kestra.plugin.core.log.Log message: "No products found for category {{ inputs.category }}." errors: - id: alert_on_failure type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ secret('SLACK_WEBHOOK') }}" messageText: "Failure alert for flow {{ flow.namespace }}.{{ flow.id }} with ID {{ execution.id }}" triggers: - id: every_monday_at_10_am type: io.kestra.plugin.core.trigger.Schedule cron: 0 10 * * 1 ``` Now if there is an error, say our API endpoint is unreachable, we'll get a Slack alert notifying a team to investigate. For more, check the [error handling](../../05.workflow-components/11.errors/index.md) page. ## Namespace-level error handling using a flow trigger To get notified on a workflow failure, you can leverage Kestra's built-in notification tasks, including: - [Slack](/plugins/plugin-slack) - [Microsoft Teams](/plugins/plugin-teams) - [Email](/plugins/plugin-mail) For centralized namespace-level alerting, add a dedicated monitoring workflow with one of the notification tasks above and a Flow trigger. Below is an example workflow that automatically sends a Slack alert as soon as any flow in the namespace `company.team` fails or finishes with warnings. ```yaml id: failure_alert namespace: system tasks: - id: send type: io.kestra.plugin.slack.notifications.SlackExecution url: "{{ secret('SLACK_WEBHOOK') }}" executionId: "{{ trigger.executionId }}" triggers: - id: listen type: io.kestra.plugin.core.trigger.Flow conditions: - type: io.kestra.plugin.core.condition.ExecutionStatus in: - FAILED - WARNING - type: io.kestra.plugin.core.condition.ExecutionNamespace namespace: company.team prefix: true ``` Adding this flow ensures you receive a Slack alert for any flow failure in the `company.team` namespace. ![alert notification](./alert-notification.png) ## Retries When working with external systems, transient errors are common. For example, a file may not be available yet, an API might be temporarily unreachable, or a database can be under maintenance. In such cases, retries can often resolve the issue without human intervention. ### Configuring retries Each task can be retried a certain number of times and in a specific way. Use the `retry` property with the desired type of retry. The following types of retries are currently supported: - **Constant**: The task will be retried every X seconds/minutes/hours/days. - **Exponential**: The task will also be retried every X seconds/minutes/hours/days but with an exponential backoff (i.e., an exponential time interval in between each retry attempt.) - **Random**: The task will be retried every X seconds/minutes/hours/days with a random delay (i.e., a random time interval in between each retry attempt.) In this example, the task is retried up to 5 times within a total duration of 1 minute, with a constant 2-second interval between attempts. ```yaml id: retries namespace: company.team tasks: - id: fail_four_times type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'if [ "{{ taskrun.attemptsCount }}" -eq 4 ]; then exit 0; else exit 1; fi' retry: type: constant interval: PT2S maxAttempts: 5 maxDuration: PT1M warningOnRetry: false errors: - id: will_never_run type: io.kestra.plugin.core.debug.Return format: This will never be executed as retries will fix the issue ``` ### Adding a retry configuration to our tutorial workflow Returning to the first iteration of our flow from the [Fundamentals](../01.fundamentals/index.md) section. We can add a retry configuration to the `api` task. API calls are prone to transient errors, so we will retry that task up to 10 times, for at most 1 hour of total duration, every 10 seconds (i.e., with a constant interval of 10 seconds in between retry attempts). ```yaml id: retry_request namespace: company.team tasks: - id: api type: io.kestra.plugin.core.http.Request uri: https://dummyjson.com/products retry: type: constant interval: PT20S maxDuration: PT1H maxAttempts: 10 warningOnRetry: true ``` With the complete example looking something like the following: ```yaml id: getting_started_category_check namespace: company.team inputs: - id: category type: SELECT displayName: Select a category values: ['beauty', 'notebooks'] defaults: 'beauty' tasks: - id: api type: io.kestra.plugin.core.http.Request uri: "https://dummyjson.com/products/category/{{ inputs.category }}" method: GET retry: type: constant interval: PT20S maxDuration: PT1H maxAttempts: 10 warningOnRetry: true - id: check_products type: io.kestra.plugin.core.flow.If condition: "{{ json(outputs.api.body).products | length > 0 }}" then: - id: log_status type: io.kestra.plugin.core.log.Log message: "Found {{ json(outputs.api.body).products | length }} products for category {{ inputs.category }}" - id: python type: io.kestra.plugin.scripts.python.Script containerImage: python:slim dependencies: - polars outputFiles: - "products.csv" script: | import polars as pl data = {{ outputs.api.body | jq('.products') | first }} df = pl.from_dicts(data) df.glimpse() df.select(["title", "brand", "price", "rating"]).write_csv("products.csv") - id: sqlQuery type: io.kestra.plugin.jdbc.duckdb.Queries inputFiles: in.csv: "{{ outputs.python.outputFiles['products.csv'] }}" sql: | SELECT brand, round(avg(price), 2) AS avg_price, count(*) AS cnt FROM read_csv_auto('{{ workingDir }}/in.csv', header=True) GROUP BY brand ORDER BY avg_price DESC; store: true else: - id: when_false type: io.kestra.plugin.core.log.Log message: "No products found for category {{ inputs.category }}." errors: - id: alert_on_failure type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ secret('SLACK_WEBHOOK') }}" messageText: "Failure alert for flow {{ flow.namespace }}.{{ flow.id }} with ID {{ execution.id }}" triggers: - id: every_monday_at_10_am type: io.kestra.plugin.core.trigger.Schedule cron: 0 10 * * 1 ``` ## Next steps Congrats! 🎉 You've completed the tutorial. Next, you can dive into: - [Architecture](../../08.architecture/index.mdx) - [Workflow Components](../../05.workflow-components/index.mdx) - [Key concepts](../../06.concepts/index.mdx) - [Plugins](/plugins) to integrate with external systems - [Deployments](../../10.administrator-guide/index.mdx) to deploy Kestra to production. - [Scripts](../../16.scripts/index.mdx) --- # Flowable Tasks in Kestra: Branch, Loop, Parallelize URL: https://kestra.io/docs/tutorial/flowable > Master Kestra's Flowable tasks to control workflow logic. Learn how to implement branching, loops, and parallel execution for complex orchestration scenarios. Run tasks or subflows in parallel, create loops, and conditional branching.
The example flow from earlier in this tutorial extracts data from an API, processes it in a Python script, executes a SQL query, and generates a downloadable artifact on a predefined schedule. Many real-world use cases require branching, looping, or running several tasks simultaneously. Kestra handles these requirements with Flowable tasks. Tasks from the [Core Flow plugin](/plugins/core/flow) control flow logic. Use them to run tasks in parallel or sequentially, branch conditionally, iterate over items, pause, or allow specific tasks to fail without stopping the execution. For example, you can use the [If task](/plugins/core/flow/io.kestra.plugin.core.flow.if) to specify your conditions and define what action to take based on whether those conditions are met. The example below redesigns the flow to use a `SELECT` input for product category rather than a `STRING` URI, while still calling [dummyjson](https://dummyjson.com). An API request is made based on the selected category — `beauty` or `notebooks` (one does not exist). The `check_products` If task has a `condition` of `"{{ json(outputs.api.body).products | length > 0 }}"` (i.e., checking whether the API body is not empty and contains at least one product). The log message then depends on whether the actual product category exists or not. The `then` property defines the action for a true condition, and the `else` property defines the action for a false result. ```yaml id: getting_started namespace: company.team inputs: - id: category type: SELECT displayName: Select a category values: ['beauty', 'notebooks'] defaults: 'beauty' tasks: - id: api type: io.kestra.plugin.core.http.Request uri: "https://dummyjson.com/products/category/{{ inputs.category }}" method: GET - id: check_products type: io.kestra.plugin.core.flow.If condition: "{{ json(outputs.api.body).products | length > 0 }}" then: - id: log_status type: io.kestra.plugin.core.log.Log message: "Found {{ json(outputs.api.body).products | length }} products for category {{ inputs.category }}" - id: python type: io.kestra.plugin.scripts.python.Script containerImage: python:slim dependencies: - polars outputFiles: - "products.csv" script: | import polars as pl data = {{ outputs.api.body | jq('.products') | first }} df = pl.from_dicts(data) df.glimpse() # Keep a simple view for this category df.select(["title", "brand", "price"]).write_csv("products.csv") - id: sqlQuery type: io.kestra.plugin.jdbc.duckdb.Query inputFiles: in.csv: "{{ outputs.python.outputFiles['products.csv'] }}" sql: | SELECT brand, round(avg(price), 2) AS avg_price, count(*) AS cnt FROM read_csv_auto('{{ workingDir }}/in.csv', header=True) GROUP BY brand ORDER BY avg_price DESC; store: true else: - id: when_false type: io.kestra.plugin.core.log.Log message: "No products found for category {{ inputs.category }}." triggers: - id: every_monday_at_10_am type: io.kestra.plugin.core.trigger.Schedule cron: 0 10 * * 1 ``` Execute the flow twice, once with `beauty` and once with `notebooks` to examine the results. ## Add a loop to a flow using Flowable tasks A common orchestration pattern is operating on a set of values. Kestra offers several approaches depending on your use case. The standalone examples below demonstrate each type. ### ForEach The **ForEach** flowable task executes a group of tasks for each value in the list. There are many ways to implement ForEach for complex looping operations, possibly incorporating conditional flowable tasks or subtasks. See more examples in the [ForEach documentation](/plugins/core/flow/io.kestra.plugin.core.flow.foreach). As an introduction to the feature, the below example demonstrates using ForEach to make an API call to [OpenLibrary](https://openlibrary.org/dev/docs/api/search) to get a list of associated titles for each author in the list. The values are defined as a JSON string or an array, i.e., a list of string values `["value1", "value2"]` or a list of key-value pairs `[{"key": "value1"}, {"key": "value2"}]`. You can access the current iteration value using the variable `{{ taskrun.value }}`: ```yaml id: for_loop_example namespace: tutorial tasks: - id: for_each type: io.kestra.plugin.core.flow.ForEach values: ["pynchon", "dostoyevsky", "hedayat"] tasks: - id: api type: io.kestra.plugin.core.http.Request uri: "https://openlibrary.org/search.json?author={{ taskrun.value }}&sort=new" ``` After execution, the Gantt view shows separate runs for each of the three listed authors in the task. ![forEach example](./for-each-author.png) ### LoopUntil You can also loop until an external system reports a healthy status. The `LoopUntil` task reruns its child tasks until a condition becomes `true`, which is helpful for polling APIs or long-running jobs. Key options: - `condition` — evaluated after each run and can reference the latest child outputs (for example `{{ outputs.healthCheck.code }}`). - `tasks` — the steps executed on every loop iteration. - `checkFrequency` — optional guardrails controlling the poll interval plus maximum iterations or duration. ```yaml id: loop_until_health_check namespace: tutorial tasks: - id: loop type: io.kestra.plugin.core.flow.LoopUntil condition: "{{ outputs.healthCheck.code == 200 }}" checkFrequency: interval: PT30S maxIterations: 50 tasks: - id: healthCheck type: io.kestra.plugin.core.http.Request method: GET uri: https://kestra.io ``` This flow checks an HTTP endpoint every 30 seconds and stops either when it returns 200 or after 50 attempts, whichever comes first. You can reference the child task outputs (here `outputs.healthCheck.code`) inside the `condition` expression. See the [LoopUntil task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.loopuntil) for additional options. ## Add parallelism using Flowable tasks A common orchestration requirement is executing independent processes **in parallel**. For example, you can process data for each partition in parallel. This can significantly speed up the processing time. The flow below uses the `ForEach` flowable task to execute a list of `tasks` in parallel. 1. The `concurrencyLimit` property with value `0` makes the list of `tasks` to execute in parallel. 2. The `values` property defines the list of items to iterate over. 3. The `tasks` property defines the list of tasks to execute for each item in the list. You can access the iteration value using the `{{ taskrun.value }}` variable. ```yaml id: python_partitions namespace: company.team description: Process partitions in parallel tasks: - id: getPartitions type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest script: | from kestra import Kestra partitions = [f"file_{nr}.parquet" for nr in range(1, 10)] Kestra.outputs({'partitions': partitions}) - id: processPartitions type: io.kestra.plugin.core.flow.ForEach concurrencyLimit: 0 values: '{{ outputs.getPartitions.vars.partitions }}' tasks: - id: partition type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker dependencies: - kestra script: | import random import time from kestra import Kestra filename = '{{ taskrun.value }}' print(f"Reading and processing partition {filename}") nr_rows = random.randint(1, 1000) processing_time = random.randint(1, 20) time.sleep(processing_time) Kestra.counter('nr_rows', nr_rows, {'partition': filename}) Kestra.timer('processing_time', processing_time, {'partition': filename}) ``` These examples, while simple, demonstrate the flexibility of flowable tasks in both simple and complex workflows. To learn more about flowable tasks and see more examples, check out the full [Flowable tasks documentation](../../05.workflow-components/01.tasks/00.flowable-tasks/index.md). Next, we'll explore error handling in a flow. --- # Build a Hello World Flow in Kestra URL: https://kestra.io/docs/tutorial/fundamentals > Build your first Hello World flow in Kestra. Follow a step-by-step tutorial to learn declarative workflow design and core Kestra concepts from scratch. Start by building a simple Hello World flow. :::alert{type="info"} If you haven't already, follow the [Quickstart Guide](../../01.quickstart/index.md) or check the detailed [Installation Guide](../../02.installation/index.mdx). ::: ## Build your first Hello World Flow
## Flows [Flows](../../05.workflow-components/01.flow/index.md) are defined in a declarative YAML syntax to keep the orchestration code portable and language-agnostic. Each flow consists of three required components: `id`, `namespace`, and `tasks`. 1. `id` is the unique identifier of the flow. 2. `namespace` separates projects, teams, and environments. 3. `tasks` is a list of tasks executed in order. Here are those three components in a YAML file: ```yaml id: getting_started namespace: company.team tasks: - id: hello_world type: io.kestra.plugin.core.log.Log message: Hello World! ``` The `id` of a flow must be unique within its namespace. For example: - ✅ You **can** have a flow named `getting_started` in `company.team1` and another flow named `getting_started` in `company.team2`. - ❌ You **cannot** have two flows named `getting_started` in `company.team` at the same time. The combination of `id` and `namespace` is the unique identifier for a flow. ### Namespaces [Namespaces](../../05.workflow-components/02.namespace/index.md) are used to group flows and provide structure. Keep in mind that a flow’s allocation to a namespace is immutable. Once a flow is created, you cannot change its namespace. If you need to change the namespace of a flow, create a new flow within the desired namespace and delete the old flow. ### Labels To add another layer of organization, use [labels](../../05.workflow-components/08.labels/index.md) to group flows with key-value pairs. In short, labels are customizable tags to simplify monitoring and filtering of flows and executions. For example, taking the flow above, we can add a label with the key `tag` to define the flow as `Getting Started`: ```yaml id: getting_started namespace: company.team labels: tag: Getting Started tasks: - id: hello_world type: io.kestra.plugin.core.log.Log message: Hello World! ``` ### Descriptions You can optionally add a [description](../../05.workflow-components/15.descriptions/index.md) property to document your flow's purpose or other useful information. The `description` is a string that supports **markdown** syntax. This markdown description is rendered and displayed in the UI. :::alert{type="info"} You can also add a `description` property to `tasks` and `triggers` to document all the components of your workflow. ::: Here is the same flow as before, but with labels and descriptions: ```yaml id: getting_started namespace: company.team description: | # Getting Started Let's `write` some **markdown** - [first flow](https://t.ly/Vemr0) 🚀 labels: tag: Getting Started tasks: - id: hello_world type: io.kestra.plugin.core.log.Log message: Hello World! description: | ## About this task This task prints "Hello World!" to the logs. ``` Learn more about flows in the [Flows page](../../05.workflow-components/01.flow/index.md). ## Tasks Now that you know how to document and organize your flows, it's time to get to the core of orchestration: tasks. [Tasks](../../05.workflow-components/01.tasks/index.mdx) are atomic actions in your flows. You can design your tasks to be small and granular, such as fetching data from a REST API or running a self-contained Python script. However, tasks can also represent large and complex processes, like triggering containerized processes or long-running batch jobs (e.g., using dbt, Spark, AWS Batch, Azure Batch, etc.) and waiting for their completion. ### Task execution order Tasks are defined as a **list**. By default, all tasks in the list will be executed **sequentially** — the second task will start as soon as the first one finishes successfully. Kestra provides additional **customization** to run tasks **in parallel**, iterate (_sequentially or in parallel_) over a list of items, or allow **specific tasks to fail** without stopping the flow. These kinds of actions are called [**Flowable**](../05.flowable/index.md) tasks because they define the flow logic. We'll cover Flowable tasks in more detail later in the tutorial, but for now it is good to know they exist. A task in Kestra must have an `id` and a `type`. This is similar to how a flow must have an `id` and a `namespace`. Other task properties depend on the task type. You can think of a task as a step in a flow that executes a specific action, such as running a Python or Node.js script in a Docker container or loading data from a database. We've shown a Log task in some example flows before, and below is the same flow with an additional Python script task added. The Log task runs first and then the Python task (copy and run for yourself to see the results): ```yaml id: getting_started namespace: company.team description: | # Getting Started Let's `write` some **markdown** - [first flow](https://t.ly/Vemr0) 🚀 labels: tag: Getting Started tasks: - id: hello_world type: io.kestra.plugin.core.log.Log message: Hello World! description: | ## About this task This task prints "Hello World!" to the logs. - id: python type: io.kestra.plugin.scripts.python.Script containerImage: python:slim script: | print("Hello World!") ``` ### Iterate quickly with Playground When you want to tweak a flow step by step without rerunning everything, use the **Playground** in the editor. It lets you play tasks one at a time, keep prior outputs, and iterate like a notebook. See the short guide in [UI → Playground](../../09.ui/10.playground/index.md) and try it with the `getting_started` example above before moving on. ### Autocompletion Kestra supports [hundreds of tasks](/plugins) integrating with various external systems. It's neither necessary nor possible to memorize all potential tasks or properties, maybe one day. Use the shortcut `CTRL + SPACE` on Windows/Linux or `fn + control + SPACE` on macOS to trigger autocompletion and list available tasks or properties for a given task. Kestra also has built-in documentation accessible through the UI for Flow, Task, and Trigger properties, so you don't have to context switch between building a flow and learning the ins and outs of a component. :::alert{type="info"} If you want to comment or uncomment out part of your code, use `CTRL + /` on Windows/Linux or `⌘ + /` on macOS. All available keyboard shortcuts are listed in the code editor context menu. ::: ## Create and run a flow To this point, we have shown some flows to run and get familiar with. Now, let's create a flow to use throughout the rest of the tutorial. Open the **Flows** view and click **+ Create**: ![Create flow](./create_button.png) Paste the following code into the Flow editor: ```yaml id: getting_started namespace: company.team tasks: - id: api type: io.kestra.plugin.core.http.Request uri: https://dummyjson.com/products ``` Then, hit the **Save** button. ![Create flow](./save_button.png) This flow has a single task that fetches data from the [dummyjson](https://dummyjson.com/) API via an [HTTP Request task](/plugins/core/http/io.kestra.plugin.core.http.request). Run it to see the output. ![New execution](./new_execution.png) After execution, you’ll be directed to the Gantt view to see the stages of your flow’s progress. In this simple example, we see the API request successfully execute. ![gantt view](./gantt-view.png) While fetching data is a great first step, it is just that, a first step. In the next sections, you'll explore the other critical components of Kestra flows: Inputs, Outputs, Triggers, and more. --- # Add Inputs to Kestra Workflows URL: https://kestra.io/docs/tutorial/inputs > Discover how to add dynamic inputs to your Kestra workflows to make them flexible and reusable across different scenarios. Instead of hardcoding values in your flow, use inputs to make your workflows more dynamic and reusable. ## Make Flows dynamic with Inputs
## Defining inputs Similar to `tasks`, `inputs` is defined as a list of key-value pairs. Each input must have an `id` and a `type`. You can also set `defaults` for each input. Setting default values is recommended, especially when running on a schedule. An input might be a user you'd like to send an autogenerated message to, for example taking the following flow: ```yaml id: inputs_demo namespace: company.team inputs: - id: user type: STRING defaults: Zoyd Wheeler tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Hey there, {{ inputs.user }} ``` In the Log task, you'll notice `inputs.user`. To retrieve an input value, you need to identify the input in an [expression](../../expressions/index.mdx). In Kestra, bracket notation `{{ }}` is used to wrap an expression. For an input, follow this general `{{ inputs.input_id }}` syntax. In the example above, the input `id` is set to `user`, and it's referenced in the task message as `{{ inputs.user }}`. Leverage [autocompletion](../01.fundamentals/index.md#autocompletion) in the flow editor to use expressions; they can be tricky at first, let Kestra do the hard work. :::alert{type="info"} Hit the `Backspace` or `Delete` key while building your expression? Use the keyboard shortcut to bring autocomplete back again: `CTRL + SPACE`. ::: Try running the above flow with different values for the `user` input. You can do this by clicking on the **Execute** button and then typing a new string value in the prompt for whatever name you'd like. ![Inputs](./inputs.png) :::alert{type="info"} The plural form `defaults` is used instead of `default` for two reasons. First, `default` is a reserved keyword in Java, so it cannot be used. Second, this property allows you to set default values for a JSON object which can be an array that simultaneously defines multiple default values. ::: ## Input types Here are the most common input types: | Type | Description | |---------|-------------------------------------------------------------------------------------------------------| | STRING | It can be any string value. Strings are not parsed, they are passed as-is to any task that uses them. | | INT | It can be any valid integer number (without decimals). | | BOOLEAN | It must be either `true` or `false`. | This is a very basic list to get started. Check the [Inputs documentation](../../05.workflow-components/05.inputs/index.md) for an extensive list of supported input types and properties. ## Parameterize your flow With basic inputs covered, you can parameterize the flow created earlier in [Fundamentals](../01.fundamentals/index.md#create-and-run-a-flow). In our example below, we provide the URL of the API as an input rather than hardcoded into the Request task's `uri` property. This allows you to change the URL at execution time without modifying the flow itself. ```yaml id: getting_started namespace: company.team inputs: - id: api_url type: STRING defaults: https://dummyjson.com/products tasks: - id: api type: io.kestra.plugin.core.http.Request uri: "{{ inputs.api_url }}" ``` To learn more about input types, properties, and more advanced uses, check out the full [Inputs documentation](../../05.workflow-components/05.inputs/index.md). Next, we check out flow results and how they can be used. --- # Pass Outputs Between Tasks in Kestra URL: https://kestra.io/docs/tutorial/outputs > Learn how to pass data between tasks and flows in Kestra using Outputs, enabling complex data processing pipelines. Outputs let you pass data between tasks and flows. ## Pass Outputs between Tasks Tasks and flows can generate outputs that are passed to downstream processes. To do this, Kestra uses internal storage. Tasks from the `io.kestra.plugin.core.storage` category, along with [Outputs](../../05.workflow-components/06.outputs/index.md), interact with internal storage. You can think of internal storage like an S3 bucket, including your own private bucket. This storage layer helps avoid connector sprawl. For example, the PostgreSQL plugin can extract data and load it into internal storage. Other tasks can then load that data into Snowflake, BigQuery, or Redshift — or process it with another plugin — without direct point-to-point connections. Let's check out Outputs in practice.
## How to retrieve outputs To see which outputs have been generated during a flow execution, go to the **Outputs** tab on the Execution page: ![Output of our previous download](./output.png) Outputs are useful for troubleshooting and auditing. Additionally, you can use outputs to: - share **downloadable artifacts** with business stakeholders (e.g., a table generated by a SQL query or a CSV file generated by a Python script) - **pass data** between decoupled processes (e.g., pass subflow's outputs or a file detected by S3 trigger to downstream tasks) Similar to Inputs, use [expressions](../../expressions/index.mdx) to access Outputs in downstream tasks. Use the syntax `{{ outputs.task_id.output_property }}` to retrieve a specific output value of a task. If your task `id` contains one or more hyphens (`-`), wrap the task `id` in square brackets, for example: `{{ outputs['task-id'].output_property }}`. Referring back to our flow, you'll have seen that one of the outputs is the API status code (e.g., 200). We can access this output in a downstream task with an expression in a Log task, for example: ```yaml id: getting_started namespace: company.team inputs: - id: api_url type: STRING defaults: https://dummyjson.com/products tasks: - id: api type: io.kestra.plugin.core.http.Request uri: "{{ inputs.api_url }}" - id: log type: io.kestra.plugin.core.log.Log message: "API Status Code: {{ outputs.api.code }}" ``` ## Use outputs in your flow While returning status code is good for unit testing a flow, typically when fetching data from a REST API we want to use that data. Kestra stores that fetched data in the internal storage and makes it available to downstream tasks using the `body` output argument. Use the `{{ outputs.task_id.body }}` syntax to process that fetched data in a downstream task. To demonstrate, we can add a [Python script](/plugins/plugin-script-python/io.kestra.plugin.scripts.python.script) task to our flow below. :::alert{type="info"} Kestra is language-agnostic — run custom scripts in any flow. You can run Python, Node.js, R, Julia, and an ever-growing number of languages; or execute commands in shell or PowerShell. Check out the [Scripts documentation](../../16.scripts/index.mdx) for more! ::: ```yaml id: getting_started_output namespace: company.team inputs: - id: api_url type: STRING defaults: https://dummyjson.com/products tasks: - id: api type: io.kestra.plugin.core.http.Request uri: "{{ inputs.api_url }}" - id: log type: io.kestra.plugin.core.log.Log message: "API Status Code: {{ outputs.api.code }}" - id: python type: io.kestra.plugin.scripts.python.Script containerImage: python:slim beforeCommands: - pip install polars outputFiles: - "products.csv" script: | import polars as pl data = {{outputs.api.body | jq('.products') | first}} df = pl.from_dicts(data) df.glimpse() df.select(["brand", "price"]).write_csv("products.csv") ``` This flow processes data using Polars and stores the result as a CSV file. ![File Outputs](./tutorial-outputs-python.png) :::alert{type="info"} To avoid package dependency conflicts, the Python task runs in an **independent Docker container**. You can optionally provide a **custom Docker image** from a private container registry or use a public Python image from DockerHub and install any custom package dependencies using the `beforeCommands` argument. The `beforeCommands` argument allows you to install any custom package dependencies — here, we install [Polars](https://www.pola.rs/). Use as many commands as needed to prepare the containerized environment for script execution. ::: ## Debug expressions When referencing the output from the previous task, this flow uses `jq` [language](https://en.wikipedia.org/wiki/Jq_(programming_language)) to extract the `products` array from the API response — `jq` is available in all Kestra tasks without having to install it. You can test `{{ outputs.task_id.body | jq('.products') | first }}` and any other output parsing expression using the built-in expressions evaluator on the **Outputs** page: ![Debug Expression](./eval_expressions.png) ## Passing data between tasks So now our flow is able to handle different API endpoints through an input, extract that API's data as an output, and pass that output to a custom Python script to package the data into a usable CSV file. Let's add another task to process the CSV file generated by the Python script task. We can pass the file from internal storage to the `io.kestra.plugin.jdbc.duckdb.Query` ([DuckDB Plugin](/plugins/plugin-jdbc-duckdb)) task to run a SQL query on the CSV file and store the result as a downloadable artifact in internal storage. ```yaml id: getting_started namespace: company.team inputs: - id: api_url type: STRING defaults: https://dummyjson.com/products tasks: - id: api type: io.kestra.plugin.core.http.Request uri: "{{ inputs.api_url }}" - id: python type: io.kestra.plugin.scripts.python.Script containerImage: python:slim beforeCommands: - pip install polars outputFiles: - "products.csv" script: | import polars as pl data = {{ outputs.api.body | jq('.products') | first }} df = pl.from_dicts(data) df.glimpse() df.select(["brand", "price"]).write_csv("products.csv") - id: sqlQuery type: io.kestra.plugin.jdbc.duckdb.Query inputFiles: in.csv: "{{ outputs.python.outputFiles['products.csv'] }}" sql: | SELECT brand, round(avg(price), 2) as avg_price FROM read_csv_auto('{{ workingDir }}/in.csv', header=True) GROUP BY brand ORDER BY avg_price DESC; store: true ``` This example flow passes data between tasks using Outputs. The `inputFiles` argument of the `io.kestra.plugin.jdbc.duckdb.Query` task allows you to pass files from internal storage to the task. The `store: true` property ensures that the result of the SQL query is stored in the internal storage and can be previewed and downloaded from the Outputs tab. ![Preview](./preview.png) This flow extracts data from an API, processes it in a Python script, executes a SQL query, and generates a downloadable artifact. :::alert{type="info"} If you encounter any issues while executing the above flow, this might be a Docker-related issue (i.e., Docker-in-Docker setup, which can be difficult to configure on Windows). Set the runner property to `PROCESS` to run the Python script task in the same process as the flow rather than in a Docker container, as shown in the example below. This will avoid any Docker related issues. ::: ```yaml id: getting_started namespace: company.team inputs: - id: api_url type: STRING defaults: https://dummyjson.com/products tasks: - id: api type: io.kestra.plugin.core.http.Request uri: "{{ inputs.api_url }}" - id: python type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.core.runner.Process # Runs the Python script in the same process as the flow rather than a Docker container beforeCommands: - pip install polars outputFiles: - "products.csv" script: | import polars as pl data = {{ outputs.api.body | jq('.products') | first }} df = pl.from_dicts(data) df.glimpse() df.select(["brand", "price"]).write_csv("products.csv") - id: sqlQuery type: io.kestra.plugin.jdbc.duckdb.Query inputFiles: in.csv: "{{ outputs.python.outputFiles['products.csv'] }}" sql: | SELECT brand, round(avg(price), 2) as avg_price FROM read_csv_auto('{{ workingDir }}/in.csv', header=True) GROUP BY brand ORDER BY avg_price DESC; store: true ``` Outputs can be used in many ways to connect tasks and flows. To learn more about Outputs, check out the full [Outputs documentation](../../05.workflow-components/06.outputs/index.md). Next, instead of relying on a manual click, we'll set up the flow to run automatically with a Trigger. --- # Add Triggers to Automate Kestra Flows URL: https://kestra.io/docs/tutorial/triggers > Automate Kestra flows with triggers. Schedule workflows or trigger them based on events like file uploads, API calls, or other flow completions. Triggers automatically start your flow based on events or a schedule. ## Automate Flows with Triggers A trigger can be a scheduled date, the arrival of a new file, a new message in a queue, the completion of another flow's execution and much more.
## Defining triggers Like `inputs` and `tasks`, use the `triggers` keyword in the flow to define a list of triggers. You can have several triggers attached to a flow. The `trigger` definition is similar to a task definition — it contains an `id`, a `type`, and additional properties specific to the trigger type. To get started, take a look at the flow below. The `schedule_trigger` defines a `cron` expression to run every day at 10 AM. The [Schedule trigger](../../05.workflow-components/07.triggers/01.schedule-trigger/index.md) is great for nightly jobs and other static schedule-oriented workflows. However, Kestra does not limit you to schedule-based orchestration. The workflow below also includes a `flow_trigger` that automatically starts the `getting_started` flow whenever the `first_flow` defined in the conditions finishes executing. In other words, a flow can be triggered by time-based schedules, by events, or by several triggers at once. For example, you can react to a change in a [Google Sheet](/plugins/plugin-googleworkspace/sheets/io.kestra.plugin.googleworkspace.sheets.sheetmodifiedtrigger), a new file in an [S3 bucket](/plugins/plugin-aws/s3/io.kestra.plugin.aws.s3.trigger), a [PostgreSQL database](/plugins/plugin-jdbc-postgres/io.kestra.plugin.jdbc.postgresql.trigger) query result, or even when an [email is received](/plugins/plugin-email/io.kestra.plugin.email.realtimetrigger) in real time. ```yaml id: getting_started namespace: company.team tasks: - id: hello_world type: io.kestra.plugin.core.log.Log message: Hello World! triggers: - id: schedule_trigger type: io.kestra.plugin.core.trigger.Schedule cron: 0 10 * * * - id: flow_trigger type: io.kestra.plugin.core.trigger.Flow conditions: - type: io.kestra.plugin.core.condition.ExecutionFlow namespace: company.team flowId: first_flow ``` :::alert{type="info"} Schedules default to UTC. To use a different time zone, set the `timezone` property on the `Schedule` trigger (for example, `America/New_York`). ::: ## Add a trigger to your flow Building on the example flow from the previous pages, we can add one of the above triggers to the flow. For example, take the following and ensure our flow runs every Monday at 10 AM to get the latest product data. ```yaml triggers: - id: every_monday_at_10_am type: io.kestra.plugin.core.trigger.Schedule cron: 0 10 * * 1 ``` The `getting_started` flow now runs every Monday at 10 AM, starting the week with the latest product data. ```yaml id: getting_started namespace: company.team inputs: - id: api_url type: STRING defaults: https://dummyjson.com/products tasks: - id: api type: io.kestra.plugin.core.http.Request uri: "{{ inputs.api_url }}" - id: python type: io.kestra.plugin.scripts.python.Script containerImage: python:slim beforeCommands: - pip install polars outputFiles: - "products.csv" script: | import polars as pl data = {{ outputs.api.body | jq('.products') | first }} df = pl.from_dicts(data) df.glimpse() df.select(["brand", "price"]).write_csv("products.csv") - id: sqlQuery type: io.kestra.plugin.jdbc.duckdb.Query inputFiles: in.csv: "{{ outputs.python.outputFiles['products.csv'] }}" sql: | SELECT brand, round(avg(price), 2) as avg_price FROM read_csv_auto('{{ workingDir }}/in.csv', header=True) GROUP BY brand ORDER BY avg_price DESC; store: true triggers: - id: every_monday_at_10_am type: io.kestra.plugin.core.trigger.Schedule cron: 0 10 * * 1 ``` With a trigger added to a flow, you can now see the trigger's details in the flow's **Triggers** tab. ![Flow Triggers Tab](./flow-triggers.png) To learn more about Triggers, check out the full [Triggers documentation](../../05.workflow-components/07.triggers/index.mdx). Next up, we'll check out Flowable tasks – ways to loop, condition, and parallelize tasks. --- # Kestra UI Guide: Dashboards, Flows & Logs URL: https://kestra.io/docs/ui > Navigate the Kestra User Interface. A guide to the main UI components including Flows, Executions, Logs, and Administration pages. import ChildCard from "~/components/docs/ChildCard.astro" Kestra comes with a rich web user interface located by default on port 8080. When you first navigate to the Kestra UI, you will see the **Welcome** page. ![Kestra User Interface Welcome Page](./01-Welcome.png) On this page, click on **Start Product Tour** to open the Kestra **Guided Tour**, which will guide you through creating and executing your first flow step by step. On the left menu, you will see the following UI pages: - The **Dashboards** page contains dashboards of visualizations for flow execution data and other metrics. - The **Flows** page shows you all of your flows, where you can create, edit, and execute them. - The **Executions** page provides a view to inspect and manage previous executions. - The **Logs** page gives access to all task logs from previous executions. - The **Namespaces** page lists all available namespaces and allows specific configurations to be set at the namespace level. - The **Blueprints** page provides a catalog of ready-to-use flow examples. - The **Plugins** page provides a catalog of plugins you can use inside your flows. - The **Tenant** page provides a system overview, the full KV Store, a list of all triggers, and concurrency limits. The [Kestra Enterprise Edition](../oss-vs-paid/index.md) comes with additional functionalities provided by the Kestra UI: - The **Apps** page takes you to your list of Apps and is also where you create new Apps. - The **Tests** page takes you to your flow unit tests where you can view, edit, and create test assertions against your flows without creating executions. - The **Assets** page lets you manage reusable assets available to your flows and apps. - The **Tenant** page provides a system overview, KV Store, secrets and credentials lists, a list of all triggers, audit logs, concurrency limits, the Apps Catalog, and IAM. - The **Instance** page includes sections for Services, Versioned Plugins, tenant management, Worker Groups, Kill Switch, and Announcements.

--- # Blueprints in the Kestra UI – Find and Use Templates URL: https://kestra.io/docs/ui/blueprints > Explore Blueprints in the Kestra UI. Discover and use pre-built workflow templates to kickstart your orchestration projects quickly. Ready-to-use examples designed to kickstart your workflow.
Blueprints are a curated, organized, and searchable catalog of ready-to-use examples designed to help you kickstart your workflow. Each Blueprint combines code and documentation and can be assigned several tags for organization and discoverability. All Blueprints are validated and documented. You can customize and integrate them into your flows with a single click on the **Use** button. ![Blueprint](../../06.concepts/07.blueprints/blueprints.png) ## Custom blueprints :::badge{editions="EE,Cloud"} ::: You can also create custom blueprints shared within your organization. Custom blueprints promote reusability and consistency across an organization or team. :::alert{type="info"} Custom blueprints require the [Enterprise Edition](../../07.enterprise/index.mdx). ::: ![Custom Blueprints](./custom-blueprint.png) Check the [Blueprints documentation](../../06.concepts/07.blueprints/index.md) for more details. --- # Bookmarks in the Kestra UI – Star and Revisit Pages URL: https://kestra.io/docs/ui/bookmarks > Use Bookmarks in the Kestra UI. Star frequently used pages for quick access and organize your workflow navigation efficiently. Quickly save and access your favorite pages by starring them for instant retrieval. The bookmark feature allows you to star any page, instantly adding it to the Starred tab located in the left panel. The star is located near the top of the page next to the page name. You can rename or remove bookmarks directly from the left panel. ![bookmark](./bookmarks.png) --- # Dashboards in Kestra UI: Monitor Executions URL: https://kestra.io/docs/ui/dashboard > Monitor workflows with Kestra Dashboards. Visualize execution metrics, create custom charts, and track performance indicators in the UI. Get insights into your workflows with Dashboards. The first time you access the main **Dashboard**, you'll see the **Welcome Page** and you can click **Create my first flow** to launch a Guided Tour. Once you have executed a flow, you will see your flow executions in the dashboard. ## Dashboard page The Dashboard page displays both the **default dashboard** and any **custom dashboards** you've created. To switch between dashboards, use the hamburger menu. If you have over 10 dashboards, type the dashboard name in the search bar to quickly find it. The same menu also lets you edit or delete existing dashboards. From your dashboard, you can apply and save filters, refresh data, and set an automatic periodic refresh. ![Dashboard Main Page](./main_page.png) Dashboards display the following data: - Executions over time - Execution Status for Today, Yesterday as well as Last 30 days - Executions per namespace - Execution errors per namespace - List of failed Executions - List of error logs - A ratio of execution successes to total executions ## Custom dashboards Dashboards let you define custom queries and charts to visualize data on your executions, logs, and metrics. Rather than relying only on the default dashboard on Kestra's home screen, you can create a custom dashboard with charts that answer specific questions and track key metrics. Each chart's configuration can be modified individually using the pencil icon in the dashboard view. :::alert{type="info"} See the [No-Code Dashboards guide](../../no-code/02.no-code-dashboards/index.md) for building dashboards without writing YAML. ::: ### Chart types Dashboards support six chart types: **Bar**, **Pie**, **TimeSeries**, **Table**, **KPI**, and **Markdown**. Each data chart type is composed of `chartOptions` and `data`. A chart's `chartOptions` property is where you customize display names and descriptions, and choose whether to add legends and tooltips to complement the visualization. A chart's `data` property is where you specify which Kestra data to use as a column, how you want the data displayed (e.g., an aggregate count or an `ORDER BY`), and add any [filters](#querying-data) you might want applied to the chart (e.g., REGEX match, greater or less than, or not Null). Each chart's options are listed in the [Chart Plugin Documentation](/plugins/core/chart) where you can dive further into the properties of each type. #### Common chart properties All chart types share the following `chartOptions` properties: | Property | Required | Default | Description | | --- | --- | --- | --- | | `displayName` | Yes | — | The title displayed on the chart | | `description` | No | — | An optional subtitle or description | | `width` | No | `6` | Width of the chart on a 12-column grid (1–12) | #### Bar chart `type: io.kestra.plugin.core.dashboard.chart.Bar` Compares categorical data across groups. Requires exactly one aggregation column. Additional `chartOptions` properties: | Property | Required | Default | Description | | --- | --- | --- | --- | | `column` | Yes | — | The data column to use as the x-axis categories | | `legend.enabled` | No | `true` | Show or hide the legend | | `tooltip` | No | `ALL` | Tooltip display behavior: `NONE`, `ALL`, or `SINGLE` | ```yaml charts: - id: executions_per_namespace_bars type: io.kestra.plugin.core.dashboard.chart.Bar chartOptions: displayName: Executions per Namespace description: Execution count per namespace column: namespace legend: enabled: true data: type: io.kestra.plugin.core.dashboard.data.Executions columns: namespace: field: NAMESPACE state: field: STATE total: displayName: Executions agg: COUNT ``` #### Pie chart `type: io.kestra.plugin.core.dashboard.chart.Pie` Shows proportions and distributions. Requires exactly one aggregation column. Additional `chartOptions` properties: | Property | Required | Default | Description | | --- | --- | --- | --- | | `graphStyle` | No | `DONUT` | Chart style: `PIE` or `DONUT` | | `colorByColumn` | No | — | The column whose values determine segment colors | | `legend.enabled` | No | `true` | Show or hide the legend | | `tooltip` | No | `ALL` | Tooltip display behavior: `NONE`, `ALL`, or `SINGLE` | ```yaml charts: - id: executions_pie type: io.kestra.plugin.core.dashboard.chart.Pie chartOptions: displayName: Total Executions description: Total executions per state graphStyle: DONUT colorByColumn: state legend: enabled: true data: type: io.kestra.plugin.core.dashboard.data.Executions columns: state: field: STATE total: agg: COUNT ``` #### TimeSeries chart `type: io.kestra.plugin.core.dashboard.chart.TimeSeries` Tracks trends over time. Requires between one and two aggregation columns. Additional `chartOptions` properties: | Property | Required | Default | Description | | --- | --- | --- | --- | | `column` | Yes | — | The data column to use as the time (x) axis | | `colorByColumn` | No | — | The column whose values determine series colors | | `legend.enabled` | No | `true` | Show or hide the legend | | `tooltip` | No | `ALL` | Tooltip display behavior: `NONE`, `ALL`, or `SINGLE` | The `graphStyle` property can be set per column in `data.columns` to control how each series is rendered: `LINES`, `BARS`, or `POINTS`. It defaults to `LINES` when an aggregation is set. ```yaml charts: - id: executions_timeseries type: io.kestra.plugin.core.dashboard.chart.TimeSeries chartOptions: displayName: Executions description: Executions last week column: date colorByColumn: state legend: enabled: true data: type: io.kestra.plugin.core.dashboard.data.Executions columns: date: field: START_DATE displayName: Date state: field: STATE total: displayName: Executions agg: COUNT graphStyle: BARS duration: displayName: Duration field: DURATION agg: SUM graphStyle: LINES ``` #### KPI chart `type: io.kestra.plugin.core.dashboard.chart.KPI` Displays a single key performance indicator value. Requires exactly one aggregation column. Use `ExecutionsKPI`, `FlowsKPI`, `LogsKPI`, or `MetricsKPI` as the data type for KPI charts. To display a ratio (e.g., success rate), use the `numerator` property to filter the subset of rows that count toward the numerator. All rows matching the chart's `where` clause form the denominator. Additional `chartOptions` properties: | Property | Required | Default | Description | | --- | --- | --- | --- | | `numberType` | No | `FLAT` | Display format: `FLAT` (raw count) or `PERCENTAGE` | ```yaml charts: - id: kpi_success_percentage type: io.kestra.plugin.core.dashboard.chart.KPI chartOptions: displayName: Success Ratio numberType: PERCENTAGE width: 3 data: type: io.kestra.plugin.core.dashboard.data.ExecutionsKPI columns: field: FLOW_ID agg: COUNT numerator: - field: STATE type: IN values: - SUCCESS where: - field: NAMESPACE type: EQUAL_TO value: "company.team" ``` #### Table `type: io.kestra.plugin.core.dashboard.chart.Table` Displays structured data in a sortable, paginated table. Additional `chartOptions` properties: | Property | Required | Default | Description | | --- | --- | --- | --- | | `header.enabled` | No | `true` | Show or hide the table header row | | `pagination.enabled` | No | `true` | Show or hide table pagination controls | Column-level properties unique to tables: | Property | Required | Default | Description | | --- | --- | --- | --- | | `columnAlignment` | No | `LEFT` | Text alignment within the column: `LEFT`, `RIGHT`, or `CENTER` | ```yaml charts: - id: table_metrics type: io.kestra.plugin.core.dashboard.chart.Table chartOptions: displayName: Sum of sales per namespace data: type: io.kestra.plugin.core.dashboard.data.Metrics columns: namespace: field: NAMESPACE value: field: VALUE agg: SUM columnAlignment: RIGHT where: - field: NAME type: EQUAL_TO value: sales_count orderBy: - column: value order: DESC ``` #### Markdown `type: io.kestra.plugin.core.dashboard.chart.Markdown` Adds explanatory text or context alongside data charts. No `data` property is required. The content of a Markdown chart is set via the `source` property. Two source types are available: **Text** — inline Markdown content: ```yaml charts: - id: markdown_insight type: io.kestra.plugin.core.dashboard.chart.Markdown chartOptions: displayName: Chart Insights description: How to interpret this chart source: type: Text content: | ## Execution Success Rate This chart displays the percentage of successful executions over time. - A **higher success rate** indicates stable and reliable workflows. - Sudden **drops** may signal issues in task execution or external dependencies. ``` **FlowDescription** — pulls the description from a specific flow: ```yaml charts: - id: markdown_flow_desc type: io.kestra.plugin.core.dashboard.chart.Markdown chartOptions: displayName: Flow Overview source: type: FlowDescription namespace: company.team flowId: my_flow ``` :::alert{type="info"} The `content` shorthand (used in earlier examples) sets plain text content directly. The `source` property gives you access to the `FlowDescription` type to pull dynamic content from a flow's description field. ::: ## Create a new custom dashboard as code Clicking on the `+ Create new dashboard` button opens a code editor where you can define the dashboard layout and data sources as code. The top-level dashboard properties are: | Property | Description | | --- | --- | | `title` | Dashboard title | | `description` | Optional description | | `timeWindow.default` | Default time range, as an ISO 8601 duration (e.g., `P7D`) | | `timeWindow.max` | Maximum selectable time range (e.g., `P365D`) | | `charts` | List of chart definitions | Below is an example of a dashboard definition that displays executions over time, flow execution success ratio, a table that uses metrics to display the sum of sales per namespace, a table that shows the log count by level per namespace, and a Markdown insights panel: :::collapse{title="Expand for an example dashboard definition"} ```yaml title: Getting Started description: First custom dashboard timeWindow: default: P7D max: P365D charts: - id: executions_timeseries type: io.kestra.plugin.core.dashboard.chart.TimeSeries chartOptions: displayName: Executions description: Executions last week legend: enabled: true column: date colorByColumn: state data: type: io.kestra.plugin.core.dashboard.data.Executions columns: date: field: START_DATE displayName: Date state: field: STATE total: displayName: Executions agg: COUNT graphStyle: BARS duration: displayName: Duration field: DURATION agg: SUM graphStyle: LINES - id: kpi_success_percentage type: io.kestra.plugin.core.dashboard.chart.KPI chartOptions: displayName: Success Ratio numberType: PERCENTAGE width: 3 data: type: io.kestra.plugin.core.dashboard.data.ExecutionsKPI columns: field: FLOW_ID agg: COUNT numerator: - field: STATE type: IN values: - SUCCESS where: - field: NAMESPACE type: EQUAL_TO value: "company.team" - id: table_metrics type: io.kestra.plugin.core.dashboard.chart.Table chartOptions: displayName: Sum of sales per namespace data: type: io.kestra.plugin.core.dashboard.data.Metrics columns: namespace: field: NAMESPACE value: field: VALUE agg: SUM where: - field: NAME type: EQUAL_TO value: sales_count - field: NAMESPACE type: IN values: - dev_graph - prod_graph orderBy: - column: value order: DESC - id: table_logs type: io.kestra.plugin.core.dashboard.chart.Table chartOptions: displayName: Log count by level for filtered namespace data: type: io.kestra.plugin.core.dashboard.data.Logs columns: level: field: LEVEL count: agg: COUNT where: - field: NAMESPACE type: IN values: - dev_graph - prod_graph - id: markdown type: io.kestra.plugin.core.dashboard.chart.Markdown chartOptions: displayName: Chart Insights description: How to interpret this chart source: type: Text content: | ## Execution Success Rate This chart displays the percentage of successful executions over time. - A **higher success rate** indicates stable and reliable workflows. - Sudden **drops** may signal issues in task execution or external dependencies. - Use this insight to identify trends and optimize performance. ``` ::: :::alert{type="info"} To see all available properties to configure a custom dashboard as code, see examples provided in the [Enterprise Edition Examples](https://github.com/kestra-io/enterprise-edition-examples) repository. ::: ## Exporting data Table data can be exported as a CSV file by hovering over the top-right corner and clicking the download icon. This enables dashboard users to build custom queries in Dashboards and to export data with one click without having to worry about pagination. ![Dashboard Table Export](./dashboard-table-export.png) ## Querying data The `data` property of a chart defines the type of data that is queried and displayed. The `type` determines which columns are available. ### Data source types Dashboards can query data from these source `types`: | Type | Description | | --- | --- | | `io.kestra.plugin.core.dashboard.data.Executions` | Workflow execution data | | `io.kestra.plugin.core.dashboard.data.ExecutionsKPI` | Execution data for KPI charts (supports `numerator`) | | `io.kestra.plugin.core.dashboard.data.Flows` | Flow definition data | | `io.kestra.plugin.core.dashboard.data.FlowsKPI` | Flow data for KPI charts (supports `numerator`) | | `io.kestra.plugin.core.dashboard.data.Logs` | Log entries produced by your workflows | | `io.kestra.plugin.core.dashboard.data.LogsKPI` | Log data for KPI charts (supports `numerator`) | | `io.kestra.plugin.core.dashboard.data.Metrics` | Metrics emitted by your plugins | | `io.kestra.plugin.core.dashboard.data.MetricsKPI` | Metrics data for KPI charts (supports `numerator`) | | `io.kestra.plugin.core.dashboard.data.Triggers` | Trigger state and scheduling data | ### Available fields by data source After defining the data source, specify the columns to display in the chart. Each column is defined by its `field`. The fields available depend on the data source type: **Executions / ExecutionsKPI:** | Field | Description | | --- | --- | | `ID` | Execution ID | | `NAMESPACE` | Namespace of the flow | | `FLOW_ID` | Flow identifier | | `FLOW_REVISION` | Flow revision number | | `STATE` | Execution state (e.g., `SUCCESS`, `FAILED`) | | `DURATION` | Execution duration | | `LABELS` | Key-value labels attached to the execution | | `START_DATE` | Execution start timestamp | | `END_DATE` | Execution end timestamp | | `TRIGGER_EXECUTION_ID` | ID of the execution that triggered this one | | `SCOPE` | Execution scope | **Flows / FlowsKPI:** | Field | Description | | --- | --- | | `ID` | Flow identifier | | `NAMESPACE` | Namespace of the flow | | `REVISION` | Flow revision number | **Logs / LogsKPI:** | Field | Description | | --- | --- | | `NAMESPACE` | Namespace of the flow | | `FLOW_ID` | Flow identifier | | `EXECUTION_ID` | Associated execution ID | | `TASK_ID` | Task that produced the log | | `DATE` | Log timestamp | | `TASK_RUN_ID` | Task run identifier | | `ATTEMPT_NUMBER` | Task attempt number | | `TRIGGER_ID` | Trigger identifier | | `LEVEL` | Log level (e.g., `INFO`, `WARN`, `ERROR`) | | `MESSAGE` | Log message text (cannot be aggregated) | **Metrics / MetricsKPI:** | Field | Description | | --- | --- | | `NAMESPACE` | Namespace of the flow | | `FLOW_ID` | Flow identifier | | `TASK_ID` | Task that emitted the metric | | `EXECUTION_ID` | Associated execution ID | | `TASK_RUN_ID` | Task run identifier | | `TYPE` | Metric type | | `NAME` | Metric name | | `VALUE` | Metric value | | `DATE` | Metric timestamp | **Triggers:** | Field | Description | | --- | --- | | `ID` | Trigger identifier | | `NAMESPACE` | Namespace of the flow | | `FLOW_ID` | Flow identifier | | `TRIGGER_ID` | Trigger identifier within the flow | | `EXECUTION_ID` | Last execution ID triggered | | `NEXT_EXECUTION_DATE` | Scheduled next execution date | | `WORKER_ID` | Worker handling the trigger | ### Column properties Each entry in `data.columns` supports the following properties: | Property | Description | | --- | --- | | `field` | The only required property; specifies which field from the data source to use | | `displayName` | Sets the label displayed in the chart | | `agg` | Aggregation function: `AVG`, `COUNT`, `MAX`, `MIN`, or `SUM` | | `graphStyle` | Series render style for TimeSeries charts: `LINES`, `BARS`, or `POINTS` (defaults to `LINES` when `agg` is set) | | `columnAlignment` | Column text alignment for Table charts: `LEFT`, `RIGHT`, or `CENTER` | | `labelKey` | When `field: LABELS`, filters to a specific [label](../../05.workflow-components/08.labels/index.md) key | ### Filtering data Use the `where` property to filter the result set before it is displayed. Filters can apply to any field in the data source. Multiple conditions in `where` are combined with `AND` by default. To use `OR` logic, set `type: OR` on a condition. Available filter types: - `CONTAINS` - `ENDS_WITH` - `EQUAL_TO` - `GREATER_THAN` - `GREATER_THAN_OR_EQUAL_TO` - `IN` - `IS_FALSE` - `IS_NOT_NULL` - `IS_NULL` - `IS_TRUE` - `LESS_THAN` - `LESS_THAN_OR_EQUAL_TO` - `NOT_EQUAL_TO` - `NOT_IN` - `OR` - `REGEX` - `STARTS_WITH` --- # Executions in the Kestra UI – Inspect and Manage Runs URL: https://kestra.io/docs/ui/executions > Inspect flow runs in the Kestra UI. Track execution status, view logs, analyze outputs, and manage tasks via Gantt and Topology views. Inspect and manage flow executions. On the **Executions** page, you see a list of all your completed flow executions. You can select multiple checkboxes to choose executions for bulk actions, such as Restart, Kill, Pause, or Force Run. Alternatively, you can click an execution ID or the magnifying glass icon to open an execution for further examination. ![Kestra User Interface Executions Page](./executions-overview.png) ## Overview An **Execution's Overview** page displays the details of a flow execution, organized into the following sections. For reference, below is an example flow and its **Execution Overview**. ```yaml id: conditionallyReturnOutputs namespace: company.team labels: - key: environment value: dev - key: owner value: data-team variables: description: This is a demo flow version: 1.0.0 inputs: - id: runTask type: BOOL defaults: true tasks: - id: taskA runIf: "{{ inputs.runTask }}" type: io.kestra.plugin.core.debug.Return format: Hello World! - id: taskB type: io.kestra.plugin.core.debug.Return format: Fallback output outputs: - id: flowOutput type: STRING value: "{{ tasks.taskA.state != 'SKIPPED' ? outputs.taskA.value : outputs.taskB.value }}" triggers: - id: every_minute_schedule type: io.kestra.plugin.core.trigger.Schedule cron: "* * * * *" ``` ![Kestra User Interface Execution Page](./execution-results-overview.png) From the **Overview** tab, you can: - Set Labels: give a label to the execution for tracking or filtering. - Change State: change the execution state. - Force Run: forces the execution to run. This may create duplicate task executions — use with caution. The **Previous and Next Execution** buttons navigate you through past and future (if there's a trigger) flow executions. - Execution **state** is displayed along with a timestamped state history from `CREATED` to `RUNNING` to `SUCCESS` (or any other possible state). - Flow [Variables](../../05.workflow-components/04.variables/index.md) and [Inputs](../../05.workflow-components/05.inputs/index.md) are clearly listed along with execution details including dates and the corresponding namespace and flow. - Flow outputs and trigger data are captured with expression rendering. From the **Overview** page, you can also take actions such as [**Replay**](../../06.concepts/10.replay/index.md) or **Pause**, and view executions over time to compare previous runs. ## Filters From the main Executions page, you can filter the displayed executions on fields like namespace, flowId, labels, state, startDate, open text, and more. You can save applied filters and export the data all from the UI. The following video demonstrates the filters in action:
## Gantt The **Gantt** tab visualizes each task's duration. From this interface, you can replay a specific task, see task source code, change task status, or look at task metrics and outputs. ![Kestra User Interface Execution Gantt](./execution-gantt-view.png) The **Gantt** view displays all successful and failed tasks in the execution. For failed tasks, you can open the task and click the three dots to **"Fix with AI"**. This option reopens the flow editor with the [AI Copilot](../../ai-tools/ai-copilot/index.md) prompted to help resolve any issues with the task. ![Fix with AI](../../ai-tools/ai-copilot/fix-with-ai-gantt.png) ## Logs The **Logs** tab gives access to a task's logs. You can filter by log level, copy logs into your clipboard, or download logs as a file. Logs can be viewed per task in the **Default View** or temporally based on timestamp in the **Temporal View**. ![Kestra User Interface Execution Log](./execution-logs-view.png) For failed tasks, click the three dots to **"Fix with AI"**. This option reopens the flow editor with the [AI Copilot](../../ai-tools/ai-copilot/index.md) prompted to help resolve any issues with the task. ![Fix with AI](./fix-with-ai-logs.png) ## Topology Similar to the Editor view, you can see your execution's topology. **Topology** provides a graphical view to access specific task logs, replay certain tasks, or change task status. Tasks' state progression is shown and updated as the status changes. For example, green indicates a task has reached **SUCCESS** while red indicates **FAILED**. ![Kestra User Interface Execution Topology](./execution-topology-view.png) From a **FAILED** task, click the magnifying glass icon to open the logs and read the error message, investigate, and **"Fix with AI"** if you have [AI Copilot](../../ai-tools/ai-copilot/index.md) configured. ## Outputs The **Outputs** tab presents the execution's generated outputs. All tasks and their corresponding outputs are accessible from this page for examination and debugging. Outputs could be results or variables to pass onto downstream tasks, or files to download or pass downstream as a URI for processing. The example below downloads an outputted file generated from a SQL query.
The **Debug Expression** button lets you evaluate [expressions](../../expressions/index.mdx) against task outputs to verify they match what you expect. Select a task first to enable it. ![Kestra User Interface Execution Debug Expression](./execution-debug-expression.png) Use **Debug Expression** to inspect task outputs and test expressions interactively. ## Metrics The Metrics tab shows every metric exposed by tasks after execution. For example, a [BigQuery load task](/plugins/plugin-gcp/bigquery/io.kestra.plugin.gcp.bigquery.load) might show the amount of files inputted, rows inserted, and how long the operation took to complete. Another example, a flow using an AI plugin shows token usage as a metric for the task. ![Kestra User Interface Execution Metric](./execution-metrics-view.png) ## Dependencies
The Dependencies tab shows the relationship dependencies between other flows and the selected execution. It also displays extra execution metadata such as state. ![Execution Dependencies](./executions-dependencies-1-0.png) --- # Flows in the Kestra UI – Browse, Edit, Execute URL: https://kestra.io/docs/ui/flows > Manage flows in the Kestra UI. Browse, edit, and execute workflows using the code editor, topology view, and version history tools. Manage your flows in one place. On the **Flows** page, you see a list of flows that you can edit and execute. You can also create a new flow in the top-right corner. Click a flow ID or the eye icon to open a flow. ![Kestra User Interface Flows Page](./04-Flows.png) A **Flow** page has multiple tabs that allow you to: see the flow topology, all flow executions, edit the flow, view its revisions, logs, metrics, and dependencies. You are also able to edit namespace files in the Flow editor as well. ![Kestra User Interface Flow Page](./05-Flows-Flow.png) ## Filters From the main Flows page, you can filter the displayed flows on fields like namespace, scope, labels, and open text. The filters are key based with comma-separated OR-conditions and spaced-separated AND-conditions. The following video demonstrates the filters in action:
## Edit The Edit interface provides a rich view of your workflow, as well as Namespace Files. The editor allows you to add multiple panels: - Flow code - No-code - Topology - Documentation - Files - Blueprints Additionally, from the **Actions** menu, you can export your flow as a YAML file, delete, or copy your flow.
### Flow code view The **Flow** code view allows you to edit your workflows with YAML. Autocomplete is available as you write. As new tasks are added, they automatically appear in the No-code and topology views. ![Flow Code](./flow-editor.png) ### No-code view The **No-code** view allows you to edit your workflows directly from the UI. As you modify your flow, YAML code is generated in real time in the flow code view, and you can switch between both views at any time. ![No-code](./no-code-editor.png) ### Topology view The **Topology** view allows you to visualize the structure of your flow. This is especially useful when you have complex flows with multiple branches of logic. From the bottom left corner of the Topology view, you can zoom in, zoom out, and export your flow topology as a `.png` file. ![Topology](./topology-editor.png) ### Documentation view The **Documentation** view displays Kestra's documentation directly inside the editor. As you move your cursor around the editor, the documentation panel updates to reflect the specific task type documentation. ![Docs](./docs-editor.png) :::alert{type="warning"} If you use the [Brave browser](https://brave.com/), you may need to disable Brave Shields to make the editor work as expected. To view task documentation, set the `Block cookies` option to `Disabled` in Shields settings: `brave://settings/shields`. ![Brave cookies](./brave.png) ::: ## Files view The **Files** view allows you to create, edit and delete [Namespaces Files](../../06.concepts/02.namespace-files/index.md). Multiple files can be opened at the same time, as well as displayed side by side using multiple panels. ![Files](./files-editor.png) ### Blueprints view The **Blueprints** view gives you example flows to copy directly into your flow. Blueprints are especially useful when working with a new plugin, since you can start from a working example. ![Blueprints Editor](./blueprints-editor.png) ### Namespace context (Enterprise) In the **Namespace Context** view, you can directly access your Variables, KV pairs, and Secrets managed at the namespace level. You can also render expressions that fall within those categories. ![Namespace Context](./namespace-context.png) ## Revisions You can view the history of your flow code changes under the **Revisions** tab. For more details, see [Revisions](../../06.concepts/03.revision/index.md). ![Blueprints Editor](./revisions.png) ## Dependencies
The **Dependencies** page shows the relationship dependencies between other flows and the selected flow, and lets you navigate between them. ![Dependencies](./flow-dependencies-1-0.png) :::alert{type="info"} The **Dependencies View** on the **Namespaces** page shows all the flows in the namespace and how they each relate to one another, if at all, whereas the Flow Dependencies view is only for the selected flow. ::: ## JSON Schema usage for flow validation Kestra provides a JSON Schema to validate your flow definitions. This schema ensures that your flows are correctly structured and helps catch errors early in the development process. ### JSON Schema in VSCode To use the JSON Schema in Visual Studio Code (VSCode), follow these steps: 1. Install the [YAML extension](https://marketplace.visualstudio.com/items?itemName=redhat.vscode-yaml) by Red Hat. 2. Open your VSCode settings (`Ctrl+,` or `Cmd+,`). 3. Search for `YAML: Schemas` and click on `Edit in settings.json`. 4. Add the following configuration to associate the Kestra JSON Schema with your flow files: ```json { "yaml.schemas": { "https://your-kestra-instance.com/api/v1/schemas/flow.json": "/*.yaml" } } ``` Replace `https://your-kestra-instance.com/api/v1/schemas/flow.json` with the actual URL of your Kestra JSON Schema. ### Example of using JSON Schema in flow editor Here is an example of how to use the JSON Schema in the flow editor: ```yaml id: example_flow namespace: example_namespace tasks: - id: example_task type: io.kestra.core.tasks.log.Log message: "Hello, World!" ``` When you open this flow in the editor, the JSON Schema validates the structure and provides autocompletion and error checking. ### Globally available location for JSON Schema The JSON Schema for Kestra flows is available at the following URL: ```plaintext https://your-kestra-instance.com/api/v1/main/schemas/flow.json ``` Replace `https://your-kestra-instance.com` with the actual URL of your Kestra instance. --- # Logs in Kestra UI: Search and Filter Task Output URL: https://kestra.io/docs/ui/logs > Search and filter logs in the Kestra UI. access real-time task logs, debug issues, and filter by level, namespace, or execution ID. Manage Logs generated by tasks. On the **Logs** page, you have access to all task logs. Investigate logs with ad-hoc or saved filters, search for known IDs, and refresh data manually or on an automatic schedule. On this page, you can filter by: - Namespace - Log level - Interval - Scope - Trigger ID - Flow ID You can filter the displayed logs with comma-separated OR-conditions and spaced-separated AND-conditions. The following demo shows the filters in action:
--- # Namespaces in Kestra UI: Manage Resources URL: https://kestra.io/docs/ui/namespaces > Overview of the Namespaces UI in Kestra. Manage flows, files, KV store, and dependencies specific to each Namespace in a central view. Manage all resources associated with a Namespace in one place. The **Namespaces** tab in the UI for Open Source users displays all Namespaces associated with different flows in your Kestra instance.
## Interactive demo Explore the Namespace UI through this interactive demo:
## Overview The **Overview** tab is the default landing page of a Namespace. It displays dashboards and summaries of flow executions within that Namespace. ![Overview](./overview-namespaces.png) ## Flows The **Flows** tab lists all flows within the Namespace. It displays key information such as the flow ID, labels, last execution date and status, and execution statistics. Selecting the **details** button on a flow opens its detailed page. ![Flows](./flows-namespaces.png) ## Dependencies The **Dependencies** tab visualizes relationships between flows, showing which flows depend on one another (for example, through subflows or flow triggers). This view is similar to the **Dependencies** page in the Flow Editor but focuses on inter-flow relationships within a single Namespace — even if some flows are independent. ![Dependencies](./dependencies-namespaces.png) ## KV store The **KV Store** tab lets you manage key-value pairs associated with a Namespace. For more information, see the [KV Store concept guide](../../06.concepts/05.kv-store/index.md). ![KV Store](./kvstore-namespaces.png)
## Files The **Files** tab lets you create, edit, and manage Namespace Files used in your flows — from custom Python scripts to images. Learn more in [Namespace Files](../../06.concepts/02.namespace-files/index.md). ![Namespace Files](./namespace-files-tab.png) ## Additional enterprise pages In the [Enterprise Edition](../../07.enterprise/01.overview/01.enterprise-edition/index.md), additional Namespace pages provide deeper insights and management capabilities. Learn more on the [Enterprise Namespace Management page](../../07.enterprise/02.governance/07.namespace-management/index.md). --- # Playground in Kestra UI: Build Flows Task by Task URL: https://kestra.io/docs/ui/playground > Experiment in the Kestra Playground. Build and test tasks iteratively in the UI to debug and refine workflows without full execution. Iteratively build and test flows task by task without running the entire workflow.
## Playground The **Playground mode** in Kestra allows you to build workflows iteratively, one task at a time. This feature is especially useful when building data processing flows, where you typically start with a task extracting data, and you need to inspect the output before knowing what kind of transformation might be required. Then, you can work on that transformation task without rerunning the extraction task. If you've ever worked with a [Jupyter](https://jupyter.org/) notebook, you might be familiar with this pattern: you run the first cell to extract data, then you run the second cell to transform that data, and you can rerun the second cell multiple times to test different transformations without having to rerun the first cell again. Kestra's Playground mode allows you to do the same within your flows. ## Use Playground mode To use Playground mode: 1. Enable the Playground mode. 2. Add a task to your flow and hit **Play** to run it. 3. Add a second task and hit **Play** to run it, reusing the output of the first task. 4. Modify the second task and hit **Play** again to rerun only the second task. 5. Add a third task and hit **Play** to run it, reusing the outputs of the first and second tasks. 6. Keep iterating by adding more tasks and running them individually, or click on **Run all tasks** or **Run all downstream tasks** options to run multiple tasks at once. Kestra tracks up to 10 recent playground runs, so you can go back to inspect the outputs of previously executed tasks. Older runs are purged automatically. Playground runs won't appear in the regular execution list to avoid confusion with production executions. Playground mode requires a DAG (Directed Acyclic Graph) structure, so you cannot run a task before its upstream tasks have been played. If you change flow-level `inputs`, `variables`, `pluginDefaults`, or `outputs` properties while in Playground mode, existing task runs are automatically reset and must be rerun. Kestra resets them to ensure that task outputs remain consistent with the flow-level properties. To see Playground in action, check out the demo below.
--- # Settings in Kestra UI: Themes, Timezone & Editor URL: https://kestra.io/docs/ui/settings > Customize the Kestra UI. Configure themes, editor preferences, time zones, and default settings to personalize your user experience. Configure Settings for Kestra. **Settings** are accessible from the bottom left environment menu. All configuration options are per-user. ![Kestra User Interface Settings Page](./settings.png) ## Main configuration Options you can configure under **Main Configuration** include: - **Default Namespace**: e.g., `company.team` - by default, this is empty. Once set, this will be the default namespace when creating a new flow (otherwise `company.team` is used as a placeholder). Also, when navigating to the Flows or Executions pages, it will filter for this default namespace. - **Default Editor Type**: e.g., "YAML Editor" or "No Code Editor" - **Default Log Level**: e.g., `TRACE` - **Default Log Display**: Expand all, Collapse all, or Expand only failed tasks - **Execute the Flow**: In the same tab or in a new tab - **Default Execution Tab**: Sets which Execution tab you are directed to (e.g., Gantt, Logs, Outputs, etc.) after executing a flow. - **Default Flow Tab**: Sets which flow tab opens by default when you click a flow (e.g., Overview, Topology, Edit, etc.) ## Theme preferences Kestra supports both Light and Dark mode. You can also configure the Editor independently in Light or Dark mode. In addition, you can adjust the Editor font size and family. There's also the option to change the environment name and color to help you identify if you have multiple Kestra instances, for example a `dev` and `prod` environment.
Below is a detailed list of the Theme Preferences you can configure: - **Theme Mode**: Dark or Light - **Chart Color Scheme**: Classic (red-green) or Kestra (pink-purple) - **Editor Theme**: Dark or Light - **Editor Font Size**: e.g., 12 — arbitrary integer number - **Editor Font Family**: one of the following: - Source Code Pro - Courier - Times New Roman - Book Antiqua - Times New Roman Arabic - SimSun - **Automatic Code Folding in the Editor**: a toggle, by default toggled off - **Environment Name**: e.g., dev, staging, prod - **Environment Color**: select a color from the color picker ## Language and region - **Language**: English, German, Spanish, French, Hindi, Italian, Japanese, Korean, Polish, Portuguese, Russian, or Chinese - **Time Zone**: e.g., Europe/Berlin (`UTC+02:00`) - **Date Format**: choose one of the following formats: - `2024-09-30T12:44:34+02:00` - `2024-09-30 12:44:34` - `30/09/2024 12:44:34` - `Sep 30, 2024 12:44 PM` - `Mon, Sep 30, 2024 12:44 PM` - `September 30, 2024 12:44 PM` - `Monday, September 30, 2024 12:44 PM` :::alert{type="info"} This setting only affects the UI display. It does not affect [Schedule triggers](../../05.workflow-components/07.triggers/01.schedule-trigger/index.md) or flow execution times, which run on UTC by default. ::: ## Export You can also export all of your flows as a `.zip` file. This allows you to back up your flows or migrate them to another instance of Kestra. --- # Kestra Use Cases: Data, ETL, Infra & Microservices URL: https://kestra.io/docs/use-cases > Explore common use cases for Kestra, from data orchestration and ETL to infrastructure automation and microservices. import ChildCard from "~/components/docs/ChildCard.astro" Get to know the main use cases covered by Kestra. ## Explore Kestra use cases – What’s possible with Kestra --- # Automate Manual Approval Processes in Kestra URL: https://kestra.io/docs/use-cases/approval-processes > Integrate Human-in-the-Loop approvals into critical steps of automated workflows Modern automation requires human oversight for critical decisions. Kestra enables integration of manual approval steps within workflows while maintaining audit trails and process consistency. ## Add human approval steps to workflows ## What is Human-in-the-Loop Automation? Human-in-the-loop (HITL) automation combines automated tasks with human decision points. Kestra implements this through: - **Pause/Resume** – Pause workflows for manual inspection before resuming - **Dynamic Inputs** – Collect user decisions during execution - **Approval Chains** – Route decisions to specific users or teams - **Audit Logs** – Track who approved/rejected each request and why. --- ## Why Use Kestra for Human-in-the-Loop Workflows? 1. **Flexible Integration** – Add approval steps to existing workflows in a few lines of YAML 2. **Enterprise Security** – Manage permissions via namespace-level RBAC 3. **Cross-Platform Notifications** – Send approval requests to Slack, Teams, or Email 4. **Input Validation** – Enforce structured responses (Numeric, Boolean, Dates, Dropdowns) 5. **Bulk Actions** – Bulk-resume multiple paused workflows when needed. 6. **Audit Trails** – Track approvals, rejections, and reasons for each decision. --- ## Example: Vacation Approval Workflow This workflow demonstrates a complete approval process with Slack notifications and audit logging: ```yaml id: vacation_approval namespace: hr.operations inputs: - id: employee type: STRING required: false - id: start_date type: DATE - id: end_date type: DATE tasks: - id: notify_manager type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ secret('SLACK_HR_WEBHOOK') }}" payload: | { "channel": "#vacation-approvals", "text": "Review request from {{ inputs.employee ?? labels.system.username }}\n*Dates*: {{ inputs.start_date }} → {{ inputs.end_date }}\nApprove: {{ appLink('appId') }}" } - id: await_decision type: io.kestra.plugin.core.flow.Pause onResume: - id: approved type: BOOLEAN description: Approve this request? - id: reason type: STRING description: Decision notes - id: update_hr_system type: io.kestra.plugin.core.http.Request uri: "{{ kv('HR_API_ENDPOINT') }}/approvals" method: POST contentType: multipart/form-data formData: employee: "{{ inputs.employee ?? labels.system.username }}" approvalStatus: "{{ outputs.await_decision.onResume.approved ? 'APPROVED' : 'REJECTED' }}" notes: "{{ outputs.await_decision.onResume.reason }}" resumedBy: "{{ outputs.await_decision.resumed.by }}" resumedOn: "{{ outputs.await_decision.resumed.on }}" resumedStatus: "{{ outputs.await_decision.resumed.to }}" # by default: SUCCESS - id: log_result type: io.kestra.plugin.core.log.Log message: | Decision: {{ outputs.await_decision.onResume }} ``` --- ## Kestra Features for Human-in-the-Loop Automation ### Structured Inputs for Human Decisions Add approval steps with structured inputs to any workflow: ```yaml - id: await_decision type: io.kestra.plugin.core.flow.Pause onResume: - id: approved type: BOOLEAN displayName: Approve this request? - id: reason type: STRING displayName: Decision notes - id: team type: SELECT displayName: Team to review values: - HR - Finance - IT ``` ### Bulk Actions Approve multiple paused workflows simultaneously: ![Bulk Resume](../../15.how-to-guides/pause-resume/pause_resume2.png) ### Audit Trails Audit Logs capture who approved or rejected each request, and the Pause task's outputs contain the user's decision: ```json { "approved": true, "reason": "Within policy limits" } ``` ### Conditional Branching Route next automated tasks based on human decisions: ```yaml - id: handle_rejection type: io.kestra.plugin.core.flow.If condition: "{{ outputs.await_decision.onResume.approved is false }}" then: - id: notify_employee type: io.kestra.plugin.mail.MailSend to: "{{ inputs.employee_email }}" subject: "Request Denied" htmlTextContent: "Reason: {{ outputs.await_decision.onResume.reason }}" ``` --- ## HumanTask: Assign Specific Users for Approval :::badge{version=">=1.1" editions="EE"} ::: For enterprise use cases where specific users or groups must handle approvals, use the `HumanTask` instead of the basic `Pause` task. This ensures only authorized users can resume paused executions. ### Key Benefits of HumanTask - **User-Specific Assignments** – Assign approval tasks to specific users by email - **Group-Based Permissions** – Route approvals to entire RBAC groups - **Access Control** – Prevent unauthorized users from resuming executions - **Auditability** – Track who approved what for audit and compliance. When an unauthorized user attempts to resume a `HumanTask`, they receive an "Access denied to resume this execution" error. ### Basic HumanTask Example ```yaml id: vm_provisioning_approval namespace: infrastructure inputs: - id: vm_spec type: STRING defaults: "2 vCPU, 4GB RAM, Ubuntu 22.04" tasks: - id: validate_request type: io.kestra.plugin.core.log.Log message: "Validating VM request from {{ labels.system.username }}: {{ inputs.vm_spec }}" - id: it_admin_approval type: io.kestra.plugin.ee.flow.HumanTask assignment: users: - it-admin@company.com - infrastructure-lead@company.com - id: provision_vm type: io.kestra.plugin.core.log.Log message: "Approved by {{ outputs.it_admin_approval.resumed.by }}! Provisioning VM for {{ labels.system.username }}" ``` ### Group-Based Assignment You can assign approvals to entire RBAC groups for team-based Human-in-the-loop workflows: ```yaml - id: security_review type: io.kestra.plugin.ee.flow.HumanTask assignment: groups: - Security Team - DevOps Engineers - Infrastructure Admins ``` ### Combined User and Group Assignment When needed, you can also mix `users` and `groups` to allow both individual users and users from specific RBAC groups to approve the workflow: ```yaml - id: production_deployment_approval type: io.kestra.plugin.ee.flow.HumanTask assignment: users: - platform-lead@company.com - release-manager@company.com groups: - DevOps Team - Site Reliability Engineers ``` --- ## Best practices for long-running workflows Long approvals can take days or weeks. Kestra persists execution state (including `PAUSED` state) in the database, so a paused execution survives server restarts and stays `PAUSED` until you manually resume it via the UI or API. ### Keep downstream logic in the same flow (simplest and most common pattern) The simplest pattern is to keep the entire downstream logic in the same flow after the Pause task: ```yaml id: pause_demo namespace: demo tasks: - id: initial_logic type: io.kestra.plugin.core.log.Log message: placeholder for tasks with initial logic before the pause - id: wait_for_manual_resume type: io.kestra.plugin.core.flow.Pause onResume: - id: status type: STRING - id: entire_downstream_logic type: io.kestra.plugin.core.log.Log message: can have multiple tasks defined here after the pause task ``` Use this when it’s easier to continue in the same execution after the Pause; for larger systems, consider calling a subflow containing the downstream logic for modularity: ```yaml id: pause_demo_with_subflow namespace: demo tasks: - id: initial_logic type: io.kestra.plugin.core.log.Log message: placeholder for tasks with initial logic before the pause - id: wait_for_manual_resume type: io.kestra.plugin.core.flow.Pause onResume: - id: status type: STRING - id: entire_downstream_logic type: io.kestra.plugin.core.flow.Subflow namespace: demo flowId: downstream_logic_flow ``` ### Pause + Manual resume + Flow trigger pattern 1) Flow that can stay paused as long as needed: ```yaml id: pause_demo namespace: demo tasks: - id: initial_logic type: io.kestra.plugin.core.log.Log message: placeholder for tasks with initial logic before the pause - id: wait_for_manual_resume type: io.kestra.plugin.core.flow.Pause onResume: - id: status type: STRING ``` 2) Resume the paused execution via API when ready: ```bash curl -X POST "http://localhost:28080/api/v1/demo/executions/23F2KgSYm3uCfHJ00DvNxY/resume" \ -H 'accept: application/json' \ -F 'status=OK' -H "Authorization: Bearer your_service_account_api_token" ``` 3) React to the resumed flow’s completion using a Flow trigger (fires on SUCCESS of `pause_demo`): ```yaml id: resume_demo namespace: demo tasks: - id: entire_downstream_logic type: io.kestra.plugin.core.log.Log message: can have multiple tasks defined here that should run after the first flow completes triggers: - id: flow type: io.kestra.plugin.core.trigger.Flow preconditions: id: flow1 flows: - flowId: pause_demo namespace: demo states: - SUCCESS ``` Why this is robust: - The `Pause` task can keep the execution in `PAUSED` for weeks; the state is persisted in the DB and survives server restarts. - You resume explicitly from the UI or via API when a decision is made. - Downstream automation is cleanly decoupled and reacts only after the first flow completes successfully. :::alert{type="info"} The Flow trigger cannot react to a `RESUMED` state as this state is not observable. While you could react to the `RESTARTED` state, this state is used not only for resumed executions after paused but also for executions restarted after a failure, which may not be what you want. Design your Flow trigger to react directly to the `SUCCESS` state as shown above. ::: --- ### Understanding `behavior` vs. Resume/Kill from UI or API The `behavior` property on a `Pause` task applies **only when the pause duration (`pauseDuration`) elapses**. It determines whether the workflow should automatically `CANCEL` or `RESUME` once the timer expires. When you **manually resume or kill** a paused execution through the **UI** or **API**, Kestra does **not** apply the `behavior` property: * When you **resume**, the paused task ends, and the workflow continues to the next task in sequence. * When you **kill**, the entire execution transitions to the `KILLING` state, and all running or paused taskruns are stopped accordingly. Example: ```yaml - id: wait_five_minutes type: io.kestra.plugin.core.flow.Pause behavior: CANCEL pauseDuration: PT5M ``` * If the `pauseDuration` elapses, the task run ends in a `CANCELLED` state, and the execution stops. * If you **resume** manually before that time, the execution continues, ignoring the `behavior` property. * If you **kill** manually before that time, the execution moves to the `KILLING` state, ensuring all tasks are stopped. Since Kestra 0.24, there’s [no longer](https://github.com/kestra-io/kestra/pull/12547) the need to add an explicit `Kill` task after the `Pause` task to stop the execution. --- ## Getting Started with Human-in-the-Loop Automation 1. **Install Kestra** – Follow the [quick start guide](../../01.quickstart/index.md) or the full [installation instructions for production environments](../../02.installation/index.mdx). 2. **Write Your Workflows** – Configure your [flow](../../03.tutorial/index.mdx) in YAML. Each automated task can invoke an API, run scripts, or call any existing service. Then, add `Pause` tasks for manual approvals: ```yaml - id: approval_gate type: io.kestra.plugin.core.flow.Pause onResume: - id: signoff type: BOOLEAN required: true ``` 3. **Configure Notifications** – Use Slack, Teams, or Email plugins to notify users about pending approvals: ```yaml - id: alert type: io.kestra.plugin.teams.TeamsIncomingWebhook url: "{{ secret('TEAMS_WEBHOOK') }}" payload: | { "text": "The process {{ flow.id }} is pending approval {{ appLink() }}" } ``` 4. **Add Triggers** – Use scheduled or event-based [triggers](../../05.workflow-components/07.triggers/index.mdx) to launch workflows. 5. **Observe and Manage** – Use [Kestra’s UI](../../09.ui/index.mdx) to monitor states, logs, outputs, and metrics. Correct and replay failed workflow executions or roll back to a previous revision when needed. --- ## Next Steps - [Explore notification plugins](https://kestra.io/plugins) for Slack, Teams, Email and more - [Check How-to Guides](../../15.how-to-guides/pause-resume/index.md) for detailed examples of approval workflows - [Explore video tutorials](https://www.youtube.com/@kestra-io) on our YouTube channel - [Join Slack](https://kestra.io/slack) to ask questions, contribute code, or share feature requests - [Book a demo](https://kestra.io/demo) to discuss how Kestra can help automate your approval processes. --- # Orchestrate Data Pipelines in Kestra URL: https://kestra.io/docs/use-cases/data-pipelines > Schedule, backfill, and scale data pipelines declaratively with Kestra. Orchestrate ETL, ELT, and analytics workflows from ingestion to delivery. Data teams use orchestration platforms like Kestra to manage complex pipelines — ingest raw data, transform it, and deliver it to data warehouses, lakes, and user-facing applications. The orchestration engine ensures workflows run in the correct sequence, recover from failures, and scale dynamically. ## Orchestrate data pipelines with Kestra ## What is Data Orchestration? Data orchestration automates the execution of interconnected tasks (ETL, Analytics, AI and ML jobs) while governing their dependencies and business logic. It focuses on **how data moves** between systems, teams and processes. Kestra's data orchestration capabilities include: - **Flexible workflow triggers** — run data flows on schedule, external events (e.g., a new file in S3/SFTP), or API calls. - **Powerful orchestration engine** — control retries, parallel task runs, timeouts, SLAs, concurrency and error handling. - **Dynamic resource allocation** — provision containers on-demand (e.g., AWS Fargate, GCP Batch, Azure Batch, Kubernetes) for compute-heavy tasks. - **Visibility** — track logs, traces, metrics, inputs, outputs, and lineage across workflows and tasks. --- ## Why Use Kestra for Data Orchestration? 1. **Simple Declarative Syntax** – Define each data pipeline in a self-contained, portable YAML configuration that includes tasks, triggers, dependencies and infrastructure requirements. 2. **Extensible Integrations** – Connect to over 1200 services via [pre-built plugins](https://kestra.io/plugins). Thanks to plugins, you can avoid writing custom code for boilerplate tasks like file downloads, SQL queries, or REST API calls. 3. **Execution Control** – Set retries, timeouts, SLAs, and concurrency limits. 4. **Zero Code Changes** – Run existing Python/R/SQL scripts as-is (no rewrites needed); specify dependencies via YAML configuration. 5. **State Management** – Pass data of any size between tasks (_files, variables, query results_) or between workflows (_using KV Store_) thanks to [internal storage](../../08.architecture/data-components/index.md#internal-storage). 6. **Dynamic Scaling** – Scale custom code with [task runners](../../task-runners/index.mdx). Spin up containers on cloud services (AWS ECS Fargate, Google Batch, Kubernetes) dynamically at runtime — no need for dedicated always-on workers (_to scale on-premise deployments, you can use [worker groups](../../07.enterprise/04.scalability/worker-group/index.md)_). 7. **Observability** – Monitor flow execution states, durations, logs, inputs, outputs and resource usage in real time. --- ## Example: Data Engineering Pipeline The following flow triggers a data sync from Airbyte, Fivetran, and dbt Cloud. Then, it downloads a JSON dataset via REST API, filters specific columns using Python, and calculates KPIs with DuckDB. Kestra dynamically provisions a Python container for the task running custom code and terminates it once the task completes: ```yaml id: data_pipeline namespace: tutorial description: Process product data to calculate brand price averages inputs: - id: columns_to_keep type: ARRAY itemType: STRING defaults: - brand - price tasks: - id: airbyte_sync type: io.kestra.plugin.airbyte.connections.Sync url: http://localhost:8080 connectionId: e3b1ce92-547c-436f-b1e8-23b6936c12cd wait: true - id: fivetran_sync type: io.kestra.plugin.fivetran.connectors.Sync apiKey: "{{ secret('FIVETRAN_API_KEY') }}" apiSecret: "{{ secret('FIVETRAN_API_SECRET') }}" connectorId: myConnectorId wait: true - id: dbt_job type: io.kestra.plugin.dbt.cloud.TriggerRun accountId: dbt_account token: "{{ secret('DBT_CLOUD_TOKEN') }}" jobId: abc12345 wait: true - id: extract type: io.kestra.plugin.core.http.Download uri: https://dummyjson.com/products # Filter columns in a disposable Python container - id: transform type: io.kestra.plugin.scripts.python.Script containerImage: python:3.11-slim taskRunner: # spins up container on-demand type: io.kestra.plugin.scripts.runner.docker.Docker inputFiles: data.json: "{{ outputs.extract.uri }}" # Input from previous task outputFiles: - "*.json" script: | import json filtered_data = [ {col: product.get(col, "N/A") for col in {{ inputs.columns_to_keep }}} for product in json.load(open("data.json"))["products"] ] json.dump(filtered_data, open("products.json", "w"), indent=4) # Analyze filtered data with DuckDB - id: query type: io.kestra.plugin.jdbc.duckdb.Query fetchType: STORE inputFiles: products.json: "{{ outputs.transform.outputFiles['products.json'] }}" sql: | INSTALL json; LOAD json; SELECT brand, ROUND(AVG(price), 2) AS avg_price FROM read_json_auto('{{ workingDir }}/products.json') GROUP BY brand ORDER BY avg_price DESC; ``` --- ## Getting Started with Data Orchestration 1. **Install Kestra** – Follow the [quick start guide](../../01.quickstart/index.md) or the full [installation instructions for production environments](../../02.installation/index.mdx). 2. **Write Your Workflows** – Configure your [flow](../../03.tutorial/index.mdx) in YAML, declaring inputs, tasks, and triggers. Tasks can be anything — scripts, queries, remote jobs or API calls. Add `retry`, `timeout`, `concurrency` or `taskRunner` settings to scale tasks dynamically and manage data orchestration logic. 3. **Add Triggers** – Execute flows manually, via schedules, API, flow or event [triggers](../../05.workflow-components/07.triggers/index.mdx) (e.g., S3 file uploads). 4. **Observe and Manage** – Use [Kestra’s UI](../../09.ui/index.mdx) to inspect workflow outputs, logs, execution states, and dependencies. --- ## Next Steps - [Explore plugins](https://kestra.io/plugins) for databases, data ingestion and transformation tools or custom scripts in any language. - [Explore blueprints](/blueprints) for common data workflows and data orchestration patterns. - [Explore How-to Guides](../../15.how-to-guides/index.mdx) for detailed examples on [using Kestra for ETL](../../15.how-to-guides/etl-pipelines/index.md), [ELT](../../15.how-to-guides/dbt/index.md), ML, and more. - [Explore Task Runners](../../07.enterprise/04.scalability/task-runners/index.md) for scaling custom scripts and containers. - [Explore video tutorials](https://www.youtube.com/@kestra-io) on our YouTube channel. - [Join Slack](https://kestra.io/slack) to share flow examples or ask questions. - [Book a demo](https://kestra.io/demo) to discuss how Kestra can help orchestrate your data workflows. --- # Orchestrate dbt Workflows in Kestra URL: https://kestra.io/docs/use-cases/dbt > Version-control, test, and deploy dbt models with Kestra and GitOps. Run dbt on demand or on a schedule with on-demand compute for reliable transformations. Data teams use dbt to transform data in warehouses. While dbt simplifies SQL transformations, managing dependencies, testing changes, and deploying models at scale remains challenging. Kestra solves this by integrating dbt with your data platform through version-controlled workflows. ## Orchestrate dbt projects with Kestra ## What is needed to orchestrate dbt workflows? Orchestration platforms like Kestra automate the execution of dbt models while managing dependencies, environments, and deployments. With Kestra, you can: - **Version control models** – Store dbt projects in Git and sync with Kestra's namespace files - **Test changes safely** – Run modified models in isolated containers before production - **Scale transformations** – Execute dbt builds on dynamically provisioned containers in the cloud using [task runners](../../07.enterprise/04.scalability/task-runners/index.md) (AWS/GCP/Azure Batch) - **Integrate with your data stack** – Chain dbt runs with ingestion tools, quality checks, and alerts. --- ## Why Use Kestra for dbt Orchestration? 1. **GitOps Workflows** – Sync dbt projects from Git, add and test new models, then push changes to Git from Kestra. 2. **Environment Management** – Run models in different targets (dev/stage/prod) from one self-contained flow. 3. **Dynamic Scaling** – Execute heavy dbt builds on serverless containers or Kubernetes clusters. 4. **Dependency Tracking** – Automatically parse `manifest.json` to visualize model relationships. 5. **Integrated Testing** – Add data quality checks between dbt models using Python or SQL. 6. **CI/CD Pipelines** – Deploy model changes to multiple Kestra namespaces or Git branches. 7. **Multi-Project Support** – Coordinate multiple dbt projects declaratively in one flow. --- ## Example: dbt Project Orchestration Below are common patterns to orchestrate dbt workflows using Kestra. ### Fetch dbt Project from Git at Runtime The example below runs `dbt build` for DuckDB in a Docker container. The dbt project is cloned from a Git repository at runtime to ensure the latest version is used. ```yaml id: dbt_duckdb namespace: company.team.dbt tasks: - id: dbt type: io.kestra.plugin.core.flow.WorkingDirectory tasks: - id: clone_repository type: io.kestra.plugin.git.Clone url: https://github.com/kestra-io/dbt-example branch: main - id: dbt_build type: io.kestra.plugin.dbt.cli.DbtCLI taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/dbt-duckdb:latest commands: - dbt deps - dbt build profiles: | my_dbt_project: outputs: dev: type: duckdb path: ":memory:" fixed_retries: 1 threads: 16 timeout_seconds: 300 target: dev ``` ### Sync dbt Project from Git to Kestra's Namespace Files You can sync the dbt project from a Git branch to Kestra's namespace and iterate on the models from the integrated code editor in the Kestra UI. ```yaml id: dbt_build namespace: company.team.dbt tasks: - id: sync type: io.kestra.plugin.git.SyncNamespaceFiles url: https://github.com/kestra-io/dbt-example branch: master namespace: "{{ flow.namespace }}" gitDirectory: dbt dryRun: false - id: dbt_build type: io.kestra.plugin.dbt.cli.DbtCLI containerImage: ghcr.io/kestra-io/dbt-duckdb:latest namespaceFiles: enabled: true exclude: - profiles.yml taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker commands: - dbt build profiles: | my_dbt_project: outputs: prod: type: duckdb path: ":memory:" schema: main threads: 8 target: prod ``` You can use the above flow as an initial setup: 1. Add this flow within Kestra UI 2. Save it 3. Execute that flow 4. Click on the `Files` sidebar in the code editor to view the uploaded dbt files. ![dbt-code-editor](../../15.how-to-guides/dbt/dbt-code-editor.png) You can then set `disabled: true` within the first task after the first execution to avoid re-syncing the project. This allows you to iterate on the models without cloning the repository every time. With the Code Editor built into Kestra, you can easily manage dbt projects by cloning the dbt Git repository, and uploading it to your Kestra namespace. You can make changes to the dbt models directly from the Kestra UI, test them as part of an end-to-end workflow, and push the changes to the desired Git branch when you are ready. :::collapse{title="Run dbt CLI, iterate on models, and push changes to Git"} Create a flow that runs dbt CLI commands on top of the dbt project synced from Git to your Kestra namespace. Use the Code Editor to make changes to the dbt models and push them back to the Git repository. ```yaml id: dbt_build namespace: company.team.dbt inputs: - id: dbt_command type: SELECT allowCustomValue: true defaults: dbt build --project-dir dbt --profiles-dir dbt --no-partial-parse --target prod values: - dbt build --project-dir dbt --profiles-dir dbt --no-partial-parse --target prod - dbt build --project-dir dbt --profiles-dir dbt --no-partial-parse --target prod --select state:modified+ --defer --state ./target --target-path ./dev tasks: - id: dbt type: io.kestra.plugin.dbt.cli.DbtCLI namespaceFiles: enabled: true containerImage: ghcr.io/kestra-io/dbt-duckdb:latest projectDir: dbt commands: - "{{ inputs.dbt_command }}" loadManifest: key: manifest.json namespace: "{{ flow.namespace }}" storeManifest: key: manifest.json namespace: "{{ flow.namespace }}" taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker ``` The `namespaceFiles` property lets you run dbt commands on files uploaded to the namespace, so you can test dbt models without cloning the Git repository each time. Execute the flow using the default value for the `dbt_command` input. ### Edit dbt file You can now open the dbt files in the Code Editor and make changes as needed. For example, let's add a new model `my_third_dbt_model.sql`: ```sql select * from {{ ref('my_first_dbt_model') }} where id = 2 ``` ![dbt-code-editor](../../15.how-to-guides/dbt/dbt-code-editor-2.png) When you now run the flow using the second dropdown value for the `dbt_command` input, only the new model will be built. This allows you to test the changes quickly and iterate faster. ### Push changes to Git Once you are satisfied with the changes, you can push them to the same Git repository to your desired Git branch using the [PushNamespaceFiles](../../15.how-to-guides/pushnamespacefiles/index.md). ```yaml id: push_dbt_to_git namespace: company.datateam.dbt inputs: - id: commit_message type: STRING defaults: "Changes to dbt from Kestra" tasks: - id: commit_and_push type: io.kestra.plugin.git.PushNamespaceFiles namespace: "{{ flow.namespace }}" username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" url: https://github.com/git_username/scripts branch: dev gitDirectory: dbt commitMessage: "{{ inputs.commit_message }}" ``` Adjust the `url`, `branch`, and `gitDirectory` properties to match your dbt Git repository structure. If the branch does not exist, it will be created. If you want to test this step more incrementally, you can set the `dryRun` property to `true` to validate the changes before committing them to Git. ::: --- ## Kestra Features to Orchestrate dbt Workflows ### Git Integration Clone dbt projects from any Git provider: ```yaml - id: clone type: io.kestra.plugin.git.Clone url: https://github.com/kestra-io/dbt-example branch: main ``` ### Best-in-class log navigation across dbt models Kestra automatically parses the `manifest.json` file within the Execution Gantt chart to provide visibility into each dbt model's built status, their duration and logs. You can browse all logs in one place (without having to manually navigate to each dbt model) and you can easily jump to the next `INFO`/`WARN`/`ERROR` log thanks to the best-in-class log navigation feature. ![dbtLogs](./dbtLogs.png) ### Manifest Tracking Store dbt artifacts between runs in the integrated KV Store: ```yaml tasks: - id: dbt type: io.kestra.plugin.dbt.cli.DbtCLI namespaceFiles: enabled: true loadManifest: key: manifest.json namespace: "{{ flow.namespace }}" storeManifest: key: manifest.json namespace: "{{ flow.namespace }}" ``` ### Custom Quality Checks Add quality checks validating dbt models using various plugins such as [Soda](/plugins/plugin-soda): ```yaml - id: scan type: io.kestra.plugin.soda.Scan configuration: # ... checks: checks for orderDetail: - row_count > 0 - max(unitPrice): warn: when between 1 and 250 fail: when > 250 checks for territory: - row_count > 0 - failed rows: name: Failed rows query test fail condition: regionId = 4 requirements: - soda-core-bigquery ``` ### Multi-Project Coordination If needed, you can orchestrate multiple dbt projects from a single flow: ```yaml - id: core type: io.kestra.plugin.dbt.cli.DbtCLI projectDir: dbt-core - id: marts type: io.kestra.plugin.dbt.cli.DbtCLI projectDir: dbt-marts ``` ### Scale dbt Workflows in the Cloud Adding the following `pluginDefaults` to that flow (or your namespace) will scale the dbt task so that the (_computationally heavy_) dbt parsing process runs on AWS ECS Fargate, Google Batch, Azure Batch, or Kubernetes job by leveraging [Kestra's task runners](../../07.enterprise/04.scalability/task-runners/index.md): ```yaml pluginDefaults: - type: io.kestra.plugin.dbt.cli.DbtCLI values: taskRunner: type: io.kestra.plugin.ee.aws.runner.Batch region: us-east-1 accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" computeEnvironmentArn: "arn:aws:batch:us-east-1:123456789:compute-environment/kestra" jobQueueArn: "arn:aws:batch:us-east-1:123456789:job-queue/kestra" executionRoleArn: "arn:aws:iam::123456789:role/ecsTaskExecutionRole" taskRoleArn: "arn:aws:iam::123456789:role/ecsTaskRole" bucket: kestra-us ``` You can set plugin defaults at the flow, namespace, or global level to apply to all tasks of that type, ensuring that all dbt tasks run on AWS ECS Fargate in a given environment. --- ## Getting Started with dbt Orchestration 1. **Install Kestra** – Follow the [quick start guide](../../01.quickstart/index.md) or the full [installation instructions for production environments](../../02.installation/index.mdx). 2. **Write Your Workflows** – Configure your [flow](../../03.tutorial/index.mdx) in YAML, declaring inputs, tasks, and triggers. Use one of the patterns above to sync dbt projects from Git, run dbt CLI commands, and push changes back to Git. 3. **Configure Environments** — Set up dbt profiles for different targets based on your dbt project setup: ```yaml - id: dbt type: io.kestra.plugin.dbt.cli.DbtCLI containerImage: ghcr.io/kestra-io/dbt-duckdb:latest profiles: | my_dbt_project: outputs: prod: type: duckdb ``` 4. **Add Execution Triggers** — Schedule dbt runs or trigger them based on upstream data availability: ```yaml triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "0 9 * * 1-5" # Weekdays at 9 AM ``` 5. **Monitor Runs** — Track dbt models and their execution durations in Kestra's UI. --- ## Next Steps - [Explore dbt plugins](/plugins/plugin-dbt) - [Read how-to guide on dbt](../../15.how-to-guides/dbt/index.md) - [Explore video tutorials](https://www.youtube.com/@kestra-io) on our YouTube channel - [Join Slack](https://kestra.io/slack) to ask questions, contribute code, report bugs and share and feature requests. - [Book a demo](https://kestra.io/demo) to discuss how Kestra can help orchestrate your dbt workflows. --- # Automate Infrastructure Workflows in Kestra URL: https://kestra.io/docs/use-cases/infrastructure > Provision resources, manage builds, and scale infrastructure workflows with Kestra. Automate cloud provisioning, CI/CD, and Terraform from one platform. DevOps and engineering teams automate infrastructure to ensure consistency across environments, avoid manual errors, and scale resources on demand. With Kestra, you can orchestrate Docker builds, Terraform, OpenTofu, and Terragrunt deployments, Ansible playbooks, and cloud provisioning in a unified workflow — triggered by schedules, code changes, or external events. ## Automate infrastructure workflows with Kestra ## What is Infrastructure Automation? Infrastructure automation is the process of codifying and orchestrating infrastructure tasks to manage cloud resources, build container images, and deploy applications. Kestra handles dependencies, retries, and scaling so you can: - **Provision resources** dynamically (e.g., spin up cloud VM instances via Terraform, OpenTofu, or Terragrunt) - **Build and deploy** containers for critical applications - **Scale workloads** using task runners (AWS ECS Fargate, Azure Batch, Kubernetes) or dedicated worker groups - **Roll back** code changes to a previous revision if some infrastructure workflows fail. --- ## Why Use Kestra for Infrastructure Automation? 1. **Unified Platform** – Combine Docker, Terraform, OpenTofu, Terragrunt, Ansible, and cloud CLI tools in a declarative YAML flow. 2. **Dynamic Scaling** – Task runners provision containers on-demand (e.g., AWS Fargate) for heavy containerized workloads 3. **State Management** – Securely pass secrets, variables, and outputs between tasks (e.g., Docker image tag → Terraform config). 4. **Failure Handling** – Retry failed Terraform plans or Docker builds with custom retry policies, and get alerts on failures. 5. **GitOps** – Sync infrastructure-as-code (IaC) workflows with Git for version control and auditability. 6. **Security** – Inject cloud credentials via secrets and manage access controls across teams. 7. **Multi-Cloud** – Avoid lock-in by orchestrating AWS, GCP, Azure, or on-premises infrastructure in the same flow. --- ## Example: Infrastructure Automation Flow This workflow builds a Docker image, runs a container, provisions cloud resources with Terraform, and logs the results. You can substitute the `TerraformCLI` task with [OpenTofuCLI](/plugins/plugin-opentofu/cli/io.kestra.plugin.opentofu.cli.opentofucli) for a drop-in open-source alternative, or [TerragruntCLI](/plugins/plugin-terragrunt/cli/io.kestra.plugin.terragrunt.cli.terragruntcli) when orchestrating multi-module configurations. ```yaml id: infrastructure_automation namespace: devops inputs: - id: docker_image type: STRING defaults: kestra/myimage:latest tasks: - id: build_image type: io.kestra.plugin.docker.Build dockerfile: | FROM python:3.11-alpine RUN pip install --no-cache-dir kestra tags: - "{{ inputs.docker_image }}" push: false # change this to true after adding credentials credentials: registry: https://index.docker.io/v1/ username: "{{ secret('DOCKERHUB_USERNAME') }}" password: "{{ secret('DOCKERHUB_PASSWORD') }}" - id: run_container type: io.kestra.plugin.docker.Run pullPolicy: NEVER # to use the local image we've just built containerImage: "{{ inputs.docker_image }}" commands: - pip - show - kestra - id: run_terraform type: io.kestra.plugin.terraform.cli.TerraformCLI beforeCommands: - terraform init commands: - terraform plan 2>&1 | tee plan_output.txt - terraform apply -auto-approve 2>&1 | tee apply_output.txt outputFiles: - "*.txt" inputFiles: main.tf: | terraform { required_providers { http = { source = "hashicorp/http" } local = { source = "hashicorp/local" } } } provider "http" {} provider "local" {} variable "pokemon_names" { type = list(string) default = ["pikachu", "psyduck", "charmander", "bulbasaur"] } data "http" "pokemon" { count = length(var.pokemon_names) url = "https://pokeapi.co/api/v2/pokemon/${var.pokemon_names[count.index]}" } locals { pokemon_details = [for i in range(length(var.pokemon_names)) : { name = jsondecode(data.http.pokemon[i].response_body)["name"] types = join(", ", [for type in jsondecode(data.http.pokemon[i].response_body)["types"] : type["type"]["name"]]) }] file_content = join("\n\n", [for detail in local.pokemon_details : "Name: ${detail.name}\nTypes: ${detail.types}"]) } resource "local_file" "pokemon_details_file" { filename = "${path.module}/pokemon.txt" content = local.file_content } output "file_path" { value = local_file.pokemon_details_file.filename } - id: log_pokemon type: io.kestra.plugin.core.log.Log message: "{{ read(outputs.run_terraform.outputFiles['pokemon.txt']) }}" ``` --- ## Getting Started with Infrastructure Automation 1. **Install Kestra** – Follow the [quick start guide](../../01.quickstart/index.md) or the full [installation instructions for production environments](../../02.installation/index.mdx). 2. **Write Your Workflows** – Configure your [flow](../../03.tutorial/index.mdx) in YAML, declaring inputs, tasks, and triggers. Tasks can be anything from Docker, Terraform and Ansible plugins, to tasks running CLI commands on AWS/GCP, custom scipts in any language, queries to databases, or API calls. Add `retry`, `timeout`, `concurrency` or `taskRunner` settings to scale tasks dynamically and manage the orchestration logic. 3. **Add Triggers** – Execute flows manually, on schedules (e.g., nightly redeploys), or via event [triggers](../../05.workflow-components/07.triggers/index.mdx) (e.g., GitHub webhooks, S3 file uploads). 4. **Observe and Manage** – Use [Kestra’s UI](../../09.ui/index.mdx) to track Terraform plan or Ansible playbook outputs, Docker build logs, resource usage, logs, execution states, and dependencies across systems. --- ## Next Steps - [Explore plugins](https://kestra.io/plugins) for Docker, Terraform, OpenTofu, Terragrunt, Ansible, Script tasks (Python, Go, Shell, Powershell, Ruby and more), and cloud providers. - [Explore Task Runners](../../07.enterprise/04.scalability/task-runners/index.md) for scaling custom scripts and containers. - [Read how-to guides](../../15.how-to-guides/index.mdx) on how to integrate with Grafana, Prometheus, and other monitoring tools. - [Explore video tutorials](https://www.youtube.com/@kestra-io) on our YouTube channel. - [Join Slack](https://kestra.io/slack) to ask questions or contribute bug reports and feature requests. - [Book a demo](https://kestra.io/demo) to discuss how Kestra can help orchestrate your infrastructure workflows. --- # Orchestrate Microservices in Kestra URL: https://kestra.io/docs/use-cases/microservices > Run microservice tasks in response to events, recover from failures, and scale as needed Data and engineering teams often rely on microservices to keep their systems modular, fault-tolerant, and flexible. Each microservice handles a single task, making it easier to develop, scale, and maintain. The main challenge is orchestrating these independent components so they run in the correct order, automatically recover from failures, and scale as needed. ## Orchestrate microservices with Kestra ## What is Microservices Orchestration? Microservices orchestration is the automated coordination of services, often triggered by events such as a file upload to S3/SFTP, a message in Kafka/PubSub, or a new database row. A platform like Kestra defines the sequence of these services, manages their dependencies, and ensures reliable execution — whether triggered manually, via API calls, webhooks, schedules, or by completion of upstream workflows. Kestra can: - [Trigger](../../05.workflow-components/07.triggers/index.mdx) your microservices from any event, schedule, or flow dependency - Pass data of any size between services thanks to built-in [internal storage](../../08.architecture/data-components/index.md#internal-storage) - Dynamically provision task runner environments, so your services have enough compute resources - Retry failed services, keeping workflows robust and fault-tolerant - Send alerts or notifications on success or failure - Track logs, metrics, inputs, and outputs of each service execution - Roll back to earlier workflow [revisions](../../15.how-to-guides/rollback-and-revision-history/index.md) as needed. ## Why Use Kestra for Microservices Orchestration? 1. **Visibility** – View dependencies, see which service failed or succeeded, then restart or roll back as needed. 2. **Simplicity** – Declare dependencies in YAML or use Kestra’s UI with no-code/low-code options. 3. **Scalability** – Run microservices in parallel and scale compute resources based on workload. 4. **Resilience** – If one service fails, Kestra can retry just that part instead of re-running the entire workflow. 5. **Zero Code Changes** – Keep your existing code as-is and add minimal YAML on top to orchestrate it. 6. **Extensibility** – Add new triggers, tasks, runners, or notifications through Kestra’s plugin system. 7. **Security and Compliance** – Manage secrets, access, encryption, and audit logs within Kestra. 8. **Version Control** – Keep orchestration configurations in Git and revert to previous versions if needed. 9. **Multi-Tenancy** – Use separate tenants/namespaces for different teams or projects, each with its own variables and secrets. 10. **Open-Source Core** – Ask questions in our Slack community, report issues on GitHub, contribute to the codebase — all with no vendor lock-in. ## Example: Microservices Orchestration in Kestra Below is a minimal Kestra flow for an e-commerce order-processing workflow. It checks inventory, processes payment, confirms the order, arranges shipping, and updates delivery status. Each task waits for a successful response before triggering the next service, passing data and determining the next step based on the status code of the prior API call. ```yaml id: orderProcessing namespace: ecommerce description: E-commerce Order Processing Workflow inputs: - id: orderId type: STRING defaults: myorder tasks: - id: checkInventory type: io.kestra.plugin.core.http.Request description: Check inventory for the order items uri: https://kestra.io/api/mock - id: processPayment type: io.kestra.plugin.core.http.Request runIf: "{{ outputs.checkInventory.code == 201 }}" description: Process payment for the order uri: https://kestra.io/api/mock - id: orderConfirmation type: io.kestra.plugin.core.http.Request runIf: "{{ outputs.processPayment.code == 201 }}" description: Confirm the order and notify the customer uri: https://kestra.io/api/mock - id: arrangeShipping type: io.kestra.plugin.core.http.Request runIf: "{{ outputs.orderConfirmation.code == 201 }}" description: Arrange shipping for the order uri: https://kestra.io/api/mock - id: updateDeliveryStatus type: io.kestra.plugin.core.http.Request runIf: "{{ outputs.arrangeShipping.code == 201 }}" description: Update the delivery status of the order uri: https://kestra.io/api/mock pluginDefaults: - type: io.kestra.plugin.core.http.Request values: contentType: multipart/form-data method: POST formData: orderId: "{{inputs.orderId}}" ``` ## Getting Started with Microservice Orchestration in Kestra 1. **Install Kestra** – Follow the [quick start guide](../../01.quickstart/index.md) or the full [installation instructions for production environments](../../02.installation/index.mdx). 2. **Write Your Workflows** – Configure your [flow](../../03.tutorial/index.mdx) in YAML. Each task can invoke an API, run scripts, or call any existing service. 3. **Add Triggers** – Use scheduled or event-based [triggers](../../05.workflow-components/07.triggers/index.mdx) to start microservice workflows. 4. **Observe and Manage** – Use [Kestra’s UI](../../09.ui/index.mdx) to monitor states, logs, and metrics. Rerun failed workflow executions or roll back with one click. --- ## Next Steps - [Explore plugins](https://kestra.io/plugins) for databases, message brokers or custom scripts in any language. - [Explore blueprints](/blueprints) for common microservice orchestration patterns. - [Explore How-to Guides](../../15.how-to-guides/index.mdx) for detailed examples on using Kestra to orchestrate microservices written in Python, R, Node.js, Rust, Ruby, Go, Shell, Powershell or any other language. - [Explore Task Runners](../../07.enterprise/04.scalability/task-runners/index.md) for scaling custom code and containerized services. - [Explore video tutorials](https://www.youtube.com/@kestra-io) on our YouTube channel. - [Join Slack](https://kestra.io/slack) to share flow examples or ask questions. - [Book a demo](https://kestra.io/demo) to discuss how Kestra can help orchestrate your microservices. --- # Orchestrate Python Workflows in Kestra URL: https://kestra.io/docs/use-cases/python-workflows > Automate, schedule, and scale Python workflows declaratively with Kestra. Run scripts in Docker containers with full dependency management and observability. Data teams and developers use Python for AI, ML, ETL, analytics, and a lot more. Kestra lets you schedule and orchestrate Python scripts at scale — whether they’re simple data transformations, API calls, or compute-heavy ML jobs — without rewriting code or managing infrastructure. ## Orchestrate Python workflows with Kestra ## What is Workflow Orchestration for Python? Workflow orchestration platforms like Kestra automate the execution and deployment of your Python code across environments, handling dependencies, error recovery, and scaling resource. With Kestra, you can: - **Schedule scripts** via cron, external events (e.g., new files in S3), or API calls. - **Manage dependencies** with `pip` and `uv`, using custom or pre-built container images. - **Pass data** between Python tasks and downstream steps (SQL queries, APIs, etc.). - **Scale dynamically** — run scripts in lightweight containers or cloud services like AWS ECS Fargate, Azure Batch, GCP Cloud Run, Modal or Kubernetes. --- ## Why Use Kestra for Python Scripts? 1. **Zero Code Changes** – Run existing Python scripts as-is (no decorators needed); specify dependencies via YAML configuration or no-code forms. 2. **Dependency Management** – Dynamically install latest packages at runtime with `pip`, use custom Docker images, or leverage [pre-built packages](https://github.com/orgs/kestra-io/packages). 3. **Dynamic Scaling** – [Task runners](../../07.enterprise/04.scalability/task-runners/index.md) provision resources on-demand (AWS ECS Fargate, Google Batch) for heavy workloads. 4. **Observability** – Track logs, outputs, and custom metrics (e.g., row counts, durations) in real time. 5. **Integration** – Combine Python with SQL, Spark, dbt, or microservices in a single flow. 6. **Failure Handling** – Retry failed scripts with configurable retry policies and get alerts on errors. 7. **React to Events** – Trigger Python scripts on file uploads from S3/SFTP, API calls, or custom events from Kafka, RabbitMQ, SQS, etc. 8. **Schedules and Backfills** – Run scripts on a schedule or backfill historical data with custom parameters. --- ## Example: Python Data Pipeline This flow runs a Python script to fetch data, processes it with Pandas, and logs results. Kestra dynamically provisions a container for the task and scales down once complete: ```yaml id: sales_analysis namespace: analytics description: Analyze daily sales data tasks: - id: extract type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: transform type: io.kestra.plugin.scripts.python.Script containerImage: ghcr.io/kestra-io/pydata:latest # Pre-built image with Pandas inputFiles: data.csv: "{{ outputs.extract.uri }}" script: | import pandas as pd from kestra import Kestra df = pd.read_csv("data.csv") total_sales = float(df["total"].sum()) product_quantity = df.groupby("product_id")["quantity"].sum().astype('int32') top_product_id = int(product_quantity.idxmax()) Kestra.outputs({ "total_sales": round(total_sales, 2), "top_product_id": top_product_id, "total_quantity_sold": int(product_quantity.max()) }) Kestra.counter("row_count", int(len(df))) Kestra.counter("unique_products", int(df['product_id'].nunique())) - id: notify type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "https://kestra.io/api/mock" payload: | { "text": "📊 *Daily Sales Report* • Total Sales: ${{ outputs.transform.vars.total_sales }} • Top Product ID: #{{ outputs.transform.vars.top_product_id }} • Units Sold of Top Product: {{ outputs.transform.vars.total_quantity_sold }}" } triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "0 9 * * *" # Run every day at 9 AM ``` Adding the following `pluginDefaults` to that flow (or your namespace) will scale the Python task to run on AWS ECS Fargate: ```yaml pluginDefaults: - type: io.kestra.plugin.scripts.python values: taskRunner: type: io.kestra.plugin.ee.aws.runner.Batch region: us-east-1 accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}" computeEnvironmentArn: "arn:aws:batch:us-east-1:123456789:compute-environment/kestra" jobQueueArn: "arn:aws:batch:us-east-1:123456789:job-queue/kestra" executionRoleArn: "arn:aws:iam::123456789:role/ecsTaskExecutionRole" taskRoleArn: "arn:aws:iam::123456789:role/ecsTaskRole" bucket: kestra-us ``` You can set plugin defaults at the flow, namespace, or global level to apply to all tasks of that type, ensuring that all Python tasks run on AWS ECS Fargate in a given environment. --- ## Kestra Features for Python Orchestration ### Package Dependency Management Install packages at runtime or use pre-built images: ```yaml - id: script type: io.kestra.plugin.scripts.python.Script beforeCommands: - pip install pandas requests script: | # Your code here ``` ### Outputs and Metrics Pass data between tasks using outputs and track metrics: ```python from kestra import Kestra Kestra.outputs({"key": "value"}) # Pass to downstream tasks Kestra.counter("rows_processed", 1000) # Track metrics ``` ### Dynamic Scaling Run heavy scripts on dynamically provisioned cloud infrastructure: ```yaml taskRunner: type: io.kestra.plugin.ee.aws.runner.Batch resources: cpu: 4 memory: 8192 ``` ### Error Handling Add configurable `retry` policies to automatically retry failed tasks: ```yaml retry: type: constant interval: PT1M maxAttempts: 3 ``` Alert on failures via email, Slack, and other [notification plugins](https://kestra.io/plugins): ```yaml errors: - id: send_alert type: io.kestra.plugin.slack.notifications.SlackExecution url: "{{secret('SLACK_WEBHOOK_URL')}}" executionId: "{{execution.id}}" ``` --- ## Getting Started Orchestrating Python Workflows 1. **Install Kestra** – Follow the [quick start guide](../../01.quickstart/index.md) or [production setup](../../02.installation/index.mdx). 2. **Write Your Flow** – Define Python tasks in YAML. Use `Script` for inline code or `Commands` for `.py` files: ```yaml - id: py type: io.kestra.plugin.scripts.python.Commands namespaceFiles: enabled: true commands: - python scripts/transform.py ``` 3. **Add Triggers** – run flows on schedule, via API or on events (e.g., new files in S3). 4. **Observe** – Monitor execution logs, outputs, and metrics in [Kestra’s UI](../../09.ui/index.mdx). --- ## Next Steps - [Explore Python plugins](/plugins/plugin-script-python) - [Manage package dependencies](../../15.how-to-guides/python-dependencies/index.md) with Docker or `pip`. - [Explore video tutorials](https://www.youtube.com/@kestra-io) on our YouTube channel. - [Join Slack](https://kestra.io/slack) to ask questions, contribute code or share feature requests. - [Book a demo](https://kestra.io/demo) to discuss how Kestra can help orchestrate your Python workflows. --- # Version Control & CI/CD in Kestra: GitOps and Pipelines URL: https://kestra.io/docs/version-control-cicd > Overview of Kestra's version control and CI/CD capabilities, enabling GitOps workflows and automated deployment pipelines. import ChildCard from "~/components/docs/ChildCard.astro" Version Control & CI/CD Pipelines ## Control versions and automate deployments with CI/CD --- # CI/CD Pipelines in Kestra: Validate and Deploy Flows URL: https://kestra.io/docs/version-control-cicd/cicd > Automate Kestra flow validation and deployment with CI/CD. Integrate with GitHub Actions, GitLab CI, and other pipelines to enforce quality before production. Automate the validation and deployment of your Kestra flows using CI/CD pipelines. ## Automate validation and deployment with CI/CD Continous integration and deliver (CI/CD) pipelines enable teams to deploy updates automatically and consistently as soon as they are reviewed and merged into a version control system (VCS) like Git. This section covers multiple approaches to building a CI/CD pipeline for Kestra — from using the CLI and GitHub Actions to integrating with Terraform. :::alert{type="info"} When flows are deployed through CI/CD, add the [`system.readOnly`](../../06.concepts/system-labels/index.md#systemreadonly) label set to `"true"` so the UI editor is disabled and production configurations stay immutable. This is especially recommended for critical production flows: ```yaml labels: system.readOnly: true ``` ::: --- ## Why use a CI/CD pipeline? A CI/CD process ensures **fast, reliable, and repeatable deployments**. It removes manual steps, reduces human error, and accelerates delivery from development to production environments. --- ## CI/CD for Kestra flows Kestra supports several approaches for automating flow validation and deployment. Choose the one that best fits your environment and tooling preferences. --- ### Kestra CLI The [Kestra CLI](./04.helpers/index.md) includes built-in commands for validating and deploying your flows. #### Validate and deploy a single flow ```bash ## Validate a single flow ./kestra flow validate flow_directory/myflow.yml --server http://localhost:8080 --api-token ## Deploy a single flow to a namespace (without deleting existing flows) ./kestra flow namespace update namespace_name flow_directory/myflow.yml --no-delete --server http://localhost:8080 --api-token ``` :::alert{type="info"} The `--api-token` flag is available in the [Enterprise Edition](../../07.enterprise/03.auth/api-tokens/index.md). In the open-source edition, use basic authentication with the `--user` flag: ```bash ./kestra flow namespace update namespace_name flow_directory/myflow.yml --no-delete --server http://localhost:8080 --user=USERNAME:PASSWORD ``` ::: #### Running CLI commands in Docker If Kestra runs inside a Docker container, you can access the CLI as follows: ```bash docker exec -it kestra-container-name /bin/bash ./kestra flow --help ``` #### Validate and deploy multiple flows To process all flows in a directory: ```bash ./kestra flow validate flows/ --server http://localhost:8080 --api-token ./kestra flow namespace update namespace_name flows/ --no-delete --server http://localhost:8080 --api-token ``` Use `--no-delete` to preserve existing flows. Omit it if your Git repository or local directory should serve as the **single source of truth** — Kestra will then delete any previously stored flows not present in the directory. #### CLI options The CLI provides options to tailor the validation and deployment process: - `--local`: Validates flows locally using the client. By default, validation occurs server-side via the Kestra API. - `--server`: Specifies the Kestra webserver/standalone server URL (default: `http://localhost:8080`). For a full list of available options, use: ```bash ./kestra flow validate -h ./kestra flow namespace update -h ``` --- ### Automate deployments within Kestra You can run CLI commands directly from a Kestra flow to manage your CI/CD pipeline within Kestra itself. ```yaml id: ci-cd namespace: company.team tasks: - id: github-ci-cd type: io.kestra.plugin.core.flow.WorkingDirectory tasks: - id: clone-repository type: io.kestra.plugin.git.Clone url: https://github.com/anna-geller/kestra-ci-cd branch: main - id: validate-flows type: io.kestra.plugin.scripts.shell.Commands description: "Validate flows from Git before deploying them." taskRunner: type: io.kestra.plugin.core.runner.Process commands: - /app/kestra flow validate flows/ --server http://localhost:8080 --api-token "{{ secret('KESTRA_API_TOKEN') }}" - id: deploy-flows type: io.kestra.plugin.scripts.shell.Commands description: "Deploy flows to production namespaces." taskRunner: type: io.kestra.plugin.core.runner.Process commands: - /app/kestra flow namespace update prod flows/prod/ --server http://localhost:8080 --api-token "{{ secret('KESTRA_API_TOKEN') }}" - /app/kestra flow namespace update prod.marketing flows/prod.marketing/ --server http://localhost:8080 --api-token "{{ secret('KESTRA_API_TOKEN') }}" triggers: - id: github type: io.kestra.plugin.core.trigger.Webhook key: "yourSecretKey1234" ``` You can trigger this CI/CD flow manually via the UI or API — or automatically using a Git webhook. #### Configuring a GitHub webhook To trigger your Kestra CI/CD flow on each Git push: 1. Go to your GitHub repository → **Settings** → **Webhooks** 2. Select **Add webhook** 3. Set the *Payload URL* to your Kestra webhook endpoint: ```bash https://kestra_host_url/api/v1/main/executions/webhook/namespace/flow_id/webhook_key ``` 4. Choose the **Push event** to trigger your pipeline (or customize for pull requests, tags, etc.) ![github_webhook_2](./github_webhook_2.png) --- ### Deploy flows with GitHub Actions Kestra provides [official GitHub Actions](./01.github-action/index.md) to validate and deploy flows. 1. **Validate** flows and templates — [Validate Action](https://github.com/marketplace/actions/kestra-validate-action) 2. **Deploy** flows and templates — [Deploy Action](https://github.com/marketplace/actions/kestra-deploy-action) #### Example GitHub Actions workflow ```yaml name: Kestra CI/CD on: push: branches: - main jobs: prod: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Validate flows uses: kestra-io/validate-action@master with: directory: ./flows/prod resource: flow server: ${{secrets.KESTRA_HOSTNAME}} user: ${{secrets.KESTRA_USER}} password: ${{secrets.KESTRA_PASSWORD}} - name: Deploy prod uses: kestra-io/deploy-action@develop with: namespace: prod directory: ./flows/prod resource: flow server: ${{secrets.KESTRA_HOSTNAME}} user: ${{secrets.KESTRA_USER}} password: ${{secrets.KESTRA_PASSWORD}} delete: false - name: Deploy prod-marketing uses: kestra-io/deploy-action@develop with: namespace: prod.marketing directory: ./flows/prod.marketing resource: flow server: ${{secrets.KESTRA_HOSTNAME}} user: ${{secrets.KESTRA_USER}} password: ${{secrets.KESTRA_PASSWORD}} delete: false ``` :::alert{type="info"} You can also authenticate using an [API token](../../07.enterprise/03.auth/api-tokens/index.md) instead of username and password: ```yaml with: server: ${{secrets.KESTRA_HOSTNAME}} apiToken: ${{secrets.KESTRA_API_TOKEN}} ``` ::: --- ### Deploy flows with GitLab CI/CD GitLab CI/CD uses a similar approach to GitHub Actions. See the [GitLab guide](./02.gitlab/index.md) for examples and configuration details. --- ### Deploy flows with Terraform Terraform provides the most flexible, **Infrastructure-as-Code** approach to managing Kestra deployments. It allows you to define, validate, and deploy flows alongside the rest of your cloud infrastructure. Here’s an example Terraform configuration for deploying flows stored in a `flows` directory: ```hcl terraform { required_providers { kestra = { source = "kestra-io/kestra" version = "~> 0.15.0" } } } provider "kestra" { url = "http://localhost:8080" # Kestra webserver/standalone server URL api_token = "" # Only available in the Enterprise Edition } resource "kestra_flow" "flows" { for_each = fileset(path.module, "flows/*.yml") flow_id = yamldecode(templatefile(each.value, {}))["id"] namespace = yamldecode(templatefile(each.value, {}))["namespace"] content = templatefile(each.value, {}) keep_original_source = true } ``` Then run the following commands: ```bash terraform init # Download the Kestra provider terraform validate # Validate both the configuration and your flows terraform apply -auto-approve # Deploy your flows automatically ``` --- ## Next steps Explore detailed documentation for each CI/CD option below to choose the best fit for your workflow and deployment process. --- # Azure DevOps for Kestra – YAML Pipelines Example URL: https://kestra.io/docs/version-control-cicd/cicd/05-azure-devops > Build CI/CD pipelines in Azure DevOps to automate the validation and deployment of Kestra flows using YAML configurations. How to use Azure DevOps to create a CI/CD pipeline for your Kestra flows. ## Automate Kestra deployments with Azure DevOps Azure DevOps allows you to automate the validation and deployment of your Kestra flows using YAML-based pipelines. Follow the steps below to configure a simple Terraform-based CI/CD setup. :::alert{type="info"} For flows managed through CI/CD, add the [`system.readOnly`](../../../06.concepts/system-labels/index.md#systemreadonly) label set to `"true"` so the UI editor is disabled and production configurations stay immutable. This is especially recommended for critical production flows: ```yaml labels: system.readOnly: true ``` ::: ### Connect to your repository First, connect your pipeline to a code repository such as **GitHub**, **Azure Repos Git**, or **Bitbucket**. ![az-devops-image-repo](./az-devops-image-repo.png) ### Select your repository Choose the repository where your Kestra flows are stored. ### Configure your pipeline Start with a minimal pipeline template or an existing configuration. ![az-devops-image-config](./az-devops-image-config.png) ### Example pipeline Below is a complete example of a Terraform pipeline that validates and deploys Kestra resources. ```yaml trigger: branches: include: - main pool: name: test-pool stages: - stage: tfvalidate jobs: - job: deploy continueOnError: false steps: - task: TerraformInstaller@1 inputs: terraformVersion: 'latest' - task: TerraformTaskV4@4 inputs: provider: 'aws' command: 'init' backendServiceAWS: 'aws_s3' backendAWSBucketName: 'eu-north-1' backendAWSKey: 'kestra-tf' - task: TerraformTaskV4@4 inputs: provider: 'aws' command: 'validate' - task: TerraformTaskV4@4 inputs: provider: 'aws' command: 'apply' environmentServiceNameAWS: 'aws_s3' ``` ### How it works - The pipeline runs automatically whenever the **`main`** branch is updated (for example, after merging a pull request). - The **pool** defines the agent that runs your pipeline. For setup details, refer to the [Azure DevOps documentation](https://learn.microsoft.com/en-us/azure/devops/pipelines/agents/pools-queues?view=azure-devops&tabs=yaml,browser). - The **Terraform extension** manages installation, validation, and deployment of Terraform resources. To install the Terraform extension, navigate to **Organization Settings → Extensions**, then browse the Marketplace to install it. ![image-terraform](./az-devops-image-terraform.png) ### Task breakdown This pipeline includes one installation step and three Terraform tasks: 1. **Install Terraform** — The `TerraformInstaller@1` task installs Terraform at runtime. 2. **Initialize Terraform** — The first `TerraformTaskV4@4` runs the `init` command. In this example, the backend uses an AWS S3 bucket, but you can use Azure RM, AWS, or GCP. 3. **Validate configuration** — The second task runs the `validate` command to ensure the configuration is correct. 4. **Apply changes** — The final task executes the `apply` command to deploy your Terraform-managed resources. ![image-green-pipeline](./az-devops-image-green-pipleine.png) --- For more details, refer to the [Kestra Terraform provider documentation](../../../13.terraform/index.mdx). --- # Bitbucket Pipes for Kestra: Build and Deploy Flows URL: https://kestra.io/docs/version-control-cicd/cicd/bitbucket-pipes > Use Bitbucket Pipes to streamline the build and deployment process of your Kestra flows from Bitbucket repositories. How to use Bitbucket Pipes to create a CI/CD pipeline for your Kestra flows. ## Automate Kestra deployments with Bitbucket Pipes With the Kestra Docker image and CLI, you can validate and deploy flows from Bitbucket repositories through [Bitbucket Pipes](https://support.atlassian.com/bitbucket-cloud/docs/configure-your-first-pipeline/). :::alert{type="info"} For flows managed via CI/CD, add the [`system.readOnly`](../../../06.concepts/system-labels/index.md#systemreadonly) label set to `"true"` so the UI editor is disabled and production configurations stay immutable. This is especially recommended for critical production flows: ```yaml labels: system.readOnly: true ``` ::: Here is a basic pipeline: ```yaml image: kestra/kestra pipelines: default: - step: name: 'Validate Kestra flows' deployment: staging script: - /bin/sh /app/kestra flow validate flows/ --server $SERVER --tenant $TENANT --user $KESTRA_USER:$KESTRA_PASSWORD - step: name: 'Deploy Kestra flows' deployment: production script: - echo $SERVER - echo $KESTRA_USER - echo $KESTRA_PASSWORD - /bin/sh /app/kestra flow namespace update dev flows/ --server=$SERVER --tenant=$TENANT --user=$KESTRA_USER:$KESTRA_PASSWORD ``` Variables such as `$SERVER`, `$KESTRA_USER`, `$KESTRA_PASSWORD`, and optionally `$TENANT` (for multi-tenant environments) are set in the Bitbucket variable configuration: ![Bitbucket Pipes Variable](./bitbucket_pipe_variable.png) :::alert{type="info"} If you're using Kestra Enterprise Edition, you can replace ``--user $KESTRA_USER:$KESTRA_PASSWORD`` with the `--api-token` option to authenticate with a service account API token. ::: This example uses the Kestra CLI to: 1. Validate flows contained in the `flows/` directory of the repository. 2. Deploy flows into the `company.team` namespace of your Kestra instance. --- # GitHub Actions for Kestra – CI/CD Workflow Examples URL: https://kestra.io/docs/version-control-cicd/cicd/github-action > Automate Kestra flow validation and deployment directly from your GitHub repository using official Kestra GitHub Actions. Use GitHub Actions to automate the validation and deployment of your Kestra flows and namespace files. ## Automate Kestra deployments with GitHub Actions Kestra provides three official [GitHub Actions](https://github.com/features/actions) enabling you to build robust CI/CD pipelines in your GitHub repository: - **Validate your flows** - **Deploy your flows** - **Deploy namespace files** To use these Actions, your Kestra instance must be reachable by the GitHub Actions runner—either publicly or via a self-hosted runner. If you need to validate flows offline — without connecting to a running Kestra instance — use the legacy marketplace action instead: [kestra-validate-action](https://github.com/marketplace/actions/kestra-validate-action). :::alert{type="info"} For flows managed through CI/CD, add the [`system.readOnly`](../../../06.concepts/system-labels/index.md#systemreadonly) label set to `"true"` so the UI editor is disabled and production configurations stay immutable. This is especially recommended for critical production flows: ```yaml labels: system.readOnly: true ``` :::
## Official Kestra Actions Kestra provides these three Actions for CI/CD pipelines: - [`kestra-io/github-actions/validate-flows`](https://github.com/kestra-io/github-actions/tree/main/validate-flows): Validate a folder of flows before deployment. - [`kestra-io/github-actions/deploy-flows`](https://github.com/kestra-io/github-actions/tree/main/deploy-flows): Deploy a folder of flows to your Kestra server. - [`kestra-io/github-actions/deploy-namespace-files`](https://github.com/kestra-io/github-actions/tree/main/deploy-namespace-files): Deploy namespace files to your Kestra server. --- ## Input reference ### Validate Flows Action inputs | Input | Required | Default | Description | |-------------|----------|----------|-------------| | `directory` | ❌ | `'./'` | Folder containing your flows (YAMLs). | | `server` | ✅ | — | URL of your Kestra server. | | `apiToken` | ❌ | — | API Token for authentication (Enterprise Edition only). | | `user` | ❌ | — | Basic auth username. | | `password` | ❌ | — | Basic auth password. | | `tenant` | ✅ | `"main"` | Tenant identifier (Enterprise Edition only, for multi-tenancy). | [(See action.yml)](https://github.com/kestra-io/github-actions/blob/main/validate-flows/action.yml) --- ### Deploy Flows Action inputs | Input | Required | Default | Description | |-------------|----------|----------|-------------| | `directory` | ❌ | `'./'` | Folder containing your flows (YAMLs). | | `namespace` | ❌ | — | Namespace to deploy flows to (optional). If omitted, each flow uses the namespace defined in its YAML. | | `override` | ❌ | `'false'`| If `true`, override existing flows. | | `server` | ✅ | — | URL of your Kestra server. | | `apiToken` | ❌ | — | API Token for authentication (EE only). | | `user` | ❌ | — | Basic auth username. | | `password` | ❌ | — | Basic auth password. | | `tenant` | ✅ | `"main"` | Tenant identifier (Enterprise Edition only, for multi-tenancy). | [(See action.yml)](https://github.com/kestra-io/github-actions/blob/main/deploy-flows/action.yml) --- ### Deploy Namespace Files Action inputs | Input | Required | Default | Description | |----------------|----------|----------|-------------| | `localPath` | ❌ | `'./'` | Path to your local file or directory for upload. | | `namespacePath`| ✅ | — | Remote namespace path to deploy files to (if uploading a file, must match a file path). | | `namespace` | ✅ | — | Namespace to deploy files to. | | `override` | ❌ | `'false'`| If `true`, override existing files. | | `server` | ✅ | — | URL of your Kestra server. | | `apiToken` | ❌ | — | API Token for authentication (EE only). | | `user` | ❌ | — | Basic auth username. | | `password` | ❌ | — | Basic auth password. | | `tenant` | ✅ | `"main"` | Tenant identifier (Enterprise Edition only, for multi-tenancy). | [(See action.yml)](https://github.com/kestra-io/github-actions/blob/main/deploy-namespace-files/action.yml) --- ## Example workflow A sample CI/CD workflow that validates and deploys flows with the new Kestra Actions: ```yaml name: Kestra CI/CD on: [push] jobs: validate: runs-on: ubuntu-latest steps: - name: Checkout repository content uses: actions/checkout@v4 - name: Validate flows uses: kestra-io/github-actions/validate-flows@main with: directory: ./kestra/flows server: ${{ secrets.KESTRA_HOSTNAME }} # Optional: uncomment for Enterprise Edition # apiToken: ${{ secrets.KESTRA_API_TOKEN }} deploy: runs-on: ubuntu-latest needs: validate steps: - name: Checkout repository content uses: actions/checkout@v4 - name: Deploy product flows uses: kestra-io/github-actions/deploy-flows@main with: directory: ./kestra/flows/product namespace: product server: ${{ secrets.KESTRA_HOSTNAME }} - name: Deploy engineering flows uses: kestra-io/github-actions/deploy-flows@main with: directory: ./kestra/flows/engineering namespace: engineering server: ${{ secrets.KESTRA_HOSTNAME }} # Example: Deploy namespace files upload_nsfiles: runs-on: ubuntu-latest steps: - name: Checkout repository content uses: actions/checkout@v4 - name: Upload config YAML to engineering namespace uses: kestra-io/github-actions/deploy-namespace-files@main with: localPath: ./config/eng.yaml namespace: engineering namespacePath: config/eng.yaml server: ${{ secrets.KESTRA_HOSTNAME }} ``` :::alert{type="info"} **Tips:** - Store your Kestra server credentials and tokens as GitHub Secrets for security. - All actions invoke the Kestra CLI via a managed binary—no manual CLI install needed. - You can authenticate using either an **API token** (Enterprise Edition) or basic auth credentials (username and password): ```yaml # Using API token (EE) with: server: ${{ secrets.KESTRA_HOSTNAME }} apiToken: ${{ secrets.KESTRA_API_TOKEN }} --- # Using basic auth with: server: ${{ secrets.KESTRA_HOSTNAME }} user: ${{ secrets.KESTRA_USERNAME }} password: ${{ secrets.KESTRA_PASSWORD }} ``` - When using Enterprise features, provide `apiToken` and/or set `tenant` as needed. ::: ### Deploy to multiple namespaces You can target multiple namespaces in one workflow either by: - Letting each flow keep its own `namespace` value (omit the `namespace` input). - Running the action multiple times with different `namespace` inputs. ```yaml name: Kestra Deploy Across Namespaces on: [push, workflow_dispatch] jobs: deploy-default-namespaces: runs-on: ubuntu-latest steps: - uses: actions/checkout@v5 - uses: kestra-io/github-actions/deploy-flows@main with: server: https://kafka-ee.preview.dev.kestra.io apiToken: ${{ secrets.KESTRA_API_TOKEN }} tenant: my-tenant directory: ./flows # Flows keep their own namespace values override: true deploy-to-other-namespace: runs-on: ubuntu-latest steps: - uses: actions/checkout@v5 - uses: kestra-io/github-actions/deploy-flows@main with: server: https://kafka-ee.preview.dev.kestra.io apiToken: ${{ secrets.KESTRA_API_TOKEN }} tenant: my-tenant directory: ./flows namespace: company.team # Force all flows to this namespace override: true ``` The same pattern applies to namespace files: run `deploy-namespace-files` once per target namespace, or keep authorship paths by omitting overriding `namespacePath`. ### Examples by use case **Validate flows (single file or folder)** ```yaml jobs: validate-single-flow: runs-on: ubuntu-latest steps: - uses: actions/checkout@v5 - uses: kestra-io/github-actions/validate-flows@main with: server: ${{ secrets.KESTRA_HOSTNAME }} apiToken: ${{ secrets.KESTRA_API_TOKEN }} tenant: my-tenant directory: ./flows/my-log-flow.yml # Single file validate-folder: runs-on: ubuntu-latest steps: - uses: actions/checkout@v5 - uses: kestra-io/github-actions/validate-flows@main with: server: ${{ secrets.KESTRA_HOSTNAME }} apiToken: ${{ secrets.KESTRA_API_TOKEN }} tenant: my-tenant directory: ./flows # Folder path of files ``` **Deploy flows (use authored namespaces vs override)** ```yaml jobs: deploy-authored-namespaces: runs-on: ubuntu-latest steps: - uses: actions/checkout@v5 - uses: kestra-io/github-actions/deploy-flows@main with: server: ${{ secrets.KESTRA_HOSTNAME }} apiToken: ${{ secrets.KESTRA_API_TOKEN }} tenant: my-tenant directory: ./flows # Each flow keeps its namespace override: true deploy-to-specific-namespace: runs-on: ubuntu-latest steps: - uses: actions/checkout@v5 - uses: kestra-io/github-actions/deploy-flows@main with: server: ${{ secrets.KESTRA_HOSTNAME }} apiToken: ${{ secrets.KESTRA_API_TOKEN }} tenant: my-tenant directory: ./flows namespace: company.team # Force all flows to a specific namespace override: true ``` **Deploy namespace files (single file, folder, folder to custom path)** ```yaml jobs: upload-single-file: runs-on: ubuntu-latest steps: - uses: actions/checkout@v5 - uses: kestra-io/github-actions/deploy-namespace-files@main with: server: ${{ secrets.KESTRA_HOSTNAME }} apiToken: ${{ secrets.KESTRA_API_TOKEN }} tenant: my-tenant localPath: ./nsfiles/file1.txt namespacePath: single/file1.txt namespace: my-namespace override: true upload-folder: runs-on: ubuntu-latest steps: - uses: actions/checkout@v5 - uses: kestra-io/github-actions/deploy-namespace-files@main with: server: ${{ secrets.KESTRA_HOSTNAME }} apiToken: ${{ secrets.KESTRA_API_TOKEN }} tenant: my-tenant localPath: ./nsfiles # Upload entire folder namespacePath: ./nsfiles namespace: my-namespace override: true upload-folder-to-custom-path: runs-on: ubuntu-latest steps: - uses: actions/checkout@v5 - uses: kestra-io/github-actions/deploy-namespace-files@main with: server: ${{ secrets.KESTRA_HOSTNAME }} apiToken: ${{ secrets.KESTRA_API_TOKEN }} tenant: my-tenant localPath: ./nsfiles namespacePath: myFiles # Remap destination path namespace: my-namespace override: true ``` --- ## Additional resources - [How-to Guide: GitHub Actions CI/CD](../../../15.how-to-guides/github-actions/index.md) — More examples and advanced workflows. - [Kestra Validate Flows Action](https://github.com/kestra-io/github-actions/tree/main/validate-flows) - [Kestra Deploy Flows Action](https://github.com/kestra-io/github-actions/tree/main/deploy-flows) - [Kestra Deploy Namespace Files Action](https://github.com/kestra-io/github-actions/tree/main/deploy-namespace-files) --- # GitLab CI for Kestra: Automate Flow Validation URL: https://kestra.io/docs/version-control-cicd/cicd/gitlab > Configure GitLab CI pipelines to automatically validate and deploy Kestra flows and resources to your Kestra instance. Use GitLab CI to automate the validation and deployment of your Kestra flows. ## Automate Kestra deployments with GitLab CI [GitLab CI](https://docs.gitlab.com/ee/ci/) lets you define pipelines in a `.gitlab-ci.yml` file to automate tests, builds, and deployments. With Kestra, you can validate and deploy flows directly from your pipeline using the Kestra CLI. :::alert{type="info"} For flows managed through CI/CD, add the [`system.readOnly`](../../../06.concepts/system-labels/index.md#systemreadonly) label set to `"true"` so the UI editor is disabled and production configurations stay immutable. This is especially recommended for critical production flows: ```yaml labels: system.readOnly: true ``` ::: :::alert{type="info"} Your GitLab runner must be able to reach your Kestra instance. - If your Kestra server is **publicly accessible**, a shared GitLab runner is sufficient. - If your Kestra server is **private**, use a **self-hosted runner** with network access to Kestra. ::: --- ## Example pipeline The example below defines two stages — `validate` and `deploy` — and runs the Kestra CLI in an official Kestra image. Validation runs first; if it succeeds, the deploy stage updates a target namespace with the flows from your repository. ```yaml stages: - validate - deploy default: image: name: kestra/kestra:latest entrypoint: [""] variables: KESTRA_HOST: https://kestra.io/ validate: stage: validate # Validate flows server-side script: - /app/kestra flow validate ./kestra/flows --server ${KESTRA_HOST} --api-token $KESTRA_API_TOKEN deploy: stage: deploy script: - /app/kestra flow namespace update my_namespace ./kestra/flows/prod --server ${KESTRA_HOST} --api-token $KESTRA_API_TOKEN ``` :::alert{type="info"} **Authentication options:** - Use the `--api-token` flag (as shown). - Use basic auth with `--user=USERNAME:PASSWORD` if API tokens are not available. ::: --- ## Tips - Pin the Docker image (e.g., `kestra/kestra:1.0.x`) to avoid unexpected CLI changes. --- # CI/CD Helpers in Kestra: Local Dev and Read-Only Flows URL: https://kestra.io/docs/version-control-cicd/cicd/helpers > Simplify local flow development and validation with Kestra CI/CD helpers for expanding inclusions and managing read-only flows. Kestra provides a set of **helper functions** designed to make local flow development easier — especially when working with large or modular flows. ## Simplify local flow development with helpers :::alert{type="info"} If helpers feed into flows deployed via CI/CD, add the [`system.readOnly`](../../../06.concepts/system-labels/index.md#systemreadonly) label set to `"true"` so those flows remain immutable in the UI. This is especially recommended for critical production flows: ```yaml labels: system.readOnly: true ``` ::: :::alert{type="warning"} Helpers are only available during **local flow development**. Before deploying your flows to a Kestra server, you must expand them first. CI/CD pipelines automatically handle this expansion process. Helpers **cannot be used directly from the Kestra UI**. ::: --- ## Expanding flows before upload Use the `flow validate` command to validate and expand your flow locally. This command will output the expanded version of your flow, resolving any helper references. ```bash ./kestra flow validate path-to-your-flow.yaml ``` --- ## `[[> file.txt]]`: Include external files When working on large flows, inlining long scripts or SQL statements can make maintenance difficult. The **include helper** lets you reference external files inside your flow YAML, keeping it clean and modular. ### Example #### Without helper ```yaml id: include namespace: company.team tasks: - id: t1 type: io.kestra.plugin.core.debug.Return format: | Lorem Ipsum is simply dummy text of the printing ..... 500 lines later ``` #### With helper ```yaml id: include namespace: company.team tasks: - id: t1 type: io.kestra.plugin.core.debug.Return format: "[[> lorem.txt]]" ``` Then, create a local file named `lorem.txt` containing your text. --- ### Supported path formats | Format | Description | |--------|--------------| | `[[> lorem.txt]]` | Relative path from the flow file (both in the same directory). | | `[[> /path/to/lorem.txt]]` | Absolute path. | | `[[> path/to/lorem.txt]]` | Relative path from the flow directory (e.g., `flow.yaml` in parent folder). | When including a file, ensure you use the correct YAML scalar type — literal (quoted or unquoted) for single-line values or folded for multiline content. :::alert{type="warning"} Includes are **resolved recursively**, meaning included files can themselves contain additional includes. Be careful: if your included file needs to display `[[ ... ]]` literally, escape it as `\[[ ... ]]`. ::: --- ## Local flow validation To validate your flow locally (especially if it uses helpers), use the `--local` flag. This ensures validation runs with the same plugins as your local environment. ```bash ./kestra flow validate --local path-to-your-flow.yaml ``` :::alert{type="info"} Flows using helper functions **must** be validated locally since the expansion process cannot run on the Kestra webserver. ::: --- ## Expand includes To explicitly expand your flow (resolving includes and helpers) without validation, use: ```bash ./kestra flow expand path-to-your-flow.yaml ``` This command outputs a version of your flow ready for upload to the Kestra server. --- --- # Kubernetes Operator in Kestra: GitOps for Flows URL: https://kestra.io/docs/version-control-cicd/cicd/kubernetes-operator > Manage Kestra resources declaratively using the Kestra Kubernetes Operator for GitOps-style flow and configuration management. How to use the Kestra Kubernetes Operator to provision and manage changes to Kestra resources, including flows, namespace files, and key-value store entries. :::alert{type="warning"} The Kestra Kubernetes Operator is no longer maintained. It will not receive further updates or bug fixes. For declarative flow management, consider using the [Terraform provider](../03.terraform/index.md) or the [Kestra CLI](../../../kestra-cli/index.mdx) as alternatives. ::: ## Manage Kestra with the Kubernetes Operator :::alert{type="info"} When you deploy flows through GitOps or CI/CD (including the operator), add the [`system.readOnly`](../../../06.concepts/system-labels/index.md#systemreadonly) label set to `"true"` so the UI editor is disabled and production configurations stay immutable. This is especially recommended for critical production flows: ```yaml labels: system.readOnly: true ``` ::: This feature requires the [Enterprise Edition](../../../07.enterprise/index.mdx). A **Kubernetes operator** is an application-specific controller that extends the functionality of the Kubernetes API to create, configure, and manage instances of applications or their components on behalf of a Kubernetes user. It is a custom Kubernetes controller that uses custom resources (CR). To define and manage these components, operators use Custom Resource Definitions (CRDs). CRDs allow you to extend the Kubernetes API with new resource types that are specific to your application or service. The Kestra Kubernetes Operator manages Kestra flows, namespace files, and key-value store entries as Kubernetes resources. ## Installing the Kestra Kubernetes Operator We provide a Helm chart to install Kestra in Kubernetes; see the [installation guide](../../../02.installation/03.kubernetes/index.md). The [Kestra Operator](https://github.com/kestra-io/kestra/tree/develop/charts/kestra-operator) can be installed with the `kestra-operator` chart. To install the chart with the release name `my-kestra-operator` use: ```bash $ helm repo add kestra https://helm.kestra.io/ $ helm install my-kestra-operator kestra/kestra-operator --version 1.0.0 ``` This chart can also deploy the Kestra Kubernetes Operator in your cluster. :::alert{type="info"} The operator automatically creates and updates Kestra CRDs, so it requires Kubernetes RBAC (service account plus cluster-wide roles) that the Helm chart provisions for you. Contact us if you have concerns or run into issues applying it to your cluster. ::: Because the operator calls the Kestra API, you must provide credentials — either a [service account](../../../07.enterprise/03.auth/service-accounts/index.md) or an [API token](../../../07.enterprise/03.auth/api-tokens/index.md)—if authentication is enabled. To install the Kestra Kubernetes Operator inside your cluster, you need to configure the following properties in your Helm values: ```yaml operator: enabled: true apiKey: ``` If you prefer to use a service account, please configure the following properties instead: ```yaml operator: enabled: true basicAuth: ``` Then run `helm install` or `helm upgrade` to roll out the changes to your cluster. If everything goes well, you will see a `kestra-operator` pod running. ```plaintext kubectl get po NAME READY STATUS RESTARTS AGE kestra-operator-7d7bdbd846-pzpl2 1/1 Running 0 158m kestra-postgresql-0 1/1 Running 1 (2d23h ago) 3d kestra-standalone-677474499f-4r5ft 1/1 Running 2 (5h10m ago) 2d23h ``` ### Managing multiple operators in one cluster Each operator instance manages a single Kestra instance. If you run multiple Kestra deployments in the same Kubernetes cluster, deploy one operator per Kestra instance and scope each operator to the namespaces that will contain that instance’s custom resources. Configure the namespace watch list via `quarkus.operator-sdk.namespaces` (Helm chart values) or the `QUARKUS_OPERATOR_SDK_NAMESPACES` environment variable. Example snippets: ```yaml quarkus: operator-sdk: namespaces: - kestra-dev - kestra-prod ``` ```yaml kestraOperator: env: - name: QUARKUS_OPERATOR_SDK_NAMESPACES value: "kestra-dev,kestra-prod" ``` Deploying separate operator releases with different namespace lists ensures each instance reconciles only its own `KestraFlow`, `KestraKeyValue`, and `KestraNamespaceFile` resources. ## Manage Kestra resources via the operator The Kestra Kubernetes operator watches for three resource types in all namespaces: - `KestraFlow`, shortname **flow**. To manage [flows](../../../05.workflow-components/01.flow/index.md). - `KestraKeyValue`, shortnames **keyvalue** or **kv**. To manage [K/V store](../../../06.concepts/05.kv-store/index.md) entries. - `KestraNamespaceFile`, shortnames **namespacefile** or **nsfile**. To manage [Namespace files](../../../06.concepts/02.namespace-files/index.md). ### Managing Flow resources Here is an example flow resource that you can create in a `hello-world.yml` file: ```yaml apiVersion: model.kestra.io/v1alpha1 kind: KestraFlow metadata: name: hello-world spec: id: hello-world namespace: company.team # This is a Kestra namespace, not a Kubernetes namespace source: | id: hello-world namespace: company.team tasks: - id: hello type: io.kestra.core.tasks.log.Log ``` :::alert{type="info"} Note: set the flow `id` and `namespace` both in the resource spec and inside the flow source so updates are applied correctly. ::: You can then use standard `kubectl` commands to create, update, list, and delete your flows: ```shell ## Create or update the flow kubectl apply hello-world.yml ## List all flows kubectl get flow ## Get the 'hello-world' flow kubectl get flow hello-world ## Delete the 'hello-world' flow kubectl delete flow hello-world ``` ### Managing K/V entry resources Here is an example key-value entry resource that you can create in a `kv-1.yml` file: ```yaml apiVersion: model.kestra.io/v1alpha1 kind: KestraKeyValue metadata: name: kv-1 spec: namespace: company.team # This is a Kestra namespace, not a Kubernetes namespace key: key1 value: value1 ``` Use the same `kubectl` workflow to create, update, list, and delete your entries: ```shell ## Create or update the k/v entry kubectl apply kv-1.yml ## List all entries kubectl get kv ## Get the 'kv-1' k/v entries kubectl get kv kv-1 ## Delete the 'kv-1' k/v entry kubectl delete kv kv-1 ``` ### Managing Namespace File resources Here is an example namespace file resource that you can create in an `nsfile-1.yml` file: ```yaml apiVersion: model.kestra.io/v1alpha1 kind: KestraNamespaceFile metadata: name: nsfile-1 spec: namespace: company.team # This is a Kestra namespace, not a Kubernetes namespace filename: nsfile-1.txt content: Hello World ``` You can then use the standard `kubectl` commands to create, update, list and delete your namespace files: ```shell ## Create or update the namespace file kubectl apply nsfile-1.yml ## List all namespace files kubectl get nsfile ## Get the 'nsfile-1' namespace file kubectl get nsfile nsfile-1 ## Delete the 'nsfile-1' namespace file kubectl delete nsfile nsfile-1 ``` --- # Terraform for Kestra – Manage Resources as Code URL: https://kestra.io/docs/version-control-cicd/cicd/terraform > Provision and manage Kestra resources like flows and namespaces as code using the official Kestra Terraform Provider. Use Terraform to provision, manage, and automate changes to Kestra resources. ## Manage Kestra resources with Terraform The [official Kestra Terraform Provider](https://registry.terraform.io/providers/kestra-io/kestra/latest) lets you manage Kestra resources as code. You can define flows, templates, and namespaces declaratively — Terraform will handle creation, updates, and deletions automatically. :::alert{type="info"} For flows managed through CI/CD or infrastructure-as-code, add the [`system.readOnly`](../../../06.concepts/system-labels/index.md#systemreadonly) label set to `"true"` so the UI editor is disabled and production configurations stay immutable. This is especially recommended for critical production flows: ```yaml labels: system.readOnly: true ``` ::: --- ## Multitenancy The Kestra Terraform provider supports **multitenancy**, allowing you to manage resources across multiple tenants from a single configuration. When configuring the provider, include the `tenant_id` parameter to specify the tenant you want to target. ```hcl provider "kestra" { tenant_id = "kestra-tech" url = "http://your-kestra-url:8080" } ``` --- ## Example configuration Start by creating a Terraform configuration file — typically named `provider.tf` or `main.tf` — to set up the Kestra provider. ### **`provider.tf`** ```hcl provider "kestra" { # Required: Kestra server URL url = "http://localhost:8080" # Required for multitenant environments tenant_id = "kestra-tech" # Optional: basic authentication username = "john" password = "my-password" # Optional: API token (Enterprise Edition) api_token = "my-api-token" # Optional: JWT authentication jwt = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Iktlc3RyYS5pbyIsImlhdCI6MTUxNjIzOTAyMn0.hm2VKztDJP7CUsI69Th6Y5NLEQrXx7OErLXay55GD5U" } ``` ### **`version.tf`** Define the provider source and version to ensure compatibility. ```hcl terraform { required_providers { kestra = { source = "kestra-io/kestra" version = "~> 0.18.1" } } } ``` ### **`flows.tf`** Create a flow resource referencing a YAML file containing your flow definition. ```hcl resource "kestra_flow" "flow_example" { namespace = "company.team" flow_id = "myflow" content = file("kestra/flows/my-flow.yml") } ``` ### **`templates.tf`** Define a template resource. Use the `depends_on` attribute to ensure flows are deployed first. ```hcl resource "kestra_template" "template_example" { namespace = "company.team" template_id = "my-template" content = file("kestra/templates/my-template.yml") depends_on = [kestra_flow.flow_example] } ``` --- ## Next steps - Review the full [Kestra Terraform Provider documentation](https://registry.terraform.io/providers/kestra-io/kestra/latest/docs). - Use Terraform commands like `terraform validate`, `plan`, and `apply` to manage your Kestra infrastructure. - Integrate Terraform into your [CI/CD pipeline](../index.md) for automated deployments. --- # Version Control with Git: Sync, Push, and Clone Flows URL: https://kestra.io/docs/version-control-cicd/git > Learn patterns for versioning Kestra flows and namespace files with Git, including Sync, Push, and Clone strategies. Learn how to pair Kestra with Git so you can version flows, namespace files, and related artifacts alongside your application code. ## Version flows and namespace files with Git
Kestra supports version control with Git. You can use one or more repositories to store your [flows](../../05.workflow-components/01.flow/index.md), [namespace files](../../06.concepts/02.namespace-files/index.md), [apps](../../07.enterprise/04.scalability/apps/index.md), [tests](../../07.enterprise/02.governance/unit-tests/index.md), and [dashboards](../../09.ui/00.dashboard/index.md), tracking changes through Git history. There are multiple ways to combine Kestra with Git: - [SyncFlows](/plugins/plugin-git/io.kestra.plugin.git.syncflows) implements GitOps with Git as the single source of truth for flows. - [SyncNamespaceFiles](/plugins/plugin-git/io.kestra.plugin.git.syncnamespacefiles) syncs namespace files the same way. - [PushFlows](/plugins/plugin-git/io.kestra.plugin.git.pushflows) commits and pushes flow edits from the UI to Git, useful when you rely on the built-in editor but still want version history. - [PushNamespaceFiles](/plugins/plugin-git/io.kestra.plugin.git.pushnamespacefiles) does the same for namespace files. - [Clone](https://kestra.io/plugins/plugin-git/io.kestra.plugin.git.clone) clones a repository directly into a flow so scripts are available at runtime. - [TenantSync](/plugins/plugin-git/io.kestra.plugin.git.tenantsync) synchronizes all namespaces in a tenant, including flows, files, apps, tests, and dashboards. - [NamespaceSync](/plugins/plugin-git/io.kestra.plugin.git.namespacesync) keeps a single namespace in sync with a Git repo. - A custom [CI/CD](../cicd/index.md) pipeline lets you manage deployments yourself (GitHub Actions, Terraform, etc.) while keeping Git authoritative. The image below shows how to choose the right pattern based on your needs: ![git](./git.png) The following sections cover each pattern and when to use it. ## Git SyncFlows and SyncNamespaceFiles The [Git SyncFlows](/plugins/plugin-git/io.kestra.plugin.git.syncflows) pattern implements GitOps with Git as the single source of truth. Store flows in Git, and run a _system flow_ that automatically syncs changes into Kestra. The [Git SyncNamespaceFiles](/plugins/plugin-git/io.kestra.plugin.git.syncnamespacefiles) pattern mirrors this for namespace files. Here's how it works: - Store flows and namespace files in Git. - Schedule a _system flow_ that syncs changes from Git to Kestra. - Modify files in Git whenever you need to change a flow or namespace file. - The system flow syncs those changes, overwriting any conflicting UI edits with the Git version. This pattern suits teams that treat Git as the single source of truth and prefer not to edit flows or namespace files in the UI. No CI/CD pipeline is required, so it's ideal if you already follow GitOps practices or come from a Kubernetes background. Here is an example system flow that you can use to declaratively sync changes from Git to Kestra: ```yaml id: sync_from_git namespace: system tasks: - id: git type: io.kestra.plugin.git.SyncFlows url: https://github.com/kestra/scripts branch: main username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" targetNamespace: git includeChildNamespaces: true # optional; by default, it's set to false to allow explicit definition gitDirectory: your_git_dir triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "*/1 * * * *" # every minute ``` Commit this flow to Git or add it via the built-in editor; it won't be overwritten by reconciliation. You can also sync namespace files with the example below: ```yaml id: sync_from_git namespace: system tasks: - id: git type: io.kestra.plugin.git.SyncNamespaceFiles namespace: prod gitDirectory: _files # optional; set to _files by default url: https://github.com/kestra-io/flows branch: main username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" ``` You can also trigger this flow with a [GitHub webhook](../../05.workflow-components/07.triggers/03.webhook-trigger/index.md) whenever changes land in Git: ```yaml id: sync_from_git namespace: system tasks: - id: git type: io.kestra.plugin.git.SyncFlows url: https://github.com/kestra/scripts branch: main targetNamespace: git username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" triggers: - id: github_webhook type: io.kestra.plugin.core.trigger.Webhook key: "{{ secret('WEBHOOK_KEY') }}" ``` The webhook key authenticates requests and prevents unauthorized access. For the flow above, paste the following URL into your repository’s **Webhooks** settings: ```bash http://your_kestra_host:8080/api/v1//executions/webhook/prod/sync_from_git/your_secret_key ``` ![github_webhook](./github_webhook.png) Following the pattern: ```bash http:///api/v1//executions/webhook/// ``` ## CI/CD The CI/CD pattern still treats Git as the single source of truth but pushes code changes to Kestra whenever a pull request merges. Unlike the Sync pattern, you manage the automation (GitHub Actions, Terraform, etc.). See the [CI/CD](../cicd/index.md) docs for setup details. ## Git PushFlows and PushNamespaceFiles The [Git PushFlows](/plugins/plugin-git/io.kestra.plugin.git.pushflows) pattern lets you edit flows in the UI while pushing versions to Git. The [Git PushNamespaceFiles](/plugins/plugin-git/io.kestra.plugin.git.pushnamespacefiles) pattern offers the same workflow for namespace files. Example flow for pushing from Kestra to Git: ```yaml id: push_to_git namespace: system tasks: - id: commit_and_push type: io.kestra.plugin.git.PushFlows url: https://github.com/kestra-io/scripts sourceNamespace: dev targetNamespace: pod flows: "*" branch: kestra username: github_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" commitMessage: add namespace files changes triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "* */1 * * *" # every hour ``` Example flow for pushing namespace files: ```yaml id: push_to_git namespace: system tasks: - id: commit_and_push type: io.kestra.plugin.git.PushNamespaceFiles namespace: dev files: "*" gitDirectory: _files url: https://github.com/kestra-io/scripts # required string username: git_username password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" branch: dev commitMessage: "add namespace files" triggers: - id: schedule_push_to_git type: io.kestra.plugin.core.trigger.Schedule cron: "*/15 * * * *" ``` Use this pattern to push to a feature branch and open a pull request for review. ## Git Clone The [Git Clone](/plugins/plugin-git/io.kestra.plugin.git.clone) pattern clones a repository at runtime so you can orchestrate code managed elsewhere, for example: - dbt projects via the [dbt CLI task](/plugins/plugin-dbt/dbt-cli/io.kestra.plugin.dbt.cli.dbtcli) - Infrastructure deployments via [Terraform CLI](/plugins/plugin-terraform/cli/io.kestra.plugin.terraform.cli.terraformcli), [OpenTofu CLI](/plugins/plugin-opentofu/cli/io.kestra.plugin.opentofu.cli.opentofucli), [Terragrunt CLI](/plugins/plugin-terragrunt/cli/io.kestra.plugin.terragrunt.cli.terragruntcli), or [Ansible CLI](/plugins/plugin-ansible/cli/io.kestra.plugin.ansible.cli.ansiblecli) - Docker builds via the [Docker Build task](/plugins/plugin-docker/io.kestra.plugin.docker.build) ## Git TenantSync and NamespaceSync Both [Git TenantSync](/plugins/plugin-git/io.kestra.plugin.git.tenantsync) and [Git NamespaceSync](/plugins/plugin-git/io.kestra.plugin.git.namespacesync) give you full control over synchronizing Kestra objects with your Git repository. - **`TenantSync`** – synchronizes **all namespaces** in a tenant, including flows, files, apps, tests, dashboards, and custom blueprints. - Requires `kestraUrl` and `auth` so the task can call Kestra's API with tenant-wide RBAC. - Useful when you need to back up the entire tenant to Git and promote environments through pull requests. - **`NamespaceSync`** – synchronizes objects within a **single namespace** with your Git repository. - Requires the `namespace` property but not `kestraUrl` or `auth`; it relies on namespace-level RBAC and can be run by any user with sufficient permissions. - Ideal for teams that sync one namespace per repository, allowing owners to manage their own syncs. Both plugins support: - `sourceOfTruth` (`GIT` or `KESTRA`) to define the update strategy. - `whenMissingInSource` with options `DELETE`, `KEEP`, or `FAIL` to control how missing objects should be handled. - An **opinionated folder structure** for flows, apps, dashboards, tests, and files with one folder per namespace (see [Git directory structure](#git-directory-structure) below). - `protectedNamespaces` to ensure your Kestra objects from critical namespaces (such as `system`) are not accidentally deleted when `sourceOfTruth` is `GIT`. - Validation rules requiring explicit Git `branch` and optional `gitDirectory`. - Options like `dryRun` and `onInvalidSyntax` for safe rollouts and error handling. Example usage of the `TenantSync` task: ```yaml id: tenant_git_sync namespace: system tasks: - id: tenant type: io.kestra.plugin.git.TenantSync sourceOfTruth: KESTRA whenMissingInSource: DELETE url: https://github.com/org/repo branch: main protectedNamespaces: - system kestraUrl: http://localhost:8080 auth: username: admin@kestra.io password: "{{ secret('KESTRA_PASSWORD') }}" ``` Example usage of the `NamespaceSync` task: ```yaml id: namespace_git_sync namespace: system tasks: - id: namespace type: io.kestra.plugin.git.NamespaceSync namespace: company.team sourceOfTruth: GIT whenMissingInSource: KEEP url: https://github.com/org/repo branch: main protectedNamespaces: - system ``` ### Git directory structure Both `TenantSync` and `NamespaceSync` expect a specific folder structure inside your Git repository. The optional `gitDirectory` property sets a base folder within the repo; if omitted the repo root is used. Under that base, Kestra uses a fixed layout organized by namespace and resource type: | Resource type | Path in Git | | --- | --- | | Flows | `/flows/.yaml` | | Namespace files | `/files/` | | Apps | `/apps/.yaml` | | Unit tests | `/tests/.yaml` | | Dashboards | `_global/dashboards/.yaml` | | Custom blueprints | `_global/blueprints/.yaml` | If you set `gitDirectory: monorepo`, the full path for a flow in the `company.team` namespace becomes `monorepo/company.team/flows/my-flow.yaml`. #### How resource identity works **The filename stem is the resource ID.** For flows, apps, unit tests, and dashboards the part of the filename before `.yaml` is used as the object's ID during sync — not the `id` field written inside the YAML. Custom blueprints are an exception: the sync reads the `id` field from the YAML content and falls back to the filename stem only when the field is absent. Namespace files use their full relative path under `/files/` as the file path identity. This has an important consequence: **the filename must match the `id` inside the YAML**. When Kestra pushes objects from the UI to Git (for example with `sourceOfTruth: KESTRA`), it generates filenames from the object's ID automatically. If you later rename a file in Git, the sync treats the old filename as a deleted object and the new filename as a new object. When it tries to create the new object, Kestra rejects it because a resource with that ID already exists in the instance — resulting in an error like: ``` Invalid entity: App already exists for id 'solutions_ai_search_annual_report' ``` To avoid this error, keep filenames in sync with the `id` field inside each YAML. If you need to rename a file, also update the `id` inside the YAML at the same time. #### Handling mismatched filenames with `onInvalidSyntax` If you encounter files whose names do not match the expected ID (for example after a manual rename), you can control how the sync reacts using the `onInvalidSyntax` property: | Value | Behavior | | --- | --- | | `FAIL` (default) | Throws an exception and stops the sync | | `WARN` | Logs a warning and continues | | `SKIP` | Logs an info message and continues | Use `WARN` or `SKIP` as a short-term workaround while you correct the naming in Git: ```yaml tasks: - id: sync type: io.kestra.plugin.git.TenantSync sourceOfTruth: GIT onInvalidSyntax: WARN # ... other properties ``` --- # Why Kestra: Simpler, More Powerful Orchestration URL: https://kestra.io/docs/why-kestra > Discover why teams choose Kestra for orchestration. A declarative, event-driven platform that scales from simple automations to complex enterprise pipelines. How We See the Orchestration and Automation Market Most orchestration solutions fall into one of two extremes. On one side are code-centric frameworks like Apache Airflow, which can offer flexibility but come with a steep learning curve and large operational overhead. Before you can run your first workflow, you have to set up cumbersome infrastructure, learn complex frameworks, and deal with confusing deployment patterns. On the other side, drag-and-drop automation tools like Zapier are easier to start with. They come with prebuilt integrations, there's no heavy infrastructure to manage, and they require no coding skills. But as soon as you need something custom (like running a Python script in a container), these tools hit a wall. This creates a trade-off, forcing organizations to choose between flexible but complex developer tools or simple but inflexible drag-and-drop automation platforms. If you pick a code-first approach, you have to invest significant engineering resources to maintain the codebase, infrastructure, and deployment processes. If you pick a no-code tool, you outgrow it fast and start building shadow IT with workarounds. Mixing both can create chaos and confusion — each team builds their own silos, and soon nobody knows which workflows run where. Kestra bridges this market gap. It combines the flexibility of code-based orchestration with a no-code interface that anyone can learn in minutes. This means your teams can start simple and scale up to complex distributed pipelines — all within a single, unified platform. ## Meet Kestra: A Simple Orchestration Platform for Everyone Kestra combines full-code, low-code, and no-code in one place. It’s simple enough for non-developers to start building basic workflows, yet powerful enough to handle massive data pipelines or distributed processes. Kestra was built to provide a single source of truth for the entire business. You can schedule a small daily script or orchestrate a multi-step, event-driven pipeline. In both cases, you don’t need separate tools or complex infrastructure. ## What Is Kestra? Kestra is an open-source orchestration platform that: - Lets you define workflows declaratively in YAML - Allows non-developers to automate tasks with a no-code interface - Keeps everything versioned and governed, so it stays secure and auditable - Extends easily for custom use cases through plugins and custom scripts. Kestra follows a “start simple and grow as needed” philosophy. You can schedule a basic workflow in a few minutes, then later add Python scripts, Docker containers, or complicated branching logic if the situation calls for it. ## How Kestra Solves Common Problems ### 1. Focus on Business Logic, Not Plumbing With Kestra, flows are written in simple YAML. You can use one of over 1,200 built-in plugins or create tasks in any language — Python, Node.js, Go, Rust, SQL, or even a Bash script running in a container. If you want to change a schedule or add a new trigger, you just update the flow configuration directly from the embedded code editor in the UI. You don’t need to redeploy your entire application or fiddle with a complicated framework. ### 2. Simple by Default, Complex When Needed Kestra comes with many built-in plugins. You can automate many tasks without writing code. But if you need to orchestrate something custom — like an ingestion script packaged in Docker or a heavy data transformation in Spark — you can add it to your flow with minimal effort. There’s no ceiling that blocks advanced use cases. ### 3. One-Stop Shop for Automation Everything lives in one place: - **Built-in Code Editor**: Write or edit YAML in the browser with syntax validation, autocompletion, live-updating dependency view and integrated docs. - **No-Code Forms**: Non-developers can create or adjust flows without having to set up complex development environments. - **Version Control**: All flows have a revision history. You can roll back or compare changes side by side. - **Blueprints**: Start with prebuilt examples or create custom blueprints to speed up repetitive tasks. You decide whether to work in the UI or in your favorite IDE (using our VS Code or JetBrains extensions). That flexibility means less context-switching for developers and fewer barriers for everyone else. ### 4. Prebuilt Plugins and Painless Dependency Management Kestra avoids package dependency issues common in Python-based orchestrators, where you would have to manually “pip install” all integrations you want to use, leading to dependency conflicts across teams. In contrast, when you install Kestra, you get immediate access to a broad ecosystem of plugins available out of the box. For custom code, each script task runs in its own container or task runner environment, so each of them can have its own set of libraries, eliminating any dependency conflicts. ### 5. Separation of Orchestration and Business Logic With Kestra, you don’t have to turn your code into a special “task function” or adopt a specific programming style. If you already have a custom script that does some data processing, you can call it from Kestra as-is. This separation allows teams to update their business logic without the orchestrator getting in the way. ### 6. Bridging Developer and Non-Developer Worlds Kestra isn’t just for engineers — business stakeholders can schedule flows, validate data quality, and manage key tasks directly from the UI. There’s no need for an IDE or developer support for simple changes; they can write and run code in the built-in editor. This makes automation accessible across teams and removes unnecessary bottlenecks. ## Unique Advantages ### API-First Design Everything in Kestra is driven by an API. Flows, tasks, logs, permissions — anything you do in the UI can also be done with an HTTP call. This opens up all sorts of automation possibilities. You can: - Integrate Kestra with your internal applications or CI/CD pipelines - Manage all Kestra resources via Terraform - Build custom UIs - Dynamically generate new workflow executions from anywhere. ### Language-Agnostic, YAML-Based Configuration You can orchestrate tasks in Python, R, Node.js, Rust, Go or any other language. You don’t have to rewrite your code in a specific DSL or add special decorators. Kestra’s YAML config is a lightweight layer that sits on top of your existing scripts or containers. ```yaml id: python namespace: company.team tasks: - id: script_from_git type: io.kestra.plugin.core.flow.WorkingDirectory tasks: - id: clone_repository type: io.kestra.plugin.git.Clone url: https://github.com/kestra-io/scripts branch: main - id: container type: io.kestra.plugin.scripts.python.Commands taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest commands: - python etl/global_power_plant.py ``` ### Scalable and Cloud-Native Kestra can handle a few daily jobs on a single server or scale to millions of events in a multi-tenant environment. It can be deployed to any major cloud provider and on-prem data centers using standalone binary, Docker containers, or official Helm Charts for Kubernetes deployments. High availability is baked in when using the Enterprise Edition, so there’s no single point of failure. Large organizations can isolate business units in separate tenants or namespaces with dedicated workers, as well as storage and secrets backends for better governance, compliance and reliability. ### Simple Onboarding Starting your first workflow in Kestra takes minutes. There’s no long setup that involves building Docker images or editing config files in ten different places. Many users noted that tools like Airflow are tough to set up or that no-code tools aren’t flexible enough for real-world tasks. Kestra brings the accessibility of no-code together with the power of code-first tools, all without the usual setup hurdles. ### Rich Governance and Security The Enterprise Edition scales Kestra to have enterprise-grade security, scalability, and governance features required by organizations managing complex workflows across multiple teams or environments: - **Access & Identity**: SSO + RBAC with Audit Logs for full traceability, plus invitations/SCIM to manage users at scale. - **Secrets & Policy**: Bring your own secrets manager or use built-in storage, apply read-only secrets, and enforce allowed-plugin lists. - **Isolation & Control**: Multi-tenant architecture with worker isolation and dedicated worker groups; a kill switch for safe pauses and maintenance mode with in-product announcements for change comms. - **Change Safety**: Assets packaging for artifact lineage tracking, versioned plugins to pin dependencies, and flow unit tests to catch regressions before deploy. - **Operations Visibility**: Cluster monitoring from the Instance dashboard keeps runtime health transparent. ### Clear Visibility into Dependencies Kestra helps you stay organized with namespaces, labels, subflows, flow triggers, and event-driven orchestration. You can decouple processes but still see how they connect. This clarity makes it easier to diagnose issues and understand how data flows through the business. ### No Vendor Lock-In Since Kestra is open-source and self-hosted, you retain full control over your environment and your data. Even if you use Kestra Enterprise or Kestra Cloud, you’re still running the same open-source core under the hood. You’re not tied to a proprietary system that might change or disappear. You can host Kestra anywhere, export all flows with one click, and even contribute new features back to the community. ## Comparing Kestra to Other Tools **Python-Focused Orchestrators (Airflow, Prefect, Dagster)**: Great for Python shops, but create barrier for non-developers and you need to rewrite your codebase to match their framework's DSL. Any change to your workflow requires redeploying code, leading to large operational overhead and slow feedback loops. You need dedicated engineering resources to manage complicated infrastructure and CI/CD processes. Kestra's lightweight YAML approach bypasses those issues, allowing you to make changes right from the UI while keeping everything version-controlled automatically. **No-Code Solutions (Zapier, n8n)**: They’re useful for basic automations, but they fall short on complex processes (like containerized jobs, large data pipelines, or advanced orchestration logic). Kestra maintains the same simplicity but adds the power to scale. **Workflow Engines for Microservices (Temporal, Camunda)**: These can excel in specialized use cases (transactional microservices, BPMN-based processes) but may be too heavy or too dev-centric for broad company-wide adoption. Kestra aims to support multiple personas and simpler day-to-day automation tasks while still allowing advanced patterns for complex workflows. ## Common Use Cases for Kestra - **Data Pipelines and ETL/ELT**: Orchestrate batch and real-time data processing jobs, load data from multiple sources, run dbt transformations, and scale computation for custom scripts with task runners. - **Process Automation**: Empower non-engineers to automate routine tasks and complex business-critical processes with human-in-the-loop manual approval, intuitive UI forms and user-facing Apps. - **Microservice Coordination**: Trigger workflows based on events, integrate with message brokers or REST APIs, monitor long-running processes and call containerized jobs in any language. - **Generative AI Workflows**: Orchestrate LLM-powered tasks and build custom AI agents. Trigger AI steps in response to any business event and use Kestra’s Pause & Resume functionality to let humans validate AI-generated outputs. - **IT Automation**: Automate resource requests for infrastructure provisioning across AWS, GCP, or on-prem. Orchestrate build processes with plugins for Terraform, Ansible, or Docker and simplify DevOps processes from a single orchestrator. - **Cross-Team Collaboration**: Let analytics, finance, marketing, and engineering automate work in the same platform, each at their own comfort level. - **Custom Applications**: Use Kestra as a backend workflow engine for your internal tools, SaaS products, or customer-facing applications. ## Outcomes Kestra Delivers for Our Users - **Shorter Time-to-Value**: You can build, test, and deploy new workflows in hours or days, not weeks or months. - **Greater Operational Efficiency**: Non-technical users handle many tasks themselves, freeing engineers to tackle other projects. - **Clarity and Structure**: Kestra provides visibility into dependencies across teams, data sources, and environments. - **Single Pane of Glass**: Stop juggling multiple orchestration tools. With Kestra, you get a unified platform to automate everything from simple scheduled jobs to large-scale mission-critical data pipelines. ## Our Vision: Orchestrate Everything, Everywhere We believe in a future where a single orchestration platform covers all use cases from small scripts to complex enterprise processes — without forcing you into a single language or framework. Kestra is designed to be: - **The simplest orchestration app for both developers and non-developers**. - **Equally at home orchestrating data pipelines, business processes, or microservices**. - **Flexible enough to integrate with any technology stack, any scale, anywhere**. ## Try Kestra and See it in Action Kestra’s goal is to remove the barriers that keep orchestration locked away in dev-centric tools or limited no-code apps. Thanks to a language-agnostic, API-first design, Kestra creates a place where everyone can automate and scale mission-critical workflows without wrestling with complex frameworks or feeling boxed in by rigid no-code solutions. If you’re tired of the old trade-offs — heavy code frameworks vs. limited no-code apps — Kestra stands ready to help. Get started by installing our forever-free open-source edition. You get to keep your favorite languages, empower non-technical teams, and orchestrate everything from small daily tasks to multi-stage, event-driven pipelines. If you have questions or want to see how Kestra fits into your environment, [Book a Demo](https://kestra.io/demo) or [join our Slack community](http://kestra.io/slack). We’re happy to discuss your specific use case and help you succeed in your orchestration and automation journey. --- # Workflow Components in Kestra: Complete Reference URL: https://kestra.io/docs/workflow-components > Comprehensive reference guide to Kestra workflow components. Explore flows, tasks, triggers, inputs, outputs, and more to build powerful orchestration logic. import ChildCard from "~/components/docs/ChildCard.astro" Get to know the main orchestration components of a Kestra workflow. ## Core Workflow Components

--- # afterExecution Tasks in Kestra – Post-Run Actions URL: https://kestra.io/docs/workflow-components/afterexecution > Trigger actions after flow completion with afterExecution tasks. Run logic based on final execution status (Success/Failed) for notifications or reporting. Run tasks after a flow execution completes. `afterExecution` tasks run once a flow has finished, allowing you to act on the final execution status.
## `afterExecution` property `afterExecution` is a block of tasks that run after the flow ends. You can use it to run conditional tasks based on the final state, such as **SUCCESS** or **FAILED**. This is especially useful for custom notifications and alerts. For example, you can combine `afterExecution` with the `runIf` property to send different Slack messages depending on the execution state. ```yaml id: alerts_demo namespace: company.team tasks: - id: fail type: io.kestra.plugin.core.execution.Fail afterExecution: - id: onSuccess runIf: "{{execution.state == 'SUCCESS'}}" type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: https://hooks.slack.com/services/xxxxx messageText: "{{flow.namespace}}.{{flow.id}} finished successfully!" - id: onFailure runIf: "{{execution.state == 'FAILED'}}" type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: https://hooks.slack.com/services/xxxxx messageText: "Oh no, {{flow.namespace}}.{{flow.id}} failed!!!" ``` ## `afterExecution` vs `errors` Both constructs are useful for notifications and follow-up actions, but they run at different moments. - `errors` runs when a task or flow errors and is primarily for failure handling. - `afterExecution` runs only after the execution reaches its final state. For failure-specific handling, including local handlers inside flowable tasks, see the [`errors` documentation](../11.errors/index.md). Choose `afterExecution` when you need to branch on the final status of the whole execution, for example to send one message for `SUCCESS`, another for `FAILED`, and a third for `WARNING`. Choose `errors` when you only care about failure handling or when you need local error handling inside a specific flowable task. Pros of `afterExecution`: - It works naturally with final states such as `SUCCESS`, `FAILED`, and `WARNING`. - It keeps all post-run outcome logic in one place. - It is well suited for final notifications, reporting, and auditing tasks. Cons of `afterExecution`: - It cannot be scoped locally to a flowable task the way `errors` can. - Errors inside `afterExecution` do not change the final execution state. Any errors in the `afterExecution` block will not change the state of the flow from `SUCCESS` to `FAILED`, and they will not trigger a flow that relies on `ExecutionStatus = FAILED`. You can force a state change by using a [Sequential flowable task](../01.tasks/00.flowable-tasks/index.md#sequential) with an `errors` block, as in the example below: ```yaml afterExecution: - id: t2 type: io.kestra.plugin.core.flow.Sequential tasks: - id: t2-t1 type: io.kestra.plugin.core.flow.Sleep duration: "PT5S" - id: t2-t2 type: io.kestra.plugin.core.execution.Fail errors: - id: sendAlert type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: https://hooks.slack.com/services/xxxxx messageText: "Flow {{ flow.namespace }}.{{ flow.id }} with execution ID {{ execution.id }} failed." ``` ## `afterExecution` vs `finally` `afterExecution` and `finally` are both end-of-flow constructs, but they serve different purposes. The `afterExecution` property differs from the `finally` property because: 1. `finally` runs tasks at the end of the flow while the execution is still in a `RUNNING` state. 2. `afterExecution` runs tasks after the execution finishes in a terminal state like **SUCCESS** or **FAILED**. Use `finally` for cleanup operations that should always run, regardless of the outcome. See the [`finally` documentation](../19.finally/index.md) for examples. When follow-up actions depend on the final state, use `afterExecution` to capture the result. To demonstrate, take the following flow that uses both `finally` and `afterExecution`: ```yaml id: state_demo namespace: company.team tasks: - id: run type: io.kestra.plugin.core.log.Log message: Execution {{ execution.state }} # Will show RUNNING - id: fail type: io.kestra.plugin.core.execution.Fail finally: - id: finally type: io.kestra.plugin.core.log.Log message: Execution {{ execution.state }} # Will show RUNNING afterExecution: - id: afterExecution type: io.kestra.plugin.core.log.Log message: Execution {{ execution.state }} # Will show FAILED ``` After running the example above, the `finally` task appears with a `RUNNING` state while the `afterExecution` task shows `FAILED`. ![after-execution-1](./after-execution-1.png) :::alert{type="info"} Best practice: Use `afterExecution` when you need to act on the final state of an execution. Use `finally` when you need to ensure cleanup happens regardless of state. ::: --- # Checks in Kestra – Pre-Execution Validations URL: https://kestra.io/docs/workflow-components/checks > Implement Checks in Kestra for pre-execution validation. Guard your workflows by enforcing conditions on inputs before any task begins execution. Add pre-execution validations that can block or fail an execution before any tasks run. ## Add checks to validate inputs before execution `checks` are flow-level assertions evaluated when validating inputs and before creating a new execution. Each check defines a boolean `condition` and a `message` shown when the condition is false. You can choose how Kestra reacts (block, fail, or still create the execution) and how the message is styled in the UI. Checks are useful to enforce business rules on inputs (e.g., allowed values, date windows, required flags) or to nudge users with warnings before they launch a run. ## Properties Each item in `checks` supports the following properties: - `condition` *(required)*: Pebble expression that must evaluate to a boolean. For example, you can design checks against Inputs, Key-Value pairs, or other [expression](../../expressions/index.mdx) accessible workflow components. - `message` *(required)*: Text displayed when the condition is false. - `style` *(optional, default `INFO`)*: Visual style for the message. One of `ERROR`, `SUCCESS`, `WARNING`, `INFO`. - `behavior` *(optional, default `BLOCK_EXECUTION`)*: How the flow should react when the condition is false. One of: - `BLOCK_EXECUTION`: Do not create the execution. - `FAIL_EXECUTION`: Create the execution immediately in a failed state. - `CREATE_EXECUTION`: Allow execution creation even if the check fails. When clicking **Execute**, with an `ERROR` message display set in the flow code, the modal will display the `message` as soon as an input is set that doesn't satisfy the check like below: ![Failed Check](./checks-fail.png) --- ### Multiple checks If several checks fail, the most restrictive behavior wins in this priority order: `BLOCK_EXECUTION` → `FAIL_EXECUTION` → `CREATE_EXECUTION`. This lets you mix hard stops with softer warnings in the same flow. ## Examples ### Simple guard This flow blocks execution unless the `name` input is `Kestra`. ```yaml id: simple_check namespace: company.team inputs: - id: name type: STRING checks: - message: "Sorry, this flow can only be executed with 'Kestra'" condition: "{{ (inputs.name | upper) == 'KESTRA' }}" style: ERROR behavior: BLOCK_EXECUTION tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Hello World! 🚀 ``` ### Advanced guarded ingest This flow pulls sample data from DummyJSON, blocks prod runs outside a time window, and warns (but allows) when using a non-approved source URL. ```yaml id: guarded_ingest namespace: company.team inputs: - id: environment type: SELECT values: [dev, prod] defaults: dev - id: run_date type: DATETIME defaults: "{{ now() }}" - id: payload_url type: URI defaults: https://dummyjson.com/products?limit=5 checks: # Block risky prod runs outside the allowed window - message: "Prod runs are only allowed between 06:00 and 22:00 UTC" condition: "{{ inputs.environment != 'prod' or (inputs.run_date | date('HH') | number >= 6 and inputs.run_date | date('HH') | number < 22) }}" style: ERROR behavior: BLOCK_EXECUTION # Warn if the payload is not the approved source - message: "Non-approved source detected. Use https://dummyjson.com when possible." condition: "{{ inputs.payload_url | startsWith('https://dummyjson.com') }}" style: WARNING behavior: CREATE_EXECUTION tasks: - id: fetch type: io.kestra.plugin.core.http.Download uri: "{{ inputs.payload_url }}" - id: log_run type: io.kestra.plugin.core.log.Log message: "Run {{ execution.id }} in {{ inputs.environment }} with file {{ outputs.fetch.uri }}" ``` ## When to use checks - Prevent invalid or risky executions based on user inputs. - Prevent runs when resources are exhausted (e.g., too many VMs provisioned). - Offer guardrails with warnings while still allowing runs to proceed. - Enforce “only one path” scenarios by failing early instead of deep in the task sequence. Checks run before tasks start, so they are a low-cost way to validate inputs and intentions upfront. --- # Flow Concurrency in Kestra: Limit Parallel Runs URL: https://kestra.io/docs/workflow-components/concurrency > Manage workflow load with Concurrency Limits in Kestra. Control the number of parallel executions for a flow to protect resources and downstream systems. Control how many executions of a flow can run at the same time. The flow-level `concurrency` property lets you limit how many executions of a flow can run concurrently by setting the `limit` key. Think of concurrency as a global execution limit for that specific flow. The concurrency limit and behavior is then applied to all executions of that flow, regardless of whether those executions have been started automatically via a trigger, webhook, an API call, or manually created from the UI. :::alert{type="info"} Concurrency limits executions of a flow, not the number of tasks a worker runs. Task processing is still governed by worker thread pools and task runners. Concurrency uses database locks to hold slots, so heavy contention (many executions fighting for the same lock) can increase database load and slow scheduling. ::: Use concurrency when you need to protect downstream systems (rate limits, database load, external APIs) or enforce “only one execution at a time” semantics. Do **not** rely on concurrency to throttle Kestra worker usage; adjust worker threads, task runners, or queue sizing for that.
For example, if you set the concurrency `limit` to 2, only two executions of that flow will be allowed to run at the same time. If you try to trigger a third execution, it will be queued until one of the two running executions is completed. ### When to use concurrency - Protect a shared target system (databases, SaaS APIs, warehouses) from overload. - Enforce sequential processing for stateful workloads (one ETL load at a time). - Keep a small, fixed number of parallel executions within an external rate limit. ### When **not** to use concurrency - Throttling worker CPU/RAM usage — tune worker thread pools or task runners instead. - Replacing task-level limits — use task runner settings (e.g., container resources) and retry/backoff for per-task control. - Broad platform protection — use platform sizing and queue configuration rather than flow-level concurrency locks. ```yaml id: concurrency_example namespace: company.team concurrency: limit: 2 tasks: - id: wait type: io.kestra.plugin.scripts.shell.Commands commands: - sleep 10 ``` In the UI, the third execution is queued while the first two finish successfully. ![concurrency](./concurrency.png) ## `behavior` property You can customize the behavior when the concurrency limit is reached by choosing to queue, cancel, or fail the new execution. To do that, set the `behavior` Enum-type property to one of the following values: - `QUEUE` - `CANCEL` - `FAIL` For example, with `concurrency.limit` set to 2 and `CANCEL` or `FAIL` behavior, the third execution is immediately marked as `CANCELLED` or `FAILED` without running any tasks. Below is a full flow example that uses the `concurrency` property to limit the number of concurrent executions to 2. The `bash` task sleeps for 10 seconds, so you can trigger multiple executions of that flow and see how the `concurrency` property behaves. :::alert{type="warning"} Each execution that waits for a concurrency slot holds a database lock. Large backlogs (many queued executions) can increase lock contention and slow down scheduling. If you expect spikes, combine conservative limits with backoff at the source (e.g., trigger rates) and keep an eye on the Concurrency tab in the UI. ::: ```yaml id: concurrency_limited_flow namespace: company.team concurrency: behavior: FAIL # QUEUE, CANCEL or FAIL limit: 2 # can be any integer >= 1 tasks: - id: wait type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - sleep 10 ``` As you can see in the UI, the third execution failed as the first two executions were still running. ![concurrency_fail](./concurrency_fail.png) :::alert{type="warning"} When an execution starts from a [Trigger](../07.triggers/index.mdx), the trigger locks until it finishes, preventing multiple executions from that trigger from running concurrently. This means the `behavior` property will not come into effect and instead no new executions will be started. Read more in the [Locked Triggers](../07.triggers/index.mdx#locked-triggers) section. ::: ## Tracking concurrency slots from the UI The `Concurrency` tab on the `Flow` page lets you track and troubleshoot concurrency issues. It shows a progress bar with the number of active slots compared to the total slots available. Below that progress bar, you can see a table showing currently running and queued Executions, providing a clear overview of the flow's concurrency status. ![concurrency_page_1](./concurrency_page_1.png) To see the concurrency behavior in action, you can configure a flow with a concurrency limit as follows: ```yaml id: concurrent namespace: company.team concurrency: behavior: QUEUE limit: 5 tasks: - id: long_running_task type: io.kestra.plugin.scripts.shell.Commands commands: - sleep 90 taskRunner: type: io.kestra.plugin.core.runner.Process ``` Next, trigger multiple Executions of that flow and watch the `Concurrency` tab showing the active slots and queued Executions. ![concurrency_page_2](./concurrency_page_2.png) ## Concurrent executions for Triggers Any [Trigger](../../05.workflow-components/07.triggers/index.mdx) type supports concurrent executions, allowing multiple instances of the same workflow to run simultaneously. This enables more flexible and scalable workflow patterns. For example, consider a workflow that takes 60 seconds to complete but is triggered every second. By default, with `allowConcurrent: false`, only one execution can run at a time. If a trigger fires while a previous execution is still running, the new execution will be skipped: ```yaml id: sleep_concurrent namespace: company.team tasks: - id: sleep type: io.kestra.plugin.core.flow.Sleep duration: PT60S triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "* * * * * *" withSeconds: true allowConcurrent: false ``` In this example, even though the [Schedule trigger](../../05.workflow-components/07.triggers/01.schedule-trigger/index.md) fires every second, only one execution will run at a time. Setting `allowConcurrent: true` would allow multiple executions to run simultaneously. ## How to troubleshoot Concurrency issues Imagine that you encounter a situation where the concurrency limit is reached, and some executions are stuck in the `QUEUED` state. Here are some steps to troubleshoot and resolve the issue. ### Check the Concurrency tab The `Concurrency` tab on the `Flow` UI page described above allows you to see which executions are `RUNNING` and which are `QUEUED` (i.e., waiting or stuck). This page can help you troubleshoot which Executions are taking concurrency slots and which are waiting to be processed. In the future, this page will also let you run stuck executions while ignoring concurrency limits. ### Edit the Concurrency property You can edit the `concurrency` property within the flow (or remove that property entirely to get rid of any limits) and `Save` the flow code. The modified concurrency limit and behavior will be immediately taken into account for all Executions in progress because the Executor checks this for the latest flow revision rather than for the revision of the Execution. :::alert{type="warning"} Do **not** delete executions, as this makes the issue worse — deleted executions still occupy concurrency slots indefinitely. You can select stuck Executions and hit the `Kill` button to cancel them and free up the concurrency slots, but do not delete them. ::: --- # Descriptions in Kestra – Document Flows and Tasks URL: https://kestra.io/docs/workflow-components/descriptions > Document your Kestra workflows effectively. Add Markdown descriptions to flows, tasks, inputs, and triggers to improve maintainability and collaboration. You can document flows, inputs, outputs, tasks, and triggers by adding a `description` property. The `description` property is a string field that supports [Markdown](https://en.wikipedia.org/wiki/Markdown) syntax.
You can add a `description` property on: - Flows - Inputs - Outputs - Tasks - Triggers All Markdown descriptions are rendered directly in the UI. You can get as detailed as you'd like. Take the following example: ```yaml id: data-engineering-pipeline-demo namespace: tutorial description: | # Data Engineering Pipelines for Product Inventory This flow demonstrates a complete data engineering pipeline, from data extraction to transformation and analysis, using Kestra. ## Flow Overview: 1. **Extract**: Downloads product data from a public API. 2. **Transform**: Processes the raw data using a Python script, filtering for specified columns and creating a clean JSON output. 3. **Query**: Loads the transformed data into a DuckDB instance and performs an aggregation to calculate the average price per brand. ## Inputs: - `columns_to_keep`: An array of strings specifying which columns from the raw data should be retained during the transformation step. Defaults to `["brand", "price"]`. ## Tasks Breakdown: ### 1. `extract` (HTTP Download) - **Type**: `io.kestra.plugin.core.http.Download` - **Purpose**: Fetches product data from `https://dummyjson.com/products`. - **Output**: The raw JSON data is stored in Kestra's internal storage, and its URI is made available for subsequent tasks. ### 2. `transform` (Python Script) - **Type**: `io.kestra.plugin.scripts.python.Script` - **Purpose**: Transforms the extracted product data. - **Container Image**: `python:3.11-alpine` is used for a lightweight execution environment. - **Input**: Takes the URI of the downloaded `data.json` from the `extract` task. - **Logic**: - Reads the `data.json` file. - Filters the product list to keep only the columns specified in the `columns_to_keep` input. - Generates a new JSON file named `products.json` with the filtered data. - **Output**: The `products.json` file is uploaded to Kestra's internal storage. ### 3. `query` (DuckDB Queries) - **Type**: `io.kestra.plugin.jdbc.duckdb.Queries` - **Purpose**: Analyzes the transformed product data using DuckDB. - **Input**: Uses the `products.json` file generated by the `transform` task. - **SQL Queries**: - `INSTALL json; LOAD json;`: Installs and loads the JSON extension for DuckDB to handle JSON files. - `SELECT brand, round(avg(price), 2) as avg_price FROM read_json_auto('{{ workingDir }}/products.json') GROUP BY brand ORDER BY avg_price DESC;`: Reads the `products.json` file, calculates the average price for each brand, and orders the results by average price in descending order. - **Fetch Type**: `STORE` ensures that the results of the SQL query are stored as an output file in Kestra's internal storage, making them accessible for further use or inspection. ``` ![description](./description.png) Here is an example flow with descriptions in different components: ```yaml id: myflow namespace: company.team description: | This is the **Flow Description**. You can look at `input description`, `task description`, `output description` and `trigger description` as well in this example. labels: env: dev project: myproject inputs: - id: payload type: JSON description: JSON request payload to the API # Input description example defaults: | [{"name": "kestra", "rating": "best in class"}] tasks: - id: send_data type: io.kestra.plugin.core.http.Request description: Sends a POST API request to https://kestra.io/api/mock # Task description example uri: https://kestra.io/api/mock method: POST contentType: application/json body: "{{ inputs.payload }}" - id: print_status type: io.kestra.plugin.core.debug.Return description: Prints the API request date # Task description example format: hello on {{ outputs.send_data.headers.date | first }} outputs: - id: final type: STRING description: Final flow output derived from the task result value: "{{ outputs.print_status.value }}" triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule description: Triggers the flow every day at 09:00 AM # Trigger description example cron: "0 9 * * *" ``` --- # Disabled Flag in Kestra: Skip Flows and Triggers URL: https://kestra.io/docs/workflow-components/disabled > Disable flows or tasks in Kestra without deleting them. Use the disabled property to pause individual tasks or entire flows for maintenance. The `disabled` flag is a boolean property that lets you skip a flow, task, or trigger. This is useful for debugging or testing parts of a flow without removing existing logic. Instead of deleting parts of your YAML, you can add the `disabled` property.
## Disabled flow When a flow is disabled, it will not be executed — even if a trigger is set. If you have an active trigger on a disabled flow, it will be ignored. You don’t need to disable the trigger separately – it is ignored automatically. Setting a flow to `disabled` effectively prevents any future executions of the flow until it is re-enabled. Add the following flow, then attempt to run it and observe the scheduled executions: ```yaml id: disabled_flow namespace: company.team disabled: true tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Kestra team wishes you a great day! 👋 triggers: - id: fail_every_minute type: io.kestra.plugin.core.trigger.Schedule cron: "*/1 * * * *" ``` You will see that you cannot run the flow and that the trigger is ignored — no executions are created. ![disabled_flag](./disabled_flag_1.png) When executing a disabled flow from a subflow: ```yaml id: parent_runs_disabled_flow namespace: company.team tasks: - id: disabled_subflow type: io.kestra.plugin.core.flow.Subflow flowId: disabled_flow namespace: company.team ``` When you execute the parent flow, it immediately fails with the error message: `Cannot execute a flow which is disabled`. ![disabled_flag_2](./disabled_flag_2.png) Similarly, try running a disabled flow via an API call: ```bash curl -X POST http://localhost:8080/api/v1/main/executions/trigger/example/parent_runs_disabled_flow ``` The API call itself is successful: ```bash {"id":"5ScXvrnOkjfKIXqYylRYME","namespace":"example","flowId":"parent_runs_disabled_flow","flowRevision":1,"state":{"current":"CREATED","histories":[{"state":"CREATED","date":"2024-01-19T20:38:48.474047013Z"}],"duration":"PT0.011094958S","startDate":"2024-01-19T20:38:48.474047013Z"},"originalId":"5ScXvrnOkjfKIXqYylRYME"}% ``` That execution is immediately marked as failed with the error message: `Cannot execute a flow which is disabled`. ## Disabled trigger When using a Schedule trigger, it is often useful to disable it temporarily. For example, you may want to disable a trigger while you are debugging a flow. You can do this by setting the `disabled` flag to `true` on the trigger: ```yaml id: myflow namespace: company.team tasks: - id: hello type: io.kestra.plugin.core.log.Log message: hello from a scheduled flow triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "0 9 * * *" disabled: true ``` You will see that no scheduled executions are created for this flow. Once you are done debugging, you can re-enable the trigger by setting the `disabled` flag to `false` or simply by removing the `disabled` flag: ```yaml id: myflow namespace: company.team tasks: - id: hello type: io.kestra.plugin.core.log.Log message: hello from a scheduled flow triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "0 9 * * *" ``` ## Disabled task Instead of disabling the entire flow or a trigger, you can also disable a single task. This is useful when you want to temporarily disable a single task without deleting it e.g., when troubleshooting a failure. You can do this by setting the `disabled` flag to `true` on the task: ```yaml id: myflow namespace: company.team tasks: - id: enabled type: io.kestra.plugin.core.log.Log message: this task will run - id: disabled type: io.kestra.plugin.core.debug.Return format: this task will be skipped disabled: true ``` You can see in the UI that disabled tasks are greyed out: ![disabled_flag_3](./disabled_flag_3.png) --- # Workflow Errors in Kestra – Handling Strategies URL: https://kestra.io/docs/workflow-components/errors > Master error handling in Kestra. Explore strategies like global and local error handlers, allowing failures, and configuring alerts for robust workflows. Kestra provides multiple ways to handle errors, helping you both identify issues and decide whether your flows should stop or continue running after an error.
## `errors` Component `errors` is a list of tasks set at the flow level that are executed when an error occurs. You can add multiple tasks, and they are executed sequentially. This is useful for sending alerts when errors occur. The example below sends a flow-level failure alert via Slack using the [SlackIncomingWebhook](/plugins/plugin-slack/slack-notifications/io.kestra.plugin.slack.notifications.slackincomingwebhook) task defined using the `errors` property. ```yaml id: errors namespace: company.team description: This will always fail tasks: - id: failed_task type: io.kestra.plugin.core.execution.Fail errors: - id: alert_on_failure type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: secret('SLACK_WEBHOOK') messageText: "Failure alert for flow {{ flow.namespace }}.{{ flow.id }} with ID {{ execution.id }}" ``` ## `errors` vs `afterExecution` Both `errors` and `afterExecution` can be used for post-run actions, but they solve different problems. Use `errors` when you want failure handling to happen as part of the execution lifecycle when a task or flow errors. Use `afterExecution` when you want to react to the final execution state once the run has already finished. For post-run actions based on the final execution state, see the [`afterExecution` documentation](../20.afterexecution/index.md). | Use case | Prefer | | --- | --- | | Send an alert only when the flow fails | `errors` | | Handle errors only inside one flowable task and its children | `errors` | | Run different tasks for `SUCCESS`, `FAILED`, or `WARNING` | `afterExecution` | | Run reports or notifications that depend on the final execution state | `afterExecution` | Pros of `errors`: - Failure-specific by design. - Available at the flow level and locally inside flowable tasks. - Well suited for remediation, cleanup, or alerts tied to a failure path. Cons of `errors`: - It is focused on error paths, not success paths. - It is less convenient when you want one block that branches on multiple final states. Two kinds of error handlers can be defined: - **Global**: error handling for the entire flow, defined at the root level - **Local**: error handling for a Flowable Task and its children ## Global error handler This example shows a global error handler. The first task fails immediately, triggering the handler, which then logs the ID of the failed task using the `tasksWithState()` function. ```yaml id: errors namespace: company.team tasks: - id: failed type: io.kestra.plugin.core.execution.Fail errors: - id: error_handler type: io.kestra.plugin.core.log.Log message: I'm failing task '{{ tasksWithState('failed')[0]['taskId'] }}' # Because tasksWithState() returns an array, the first taskId to fail is retrieved. level: INFO ``` ## Local error handler This example demonstrates a local error handler that applies only to the children of `t2`. Errors from other tasks, like `t1`, are not handled here. This can be useful to restrict error handling for a specific part of the flow and perform specific tasks like resource cleanup. ```yaml id: errors namespace: company.team tasks: - id: parent-seq type: io.kestra.plugin.core.flow.Sequential tasks: - id: t1 type: io.kestra.plugin.core.debug.Return format: "{{task.id}} > {{taskrun.startDate}}" - id: t2 type: io.kestra.plugin.core.flow.Sequential tasks: - id: t2-t1 type: io.kestra.plugin.core.execution.Fail errors: - id: error-t1 type: io.kestra.plugin.core.debug.Return format: "Error Trigger ! {{task.id}}" ``` ## `allowFailure` and `allowWarning` Property
When you execute a flow and one of its tasks fails, downstream tasks are not executed. This may not always be desirable, especially for non-critical tasks. You can resolve this by adding the `allowFailure` property to the task, which allows downstream tasks to continue despite an error. In this case, the execution will finish in a `WARNING` state. ```yaml id: allow_failure namespace: company.team description: This flow will allow a failure of a task (imagine a flaky unit test) and will continue processing downstream tasks, but the execution will finish in a `WARNING` state. tasks: - id: first type: io.kestra.plugin.core.debug.Return format: "{{ task.id }} > {{ taskrun.startDate }}" - id: allow_failure type: io.kestra.plugin.core.execution.Fail allowFailure: true - id: last type: io.kestra.plugin.core.debug.Return format: "{{ task.id }} > {{ taskrun.startDate }}" ``` There is also the `allowWarning` property, which works similarly to `allowFailure`, but the execution finishes in a `SUCCESS` state even if warnings occur. ```yaml id: allow_warning namespace: company.team description: This flow will allow a warning of a task (imagine a notification task) and will continue processing downstream tasks, with the execution finishing in a `SUCCESS` state even if warnings occurred. tasks: - id: first type: io.kestra.plugin.core.debug.Return format: "{{ task.id }} > {{ taskrun.startDate }}" - id: allow_warning type: io.kestra.plugin.scripts.python.Script allowWarning: true beforeCommands: - pip install kestra script: | from kestra import Kestra logger = Kestra.logger() logger.warning("WARNING signals something unexpected.") ``` ## Best practices for error handling - Use **global handlers** for alerts and monitoring across the whole flow. - Use **local handlers** for targeted cleanup or retries. - Add `allowFailure` for **non-critical tasks** that shouldn’t block execution. - Use `allowWarning` when warnings should not mark the execution as failed. --- # Executions in Kestra – Run and Monitor Flows URL: https://kestra.io/docs/workflow-components/execution > Manage Flow Executions in Kestra. Learn how to trigger, monitor, and troubleshoot workflow runs, understand states, and access execution metrics. Execute flows and view the results. An execution is a single run of a flow with a specific state.
## Task run A task run is a single run of an individual task within an execution. Each task run has associated data such as: - Execution ID - State - Start Date - End Date Read more about task runs on the [dedicated docs page](../01.tasks/02.taskruns/index.md). ## Attempts Each task run can have one or more attempts. Most task runs have only one attempt, but you can configure [retries](../12.retries/index.md) for a task. If retries have been configured, a task failure will generate new attempts until the retry `maxAttempts` or `maxDuration` threshold is hit. ## Outputs Each task can generate output data that other tasks in the current flow execution can use. These outputs can be variables or files that are stored inside Kestra's internal storage. Outputs are described on each task’s documentation page and can be viewed in the **Outputs** tab of the **Execution** page. Read more on the [Outputs page](../../05.workflow-components/06.outputs/index.md). ## Metrics Each task can expose metrics that help you understand task internals. Metrics may include file size, number of returned rows, or query duration. You can view the available metrics for a task type on its documentation page. Metrics can be seen in the **Metrics** tab of the **Executions** page. Below is an example of a flow generating metrics: ```yaml id: load_data_to_bigquery namespace: company.team tasks: - id: http_download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - id: load_bigquery type: io.kestra.plugin.gcp.bigquery.Load description: Load data into BigQuery autodetect: true csvOptions: fieldDelimiter: "," destinationTable: kestra-dev.demo.orders format: CSV from: "{{ outputs.http_download.uri }}" ``` You can see the list of generated metrics generated in the [BigQuery Load task documentation](/plugins/plugin-gcp/bigquery/io.kestra.plugin.gcp.bigquery.load#metrics). After executing the flow, view the BigQuery Load task metrics in the **Metrics** tab. ![bigquery_metrics](./bigquery_metrics.png) ## State An execution or a task run can be in a particular state. There are multiple possible states: | State | Description | | - |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `CREATED` | The Execution or task run is waiting to be processed. This state usually means that the Execution is in a queue and is yet to be started. | | `RUNNING` | The Execution or task run is currently being processed. | | `PAUSED` | The Execution or task run has been paused. Used for manual validation or waiting for a specified duration before continuing the execution. | | `SUCCESS` | The Execution or task run has been completed successfully. | | `WARNING` | The Execution or task run exhibited unintended behavior, but the execution continued and was flagged with a warning. | | `FAILED` | The Execution or task run exhibited unintended behavior that caused the execution to fail. | | `KILLING` | A command was issued that asked for the Execution or task run to be killed. The system is in the process of killing the associated tasks. | | `KILLED` | An Execution or task run was killed (upon request), and no more tasks will run. | | `RESTARTED` | Transitional status equivalent to `CREATED`for a flow that was executed, failed, and then restarted. | | `CANCELLED` | An Execution or task run has been aborted because it has reached the defined [concurrency limit](../14.concurrency/index.md) or exceeded the [SLA](../18.sla/index.md) . The behavior was set to the `CANCEL`. | | `QUEUED` | An Execution or task run has been put on hold because it has reached its defined concurrency limit. The limit was set to the `QUEUE` behavior. | | `RETRYING` | The Execution or task run is currently being [retried](../12.retries/index.md). | | `RETRIED` | An Execution or task run exhibited unintended behavior, stopped, and created a new execution as defined by its [flow-level retry policy](../12.retries/index.md#flow-level-retries). The policy was set to the `CREATE_NEW_EXECUTION` behavior. | :::alert{type="info"} For a detailed overview of how each execution and task run transitions through states, see the [States](../17.states/index.md) page. ::: ## Execution expressions You can use the following execution expressions in your flow. | Parameter | Description | | - | - | | `{{ execution.id }}` | The execution ID, a generated unique ID for each execution | | `{{ execution.startDate }}` | The start date of the current execution, can be formatted with `{{ execution.startDate \| date("yyyy-MM-dd HH:mm:ss.SSSSSS") }}`. | | `{{ execution.originalId }}` | The original execution ID, this ID never changes, even in case of a replay and keeps the first execution ID. | ## Execute a flow from the UI You can trigger a flow manually from the Kestra UI by clicking the **Execute** button on the flow page. This is useful when you want to test a flow or run it on demand. ![execute_button](./execute_button.png) ## Use automatic triggers You can add a **Schedule trigger** to automatically launch a flow execution at a regular time interval. Alternatively, you can add a **Flow trigger** to automatically launch a flow execution when another flow execution is completed. This is helpful when you want to: - Implement a centralized namespace-level error handling strategy, e.g., to send a notification when any flow execution fails in a production namespace. Check the [Alerting & Monitoring](../../10.administrator-guide/03.monitoring/index.md) section for more details. - Decouple your flows by following an event-driven pattern where a flow is triggered by the completion of another flow (as opposed to the [subflow pattern]… where a parent flow explicitly calls child flows). You can also use the **Webhook trigger** to automatically launch a flow execution when a given HTTP request is received. You can leverage the `{{ trigger.body }}` variable to access the request body and the `{{ trigger.headers }}` variable to access the request headers in your flow. To launch a flow and send data to the flow's execution context from an external system using a webhook, you can send a POST request to the Kestra API using the following URL: ```bash http://:/api/v1/main/executions/webhook/// ``` Below is an example: ```bash http://localhost:8080/api/v1/main/executions/webhook/dev/hello-world/secretWebhookKey42 ``` You can also pass inputs to the flow using the `inputs` query parameter. ## Execute a flow via an API call You can trigger a flow execution by calling the [API](../../api-reference/index.mdx) directly. This is useful when you want to start a flow execution from another application or service. Use the following flow as an example: ```yaml id: hello_world namespace: company.team inputs: - id: greeting type: STRING defaults: hey tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "{{ inputs.greeting }}" triggers: - id: webhook type: io.kestra.plugin.core.trigger.Webhook key: test1234 ``` If Kestra runs locally, trigger a flow by calling `/api/v1/main/executions/{namespace}/{flowId}` endpoint. This example uses `curl` but you could use something else like [Postman](https://www.postman.com/) to test this too: ```bash curl -X POST \ http://localhost:8080/api/v1/main/executions/company.team/hello_world ``` The above command triggers an execution of the latest revision of the `hello_world` flow from the `company.team` namespace. ### Execute a specific revision of a flow If you want to trigger an execution for a specific revision, you can use the `revision` query parameter: ```bash curl -X POST \ http://localhost:8080/api/v1/main/executions/company.team/hello_world?revision=2 ``` ### Execute a flow with inputs You can also trigger a flow execution with inputs by adding the `inputs` as form data (the `-F` flag in the `curl` command): ```bash curl -X POST \ http://localhost:8080/api/v1/main/executions/company.team/hello_world \ -F greeting="hey there" ``` You can pass inputs of different types, such as `STRING`, `INT`, `FLOAT`, `DATETIME`, `FILE`, `BOOLEAN`, and more. ```bash curl -v "http://localhost:8080/api/v1/main/executions/company.team/kestra-inputs" \ -H "Transfer-Encoding:chunked" \ -H "Content-Type:multipart/form-data" \ -F string="a string" \ -F optional="an optional string" \ -F int=1 \ -F float=1.255 \ -F boolean=true \ -F instant="2023-12-24T23:00:00.000Z" \ -F "files=@/tmp/128M.txt;filename=file" ``` ### Execute a flow with FILE-type inputs You can also pass files as an input. All files must be sent as multipart form data named `files` with a header `filename=your_kestra_input_name` indicating the name of the input. Let's look at an example to make this clearer. Suppose you have a flow that takes a JSON file as input and reads the file's content: ```yaml id: large_json_payload namespace: company.team inputs: - id: myCustomFileInput type: FILE tasks: - id: hello type: io.kestra.plugin.scripts.shell.Commands inputFiles: myfile.json: "{{ inputs.myCustomFileInput }}" taskRunner: type: io.kestra.plugin.core.runner.Process commands: - cat myfile.json ``` Assuming you have a file `myfile.json` in the current working directory, you can invoke the flow using the following `curl` command: ```bash curl -X POST -F "files=@./myfile.json;filename=myCustomFileInput" 'http://localhost:8080/api/v1/main/executions/company.team/large_json_payload' ``` :::alert{type="info"} We recommend this pattern if you need to pass large payloads to a flow. Passing a large payload directly in the request body (e.g., as `JSON`-type input or as a raw JSON webhook body) is not recommended for privacy, performance, and maintainability reasons. Such large payloads would be stored directly in your Kestra's database backend, cluttering valuable storage space and leading to potential performance or privacy issues. However, if you pass it as a JSON file using a `FILE`-type input, it will be stored in internal storage (such as S3, GCS, Azure Blob), making it more performant and cost-effective to store and retrieve. ::: ### Execute a flow via an API call in Python You can also use the `requests` library in Python to make requests to the Kestra API. Here's an example: ```python import requests from requests_toolbelt.multipart.encoder import MultipartEncoder with open("/tmp/128M.txt", 'rb') as fh: url = "http://kestra:8080/api/v1/main/executions/company.team/hello_world" mp_encoder = MultipartEncoder(fields={ "string": "a string", "int": 1, "float": 1.255, "datetime": "2025-04-20T13:00:00.000Z", "files": ("file", fh, "text/plain") }) result = requests.post( url, data=mp_encoder, headers={"Content-Type": mp_encoder.content_type}, ) ``` ### Get URL to follow the Execution progress The executions endpoint also [returns a URL](https://github.com/kestra-io/kestra/issues/4256), allowing you to follow the execution progress from the UI. This is helpful for externally triggered, long-running executions that require users to monitor workflow progress. Below are the steps to use it: 1) First, create a flow: ```yaml id: myflow namespace: company.team tasks: - id: long_running_task type: io.kestra.plugin.scripts.shell.Commands commands: - sleep 90 taskRunner: type: io.kestra.plugin.core.runner.Process ``` 2) Execute the flow via an API call: ```shell curl -X POST http://localhost:8080/api/v1/main/executions/company.team/myflow ``` You will see output similar to the following: ```bash { "id": "1ZiZQWCHj7bf9XLtgvAxyi", "namespace": "company.team", "flowId": "myflow", "flowRevision": 1, "state": { "current": "CREATED", "histories": [ { "state": "CREATED", "date": "2024-09-24T13:35:32.983335847Z" } ], "duration": "PT0.017447417S", "startDate": "2024-09-24T13:35:32.983335847Z" }, "originalId": "1ZiZQWCHj7bf9XLtgvAxyi", "deleted": false, "metadata": { "attemptNumber": 1, "originalCreatedDate": "2024-09-24T13:35:32.983420055Z" }, "url": "http://localhost:8080/ui/executions/company.team/myflow/1ZiZQWCHj7bf9XLtgvAxyi" } ``` You can click directly on that last URL to follow the execution progress from the UI, or you can return that URL from your application to the user who initiated the flow. Keep in mind that you need to configure the URL of your Kestra instance within your [Runtime and Storage configuration](../../configuration/02.runtime-and-storage/index.md) file to have a full URL rather than just the suffix `/ui/executions/company.team/myflow/uuid`. Here is how you can do it: ```yaml kestra: url: http://localhost:8080 ``` ## Webhook vs. API call When sending a POST request to the `/api/v1/main/executions/{namespace}/{flowId}` endpoint, you can send data to the flow's execution context using `inputs`. If you want to send arbitrary metadata to the flow's execution context based on some event happening in your application, you can leverage a Webhook trigger. You can adjust the previous `hello_world` example to use the webhook trigger instead of an API call: ```yaml id: hello_world namespace: company.team inputs: - id: greeting type: STRING defaults: hey tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "{{ trigger.body ?? inputs.greeting }}" triggers: - id: webhook type: io.kestra.plugin.core.trigger.Webhook key: test1234 ``` You can now send a POST request to the `/api/v1/main/executions/webhook/{namespace}/{flowId}/{webhookKey}` endpoint to trigger an execution and pass any metadata to the flow using the request body. In this example, the webhook URL would be `http://localhost:8080/api/v1/main/executions/webhook/company.team/hello_world/test1234`. You can test the webhook trigger using a tool like Postman or cURL. Paste the webhook URL in the URL field and a [sample JSON payload](https://gist.github.com/anna-geller/df2532c0699e3ba4f572a88fbdf19a13) in the request body. Make sure to set: - the request method to POST - the request body type to raw JSON format Finally, click the **Send** button to trigger the flow execution. You should get a response with the execution ID and status code 200 OK. ![postman webhook](./postman.png) :::alert{type="info"} ⚡️ **When to use a webhook trigger vs. an API call?** To decide whether to use a webhook trigger or an API call to create an Execution, consider the following: - Use the **webhook trigger** when you want **to send arbitrary metadata** to the flow's execution context based on some event happening in your application. - Use the **webhook trigger** when you want to create new executions based on some **event** happening in an **external application**, such as a GitHub event (_e.g. a Pull Request is merged_) or a new record in a SaaS application, and you want to send the event metadata (header and body) to the flow to act on it. - Use an **API call** when you only need to pass **typed inputs** and do not need to send an arbitrary payload. ::: --- ## Execute a flow via kestractl You can trigger and inspect executions from the command line using [kestractl](../../kestra-cli/kestractl/index.md). ```bash # Run a flow and wait for completion kestractl executions run prod nightly-refresh --wait # Run a flow and get the result as JSON kestractl executions run prod nightly-refresh --wait --output json ``` --- ## Execute a flow from Python You can also execute a flow using the [kestra pip package](https://github.com/kestra-io/libs). This is useful when you want to trigger a flow execution from a Python application without crafting the HTTP request manually, as shown earlier. First, install the package: ```bash pip install kestra ``` Then, you can trigger a flow execution by calling the `execute()` method. Below is an example for the same `hello_world` flow in the namespace `company.team` as above: ```python from kestra import Flow flow = Flow() flow.execute('company.team', 'hello_world', {'greeting': 'hello from Python'}) ``` Now imagine that you have a flow that takes a FILE-type input and reads the file's content: ```yaml id: myflow namespace: company.team inputs: - id: myfile type: FILE tasks: - id: print_data type: io.kestra.plugin.core.log.Log message: "file's content {{ read(inputs.myfile) }}" ``` Assuming you have a file called `example.txt` in the same directory as your Python script, you can pass a file as an input to the flow using the following Python code: ```python import os from kestra import Flow os.environ["KESTRA_HOSTNAME"] = "http://host.docker.internal:8080" # Set this when executing inside Kestra flow = Flow() with open('example.txt', 'rb') as fh: flow.execute('company.team', 'myflow', {'files': ('myfile', fh, 'text/plain')}) ``` Keep in mind that `files` is a tuple with the following structure: `('input_id', file_object, 'content_type')`. ## Execute with ForEachItem The `ForEachItem` task allows you to iterate over a list of items and run a subflow for each item, or for each batch containing multiple items. Use this to process large lists in parallel, e.g., millions of records from a database table or an API payload. The `ForEachItem` task is a **Flowable** task, which means that it can be used to define the flow logic and control the execution of the flow. Syntax: ```yaml id: each_example namespace: company.team tasks: - id: each type: io.kestra.plugin.core.flow.ForEachItem items: "{{ inputs.file }}" # could be also an output variable {{ outputs.extract.uri }} inputs: file: "{{ taskrun.items }}" # batch items passed to the subflow batch: rows: 4 bytes: "1024" partitions: 2 namespace: company.team flowId: subflow revision: 1 # optional (default: latest) wait: true # wait for the subflow execution transmitFailed: true # fail the task run if the subflow execution fails labels: # optional labels to pass to the subflow to be executed key: value ``` :::collapse{title="Full Example"} Subflow: ```yaml id: subflow namespace: company.team inputs: - id: items type: FILE tasks: - id: for_each_item type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - cat "{{ inputs.items }}" - id: read type: io.kestra.plugin.core.log.Log message: "{{ read(inputs.items) }}" ``` Below is a flow that uses the `ForEachItem` task to iterate over a list of items and run the `subflow` for a batch of 10 items at a time: ```yaml id: each_parent namespace: company.team tasks: - id: extract type: io.kestra.plugin.jdbc.duckdb.Query sql: | INSTALL httpfs; LOAD httpfs; SELECT * FROM read_csv_auto('https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv', header=True); store: true - id: each type: io.kestra.plugin.core.flow.ForEachItem items: "{{ outputs.extract.uri }}" batch: rows: 10 namespace: company.team flowId: subflow wait: true transmitFailed: true inputs: items: "{{ taskrun.items }}" ``` ::: --- # Finally Tasks in Kestra – Always-Run Cleanup URL: https://kestra.io/docs/workflow-components/finally > Ensure cleanup with Finally tasks in Kestra. Execute specific tasks at the end of a flow regardless of success or failure, perfect for resource teardown. Define a block of tasks that always run at the end of a flow, regardless of task status. `finally` tasks are useful for cleanup operations that must run at the end of your flow, whether the execution ends in success or failure.
## `finally` component `finally` is a block of tasks that execute at the end of your workflow, regardless of the status of prior tasks. This ensures cleanup or teardown steps always occur, no matter how the flow ends. For example, you might use a `finally` block to turn off a cloud service when the flow finishes, regardless of the outcome. :::alert{type="info"} Note that `finally` tasks run while the execution is still `RUNNING`. If you need to trigger tasks after an execution finishes with a specific status (`SUCCESS` or `FAILED`), use the [`afterExecution` property](../20.afterexecution/index.md). ::: ## `finally` vs `errors` `finally` and `errors` can both run near the end of a flow, but they are meant for different jobs. - Use `finally` for cleanup and teardown that must happen every time. - Use `errors` for failure-specific handling such as alerts, remediation, or fallback actions. Unlike `errors`, `finally` is not tied to a failure path. It runs whether the flow succeeds or fails, and it runs while the execution is still in the `RUNNING` state. For failure-specific handling, including local handlers inside flowable tasks, see the [`errors` documentation](../11.errors/index.md). For post-run actions based on the final execution state, see the [`afterExecution` documentation](../20.afterexecution/index.md). ## `finally` example In the example below, one task is designed to fail, and an `errors` task logs a message to signal the failure. The `finally` task still runs after the other tasks finish. Here it logs another message, but in practice it could be used to shut down resources started for the flow. ```yaml id: finally_example namespace: company.team tasks: - id: fail type: io.kestra.plugin.core.execution.Fail errorMessage: Test downstream tasks errors: - id: send_alert type: io.kestra.plugin.core.log.Log message: alert on failure finally: - id: cleanup_task type: io.kestra.plugin.core.log.Log message: cleaning up resources ``` If you change the example so the first task succeeds, as shown below, the `finally` task still runs in the same way: ```yaml id: finally_example namespace: company.team tasks: - id: log type: io.kestra.plugin.core.log.Log errorMessage: "This flow executes successfully!" errors: - id: send_alert type: io.kestra.plugin.core.log.Log message: alert on failure finally: - id: cleanup_task type: io.kestra.plugin.core.log.Log message: cleaning up resources ``` As in the first example, the `finally` task runs at the end even though the `errors` task does not send an alert, ensuring cleanup still happens regardless of status. Beyond simple cleanup, `finally` can manage external services. For example, you might spin up Redis, Elasticsearch, or Kafka to run queries or QA checks, and then ensure the service is stopped when the flow ends. The following example shows how to start a Redis Docker container, run some database operations, and then stop the container when the flow finishes. ```yaml id: dockerRedis namespace: company.team variables: host: host.docker.internal tasks: - id: start type: io.kestra.plugin.docker.Run containerImage: redis wait: false portBindings: - "6379:6379" - id: sleep type: io.kestra.plugin.core.flow.Sleep duration: PT1S description: Wait for the Redis container to start - id: set type: io.kestra.plugin.redis.string.Set url: "redis://:redis@{{vars.host}}:6379/0" key: "key_string_{{execution.id}}" value: "{{flow.id}}" serdeType: STRING - id: get type: io.kestra.plugin.redis.string.Get url: "redis://:redis@{{vars.host}}:6379/0" key: "key_string_{{execution.id}}" serdeType: STRING - id: assert type: io.kestra.plugin.core.execution.Assert errorMessage: "Invalid get data {{outputs.get}}" conditions: - "{{outputs.get.data == flow.id}}" - id: delete type: io.kestra.plugin.redis.string.Delete url: "redis://:redis@{{vars.host}}:6379/0" keys: - "key_string_{{execution.id}}" - id: getAfterDelete type: io.kestra.plugin.redis.string.Get url: "redis://:redis@{{vars.host}}:6379/0" key: "key_string_{{execution.id}}" serdeType: STRING - id: assertAfterDelete type: io.kestra.plugin.core.execution.Assert errorMessage: "Invalid get data {{outputs.getAfterDelete}}" conditions: - "{{(outputs.getAfterDelete contains 'data') == false}}" finally: - id: stop type: io.kestra.plugin.docker.Stop containerId: "{{outputs.start.taskRunner.containerId}}" ``` :::alert{type="info"} Best practice: Use `finally` for cleanup and resource teardown, not for critical business logic. Business logic dependent on execution outcomes should use `errors` or `afterExecution`. ::: --- # Flows in Kestra – Define Orchestration Units URL: https://kestra.io/docs/workflow-components/flow > Understand Kestra Flows, the fundamental units of orchestration. Learn to define tasks, inputs, outputs, and logic to automate your business processes. Flow is a container for tasks and their orchestration logic. A Flow is the fundamental unit of orchestration in Kestra. It defines a set of tasks, their execution order, inputs, outputs, and orchestration logic.
## Components of a flow A flow organizes `tasks`, their `inputs` and `outputs`, error handling, and orchestration logic. It specifies **what** tasks run, **when** they run, and **how** they interact (sequentially, in parallel, or conditionally). You can define a flow declaratively using a [YAML](https://en.wikipedia.org/wiki/YAML) file. Alternatively, you can also build flows using the [No-Code Editor](../../no-code/01.no-code-flow-building/index.md) instead of writing your own YAML. A flow must have: - identifier (`id`) - [`namespace`](../02.namespace/index.md) - [list of `tasks`](../01.tasks/index.mdx) Optionally, a flow can also have: - [inputs](../05.inputs/index.md) - [outputs](../06.outputs/index.md) - [variables](../04.variables/index.md) - [triggers](../07.triggers/index.mdx) - [labels](../08.labels/index.md) - [pluginDefaults](../09.plugin-defaults/index.md) - [errors](../11.errors/index.md) - [finally](../19.finally/index.md) - [retries](../12.retries/index.md) - [sla](../18.sla/index.md) - [concurrency](../14.concurrency/index.md) - [descriptions](../15.descriptions/index.md) - [disabled](../16.disabled/index.md) - [revision](../../06.concepts/03.revision/index.md) - [checks](../07.checks/index.md) ## Flow sample Below is a sample flow definition. It uses tasks available in Kestra core for testing purposes, such as the `Return` or `Log` tasks, and demonstrates how to use `labels`, `inputs`, `variables`, `triggers`, and various `descriptions`. ```yaml id: hello-world namespace: company.team description: flow **documentation** in *Markdown* labels: env: prod team: engineering inputs: - id: my-value type: STRING defaults: "default value" description: This input is has a default value. variables: first: "1" second: "{{vars.first}} > 2" tasks: - id: date type: io.kestra.plugin.core.debug.Return description: "Some tasks **documentation** in *Markdown*" format: "A log line content with a contextual date variable {{taskrun.startDate}}" pluginDefaults: - type: io.kestra.plugin.core.log.Log values: level: ERROR ``` ### Plugin defaults Use `pluginDefaults` to avoid repeating common configurations across multiple tasks of the same type. This is a list of default task properties that will be applied to each task of a certain type inside your flow. Refer to the [Plugin Defaults documentation](../09.plugin-defaults/index.md) for more details. ### Variables Flow-level variables define key/value pairs that tasks can access using `{{ vars.key }}`. Refer to the [flow variables documentation](../04.variables/index.md) for more details. ### List of tasks The most important part of a flow is the list of tasks that will be run sequentially when the flow is executed. ## Disable a flow By default, all flows are active and will execute whether or not a trigger has been set. You can [disable a flow](../16.disabled/index.md) to temporarily prevent it from running. This is useful for pausing scheduled executions, troubleshooting, or testing. ![Disable a Flow](./disable_flow.png) ## Task A task is a single action in a flow. A task can have properties, use flow inputs and other task's outputs, perform an action, and produce an [output](#outputs). There are two kinds of tasks in Kestra: - Runnable Tasks – Perform actual work (API calls, database queries, computations). Executed by workers. - Flowable Tasks – Control orchestration (branching, looping, parallelization). Executed by the executor, not suitable for heavy computation. ### Runnable Task [Runnable Tasks](../01.tasks/01.runnable-tasks/index.md) handle computational work in the flow. For example, these include file system operations, API calls, database queries, etc. These tasks can be compute-intensive and are handled by workers. By default, Kestra only includes a few Runnable Tasks. However, many of them are available as [plugins](/plugins), and if you use our default Docker image, plenty of them are already included. ### Flowable Task [Flowable Tasks](../01.tasks/00.flowable-tasks/index.md) only handle flow logic (branching, grouping, parallel processing, etc.) and start new tasks. For example, the [Switch task](/plugins/core/flow/io.kestra.plugin.core.flow.switch) decides the next task to run based on some inputs. A Flowable Task is handled by an executor and can be called very often. Because of that, these tasks cannot include intensive computations, unlike Runnable Tasks. Most of the common Flowable Tasks are available in the default Kestra installation. ## Labels Labels are key-value pairs that you can add to flows. Labels are used to **organize** flows and can be used to **filter executions** of any given flow from the UI. ## Inputs Inputs are strongly typed parameters provided at execution time. Can be required or optional, with default values and validation rules. Inputs of type `FILE` are uploaded to Kestra's [internal storage](../../08.architecture/data-components/index.md#internal-storage) and made available for all tasks. Flow inputs can be seen in the **Overview** tab of the **Execution** page. ## Outputs Outputs are results produced by tasks or flows. Outputs can be reused in later tasks or downloaded if stored in internal storage. Some outputs are of a special type and are stored in Kestra's internal storage. Kestra automatically makes these outputs available for all tasks. You can view: - task outputs in the **Outputs** tab of the **Execution** page - flow outputs in the **Overview** tab of the **Execution** page If an output is a file from the internal storage, it will be available to download. For more details on both task and flow outputs, see the [Outputs](../06.outputs/index.md) page. ## Revision Every change to a flow creates a new revision. Kestra automatically manages revisions, similar to version control, and you can view them in the **Revisions** tab. You can access old revisions inside the **Revisions** tab of the **Flows** page. ## Triggers [Triggers](../07.triggers) are a way to start a flow from external events. For example, a trigger might initiate a flow at a scheduled time or based on external events (webhooks, file creation, message in a broker, etc.). ## Flow variable expressions Flows have a number of variable expressions giving you information about them dynamically, a few examples include: | Parameter | Description | |-------------------------------|-----------------------------------------------------------------------------------------------------------------------------------| | `{{ flow.id }}` | The identifier of the flow. | | `{{ flow.namespace }}` | The name of the flow namespace. | | `{{ flow.tenantId }}` | The identifier of the tenant (EE only). | | `{{ flow.revision }}` | The revision of the flow. | ## Listeners (deprecated) Listeners are special tasks that can listen to the current flow and launch tasks *outside the flow*, meaning launch tasks that are not part of the flow. The results of listeners do not change the execution status of the flow. Listeners are mainly used to send notifications or handle special behavior outside the primary flow. :::alert{type="warning"} These features are retained for backward compatibility and will be removed in future versions. Use alternative patterns (e.g., triggers, reusable tasks) instead. ::: ## Templates (deprecated) Templates are lists of tasks that can be shared between flows. You can define a template and call it from other flows. Templates allow you to share a list of tasks and keep them updated without changing all flows that use them. ## FAQ ### Where does Kestra store flows? Flows are stored in a serialized format directly **in the Kestra backend database**. The easiest way to add new flows is to add them directly from the Kestra UI. You can also use [`kestractl flows deploy`](../../kestra-cli/kestractl/index.md) to push flows from the command line, or use the Git Sync pattern or CI/CD integration to add flows automatically after a pull request is merged to a given Git branch. To see how flows are represented in a file structure, you can leverage the `_flows` directory in the [Namespace Files](../../06.concepts/02.namespace-files/index.md) editor. ### How to load flows at server startup? To pre-load flows from a directory when Kestra starts (so they’re available immediately), use the `-f` or `--flow-path` flag on the server command: ```bash ./kestra server standalone --flow-path /path/to/flows ``` Point this to a directory of YAML flow definitions; Kestra will load them at startup and place them in the namespaces declared in each file. For more information about the Kestra server CLI, check the [Server CLI Reference](../../kestra-cli/kestra-server/index.md) section. ### Can I sync a local flows directory to be continuously loaded into Kestra? Yes. See [Synchronize Local Flows](../../15.how-to-guides/local-flow-sync/index.md) for syncing a local directory, or [Sync Flows from a Git Repository](../../15.how-to-guides/syncflows/index.md) for Git-based workflows. --- # Workflow Inputs in Kestra: Declare and Pass Parameters URL: https://kestra.io/docs/workflow-components/inputs > Make your Kestra flows dynamic with Inputs. Learn to declare typed inputs, validate values, and pass parameters at runtime for flexible workflow execution. Inputs are dynamic values passed to the flow at runtime.
A flow can be parameterized with inputs, allowing multiple executions of the same flow with different values. Flow inputs are stored in the execution context and accessed with `{{ inputs.parameter_name }}`. You can use inputs to make your tasks more dynamic. For instance, you can use an input to dynamically define the path of a file that needs to be processed within a flow. You can inspect input values in the **Overview** tab of the **Execution** page and set a custom `displayName` for each input to make the interface more readable. ## Declaring inputs You can declare as many inputs as necessary for any flow. Inputs can be **required** or **optional**. If an input is required, you must provide a value at runtime or set a `defaults` value; otherwise, the execution will not be created. All inputs are validated when the execution is created; invalid inputs prevent the execution from being created. :::alert{type="warning"} If an execution is **not created** due to invalid or missing inputs, it will not appear in the executions list. ::: Below is an example flow using several inputs: ```yaml id: inputs namespace: company.team inputs: - id: string type: STRING defaults: "Hello World!" displayName: "A string input" - id: optional type: STRING required: false displayName: "An optional string" - id: int type: INT defaults: 100 displayName: "An integer input" - id: list_of_int type: ARRAY itemType: INT defaults: [1, 2, 3] displayName: "A list of integers" - id: bool type: BOOL defaults: true displayName: "A boolean input displayed as a toggle." - id: float type: FLOAT defaults: 100.12 displayName: "A float input" - id: dropdown type: SELECT displayName: "A dropdown input" defaults: VALUE_1 values: - VALUE_1 - VALUE_2 - VALUE_3 - id: dropdown_multi type: MULTISELECT values: - VALUE_1 - VALUE_2 - VALUE_3 required: true - id: instant type: DATETIME defaults: "2013-08-09T14:19:00Z" displayName: "A datetime input" - id: date type: DATE defaults: "2013-10-25" displayName: "A date input" - id: time type: TIME displayName: "A time input" defaults: "14:19:00" - id: duration type: DURATION defaults: "PT5M6S" displayName: "A duration input" - id: file type: FILE displayName: "Upload a file" defaults: nsfile:///hello.txt allowedFileExtensions: [".md", ".txt"] - id: json type: JSON displayName: "A JSON input" defaults: | [{"name": "kestra", "rating": "best in class"}] - id: uri type: URI defaults: "https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv" displayName: "A URI input" - id: secret type: SECRET displayName: "A secret input" - id: yaml type: YAML defaults: - user: john email: john@example.com - user: will email: will@example.com displayName: YAML - id: nested.string type: STRING defaults: "Hello World!" displayName: "A nested string input" ``` :::alert{type="info"} The `FILE` type supports defaults via the universal file protocol. Use `nsfile:///` for namespace files or `file:///` for local files. Note: `file:///` works only for explicitly allowed paths. Bind‑mount the host directory into the Kestra container and include that path under `kestra.local-files.allowed-paths` in your configuration (e.g., `/scripts`). Otherwise, access to the path will be denied for security reasons. ::: ## Input types Inputs in Kestra are strongly typed and validated before starting the flow execution. Here is the list of supported data types: - `STRING`: Any string. Values are passed without parsing; for additional validation, use a regex `validator`. - `INT`: Must be a valid integer value (i.e., without any decimals). - `FLOAT`: Must be a valid float value (i.e., with decimals). - `SELECT`: Must be a valid string value from a predefined list of values. You can either pass those values directly using the `values` property or use the `expression` property to fetch the values dynamically from a KV store. Additionally, if `allowCustomValue` is set to true, the user can provide a custom value that is not in the predefined list. :::alert{type="info"} **Note:** Due to [YAML allowing Scalar content](https://yaml.org/spec/1.1/current.html#id864510) to be presented in several formats, the boolean “true” might also be written as “yes” and “false” as “no”. To avoid errors using Yes/No in the `SELECT` input type, wrap them in quotation marks to preserve string format: "Yes", "No". ::: - `MULTISELECT`: Must be one or more valid string values from a predefined list of values. You can either pass those values directly using the `values` property or use the `expression` property to fetch the values dynamically from a KV store. Additionally, if `allowCustomValue` is set to true, the user can provide a custom value that is not in the predefined list. - `BOOLEAN`: Must be `true` or `false` passed as strings. - `DATETIME`: Must be a valid full [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) date and time with the timezone expressed in UTC format; pass input of type DATETIME in a string format following the pattern `2042-04-02T04:20:42.000Z`. - `DATE`: Must be a valid full [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) date without the timezone from a text string such as `2042-12-03`. - `TIME`: Must be a valid full [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) time without the timezone from a text string such as `10:15:30`. - `DURATION`: Must be a valid full [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration from a text string such as `PT5M6S`. - `FILE`: Either a file uploaded at execution time as `Content-Type: multipart/form-data` with `Content-Disposition: form-data; name=""; filename=""` (where `` is the input name and `` is the original filename of the file being uploaded), or a default file referenced via the universal file protocol using `nsfile:///path/to/file` (namespace file) or `file:///path/to/file` (local file from an allowed path). `FILE` type inputs also have the `allowedFileExtensions` property to control which types of files can be uploaded. - `JSON`: Must be a valid JSON string and will be converted to a typed form. - `YAML`: Must be a valid YAML string. - `URI`: Must be a valid URI and will be kept as a string. - `SECRET`: Encrypted string stored in the database. It is decrypted at runtime and can be used in all tasks. The value of a `SECRET` input is masked in the UI and in the execution context. Note that you need to set the [encryption key](../../configuration/05.security-and-secrets/index.md) in your [Kestra configuration](../../configuration/index.mdx) before using it. - `ARRAY`: Must be a valid JSON array or a YAML list. The `itemType` property is required to ensure validation of the type of the array items. All `FILE` inputs are automatically uploaded to Kestra's [internal storage](../../08.architecture/data-components/index.md#internal-storage) and accessible to all tasks. After the upload, the input variable will contain a fully qualified URL of the form `kestra:///.../.../` that will be automatically managed by Kestra and can be used as-is within any task. ## Input properties Below is the list of available properties for all inputs regardless of their types: - `id`: The input parameter identifier — this property is important as it's used to reference the input variables in your flow, e.g., `{{ inputs.user }}` references the input parameter named `user`. - `type`: The data type of the input parameter, as described in the previous section. - `required`: Whether the input is required. If `true` and neither a default nor a runtime value is provided, the execution will not be created. - `defaults`: The default value that is used if no custom input value is provided at runtime; this value must be provided as a string and will be set to the desired data type specified using the `type` property. - `prefill`: Starts with an initial value that can be cleared or set to `null` when the input is not required. Like an editable default, it allows workflows to support optional inputs that start with a suggestion but can still be reset to `null` at runtime. - `dependsOn`: Makes the input dependent on other inputs that must be provided first. - `displayName`: Label shown in the UI instead of the `id`. - `description`: Markdown description for the input. - `expression`: Use a pebble expression as a value -- e.g., `expression: "{{ kv('SELECT_VALUES') }}"`. - `autoSelectFirst`: A boolean property to auto-select the first list value in the dropdown as a default value (only usable for `SELECT` and `MULTISELECT` input types). This way, you don't need to explicitly set any `defaults` for that property. ## Input validation Kestra validates the `type` of each input. In addition to the type validation, some input types can be configured with validation rules that are enforced at execution time. - `STRING`: A `validator` property allows the addition of a validation [regex](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html). Validator patterns are subject to a 10-second timeout. The timeout is configurable via [`kestra.regex.timeout`](../../configuration/05.security-and-secrets/index.md#regex-timeout). - `SECRET`: Supports the same `validator` regex property as `STRING`, with the same 10-second timeout applied before the value is encrypted. - `INT`: `min` and `max` define the allowed range. - `FLOAT`: `min` and `max` define the allowed range. - `DURATION`: `min` and `max` define the allowed range. - `DATE`: `after` and `before` properties help you ensure that the input value is within the allowed date range. - `TIME`: `after` and `before` properties help you ensure that the input value is within the allowed time range. - `DATETIME`: `after` and `before` properties help you ensure that the input value is within the allowed date and time range. ### Example: use input validators in your flows To ensure that your input value is within a certain `integer` value range, you can use the `min` and `max` properties. Similarly, to ensure that your string input matches a regex pattern, you can provide a custom regex `validator`. The following flow demonstrates how this can be accomplished: ```yaml id: regex_input namespace: company.team inputs: - id: age type: INT prefill: 42 required: false min: 18 max: 64 - id: user type: STRING prefill: student required: false validator: ^student(\d+)?$ - id: float type: FLOAT defaults: 3.2 min: 0.2 max: 5.3 - id: duration type: DURATION min: "PT5M6S" max: "PT12H58M46S" - id: date type: DATE defaults: "2024-04-12" after: "2024-04-10" before: "2024-04-15" - id: time type: TIME after: "11:01:01" before: "11:04:01" - id: datetime type: DATETIME defaults: "2024-04-13T14:17:00Z" after: "2024-04-10T14:19:00Z" before: "2024-04-15T14:19:00Z" tasks: - id: validator type: io.kestra.plugin.core.log.Log message: User {{ inputs.user }}, age {{ inputs.age }} ``` The `age`, `float`, and `duration` input must be within a valid range between `min` and `max` values. Specifically for the `age` input, we specify that this input is by default set to 42, but it can be overwritten at runtime to a value between 18 and 64. If you attempt to execute the flow with the `age` input set to 17 or 65, the validation will fail and the execution won't start. Similarly, the Regex expression `^student(\d+)?$` is used to validate that the input argument `user` of type STRING follows the following pattern: - `^student`: This part of the regex asserts that the string must begin with the lowercase string value `student`. - `\d`: This part of the regex matches any digit (0-9). - `+`: This part of the regex asserts that there is one or more of the preceding token (i.e., one or more digits are allowed after the value `student`). - `()?`: The parentheses group the digits together, and the question mark makes the entire group optional — this means that the digits after the word `student` are optional. - `$`: This part of the regex asserts the end of the string. This ensures that the string doesn't contain any additional characters after the optional digits. With this pattern: - "student" is a match. - "student123" is a match. - "studentabc" is not a match because "abc" isn't a sequence of digits. - "student123abc" is not a match because no characters are allowed after `student` and the optional digits. Lastly, the `date`, `time`, and `datetime` inputs must be within a valid range between `after` and `before`. In the `date` example, the date provided must be between 10th April 2024 and 15th April 2024. Anything outside of this range will fail and the execution won't start. Try running this flow with various inputs or adjust the regex pattern to see how the input validation works. ## Nested inputs Using a `.` in an input `id` creates a nested input. Here's an example that includes 2 nested inputs: ```yaml id: nested_inputs namespace: company.team inputs: - id: nested.string type: STRING required: false - id: nested.int type: INT tasks: - id: log_inputs type: io.kestra.plugin.core.log.Log message: "{{ inputs.nested.string }} and {{ inputs.nested.int }}" ``` You can access the first input value using `{{ inputs.nested.string }}`. This provides type validation for nested inputs without resorting to raw JSON (JSON inputs are passed as strings). ## Array inputs Array inputs are used to pass a list of values to a flow. The `itemType` property is required to ensure validation of the type of the array items. This is useful when you want the user triggering the workflow to provide multiple values of a specific type, for example, a list of integers, strings, booleans, datetimes, etc. You can provide the default values as a JSON array or as a YAML list — both are supported. ```yaml id: array_demo namespace: company.team inputs: - id: my_numbers_json_list type: ARRAY itemType: INT defaults: [1, 2, 3] - id: my_numbers_yaml_list type: ARRAY itemType: INT defaults: - 1 - 2 - 3 tasks: - id: print_status type: io.kestra.plugin.core.log.Log message: received inputs {{ inputs }} ``` Below is how the array inputs are rendered in the UI when you create an execution: ![array_inputs](./array-inputs.png) ## Use an input value in a flow Inputs are available via `{{ inputs.name }}` or `{{ inputs['name'] }}`. If an input `id` contains characters like `-`, use the bracket form: `{{ inputs['name-example'] }}`. For example, if you declare the following inputs: ```yaml inputs: - id: mystring type: STRING required: true - id: my-file type: FILE ``` You can use the value of the input `mystring` inside dynamic task properties with `{{ inputs.mystring }}` but `my-file` would have to use `{{ inputs['my-file'] }}` because of the hyphen (`-`). We can see a full example below where `inputFiles` property is set to `{{ inputs['my-file'] }}`: ```yaml id: input_files namespace: company.team description: This flow shows how to pass files between inputs and tasks in Shell scripts. inputs: - id: my-file type: FILE tasks: - id: rename type: io.kestra.plugin.scripts.shell.Commands commands: - mv file.tmp output.tmp inputFiles: file.tmp: "{{ inputs['my-file'] }}" outputFiles: - "*.tmp" ``` ## Set input values at flow execution When executing a flow with inputs, you must provide all required inputs (unless a default is defined) for the execution to be created. Let's consider the following example that defines multiple inputs: ```yaml id: kestra_inputs namespace: company.team inputs: - id: string type: STRING defaults: hello - id: optional type: STRING required: false - id: int type: INT - id: float type: FLOAT - id: instant type: DATETIME - id: file type: FILE ``` Here, `string` and `optional` can be omitted because `string` has a default and `optional` is not required. All other inputs must be specified at runtime. ### Set inputs from the web UI When creating an execution from the web UI, you must set the inputs in the UI form. Kestra's UI generates a dedicated form based on your `inputs` definition. For example, datetime input properties have a date picker. The input form for the inputs above looks as follows: ![Flow inputs](./inputs.png) Once the inputs are set, you can trigger an execution of the flow. ### Set inputs when executing the flow using the API To create an execution with these inputs using the API, we can use the `curl` command to make an API request: ```bash curl -v "http://localhost:8080/api/v1/main/executions/example/kestra-inputs" \ -H "Transfer-Encoding:chunked" \ -H "Content-Type:multipart/form-data" \ -F string="a string" \ -F optional="an optional string" \ -F int=1 \ -F float=1.255 \ -F instant="2023-12-24T23:00:00.000Z" \ -F "files=@/tmp/128M.txt;filename=file" ``` Send files as `multipart/form-data` under the `files` field with `filename=""`, where `` is the input name. ### Set inputs when executing the flow in Python To create an execution with these inputs in Python, you can use the following script: ```python import io import requests from kestra import Flow flow = Flow() with open('/tmp/example.txt', 'rb') as fh: flow.execute('example', 'kestra-inputs', {'string': 'a string', 'optional': 'an optional string', 'int': 1, 'float': str(1.255), 'instant': '2020-01-14T23:00:00.000Z', 'files': ('file', fh, 'text/plain')}) ``` :::alert{type="info"} Wrap floats with `str()`; otherwise, a bytes-like object error may occur when sending a file input. ::: You can also use the `requests` library in Python to make requests to the Kestra API. Here's an example to execute a flow with multiple inputs: ```python import io import requests from requests_toolbelt.multipart.encoder import MultipartEncoder with open("/tmp/128M.txt", 'rb') as fh: url = f"http://kestra:8080/api/v1/main/executions/io.kestra.docs/my-flow" mp_encoder = MultipartEncoder(fields={ "string": "a string", "optional": "an optional string", "int": 1, "float": 1.255, "instant": "2020-01-14T23:00:00.000Z", "files": ("file", fh, "text/plain") }) result = requests.post( url, data=mp_encoder, headers={"Content-Type": mp_encoder.content_type}, ) ``` ### Set inputs when executing the flow in Java To create an execution with these inputs in Java (with [Apache Http Client 5](https://hc.apache.org/index.html)), you can use the following script: ```java import org.apache.hc.client5.http.classic.methods.HttpPost; import org.apache.hc.client5.http.entity.mime.FileBody; import org.apache.hc.client5.http.entity.mime.MultipartEntityBuilder; import org.apache.hc.client5.http.entity.mime.StringBody; import org.apache.hc.client5.http.impl.classic.CloseableHttpClient; import org.apache.hc.client5.http.impl.classic.CloseableHttpResponse; import org.apache.hc.client5.http.impl.classic.HttpClientBuilder; import org.apache.hc.core5.http.ContentType; import org.apache.hc.core5.http.HttpEntity; import java.io.File; class Application { public static void main(String[] args) { HttpEntity multipartEntity = MultipartEntityBuilder.create() .addPart("string", new StringBody("test", ContentType.DEFAULT_TEXT)) .addPart("int", new StringBody("1", ContentType.DEFAULT_TEXT)) .addPart("files", new FileBody(new File("/tmp/test.csv"), ContentType.DEFAULT_TEXT, "file")) .build(); try (CloseableHttpClient httpclient = HttpClientBuilder.create().build()) { HttpPost request = new HttpPost("http://kestra:8080/api/v1/main/executions/com.kestra.lde/inputs"); request.setEntity(multipartEntity); CloseableHttpResponse response = httpclient.execute(request); System.out.println("Response " + response); } catch (Exception e) { throw new RuntimeException(e); } } } ``` ## Difference between inputs and variables [Variables] are similar to constants. They behave like inputs during execution but cannot be overridden once the execution starts. Variables must be defined before execution, whereas inputs can be set at execution time. Variables are best suited for values that you don't want to change and are used in multiple places within the flow. For example, a URL you use for an API request that won't change is best as a variable whereas an email address that changes every time you execute your flow is best as an input. ## Dynamic inputs
Inputs in Kestra are strongly typed. Currently, you cannot enforce strong types and simultaneously use dynamically rendered Pebble expressions. The example below demonstrates using an expression inside of an input. When you select execute, the expression is rendered. ```yaml id: test namespace: company.team inputs: - id: date type: DATETIME defaults: "{{ now() }}" tasks: - id: print_date type: io.kestra.plugin.core.log.Log message: "hello on {{ inputs.date }}" ``` ### Dynamic Inputs with HTTP function With the `http()` function, you can make `SELECT` and `MULTISELECT` inputs dynamic by fetching options from an external API. This proves valuable when your data used in dropdowns changes frequently or when you already have an API serving that data for existing applications. The example below demonstrates how to create a flow with two dynamic dropdowns: one for selecting a product category and another for selecting a product from that category. The first dropdown fetches the product categories from an external HTTP API. The second dropdown makes another HTTP call to dynamically retrieve products matching the selected category. ```yaml id: dynamic_dropdowns namespace: company.team inputs: - id: category type: SELECT expression: "{{ http(uri = 'https://dummyjson.com/products/categories') | jq('.[].slug') }}" - id: product type: SELECT dependsOn: inputs: - category expression: "{{ http(uri = 'https://dummyjson.com/products/category/' + inputs.category) | jq('.products[].title') }}" tasks: - id: display_selection type: io.kestra.plugin.core.log.Log message: | You selected Category: {{ inputs.category }} And Product: {{ inputs.product }} ``` --- Dynamic inputs are useful for flows using authenticated API requests like the following: ```yaml id: approversFlow namespace: company.team inputs: - id: executionIdsToBeApproved type: MULTISELECT expression: >- {{ http( uri = 'http://localhost:8080/api/v1/internal/executions/search?state=PAUSED', method = 'GET', contentType = 'application/json', headers={ 'User-Agent': 'kestra', 'Connection': 'keep-alive', 'Authorization': 'Bearer ' ~ secret("bearerToken") } ) | jq('.results[] | "ExecutionId: \(.id), FlowId: \(.flowId), RequestedBy: \(.labels[] | select(.key == "system.username").value) InputParams: \( .inputs | to_entries | map("\(.key):\(.value)") | join(" ") )"') }} tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Hello World! 🚀 ``` :::alert{type="info"} When using `http()` inside an `expression` with secrets in headers (e.g., an authenticated API request), use named arguments and string concatenation ([Pebble Literals](https://pebbletemplates.io/wiki/guide/basic-usage/#literals)). The key to the syntax is to use string interpolation with `~`. ::: ## Conditional inputs for interactive workflows You can set up inputs that depend on other inputs, letting further inputs be conditionally displayed based on user choices. This is useful for use cases such as approval workflows or dynamic resource provisioning. ### How it works Create inputs that change based on other inputs using the `dependsOn` and `condition` properties. The example below shows different inputs appearing based on the selected resource type: ```yaml id: request_resources namespace: company.team inputs: - id: resource_type displayName: Resource type type: SELECT values: - Access permissions - SaaS application - Development tool - Cloud VM - id: access_permissions displayName: Access permissions type: SELECT expression: "{{ kv('access_permissions') }}" dependsOn: inputs: - resource_type condition: "{{ inputs.resource_type == 'Access permissions' }}" - id: saas_applications displayName: SaaS spplication type: MULTISELECT expression: "{{ kv('saas_applications') }}" dependsOn: inputs: - resource_type condition: "{{ inputs.resource_type == 'SaaS application' }}" - id: cloud_provider displayName: Cloud provider type: SELECT values: - AWS - GCP - Azure dependsOn: inputs: - resource_type condition: "{{ inputs.resource_type == 'Cloud VM' }}" - id: cloud_vms displayName: Cloud VM type: SELECT expression: "{{ kv('cloud_vms')[inputs.cloud_provider] }}" dependsOn: inputs: - resource_type - cloud_provider condition: "{{ inputs.resource_type == 'Cloud VM' }}" ``` In this example: - The `resource_type` input controls which additional inputs (such as `access_permissions`, `saas_applications`, and `cloud_vms`) appear. - `dependsOn` links inputs; `condition` defines when to display the related input. Before running the flow, set up the key-value pairs for each input. Expand the example below to add all key-value pairs with a helper flow. :::collapse{title="Flow adding key-value pairs"} ```yaml id: add_kv_pairs namespace: company.team tasks: - id: access_permissions type: io.kestra.plugin.core.kv.Set key: "{{ task.id }}" kvType: JSON value: | ["Admin", "Developer", "Editor", "Launcher", "Viewer"] - id: saas_applications type: io.kestra.plugin.core.kv.Set key: "{{ task.id }}" kvType: JSON value: | ["Slack", "Notion", "HubSpot", "GitHub", "Jira"] - id: development_tools type: io.kestra.plugin.core.kv.Set key: "{{ task.id }}" kvType: JSON value: | ["Cursor", "IntelliJ IDEA", "PyCharm Professional", "DataGrip"] - id: cloud_vms type: io.kestra.plugin.core.kv.Set key: "{{ task.id }}" kvType: JSON value: | { "AWS": ["t2.micro", "t2.small", "t2.medium", "t2.large"], "GCP": ["f1-micro", "g1-small", "n1-standard-1", "n1-standard-2"], "Azure": ["Standard_B1s", "Standard_B1ms", "Standard_B2s", "Standard_B2ms"] } - id: cloud_regions type: io.kestra.plugin.core.kv.Set key: "{{ task.id }}" kvType: JSON value: | { "AWS": ["us-east-1", "us-west-1", "us-west-2", "eu-west-1"], "GCP": ["us-central1", "us-east1", "us-west1", "europe-west1"], "Azure": ["eastus", "westus", "centralus", "northcentralus"] } ``` ::: You can also [add these key-value pairs](../../06.concepts/05.kv-store/index.md) via the API or the UI. ## Custom values in SELECT and MULTISELECT inputs If the predefined dropdown values do not fit a user’s needs, set `allowCustomValue` to `true` to allow custom entries. This lets you offer defaults while still accepting user-provided values. In the example below, `cloud_provider` lets users select a common provider (AWS, GCP, Azure) or enter a custom value (e.g., Oracle Cloud). ```yaml id: custom_values namespace: company.team inputs: - id: cloud_provider displayName: Cloud provider type: SELECT allowCustomValue: true values: - AWS - GCP - Azure tasks: - id: print_status type: io.kestra.plugin.core.log.Log message: Selected cloud provider {{ inputs.cloud_provider }} ``` --- # Labels in Kestra – Tag Flows and Executions URL: https://kestra.io/docs/workflow-components/labels > Organize and filter Kestra flows and executions with Labels. Use key-value tags to group workflows by team, environment, project, or priority. Labels are key-value pairs in Kestra that let you organize [flows](../01.flow/index.md) and [executions](../03.execution/index.md) across multiple dimensions, without being restricted to a single hierarchy. You can organize flows and executions by project, priority, maintainer, or any other relevant criteria. Unlike fixed categories, labels support flexible filtering, grouping, and discovery. Labels can be associated with both the flow definition and individual execution instances. This allows you to distinguish between different executions of the same flow.
Labels help organize and filter flows and their executions based on your criteria. Adding a labels section to flows lets you sort and group executions, making them easier to discover and analyze. Here's a simple example of a flow with two labels defined: ```yaml id: process_invoice_flow namespace: company.team labels: team: finance priority: HIGH tasks: - id: hello type: io.kestra.plugin.core.log.Log message: hello from a flow with labels ``` Executing such a flow results in the execution inheriting both `team: finance` and `priority: HIGH` labels by default. However, you can also define additional labels at the time of execution launch. ## Benefits of labels Labels provide a simple and effective way to organize and filter flows and their executions. Key benefits include: - **Observability**: Track execution status, monitor errors, and rerun only a subset of executions. - **Filtering**: Quickly find executions, mark test runs, track ML experiments, or label runs by runtime inputs. - **Organization**: Manage workflows at scale by grouping executions by team, project, maintainer, or environment. You can also build custom dashboards using labels, for example: `http://localhost:8080/ui/executions?filters[labels][EQUALS][team]=finance`. ### Common scenarios To group flows related to the same project across [Kestra namespaces](../02.namespace/index.md), you can use a common flow label, such as `project: XYZ-123`. When running the `process_invoice_flow`, you can add execution labels (e.g., `currency`) to capture attributes of the processed invoice. This allows you to filter executions by specific values, like `currency: USD`. You can also label executions related to a pre-production run. For example, using a `purpose: pre-prod` label. This enables you to safely delete only those executions associated with the pre-production phase. In multi-team environments, labels help you separate executions by team, for example `support: EMEA` and `support: APAC`, when the same flow handles data from different regions. ## Execution labels propagated from flow labels When you execute a flow with labels, those labels are automatically applied to its executions. ![labels1](./labels1.png) ![labels2](./labels2.png) ## Set execution labels when executing a flow from the UI When executing flows manually from the UI, you can override and define new labels at the execution's start by expanding the **Advanced configuration** section:
You can set labels from the UI even after an execution completes. This helps with collaboration and troubleshooting. For example, you can add a label to a failed execution to indicate its status, such as whether it has been acknowledged, is being investigated, or has been resolved. To set labels from the UI, go to the **Overview** tab of an **Execution** and click on the "Set labels" button. You can add multiple labels at once. ![labels3](./labels3.png) You can even set labels for multiple executions at once from the UI. This feature is helpful for bulk operations, such as acknowledging multiple failed executions at once after an outage. ![labels4](./labels4.png) ## Set labels based on flow inputs and task outputs You have the ability to set execution labels from a dedicated [Labels task](/plugins/core/execution/io.kestra.plugin.core.execution.labels). This task provides a dynamic way to label your flows, helping with observability, debugging, and monitoring of failures. This task lets you set custom execution labels based on flow inputs, task outputs, or other dynamic workflow data. There are two ways to set labels in this task: 1. **Using a Map (Key-Value Pairs)**: ideal when the `key` is static and the `value` is dynamic. The key is the label name, and the value is a dynamic label value that might be derived from the flow inputs or task outputs. In the example below, the task `update_labels` overrides the default label `song` with the output of the `get` task, and adds a new label called `artist`. ```yaml id: labels_override namespace: company.team labels: song: never_gonna_give_you_up tasks: - id: get type: io.kestra.plugin.core.debug.Return format: never_gonna_stop - id: update_labels type: io.kestra.plugin.core.execution.Labels labels: song: "{{ outputs.get.value }}" artist: rick_astley # new label ``` 2. **Using a List of Key-Value Pairs**: particularly useful if both the `key` and the `value` are dynamic properties. ```yaml id: labels namespace: company.team inputs: - id: user type: STRING defaults: Rick Astley - id: url type: STRING defaults: song_url tasks: - id: update_labels_with_map type: io.kestra.plugin.core.execution.Labels labels: customerId: "{{ inputs.user }}" - id: get type: io.kestra.plugin.core.debug.Return format: https://t.ly/Vemr0 - id: update_labels_with_list type: io.kestra.plugin.core.execution.Labels labels: - key: "{{ inputs.url }}" value: "{{ outputs.get.value }}" ``` ### Overriding flow labels at runtime You can set default labels at the flow level and override them at runtime. This approach is useful for overriding labels dynamically during execution, based on task results. The example below shows how to override the default label `song` with the output of the `get` task: ```yaml id: flow_with_labels namespace: company.team labels: song: never_gonna_give_you_up artist: rick-astley genre: pop tasks: - id: get type: io.kestra.plugin.core.debug.Return format: never_gonna_stop - id: update-list type: io.kestra.plugin.core.execution.Labels labels: song: "{{ outputs.get.value }}" ``` In this example, the default label `song` is overridden by the output of the `get` task. --- # Namespaces in Kestra – Organize and Secure Workflows URL: https://kestra.io/docs/workflow-components/namespace > Organize your Kestra workflows with Namespaces. Learn to group flows, manage access, and structure your orchestration environment hierarchically. Namespaces are logical groupings of flows and their components. Use namespaces to organize workflows and manage access to secrets, key-value pairs, plugin defaults, variables and more.
You can think of a namespace as a **folder for your flows**. Similar to folders on your file system, namespaces can be used to organize flows into logical categories. Similar to filesystems, namespaces can be indefinitely nested. If you're looking to completely isolate environments with their own resources on the same Kestra instance, you should consider [Tenants](../../07.enterprise/02.governance/tenants/index.md), part of the [Enterprise Edition](../../07.enterprise/index.mdx). ## Hierarchical structure with nested namespaces Using the dot `.` symbol, you can add a hierarchical structure to your namespaces which allows you to logically separate environments, projects, teams, and departments. This way, your product, engineering, marketing, finance, and data teams can all use the same Kestra instance, all while keeping their flows organized and separated. Various stakeholders can have their own child namespaces that belong to a parent namespace grouping them by environment, project, or team. ## Namespace name A namespace name can be built from alphanumerical characters, optionally separated by `.`. The hierarchy depth for namespaces is unlimited. Here are some examples of namespaces: - `project_one` - `company.project_two` - `company.team.project_three` ## Using namespaces to organize flows and files When you create a flow, you can assign a namespace to it: ```yaml id: hello_world namespace: company.team tasks: - id: log_task type: io.kestra.plugin.core.log.Log message: hi from {{ flow.namespace }} ``` :::alert{type="warning"} **Note:** Once you've saved your flow, you won't be able to change its namespace. You'll need to make a new flow in order to change the namespace. ::: Below, the flow is assigned to the `company.team` namespace. This assignment of a namespace to a flow already provides a benefit of improved organization and filtering: ![Namespace Organization](./namespace_1.png) Additionally, you can organize your code on a namespace-level using the embedded Code editor and [Namespace Files](../../06.concepts/02.namespace-files/index.md), with the option to [sync those files from Git](../../version-control-cicd/04.git/index.md): ![Namespace Flow and Files](./namespace_2.png) ## Namespace tab In the **Namespaces** tab, you can see all the namespaces associated with the different flows in Kestra. You can also list and filter namespaces from the command line using [`kestractl namespaces list`](../../kestra-cli/kestractl/index.md).
You can open the details about any namespace by clicking on the name or details button to the right of that namespace. ![namespace_tab](./namespace_tab.png) When you select the details button for any namespace, the namespace overview page opens which details the executions of flows in that namespace. ![namespace_overview](./namespace_overview.png) On the top of this page, you have different tabs: 1. **Overview:** the default landing page of the Namespace. This page contains the dashboards and summary about the executions of different flows in this namespace. 2. **Executions:** View and manage all the executions details. 3. **Flows:** View all flows in the namespace with execution details and statistics. Select the details button to navigate to a specific flow's page. 4. **Dependencies:** View flow dependencies through subflows or flow triggers. 5. **KV Store:** Manage key-value pairs for this namespace. See [KV Store](../../06.concepts/05.kv-store/index.md) for details. 6. **Files:** To manage, view and modify all the Namespace files. The other tabs: Edit, Variables, Plugin Defaults, Secrets, and Audit Logs are only available for Kestra EE. More details about them can be found in our [Enterprise Edition documentation](../../07.enterprise/index.mdx). --- # Workflow Outputs in Kestra: Share Data Between Tasks URL: https://kestra.io/docs/workflow-components/outputs > Leverage Outputs in Kestra to share data between tasks and flows. Learn to capture, store, and reuse execution results and artifacts in your workflows. Outputs let you pass data between tasks and flows.
A workflow execution can generate **outputs**. Outputs are stored in the flow’s execution context and can be accessed by all downstream tasks and flows. Each task defines its own output attributes — see the task’s documentation for details. You can retrieve outputs from other tasks within all [dynamic properties](../01.tasks/index.mdx#dynamic-vs-static-task-properties). :::alert{type="warning"} **Do not use Outputs to fetch sensitive data (such as passwords, secrets, or API tokens).** Fetching Secrets from an external Secrets Manager via a task imposes a significant security risk. All data fetched via outputs is **stored in clear text in multiple places** (including the backend database, internal storage, logs, API requests). For secure handling of secrets, **exclusively** use [Secrets](../../06.concepts/04.secret/index.md). [Kestra EE](../../07.enterprise/02.governance/secrets/index.md) and [Kestra Cloud](/cloud) offer reliable secrets management including native integrations with various [Secrets Managers](../../07.enterprise/02.governance/secrets-manager/index.md). ::: ## Using outputs Below is an example of how to use the output of the `produce_output` task in the `use_output` task. We use the [Return](/plugins/core/debug/io.kestra.plugin.core.debug.return) task that has one output attribute named `value`. ```yaml id: task_outputs_example namespace: company.team tasks: - id: produce_output type: io.kestra.plugin.core.debug.Return format: my output {{ execution.id }} - id: use_output type: io.kestra.plugin.core.log.Log message: The previous task output is {{ outputs.produce_output.value }} ``` In this example, the first task produces an output from the `format` property. This output attribute is then used in the second task `message` property. The expression `{{ outputs.produce_output.value }}` references the previous task output attribute. :::alert{type="info"} In the example above, the **Return** task produces an output attribute `value`. Every task produces different output attributes. You can look at each task outputs documentation or use the **Outputs** tab of the **Executions** page to find out about specific task output attributes. ::: The **Outputs** tab shows the output for `produce_output` task. There is no output for `use_output` task as it only logs a message. ![task_outputs_example](./task_outputs_example.png) In the next example, we can see a file is passed between an input and a task, where the task generates a new file as an output: ```yaml id: bash_with_files namespace: company.team description: This flow shows how to pass files between inputs and tasks in Shell scripts. inputs: - id: file type: FILE tasks: - id: rename type: io.kestra.plugin.scripts.shell.Commands commands: - mv file.tmp output.tmp inputFiles: file.tmp: "{{ inputs.file }}" outputFiles: - "*.tmp" ``` :::alert{type="info"} Since 0.14, Outputs are no longer rendered recursively. You can read more about this change and how to change this behavior in the [0.14 Migration guide](../../11.migration-guide/v0.14.0/recursive-rendering/index.md). ::: ## Internal storage Each task can store data in Kestra's internal storage. If an output is stored in internal storage, it contains a URI pointing to the file location. This output attribute could be used by other tasks to access the stored data. The following example stores the query results in internal storage, then accesses it in the `write_to_csv` task: ```yaml id: output_sample namespace: company.team tasks: - id: output_from_query type: io.kestra.plugin.gcp.bigquery.Query sql: | SELECT * FROM `bigquery-public-data.wikipedia.pageviews_2023` WHERE DATE(datehour) = current_date() ORDER BY datehour desc, views desc LIMIT 10 store: true - id: write_to_csv type: io.kestra.plugin.serdes.csv.IonToCsv from: "{{ outputs.output_from_query.uri }}" ``` ## Flow outputs A flow can also produce strongly typed outputs. You can add them using the `outputs` attribute in the flow definition. Here is an example of a flow that produces an output: ```yaml id: flow_outputs namespace: company.team tasks: - id: mytask type: io.kestra.plugin.core.debug.Return format: this is a task output used as a final flow output outputs: - id: final type: STRING value: "{{ outputs.mytask.value }}" ``` An Output can have one of the following types: `ARRAY`, `BOOLEAN`, `DATE`, `DATETIME`, `DURATION`, `EMAIL`, `ENUM`, `FILE`, `FLOAT`, `INT`, `JSON`, `MULTISELECT`, `SECRET`, `STRING`, `TIME`, `URI`, or `YAML`. Outputs are defined as a list of key-value pairs. The `id` is the name of the output attribute (must be unique within a flow), and the `value` is the value of the output. You can also add a `description` to the output. Flow outputs appear in the **Overview** tab of the **Executions** page. ![subflow_output](./subflow_output.png) ### Pass data between flows using flow outputs Here is how you can access the flow output in a parent flow: ```yaml id: parent_flow namespace: company.team tasks: - id: subflow type: io.kestra.plugin.core.flow.Subflow flowId: flow_outputs namespace: company.team wait: true - id: log_subflow_output type: io.kestra.plugin.core.log.Log message: "{{ outputs.subflow.outputs.final }}" ``` In the example above, the `subflow` task produces an output attribute `final`. This output attribute is then used in the `log_subflow_output` task. :::alert{type="info"} Note how the `outputs` are set twice within the `"{{outputs.subflow.outputs.final}}"`: 1. once to access outputs of the `subflow` task 2. once to access the outputs of the subflow itself — specifically, the `final` output ::: Here is what you will see in the **Outputs** tab of the **Executions** page in the parent flow: ![subflow_output_parent](./subflow_output_parent.png) ### Return outputs conditionally You can return different outputs based on conditions. For instance, if a given task is skipped, you may want to return a fallback value or return the output of another task. Here is an example of how you can achieve this: ```yaml id: conditionally_return_output namespace: company.team inputs: - id: run_task type: BOOLEAN defaults: true tasks: - id: main type: io.kestra.plugin.core.debug.Return format: Hello World! runIf: "{{ inputs.run_task }}" - id: fallback type: io.kestra.plugin.core.debug.Return format: fallback output outputs: - id: flow_output type: STRING value: "{{ tasks.main.state != 'SKIPPED' ? outputs.main.value : outputs.fallback.value }}" ``` Note how the Ternary Operator `{{ condition ? value_if_true : value_if_false }}` is used in the output expression `{{ tasks.main.state != 'SKIPPED' ? outputs.main.value : outputs.fallback.value }}` to return the output of the `main` task if it is not skipped, otherwise, it returns the output of the `fallback` task. ## Dynamic variables (Each tasks) ### Current taskrun value In dynamic flows (for example, with an **Each** loop), variables are passed to tasks dynamically. You can access the current taskrun value with `{{ taskrun.value }}` like this: ```yaml id: taskrun_value_example namespace: company.team tasks: - id: each type: io.kestra.plugin.core.flow.ForEach values: ["alpha", "beta", "gamma"] tasks: - id: inner type: io.kestra.plugin.core.debug.Return format: "{{ task.id }} > {{ taskrun.value }} > {{ taskrun.startDate }}" ``` The **Outputs** tab contains the output for each of the inner task. ![taskrun_value_example](./taskrun_value_example.png) ### Loop over a list of JSON objects Within the loop, the `value` is always a JSON string, so the `{{ taskrun.value }}` is the current element as JSON string. To access properties, you need to wrap it in the `fromJson()` function to have a JSON object allowing to access each property easily. ```yaml id: loop_sequentially_over_list namespace: company.team tasks: - id: each type: io.kestra.plugin.core.flow.ForEach values: - {"key": "my-key", "value": "my-value"} - {"key": "my-complex", "value": {"sub": 1, "bool": true}} tasks: - id: inner type: io.kestra.plugin.core.debug.Return format: "{{ fromJson(taskrun.value).key }} > {{ fromJson(taskrun.value).value }}" ``` ### Specific outputs for dynamic tasks Dynamic tasks are tasks that run other tasks a certain number of times. A dynamic task runs multiple iterations of a set of sub-tasks. For example, **ForEach** produces other tasks dynamically depending on its `values` property. It is possible to reach each iteration output of dynamic tasks by using the following syntax: ```yaml id: output_sample namespace: company.team tasks: - id: each type: io.kestra.plugin.core.flow.ForEach values: ["s1", "s2", "s3"] tasks: - id: sub type: io.kestra.plugin.core.debug.Return format: "{{ task.id }} > {{ taskrun.value }} > {{ taskrun.startDate }}" - id: use type: io.kestra.plugin.core.debug.Return format: "Previous task produced output: {{ outputs.sub.s1.value }}" ``` The `outputs.sub.s1.value` variable reaches the `value` of the `sub` task of the `s1` iteration. ### Previous task lookup It is also possible to locate a specific dynamic task by its `value`: ```yaml id: dynamic_looping namespace: company.team tasks: - id: each type: io.kestra.plugin.core.flow.ForEach values: ["alpha", "beta", "gamma"] tasks: - id: inner type: io.kestra.plugin.core.debug.Return format: "{{ taskrun.value }}" - id: end type: io.kestra.plugin.core.debug.Return format: "{{ task.id }} > {{ outputs.inner['alpha'].value }}" ``` It uses the format `outputs.TASKID[VALUE].ATTRIBUTE`. The special bracket `[]` in `[VALUE]` is called the subscript notation; it enables using special chars like space or '-' in task identifiers or output attributes. ### Lookup in sibling tasks Sometimes it is useful to access outputs from other tasks in the same task tree, known as sibling tasks. If the task tree is static, for example when using the [Sequential](/plugins/core/flow/io.kestra.plugin.core.flow.sequential) task, you can use the `{{ outputs.task_id.value }}` notation where `task_id` is the identifier of the sibling task, as you would outside of the task tree. For example: ```yaml id: sibling_tasks namespace: company.team tasks: - id: sequential type: io.kestra.plugin.core.flow.Sequential tasks: - id: first type: io.kestra.plugin.core.output.OutputValues values: data: "hello from task 1" - id: second type: io.kestra.plugin.core.output.OutputValues values: data: "{{ outputs.first.values.data }}" - id: log_siblings type: io.kestra.plugin.core.log.Log message: "{{ outputs.second.values.data }}" ``` If the task tree is dynamic, for example when using the [ForEach](/plugins/core/flow/io.kestra.plugin.core.flow.foreach) task, you need to use `{{ outputs.task_id[taskrun.value] }}` to access the current tree task. `taskrun.value` is a special variable that holds the current value of the ForEach task. For example: ```yaml id: loop_with_sibling_tasks namespace: company.team tasks: - id: foreach type: io.kestra.plugin.core.flow.ForEach values: ["alpha", "beta", "gamma"] tasks: - id: first type: io.kestra.plugin.core.output.OutputValues values: data: "First value: {{ taskrun.value }}" - id: second type: io.kestra.plugin.core.output.OutputValues values: data: "{{ outputs.first[taskrun.value].values.data }}" - id: log_output_from_foreach type: io.kestra.plugin.core.log.Log message: "{{ outputs.second['alpha'].values.data }}" ``` You can also use the `currentEachOutput` function to access the current tree task. See [Function Reference](../../expressions/04.functions/index.mdx) for more details. :::alert{type="warning"} Accessing sibling task outputs is impossible on [Parallel](/plugins/core/flow/io.kestra.plugin.core.flow.parallel) as it runs tasks in parallel. ::: For more examples and guidance on accessing sibling outputs inside `ForEach`, including how to read them both inside and outside the loop, see [Best Practices for ForEach and ForEachItem](../../14.best-practices/11.foreach-and-foreachitem/index.md#example-use-sibling-outputs-correctly-inside-foreach). ## Outputs preview Kestra provides a preview option for output files stored in internal storage. The following flow demonstrates this feature: ```yaml id: get_employees namespace: company.team tasks: - id: download type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/raw/main/ion/employees.ion ``` On flow execution, the file is downloaded into the Kestra internal storage. When you go to the Outputs tab for this execution, the `uri` attribute of the `download` task contains the file location on Kestra's internal storage and has a Download and a Preview button. ![preview_button](./preview_button.png) On clicking the Preview button, you can preview the contents of the file in a tabular format, making it extremely easy to check the contents of the file without downloading it. ![preview](./preview.png) ## Using debug expression You can evaluate the output further using the **Debug Expression** functionality in the **Outputs** tab. Consider the following flow: ```yaml id: json_values namespace: company.team tasks: - id: sample_json type: io.kestra.plugin.core.debug.Return format: '{"data": [1, 2, 3]}' ``` When you run this flow, the **Outputs** tab will contain the output for the `sample_json` task, as shown below: ![json_values](./json_values.png) You can select the task from the drop-down menu. Here, we select "sample_json" and select **Debug Expression**: ![json_values_render_expression](./json_values_render_expression.png) You can now use Pebble expressions to evaluate and analyze the output data further.
:::alert{type="info"} Note: This was previously called **Render expression**. ::: ## Encrypted outputs from script tasks :::badge{version=">=0.23" editions="EE,Cloud"} ::: For [script task Outputs](../../16.scripts/06.outputs-metrics/index.md) that have sensitive values, you can protect the information by using the `encryptedOutputs` syntax such as `::{"encryptedOutputs":{"encrypted":"my secret value"}}::`. In the following flow, the `encrypted` output is not shown in plain text in the Outputs UI. ```yaml id: encryped_output namespace: company.team tasks: - id: hello type: io.kestra.plugin.scripts.shell.Script script: | echo '::{"outputs":{"plaintext":"plaintext_value"}}::' echo '::{"encryptedOutputs":{"encrypted":"my secret value"}}::' - id: print type: io.kestra.plugin.core.log.Log message: "{{ outputs.hello['vars']['encrypted'] }}" ``` The `encrypted` output is displayed encoded: ![Encrypted Outputs](./encrypted-outputs.png) --- # Plugin Defaults in Kestra – Set Task-Level Defaults URL: https://kestra.io/docs/workflow-components/plugin-defaults > Streamline Kestra flow configuration with Plugin Defaults. Set global or flow-level default values for task properties to reduce repetition and boilerplate. Plugin defaults are default values applied to every task of a given type within one or more flows. They work like default function arguments, helping you avoid repetition when tasks or plugins frequently use the same values.
## Plugin Defaults on a flow-level You can define plugin defaults in the `pluginDefaults` section to avoid repeating properties across multiple tasks of the same type. For example: ```yaml id: api_python_sql namespace: company.team tasks: - id: api type: io.kestra.plugin.core.http.Request uri: https://dummyjson.com/products - id: hello type: io.kestra.plugin.scripts.python.Script script: | print("Hello World!") - id: python type: io.kestra.plugin.scripts.python.Script beforeCommands: - pip install polars outputFiles: - "products.csv" script: | import polars as pl data = {{outputs.api.body | jq('.products') | first}} df = pl.from_dicts(data) df.glimpse() df.select(["brand", "price"]).write_csv("products.csv") - id: sql_query type: io.kestra.plugin.jdbc.duckdb.Query inputFiles: in.csv: "{{ outputs.python.outputFiles['products.csv'] }}" sql: | SELECT brand, round(avg(price), 2) as avg_price FROM read_csv_auto('{{workingDir}}/in.csv', header=True) GROUP BY brand ORDER BY avg_price DESC; store: true pluginDefaults: - type: io.kestra.plugin.scripts.python.Script values: taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker pullPolicy: ALWAYS # set it to NEVER to use a local image containerImage: python:slim ``` In this example, Docker and Python configurations are defined once in `pluginDefaults`, instead of being repeated in every task. This approach helps to streamline the configuration process and reduce the chances of errors caused by inconsistent settings across different tasks. :::alert{type="info"} If you move required attributes into `pluginDefaults`, the UI code editor may show warnings about missing arguments, because defaults are only resolved at runtime. As long as `pluginDefaults` contains the relevant arguments, you can save the flow and ignore the warning displayed in the editor. ![pluginDefaultsWarning](./warning.png) ::: ### `forced` attribute in `pluginDefaults` Setting `forced: true` in `pluginDefaults` ensures that default values override any properties defined directly in the task. By default, the value of the `forced` attribute is `false`. ## Plugin defaults in a global configuration Plugin defaults can also be defined globally in your Kestra configuration, applying the same values across all flows. This is useful when you want to apply the same defaults across multiple flows. Let's say that you want to centrally manage the default values for the `io.kestra.plugin.aws` plugin to reuse the same credentials and region across all your flows. You can add the following to your Kestra configuration: ```yaml kestra: plugins: defaults: - type: io.kestra.plugin.aws values: accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}" region: "us-east-1" ``` If you want to set defaults only for a specific task, you can do that too: ```yaml kestra: plugins: defaults: - type: io.kestra.plugin.aws.s3.Upload values: accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}" region: "us-east-1" ``` ### Nested property values For plugins with nested properties, define the values using the same nested YAML structure you would use in a flow. For example, to set resource limits for the Kubernetes task runner: ```yaml kestra: plugins: defaults: - type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes forced: true values: resources: limit: cpu: "1" memory: "128Mi" ``` This is equivalent to writing the same nested structure directly in a task. The `forced: true` attribute ensures these defaults override any values set at the task level. ## Plugin Defaults Enterprise Edition :::alert{type="info"} In the [Enterprise Edition](../../07.enterprise/index.mdx) or [Kestra Cloud](/cloud), plugin defaults can be configured directly in the UI under the **Plugin Defaults** tab of a Namespace. ::: You can create them via form or directly as YAML code for the Namespace: ![Plugin Default Form Creation](./plugin-default-creation.png) Or click on **YAML** and, for example, paste the following: ```yaml - type: io.kestra.plugin.aws.s3.Upload values: accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}" region: "us-east-1" ``` ### Inherited Plugin Defaults Plugin Defaults are inherited from the parent Namespace to children Namespaces. In the example above, the image shows the Plugin Default was created in the `kestra.company` Namespace. Navigating to the **Plugin Defaults** tab of a child Namespace, for example `kestra.company.data`, shows the parent Namespace's Plugin Defaults. This avoids having to recreate Plugin Defaults across children Namespaces, but it still allows for the children Namespaces to maintain their own isolated defaults if needed. ![Plugin Default Inheritance](./inherited-plugin-defaults.png)
--- # Plugins in Kestra: Tasks, Triggers, Integrations URL: https://kestra.io/docs/workflow-components/plugins > Understand how Kestra plugins work, how to choose versions, and where to find or build the right integration. Plugins power every task and trigger in Kestra. They wrap external systems, expose orchestration primitives, and let you extend the platform with custom code. Think of them as the "integrations" or “drivers” that let flows talk to databases, queues, SaaS APIs, file systems, and runtime environments. ## Plugin categories Most flows mix several categories: - **Tasks** perform work (HTTP, JDBC, Python, Spark, Script, etc.). - **Triggers** start executions ([Schedule](../07.triggers/01.schedule-trigger/index.md), [Webhook](../07.triggers/03.webhook-trigger/index.md), [Kafka](../07.triggers/05.realtime-trigger/index.md), Pub/Sub). - **Conditions** gate paths (`If`, `Switch`, expressions). Browse all available plugins at [kestra.io/plugins](/plugins). ## Choosing versions (Enterprise) Kestra can host multiple versions of the same plugin. You can: - Pin a version on an individual task/trigger (`version: "0.21.0"`). - Rely on the instance-wide `defaultVersion` (often `LATEST`) when you omit it. - In Enterprise, install and pin versions centrally under **Instance → Versioned Plugins** (see [Versioned Plugins](../../07.enterprise/05.instance/versioned-plugins/index.md)). Example of pinning a task to a specific version: ```yaml id: postgres_query namespace: company.team tasks: - id: fetch type: io.kestra.plugin.jdbc.postgresql.Query version: "1.0.0" url: jdbc:postgresql://127.0.0.1:56982/ username: "{{ secret('POSTGRES_USERNAME') }}" password: "{{ secret('POSTGRES_PASSWORD') }}" sql: select * from orders limit 1000 fetchType: STORE ``` ## Common configuration patterns Plugins often share the same properties; use them wisely to keep executions fast and safe: - **Result handling (`fetchType` / `storeType`)** chooses how outputs are returned: `FETCH_ONE`, `FETCH`, `NONE`, or `STORE`. `STORE` writes results to internal storage and returns a URI instead of inlining the payload. - **Pagination limits** (`fetchSize`, `limit`, `maxResults`) prevent oversized responses when you expect big result sets. - [**Secrets**](../../06.concepts/04.secret/index.md): keep connection strings, tokens, and usernames in secrets (`{{ secret('KEY') }}`) so they don’t leak into flow revisions or logs. ### Handling outputs: fetch vs. store A quick rule set to avoid bloated execution context: - Use fetch-style outputs (`fetch`, `fetchType`, `store=false`) only for small payloads you need inline for control flow (e.g., a few rows feeding `Switch` or `ForEach`). - For large datasets, switch to store-style (`store=true`, `storeType: STORE`): the data is written to internal storage, and only a URI is kept in the execution context, preventing repeated serialization on every task state change. - `value` and `uri` are mutually exclusive: `store=false` exposes `value`; `store=true` exposes `uri`. Accessing the wrong one raises an execution error. | Setting | Use when you need | Stored in execution context | Pebble access | Good for | |--------------------|-------------------|-----------------------------|--------------------------|-------------------| | `fetchType: FETCH_ONE` | A single small record | The value itself | `{{ outputs.task.value }}` | Lookups, routing | | `fetchType: FETCH` | A small list | The list values | `{{ outputs.task.value }}` | Branching logic | | `fetchType: NONE` | No result needed | Nothing | n/a | Fire-and-forget | | `storeType: STORE` or `store: true` | Large payloads/file-like results | Only a URI | `{{ outputs.task.uri }}` | Large exports, heavy queries | :::alert{type="info"} Handling large outputs? Prefer `STORE`/`storeType` and see [Managing output data volume](../../14.best-practices/0.flows/index.md#managing-output-data-volume). ::: ### Secrets in configuration properties Some configuration properties such as "Database Password" are obvious secrets and should be protected, but consider using secrets for connection URLs, database names, user or service account names, and similar. Remember using these values in the flow code even once will appear in a [revision](../../06.concepts/03.revision/index.md). Check out the how-to guide for [Secrets in Open Source](../../15.how-to-guides/secrets/index.md), or [Secrets Manager](../../07.enterprise/02.governance/secrets-manager/index.md) in Enterprise Edition. ## Installing plugins Installation paths vary by role: - **UI (Enterprise)**: install/upgrade/pin versions under **Instance → Versioned Plugins**. - **CLI/API**: automate installs; see [Selected Plugin Installation](../../15.how-to-guides/selected-plugin-installation/index.md). ## Building or requesting plugins If you can’t find the integration you need, you can build or request it: - Build: follow the [Plugin Developer Guide](../../plugin-developer-guide/index.md) to scaffold, test, and publish. - Request: ask in the [Kestra Slack community](https://kestra.io/slack) or open an issue in the [Kestra repository](https://github.com/kestra-io/kestra/issues). --- # Task Retries in Kestra – Handle Transient Failures URL: https://kestra.io/docs/workflow-components/retries > Configure Retries in Kestra to handle transient failures. Learn about constant, exponential, and random retry strategies for tasks and flows. Retries handle transient failures in your workflows. They are defined at the task level and can be configured to retry a task a certain number of times or with a delay between attempts.
Retries let you automatically rerun failed tasks. Each retry creates a new task run attempt, based on the retry configuration defined in the flow. ### Example This task retries up to 5 times with a 15-minute interval between attempts: ```yaml - id: retry_sample type: io.kestra.plugin.core.log.Log message: my output for task {{task.id}} timeout: PT10M retry: type: constant maxAttempts: 5 interval: PT15M ``` In this example, the flow retries 4 times every 0.25 seconds. It succeeds on the 5th attempt, using `{{ taskrun.attemptsCount }}` to track retries: ```yaml id: retry namespace: company.team description: This flow retries 4 times and succeeds on the 5th attempt tasks: - id: failed type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'if [ "{{taskrun.attemptsCount}}" -eq 4 ]; then exit 0; else exit 1; fi' retry: type: constant interval: PT0.25S maxAttempts: 5 maxDuration: PT1M warningOnRetry: true errors: - id: never_happen type: io.kestra.plugin.core.debug.Return format: Never happened {{task.id}} ``` ### Timeout vs. Max Retry Duration - **`timeout`**: Maximum duration for a single task attempt (initial or retry). If exceeded, the attempt fails. - **`retry.maxDuration`**: Maximum total time allowed for the task, including all attempts and delays. Once exceeded, retries stop. **Example**: With `timeout: 10m` and `maxDuration: 30m`: - Each attempt can last up to 10 minutes. - The overall retries stop after 30 minutes in total. ⚠️ Ensure `retry.interval` is smaller than `maxDuration`, or retries may not run. ### Retry options | Name | Type | Description | |------------------|------------|-------------| | `type` | string | Retry strategy: `constant`, `exponential`, or `random`. | | `maxAttempts` | integer | Number of retry attempts before stopping. | | `maxDuration` | Duration | Maximum total time for the task, across all attempts. | | `warningOnRetry` | Boolean | Marks execution as `WARNING` if retries occurred (default: false). | ### Duration format Durations use [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601#Durations) format (weeks, months, years not supported). Examples: | Value | Description | |----------|-------------| | PT0.25S | 250 ms | | PT2S | 2 seconds | | PT1M | 1 minute | | PT3.5H | 3 hours, 30 minutes | | P6DT4H | 6 days, 4 hours | ## Retry types ### `constant` Retries at fixed intervals. Example: with `interval: PT10M`, retries occur every 10 minutes. | Name | Type | Description | |------------|----------|-------------| | `interval` | Duration | Delay between attempts. | ### `exponential` Wait time increases after each retry (e.g., 30s, 1m, 2m, ...). | Name | Type | Description | |---------------|----------|-------------| | `interval` | Duration | Base interval between attempts. | | `maxInterval` | Duration | Maximum interval allowed. | | `delayFactor` | Double | Multiplier (default: 2). Example: interval 30s → 30s, 1m, 2m, 4m... | ### `random` Randomized delays within bounds. | Name | Type | Description | |---------------|----------|-------------| | `minInterval` | Duration | Minimum delay. | | `maxInterval` | Duration | Maximum delay. | ## Configuring retries globally You can configure retries globally for all tasks in Kestra: ```yaml kestra: plugins: configurations: - type: io.kestra values: retry: type: constant maxAttempts: 3 interval: PT30S ``` This applies a constant retry policy with up to 3 attempts every 30 seconds. ## Flow-level retries You can retry at the flow level, restarting either the entire execution or just failed tasks. Options: 1. `CREATE_NEW_EXECUTION`: Start a new execution. 2. `RETRY_FAILED_TASK`: Retry only the failed task. ```yaml id: flow_level_retry namespace: company.team retry: maxAttempts: 3 behavior: CREATE_NEW_EXECUTION # or RETRY_FAILED_TASK type: constant interval: PT1S tasks: - id: fail_1 type: io.kestra.plugin.core.execution.Fail allowFailure: true - id: fail_2 type: io.kestra.plugin.core.execution.Fail ``` - With `CREATE_NEW_EXECUTION`, the **execution attempt** increases. - With `RETRY_FAILED_TASK`, only the task run attempt increases. :::alert{type="info"} Flow-level retries also restart Subflows as new executions. ::: ## Retry vs. Restart vs. Replay ### Automatic vs. manual - **Retry**: Automatic rerun of failed tasks within the same execution. - **Restart**: Manual rerun of failed tasks within the same execution. - **Replay**: Manual rerun from any point, creating a new execution. ![replay_restart.png](./replay_restart.png) ### Restart vs. Replay - **Restart**: Retries only failed tasks in the same execution. - **Replay**: Starts a new execution from a chosen task, with a new execution ID. Outputs of previous tasks are reused from cache if needed. Check out the [Replay documentation](../../06.concepts/10.replay/index.md). ![replay.png](./replay.png) Replays can start from successful or failed tasks but always create a new execution. Restarts keep the same execution ID. After a Replay, you can still track which Execution triggered this new run thanks to the `Original Execution` field: ![original_execution.png](./original_execution.png) ### Summary | Concept | Scope | Trigger | New execution? | |---------|------------------------|----------|----------------| | Retry | Task level | Automatic| No | | Restart | Flow level | Manual | No | | Replay | Flow or task level | Manual | Yes | --- # Workflow SLAs in Kestra – Assert Duration Targets URL: https://kestra.io/docs/workflow-components/sla > Enforce Service Level Agreements (SLAs) in Kestra. Monitor workflow duration and assertions, triggering alerts or actions when performance targets are missed. Assert that your workflows meet SLAs.
A Service Level Agreement (SLA) is a core property of a flow that defines a `behavior` to trigger if the flow runs too long or fails to meet the defined assertion. ## SLA types Currently, Kestra supports the following SLA types: 1. **MAX_DURATION** — the maximum allowed execution duration before the SLA is breached 2. **EXECUTION_ASSERTION** — an assertion defined by a Pebble expression that must be met during the execution. If the assertion doesn't hold true, the SLA is breached. ## How to use SLAs SLAs are defined using the `sla` property at the root of a flow, and they declare the desired state that must be met during executions of the flow. ### MAX_DURATION If a workflow execution exceeds the expected duration, an SLA can trigger corrective actions, such as cancelling the execution. The following SLA cancels an execution if it takes more than 8 hours: ```yaml id: sla_example namespace: company.team sla: - id: maxDuration type: MAX_DURATION duration: PT8H behavior: CANCEL labels: sla: miss reason: durationExceeded tasks: - id: punctual type: io.kestra.plugin.core.log.Log message: Workflow started, monitoring SLA compliance - id: sleepyhead type: io.kestra.plugin.core.flow.Sleep duration: PT9H - id: never_executed_task type: io.kestra.plugin.core.log.Log message: This task will never start because the SLA was breached ``` ### EXECUTION_ASSERTION An SLA can also be based on an assertion that must hold true during execution. If the assertion fails, the SLA is breached. The following SLA fails if the output of `mytask` is not equal to `expected output`: ```yaml id: sla_demo namespace: company.team sla: - id: assert_output type: EXECUTION_ASSERTION assert: "{{ outputs.mytask.value == 'expected output' }}" behavior: FAIL labels: sla: miss reason: outputMismatch tasks: - id: mytask type: io.kestra.plugin.core.debug.Return format: expected output ``` ## SLA behavior The `behavior` property of an SLA defines the action to take when the SLA is breached. The following behaviors are supported: 1. **CANCEL** — cancels the execution 2. **FAIL** — fails the execution 3. **NONE** — logs a message In addition, each breached SLA can set labels that can be used to filter executions or trigger follow-up actions. ## Alerts on SLA breaches For example, if you want to receive a Slack alert when an SLA is breached, you can use a Flow trigger to react to cancelled or failed executions labeled with `sla: miss`: ```yaml id: sla_miss_alert namespace: system tasks: - id: send_alert type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{secret('SLACK_WEBHOOK')}}" messageText: "SLA breached for flow `{{trigger.namespace}}.{{trigger.flowId}}` with ID `{{trigger.executionId}}`" triggers: - id: alert_on_failure type: io.kestra.plugin.core.trigger.Flow labels: sla: miss states: - FAILED - WARNING - CANCELLED ``` :::alert{type="info"} Best practice: Use labels with SLAs to track SLA breaches across environments, and pair them with alerting or monitoring flows for proactive response. ::: --- # Execution States in Kestra: Full Lifecycle Guide URL: https://kestra.io/docs/workflow-components/states > Understand the Kestra Execution Lifecycle. Reference guide to all execution and task run states, including Created, Running, Success, Failed, and more. States control the status of your workflow execution.
## Overview An execution is a single run of a flow in a specific state. Each state represents a point in the workflow where Kestra determines what happens next based on the control flow logic defined in the flow. You can read more about executions in the [workflow components documentation](../03.execution/index.md). ## Execution states Each Kestra execution can transition through several states during its lifecycle. The following diagram illustrates the possible states an execution can be in: ![execution_states](./execution_states.png) Here is a brief description of each state: 1. **CREATED**: The execution has been created but not yet started. This transient state means the execution is waiting to be processed. It usually transitions quickly to `RUNNING`, `CANCELLED`, or `QUEUED`. If you see executions stuck in this state, it may indicate a problem with the system. 2. **QUEUED**: The execution is waiting for a free slot to start running. This transient state is only used when the flow has [concurrency](../14.concurrency/index.md) limits, and all available slots are taken. 3. **RUNNING**: The execution is currently in progress. This transient state continues until all task runs are completed. 4. **SUCCESS**: The execution has completed successfully. This terminal state indicates that the execution has completed successfully, and all tasks have finished without errors (or were allowed to fail). 5. **WARNING**: This terminal state is used when the execution has completed successfully, but one or more tasks have emitted warnings. 6. **FAILED**: This state indicates that one or more tasks have failed and will not be retried. If there is an `errors` branch defined in the flow, the error `tasks` will be executed before permanently ending the execution, e.g., to send an alert about failure. Without additional orchestration, this state is usually considered terminal. However, when the flow has a [flow-level retry policy](../12.retries/index.md#flow-level-retries) set to the `RETRY_FAILED_TASK` behavior, the execution will transition to the `RETRYING` state. 7. **RETRYING**: This transient state indicates that the execution is currently [retrying](../12.retries/index.md) one or more failed task runs. After all retry attempts are exhausted, the execution will transition to the terminal `SUCCESS`, `WARNING`, or `FAILED` state. 8. **RETRIED**: This terminal state indicates that the execution has been retried according to the [flow-level retry policy](../12.retries/index.md#flow-level-retries) set to the `CREATE_NEW_EXECUTION` behavior. This means that the original execution (which failed and has been retried) is marked as `RETRIED`, and a new execution is created to run the flow again. 9. **PAUSED**: This transient state indicates that the execution is awaiting manual approval or has been paused for a fixed duration before continuing the execution. There are no `RESUMING` or `RESUMED` states. A paused execution transitions directly from `PAUSED` to `RUNNING` when resumed. 10. **RESTARTED**: This transient state is equivalent to the `CREATED` state but for a failed execution that has been restarted e.g., from the UI. These executions transition to `RUNNING` once the restart is processed. 11. **CANCELLED**: This terminal state indicates that the execution has been automatically cancelled by the system, usually because the `concurrency` limit was reached and the [concurrency](../14.concurrency/index.md) `behavior` was set to `CANCEL`, which cancels all executions that exceed the concurrency limit. 12. **KILLING**: This transient state indicates that the user has issued a command to kill the execution, e.g., via a task or by clicking on the `Kill` button in the UI. The system is terminating (killing) any task runs still in progress. As soon as all task runs are terminated, the execution will transition to the `KILLED` state. 13. **KILLED**: This terminal state indicates that the execution has been killed upon request by the user. No more tasks will be able to run, and the execution is considered terminated. ## What is the difference between the `CANCELLED` and `KILLED` states? 1. The `CANCELLED` state is used when the **system** automatically cancels an execution due to the `concurrency` limit being reached. 2. The `KILLING` state is used when the **user** manually kills an execution and the system is in the process of terminating the task runs associated with the execution. 3. The `KILLED` state is used when the execution has been killed upon request by the **user**. ## How are task run states different from execution states? Task run states represent the status of a single task run within an execution. ![taskrun_states](./taskrun_states.png) Each task run can be in one of the following states: 1. **CREATED**: The task run has been created but not yet started. 2. **SUBMITTED**: The task run has been submitted to a Worker but has not started running yet. 3. **RUNNING**: The task run is currently in progress. 4. **SUCCESS**: The task run has completed successfully. 5. **WARNING**: The task run has completed successfully but with warnings. 6. **FAILED**: The task run has failed. 7. **RETRYING**: The task run is currently being retried. 8. **RETRIED**: The task run has been retried. 9. **RESTARTED**: The task run is currently being restarted. 10. **KILLING**: The task run is in the process of being killed. 11. **KILLED**: The task run has been killed upon request by the user. Note how there is no `QUEUED`, `CANCELLED`, or `PAUSED` states for task runs. --- # Subflows in Kestra – Modularize and Reuse Flows URL: https://kestra.io/docs/workflow-components/subflows > Modularize your Kestra workflows with Subflows. Learn to call flows from other flows, pass inputs and outputs, and build reusable orchestration components. Subflows let you build **modular** and **reusable** workflow components. They work like function calls: executing a subflow creates a new flow run from within another flow.
## Why use a subflow? Subflows allow you to build modular and reusable components that you can use across multiple flows. For example, you might define a subflow that handles error alerts by posting to Slack and email. By using a Subflow, you can reuse these two tasks together for all flows that you want to send error notifications, instead of having to copy the individual tasks for every flow. :::alert{type="warning"} Recursive flows are not supported. Kestra doesn’t allow a flow to call itself (directly or indirectly). Any cycle **(flowA→flowA)** makes the flow invalid. Recursive execution can create infinite loops and unbounded fan-out. **Do instead:** Use **[ForEach](/plugins/core/flow/io.kestra.plugin.core.flow.foreach)** and **[branching flowable](../01.tasks/00.flowable-tasks/index.md)** tasks to iterate or split work without creating cycles (e.g., [LoopUntil](/plugins/core/flow/io.kestra.plugin.core.flow.loopuntil)). ::: ## How to declare a subflow To call a flow from another flow, use the `io.kestra.plugin.core.flow.Subflow` task, and in that task, specify the `flowId` and `namespace` of the subflow that you want to execute. You can also specify custom `inputs`, similar to passing arguments to a function. The optional properties `wait` and `transmitFailed` control the execution behavior. By default, if `wait` is omitted or set to `false`, the parent flow continues without waiting for the subflow to finish. The `transmitFailed` property determines whether a failure in the subflow execution should cause the parent flow to fail. :::alert{type="info"} A Subflow task acts like a trigger to execute the child flow. While not managed like [Triggers](../07.triggers/index.mdx) in the UI, it is conceptually similar. ::: ## Practical example A subflow can encapsulate critical business logic, making it reusable across flows and easier to test in isolation. Here is a simple example of a subflow: ```yaml id: critical_service namespace: company.team tasks: - id: return_data type: io.kestra.plugin.jdbc.duckdb.Query sql: | INSTALL httpfs; LOAD httpfs; SELECT sum(total) as total, avg(quantity) as avg_quantity FROM read_csv_auto('https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv', header=True); store: true outputs: - id: some_output type: STRING value: "{{ outputs.return_data.uri }}" ``` In this example, `return_data` outputs `uri` of the query output. That URI is a reference to the internal storage location of the stored file. This output can be used in the parent flow to perform further processing. ```yaml id: parent_service namespace: company.team tasks: - id: subflow_call type: io.kestra.plugin.core.flow.Subflow namespace: company.team flowId: critical_service wait: true transmitFailed: true - id: log_subflow_output type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - cat "{{ outputs.subflow_call.outputs.some_output }}" ``` The `outputs` map task IDs to their results. Here, the parent flow accesses the `some_output` value from the `subflow_call` task. ## Subflow properties Below is a full list of all properties of the `io.kestra.plugin.core.flow.Subflow` task. You don’t need to memorize all properties — the task documentation always lists them. | Field | Description | |------------------------|-----------------------------------------------------------------------------| | `flowId` | The subflow's identifier | | `namespace` | The namespace where the subflow is located | | `inheritLabels` | Determines if the subflow inherits labels from the parent (default: false). | | `inputs` | Inputs passed to the subflow | | `labels` | Labels assigned to the subflow | | `outputs` (deprecated) | Allows passing outputs from the subflow execution to the parent flow. | | `revision` | The subflow revision to execute (defaults to the latest) | | `scheduleDate` | Schedule subflow execution on a specific date rather than immediately. | | `transmitFailed` | If true, parent flow fails on subflow failure (requires `wait` to be true). | | `wait` | If true, parent flow waits for subflow completion (default: true). | ## Passing data between parent and child flows Flows can emit outputs that can be accessed by the parent flow. Using the `io.kestra.plugin.core.flow.Subflow` task you can call any flow as a subflow and access its outputs in downstream tasks. For more details and examples, check the [Outputs page](../06.outputs/index.md#pass-data-between-flows-using-flow-outputs). ### Accessing Outputs from a subflow execution Outputs include the execution ID, extracted outputs, and the final state (if `wait` is true). Subflows improve maintainability of complex workflows. Use them to build modular, reusable components that can be shared across namespaces, projects, and teams. Here’s an example of a subflow with explicitly defined outputs. ```yaml id: flow_outputs namespace: company.team tasks: - id: mytask type: io.kestra.plugin.core.debug.Return format: this is a task output used as a final flow output outputs: - id: final type: STRING value: "{{ outputs.mytask.value }}" ``` We can access these outputs from a parent task as seen in the example below: ```yaml id: parent_flow namespace: company.team tasks: - id: subflow type: io.kestra.plugin.core.flow.Subflow flowId: flow_outputs namespace: company.team wait: true - id: log_subflow_output type: io.kestra.plugin.core.log.Log message: "{{ outputs.subflow.outputs.final }}" ``` For more details, see the [sublow outputs documentation](../../11.migration-guide/v0.15.0/subflow-outputs/index.md). ### Passing inputs to a subflow You can pass inputs to a Subflow task. The example below passes two inputs to a subflow. Subflow: ```yaml id: subflow_example namespace: company.team inputs: - id: http_uri type: STRING tasks: - id: download type: io.kestra.plugin.core.http.Request uri: "{{ inputs.http_uri }}" - id: log type: io.kestra.plugin.core.log.Log message: "{{ outputs.download.body }}" outputs: - id: data type: STRING value: "{{ outputs.download.body }}" ``` Parent flow: ```yaml id: inputs_subflow namespace: company.team inputs: - id: url type: STRING tasks: - id: subflow type: io.kestra.plugin.core.flow.Subflow flowId: subflow_example namespace: company.team inputs: http_uri: "{{ inputs.url }}" wait: true - id: hello type: io.kestra.plugin.core.log.Log message: "{{ outputs.subflow.outputs.data }}" ``` In this example, the parent flow successfully passes an input to the subflow. #### Nested inputs In the example below, the flow extracts JSON data from a REST API and passes it to a subflow as a nested input: ```yaml id: extract_json namespace: company.team tasks: - id: api type: io.kestra.plugin.core.http.Request uri: https://dummyjson.com/users - id: read_json type: io.kestra.plugin.core.log.Log message: "{{ outputs.api.body }}" - id: subflow type: io.kestra.plugin.core.flow.Subflow namespace: company.team flowId: subflow inputs: users.firstName: "{{ outputs.api.body | jq('.users') | first | first | jq('.firstName') | first }}" users.lastName: "{{ outputs.api.body | jq('.users') | first | first | jq('.lastName') | first }}" wait: true transmitFailed: true ``` To provide type validation to extracted JSON fields, you can use [nested inputs](../05.inputs/index.md#nested-inputs) in the subflow definition: ```yaml id: subflow namespace: company.team inputs: - id: users.firstName type: STRING defaults: Rick - id: users.lastName type: STRING defaults: Astley tasks: - id: process_user_data type: io.kestra.plugin.core.log.Log message: hello {{ inputs.users }} ``` You can then pass the entire `users` object, including nested fields, to any task in the subflow. --- # Task Cache in Kestra – Reuse Expensive Results URL: https://kestra.io/docs/workflow-components/task-cache > Optimize performance with Task Caching in Kestra. Cache outputs of expensive tasks to skip re-execution and speed up workflows when inputs haven't changed. Cache the status and outputs of computationally expensive operations. The `taskCache` property stores a task’s status and outputs in Kestra’s database. When the same execution runs again with identical inputs, Kestra skips the task and reuses the cached outputs. You can enable caching on any task, but it is most effective for heavy operations such as large data extractions or long-running scripts. Using task caching can significantly speed up workflows and reduce resource consumption. :::alert{type="info"} Task caching is only supported for [Runnable Tasks](../01.tasks/01.runnable-tasks/index.md). ::: ## `taskCache` syntax The syntax of the `taskCache` property is as follows: ```yaml taskCache: enabled: true ttl: PT1H # Duration in ISO 8601 format, e.g., PT1H for 1 hour ``` The `ttl` (time-to-live) property defines how long cached outputs are kept before expiring. Use any ISO 8601 duration (e.g., `PT1H` for 1 hour, `PT24H` for 1 day, or `P7D` for 7 days). ## `taskCache` example In the example below, the flow caches the outputs of a computationally expensive task, extracting a large dataset from a production database. This flow downloads product data once per day, caches it for 24 hours, and reuses it in joins with frequently updated transaction data. ```yaml id: caching namespace: company.team tasks: - id: transactions type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/resolve/main/csv/cache_demo/transactions.csv - id: products type: io.kestra.plugin.core.http.Download uri: https://huggingface.co/datasets/kestra/datasets/resolve/main/csv/cache_demo/products.csv description: This task pulls the full product catalog once per day. Because the catalog changes infrequently and contains over 200k rows, running it only once daily avoids unnecessary strain on a production DB, while ensuring downstream joins always use up-to-date reference data. taskCache: enabled: true ttl: PT24H - id: duckdb type: io.kestra.plugin.jdbc.duckdb.Query store: true inputFiles: products.csv: "{{ outputs.products.uri }}" transactions.csv: "{{ outputs.transactions.uri }}" sql: |- SELECT t.transaction_id, t.timestamp, t.quantity, t.sale_price, p.product_name, p.category, p.cost_price, p.supplier_id, (t.sale_price - p.cost_price) * t.quantity AS profit FROM read_csv_auto('transactions.csv') AS t JOIN read_csv_auto('products.csv') AS p USING (product_id); ``` This approach minimizes load on the production database while ensuring transactions are always processed against up-to-date product data. --- # Tasks in Kestra – Define Steps in a Flow URL: https://kestra.io/docs/workflow-components/tasks > Explore Tasks in Kestra, the building blocks of your flows. Differentiate between Runnable tasks for processing and Flowable tasks for orchestration logic. import ChildCard from "~/components/docs/ChildCard.astro" Tasks are the steps within a flow. They represent discrete actions, capable of processing inputs and variables and producing outputs for downstream consumption by end users and other tasks.
## Flowable tasks Kestra orchestrates flows using [Flowable tasks](./00.flowable-tasks/index.md). These tasks do not perform heavy computation. Instead, they control orchestration behavior, enabling advanced workflow patterns. Example Flowable tasks include: - `io.kestra.plugin.core.flow.Parallel` - `io.kestra.plugin.core.flow.Switch` - `io.kestra.plugin.core.flow.ForEachItem` Read the full list on the [Flowable tasks page](./00.flowable-tasks/index.md). ## Runnable tasks Most data processing in Kestra is performed by [Runnable tasks](./01.runnable-tasks/index.md). Unlike Flowable tasks, Runnable tasks perform the actual work — such as file system operations, API calls, or database queries. These tasks can be compute-intensive and are executed by [workers](../../08.architecture/02.server-components/index.md#worker). Example runnable tasks include: - `io.kestra.plugin.scripts.python.Commands` - `io.kestra.plugin.core.http.Request` - `io.kestra.plugin.slack.notifications.SlackExecution` ## Core task properties All tasks share the following core properties: | Property | Description | | -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `id` | A unique identifier of the task | | `type` | A full Java class name that represents the type of the task | | `description` | Your custom [documentation](../../../05.workflow-components/15.descriptions/index.md) of what the task does | | `retry` | How often should the task be retried in case of a failure, and the [type of retry strategy](../../../05.workflow-components/12.retries/index.md) | | `timeout` | The [maximum time allowed](../../../05.workflow-components/13.timeout/index.md) for the task to complete expressed in [ISO 8601 Durations](https://en.wikipedia.org/wiki/ISO_8601#Durations) | | `runIf` | Skip a task if the provided condition evaluates to false | | `disabled` | A boolean flag indicating whether the task is [disabled or not](../../../05.workflow-components/16.disabled/index.md); if set to `true`, the task will be skipped during the execution | | `workerGroup` | The [group of workers](../../07.enterprise/04.scalability/worker-group/index.md) (EE-only) that are eligible to execute the task; you can specify a `workerGroup.key` and a `workerGroup.fallback` (the default is `WAIT`) | | `allowFailure` | A boolean flag allowing to continue the execution even if this task fails | | `allowWarning` | A boolean flag allowing to mark a task run as Success despite Warnings | | `logLevel` | Defines the log level persisted to the backend database. By default, all logs are stored. For example, restricting to `INFO` prevents `DEBUG` and `TRACE` logs from being saved. | | `logToFile` | A boolean that lets you store logs as a file in [internal storage](../../08.architecture/data-components/index.md#kestra-internal-storage). That file can be previewed and downloaded from the Logs and Gantt Execution tabs. When set to `true`, logs aren't saved in the database, which is useful for tasks that produce a large amount of logs that would otherwise take up too much space. The same property can be set on [triggers](../../05.workflow-components/07.triggers/index.mdx). | ## Dynamic vs. static task properties Task properties can be static or dynamic. Dynamic properties can be set using expressions. To determine whether a property is static or dynamic (properties marked with a snowflake icon are non-dynamic), check the task’s documentation on the [plugin's homepage](/plugins) or in the UI by clicking on the documentation tab for the task. ![dynamic_properties](./dynamic-properties.png) Some properties are marked as **non-dynamic** because they are complex types (e.g., maps, lists of strings, lists of maps). These act as **placeholders** for other dynamic properties. For example, the [runTasks](/plugins/plugin-databricks/databricks-job/io.kestra.plugin.databricks.job.submitrun#runtasks) property of Databricks' `SubmitRun` is not dynamic because it is an array of [RunSubmitTaskSetting](/plugins/plugin-databricks/databricks-job/io.kestra.plugin.databricks.job.submitrun#runsubmittasksetting). Each `RunSubmitTaskSetting` contains its own properties, many of which are dynamic or placeholders for more complex types. Always drill down to the lowest level — most low-level properties are dynamic and can be templated using expressions. --- # Flowable Tasks in Kestra: Control Flow Logic URL: https://kestra.io/docs/workflow-components/tasks/flowable-tasks > Deep dive into Kestra Flowable Tasks. Learn to control execution flow with sequential, parallel, switch, if/else, loops, and error handling constructs. Control your orchestration logic. ## Control orchestration with flowable tasks Flowable tasks control orchestration logic — running tasks or subflows in parallel, creating loops, and handling conditional branching. They do not run heavy operations; those are handled by workers. Flowable tasks use [expressions](../../../expressions/index.mdx) from the execution context to determine which tasks run next. For example, you can use the outputs of a previous task in a `Switch` task to decide which task to run next. ### Sequential This task runs tasks sequentially and is typically used to group them. ```yaml id: sequential namespace: company.team tasks: - id: sequential type: io.kestra.plugin.core.flow.Sequential tasks: - id: 1st type: io.kestra.plugin.core.debug.Return format: "{{ task.id }} > {{ taskrun.startDate }}" - id: 2nd type: io.kestra.plugin.core.debug.Return format: "{{ task.id }} > {{ taskrun.id }}" - id: last type: io.kestra.plugin.core.debug.Return format: "{{ task.id }} > {{ taskrun.startDate }}" ``` :::alert{type="info"} You can access the output of a sibling task using the syntax `{{ outputs.sibling.value }}`. ::: For more details on capabilities, check out the [Sequential Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.sequential). ### Parallel This task runs tasks in parallel, making it convenient to process many tasks simultaneously. ```yaml id: parallel namespace: company.team tasks: - id: parallel type: io.kestra.plugin.core.flow.Parallel tasks: - id: 1st type: io.kestra.plugin.core.debug.Return format: "{{ task.id }} > {{ taskrun.startDate }}" - id: 2nd type: io.kestra.plugin.core.debug.Return format: "{{ task.id }} > {{ taskrun.id }}" - id: last type: io.kestra.plugin.core.debug.Return format: "{{ task.id }} > {{ taskrun.startDate }}" ``` :::alert{type="warning"} You cannot access the output of a sibling task as tasks will be run in parallel. ::: For more task details, refer to the [Parallel Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.parallel). ### Switch This task conditionally runs tasks based on the value of a contextual variable. In the following example, an input is used to decide which task to run next. ```yaml id: switch namespace: company.team inputs: - id: param type: BOOLEAN tasks: - id: decision type: io.kestra.plugin.core.flow.Switch value: "{{ inputs.param }}" cases: true: - id: is_true type: io.kestra.plugin.core.log.Log message: "This is true" false: - id: is_false type: io.kestra.plugin.core.log.Log message: "This is false" ``` For more plugin details, refer to the [Switch Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.switch). ### If This task processes a set of tasks conditionally depending on a condition. The condition must evaluate to a boolean. Values such as `0`, `-0`, `null`, and `''` evaluate to `false`; all other values evaluate to `true`. The `else` branch is optional. In the following example, an input is used to decide which task to run next. ```yaml id: if_condition namespace: company.team inputs: - id: param type: BOOLEAN tasks: - id: if type: io.kestra.plugin.core.flow.If condition: "{{ inputs.param }}" then: - id: when_true type: io.kestra.plugin.core.log.Log message: "This is true" else: - id: when_false type: io.kestra.plugin.core.log.Log message: "This is false" ``` For more details, check out the [If Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.if). ### ForEach This task executes a group of tasks for each value in the list. In the following example, the variable is static, but it could also be generated from a previous task output, starting any number of subtasks. ```yaml id: foreach_example namespace: company.team tasks: - id: for_each type: io.kestra.plugin.core.flow.ForEach values: ["value 1", "value 2", "value 3"] tasks: - id: before_if type: io.kestra.plugin.core.debug.Return format: "Before if {{ taskrun.value }}" - id: if type: io.kestra.plugin.core.flow.If condition: '{{ taskrun.value == "value 2" }}' then: - id: after_if type: io.kestra.plugin.core.debug.Return format: "After if {{ parent.taskrun.value }}" ``` In this execution, you can access: - The iteration value i.e., the index of a loop (the loop index starts at 0) using the syntax `{{ taskrun.iteration }}` - The output of a sibling task using the syntax `{{ outputs.sibling[taskrun.value].value }}` This example shows how to run tasks in parallel for each value in the list. All child tasks of the parallel task run in parallel. However, due to the `concurrencyLimit` property set to 2, only two parallel task groups run at any given time. ```yaml id: parallel_tasks_example namespace: company.team tasks: - id: for_each type: io.kestra.plugin.core.flow.ForEach values: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] concurrencyLimit: 2 tasks: - id: parallel type: io.kestra.plugin.core.flow.Parallel tasks: - id: log type: io.kestra.plugin.core.log.Log message: Processing {{ parent.taskrun.value }} - id: shell type: io.kestra.plugin.scripts.shell.Commands commands: - sleep {{ parent.taskrun.value }} ``` For more information on handling outputs generated from `ForEach`, check out the [dedicated loop how-to guide](../../../15.how-to-guides/loop/index.md) and the [Best Practices for ForEach and ForEachItem](../../../14.best-practices/11.foreach-and-foreachitem/index.md) guide, including how to access [sibling task outputs correctly](../../../14.best-practices/11.foreach-and-foreachitem/index.md#example-use-sibling-outputs-correctly-inside-foreach) inside the loop. For processing items, or forwarding processing to a subflow, [ForEachItem](#foreachitem) is better suited. :::alert{type="info"} For more details, refer to the [ForEach Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.foreach). ::: ### ForEachItem This task iterates over a list of items and runs a subflow for each item, or for each batch of items. ```yaml - id: each type: io.kestra.plugin.core.flow.ForEachItem items: "{{ inputs.file }}" # could be also an output variable {{ outputs.extract.uri }} inputs: file: "{{ taskrun.items }}" # items of the batch batch: rows: 4 namespace: company.team flowId: subflow revision: 1 # optional (default: latest) wait: true # wait for the subflow execution transmitFailed: true # fail the task run if the subflow execution fails labels: # optional labels to pass to the subflow to be executed key: value ``` This executes the subflow `company.team.subflow` for each batch of items. To pass the batch of items to a subflow, you can use inputs. The example above uses an input of `FILE` type called `file` that takes the URI of an internal storage file containing the batch of items. The next example shows you how to access the outputs from each subflow executed. The ForEachItem automatically merges the URIs of the outputs from each subflow into a single file. The URI of this file is available through the `subflowOutputs` output. ```yaml id: for_each_item namespace: company.team tasks: - id: generate type: io.kestra.plugin.scripts.shell.Script script: | for i in $(seq 1 10); do echo "$i" >> data; done outputFiles: - data - id: for_each_item type: io.kestra.plugin.core.flow.ForEachItem items: "{{ outputs.generate.outputFiles.data }}" batch: rows: 4 wait: true flowId: my_subflow namespace: company.team inputs: value: "{{ taskrun.items }}" - id: for_each_outputs type: io.kestra.plugin.core.log.Log message: "{{ outputs.forEachItem_merge.subflowOutputs }}" # Log the URI of the file containing the URIs of the outputs from each subflow ``` :::alert{type="info"} For more details, refer to the [ForEachItem Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.foreachitem). ::: #### `ForEach` vs `ForEachItem` Both `ForEach` and `ForEachItem` are similar, but there are specific use cases that suit one over the other: - `ForEach` generates a lot of [Task Runs](../02.taskruns/index.md) which can impact performance. - `ForEachItem` generates separate executions using [Subflows](../../10.subflows/index.md) for the group of tasks. This scales better for larger datasets. Read more about performance optimization in our [best practices guides](../../../14.best-practices/0.flows/index.md#tasks-in-the-same-execution).
### LoopUntil `LoopUntil` runs a group of tasks repeatedly until a boolean condition evaluates to `true`. After each iteration, the task evaluates the `condition` expression; if it evaluates to `false`, the block is executed again after the configured interval. Typical use cases include polling an external API, waiting for a long-running job to transition to a terminal state, or checking for the presence of downstream resources. Key properties: - `condition` — expression evaluated after each iteration; has access to the child task outputs from the most recent run (e.g. `{{ outputs.checkStatus.code }}`). - `tasks` — the list of child tasks to run before re-evaluating the condition. - `checkFrequency` — optional guardrails that define `interval`, `maxIterations`, and/or `maxDuration` between repeats. (See the [LoopUntil migration note](../../../11.migration-guide/v0.23.0/loop-until-defaults/index.md) for default values.) Example: poll an API until it returns HTTP 200, checking every 30 seconds and stopping after 50 attempts if it never succeeds. ```yaml id: loop_until namespace: company.team tasks: - id: loop type: io.kestra.plugin.core.flow.LoopUntil condition: "{{ outputs.ping.code == 200 }}" checkFrequency: interval: PT30S maxIterations: 50 tasks: - id: ping type: io.kestra.plugin.core.http.Request method: GET uri: https://kestra.io/api/mock ``` For more details, refer to the [LoopUntil Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.loopuntil). ### AllowFailure This task allows child tasks to fail. If any child task fails: - The `AllowFailure` task is marked with status `WARNING`. - All child tasks inside `AllowFailure` stop immediately. - The execution continues for all other tasks. - At the end, the execution as a whole is marked as status `WARNING`. In the following example: - `allow_failure` will be labelled as `WARNING`. - `ko` will be labelled as `FAILED`. - `next` will not be run. - `end` will be run and labelled `SUCCESS`. ```yaml id: each namespace: company.team tasks: - id: allow_failure type: io.kestra.plugin.core.flow.AllowFailure tasks: - id: ko type: io.kestra.plugin.core.execution.Fail - id: next type: io.kestra.plugin.core.debug.Return format: "{{ task.id }} > {{ taskrun.startDate }}" - id: end type: io.kestra.plugin.core.debug.Return format: "{{ task.id }} > {{ taskrun.startDate }}" ``` :::alert{type="info"} For more details, refer to the [AllowFailure Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.allowfailure). ::: ### Fail This task fails the flow; it can be used with or without conditions. Without conditions, it can be used, for example, to fail on some switch value. ```yaml id: fail_on_switch namespace: company.team inputs: - id: param type: STRING required: true tasks: - id: switch type: io.kestra.plugin.core.flow.Switch value: "{{ inputs.param }}" cases: case1: - id: case1 type: io.kestra.plugin.core.log.Log message: Case 1 case2: - id: case2 type: io.kestra.plugin.core.log.Log message: Case 2 notexist: - id: fail type: io.kestra.plugin.core.execution.Fail default: - id: default type: io.kestra.plugin.core.log.Log message: default ``` With conditions, it can be used, for example, to validate inputs. ```yaml id: fail_on_condition namespace: company.team inputs: - id: param type: STRING required: true tasks: - id: before type: io.kestra.plugin.core.log.Log message: "I'm before the fail on condition" - id: fail type: io.kestra.plugin.core.execution.Fail condition: "{{ inputs.param == 'fail' }}" - id: after type: io.kestra.plugin.core.log.Log message: "I'm after the fail on condition" ``` For more information, refer to the [Fail Task documentation](/plugins/core/execution/io.kestra.plugin.core.execution.fail). ### Subflow This task triggers another flow. This enables you to decouple the first flow from the second and monitor each flow individually. You can pass flow outputs as inputs to the triggered subflow (those must be declared in the subflow). ```yaml id: subflow_example namespace: company.team inputs: - id: my_file type: FILE tasks: - id: subflow type: io.kestra.plugin.core.flow.Subflow namespace: company.team flowId: my_subflow inputs: file: "{{ inputs.my_file }}" store: 12 ``` For more details, refer to the [Subflow Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.subflow). ### WorkingDirectory By default, Kestra launches each task in a new working directory, possibly on different workers if multiple ones exist. The example below runs all tasks nested under the `WorkingDirectory` task sequentially in the same directory, allowing downstream tasks to reuse output files from previous ones. To share a working directory, all tasks nested under the `WorkingDirectory` task are launched on the same worker. This task can be particularly useful for compute-intensive file system operations. ```yaml id: working_dir_flow namespace: company.team tasks: - id: working_dir type: io.kestra.plugin.core.flow.WorkingDirectory tasks: - id: first type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'echo "{{ taskrun.id }}" > {{ workingDir }}/stay.txt' - id: second type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - | echo '::{"outputs": {"stay":"'$(cat {{ workingDir }}/stay.txt)'"}}::' ``` This task can also cache files inside the working directory, for example, to cache script dependencies like the `node_modules` of a node `Script` task. ```yaml id: node_with_cache namespace: company.team tasks: - id: working_dir type: io.kestra.plugin.core.flow.WorkingDirectory cache: patterns: - node_modules/** ttl: PT1H tasks: - id: script type: io.kestra.plugin.scripts.node.Script beforeCommands: - npm install colors script: | const colors = require("colors"); console.log(colors.red("Hello")); ``` This task can also fetch files from [namespace files](../../../06.concepts/02.namespace-files/index.md) and make them available to all child tasks. ```yaml id: node_with_cache namespace: company.team tasks: - id: working_dir type: io.kestra.plugin.core.flow.WorkingDirectory namespaceFiles: enabled: true include: - dir1/*.* exclude: - dir2/*.* tasks: - id: shell type: io.kestra.plugin.scripts.shell.Commands commands: - cat dir1/file1.txt ``` :::alert{type="info"} [WorkingDirectory Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.workingdirectory) ::: ### Pause Kestra flows run until all tasks complete, but sometimes you need to: - Add a manual validation before continuing the execution - Wait for some duration before continuing the execution For this, you can use the Pause task. In the following example, the `validation` task pauses until it is manually resumed, while the `wait` task pauses for 5 minutes. ```yaml id: pause namespace: company.team tasks: - id: validation type: io.kestra.plugin.core.flow.Pause tasks: - id: ok type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'echo "started after manual validation"' - id: wait type: io.kestra.plugin.core.flow.Pause delay: PT5M tasks: - id: waited type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - 'echo "start after 5 minutes"' ``` :::alert{type="info"} A Pause task without delay waits indefinitely until the task state is changed to **Running**. For this: go to the **Gantt** tab of the **Execution** page, click on the task, select **Change status** on the contextual menu, and select **Mark as RUNNING** on the form. This makes the task run until its end. For more details, refer to the [Pause Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.pause). ::: ### DAG This task allows defining dependencies between tasks by creating a directed acyclic graph (DAG). Instead of an explicit DAG structure, this task defines dependencies for each task using the `dependsOn` property. This way, you can set dependencies more implicitly for each task, and Kestra figures out the overall flow structure. ```yaml id: dag namespace: company.team tasks: - id: dag description: "my task" type: io.kestra.plugin.core.flow.Dag tasks: - task: id: task1 type: io.kestra.plugin.core.log.Log message: I'm the task 1 - task: id: task2 type: io.kestra.plugin.core.log.Log message: I'm the task 2 dependsOn: - task1 - task: id: task3 type: io.kestra.plugin.core.log.Log message: I'm the task 3 dependsOn: - task1 - task: id: task4 type: io.kestra.plugin.core.log.Log message: I'm the task 4 dependsOn: - task2 - task: id: task5 type: io.kestra.plugin.core.log.Log message: I'm the task 5 dependsOn: - task4 - task3 ``` For more details, refer to the [Dag Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.dag). ### Template (deprecated) Templates are lists of tasks that can be shared between flows. You can define a template and call it from other flows, allowing them to share a list of tasks and keep these tasks updated without changing your flow. The following example uses the Template task to use a template. ```yaml id: template namespace: company.team tasks: - id: template type: io.kestra.plugin.core.flow.Template namespace: company.team templateId: template ``` --- # Runnable Tasks in Kestra – Execute Workloads URL: https://kestra.io/docs/workflow-components/tasks/runnable-tasks > Learn about Runnable Tasks in Kestra. Execute compute-intensive workloads like scripts, API calls, and database queries using distributed workers. Data processing tasks handled by the workers. ## Execute work with runnable tasks Runnable tasks handle data processing, such as file system operations, API calls, and database queries. They can be compute-intensive and are executed by workers. Each task requires an identifier (`id`) and a type, defined by its Java Fully Qualified Class Name (FQCN). Tasks include properties specific to their type. Refer to each task’s documentation for a full list of available properties. Most tasks are runnable, except for [Flowable tasks](../00.flowable-tasks/index.md), which control orchestration logic. By default, Kestra includes only a few runnable tasks. Many more are available as [plugins](/plugins), and the default Docker image comes preloaded with several of them. ## Example The following example shows two runnable tasks: one that makes an HTTP request and another that logs its output. ```yaml id: runnable_http namespace: company.team tasks: - id: make_request type: io.kestra.plugin.core.http.Request uri: https://kestra.io/api/mock method: GET contentType: application/json - id: print_status type: io.kestra.plugin.core.log.Log message: "{{ outputs.make_request.body }}" ``` --- # Task Runs in Kestra – Track Task Execution URL: https://kestra.io/docs/workflow-components/tasks/taskruns > Understand Task Runs in Kestra. Track the execution of individual tasks, monitor their states, attempts, and outputs within your workflow executions. A task run is a single execution of an individual task within an [Execution](../../03.execution/index.md), where an execution represents a run of the entire flow. One execution can therefore contain multiple task runs. ## Understand task runs Each task run includes associated data such as: - Execution ID - State - Start Date - End Date ## Attempts A task run can include one or more attempts. Most have only a single attempt, but you can configure [retries](../../12.retries/index.md) if needed. When retries are enabled, a task failure triggers new attempts until the `maxAttempts` or `maxDuration` threshold is reached. ## States Similar to executions, task runs can exist in different states. | State | Description | | - |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `CREATED` | The task run is waiting to be processed, usually queued and not yet started. | | `SUBMITTED` | The task run has been submitted to a Worker but has not started running yet. | | `RUNNING` | The execution or task run is currently being processed. | | `SUCCESS` | The execution or task run has been completed successfully. | | `WARNING` | The task run had issues but continued, flagged with a warning. | | `FAILED` | The task run encountered errors that caused the execution to fail. | | `RETRYING` | The execution or task run is currently being [retried](../../12.retries/index.md). | | `RETRIED` | An execution or task run exhibited unintended behavior, stopped, and created a new execution as defined by its [flow-level retry policy](../../12.retries/index.md#flow-level-retries). The policy was set to the `CREATE_NEW_EXECUTION` behavior. | | `KILLING` | A kill command was issued and the system is terminating the task run. | | `KILLED` | An execution or task run was killed (upon request), and no more tasks will run. | :::alert{type="info"} For a detailed overview of how each task run transition through different states, see the [States](../../17.states/index.md#how-are-task-run-states-different-from-execution-states) page. ::: ## Expression You can access information about the current task run using the `{{ taskrun }}` expression. The following example outputs task run details using `{{ taskrun }}`: ```yaml id: taskrun namespace: company.team tasks: - id: return type: io.kestra.plugin.core.debug.Return format: "{{ taskrun }}" ``` The logs show the following: ```json { "id": "61TxwXQjkXfwTd4ANK6fhv", "startDate": "2024-11-13T14:38:38.355668Z", "attemptsCount": 0 } ``` ## Task run values Some [Flowable tasks](../00.flowable-tasks/index.md), such as [ForEach](../00.flowable-tasks/index.md) and [ForEachItem](../00.flowable-tasks/index.md#foreachitem), group tasks together. You can use `{{ taskrun.value }}` to access the value of a specific task run. In the example below, `foreach` iterates twice over the values `[1, 2]`: ```yaml id: loop namespace: company.team tasks: - id: foreach type: io.kestra.plugin.core.flow.ForEach values: [1, 2] tasks: - id: log type: io.kestra.plugin.core.log.Log message: - "{{ taskrun }}" - "{{ taskrun.value }}" - "{{ taskrun.id }}" - "{{ taskrun.startDate }}" - "{{ taskrun.attemptsCount }}" - "{{ taskrun.parentId }}" - "{{ taskrun.iteration }}" ``` This produces two separate log entries, one with `1` and the other with `2`. ### Parent task run values You can also use the `{{ parent.taskrun.value }}` expression to access a task run value from a parent task within nested flowable child tasks: ```yaml id: loop namespace: company.team tasks: - id: foreach type: io.kestra.plugin.core.flow.ForEach values: [1, 2] tasks: - id: log type: io.kestra.plugin.core.log.Log message: "{{ taskrun.value }}" - id: if type: io.kestra.plugin.core.flow.If condition: "{{ true }}" then: - id: log_parent type: io.kestra.plugin.core.log.Log message: "{{ parent.taskrun.value }}" ``` This iterates through the `log` and `if` tasks twice as there are two items in `values` property. The `log_parent` task logs the parent task run value as `1` and then `2`. ### Parent vs. parents in nested Flowable tasks With nested [Flowable tasks](../00.flowable-tasks/index.md), only the immediate parent is available through `taskrun.value`. To access a parent task higher up the tree, you can use the `parent` and the `parents` expressions. The following flow shows a more complex example with nested flowable parent tasks: ```yaml id: each_switch namespace: company.team tasks: - id: simple type: io.kestra.plugin.core.log.Log message: - "{{ task.id }}" - "{{ taskrun.startDate }}" - id: hierarchy_1 type: io.kestra.plugin.core.flow.ForEach values: ["caseA", "caseB"] tasks: - id: hierarchy_2 type: io.kestra.plugin.core.flow.Switch value: "{{ taskrun.value }}" cases: caseA: - id: hierarchy_2_a type: io.kestra.plugin.core.debug.Return format: "{{ task.id }}" caseB: - id: hierarchy_2_b_first type: io.kestra.plugin.core.debug.Return format: "{{ task.id }}" - id: hierarchy_2_b_second type: io.kestra.plugin.core.flow.ForEach values: ["case1", "case2"] tasks: - id: switch type: io.kestra.plugin.core.flow.Switch value: "{{ taskrun.value }}" cases: case1: - id: switch_1 type: io.kestra.plugin.core.log.Log message: - "{{ parents[0].taskrun.value }}" - "{{ parents[1].taskrun.value }}" case2: - id: switch_2 type: io.kestra.plugin.core.log.Log message: - "{{ parents[0].taskrun.value }}" - "{{ parents[1].taskrun.value }}" - id: simple_again type: io.kestra.plugin.core.log.Log message: - "{{ task.id }}" - "{{ taskrun.startDate }}" ``` The `parent` variable gives direct access to the first parent, while the `parents[INDEX]` gives you access to the parent higher up the tree. :::collapse{title="Task Run JSON Object Example"} ```json { "id": "5cBZ1JF8kim8fbFg13bumX", "executionId": "6s1egIkxu3gpzzILDnyxTn", "namespace": "io.kestra.tests", "flowId": "each-sequential-nested", "taskId": "1-1_return", "parentTaskRunId": "5ABxhOwhpd2X8DtwUPKERJ", "value": "s1", "attempts": [ { "metrics": [ { "name": "length", "tags": { "format": "{{task.id}} > {{taskrun.value}} ⬅ {{taskrun.startDate}}" }, "value": 45.0, "type": "counter" }, { "name": "duration", "tags": { "format": "{{task.id}} > {{taskrun.value}} ⬅ {{taskrun.startDate}}" }, "type": "timer", "value": "PT0.007213673S" } ], "state": { "current": "SUCCESS", "histories": [ { "state": "CREATED", "date": "2025-05-04T12:02:54.121836Z" }, { "state": "RUNNING", "date": "2025-05-04T12:02:54.121841Z" }, { "state": "SUCCESS", "date": "2025-05-04T12:02:54.131892Z" } ], "duration": "PT0.010056S", "endDate": "2025-05-04T12:02:54.131892Z", "startDate": "2025-05-04T12:02:54.121836Z" } } ], "outputs": { "value": "1-1_return > s1 ⬅ 2025-05-04T12:02:53.938333Z" }, "state": { "current": "SUCCESS", "histories": [ { "state": "CREATED", "date": "2025-05-04T12:02:53.938333Z" }, { "state": "RUNNING", "date": "2025-05-04T12:02:54.116336Z" }, { "state": "SUCCESS", "date": "2025-05-04T12:02:54.144135Z" } ], "duration": "PT0.205802S", "endDate": "2025-05-04T12:02:54.144135Z", "startDate": "2025-05-04T12:02:53.938333Z" } } ``` ::: --- # Task Timeouts in Kestra – Limit Run Duration URL: https://kestra.io/docs/workflow-components/timeout > Control task duration with Timeouts in Kestra. Prevent hanging processes and manage costs by setting maximum execution times for your tasks. A timeout defines the maximum duration a [runnable task](../01.tasks/01.runnable-tasks/index.md) is allowed to run.
## What is a timeout If a task run exceeds the specified duration, Kestra automatically stops it and marks it as failed. This is useful for tasks that may hang and run indefinitely. Timeouts are often used as a cost-control mechanism in cloud-based workflows. Imagine a Snowflake query or an AWS Batch job that runs for hours leading to unexpected costs. By setting a timeout, you can ensure that the task run will not exceed a certain duration. ## Format Similar to [retries](../../05.workflow-components/12.retries/index.md), timeouts use the [ISO 8601 duration](https://en.wikipedia.org/wiki/ISO_8601#Durations) format, but week, month, and year designators are not supported. Below are some examples: | name | description | |----------|--------------------------| | PT0.250S | 250 milliseconds delay | | PT2S | 2 seconds delay | | PT1M | 1 minute delay | | PT3.5H | 3 hours and a half delay | | P6DT4H | 6 days and 4 hours delay | ## Example In this example, the `costly_query` task sleeps for 10 seconds, but the timeout is set to 5 seconds, causing the task to fail. ```yaml id: timeout namespace: company.team description: This flow will always fail because of a timeout. tasks: - id: costly_query type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - sleep 10 timeout: PT5S ``` ## Flow-level timeout There is no flow-level timeout. To cancel a workflow execution that exceeds a specific duration, use `MAX_DURATION`-type [SLA](../18.sla/index.md). --- # Triggers in Kestra: Schedule, Events, Webhooks URL: https://kestra.io/docs/workflow-components/triggers > Automate flow execution with Kestra Triggers. Explore scheduled, event-based, and webhook triggers to start workflows based on time or external events. import ChildCard from "~/components/docs/ChildCard.astro" A trigger is a mechanism that automatically starts the execution of a flow.
Triggers can be either scheduled or event-based, giving you flexibility in how you automate workflow execution. ## Trigger types Kestra supports both **scheduled** and **external** events. Kestra supports five core trigger types: - [Schedule trigger](./01.schedule-trigger/index.md) allows you to execute your flow on a regular cadence e.g. using a CRON expression and custom scheduling conditions. - [Flow trigger](./02.flow-trigger/index.md) allows you to execute your flow when another flow finishes its execution (based on a configurable list of states). - [Webhook trigger](./03.webhook-trigger/index.md) allows you to execute your flow based on an HTTP request emitted by a webhook. - [Polling trigger](./04.polling-trigger/index.md) allows you to execute your flow by polling external systems for the presence of data. - [Realtime trigger](./05.realtime-trigger/index.md) allows you to execute your flow when events happen with millisecond latency. Many other triggers are available from the plugins, such as triggers based on file detection events, e.g. the [S3 trigger](/plugins/plugin-aws/s3/io.kestra.plugin.aws.s3.trigger), or a new message arrival in a message queue, such as the [SQS](/plugins/plugin-aws/sqs/io.kestra.plugin.aws.sqs.realtimetrigger) or [Kafka trigger](/plugins/plugin-kafka/io.kestra.plugin.kafka.trigger). ### Trigger Common Properties The following properties are common to all triggers: | Field | Description | | ----------------- | ---------------------------------------------------------------------------------------- | | `id` | The flow identifier, must be unique inside a flow. | | `type` | The Java FQDN of the trigger. | | `description` | The description of the trigger. | | `disabled` | Set it to `true` to disable execution of the trigger. | | `allowConcurrent` | Set it to `true` to allow multiple executions from this trigger to run at the same time. | | `workerGroup.key` | To execute this trigger on a specific Worker Group (EE). | ## Trigger variables Triggers expose metadata through expressions. For example: – `{{ trigger.date }}` returns the current date for the [Schedule trigger](./01.schedule-trigger/index.md) – `{{ trigger.uri }}` returns the file or message for file detection or message arrival events – `{{ trigger.rows }}` provides query results for triggers like [PostgreSQL Query](/plugins/plugin-jdbc-postgres/io.kestra.plugin.jdbc.postgresql.trigger) trigger. This example will log the date when the trigger executes the flow: ```yaml id: variables namespace: company.team tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "Hello World on {{ trigger.date }}! 🚀" triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "@hourly" allowConcurrent: false ``` :::alert{type="warning"} The **templated variables** above are only available when the execution is created **automatically** by the trigger. You'll get an error if you try to run a flow containing such variables **manually**. **You don't need an extra task to consume** the file or message from the event. Kestra downloads it automatically to **internal storage** and makes it available in your flow via the `{{ trigger.uri }}` variable. Check the documentation of a specific trigger and [Blueprints](/blueprints) with the **Trigger** tag for more details and examples. ::: Each trigger ID is limited to a single active execution at a time. If a scheduled execution is still running, the next one will be queued instead of started immediately. For instance, if an execution from a flow with a `Schedule` trigger with ID `hourly` is still in a `Running` state, another one will not be started. However, you can still trigger the same flow manually (from the UI or API), and the scheduled executions will not be affected. ```yaml id: hourlyFlow namespace: company.team tasks: - id: important-task type: io.kestra.plugin.core.log.Log message: If this runs for longer than 1h, next Executions will be queued rather than being started immediately triggers: - id: hourly type: io.kestra.plugin.core.trigger.Schedule cron: "@hourly" ``` ## Conditions Conditions are criteria that determine when a trigger should create a new execution. Usually, they limit the scope of a trigger to a specific set of cases. For example, you can restrict a Flow trigger to a specific namespace prefix or execution status, and you can restrict a Schedule trigger to a specific time of the week or month. You can pass a list of conditions; in this case, all the conditions must match to enable the current action. Available conditions include: - [HasRetryAttempt](/plugins/core/condition/io.kestra.plugin.core.condition.hasretryattempt) - [MultipleCondition](/plugins/core) - [Not](/plugins/core/condition/io.kestra.plugin.core.condition.not) - [Or](/plugins/core/condition/io.kestra.plugin.core.condition.or) - [ExecutionFlow](/plugins/core/condition/io.kestra.plugin.core.condition.executionflow) - [ExecutionNamespace](/plugins/core/condition/io.kestra.plugin.core.condition.executionnamespace) - [ExecutionLabels](/plugins/core/condition/io.kestra.plugin.core.condition.executionlabels) - [ExecutionStatus](/plugins/core/condition/io.kestra.plugin.core.condition.executionstatus) - [ExecutionOutputs](/plugins/core/condition/io.kestra.plugin.core.condition.executionoutputs) - [Expression](/plugins/core/condition/io.kestra.plugin.core.condition.expression) You can also find datetime related conditions [on the Schedule trigger page](./01.schedule-trigger/index.md#schedule-conditions). ## Unlocking, enabling, and disabling triggers Triggers do not always need to be enabled. Disable a trigger whenever you want to pause a flow during the development phase or other instances. ### Disabling a trigger in the source code If you want to temporarily disable a trigger, you could do so by setting the `disabled` property to `true`, as you can see in the example below: ```yaml id: hello_world namespace: company.team tasks: - id: sleep type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - sleep 30 triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "*/1 * * * *" disabled: true ``` However, this approach requires changing the source code. A simpler approach is to use the `Enabled` toggle in the UI. ### Disabling a trigger from the UI You can disable or re-enable a trigger from the UI. Here is how you can do it: 1. Go to the `Flows` page and click on the flow you want to disable the trigger for. 2. Go to the `Triggers` tab and click on the `Enabled` toggle next to the trigger you want to disable. You can re-enable it by clicking the toggle again. ![triggers_flow](./triggers_flow.png) If your trigger is locked due to an execution in progress, you can unlock it by clicking the `Unlock trigger` button. ![trigger_unlock](./trigger_unlock.png) The **Unlock trigger** functionality is useful for troubleshooting, e.g. if a process is stuck due to infrastructure issues. Keep in mind that manually unlocking triggers may result in multiple concurrent (potentially duplicated) executions — use it with caution. :::alert{type="info"} Only scheduled-based triggers (triggers handled by the Scheduler) will be visible in the UI. Triggers handled by the Executor and Webserver will not be displayed. This also applies when fetching triggers from the API. ::: ### Toggle, unlock, or delete triggers from the Administration page From **Administration → Triggers** you can bulk manage trigger state: - **Toggle** — enable or disable one or more triggers without editing the flow code. - **Unlock** — clear the “locked” state if a trigger is stuck waiting on a long-running execution (use carefully, as this may create duplicate executions). - **Delete trigger** — remove the trigger definition so it behaves as if newly created. This is useful when you need to reset trigger state or force a fresh evaluation window. ![triggers_administration](./triggers-administration.png) Deleting a trigger is different from deleting a backfill: removing a backfill only cancels pending catch-up runs, while deleting a trigger resets the trigger entity itself. Use **Delete backfill** to stop scheduled replays and **Delete trigger** to rebuild the trigger state. ![Delete a trigger](./delete-triggers.png) ## Troubleshooting a trigger from the UI If you misconfigured a trigger, and as a result, no Executions are created, take the following actions to troubleshoot. The example flow below illustrates this scenario. Note how the `sqs_trigger` trigger is misconfigured with invalid AWS credentials: ```yaml id: bad_trigger_example namespace: company.team tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Hello World! triggers: - id: sqs_trigger type: io.kestra.plugin.aws.sqs.Trigger accessKeyId: "nonExistingKey" secretKeyId: "nonExistingSecret" region: "us-east-1" queueUrl: "https://sqs.us-east-1.amazonaws.com/123456789/testQueue" maxRecords: 10 ``` When you add that flow to Kestra, you'll see that no Executions are created. To troubleshoot this, you can go to the `Triggers` tab on the Flow's page and **expand the logs** of the trigger that is causing the issue. You'll see a detailed error message that will help you identify the problem: ![invalid_trigger_configuration](./invalid_trigger_configuration.png) ## The `stopAfter` property Kestra 0.15 introduced a generic `stopAfter` property which is a list of states that will disable the trigger after the flow execution has reached one of the states in the list. This property is most useful with `Schedule` triggers and polling-based triggers such as HTTP, JDBC, or File Detection. :::alert{type="info"} Note that we don't handle any automatic trigger reenabling logic. After a trigger has been disabled due to the `stopAfter` state condition, you can take some action based on it and manually reenable the trigger. ::: ### Pause the schedule trigger after a failed execution The `stopAfter` property can be used to pause a schedule trigger after a failed execution. Here is an example of how to use it: ```yaml id: business_critical_flow namespace: company.team tasks: - id: important_task type: io.kestra.plugin.core.log.Log message: if this fails, we want to stop the flow from running until we fix it triggers: - id: stopAfter type: io.kestra.plugin.core.trigger.Schedule cron: "0 9 * * *" stopAfter: - FAILED ``` The above flow will be triggered every day at `9:00` AM, but if it fails, the schedule will be paused so that you can manually reenable the trigger once the issue is fixed. This is useful for business-critical flows that should not continue running the next scheduled executions if a previous execution has failed. ### Disable the HTTP trigger after the first successful execution The example below shows how to use the `stopAfter` property with the HTTP trigger condition. The use case is to poll an API endpoint and send a Slack alert if the price is below $110. If the condition is met, the trigger will be disabled so that you don't get alerted every 30 seconds about the same condition. ```yaml id: http namespace: company.team tasks: - id: slack type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ secret('SLACK_WEBHOOK') }}" messageText: "The price is now: {{ json(trigger.body).price }}" triggers: - id: http type: io.kestra.plugin.core.http.Trigger uri: https://fakestoreapi.com/products/1 responseCondition: "{{ json(response.body).price <= 110 }}" interval: PT30S stopAfter: - SUCCESS ``` Let's break down the above example: 1. The HTTP trigger will poll the API endpoint every 30 seconds to check if the price of a product is below $110. 2. If the condition is met, the Execution will be created 3. Within that execution, the `slack` task will send a Slack message to notify about the price change 4. After that execution finishes successfully, the `stopAfter` property condition is met — it will disable the trigger ensuring that you don't get alerted every 30 seconds about the same condition. ## Locked triggers [Flow](./02.flow-trigger/index.md), [Schedule](./01.schedule-trigger/index.md), and [Polling triggers](./04.polling-trigger/index.md) have locks to avoid concurrent trigger evaluation and concurrent execution of a flow for a trigger. To see a list of triggers and inspect their current status, go to the **Administration -> Triggers** section in the Kestra UI. From here, you can unlock a trigger if it is locked. Note that doing so raises a risk of concurrent trigger evaluation or flow execution for this trigger if you unlock it manually. ## Setting inputs inside of triggers You can easily pass inputs to triggers by using the `inputs` property and passing them as a key-value pair. In this example, the `user` input is set to "John Smith" inside of the `schedule` trigger: ```yaml id: myflow namespace: company.team inputs: - id: user type: STRING defaults: Rick Astley tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "Hello {{ inputs.user }}! 🚀" triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "*/1 * * * *" inputs: user: John Smith ``` ## Trigger errors By default, if a trigger fails, no execution is created; this is by design to avoid excessive executions on the instance. To troubleshoot, you must [investigate the trigger logs](#troubleshooting-a-trigger-from-the-ui). If you'd prefer an execution to be created on trigger failure, set the `failOnTriggerError` property to `true` in the trigger configuration. This will cause the flow to fail and produce an execution with its own logs. For example, take the following flow with a misconfigured trigger: ```yaml id: bad_trigger_example namespace: company.team tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Hello World! triggers: - id: sqs_trigger type: io.kestra.plugin.aws.sqs.Trigger accessKeyId: "nonExistingKey" secretKeyId: "nonExistingSecret" region: "us-east-1" queueUrl: "https://sqs.us-east-1.amazonaws.com/123456789/testQueue" maxRecords: 10 failOnTriggerError: true ``` With this configuration, the flow will produce an execution containing logs that describe the trigger failure. This execution can be used for both troubleshooting and notification, in addition to the trigger logs. --- # Flow Trigger in Kestra – Chain Flow Executions URL: https://kestra.io/docs/workflow-components/triggers/flow-trigger > Chain workflows in Kestra using the Flow Trigger. Automate dependencies by triggering flows upon the completion, success, or failure of other flows. Trigger one flow based on the execution of another flow. A Flow trigger runs a flow after another flow completes, enabling event-driven workflows and dependencies across teams. ```yaml type: io.kestra.plugin.core.trigger.Flow ``` Kestra can automatically start a flow as soon as another flow ends. This allows you to create dependencies between flows, even when those flows are owned by different teams. Check the [Flow trigger](/plugins/core/trigger/io.kestra.plugin.core.trigger.flow) documentation for the list of all properties. ## Preconditions A Flow trigger requires preconditions to filter which upstream executions can trigger the flow, often within a defined time window. :::alert{type="info"} [Pebble expressions](../../../expressions/index.mdx) cannot be used in Flow Trigger (pre)conditions. You must declaratively define any condition variables. ::: ### Filters - `flows`: A list of preconditions to meet, in the form of upstream flows The example below shows a Flow trigger that runs when `flow_a` completes successfully. ```yaml id: flow_b namespace: kestra.sandbox tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "Hello World!" triggers: - id: upstream_dependancy type: io.kestra.plugin.core.trigger.Flow preconditions: id: flow_trigger flows: - namespace: kestra.sandbox flowId: flow_a states: [SUCCESS] ``` :::alert{type="info"} It is [best practice](../../../14.best-practices/0.flows/index.md#flow-trigger-on-state-change) when using a flow trigger to use `preconditions.flows.states` rather than the `states` task property when defining state conditions for one specific flow. ::: - `where`: filter executions based on fields like `FLOW_ID`, `NAMESPACE`, `STATE`, and `EXPRESSION`. For example, the following Flow Trigger triggers on execution from flows in FAILED or WARNING states in namespaces starting with "company": ```yaml triggers: - id: alert_on_failure type: io.kestra.plugin.core.trigger.Flow states: - FAILED - WARNING preconditions: id: company_namespace where: - id: company filters: - field: NAMESPACE type: STARTS_WITH value: company ``` ### Time Window & SLA The `timeWindow` property lets you define how Kestra evaluates upstream flow executions over time. It supports several modes: - `DURATION_WINDOW`: This is the default type. It uses a start time (windowAdvance) and end time (window) that are moving forward to the next interval whenever the evaluation time reaches the end time, based on the defined duration window. For example, with a 1-day window (`window: PT1D`, the default), SLA conditions are evaluated over a 24-hour period starting at midnight each day. If you set `windowAdvance: PT6H`, the window will start at 6 AM each day. If you set `windowAdvance: PT6H` and you also override `window: PT6H`, the window will start at 6 AM and last for 6 hours — as a result, Kestra will check the SLA conditions during the following time periods: `06:00` to `12:00`, `12:00` to `18:00`, `18:00` to `00:00`, and `00:00` to `06:00`, and so on. - `SLIDING_WINDOW`: This option also evaluates SLA conditions over a fixed time window, but it always goes backward from the current time. For example, a sliding window of 1 hour (window: PT1H) will evaluate executions for the past hour (so between now and one hour before now). It uses a default window of 1 day. For example, the flow below evaluates every hour if the flow `flow_a` is in SUCCESS state. If so, it triggers the `flow_b` passing corresponding inputs (reading `flow_a` outputs). ```yaml id: flow_b namespace: kestra.sandbox inputs: - id: value_from_a type: STRING tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "{{ inputs.value_from_a }}" triggers: - id: upstream_dep type: io.kestra.plugin.core.trigger.Flow inputs: value_from_a: "{{ trigger.outputs.return_value }}" preconditions: id: test flows: - namespace: kestra.sandbox flowId: flow_a states: [SUCCESS] timeWindow: type: SLIDING_WINDOW window: PT1H ``` For reference, below is `flow_a`: ```yaml id: flow_a namespace: kestra.sandbox tasks: - id: hello type: io.kestra.plugin.core.log.Log message: Hello World! 🚀 outputs: - id: return_value type: STRING value: "Flow A run succesfully" ``` - `DAILY_TIME_DEADLINE`: This option enforces SLA conditions that must be met before a specific cutoff time each day. With the string property deadline, you can configure a daily cutoff for checking conditions. For example, deadline: `09:00:00.00Z` means that the defined SLA conditions should be met from midnight until 9 AM each day; otherwise, the flow will not be triggered. For the example, this trigger definition only triggers the flow if `flow_a` is in SUCCESS state before `9:00` AM every day. ```yaml triggers: - id: upstream_dep type: io.kestra.plugin.core.trigger.Flow preconditions: id: should_be_success_by_nine flows: - namespace: kestra.sandbox flowId: flow_a states: [SUCCESS] timeWindow: type: DAILY_TIME_DEADLINE deadline: "09:00:00.00Z" ``` - `DAILY_TIME_WINDOW`: This option enforces SLA conditions that must be met within a specific daily time range. For example, a window from `startTime: "06:00:00"` to `endTime: "09:00:00"` evaluates executions within that interval each day. This option is particularly useful for declarative definition of freshness conditions when building data pipelines. For example, if you only need one successful execution within a given time range to guarantee that some data has been successfully refreshed in order for you to proceed with the next steps of your pipeline, this option can be more useful than a strict DAG-based approach. Usually, each failure in your flow would block the entire pipeline, whereas with this option, you can proceed with the next steps of the pipeline as soon as the data is successfully refreshed at least once within the given time range. ```yaml triggers: - id: upstream_dep type: io.kestra.plugin.core.trigger.Flow inputs: value_from_a: "{{ trigger.outputs.return_value }}" preconditions: id: test flows: - namespace: kestra.sandbox flowId: flow_a states: [SUCCESS] timeWindow: type: DAILY_TIME_WINDOW startTime: "06:00:00" endTime: "12:00:00" ``` ## Example This example triggers the `silver_layer` flow once the `bronze_layer` flow finishes successfully by 9 AM. The deadline time string must include the timezone offset. This ensures that no new executions are triggered past the deadline. Here is the `silver_layer` flow: ```yaml id: silver_layer namespace: company.team tasks: - id: transform_data type: io.kestra.plugin.core.log.Log message: deduplication, cleaning, and minor aggregations triggers: - id: flow_trigger type: io.kestra.plugin.core.trigger.Flow preconditions: id: bronze_layer timeWindow: type: DAILY_TIME_DEADLINE deadline: "09:00:00+01:00" flows: - namespace: company.team flowId: bronze_layer states: [SUCCESS] ``` ## Example: Alerting This example creates a `System Flow` to send a Slack alert on any failure or warning state within the `company` namespace. This example uses the Slack webhook secret to notify the `#general` channel about the failed flow. ```yaml id: alert namespace: system tasks: - id: send_alert type: io.kestra.plugin.slack.notifications.SlackExecution url: "{{secret('SLACK_WEBHOOK')}}" # format: https://hooks.slack.com/services/xzy/xyz/xyz channel: "#general" executionId: "{{trigger.executionId}}" triggers: - id: alert_on_failure type: io.kestra.plugin.core.trigger.Flow states: - FAILED - WARNING preconditions: id: company_namespace where: - id: company filters: - field: NAMESPACE type: STARTS_WITH value: company ``` --- # Polling Trigger in Kestra – Check External Systems URL: https://kestra.io/docs/workflow-components/triggers/polling-trigger > Automate workflows based on external state with Polling Triggers. Monitor databases, FTPs, or queues and trigger Kestra flows when changes are detected. Trigger flows automatically by polling external systems for new data or events. Polling triggers repeatedly check an external system at a fixed interval. When new data or events are detected, they automatically start a new flow execution. Kestra provides polling triggers for a wide variety of systems, including databases, message queues, cloud storage, and FTP servers. The polling frequency is controlled by the `interval` property. When triggered, the flow has access to the polling results through the `trigger` variable, making the retrieved data immediately available for downstream tasks. ## Example For example, the following flow polls a PostgreSQL table every 5 minutes. When new rows are available, it deletes them (to prevent duplicate processing) and logs the retrieved values. ```yaml id: jdbc-trigger namespace: company.team inputs: - id: db_url type: STRING tasks: - id: update type: io.kestra.plugin.jdbc.postgresql.Query url: "{{ inputs.db_url }}" sql: DELETE * FROM my_table - id: log type: io.kestra.plugin.core.log.Log message: "{{ trigger.rows }}" triggers: - id: watch type: io.kestra.plugin.jdbc.postgresql.Trigger url: myurl interval: "PT5M" sql: "SELECT * FROM my_table" ``` In [Enterprise Edition](../../../07.enterprise/01.overview/01.enterprise-edition/index.md), you can assign polling triggers to a specific [Worker Group](../../../07.enterprise/04.scalability/worker-group/index.md) using the `workerGroup.key` property. This allows you to control where the polling is executed. ## Enterprise example In Enterprise Edition (Kestra 0.24+), the `Salesforce Trigger` enables flows to start automatically when new records are created in Salesforce. For example, the flow below sends a Slack notification whenever a new contact is added. ```yaml id: salesforce_contact_trigger namespace: company.sales tasks: - id: notify_sales_manager type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: "{{ secret('SLACK_WEBHOOK_URL') }}" messageText: "New contact created" triggers: - id: new_contact_trigger type: io.kestra.plugin.ee.salesforce.Trigger interval: "PT5M" connection: username: "{{ secret('SALESFORCE_USERNAME') }}" password: "{{ secret('SALESFORCE_PASSWORD') }}" authEndpoint: "{{ secret('SALESFORCE_AUTH_ENDPOINT') }}" query: "SELECT Id, FirstName, LastName, Email, Phone, Company, CreatedDate FROM Contact WHERE CreatedDate >= LAST_N_MINUTES:5" ``` --- # Realtime Trigger in Kestra – Millisecond Eventing URL: https://kestra.io/docs/workflow-components/triggers/realtime-trigger > Achieve low-latency automation with Kestra's Realtime Triggers. React instantly to events from Kafka, SQS, MQTT, and other streaming systems. Trigger workflows instantly as events occur, with millisecond latency. [Triggers](./index.md) in Kestra can listen to external events and start a workflow execution when the event occurs. Most triggers in Kestra **poll** external systems at regular intervals (e.g., every second) to detect new events. This is effective for batch-style data processing. However, business-critical workflows often demand immediate reactions — within milliseconds. **Realtime Triggers** address this need by listening directly for events and starting workflows as soon as they occur.
## What are Realtime Triggers Realtime Triggers continuously listen for events and launch a new workflow execution the moment an event occurs, such as: - a message is published to a [Kafka topic](/plugins/plugin-kafka/io.kestra.plugin.kafka.realtimetrigger) - a message is published to a [Pulsar topic](/plugins/plugin-pulsar/io.kestra.plugin.pulsar.realtimetrigger) - a message is published to an [AMQP queue](/plugins/plugin-amqp/io.kestra.plugin.amqp.realtimetrigger) - a message is published to an [MQTT queue](/plugins/plugin-mqtt/io.kestra.plugin.mqtt.realtimetrigger) - a message is published to an [AWS SQS queue](/plugins/plugin-aws/sqs/io.kestra.plugin.aws.sqs.realtimetrigger) - a message is published to [Google Pub/Sub](/plugins/plugin-gcp/pubsub/io.kestra.plugin.gcp.pubsub.realtimetrigger) - a message is published to [Azure Event Hubs](/plugins/plugin-azure/eventhubs/io.kestra.plugin.azure.eventhubs.realtimetrigger) - a message is published to a [NATS subject](/plugins/plugin-nats/nats-core/io.kestra.plugin.nats.core.realtimetrigger) - an item is added to a [Redis list](/plugins/plugin-redis) - a row is added, modified or deleted in [Postgres](/plugins/plugin-debezium-postgres/io.kestra.plugin.debezium.postgres.realtimetrigger), [MySQL](/plugins/plugin-debezium-mysql/io.kestra.plugin.debezium.mysql.realtimetrigger), or [SQL Server](/plugins/plugin-debezium-sqlserver/io.kestra.plugin.debezium.sqlserver.realtimetrigger). ## How Realtime Triggers work Once a Realtime Trigger is added to a workflow, Kestra spins up a dedicated listener thread that remains active. As soon as a new event arrives, the listener immediately starts a workflow execution to process it. ## Use cases Realtime Triggers are ideal for orchestrating **business-critical operations** and **event-driven microservices**. Typical scenarios include: - Fraud or anomaly detection - Order and payment processing - Real-time predictions or recommendations - Stock price or market event reactions - Shipping and delivery updates - Any workflow requiring instant reaction to external events In addition, Realtime Triggers can be used for **data orchestration**, especially for **Change Data Capture** use cases. The [Debezium Postgres RealtimeTrigger](/plugins/plugin-debezium-postgres/io.kestra.plugin.debezium.postgres.realtimetrigger) plugin can listen to changes in a database table and start a workflow execution as soon as a new row is inserted, updated, or deleted. ## When to use Triggers vs. Realtime Triggers The table below compares Triggers with Realtime Triggers to help you choose the right trigger type for your use case: | Criteria | Trigger | Realtime Trigger | |----------------------|-----------------------------------------------------------------------|-------------------------------------------------------------------------------------------------| | **Implementation** | Micro-batch | Realtime | | **Event Processing** | Batch-process all events received until the poll interval has elapsed | Process each event immediately as it happens | | **Latency** | Second(s) or minute(s) | Millisecond(s) | | **Execution Model** | Each execution processes one or many events | Each execution processes exactly one event | | **Data Handling** | Store all received events in a file | Store each event in a raw format | | **Output format** | URI of a file in internal storage | Raw data of the event payload and related metadata | | **Application** | Data applications processing data in batch | Business-critical operations reacting to events in real time | | **Use cases** | Data orchestration for analytics and building data products | Process and microservice orchestration (real time updates, anomaly detection, order processing) | ## How to use Realtime Triggers To use Realtime Triggers, choose the `RealtimeTrigger` as the trigger type for your desired service. The following flow uses the `RealtimeTrigger` to [listen to new messages in an AWS SQS queue](https://youtu.be/bLzk4dKc95g): ```yaml id: sqs namespace: company.team tasks: - id: log type: io.kestra.plugin.core.log.Log message: "{{ trigger }}" triggers: - id: realtime_trigger type: io.kestra.plugin.aws.sqs.RealtimeTrigger region: eu-north-1 accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID')}}" secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}" queueUrl: https://sqs.eu-north-1.amazonaws.com/123456789/MyQueue ``` ## Worker failover for Realtime Triggers Each Realtime Trigger runs as a dedicated listener thread on one specific worker. If that worker stops, the listener stops with it. Kestra's [liveness mechanism](../../../10.administrator-guide/server-lifecycle/index.md) detects this and re-emits the trigger so another available worker can pick it up. The time before failover depends on how the worker stopped: - **Graceful shutdown** (e.g. `docker stop`, rolling deploy): the Executor waits for `kestra.server.terminationGracePeriod` (default `PT5M`) before reassigning the trigger. This prevents duplicate processing when the worker is expected to come back shortly, such as during a rolling deployment. - **Abrupt failure** (no heartbeat received): the Executor detects the missing heartbeat within `kestra.server.liveness.timeout` and reassigns the trigger without waiting for the grace period. To reduce the failover time after a graceful shutdown, lower the `terminationGracePeriod`: ```yaml kestra: server: terminationGracePeriod: PT1M # default is PT5M ``` ::alert{type="info"} Events are not lost during the failover window. They remain in the source system (Kafka topic, SQS queue, etc.) and will be consumed once the trigger listener is restarted on another worker. :: ## Comparison with real-time data processing engines Kestra's Realtime Triggers are not a replacement for real-time data processing engines such as Apache Flink, Apache Beam, or Google Dataflow. Those data processing engines excel at **stateful** streaming applications and complex SQL transformations over real-time data streams. Unlike streaming engines, Kestra’s Realtime Triggers are **stateless** — each event creates its own independent workflow execution. They are designed for orchestrating business workflows and microservices in response to events, not for continuous stateful stream processing. To continue with Realtime Triggers, check out their [How-to Guide](../../../15.how-to-guides/realtime-triggers/index.md). --- # Schedule Trigger in Kestra – Cron-Based Scheduling URL: https://kestra.io/docs/workflow-components/triggers/schedule-trigger > Schedule Kestra workflows with the Schedule Trigger. Learn to use cron expressions, backfills, and conditions to run flows at precise times. Schedule flows using cron expressions. The Schedule trigger generates new executions on a regular cadence based on a Cron expression or custom scheduling conditions. ```yaml type: io.kestra.plugin.core.trigger.Schedule ``` Kestra can trigger flows on a defined schedule. If you need to wait for another system to be ready and no event mechanism is available, you can configure one or more time-based schedules for your flow. Kestra can automatically handle [backfills](../../../06.concepts/08.backfill/index.md) to recover missed executions. Check the [Schedule task](/plugins/core/trigger/io.kestra.plugin.core.trigger.schedule) documentation for the list of the task properties and outputs. :::alert{type="warning"} To avoid unexpected differences, keep your Kestra server and database timezones aligned. If this isn’t possible, account for timezone implications such as Daylight Saving Time or regional variations. ::: ## Cron extension Kestra supports the following cron extensions instead of writing a cron expression: - `@yearly` and `@annually` - runs yearly on 1st January at `00:00` - `@monthly` - runs monthly on the 1st at `00:00` - `@weekly` - runs weekly on Sunday at `00:00` - `@daily` and `@midnight` - runs at `00:00` every day - `@midnight` - runs at `00:00` every day - `@hourly` - runs every hour, on the hour ## Examples Schedule that runs every 15 minutes: ```yaml triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "*/15 * * * *" ``` Schedule that runs only on the first monday of every month at 11 AM: ```yaml triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "0 11 * * 1" conditions: - type: io.kestra.plugin.core.condition.DayWeekInMonth date: "{{ trigger.date }}" dayOfWeek: "MONDAY" dayInMonth: "FIRST" ``` A schedule that runs daily at midnight US Eastern time: ```yaml triggers: - id: daily type: io.kestra.plugin.core.trigger.Schedule cron: "@daily" timezone: America/New_York ``` Schedule that runs on the last day of month: The Schedule trigger also supports `L` in the day-of-month field to represent the last day of the month. For example: ```yaml triggers: - id: month_end type: io.kestra.plugin.core.trigger.Schedule cron: "0 12 L * *" ``` This runs at `12:00` on the last day of every month, including shorter months like February. :::alert{type="warning"} Schedules cannot **overlap**, meaning concurrent schedule executions are not allowed. If the previous schedule is not ended when the next one must start, the scheduler will wait until the end of the previous one. The same applies during backfills. ::: :::alert{type="info"} By default, schedule executions depend on `trigger.date`. For example, this may be used when querying files or databases by date. However, this prevents manual execution since trigger.date is only available for scheduled runs. You can use this expression to make your **manual execution work**: `{{ trigger.date ?? execution.startDate | date("yyyy-MM-dd") }}`. It will use the current date if there is no schedule date making it possible to start the flow manually. ::: ## Schedule conditions When a `cron` expression alone is not sufficient (e.g., only first Monday of the month, only weekends), you can refine schedules using `conditions`. You **must** use the `{{ trigger.date }}` expression on the property `date` of the current schedule. This condition will be evaluated and `{{ trigger.previous }}` and `{{ trigger.next }}` will reflect the date **with** the conditions applied. The list of core conditions that can be used are: - [DateTimeBetween](/plugins/core/condition/io.kestra.plugin.core.condition.datetimebetween) - [DayWeek](/plugins/core/condition/io.kestra.plugin.core.condition.dayweek) - [DayWeekInMonth](/plugins/core/condition/io.kestra.plugin.core.condition.dayweekinmonth) - [Not](/plugins/core/condition/io.kestra.plugin.core.condition.not) - [Or](/plugins/core/condition/io.kestra.plugin.core.condition.or) - [Weekend](/plugins/core/condition/io.kestra.plugin.core.condition.weekend) - [PublicHoliday](/plugins/core/condition/io.kestra.plugin.core.condition.publicholiday) - [TimeBetween](/plugins/core/condition/io.kestra.plugin.core.condition.timebetween) Here's an example using the `DayWeek` condition: ```yaml id: conditions namespace: company.team tasks: - id: hello type: io.kestra.plugin.core.log.Log message: This will execute only on Thursday! triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "@hourly" conditions: - type: io.kestra.plugin.core.condition.DayWeek dayOfWeek: "THURSDAY" ``` ## Recover missed schedules ### Automatically By default, Kestra automatically recovers missed schedules. This means that if the Kestra server is down, the missed schedules will be executed as soon as the server is back up. However, this behavior is not always desirable, e.g. during a planned maintenance window. This behavior can be disabled by setting the `recoverMissedSchedules` configuration to `NONE`. Configure `recoverMissedSchedules` behavior in your global Kestra configuration to choose whether you want to recover missed schedules automatically or not: ```yaml kestra: plugins: configurations: - type: io.kestra.plugin.core.trigger.Schedule values: # available options: LAST | NONE | ALL -- default: ALL recoverMissedSchedules: NONE ``` The `recoverMissedSchedules` configuration can be set to `ALL`, `NONE` or `LAST`: - `ALL`: Kestra will recover all missed schedules. This is the **default** value. - `NONE`: Kestra will not recover any missed schedules. - `LAST`: Kestra will recover only the last missed schedule for each flow. Note that this is a global configuration that will apply to all flows, unless other behavior is explicitly defined within the flow definition like below: ```yaml triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "*/15 * * * *" recoverMissedSchedules: NONE ``` In this example, the `recoverMissedSchedules` is set to `NONE`, which means that Kestra will not recover any missed schedules for this specific flow regardless of the global configuration or default `recoverMissedSchedules` behavior. If you have a missed window of executions with `recoverMissedSchedules: NONE`, then use Backfill to replay the missed executions. ### Using Backfill Backfills are replays of missed schedule intervals between a defined start and end date. To backfill the missed executions, go to the `Triggers` tab on the flow's detail page and click on the `Backfill executions` button. ![backfill1](../../../06.concepts/08.backfill/backfill1.png) :::alert{type="info"} Note: Ensure the backfill date range spans every missed schedule so the trigger can replay each execution. ::: For more information on Backfill, check out the [dedicated documentation](../../../06.concepts/08.backfill/index.md). #### Disabling the trigger If you are unsure how to proceed, you can temporarily disable the trigger by setting `disabled: true` in the YAML or toggling it in the UI. This is useful if you are figuring out what to do before the next schedule is due to run. For more information on Disabled, check out the [dedicated documentation](../../16.disabled/index.md). ## Setting inputs inside of the schedule trigger You can easily pass inputs to the Schedule Trigger by using the `inputs` property and passing them as a key-value pair. In this example, the `user` input is set to "John Smith" inside of the `schedule` trigger: ```yaml id: myflow namespace: company.team inputs: - id: user type: STRING defaults: Rick Astley tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "Hello {{ inputs.user }}! 🚀" triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "*/1 * * * *" inputs: user: John Smith ``` ## Disable a schedule trigger after a specified execution state Schedule triggers have an optional property, `stopAfter`, that disables a trigger after a specified execution state has been reached: for example, `SUCCESS`, `FAILED`, `KILLED`, `SKIPPED`, etc. Refer to the [Schedule Trigger documentation](/plugins/core/trigger/io.kestra.plugin.core.trigger.schedule#properties_stopAfter-body) for more property details. For example, you may want to disable a trigger for a `FAILED` or `KILLED` flow to avoid multiple runs of that flow that is misconfigured and needs attention. The property is added to the trigger definition like below: ```yaml id: myflow namespace: company.team inputs: - id: user type: STRING defaults: Rick Astley tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "Hello {{ inputs.user }}! 🚀" triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "*/1 * * * *" stopAfter: - FAILED - KILLED inputs: user: John Smith ``` ## Detect stuck Schedule Triggers Kestra has a plugin, [ScheduleMonitor](/plugins/plugin-kestra/kestra-triggers/io.kestra.plugin.kestra.triggers.schedulemonitor), for detecting stuck or misconfigured Schedule Triggers. It checks periodically and can run at the Tenant level, for a specific Namespace, or for a single Flow. For example, set this up as a [System Flow](../../../06.concepts/system-flows/index.md) and send an alert if any Schedule Triggers come back showing an issue: ```yaml id: detect_stuck_schedules namespace: system tasks: - id: send_alert runIf: "{{ trigger.data }}" type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook url: https://kestra.io/api/mock messageText: The following Schedule triggers seem unhealthy {{ trigger.data }} triggers: - id: stuck_schedules type: io.kestra.plugin.kestra.triggers.ScheduleMonitor auth: username: admin@kestra.io # pass your Kestra username as secret password: Admin1234 # pass your Kestra password as secret namespace: company.team flowId: daily_sync interval: PT1H # poll for stuck schedules every 1h ``` By default, the trigger checks all schedules in the current Tenant ([Multi-tenancy](../../../07.enterprise/02.governance/tenants/index.md) is an Enterprise feature) if no Namespace or Flow is specified. --- # Webhook Trigger in Kestra – Start Flows via HTTP URL: https://kestra.io/docs/workflow-components/triggers/webhook-trigger > Trigger Kestra flows via HTTP with the Webhook Trigger. Learn to start executions from external applications using secure webhook URLs and payloads. Trigger flows automatically in response to web-based events. A Webhook trigger generates a unique URL that lets external applications (such as GitHub, Amazon EventBridge, or any system that can send HTTP requests) automatically start new executions in Kestra. Each webhook URL requires a secret `key` to secure it. This prevents unauthorized access and ensures only trusted systems can trigger your flow. ```yaml type: io.kestra.plugin.core.trigger.Webhook ``` A Webhook trigger enables triggering a flow from a webhook URL. When you create the trigger, you must provide a `key`. This `key` is embedded in the webhook URL: `/api/v1/main/executions/webhook/{namespace}/{flowId}/{key}`. For security, use a randomly generated string rather than something easy to guess. Kestra accepts `GET`, `POST`, and `PUT` requests on the webhook URL. Both the request body and headers are automatically available as variables inside your flow. :::alert{type="info"} Starting in Kestra 0.24, [Basic Authentication is required](../../../11.migration-guide/v0.24.0/basic-authentication/index.md) for all instances. This change makes it so API requests require an `Authorization` header. Follow these [Basic Authentication Encoding Steps](../../../15.how-to-guides/synchronous-executions-api/index.md#basic-authentication) to configure requests correctly. ::: ## Example ```yaml id: trigger namespace: company.team tasks: - id: hello type: io.kestra.plugin.core.log.Log message: "Hello World! 🚀" triggers: - id: webhook type: io.kestra.plugin.core.trigger.Webhook key: 4wjtkzwVGBM9yKnjm3yv8r ``` After creating the trigger, include the key in the webhook URL to start the flow. For example: ```bash https://{kestra_domain}/api/v1/main/executions/webhook/{namespace}/{flowId}/4wjtkzwVGBM9yKnjm3yv8r ``` Make sure to replace `kestra_domain`, `namespace`, and `flowId`. You can also copy the formed Webhook URL from the **Triggers** tab.
## Webhook response By default, a webhook trigger answers with JSON. When you need the caller to wait for a custom response (e.g., validation handshakes that require `text/plain`), enable `wait` and set the `responseContentType` to `text/plain`. ```yaml triggers: - id: webhook type: io.kestra.plugin.core.trigger.Webhook key: your-secret-key wait: true returnOutputs: true responseContentType: text/plain # optional, defaults to application/json ``` Behavior: - `wait: true` keeps the HTTP connection open until the flow finishes or hits the trigger’s timeout. - `returnOutputs: true` returns the flow outputs as the HTTP response body (JSON by default). Override with `responseContentType` for plaintext or other formats. ## Webhook trigger testing If your flow uses trigger variables (such as `{{ trigger.body }})`, you can test it directly from the execution modal. Kestra generates a ready-to-use `cURL` command that lets you trigger the flow with a custom JSON payload. ![Webhook Trigger Test](./webhook-trigger-test.png) See the [Webhook trigger plugin documentation](/plugins/core/trigger/io.kestra.plugin.core.trigger.webhook) for a full list of properties and outputs. ### Return flow outputs in the webhook response To send task outputs back to the caller in the HTTP response, configure the Webhook trigger to wait for the execution and return outputs. The flow must expose at least one `outputs` entry. ```yaml id: webhook_return_outputs namespace: company.team tasks: - id: make_payload type: io.kestra.plugin.core.debug.Return format: "Hello {{ trigger.parameters.name[0] ?? 'world' }}!" outputs: - id: greeting type: STRING value: "{{ outputs.make_payload.value }}" triggers: - id: webhook type: io.kestra.plugin.core.trigger.Webhook key: 4wjtkzwVGBM9yKnjm3yv8r wait: true returnOutputs: true # optional: responseContentType: "text/plain" ``` - Call the webhook URL with a query parameter (for example `?name=Alice`). The execution runs synchronously because `wait: true` is set. - The HTTP response body contains the flow outputs (JSON by default). With the example above, the response includes `"greeting": "Hello Alice!"`. - Set `responseContentType: "text/plain"` when you want the response body to be plain text (ensure the flow returns a single string output, such as from the `Return` task). --- # Variables in Kestra – Reuse Values Across Flows URL: https://kestra.io/docs/workflow-components/variables > Master Variables in Kestra to reuse values across tasks and flows. Learn to configure, modify, and utilize dynamic variables with Pebble templating. Variables are key-value pairs that let you reuse values across tasks. You can also store variables at the namespace level to reuse them across multiple flows in that namespace.
## How to configure variables The example below shows how you can configure variables in your flow: ```yaml id: hello_world namespace: company.team variables: myvar: hello numeric_variable: 42 tasks: - id: log type: io.kestra.plugin.core.debug.Return format: "{{ vars.myvar }} world {{ vars.numeric_variable }}" ``` Use variables with the syntax `{{ vars.variable_name }}`. ## How variables are rendered You can use variables in any task property documented as **dynamic**. Dynamic variables are rendered by the Pebble templating engine, which processes expressions with filters and functions. More information on variable processing can be found under [Expressions](../../expressions/index.mdx). :::alert{type="info"} Variables are no longer rendered recursively. Learn more about this change — and how to adjust behavior — in the [migration guide](../../11.migration-guide/v0.14.0/recursive-rendering/index.md). ::: ## Dynamic variables
If a variable contains an expression, wrap it with `render` when using it in a task. For example, the variable below displays the current time only when wrapped with `render`; otherwise, the log prints the expression as a string: ```yaml id: dynamic_variable namespace: company.team variables: time: "{{ now() }}" tasks: - id: log type: io.kestra.plugin.core.log.Log message: "{{ render(vars.time) }}" ``` :::alert{type="info"} Wrap the variable expression with `render` every time you use it in a task. ::: ## Set or modify execution variables The `SetVariables` and `UnsetVariables` tasks can modify or delete variables within the execution context. For example, take the following flow: ```yaml id: variables_demo namespace: company.team variables: state: FAILED ansibleTicket: myticket nested: child: property unchanged: stay the same tasks: - id: request type: io.kestra.plugin.core.output.OutputValues values: ansibleTicket: new ticket value state: SUCCESS - id: updateVariables type: io.kestra.plugin.core.execution.SetVariables overwrite: true # true by default variables: state: "{{ outputs.request.values.state }}" ansibleTicket: "{{ outputs.request.values.ansibleTicket }}" nested: child: new value - id: confirmUpdate type: io.kestra.plugin.core.log.Log message: Hello "{{ vars }}" ``` Initially, `state` is `FAILED` and `ansibleTicket` is `myticket`. Within the flow, the `updateVariables` task uses `io.kestra.plugin.core.execution.SetVariables` to modify `state` to `SUCCESS` and `ansibleTicket` to `new ticket value` per the `request` task, as well as change one of the nested variables, `nested.child` to `new value` (`nested.unchanged` is unmodified so it'll remain the same). After the flow runs, `state`, `ansibleTicket`, and `nested.child` have their new values, and `nested.unchanged` remains unchanged. ![Set Variables](./set-variables.png) ## Delete or unset execution variables To unset variables, use `io.kestra.plugin.core.execution.UnsetVariables`. Building on the example above, add the following task: ```yaml - id: deleteVariables type: io.kestra.plugin.core.execution.UnsetVariables variables: - state - ansibleTicket - nested.child # remove only this key from the nested map ``` After executing the flow, the only remaining variable is `nested.unchanged` with the value `stay the same`. In the unset task, `state`, `ansibleTicket`, and `nested.child` were deleted. ![Unset Variables](./unset-variables.png) ## FAQ ### How do I escape a block in Pebble syntax to ensure that it won't be parsed? To ensure that a block of code won't be parsed by Pebble, you can use the `{% raw %}` and `{% endraw %}` [Pebble tags](../../expressions/02.syntax/index.mdx#raw). For example, the following returns the string `{{ myvar }}` instead of the value of `myvar`: ```yaml {% raw %}{{ myvar }}{% endraw %} ``` ### In which order are inputs and variables resolved? [Inputs](../05.inputs/index.md) are resolved first, before the execution starts. If a flow has an invalid input value, the execution will not be created. Therefore, you can use inputs within variables, but you cannot use variables or Pebble expressions in most contexts (Check out [Dynamic Inputs](../05.inputs/index.md#dynamic-inputs) for more information) within inputs. [Expressions](../../expressions/index.mdx) are rendered recursively: if a variable references another variable, the inner one is resolved first. Triggers are handled similarly to inputs because they are known before the execution starts (they create the execution). This means you cannot use inputs (unless they have `defaults`) within triggers, but you can use trigger variables inside `variables`. #### Examples This flow uses inputs, trigger, and execution variables which are resolved before variables: ```yaml id: upload_to_s3 namespace: company.team inputs: - id: bucket type: STRING defaults: declarative-data-orchestration tasks: - id: get_zip_file type: io.kestra.plugin.core.http.Download uri: https://wri-dataportal-prod.s3.amazonaws.com/manual/global_power_plant_database_v_1_3.zip - id: unzip type: io.kestra.plugin.compress.ArchiveDecompress algorithm: ZIP from: "{{outputs.get_zip_file.uri}}" - id: csv_upload type: io.kestra.plugin.aws.s3.Upload from: "{{ outputs.unzip.files['global_power_plant_database.csv'] }}" bucket: "{{ inputs.bucket }}" key: "powerplant/{{ trigger.date ?? execution.startDate | date('yyyy_MM_dd__HH_mm_ss') }}.csv" triggers: - id: hourly type: io.kestra.plugin.core.trigger.Schedule cron: "@hourly" ``` This flow starts a task conditionally based on whether the input is provided or not: ```yaml id: conditional_branching namespace: company.team inputs: - id: parameter type: STRING required: false tasks: - id: if type: io.kestra.plugin.core.flow.If condition: "{{inputs.parameter ?? false }}" then: - id: if_not_null type: io.kestra.plugin.core.log.Log message: Received input {{inputs.parameter}} else: - id: if_null type: io.kestra.plugin.core.log.Log message: No input provided ``` Below is an example that uses a trigger variable within a trigger itself (_that's allowed!_): ```yaml id: backfill_past_mondays namespace: company.team tasks: - id: log_trigger_or_execution_date type: io.kestra.plugin.core.log.Log message: "{{ trigger.date ?? execution.startDate }}" triggers: - id: first_monday_of_the_month type: io.kestra.plugin.core.trigger.Schedule timezone: Europe/Berlin backfill: start: 2023-11-11T00:00:00Z cron: "0 11 * * MON" # at 11:00 every Monday conditions: # only first Monday of the month - type: io.kestra.plugin.core.condition.DayWeekInMonth date: "{{ trigger.date }}" dayOfWeek: "MONDAY" dayInMonth: "FIRST" ``` ### Can I transform variables with Pebble expressions? Yes. Kestra uses [Pebble templates](https://pebbletemplates.io/) along with the execution context to render **dynamic properties**. (such as filters, functions, and operators) to transform [inputs](../05.inputs/index.md) and [variables](../04.variables/index.md). The example below illustrates how to use variables and Pebble expressions to transform string values in dynamic task properties: ```yaml id: variables_demo namespace: company.team variables: DATE_FORMAT: "yyyy-MM-dd" tasks: - id: seconds_of_day type: io.kestra.plugin.core.debug.Return format: '{{ 60 * 60 * 24 }}' - id: start_date type: io.kestra.plugin.core.debug.Return format: "{{ execution.startDate | date(vars.DATE_FORMAT) }}" - id: curr_date_unix type: io.kestra.plugin.core.debug.Return format: "{{ now() | date(vars.DATE_FORMAT) | timestamp() }}" - id: next_date type: io.kestra.plugin.core.debug.Return format: "{{ now() | dateAdd(1, 'DAYS') | date(vars.DATE_FORMAT) }}" - id: next_date_unix type: io.kestra.plugin.core.debug.Return format: "{{ now() | dateAdd(1, 'DAYS') | date(vars.DATE_FORMAT) | timestamp() }}" - id: pass_downstream type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - echo "{{ outputs.next_date_unix.value }}" ``` ### Can I use nested variables? Yes. Depending on the task, you may need to wrap the root variable with `json()` to access specific keys. Below is an example using a list of maps as a variable: ```yaml id: vars namespace: company.myteam variables: servers: - fqn: server01.mydomain.io user: root - fqn: server02.mydomain.io user: guest - fqn: server03.mydomain.io user: rick tasks: - id: parallel type: io.kestra.plugin.core.flow.ForEach concurrencyLimit: 0 values: "{{ vars.servers }}" tasks: - id: log type: io.kestra.plugin.core.log.Log message: - "{{ taskrun.value }}" # for each element, prints the full JSON object (e.g., {"fqn":"server01.mydomain.io","user":"root"}) - "{{ json(taskrun.value).fqn }}" # prints the value for that key (e.g., server01.mydomain.io) - "{{ json(taskrun.value).user }}" # prints the value for that key (e.g., root) ``` ---