​Orchestrate dbt ​Workflows

Data teams use dbt to transform data in warehouses. While dbt simplifies SQL transformations, managing dependencies, testing changes, and deploying models at scale remains challenging. Kestra solves this by integrating dbt with your data platform through version-controlled workflows.

What is needed to orchestrate dbt workflows?

Orchestration platforms like Kestra automate the execution of dbt models while managing dependencies, environments, and deployments. With Kestra, you can:

  • Version control models – Store dbt projects in Git and sync with Kestra's namespace files
  • Test changes safely – Run modified models in isolated containers before production
  • Scale transformations – Execute dbt builds on dynamically provisioned containers in the cloud using task runners (AWS/GCP/Azure Batch)
  • Integrate with your data stack – Chain dbt runs with ingestion tools, quality checks, and alerts.

Why Use Kestra for dbt Orchestration?

  1. GitOps Workflows – Sync dbt projects from Git, add and test new models, then push changes to Git from Kestra.
  2. Environment Management – Run models in different targets (dev/stage/prod) from one self-contained flow.
  3. Dynamic Scaling – Execute heavy dbt builds on serverless containers or Kubernetes clusters.
  4. Dependency Tracking – Automatically parse manifest.json to visualize model relationships.
  5. Integrated Testing – Add data quality checks between dbt models using Python or SQL.
  6. CI/CD Pipelines – Deploy model changes to multiple Kestra namespaces or Git branches.
  7. Multi-Project Support – Coordinate multiple dbt projects declaratively in one flow.

Example: dbt Project Orchestration

Below are common patterns to orchestrate dbt workflows using Kestra.

Fetch dbt Project from Git at Runtime

The example below shows a simple flow that runs dbt build for DuckDB in a Docker container. Note that the dbt project is cloned from a Git repository at runtime to ensure the latest version is used.

yaml
id: dbt_duckdb
namespace: company.team.dbt

tasks:
  - id: dbt
    type: io.kestra.plugin.core.flow.WorkingDirectory
    tasks:
      - id: clone_repository
        type: io.kestra.plugin.git.Clone
        url: https://github.com/kestra-io/dbt-example
        branch: main

      - id: dbt_build
        type: io.kestra.plugin.dbt.cli.DbtCLI
        taskRunner:
          type: io.kestra.plugin.scripts.runner.docker.Docker
        containerImage: ghcr.io/kestra-io/dbt-duckdb:latest
        commands:
          - dbt deps
          - dbt build
        profiles: |
          my_dbt_project:
            outputs:
              dev:
                type: duckdb
                path: ":memory:"
                fixed_retries: 1
                threads: 16
                timeout_seconds: 300
            target: dev

Sync dbt Project from Git to Kestra's Namespace Files

You can sync the dbt project from a Git branch to Kestra's namespace and iterate on the models from the integrated code editor in the Kestra UI.

yaml
id: dbt_build
namespace: company.team.dbt

tasks:
  - id: sync
    type: io.kestra.plugin.git.SyncNamespaceFiles
    url: https://github.com/kestra-io/dbt-example
    branch: master
    namespace: "{{ flow.namespace }}"
    gitDirectory: dbt
    dryRun: false

  - id: dbt_build
    type: io.kestra.plugin.dbt.cli.DbtCLI
    containerImage: ghcr.io/kestra-io/dbt-duckdb:latest
    namespaceFiles:
      enabled: true
      exclude:
        - profiles.yml
    taskRunner:
      type: io.kestra.plugin.scripts.runner.docker.Docker
    commands:
      - dbt build
    profiles: |
      my_dbt_project:
        outputs:
          prod:
            type: duckdb
            path: ":memory:"
            schema: main
            threads: 8
        target: prod

You can use the above flow as an initial setup:

  1. Add this flow within Kestra UI
  2. Save it
  3. Execute that flow
  4. Click on the Files sidebar in the code editor to view the uploaded dbt files.

dbt-code-editor

You can then set disabled: true within the first task after the first execution to avoid re-syncing the project. This allows you to iterate on the models without cloning the repository every time.

With the Code Editor built into Kestra, you can easily manage dbt projects by cloning the dbt Git repository, and uploading it to your Kestra namespace. You can make changes to the dbt models directly from the Kestra UI, test them as part of an end-to-end workflow, and push the changes to the desired Git branch when you are ready.

Run dbt CLI, iterate on models, and push changes to Git

Kestra Features to Orchestrate dbt Workflows

Git Integration

Clone dbt projects from any Git provider:

yaml
- id: clone
  type: io.kestra.plugin.git.Clone
  url: https://github.com/kestra-io/dbt-example
  branch: main

Best-in-class log navigation across dbt models

Kestra automatically parses the manifest.json file within the Execution Gantt chart to provide visibility into each dbt model's built status, their duration and logs. You can browse all logs in one place (without having to manually navigate to each dbt model) and you can easily jump to the next INFO/WARN/ERROR log thanks to the best-in-class log navigation feature.

dbtLogs

Manifest Tracking

Store dbt artifacts between runs in the integrated KV Store:

yaml
tasks:
  - id: dbt
    type: io.kestra.plugin.dbt.cli.DbtCLI
    namespaceFiles:
      enabled: true
    loadManifest:
      key: manifest.json
      namespace: "{{ flow.namespace }}"
    storeManifest:
      key: manifest.json
      namespace: "{{ flow.namespace }}"

Custom Quality Checks

Add quality checks validating dbt models using various plugins such as Soda:

yaml
  - id: scan
    type: io.kestra.plugin.soda.Scan
    configuration:
      # ...
    checks:
      checks for orderDetail:
        - row_count > 0
        - max(unitPrice):
            warn: when between 1 and 250
            fail: when > 250
      checks for territory:
        - row_count > 0
        - failed rows:
            name: Failed rows query test
            fail condition: regionId = 4
    requirements:
      - soda-core-bigquery

Multi-Project Coordination

If needed, you can orchestrate multiple dbt projects from a single flow:

yaml
- id: core
  type: io.kestra.plugin.dbt.cli.DbtCLI
  projectDir: dbt-core

- id: marts
  type: io.kestra.plugin.dbt.cli.DbtCLI
  projectDir: dbt-marts

Scale dbt Workflows in the Cloud

Adding the following pluginDefaults to that flow (or your namespace) will scale the dbt task so that the (computationally heavy) dbt parsing process runs on AWS ECS Fargate, Google Batch, Azure Batch, or Kubernetes job by leveraging Kestra's task runners:

yaml
pluginDefaults:
  - type: io.kestra.plugin.dbt.cli.DbtCLI
    values:
       taskRunner:
         type: io.kestra.plugin.ee.aws.runner.Batch
         region: us-east-1
         accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}"
         secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}"
         computeEnvironmentArn: "arn:aws:batch:us-east-1:123456789:compute-environment/kestra"
         jobQueueArn: "arn:aws:batch:us-east-1:123456789:job-queue/kestra"
         executionRoleArn: "arn:aws:iam::123456789:role/ecsTaskExecutionRole"
         taskRoleArn: "arn:aws:iam::123456789:role/ecsTaskRole"
         bucket: kestra-us

You can set plugin defaults at the flow, namespace, or global level to apply to all tasks of that type, ensuring that all dbt tasks run on AWS ECS Fargate in a given environment.


Getting Started with dbt Orchestration

  1. Install Kestra – Follow the quick start guide or the full installation instructions for production environments.
  2. Write Your Workflows – Configure your flow in YAML, declaring inputs, tasks, and triggers. Use one of the patterns above to sync dbt projects from Git, run dbt CLI commands, and push changes back to Git.
  3. Configure Environments — Set up dbt profiles for different targets based on your dbt project setup:
    yaml
    - id: dbt
      type: io.kestra.plugin.dbt.cli.DbtCLI
      containerImage: ghcr.io/kestra-io/dbt-duckdb:latest
      profiles: |
        my_dbt_project:
          outputs:
            prod:
              type: duckdb
    
  4. Add Execution Triggers — Schedule dbt runs or trigger them based on upstream data availability:
    yaml
    triggers:
      - id: schedule
        type: io.kestra.plugin.core.trigger.Schedule
        cron: "0 9 * * 1-5"  # Weekdays at 9 AM
    
  5. Monitor Runs — Track dbt models and their execution durations in Kestra's UI.

Next Steps

Was this page helpful?