Source
id: airbyte-cloud-dbt
namespace: company.team
tasks:
- id: data_ingestion
type: io.kestra.plugin.core.flow.Parallel
tasks:
- id: salesforce
type: io.kestra.plugin.airbyte.cloud.jobs.Sync
connectionId: e3b1ce92-547c-436f-b1e8-23b6936c12ab
- id: google_analytics
type: io.kestra.plugin.airbyte.cloud.jobs.Sync
connectionId: e3b1ce92-547c-436f-b1e8-23b6936c12cd
- id: facebook_ads
type: io.kestra.plugin.airbyte.cloud.jobs.Sync
connectionId: e3b1ce92-547c-436f-b1e8-23b6936c12ef
- id: dbt
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-demo
branch: main
- id: dbt_build
type: io.kestra.plugin.dbt.cli.Build
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
dbtPath: /usr/local/bin/dbt
dockerOptions:
image: ghcr.io/kestra-io/dbt-bigquery:latest
inputFiles:
.profile/profiles.yml: |
jaffle_shop:
outputs:
dev:
type: bigquery
dataset: your_big_query_dataset_name
project: your_big_query_project
fixed_retries: 1
keyfile: sa.json
location: EU
method: service-account
priority: interactive
threads: 8
timeout_seconds: 300
target: dev
sa.json: "{{ secret('GCP_CREDS') }}"
pluginDefaults:
- type: io.kestra.plugin.airbyte.cloud.jobs.Sync
values:
token: "{{ secret('AIRBYTE_CLOUD_API_TOKEN') }}"
About this blueprint
Data
This blueprint orchestrates a modern ETL pipeline by combining parallel SaaS data ingestion with local dbt Core transformations, using Airbyte Cloud for extraction and dbt CLI for analytics modeling.
It performs the following actions:
- Runs multiple Airbyte Cloud syncs in parallel to ingest data from SaaS sources such as Salesforce, Google Analytics, and advertising platforms.
- Waits for all ingestion jobs to complete before starting transformations.
- Clones a dbt project repository and executes dbt Core commands using a containerized runtime.
- Transforms raw ingested data into analytics-ready tables in a cloud data warehouse.
This pattern is designed for analytics engineering teams and data platform teams that want full control over dbt execution while still leveraging managed SaaS ingestion with Airbyte Cloud.
Configuration:
- Add an Airbyte Cloud API token as a secret (
AIRBYTE_CLOUD_API_TOKEN). - Configure cloud warehouse credentials (for example, a GCP service account for BigQuery) as secrets.
- Update Airbyte
connectionIdvalues to match your Airbyte Cloud workspace. - Customize the dbt project repository, profiles, dataset, and execution settings to match your analytics environment.
- Optionally schedule or trigger this ETL pipeline on demand to support dashboards, reporting, or downstream data products.
By orchestrating ingestion and transformation in a single automation, this blueprint ensures fresh, consistent, and analytics-ready data for modern BI and analytics workloads.
More Related Blueprints