What's the difference between BigQuery Scheduled Queries and Kestra?

Scheduled Queries run a SQL statement on a cron inside BigQuery. They cannot load from GCS, wait on an Airbyte sync, retry one step, or chain to dbt and Hightouch. Kestra runs above the warehouse: it triggers io.kestra.plugin.gcp.bigquery.Query and LoadFromGcs on real events, refreshes only changed partitions, and gives one execution history across every tool. Kestra works with BigQuery, not instead of it.

Can Kestra load BigQuery when a file lands in GCS?

Yes. The io.kestra.plugin.gcp.gcs.Trigger polls a bucket prefix and fires on a new or updated object. The bigquery.LoadFromGcs task then loads it, and a dbt build can run against the loaded table immediately after. The trigger can MOVE or DELETE processed objects so the same file does not fire twice.

Can Kestra stream Pub/Sub events into BigQuery?

Yes. The pubsub.RealtimeTrigger starts one execution per message for low-latency inserts, or the batch pubsub.Trigger groups messages for micro-batch loads. A bigquery.Query insert (or a load for larger batches) writes the rows, and messages are acked on success.

Is Kestra self-hosted or managed?

Kestra offers multiple deployment options: fully self-hosted on Docker or Kubernetes, a managed cloud version, or air-gapped for regulated environments. Unlike Cloud Composer, Kestra is not tied to GCP, so it can orchestrate BigQuery alongside AWS, Snowflake, and on-prem systems from one control plane.

Run BigQuery jobs on the event that matters.

Q: How does Kestra avoid reprocessing a whole BigQuery table?

Target only the changed range. io.kestra.plugin.gcp.bigquery.DeletePartitions clears the affected date partitions, then bigquery.LoadFromGcs reloads just those partitions. Cost and runtime scale with the delta, not the table size, and because the range is cleared before it is loaded, a rerun is safe to repeat.

Q: Can Kestra control BigQuery query cost?

Yes. BigQuery bills by bytes scanned, so a flow runs a dry-run query first, branches on the estimate with an If task, and only runs the real job under a threshold. Pairing the dry-run guardrail with partition-aware loads keeps both the bytes scanned and the runtime tied to the actual delta.

BigQuery runs the SQL and the warehouse. Kestra decides when each job fires, what feeds it, and how much it scans. Load on a GCS arrival, refresh only the changed partitions, chain with ingestion and dbt, and trace every query in one execution history.

Book a Demo Get Started

Blueprints for BigQuery orchestration.

BigQuery Scheduled Queries fire SQL on a cron, but they do not load from object storage, chain to other tools, retry one step, or reprocess only the partitions that changed. Kestra runs above BigQuery: it loads the moment a file lands in GCS, reacts to Pub/Sub in real time, chains BigQuery with Airbyte ingestion and dbt transforms, and refreshes only the changed date range so cost tracks the delta. A transient transform retry never re-runs the load.

Run dbt transformations on BigQuery from GitOpen blueprint

id: dbt-bigquery
namespace: company.team

tasks:
  - id: git
    type: io.kestra.plugin.core.flow.WorkingDirectory
    tasks:
      - id: clone_repository
        type: io.kestra.plugin.git.Clone
        url: https://github.com/kestra-io/dbt-example
        branch: main

      - id: dbt
        type: io.kestra.plugin.dbt.cli.DbtCLI
        inputFiles:
          sa.json: "{{ secret('GCP_CREDS') }}"
        taskRunner:
          type: io.kestra.plugin.scripts.runner.docker.Docker
        containerImage: ghcr.io/kestra-io/dbt-bigquery:latest
        profiles: |
          my_dbt_project:
            outputs:
              dev:
                type: bigquery
                dataset: your_big_query_dataset_name
                project: your_big_query_project
                keyfile: sa.json
                location: EU
                method: service-account
                priority: interactive
                threads: 16
                timeout_seconds: 300
                fixed_retries: 1
            target: dev
        commands:
          - dbt deps
          - dbt build

Refresh only the partitions that changedOpen blueprint

id: bigquery-incremental-partitions
namespace: company.team
description: |
  Refresh only the BigQuery partitions that changed instead of rescanning the
  whole table. For each day in a small rolling window, the flow loads that
  day's GCS files into the matching partition with a partition decorator and
  WRITE_TRUNCATE, so each partition is atomically replaced and cost tracks the
  delta. Reruns are safe because each load truncates just its own partition.

inputs:
  - id: day_offsets
    type: JSON
    defaults: [ 0, -1, -2 ]
    description: Days to refresh, counting back from today (0 = today, -1 =
      yesterday). Covers your late-arriving-data window.

tasks:
  - id: refresh_window
    type: io.kestra.plugin.core.flow.ForEach
    description: Replace each day's partition from its own GCS path, one at a time.
    values: "{{ inputs.day_offsets }}"
    concurrencyLimit: 1
    tasks:
      - id: load_partition
        type: io.kestra.plugin.gcp.bigquery.LoadFromGcs
        description: Load one day's files into that day's partition, truncating just
          that partition.
        allowFailure: true
        from:
          - "gs://{{ secret('GCS_BUCKET') }}/events/dt={{ now() |
            dateAdd(taskrun.value, 'DAYS') | date('yyyy-MM-dd') }}/*.parquet"
        destinationTable: "{{ secret('GCP_PROJECT_ID') }}.analytics.events${{ now() |
          dateAdd(taskrun.value, 'DAYS') | date('yyyyMMdd') }}"
        format: PARQUET
        writeDisposition: WRITE_TRUNCATE

  - id: notify
    type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook
    description: Confirm the incremental partition refresh.
    url: "{{ secret('SLACK_WEBHOOK_URL') }}"
    payload: |
      {
        "text": "BigQuery partition refresh complete for {{ inputs.day_offsets | length }} day(s) (execution {{ execution.id }})."
      }

pluginDefaults:
  - type: io.kestra.plugin.gcp.bigquery
    values:
      projectId: "{{ secret('GCP_PROJECT_ID') }}"
      serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT') }}"

triggers:
  - id: daily
    type: io.kestra.plugin.core.trigger.Schedule
    description: Refresh the recent partitions on a cadence. Adjust or disable as needed.
    cron: "0 5 * * *"
    disabled: true

errors:
  - id: alert_on_failure
    type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook
    description: Alert Slack if the partition refresh fails.
    url: "{{ secret('SLACK_WEBHOOK_URL') }}"
    payload: |
      {
        "text": "BigQuery partition refresh failed (execution {{ execution.id }})."
      }

Micro-batch Pub/Sub events into BigQueryOpen blueprint

id: bigquery-pubsub-streaming-insert
namespace: company.team
description: |
  Land Pub/Sub messages into BigQuery in near real time with micro-batching. A
  Pub/Sub trigger polls the subscription on a short interval and starts the flow
  with the batch of messages; the flow converts them to newline-delimited JSON
  and loads them into BigQuery in one append. Micro-batching keeps you well
  under BigQuery's per-row DML limits while staying fresh.

tasks:
  - id: to_json
    type: io.kestra.plugin.serdes.json.IonToJson
    description: Convert the consumed messages (ION) to newline-delimited JSON for the load.
    from: "{{ trigger.uri }}"

  - id: load_to_bigquery
    type: io.kestra.plugin.gcp.bigquery.Load
    description: Append the batch of messages to a raw landing table, autodetecting
      the schema.
    from: "{{ outputs.to_json.uri }}"
    destinationTable: "{{ secret('GCP_PROJECT_ID') }}.analytics.pubsub_events_raw"
    format: JSON
    autodetect: true
    createDisposition: CREATE_IF_NEEDED
    writeDisposition: WRITE_APPEND

pluginDefaults:
  - type: io.kestra.plugin.gcp.bigquery
    values:
      projectId: "{{ secret('GCP_PROJECT_ID') }}"
      serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT') }}"

triggers:
  - id: on_messages
    type: io.kestra.plugin.gcp.pubsub.Trigger
    description: Poll the subscription and start the flow with each batch. Enable
      once your secrets and subscription are set.
    projectId: "{{ secret('GCP_PROJECT_ID') }}"
    topic: events
    subscription: kestra-events-sub
    serdeType: JSON
    interval: PT1M
    maxRecords: 1000
    disabled: true

errors:
  - id: alert_on_failure
    type: io.kestra.plugin.slack.notifications.SlackIncomingWebhook
    description: Alert Slack if the batch load fails.
    url: "{{ secret('SLACK_WEBHOOK_URL') }}"
    payload: |
      {
        "text": "BigQuery load from Pub/Sub failed (execution {{ execution.id }})."
      }

Browse all 64 BigQuery blueprints

Above BigQuery Scheduled Queries.

BigQuery runs the SQL, the storage, and the slots. Kestra runs the steps around the query: what triggers it, what feeds it, how much it scans, and where the cross-tool audit trail lives.

Event-driven loads, not query cron

Scheduled Queries fire on a fixed cadence whether the source is ready or not. Kestra triggers a BigQuery job the moment the upstream step confirms: io.kestra.plugin.gcp.gcs.Trigger on a new object, pubsub.RealtimeTrigger per message, or bigquery.Trigger when a sentinel query returns rows. The load runs on the event, not the next tick.

Load, query, and extract orchestrated

Kestra runs bigquery.LoadFromGcs to load objects, bigquery.Query to run SQL jobs, and bigquery.ExtractToGcs to export, as ordered steps with per-step retries. A failed query retries without re-loading the data, and outputs flow forward to whatever reads the table next.

Partition-aware incremental refreshes

Reprocessing a whole table to update one day is slow and expensive. Kestra targets only the changed range: bigquery.DeletePartitions clears the affected dates, then bigquery.LoadFromGcs reloads just those partitions. Cost and runtime track the delta, and clearing before loading makes a rerun safe to repeat.

Full data stack around the warehouse

BigQuery has no view of the Airbyte sync that fed it or the dbt models downstream. Kestra runs ingestion, the load, the transform, and the Hightouch activation as steps in one flow with job IDs flowing forward and one shared execution ID.

Cost guardrails before a query runs

BigQuery bills by bytes scanned. A flow runs a dry-run query first, branches on the estimate with an io.kestra.plugin.core.flow.If task, and only runs the real job under a threshold. Pairing the dry-run guardrail with partition-aware loads keeps both bytes scanned and runtime tied to the actual delta.

Self-service queries for analysts

Analysts should not need console access to run a parameterized export. Kestra Apps wraps a Query flow in a typed form: pick the date range, run the query, get the result file. Every run lands in execution history with the requesting user, so self-service is still audited.

How teams use BigQuery and Kestra

Patterns cloud data teams run in production today. Each one shows the flow end to end, with the real plugin classes in play.

Event-driven

Load BigQuery when a file lands in GCS

The GCS Trigger fires on a new object, Kestra runs LoadFromGcs into BigQuery, then triggers dbt against the loaded table and confirms on Slack. The load runs on arrival, not on a clock.

Trigger on arrival

gcs.Trigger fires on CREATE or UPDATE under a prefix.

Load then transform

dbt runs only after LoadFromGcs confirms the table is loaded.

Move processed files

The trigger can MOVE or DELETE objects so they do not re-fire.

gcs trigger

object lands

bigquery load

LoadFromGcs

dbt build

transform

slack

on complete

Streaming

Stream Pub/Sub events into BigQuery

The pubsub.RealtimeTrigger starts one execution per message. Kestra routes by payload, runs a bigquery.Query insert, and acks the message. Use the batch pubsub.Trigger for grouped micro-batch inserts instead.

One execution per message

RealtimeTrigger fans each Pub/Sub message into its own run.

Route by payload

A Switch sends each message type down its own path.

Batch mode available

Swap to pubsub.Trigger for grouped micro-batch inserts.

pubsub trigger

per message

route

by type

bigquery insert

write row

notify

on event

Full stack

Ingest, load, transform, activate around BigQuery

Airbyte syncs into GCS, Kestra loads BigQuery, builds dbt models, then fires the Hightouch activation. One flow, one execution ID, step-local retries across every service.

One execution ID across services

GCS object, BigQuery job ID, and dbt results in one view.

Step retries stay local

A dbt retry never re-runs the load or the Airbyte sync.

Outputs flow forward

The load's destination table feeds the dbt and activation steps.

airbyte sync

ingest to GCS

bigquery load

LoadFromGcs

dbt build

transform

hightouch sync

activate

Incremental

Refresh only the partitions that changed

A daily load should not rescan the whole table. Kestra runs DeletePartitions for the affected date range, then LoadFromGcs reloads just those partitions. Cost and runtime track the delta, not the table size.

Touch only the delta

DeletePartitions plus LoadFromGcs reprocess just the changed dates.

Cost tracks the change

Runtime and bytes scanned scale with the delta, not the table.

Idempotent reloads

Clearing then loading the range makes a rerun safe to repeat.

schedule

daily window

delete partitions

clear range

load partitions

reload range

notify

on complete

Big data

Extract to GCS and run a Dataproc Spark batch

Kestra exports a BigQuery table to GCS with ExtractToGcs, submits a Spark batch to Dataproc Serverless, then loads the result back into BigQuery. The Spark job is one step in the flow, not a standalone island.

Spark as a step

PySparkSubmit runs on Dataproc Serverless inside the flow.

Extract and reload

BigQuery to GCS to Spark to BigQuery, all ordered.

One history for the batch

The extract, Spark job, and reload share one execution ID.

bigquery extract

to GCS

dataproc spark

PySpark batch

bigquery load

result back

notify

on complete

Kestra is the unifying layer for our data and workflows. You can start small, but then there is no limit to the possibilities and scalability of such an open architecture.

Julien Henrion, Head of Data Engineering at Leroy Merlin France

+900%Increase in data production

5000+Workflows created

Kestra vs the orchestration alternatives BigQuery teams evaluate

Capability		Scheduled Queries	Cloud Composer	Dagster
Load from GCS, trigger on arrival	gcs.Trigger + LoadFromGcs	SQL only, no loads	Sensors, Python required	Custom Python code
Chain BigQuery with dbt and Airbyte	Native plugins	SQL only	Operators, Python DAGs	Ops, Python required
Partition-aware incremental loads	DeletePartitions + LoadFromGcs	Manual SQL DML	Custom operator logic	Custom Python code
Dry-run cost check before a query	Branch on scanned bytes	No	Custom operator	Custom code
Run fully self-hosted, off GCP	Docker, Kubernetes, air-gapped	GCP only	Managed GCP only	Self-host or cloud
Self-service queries for analysts	Kestra Apps	No	No native form layer	No native form layer
Declarative YAML, IaC-friendly	YAML + Terraform provider	SQL + console	Python DAGs	Python assets

Run BigQuery jobs on the event that matters.

Blueprints for BigQuery orchestration.

Above BigQuery Scheduled Queries.

Event-driven loads, not query cron

Load, query, and extract orchestrated

Partition-aware incremental refreshes

Full data stack around the warehouse

Cost guardrails before a query runs

Self-service queries for analysts

How teams use BigQuery and Kestra

Load BigQuery when a file lands in GCS

Trigger on arrival

Load then transform

Move processed files

Stream Pub/Sub events into BigQuery

One execution per message

Route by payload

Batch mode available

Ingest, load, transform, activate around BigQuery

One execution ID across services

Step retries stay local

Outputs flow forward

Refresh only the partitions that changed

Touch only the delta

Cost tracks the change

Idempotent reloads

Extract to GCS and run a Dataproc Spark batch

Spark as a step

Extract and reload

One history for the batch

Kestra vs the orchestration alternatives BigQuery teams evaluate

BigQuery & Kestra: common questions

Read more about BigQuery + Kestra

Ready to orchestrate your BigQuery jobs?