About this blueprint
CLI Ingest
This flow will start a batch job to sync data from HackerNews to DuckDB with CloudQuery. The sync process relies on CloudQuery source and destination plugins. The source plugin will extract data from HackerNews and the destination plugin will load it to DuckDB.
This flow uses Kestra templating to dynamically configure the start time of the sync process to be 1 day before the scheduled date, allowing you to also backfill data for past intervals in small batches.
Alternatively, you can toggle the incremental
flag to true
to enable incremental sync. During incremental syncs, Kestra will store the CloudQuery cursor in the Kestra backend, and will use it to always start the sync process from the last cursor position.
To get started configuring CloudQuery sources and destinations, go to CloudQuery Integrations page. From here, you can select your desired source and destination, and you'll see the YAML configuration that you can copy and paste into your own file, or into Kestra flow. This page will also provide more detailed instructions and references about tables available in each plugin.
Note that you can generate an API key to use premium plugins. You can add the API key as an environment variable:
- id: hn_to_duckdb
type: io.kestra.plugin.cloudquery.Sync
env:
CLOUDQUERY_API_KEY: "{{ secret('CLOUDQUERY_API_KEY') }}"
id: cloudquery-sync
namespace: company.team
tasks:
- id: hn_to_duckdb
type: io.kestra.plugin.cloudquery.Sync
env:
CLOUDQUERY_API_KEY: "{{ secret('CLOUDQUERY_API_KEY') }}"
incremental: false
configs:
- kind: source
spec:
name: hackernews
path: cloudquery/hackernews
version: v3.0.13
tables:
- "*"
destinations:
- duckdb
spec:
item_concurrency: 100
start_time: "{{ trigger.date ?? execution.startDate | dateAdd(-1, 'DAYS') }}"
- kind: destination
spec:
name: duckdb
path: cloudquery/duckdb
version: v4.2.10
write_mode: overwrite-delete-stale
spec:
connection_string: hn.db
triggers:
- id: schedule
type: io.kestra.plugin.core.trigger.Schedule
cron: "@daily"
timezone: US/Eastern
More Related Blueprints