Source
yaml
id: hubspot-to-bigquery
namespace: company.team
tasks:
- id: sync
type: io.kestra.plugin.cloudquery.Sync
inputFiles:
sa.json: "{{ secret('GCP_SERVICE_ACCOUNT') }}"
env:
GOOGLE_APPLICATION_CREDENTIALS: sa.json
HUBSPOT_APP_TOKEN: "{{ secret('HUBSPOT_API_TOKEN') }}"
CLOUDQUERY_API_KEY: "{{ secret('CLOUDQUERY_API_KEY') }}"
configs:
- kind: destination
spec:
name: bigquery
path: cloudquery/bigquery
registry: cloudquery
version: v3.3.16
write_mode: append
spec:
project_id: kestra-prd
dataset_id: hubspot
- kind: source
spec:
name: hubspot
path: cloudquery/hubspot
registry: cloudquery
version: v3.0.18
destinations:
- bigquery
tables:
- "*"
spec:
max_requests_per_second: 5
triggers:
- id: schedule
type: io.kestra.plugin.core.trigger.Schedule
cron: 0 6 * * *
About this blueprint
Ingest BigQuery API
This flow will sync data from Hubspot CRM to BigQuery on a schedule. The sync
task from the CloudQuery plugin uses the hubspot
source and the bigquery
destination.
Note how we use the sa.json
credentials file to authenticate with GCP and the HUBSPOT_APP_TOKEN
environment variable to authenticate with Hubspot CRM.
To avoid rate limiting issues, you can set the max_requests_per_second
parameter in the hubspot
source configuration. In this example, we set it to 5 requests per second.
The schedule
trigger runs the flow every day at 6:00 AM.
Additionally, you can generate an API key to use premium plugins. You can add the API key as an environment variable:
yaml
- id: hn_to_duckdb
type: io.kestra.plugin.cloudquery.Sync
env:
CLOUDQUERY_API_KEY: "{{ secret('CLOUDQUERY_API_KEY') }}"