Blueprints

Schedule a Python data ingestion job to extract data from an API and load it to DuckDB using dltHub (data load tool)

Source

yaml
id: dlt-elt-chess-api-to-duckdb
namespace: company.team

tasks:
  - id: chess_api_to_duckdb
    type: io.kestra.plugin.scripts.python.Script
    taskRunner:
      type: io.kestra.plugin.scripts.runner.docker.Docker
    containerImage: python:slim
    beforeCommands:
      - pip install dlt[duckdb]
    warningOnStdErr: false
    script: |
      import dlt
      import requests

      pipeline = dlt.pipeline(
          pipeline_name='chess_pipeline',
          destination='duckdb',
          dataset_name='player_data'
      )
      data = []
      for player in ['magnuscarlsen', 'rpragchess']:
          response = requests.get(f'https://api.chess.com/pub/player/{player}')
          response.raise_for_status()
          data.append(response.json())
      # Extract, normalize, and load the data
      pipeline.run(data, table_name='player')

triggers:
  - id: daily
    type: io.kestra.plugin.core.trigger.Schedule
    disabled: true
    cron: 0 9 * * *

About this blueprint

Trigger Ingest Data Schedule

This flow loads data from the Chess.com API into DuckDB destination. The flow is scheduled to run daily at 9 AM.

Script

Docker

Schedule

New to Kestra?

Use blueprints to kickstart your first workflows.

Get started with Kestra