Blueprints

Ingest a JSON API payload into a DuckDB database using dlt

Source

yaml
id: dlt-json-to-duckdb
namespace: company.team

tasks:
  - id: dlt_pipeline
    type: io.kestra.plugin.scripts.python.Script
    taskRunner:
      type: io.kestra.plugin.scripts.runner.docker.Docker
    containerImage: python:3.11-slim
    outputFiles:
      - dummy_products.duckdb
    beforeCommands:
      - pip install dlt[duckdb]
    warningOnStdErr: false
    script: |
      import dlt
      import requests

      response = requests.get('https://dummyjson.com/products')
      response.raise_for_status()
      data = response.json()['products']

      pipeline = dlt.pipeline(
          pipeline_name='dummyjson_products_pipeline',
          destination='duckdb',
          dataset_name='products'
      )

      pipeline.run(data, table_name='product')

About this blueprint

Ingest API

This flow demonstrates how to extract data from an API and load it into a DuckDB database using dlt. The entire workflow logic is contained in a single Python script that uses the dlt Python library to ingest the JSON file into a DuckDB database.

Script

Docker

More Related Blueprints

New to Kestra?

Use blueprints to kickstart your first workflows.

Get started with Kestra