Blueprints

Scrape API in a Python task running in a Docker container and load the JSON document to a MongoDB collection

Source

yaml
id: json-from-api-to-mongodb
namespace: company.team

tasks:
  - id: generate_json
    type: io.kestra.plugin.scripts.python.Script
    taskRunner:
      type: io.kestra.plugin.scripts.runner.docker.Docker
    containerImage: ghcr.io/kestra-io/pydata:latest
    outputFiles:
      - output.json
    script: |
      import requests
      import json
      from kestra import Kestra

      response = requests.get("https://api.github.com")
      data = response.json()

      with open("output.json", "w") as output_file:
          json.dump(data, output_file)

      Kestra.outputs({'data': data, 'status': response.status_code})

  - id: load_to_mongodb
    type: io.kestra.plugin.mongodb.Load
    connection:
      uri: mongodb://host.docker.internal:27017/
    database: local
    collection: github
    from: "{{ outputs.generate_json.outputFiles['output.json'] }}"

About this blueprint

Ingest Python Outputs

This flow will scrape GitHub API in a Python task running in a Docker container, and will output the result to a JSON file. That JSON API payload is then loaded to a MongoDB collection.

Script

Docker

Load

New to Kestra?

Use blueprints to kickstart your first workflows.

Get started with Kestra