Source
yaml
id: python-csv-each-parallel
namespace: company.team
tasks:
- id: csv
type: io.kestra.plugin.core.flow.ForEach
concurrencyLimit: 0
values:
- https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv
- https://huggingface.co/datasets/kestra/datasets/raw/main/csv/products.csv
- https://huggingface.co/datasets/kestra/datasets/raw/main/csv/salaries.csv
tasks:
- id: pandas
type: io.kestra.plugin.scripts.python.Script
warningOnStdErr: false
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
containerImage: ghcr.io/kestra-io/pydata:latest
script: |
import pandas as pd
df = pd.read_csv("{{ taskrun.value }}")
df.info()
About this blueprint
Parallel Python
This flow reads a list of CSV files and processes each file in parallel in isolated Python scripts using Pandas.
More Related Blueprints