Blueprints

Process files in parallel

Source

yaml
id: parallel-files
namespace: company.team

tasks:
  - id: bash
    type: io.kestra.plugin.scripts.shell.Commands
    taskRunner:
      type: io.kestra.plugin.core.runner.Process
    outputFiles:
      - out/**
    commands:
      - mkdir -p out
      - echo "Hello from 1" >> out/output1.txt
      - echo "Hello from 2" >> out/output2.txt
      - echo "Hello from 3" >> out/output3.txt
      - echo "Hello from 4" >> out/output4.txt

  - id: each
    type: io.kestra.plugin.core.flow.ForEach
    concurrencyLimit: 0
    values: "{{ outputs.bash.outputFiles | jq('.[]') }}"
    tasks:
      - id: path
        type: io.kestra.plugin.core.debug.Return
        format: "{{ taskrun.value }}"

      - id: contents
        type: io.kestra.plugin.scripts.shell.Commands
        taskRunner:
          type: io.kestra.plugin.core.runner.Process
        commands:
          - cat "{{ taskrun.value }}"

About this blueprint

Parallel Software Engineering Outputs

This example demonstrates how to process files in parallel. In the bash task, we generate multiple files, and store them in the internal storage. Next, in the each task, we process these files in parallel, where the tasks path and contents run for each of the 4 output files, resulting in 8 parallel task runs. Instead of the bash script, you may have a Python/R/Node.js script that generates such files.

Commands

Process

For Each

Return

More Related Blueprints

New to Kestra?

Use blueprints to kickstart your first workflows.

Get started with Kestra