Source
yaml
id: s3-map-over-objects
namespace: company.team
inputs:
- id: bucket
type: STRING
defaults: declarative-data-orchestration
tasks:
- id: list_objects
type: io.kestra.plugin.aws.s3.List
bucket: "{{ inputs.bucket }}"
prefix: powerplant/
accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}"
secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}"
region: "{{ secret('AWS_DEFAULT_REGION') }}"
- id: print_objects
type: io.kestra.plugin.core.log.Log
message: "Found objects {{ outputs.list_objects.objects }}"
- id: map_over_s3_objects
type: io.kestra.plugin.core.flow.ForEach
concurrencyLimit: 0
values: "{{ outputs.list_objects.objects }}"
tasks:
- id: filename
type: io.kestra.plugin.core.log.Log
message: "Filename {{ json(taskrun.value).key }} with size {{
json(taskrun.value).size }}"
About this blueprint
S3 Parallel Variables Inputs Outputs
This flow lists objects with a specific prefix in an S3 bucket and then processes each object in parallel.
This flow assumes AWS credentials stored as secrets AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
and AWS_DEFAULT_REGION
.