Blueprints

Detect New Files in S3 and process them in Python

Source

yaml
id: s3-trigger-python
namespace: company.team

variables:
  bucket: s3-bucket
  region: eu-west-2

tasks:
  - id: process_data
    type: io.kestra.plugin.scripts.python.Commands
    taskRunner:
      type: io.kestra.plugin.scripts.runner.docker.Docker
    containerImage: ghcr.io/kestra-io/kestrapy:latest
    namespaceFiles:
      enabled: true
    inputFiles:
      input.csv: "{{ read(trigger.objects[0].uri) }}"
    outputFiles:
      - data.csv
    commands:
      - python process_data.py

triggers:
  - id: watch
    type: io.kestra.plugin.aws.s3.Trigger
    interval: "PT1S"
    accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}"
    secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}"
    region: "{{ vars.region }}"
    bucket: "{{ vars.bucket }}"
    filter: FILES
    action: MOVE
    moveTo:
      key: archive/
    maxKeys: 1

About this blueprint

S3 Trigger Python

This flow will be triggered whenever a new file is detected in the specified S3 bucket. It will download the file into the internal storage and move the S3 object to an archive folder (i.e. S3 object prefix with the name archive). The Python code will read the file as an inputFile called input.csv and processing it to generate a new file called data.csv. It's recommended to set the accessKeyId and secretKeyId properties as secrets. This flow assumes AWS credentials stored as secrets AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY.

Commands

Docker

Trigger

More Related Blueprints

New to Kestra?

Use blueprints to kickstart your first workflows.

Get started with Kestra