Download a zip file, unzip it and upload all files in parallel to AWS S3 - using pluginDefaults to avoid boilerplate code

Source

yaml

id: s3-parallel-uploads
namespace: company.team

inputs:
  - id: bucket
    type: STRING
    defaults: declarative-data-orchestration

tasks:
  - id: get_zip_file
    type: io.kestra.plugin.core.http.Download
    uri: https://wri-dataportal-prod.s3.amazonaws.com/manual/global_power_plant_database_v_1_3.zip

  - id: unzip
    type: io.kestra.plugin.compress.ArchiveDecompress
    algorithm: ZIP
    from: "{{ outputs.get_zip_file.uri }}"

  - id: parallel_upload_to_s3
    type: io.kestra.plugin.core.flow.Parallel
    tasks:
      - id: csv
        type: io.kestra.plugin.aws.s3.Upload
        from: "{{ outputs.unzip.files['global_power_plant_database.csv'] }}"
        key: powerplant/global_power_plant_database.csv

      - id: pdf
        type: io.kestra.plugin.aws.s3.Upload
        from: "{{
          outputs.unzip.files['Estimating_Power_Plant_Generation_in_the_Global_P\
          ower_Plant_Database.pdf'] }}"
        key: powerplant/Estimating_Power_Plant_Generation_in_the_Global_Power_Plant_Database.pdf

      - id: txt
        type: io.kestra.plugin.aws.s3.Upload
        from: "{{ outputs.unzip.files['RELEASE_NOTES.txt'] }}"
        key: powerplant/RELEASE_NOTES.txt

pluginDefaults:
  - type: io.kestra.plugin.aws.s3.Upload
    values:
      accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}"
      secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}"
      region: "{{ secret('AWS_DEFAULT_REGION') }}"
      bucket: "{{ inputs.bucket }}"

About this blueprint

AWS Kestra

This flow downloads a zip file, unzips it, and uploads all files to S3 in parallel.

This flow assumes AWS credentials stored as secrets AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_DEFAULT_REGION configured using pluginDefaults to avoid boilerplate configuration.

The flow does not create the S3 bucket, and assumes that the bucket name provided in the inputs already exists in the AWS_DEFAULT_REGION.

Download

Archive Decompress

Parallel

Upload

More Related Blueprints

AWSKestra

List objects in an S3 bucket and process them in parallel

NotificationsPythonAWSKestra

Upload data to S3 in Python using boto3, transform it in a SQL query with DuckDB and send a CSV report via email every first day of the month

AWSKestra

Extract a CSV file via HTTP API and upload it to S3 by using scheduled date as a filename

New to Kestra?

Use blueprints to kickstart your first workflows.

Get started with Kestra