Batch Batch

yaml
type: "io.kestra.plugin.ee.aws.runner.Batch"

Task runner that executes a task inside a job in AWS Batch.

This task runner only supports ECS Fargate or ECS EC2 as compute environment. For EKS, use the Kubernetes Task Runner.

Make sure to set the containerImage property because this runner runs the task in a container.

To access the task's working directory, use the {{ workingDir }} Pebble expression or the WORKING_DIR environment variable. This directory will contain all input files and namespace files (if enabled).

To generate output files you can either use the outputFiles task property and create a file with the same name in the task's working directory, or create any file in the output directory which can be accessed using the {{ outputDir }} Pebble expression or the OUTPUT_DIR environment variable.

To use inputFiles, outputFiles or namespaceFiles properties, make sure to set the bucket property. The bucket serves as an intermediary storage layer for the task runner. Input and namespace files will be uploaded to the cloud storage bucket before the task run starts. Similarly, the task runner will store outputFiles in this bucket during the task run. In the end, the task runner will make those files available for download and preview from the UI by sending them to internal storage.

The task runner will generate a folder in the configured bucket for each task run. You can access that folder using the {{ bucketPath }} Pebble expression or the BUCKET_PATH environment variable.

Note that this task runner executes the task in the root directory. You need to use the {{ workingDir }} Pebble expression or the WORKING_DIR environment variable to access files in the task's working directory.

Note that when the Kestra Worker running this task is terminated, the batch job will still runs until completion, then after restarting, the Worker will resume processing on the existing job unless resume is set to false.

This task runner will return with an exit code according to the following mapping:

  • SUCCEEDED: 0
  • FAILED: 1
  • RUNNING: 2
  • RUNNABLE: 3
  • PENDING: 4
  • STARTING: 5
  • SUBMITTED: 6
  • OTHER: -1

To avoid zombie containers in ECS, you can set the timeout property on the task and kestra will terminate the batch job if the task is not completed within the specified duration.

Examples

Execute a Shell command in a container on ECS Fargate.

yaml
id: run_container
namespace: company.team

variables:
  region: eu-west-2
  computeEnvironmentArn: "arn:aws:batch:eu-central-1:123456789012:compute-environment/kestraFargateEnvironment"

tasks:
  - id: shell
    type: io.kestra.plugin.scripts.shell.Commands
    taskRunner:
      type: io.kestra.plugin.ee.aws.runner.Batch
      accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}"
      secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}"
      region: "{{ vars.region }}"
      computeEnvironmentArn: "{{ vars.computeEnvironmentArn }}"
    commands:
      - echo "Hello World"

Pass input files to the task, execute a Shell command, then retrieve the output files.

yaml
id: container_with_input_files
namespace: company.team

inputs:
  - id: file
    type: FILE

variables:
  region: eu-west-2
  computeEnvironmentArn: "arn:aws:batch:eu-central-1:123456789012:compute-environment/kestraFargateEnvironment"

tasks:
  - id: shell
    type: io.kestra.plugin.scripts.shell.Commands
    inputFiles:
      data.txt: "{{ inputs.file }}"
    outputFiles:
      - out.txt
    containerImage: centos
    taskRunner:
      type: io.kestra.plugin.ee.aws.runner.Batch
      accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}"
      secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}"
      region: "{{ vars.region }}"
      bucket: "{{ vars.bucket }}"
      computeEnvironmentArn: "{{ vars.computeEnvironmentArn }}"
    commands:
      - cp {{ workingDir }}/data.txt {{ workingDir }}/out.txt

Properties

computeEnvironmentArn

  • Type: string
  • Dynamic: ✔️
  • Required: ✔️

Compute environment in which to run the job.

delete

  • Type: boolean
  • Dynamic:
  • Required: ✔️
  • Default: true

Whether the job should be deleted upon completion.

region

  • Type: string
  • Dynamic: ✔️
  • Required: ✔️

AWS region with which the SDK should communicate.

resources

  • Type: Batch-Resources
  • Dynamic:
  • Required: ✔️
  • Default: {request={memory=2048, cpu=1}}

Custom resources for the ECS Fargate container.

See the AWS documentation for more details.

resume

  • Type: boolean
  • Dynamic:
  • Required: ✔️
  • Default: true

Whether to reconnect to the current job if it already exists.

accessKeyId

  • Type: string
  • Dynamic: ✔️
  • Required:

Access Key Id in order to connect to AWS.

If no credentials are defined, we will use the default credentials provider chain to fetch credentials.

bucket

  • Type: string
  • Dynamic: ✔️
  • Required:

S3 Bucket to upload (inputFiles and namespaceFiles) and download (outputFiles) files.

It's mandatory to provide a bucket if you want to use such properties.

completionCheckInterval

  • Type: string
  • Dynamic:
  • Required:
  • Default: 5.000000000
  • Format: duration

Determines how often Kestra should poll the container for completion. By default, the task runner checks every 5 seconds whether the job is completed. You can set this to a lower value (e.g. PT0.1S = every 100 milliseconds) for quick jobs and to a lower threshold (e.g. PT1M = every minute) for long-running jobs. Setting this property to a lower value will reduce the number of API calls Kestra makes to the remote service — keep that in mind in case you see API rate limit errors.

endpointOverride

  • Type: string
  • Dynamic: ✔️
  • Required:

The endpoint with which the SDK should communicate.

This property allows you to use a different S3 compatible storage backend.

executionRoleArn

  • Type: string
  • Dynamic: ✔️
  • Required:

Execution role for the AWS Batch job.

Mandatory if the compute environment is ECS Fargate. See the AWS documentation for more details.

jobQueueArn

  • Type: string
  • Dynamic: ✔️
  • Required:

Job queue to use to submit jobs (ARN). If not specified, the task runner will create a job queue — keep in mind that this can lead to a longer execution.

secretKeyId

  • Type: string
  • Dynamic: ✔️
  • Required:

Secret Key Id in order to connect to AWS.

If no credentials are defined, we will use the default credentials provider chain to fetch credentials.

sessionToken

  • Type: string
  • Dynamic: ✔️
  • Required:

AWS session token, retrieved from an AWS token service, used for authenticating that this user has received temporary permissions to access a given resource.

If no credentials are defined, we will use the default credentials provider chain to fetch credentials.

stsEndpointOverride

  • Type: string
  • Dynamic: ✔️
  • Required:

The AWS STS endpoint with which the SDKClient should communicate.

stsRoleArn

  • Type: string
  • Dynamic: ✔️
  • Required:

AWS STS Role.

The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.

stsRoleExternalId

  • Type: string
  • Dynamic: ✔️
  • Required:

AWS STS External Id.

A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.

stsRoleSessionDuration

  • Type: string
  • Dynamic:
  • Required:
  • Default: 900.000000000
  • Format: duration

AWS STS Session duration.

The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.

stsRoleSessionName

  • Type: string
  • Dynamic: ✔️
  • Required:

AWS STS Session name.

This property is only used when an stsRoleArn is defined.

taskRoleArn

  • Type: string
  • Dynamic: ✔️
  • Required:

Task role to use within the container.

Needed if you want to authenticate with AWS CLI within your container.

waitUntilCompletion

  • Type: string
  • Dynamic:
  • Required:
  • Default: 3600.000000000
  • Format: duration

The maximum duration to wait for the job completion unless the task timeout property is set which will take precedence over this property.

AWS Batch will automatically timeout the job upon reaching that duration and the task will be marked as failed.

Definitions

io.kestra.plugin.ee.aws.runner.Batch-Resource

Properties

cpu
  • Type: string
  • Dynamic:
  • Required: ✔️
memory
  • Type: string
  • Dynamic:
  • Required: ✔️

io.kestra.plugin.ee.aws.runner.Batch-Resources

Properties

request

Was this page helpful?