PluginsGoogle Cloud (EE)Task RunnersRunnerBatch

Batch

This plugin is exclusively available on the Cloud and Enterprise editions of Kestra.

yaml

type: "io.kestra.plugin.ee.gcp.runner.Batch"

Task runner that executes a task inside a job in Google Cloud Batch.

This plugin is only available in the Enterprise Edition (EE).

This task runner is container-based so the containerImage property must be set. You need to have roles 'Batch Job Editor' and 'Logs Viewer' to be able to use it.

To access the task's working directory, use the {{ workingDir }} Pebble expression or the WORKING_DIR environment variable. Input files and namespace files will be available in this directory.

To generate output files you can either use the outputFiles task's property and create a file with the same name in the task's working directory, or create any file in the output directory which can be accessed by the {{ outputDir }} Pebble expression or the OUTPUT_DIR environment variables.

To use inputFiles, outputFiles or namespaceFiles properties, make sure to set the bucket property. The bucket serves as an intermediary storage layer for the task runner. Input and namespace files will be uploaded to the cloud storage bucket before the task run. Similarly, the task runner will store outputFiles in this bucket during the task run. In the end, the task runner will make those files available for download and preview from the UI by sending them to internal storage.

The task runner will generate a folder in the configured bucket for each task run. You can access that folder using the {{ bucketPath }} Pebble expression or the BUCKET_PATH environment variable.

Warning, contrarily to other task runners, this task runner didn't run the task in the working directory but in the root directory. You must use the {{ workingDir }} Pebble expression or the WORKING_DIR environment variable to access files.

Note that when the Kestra Worker running this task is terminated, the batch job will still runs until completion, then after restarting, the Worker will resume processing on the existing job unless resume is set to false.

Examples

Execute a Shell command.

yaml

id: new-shell
namespace: company.team

tasks:
  - id: shell
    type: io.kestra.plugin.scripts.shell.Commands
    taskRunner:
      type: io.kestra.plugin.ee.gcp.runner.Batch
      projectId: "{{vars.projectId}}"
      region: "{{vars.region}}"
    commands:
      - echo "Hello World"

Pass input files to the task, execute a Shell command, then retrieve output files.

yaml

id: new-shell-with-file
namespace: company.team

inputs:
  - id: file
    type: FILE

tasks:
  - id: shell
    type: io.kestra.plugin.scripts.shell.Commands
    inputFiles:
      data.txt: "{{inputs.file}}"
    outputFiles:
      - out.txt
    containerImage: centos
    taskRunner:
      type: io.kestra.plugin.ee.gcp.runner.Batch
      projectId: "{{vars.projectId}}"
      region: "{{vars.region}}"
      bucket: "{{vars.bucker}}"
    commands:
      - cp {{workingDir}}/data.txt {{workingDir}}/out.txt

Properties

`delete`

Type: boolean
Dynamic: ❌
Required: ✔️
Default: true

Whether the job should be deleted upon completion.

`machineType`

Type: string
Dynamic: ✔️
Required: ✔️
Default: e2-medium

The GCP machine type.

See https://cloud.google.com/compute/docs/machine-types

`region`

Type: string
Dynamic: ✔️
Required: ✔️

The GCP region.

`resume`

Type: boolean
Dynamic: ❌
Required: ✔️
Default: true

Whether to reconnect to the current job if it already exists.

`bucket`

Type: string
Dynamic: ✔️
Required: ❌

Google Cloud Storage Bucket to use to upload (inputFiles and namespaceFiles) and download (outputFiles) files.

It's mandatory to provide a bucket if you want to use such properties.

`completionCheckInterval`

Type: string
Dynamic: ❌
Required: ❌
Default: 5.000000000
Format: duration

Determines how often Kestra should poll the container for completion. By default, the task runner checks every 5 seconds whether the job is completed. You can set this to a lower value (e.g. PT0.1S = every 100 milliseconds) for quick jobs and to a lower threshold (e.g. PT1M = every minute) for long-running jobs. Setting this property to a lower value will reduce the number of API calls Kestra makes to the remote service — keep that in mind in case you see API rate limit errors.

`computeResource`

Type: Batch-ComputeResource
Dynamic: ❌
Required: ❌

Compute resource requirements.

ComputeResource defines the amount of resources required for each task. Make sure your tasks have enough compute resources to successfully run. If you also define the types of resources for a job to use with the InstancePolicyOrTemplate field, make sure both fields are compatible with each other.

`entryPoint`

Type: array
SubType: string
Dynamic: ✔️
Required: ❌

Container entrypoint to use.

`lifecyclePolicies`

Type: array
SubType: Batch-LifecyclePolicy
Dynamic: ❌
Required: ❌

Lifecycle management schema when any task in a task group is failed.

Currently we only support one lifecycle policy. When the lifecycle policy condition is met, the action in the policy will execute. If task execution result does not meet with the defined lifecycle policy, we consider it as the default policy. Default policy means if the exit code is 0, exit task. If task ends with non-zero exit code, retry the task with max_retry_count.

`maxRetryCount`

Type: integer
Dynamic: ❌
Required: ❌
Minimum: >= 0
Maximum: <= 10

**Maximum number of retries on failures. **

The default, 0, which means never retry.

`networkInterfaces`

Type: array
SubType: Batch-NetworkInterface
Dynamic: ❌
Required: ❌

Network interfaces.

`projectId`

Type: string
Dynamic: ✔️
Required: ❌

The GCP project ID.

`reservation`

Type: string
Dynamic: ✔️
Required: ❌

Compute reservation.

`scopes`

Type: array
SubType: string
Dynamic: ✔️
Required: ❌
Default: [https://www.googleapis.com/auth/cloud-platform]

The GCP scopes to be used.

`serviceAccount`

Type: string
Dynamic: ✔️
Required: ❌

The GCP service account key.

`waitForLogInterval`

Type: string
Dynamic: ❌
Required: ❌
Default: 5.000000000
Format: duration

Additional time after the job ends to wait for late logs.

`waitUntilCompletion`

Type: string
Dynamic: ❌
Required: ❌
Default: 3600.000000000
Format: duration

The maximum duration to wait for the job completion unless the task timeout property is set which will take precedence over this property.

Google Cloud Batch will automatically timeout the job upon reaching such duration and the task will be failed.

Outputs

Definitions

`io.kestra.plugin.ee.gcp.runner.Batch-LifecyclePolicyAction`

Properties

`exitCodes`

Type: array
SubType: integer
Dynamic: ❌
Required: ❌

Exit codes of a task execution.

If there are more than 1 exit codes, when task executes with any of the exit code in the list, the condition is met and the action will be executed.

`io.kestra.plugin.ee.gcp.runner.Batch-LifecyclePolicy`

Properties

`action`

Type: string
Dynamic: ❌
Required: ❌
Possible Values:
- ACTION_UNSPECIFIED
- RETRY_TASK
- FAIL_TASK
- UNRECOGNIZED

Action on task failures based on different conditions.

`actionCondition`

Type: Batch-LifecyclePolicyAction
Dynamic: ❌
Required: ❌

Conditions for actions to deal with task failures.

`io.kestra.plugin.ee.gcp.runner.Batch-NetworkInterface`

Properties

`network`

Type: string
Dynamic: ✔️
Required: ✔️

Network identifier with the format projects/HOST_PROJECT_ID/global/networks/NETWORK.

`subnetwork`

Type: string
Dynamic: ✔️
Required: ❌

Subnetwork identifier in the format projects/HOST_PROJECT_ID/regions/REGION/subnetworks/SUBNET

`io.kestra.plugin.ee.gcp.runner.Batch-ComputeResource`

Properties

`bootDisk`

Type: string
Dynamic: ❌
Required: ❌

Extra boot disk size for each task.

`cpu`

Type: string
Dynamic: ❌
Required: ❌

The milliCPU count.

Defines the amount of CPU resources per task in milliCPU units. For example, 1000 corresponds to 1 vCPU per task. If undefined, the default value is 2000. If you also define the VM's machine type using the machineType property in InstancePolicy field or inside the instanceTemplate in the InstancePolicyOrTemplate field, make sure the CPU resources for both fields are compatible with each other and with how many tasks you want to allow to run on the same VM at the same time.

For example, if you specify the n2-standard-2 machine type, which has 2 vCPUs, you can set the cpu to no more than 2000. Alternatively, you can run two tasks on the same VM if you set the cpu to 1000 or less.

`memory`

Type: string
Dynamic: ❌
Required: ❌

Memory in MiB.

Defines the amount of memory per task in MiB units. If undefined, the default value is 2048. If you also define the VM's machine type using the machineType in InstancePolicy field or inside the instanceTemplate in the InstancePolicyOrTemplate field, make sure the memory resources for both fields are compatible with each other and with how many tasks you want to allow to run on the same VM at the same time.

For example, if you specify the n2-standard-2 machine type, which has 8 GiB of memory, you can set the memory to no more than 8192.

Was this page helpful?

​Batch

Batch

exitCodes

action

actionCondition

network

subnetwork

bootDisk

cpu

memory

Batch

`exitCodes`

`action`

`actionCondition`

`network`

`subnetwork`

`bootDisk`

`cpu`

`memory`