GcpBatchTaskRunner GcpBatchTaskRunner

yaml
type: "io.kestra.plugin.gcp.runner.GcpBatchTaskRunner"

Task runner that executes a task inside a job in Google Cloud Batch.

This task runner is container-based so the containerImage property must be set. You need to have roles 'Batch Job Editor' and 'Logs Viewer' to be able to use it.

To access the task's working directory, use the Pebble expression or the WORKING_DIR environment variable. Input files and namespace files will be available in this directory.

To generate output files you can either use the outputFiles task's property and create a file with the same name in the task's working directory, or create any file in the output directory which can be accessed by the Pebble expression or the OUTPUT_DIR environment variables.

To use inputFiles, outputFiles or namespaceFiles properties, make sure to set the bucket property. The bucket serves as an intermediary storage layer for the task runner. Input and namespace files will be uploaded to the cloud storage bucket before the task run. Similarly, the task runner will store outputFiles in this bucket during the task run. In the end, the task runner will make those files available for download and preview from the UI by sending them to internal storage. To make it easier to track where all files are stored, the task runner will generate a folder for each task run. You can access that folder using the Pebble expression or the BUCKET_PATH environment variable.

Warning, contrarily to other task runners, this task runner didn't run the task in the working directory but in the root directory. You must use the Pebble expression or the WORKING_DIR environment variable to access files.

Note that when the Kestra Worker running this task is terminated, the batch job will still run until completion.

Examples

Execute a Shell command.

yaml
id: new-shell
namespace: myteam

tasks:
  - id: shell
    type: io.kestra.plugin.scripts.shell.Commands
    taskRunner:
      type: io.kestra.plugin.gcp.runner.GcpBatchTaskRunner
      projectId: "{{vars.projectId}}"
      region: "{{vars.region}}"
    commands:
    - echo "Hello World"

Pass input files to the task, execute a Shell command, then retrieve output files.

yaml
id: new-shell-with-file
namespace: myteam

inputs:
  - id: file
    type: FILE

tasks:
  - id: shell
    type: io.kestra.plugin.scripts.shell.Commands
    inputFiles:
      data.txt: "{{inputs.file}}"
    outputFiles:
      - out.txt
    containerImage: centos
    taskRunner:
      type: io.kestra.plugin.gcp.runner.GcpBatchTaskRunner
      projectId: "{{vars.projectId}}"
      region: "{{vars.region}}"
      bucket: "{{vars.bucker}}"
    commands:
    - cp {{workingDir}}/data.txt {{workingDir}}/out.txt

Properties

delete

  • Type: boolean
  • Dynamic:
  • Required: ✔️
  • Default: true

Whether the job should be deleted upon completion.

machineType

  • Type: string
  • Dynamic: ✔️
  • Required: ✔️
  • Default: e2-medium

The GCP machine type.

See https://cloud.google.com/compute/docs/machine-types

region

  • Type: string
  • Dynamic: ✔️
  • Required: ✔️

The GCP region.

bucket

  • Type: string
  • Dynamic: ✔️
  • Required:

Google Cloud Storage Bucket to use to upload (inputFiles and namespaceFiles) and download (outputFiles) files.

It's mandatory to provide a bucket if you want to use such properties.

entryPoint

  • Type: array
  • SubType: string
  • Dynamic: ✔️
  • Required:

Container entrypoint to use.

networkInterfaces

Network interfaces.

projectId

  • Type: string
  • Dynamic: ✔️
  • Required:

The GCP project ID.

reservation

  • Type: string
  • Dynamic: ✔️
  • Required:

Compute reservation.

scopes

  • Type: array
  • SubType: string
  • Dynamic: ✔️
  • Required:
  • Default: [https://www.googleapis.com/auth/cloud-platform]

The GCP scopes to be used.

serviceAccount

  • Type: string
  • Dynamic: ✔️
  • Required:

The GCP service account key.

waitUntilCompletion

  • Type: string
  • Dynamic:
  • Required:
  • Default: 3600.000000000
  • Format: duration

The maximum duration to wait for the job completion. Google Cloud Batch will automatically timeout the job upon reaching such duration and the task will be failed.

Definitions

io.kestra.plugin.gcp.runner.GcpBatchTaskRunner-NetworkInterface

Properties

network
  • Type: string
  • Dynamic: ✔️
  • Required: ✔️

Network identifier with the format projects/HOST_PROJECT_ID/global/networks/NETWORK.

subnetwork
  • Type: string
  • Dynamic: ✔️
  • Required:

Subnetwork identifier with the format projects/HOST_PROJECT_ID/regions/REGION/subnetworks/SUBNET