GcpBatchTaskRunner
GcpBatchTaskRunner
This plugin is currently in beta. While it is considered safe for use, please be aware that its API could change in ways that are not compatible with earlier versions in future releases, or it might become unsupported.
type: "io.kestra.plugin.gcp.runner.GcpBatchTaskRunner"
Task runner that executes a task inside a job in Google Cloud Batch.
This task runner is container-based so the containerImage
property must be set.
You need to have roles 'Batch Job Editor' and 'Logs Viewer' to be able to use it.
To access the task's working directory, use the Pebble expression or the
WORKING_DIR
environment variable. Input files and namespace files will be available in this directory.
To generate output files you can either use the outputFiles
task's property and create a file with the same name in the task's working directory, or create any file in the output directory which can be accessed by the Pebble expression or the
OUTPUT_DIR
environment variables.
To use inputFiles
, outputFiles
or namespaceFiles
properties, make sure to set the bucket
property. The bucket serves as an intermediary storage layer for the task runner. Input and namespace files will be uploaded to the cloud storage bucket before the task run. Similarly, the task runner will store outputFiles in this bucket during the task run. In the end, the task runner will make those files available for download and preview from the UI by sending them to internal storage.
To make it easier to track where all files are stored, the task runner will generate a folder for each task run. You can access that folder using the Pebble expression or the
BUCKET_PATH
environment variable.
Warning, contrarily to other task runners, this task runner didn't run the task in the working directory but in the root directory. You must use the Pebble expression or the
WORKING_DIR
environment variable to access files.
Note that when the Kestra Worker running this task is terminated, the batch job will still run until completion.
Examples
Execute a Shell command.
id: new-shell
namespace: myteam
tasks:
- id: shell
type: io.kestra.plugin.scripts.shell.Commands
taskRunner:
type: io.kestra.plugin.gcp.runner.GcpBatchTaskRunner
projectId: "{{vars.projectId}}"
region: "{{vars.region}}"
commands:
- echo "Hello World"
Pass input files to the task, execute a Shell command, then retrieve output files.
id: new-shell-with-file
namespace: myteam
inputs:
- id: file
type: FILE
tasks:
- id: shell
type: io.kestra.plugin.scripts.shell.Commands
inputFiles:
data.txt: "{{inputs.file}}"
outputFiles:
- out.txt
containerImage: centos
taskRunner:
type: io.kestra.plugin.gcp.runner.GcpBatchTaskRunner
projectId: "{{vars.projectId}}"
region: "{{vars.region}}"
bucket: "{{vars.bucker}}"
commands:
- cp {{workingDir}}/data.txt {{workingDir}}/out.txt
Properties
delete
- Type: boolean
- Dynamic: ❓
- Required: ✔️
- Default:
true
Whether the job should be deleted upon completion.
machineType
- Type: string
- Dynamic: ✔️
- Required: ✔️
- Default:
e2-medium
The GCP machine type.
region
- Type: string
- Dynamic: ✔️
- Required: ✔️
The GCP region.
bucket
- Type: string
- Dynamic: ✔️
- Required: ❌
Google Cloud Storage Bucket to use to upload (inputFiles
and namespaceFiles
) and download (outputFiles
) files.
It's mandatory to provide a bucket if you want to use such properties.
entryPoint
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ❌
Container entrypoint to use.
networkInterfaces
- Type: array
- SubType: GcpBatchTaskRunner-NetworkInterface
- Dynamic: ❌
- Required: ❌
Network interfaces.
projectId
- Type: string
- Dynamic: ✔️
- Required: ❌
The GCP project ID.
reservation
- Type: string
- Dynamic: ✔️
- Required: ❌
Compute reservation.
scopes
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ❌
- Default:
[https://www.googleapis.com/auth/cloud-platform]
The GCP scopes to be used.
serviceAccount
- Type: string
- Dynamic: ✔️
- Required: ❌
The GCP service account key.
waitUntilCompletion
- Type: string
- Dynamic: ❓
- Required: ❌
- Default:
3600.000000000
- Format:
duration
The maximum duration to wait for the job completion. Google Cloud Batch will automatically timeout the job upon reaching such duration and the task will be failed.
Definitions
io.kestra.plugin.gcp.runner.GcpBatchTaskRunner-NetworkInterface
Properties
network
- Type: string
- Dynamic: ✔️
- Required: ✔️
Network identifier with the format projects/HOST_PROJECT_ID/global/networks/NETWORK
.
subnetwork
- Type: string
- Dynamic: ✔️
- Required: ❌
Subnetwork identifier with the format projects/HOST_PROJECT_ID/regions/REGION/subnetworks/SUBNET