Google Batch Task Runner
Available on: Enterprise EditionCloud>= 0.18.0
Run tasks as containers on Google Cloud VMs.
How to use the Google Batch task runner
The Google Batch task runner deploys a container for each task on a specified Google Cloud Batch VM.
To launch tasks on Google Cloud Batch, you should understand three main concepts:
- Machine type — A required property that defines the compute machine type where the task will be deployed. If no
reservationis specified, a new compute instance will be created for each batch, which can add up to a minute of startup latency. - Reservation — An optional property that lets you reserve virtual machines in advance to avoid the delay of provisioning new instances for every task.
- Network interfaces — Optional; if not specified, the runner will use the default network interface.
How the Google Batch task runner works
To support inputFiles, namespaceFiles, and outputFiles, the Google Batch task runner performs the following actions:
- Mounts a volume from a GCS bucket.
- Uploads input files to the bucket before launching the container.
- Downloads output files from the bucket after the container finishes.
Because the container’s working directory is not known ahead of time, you must explicitly define the working and output directories. For example, use python {{ workingDir }}/main.py instead of python main.py.
Example flow
id: gcp_batch_runner
namespace: company.team
variables:
region: europe-west9
tasks:
- id: scrape_environment_info
type: io.kestra.plugin.scripts.python.Commands
containerImage: ghcr.io/kestra-io/pydata:latest
taskRunner:
type: io.kestra.plugin.ee.gcp.runner.Batch
projectId: "{{ secret('GCP_PROJECT_ID') }}"
region: "{{ vars.region }}"
bucket: "{{ secret('GCS_BUCKET') }}"
serviceAccount: "{{ secret('GOOGLE_SA') }}"
commands:
- python {{ workingDir }}/main.py
namespaceFiles:
enabled: true
outputFiles:
- "environment_info.json"
inputFiles:
main.py: |
import platform
import socket
import sys
import json
from kestra import Kestra
print("Hello from GCP Batch and kestra!")
def print_environment_info():
print(f"Host's network name: {platform.node()}")
print(f"Python version: {platform.python_version()}")
print(f"Platform information (instance type): {platform.platform()}")
print(f"OS/Arch: {sys.platform}/{platform.machine()}")
env_info = {
"host": platform.node(),
"platform": platform.platform(),
"OS": sys.platform,
"python_version": platform.python_version(),
}
Kestra.outputs(env_info)
filename = '{{ workingDir }}/environment_info.json'
with open(filename, 'w') as json_file:
json.dump(env_info, json_file, indent=4)
if __name__ == '__main__':
print_environment_info()
For a full list of available properties, see the Google Batch plugin documentation or explore the configuration in the built-in Code Editor in the Kestra UI.
Full setup guide: running Google Batch from scratch
Before you begin
You’ll need the following prerequisites:
- A Google Cloud account.
- A Kestra instance (version 0.16.0 or later) with Google credentials stored as secrets or set as environment variables.
Google Cloud Console setup
Create a project
If you don’t already have one, create a new project in the Google Cloud Console.

Once created, ensure your new project is selected in the top navigation bar.

Enable the Batch API
Navigate to the APIs & Services section and search for Batch API. Enable it so Kestra can create and manage Batch jobs.

After enabling the API, you’ll be prompted to create credentials for integration.
Create a service account
Once the Batch API is active, create a service account to allow Kestra to access GCP resources.
Follow the prompt for Application data, which will generate a new service account.

Give the service account a descriptive name.

Assign the following roles:
- Batch Job Editor
- Logs Viewer
- Storage Object Admin

Next, create a key for this service account by going to Keys → Add Key, and choose JSON. This will generate credentials you can add to Kestra as a secret or directly into your flow configuration.

See Google credentials guide for more details.
Grant this service account access to the Compute Engine default service account by navigating to IAM & Admin → Service Accounts → Permissions → Grant Access, then assigning the Service Account User role.

Create a storage bucket
Search for “Bucket” in the Cloud Console and create a new GCS bucket. You can keep the default configuration for now.

Create a flow
Below is a sample flow that runs a Python file (main.py) using the Google Batch Task Runner. The taskRunner section defines properties such as the project, region, and bucket.
containerImage: ghcr.io/kestra-io/kestrapy:latest
taskRunner:
type: io.kestra.plugin.ee.gcp.runner.Batch
projectId: "{{ secret('GCP_PROJECT_ID') }}"
region: "{{ vars.region }}"
bucket: "{{ secret('GCS_BUCKET') }}"
serviceAccount: "{{ secret('GOOGLE_SA') }}"
By default, the task runner uses the default network configuration of your Google Cloud project. If none exists, you can configure connectivity manually using the networkInterfaces property. See the Google Cloud Batch Task Runner documentation for details.
Here’s the full flow configuration:
id: gcp_batch_runner
namespace: company.team
variables:
region: europe-west2
tasks:
- id: scrape_environment_info
type: io.kestra.plugin.scripts.python.Commands
containerImage: ghcr.io/kestra-io/kestrapy:latest
taskRunner:
type: io.kestra.plugin.ee.gcp.runner.Batch
projectId: "{{ secret('GCP_PROJECT_ID') }}"
region: "{{ vars.region }}"
bucket: "{{ secret('GCS_BUCKET') }}"
serviceAccount: "{{ secret('GOOGLE_SA') }}"
commands:
- python {{ workingDir }}/main.py
namespaceFiles:
enabled: true
outputFiles:
- "environment_info.json"
inputFiles:
main.py: |
import platform
import socket
import sys
import json
from kestra import Kestra
print("Hello from GCP Batch and kestra!")
def print_environment_info():
print(f"Host's network name: {platform.node()}")
print(f"Python version: {platform.python_version()}")
print(f"Platform information (instance type): {platform.platform()}")
print(f"OS/Arch: {sys.platform}/{platform.machine()}")
env_info = {
"host": platform.node(),
"platform": platform.platform(),
"OS": sys.platform,
"python_version": platform.python_version(),
}
Kestra.outputs(env_info)
filename = '{{ workingDir }}/environment_info.json'
with open(filename, 'w') as json_file:
json.dump(env_info, json_file, indent=4)
print_environment_info()
When you execute the flow, the logs will show the task runner being created:

You can also confirm job creation directly in the Google Cloud Console:

After the task completes, the runner automatically shuts down. You can review output artifacts in Kestra’s Outputs tab:

Was this page helpful?