Kubernetes Task Runner
Available on: Enterprise EditionCloud>= 0.18.0
Run tasks as Kubernetes pods.
Overview
This plugin is available only in the Enterprise Edition (EE) and Kestra Cloud. The task runner is container-based, so the containerImage
property must be set. To access the task's working directory, use either the {{workingDir}}
Pebble expression or the WORKING_DIR
environment variable. Input files and namespace files are available in this directory.
To generate output files, you can either:
- Use the
outputFiles
property of the task and create a file with the same name in the task’s working directory, or - Create any file in the output directory, accessible via the
{{outputDir}}
Pebble expression or theOUTPUT_DIR
environment variable.
When the Kestra Worker running this task is terminated, the pod continues until completion. After restarting, the Worker resumes processing on the existing pod unless resume
is set to false
.
If your cluster is configured with RBAC, the service account running your pod must have the following authorizations:
pods
: get, create, delete, watch, listpods/log
: get, watchpods/exec
: get, watch
Here is an example role that grants these authorizations:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: task-runner
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "create", "delete", "watch", "list"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["get", "watch"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get", "watch"]
How to use the Kubernetes task runner
The Kubernetes task runner executes tasks in a specified Kubernetes cluster. It is useful for declaring resource limits and resource requests.
Here is an example of a workflow with a task running shell commands in a Kubernetes pod:
id: kubernetes_task_runner
namespace: company.team
description: |
To get the kubeconfig file, run: `kubectl config view --minify --flatten`.
Then, copy the values to the configuration below.
Here is how Kubernetes task runner properties (on the left) map to the kubeconfig file's properties (on the right):
- clientKeyData: client-key-data
- clientCertData: client-certificate-data
- caCertData: certificate-authority-data
- masterUrl: server, e.g., https://docker-for-desktop:6443
- oauthToken: token (if using OAuth, e.g., GKE/EKS)
inputs:
- id: file
type: FILE
tasks:
- id: shell
type: io.kestra.plugin.scripts.shell.Commands
inputFiles:
data.txt: "{{ inputs.file }}"
outputFiles:
- "*.txt"
containerImage: centos
taskRunner:
type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes
config:
clientKeyData: client-key-data
clientCertData: client-certificate-data
caCertData: certificate-authority-data
masterUrl: server e.g. https://docker-for-desktop:6443
commands:
- echo "Hello from a Kubernetes task runner!"
- cp data.txt out.txt
To deploy Kubernetes with Docker Desktop, see this guide.
To install kubectl
, see this guide.
File handling
If your script task has inputFiles
or namespaceFiles
configured, an init container uploads files into the main container.
If your script task has outputFiles
configured, a sidecar container downloads files from the main container.
All containers use an in-memory emptyDir
volume for file exchange.
Failure scenarios
If a task is resubmitted (for example, due to a retry or a Worker crash), the new Worker will reattach to the existing (or completed) pod instead of starting a new one.
Specifying resource requests for Python scripts
Some Python scripts may require more resources than others. You can specify resource requests in the resources
property of the task runner.
id: kubernetes_resources
namespace: company.team
tasks:
- id: python_script
type: io.kestra.plugin.scripts.python.Script
containerImage: ghcr.io/kestra-io/pydata:latest
taskRunner:
type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes
namespace: default
pullPolicy: ALWAYS
config:
username: docker-desktop
masterUrl: https://docker-for-desktop:6443
caCertData: xxx
clientCertData: xxx
clientKeyData: xxx
resources:
request:
cpu: "500m"
memory: "128Mi"
outputFiles:
- "*.json"
script: |
import platform
import socket
import sys
import json
from kestra import Kestra
print("Hello from a Kubernetes runner!")
host = platform.node()
py_version = platform.python_version()
platform_info = platform.platform()
os_arch = f"{sys.platform}/{platform.machine()}"
def print_environment_info():
print(f"Host name: {host}")
print(f"Python version: {py_version}")
print(f"Platform: {platform_info}")
print(f"OS/Arch: {os_arch}")
env_info = {
"host": host,
"platform": platform_info,
"os_arch": os_arch,
"python_version": py_version,
}
Kestra.outputs(env_info)
with open("environment_info.json", "w") as json_file:
json.dump(env_info, json_file, indent=4)
if __name__ == "__main__":
print_environment_info()
For a full list of Kubernetes task runner properties, see the Kubernetes plugin documentation or explore them in the built-in Code Editor in the Kestra UI.
Using plugin defaults to avoid repetition
You can use pluginDefaults
to avoid repeating configuration across multiple tasks. For example, you can set the pullPolicy
to ALWAYS
for all tasks in a namespace:
id: k8s_taskrunner
namespace: company.team
tasks:
- id: parallel
type: io.kestra.plugin.core.flow.Parallel
tasks:
- id: run_command
type: io.kestra.plugin.scripts.python.Commands
containerImage: ghcr.io/kestra-io/kestrapy:latest
commands:
- pip show kestra
- id: run_python
type: io.kestra.plugin.scripts.python.Script
containerImage: ghcr.io/kestra-io/pydata:latest
script: |
import socket
ip_address = socket.gethostbyname(hostname)
print("Hello from AWS EKS and Kestra!")
print(f"Host IP Address: {ip_address}")
pluginDefaults:
- type: io.kestra.plugin.scripts.python
forced: true
values:
taskRunner:
type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes
namespace: default
pullPolicy: ALWAYS
config:
username: docker-desktop
masterUrl: https://docker-for-desktop:6443
caCertData: |-
placeholder
clientCertData: |-
placeholder
clientKeyData: |-
placeholder
Guides
Below are several guides to help you set up the Kubernetes task runner on different platforms.
Google Kubernetes Engine (GKE)
Before you begin
Before starting, ensure you have the following:
- A Google Cloud account.
- A Kestra instance (version 0.18.0 or later) with Google credentials stored as secrets or environment variables.
Set up Google Cloud
In Google Cloud, perform the following steps:
- Create and select a project.
- Create a GKE cluster.
- Enable the Kubernetes Engine API.
- Set up the
gcloud
CLI withkubectl
. - Create a service account.
To authenticate with Google Cloud, create a service account and add a JSON key to Kestra. Read more in our Google credentials guide. For GKE, ensure the Kubernetes Engine default node service account
role is assigned to your service account.
Creating a flow
Here's an example flow using the Kubernetes task runner with GKE. To authenticate, use OAuth with a service account.
id: gke_task_runner
namespace: company.team
tasks:
- id: metadata
type: io.kestra.plugin.gcp.gke.ClusterMetadata
clusterId: kestra-dev-gke
clusterZone: "europe-west1"
clusterProjectId: kestra-dev
- id: auth
type: io.kestra.plugin.gcp.auth.OauthAccessToken
- id: pod
type: io.kestra.plugin.scripts.shell.Commands
containerImage: ubuntu
commands:
- echo "Hello from a Kubernetes task runner!"
taskRunner:
type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes
namespace: default
config:
caCertData: "{{ outputs.metadata.masterAuth.clusterCertificat }}"
masterUrl: "https://{{ outputs.metadata.endpoint }}"
oauthToken: "{{ outputs.auth.accessToken['tokenValue'] }}"
Use the gcloud
CLI to get credentials such as masterUrl
and caCertData
:
gcloud container clusters get-credentials clustername --region myregion --project projectid
Update the following arguments with your own values:
clusterId
: the name of your cluster.clusterZone
: the region of your cluster (for example,europe-west2
).clusterProjectId
: the ID of your Google Cloud project.
After running the command, access your config with kubectl config view --minify --flatten
to replace caCertData
, masterUrl
, and username
.
Amazon Elastic Kubernetes Service (EKS)
Here's an example flow using the Kubernetes task runner with AWS EKS. To authenticate, you need an OAuth token.
id: eks_task_runner
namespace: company.team
tasks:
- id: shell
type: io.kestra.plugin.scripts.shell.Commands
containerImage: centos
taskRunner:
type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes
config:
caCertData: "{{ secret('certificate-authority-data') }}"
masterUrl: https://xxx.xxx.region.eks.amazonaws.com
username: arn:aws:eks:region:xxx:cluster/cluster_name
oauthToken: xxx
commands:
- echo "Hello from a Kubernetes task runner!"
Was this page helpful?