CustomJob
Start a custom job in Google Vertex AI.
For more details, check out the custom job documentation.
type: "io.kestra.plugin.gcp.vertexai.CustomJob"
Examples
id: gcp_vertexai_custom_job
namespace: company.team
tasks:
- id: custom_job
type: io.kestra.plugin.gcp.vertexai.CustomJob
projectId: my-gcp-project
region: europe-west1
displayName: Start Custom Job
spec:
workerPoolSpecs:
- containerSpec:
imageUri: gcr.io/my-gcp-project/my-dir/my-image:latest
machineSpec:
machineType: n1-standard-4
replicaCount: 1
Properties
displayName *Requiredstring
The job display name.
region *Requiredstring
The GCP region.
spec *RequiredNon-dynamicCustomJobSpec
The job specification.
delete booleanstring
true
Delete the job at the end.
impersonatedServiceAccount string
The GCP service account to impersonate.
projectId string
The GCP project ID.
scopes array
["https://www.googleapis.com/auth/cloud-platform"]
The GCP scopes to be used.
serviceAccount string
The GCP service account.
wait booleanstring
true
Wait for the end of the job.
Allowing to capture job status & logs.
Outputs
createDate *Requiredstring
date-time
Time when the CustomJob was created.
endDate *Requiredstring
date-time
Time when the CustomJob was ended.
name *Requiredstring
Resource name of a CustomJob.
state *Requiredstring
JOB_STATE_UNSPECIFIED
JOB_STATE_QUEUED
JOB_STATE_PENDING
JOB_STATE_RUNNING
JOB_STATE_SUCCEEDED
JOB_STATE_FAILED
JOB_STATE_CANCELLING
JOB_STATE_CANCELLED
JOB_STATE_PAUSED
JOB_STATE_EXPIRED
JOB_STATE_UPDATING
JOB_STATE_PARTIALLY_SUCCEEDED
UNRECOGNIZED
The detailed state of the CustomJob.
updateDate *Requiredstring
date-time
Time when the CustomJob was updated.
Definitions
io.kestra.plugin.gcp.vertexai.models.ContainerSpec
imageUri *Requiredstring
The URI of a container image in the Container Registry that is to be run on each worker replica.
Must be on google container registry, example: gcr.io/{{ project }}/{{ dir }}/{{ image }}: {{ tag }}
args array
The arguments to be passed when starting the container.
commands array
The command to be invoked when the container is started.
It overrides the entrypoint instruction in Dockerfile when provided.
env object
Environment variables to be passed to the container.
Maximum limit is 100.
io.kestra.plugin.gcp.vertexai.models.CustomJobSpec
workerPoolSpecs *Requiredarray
baseOutputDirectory GcsDestination
The Cloud Storage location to store the output of this job.
enableWebAccess booleanstring
Whether you want Vertex AI to enable interactive shell access to training containers.
network string
The full name of the Compute Engine network to which the Job should be peered.
For example, projects/12345/global/networks/myVPC
.
Format is of the form projects/{project}/global/networks/{network}
. Where {project} is a project number, as in 12345
, and {network} is a network name.
To specify this field, you must have already configured VPC Network Peering for Vertex AI.
If this field is left unspecified, the job is not peered with any network.
scheduling Scheduling
Scheduling options for a CustomJob.
serviceAccount string
Specifies the service account for workload run-as account.
Users submitting jobs must have act-as permission on this run-as account.
If unspecified, the [Vertex AI Custom Code Service
Agent](https://cloud.google.com/vertex-ai/docs/general/access-control#service-agents)
for the CustomJob's project is used.
tensorboard string
The name of a Vertex AI Tensorboard resource to which this CustomJob
will upload Tensorboard logs. Format: projects/{project}/locations/{location}/tensorboards/{tensorboard}
io.kestra.plugin.gcp.vertexai.models.GcsDestination
outputUriPrefix *Requiredstring
Google Cloud Storage URI to output directory.
If the uri doesn't end with '/', a '/' will be automatically appended. The directory is created if it doesn't exist.
io.kestra.plugin.gcp.vertexai.models.WorkerPoolSpec
containerSpec *RequiredContainerSpec
The custom container task.
machineSpec *RequiredMachineSpec
The specification of a single machine.
discSpec DiscSpec
The specification of the disk.
pythonPackageSpec PythonPackageSpec
The python package specs.
replicaCount integerstring
The specification of the disk.
io.kestra.plugin.gcp.vertexai.models.PythonPackageSpec
args *Requiredarray
The Google Cloud Storage location of the Python package files which are the training program and its dependent packages.
The maximum number of package URIs is 100.
envs *Requiredobject
Environment variables to be passed to the python module.
Maximum limit is 100.
packageUris *Requiredarray
The Google Cloud Storage location of the Python package files which are the training program and its dependent packages.
The maximum number of package URIs is 100.
io.kestra.plugin.gcp.vertexai.models.DiscSpec
bootDiskSizeGb integerstring
100
Size in GB of the boot disk.
bootDiskType string
PD_SSD
PD_SSD
PD_STANDARD
Type of the boot disk.
io.kestra.plugin.gcp.vertexai.models.MachineSpec
machineType *Requiredstring
The type of the machine.
acceleratorCount integerstring
The number of accelerators to attach to the machine.
acceleratorType string
ACCELERATOR_TYPE_UNSPECIFIED
NVIDIA_TESLA_K80
NVIDIA_TESLA_P100
NVIDIA_TESLA_V100
NVIDIA_TESLA_P4
NVIDIA_TESLA_T4
NVIDIA_TESLA_A100
NVIDIA_A100_80GB
NVIDIA_L4
NVIDIA_H100_80GB
NVIDIA_H100_MEGA_80GB
NVIDIA_H200_141GB
NVIDIA_B200
TPU_V2
TPU_V3
TPU_V4_POD
TPU_V5_LITEPOD
UNRECOGNIZED
The type of accelerator(s) that may be attached to the machine.
io.kestra.plugin.gcp.vertexai.models.Scheduling
restartJobOnWorkerRestart *Requiredbooleanstring
Restarts the entire CustomJob if a worker gets restarted.
This feature can be used by distributed training jobs that are not resilient to workers leaving and joining a job.
timeOut *Requiredstring
duration
The maximum job running time. The default is 7 days.