CreateJob
type: "io.kestra.plugin.databricks.job.CreateJob"
Create a Databricks job and run it. Set
waitForCompletion
to the desired maximum duration if you want the task to wait for the job completion (e.g.,PT1H
to wait up to one hour).
Examples
Create a Databricks job, run it, and wait for completion for five minutes
id: "create_job"
type: "io.kestra.plugin.databricks.job.CreateJob"
id: createJob
type: io.kestra.plugin.databricks.job.CreateJob
authentication:
token: <your-token>
host: <your-host>
jobTasks:
- existingClusterId: <your-cluster>
taskKey: taskKey
sparkPythonTask:
pythonFile: /Shared/hello.py
sparkPythonTaskSource: WORKSPACE
waitForCompletion: PT5M
Properties
jobTasks
- Type: array
- SubType: JobTaskSetting
- Dynamic: ❌
- Required: ✔️
- Min items:
1
The job tasks, if multiple tasks are defined you must set
dependsOn
on each task
accountId
- Type: string
- Dynamic: ✔️
- Required: ❌
Databricks account identifier
authentication
- Type: AuthenticationConfig
- Dynamic: ❌
- Required: ❌
Databricks authentication configuration
This property allows to configure the authentication to Databricks, different properties should be set depending on the type of authentication and the cloud provider. All configuration options can also be set using the standard Databricks environment variables. Check the Databricks authentication guide for more information.
configFile
- Type: string
- Dynamic: ✔️
- Required: ❌
Databricks configuration file, use this if you don't want to configure each Databricks account properties one by one
host
- Type: string
- Dynamic: ✔️
- Required: ❌
Databricks host
jobName
- Type: string
- Dynamic: ✔️
- Required: ❌
The name of the job
waitForCompletion
- Type: string
- Dynamic: ❌
- Required: ❌
- Format:
duration
If set, the task will wait for the job run completion for up to the
waitForCompletion
duration before timing out.
Outputs
jobId
- Type: integer
The job identifier
jobURI
- Type: string
The job URI on the Databricks console
runId
- Type: integer
The run identifier
runURI
- Type: string
The run URI on the Databricks console
Definitions
AuthenticationConfig
authType
- Type: string
- Dynamic: ❌
- Required: ❌
azureClientId
- Type: string
- Dynamic: ✔️
- Required: ❌
azureClientSecret
- Type: string
- Dynamic: ✔️
- Required: ❌
azureTenantId
- Type: string
- Dynamic: ✔️
- Required: ❌
clientId
- Type: string
- Dynamic: ✔️
- Required: ❌
clientSecret
- Type: string
- Dynamic: ✔️
- Required: ❌
googleCredentials
- Type: string
- Dynamic: ✔️
- Required: ❌
googleServiceAccount
- Type: string
- Dynamic: ✔️
- Required: ❌
password
- Type: string
- Dynamic: ✔️
- Required: ❌
token
- Type: string
- Dynamic: ✔️
- Required: ❌
username
- Type: string
- Dynamic: ✔️
- Required: ❌
SqlTaskSetting
parameters
- Type: object
- SubType: string
- Dynamic: ❌
- Required: ❌
queryId
- Type: string
- Dynamic: ✔️
- Required: ❌
warehouseId
- Type: string
- Dynamic: ✔️
- Required: ❌
NotebookTaskSetting
baseParameters
- Type: object
- SubType: string
- Dynamic: ❌
- Required: ❌
notebookPath
- Type: string
- Dynamic: ✔️
- Required: ❌
source
- Type: string
- Dynamic: ❌
- Required: ❌
- Possible Values:
GIT
WORKSPACE
SparkPythonTaskSetting
pythonFile
- Type: string
- Dynamic: ✔️
- Required: ✔️
sparkPythonTaskSource
- Type: string
- Dynamic: ❌
- Required: ✔️
- Possible Values:
GIT
WORKSPACE
parameters
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ❌
JobTaskSetting
dbtTask
- Type: DbtTaskSetting
- Dynamic: ❌
- Required: ❌
DBT task settings
dependsOn
- Type: array
- SubType: string
- Dynamic: ❌
- Required: ❌
Task dependencies, set this if multiple tasks are defined on the job
description
- Type: string
- Dynamic: ✔️
- Required: ❌
Task description
existingClusterId
- Type: string
- Dynamic: ✔️
- Required: ❌
The identifier of the cluster
notebookTask
- Type: NotebookTaskSetting
- Dynamic: ❌
- Required: ❌
Notebook task settings
pipelineTask
- Type: PipelineTaskSetting
- Dynamic: ❌
- Required: ❌
Pipeline task settings
pythonWheelTask
- Type: PythonWheelTaskSetting
- Dynamic: ❌
- Required: ❌
Python Wheel task settings
sparkJarTask
- Type: SparkJarTaskSetting
- Dynamic: ❌
- Required: ❌
Spark JAR task settings
sparkPythonTask
- Type: SparkPythonTaskSetting
- Dynamic: ❌
- Required: ❌
Spark Python task settings
sparkSubmitTask
- Type: SparkSubmitTaskSetting
- Dynamic: ❌
- Required: ❌
Spark Submit task settings
sqlTask
- Type: SqlTaskSetting
- Dynamic: ❌
- Required: ❌
SQL task settings
taskKey
- Type: string
- Dynamic: ✔️
- Required: ❌
Task key
timeoutSeconds
- Type: integer
- Dynamic: ❌
- Required: ❌
Task timeout in seconds
PythonWheelTaskSetting
entryPoint
- Type: string
- Dynamic: ✔️
- Required: ❌
namedParameters
- Type: object
- SubType: string
- Dynamic: ❌
- Required: ❌
packageName
- Type: string
- Dynamic: ✔️
- Required: ❌
parameters
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ❌
PipelineTaskSetting
fullRefresh
- Type: boolean
- Dynamic: ❌
- Required: ❌
pipelineId
- Type: string
- Dynamic: ✔️
- Required: ❌
DbtTaskSetting
catalog
- Type: string
- Dynamic: ✔️
- Required: ❌
commands
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ❌
schema
- Type: string
- Dynamic: ✔️
- Required: ❌
warehouseId
- Type: string
- Dynamic: ✔️
- Required: ❌
SparkJarTaskSetting
jarUri
- Type: string
- Dynamic: ✔️
- Required: ❌
mainClassName
- Type: string
- Dynamic: ✔️
- Required: ❌
parameters
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ❌
SparkSubmitTaskSetting
parameters
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ❌