CreateJob
CreateJob
type: "io.kestra.plugin.databricks.job.CreateJob"
Create a Databricks job and run it. Set waitForCompletion
to the desired maximum duration if you want the task to wait for the job completion (e.g., PT1H
to wait up to one hour).
Examples
Create a Databricks job, run it, and wait for completion for five minutes.
id: databricks_job_create
namespace: company.team
tasks:
- id: create_job
type: io.kestra.plugin.databricks.job.CreateJob
authentication:
token: <your-token>
host: <your-host>
jobTasks:
- existingClusterId: <your-cluster>
taskKey: taskKey
sparkPythonTask:
pythonFile: /Shared/hello.py
sparkPythonTaskSource: WORKSPACE
waitForCompletion: PT5M
Properties
jobTasks
- Type: array
- SubType: CreateJob-JobTaskSetting
- Dynamic: ❌
- Required: ✔️
- Min items:
1
The job tasks, if multiple tasks are defined you must set dependsOn
on each task.
accountId
- Type: string
- Dynamic: ✔️
- Required: ❌
Databricks account identifier.
authentication
- Type: AbstractTask-AuthenticationConfig
- Dynamic: ❌
- Required: ❌
Databricks authentication configuration.
This property allows to configure the authentication to Databricks, different properties should be set depending on the type of authentication and the cloud provider. All configuration options can also be set using the standard Databricks environment variables. Check the Databricks authentication guide for more information.
configFile
- Type: string
- Dynamic: ✔️
- Required: ❌
Databricks configuration file, use this if you don't want to configure each Databricks account properties one by one.
host
- Type: string
- Dynamic: ✔️
- Required: ❌
Databricks host.
jobName
- Type: string
- Dynamic: ✔️
- Required: ❌
The name of the job.
waitForCompletion
- Type: string
- Dynamic: ❌
- Required: ❌
- Format:
duration
If set, the task will wait for the job run completion for up to the waitForCompletion
duration before timing out.
Outputs
jobId
- Type: integer
- Required: ❌
The job identifier.
jobURI
- Type: string
- Required: ❌
- Format:
uri
The job URI on the Databricks console.
runId
- Type: integer
- Required: ❌
The run identifier.
runURI
- Type: string
- Required: ❌
- Format:
uri
The run URI on the Databricks console.
Definitions
io.kestra.plugin.databricks.job.task.SqlTaskSetting
Properties
parameters
- Type:
- string
- object
- SubType: string
- Dynamic: ✔️
- Required: ❌
Map of task parameters.
Can be a map of string/string or a variable that binds to a JSON object.
queryId
- Type: string
- Dynamic: ✔️
- Required: ❌
warehouseId
- Type: string
- Dynamic: ✔️
- Required: ❌
io.kestra.plugin.databricks.job.task.NotebookTaskSetting
Properties
baseParameters
- Type:
- string
- object
- SubType: string
- Dynamic: ✔️
- Required: ❌
Map of task base parameters.
Can be a map of string/string or a variable that binds to a JSON object.
notebookPath
- Type: string
- Dynamic: ✔️
- Required: ❌
source
- Type: string
- Dynamic: ❌
- Required: ❌
- Possible Values:
GIT
WORKSPACE
io.kestra.plugin.databricks.job.CreateJob-JobTaskSetting
Properties
dbtTask
- Type: DbtTaskSetting
- Dynamic: ❌
- Required: ❌
DBT task settings.
dependsOn
- Type: array
- SubType: string
- Dynamic: ❌
- Required: ❌
Task dependencies, set this if multiple tasks are defined on the job.
description
- Type: string
- Dynamic: ✔️
- Required: ❌
Task description.
existingClusterId
- Type: string
- Dynamic: ✔️
- Required: ❌
The identifier of the cluster.
libraries
- Type: array
- SubType: LibrarySetting
- Dynamic: ❌
- Required: ❌
Task libraries.
notebookTask
- Type: NotebookTaskSetting
- Dynamic: ❌
- Required: ❌
Notebook task settings.
pipelineTask
- Type: PipelineTaskSetting
- Dynamic: ❌
- Required: ❌
Pipeline task settings.
pythonWheelTask
- Type: PythonWheelTaskSetting
- Dynamic: ❌
- Required: ❌
Python Wheel task settings.
sparkJarTask
- Type: SparkJarTaskSetting
- Dynamic: ❌
- Required: ❌
Spark JAR task settings.
sparkPythonTask
- Type: SparkPythonTaskSetting
- Dynamic: ❌
- Required: ❌
Spark Python task settings.
sparkSubmitTask
- Type: SparkSubmitTaskSetting
- Dynamic: ❌
- Required: ❌
Spark Submit task settings.
sqlTask
- Type: SqlTaskSetting
- Dynamic: ❌
- Required: ❌
SQL task settings.
taskKey
- Type: string
- Dynamic: ✔️
- Required: ❌
Task key.
timeoutSeconds
- Type: integer
- Dynamic: ❌
- Required: ❌
Task timeout in seconds.
io.kestra.plugin.databricks.job.task.PythonWheelTaskSetting
Properties
entryPoint
- Type: string
- Dynamic: ✔️
- Required: ❌
namedParameters
- Type:
- string
- object
- SubType: string
- Dynamic: ✔️
- Required: ❌
Map of task named parameters.
Can be a map of string/string or a variable that binds to a JSON object.
packageName
- Type: string
- Dynamic: ✔️
- Required: ❌
parameters
- Type:
- string
- array
- Dynamic: ✔️
- Required: ❌
List of task parameters.
Can be a list of strings or a variable that binds to a JSON array of strings.
io.kestra.plugin.databricks.job.task.LibrarySetting-CranSetting
Properties
_package
- Type: string
- Dynamic: ✔️
- Required: ❌
repo
- Type: string
- Dynamic: ✔️
- Required: ❌
io.kestra.plugin.databricks.job.task.SparkSubmitTaskSetting
Properties
parameters
- Type:
- string
- array
- Dynamic: ✔️
- Required: ❌
List of task parameters.
Can be a list of strings or a variable that binds to a JSON array of strings.
io.kestra.plugin.databricks.AbstractTask-AuthenticationConfig
Properties
authType
- Type: string
- Dynamic: ✔️
- Required: ❌
azureClientId
- Type: string
- Dynamic: ✔️
- Required: ❌
azureClientSecret
- Type: string
- Dynamic: ✔️
- Required: ❌
azureTenantId
- Type: string
- Dynamic: ✔️
- Required: ❌
clientId
- Type: string
- Dynamic: ✔️
- Required: ❌
clientSecret
- Type: string
- Dynamic: ✔️
- Required: ❌
googleCredentials
- Type: string
- Dynamic: ✔️
- Required: ❌
googleServiceAccount
- Type: string
- Dynamic: ✔️
- Required: ❌
password
- Type: string
- Dynamic: ✔️
- Required: ❌
token
- Type: string
- Dynamic: ✔️
- Required: ❌
username
- Type: string
- Dynamic: ✔️
- Required: ❌
io.kestra.plugin.databricks.job.task.SparkPythonTaskSetting
Properties
pythonFile
- Type: string
- Dynamic: ✔️
- Required: ✔️
sparkPythonTaskSource
- Type: string
- Dynamic: ❌
- Required: ✔️
- Possible Values:
GIT
WORKSPACE
parameters
- Type:
- string
- array
- Dynamic: ✔️
- Required: ❌
List of task parameters.
Can be a list of strings or a variable that binds to a JSON array of strings.
io.kestra.plugin.databricks.job.task.PipelineTaskSetting
Properties
fullRefresh
- Type: boolean
- Dynamic: ❌
- Required: ❌
pipelineId
- Type: string
- Dynamic: ✔️
- Required: ❌
io.kestra.plugin.databricks.job.task.LibrarySetting
Properties
cran
- Type: LibrarySetting-CranSetting
- Dynamic: ❌
- Required: ❌
egg
- Type: string
- Dynamic: ✔️
- Required: ❌
jar
- Type: string
- Dynamic: ✔️
- Required: ❌
maven
- Type: LibrarySetting-MavenSetting
- Dynamic: ❌
- Required: ❌
pypi
- Type: LibrarySetting-PypiSetting
- Dynamic: ❌
- Required: ❌
whl
- Type: string
- Dynamic: ✔️
- Required: ❌
io.kestra.plugin.databricks.job.task.DbtTaskSetting
Properties
catalog
- Type: string
- Dynamic: ✔️
- Required: ❌
commands
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ❌
schema
- Type: string
- Dynamic: ✔️
- Required: ❌
warehouseId
- Type: string
- Dynamic: ✔️
- Required: ❌
io.kestra.plugin.databricks.job.task.LibrarySetting-PypiSetting
Properties
_package
- Type: string
- Dynamic: ✔️
- Required: ❌
repo
- Type: string
- Dynamic: ✔️
- Required: ❌
io.kestra.plugin.databricks.job.task.SparkJarTaskSetting
Properties
jarUri
- Type: string
- Dynamic: ✔️
- Required: ❌
mainClassName
- Type: string
- Dynamic: ✔️
- Required: ❌
parameters
- Type:
- string
- array
- Dynamic: ✔️
- Required: ❌
List of task parameters.
Can be a list of strings or a variable that binds to a JSON array of strings.
io.kestra.plugin.databricks.job.task.LibrarySetting-MavenSetting
Properties
coordinates
- Type: string
- Dynamic: ✔️
- Required: ❌
exclusions
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ❌
repo
- Type: string
- Dynamic: ✔️
- Required: ❌
Was this page helpful?