Submit a Databricks run.

Optionally, set waitForCompletion to a desired maximum duration to wait for the run completion.

yaml
type: "io.kestra.plugin.databricks.job.submitrun"

Submit a Databricks run and wait up to 5 minutes for its completion.

yaml
id: databricks_job_submit_run
namespace: company.team

tasks:
  - id: submit_run
    type: io.kestra.plugin.databricks.job.SubmitRun
    host: "{{ secret('DATABRICKS_HOST') }}"
    authentication:
      token: "{{ secret('DATABRICKS_TOKEN') }}"
    runTasks:
      - existingClusterId: <your-cluster>
        taskKey: pysparkTask
        sparkPythonTask:
          pythonFile: /Shared/hello.py
          sparkPythonTaskSource: WORKSPACE
    waitForCompletion: PT5M
Properties
Min items1

The run tasks, if multiple tasks are defined you must set dependsOn on each task.

Definitions
dependsOnarray
SubTypestring

Task dependencies, set this if multiple tasks are defined on the run.

existingClusterIdstring
librariesarray

Task libraries.

cran
_packagestring
repostring
eggstring
jarstring
maven
coordinatesstring
exclusionsarray
SubTypestring
repostring
pypi
_packagestring
repostring
whlstring
notebookTask

Notebook task settings.

baseParametersstringobject
SubTypestring

Map of task base parameters.

notebookPathstring
sourcestring
Possible Values
GITWORKSPACE
pipelineTask

Pipeline task settings.

fullRefreshbooleanstring
pipelineIdstring
pythonWheelTask

Python Wheel task settings.

entryPointstring
namedParametersstringobject
SubTypestring

Map of task named parameters.

Can be a map of string/string or a variable that binds to a JSON object.

packageNamestring
parametersstringarray
runJobTask

Run job task settings.

jobIdstring
jobParametersobject
sparkJarTask

Spark JAR task settings.

jarUristring
mainClassNamestring
parametersstringarray
sparkPythonTask

Spark Python task settings.

pythonFile*Requiredstring
sparkPythonTaskSource*Requiredstring
Possible Values
GITWORKSPACE
parametersstringarray
sparkSubmitTask

Spark Submit task settings.

parametersstringarray

List of task parameters.

Can be a list of strings or a variable that binds to a JSON array of strings.

taskKeystring
timeoutSecondsinteger

Databricks account identifier.

Databricks authentication configuration.

This property allows to configure the authentication to Databricks, different properties should be set depending on the type of authentication and the cloud provider. All configuration options can also be set using the standard Databricks environment variables. Check the Databricks authentication guide for more information.

Definitions
authTypestring
azureClientIdstring
azureClientSecretstring
azureTenantIdstring
clientIdstring
clientSecretstring
googleCredentialsstring
googleServiceAccountstring
passwordstring
tokenstring
usernamestring

Databricks configuration file, use this if you don't want to configure each Databricks account properties one by one.

Databricks host.

The name of the run.

Formatduration

If set, the task will wait for the run completion.

The run identifier.

Formaturi

The run URI on the Databricks console.