Blueprints
Execute a Spark or Python script on an existing Databricks cluster and wait for its completion
Source
yaml
id: run-tasks-on-databricks
namespace: company.team
tasks:
- id: submit_run
type: io.kestra.plugin.databricks.job.SubmitRun
host: "{{ secret('DATABRICKS_HOST') }}"
authentication:
token: "{{ secret('DATABRICKS_TOKEN') }}"
runTasks:
- existingClusterId: abcdefgh12345678
taskKey: pysparkTask
sparkPythonTask:
pythonFile: /Shared/hello.py
sparkPythonTaskSource: WORKSPACE
waitForCompletion: PT5M
- id: log_status
type: io.kestra.plugin.core.log.Log
message: The job finished, all done!
About this blueprint
Databricks
This flow will run a task on an existing Databricks cluster specified by ID. The Python script referenced on the pythonFile
property is stored in the Databricks workspace.
The flow will run the script on a Databricks cluster and wait up to five minutes for the task to complete before running the next task (here, simply printing a log message to the console).
More Related Blueprints