Blueprints

Execute a Spark or Python script on an existing Databricks cluster and wait for its completion

Source

yaml
id: run-tasks-on-databricks
namespace: company.team

tasks:
  - id: submit_run
    type: io.kestra.plugin.databricks.job.SubmitRun
    host: "{{ secret('DATABRICKS_HOST') }}"
    authentication:
      token: "{{ secret('DATABRICKS_TOKEN') }}"
    runTasks:
      - existingClusterId: abcdefgh12345678
        taskKey: pysparkTask
        sparkPythonTask:
          pythonFile: /Shared/hello.py
          sparkPythonTaskSource: WORKSPACE
    waitForCompletion: PT5M

  - id: log_status
    type: io.kestra.plugin.core.log.Log
    message: The job finished, all done!

About this blueprint

Databricks

This flow will run a task on an existing Databricks cluster specified by ID. The Python script referenced on the pythonFile property is stored in the Databricks workspace. The flow will run the script on a Databricks cluster and wait up to five minutes for the task to complete before running the next task (here, simply printing a log message to the console).

Submit Run

Log

More Related Blueprints

New to Kestra?

Use blueprints to kickstart your first workflows.

Get started with Kestra