Run a task on an on-demand Databricks cluster

Source

yaml

id: on-demand-cluster-job
namespace: company.team

tasks:
  - id: create_cluster
    type: io.kestra.plugin.databricks.cluster.CreateCluster
    authentication:
      token: "{{ secret('DATABRICKS_TOKEN') }}"
    host: "{{ secret('DATABRICKS_HOST') }}"
    clusterName: kestra-demo
    nodeTypeId: n2-highmem-4
    numWorkers: 1
    sparkVersion: 13.0.x-scala2.12

  - id: allow_failure
    type: io.kestra.plugin.core.flow.AllowFailure
    tasks:
      - id: run_job
        type: io.kestra.plugin.databricks.job.CreateJob
        authentication:
          token: "{{ secret('DATABRICKS_TOKEN') }}"
        host: "{{ secret('DATABRICKS_HOST') }}"
        jobTasks:
          - existingClusterId: "{{ outputs.create_cluster.clusterId }}"
            taskKey: yourArbitraryTaskKey
            sparkPythonTask:
              pythonFile: /Shared/hello.py
              sparkPythonTaskSource: WORKSPACE
        waitForCompletion: PT5M

  - id: delete_cluster
    type: io.kestra.plugin.databricks.cluster.DeleteCluster
    authentication:
      token: "{{ secret('DATABRICKS_TOKEN') }}"
    host: "{{ secret('DATABRICKS_HOST') }}"
    clusterId: "{{ outputs.create_cluster.clusterId }}"

About this blueprint

DevOps

This flow will create a Databricks cluster, run a task on the cluster, and then delete the cluster.

The Python script referenced on pythonFile property is stored in the Databricks workspace.
The flow will run the script on a Databricks cluster and wait up to five minutes (as declared on the waitForCompletion property) for the task to complete.

Even if the job fails, the AllowFailure tasks ensures that Databricks cluster will be deleted in the end.

Create Cluster

Allow Failure

Create Job

Delete Cluster

More Related Blueprints

Getting StartedAPIDevOps

Getting started with Kestra — an Infrastructure Automation workflow example

DevOpsGit

Deploy Terraform resources defined in a GitHub repository using S3 remote backend

APIDevOpsGitKestra

Restore a Kestra instance from a Git backup

New to Kestra?

Use blueprints to kickstart your first workflows.

Get started with Kestra