Available on: >= 0.16.0

Run tasks as containers on Azure Batch VMs.

This runner will deploy the container for the task to a specified Azure Batch pool.

To launch the task on Azure Batch, there is only two main concepts you need to be aware of:

  1. Pool β€” mandatory, not created by the task. This is a pool composed of nodes where your task can run on.
  2. Job β€” created by the task runner; holds information about which image, commands, and resources to run on.

How does it work

In order to support inputFiles, namespaceFiles, and outputFiles, the AzureBatchTaskRunner currently relies on resource files and output files which transit through Azure Blob Storage.

As with ProcessTaskRunner and DockerTaskRunner, you can directly reference input, namespace and output files since you're in the same working directory: cat myFile.txt.

A full flow example

yaml
id: azure_batch_runner
namespace: dev

tasks:
  - id: scrape_environment_info
    type: io.kestra.plugin.scripts.python.Commands
    containerImage: ghcr.io/kestra-io/pydata:latest
    taskRunner:
      type: io.kestra.plugin.azure.runner.AzureBatchTaskRunner
      account: "{{secrets.account}}"
      accessKey: "{{secrets.accessKey}}"
      endpoint: "{{secrets.endpoint}}"
      poolId: "{{vars.poolId}}"
      blobStorage:
        containerName: "{{vars.containerName}}"
        connectionString: "{{secrets.connectionString}}"
    commands:
      - python main.py
    namespaceFiles:
      enabled: true
    outputFiles:
      - "environment_info.json"
    inputFiles:
      main.py: |
        import platform
        import socket
        import sys
        import json
        from kestra import Kestra

        print("Hello from Azure Batch and kestra!")

        def print_environment_info():
            print(f"Host's network name: {platform.node()}")
            print(f"Python version: {platform.python_version()}")
            print(f"Platform information (instance type): {platform.platform()}")
            print(f"OS/Arch: {sys.platform}/{platform.machine()}")

            env_info = {
                "host": platform.node(),
                "platform": platform.platform(),
                "OS": sys.platform,
                "python_version": platform.python_version(),
            }
            Kestra.outputs(env_info)

            filename = 'environment_info.json'
            with open(filename, 'w') as json_file:
                json.dump(env_info, json_file, indent=4)

        if __name__ == '__main__':
          print_environment_info()

Was this page helpful?