Create

Create an Azure Batch job with tasks.

yaml
type: "io.kestra.plugin.azure.batch.job.Create"

Examples

yaml
id: azure_batch_job_create
namespace: company.team

tasks:
  - id: create
    type: io.kestra.plugin.azure.batch.job.Create
    endpoint: https://***.francecentral.batch.azure.com
    account: <batch-account>
    accessKey: <access-key>
    poolId: <pool-id>
    job:
      id: <job-name>
    tasks:
      - id: env
        commands:
          - 'echo t1=$ENV_STRING'
        environments:
          ENV_STRING: "{{ inputs.first }}"

      - id: echo
        commands:
          - 'echo t2={{ inputs.second }} 1>&2'

      - id: for
        commands:
          -  'for i in $(seq 10); do echo t3=$i; done'

      - id: vars
        commands:
          - echo '::{"outputs":{"extract":"'$(cat files/in/in.txt)'"}::'
        resourceFiles:
          - httpUrl: https://unittestkt.blob.core.windows.net/tasks/***?sv=***&se=***&sr=***&sp=***&sig=***
          filePath: files/in/in.txt

      - id: output
        commands:
          - 'mkdir -p outs/child/sub'
          - 'echo 1 > outs/1.txt'
          - 'echo 2 > outs/child/2.txt'
          - 'echo 3 > outs/child/sub/3.txt'
        outputFiles:
          - outs/1.txt
        outputDirs:
          - outs/child

Use a container to start the task, the pool must use a microsoft-azure-batch publisher.

yaml
id: azure_batch_job_create
namespace: company.team

tasks:
  - id: create
    type: io.kestra.plugin.azure.batch.job.Create
    endpoint: https://***.francecentral.batch.azure.com
    account: <batch-account>
    accessKey: <access-key>
    poolId: <pool-id>
    job:
      id: <job-name>
    tasks:
      - id: echo
        commands:
          - 'python --version'
        containerSettings:
          imageName: python

Properties

accessKey *string

account *string

endpoint *string

The blob service endpoint.

job *Job

The job to create.

poolId *string

The ID of the pool.

tasks *array

SubType

The list of tasks to be run.

completionCheckInterval string

Default PT1S

Format duration

The frequency with which the task checks whether the job is completed.

delete booleanstring

Default true

Whether the job should be deleted upon completion.

maxDuration string

Format duration

The maximum total wait duration.

If null, there is no timeout and the task is delegated to Azure Batch.

resume booleanstring

Default true

Whether to reconnect to the current job if it already exists.

Outputs

outputFiles object

SubType string

The output files' URIs in Kestra's internal storage.

vars object

The values from the output of the commands.

Definitions

io.kestra.plugin.azure.batch.models.OutputFileBlobContainerDestination

containerUrl *string

The URL of the container within Azure Blob Storage to which to upload the file(s).

If not using a managed identity, the URL must include a Shared Access Signature (SAS) granting write permissions to the container.

identityReference ComputeNodeIdentityReference

The reference to the user assigned identity to use to access Azure Blob Storage specified by containerUrl.

The identity must have write access to the Azure Blob Storage container.

path string

The destination blob or virtual directory within the Azure Storage container.

If filePattern refers to a specific file (i.e. contains no wildcards), then path is the name of the blob to which to upload that file. If filePattern contains one or more wildcards (and therefore may match multiple files), then path is the name of the blob virtual directory (which is prepended to each blob name) to which to upload the file(s). If omitted, file(s) are uploaded to the root of the container with a blob name matching their file name.

io.kestra.plugin.azure.batch.models.ContainerRegistry

identityReference ComputeNodeIdentityReference

The reference to the user assigned identity to use to access the Azure Container Registry instead of username and password.

password string

The password to log into the registry server.

registryServer string

The registry server URL.

If omitted, the default is "docker.io".

userName string

The user name to log into the registry server.

io.kestra.plugin.azure.batch.models.OutputFileUploadOptions

uploadCondition string

Default taskcompletion

Possible Values

TASK_SUCCESSTASK_FAILURETASK_COMPLETION

The conditions under which the Task output file or set of files should be uploaded.

io.kestra.plugin.azure.batch.models.ComputeNodeIdentityReference

resourceId string

The ARM resource ID of the user assigned identity.

io.kestra.plugin.azure.batch.models.ResourceFile

autoStorageContainerName string

The storage container name in the auto storage Account.

The autoStorageContainerName, storageContainerUrl and httpUrl properties are mutually exclusive, and one of them must be specified.

blobPrefix string

The blob prefix to use when downloading blobs from the Azure Storage container.

Only the blobs whose names begin with the specified prefix will be downloaded. The property is valid only when autoStorageContainerName or storageContainerUrl is used. This prefix can be a partial file name or a subdirectory. If a prefix is not specified, all the files in the container will be downloaded.

fileMode string

The file permission mode attribute in octal format.

This property applies only to files being downloaded to Linux Compute Nodes. It will be ignored if it is specified for a resourceFile which will be downloaded to a Windows Compute Node. If this property is not specified for a Linux Compute Node, then a default value of 0770 is applied to the file.

filePath string

The location on the Compute Node to which to download the file(s), relative to the Task's working directory.

If the httpUrl property is specified, the filePath is required and describes the path which the file will be downloaded to, including the file name. Otherwise, if the autoStorageContainerName or storageContainerUrl property is specified, filePath is optional and is the directory to download the files to. In the case where filePath is used as a directory, any directory structure already associated with the input data will be retained in full and appended to the specified filePath directory. The specified relative path cannot break out of the Task's working directory (for example by using ..).

httpUrl string

The URL of the file to download.

The autoStorageContainerName, storageContainerUrl and httpUrl properties are mutually exclusive, and one of them must be specified. If the URL points to Azure Blob Storage, it must be readable from compute nodes. There are three ways to get such a URL for a blob in Azure storage: include a Shared Access Signature (SAS) granting read permissions on the blob, use a managed identity with read permission, or set the ACL for the blob or its container to allow public access.

identityReference ComputeNodeIdentityReference

The reference to the user assigned identity to use to access Azure Blob Storage specified by storageContainerUrl or httpUrl.

storageContainerUrl string

The URL of the blob container within Azure Blob Storage.

The autoStorageContainerName, storageContainerUrl and httpUrl properties are mutually exclusive, and one of them must be specified. This URL must be readable and listable from compute nodes. There are three ways to get such a URL for a container in Azure storage: include a Shared Access Signature (SAS) granting read and list permissions on the container, use a managed identity with read and list permissions, or set the ACL for the container to allow public access.

io.kestra.plugin.azure.batch.models.TaskContainerSettings

imageName *string

The Image to use to create the container in which the Task will run.

This is the full Image reference, as would be specified to docker pull. If no tag is provided as part of the Image name, the tag : latest is used as a default.

containerRunOptions string

Additional options to the container create command.

These additional options are supplied as arguments to the docker create command, in addition to those controlled by the Batch Service.

registry ContainerRegistry

The private registry which contains the container image.

This setting can be omitted if was already provided at Pool creation.

workingDirectory string

Possible Values

TASK_WORKING_DIRECTORYCONTAINER_IMAGE_DEFAULT

The location of the container Task working directory.

The default is TASK_WORKING_DIRECTORY. Possible values include: TASK_WORKING_DIRECTORY, CONTAINER_IMAGE_DEFAULT.

io.kestra.plugin.azure.batch.models.Task

commands *array

SubType string

The command line of the Task.

For multi-instance Tasks, the command line is executed as the primary Task, after the primary Task and all subtasks have finished executing the coordination command line. The command line does not run under a shell, and therefore cannot take advantage of shell features such as environment variable expansion. If you want to take advantage of such features, you should invoke the shell in the command line, for example, using cmd /c MyCommand in Windows or /bin/sh -c MyCommand in Linux. If the command line refers to file paths, it should use a relative path (relative to the Task working directory), or use the Batch provided environment variable.

Command will be passed as /bin/sh -c "command" by default.

id *string

Max length 64

A string that uniquely identifies the Task within the Job.

The ID can contain any combination of alphanumeric characters including hyphens and underscores, and cannot contain more than 64 characters. The ID is case-preserving and case-insensitive (that is, you may not have two IDs within a Job that differ only by case). If not provided, a random UUID will be generated.

constraints TaskConstraints

The execution constraints that apply to this Task.

containerSettings TaskContainerSettings

The settings for the container under which the Task runs.

If the Pool that will run this Task has containerConfiguration set, this must be set as well. If the Pool that will run this Task doesn't have containerConfiguration set, this must not be set. When this is specified, all directories recursively below the AZ_BATCH_NODE_ROOT_DIR (the root of Azure Batch directories on the node) are mapped into the container, all Task environment variables are mapped into the container, and the Task command line is executed in the container. Files produced in the container outside of AZ_BATCH_NODE_ROOT_DIR might not be reflected to the host disk, meaning that Batch file APIs will not be able to access those files.

displayName string

Max length 1024

A display name for the Task.

The display name need not be unique and can contain any Unicode characters up to a maximum length of 1024.

environments object

SubType string

A list of environment variable settings for the Task.

interpreter string

Default /bin/sh

Interpreter to be used.

interpreterArgs array

SubType string

Default

[
  "-c"
]

Interpreter args to be used.

outputDirs array

SubType string

Output directories list that will be uploaded to the internal storage.

List of keys that will generate temporary directories. In the command, you can use a special variable named outputDirs.key. If you add a file with ["myDir"], you can use the special variable echo 1 >> {{ outputDirs.myDir }}/file1.txt and echo 2 >> {{ outputDirs.myDir }}/file2.txt, and both files will be uploaded to the internal storage. Then, you can use them on other tasks using {{ outputs.taskId.files['myDir/file1.txt'] }}

outputFiles array

SubType string

Output file list that will be uploaded to the internal storage.

List of keys that will generate temporary files. In the command, you can use a special variable named outputFiles.key. If you add a file with ["first"], you can use the special variable echo 1 >> {{ outputFiles.first }}on this task, and reference this file on others tasks using {{ outputs.taskId.outputFiles.first }}.

requiredSlots integerstring

The number of scheduling slots that the Task requires to run.

The default is 1. A Task can only be scheduled to run on a compute node if the node has enough free scheduling slots available. For multi-instance Tasks, this must be 1.

resourceFiles array

SubType

A list of files that the Batch service will download to the Compute Node before running the command line.

For multi-instance Tasks, the resource files will only be downloaded to the Compute Node on which the primary Task is executed. There is a maximum size for the list of resource files. When the max size is exceeded, the request will fail and the response error code will be RequestEntityTooLarge. If this occurs, the collection of ResourceFiles must be reduced in size. This can be achieved using .zip files, Application Packages, or Docker Containers.

uploadFiles array

SubType

A list of files that the Batch service will upload from the Compute Node after running the command line.

For multi-instance Tasks, the files will only be uploaded from the Compute Node on which the primary Task is executed.

io.kestra.plugin.azure.batch.models.OutputFile

destination *OutputFileDestination

The destination for the output file(s).

filePattern string

A pattern indicating which file(s) to upload.

Both relative and absolute paths are supported. Relative paths are relative to the Task working directory. The following wildcards are supported: * matches 0 or more characters (for example, pattern abc* would match abc or abcdef), ** matches any directory, ? matches any single character, [abc] matches one character in the brackets, and [a-c] matches one character in the range. Brackets can include a negation to match any character not specified (for example, [!abc] matches any character but a, b, or c). If a file name starts with "." it is ignored by default but may be matched by specifying it explicitly (for example *.gif will not match .a.gif, but .*.gif will). A simple example: **\*.txt matches any file that does not start in '.' and ends with .txt in the Task working directory or any subdirectory. If the filename contains a wildcard character it can be escaped using brackets (for example, abc[*] would match a file named abc*). Note that both \ and / are treated as directory separators on Windows, but only / is on Linux.Environment variables (%var% on Windows or $var on Linux) are expanded prior to the pattern being applied.

uploadOptions OutputFileUploadOptions

Default

{
  "uploadCondition": "taskcompletion"
}

Additional options for the upload operation, including the conditions under which to perform the upload.

io.kestra.plugin.azure.batch.models.Job

id *string

Max length 64

A string that uniquely identifies the Job within the Account.

displayName string

Max length 1024

The display name for the Job.

The display name need not be unique and can contain any Unicode characters up to a maximum length of 1024.

labels object

SubType string

Labels to attach to the created job.

maxParallelTasks integerstring

The maximum number of tasks that can be executed in parallel for the Job.

The value of maxParallelTasks must be -1 or greater than 0, if specified. If not specified, the default value is -1, which means there's no limit to the number of tasks that can be run at once. You can update a job's maxParallelTasks after it has been created using the update job API.

priority integerstring

The priority of the Job.

Priority values can range from -1000 to 1000, with -1000 being the lowest priority and 1000 being the highest priority. The default value is 0.

io.kestra.plugin.azure.batch.models.TaskConstraints

maxTaskRetryCount integerstring

The maximum number of times the Task may be retried.

The Batch service retries a Task if its exit code is nonzero. Note that this value specifically controls the number of retries for the Task executable due to a nonzero exit code. The Batch service will try the Task once, and may then retry up to this limit. For example, if the maximum retry count is 3, Batch tries the Task up to 4 times (one initial try and 3 retries). If the maximum retry count is 0, the Batch service does not retry the Task after the first attempt. If the maximum retry count is -1, the Batch service retries the Task without limit.

maxWallClockTime string

Format duration

The maximum elapsed time that the Task may run, measured from the time the Task starts.

If the Task does not complete within the time limit, the Batch service terminates it. If this is not specified, there is no time limit on how long the Task may run.

retentionTime string

Format duration

The minimum time to retain the Task directory on the Compute Node where it ran, from the time it completes execution.

After this time, the Batch service may delete the Task directory and all its contents. The default is 7 days, i.e. the Task directory will be retained for 7 days unless the Compute Node is removed or the Job is deleted.

io.kestra.plugin.azure.batch.models.OutputFileDestination

container *OutputFileBlobContainerDestination

A location in Azure Blob Storage to which the files are uploaded.

​Create

Create