Create
Create a Azure Batch job with tasks.
type: "io.kestra.plugin.azure.batch.job.Create"
id: azure_batch_job_create
namespace: company.team
tasks:
- id: create
type: io.kestra.plugin.azure.batch.job.Create
endpoint: https://***.francecentral.batch.azure.com
account: <batch-account>
accessKey: <access-key>
poolId: <pool-id>
job:
id: <job-name>
tasks:
- id: env
commands:
- 'echo t1=$ENV_STRING'
environments:
ENV_STRING: "{{ inputs.first }}"
- id: echo
commands:
- 'echo t2={{ inputs.second }} 1>&2'
- id: for
commands:
- 'for i in $(seq 10); do echo t3=$i; done'
- id: vars
commands:
- echo '::{"outputs":{"extract":"'$(cat files/in/in.txt)'"}::'
resourceFiles:
- httpUrl: https://unittestkt.blob.core.windows.net/tasks/***?sv=***&se=***&sr=***&sp=***&sig=***
filePath: files/in/in.txt
- id: output
commands:
- 'mkdir -p outs/child/sub'
- 'echo 1 > outs/1.txt'
- 'echo 2 > outs/child/2.txt'
- 'echo 3 > outs/child/sub/3.txt'
outputFiles:
- outs/1.txt
outputDirs:
- outs/child
Use a container to start the task, the pool must use a microsoft-azure-batch
publisher.
id: azure_batch_job_create
namespace: company.team
tasks:
- id: create
type: io.kestra.plugin.azure.batch.job.Create
endpoint: https://***.francecentral.batch.azure.com
account: <batch-account>
accessKey: <access-key>
poolId: <pool-id>
job:
id: <job-name>
tasks:
- id: echo
commands:
- 'python --version'
containerSettings:
imageName: python
The blob service endpoint.
The job to create.
The ID of the pool.
The frequency with which the task checks whether the job is completed.
The maximum total wait duration.
If null, there is no timeout and the task is delegated to Azure Batch.
The output files' URIs in Kestra's internal storage.
The values from the output of the commands.
The URL of the container within Azure Blob Storage to which to upload the file(s).
If not using a managed identity, the URL must include a Shared Access Signature (SAS) granting write permissions to the container.
The reference to the user assigned identity to use to access Azure Blob Storage specified by containerUrl
.
The identity must have write access to the Azure Blob Storage container.
The destination blob or virtual directory within the Azure Storage container.
If filePattern
refers to a specific file (i.e. contains no wildcards), then path
is the name of the blob to which to upload that file. If filePattern
contains one or more wildcards (and therefore may match multiple files), then path
is the name of the blob virtual directory (which is prepended to each blob name) to which to upload the file(s). If omitted, file(s) are uploaded to the root of the container with a blob name matching their file name.
The reference to the user assigned identity to use to access the Azure Container Registry instead of username and password.
The password to log into the registry server.
The registry server URL.
If omitted, the default is "docker.io".
The user name to log into the registry server.
The conditions under which the Task output file or set of files should be uploaded.
The ARM resource ID of the user assigned identity.
The storage container name in the auto storage Account.
The autoStorageContainerName
, storageContainerUrl
and httpUrl
properties are mutually exclusive, and one of them must be specified.
The blob prefix to use when downloading blobs from the Azure Storage container.
Only the blobs whose names begin with the specified prefix will be downloaded. The property is valid only when autoStorageContainerName
or storageContainerUrl
is used. This prefix can be a partial file name or a subdirectory. If a prefix is not specified, all the files in the container will be downloaded.
The file permission mode attribute in octal format.
This property applies only to files being downloaded to Linux Compute Nodes. It will be ignored if it is specified for a resourceFile
which will be downloaded to a Windows Compute Node. If this property is not specified for a Linux Compute Node, then a default value of 0770
is applied to the file.
The location on the Compute Node to which to download the file(s), relative to the Task's working directory.
If the httpUrl
property is specified, the filePath
is required and describes the path which the file will be downloaded to, including the file name. Otherwise, if the autoStorageContainerName
or storageContainerUrl
property is specified, filePath
is optional and is the directory to download the files to. In the case where filePath
is used as a directory, any directory structure already associated with the input data will be retained in full and appended to the specified filePath
directory. The specified relative path cannot break out of the Task's working directory (for example by using ..
).
The URL of the file to download.
The autoStorageContainerName
, storageContainerUrl
and httpUrl
properties are mutually exclusive, and one of them must be specified. If the URL points to Azure Blob Storage, it must be readable from compute nodes. There are three ways to get such a URL for a blob in Azure storage: include a Shared Access Signature (SAS) granting read permissions on the blob, use a managed identity with read permission, or set the ACL for the blob or its container to allow public access.
The reference to the user assigned identity to use to access Azure Blob Storage specified by storageContainerUrl
or httpUrl
.
The URL of the blob container within Azure Blob Storage.
The autoStorageContainerName
, storageContainerUrl
and httpUrl
properties are mutually exclusive, and one of them must be specified. This URL must be readable and listable from compute nodes. There are three ways to get such a URL for a container in Azure storage: include a Shared Access Signature (SAS) granting read and list permissions on the container, use a managed identity with read and list permissions, or set the ACL for the container to allow public access.
The Image to use to create the container in which the Task will run.
This is the full Image reference, as would be specified to docker pull
. If no tag is provided as part of the Image name, the tag : latest
is used as a default.
Additional options to the container create command.
These additional options are supplied as arguments to the docker create
command, in addition to those controlled by the Batch Service.
The private registry which contains the container image.
This setting can be omitted if was already provided at Pool creation.
The location of the container Task working directory.
The default is taskWorkingDirectory
. Possible values include: taskWorkingDirectory
, containerImageDefault
.
The command line of the Task.
For multi-instance Tasks, the command line is executed as the primary Task, after the primary Task and all subtasks have finished executing the coordination command line. The command line does not run under a shell, and therefore cannot take advantage of shell features such as environment variable expansion. If you want to take advantage of such features, you should invoke the shell in the command line, for example, using cmd /c MyCommand
in Windows or /bin/sh -c MyCommand
in Linux. If the command line refers to file paths, it should use a relative path (relative to the Task working directory), or use the Batch provided environment variable.
Command will be passed as /bin/sh -c "command"
by default.
A string that uniquely identifies the Task within the Job.
The ID can contain any combination of alphanumeric characters including hyphens and underscores, and cannot contain more than 64 characters. The ID is case-preserving and case-insensitive (that is, you may not have two IDs within a Job that differ only by case). If not provided, a random UUID will be generated.
Interpreter to be used.
The execution constraints that apply to this Task.
The settings for the container under which the Task runs.
If the Pool that will run this Task has containerConfiguration
set, this must be set as well. If the Pool that will run this Task doesn't have containerConfiguration
set, this must not be set. When this is specified, all directories recursively below the AZ_BATCH_NODE_ROOT_DIR (the root of Azure Batch directories on the node) are mapped into the container, all Task environment variables are mapped into the container, and the Task command line is executed in the container. Files produced in the container outside of AZ_BATCH_NODE_ROOT_DIR might not be reflected to the host disk, meaning that Batch file APIs will not be able to access those files.
A display name for the Task.
The display name need not be unique and can contain any Unicode characters up to a maximum length of 1024.
A list of environment variable settings for the Task.
Interpreter args to be used.
Output directories list that will be uploaded to the internal storage.
List of keys that will generate temporary directories.
In the command, you can use a special variable named outputDirs.key
.
If you add a file with ["myDir"]
, you can use the special variable echo 1 >> {{ outputDirs.myDir }}/file1.txt
and echo 2 >> {{ outputDirs.myDir }}/file2.txt
, and both files will be uploaded to the internal storage. Then, you can use them on other tasks using {{ outputs.taskId.files['myDir/file1.txt'] }}
Output file list that will be uploaded to the internal storage.
List of keys that will generate temporary files.
In the command, you can use a special variable named outputFiles.key
.
If you add a file with ["first"]
, you can use the special variable echo 1 >> {{ outputFiles.first }}
on this task, and reference this file on others tasks using {{ outputs.taskId.outputFiles.first }}
.
A list of files that the Batch service will download to the Compute Node before running the command line.
For multi-instance Tasks, the resource files will only be downloaded to the Compute Node on which the primary Task is executed. There is a maximum size for the list of resource files. When the max size is exceeded, the request will fail and the response error code will be RequestEntityTooLarge. If this occurs, the collection of ResourceFiles must be reduced in size. This can be achieved using .zip files, Application Packages, or Docker Containers.
The destination for the output file(s).
Additional options for the upload operation, including the conditions under which to perform the upload.
A pattern indicating which file(s) to upload.
Both relative and absolute paths are supported. Relative paths are relative to the Task working directory. The following wildcards are supported: *
matches 0 or more characters (for example, pattern abc*
would match abc
or abcdef
), **
matches any directory, ?
matches any single character, [abc]
matches one character in the brackets, and [a-c]
matches one character in the range. Brackets can include a negation to match any character not specified (for example, [!abc]
matches any character but a
, b
, or c
). If a file name starts with "."
it is ignored by default but may be matched by specifying it explicitly (for example *.gif
will not match .a.gif
, but .*.gif
will). A simple example: **\*.txt
matches any file that does not start in '.' and ends with .txt
in the Task working directory or any subdirectory. If the filename contains a wildcard character it can be escaped using brackets (for example, abc[*]
would match a file named abc*
). Note that both \
and /
are treated as directory separators on Windows, but only /
is on Linux.Environment variables (%var%
on Windows or $var
on Linux) are expanded prior to the pattern being applied.
A string that uniquely identifies the Job within the Account.
The ID can contain any combination of alphanumeric characters including hyphens and underscores, and cannot contain more than 64 characters. The ID is case-preserving and case-insensitive (that is, you may not have two IDs within an Account that differ only by case).
The display name for the Job.
The display name need not be unique and can contain any Unicode characters up to a maximum length of 1024.
Labels to attach to the created job.
The maximum elapsed time that the Task may run, measured from the time the Task starts.
If the Task does not complete within the time limit, the Batch service terminates it. If this is not specified, there is no time limit on how long the Task may run.
The minimum time to retain the Task directory on the Compute Node where it ran, from the time it completes execution.
After this time, the Batch service may delete the Task directory and all its contents. The default is 7 days, i.e. the Task directory will be retained for 7 days unless the Compute Node is removed or the Job is deleted.
A location in Azure Blob Storage to which the files are uploaded.