# Errors and retries

# Errors handling

Errors are special branches of your flow where you can define how to handle task failures.

Two kinds of error handlers can be defined:

  • Global: error handling global to a flow that must be at the root of the flow.
  • Local: error handling local to a Flowable Task, will handle errors for the flowable task and its children.

# Global Error Handler

This flow example has a single Bash task that fails immediately.
The global error handler will then be called so the 2nd task will run.

id: errors
namespace: io.kestra.tests

  - id: failed
    type: io.kestra.core.tasks.scripts.Bash
      - exit 1
  - id: 2nd
    type: io.kestra.core.tasks.debugs.Echo
    format: I'm failing {{task.id}}
    level: INFO

# Local Error Handler

In this flow example, the error branch will be used only if a child of the task t2 has an error. If the task t1 failed, the error branch would not be used.

This can be useful to restrict error handling for a specific part of the flow and perform specific tasks like resource cleanup.

id: errors
namespace: io.kestra.tests

  - id: parent-seq
    type: io.kestra.core.tasks.flows.Sequential
      - id: t1
        type: io.kestra.core.tasks.debugs.Return
        format: "{{task.id}} > {{taskrun.startDate}}"
      - id: t2
        type: io.kestra.core.tasks.flows.Sequential
          - id: t2-t1
            type: io.kestra.core.tasks.scripts.Bash
              - 'exit 1'
          - id: error-t1
            type: io.kestra.core.tasks.debugs.Return
            format: "Error Trigger ! {{task.id}}"

# Retries

Kestra provides a task retry feature. This makes it possible to add retry behavior for any task failed run based on configurations in the flow description.

A retry on a task run will create a new task attempt.

# Example

The following example defines a retry for the retry-sample task with a maximum of 5 attempts every 15 minutes:

- id: retry-sample
  type: io.kestra.core.tasks.debugs.Echo
  format: my output for task {{task.id}}
  timeout: PT10M
    maxAttempt: 5
    type: constant
    interval: PT15M

# Retry options for all retry types

name type description
type string Retry behavior to apply. Can be one of constant, exponential, random.
maxAttempt integer Number of retries performed before the system stops retrying.
maxDuration Duration Maximum delay the execution is retried. Once passed, the task is no more processed.
warningOnRetry Boolean Flag the execution as warning if any retry was done on this task. Default false.

# Duration

Some options above have to be filled with a duration notation.
Durations are expressed in ISO 8601 Durations (opens new window), here are some examples :

name description
PT0.250S 250 milliseconds delay
PT2S 2 seconds delay
PT1M 1 minute delay
PT3.5H 3 hours and a half delay

# Retry types

# constant

This establishes constant retry times: if the interval property is set to 10 minutes, it retries every 10 minutes.

name type description
interval Duration Duration between each retry.

# exponential

This establishes retry behavior that waits longer between each retry e.g. 1s, 5s, 15s, ...

name type description
interval Duration Duration between each retry.
maxInterval Duration Max Duration between each retry.
delayFactor Double Multiplier for the interval on between retry, default is 2. For example, with an interval of 30s and a delay factor of 2, retry will append at 30s, 1m30, 3m30, ...

# random

This establishes retries with a random delay within minimum and maximum limits.

name type description
minInterval Duration Minimal duration between each retry.
maxInterval Duration Maximum duration between each retry.