# Task Retries in Kestra – Handle Transient Failures

Retries handle transient failures in your workflows.

They are defined at the task level and can be configured to retry a task a certain number of times or with a delay between attempts.

<div class="video-container">
  <iframe src="https://www.youtube.com/embed/_MS14PNxtjg?si=-dqo21ljXdAw-C17" title="YouTube video player" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
</div>

Retries let you automatically rerun failed tasks. Each retry creates a new task run attempt, based on the retry configuration defined in the flow.

### Example

This task retries up to 5 times with a 15-minute interval between attempts:

```yaml
- id: retry_sample
  type: io.kestra.plugin.core.log.Log
  message: my output for task {{task.id}}
  timeout: PT10M
  retry:
    type: constant
    maxAttempts: 5
    interval: PT15M
```

In this example, the flow retries 4 times every 0.25 seconds. It succeeds on the 5th attempt, using `{{ taskrun.attemptsCount }}` to track retries:

```yaml
id: retry
namespace: company.team
description: This flow retries 4 times and succeeds on the 5th attempt

tasks:
- id: failed
  type: io.kestra.plugin.scripts.shell.Commands
  taskRunner:
    type: io.kestra.plugin.core.runner.Process
  commands:
  - 'if [ "{{taskrun.attemptsCount}}" -eq 4 ]; then exit 0; else exit 1; fi'
  retry:
    type: constant
    interval: PT0.25S
    maxAttempts: 5
    maxDuration: PT1M
    warningOnRetry: true

errors:
  - id: never_happen
    type: io.kestra.plugin.core.debug.Return
    format: Never happened {{task.id}}
```

### Timeout vs. Max Retry Duration

- **`timeout`**: Maximum duration for a single task attempt (initial or retry). If exceeded, the attempt fails.
- **`retry.maxDuration`**: Maximum total time allowed for the task, including all attempts and delays. Once exceeded, retries stop.

**Example**: With `timeout: 10m` and `maxDuration: 30m`:
- Each attempt can last up to 10 minutes.
- The overall retries stop after 30 minutes in total.

⚠️ Ensure `retry.interval` is smaller than `maxDuration`, or retries may not run.

### Retry options

| Name             | Type       | Description |
|------------------|------------|-------------|
| `type`           | string     | Retry strategy: `constant`, `exponential`, or `random`. |
| `maxAttempts`    | integer    | Number of retry attempts before stopping. |
| `maxDuration`    | Duration   | Maximum total time for the task, across all attempts. |
| `warningOnRetry` | Boolean    | Marks execution as `WARNING` if retries occurred (default: false). |

### Duration format

Durations use [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601#Durations) format (weeks, months, years not supported). Examples:

| Value    | Description |
|----------|-------------|
| PT0.25S  | 250 ms |
| PT2S     | 2 seconds |
| PT1M     | 1 minute |
| PT3.5H   | 3 hours, 30 minutes |
| P6DT4H   | 6 days, 4 hours |

## Retry types

### `constant`

Retries at fixed intervals. Example: with `interval: PT10M`, retries occur every 10 minutes.

| Name       | Type     | Description |
|------------|----------|-------------|
| `interval` | Duration | Delay between attempts. |

### `exponential`

Wait time increases after each retry (e.g., 30s, 1m, 2m, ...).

| Name          | Type     | Description |
|---------------|----------|-------------|
| `interval`    | Duration | Base interval between attempts. |
| `maxInterval` | Duration | Maximum interval allowed. |
| `delayFactor` | Double   | Multiplier (default: 2). Example: interval 30s → 30s, 1m, 2m, 4m... |

### `random`

Randomized delays within bounds.

| Name          | Type     | Description |
|---------------|----------|-------------|
| `minInterval` | Duration | Minimum delay. |
| `maxInterval` | Duration | Maximum delay. |

## Configuring retries globally

You can configure retries globally for all tasks in Kestra:

```yaml
kestra:
  plugins:
    configurations:
      - type: io.kestra
        values:
          retry:
            type: constant
            maxAttempts: 3
            interval: PT30S
```

This applies a constant retry policy with up to 3 attempts every 30 seconds.

## Flow-level retries

You can retry at the flow level, restarting either the entire execution or just failed tasks. Options:

1. `CREATE_NEW_EXECUTION`: Start a new execution.
2. `RETRY_FAILED_TASK`: Retry only the failed task.

```yaml
id: flow_level_retry
namespace: company.team

retry:
  maxAttempts: 3
  behavior: CREATE_NEW_EXECUTION # or RETRY_FAILED_TASK
  type: constant
  interval: PT1S

tasks:
  - id: fail_1
    type: io.kestra.plugin.core.execution.Fail
    allowFailure: true

  - id: fail_2
    type: io.kestra.plugin.core.execution.Fail
```

- With `CREATE_NEW_EXECUTION`, the **execution attempt** increases.
- With `RETRY_FAILED_TASK`, only the task run attempt increases.

:::alert{type="info"}
Flow-level retries also restart Subflows as new executions.
:::

## Retry vs. Restart vs. Replay

### Automatic vs. manual

- **Retry**: Automatic rerun of failed tasks within the same execution.
- **Restart**: Manual rerun of failed tasks within the same execution.
- **Replay**: Manual rerun from any point, creating a new execution.

![replay_restart.png](./replay_restart.png)

### Restart vs. Replay

- **Restart**: Retries only failed tasks in the same execution.
- **Replay**: Starts a new execution from a chosen task, with a new execution ID. Outputs of previous tasks are reused from cache if needed. Check out the [Replay documentation](../../06.concepts/10.replay/index.md).

![replay.png](./replay.png)

Replays can start from successful or failed tasks but always create a new execution. Restarts keep the same execution ID.

After a Replay, you can still track which Execution triggered this new run thanks to the `Original Execution` field:

![original_execution.png](./original_execution.png)

### Summary

| Concept | Scope                  | Trigger  | New execution? |
|---------|------------------------|----------|----------------|
| Retry   | Task level             | Automatic| No |
| Restart | Flow level             | Manual   | No |
| Replay  | Flow or task level     | Manual   | Yes |