Retries handle transient failures in your workflows.

They are defined at the task level and can be configured to retry a task a certain number of times or with a delay between attempts.

What are retries

Retries let you automatically rerun failed tasks. Each retry creates a new task run attempt, based on the retry configuration defined in the flow.

Example

This task retries up to 5 times with a 15-minute interval between attempts:

yaml
- id: retry_sample
  type: io.kestra.plugin.core.log.Log
  message: my output for task {{task.id}}
  timeout: PT10M
  retry:
    type: constant
    maxAttempts: 5
    interval: PT15M

In this example, the flow retries 4 times every 0.25 seconds. It succeeds on the 5th attempt, using {{ taskrun.attemptsCount }} to track retries:

yaml
id: retry
namespace: company.team
description: This flow retries 4 times and succeeds on the 5th attempt

tasks:
- id: failed
  type: io.kestra.plugin.scripts.shell.Commands
  taskRunner:
    type: io.kestra.plugin.core.runner.Process
  commands:
  - 'if [ "{{taskrun.attemptsCount}}" -eq 4 ]; then exit 0; else exit 1; fi'
  retry:
    type: constant
    interval: PT0.25S
    maxAttempts: 5
    maxDuration: PT1M
    warningOnRetry: true

errors:
  - id: never_happen
    type: io.kestra.plugin.core.debug.Return
    format: Never happened {{task.id}}

Timeout vs. Max Retry Duration

  • timeout: Maximum duration for a single task attempt (initial or retry). If exceeded, the attempt fails.
  • retry.maxDuration: Maximum total time allowed for the task, including all attempts and delays. Once exceeded, retries stop.

Example: With timeout: 10m and maxDuration: 30m:

  • Each attempt can last up to 10 minutes.
  • The overall retries stop after 30 minutes in total.

⚠️ Ensure retry.interval is smaller than maxDuration, or retries may not run.

Retry options

NameTypeDescription
typestringRetry strategy: constant, exponential, or random.
maxAttemptsintegerNumber of retry attempts before stopping.
maxDurationDurationMaximum total time for the task, across all attempts.
warningOnRetryBooleanMarks execution as WARNING if retries occurred (default: false).

Duration format

Durations use ISO 8601 format (weeks, months, years not supported). Examples:

ValueDescription
PT0.25S250 ms
PT2S2 seconds
PT1M1 minute
PT3.5H3 hours, 30 minutes
P6DT4H6 days, 4 hours

Retry types

constant

Retries at fixed intervals. Example: with interval: PT10M, retries occur every 10 minutes.

NameTypeDescription
intervalDurationDelay between attempts.

exponential

Wait time increases after each retry (e.g., 30s, 1m, 2m, ...).

NameTypeDescription
intervalDurationBase interval between attempts.
maxIntervalDurationMaximum interval allowed.
delayFactorDoubleMultiplier (default: 2). Example: interval 30s → 30s, 1m, 2m, 4m...

random

Randomized delays within bounds.

NameTypeDescription
minIntervalDurationMinimum delay.
maxIntervalDurationMaximum delay.

Configuring retries globally

You can configure retries globally for all tasks in Kestra:

yaml
kestra:
  plugins:
    configurations:
      - type: io.kestra
        values:
          retry:
            type: constant
            maxAttempts: 3
            interval: PT30S

This applies a constant retry policy with up to 3 attempts every 30 seconds.


Flow-level retries

You can retry at the flow level, restarting either the entire execution or just failed tasks. Options:

  1. CREATE_NEW_EXECUTION: Start a new execution.
  2. RETRY_FAILED_TASK: Retry only the failed task.
yaml
id: flow_level_retry
namespace: company.team

retry:
  maxAttempts: 3
  behavior: CREATE_NEW_EXECUTION # or RETRY_FAILED_TASK
  type: constant
  interval: PT1S

tasks:
  - id: fail_1
    type: io.kestra.plugin.core.execution.Fail
    allowFailure: true

  - id: fail_2
    type: io.kestra.plugin.core.execution.Fail
  • With CREATE_NEW_EXECUTION, the execution attempt increases.
  • With RETRY_FAILED_TASK, only the task run attempt increases.

Retry vs. Restart vs. Replay

Automatic vs. manual

  • Retry: Automatic rerun of failed tasks within the same execution.
  • Restart: Manual rerun of failed tasks within the same execution.
  • Replay: Manual rerun from any point, creating a new execution.

replay_restart.png

Restart vs. Replay

  • Restart: Retries only failed tasks in the same execution.
  • Replay: Starts a new execution from a chosen task, with a new execution ID. Outputs of previous tasks are reused from cache if needed.

replay.png

Replays can start from successful or failed tasks but always create a new execution. Restarts keep the same execution ID.

After a Replay, you can still track which Execution triggered this new run thanks to the Original Execution field:

original_execution.png

Summary

ConceptScopeTriggerNew execution?
RetryTask levelAutomaticNo
RestartFlow levelManualNo
ReplayFlow or task levelManualYes

Was this page helpful?