Backfills are replays of missed schedule intervals between a defined start and end date.
Overview
Let's take the following flow as an example:
id: scheduled_flow
namespace: company.team
tasks:
- id: label
type: io.kestra.plugin.core.execution.Labels
labels: # label to track scheduled date
scheduledDate: "{{trigger.date ?? execution.startDate}}"
- id: external_system_export
type: io.kestra.plugin.scripts.shell.Commands
taskRunner:
type: io.kestra.plugin.core.runner.Process
commands:
- echo "processing data for {{trigger.date ?? execution.startDate}}"
- sleep $((RANDOM % 5 + 1))
triggers:
- id: schedule
type: io.kestra.plugin.core.trigger.Schedule
cron: "*/30 * * * *"
This flow run every 30 minutes. However, imagine that your source system had an outage for 5 hours. The flow will miss 10 executions. To replay these missed executions, you can use the backfill feature.
All missed schedules are automatically recovered by default.
You can use Backfill if it's configured differently, e.g., to not recover missed schedules or only the most recent. Read more in the dedicated documentation.
To backfill the missed executions, go to the Triggers tab on the Flow's detail page and click on the Backfill executions button.
You can then select the start and end date for the backfill. Additionally, you can set custom labels for the backfill executions to help you identify them in the future.
You can pause and resume the backfill process at any time, and by clicking on the Details button, you can see more details about that backfill process:
Trigger Backfill via an API call
Using cURL
You can invoke the backfill executions using the cURL
call as follows:
curl -X PUT http://localhost:8080/api/v1/release/triggers \
-H "Authorization: Bearer $KESTRA_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"namespace": "dev",
"flowId": "scheduled_flow",
"triggerId": "schedule",
"backfill": {
"start": "2025-04-29T11:30:00Z",
"end": null,
"labels": [
{
"key": "reason",
"value": "outage"
}
]
}
}'
In the backfill
attribute, you need to provide the start time for the backfill; the end time can be optionally provided. You can provide inputs to the flow with inputs
, as well as assign labels to the backfill executions by providing key-value pairs in the labels
section. In the example reason:outage
is labelled to make it clear what caused the need to backfill.
Other attributes to this PUT call are flowId
, namespace
, and triggerId
corresponding to the flow that is to backfilled.
Using a Service Account
Available on:
>=0.15Enterprise EditionCloudFor Enterprise and Cloud users, the same process as above can be done with Service Accounts, so no human user needed to be involved. In this case, you must specify the Tenant to use in the request header and definition: X-KESTRA-TENANT
and tenantId
. In the example we use a Tenant named production
.
curl -X PUT http://localhost:8080/api/v1/release/triggers \
-H "Authorization: Bearer $KESTRA_API_TOKEN" \
-H "X-Kestra-Tenant: production" \
-H "Content-Type: application/json" \
-d '{
"namespace": "dev",
"flowId": "scheduled_flow",
"triggerId": "schedule",
"tenantId": "production",
"backfill": {
"start": "2025-04-29T11:30:00Z",
"end": null,
"labels": [
{
"key": "reason",
"value": "outage"
}
]
}
}'
To use a Service Account, go to Administration -> IAM -> Service Accounts. From the Service Accounts tab, create a Service Account, generate an API Token, copy the token, and give the Service Account the appropriate access to backfill a flow. Use this API token in your cURL
instead of a user's token.
The interactive demo below walks through the steps one-by-one.
Using Python requests
You can invoke the backfill exections using the Python requests as follows:
import requests
import json
url = 'http://localhost:8080/api/v1/triggers'
headers = {
'Content-Type': 'application/json'
}
data = {
"backfill": {
"start": "2025-06-03T06:30:00.000Z",
"end": None,
"inputs": None,
"labels": [
{
"key": "reason",
"value": "outage"
}
]
},
"flowId": "scheduled_flow",
"namespace": "company.team",
"triggerId": "schedule"
}
response = requests.put(url, headers=headers, data=json.dumps(data))
print(response.status_code)
print(response.text)
With this code, you will be invoking the backfill for scheduled_flow
flow under company.team
namespace based on schedule
trigger ID within the flow. The number of backfills that will be executed will depend on the schedule present in the schedule
trigger and the start
and end
times mentioned in the backfill. When the end
time is null, as in this case, the end
time would be considered as the present time.
Was this page helpful?