High Availability

Available on: Enterprise Edition

Kestra is designed to be highly available and fault-tolerant. This section describes how to configure Kestra for high availability.

Overview

First, let's define what we mean by high availability.

Highly available systems are designed to guarantee continuous operation, even in the event of a failure. This is achieved by eliminating single points of failure and ensuring that there are redundant systems and components.

In Kestra, high availability is achieved by running multiple instances of all components, such as the webserver hosting the API, the scheduler, the executor, the indexer and the workers. This ensures that if one instance fails, the system can continue to operate without interruption.

Note that you need to deploy Kestra using the Kafka and Elasticsearch architecture. This architecture is designed to be highly available and fault-tolerant.

Scaling the components

The following components can be scaled horizontally (e.g. by allocating more replicas in your Helm chart values):

Webserver
Scheduler
Executor
Worker
Indexer

Additionally, the Elasticsearch and Kafka clusters can be scaled out as needed to handle large volumes of data.

Finally, the internal storage (such as e.g. S3) is highly available and fault-tolerant by design.

Ensure that the host system is configured for high availability and fault-tolerance. For instance, you can adjust the Linux kernel net.ipv4.tcp_retries2 parameter to reduce TCP retransmission times.

Load balancing

To ensure high availability, you should use a load balancer to distribute incoming requests across multiple instances of the webserver. This ensures that if one instance fails, the system can continue to operate without interruption.

Was this page helpful?

Administrator GuideMITM Proxy for DinD

Administrator GuideMITM Proxy for Kestra

​High ​Availability

Overview

Scaling the components

Load balancing

High Availability