Trigger a flow on real-time message consumption from a MongoDB database via change data capture and create one execution per row.

If you would like to consume multiple messages processed within a given time frame and process them in batch, you can use the io.kestra.plugin.debezium.mongodb.Trigger instead.

yaml
type: "io.kestra.plugin.debezium.mongodb.realtimetrigger"

Sharded connection.

yaml
id: debezium_mongodb
namespace: company.team

tasks:
  - id: send_data
    type: io.kestra.plugin.core.log.Log
    message: "{{ trigger.data }}"

triggers:
  - id: realtime
    type: io.kestra.plugin.debezium.mongodb.RealtimeTrigger
    snapshotMode: INITIAL
    connectionString: "mongodb://mongo_user:{{secret('MONGO_PASSWORD')}}@mongos0.example.com:27017,mongos1.example.com:27017/"

Replica set connection.

yaml
id: debezium_mongodb
namespace: company.team

tasks:
  - id: send_data
    type: io.kestra.plugin.core.log.Log
    message: "{{ trigger.data }}"

triggers:
  - id: realtime
    type: io.kestra.plugin.debezium.mongodb.RealtimeTrigger
    snapshotMode: INITIAL
    connectionString: "mongodb://mongo_user:{{secret('MONGO_PASSWORD')}}@mongodb0.example.com:27017/?replicaSet=rs0"
Properties

Defines connection string to mongodb replica set or sharded

Hostname of the remote server.

Port of the remote server.

List of conditions in order to limit the flow trigger.

Definitions
type*Requiredobject
afterstring
Formatdate-time

The date to test must be after this one.

beforestring
Formatdate-time

The date to test must be before this one.

Must be a valid ISO 8601 datetime with the zone identifier (use 'Z' for the default zone identifier).

datestring
Default{{ trigger.date }}
dayOfWeek*Requiredstring
Possible Values
MONDAYTUESDAYWEDNESDAYTHURSDAYFRIDAYSATURDAYSUNDAY
type*Requiredobject
datestring
Default{{ trigger.date }}
dayInMonth*Requiredstring
Possible Values
FIRSTLASTSECONDTHIRDFOURTH

Are you looking for the first or the last day in the month?

dayOfWeek*Requiredstring
Possible Values
MONDAYTUESDAYWEDNESDAYTHURSDAYFRIDAYSATURDAYSUNDAY

The day of week.

type*Requiredobject
datestring
Default{{ trigger.date }}
flowId*Requiredstring
namespace*Requiredstring
type*Requiredobject
labels*Requiredarrayobject

List of labels to match in the execution.

type*Requiredobject
namespace*Requiredstring

String against which to match the execution namespace depending on the provided comparison.

type*Requiredobject
comparisonstring
Possible Values
EQUALSPREFIXSUFFIX

Comparison to use when checking if namespace matches. If not provided, it will use EQUALS by default.

prefixbooleanstring
Defaultfalse

Whether to look at the flow namespace by prefix. Shortcut for comparison: PREFIX.

Only used when comparison is not set

expression*Requiredbooleanstring
type*Requiredobject
type*Requiredobject
inarray
SubTypestring
Possible Values
CREATEDSUBMITTEDRUNNINGPAUSEDRESTARTEDKILLINGSUCCESSWARNINGFAILEDKILLEDCANCELLEDQUEUEDRETRYINGRETRIEDSKIPPEDBREAKPOINTRESUBMITTED
notInarray
SubTypestring
Possible Values
CREATEDSUBMITTEDRUNNINGPAUSEDRESTARTEDKILLINGSUCCESSWARNINGFAILEDKILLEDCANCELLEDQUEUEDRETRYINGRETRIEDSKIPPEDBREAKPOINTRESUBMITTED
expression*Requiredstring
type*Requiredobject
flowId*Requiredstring

The flow id.

namespace*Requiredstring

The namespace of the flow.

type*Requiredobject
namespace*Requiredstring

The namespace of the flow or the prefix if prefix is true.

type*Requiredobject
prefixboolean
Defaultfalse

If we must look at the flow namespace by prefix (checked using startsWith). The prefix is case sensitive.

type*Requiredobject
inarray
SubTypestring
Possible Values
CREATEDSUBMITTEDRUNNINGPAUSEDRESTARTEDKILLINGSUCCESSWARNINGFAILEDKILLEDCANCELLEDQUEUEDRETRYINGRETRIEDSKIPPEDBREAKPOINTRESUBMITTED

List of states that are authorized.

notInarray
SubTypestring
Possible Values
CREATEDSUBMITTEDRUNNINGPAUSEDRESTARTEDKILLINGSUCCESSWARNINGFAILEDKILLEDCANCELLEDQUEUEDRETRYINGRETRIEDSKIPPEDBREAKPOINTRESUBMITTED

List of states that aren't authorized.

conditions*Requiredobject

The list of preconditions to wait for

The key must be unique for a trigger because it will be used to store the previous evaluation result.

id*Requiredstring
Validation RegExp^[a-zA-Z0-9][a-zA-Z0-9_-]*
Min length1

A unique id for the condition

type*Requiredobject
resetOnSuccessboolean
Defaulttrue

Whether to reset the evaluation results of SLA conditions after a first successful evaluation within the given time period.

By default, after a successful evaluation of the set of SLA conditions, the evaluation result is reset, so, the same set of conditions needs to be successfully evaluated again in the same time period to trigger a new execution. This means that to create multiple executions, the same set of conditions needs to be evaluated to true multiple times. You can disable this by setting this property to false so that, within the same period, each time one of the conditions is satisfied again after a successful evaluation, it will trigger a new execution.

timeWindow
Default{ "type": "DURATION_WINDOW" }

Define the time period (or window) for evaluating preconditions.

You can set the type of sla to one of the following values:

  1. DURATION_WINDOW: this is the default type. It uses a start time (windowAdvance) and end time (window) that are moving forward to the next interval whenever the evaluation time reaches the end time, based on the defined duration window. For example, with a 1-day window (the default option: window: PT1D), the SLA conditions are always evaluated during 24h starting at midnight (i.e. at time 00: 00: 00) each day. If you set windowAdvance: PT6H, the window will start at 6 AM each day. If you set windowAdvance: PT6H and you also override the window property to PT6H, the window will start at 6 AM and last for 6 hours — as a result, Kestra will check the SLA conditions during the following time periods: 06: 00 to 12: 00, 12: 00 to 18: 00, 18: 00 to 00: 00, and 00: 00 to 06: 00, and so on.
  2. SLIDING_WINDOW: this option also evaluates SLA conditions over a fixed time window, but it always goes backward from the current time. For example, a sliding window of 1 hour (window: PT1H) will evaluate executions for the past hour (so between now and one hour before now). It uses a default window of 1 day.
  3. DAILY_TIME_DEADLINE: this option declares that some SLA conditions should be met "before a specific time in a day". With the string property deadline, you can configure a daily cutoff for checking conditions. For example, deadline: "09: 00: 00" means that the defined SLA conditions should be met from midnight until 9 AM each day; otherwise, the flow will not be triggered.
  4. DAILY_TIME_WINDOW: this option declares that some SLA conditions should be met "within a given time range in a day". For example, a window from startTime: "06: 00: 00" to endTime: "09: 00: 00" evaluates executions within that interval each day. This option is particularly useful for declarative definition of freshness conditions when building data pipelines. For example, if you only need one successful execution within a given time range to guarantee that some data has been successfully refreshed in order for you to proceed with the next steps of your pipeline, this option can be more useful than a strict DAG-based approach. Usually, each failure in your flow would block the entire pipeline, whereas with this option, you can proceed with the next steps of the pipeline as soon as the data is successfully refreshed at least once within the given time range.
deadlinestring
Formatpartial-time

SLA daily deadline

Use it only for DAILY_TIME_DEADLINE SLA.

endTimestring
Formatpartial-time

SLA daily end time

startTimestring
Formatpartial-time

SLA daily start time

Use it only for DAILY_TIME_WINDOW SLA.

typestring
DefaultDURATION_WINDOW
Possible Values
DAILY_TIME_DEADLINEDAILY_TIME_WINDOWDURATION_WINDOWSLIDING_WINDOW

The type of the SLA

The default SLA is a sliding window (DURATION_WINDOW) with a window of 24 hours.

windowstring
Formatduration

Use it only for DURATION_WINDOW or SLIDING_WINDOW SLA. See ISO_8601 Durations for more information of available duration value. The start of the window is always based on midnight except if you set windowAdvance parameter. Eg if you have a 10 minutes (PT10M) window, the first window will be 00: 00 to 00: 10 and a new window will be started each 10 minutes

windowAdvancestring
Formatduration

Use it only for DURATION_WINDOW SLA. Allow to specify the start time of the window Eg: you want a window of 6 hours (window=PT6H), by default the check will be done between: 00: 00 and 06: 00, 06: 00 and 12: 00, 12: 00 and 18: 00, and 18: 00 and 00: 00. If you want to check the window between 03: 00 and 09: 00, 09: 00 and 15: 00, 15: 00 and 21: 00, and 21: 00 and 3: 00, you will have to shift the window of 3 hours by settings windowAdvance: PT3H

windowDeprecatedstring
Formatduration

The duration of the window

Deprecated, use timeWindow.window instead.

windowAdvanceDeprecatedstring
Formatduration

The window advance duration

Deprecated, use timeWindow.windowAdvance instead.

conditions*Required
Min items1

The list of conditions to exclude.

If any condition is true, it will prevent the event's execution.

type*Requiredobject
conditions*Required
Min items1

The list of conditions to validate.

If any condition is true, it will allow the event's execution.

type*Requiredobject
type*Requiredobject
countrystring

ISO 3166-1 alpha-2 country code. If not set, it uses the country code from the default locale.

datestring
Default{{ trigger.date}}
subDivisionstring

ISO 3166-2 country subdivision (e.g., provinces and states) code.

It uses the Jollyday library for public holiday calendar that supports more than 70 countries.

type*Requiredobject
afterstring
Formattime

The time to test must be after this one.

beforestring
Formattime

The time to test must be before this one.

Must be a valid ISO 8601 time with offset.

datestring
Default{{ trigger.date }}

The time to test.

Can be any variable or any valid ISO 8601 time. By default, it will use the trigger date.

type*Requiredobject
datestring
Default{{ trigger.date }}

The date to test.

Can be any variable or any valid ISO 8601 datetime. By default, it will use the trigger date.

DefaultADD_FIELD
Possible Values
ADD_FIELDNULLDROP

Specify how to handle deleted rows.

Possible settings are:

  • ADD_FIELD: Add a deleted field as boolean.
  • NULL: Send a row with all values as null.
  • DROP: Don't send deleted row.
Defaultdeleted

The name of deleted field if deleted is ADD_FIELD.

The name of the MongoDB database collection excluded from which to stream the changes.

A list of regular expressions that match the collection namespaces (for example, .) of all collections to be excluded

An optional, comma-separated list of regular expressions that match the fully-qualified names of columns to include in change event record values.

Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."

An optional, comma-separated list of regular expressions that match the names of databases for which you do not want to capture changes.

The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.

An optional, comma-separated list of regular expressions that match fully-qualified table identifiers for tables whose changes you do not want to capture.

The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.

DefaultINLINE
Possible Values
RAWINLINEWRAP

The format of the output.

Possible settings are:

  • RAW: Send raw data from debezium.
  • INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.
  • WRAP: Send a row like INLINE but wrapped in a record field.
Defaulttrue

Ignore DDL statement.

Ignore CREATE, ALTER, DROP and TRUNCATE operations.

The name of the MongoDB database collection included from which to stream the changes.

A list of regular expressions that match the collection namespaces (for example, .) of all collections to be monitored

An optional, comma-separated list of regular expressions that match the fully-qualified names of columns to exclude from change event record values.

Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.

An optional, comma-separated list of regular expressions that match the names of the databases for which to capture changes.

The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.

An optional, comma-separated list of regular expressions that match fully-qualified table identifiers of tables whose changes you want to capture.

The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.

DefaultADD_FIELD
Possible Values
ADD_FIELDDROP

Specify how to handle key.

Possible settings are:

  • ADD_FIELD: Add key(s) merged with columns.
  • DROP: Drop keys.
DefaultADD_FIELD
Possible Values
ADD_FIELDDROP

Specify how to handle metadata.

Possible settings are:

  • ADD_FIELD: Add metadata in a column named metadata.
  • DROP: Drop metadata.
Defaultmetadata

The name of metadata field if metadata is ADD_FIELD.

DefaultON_STOP
Possible Values
ON_EACH_BATCHON_STOP

When to commit the offsets to the KV Store.

  • ON_EACH_BATCH: after each batch of records consumed by this trigger, the offsets will be stored in the KV Store. This avoids any duplicated records being consumed but can be costly if many events are produced.
  • ON_STOP: when this trigger is stopped or killed, the offsets will be stored in the KV Store. This avoids any un-necessary writes to the KV Store, but if the trigger is not stopped gracefully, the KV Store value may not be updated leading to duplicated records consumption.

Password on the remote server.

SubTypestring

Additional configuration properties.

Any additional configuration properties that is valid for the current driver.

DefaultINITIAL
Possible Values
INITIALINITIAL_ONLYNO_DATAWHEN_NEEDED

Specifies the criteria for running a snapshot when the connector starts.

Possible settings are:

  • INITIAL: The connector runs a snapshot only when no offsets have been recorded for the logical server name.
  • INITIAL_ONLY: The connector runs a snapshot only when no offsets have been recorded for the logical server name and then stops; i.e. it will not read change events from the binlog.
  • NO_DATA: The connector captures the structure of all relevant tables, performing all the steps described in the default snapshot workflow, except that it does not create READ events to represent the data set at the point of the connector’s start-up.
  • WHEN_NEEDED: The connector runs a snapshot upon startup whenever it deems it necessary. That is, when no offsets are available, or when a previously recorded offset specifies a binlog location or GTID that is not available in the server.
DefaultTABLE
Possible Values
OFFDATABASETABLE

Split table on separate output uris.

Possible settings are:

  • TABLE: This will split all rows by tables on output with name database.table
  • DATABASE: This will split all rows by databases on output with name database.
  • OFF: This will NOT split all rows resulting in a single data output.
Defaultdebezium-state

The name of the Debezium state file stored in the KV Store for that namespace.

SubTypestring
Possible Values
CREATEDSUBMITTEDRUNNINGPAUSEDRESTARTEDKILLINGSUCCESSWARNINGFAILEDKILLEDCANCELLEDQUEUEDRETRYINGRETRIEDSKIPPEDBREAKPOINTRESUBMITTED

List of execution states after which a trigger should be stopped (a.k.a. disabled).

Username on the remote server.

Data.

Data extracted.

Stream.

Stream source