
AIAgent
Run an AI Agent
Run an AI Agent
Run an AI Agent
An AI agent is an autonomous system that uses a Large Language Model (LLM). Each run combines a system message and a prompt. The system message defines the agent's role and behavior, while the prompt carries the actual user input for that execution. Together, they guide the agent's response. The agent can also use tools, content retrievers, and memory to provide richer context during execution.
type: "io.kestra.plugin.ai.agent.AIAgent"Examples
Summarize arbitrary text with controllable length and language.
id: simple_summarizer_agent
namespace: company.ai
inputs:
- id: summary_length
displayName: Summary Length
type: SELECT
defaults: medium
values:
- short
- medium
- long
- id: language
displayName: Language ISO code
type: SELECT
defaults: en
values:
- en
- fr
- de
- es
- it
- ru
- ja
- id: text
type: STRING
displayName: Text to summarize
defaults: |
Kestra is an open-source orchestration platform that:
- Allows you to define workflows declaratively in YAML
- Allows non-developers to automate tasks with a no-code interface
- Keeps everything versioned and governed, so it stays secure and auditable
- Extends easily for custom use cases through plugins and custom scripts.
Kestra follows a "start simple and grow as needed" philosophy. You can schedule a basic workflow in a few minutes, then later add Python scripts, Docker containers, or complicated branching logic if the situation calls for it.
tasks:
- id: multilingual_agent
type: io.kestra.plugin.ai.agent.AIAgent
systemMessage: |
You are a precise technical assistant.
Produce a {{ inputs.summary_length }} summary in {{ inputs.language }}.
Keep it factual, remove fluff, and avoid marketing language.
If the input is empty or non-text, return a one-sentence explanation.
Output format:
- 1-2 sentences for 'short'
- 2-5 sentences for 'medium'
- Up to 5 paragraphs for 'long'
prompt: |
Summarize the following content: {{ inputs.text }}
- id: english_brevity
type: io.kestra.plugin.ai.agent.AIAgent
prompt: Generate exactly 1 sentence English summary of "{{ outputs.multilingual_agent.textOutput }}"
pluginDefaults:
- type: io.kestra.plugin.ai.agent.AIAgent
values:
provider:
type: io.kestra.plugin.ai.provider.GoogleGemini
modelName: gemini-2.5-flash
apiKey: "{{ kv('GEMINI_API_KEY') }}"
Interact with an MCP Server subprocess running in a Docker container
id: agent_with_docker_mcp_server_tool
namespace: company.ai
inputs:
- id: prompt
type: STRING
defaults: What is the current UTC time?
tasks:
- id: agent
type: io.kestra.plugin.ai.agent.AIAgent
prompt: "{{ inputs.prompt }}"
provider:
type: io.kestra.plugin.ai.provider.OpenAI
apiKey: "{{ kv('OPENAI_API_KEY') }}"
modelName: gpt-5-nano
tools:
- type: io.kestra.plugin.ai.tool.DockerMcpClient
image: mcp/time
Run an AI agent with a memory
id: agent_with_memory
namespace: company.ai
tasks:
- id: first_agent
type: io.kestra.plugin.ai.agent.AIAgent
prompt: Hi, my name is John and I live in New York!
- id: second_agent
type: io.kestra.plugin.ai.agent.AIAgent
prompt: What's my name and where do I live?
pluginDefaults:
- type: io.kestra.plugin.ai.agent.AIAgent
values:
provider:
type: io.kestra.plugin.ai.provider.OpenAI
apiKey: "{{ kv('OPENAI_API_KEY') }}"
modelName: gpt-5-mini
memory:
type: io.kestra.plugin.ai.memory.KestraKVStore
memoryId: JOHN
ttl: PT1M
messages: 5
Run an AI agent leveraging Tavily Web Search as a content retriever. Note that in contrast to tools, content retrievers are always called to provide context to the prompt, and it's up to the LLM to decide whether to use that retrieved context or not.
id: agent_with_content_retriever
namespace: company.ai
inputs:
- id: prompt
type: STRING
defaults: What is the latest Kestra release and what new features does it include? Name at least 3 new features added exactly in this release.
tasks:
- id: agent
type: io.kestra.plugin.ai.agent.AIAgent
prompt: "{{ inputs.prompt }}"
provider:
type: io.kestra.plugin.ai.provider.GoogleGemini
modelName: gemini-2.5-flash
apiKey: "{{ kv('GEMINI_API_KEY') }}"
contentRetrievers:
- type: io.kestra.plugin.ai.retriever.TavilyWebSearch
apiKey: "{{ kv('TAVILY_API_KEY') }}"
Run an AI Agent returning a structured output specified in a JSON schema. Note that some providers and models don't support JSON Schema; in those cases, instruct the model to return strict JSON using an inline schema description in the prompt and validate the result downstream.
id: agent_with_structured_output
namespace: company.ai
inputs:
- id: customer_ticket
type: STRING
defaults: >-
I can't log into my account. It says my password is wrong, and the reset link never arrives.
tasks:
- id: support_agent
type: io.kestra.plugin.ai.agent.AIAgent
provider:
type: io.kestra.plugin.ai.provider.MistralAI
apiKey: "{{ kv('MISTRAL_API_KEY') }}"
modelName: open-mistral-7b
systemMessage: |
You are a classifier that returns ONLY valid JSON matching the schema.
Do not add explanations or extra keys.
configuration:
responseFormat:
type: JSON
jsonSchema:
type: object
required: ["category", "priority"]
properties:
category:
type: string
enum: ["ACCOUNT", "BILLING", "TECHNICAL", "GENERAL"]
priority:
type: string
enum: ["LOW", "MEDIUM", "HIGH"]
prompt: |
Classify the following customer message:
{{ inputs.customer_ticket }}
Perform market research with an AI Agent using a web search retriever and save the findings as a Markdown report.
The retriever gathers up-to-date information, the agent summarizes it, and the filesystem tool writes the result to the task working directory.
Mount to a container path (e.g., /tmp) so the generated report file is accessible and can be collected with outputFiles.
id: market_research_agent
namespace: company.ai
inputs:
- id: prompt
type: STRING
defaults: |
Research the latest trends in workflow and data orchestration.
Use web search to gather current, reliable information from multiple sources.
Then create a well-structured Markdown report that includes an introduction,
key trends with short explanations, and a conclusion.
Save the final report as `report.md` in the `/tmp` directory.
tasks:
- id: agent
type: io.kestra.plugin.ai.agent.AIAgent
provider:
type: io.kestra.plugin.ai.provider.GoogleGemini
apiKey: "{{ kv('GEMINI_API_KEY') }}"
modelName: gemini-2.5-flash
prompt: "{{ inputs.prompt }}"
systemMessage: |
You are a research assistant that must always follow this process:
1. Use the TavilyWebSearch content retriever to gather the most relevant and up-to-date information for the user prompt. Do not invent information.
2. Summarize and structure the findings clearly in Markdown format. Use headings, bullet points, and links when appropriate.
3. Save the final Markdown report as `report.md` in the `/tmp` directory by using the provided filesystem tool.
Important rules:
- Never output raw text in your response. The final result must always be written to `report.md`.
- If no useful results are retrieved, write a short note in `report.md` explaining that no information was found.
- Do not attempt to bypass or ignore the retriever or the filesystem tool.
contentRetrievers:
- type: io.kestra.plugin.ai.retriever.TavilyWebSearch
apiKey: "{{ kv('TAVILY_API_KEY') }}"
maxResults: 10
tools:
- type: io.kestra.plugin.ai.tool.DockerMcpClient
image: mcp/filesystem
command: ["/tmp"]
binds: ["{{workingDir}}:/tmp"] # mount host_path:container_path to access the generated report
outputFiles:
- report.md
Analyze a numeric series with CodeExecution. The agent must call the code tool for all calculations, then explain the results in English.
id: agent_with_code_execution_stats
namespace: company.ai
inputs:
- id: series
type: STRING
defaults: |
12, 15, 15, 18, 21, 99, 102, 102, 104
tasks:
- id: stats_agent
type: io.kestra.plugin.ai.agent.AIAgent
provider:
type: io.kestra.plugin.ai.provider.GoogleGemini
apiKey: "{{ kv('GEMINI_API_KEY') }}"
modelName: gemini-2.5-flash
systemMessage: |
You are a data analyst.
Always use the CodeExecution tool for computations.
Then summarize clearly in English.
prompt: |
Here is a numeric series: {{ inputs.series }}
1) Compute mean, median, min, max, and standard deviation.
2) Detect outliers using a z-score greater than 2.
3) Explain the distribution in 5-8 lines.
tools:
- type: io.kestra.plugin.ai.tool.CodeExecution
apiKey: "{{ kv('RAPID_API_KEY') }}"
Generate release notes using Google Custom Web Search as a tool. Unlike content retrievers, tools are called only when the LLM decides it needs fresh context.
id: agent_with_google_custom_search_release_notes
namespace: company.ai
inputs:
- id: prompt
type: STRING
defaults: |
Find the most recent Kestra release and summarize:
- release date
- 5 major new features
- 3 important bug fixes
Answer in English.
tasks:
- id: agent
type: io.kestra.plugin.ai.agent.AIAgent
provider:
type: io.kestra.plugin.ai.provider.GoogleGemini
apiKey: "{{ kv('GEMINI_API_KEY') }}"
modelName: gemini-2.5-flash
systemMessage: |
You are a release-notes assistant.
If you need up-to-date information, call GoogleCustomWebSearch.
Summarize sources and avoid hallucinations.
prompt: "{{ inputs.prompt }}"
tools:
- type: io.kestra.plugin.ai.tool.GoogleCustomWebSearch
apiKey: "{{ kv('GOOGLE_SEARCH_API_KEY') }}"
csi: "{{ kv('GOOGLE_SEARCH_CSI') }}"
Triage an incident and trigger the right Kestra flow using KestraFlow in implicit mode. The agent infers namespace/flowId from the prompt and executes the flow.
id: incident_triage_orchestrator
namespace: company.ai
inputs:
- id: incident
type: STRING
defaults: |
The "billing-prod" SaaS data has been stale for 2 hours.
We suspect an API extraction failure from an external provider.
tasks:
- id: agent
type: io.kestra.plugin.ai.agent.AIAgent
provider:
type: io.kestra.plugin.ai.provider.OpenAI
apiKey: "{{ kv('OPENAI_API_KEY') }}"
modelName: gpt-5-mini
systemMessage: |
You are an incident triage agent.
Decide which flow to run to mitigate the issue.
Use the kestra_flow tool to trigger it with relevant inputs.
prompt: |
Incident:
{{ inputs.incident }}
You can run the following flows in the "prod.ops" namespace:
- restart-billing-extract (inputs: service, reason)
- run-billing-backfill (inputs: service, sinceHours)
- notify-oncall (inputs: team, severity, message)
Pick the best flow and execute it using the tool.
tools:
- type: io.kestra.plugin.ai.tool.KestraFlow
Route between multiple explicitly-defined Kestra flows. Each flow becomes a separate tool and the LLM selects which one to call.
id: multi_flow_planner_agent
namespace: company.ai
inputs:
- id: objective
type: SELECT
defaults: ingestion
values:
- ingestion
- cleanup
- alerting
tasks:
- id: agent
type: io.kestra.plugin.ai.agent.AIAgent
provider:
type: io.kestra.plugin.ai.provider.GoogleGemini
apiKey: "{{ kv('GEMINI_API_KEY') }}"
modelName: gemini-2.5-flash
prompt: |
User objective: {{ inputs.objective }}
Execute the most appropriate flow for this objective.
tools:
- type: io.kestra.plugin.ai.tool.KestraFlow
namespace: prod.data
flowId: ingest-daily-snapshots
description: Daily ingestion of snapshots
- type: io.kestra.plugin.ai.tool.KestraFlow
namespace: prod.data
flowId: purge-stale-partitions
description: Cleanup of obsolete partitions
- type: io.kestra.plugin.ai.tool.KestraFlow
namespace: prod.ops
flowId: send-severity-alert
description: Send an on-call alert
Self-healing automation using KestraTask. The agent fills mandatory placeholders ("...") and then runs the task tool.
id: agent_using_kestra_task_self_healing
namespace: company.ai
inputs:
- id: error_message
type: STRING
defaults: "Disk usage >= 95% on node worker-3"
tasks:
- id: agent
type: io.kestra.plugin.ai.agent.AIAgent
provider:
type: io.kestra.plugin.ai.provider.GoogleGemini
apiKey: "{{ kv('GEMINI_API_KEY') }}"
modelName: gemini-2.5-flash
systemMessage: |
You are a self-healing automation agent.
When remediation is needed, call the KestraTask tool.
prompt: |
Detected issue: {{ inputs.error_message }}
1) Propose a safe remediation action.
2) Execute the corresponding task using the tool.
tools:
- type: io.kestra.plugin.ai.tool.KestraTask
tasks:
- id: cleanup
type: io.kestra.plugin.scripts.shell.Commands
commands:
- "..." # Placeholder: the agent will decide real commands.
timeout: PT10M
Find places using an MCP SSE client tool. The agent calls the MCP server to retrieve structured results, then ranks them.
id: agent_with_sse_mcp_places
namespace: company.ai
inputs:
- id: city
type: STRING
defaults: Lyon, France
- id: cuisine
type: STRING
defaults: "bistronomic"
tasks:
- id: agent
type: io.kestra.plugin.ai.agent.AIAgent
provider:
type: io.kestra.plugin.ai.provider.GoogleGemini
apiKey: "{{ kv('GEMINI_API_KEY') }}"
modelName: gemini-2.5-flash
systemMessage: |
You are a local guide.
Use the MCP places tool to search restaurants.
Return a short ranked list with brief reasons.
prompt: |
Find 3 {{ inputs.cuisine }} restaurants in {{ inputs.city }}.
Criteria: rating > 4.5, quiet atmosphere, mid-range budget.
Provide name, address, and two short reasons for each.
tools:
- type: io.kestra.plugin.ai.tool.SseMcpClient
sseUrl: https://mcp.apify.com/?actors=compass/crawler-google-places
timeout: PT3M
headers:
Authorization: Bearer {{ kv('APIFY_API_TOKEN') }}
Combine TavilyWebSearch and CodeExecution as tools. The agent searches for market data, then computes projections with the code tool.
id: agent_research_and_validate_forecast
namespace: company.ai
inputs:
- id: topic
type: STRING
defaults: "workflow and data orchestration market"
- id: year
type: INT
defaults: 2028
tasks:
- id: agent
type: io.kestra.plugin.ai.agent.AIAgent
provider:
type: io.kestra.plugin.ai.provider.GoogleGemini
apiKey: "{{ kv('GEMINI_API_KEY') }}"
modelName: gemini-2.5-flash
systemMessage: |
You are a market research analyst.
1) Use TavilyWebSearch to gather current market size and CAGR.
2) Use CodeExecution to project the market size to the target year.
3) Summarize in English with sources.
prompt: |
Topic: {{ inputs.topic }}
1) Find credible sources for the current market size and CAGR.
2) Project the market size for {{ inputs.year }} using the CAGR.
3) Write a compact report (2 paragraphs) plus a list of sources.
tools:
- type: io.kestra.plugin.ai.tool.TavilyWebSearch
apiKey: "{{ kv('TAVILY_API_KEY') }}"
- type: io.kestra.plugin.ai.tool.CodeExecution
apiKey: "{{ kv('RAPID_API_KEY') }}"
Properties
prompt*Requiredstring
Text prompt
The input prompt for the language model
provider*RequiredNon-dynamic
Amazon Bedrock Model Provider
COHERECOHERETITANAnthropic AI Model Provider
Azure OpenAI Model Provider
DashScope (Qwen) Model Provider from Alibaba Cloud
https://dashscope-intl.aliyuncs.com/api/v1Deepseek Model Provider
https://api.deepseek.com/v1GitHub Models AI Model Provider
Google Gemini Model Provider
Google VertexAI Model Provider
HuggingFace Model Provider
https://router.huggingface.co/v1LocalAI Model Provider
Mistral AI Model Provider
OciGenAI Model Provider
Ollama Model Provider
OpenAI Model Provider
https://api.openai.com/v1OpenRouter Model Provider
Watsonx AI Model Provider
WorkersAI Model Provider
ZhiPu AI Model Provider
https://open.bigmodel.cn/configurationNon-dynamic
{}io.kestra.plugin.ai.domain.ChatConfiguration
io.kestra.plugin.ai.domain.ChatConfiguration-ResponseFormat
TEXTTEXTJSONcontentRetrievers
Embedding store content retriever for RAG (Retrieval Augmented Generation)
Amazon Bedrock Model Provider
COHERECOHERETITANAnthropic AI Model Provider
Azure OpenAI Model Provider
DashScope (Qwen) Model Provider from Alibaba Cloud
https://dashscope-intl.aliyuncs.com/api/v1Deepseek Model Provider
https://api.deepseek.com/v1GitHub Models AI Model Provider
Google Gemini Model Provider
Google VertexAI Model Provider
HuggingFace Model Provider
https://router.huggingface.co/v1LocalAI Model Provider
Mistral AI Model Provider
OciGenAI Model Provider
Ollama Model Provider
OpenAI Model Provider
https://api.openai.com/v1OpenRouter Model Provider
Watsonx AI Model Provider
WorkersAI Model Provider
ZhiPu AI Model Provider
https://open.bigmodel.cn/Chroma Embedding Store
Elasticsearch Embedding Store
io.kestra.plugin.ai.embeddings.Elasticsearch-ElasticsearchConnection
1List of HTTP Elasticsearch servers
Must be a URI like https://example.com: 9200 with scheme and port
Basic authorization configuration
List of HTTP headers to be sent with every request
Each item is a key: value string, e.g., Authorization: Token XYZ
Path prefix for all HTTP requests
If set to /my/path, each client request becomes /my/path/ + endpoint. Useful when Elasticsearch is behind a proxy providing a base path; do not use otherwise.
Treat responses with deprecation warnings as failures
Trust all SSL CA certificates
Use this if the server uses a self-signed SSL certificate
In-memory embedding store that stores data as Kestra KV pairs
{{flow.id}}-embedding-storeMariaDB Embedding Store
Milvus Embedding Store
MongoDB Atlas Embedding Store
PGVector Embedding Store
falsePinecone Embedding Store
Qdrant Embedding Store
Redis Embedding Store
embedding-indexTablestore Embedding Store
com.alicloud.openservices.tablestore.model.search.FieldSchema
SingleWordMaxWordMinWordSplitFuzzyLONGDOUBLEBOOLEANKEYWORDTEXTNESTEDGEO_POINTDATEVECTORFUZZY_KEYWORDIPJSONUNKNOWNDOCSFREQSPOSITIONSOFFSETSFLATTENNESTEDWeaviate Embedding Store
ONEQUORUMALL30.0Web search content retriever for Google Custom Search
3SQL Database content retriever using LangChain4j experimental SqlDatabaseContentRetriever. ⚠ IMPORTANT: the database user should have READ-ONLY permissions.
Amazon Bedrock Model Provider
COHERECOHERETITANAnthropic AI Model Provider
Azure OpenAI Model Provider
DashScope (Qwen) Model Provider from Alibaba Cloud
https://dashscope-intl.aliyuncs.com/api/v1Deepseek Model Provider
https://api.deepseek.com/v1GitHub Models AI Model Provider
Google Gemini Model Provider
Google VertexAI Model Provider
HuggingFace Model Provider
https://router.huggingface.co/v1LocalAI Model Provider
Mistral AI Model Provider
OciGenAI Model Provider
Ollama Model Provider
OpenAI Model Provider
https://api.openai.com/v1OpenRouter Model Provider
Watsonx AI Model Provider
WorkersAI Model Provider
ZhiPu AI Model Provider
https://open.bigmodel.cn/{}io.kestra.plugin.ai.domain.ChatConfiguration
io.kestra.plugin.ai.domain.ChatConfiguration-ResponseFormat
TEXTTEXTJSON2WebSearch content retriever for Tavily Search
3maxSequentialToolsInvocationsintegerstring
memoryNon-dynamic
Agent memory
Agent memory will store messages and add them as history to the LLM context.
In-memory Chat Memory that stores its data as Kestra KV pairs
NEVERNEVERBEFORE_TASKRUNAFTER_TASKRUN{{ labels.system.correlationId }}10PT1HdurationChat Memory backed by PostgreSQL
The name of the PostgreSQL database
PostgreSQL host
The hostname of your PostgreSQL server
The password to connect to PostgreSQL
Database user
The username to connect to PostgreSQL
NEVERNEVERBEFORE_TASKRUNAFTER_TASKRUN{{ labels.system.correlationId }}105432PostgreSQL port
The port of your PostgreSQL server
chat_memoryTable name
The name of the table used to store chat memory. Defaults to 'chat_memory'.
PT1HdurationChat Memory backed by Redis
Redis host
The hostname of your Redis server (e.g., localhost or redis-server)
NEVERNEVERBEFORE_TASKRUNAFTER_TASKRUNDrop memory: never, before, or after the agent's task run
By default, the memory ID is the value of the system.correlationId label, meaning that the same memory will be used by all tasks of the flow and its subflows.
If you want to remove the memory eagerly (before expiration), you can set drop: AFTER_TASKRUN to erase the memory after the taskrun.
You can also set drop: BEFORE_TASKRUN to drop the memory before the taskrun.
{{ labels.system.correlationId }}Memory ID - defaults to the value of the system.correlationId label. This means that a memory is valid for the entire flow execution including its subflows.
10Maximum number of messages to keep in memory. If memory is full, the oldest messages will be removed in a FIFO manner. The last system message is always kept.
6379Redis port
The port of your Redis server
PT1HdurationMemory duration - defaults to 1h
outputFilesarray
The files from the local filesystem to send to Kestra's internal storage.
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
systemMessagestring
toolsNon-dynamic
Call a remote AI agent via the A2A protocol.
Server URL
The URL of the remote agent A2A server
toolCall an AI Agent as a tool
Agent description
The description will be used to instruct the LLM what the tool is doing.
Amazon Bedrock Model Provider
COHERECOHERETITANAnthropic AI Model Provider
Azure OpenAI Model Provider
DashScope (Qwen) Model Provider from Alibaba Cloud
https://dashscope-intl.aliyuncs.com/api/v1Deepseek Model Provider
https://api.deepseek.com/v1GitHub Models AI Model Provider
Google Gemini Model Provider
Google VertexAI Model Provider
HuggingFace Model Provider
https://router.huggingface.co/v1LocalAI Model Provider
Mistral AI Model Provider
OciGenAI Model Provider
Ollama Model Provider
OpenAI Model Provider
https://api.openai.com/v1OpenRouter Model Provider
Watsonx AI Model Provider
WorkersAI Model Provider
ZhiPu AI Model Provider
https://open.bigmodel.cn/{}io.kestra.plugin.ai.domain.ChatConfiguration
io.kestra.plugin.ai.domain.ChatConfiguration-ResponseFormat
JSON Schema (used when type = JSON)
Provide a JSON Schema describing the expected structure of the response. In Kestra flows, define the schema in YAML (it is still a JSON Schema object). Example (YAML):
responseFormat:
type: JSON
jsonSchema:
type: object
required: ["category", "priority"]
properties:
category:
type: string
enum: ["ACCOUNT", "BILLING", "TECHNICAL", "GENERAL"]
priority:
type: string
enum: ["LOW", "MEDIUM", "HIGH"]
Note: Provider support for strict schema enforcement varies. If unsupported, guide the model about the expected output structure via the prompt and validate downstream.
Schema description (optional)
Natural-language description of the schema to help the model produce the right fields. Example: "Classify a customer ticket into category and priority."
TEXTTEXTJSONResponse format type
Specifies how the LLM should return output. Allowed values:
- TEXT (default): free-form natural language.
- JSON: structured output validated against a JSON Schema.
Content retrievers
Some content retrievers, like WebSearch, can also be used as tools. However, when configured as content retrievers, they will always be used, whereas tools are only invoked when the LLM decides to use them.
Embedding store content retriever for RAG (Retrieval Augmented Generation)
Embedding model provider
Provider used to generate embeddings for the query. Must support embedding generation.
Amazon Bedrock Model Provider
COHERECOHERETITANAnthropic AI Model Provider
Azure OpenAI Model Provider
DashScope (Qwen) Model Provider from Alibaba Cloud
https://dashscope-intl.aliyuncs.com/api/v1Deepseek Model Provider
https://api.deepseek.com/v1GitHub Models AI Model Provider
Google Gemini Model Provider
Google VertexAI Model Provider
HuggingFace Model Provider
https://router.huggingface.co/v1LocalAI Model Provider
Mistral AI Model Provider
OciGenAI Model Provider
Ollama Model Provider
OpenAI Model Provider
https://api.openai.com/v1OpenRouter Model Provider
Watsonx AI Model Provider
WorkersAI Model Provider
ZhiPu AI Model Provider
https://open.bigmodel.cn/Embedding store
The embedding store to retrieve relevant content from
Chroma Embedding Store
The database base URL
Elasticsearch Embedding Store
The name of the index to store embeddings
In-memory embedding store that stores data as Kestra KV pairs
{{flow.id}}-embedding-storeThe name of the KV pair to use
MariaDB Embedding Store
Whether to create the table if it doesn't exist
Database URL of the MariaDB database (e.g., jdbc: mariadb://host: port/dbname)
Name of the column used as the unique ID in the database
Name of the table where embeddings will be stored
Metadata Column Definitions
List of SQL column definitions for metadata fields (e.g., 'text TEXT', 'source TEXT'). Required only when using COLUMN_PER_KEY storage mode.
Metadata Index Definitions
List of SQL index definitions for metadata columns (e.g., 'INDEX idx_text (text)'). Used only with COLUMN_PER_KEY storage mode.
Metadata Storage Mode
Determines how metadata is stored: - COLUMN_PER_KEY: Use individual columns for each metadata field (requires columnDefinitions and indexes). - COMBINED_JSON (default): Store metadata as a JSON object in a single column. If columnDefinitions and indexes are provided, COLUMN_PER_KEY must be used.
Milvus Embedding Store
Token
Milvus auth token. Required if authentication is enabled; omit for local deployments without auth.
Auto flush on delete
If true, flush after delete operations.
Auto flush on insert
If true, flush after insert operations. Setting it to false can improve throughput.
Collection name
Target collection. Created automatically if it does not exist. Default: "default".
Read/write consistency level. Common values include STRONG, BOUNDED, or EVENTUALLY (depends on client/version).
Database name
Logical database to use. If not provided, the default database is used.
Milvus host name (used when uri is not set). Default: "localhost".
ID field name
Field name for document IDs. Default depends on collection schema.
Index type
Vector index type (e.g., IVF_FLAT, IVF_SQ8, HNSW). Depends on Milvus deployment and dataset.
Field name for metadata. Default depends on collection schema.
Metric type
Similarity metric (e.g., L2, IP, COSINE). Should match the embedding provider’s expected metric.
Password
Milvus port (used when uri is not set). Typical: 19530 (gRPC) or 9091 (HTTP). Default: 19530.
Retrieve embeddings on search
If true, return stored embeddings along with matches. Default: false.
Text field name
Field name for original text. Default depends on collection schema.
URI
Connection URI. Use either uri OR host/port (not both).
Examples:
- gRPC (typical): "milvus://host: 19530"
- HTTP: "http://host: 9091"
Username
Required when authentication/TLS is enabled. See https://milvus.io/docs/authenticate.md
Vector field name
Field name for the embedding vector. Must match the index definition and embedding dimensionality.
MongoDB Atlas Embedding Store
The host
The scheme (e.g., mongodb+srv)
Create the index
The database
The metadata field names
The connection string options
The password
The username
PGVector Embedding Store
The database name
The database password
The table to store embeddings in
The database user
falseWhether to use use an IVFFlat index
An IVFFlat index divides vectors into lists, and then searches a subset of those lists closest to the query vector. It has faster build times and uses less memory than HNSW but has lower query performance (in terms of speed-recall tradeoff).
Pinecone Embedding Store
The cloud provider
The index
The cloud provider region
The namespace (default will be used if not provided)
Qdrant Embedding Store
The API key
The collection name
Redis Embedding Store
The database server host
The database server port
embedding-indexThe index name
Tablestore Embedding Store
Access Key ID
The access key ID used for authentication with the database.
Access Key Secret
The access key secret used for authentication with the database.
The base URL for the Tablestore database endpoint.
Instance Name
The name of the Tablestore database instance.
Metadata Schema List
Optional list of metadata field schemas for the collection.
Weaviate Embedding Store
Weaviate API key. Omit for local deployments without auth.
Host
Cluster host name without protocol, e.g., "abc123.weaviate.network".
Avoid duplicates
If true (default), a hash-based ID is derived from each text segment to prevent duplicates. If false, a random ID is used.
ONEQUORUMALLConsistency level
Write consistency: ONE, QUORUM (default), or ALL.
gRPC port
Port for gRPC if enabled (e.g., 50051).
Metadata field name
Field used to store metadata. Defaults to "_metadata" if not set.
Metadata keys
The list of metadata keys to store - if not provided, it will default to an empty list.
Object class
Weaviate class to store objects in (must start with an uppercase letter). Defaults to "Default" if not set.
Port
Optional port (e.g., 443 for https, 80 for http). Leave unset to use provider defaults.
Scheme
Cluster scheme: "https" (recommended) or "http".
Secure gRPC
Whether the gRPC connection is secured (TLS).
Use gRPC for batch inserts
If true, use gRPC for batch inserts. HTTP remains required for search operations.
3Maximum number of results to return from the embedding store
0.0Minimum similarity score
Only results with a similarity score ≥ minScore are returned. Range: 0.0 to 1.0 inclusive.
Web search content retriever for Google Custom Search
3Maximum number of results
SQL Database content retriever using LangChain4j experimental SqlDatabaseContentRetriever. ⚠ IMPORTANT: the database user should have READ-ONLY permissions.
Database password
Language model provider
Amazon Bedrock Model Provider
AWS Access Key ID
AWS Secret Access Key
COHERECOHERETITANAmazon Bedrock Embedding Model Type
Anthropic AI Model Provider
Maximum Tokens
Specifies the maximum number of tokens that the model is allowed to generate in its response.
Azure OpenAI Model Provider
API endpoint
The Azure OpenAI endpoint in the format: https://{resource}.openai.azure.com/
Client ID
Client secret
Tenant ID
DashScope (Qwen) Model Provider from Alibaba Cloud
https://dashscope-intl.aliyuncs.com/api/v1If you use a model in the China (Beijing) region, you need to replace the URL with: https://dashscope.aliyuncs.com/api/v1,
otherwise use the Singapore region of: "https://dashscope-intl.aliyuncs.com/api/v1.
The default value is computed based on the system timezone.
Whether the model uses Internet search results for reference when generating text or not
Repetition in a continuous sequence during model generation
Increasing repetition_penalty reduces the repetition in model generation,
1.0 means no penalty. Value range: (0, +inf)
Deepseek Model Provider
https://api.deepseek.com/v1GitHub Models AI Model Provider
GitHub Token
Personal Access Token (PAT) used to access GitHub Models.
Google Gemini Model Provider
Google VertexAI Model Provider
Endpoint URL
Project location
Project ID
HuggingFace Model Provider
https://router.huggingface.co/v1LocalAI Model Provider
Mistral AI Model Provider
OciGenAI Model Provider
OCID of OCI Compartment with the model
OCI Region to connect the client to
OCI SDK Authentication provider
Ollama Model Provider
Model endpoint
OpenAI Model Provider
https://api.openai.com/v1OpenRouter Model Provider
Watsonx AI Model Provider
Project Id
WorkersAI Model Provider
Account Identifier
Unique identifier assigned to an account
Base URL
Custom base URL to override the default endpoint (useful for local tests, WireMock, or enterprise gateways).
ZhiPu AI Model Provider
Model name
https://open.bigmodel.cn/API base URL
The base URL for ZhiPu API (defaults to https://open.bigmodel.cn/)
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
The maximum retry times to request
The maximum number of tokens returned by this request
With the stop parameter, the model will automatically stop generating text when it is about to contain the specified string or token_id
Database username
{}Language model configuration
io.kestra.plugin.ai.domain.ChatConfiguration
Log LLM requests
If true, prompts and configuration sent to the LLM will be logged at INFO level.
Log LLM responses
If true, raw responses from the LLM will be logged at INFO level.
Maximum number of tokens the model can generate in the completion (response). This limits the length of the output.
Response format
Defines the expected output format. Default is plain text.
Some providers allow requesting JSON or schema-constrained outputs, but support varies and may be incompatible with tool use.
When using a JSON schema, the output will be returned under the key jsonOutput.
Return Thinking
Controls whether to return the model's internal reasoning or 'thinking' text, if available. When enabled, the reasoning content is extracted from the response and made available in the AiMessage object. It Does not trigger the thinking process itself—only affects whether the output is parsed and returned.
Seed
Optional random seed for reproducibility. Provide a positive integer (e.g., 42, 1234). Using the same seed with identical settings produces repeatable outputs.
Temperature
Controls randomness in generation. Typical range is 0.0–1.0. Lower values (e.g., 0.2) make outputs more focused and deterministic, while higher values (e.g., 0.7–1.0) increase creativity and variability.
Thinking Token Budget
Specifies the maximum number of tokens allocated as a budget for internal reasoning processes, such as generating intermediate thoughts or chain-of-thought sequences, allowing the model to perform multi-step reasoning before producing the final output.
Enable Thinking
Enables internal reasoning ('thinking') in supported language models, allowing the model to perform intermediate reasoning steps before producing a final output; this is useful for complex tasks like multi-step problem solving or decision making, but may increase token usage and response time, and is only applicable to compatible models.
Top-K
Limits sampling to the top K most likely tokens at each step. Typical values are between 20 and 100. Smaller values reduce randomness; larger values allow more diverse outputs.
Top-P (nucleus sampling)
Selects from the smallest set of tokens whose cumulative probability is ≤ topP. Typical values are 0.8–0.95. Lower values make the output more focused, higher values increase diversity.
Optional JDBC driver class name – automatically resolved if not provided.
JDBC connection URL to the target database
2Maximum number of database connections in the pool
WebSearch content retriever for Tavily Search
API Key
3Maximum number of results to return
Maximum sequential tools invocations
toolAgent name
It must be set to a different value than the default in case you want to have multiple agents used as tools in the same task.
System message
The system message for the language model
Tools that the LLM may use to augment its response
Code execution tool using Judge0
RapidAPI key for Judge0
You can obtain it from the RapidAPI website.
Model Context Protocol (MCP) Docker client tool
Container image
API version
Volume binds
Docker certificate path
Docker configuration
Docker context
Docker host
Whether Docker should verify TLS certificates
falseWhether to log events
Container registry email
Container registry password
Container registry URL
Container registry username
Google Custom Search web tool
API key
Custom search engine ID (cx)
Call a Kestra flow as a tool
Description of the flow if not already provided inside the flow itself
Use it only if you define the flow in the tool definition. The LLM needs a tool description to identify whether to call it. If the flow has a description, the tool will use it. Otherwise, the description property must be explicitly defined.
Flow ID of the flow that should be called
falseWhether the flow should inherit labels from this execution that triggered it
By default, labels are not inherited. If you set this option to true, the flow execution will inherit all labels from the agent's execution.
Any labels passed by the LLM will override those defined here.
Input values that should be passed to flow's execution
Any inputs passed by the LLM will override those defined here.
Labels that should be added to the flow's execution
Any labels passed by the LLM will override those defined here.
Namespace of the flow that should be called
Revision of the flow that should be called
date-timeSchedule the flow execution at a later date
If the LLM sets a scheduleDate, it will override the one defined here.
Call a Kestra runnable task as a tool
List of Kestra runnable tasks
Model Context Protocol (MCP) SSE client tool
SSE URL of the MCP server
Could be useful, for example, to add authentication tokens via the Authorization header.
falsefalsedurationModel Context Protocol (MCP) Stdio client tool
MCP client command, as a list of command parts
Environment variables
falseLog events
Model Context Protocol (MCP) SSE client tool
URL of the MCP server
Custom headers
Useful, for example, for adding authentication tokens via the Authorization header.
falseLog requests
falseLog responses
durationConnection timeout duration
WebSearch tool for Tavily Search
Tavily API Key - you can obtain one from the Tavily website
Outputs
finishReasonstring
STOPLENGTHTOOL_EXECUTIONCONTENT_FILTEROTHERintermediateResponsesarray
Intermediate responses
io.kestra.plugin.ai.domain.AIOutput-AIResponse
Generated text completion
The result of the text completion
STOPLENGTHTOOL_EXECUTIONCONTENT_FILTEROTHERFinish reason
Response identifier
io.kestra.plugin.ai.domain.TokenUsage
Tool execution requests
io.kestra.plugin.ai.domain.AIOutput-AIResponse-ToolExecutionRequest
Tool request arguments
Tool execution request identifier
Tool name
jsonOutputobject
LLM output for JSON response format
The result of the LLM completion for response format of type JSON, null otherwise.
outputFilesobject
URIs of the generated files in Kestra's internal storage
requestDurationinteger
Request duration in milliseconds
sourcesarray
Content sources used during RAG retrieval
io.kestra.plugin.ai.domain.AIOutput-ContentSource
Extracted text segment
A snippet of text relevant to the user's query, typically a sentence, paragraph, or other discrete unit of text.
Source metadata
Key-value pairs providing context about the origin of the content, such as URLs, document titles, or other relevant attributes.
textOutputstring
LLM output for TEXT response format
The result of the LLM completion for response format of type TEXT (default), null otherwise.
thinkingstring
Model's Thinking Output
Contains the model's internal reasoning or 'thinking' text, if the model supports it and 'returnThinking' is enabled. This may include intermediate reasoning steps, such as chain-of-thought explanations. Null if thinking is not supported, not enabled, or not returned by the model.
tokenUsage
Token usage
io.kestra.plugin.ai.domain.TokenUsage
toolExecutionsarray
Tool executions
io.kestra.plugin.ai.domain.AIOutput-ToolExecution
Metrics
input.token.countcounter
tokenLarge Language Model (LLM) input token count
output.token.countcounter
tokenLarge Language Model (LLM) output token count
total.token.countcounter
tokenLarge Language Model (LLM) total token count