EmbeddingStoreRetrieverEmbeddingStoreRetriever
EmbeddingStoreRetrieverCertified

Embedding store content retriever for RAG (Retrieval Augmented Generation)

Embedding store content retriever for RAG (Retrieval Augmented Generation)

Retrieves relevant content from an embedding store based on semantic similarity to the query.

yaml
type: "io.kestra.plugin.ai.retriever.EmbeddingStoreRetriever"

Use RAG with AIAgent using an embedding store content retriever. This example ingests documents into a KV embedding store and then uses an AI agent with the EmbeddingStoreRetriever to answer questions grounded in the ingested data.

yaml
id: "embeddingstoreretriever"
type: "io.kestra.plugin.ai.retriever.EmbeddingStoreRetriever"
id: agent_with_rag
namespace: company.ai

tasks:
  - id: ingest
    type: io.kestra.plugin.ai.rag.IngestDocument
    provider:
      type: io.kestra.plugin.ai.provider.GoogleGemini
      modelName: gemini-embedding-exp-03-07
      googleApiKey: "{{ kv('GEMINI_API_KEY') }}"
    embeddings:
      type: io.kestra.plugin.ai.embeddings.KestraKVStore
    drop: true
    fromDocuments:
      - content: Paris is the capital of France with a population of over 2.1 million people
      - content: The Eiffel Tower is the most famous landmark in Paris at 330 meters tall

  - id: agent
    type: io.kestra.plugin.ai.agent.AIAgent
    provider:
      type: io.kestra.plugin.ai.provider.GoogleGemini
      modelName: gemini-2.0-flash
      googleApiKey: "{{ kv('GEMINI_API_KEY') }}"
    contentRetrievers:
      - type: io.kestra.plugin.ai.retriever.EmbeddingStoreRetriever
        embeddings:
          type: io.kestra.plugin.ai.embeddings.KestraKVStore
        embeddingProvider:
          type: io.kestra.plugin.ai.provider.GoogleGemini
          modelName: gemini-embedding-exp-03-07
          googleApiKey: "{{ kv('GEMINI_API_KEY') }}"
        maxResults: 3
        minScore: 0.0
    prompt: What is the capital of France and how many people live there?

Use multiple embedding stores simultaneously. This demonstrates the power of the content retriever approach - you can retrieve from multiple embedding stores and other sources in a single task.

yaml
id: "embeddingstoreretriever"
type: "io.kestra.plugin.ai.retriever.EmbeddingStoreRetriever"
id: multi_store_rag
namespace: company.ai

tasks:
  - id: agent
    type: io.kestra.plugin.ai.agent.AIAgent
    provider:
      type: io.kestra.plugin.ai.provider.GoogleGemini
      modelName: gemini-2.0-flash
      googleApiKey: "{{ kv('GEMINI_API_KEY') }}"
    contentRetrievers:
      - type: io.kestra.plugin.ai.retriever.EmbeddingStoreRetriever
        embeddings:
          type: io.kestra.plugin.ai.embeddings.Pinecone
          pineconeApiKey: "{{ kv('PINECONE_API_KEY') }}"
          index: technical-docs
        embeddingProvider:
          type: io.kestra.plugin.ai.provider.OpenAI
          googleApiKey: "{{ kv('OPENAI_API_KEY') }}"
          modelName: text-embedding-3-small
      - type: io.kestra.plugin.ai.retriever.EmbeddingStoreRetriever
        embeddings:
          type: io.kestra.plugin.ai.embeddings.Qdrant
          host: localhost
          port: 6333
          collectionName: business-docs
        embeddingProvider:
          type: io.kestra.plugin.ai.provider.GoogleGemini
          modelName: gemini-embedding-exp-03-07
          googleApiKey: "{{ kv('GEMINI_API_KEY') }}"
      - type: io.kestra.plugin.ai.retriever.TavilyWebSearch
        tavilyApiKey: "{{ kv('TAVILY_API_KEY') }}"
    prompt: What are the latest trends in data orchestration?
Properties

Embedding model provider

Provider used to generate embeddings for the query. Must support embedding generation.

Definitions
accessKeyId*Requiredstring

AWS Access Key ID

modelName*Requiredstring
secretAccessKey*Requiredstring

AWS Secret Access Key

baseUrlstring
caPemstring
clientPemstring
modelTypestring
DefaultCOHERE
Possible Values
COHERETITAN

Amazon Bedrock Embedding Model Type

typeobject
apiKey*Requiredstring
modelName*Requiredstring
baseUrlstring
caPemstring
clientPemstring
maxTokensintegerstring

Maximum Tokens

Specifies the maximum number of tokens that the model is allowed to generate in its response.

typeobject
endpoint*Requiredstring

API endpoint

The Azure OpenAI endpoint in the format: https://{resource}.openai.azure.com/

modelName*Requiredstring
apiKeystring
baseUrlstring
caPemstring
clientIdstring

Client ID

clientPemstring
clientSecretstring

Client secret

serviceVersionstring

API version

tenantIdstring

Tenant ID

typeobject
apiKey*Requiredstring
modelName*Requiredstring
baseUrlstring
Defaulthttps://dashscope-intl.aliyuncs.com/api/v1
text
If you use a model in the China (Beijing) region, you need to replace the URL with: https://dashscope.aliyuncs.com/api/v1,
otherwise use the Singapore region of: "https://dashscope-intl.aliyuncs.com/api/v1.
The default value is computed based on the system timezone.
caPemstring
clientPemstring
enableSearchbooleanstring

Whether the model uses Internet search results for reference when generating text or not

maxTokensintegerstring
repetitionPenaltynumberstring

Repetition in a continuous sequence during model generation

text
Increasing repetition_penalty reduces the repetition in model generation,
1.0 means no penalty. Value range: (0, +inf)
typeobject
apiKey*Requiredstring
modelName*Requiredstring
baseUrlstring
Defaulthttps://api.deepseek.com/v1
caPemstring
clientPemstring
typeobject
gitHubToken*Requiredstring

GitHub Token

Personal Access Token (PAT) used to access GitHub Models.

modelName*Requiredstring
baseUrlstring
caPemstring
clientPemstring
typeobject
apiKey*Requiredstring
modelName*Requiredstring
baseUrlstring
caPemstring
clientPemstring
typeobject
endpoint*Requiredstring
location*Requiredstring

Project location

modelName*Requiredstring
project*Requiredstring

Project ID

baseUrlstring
caPemstring
clientPemstring
typeobject
apiKey*Requiredstring
modelName*Requiredstring
baseUrlstring
Defaulthttps://router.huggingface.co/v1
caPemstring
clientPemstring
typeobject
baseUrl*Requiredstring
modelName*Requiredstring
caPemstring
clientPemstring
typeobject
apiKey*Requiredstring
modelName*Requiredstring
baseUrlstring
caPemstring
clientPemstring
typeobject
compartmentId*Requiredstring

OCID of OCI Compartment with the model

modelName*Requiredstring
region*Requiredstring

OCI Region to connect the client to

authProviderstring

OCI SDK Authentication provider

baseUrlstring
caPemstring
clientPemstring
typeobject
endpoint*Requiredstring

Model endpoint

modelName*Requiredstring
baseUrlstring
caPemstring
clientPemstring
typeobject
apiKey*Requiredstring
modelName*Requiredstring
baseUrlstring
Defaulthttps://api.openai.com/v1
caPemstring
clientPemstring
typeobject
apiKey*Requiredstring
modelName*Requiredstring
baseUrlstring
caPemstring
clientPemstring
typeobject
apiKey*Requiredstring
modelName*Requiredstring
projectId*Requiredstring

Project Id

baseUrlstring
caPemstring
clientPemstring
typeobject
accountId*Requiredstring

Account Identifier

Unique identifier assigned to an account

apiKey*Requiredstring
modelName*Requiredstring
baseUrlstring

Base URL

Custom base URL to override the default endpoint (useful for local tests, WireMock, or enterprise gateways).

caPemstring
clientPemstring
typeobject
apiKey*Requiredstring

API Key

modelName*Requiredstring

Model name

baseUrlstring
Defaulthttps://open.bigmodel.cn/

API base URL

The base URL for ZhiPu API (defaults to https://open.bigmodel.cn/)

caPemstring

CA PEM certificate content

CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.

clientPemstring

Client PEM certificate content

PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.

maxRetriesintegerstring

The maximum retry times to request

maxTokenintegerstring

The maximum number of tokens returned by this request

stopsarray
SubTypestring

With the stop parameter, the model will automatically stop generating text when it is about to contain the specified string or token_id

typeobject

Embedding store

The embedding store to retrieve relevant content from

Definitions
baseUrl*Requiredstring

The database base URL

collectionName*Requiredstring
typeobject
connection*Required
hosts*Requiredarray
SubTypestring
Min items1

List of HTTP Elasticsearch servers

Must be a URI like https://example.com: 9200 with scheme and port

basicAuth

Basic authorization configuration

passwordstring

Basic authorization password

usernamestring

Basic authorization username

headersarray
SubTypestring

List of HTTP headers to be sent with every request

Each item is a key: value string, e.g., Authorization: Token XYZ

pathPrefixstring

Path prefix for all HTTP requests

If set to /my/path, each client request becomes /my/path/ + endpoint. Useful when Elasticsearch is behind a proxy providing a base path; do not use otherwise.

strictDeprecationModebooleanstring

Treat responses with deprecation warnings as failures

trustAllSslbooleanstring

Trust all SSL CA certificates

Use this if the server uses a self-signed SSL certificate

indexName*Requiredstring

The name of the index to store embeddings

typeobject
kvNamestring
Default{{flow.id}}-embedding-store

The name of the KV pair to use

typeobject
createTable*Requiredbooleanstring

Whether to create the table if it doesn't exist

databaseUrl*Requiredstring

Database URL of the MariaDB database (e.g., jdbc: mariadb://host: port/dbname)

fieldName*Requiredstring

Name of the column used as the unique ID in the database

password*Requiredstring
tableName*Requiredstring

Name of the table where embeddings will be stored

username*Requiredstring
columnDefinitionsarray
SubTypestring

Metadata Column Definitions

List of SQL column definitions for metadata fields (e.g., 'text TEXT', 'source TEXT'). Required only when using COLUMN_PER_KEY storage mode.

indexesarray
SubTypestring

Metadata Index Definitions

List of SQL index definitions for metadata columns (e.g., 'INDEX idx_text (text)'). Used only with COLUMN_PER_KEY storage mode.

metadataStorageModestring

Metadata Storage Mode

Determines how metadata is stored: - COLUMN_PER_KEY: Use individual columns for each metadata field (requires columnDefinitions and indexes). - COMBINED_JSON (default): Store metadata as a JSON object in a single column. If columnDefinitions and indexes are provided, COLUMN_PER_KEY must be used.

typeobject
token*Requiredstring

Token

Milvus auth token. Required if authentication is enabled; omit for local deployments without auth.

autoFlushOnDeletebooleanstring

Auto flush on delete

If true, flush after delete operations.

autoFlushOnInsertbooleanstring

Auto flush on insert

If true, flush after insert operations. Setting it to false can improve throughput.

collectionNamestring

Collection name

Target collection. Created automatically if it does not exist. Default: "default".

consistencyLevelstring

Read/write consistency level. Common values include STRONG, BOUNDED, or EVENTUALLY (depends on client/version).

databaseNamestring

Database name

Logical database to use. If not provided, the default database is used.

hoststring

Milvus host name (used when uri is not set). Default: "localhost".

idFieldNamestring

ID field name

Field name for document IDs. Default depends on collection schema.

indexTypestring

Index type

Vector index type (e.g., IVF_FLAT, IVF_SQ8, HNSW). Depends on Milvus deployment and dataset.

metadataFieldNamestring

Field name for metadata. Default depends on collection schema.

metricTypestring

Metric type

Similarity metric (e.g., L2, IP, COSINE). Should match the embedding provider’s expected metric.

passwordstring

Password

portintegerstring

Milvus port (used when uri is not set). Typical: 19530 (gRPC) or 9091 (HTTP). Default: 19530.

retrieveEmbeddingsOnSearchbooleanstring

Retrieve embeddings on search

If true, return stored embeddings along with matches. Default: false.

textFieldNamestring

Text field name

Field name for original text. Default depends on collection schema.

typeobject
uristring

URI

Connection URI. Use either uri OR host/port (not both). Examples:

  • gRPC (typical): "milvus://host: 19530"
  • HTTP: "http://host: 9091"
usernamestring

Username

Required when authentication/TLS is enabled. See https://milvus.io/docs/authenticate.md

vectorFieldNamestring

Vector field name

Field name for the embedding vector. Must match the index definition and embedding dimensionality.

collectionName*Requiredstring
host*Requiredstring

The host

indexName*Requiredstring
scheme*Requiredstring

The scheme (e.g., mongodb+srv)

createIndexbooleanstring

Create the index

databasestring

The database

metadataFieldNamesarray
SubTypestring

The metadata field names

optionsobject

The connection string options

passwordstring

The password

typeobject
usernamestring

The username

database*Requiredstring

The database name

host*Requiredstring
password*Requiredstring

The database password

port*Requiredintegerstring
table*Requiredstring

The table to store embeddings in

user*Requiredstring

The database user

typeobject
useIndexbooleanstring
Defaultfalse

Whether to use use an IVFFlat index

An IVFFlat index divides vectors into lists, and then searches a subset of those lists closest to the query vector. It has faster build times and uses less memory than HNSW but has lower query performance (in terms of speed-recall tradeoff).

apiKey*Requiredstring
cloud*Requiredstring

The cloud provider

index*Requiredstring

The index

region*Requiredstring

The cloud provider region

namespacestring

The namespace (default will be used if not provided)

typeobject
apiKey*Requiredstring

The API key

collectionName*Requiredstring

The collection name

host*Requiredstring
port*Requiredintegerstring
typeobject
host*Requiredstring

The database server host

port*Requiredintegerstring

The database server port

indexNamestring
Defaultembedding-index

The index name

typeobject
accessKeyId*Requiredstring

Access Key ID

The access key ID used for authentication with the database.

accessKeySecret*Requiredstring

Access Key Secret

The access key secret used for authentication with the database.

endpoint*Requiredstring

Endpoint URL

The base URL for the Tablestore database endpoint.

instanceName*Requiredstring

Instance Name

The name of the Tablestore database instance.

metadataSchemaListarray

Metadata Schema List

Optional list of metadata field schemas for the collection.

analyzerstring
Possible Values
SingleWordMaxWordMinWordSplitFuzzy
analyzerParameter
dateFormatsarray
SubTypestring
enableHighlightingboolean
enableSortAndAggboolean
fieldNamestring
fieldTypestring
Possible Values
LONGDOUBLEBOOLEANKEYWORDTEXTNESTEDGEO_POINTDATEVECTORFUZZY_KEYWORDIPJSONUNKNOWN
indexboolean
indexOptionsstring
Possible Values
DOCSFREQSPOSITIONSOFFSETS
isArrayboolean
jsonTypestring
Possible Values
FLATTENNESTED
sourceFieldNamesarray
SubTypestring
storeboolean
subFieldSchemasarray
analyzerstring
Possible Values
SingleWordMaxWordMinWordSplitFuzzy
analyzerParameter
dateFormatsarray
SubTypestring
enableHighlightingboolean
enableSortAndAggboolean
fieldNamestring
fieldTypestring
Possible Values
LONGDOUBLEBOOLEANKEYWORDTEXTNESTEDGEO_POINTDATEVECTORFUZZY_KEYWORDIPJSONUNKNOWN
indexboolean
indexOptionsstring
Possible Values
DOCSFREQSPOSITIONSOFFSETS
isArrayboolean
jsonTypestring
Possible Values
FLATTENNESTED
sourceFieldNamesarray
SubTypestring
storeboolean
subFieldSchemasarray
vectorOptions
vectorOptions
dataTypestring
dimensioninteger
metricTypestring
Possible Values
EUCLIDEANCOSINEDOT_PRODUCT
typeobject
apiKey*Requiredstring

API key

Weaviate API key. Omit for local deployments without auth.

host*Requiredstring

Host

Cluster host name without protocol, e.g., "abc123.weaviate.network".

avoidDupsbooleanstring

Avoid duplicates

If true (default), a hash-based ID is derived from each text segment to prevent duplicates. If false, a random ID is used.

consistencyLevelstring
Possible Values
ONEQUORUMALL

Consistency level

Write consistency: ONE, QUORUM (default), or ALL.

grpcPortintegerstring

gRPC port

Port for gRPC if enabled (e.g., 50051).

metadataFieldNamestring

Metadata field name

Field used to store metadata. Defaults to "_metadata" if not set.

metadataKeysarray
SubTypestring

Metadata keys

The list of metadata keys to store - if not provided, it will default to an empty list.

objectClassstring

Object class

Weaviate class to store objects in (must start with an uppercase letter). Defaults to "Default" if not set.

portintegerstring

Port

Optional port (e.g., 443 for https, 80 for http). Leave unset to use provider defaults.

schemestring

Scheme

Cluster scheme: "https" (recommended) or "http".

securedGrpcbooleanstring

Secure gRPC

Whether the gRPC connection is secured (TLS).

typeobject
useGrpcForInsertsbooleanstring

Use gRPC for batch inserts

If true, use gRPC for batch inserts. HTTP remains required for search operations.

Default3

Maximum number of results to return from the embedding store

Default0.0

Minimum similarity score

Only results with a similarity score ≥ minScore are returned. Range: 0.0 to 1.0 inclusive.