MariaDB Embedding Store

yaml
type: "io.kestra.plugin.ai.embeddings.MariaDB"

Ingest documents into a MariaDB embedding store

yaml
id: document_ingestion
namespace: company.ai

tasks:
  - id: ingest
    type: io.kestra.plugin.ai.rag.IngestDocument
    provider:
      type: io.kestra.plugin.ai.provider.GoogleGemini
      modelName: gemini-embedding-exp-03-07
      apiKey: "{{ kv('GEMINI_API_KEY') }}"
    embeddings:
      type: io.kestra.plugin.ai.embeddings.MariaDB
      username: "{{ kv('MARIADB_USERNAME') }}"
      password: "{{ kv('MARIADB_PASSWORD') }}"
      databaseUrl: "{{ kv('MARIADB_DATABASE_URL') }}"
      tableName: embeddings
      fieldName: id
    fromExternalURLs:
      - https://raw.githubusercontent.com/kestra-io/docs/refs/heads/main/content/blogs/release-0-24.md
Properties

Whether to create the table if it doesn't exist

Database URL of the MariaDB database (e.g., jdbc: mariadb://host: port/dbname)

Name of the column used as the unique ID in the database

The password

Name of the table where embeddings will be stored

The username

SubType string

Metadata Column Definitions

List of SQL column definitions for metadata fields (e.g., 'text TEXT', 'source TEXT'). Required only when using COLUMN_PER_KEY storage mode.

SubType string

Metadata Index Definitions

List of SQL index definitions for metadata columns (e.g., 'INDEX idx_text (text)'). Used only with COLUMN_PER_KEY storage mode.

Metadata Storage Mode

Determines how metadata is stored: - COLUMN_PER_KEY: Use individual columns for each metadata field (requires columnDefinitions and indexes). - COMBINED_JSON (default): Store metadata as a JSON object in a single column. If columnDefinitions and indexes are provided, COLUMN_PER_KEY must be used.