PGVector Embedding Store

yaml
type: "io.kestra.plugin.ai.embeddings.PGVector"

Ingest documents into a PGVector embedding store.

yaml
id: document-ingestion
namespace: company.team

tasks:
  - id: ingest
    type: io.kestra.plugin.ai.rag.IngestDocument
    provider:
      type: io.kestra.plugin.ai.provider.GoogleGemini
      modelName: gemini-embedding-exp-03-07
      apiKey: "{{ secret('GEMINI_API_KEY') }}"
    embeddings:
      type: io.kestra.plugin.ai.embeddings.PGVector
      host: localhost
      port: 5432
      user: "{{ secret('POSTGRES_USER') }}"
      password: "{{ secret('POSTGRES_PASSWORD') }}"
      database: postgres
      table: embeddings
    fromExternalURLs:
      - https://raw.githubusercontent.com/kestra-io/docs/refs/heads/main/content/blogs/release-0-22.md
Properties

The database name

The database server host

The database password

The database server port

The table to store embeddings in

The database user

Default false

Whether to use use an IVFFlat index

An IVFFlat index divides vectors into lists, and then searches a subset of those lists closest to the query vector. It has faster build times and uses less memory than HNSW but has lower query performance (in terms of speed-recall tradeoff).