MultimodalCompletion

Use Multimodal completion using the Gemini Client.

See Gemini API about multimodal input for more information.

yaml
type: "io.kestra.plugin.gemini.MultimodalCompletion"

Examples

Multimodal completion using the Gemini Client

yaml
id: gemini_multimodal_completion
namespace: company.team

inputs:
  - id: image
    type: FILE

tasks:
  - id: multimodal_completion
    type: io.kestra.plugin.gemini.MultimodalCompletion
    apiKey: "{{ secret('GEMINI_API_KEY') }}"
    model: "gemini-2.5-flash"
    contents:
      - content: Can you describe this image?
      - mimeType: image/jpeg
        content: "{{ inputs.image }}"

Generate, edit and analyze an image

yaml
id: gemini_multimodal_generate_edit_analyze
namespace: company.team

inputs:
  - id: gen_prompt
    type: STRING
    defaults: "a giant library floating in the clouds with glowing bookshelves"
  - id: edit_prompt
    type: STRING
    defaults: "transform the background into a cyberpunk cityscape at night"

tasks:
  - id: generate
    type: io.kestra.plugin.gemini.MultimodalCompletion
    contents:
      - content: "{{ inputs.gen_prompt }}"

  - id: edit
    type: io.kestra.plugin.gemini.MultimodalCompletion
    contents:
      - content: "{{ inputs.edit_prompt }}"
      - mimeType: "{{ outputs.generate.images[0].mimeType }}"
        content: "{{ outputs.generate.images[0].uri }}"

  - id: analyze
    type: io.kestra.plugin.gemini.MultimodalCompletion
    contents:
      - content: "Describe the mood and style of this image."
      - mimeType: "{{ outputs.edit.images[0].mimeType }}"
        content: "{{ outputs.edit.images[0].uri }}"

pluginDefaults:
  - type: io.kestra.plugin.gemini.MultimodalCompletion
    values:
      apiKey: "{{ secret('GEMINI_API_KEY') }}"
      model: "gemini-2.5-flash-image-preview"

Properties

apiKey*string

Gemini API Key

contents*array

The chat content prompt for the model to respond to

Definitions

io.kestra.plugin.gemini.MultimodalCompletion-Content

content*string

mimeTypestring

rolestring

Defaultuser

model*string

Model

Specifies which generative model (e.g., 'gemini-1.5-flash', 'gemini-1.0-pro') to use for the completion.

Outputs

blockedboolean

Defaultfalse

Whether the response has been blocked for safety reasons

finishReasonstring

The reason the generation has finished

imagesarray

Generated images stored in Kestra and exposed as URIs

When using image-generating/editing models like gemini-2.5-flash-image-preview, this field contains one or more Kestra storage URIs.

Definitions

io.kestra.plugin.gemini.MultimodalCompletion-GeneratedImage

mimeTypestring

IANA mime type of the image, e.g. image/jpeg

uristring

Formaturi

Kestra storage URI of the image

safetyRatingsarray

The response safety ratings

Definitions

io.kestra.plugin.gemini.MultimodalCompletion-SafetyRating

blockedboolean

categorystring

probabilitystring

textstring

The generated response text

Metrics

candidate.token.countcounter

The number of candidate tokens generated by the Gemini model.

prompt.token.countcounter

The number of tokens used in the input prompt.

total.token.countcounter

The total number of tokens processed by the Gemini model (prompt + generated).

Ollama

Tasks that work with Ollama models and runtime features.

AI

Tasks that orchestrate generative AI in Kestra with LangChain4j, covering chat completions, agents, RAG, tools, and shared providers.

AIDatabase

OpenAI

Tasks that call OpenAI for chat completions, images, and file uploads.

MultimodalCompletion

More Plugins in this Category

Ollama

AI

OpenAI

1.1.1