Google Cloud​Google ​Cloud

Google Cloud

Integrate Google Cloud Platform services with Kestra data workflows.

Authentication

All tasks must be authenticated for the Google Cloud Platform. You can do it in multiple ways:

  • By setting the task serviceAccount property that must contain the service account JSON content. It can be handy to set this property globally by using task defaults if your cluster access only one GCP project.
  • By setting the GOOGLE_APPLICATION_CREDENTIALS environment variable on the nodes running Kestra. It must point to an application credentials file. Warning: it must be the same on all worker nodes and can cause some security concerns.
  • If none is set, the default service account will be used.

You can also set authentication scopes. By default only one scope is used: https://www.googleapis.com/auth/cloud-platform.

Common property

Each task allows configuring the GCP project identifier in the projectId property. If not set, the default project identifier will be used (the one returned by ServiceOptions.getDefaultProjectId()). It can be handy to set this property globally by using plugin defaults if your cluster access only one GCP project.

BigQuery

This sub-group of plugins contains tasks for accessing Google Cloud BigQuery. BigQuery is a completely serverless and cost-effective enterprise data warehouse.

Triggers

Tasks

Pub/Sub

This sub-group of plugins contains tasks for accessing Google Cloud Pub/Sub. Pub/Sub is an asynchronous and scalable messaging service that decouples services producing messages from services processing those messages.

Triggers

Tasks

Cloud Storage (GCS)

This sub-group of plugins contains tasks for accessing Google Cloud Storage (GCS). Cloud Storage is a managed service for storing unstructured data.

Triggers

Tasks

Dataproc Batches

This sub-group of plugins contains tasks for submitting batches on Google Cloud Dataproc. Dataproc is a managed Apache Spark and Apache Hadoop service that lets you take advantage of open source data tools for batch processing, querying, streaming, and machine learning.

Tasks

Dataproc Clusters

This sub-group of plugins contains tasks to manipulate clusters on Google Cloud Dataproc. Dataproc is a managed Apache Spark and Apache Hadoop service that lets you take advantage of open source data tools for batch processing, querying, streaming, and machine learning.

Tasks

Cli

Tasks

Firestore

This sub-group of plugins contains tasks for accessing Google Cloud Firestore. Firestore is a flexible, scalable NoSQL cloud database.

Tasks

Vertex AI

This sub-group of plugins contains tasks for accessing Google Cloud Vertex AI. Vertex AI allows to build, deploy, and scale machine learning (ML) models faster, with fully managed ML tools for any use case.

Tasks

Kubernetes Engine (GKE)

This sub-group of plugins contains tasks for accessing Google Kubernetes Engine (GKE). Kubernetes Engine is a scalable and fully automated Kubernetes service.

Tasks

Authentication

This sub-group of plugins contains tasks to manage authentication for Google Cloud.

Tasks

Was this page helpful?