🚀 New! Kestra raises $3 million to grow Learn more

ExtractToGcs ExtractToGcs

yaml
type: "io.kestra.plugin.gcp.bigquery.ExtractToGcs"

Extract data from BigQuery table to GCS (Google Cloud Storage)

Examples

Extract a BigQuery table to a gcs bucket

yaml
id: "extract_to_gcs"
type: "io.kestra.plugin.gcp.bigquery.ExtractToGcs"
destinationUris: 
  - "gs://bucket_name/filename.csv"
sourceTable: "my_project.my_dataset.my_table"
format: CSV
fieldDelimiter: ';'
printHeader: true

Properties

compression

  • Type: string
  • Dynamic: ✔️
  • Required:

the compression value to use for exported files. If not set exported files are not compressed.

destinationUris

  • Type: array
  • SubType: string
  • Dynamic: ✔️
  • Required:

The list of fully-qualified Google Cloud Storage URIs (e.g. gs://bucket/path) where the extracted table should be written.

fieldDelimiter

  • Type: string
  • Dynamic: ✔️
  • Required:

The delimiter to use between fields in the exported data. By default "," is used.

format

  • Type: string
  • Dynamic: ✔️
  • Required:

The exported file format. If not set table is exported in CSV format.

jobTimeoutMs

  • Type: integer
  • Dynamic:
  • Required:

Optional Job timeout in milliseconds. If this time limit is exceeded, BigQuery may attempt to terminate the job.

labels

  • Type: object
  • SubType: string
  • Dynamic: ✔️
  • Required:

The labels associated with this job.

The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key. Parameters: labels - labels or null for none

location

  • Type: string
  • Dynamic: ✔️
  • Required:

The geographic location where the dataset should reside

This property is experimental and might be subject to change or removed.

See Dataset Location

printHeader

  • Type: boolean
  • Dynamic:
  • Required:

Whether to print out a header row in the results. By default an header is printed.

projectId

  • Type: string
  • Dynamic: ✔️
  • Required:

The GCP project id

retryAuto

retryMessages

  • Type: array
  • SubType: string
  • Dynamic: ✔️
  • Required:
  • Default: [due to concurrent update, Retrying the job may solve the problem]

The message that are valid for a automatic retry.

Message is tested as a substring of the full message and case insensitive

retryReasons

  • Type: array
  • SubType: string
  • Dynamic: ✔️
  • Required:
  • Default: [rateLimitExceeded, jobBackendError, internalError, jobInternalError]

The reason that are valid for a automatic retry.

scopes

  • Type: array
  • SubType: string
  • Dynamic: ✔️
  • Required:
  • Default: [https://www.googleapis.com/auth/cloud-platform]

The GCP scopes to used

serviceAccount

  • Type: string
  • Dynamic: ✔️
  • Required:

The GCP service account key

sourceTable

  • Type: string
  • Dynamic: ✔️
  • Required:

The table to export.

useAvroLogicalTypes

  • Type: boolean
  • Dynamic:
  • Required:

Optional Flag if format is set to "AVRO".

Optional If destinationFormat is set to "AVRO", this flag indicates whether to enable extracting applicable column types (such as TIMESTAMP) to their corresponding AVRO logical types (timestamp-micros), instead of only using their raw types (avro-long).

Outputs

destinationUris

  • Type: array
  • SubType: string

The destination URI file

fileCounts

  • Type: array
  • SubType: integer

Number of extracted files

jobId

  • Type: string

The job id

sourceTable

  • Type: string

source Table

Definitions

Constant

interval

  • Type: string
  • Dynamic:
  • Required: ✔️
  • Format: duration

type

  • Type: string
  • Dynamic:
  • Required: ✔️
  • Default: constant

maxAttempt

  • Type: integer
  • Dynamic:
  • Required:
  • Minimum: >= 1

maxDuration

  • Type: string
  • Dynamic:
  • Required:
  • Format: duration

warningOnRetry

  • Type: boolean
  • Dynamic:
  • Required:
  • Default: false

Random

maxInterval

  • Type: string
  • Dynamic:
  • Required: ✔️
  • Format: duration

minInterval

  • Type: string
  • Dynamic:
  • Required: ✔️
  • Format: duration

type

  • Type: string
  • Dynamic:
  • Required: ✔️
  • Default: random

maxAttempt

  • Type: integer
  • Dynamic:
  • Required:
  • Minimum: >= 1

maxDuration

  • Type: string
  • Dynamic:
  • Required:
  • Format: duration

warningOnRetry

  • Type: boolean
  • Dynamic:
  • Required:
  • Default: false

Exponential

interval

  • Type: string
  • Dynamic:
  • Required: ✔️
  • Format: duration

maxInterval

  • Type: string
  • Dynamic:
  • Required: ✔️
  • Format: duration

type

  • Type: string
  • Dynamic:
  • Required: ✔️
  • Default: exponential

delayFactor

  • Type: number
  • Dynamic:
  • Required:

maxAttempt

  • Type: integer
  • Dynamic:
  • Required:
  • Minimum: >= 1

maxDuration

  • Type: string
  • Dynamic:
  • Required:
  • Format: duration

warningOnRetry

  • Type: boolean
  • Dynamic:
  • Required:
  • Default: false