CreateDataset CreateDataset

yaml
type: "io.kestra.plugin.gcp.bigquery.CreateDataset"

Create a dataset or update if it already exists.

Examples

Create a dataset if not exits

yaml
id: gcp_bq_create_dataset
namespace: company.team

tasks:
  - id: create_dataset
    type: io.kestra.plugin.gcp.bigquery.CreateDataset
    name: "my_dataset"
    location: "EU"
    ifExists: "SKIP"

Properties

name

  • Type: string
  • Dynamic: ✔️
  • Required: ✔️

The dataset's user-defined ID.

acl

The dataset's access control configuration.

defaultEncryptionConfiguration

The default encryption key for all tables in the dataset.

Once this property is set, all newly-created partitioned tables in the dataset will have encryption key set to this value, unless table creation request (or query) overrides the key.

defaultPartitionExpirationMs

  • Type: integer
  • Dynamic:
  • Required:

Optional The default partition expiration time for all partitioned tables in the dataset, in milliseconds.

Once this property is set, all newly-created partitioned tables in the dataset will has an expirationMs property in the timePartitioning settings set to this value. Changing the value only affect new tables, not existing ones. The storage in a partition will have an expiration time of its partition time plus this value. Setting this property overrides the use of defaultTableExpirationMs for partitioned tables: only one of defaultTableExpirationMs and defaultPartitionExpirationMs will be used for any new partitioned table. If you provide an explicit timePartitioning.expirationMs when creating or updating a partitioned table, that value takes precedence over the default partition expiration time indicated by this property. The value may be null.

defaultTableLifetime

  • Type: integer
  • Dynamic:
  • Required:

The default lifetime of all tables in the dataset, in milliseconds.

The minimum value is 3600000 milliseconds (one hour). Once this property is set, all newly-created tables in the dataset will have an expirationTime property set to the creation time plus the value in this property, and changing the value will only affect new tables, not existing ones. When the expirationTime for a given table is reached, that table will be deleted automatically. If a table's expirationTime is modified or removed before the table expires, or if you provide an explicit expirationTime when creating a table, that value takes precedence over the default expiration time indicated by this property. This property is experimental and might be subject to change or removed.

description

  • Type: string
  • Dynamic: ✔️
  • Required:

The dataset description.

A user-friendly description for the dataset.

friendlyName

  • Type: string
  • Dynamic: ✔️
  • Required:

A user-friendly name for the dataset.

ifExists

  • Type: string
  • Dynamic:
  • Required:
  • Default: ERROR
  • Possible Values:
    • ERROR
    • UPDATE
    • SKIP

Policy to apply if a dataset already exists.

impersonatedServiceAccount

  • Type: string
  • Dynamic: ✔️
  • Required:

The GCP service account to impersonate.

labels

  • Type: object
  • SubType: string
  • Dynamic: ✔️
  • Required:

The dataset's labels.

location

  • Type: string
  • Dynamic: ✔️
  • Required:

The geographic location where the dataset should reside.

This property is experimental and might be subject to change or removed. See Dataset Location

projectId

  • Type: string
  • Dynamic: ✔️
  • Required:

The GCP project ID.

retryAuto

retryMessages

  • Type: array
  • SubType: string
  • Dynamic: ✔️
  • Required:
  • Default: [due to concurrent update, Retrying the job may solve the problem]

The messages which would trigger an automatic retry.

Message is tested as a substring of the full message, and is case insensitive.

retryReasons

  • Type: array
  • SubType: string
  • Dynamic: ✔️
  • Required:
  • Default: [rateLimitExceeded, jobBackendError, internalError, jobInternalError]

The reasons which would trigger an automatic retry.

scopes

  • Type: array
  • SubType: string
  • Dynamic: ✔️
  • Required:
  • Default: [https://www.googleapis.com/auth/cloud-platform]

The GCP scopes to be used.

serviceAccount

  • Type: string
  • Dynamic: ✔️
  • Required:

The GCP service account.

Outputs

dataset

  • Type: string
  • Required: ✔️

The dataset's user-defined ID.

description

  • Type: string
  • Required: ✔️

A user-friendly description for the dataset.

friendlyName

  • Type: string
  • Required: ✔️

A user-friendly name for the dataset.

location

  • Type: string
  • Required: ✔️

The geographic location where the dataset should reside.

This property is experimental and might be subject to change or removed. See Dataset Location

project

  • Type: string
  • Required: ✔️

The GCP project ID.

Definitions

io.kestra.core.models.tasks.retrys.Constant

Properties

interval
  • Type: string
  • Dynamic:
  • Required: ✔️
  • Format: duration
type
  • Type: string
  • Dynamic:
  • Required: ✔️
  • Default: constant
behavior
  • Type: string
  • Dynamic:
  • Required:
  • Default: RETRY_FAILED_TASK
  • Possible Values:
    • RETRY_FAILED_TASK
    • CREATE_NEW_EXECUTION
maxAttempt
  • Type: integer
  • Dynamic:
  • Required:
  • Minimum: >= 1
maxDuration
  • Type: string
  • Dynamic:
  • Required:
  • Format: duration
warningOnRetry
  • Type: boolean
  • Dynamic:
  • Required:
  • Default: false

io.kestra.core.models.tasks.retrys.Random

Properties

maxInterval
  • Type: string
  • Dynamic:
  • Required: ✔️
  • Format: duration
minInterval
  • Type: string
  • Dynamic:
  • Required: ✔️
  • Format: duration
type
  • Type: string
  • Dynamic:
  • Required: ✔️
  • Default: random
behavior
  • Type: string
  • Dynamic:
  • Required:
  • Default: RETRY_FAILED_TASK
  • Possible Values:
    • RETRY_FAILED_TASK
    • CREATE_NEW_EXECUTION
maxAttempt
  • Type: integer
  • Dynamic:
  • Required:
  • Minimum: >= 1
maxDuration
  • Type: string
  • Dynamic:
  • Required:
  • Format: duration
warningOnRetry
  • Type: boolean
  • Dynamic:
  • Required:
  • Default: false

io.kestra.plugin.gcp.bigquery.models.Entity

Properties

value
  • Type: string
  • Dynamic: ✔️
  • Required: ✔️

The value for the entity.

For example, user email if the type is USER.

com.google.cloud.bigquery.EncryptionConfiguration

Properties

kmsKeyName
  • Type: string
  • Dynamic:
  • Required:

io.kestra.core.models.tasks.retrys.Exponential

Properties

interval
  • Type: string
  • Dynamic:
  • Required: ✔️
  • Format: duration
maxInterval
  • Type: string
  • Dynamic:
  • Required: ✔️
  • Format: duration
type
  • Type: string
  • Dynamic:
  • Required: ✔️
  • Default: exponential
behavior
  • Type: string
  • Dynamic:
  • Required:
  • Default: RETRY_FAILED_TASK
  • Possible Values:
    • RETRY_FAILED_TASK
    • CREATE_NEW_EXECUTION
delayFactor
  • Type: number
  • Dynamic:
  • Required:
maxAttempt
  • Type: integer
  • Dynamic:
  • Required:
  • Minimum: >= 1
maxDuration
  • Type: string
  • Dynamic:
  • Required:
  • Format: duration
warningOnRetry
  • Type: boolean
  • Dynamic:
  • Required:
  • Default: false

io.kestra.plugin.gcp.bigquery.models.AccessControl

Properties

entity
  • Type: Entity
  • Dynamic: ✔️
  • Required: ✔️

The GCP entity.

role
  • Type: string
  • Dynamic: ✔️
  • Required: ✔️
  • Possible Values:
    • READER
    • WRITER
    • OWNER

The role to assign to the entity.

Was this page helpful?