CreateDataset
Create a BigQuery dataset or update if it already exists.
type: "io.kestra.plugin.gcp.bigquery.CreateDataset"
Examples
Create a dataset if not exits
id: gcp_bq_create_dataset
namespace: company.team
tasks:
- id: create_dataset
type: io.kestra.plugin.gcp.bigquery.CreateDataset
name: "my_dataset"
location: "EU"
ifExists: "SKIP"
Properties
name *Requiredstring
The dataset's user-defined ID.
defaultEncryptionConfiguration Non-dynamicEncryptionConfiguration
The default encryption key for all tables in the dataset.
Once this property is set, all newly-created partitioned tables in the dataset will have encryption key set to this value, unless table creation request (or query) overrides the key.
defaultPartitionExpirationMs integerstring
Optional The default partition expiration time for all partitioned tables in the dataset, in milliseconds.
Once this property is set, all newly-created partitioned tables in the dataset will has an expirationMs property in the timePartitioning settings set to this value. Changing the value only affect new tables, not existing ones. The storage in a partition will have an expiration time of its partition time plus this value. Setting this property overrides the use of defaultTableExpirationMs for partitioned tables: only one of defaultTableExpirationMs and defaultPartitionExpirationMs will be used for any new partitioned table. If you provide an explicit timePartitioning.expirationMs when creating or updating a partitioned table, that value takes precedence over the default partition expiration time indicated by this property. The value may be null.
defaultTableLifetime integerstring
The default lifetime of all tables in the dataset, in milliseconds.
The minimum value is 3600000 milliseconds (one hour). Once this property is set, all newly-created tables in the dataset will have an expirationTime property set to the creation time plus the value in this property, and changing the value will only affect new tables, not existing ones. When the expirationTime for a given table is reached, that table will be deleted automatically. If a table's expirationTime is modified or removed before the table expires, or if you provide an explicit expirationTime when creating a table, that value takes precedence over the default expiration time indicated by this property. This property is experimental and might be subject to change or removed.
description string
The dataset description.
A user-friendly description for the dataset.
friendlyName string
A user-friendly name for the dataset.
ifExists string
ERROR
ERROR
UPDATE
SKIP
Policy to apply if a dataset already exists.
impersonatedServiceAccount string
The GCP service account to impersonate.
labels object
The dataset's labels.
location string
The geographic location where the dataset should reside.
This property is experimental and might be subject to change or removed. See Dataset Location
projectId string
The GCP project ID.
retryAuto Non-dynamicConstantExponentialRandom
Automatic retry for retryable BigQuery exceptions.
Some exceptions (especially rate limit) are not retried by default by BigQuery client, we use by default a transparent retry (not the kestra one) to handle this case. The default values are exponential of 5 seconds for a maximum of 15 minutes and ten attempts
retryMessages array
["due to concurrent update","Retrying the job may solve the problem","Retrying may solve the problem"]
The messages which would trigger an automatic retry.
Message is tested as a substring of the full message, and is case insensitive.
retryReasons array
["rateLimitExceeded","jobBackendError","backendError","internalError","jobInternalError"]
The reasons which would trigger an automatic retry.
scopes array
["https://www.googleapis.com/auth/cloud-platform"]
The GCP scopes to be used.
serviceAccount string
The GCP service account.
Outputs
dataset *Requiredstring
The dataset's user-defined ID.
description *Requiredstring
A user-friendly description for the dataset.
friendlyName *Requiredstring
A user-friendly name for the dataset.
location *Requiredstring
The geographic location where the dataset should reside.
This property is experimental and might be subject to change or removed. See Dataset Location
project *Requiredstring
The GCP project ID.
Definitions
io.kestra.core.models.tasks.retrys.Constant
interval *Requiredstring
duration
type *Requiredobject
behavior string
RETRY_FAILED_TASK
RETRY_FAILED_TASK
CREATE_NEW_EXECUTION
maxAttempts integer
>= 1
maxDuration string
duration
warningOnRetry boolean
false
io.kestra.core.models.tasks.retrys.Random
maxInterval *Requiredstring
duration
minInterval *Requiredstring
duration
type *Requiredobject
behavior string
RETRY_FAILED_TASK
RETRY_FAILED_TASK
CREATE_NEW_EXECUTION
maxAttempts integer
>= 1
maxDuration string
duration
warningOnRetry boolean
false
io.kestra.plugin.gcp.bigquery.models.Entity
type *Requiredstring
DOMAIN
GROUP
USER
IAM_MEMBER
The type of the entity (USER, GROUP, DOMAIN or IAM_MEMBER).
value *Requiredstring
The value for the entity.
For example, user email if the type is USER.
com.google.cloud.bigquery.EncryptionConfiguration
kmsKeyName string
io.kestra.core.models.tasks.retrys.Exponential
interval *Requiredstring
duration
maxInterval *Requiredstring
duration
type *Requiredobject
behavior string
RETRY_FAILED_TASK
RETRY_FAILED_TASK
CREATE_NEW_EXECUTION
delayFactor number
maxAttempts integer
>= 1
maxDuration string
duration
warningOnRetry boolean
false
io.kestra.plugin.gcp.bigquery.models.AccessControl
entity *RequiredEntity
The GCP entity.
role *Requiredstring
READER
WRITER
OWNER
The role to assign to the entity.