
CreateDataset
Create a BigQuery dataset or update if it already exists.
type: "io.kestra.plugin.gcp.bigquery.CreateDataset"Examples
Create a dataset if not exits
id: gcp_bq_create_dataset
namespace: company.team
tasks:
- id: create_dataset
type: io.kestra.plugin.gcp.bigquery.CreateDataset
name: "my_dataset"
location: "EU"
ifExists: "SKIP"
Properties
name*Requiredstring
aclNon-dynamicarray
The dataset's access control configuration.
io.kestra.plugin.gcp.bigquery.models.AccessControl
The GCP entity.
io.kestra.plugin.gcp.bigquery.models.Entity
DOMAINGROUPUSERIAM_MEMBERThe type of the entity (USER, GROUP, DOMAIN or IAM_MEMBER).
The value for the entity.
For example, user email if the type is USER.
READERWRITEROWNERThe role to assign to the entity.
defaultEncryptionConfigurationNon-dynamic
The default encryption key for all tables in the dataset.
Once this property is set, all newly-created partitioned tables in the dataset will have encryption key set to this value, unless table creation request (or query) overrides the key.
com.google.cloud.bigquery.EncryptionConfiguration
defaultPartitionExpirationMsintegerstring
Optional The default partition expiration time for all partitioned tables in the dataset, in milliseconds.
Once this property is set, all newly-created partitioned tables in the dataset will has an expirationMs property in the timePartitioning settings set to this value. Changing the value only affect new tables, not existing ones. The storage in a partition will have an expiration time of its partition time plus this value. Setting this property overrides the use of defaultTableExpirationMs for partitioned tables: only one of defaultTableExpirationMs and defaultPartitionExpirationMs will be used for any new partitioned table. If you provide an explicit timePartitioning.expirationMs when creating or updating a partitioned table, that value takes precedence over the default partition expiration time indicated by this property. The value may be null.
defaultTableLifetimeintegerstring
The default lifetime of all tables in the dataset, in milliseconds.
The minimum value is 3600000 milliseconds (one hour). Once this property is set, all newly-created tables in the dataset will have an expirationTime property set to the creation time plus the value in this property, and changing the value will only affect new tables, not existing ones. When the expirationTime for a given table is reached, that table will be deleted automatically. If a table's expirationTime is modified or removed before the table expires, or if you provide an explicit expirationTime when creating a table, that value takes precedence over the default expiration time indicated by this property. This property is experimental and might be subject to change or removed.
descriptionstring
The dataset description.
friendlyNamestring
ifExistsstring
ERRORERRORUPDATESKIPPolicy to apply if a dataset already exists.
impersonatedServiceAccountstring
The GCP service account to impersonate.
labelsobject
The dataset's labels.
locationstring
projectIdstring
retryAutoNon-dynamic
Automatic retry for retryable BigQuery exceptions.
Some exceptions (especially rate limit) are not retried by default by BigQuery client, we use by default a transparent retry (not the kestra one) to handle this case. The default values are exponential of 5 seconds for a maximum of 15 minutes and ten attempts
io.kestra.core.models.tasks.retrys.Constant
durationRETRY_FAILED_TASKRETRY_FAILED_TASKCREATE_NEW_EXECUTION>= 1durationfalseio.kestra.core.models.tasks.retrys.Exponential
durationdurationRETRY_FAILED_TASKRETRY_FAILED_TASKCREATE_NEW_EXECUTION>= 1durationfalseio.kestra.core.models.tasks.retrys.Random
durationdurationRETRY_FAILED_TASKRETRY_FAILED_TASKCREATE_NEW_EXECUTION>= 1durationfalseretryMessagesarray
["due to concurrent update","Retrying the job may solve the problem","Retrying may solve the problem"]The messages which would trigger an automatic retry.
Message is tested as a substring of the full message, and is case insensitive.
retryReasonsarray
["rateLimitExceeded","jobBackendError","backendError","internalError","jobInternalError"]The reasons which would trigger an automatic retry.
scopesarray
["https://www.googleapis.com/auth/cloud-platform"]The GCP scopes to be used.
serviceAccountstring
The GCP service account.
Outputs
dataset*Requiredstring
The dataset's user-defined ID.
description*Requiredstring
A user-friendly description for the dataset.
friendlyName*Requiredstring
A user-friendly name for the dataset.
location*Requiredstring
The geographic location where the dataset should reside.
This property is experimental and might be subject to change or removed. See Dataset Location
project*Requiredstring
The GCP project ID.