CreateTable​Create​Table

Create a BigQuery table.

yaml
type: "io.kestra.plugin.gcp.bigquery.CreateTable"
yaml
id: gcp_bq_create_table
namespace: company.team

tasks:
  - id: create_table
    type: io.kestra.plugin.gcp.bigquery.CreateTable
    projectId: my-project
    dataset: my-dataset
    table: my-table
    tableDefinition:
      type: TABLE
      schema:
        fields:
        - name: id
          type: INT64
        - name: name
          type: STRING
      standardTableDefinition:
        clustering:
        - id
        - name
    friendlyName: new_table
Properties

The dataset's user-defined ID.

The table's user-defined ID.

The user-friendly description for the table.

The encryption configuration.

Format duration

Sets the duration, since now, when this table expires.

If not present, the table will persist indefinitely. Expired tables will be deleted and their storage reclaimed.

The user-friendly name for the table.

The GCP service account to impersonate.

SubType string

Return a map for labels applied to the table.

The geographic location where the dataset should reside.

This property is experimental and might be subject to change or removed.

See Dataset Location

The GCP project ID.

Return true if a partition filter (that can be used for partition elimination) is required for queries over this table.

Automatic retry for retryable BigQuery exceptions.

Some exceptions (especially rate limit) are not retried by default by BigQuery client, we use by default a transparent retry (not the kestra one) to handle this case. The default values are exponential of 5 seconds for a maximum of 15 minutes and ten attempts

SubType string
Default ["due to concurrent update","Retrying the job may solve the problem","Retrying may solve the problem"]

The messages which would trigger an automatic retry.

Message is tested as a substring of the full message, and is case insensitive.

SubType string
Default ["rateLimitExceeded","jobBackendError","backendError","internalError","jobInternalError"]

The reasons which would trigger an automatic retry.

SubType string
Default ["https://www.googleapis.com/auth/cloud-platform"]

The GCP scopes to be used.

The GCP service account.

The table definition.

Format date-time

The time when this table was created.

The dataset's ID.

The table definition.

The user-friendly description for the table.

The encryption configuration.

The hash of the table resource.

Format date-time

Returns the time when this table expires.

If not present, the table will persist indefinitely. Expired tables will be deleted and their storage reclaimed.

The user-friendly name for the table.

The service-generated id for the table.

SubType string

Return a map for labels applied to the table.

Format date-time

The time when this table was last modified.

The size of this table in bytes.

The number of bytes considered "long-term storage" for reduced billing purposes.

The number of rows of data in this table.

The project's ID.

Return true if a partition filter (that can be used for partition elimination) is required for queries over this table.

The table name.

Format duration
Default RETRY_FAILED_TASK
Possible Values
RETRY_FAILED_TASKCREATE_NEW_EXECUTION
Minimum >= 1
Format duration
Default false

The external table definition if the type is EXTERNAL.

The materialized view definition if the type is MATERIALIZED_VIEW.

The table's schema.

The table definition if the type is TABLE.

Possible Values
TABLEVIEWMATERIALIZED_VIEWEXTERNALMODEL

The table's type.

The view definition if the type is VIEW.

SubType string

Returns the clustering configuration for this table. If {@code null}, the table is not clustered.

Returns the range partitioning configuration for this table. If {@code null}, the table is not range-partitioned.

Returns information on the table's streaming buffer, if exists. Returns {@code null} if no streaming buffer exists.

Returns the time partitioning configuration for this table. If {@code null}, the table is not time-partitioned.

Format duration
Format duration
Default RETRY_FAILED_TASK
Possible Values
RETRY_FAILED_TASKCREATE_NEW_EXECUTION
Minimum >= 1
Format duration
Default false
SubType
Format duration
Format duration
Default RETRY_FAILED_TASK
Possible Values
RETRY_FAILED_TASKCREATE_NEW_EXECUTION
Minimum >= 1
Format duration
Default false
Possible Values
TABLEVIEWMATERIALIZED_VIEWEXTERNALMODEL
SubType string
Format date-time
Format duration
SubType

The query whose result is persisted.

User defined functions that can be used by query. Returns {@code null} if not set.

SubType string

Whether automatic refresh is enabled for the materialized view when the base table is updated.

Format date-time

Date when this materialized view was last modified.

The query whose result is persisted.

Format duration

The maximum frequency at which this materialized view will be refreshed.

Format duration

The number of milliseconds for which to keep the storage for a partition. When expired, the storage for the partition is reclaimed. If null, the partition does not expire.

If not set, the table is partitioned by pseudo column '_PARTITIONTIME'; if set, the table is partitioned by this field.

If set to true, queries over this table require a partition filter (that can be used for partition elimination) to be specified.

Possible Values
DAYHOURMONTHYEAR

The time partitioning type.

The field description.

Possible Values
NULLABLEREQUIREDREPEATED

The field mode.

By default, Field.Mode.NULLABLE is used.

The field name.

The policy tags for the field.

SubType

The list of sub-fields if type is a LegacySQLType.RECORD. Returns null otherwise.

Possible Values
BOOLINT64FLOAT64NUMERICBIGNUMERICSTRINGBYTESSTRUCTARRAYTIMESTAMPDATETIMEDATETIMEGEOGRAPHYJSONINTERVALRANGE

The field type.

Possible Values
INLINEFROM_URI

If type is UserDefinedFunction.Type.INLINE, this method returns a code blob. If type is UserDefinedFunction.Type.FROM_URI, the method returns a Google Cloud Storage URI (e.g. gs://bucket/path).

Whether automatic detection of schema and format options should be performed.

The compression type of the data source.

Possible Values
CSVJSONBIGTABLEDATASTORE_BACKUPAVROGOOGLE_SHEETSPARQUETORC

The source format, and possibly some parsing options, of the external data.

Whether BigQuery should allow extra values that are not represented in the table schema.

If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result.

The maximum number of bad records that BigQuery can ignore when reading data.

If the number of bad records exceeds this value, an invalid error is returned in the job result.

SubType string

The fully-qualified URIs that point to your data in Google Cloud Storage.

Each URI can

  • contain one '*' wildcard character that must come after the bucket's name. Size limits related
  • to load jobs apply to external data sources, plus an additional limit of 10 GB maximum size
  • across all URIs.
Possible Values
NULLABLEREQUIREDREPEATED
SubType
Possible Values
BOOLINT64FLOAT64NUMERICBIGNUMERICSTRINGBYTESSTRUCTARRAYTIMESTAMPDATETIMEDATETIMEGEOGRAPHYJSONINTERVALRANGE
Format duration
Possible Values
DAYHOURMONTHYEAR
Possible Values
INLINEFROM_URI
Possible Values
CSVJSONBIGTABLEDATASTORE_BACKUPAVROGOOGLE_SHEETSPARQUETORC
SubType string
SubType string