Load data from local file to BigQuery
type: "io.kestra.plugin.gcp.bigquery.Load"
Load an csv file from an input file
id: gcp_bq_load
namespace: company.team
tasks:
- id: load
type: io.kestra.plugin.gcp.bigquery.Load
from: "{{ inputs.file }}"
destinationTable: "my_project.my_dataset.my_table"
format: CSV
csvOptions:
fieldDelimiter: ";"
YES
NO
Avro parsing options.
YES
The clustering specification for the destination table.
YES
CREATE_IF_NEEDED
CREATE_NEVER
Whether the job is allowed to create tables.
NO
Csv parsing options.
YES
The table where to put query results.
If not provided, a new table is created.
YES
true
NO
CSV
JSON
AVRO
PARQUET
ORC
The source format, and possibly some parsing options, of the external data.
YES
The fully-qualified URIs that point to source data
YES
YES
The GCP service account to impersonate.
YES
The geographic location where the dataset should reside.
This property is experimental and might be subject to change or removed.
See Dataset Location
YES
YES
The GCP project ID.
NO
YES
["due to concurrent update","Retrying the job may solve the problem"]
The messages which would trigger an automatic retry.
Message is tested as a substring of the full message, and is case insensitive.
YES
["rateLimitExceeded","jobBackendError","internalError","jobInternalError"]
The reasons which would trigger an automatic retry.
YES
The schema for the destination table.
The schema can be omitted if the destination table already exists, or if you're loading data from a Google Cloud Datastore backup (i.e. DATASTORE_BACKUP format option).
schema:
fields:
- name: colA
type: STRING
- name: colB
type: NUMERIC
See type from StandardSQLTypeName
YES
ALLOW_FIELD_ADDITION
ALLOW_FIELD_RELAXATION
Experimental Options allowing the schema of the destination table to be updated as a side effect of the query job.
Schema update options are supported in two cases: when writeDisposition is WRITE_APPEND; when writeDisposition is WRITE_TRUNCATE and the destination table is a partition of a table, specified by partition decorators. For normal tables, WRITE_TRUNCATE will always overwrite the schema.
YES
["https://www.googleapis.com/auth/cloud-platform"]
The GCP scopes to be used.
YES
The GCP service account.
YES
The time partitioning field for the destination table.
YES
DAY
DAY
HOUR
MONTH
YEAR
The time partitioning type specification for the destination table.
YES
WRITE_TRUNCATE
WRITE_APPEND
WRITE_EMPTY
The action that should occur if the destination table already exists.
Destination table
The job id
Output rows count
NO
duration
NO
RETRY_FAILED_TASK
RETRY_FAILED_TASK
CREATE_NEW_EXECUTION
NO
>= 1
NO
duration
NO
constant
NO
false
NO
duration
NO
duration
NO
RETRY_FAILED_TASK
RETRY_FAILED_TASK
CREATE_NEW_EXECUTION
NO
>= 1
NO
duration
NO
random
NO
false
YES
YES
YES
The character encoding of the data.
The supported values are UTF-8 or ISO-8859-1. The default value is UTF-8. BigQuery decodes the data after the raw, binary data has been split using the values set in {@link #setQuote(String)} and {@link #setFieldDelimiter(String)}.
YES
The separator for fields in a CSV file.
BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. BigQuery also supports the escape sequence "\t" to specify a tab separator. The default value is a comma (',').
YES
The value that is used to quote data sections in a CSV file.
BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. The default value is a double-quote ('"'). If your data does not contain quoted sections, set the property value to an empty string. If your data contains quoted newline characters, you must also set {@link #setAllowQuotedNewLines(boolean)} property to {@code true}.
YES
NO
duration
NO
duration
NO
RETRY_FAILED_TASK
RETRY_FAILED_TASK
CREATE_NEW_EXECUTION
NO
NO
>= 1
NO
duration
NO
exponential
NO
false
YES