ParquetWriter
type: "io.kestra.plugin.serdes.parquet.ParquetWriter"
Read a provided file containing ion serialized data and convert it to parquet.
Properties
from
- Type: string
- Dynamic: ✔️
- Required: ✔️
Source file URI
schema
- Type: string
- Dynamic: ✔️
- Required: ✔️
The avro schema associated to the data
compressionCodec
- Type: string
- Dynamic: ❌
- Required: ❌
- Default:
GZIP
- Possible Values:
UNCOMPRESSED
SNAPPY
GZIP
ZSTD
The compression to used
dateFormat
- Type: string
- Dynamic: ✔️
- Required: ❌
- Default:
yyyy-MM-dd[XXX]
Format to use when parsing date
datetimeFormat
- Type: string
- Dynamic: ✔️
- Required: ❌
- Default:
yyyy-MM-dd'T'HH:mm[:ss][.SSSSSS][XXX]
Format to use when parsing datetime
Default value is yyyy-MM-dd'T'HH:mm[
decimalSeparator
- Type: string
- Dynamic: ✔️
- Required: ❌
- Default:
.
Character to recognize as decimal point (e.g. use ‘,’ for European data).
Default value is '.'
dictionaryPageSize
- Type: integer
- Dynamic: ❌
- Required: ❌
- Default:
1048576
Max dictionary page size
falseValues
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ❌
- Default:
[f, false, disabled, 0, off, no, ]
Values to consider as False
inferAllFields
- Type: boolean
- Dynamic: ❌
- Required: ❌
- Default:
false
Try to infer all fields
If true, we try to infer all fields with trueValues
, trueValues
& nullValues
.If false, we will infer bool & null only on field declared on schema as null
and bool
.
nullValues
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ❌
- Default:
[, #N/A, #N/A N/A, #NA, -1.#IND, -1.#QNAN, -NaN, 1.#IND, 1.#QNAN, NA, n/a, nan, null]
Values to consider as null
pageSize
- Type: integer
- Dynamic: ❌
- Required: ❌
- Default:
1048576
Target page size
rowGroupSize
- Type: integer
- Dynamic: ❌
- Required: ❌
- Default:
134217728
Target row group size
strictSchema
- Type: boolean
- Dynamic: ❌
- Required: ❌
- Default:
false
Whether to consider a field present in the data but not declared in the schema as an error
Default value is false
timeFormat
- Type: string
- Dynamic: ✔️
- Required: ❌
- Default:
HH:mm[:ss][.SSSSSS][XXX]
Format to use when parsing time
timeZoneId
- Type: string
- Dynamic: ❌
- Required: ❌
- Default:
Etc/UTC
Timezone to use when no timezone can be parsed on the source.
If null, the timezone will be UTC
Default value is system timezone
trueValues
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ❌
- Default:
[t, true, enabled, 1, on, yes]
Values to consider as True
version
- Type: string
- Dynamic: ❌
- Required: ❌
- Default:
V2
- Possible Values:
V1
V2
Target row group size
Outputs
uri
- Type: string
URI of a temporary result file