ParquetWriter
type: "io.kestra.plugin.serdes.parquet.ParquetWriter"
Read a provided file containing ion serialized data and convert it to parquet.
# Properties
# compressionCodec
- Type: CompressionCodec
- Dynamic: ❌
- Required: ❌
- Default:
GZIP
The compression to used
# dateFormat
- Type: string
- Dynamic: ✔️
- Required: ❌
- Default:
yyyy-MM-dd[XXX]
Format to use when parsing date
# datetimeFormat
- Type: string
- Dynamic: ✔️
- Required: ❌
- Default:
yyyy-MM-dd'T'HH:mm[:ss][.SSSSSS][XXX]
Format to use when parsing datetime
Default value is yyyy-MM-dd'T'HH:mm[:ss][.SSSSSS][XXX]
# decimalSeparator
- Type: string
- Dynamic: ✔️
- Required: ❌
- Default:
.
Character to recognize as decimal point (e.g. use ‘,’ for European data).
Default value is '.'
# dictionaryPageSize
- Type: integer
- Dynamic: ❌
- Required: ❌
- Default:
1048576
Max dictionary page size
# falseValues
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ❌
- Default:
[f, false, disabled, 0, off, no, ]
Values to consider as False
# from
- Type: string
- Dynamic: ✔️
- Required: ✔️
Source file URI
# inferAllFields
- Type: boolean
- Dynamic: ❌
- Required: ❌
- Default:
false
Try to infer all fields
If true, we try to infer all fields with trueValues
, trueValues
& nullValues
.If false, we will infer bool & null only on field declared on schema as null
and bool
.
# nullValues
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ❌
- Default:
[, #N/A, #N/A N/A, #NA, -1.#IND, -1.#QNAN, -NaN, 1.#IND, 1.#QNAN, NA, n/a, nan, null]
Values to consider as null
# pageSize
- Type: integer
- Dynamic: ❌
- Required: ❌
- Default:
1048576
Target page size
# rowGroupSize
- Type: integer
- Dynamic: ❌
- Required: ❌
- Default:
134217728
Target row group size
# schema
- Type: string
- Dynamic: ✔️
- Required: ✔️
The avro schema associated to the data
# strictSchema
- Type: boolean
- Dynamic: ❌
- Required: ❌
- Default:
false
Whether to consider a field present in the data but not declared in the schema as an error
Default value is false
# timeFormat
- Type: string
- Dynamic: ✔️
- Required: ❌
- Default:
HH:mm[:ss][.SSSSSS][XXX]
Format to use when parsing time
# timeZoneId
- Type: string
- Dynamic: ❌
- Required: ❌
- Default:
Etc/UTC
Timezone to use when no timezone can be parsed on the source.
If null, the timezone will be UTC
Default value is system timezone
# trueValues
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ❌
- Default:
[t, true, enabled, 1, on, yes]
Values to consider as True
# version
- Type: Version
- Dynamic: ❌
- Required: ❌
- Default:
V2
Target row group size
# Outputs
# uri
- Type: string
URI of a temporary result file