ParquetWriter ParquetWriter

type: "io.kestra.plugin.serdes.parquet.ParquetWriter"

Read a provided file containing ion serialized data and convert it to parquet.

# Properties

# compressionCodec

  • Type: string

  • Dynamic:

  • Required:

  • Default: GZIP

  • Possible Values:

    • UNCOMPRESSED
    • SNAPPY
    • GZIP
    • ZSTD

The compression to used

# dateFormat

  • Type: string
  • Dynamic: ✔️
  • Required:
  • Default: yyyy-MM-dd[XXX]

Format to use when parsing date

# datetimeFormat

  • Type: string
  • Dynamic: ✔️
  • Required:
  • Default: yyyy-MM-dd'T'HH:mm[:ss][.SSSSSS][XXX]

Format to use when parsing datetime

Default value is yyyy-MM-dd'T'HH:mm[:ss][.SSSSSS][XXX]

# decimalSeparator

  • Type: string
  • Dynamic: ✔️
  • Required:
  • Default: .

Character to recognize as decimal point (e.g. use ‘,’ for European data).

Default value is '.'

# dictionaryPageSize

  • Type: integer
  • Dynamic:
  • Required:
  • Default: 1048576

Max dictionary page size

# falseValues

  • Type: array
  • SubType: string
  • Dynamic: ✔️
  • Required:
  • Default: [f, false, disabled, 0, off, no, ]

Values to consider as False

# from

  • Type: string
  • Dynamic: ✔️
  • Required: ✔️

Source file URI

# inferAllFields

  • Type: boolean
  • Dynamic:
  • Required:
  • Default: false

Try to infer all fields

If true, we try to infer all fields with trueValues, trueValues & nullValues.If false, we will infer bool & null only on field declared on schema as null and bool.

# nullValues

  • Type: array
  • SubType: string
  • Dynamic: ✔️
  • Required:
  • Default: [, #N/A, #N/A N/A, #NA, -1.#IND, -1.#QNAN, -NaN, 1.#IND, 1.#QNAN, NA, n/a, nan, null]

Values to consider as null

# pageSize

  • Type: integer
  • Dynamic:
  • Required:
  • Default: 1048576

Target page size

# rowGroupSize

  • Type: integer
  • Dynamic:
  • Required:
  • Default: 134217728

Target row group size

# schema

  • Type: string
  • Dynamic: ✔️
  • Required: ✔️

The avro schema associated to the data

# strictSchema

  • Type: boolean
  • Dynamic:
  • Required:
  • Default: false

Whether to consider a field present in the data but not declared in the schema as an error

Default value is false

# timeFormat

  • Type: string
  • Dynamic: ✔️
  • Required:
  • Default: HH:mm[:ss][.SSSSSS][XXX]

Format to use when parsing time

# timeZoneId

  • Type: string
  • Dynamic:
  • Required:
  • Default: Etc/UTC

Timezone to use when no timezone can be parsed on the source.

If null, the timezone will be UTC Default value is system timezone

# trueValues

  • Type: array
  • SubType: string
  • Dynamic: ✔️
  • Required:
  • Default: [t, true, enabled, 1, on, yes]

Values to consider as True

# version

  • Type: string

  • Dynamic:

  • Required:

  • Default: V2

  • Possible Values:

    • V1
    • V2

Target row group size

# Outputs

# uri

  • Type: string

URI of a temporary result file