FileTransform FileTransform

yaml
type: "io.kestra.plugin.scripts.groovy.FileTransform"

Transform ion format file from kestra with a groovy script.

This allow you to transform the data previously loaded by kestra as you need.

Take a ion format file from kestra and iterate row per row. Each row will populate a row global variable, you need to alter this variable that will be saved on output file. if you set the row to null, the row will be skipped You can create a variables rows to return many rows for a single row.

Examples

Convert every row by row with file from internal storage

yaml
id: "file_transform"
type: "io.kestra.plugin.scripts.groovy.FileTransform"
from: "{{ outputs['avro-to-gcs'] }}"
script: |
  logger.info('row: {}', row)

  if (row.get('name') == 'richard') {
    row = null
  } else {
    row.put('email', row.get('name') + '@kestra.io')
  }

Create multiple rows from one row

yaml
id: "file_transform"
type: "io.kestra.plugin.scripts.groovy.FileTransform"
from: "{{ outputs['avro-to-gcs'] }}"
script: |
  logger.info('row: {}', row)
  rows = [["action", "insert"], row]

Transform with file from json string

yaml
id: "file_transform"
type: "io.kestra.plugin.scripts.groovy.FileTransform"
from: "[{"name":"jane"}, {"name":"richard"}]"
script: |
  logger.info('row: {}', row)

  if (row.get('name') == 'richard') {
    row = null
  } else {
    row.put('email', row.get('name') + '@kestra.io')
  }

Properties

from

  • Type: string
  • Dynamic: ✔️
  • Required: ✔️

Source file of row to transform

Can be an internal storage uri, a map or a list.

concurrent

  • Type: integer
  • Dynamic:
  • Required:
  • Minimum: >= 2

Number of concurrent parallels transform

Take care that the order is not respected if you use parallelism

script

  • Type: string
  • Dynamic: ✔️
  • Required:

A full script

Outputs

uri

  • Type: string

URI of a temporary result file

The file will be serialized as ion file.