FileTransform FileTransform

yaml
type: "io.kestra.plugin.scripts.jython.FileTransform"

Transform ion format file from Kestra with a groovy script.

This allows you to transform the data, previously loaded by Kestra, as you need.

Take a ion format file from Kestra and iterate row per row. Each row will populate a row global variable. You need to alter this variable that will be saved on output file. If you set the row to null, the row will be skipped. You can create a variable rows to return multiple rows for a single row.

Examples

Extract data from an API, add a column, and store it as a downloadable CSV file.

yaml
id: etl-api-to-csv
namespace: dev

tasks:
  - id: download
    type: io.kestra.plugin.fs.http.Download
    uri: https://gorest.co.in/public/v2/users

  - id: ionToJSON
    type: "io.kestra.plugin.serdes.json.JsonReader"
    from: "{{outputs.download.uri}}"
    newLine: false

  - id: writeJSON
    type: io.kestra.plugin.serdes.json.JsonWriter
    from: "{{outputs.ionToJSON.uri}}"

  - id: addColumn
    type: io.kestra.plugin.scripts.jython.FileTransform
    from: "{{outputs.writeJSON.uri}}"
    script: |
      from datetime import datetime
      logger.info('row: {}', row)
      row['inserted_at'] = datetime.utcnow()

  - id: csv
    type: io.kestra.plugin.serdes.csv.CsvWriter
    from: "{{outputs.addColumn.uri}}"

Transform with file from internal storage.

yaml
id: "file_transform"
type: "io.kestra.plugin.scripts.jython.FileTransform"
from: "{{ outputs['avro-to-gcs'] }}"
script: |
  logger.info('row: {}', row)

  if row['name'] == 'richard': 
    row = None
  else: 
    row['email'] = row['name'] + '@kestra.io'

Transform with file from JSON string.

yaml
id: "file_transform"
type: "io.kestra.plugin.scripts.jython.FileTransform"
from: "[{\"name\":\"jane\"}, {\"name\":\"richard\"}]"
script: |
  logger.info('row: {}', row)

  if row['name'] == 'richard': 
    row = None
  else: 
    row['email'] = row['name'] + '@kestra.io'

Properties

from

  • Type: string
  • Dynamic: ✔️
  • Required: ✔️

Source file containing rows to transform.

Can be Kestra's internal storage URI, a map or a list.

concurrent

  • Type: integer
  • Dynamic:
  • Required:
  • Minimum: >= 2

Number of concurrent parallel transformations to execute.

Take care that the order is not respected if you use parallelism.

script

  • Type: string
  • Dynamic: ✔️
  • Required:

A full script.

Outputs

uri

  • Type: string
  • Dynamic:
  • Required:
  • Format: uri

URI of a temporary result file.

The file will be serialized as ion file.