FileTransform FileTransform

yaml
type: "io.kestra.plugin.scripts.groovy.FileTransform"

Transform ion format file from Kestra with a groovy script.

This allows you to transform the data, previously loaded by Kestra, as you need.

Take a ion format file from Kestra and iterate row per row. Each row will populate a row global variable. You need to alter this variable that will be saved on output file. If you set the row to null, the row will be skipped. You can create a variable rows to return multiple rows for a single row.

Examples

Convert row by row of a file from Kestra's internal storage.

yaml
id: groovy_file_transform
namespace: company.team

inputs:
  - id: file
    type: FILE

tasks:
  - id: file_transform
    type: io.kestra.plugin.scripts.groovy.FileTransform
    from: "{{ inputs.file }}"
    script: |
      logger.info('row: {}', row)

      if (row.get('name') == 'richard') {
        row = null
      } else {
        row.put('email', row.get('name') + '@kestra.io')
      }

Create multiple rows from one row.

yaml
id: groovy_file_transform
namespace: company.team

inputs:
  - id: file
    type: FILE

tasks:
  - id: file_transform
    type: io.kestra.plugin.scripts.groovy.FileTransform
    from: "{{ inputs.file }}"
    script: |
      logger.info('row: {}', row)
      rows = [["action", "insert"], row]

Transform a JSON string to a file.

yaml
id: groovy_file_transform
namespace: company.team

inputs:
  - id: json
    type: JSON
    defaults: [{"name":"jane"}, {"name":"richard"}]

tasks:
  - id: file_transform
    type: io.kestra.plugin.scripts.groovy.FileTransform
    from: "{{ inputs.json }}"
    script: |
      logger.info('row: {}', row)

      if (row.get('name') == 'richard') {
        row = null
      } else {
        row.put('email', row.get('name') + '@kestra.io')
      }

JSON transformations using jackson library

yaml
id: json_transform_using_jackson
namespace: company.team

tasks:
  - id: file_transform
    type: io.kestra.plugin.scripts.groovy.FileTransform
    from: "[{"name":"John Doe", "age":99, "embedded":{"foo":"bar"}}]"
    script: |
      import com.fasterxml.jackson.*

      def mapper = new databind.ObjectMapper();
      def jsonStr = mapper.writeValueAsString(row);
      logger.info('input in json str: {}', jsonStr)

      def typeRef = new core.type.TypeReference<HashMap<String,Object>>() {};

      data = mapper.readValue(jsonStr, typeRef);

      logger.info('json object: {}', data);
      logger.info('embedded field: {}', data.embedded.foo)

Properties

from

  • Type: string
  • Dynamic: ✔️
  • Required: ✔️

Source file containing rows to transform.

Can be Kestra's internal storage URI, a map or a list.

concurrent

  • Type: integer
  • Dynamic:
  • Required:
  • Minimum: >= 2

Number of concurrent parallel transformations to execute.

Take care that the order is not respected if you use parallelism.

script

  • Type: string
  • Dynamic: ✔️
  • Required:

A full script.

Outputs

uri

  • Type: string
  • Required:
  • Format: uri

URI of a temporary result file.

The file will be serialized as ion file.

Was this page helpful?