FileTransform
type: "io.kestra.plugin.scripts.jython.FileTransform"
Transform ion format file from kestra with a groovy script.
This allow you to transform the data previously loaded by kestra as you need.
Take a ion format file from kestra and iterate row per row.
Each row will populate a row
global variable, you need to alter this variable that will be saved on output file.
if you set the row
to null
, the row will be skipped
You can create a variables rows
to return many rows for a single row
.
Examples
Extract data from an API, add a column and store it as a downloadable CSV file
id: etl-api-to-csv
namespace: dev
tasks:
- id: download
type: io.kestra.plugin.fs.http.Download
uri: https://gorest.co.in/public/v2/users
- id: ionToJSON
type: "io.kestra.plugin.serdes.json.JsonReader"
from: "{{outputs.download.uri}}"
newLine: false
- id: writeJSON
type: io.kestra.plugin.serdes.json.JsonWriter
from: "{{outputs.ionToJSON.uri}}"
- id: addColumn
type: io.kestra.plugin.scripts.jython.FileTransform
from: "{{outputs.writeJSON.uri}}"
script: |
from datetime import datetime
logger.info('row: {}', row)
row['inserted_at'] = datetime.utcnow()
- id: csv
type: io.kestra.plugin.serdes.csv.CsvWriter
from: "{{outputs.addColumn.uri}}"
Transform with file from internal storage
id: "file_transform"
type: "io.kestra.plugin.scripts.jython.FileTransform"
from: "{{ outputs['avro-to-gcs'] }}"
script: |
logger.info('row: {}', row)
if row['name'] == 'richard':
row = None
else:
row['email'] = row['name'] + '@kestra.io'
Transform with file from json string
id: "file_transform"
type: "io.kestra.plugin.scripts.jython.FileTransform"
from: "[{"name":"jane"}, {"name":"richard"}]"
script: |
logger.info('row: {}', row)
if row['name'] == 'richard':
row = None
else:
row['email'] = row['name'] + '@kestra.io'
Properties
from
- Type: string
- Dynamic: ✔️
- Required: ✔️
Source file of row to transform
Can be an internal storage uri, a map or a list.
concurrent
- Type: integer
- Dynamic: ❌
- Required: ❌
- Minimum:
>= 2
Number of concurrent parallels transform
Take care that the order is not respected if you use parallelism
script
- Type: string
- Dynamic: ✔️
- Required: ❌
A full script
Outputs
uri
- Type: string
URI of a temporary result file
The file will be serialized as ion file.