FileTransform
FileTransform
type: "io.kestra.plugin.scripts.jython.FileTransform"
Transform ion format file from Kestra with a groovy script.
This allows you to transform the data, previously loaded by Kestra, as you need.
Take a ion format file from Kestra and iterate row per row.
Each row will populate a row
global variable. You need to alter this variable that will be saved on output file.
If you set the row
to null
, the row will be skipped.
You can create a variable rows
to return multiple rows for a single row
.
Examples
Extract data from an API, add a column, and store it as a downloadable CSV file.
id: etl_api_to_csv
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.fs.http.Download
uri: https://gorest.co.in/public/v2/users
- id: ion_to_json
type: io.kestra.plugin.serdes.json.JsonToIon
from: "{{ outputs.download.uri }}"
newLine: false
- id: write_json
type: io.kestra.plugin.serdes.json.IonToJson
from: "{{ outputs.ion_to_json.uri }}"
- id: add_column
type: io.kestra.plugin.scripts.jython.FileTransform
from: "{{ outputs.write_json.uri }}"
script: |
from datetime import datetime
logger.info('row: {}', row)
row['inserted_at'] = datetime.utcnow()
- id: csv
type: io.kestra.plugin.serdes.csv.IonToCsv
from: "{{ outputs.add_column.uri }}"
Transform with file from internal storage.
id: jython_file_transform
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: file_transform
type: io.kestra.plugin.scripts.jython.FileTransform
from: "{{ inputs.file }}"
script: |
logger.info('row: {}', row)
if row['name'] == 'richard':
row = None
else:
row['email'] = row['name'] + '@kestra.io'
Create multiple rows from one row.
id: jython_file_transform
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: file_transform
type: io.kestra.plugin.scripts.jython.FileTransform
from: "{{ inputs.file }}"
script: |
logger.info('row: {}', row)
rows = [{"action": "insert"}, row]
Transform with file from JSON string.
id: jython_file_transform
namespace: company.team
inputs:
- id: json
type: JSON
defaults: {"name": "john"}
tasks:
- id: file_transform
type: io.kestra.plugin.scripts.jython.FileTransform
from: "{{ inputs.json }}"
script: |
logger.info('row: {}', row)
if row['name'] == 'richard':
row = None
else:
row['email'] = row['name'] + '@kestra.io'
Properties
from
- Type: string
- Dynamic: ✔️
- Required: ✔️
Source file containing rows to transform.
Can be Kestra's internal storage URI, a map or a list.
concurrent
- Type: integer
- Dynamic: ❌
- Required: ❌
- Minimum:
>= 2
Number of concurrent parallel transformations to execute.
Take care that the order is not respected if you use parallelism.
script
- Type: string
- Dynamic: ✔️
- Required: ❌
A full script.
Outputs
uri
- Type: string
- Required: ❌
- Format:
uri
URI of a temporary result file.
The file will be serialized as ion file.
Was this page helpful?