Source
id: dremio-sql-python
namespace: company.team
variables:
project_id: ead79cc0-9e93-4d50-b364-77639a56d4a6
tasks:
- id: query
type: io.kestra.plugin.jdbc.dremio.Query
disabled: false
url: jdbc:dremio:direct=sql.dremio.cloud:443;ssl=true;PROJECT_ID={{vars.project_id}};schema=postgres.public
username: $token
password: "{{ secret('DREMIO_TOKEN') }}"
sql: SELECT first_name, last_name, hire_date, salary FROM
postgres.public.employees LIMIT 100;
fetchType: STORE
- id: python
type: io.kestra.plugin.scripts.python.Script
warningOnStdErr: false
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
containerImage: ghcr.io/kestra-io/polars:latest
script: >
import polars as pl import amazon.ion.simpleion as ion
from amazon.ion.simple_types import IonPyDecimal, IonPyNull
def convert_ion_types(value):
if isinstance(value, IonPyNull):
return None
elif isinstance(value, IonPyDecimal):
return float(value)
else:
return value
uri = "{{outputs.query.uri}}" with open(uri, 'rb') as f:
ion_content = f.read()
ion_data = ion.loads(ion_content, single_value=False) list_of_dicts =
[dict(record) for record in ion_data] list_of_dicts = [
{k: convert_ion_types(v) for k, v in record.items()} for record in list_of_dicts
]
polars_df = pl.DataFrame(list_of_dicts)
print(polars_df.glimpse())
About this blueprint
Python SQL Outputs
This flow queries data from a Dremio lakehouse and passes the query results to a downstream Python task that further transforms fetched data using Polars. The only required inputs are the project ID and a Dremio token.
The project ID can be found in your Dremio URL. For example, if your Dremio URL is https://app.dremio.cloud/sonar/ead79cc0-9e93-4d50-b364-77639a56d4a6
, then your project ID is the last string ead79cc0-9e93-4d50-b364-77639a56d4a6
.
To create a Dremio token, go to your Dremio account settings and then to the section "Personal Access Token". From here, you can create the token, copy it and store it as a kestra secret. For more detailed description of how to create your token, visit the following Dremio documentation. It's recommended to store the token as a Secret to avoid exposing it directly in your flow.
If you are new to Dremio, the easiest way to familiarize yourself with the platform is to create a Dremio Test Drive account, which provides you with a Dremio instance and sample datasets to query. You can reproduce this flow without any changes using the test drive account.
More Related Blueprints