TrilogyT (Transform)
TrilogyT (Transform)
Trilogy and ETL - A Modern Match
TrilogyT adds in additional features to Trilogy to enable it to serve as an ETL tool in a modern data stack.
Tips
Transform is under active development and is less mature/suitable for use than the other trilogy packages.
Installation
TrilogyT is available as a python package.
pip install pytrilogyt
Features
Optimization
Given a set of output queries, TrilogyT can materialize intermediate aggregations/joins for reuse. The long-term goal here is to enable consumption definitions to happen directly without thinking about performance, and then dynamically materialize the intermediate ETL graph. This will produce dynamic aggregates - such as those used by AtScale or PDTs in Looker - but with significantly higher accuracy and less effort due to the additional metadata the language provides.
Orchestration CodeGen
Dagster
Docs coming soon!
DBT
If we take the dimensional model query defined in the last demo, written in Trilogy, we can use TrilogyT to generate the required DBT model.
Input Trilogy
property passenger.id.split_cabin <- unnest(split(passenger.cabin, ' '));
persist cabin_info into dim_cabins from
select
passenger.id,
passenger.split_cabin;
Output SQL
-- Generated from Trilogy source: None
--Do not edit manually
{{ config(materialized='table') }}
WITH
faithful as (
SELECT
current_datetime() as `_Trilogyt__created_at`
),
rail as (
SELECT
local_raw_data.`passengerid` as `passenger_id`,
local_raw_data.`cabin` as `passenger_cabin`
FROM
raw_titanic as local_raw_data
),
stork as (
SELECT
split(rail.`passenger_cabin`, ' ') as `_anon_function_input_3690552380929428`,
rail.`passenger_id` as `passenger_id`
FROM
rail as rail
),
chimpanzee as (
SELECT
stork.`passenger_id` as `passenger_id`,
`passenger_split_cabin`
FROM
stork as stork
FULL JOIN unnest(stork.`_anon_function_input_3690552380929428`) as unnest_wrapper(`passenger_split_cabin`)
),
mandrill as (
SELECT
chimpanzee.`passenger_id` as `passenger_id`,
chimpanzee.`passenger_split_cabin` as `passenger_split_cabin`,
faithful.`_Trilogyt__created_at` as `_Trilogyt__created_at`
FROM
rail as rail
LEFT OUTER JOIN chimpanzee on rail.`passenger_id` = chimpanzee.`passenger_id`
FULL JOIN faithful on 1=1
)
SELECT
mandrill.`passenger_id`,
mandrill.`passenger_split_cabin`,
mandrill.`_Trilogyt__created_at`
FROM
mandrill
Refactor with Ease
DBT workflows derived from Trilogy should be treated as another compilation target for Trilogy; they can be rebuilt on-demand as your semantic model changes to ensure that your entire graph is always using the most up to date sources.