Trilogy And ETL
TrilogyT and ETL - A Modern Match
TrilogyT adds in additional features to Trilogy to enable it to serve as an ETL tool in a modern data stack.
If you're not familiar with Trilogy, it's worth reviewing the Trilogy demo first.
Installation
TrilogyT is available as a python package.
pip install pytrilogyt
DBT Model Generation
If we take the dimensional model query defined in the last demo, written in Trilogy, we can use TrilogyT to generate the required DBT model.
Input Trilogy
property passenger.id.split_cabin <- unnest(split(passenger.cabin, ' '));
persist cabin_info into dim_cabins from
select
passenger.id,
passenger.split_cabin;
Output SQL
-- Generated from Trilogy source: None
--Do not edit manually
{{ config(materialized='table') }}
WITH
faithful as (
SELECT
current_datetime() as `_Trilogyt__created_at`
),
rail as (
SELECT
local_raw_data.`passengerid` as `passenger_id`,
local_raw_data.`cabin` as `passenger_cabin`
FROM
raw_titanic as local_raw_data
),
stork as (
SELECT
split(rail.`passenger_cabin`, ' ') as `_anon_function_input_3690552380929428`,
rail.`passenger_id` as `passenger_id`
FROM
rail as rail
),
chimpanzee as (
SELECT
stork.`passenger_id` as `passenger_id`,
`passenger_split_cabin`
FROM
stork as stork
FULL JOIN unnest(stork.`_anon_function_input_3690552380929428`) as unnest_wrapper(`passenger_split_cabin`)
),
mandrill as (
SELECT
chimpanzee.`passenger_id` as `passenger_id`,
chimpanzee.`passenger_split_cabin` as `passenger_split_cabin`,
faithful.`_Trilogyt__created_at` as `_Trilogyt__created_at`
FROM
rail as rail
LEFT OUTER JOIN chimpanzee on rail.`passenger_id` = chimpanzee.`passenger_id`
FULL JOIN faithful on 1=1
)
SELECT
mandrill.`passenger_id`,
mandrill.`passenger_split_cabin`,
mandrill.`_Trilogyt__created_at`
FROM
mandrill
Refactor with Ease
DBT workflows derived from Trilogy should be treated as another compilation target for Trilogy; they can be rebuilt on-demand as your semantic model changes to ensure that your entire graph is always using the most up to date sources.
Automatic Optimization
If you generate a large graph of Trilogy at once, TrilogyT will intelligently consolidate CTEs and other common subexpressions into their own derived models to minimize your total computations. These models will be stored in a unique 'cache' directory.