Trilogy And ETL

TrilogyT and ETL - A Modern Match

TrilogyT adds in additional features to Trilogy to enable it to serve as an ETL tool in a modern data stack.

If you're not familiar with Trilogy, it's worth reviewing the Trilogy demo first.

Installation

TrilogyT is available as a python package.

pip install pytrilogyt

DBT Model Generation

If we take the dimensional model query defined in the last demo, written in Trilogy, we can use TrilogyT to generate the required DBT model.

Input Trilogy

property passenger.id.split_cabin <- unnest(split(passenger.cabin, ' '));
persist cabin_info into dim_cabins from 
select 
    passenger.id, 
    passenger.split_cabin;

Output SQL

-- Generated from Trilogy source: None
--Do not edit manually
{{ config(materialized='table') }}

WITH 
faithful as (
SELECT
    current_datetime() as `_Trilogyt__created_at`

),
rail as (
SELECT
    local_raw_data.`passengerid` as `passenger_id`,
    local_raw_data.`cabin` as `passenger_cabin`
FROM
    raw_titanic as local_raw_data
),
stork as (
SELECT
    split(rail.`passenger_cabin`, ' ') as `_anon_function_input_3690552380929428`,
    rail.`passenger_id` as `passenger_id`
FROM
    rail as rail
),
chimpanzee as (
SELECT
    stork.`passenger_id` as `passenger_id`,
    `passenger_split_cabin`
FROM
    stork as stork

FULL JOIN unnest(stork.`_anon_function_input_3690552380929428`) as unnest_wrapper(`passenger_split_cabin`)

),
mandrill as (
SELECT
    chimpanzee.`passenger_id` as `passenger_id`,
    chimpanzee.`passenger_split_cabin` as `passenger_split_cabin`,
    faithful.`_Trilogyt__created_at` as `_Trilogyt__created_at`
FROM
    rail as rail

LEFT OUTER JOIN chimpanzee on rail.`passenger_id` = chimpanzee.`passenger_id`

FULL JOIN faithful on 1=1

)
SELECT
    mandrill.`passenger_id`,
    mandrill.`passenger_split_cabin`,
    mandrill.`_Trilogyt__created_at`
FROM
    mandrill

Refactor with Ease

DBT workflows derived from Trilogy should be treated as another compilation target for Trilogy; they can be rebuilt on-demand as your semantic model changes to ensure that your entire graph is always using the most up to date sources.

Automatic Optimization

If you generate a large graph of Trilogy at once, TrilogyT will intelligently consolidate CTEs and other common subexpressions into their own derived models to minimize your total computations. These models will be stored in a unique 'cache' directory.