TrilogyT (Transform)

Trilogy and ETL - A Modern Match

TrilogyT adds in additional features to Trilogy to enable it to serve as an ETL tool in a modern data stack.

Tips

Transform is under active development and is less mature/suitable for use than the other trilogy packages.

Installation

TrilogyT is available as a python package.

pip install pytrilogyt

Features

Given a set of output queries, TrilogyT can materialize intermediate aggregations/joins for reuse. The long-term goal here is to enable consumption definitions to happen directly without thinking about performance, and then dynamically materialize the intermediate ETL graph. This will produce dynamic aggregates - such as those used by AtScale or PDTs in Looker - but with significantly higher accuracy and less effort due to the additional metadata the language provides.

Orchestration CodeGen

Dagster

Docs coming soon!

DBT

If we take the dimensional model query defined in the last demo, written in Trilogy, we can use TrilogyT to generate the required DBT model.

Input Trilogy

property passenger.id.split_cabin <- unnest(split(passenger.cabin, ' '));
persist cabin_info into dim_cabins from 
select 
    passenger.id, 
    passenger.split_cabin;

Output SQL

-- Generated from Trilogy source: None
--Do not edit manually
{{ config(materialized='table') }}

WITH 
faithful as (
SELECT
    current_datetime() as `_Trilogyt__created_at`

),
rail as (
SELECT
    local_raw_data.`passengerid` as `passenger_id`,
    local_raw_data.`cabin` as `passenger_cabin`
FROM
    raw_titanic as local_raw_data
),
stork as (
SELECT
    split(rail.`passenger_cabin`, ' ') as `_anon_function_input_3690552380929428`,
    rail.`passenger_id` as `passenger_id`
FROM
    rail as rail
),
chimpanzee as (
SELECT
    stork.`passenger_id` as `passenger_id`,
    `passenger_split_cabin`
FROM
    stork as stork

FULL JOIN unnest(stork.`_anon_function_input_3690552380929428`) as unnest_wrapper(`passenger_split_cabin`)

),
mandrill as (
SELECT
    chimpanzee.`passenger_id` as `passenger_id`,
    chimpanzee.`passenger_split_cabin` as `passenger_split_cabin`,
    faithful.`_Trilogyt__created_at` as `_Trilogyt__created_at`
FROM
    rail as rail

LEFT OUTER JOIN chimpanzee on rail.`passenger_id` = chimpanzee.`passenger_id`

FULL JOIN faithful on 1=1

)
SELECT
    mandrill.`passenger_id`,
    mandrill.`passenger_split_cabin`,
    mandrill.`_Trilogyt__created_at`
FROM
    mandrill

Refactor with Ease

DBT workflows derived from Trilogy should be treated as another compilation target for Trilogy; they can be rebuilt on-demand as your semantic model changes to ensure that your entire graph is always using the most up to date sources.