Thesis
Thesis
What Trilogy Aims For
Trilogy is an evolution of SQL—keeping what’s great, fixing what’s not, and modernizing it for the way analysts actually work.
It’s designed to:
- Simplify the hardest parts of SQL
- Preserve the familiarity and power of SQL
- Encourage best practices by design
- Add modern ergonomics: reuse, type-checking, and composability
You should be able to query with just SQL knowledge and model comfortably if you know SQL or Python.
To support the rapid, iterative workflows that SQL encourages, Trilogy includes a lightweight, built-in definition language—so you can define, evolve, and query your semantic layer inline, in the same session.
This language supports:
- Static typing
- Automatic join inference (1-1, 1-many, many-1)
- Grouping and filtering by grain
- Null and aggregation safety
What It Looks Like
Trilogy looks like SQL, but without the boilerplate.
WHERE
SELECT
#FROM - moved to semantic layer
#GROUP BY - resolved automatically
HAVING
ORDER_BY
Tips
SQL is declarative; Trilogy queries should be even more declarative.
Deeper Example
-- models can be imported and reused
import store_sales as store_sales;
-- where clause idiomatically comes first, but can be run later for familiarity
WHERE
store_sales.date.year=2001
and store_sales.date.month_of_year=1
and store_sales.item.current_price > 1.2 * avg(store_sales.item.current_price) by store_sales.item.category
-- select looks normal
SELECT
store_sales.customer.state,
count(store_sales.customer.id) as customer_count
-- having filters the output of select
HAVING
customer_count>10
-- order by is normal
ORDER BY
customer_count asc nulls first,
store_sales.customer.state asc nulls first
LIMIT 100;
Vs SQL
SELECT
a.ca_state state,
count(*) cnt
FROM customer_address a ,
INNER JOIN customer c on a.ca_address_sk = c.c_current_addr_sk
INNER JOIN store_sales s on c.c_customer_sk = s.ss_customer_sk
INNER JOIN date_dim d on s.ss_sold_date_sk = d.d_date_sk
INNER JOIN item i on s.ss_item_sk = i.i_item_sk
WHERE
AND d.d_month_seq =
(SELECT DISTINCT (d_month_seq)
FROM date_dim
WHERE d_year = 2001
AND d_moy = 1 )
AND i.i_current_price > 1.2 *
(SELECT avg(j.i_current_price)
FROM item j
WHERE j.i_category = i.i_category)
GROUP BY
a.ca_state
HAVING count(*) >= 10
ORDER BY
cnt NULLS FIRST,
a.ca_state NULLS FIRST
LIMIT 100;
What's Different?
- No raw tables or join conditions
- No nested subqueries
- No GROUP BY clutter
The complexity is abstracted into reusable models. Once defined, they simplify every query that uses them.
Why does this matter?
We're not alone in thinking SQL can be better. Many tools try:
Related Projects
Trilogy stands apart by staying close to SQL—keeping it familiar while giving it structure and safety.
Most of the value of a SQL query is in the transformations and selections. The joins and GROUP BY are where bugs and complexity creep in.
Moreover, traditional table definitions often lack crucial metadata: keys, grain, nullability. Trilogy moves this into a first-class semantic layer, making it possible to automate what used to be error-prone.
Invest a little more in modeling. Save a lot on every query after.
Tips
A well-modeled dataset pays dividends across every dashboard, report, and analysis. Trilogy makes this modeling practical and shareable.
Why don't we all just use SQL, again?
SQL is fantastic.
SQL has been the de-facto language for working with data for decades. Data professionals can use a common, declarative syntax to interact with anything from local file based databases to global distributed compute clusters.
But SQL solves the wrong problem for the modern data stack.
SQL is a declarative language for reading and manipulating data in tables in SQL databases. This is a perfect fit for an application interacting with a datastore.
But in data warehouses, a table is a leaky abstraction. Users don't care about tables; tables are a means to an end. They want the data, and the table is a detail. Tables will be replicated; aggregated; cached - and the user spends all their time on the container, not the product.
Tips
SQL is a language for getting data out of tables in a databaase. Trilogy is a language for getting/transforming data in a [warehouse/lake/mart], with all the evolution, deprecation, and change that implies. The tables will change, but your query doesn't need to.
Example: Putting the Answer First
Say you want:
Total revenue by product line and city in 2025, as a fraction of all city revenue
WHERE year = 2025
SELECT
product_line,
sum(revenue) as product_revenue,
sum(revenue) by city_name as all_city_revenue,
product_revenue / all_city_revenue as revenue_ratio,
city_name;
But this is what you have to write
WITH tmpCity AS (
SELECT
city_name,
sum(revenue) as total_city
FROM fact_revenue_latest
JOIN dim_city ON ...
WHERE dim_date.year = 2025
GROUP BY city_name
)
SELECT
product_line,
sum(revenue),
total_city,
sum(revenue) / total_city,
city_name
FROM ...
JOIN ...
GROUP BY product_line, total_city, city_name
Which one looks more like what you're actually trying to say?
Fixing SQL Sprawl
As data teams grow, SQL sprawl becomes inevitable: Brittle, copy-pasted queries with unclear logic—costly to maintain and reverse engineer.
Fortune 500 companies burn millions trying to extract lineage and intent from raw SQL. Trilogy Solves This By:
- Separating business logic (the "what") from implementation (the "how")
- Allowing static analysis and testing of business rules
- Supporting incremental modeling and query iteration in the same language
How does Trilogy work?
Trilogy starts with a small investment: defining your semantic layer. Once done, every query becomes simpler.
Unlike most semantic layers, Trilogy doesn't require a separate modeling step. You can define and extend models inline—supporting exploration and iteration.
You don’t need a separate file or workflow to get started.
You can model and query in one place.
Read more in the concepts and references section to learn how Trilogy works under the hood, and the nuances of query design and setup.
Usage
You can: