Trilogy NLP
Reliable SQL for Humans AND Machines
Trilogy-NLP aims to be a highly reliable platform for natural language interrogation of structured database contents. It aims to achieve this by mapping natural language queries to a well-defined but expressive middle layer (Trilogy!), which then executes the SQL to retrieve results. We also goal on being able to effectively generate queries rapidly without direct database access.
Principles:
- High precision
- No special syntax - the agents should work with what humans would
- Performant - magic is more magic when it's fast
- One-shot - iterative cycles shouldn't be a requirement to get a good answer
WARNING
GenAI is a developing space and perfection is rarely guaranteed. We're short of where we want to be on some of the above principles - especially speed! Help us get there. But in the meantime, try prompts and have fun!
Installation
Trilogy NLP is available as a python package:
pip install pytrilogy-nlp
Quickstart
from trilogy_public_models import get_executor
from trilogy_nlp import NLPEngine, Provider, CacheType
# we use this to run queries
# get a Trilogy executor preloaded with the tpc_ds schema in duckdb
# Executors run queries again a model using an engine
executor = get_executor("duckdb.tpc_ds")
# create an NLP engine
# we use this to generate queries against the model
engine = NLPEngine(
provider=Provider.OPENAI,
model="gpt-4o-mini",
cache=CacheType.SQLLITE,
cache_kwargs={"database_path": ".demo.db"},
)
# We can pass the executor to the engine
# to directly run a querie
results = engine.run_query(
"What was the store sales for the first 5 days of January 2000 for customers in CA?",
executor=executor,
)
for row in results:
print(row)
Details
Trilogy-NLP is a Trilogy integration that adds natural language processing to Trilogy, enabling users to write queries in natural language and have them translated to SQL via the Trilogy semantic model. The simplified abstraction of a Trilogy model is natural fit for generative AI models, bypassing potential hallucination issues and SQL correctness challenges by focusing the model solely on interpreting the user query and mapping that to the higher level semantic layer.
Trilogy-NLP can either be used in an interactive analytics context, as a helper for users to write queries, or as a backend for a chatbot or other natural language interface to a database.
Run through some examples and then try out freeform queries against the tpc-ds dataset below.
TIP
You can view the generated SQL query just like with Trilogy - and you can also easily just generate Trilogy code. Trilogy-NLP is often a good place to start a query that can then be further refined.
Examples
California Sales
TIP
This backend is not resourced/optimized for performance - it may take a bit.