Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Ktx – Open-source executable context layer for data agents (github.com/kaelio)
93 points by lucamrtl 13 days ago | hide | past | favorite | 37 comments
Hi HN, we’re open-sourcing ktx. It’s an executable context layer that makes agents reliable on your data stack.

We built it after going through the experience of building production-grade data agents for dozens of companies. If you’ve also tried building them, or simply tried using Claude Code or Codex on your data warehouse, you’ll know that accuracy is the #1 issue. Agents are great at generating valid SQL, but it’s not always correct SQL.

To cite a few examples of “agents gone wrong”:

- Stale column + hidden business rule: when preparing a board report, a finance analyst asks Claude Code for “ARR by customer segment”, it derives ARR from multiple tables (subscriptions, plans, accounts), then groups by accounts.industry. But CC doesn’t know that this industry column was deprecated a few months prior, or that past board reports excluded paused subscriptions from the ARR calculation

- Join fanout: a data analyst at a retailer uses their company’s internal agent to prep a product revenue deck for a QBR. The agent joins orders to order_items, then sums orders.total_amount_cents grouped by order_items.product_id. The SQL runs fine, but each order’s revenue is repeated once per line item, which most people will miss if most orders only have 1 item

- Missing attribution logic: a marketing analyst asks Codex “Which campaigns drove the most revenue?” Codex joins marketing_touches to users to orders and groups by utm_campaign. But since each order can have multiple touches before purchase, the same order can be credited to first touch, last touch, every touch, or every campaign the user clicked before buying. If the agent chooses the method that doesn’t match the team’s attribution logic, they’ll make suboptimal decisions

To solve this at first we gave the agent more context through skills + a wiki-style knowledge base. That gives it some useful extra context but still relies on it writing the SQL without incorrect assumptions.

The next solution we explored was implementing a classic semantic layer. That solves the executable part, but they’re such a pain to build and maintain since they were made for legacy BI tools. Plus as a standalone tool, they lack all the useful context from unstructured data sources like internal docs.

So we built ktx and split it into 2 parts:

1. Business context goes in Markdown wiki pages that are auto-ingested and auto-populated

2. Queryable definitions go into YAML files that define tables, row grain, joins, measures, dimensions, filters, and filter groups

That way, when an agent needs a metric, it asks ktx for a measure, dimensions, filters, and filter groups instead of writing the whole query itself. ktx’s planner chooses the join path, uses grain and relationship metadata, catches issues like join fanout and chasm joins, and compiles the warehouse SQL, while utilizing the extra unstructured knowledge it has access to.

ktx is Apache 2.0. It can ingest from most warehouses (BigQuery, Snowflake, Postgres & others), modeling tools (dbt, MetricFlow, LookML), BI tools (Looker, Metabase), doc tools like Notion, and corrections from user interactions.

Install manually:

npm install -g @kaelio/ktx

ktx setup

Or give this prompt to your agent:

Run npx skills add Kaelio/ktx --skill ktx and use ktx skill to install and configure ktx

We’d especially like feedback from people who’ve tried using Claude Code, Codex, or building custom agents on analytics warehouses. Where did they fail? And what did you try to make the answers more reliable?

 help



Sounds cool, I want to try this kind of things out, but do you have or planned to have a sandboxing environment, where the agent can try running the query in let say duckdb first to confirm its validity/result before sending it over to bq? Or use something like tablesample when developing the query to reduce cost? One more thing, how do you compare with nao ( https://github.com/getnao/nao ), it's something I've followed for a while and seem to answer similar issue as what ktx build

Thanks! ktx maintainer here, related to the queries sandbox question, we're attacking this problem from a different angle: instead of catching bad queries at runtime we try to make it impossible to generate a bad query. ktx allows agents to generate semantic queries (declarative json) and takes care of translating it to the correct SQL with the help of the join graph that it builds during the data source ingestion (dijkstra path finding + fan-out handling etc.). So the agent doesn't need to think "how" it needs to query the data, it just says "what" it wants and ktx builds queries. Also the join graph allows you combining raw tables and high level ktx metrics.

On nao: it's solving an adjacent but different problem, it's a full analytics app, whereas we believe that the general purpose agents (e.g. Claude Desktop) are getting better at the representation layer generating widgets and that's where people work anyway. What these agents need is a solid foundation - this is what we're focusing on. ktx isn't an agent or UI, it's an executable institutional knowledge that keeps any analyst (nao, Claude, Codex, your own agent) from guessing which table is canonical or which join is safe.


Just to complete the answer if that's where your question was going - we actually built a prepackaged project that users can use to experiment with ktx before running any kind of ingestion. In our docs we also have a link to a demo postgres, dbt, metabase, and notion that users can access to try out ktx

And better comparisons from the semantic layer space are tools like Wren, Cube, dbt MetricFlow


This is really interesting! Seems on a similar ballpark as graphenedata. How do they compare?

Thanks for the comment, good question - there's some overlap. Graphene does the full query -> viz loop, and the model has to be handwritten in their GSQL. It's kinda like a code-first BI tool. Whereas ktx is building and managing the context (interface between BI stack and agents), so doing the modeling part of the job and focusing on agent context rather than viz. Hope that's more or less clear

So, and I may be oversimplifying, you are creating awesome documents and references that I would have loved to have for my different jobs 5-10 years ago (or more).

It’s just that making such docs had next to no ROI 10 years ago. But today they are the difference between success and failure.

It’s fascinating - thank you

(and who writes the wiki / business rules ? Can they be reverse engineered from existing query stack? )

Sounds great - all the best

Edit: don’t take the above as criticism - just trying to fit new ideas into an old dog.


Thanks! yes, exactly. But what you described is just half of what ktx does, we call it "wiki" - self organizing collection of ingested markdown files.

The second (equally important) part of ktx is similarly self-organizing executable semantic layer.

It allows agents to send simple declarative "I want X sliced by Y" requests instead of imperative "X sliced by Y that has to be queried this way and joined the other way ..."

To answer your questions:

> who writes the wiki / business rules most is create after `ktx ingest` pulls raw data from your data stack, it can be edited/created manually, but it's rarely necessary. Also if you plug ktx to your agent it allows that agent to send extra memories for ingestion to keep the context up to date

> Can they be reverse engineered from existing query stack Yes, that's the case when we analyze historic SQL queries or scan Looker or Metabase dashboards for example


Just to clarify what semantic layer is: you can think of it as a set of functions

So you define the definitions of metrics like "monthly revenue" once, and that definition is runnable. If an agent asks for "monthly revenue by region", the semantic layer will compile that request into SQL and run it (the same way everytime so that calculation method is consistent)


Thank you … one can see how “semantic layer for whole enterprise” is a PowerPoint slide that can make McKinsey drool

Hopefully more agents and fewer PowerPoints and McKinsey folks in the future :)

How does this compare with Wren 2.0, OpenVikings etc

Thanks for the question! Just to complete - OpenVikings is context/memory infrastructure for agents. ktx is analytics context for data agents specifically. So they're in different categories

The way we think about the space: there's a "semantic layer" side (Cube, dbt MetricFlow, Wren's engine) that compiles correct SQL but is hand-authored. And a "company brain" side (OpenViking, Glean, wikis) that indexes prose but can't query data warehouses safely. ktx is built to be both halves at once - a YAML semantic layer and a searchable wiki of business definitions, cross-linked (each wiki page references the metrics it explains) and auto-maintained


Hey, ktx maintainer here! Great question, we actually have quite a bit in common with Wren, they're in the same category. The main difference is that with Wren you author the semantic model (write MDL) and it executes SQL through its own Rust engine (similar to Cube as well). ktx instead builds the context for you - ingesting definitions already in dbt, Looker, Metabase, and Notion, auto-detecting joins and fan/chasm traps, flagging contradictions. It also builds both : the SL and wiki that accumulates and maintains free-form tribal knowledge (that’s also interlinked with ktx SL)

This sounds like it might be exactly what I need!

Does `ktx setup` need Claude specifically?

> LLM - picks a Claude backend. The default uses your local Claude Code session, so no API key is required. You can also use an Anthropic API key or Vertex AI.

I'm currently on Copilot at work


Next questions...

I'm skim reading the docs but I just want to know a bit more about the architecture

Does it produce artefacts that I can commit to version control and share with my team? Is it a tool that everyone runs locally in the project? Or there's a component (the wiki?) that I should deploy as an internal service?


When you install ktx, you'll have to initialize the project directory and ktx initializes a git repo there to assure the version controls.

ktx project directory is self-contained. The main 2 parts inside are :

- wiki: a collection of .md files

- semantic-layer: a collection of .yaml files

typically all these files are created/edited automatically during the ingestion, but you can also edit them by hand or even sync with a remote git.

ktx treats these files as sources and builds internal indexes in a sqlite db, located in projectDir/.ktx

The main way to use ktx is to start an mcp server by calling `ktx mcp start` it'll start an http server and multiple people will be able to connect it to their agents.

Hope this helps, happy to answer any other questions!


Sounds like it produces yaml and markdown files that I could commit as project documentation/config?

yes, exactly

We currently support anthropic models for the setup (whether through claude pro/max plan) or through API. Adding support for openai API / codex should be pretty straightforward - would love to get you in the community slack to get more details on your copilot setup

Quick update - we just released support for codex as LLM backend :)

Here's the community https://ktx.sh/slack

How are you measuring the accuracy? Are you running this against any benchmarks?

I see this covers a file based approach, was there ever a consideration for a graph based approach?

For business context, how do you handle context that evolves over time?


Great questions! we're currently working on Spider 2 submission, hope to have first results soon. It's true that we took file-first approach and not a graph DB. Main reason is that the ktx wiki and semantic layer entities while being written in plain text files (md or yaml) still contain links to each other. This allows an agent to find the right entry point (with the help of lexical and semantic searches merged with RRF) and then traverse these links to collect enough context.

As for the business context evolution - that's exactly the reason we have ingestion reconciliation and git versioning. The idea is to give ingestion agent a way to deduplicate/consolidate knowledge during the ingestion and leave complex conflicts to humans to resolve


And development was done around link detection and text to sql benchmarks to measure/compare different approaches

yep, happy to share more details with you @modus-tollens if you're interested

just tried from claude code.

some observations: setup (without openai embeddings) was pretty easy. Nice! Not sure why it took ~19 minutes for a postgres database with 20sh tables. Also the data from pg_desc did not get into the wiki

I am going to try to embed it in our agentic flow and see how it fits. Very cool project though.


Hi, thanks for trying it! 19m for ~20 tables is slower that I'd expect, I assume you activated historic queries, right? As Luca said we're focusing on performance improvements right now. 99% of the latency is LLM calls, so we're optimizing them.

As for the descriptions: `pg_description` should be used already, we store raw pg descriptions in the semantic-layer/<connection>/_schema/*.yaml files under descriptions.db. there's also an AI generated description next to it. AI is the combination of the raw DB description + bunch of metadata we manage to collect about a table/column.

could you run `ktx sl` and see if the descriptions appear in the CLI ?

also feel free to join our community Slack - ktx.sh/slack We'll be happy to help with other questions


Thanks for the feedback - the length is typically because we’re also processing query history to infer acceptable joins, etc. But we have a lot of optimizations to implement, so we’re positive this will run faster in the coming weeks. Looking into the pg_desc issue now, thanks for reporting!

Thank you both. I will do more digging on pg_description as you suggested. Also looking forward to optimizations. I was actually working on the similar project on the semantic data layer for llm context, but I think your project already solves this issue. Will report back as I try to integrate ktx more in our agentic flows.

Nice! Feel free to join our slack community btw: ktx.sh/slack

we always like chatting with folks that have tried building similar projects in the past


Cool. Just joined.

cool idea. but for ex over time if users update db schemas, agents run old queries and got errors, is there a mechanism in your lib for agents to examine and self-correct? also is there an auto memory compression the moment the context becomes too big?

good point! yes, there's a feedback mcp tool that agents are instructed to use when they want to save/update ktx "memories". This is a way to keep the wiki+semantic layer up to date beyond the initial ingestion. Plus every time there's a new ingestion the reconciliation mechanism will gather potentially drifting context in one place.

We don't have a manual compression yet, it's part of our other product that ktx used to be a part of before it became open-source. But we decided to move that feature to OSS too, so expect it to come out in the next releases !


cool

Thanks!

Interesting approach. Context management for agents is an underexplored area. I've been looking at similar problems - the key challenge is keeping token usage low while giving the agent enough context to be useful. Tiered retrieval (facts first, full text only whenneeded) seems to work well in practice.

Agreed, in addition to this I'd say it's better to spend tokens once while ingesting the sources and storing the canonical reusable definition rather than having agents do the full exploration over and over again



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: