Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
GlueSQL: SQL database engine as a library (github.com/gluesql)
117 points by fofoz on Oct 22, 2022 | hide | past | favorite | 18 comments


I’ve been thinking about writing a code analysis and refactoring tool that would expose facts like AST node info, expression typings, etc as queryable collections. To make this fast and interactive, you’d only compute the facts needed for a query and ideally incrementally maintain those as files change.

Rust used a Datalog library inside the borrow checker for this; I’d like to bring similar model to more languages so that some kinds of refactors or lints written as queries could potentially be shared between languages - for example, a lint rule that prohibits optional arguments to functions.

I wonder if GlueSQL would work naturally as this kind of “OSQuery for source code”.


This idea sounds similar to Glean. I haven’t looked too much into it yet but tagged it for later.

https://glean.software/


CodeQL is exactly the query side of this - it builds a database of facts about your code, which you can query in their language which compiles to datalog, in order to write lints (sometimes in a language-agnostic manner). It doesn't have anything for refactoring, however.


Another "database toolkit" project that I've recently learned about is Apache DataFusion, also written in rust and uses Arrow memory format:

https://github.com/apache/arrow-datafusion/blob/master/READM...


Ooo, this seems more explicitly supportive of my AST query idea


Also check out (if you haven’t already), the timely and differential dataflow crates, and what Readyset/Noria are doing with incremental compute. Lots of of super cool stuff happening in the space.


Just another embedded SQL engine.

There are SQLite(OLTP), DuckDB(OLAP) and some engine-based project like mentioned Apache Arrow(https://arrow.apache.org/)(OLAP): Apache Arrow has many language implementations, some do not include the query engine(for example, Rust implementation, which depends on the DataFusion for more SQL-like analytics) in its own repo, but other do include(for example, C++).

There is a comprehensive benchmark by ClickHouse for OLAP but including kinds of embedding engines: https://benchmark.clickhouse.com/

The more interesting is that, in fact, we have not an embedded HTAP engine. One of my database products already implements 3/4 HTAP at the engine layer, but unfortunately it's still just a free software, not an open source implementation.


>The more interesting is that, in fact, we have not an embedded HTAP engine.

I know it is not meant to be, but I have found DuckDB to be so fast at transactional queries that for many it would work well as an HATP


What's the selling point here?


I don't want to dump on someone's project, because we need new projects, and who knows what this will turn into, but...

I too am wondering "why would I use this and not SQLite?" or Firebird.

There may well be reasons, but I couldn't find them on the project front page. For all new projects, in situations like this, breaking into a very mature market, I would recommend that some sort of reason is posted on the front page.


"Because it's written in Rust" is a pretty good argument, but you won't be using it instead of SQLite because SQLite is incredibly functional and no new kid on the block can be functional fast enough to overcome that mind share for some time. In the end it will have to have a) close to parity with SQLite in functionality, and b) a better architecture. (b) seems unlikely, and (a) seems unlikely. Plus functionality is not enough -- a fantastic test suite is also needed, and here it's really hard to dethrone SQLite because its best test suite is proprietary. Perhaps provable correctness would be the thing.

> I don't want to dump on someone's project, ...

A lot of these projects are school projects, or personal projects. No need to dump on them. I think it's pretty cool that someone might tackle an RDBMS library, as long as they don't think it's a SQLite killer without understanding the tremendous need for funding that replacing SQLite would require.


>"Because it's written in Rust" is a pretty good argument

..no, it isn't?

That's like saying you would buy a specific car because the factory it was made in is better than other car factories.

It is an argument, yes. Not a pretty good argument unless there's something specific about this project that is better because of Rust.


> factory…

I agree with your point but the analogy is bad. I would definitely buy a car based on factory. Reliability, order time, parts availability, etc.


>based on factory. Reliability, order time, parts availability, etc.

Do you see the nuance here? The argument only makes sense when the details make sense.


Don't they though?


Same as for any other embedded database, though I'm afraid my risk tolerance isn't high enough to use one anywhere close to this new for anything I care about. I do think it would benefit significantly from embedded SQL syntax, though (maybe with a procedural macro?).


Note that persistent storage appears to rely on 'sled', and according to sled's github readme, sled's persistent storage format is going to change in the future so you would need to manually migrate any databases.


Seeing the examples, I start to wonder if it would be possible to support real sql syntax from within a rust macro, that maybe even has access to local variables like LINQ?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: