> Could it be viable to have one or multiple kuzu databases per user? What’s the story like for backups with kuzu?
You can have multiple databases, but con only connect to one at a time for now. We don't have support for backups for now, but we'd like to hear more about your specific use cases.
Would be great if you could join our discord (https://kuzudb.com/chat) or contact us through contact@kuzudb.com, and we can chat more there.
KùzuDB[1] is an in-process graph database built from scratch and came out of academia too. We are from Data Systems Group at University of Waterloo, started since Sep 2020, and have a small team actively work on it now.
These two posts[2,3] explain where we are from and where we're going, if anyone is interested.
For scalability, we can scale to several hundred GBs, and we routinely test on LDBC up to 300GBs. Our goal is to support efficiently querying over data at TB scale.
Right now, we only support CSV import. We are currently working on the integration of arrow, and aim to support more data formats through arrow. Hopefully that will bring us to support parquet, json, etc.
Built-in graph algorithms are coming along, but step by step. We are focusing on shortest path quries for now.
As always, any suggestions and discussions on these are welcome.
Thanks for your comment. We don't want to only base our research on Kùzu but instead are focused on implementing Kùzu seriously and support actual users. so expect a few but not many papers these upcoming years.
Also not sure what techniques you had in mind, but our position is that graph dbms's should be built on relational principles and state-of-the-art analytics data management techniques (e.g., that's why Kùzu is a columnar system). but we have many new techniques (e.g., factorization, new join algorithms, new storage designs) that are all optimized for graph data with a lot of many-to-many connections between nodes/entitites. these techniques are optimized for finding patterns over such data. we wrote about prototype implementations of these techniques over many previous research papers and now we are focusing on implementing them very seriously in Kùzu.
Hope this clarifies a bit. Welcome to share more of your opinions in more details.
Thanks for the reply. I don't see any operators in the codebase for BFS, DFS, APSP etc. Shouldn't your graph querying build around these fundamental operators?
No, not at all! This is a big misunderstanding that implementing high-level graph DBMS query language require BFS/DFS type "traversals", which is another term to use for joins of node records with each other. Systems that adopt these "traversal" algorithms to do joins end up committing to a specific type of joins (what an RDBMS would call an index-nested loop join) and that's usually not very efficient (no matter what those systems might claim). Instead it's better to accept that these are simply joins and use relational join operators that are however optimized for many-to-many joins and cyclic joins (e.g., if you are searching for triangles, etc.). that's what we do in Kùzu. You can read about some of these join algorithms in our CIDR paper (https://cs.uwaterloo.ca/~ssalihog/papers/kuzu-tr.pdf) and some earlier papers (http://www.vldb.org/pvldb/vol12/p1692-mhedhbi.pdf). We have explained what we do always in terms of fast joins and we strongly believe that's the right thing to do for evaluating the graph patterns in a language like cypher! That said, when we move to other computations, e.g., shortest paths, we will be using more specialized graph algorithms as operators as well.
Interesting. Part of my pessimism stems from seeing bad graph engines over the decades but perhaps you are here to change exactly that. I will keep track of the latest developments in your git repo. I wish the very best!
Thanks!
Yes, we really want to change that situation and think we have some good ideas. If you've used graph databases, we would love to hear and learn from your use cases too, you can reach us at: contact@kuzudb.com.
> would it be possible to use Kuzu to query data stored on sqlite? Yes, we have a SQLite extension (https://docs.kuzudb.com/extensions/attach/rdbms/) that can read data from SQLite databases.
> Could it be viable to have one or multiple kuzu databases per user? What’s the story like for backups with kuzu? You can have multiple databases, but con only connect to one at a time for now. We don't have support for backups for now, but we'd like to hear more about your specific use cases. Would be great if you could join our discord (https://kuzudb.com/chat) or contact us through contact@kuzudb.com, and we can chat more there.