Hacker Newsnew | past | comments | ask | show | jobs | submit | zX41ZdbW's commentslogin

Though ClickHouse is not limited to a single machine or local data processing. It's a full-featured distributed database.

Another alternative is Exasol that is factors (>10x) faster than Clickhouse and scales much better for complex analytics workloads that joins data. There is a free edition for personal use without data limit that can run on any number of cluster nodes.

If you just want to read and analyze single table data, then Clickhouse or DuckDB are perfect.

Disclaimer: I work at Exasol


When a single server is not enough, you deploy ClickHouse on a cluster, up to thousands of machines, e.g., https://clickhouse.com/blog/how-clickhouse-powers-ahrefs-the...

This is good news.

I was trying to add Exasol to ClickBench (https://github.com/ClickHouse/ClickBench/) since 2016, but it was not possible due to the limitations and the fact that it required using a custom virtual machine image.

Now we should try it again...


This is bad news: it is not usable:

> 5.3 Licensee may not disclose any benchmarking or results of evaluating the Software without Exasol´s prior written consent


This seems like a leftover from the old enterprise licenses. I will see if we can get that changed.

We'll be happy to be part of Clickbench. Reach out to me and we can work together to make it happen.


I would also like to have something like this, but for "vintage" links - something that looks like it was from the late 90s.

I use them in tests, just for fun: https://github.com/ClickHouse/ClickHouse/blob/master/tests/q...


There was a "shadyurl". The site itself seems to be long gone, but this'll give you some context: https://www.mikelacher.com/work/shady-url/

There's an example shadyurl link in here: https://news.ycombinator.com/item?id=14628529

Funnily enough the domains appear to have been bought up and are now genuinely shady.


GPU databases can run a small subset of production workloads in a narrow combination of conditions.

There are plenty of GPU databases out there: mapD/OmniSci/HeavyDB, AresDB, BlazingSQL, Kinetika, BrytlytDB, SQReam, Alenka, ... Some of them are very niche, and the others are not even usable.


The query tab looks quite complex with all these content shards: https://hackerbook.dosaygo.com/?view=query

I have a much simpler database: https://play.clickhouse.com/play?user=play#U0VMRUNUIHRpbWUsI...


Does your database also runs offline/locally in the browser? Seems to be the reason for the large number of shards.


You can run it locally, but it is a client-server architecture, which means that something has to run behind the browser.


The test does not look realistic: https://github.com/Cranot/grouped-simd-hashtable/blob/master...

Better to use a few distributions of keys from production-like datasets, e.g., from ClickBench. Most of them will be Zipfian and also have different temporal locality.


Here is my list of good technology blogs: https://clickhouse.com/blog/tech-blogs


There are many more exposed MySQLs than MongoDBs:

https://www.shodan.io/search?query=mongodb https://www.shodan.io/search?query=mysql https://www.shodan.io/search?query=postgresql

But this must be proportional to the overall popularity.


ClickHouse can do it. Examples:

    https://play.clickhouse.com/

    clickhouse-client --host play.clickhouse.com --user play --secure

    ssh play.clickhouse.com


Yes but CH is not SQL.


Yes, SQL is a query language and clickhouse is a database that uses SQL as a query language, but I don't see why that's relevant.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: