I don't understand the architecture section. The title is "layered architecture," but then it talks about Ports/Adapters, which would be hexagonal architecture?
I was about to leave a very witty "just be idempotent ;)" response but did not consider the nonce. I'd be surprised if Google is quick to change this, so I guess be stateful on the receiving server, persist that you handled a certain request already, and if you get a duplicate request, replay the response from the first one?
Well, but it is not a human. It's a talking computer and trying to sound not like a computer is disingenuous, creepy and heavily misleading. The movie "her" made a good point about that.
I actually think the movie "Her" showed us the opposite - people feel comfortable with their computer sounding like a human. Almost every character in the film finds it near natural.
At this point, I cannot take these kinds of safety press releases serious anymore. None of those models pose any serious risk, and it seems like we're still pretty far away from models that WOULD pose a risk.
After having used Datadog for several years, going back to Grafana / Loki / Prometheus felt like regressing by two decades. As much as I appreciate free solutions, I feel like Grafana has really fallen behind when it comes to developer experience
Grafana cloud is better for querying logs.
Grafana cloud is probably a bit better for querying metrics. Grafana cloud is terrible at finding traces or even loading them. Datadog is lightyears ahead. For alerting I feel datadog has better features but is overwhelming with all the different options.
grafana is very quirky for searching for traces. And has a huge learning curve.
Could you provide more details? Although I've never had the opportunity to use Datadog at any of my previous positions, I am quite familiar with Grafana and I'm generally pretty happy with it.
Why, exactly, do we need to put a memory cache such as Redis in front of Postgres?
Postgres has its own in-memory cache that it updates on reads and writes, right? What makes Postgres' cache so much worse than a dedicated Redis?
Postgres can develop problematic behavior if you have high churn tables - tables with lots of deletes on them.
If you have many inserts and deletes on a table, the table will build up tombstones and postgres will eventually be forced to vacuum the table. This doesn't block normal operation, but auto vacuums on large tables can be resource intensive - especially on the storage/io side. And this - at worst - can turn into a resource contention so you either end up with an infinite auto vacuum (because the vacuum can't keep up fast enough), or a severe performance impact on all queries on the system (and since this is your postgres-as-redis, there is a good chance all of the hot paths rely on the cache and get slowed down significantly).
Both of these result in different kinds of fun - either your applications just stop working because postgres is busy cleaning up, or you end up with some horrible table bloat in the future, which will take hours and hours of application downtime to fix, because your drives are fast, but not that fast.
There are ways to work around this, naturally. You could have an expiration key with an index on it, and do "select * from cache order by expiration_key desc limit 1", and throw pg_partman at it to partition the table based on the expiration key, and drop old values by dropping partitions and such... but at some point you start wondering if using a system meant for this kinda workload is easier.
The buffer pool in a rdbms ends up intimately connected with the concurrency control and durability protocols. There's also a variety of tradeoffs in how to handle conflicts between transactions (steal vs no steal, force vs no force, etc). You need deadlock detection or prevention. That creates a necessary minimum of complexity and overhead.
By comparison an in memory kv cache is much more streamlined. They basically just need to move bytes from a hash table to a network socket as fast as possible, with no transactional concerns.
The semantics matter as well. PostgreSQL has to assume all data needs to be retained. Memcached can always just throw something away. Redis persistence is best effort with an explicit loss window. That has enormous practical implications on their internals.
So in practical terms this means they're in different universes performance wise. If your workload is compatible with a kv cache semantically, adding memcached to your infrastructure will probably result in a savings overall.
Because Redis is almost infinitely scalable while Postgres is not. You have limited vertical scalability budget for your database. The more things you put into your database, the more budget you spending on things that could be done elsewhere.
Sometimes it makes sense, when your workload is not going to hit the limits of your available hardware.
But generally you should be prepared to move everything you can out of the database, so database will not spend any CPU on things that could be computed on another computer. And cache is one of those things. If you can avoid hitting database, by hitting another server, it's a great thing to do.
Of course you should not prematurely optimize. Start simple, hit your database limits, then introduce cache.
How would the architecture in the OP mesh with master-slave postgres setups? If I write a cache item how can I be certain the freshest entry is read back from the read-only slave? Can/do I pay a performance penalty on writes waiting for it to be synchronized? Is it better, when it comes to caching, to ignore the slave and send all read/write cache related queries to the master?
All of these questions go away or are greatly simplified with redis.
They don't really go away, because if you need read-only replicas with PostgreSQL, there is a good chance that you will also need read-only replicas with Redis.
Similarly to Postgres, Redis replication is also async, which means that replicas can be out-of-sync for a brief period of time.
I was unsure to comment this: You can mark postgres replicas as sync replicas. Writes on the leader only commit fully once the writes are fully replicated to all sync replicas. This way postgres could ensure consistency across several replicas.
This however can come with a lot of issues if you started to use this to ensure consistency across many replicas. Writes are only as fast as the slowest replica, and any hickup on any replica could stall all writes.
What I wasn't sure about - IMO in such a situation, you should rather fix the application to deal with (briefly) stale information, and then you can throw either async postgres replicas at it.. or redis replication, or something based on memcache.
IME -- and I've just replaced a Postgres-only unlogged cache table with Redis -- it's not about the storage or caching, but about the locking. Postgres needs to acquire (or at least check) a lock for reading as well as writing. Although some optimizations have been done for mostly-reading-workloads (search for postgres fast-path locking), you'll still run into lock contention problems relatively quickly.
Machines used to have limited memory. Distributed caching can utilize many machines to form the overall cache. Nowadays machines have plenty of memory with numerous cores and fast bandwidth. The need for large network of cache servers has waned.
Even though PG caches it is still doing all the things to run the query. It is like saying why does a 3d render take so long to render an image when the same image saved to a PNG opens so much faster.
The article talks about using Unlogged tables, they double write speed by forgoing the durability and safety of the WAL. It doesn't mention query speed because it is completely unaffected by the change.
As far as I know there is no way to tell Postgres to keep a particular index or table in memory, which is one reason to be weary of using one PG instance for many varied workloads. You might solve this by earmarking workload-specific replicas, though.
If you can keep your entire working set in memory, though, then it probably doesn't matter that much.
Redis is completely in memory, therefore all data is in memory. Postgres on the other hand does have a cache of it's own, does not give you fine controls over what stays in cache. What stays in cache depends on data access patterns. E.g. I cannot make an entire table of my choosing to be in cache.
If you have a second machine, why not just put a Postgres read replica on it? Letting the WAL deal with replica consistency is much simpler than making the client responsible for keeping an external cache in sync, and you get the benefit of keeping everything in Postgres.
I'm assuming you're targeting this mainly at enterprises and business use-cases such as callcenters, but are you planning to make this usable for personal use cases as well? For example, having a bot to bounce ideas off while coding. Pretty much "just" the TTS / STT layer to talk to my finetuned LLM in a natural manner while you handle interruptions and such.
I think the main issue right now for personal use would be cost (and I'm guessing STT / TTS are the most expensive parts..)
I really do not understand these memes about overengineered FactoryFactoryFactories. I have 10 YOE, did I just get lucky? I've worked at enterprise Java shops as well, but even there I'd call the software pragmatic. Are these overengineered monstrosities REALLY still a thing, or is it "just" people suffering in legacy projects? Even the juniors I worked with were following KISS and YAGNI.
Yeah I think in the last 10 years things have definitely changed. One of the last Java projects I worked on was in 2013 and the lead was ex-Google; he deliberately pulled in the simplest Java libs to get the job done and we didn’t over engineer anything. Contrast that with 1990-2010, the era of struts and enterprise java beans, things were definitely different back then.
Yes, some people read Clean Code and think every file should have less than 20 lines. I recently inherited a React project that has all single-use utility functions, graphql queries and component types extracted out to different files. Having to edit five+ files to change things in one component is a nightmare experience and slows down changes a lot.
I think this is intended for advertizers. I have seen lot worse music mostly for products aimed at kids and they likely take 1000s of dollars to write and record.
You aren't comparing the same thing. Custom produced thing is entirely different than buying samples. You still need musical talent to work with library music.
I agree that the quality isn't ideal, but I think this tool helps artists iterate much faster and cheaper. I wouldn't focus on the quality of the output, beyond the threshold which allows the artists to generate a reasonable idea of what they eventually want to make.
Think about all the hard work that traditionally goes into composing a single title. Artists will spend days, weeks and sometimes months trying to iterate on ideas. Writing, composing, demoing, tracking and recording, mixing, etc. Think about all of the expensive software and hardware that goes into this process (instruments, microphones, studios, DAWs, VSTs, etc). It's an expensive and difficult process, it's very manual, very sequential.
This could easily be used to speed up that iterative process. Just ask this software to generate 100 ideas for your next bridge, and iterate that way.
I find this very useful as someone who's just learning how to play the guitar. My knowledge of music theory is still limited, and it'll take me years to get to a place where I can express myself in a way I'd deem "satisfactory". I just visited this page and plugged in my lyrics, and it arranged them into a beautiful song for me. It did it just how I imagined it, and that's terrific. Now I can ask my guitar teacher if the chord progressions make any sense, and if so, then we can transcribe it. I don't know who else this would be useful for, but I could see myself paying for it depending on how they develop the tool.