I work on high-throughput PostgreSQL systems, especially when they’ve grown into a state where migrations, performance, or schema design have become limiting factors.
Recent work:
Re-architected two multi-terabyte OLTP tables (~2TB and ~1TB) receiving 200+ writes/sec. I focus on “rescue architecture” work: fixing dangerous schemas, stabilizing hot paths, removing app-level complexity, and making Postgres scale without rewriting the product.
Open to consulting or full-time roles where Postgres is core to the business and performance/architecture matters.
> Not to mention that perfectly normalizing a database always incurs join overhead that limits horizontal scalability. In fact, denormalization is required to achieve scale (with a trade-off).
This is just not true, at least not in general. Inserting on a normalized design is usually faster, due to smaller index sizes, fewer indexes and fitting more rows per page.
> That may be. What's not specified there is the immense, immense cost of driving a dev org on those terms
I'm happy that we agree on the solution, but disagree only if it is cost worthy. About the cost, I took that into consideration when I wrote the conclusion:
> FAANG-style companies are unlikely to adopt formal methods or relational rigor wholesale. But for their most critical systems, they should. It’s the only way to make failures like this impossible by design, rather than just less likely.
There is an actionable plan in the article. It is possible to run teams like these. It is an economical decision of upper management to run the risk of having these outages vis-a-vis this alternative.
Normalization cannot be done by machines, because it depends on expressing the (and only the) predicate that corresponds to the business rule in question.
It requires apprehending the essence of the situation, something a machine cannot do.
Sure it can. Not in a vacuum, maybe, but with some guidance from the user as to dependency relations. Ideally you have enough data baked into your schema to infer those relations.
Or they need to write the business rules down, in some sort of expressive format that can be tested against the requirements... Say, an artificial language of some sort. One might call it "programming"
Lots of peole got hung up on the example, which I thought would be be helpful on the discussion, but certainly should not replace the main point, which is:
Relations, attributes and tuples are logical. A PK is a combination of one or more attributes representing a name that uniquely identifies tuples and, thus, is logical too[2], while performance is determined exclusively at the physical level, by implementation.
So generating SKs for performance reasons (see, for example, Natural versus Surrogate Keys: Performance and Usability, Performance of Surrogate Key vs Composite Keys) is logical-physical confusion (LPC)[3]. Performance can be considered in PK choice only when there is no logical reason for choosing one key over another.
Remote: Yes
Willing to relocate: Yes (US preferred)
Technologies: PostgreSQL (partitioning, performance, OLTP architecture), SQL, F#, C#, C, Java, Clojure, Common Lisp, Scheme, Emacs Lisp, Python, Ruby, AWS, Linux
Email: ebellani at gmail
I work on high-throughput PostgreSQL systems, especially when they’ve grown into a state where migrations, performance, or schema design have become limiting factors.
Recent work:
Re-architected two multi-terabyte OLTP tables (~2TB and ~1TB) receiving 200+ writes/sec. I focus on “rescue architecture” work: fixing dangerous schemas, stabilizing hot paths, removing app-level complexity, and making Postgres scale without rewriting the product.
Open to consulting or full-time roles where Postgres is core to the business and performance/architecture matters.
Résumé: https://www.linkedin.com/in/eduardo-bellani/ https://ebellani.github.io/
reply