More

jebarker · 2025-12-16T20:13:17 1765915997

> Within an hour, Koishi Chan gave an alternate proof deriving the required bound {c(k^2) \geq 1/k} from the original Erdős-Szekeres theorem by a standard “blow-up” argument which we can give here in the Alice-Bob formulation.

Is this an example of the 4 minute mile phenomenon or did the AI proof provide key insights that Chan was able to use in their proof?

jebarker · 2025-12-09T16:43:36 1765298616

This is an interesting paper and I like this kind of mechanistic interpretability work - but I cannot figure out how the paper title "Transformers know more than they can tell" relates to the actual content. In this case what is it that they know and can't tell?

godelski · 2025-12-09T21:23:34 1765315414

I believe it's a reference to the paper "Language Models (Mostly) Know What They Know".

There's definitely some link but I'd need to give this paper a good read and refresh on the other to see how strong. But I think your final sentence strengthens my suspicion

https://arxiv.org/abs/2207.05221

jebarker · 2025-12-06T14:51:01 1765032661

This is how I feel about basically all consumer software

jebarker · 2025-12-06T14:49:14 1765032554

I agree. Wolfram gets so much crap ok this site but I think he’s produced so much interesting science and tech that has built up over decades, that’s very admirable.

jebarker · 2025-12-04T20:11:42 1764879102

Anyone can be a CEO, just start a company.

venturecruelty · 2025-12-04T20:13:48 1764879228

So we don't need anyone to teach or clean toilets? We can all work our way up and be fabulously rich?

jebarker · 2025-12-04T20:27:29 1764880049

I'm not sure how you got that from my comment. CEO is a job title that is easy to get, that was my only point.

johnnyanmac · 2025-12-04T22:18:14 1764886694

If everyone wanted that, sure. But many people don't (I sure don't), and many people that do will fail. Because "working hard" is relative.

And that's ignoring the inherent inequality of birthright.

jebarker · 2025-11-27T21:09:59 1764277799

I haven’t read the paper yet, but I’d imagine the issue is converting the natural language generated by the reasoner into a form where a formal verifier can be applied.

jebarker · 2025-11-26T19:02:38 1764183758

OpenAI has an excellent interactive course on Deep RL: https://spinningup.openai.com/en/latest/

jebarker · 2025-11-26T04:00:44 1764129644

> what evolution has given us is a learning architecture and learning algorithms that generalize well from extremely few samples.

This sounds magical though. My bet is that either the samples aren’t as few as they appear because humans actually operate in a constrained world where they see the same patterns repeat very many times if you use the correct similarity measures. Or, the learning that the brain does during human lifetime is really just a fine-tuning on top of accumulated evolutionary learning encoded in the structure of the brain.

HarHarVeryFunny · 2025-11-26T13:12:15 1764162735

> This sounds magical though

Not really, this is just the way that evolution works - survival of the fittest (in the prevailing environment). Given that the world is never same twice, then generalization is a must-have. The second time you see the tiger charging out, you better have learnt your lesson from the first time, even if everything other than "it's a tiger charging out" is different, else it wouldn't be very useful!

You're really saying the same thing, except rather than call it generalization you are calling it being the same "if you use the correct similarity measures".

The thing is that we want to create AI with human-like perception and generalization of the world, etc, etc, but we're building AI in a different way than our brain was shaped. Our brain was shaped by evolution, honed for survival, but we're trying to design artificial brains (or not even - just language models!!) just by designing them to operate in a certain way, and/or to have certain capabilities.

The transformer was never designed to have brain-like properties, since the goal was just to build a better seq-2-seq architecture, intended for language modelling, optimized to be efficient on today's hardware (the #1 consideration).

If we want to build something with capabilities more like the human brain, then we need to start by analyzing exactly what those capabilities are (such as quick and accurate real-time generalization), and considering evolutionary pressures (which Ilya seems to be doing) can certainly help in that analysis.

Edit: Note how different, and massively more complex, the spatio-temporal real world of messy analog never-same-twice dynamics is to the 1-D symbolic/discrete world of text that "AI" is currently working on. Language modelling is effectively a toy problem in comparison. If we build something with brain-like ability to generalize/etc over real world perceptual data, then naturally it'd be able to handle discrete text and language which is a very tiny subset of the real world, but the opposite of course does not apply.

jebarker · 2025-11-26T15:28:43 1764170923

> Note how different, and massively more complex, the spatio-temporal real world of messy analog never-same-twice dynamics is to the 1-D symbolic/discrete world of text that "AI" is currently working on.

I agree that the real world perceived by a human is vastly more complex than a sequence of text tokens. But it’s not obvious to me that it’s actually less full of repeating patterns or that learning to recognize and interpolate those patterns (like an LLM does) is insufficient for impressive generalization. I think it’s too hard to reason about this stuff when the representations in LLMs and the brain are so high-dimensional.

HarHarVeryFunny · 2025-11-26T17:14:15 1764177255

I'm not sure how they can be compared, but of course the real world is highly predictable and repetitious (if you're looking at the right generalizations and abstractions), with brains being the proof of that. Brains are very costly, but their predictive benefit is big enough to more than offset the cost.

The difference between brains and LLMs though is that brains have evolved with generality as a major driver - you could consider it as part of the "loss function" of brain optimization. Brains that don't generalize quickly won't survive.

The loss function of an LLM is just next-token error, with no regard as to HOW that was achieved. The loss is the only thing shaping what the LLM learns, and there is nothing in it that rewards generalization. If the model is underparamized (not that they really are), it seems to lead to superposed representations rather than forcing generalization.

No doubt the way LLMs are trained could be changed to improve generalization, maybe together with architectural changes (put an autoencoder in there to encourage compressed representations ?!), but trying to take a language model and tweak it into a brain seems the wrong approach, and there is a long list of architectural changes/enhancements that would be needed if that is the path.

With animal brains, it seems that generalization must have been selected for right from the simplest beginnings of a nervous system and sensory driven behavior, given that the real world demands that.

jebarker · 2025-11-25T13:30:26 1764077426

As someone that has only dabbled in C++ over the past 10 years or so, it feels like each new release has this messaging of “you have to think of it as a totally new language”. It makes C++ very unapproachable.

jandrewrogers · 2025-11-25T17:04:08 1764090248

It isn’t each release but there are three distinct “generations” of C++ spanning several decades where the style of idiomatic code fundamentally changed to qualitatively improve expressiveness and safety. You have legacy, modern (starting with C++11), and then whatever C++20 is (postmodern?).

This is happening to many older languages because modern software has more intrinsic complexity and requires more rigor than when those languages were first designed. The languages need to evolve to effectively address those needs or they risk being replaced by languages that do.

I’ve been writing roughly the same type of software for decades. What would have been considered state-of-the-art in the 1990s would be a trivial toy implementation today. The languages have to keep pace with the increasing expectations for software to make it easier to deliver reliably.

gpderetta · 2025-11-25T13:42:23 1764078143

As someone that has been using C++ extensively for the last 25 years, each release has felt as an incremental improvement. Yes, there are big chunks in each release that are harder to learn, but usually a team can introduce them at their own pace.

The fact that C++ is a very large and complex language and that makes it unapproachable is undeniable though, but I don't think the new releases make it significantly worse. If anything, I think that a some of the new stuff does ease the on-ramp a bit.

fsloth · 2025-11-25T21:18:58 1764105538

C++ can be written as the optimal industrial language it is. Simple core concepts year after year. Minimal adaptation.

The key thing to understand you are still using C with sugar on top. So you need to understand how the language concepts map to the hardware concepts. So it’s much more relevant to understand pointer arithmetic, the difference between stack and heap allocations and so on, rather what the most recent language standard changes.

You can write the same type of C++ for decades. It’s not going to stop compiling. As long as it compiles on your language standard (C++17 is fine I think unless you miss something specific) you are off to the races. And you can write C++17 for the next two decades if you want.

jebarker · 2025-11-25T11:57:08 1764071828

How would you define “ahead”?

forgetfulness · 2025-11-25T13:26:49 1764077209

Able to make changes preserving correctness over time

Vibecoding reminds me sharply of the height of the Rails hype, products quickly rushed to market off the backs of a slurry of gems and autoimports inserted on generated code, the original authors dipping and teams of maintainers then screeching into a halt

Here the bots will pigheadedly heap one 9000 lines PR onto another, shredding the code base to bits but making it look like a lot of work in the process

jebarker · 2025-11-25T13:36:34 1764077794

Yes, preserving correctness seems like a good metric. My immediate reaction was to think that the parent comment was saying they’d like to see this comparison because AI will come out ahead. On this metric and based on current AI coding it’s hard to see that being the case or even possible to verify.