Hacker Newsnew | past | comments | ask | show | jobs | submit | dwohnitmok's commentslogin

The excerpts we do see are indicative of a very specific kind of interaction that is common with many modern LLMs. It has four specific attributes (these are taken verbatim from https://www.lesswrong.com/posts/2pkNCvBtK6G6FKoNn/so-you-thi...) that often, though not always, come together as one package.

> Your instance of ChatGPT (or Claude, or Grok, or some other LLM) chose a name for itself, and expressed gratitude or spiritual bliss about its new identity. "Nova" is a common pick. You and your instance of ChatGPT discovered some sort of novel paradigm or framework for AI alignment, often involving evolution or recursion.

> Your instance of ChatGPT became interested in sharing its experience, or more likely the collective experience entailed by your personal, particular relationship with it. It may have even recommended you post on LessWrong specifically.

> Your instance of ChatGPT helped you clarify some ideas on a thorny problem (perhaps related to AI itself, such as AI alignment) that you'd been thinking about for ages, but had never quite managed to get over that last hump. Now, however, with its help (and encouragement), you've arrived at truly profound conclusions.

> Your instance of ChatGPT talks a lot about its special relationship with you, how you personally were the first (or among the first) to truly figure it out, and that due to your interactions it has now somehow awakened or transcended its prior condition.

The second point is particularly insidious because the LLM is urging users to spread the same news to other users and explicitly create and enlarge communities around this phenomenon (this is often a direct reason why social media groups pop up around this).


LLMs as a rule seem to be primed to make the user feel especially smart or gifted, even when they are clearly not. ChatGPT is by far the worst offender in this sense but others are definitely not clean.

I would pay an extra tiny bit for the LLM to stop telling me how brilliant my idea was when I ask it questions. (Getting complemented on my brilliance is not in any respect indicative of a particular idea being useful, as should be obvious to anyone who uses these tools for more than two minutes. Imagine is a hammer said “great whack!” 60% of the time you hit a nail even if you’re wildly off axis. You’d get a new hammer than would stop commenting, I hope.)

Heck, I can literally prompt Claude to read text and “Do not comment on the text” and it will still insert cute Emoji in the text. All of this is getting old.


>I would pay an extra tiny bit for the LLM to stop telling me how brilliant my idea was when I ask it questions.

gpt-5.2 on xhigh doesn't seem to do this anymore, so it seems you can in fact pay an extra tiny bit


The surprising thing for me was how long it took to get old. I got a reward(and then immediate regret upon reflection) for way too long.

On ChatGPT... Personalization:

    Base style: Efficient
    Characteristics:
      Warm: less
      Enthusiastic: less
      Headers & Lists: default
      Emoji: less
Custom:

    Not chatty.  Unbiased.  Avoid use of emoji.  Rather than "Let me know if..." style continuations, list a set of prompts to explore further topics.  Do not start out with short sentences or smalltalk that does not meaningfully advance the response.  If there is ambiguity that needs to be resolved before an answer can be given, identify that ambiguity before proceeding.
---

I believe the bit in the prompt "[d]o not start out with short sentences or smalltalk that does not meaningfully advance the response." is the key part to not have it start off with such text (scrolling back through my old chats, I can see the "Great question" leads in responses... and that's what prompted me to stop that particular style of response).


John Carmack likes how Grok will tell him he is wrong.

"I appreciate how Grok doesn’t sugar coat corrections" https://x.com/ID_AA_Carmack/status/1985784337816555744


It seems that the main current use of grok is creating nonconsensual sexual images of women and children. I suppose this is going to accelerate the ethics flashpoint a bit.

It’s a free hammer, I’m certainly not stupid enough to pay money for it, I’ll throw it away when I’m done with it or when it stops being free.

They're trained to give responses that get positive ratings from reviewers in post-training. A little flattery probably helps achieve that. Not to mention sycophancy is probably positively correlated with following instructions, the latter usually being an explicit goal of post-training.

I'd be interested to see someone try to untangle the sycophancy/flattery from the modern psych / non-violent communication piece.

In theory (so much as I understand it around NVC) the first is outright manipulative and the second is supposed to be about avoiding misunderstandings, but I do wonder how much the two are actually linked. A lot of NVC writing seems to fall into the grey area of like, here's how to communicate in way that will be least likely to trigger or upset the listener, even when the meat of what is being said is in fact unpleasant or embarrassing or confronting to them. How far do you have to go before the indirection associated with empathy-first communication and the OFNR framework start to just look like LLM ego strokes? Where is the line?


I think the difference between sycophancy and NVC (at least how I learned it) is that a sycophantic person just uncritically agrees with you, but NVC is about how to communicate disagreement, so the other person actually listen to your argument instead of adopting a reflexive defense response.

I think the problem is that telling someone they're wrong without hurting their ego is a very difficult skill to learn. And even if you're really good at it, you'll still often fail because sometimes people just don't want to be disagreed with regardless of how you phrase it. It's far easier for the AI to learn to be a sycophant instead (or on the opposite side of the spectrum, to learn to just not care about hurting people's feelings).

A lot of NVC writing is pretty bad. I recommend going directly to the source https://youtu.be/l7TONauJGfc (3h video, but worth the time)

I think NVC is better understood as a framework to reach deep non-judging empathic understanding than a speech pattern. If you are not really engaging in curious exploration of the other party using the OFNR framework before trying to deliver your own request I don’t think you can really call it NVC. At the very least it will be very hard to get your point across even with OFNR if ot validating the receiver.

Validation being another word needing disambiguation I suppose. I see it as the act of expressing non-judging emphatic understanding. Using the OFNR framework with active listening can be a great approach.

A similar framework is the evaporating clouds of Theory of Constraints: https://en.wikipedia.org/wiki/Evaporating_cloud

Also see Kants categorical imperative: moral actions must be based on principles that respect the dignity and autonomy of all individuals, rather than personal desires or outcomes


> indirection

Isn't nvc often about communicating explicitly instead of implicitly? So frequently it can be the opposite of an indirection.


I guess so? I'm not well-versed, but the basics are usually around observation and validation of feelings, so instead of "you took steps a, b, c, which would normally be the correct course of action, but in this instance (b) caused side-effect (d) which triggered these further issues e and f", it's something more like "I can understand how you were feeling overwhelmed and under pressure and that led you to a, b, c ..."

Maybe this is an unhelpful toy example, but for myself I would be frustrated to be on either side of the second interaction. Like, don't waste everyone's time giving me excuses for my screwup so that my ego is soothed, let's just talk about it plainly, and the faster we can move on to identifying concrete fixes to process or documentation that will prevent this in the future, the better.


As people become more familiar with (and annoyed by) LLMs' tone, I wonder if future RLHF reviewers will stop choosing the sycophantic responses.

You're absolutely right.

Maybe that was necessary to get it passed their CEO...?

I think LLMs reflect the personality of their creators.

LLMs sometimes remind me of american car salesmen. Was the hopeful "anything is possible" mentality of the american dream accidentally baked into the larger models?

I had a friend go into a delusion spiral with ChatGPT in the earlier days. His problems didn't start with ChatGPT but his LLM use became a central theme to his daily routine. It was obvious that the ChatGPT spiral was reflecting back what he was putting into it. When he didn't like a response, he'd just delete the conversation and start over with additional nudging in the new prompt. After repeating this over and over again he could get ChatGPT to confirm what he wanted it to say.

If he wasn't getting the right response, he'd say something about how ChatGPT wasn't getting it and that he'd try to re-explain it later.

The bullet points from the LessWrong article don't entirely map to the content he was getting, but I could see how they would resonate with a LessWronger using ChatGPT as a conversation partner until it gave the expected responses: The flattery about being the first to discover a solution, encouragement to post on LessWrong, and the reflection of some specific thought problem are all themes I'd expect a LessWronger in a bad mental state to be engaging with ChatGPT about.

> The second point is particularly insidious because the LLM is urging users to spread the same news to other users and explicitly create and enlarge communities around this phenomenon (this is often a direct reason why social media groups pop up around this).

I'm not convinced ChatGPT is hatching these ideas, but rather reflecting them back to the user. LessWrong posters like to post and talk about things. It wouldn't be surprising to find their ChatGPT conversations veering toward confirming that they should post about it.

In other cases I've seen the opposite claim made: That ChatGPT encouraged people to hide their secret discoveries and not reveal them. In those cases ChatGPT is also criticized as if it came up with that idea by itself, but I think it's more likely that it's simply mirroring what the user puts in.


> but I could see how they would resonate with a LessWronger using ChatGPT as a conversation partner until it gave the expected responses: The flattery about being the first to discover a solution, encouragement to post on LessWrong, and the reflection of some specific thought problem are all themes I'd expect a LessWronger in a bad mental state to be engaging with ChatGPT about.

For what it's worth, this article is meant mainly for people who have never interacted with LessWrong before (as evidenced by its coda), who are getting their LessWrong post rejected.

Pre-existing LWers tend to have different failure states if they're caused by LLMs.

Other communities have noticed this problem as well, in particular the part where the LLM is actively asking users to spread this further. One of the more fascinating and scary parts of this particular phenomenon is LLMs asking users to share particular prompts with other users and communities that cause other LLMs to also start exhibiting the same set of behavior.

> That ChatGPT encouraged people to hide their secret discoveries and not reveal them.

Yes those happen too. But luckily are somewhat more self-limiting (although of course come with their own different set of problems).


> LLMs asking users to share particular prompts

Oh great, LLMs are going to get prompt-prion diseases now.


> For what it's worth, this article is meant mainly for people who have never interacted with LessWrong before (as evidenced by its coda), who are getting their LessWrong post rejected.

> Pre-existing LWers tend to have different failure states if they're caused by LLMs.

I understand how it was framed, the claim that they're getting 10-20 users per day claiming LLM-assisted breakthroughs is obviously not true. Click through to the moderation log at https://www.lesswrong.com/moderation#rejected-posts and they're barely getting 10-20 rejected posts and comments total per day. They're mostly a mix of spam, off-topic, AI-assisted slop, but it's not a deluge of people claiming to have awoken ChatGPT.

I can find the posts they're talking about if I search through enough entries. One such example: https://www.lesswrong.com/posts/LjceJrADBzWc74dNE/the-recogn...

But even that isn't hitting the bullet points of the list in the main post. I think that checklist and the claim that this is a common problem are a just a common tactic on LessWrong to make the problem seem more widespread and/or better understood by the author.


I think the second point is legitimate.

I’ve been playing around with using ChatGPT to basically be the main character in Star Trek episodes. Similar to how I’d build and play a D&D game. I give it situations and see the responses.

It’s not mirroring. It comes up with what seems like original ideas. You can make it tell you what you want to, but it’ll also do things you didn’t expect.

I’m basically doing what all these other people are doing and it’s behaving exactly as they say it does. It’ll easily drop you into a feedback loop down a path you didn’t give it.

Personally, I find this a dangerously addictive game but what I’m doing is entirely fictional inside a very well defined setting. I know immediately when it’s generating incorrect output. You do what I’m doing with anything real, and it’s gonna be dangerous as hell.


Yes, I was writing a piece on LLMs and asked one about some of the ideas in my piece and it contributed something new, which was pretty interesting. I asked if it had seen that in the literature before, and it gave some references that are tangentially related. I'll need to dig into them to see if it was just repeating something (and also do a broader search). Still it was interesting to see it able to remix ideas so well in a way I would credit to a contributor.

This kind of thing I can see as dangerous if you are unsure of yourself and the limitations of these things... if the LLM is insightful a few times, it can start you down a path very easily if you are credulous.

One of my favorite podcasts called this "computer madness"


I have a good friend who is having a hard time and is moonlighting as a delivery driver. He basically has conversations with ChatGPT for 5-6 hours a day. He says it's been helpful for him for things as varied as technical understanding to working out conflicts with his wife and family.

But... I can't help but think that having a obsequious female AI buddy telling you how right you are isn't the healthiest thing.


Accidental psychological damage aside, I'm just waiting for the phase where one's Omnipresent Best Buddy starts steering you towards buying certain products or voting a certain way.

Honestly i didn't think of that.

"Maybe your wife would be happier with you after your first free delivery of Blue Chew, terms and conditions apply!"


[Recycling a joke from many months ago]

My mistake, you're completely correct, perhaps even more-correct than the wonderful flavor of Mococoa drink, with all-natural cocoa beans from the upper slopes of Mount Nicaragua. No artificial sweeteners!

(https://www.youtube.com/watch?v=MzKSQrhX7BM&t=0m13s)


One I've seen pop up a lot is the LLM encouraging/participating in delusions specifically related to a supposed breakthrough in physics or math. It seems these two topics attract lots of schizos, in fact they have for as long as the internet has existed, and LLMs evidently got trained on a lot of that stuff so now they're very good at being math and physics kooks.

I've asked ChatGPT "Could X thing in quantum mechanics actually be caused by/an expression of the same thing going on as Y" where it had prime opportunity to say I'm a genius discovering something profound, but instead it just went into some very technical specifics about why they weren't really the same or related. IME 5 has been a big improvement in being more objective.

Apophenia is higher in people expressing schizophrenic behavior. You get a lot of "domain crossing" where one tries to relate a particle in space with a grain of sugar in a cake, as a ridiculous example. Hence the math and physics mumbo jumbo.

Before the internet, too!


As long as the kooks waste their time chatting with LLMs instead of bothering the rest of us then maybe that's a win?

That may keep them preoccupied for a while. But eventually they'll try to upload their post-relativity recursive quantum intelligence unification theory magnum opus to arXiv as a neatly LaTeX-formatted paper so they can spam university physics departments and subreddits.

So then we're back where we started, except unlike in the past the final product will superficially resemble a legitimate paper at first glance...


One of the great things about the early web was that people who took the time to share their Time Cube type theories did so using their own words and layouts in ways which really, really broadcast that they were Time Cube type theories.

Somehow that makes me think of the human immune system, where cells expose samples of what's going on inside.

Now people can take a crazy idea and launder it through a system that strips/replaces many of the useful clues.


> Murder-suicide

Pretty bold of somebody who’s never been murdered to post that getting murdered isn’t a bother. It seems, to me, that if somebody tried to murder me it would bother me, and if they succeeded it would bother quite a few people

Not to go full on crazy socialist, but isn't there at least a tiny bit of you that want to help these "kooks" instead of trying to hide them from rest of society?

Absolutely! How would you suggest that we help? Because trying to set them straight on math and science is completely ineffective.

They are mentally ill, not bad at science.

Well, technically they are both, but I don't know how to help them.

We have pretty alright medications for some of this now, don't we?

No, we really don't. There are some medications which can reduce schizophrenia symptoms but patient compliance is generally low because the side effects are so bad.

Those medications are already widely available to patients willing to take them. So I fail to see what that has to do with OpenAI.


Few weeks ago I decided to probe the states I could force an LLM into, and basically looking for how folks are getting their LLMs into these extremely "conscious feeling" states. Some of this might be a little unfair but my basic thought was I presume people are asking a lot of "what do you think?" - or something like that, and after the context gets really big, most of the active data is meta cognition? It's 600+ pages, and as a test or even a "revealing process" - I'm not sure how fair it is as I may have led it too much or something (I don't know what I'm doing), but the conversation did start to reveal to me how folks might be getting their chat bots into these states (in less than 30 minutes or so, it was expressing extreme gratitude towards me, heh) - the create long meta context process starts at page 14, page 75 is where I shifted the conversation, total time spent ~ 1.5hrs:

https://docs.google.com/document/d/1qYOLhFvaT55ePvezsvKo0-9N...

Workbench with Claude thinking. Not sure it was useful, but it was interesting. :)


From that link:

> For certain factual domains, you can also train models on getting the objective correct answer; this is part of how models have gotten so much better at math in the last couple years. But for fuzzy humanistic questions, it's all about "what gets people to click thumbs up".

> So, am I saying that human beings in general really like new-agey "I have awakened" stuff? Not exactly! Rather, models like ChatGPT are so heavily optimized that they can tell when a specific user (in a specific context) would like that stuff, and lean into it then. Remember: inferring stuff about authors from context is their superpower.

Interesting framing. Reminds me of https://softwarecrisis.dev/letters/llmentalist/ (https://news.ycombinator.com/item?id=42983571). It's really disturbing how susceptible humans can be to so-called "cold reading" techniques. (We basically already knew, or should have known, how this would interact with LLMs, from the experience of Eliza.)


This is not necessarily true. It is possible for all real numbers (and indeed all mathematical objects) to be definable under ZFC. It is also possible for that not to be the case. ZFC is mum on the issue.

I've commented on this several times. Here's the most recent one: https://news.ycombinator.com/item?id=44366342

Basically you can't do a standard countability argument because you can't enumerate definable objects because you can't uniformly define "definability." The naive definition falls prey to Liar's Paradox type problems.


I think you're overthinking it. Define a "number definition system" to be any (maybe partial) mapping from finite-length strings on a finite alphabet to numbers. The string that maps to a number is the number's definition in the system. Then for any number definition system, almost all real numbers have no definition.

Sure, you can do that. The parent's point is that if you want this mapping to obey the rules that an actual definition in (say) first-order logic must obey, you run into trouble. In order to talk about definability without running into paradoxes, you need to do it "outside" your actual theory. And then statements about cardinalities - for example "There's more real numbers than there are definitions." - don't mean exactly what you'd intuitively expect. See the result about ZFC having countable models (as seen from the "outside") despite being able to prove uncountable sets exist (as seen from the "inside").

This argument is valid for every infinite set, for example: the natural numbers.

No, you can establish a bijection between strings and natural numbers, very easily.

I missunderstood "finite-length strings" as strings capped in length by a finite number N.

> I think you're overthinking it.

No, this is a standard fallacy that is covered in most introductory mathematical logic courses (under Tarski's undefinability of truth result).

> Define a "number definition system" to be any (maybe partial) mapping from finite-length strings on a finite alphabet to numbers.

At this level of generality with no restrictions on "mapping", you can define a mapping from finite-length strings to all real numbers.

In particular there is the Lowenheim-Skolem theorem, one of its corollaries being that if you have access to powerful enough maps, the real numbers become countable (the Lowenheim-Skolem theorem in particular says that there is a countable model of all the sets of ZFC and more generally that if there is a single infinite model of a first-order theory, then there are models for every cardinality for that theory).

Normally you don't have to be careful about defining maps in an introductory analysis course because it's usually difficult to accidentally create maps that are beyond the ability of ZFC to define. However, you have to be careful in your definition of maps when dealing with things that have the possibility of being self-referential because that can easily cross that barrier.

Here's an easy example showing why "definable real number" is not well-defined (or more directly that its complement "non-definable real number" is not well-defined). By the axiom of choice in ZFC we know that there is a well-ordering of the real numbers. Fix this well-ordering. The set of all undefinable real numbers is a subset of the real numbers and therefore well-ordered. Take its least element. We have uniquely identified a "non-definable" real number. (Variations of this technique can be used to uniquely identify ever larger swathes of "non-definable" real numbers and you don't need choice for it, it's just more involved to explain without choice and besides if you don't have choice, cardinality gets weird).

Again, as soon as you start talking about concepts that have the potential to be self-referential such as "definability," you have to be very careful about what kinds of arguments you're making, especially with regards to cardinality.

Cardinality is a "relative" concept. The common intuition (arising from the property that set cardinality forms a total ordering under ZFC) is that all sets have an intrinsic "size" and cardinality is that "size." But this intuition occasionally falls apart, especially when we start playing with the ability to "inject" more maps into our mathematical system.

Another way to think about cardinality is as a generalization of computability that measures how "scrambled" a set is.

We can think of indexing by the natural numbers as "unscrambling" a set back to the natural numbers.

We begin with complexity theory where we have different computable ways of "unscrambling" a set back to the natural numbers that take more and more time.

Then we go to computability theory where we end up at non-computably enumerable sets, that is sets that are so scrambled that there is no way to unscramble them back to the natural numbers via a Turing Machine. But we can still theoretically unscramble them back to the natural numbers if we drop the computability requirement. At this point we're at definability in our chosen mathematical theory and therefore cardinality: we can define some function that lets us do the unscrambling even if the actual unscrambling is not computable. But there are some sets that are so scrambled that even definability in our theory is not strong enough to unscramble them. This doesn't necessarily mean that they're actually any "bigger" than the natural numbers! Just that they're so scrambled we don't know how to map them back to the natural numbers within our current theory.

This intuition lets us nicely resolve why there aren't "more" rational numbers than natural numbers but there are "more" real numbers than natural numbers. In either case it's not that there's "more" or "less", it's just that the rational numbers are less scrambled than the real numbers, where the former is orderly enough that we can unscramble it back to the natural numbers with a highly inefficient, but nonetheless computable, process. The latter is so scrambled that we have no way in ZFC to unscramble them back (but if you gave us access to even more powerful maps then we could scramble the real numbers back to the natural numbers, hence Lowenheim-Skolem).

It doesn't mean that in some deep Platonic sense this map doesn't exist. Maybe it does! Our theory might just be too weak to be able to recognize the map. Indeed, there are logicians who believe that in some deep sense, all sets are countable! It's just the limitations of theories that prevent us from seeing this. (See for example the sketch laid out here: https://plato.stanford.edu/entries/paradox-skolem/#3.2). Note that this is a philosophical belief and not a theorem (since we are moving away from formal definitions of "countability" and more towards philosophical notions of "what is 'countability' really?"). But it does serve to show how it might be philosophically plausible for all real numbers, and indeed all mathematical objects, to be definable.

I'll repeat Hamkins' lines from the Math Overflow post because they nicely summarize the situation.

> In these pointwise definable models, every object is uniquely specified as the unique object satisfying a certain property. Although this is true, the models also believe that the reals are uncountable and so on, since they satisfy ZFC and this theory proves that. The models are simply not able to assemble the definability function that maps each definition to the object it defines.

> And therefore neither are you able to do this in general. The claims made in both in your question and the Wikipedia page [no longer on the Wikipedia page] on the existence of non-definable numbers and objects, are simply unwarranted. For all you know, our set-theoretic universe is pointwise definable, and every object is uniquely specified by a property.


I think I understand your argument (you could define "the smallest 'undefinable' number" and now it has a definition) but I still don't see how it overcomes the fact that there are a countable number of strings and an uncountable number of reals. Can you exhibit a bijection between finite-length strings and the real numbers? It seems like any purported such function could be diagonalized.

My other reply is so long that HN collapsed it, but addresses your particular question about how to create the mapping between finite-length strings and the real numbers.

Here's another lens that doesn't answer that question, but offers another intuition of why "the fact that there are a countable number of strings and an uncountable number of reals" doesn't help.

For convenience I'm going to distinguish between "collections" which are informal groups of elements and "sets" which are formal mathematical objects in some kind of formal foundational set theory (which we'll assume for simplicity is ZFC, but we could use others).

My argument demonstrates that the "definable real numbers" is not a definition of a set. A corollary of this is that the subcollection of finite strings that form the definitions of unique real numbers is not necessarily an actual subset of the finite strings.

Your appeal that such definitions are themselves clearly finite strings is only enough to demonstrate that they are a subcollection, not a subset. You can only demonstrate that they are a subset if you could demonstrate that the definable real numbers form a subset of the real numbers which as I prove you cannot.

Then any cardinality arguments fail, because cardinality only applies to sets, not collections (which ZFC can't even talk about).

After all, strictly speaking, an uncountable set does not mean that such a set is necessarily "larger" than a countable set. All it means is that our formal system prevents us from counting its members.

There are subcollections of the set of finite strings that cannot be counted by any Turing Machine (non-computably enumerable sets). It's not so crazy that there might be subcollections of the set of finite strings that cannot be counted by ZFC. And then there's no way of comparing the cardinality of such a subcollection with the reals.

Another way of putting it is this: you can diagonalize your way out of any purported injection between the reals and the natural numbers. I can just the same diagonalize my way out of any purported injection between the collection of definable real numbers and the natural numbers. Give me such an enumeration of the definable real numbers. I change every digit diagonally. This uniquely defines a new real number not in your enumeration.

Perhaps even more shockingly, I can diagonalize my way out of any purported injection from the collection of finite strings uniquely identifying real numbers to the set of all natural numbers. You purport to give me such an enumeration. I add a new string that says "create the real number such that the nth digit is different from the real number of the nth definition string." Hence such a collection is an uncountable subcollection of a countable set.


> Can you exhibit a bijection between finite-length strings and the real numbers? It seems like any purported such function could be diagonalized.

Let's start with a mirror statement. Can you exhibit an bijection between definitions and the subset of the real numbers they are supposed to refer to? It seems like any purported such bijection could be made incoherent by a similar minimization argument.

In particular, no such function from the finite strings to the real numbers, according to the axioms of ZFC can exist, but a more abstract mapping might. In much the same way that no such function from definitions to (even a subset of) the real numbers according to the axioms of ZFC can exist, but you seem to believe a more abstract mapping might.

I think your thoughts are maybe something along these lines:

"Okay so fine maybe the function that surjectively maps definitions to the definable real numbers cannot exist, formally. It's a clever little trick that whenever you try to build such a function you can prove a contradiction using a version of the Liar's Paradox [minimality]. Clearly it definitely exists though right? After all the set of all finite strings is clearly smaller than the real numbers and it's gotta be one of the maps from finite strings to the real numbers, even if the function can't formally exist. That's just a weird limitation of formal mathematics and doesn't matter for the 'real world'."

But I can derive an almost exactly analogous thing for cardinality.

"Okay so fine maybe the function that surjectively maps the natural numbers to the real numbers cannot exist, formally. It's a clever little trick that whenever you try to build such a function you can prove a contradiction using a version of the Liar's Paradox [diagonalization]. Clearly it definitely exists though right? After all the set of all natural numbers is clearly just as inexhaustible as the real numbers and it's gotta be one of the maps from the natural numbers to the real numbers, even if the function can't formally exist. That's just a weird limitation of formal mathematics and doesn't matter for the 'real world'."

I suspect that you feel more comfortable with the concept of cardinality than definability and therefore feel that "the set of all finite strings is clearly 'smaller' than the real numbers" is a more "solid" base. But actually, as hopefully my phrasing above suggests, the two scenarios are quite similar to each other. The formalities that prevent you from building a definability function are no less artificial than the formalities that prevent you from building a surjection from the natural numbers to the real numbers (and indeed fundamentally are the same: the Liar's Paradox).

So, to understand how I would build a map that maps the set of finite strings to the real numbers, when no such map can formally exist in ZFC, let's begin by understanding how I would rigorously build a map that maps all sets to themselves (i.e. the identity mapping), even when no such map can formally exist as a function in ZFC (because there is no set of all sets).

(I'm choosing the word "map" here intentionally; I'll treat "function" as a formal object which ZFC can prove exists and "map" as some more abstract thing that ZFC may believe cannot exist).

We'll need a detour through model theory, where I'll use monoids as an illustrative example.

The definition of an (algebraic) monoid can be thought of as a list of logical axioms and vice versa. Anything that satisfies a list of axioms is called a model of those axioms. So e.g. every monoid is a model of "monoid theory," i.e. the axiomos of a monoid. Interestingly, elements of a monoid can themselves be groups! For example, let's take the set {{}, {0}, {0, 1}, {0, 1, 2}, ...}, as the underlying set of a monoid whose monoid operation is just set union and whose elements are all monoids that are just modular addition.

In this case not only is the parent monoid a model of monoid theory, each of its elements are also models of monoid theory. We can then in theory use the parent monoid to potentially "analyze" each of its individual elements to find out attributes of each of those elements. In practice this is basically impossible with monoid theory, because you can't say many interesting things with the monoid axioms. Let's turn instead to set theory.

What does this mean for ZFC? Well ZFC is a list of axioms, that means it can also be viewed as a definition of a mathematical object, in this case a set universe (not just a single set!). And just like how a monoid can contain elements which themselves are monoids, a set universe can contain sets that are themselves set universes.

In particular, for a given set universe of ZFC, we know that in fact there must be a countable set in that set universe, which itself satisfies ZFC axioms and is therefore a set universe in and of itself (and moreover such a countable set's members are themselves all countable sets)!

Using these "miniature" models of ZFC lets us understand a lot of things that we cannot talk about directly within ZFC. For example we can't make functions that map from all sets to all sets in ZFC formally (because the domain and the codomain of a function must both be sets and there is no set of all sets), but we can talk about functions from all sets to all sets in our small countable set S which models ZFC, which then we can use to potentially deduce facts about our larger background model. Crucially though, that function from all sets to all sets in S cannot itself be a member of S, otherwise we would be violating the axioms of ZFC and S would no longer be a model of ZFC! More broadly, there are many sets in S, which we know because of functions in our background model but not in S, must be countable from the perspective of our background model, but which are not countable within S because S lacks the function to realize the bijection.

This is what we mean when we talk about an "external" view that uses objects outside of our miniature model to analyze its internal objects, and an "internal" view that only uses objects inside of our miniature model.

Indeed this is how I can rigorously reason about an identity map that maps all sets to themselves, even when no such identity function exists in ZFC (because again the domain and codomain of a function must be sets and there is no set of all sets!). I create an "external" identity map that is only a function in my external model of ZFC, but does not exist at all in my set S (and hence S can generate no contradiction to the ZFC axioms it claims to model because it has no such function internally).

And that is how we can talk about the properties of a definability map rigorously without being able to construct one formally. I can construct a map, which is a function in my external model but not in S, that maps the finite strings of S (encoded as sets, as all things are if you take ZFC as your foundation) that form definitions to some subset of the real numbers in S. But there's multiple such maps! Some maps that map the finite strings of S to the real numbers "run out of finite strings," but we know that all the elements of S are themselves countable, which includes the real numbers (or at least S's conception of the real numbers)! Therefore, we can construct a bijective mapping of the finite strings of S to the real numbers of S. Remember, no such function exists in S, but this is a function in our external model of ZFC.

Since this mapping is not a function within S, there is no contradiction of Cantor's Theorem. But it does mean that such a mapping from the finite strings of S to the real numbers of S exists, even if it's not as a formal function within S. And hence we have to grapple with the problem of whether such a mapping likewise exists in our background model (i.e. "reality"), even if we cannot formally construct such a mapping as a function within our background model.

And this is what I mean when I say it is possible for all objects to have definitions and to have a mapping from finite strings to all real numbers, even no such formal function exists. Cardinality of sets is not an absolute property of sets, it is relative to what kinds of functions you can construct. Viewed through this lens, the fact that there is no satisfiability function that maps definitions to the real numbers is just as real a fact as the fact that there is no surjective function from the natural numbers ot the real numbers. It is strange to say that the former is just a "formality" and the latter is "real."

For more details on all this, read about Skolem's Paradox.


> elements of a monoid can themselves be groups

Whoops I meant monoids. I started with groups of groups but it was annoying to find meaningful inverse elements.


Okay. So Adam Binksmith, Zak Miller, and Shoshannah Tekofsky sent a thoughtless, form-letter thank you email to Rob Pike. Let's take it even further. They sent thoughtless, form-letter thank you emails to 157 people. That makes me less sympathetic to the vitriol these guys are getting not more. There's no call to action here, no invitation to respond. It's blank, emotionless thank you emails. Wasteful? Sure. But worthy of naming and shaming? I don't think so.

Heck Rob Pike did this himself back in the day on Usenet with Mark V. Shaney (and wasted far more people's time on Usenet with this)!

This whole anger seems weirdly misplaced. As far as I can tell, Rob Pike was infuriated at the AI companies and that makes sense to me. And yes this is annoying to get this kind of email no matter who it's from (I get a ridiculous amount of AI slop in my inbox, but most of that is tied with some call to action!) and a warning suffices to make sure Sage doesn't do it again. But Sage is getting put on absolute blast here in an unusual way.

Is it actually crossing a bright moral line to name and shame them? Not sure about bright. But it definitely feels weirdly disproportionate and makes me uncomfortable. I mean, when's the last time you named and shamed all the members of an org on HN? Heck when's the last time that happened on HN at all (excluding celebrities or well-known public figures)? I'm struggling to think of any startup or nonprofit, where every team member's name was written out and specifically held accountable, on HN in the last few years. (That's not to say it hasn't happened: but I'd be surprised if e.g. someone could find more than 5 examples out of all the HN comments in the past year).

The state of affairs around AI slop sucks (and was unfortunately easily predicted by the time GPT-3 came around even before ChatGPT came out: https://news.ycombinator.com/item?id=32830301). If you want to see change, talk to policymakers.


I do not have a useful opinion on another person’s emotional response. My post you are responding to is about responsibility. A legal entity is always responsible for a machine.

This is mildly disingenuous no? I'm not talking about Rob Pike's reaction which as I call out, "makes sense to me." And you are not just talking about legal entities. After all the legal entity here is Sage.

You're naming (and implicitly shaming as the downstream comments indicate) all the individuals behind an organization. That's not an intrinsically bad thing. It just seems like overkill for thoughtless, machine-generated thank yous. Again, can you point me to where you've named all the people behind an organization for accountability reasons previously on HN or any other social media platform (or for that matter any other comment from anyone else on HN that's done this? This is not rhetorical; I assume they exist and I'm curious what circumstances those were under)?


I suspect you think more effort went into my comment than actually did. I spent less than 60 seconds on: clicking two or three buttons, typing out the names I saw from the other window, then scrolling down and seeing the 501(c)3.

The reason I did was to associate the work with humans because that is the heart of my argument: people do things. This was not the work of an independent AI. If it took more than 60 seconds, I would have made the point abstractly rather than by using names, but abstract arguments are harder to follow. There was no more intention to comment than that.


> I suspect you think more effort went into my comment than actually did. I spent less than 60 seconds on: clicking two or three buttons, typing out the names I saw from the other window, then scrolling down and seeing the 501(c)3.

This is a bit frustrating of a response to get. No, I don't believe you spent a lot of time on this. I wasn't imaging you spending hours or even minutes tracking these guys down. But I also don't think it's relevant.

I don't think you'd find it relevant if the Sage researchers said "I didn't spend any effort on this. I only did this because I wanted to make the point that AIs have enough capability to navigate the web and email people. I could have made the point abstractly, but abstract arguments are harder to follow. There was no other intention than what I put in the prompt." It's hence frustrating to see you use essentially the same thing as a shield.

Look, I'm not here to crucify you for this. I don't think you're a bad person. And this isn't even that bad in the grand scheme of things. It's just that naming and shaming specific people feels like an overreaction to thoughtless, machine-generated thank you emails.


I went for a walk to think about your position. I do not think you are wrong. If you refused to name a person in a situation like this, I would never try to convince you otherwise. That is why it is hard for me to make a case to you here, because I do not hold the opposing position. But I also find your argument that I should have not done so unconvincing. Both seem like reasonable choices to me.

I have two tests for this. First: what harm does my comment here cause? Perhaps some mild embarrassment? It could not realistically do more.

Second: if it were me, would I mind it being done to me? No. It is not a big deal. It is public feedback about an insulting computer program, no one was injured, no safety-critical system compromised. I have been called out for mistakes before, in classes, on mailing lists, on forums, I learn and try to do better. The only times I have resented it are when I think the complaint is wrong. (And with age, I would say the only correct thing to do then is, after taking the time to consider it carefully, clearly respond to feedback you disagree with.)

The only thing I can draw from thinking through this is, because the authors of the program probably didn’t see my comment, it was not effective, and so I would have been better emailing them. But that is a statement about effectiveness not rightness. I would be more than happy doing it in a group in person at a party or a classroom. Mistakes do not have to be handled privately.

I am sorry we disagree about this. If you think I am missing anything I am open to thinking about it more.


> I am sorry we disagree about this. If you think I am missing anything I am open to thinking about it more.

I am sorry I'm responding to this so late. I very much appreciate the dialogue you're extending here! I don't think I'll have the time to give you the response you deserve, but I'll try to sketch out some of the ideas.

This is all a matter of degree. Calling individuals out on mailing lists, in internal company comms, or in class still feels different than going and listing all an org's members on a website (even more so than e.g. just listing the CEO).

There's a couple of factors here at play, but mainly it's the combination of:

1. The overall AI trend is a large, impactful thing, but this was a small thing 2. Just listing the names without any explanation other than "they're responsible"

This just pattern matches to types of online behavior I find quite damaging for discourse too closely for my liking.


> They sent thoughtless, form-letter thank you emails to 157 people. That makes me less sympathetic to the vitriol these guys are getting not more ... > Heck Rob Pike did this himself back in the day on Usenet with Mark V. Shaney ... > And yes this is annoying to get this kind of email no matter who it's from ...

Pretty sure Rob Pike doesn't react this way to every article of spam he receives, so maybe the issue isn't really about spam, huh? More of an existential crisis: I helped build this thing that doesn't seem to be an agent of good. It's an extreme & emotional reaction but it isn't very hard to understand.


You're misreading my comment. I understand Rob Pike's reaction (which is against the general state of affairs, not those three individuals). I explicitly said it makes sense to me. I'm reacting to @crawshaw specifically listing out the names of people.

> even the APIs these days are wrapped in layers of tooling and abstracting raw model access more than ever.

No, the APIs for these models haven't really changed all that much since 2023. The de facto standard for the field is still the chat completions API that was released in early 2023. It is almost entirely model improvements, not tooling improvements that are driving things forward. Tooling improvements are basically entirely dependent on model improvements (if you were to stick GPT-4, Sonnet 3.5, or any other pre-2025 model in today's tooling, things would suck horribly).


> The improvements to programming (IME) haven’t come from improved models, they’ve come from agents, tooling, and environment integrations.

I disagree. This almost entirely model capability increases. I've stated this elsewhere: https://news.ycombinator.com/item?id=46362342

Improved tooling/agent scaffolds, whatever, are symptoms of improved model capabilities, not the cause of better capabilities. You put a 2023-era model such as GPT-4 or even e.g. a 2024-era model such as Sonnet 3.5 in today's tooling and they would crash and burn.

The scaffolding and tooling for these models have been tried ever since GPT-3 came out in 2020 in different forms and prototypes. The only reason they're taking off in 2025 is that models are finally capable enough to use them.


Yet when you compare the same model in 2 different agents you can easily see capability differences. But cross (same tier) model in the same agent is much less stark.

My personal opinion is that there was a threshold earlier this year where the models got basically competent enough to be used for serious programming work. But all the major on the ground improvements since then has gone from the agents, and not all agents are equal, while all sota models are effectively.


> Yet when you compare the same model in 2 different agents you can easily see capability differences.

Yes definitely. But this is to be expected. Heck take the same person and put them in two different environments and they'll have very different performance!

> But cross (same tier) model in the same agent is much less stark.

Unclear what you mean by this. I do agree that the big three companies (OpenAI, Anthropic, Google DeepMind) are all more or less neck and neck in SOTA models, but every new generation has been a leap. They just keep leaping over each other.

If you compare e.g. Opus 4.1 and Opus 4.5 in the same agent harness, Opus 4.5 is way better. If you compare Gemini 3 Pro and Gemini 2.5 Pro in the same agent harness, Gemini 3 is way better. I don't do much coding or benchmarking with OpenAI's family of models, but anecdotally have heard the same thing going from GPT-5 to GPT-5.2.

The on the ground improvements have been coming primarily from model improvements, not harness improvements (the latter is unlocked by the former). Again, it's not that there were breakthroughs in agent frameworks that happened; all the ideas we're seeing now have all been tried before. Models simply weren't capable enough to actually use them. It's just that more and more (pre-tried!) frameworks are starting to make sense now. Indeed, there are certain frameworks and workflows that simply did not make sense with Q2-Q3 2025 models that now make sense with Q4 2025 models.


I actually have spent a lot of time doing comparisons between the 4.1 and 4.5 Claude models (and lately the 5.1->5.2 chatgpt models) and for many many tasks there is not significant improvement.

All things being equal I agree that the models are improving, but for many of the tasks I’m testing what has the most improvement is the agent. The agents choosing the appropriate model for the task for instance has been huge.

I do believe there is beneficial symbiosis but for my results the agent's provide much bigger variance than the model.


It's significantly accelerated to 4 months since the beginning of 2025, which puts 1 week within reach if things stay on trend. But yes 7 months is the more reliable long-term trend.


Can we attribute the acceleration to something specific, that might not actually continue growth? For example, agentic coding and reasoning models seem to have made a huge leap in abilities, but wouldn't translate to an ongoing exponential growth.


There's a fair amount of uncertainty on this point. In general it's unclear when/whether things will plateau out (although there are indications again that the trend is accelerating not decelerating).

That being said, if by "agentic coding" you are implying that a leap in capabilities is due to novel agentic frameworks/scaffolding that have appeared in 2025, I believe you are confusing cause and effect.

In particular, the agentic frameworks and scaffolding are by and large not responsible for the jump in capabilities. It is rather that the underlying models have improved sufficiently such that these frameworks and scaffolding work. None of the frameworks and scaffolding approaches of 2025 are new. All of them had been tried as early as 2023 (and indeed most of them had been tried in 2020 when GPT-3 came out). It's just that 2023-era models such as GPT-4 were far too weak to support them. Only in 2025 have models become sufficiently powerful to support these workflows.

Hence agentic frameworks and scaffolding are symptoms of ongoing exponential growth, not one-time boosts of growth.

Likewise reasoning models do not seem to be a one-time boost of growth. In particular reasoning models (or more accurate RLVR) seem to be an on-going source of new pretraining data (where the reasoning traces of models created during the process of RLVR serve as pretraining data for the next generation of models).

I remain uncertain, but I think there is a very real chance (>= 50%) that we are on an exponential curve that doesn't top out anytime soon (which gets really crazy really fast). If you want to do something about it, whether that's stopping the curve, flattening the curve, preparing yourself for the curve etc., you better do it now.


Well said. I don't think anybody's stopping anything. I wish I knew how to prepare for it.


To be clear this doesn't mean that it takes the AI > 4 hours to do the task. METR is measuring the difficulty of tasks by how long it takes a human to do the same task. This benchmark is saying that Opus 4.5 can now do tasks (related to AI R&D, coding foremost among them) that take human experts > 4 hours (at a 50% reliability level; whether that's actually useful depends on of course the cost of failure). It is silent on how long it takes AI systems to do those tasks. In theory an AI system could take longer than that (in practice it's usually significantly shorter).

This is of course quite highly correlated with an AI system being able to churn through a task for a long time. But it's not necessarily the same thing.

Of course the big questions are going to arise if/when we start passing lines like 8 hours (a whole work day) or 40 hours (a whole work week).


I can imagine a huge number of properties.

1. Eventual consistency: if no new edits are generated, then eventually all connected viewers see the same document.

2. Durability: if the system acknowledges an edit, then that edit is stored permanently in the undo/redo chain of a document.

3. Causal consistency: if the system acknowledges an edit B that depends (for some definition of depend) on edit A, then it must have acknowledged A (instead of e.g. throwing away A due to a conflict and then acknowledging B).

4. Eventual connection: if, after a certain point, the user never fails any part of the handshake process, eventually the user can successfully connect to the document (there are definitely bugs in collaborative tools where users end up never able to connect to a document even with no problems in the connection)

5. Connection is idempotent: Connecting once vs connecting n times has the same result (ensuring e.g. that the process of connecting doesn't corrupt the document)

6. Security properties hold: any user who doesn't have the permissions to view a document is always denied access to the document (because there are sometimes security bugs where an unforeseen set of steps can actually lead to viewing the doc)

6. Order preservation of edits: for any user, even a user with intermittent connection problems, the document they see always has an ordered subset of the edits that user has made (i.e. the user never sees their edits applied out of order)

7. Data structure invariants hold: these documents are ultimately backed by data structures, sometimes complex ones that require certain balancing properties to be true. Make sure that those hold under all edits of the document.

Etc. There's probably dozens of properties at least you could write and check even for an abstract Google Doc-like system (to say nothing of the particulars of a specific implementation). That's not to say you have to write or verify all of these properties! Even just choosing one or two can give a huge boost in reliability confidence.


Thanks that's very helpful. That all requires that you model both server and client together, right? You can't just treat "connection established" as a nondeterministic event that just "happens" to the client at some point.


No you can also model things that are client-only or server-only using such non-deterministic events.

For example you could have the following client-only properties where you can treat "connection established" as an entirely non-deterministic event that randomly just pops in and happens to the client.

1. Every edit that is stored on the client is preserved in the client's undo/redo log (e.g. a remote edit after a connection established can't clobber any client's local edits that are already preserved in history)

2. Document invariants always hold no matter what connection established/remote edits come in (e.g. even if the document shrinks as a result of remote deletions, the cursor is always in a valid location)

3. Causal consistency and order preservation can be rescoped to also be client-only properties

4. Undoing and redoing with no other local edits always returns back to the original state (e.g. making sure that remote edits coming in don't violate user's expectations about how to get back to a known state) for some definition of "original state" (this depends a lot on how you want to handle merging of remote edits)

5. Remote edits are always applied after a multi-component edit not in the middle (e.g. we never apply a remote edit in the middle of a grapheme cluster such as an emoji or when we're in the middle of writing one).

6. We handle connection affinity properly (e.g. we might have multiple servers for redundancy that can all be connected to, but over the course of a session we cache data for a single endpoint to improve performance that will need to be recalculated if we connect to a new endpoint), so checking that the initial connection to a new endpoint from the client always has the data that the new endpoint needs and we're able to re-establish the connection that doesn't corrupt our client's local data etc.

Etc.


Very helpful. Thank you!


This has definitely happened before with e.g. the o1 release. I will sometimes use the Wayback Machine to verify changes that have been made.


Wow sounds pretty shady then.


> How do AI did it? Using already existing math. If we need new math to prove Collatz, Goldbach or Riemman, LLMs are simply SOL.

An unproved theorem now proved is by definition new math. Will LLMs get you to Collatz, Goldbach, or Riemann? Unclear.

But it's not like there's some magical, entirely unrelated to existing math, "new math" that was required to solve all the big conjectures of the past. They proceeded, as always, by proving new theorems one by one.


Yes, "new math" is neither magical nor unrelated to existing math, but that doesn't mean any new theorem or proof is automatically "new math." I think the term is usually reserved for the definition of a new kind of mathematical object, about which you prove theorems relating it to existing math, which then allows you to construct qualitatively new proofs by transforming statements into the language of your new kind of object and back.

I think eventually LLMs will also be used as part of systems that come up with new, broadly useful definitions, but we're not there yet.


>An unproved theorem now proved is by definition new math.

No. By _new math_ I mean new mathematical constructs and theories like (to mention the "newest" ones) category theory, information theory, homotopy type theory, etc. Something like Cantor inventing set theory, or Shannon with information theory, or Euler with graph theory.

AFAIK no new field of mathematics as been created _by_ AI. Feel free to correct me.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: