Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That’s not accurate. The Turing test was always intended as a benchmark for general intelligence. Turing’s 1950 paper explicitly proposed it as a way to operationalize the question “Can machines think?” not as a parlor trick about conversation but as a proxy for indistinguishability in intellectual behavior. The whole point of the imitation game was to sidestep metaphysical arguments and reduce intelligence to functional equivalence. If a machine could consistently hold its own in unrestricted dialogue, it would demonstrate the breadth, adaptability, and contextual understanding that characterize general intelligence.

The term AGI may have come later, but the concept it represents traces directly back to Turing’s framing. When early AI researchers talked about “strong AI” or “thinking machines,” they were using the same conceptual lineage. The introduction of the acronym doesn’t rewrite that history, it just gave a modern label to an old idea. The Turing test was never meant to detect a “negative” but to give a concrete, falsifiable threshold for when positive claims of general intelligence might be justified.

As for Cleverbot, it never truly passed the test in any rigorous or statistically sound sense. Those 2011 headlines were based on short exchanges with untrained judges and no control group. Passing a genuine Turing test requires sustained coherence, reasoning across domains, and the ability to handle novel input gracefully. Cleverbot couldn’t do any of that. It failed the spirit of the test even if it tricked a few people in the letter of it.

By contrast, modern large language models can pass the Turing test in flying colors. They can maintain long, open-ended conversations, reason about complex subjects, translate, summarize, and solve problems across many domains. Most human judges would be unable to tell them apart from people in text conversation, not for a few sentences but for hours. Granted, one can often tell ChatGPT is an AI because of its long and overly descriptive replies, but that’s a stylistic artifact, not a limitation of intelligence. The remarkable thing is that you can simply instruct it to imitate casual human conversation, and it will do so convincingly, adjusting tone, rhythm, and vocabulary on command. In other words, the test can be passed both intentionally and effortlessly. The Turing test was never obsolete; we finally built systems that can truly meet it.



I can definitely see the case for that. Ultimately, we're going to need vocabulary for all of the following:

* >=GPT-3.5-level intelligence

* AI that replaces an ordinary human for knowledge work

* AI that replaces an ordinary human for all work (given sufficiently capable hardware)

* AI that replaces any human for knowledge work

* AI that replaces any human for all work (given sufficiently capable hardware)

It doesn't really matter to me which of those we call "AGI" as long as we're consistent. One of them may be AGI, but all of them are important milestones.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: