llamatastic's comments

llamatastic · 2025-10-08T11:01:22 1759921282

In my DJing years I've learned that it is best to provide a hot signal and trim the volume than trying to amplify it later, because you end up amplifying noise. Max out the mixer volume and put a compressor (and a limiter after to protect the speaker set up - it will make it sound awful if hit, but it won't damage your set up and it will flag clueless bozos loud and clear) later, don't try to raise it after it leaves the origin.

It seems to me that adding noise to the process and trying to cut it out later is a self defeating proposition. Or as Deming put it, (paraphrasing) you can't QC quality into a bad process.

I can see how it seems better to "move fast and break things" but I will live and die by the opposite "move slow and fix things". There's much, much more to life than maximizing short term returns over a one dimensional naïve utilitarian take on value.

llamatastic · 2025-10-08T10:44:55 1759920295

I really don't get the idea that LLMs somehow create value. They are burning value. We only get useful work out of them because they consume past work. They are wasteful and only useful in a very contrived context. They don't turn electricity and prompts into work, they turn electricity, prompts AND past work into lesser work.

How can anyone intellectually honest not see that? Same as burning fossil fuels is great and all except we're just burning past biomass and skewing the atmosphere contents dangerously in the process.

simonw · 2025-10-08T11:25:32 1759922732

> How can anyone intellectually honest not see that?

The idea that they can only solve problems that they've seen before in their training data is one of these things that seems obviously true, but doesn't hold up once you consistently use them to solve new problems over time.

If you won't accept my anecdotal stories about this, consider the fact that both Gemini and OpenAI got gold medal level performance in two extremely well regarded academic competitions this year: the International Math Olympiad (IMO) and the International Collegiate Programming Contest (ICPC).

This is notable because both of those contests have brand new challenges created for them that have never been published before. They cannot be in the training data already!

throw219080123 · 2025-10-08T12:20:11 1759926011

> consider the fact that both Gemini and OpenAI got gold medal level performance

Yet ChatGPT 5 imagines API functions that are not there and cannot figure out basic solutions even when pointed to the original source code of libraries on GitHub.

simonw · 2025-10-08T13:11:42 1759929102

Which is why you run it in a coding agent loop using something like Codex CLI - then it doesn't matter if it imagines a non-existent function because it will correct itself when it tries to run the code.

Can you expand on "cannot figure out basic solutions even when pointed to the original source code of libraries on GitHub"? I have it do that all the time and it works really well for me (at least with modern "reasoning" models like GPT-5 and Claude 4.)

steveklabnik · 2025-10-08T17:31:45 1759944705

As a human, I sometimes write code that does not compile first try. This does not mean that I am stupid, only that I can make mistakes. And the development process has guardrails against me making mistakes, namely, running the compiler.

alickz · 2025-10-08T19:54:13 1759953253

Agreed

Infallibility is an unrealistic bar to mark LLMs against

orbital-decay · 2025-10-08T14:15:26 1759932926

Yes. I don't see why these have to be mutually exclusive.

buildbot · 2025-10-08T17:09:01 1759943341

I feel they are mutually inclusive! I don’t think you can meaningfully create new things if you must always be 100% factually correct, because you might not know what correct is for the new thing.

blibble · 2025-10-08T13:08:15 1759928895

> If you won't accept my anecdotal stories about this, consider the fact that both Gemini and OpenAI got gold medal level performance in two extremely well regarded academic competitions this year: the International Math Olympiad (IMO) and the International Collegiate Programming Contest (ICPC).

it's not a fair comparison

the competitions for humans are a display of ingenuity and intelligence because of the limited resources available to them

meanwhile for the "AI", all it does is demonstrate is that if you have a dozen billion dollar data-centres and a couple of hundred gigawatt hours, which can dedicate to brute-forcing a solution, then you can maybe match the level of one 18 year old, when you have a problem with a specific well known solution

(to be fair, a smart 18 year old)

and short of moores law lasting another 30 years, you won't be getting this from the dogshit LLMs on shatgpt.com

simonw · 2025-10-08T13:39:20 1759930760

Google already released the Gemini 2.5 Deep Think model they used in ICPC as part of their $250/month "Ultra" plan.

The trend with all of these models is for the price for the same capabilities to drop rapidly - GPT-3 three years ago was over 1,000x the price of much better models today.

I'm not yet ready to bet against that trend holding for a while longer.

blibble · 2025-10-08T13:44:18 1759931058

> GPT-3 three years ago was over 1,000x the price of much better models today.

right, so only another 27 years of moores law continuing left

> I'm not yet ready to bet against that trend holding for a while longer.

I wouldn't expect an industry evangelist to say otherwise

simonw · 2025-10-08T13:53:44 1759931624

I'm a pretty bad "industry evangelist" considering I won't shut up about how prompt injection hasn't had any meaningful improvements in the last three years and I doubt that a robust solution is coming any time soon.

I expect this industry might prefer an "evangelist" who hasn't written 126 posts about that: https://simonwillison.net/tags/prompt-injection/

(And another 221 posts about ethical concerns with how this stuff works: https://simonwillison.net/tags/ai-ethics/)

hitarpetar · 2025-10-08T15:04:04 1759935844

you would be a lot more credible if you were honest about being an evangelist

simonw · 2025-10-08T15:19:39 1759936779

Credibility is genuinely one of the things I care most about. What can I do to be more honest here?

(Also what do you mean here by an "evangelist"? Do you mean someone who is an unpaid fan of some of the products, or are you implying a financial relationship?)

steveklabnik · 2025-10-08T17:33:54 1759944834

I know this is something you care about, and I'm not your parent, but something I've often observed in conversations about technology on here, but especially around AI, is that if you say good things about something, you are an "evangelist." It's really that straightforward, and doesn't change even if you also say negative things sometimes.

simonw · 2025-10-08T17:36:44 1759945004

In that case yeah, I'm an LLM "evangelist" (not so much other forms of generative AI - I play with image/video generation occasionally but I don't spend time telling people that they're genuinely worthwhile tools to learn). I'm also a Python evangelist, a SQLite evangelist, a vanilla JavaScript evangelist, etc etc etc.

blibble · 2025-10-08T14:26:56 1759933616

yes, enough "concern" to provide plausible deniability

numismatically · 2025-10-08T21:07:16 1759957636

"they output strings that didn't exit before" is some hardcore, uncut cope

hansmayer · 2025-10-08T12:13:20 1759925600

It's not about being honest. It's about Joe Bullshit from the Bullshit-Department having it easier in his/her/theirs Bullshit Job. Because you see, Joe decided two decades ago to be an "office worker", to avoid the horrors of working honestly with your hands or mind in a real job, like electrician, plumber or surgeon. So his day consists of preparing powerpoints, putting together various Excel sheets, attending whatever bullshit meetings etc. Chances are you've met a lot of Joe Bullshits in your career, you may have even reported to some of them. Now imagine the exhilaration Joe feels when he touches these magic tools. Joe does not really care about his job or about his company. But suddenly Joe can reduce his pain and suffering in a boring-to-death-job while keeping those sweet paychecks. Of course Joe doesn't believe his bosses only need him until the magic machine is properly trained so he can be replaced and reduced to an Eloi, living off the UBI. Joe Bullshit is selfish. In the 1930s he blindly followed a maniacal dictator because the dictator gave him a sense of security (if you were in the majority population) and a job. There is unfortunately a lot of Joe Bullshits in this world. Not all of them work with Excel. Some of them became self-made "developers" in the last 10 years. I don't mean the honest folks who were interested in technology but never had the means to go to a university. I mean all those ghouls who switched careers after they learnt there was money to be made in IT and money was their main motivation. They don't really care about the meaning of it all, the beautiful abstractions your mind wanders through as you create entire universes in code. So they are happy to offload it too, well because it's just another bullshit job, for the Joe Bullshit. And since Joe Bullshit is in the majority, you my friend, with your noble thoughts, are unfortunately preaching to the wind.

rhetocj23 · 2025-10-08T14:15:43 1759932943

Jeez. Brutal but true.

llamatastic · 2025-10-08T10:38:50 1759919930

Your house is in flames? Don't worry that energy will flow elsewhere