Hacker Newsnew | past | comments | ask | show | jobs | submit | simonw's commentslogin

I've seen one-shot used to mean two different things in LLMs:

1. Getting an LLM to do something based on a single example

2. Getting an LLM to achieve a goal from a single prompt with no follow-ups

I think both are equally valid.


One-shot as in ‘given one example’ is the ML term. One-shot as in ‘in a single prompt’ is the colloquial meaning. Both are useful, but it can be confusing when discussing LLMs in ML topics.

This made me smile given Python's love of Monty Python references - the cheese shop etc.

I appreciated them at the time I encountered them (mid-2000s), but they were definitely a bit cringe in their frequency and shamelessness. I wonder if younger people even know Monty Python anymore - by my time, I think people had mostly forgotten about Hitchhiker’s Guide to the Galaxy, even if 42 survived.

As a foreigner I hadn't known Monty Python when I started learning the language and reading the docs, and I haven't noticed any of those. I guess they came across as just noise.

The kids these days have factored 42 to 6,7 (said with some inflection and hand waving)

Did you come up with that? If so, bravo!

6-7? No, my kid says it about a thousand time a day. Then, for some unknown reason they follow it with 41! WTF! I've shouted 42! many times and have tried to inform the child of the significant cultural and scientific importance of 42. Which, IIRC, factors to 2,3,7.

Dude it is not cringe. It is silly.

Pretending to be a all serius grown ups language is cringe.


I agree but don’t forget that the average programmer nowadays is a strait-laced corporate entity, whose personality is Node.js stickers on a macbook, like everybody else in their team.

They forget that Perl and co. were written by people that had one too many tabs of LSD in the 70s, sporting long hair and a ponytail.


I’m going to go out on a limb and guess that Larry Wall, a devout evangelical Christian and the child of a pastor, was not turning on, tuning in, or dropping out in the 1970s.

I definitely want documentation that a project expert has reviewed. I've found LLMs are fantastic at writing documentation about how something works, but they have a nasty tendency to take guesses at WHY - you'll get occasional sentences like "This improves the efficiency of the system".

I don't want invented rationales for changes, I want to know the actual reason a developer decided that the code should work that way.


Exactly. Often this information is not actually present in the code itself which is exactly why I would want documentation in the first place, given that I can always read the code myself if needed.

Yeah, I think it can. I'm reminded of the thing in the 80s when Compaq reverse engineered and reimplemented the IBM BIOS by having one team decompile it and write a spec which they handed to a separate team who built a new implementation based on the spec.

I expect that for games the more important piece will be the art assets - like how the Quake game engine was open source but you still needed to buy a copy of the game in order to use the textures.


All of the interesting LLMs can handle a full paper these days without any trouble at all. I don't think it's worth spending much time optimizing for that use-case any more - that was much more important two years ago when most models topped out at 4,000 or 8,000 tokens.

For anyone else who was initially confused by this, useful context is that Snowboard Kids 2 is an N64 game.

I also wasn't familiar with this terminology:

> You hand it a function; it tries to match it, and you move on.

In decompilation "matching" means you found a function block in the machine code, wrote some C, then confirmed that the C produces the exact same binary machine code once it is compiled.

The author's previous post explains this all in a bunch more detail: https://blog.chrislewis.au/using-coding-agents-to-decompile-...


I'd like to see this given a bit more structure, honestly. What occurs to me is constraining the grammar for LLM inference to ensure valid C89 (or close-to, as much can be checked without compilation), then perhaps experimentally switching to a permuter once/if a certain threshold is reached for accuracy of the decompiled function.

Eventually some or many of these attempts would, of course, fail, and require programmer intervention, but I suspect we might be surprised how far it could go.


helpful

It "launched" at the WIRED event last night. https://events.wired.com/big-interview-2025

The parent non-profit organization Hack Club isn't run by teenagers. https://hackclub.com/team/

[flagged]


There is a vouching system for comments that are flagged.

Click the date on the post, and if you have a button saying "vouch", click that.


I was surprised at how poorly GPT-5 did in comparison to Opus 4.1 and Gemini 2.5 on a pretty simple OCR task a few months ago - I should run that again against the latest models and see how they do. https://simonwillison.net/2025/Aug/29/the-perils-of-vibe-cod...

Agreed, GPT-5 and even 5.1 is noticeably bad at OCR. OCRArena backs this up: https://www.ocrarena.ai/leaderboard (I personally would rank 5.1 as even worse than it is there).

According to the calculator on the pricing page (it's inside a toggle at the bottom of the FAQs), GPT-5 is resizing images to have a minor dimension of at most 768: https://openai.com/api/pricing/ That's ~half the resolution I would normally use for OCR, so if that's happening even via the API then I guess it makes sense it performs so poorly.


and GPT4 was pretty decent at OCR, so that's weird?

In case the article author sees this, the "HTML transcription" link is broken - it goes to https://aistudio-preprod.corp.google.com/prompts/1GUEWbLIlpX... which is a Google-employee-only URL.

Love how employee portals for many companies essentially never get updated design wise over the decades, lol. That page styling and the balls certainly take me back.

Literally decades: the login page looked like that when I joined google in 2007.

Except for the updated Google logo.

I used to work for a company where the SSO screen had a nice corporate happy people at the office type of image. 25mb. I was in Brazil on a crappy roaming 2g service and couldn't login at all. I know most of the work happens on desktop but geee.....

Oh speaking on mobile, I remember when I tried to use Jira mobile web to move a few tickets up on priority by drag and dropping and ended up closing the Sprint. That stuff was horrible.


Wow yeah. Flashbacks to when Gmail Invites were cool! Google too.

hey, it's Rohan (the author of the article) - appreciate you catching this, we just fixed this!

You should try using AI to check such things :)

I’m a little surprised how open the help links are… I guess that if need help logging in you can’t be expected to well, log in.

Same with "See prompt in Google AI Studio" which links to an unpublished prompt in AI Studio.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: