Hacker Newsnew | past | comments | ask | show | jobs | submit | veunes's commentslogin

If just 16 million examples were enough to significantly boost model quality (as Anthropic claims), it turns out that data quality beats quantity

Instead of vacuuming petabytes of trash from Common Crawl, you can just take high-quality distillate from a SOTA model and get comparable results. Bad news for anyone betting solely on massive compute clusters and closed datasets


He had the full source code of a working Linux driver that does exactly the same thing, just in a neighboring kernel dialect. The task was to translate, not invent. Sure, it's still impressive (given the difference in kernel APIs), but it's not the same as writing a driver from scratch using only a PDF datasheet. Now, when an AI takes an undocumented Chinese chip and writes a driver by sniffing the bus with a logic analyzer - then I'll call it "reasoning"

To be fair if you open up driver source code from the vendors themselves, it's often the same hell with magic numbers and lack of checks because "we know what the hardware will return". But you're right on the main point: AI writes C like a very confident junior who skipped memory safety lectures - it copies the style, but not the discipline. It works as long as you're on the "happy path", but debugging a kernel panic in code like that is going to be painful

I was personally surprised when the agent debugged kernel panics caused by its own code (many times by now). It just iterates from the stack traces and crash dumps. The nice part is that, when you do see that the code smells — you ask the agent to rework it, focusing on specific problems. This is just code, and you don't need to dance around, hoping that AI will spill some "magic" at you.

> The nice part is that, when you do see that the code smells — you ask the agent to rework it, focusing on specific problems.

I think that is the crux of the problem. How do you know code smell if you don't write it, and you don't read it? I'm pretty confident the spdx header isn't correct even.


Above it was said, that in a code review, an expert would ask the author to "justify and rework". Clearly, people have always been capable producing code, that wasn't great, regardless if they read the code or not.

I wouldn't call this "clean-room". The models were trained on all available open source, including that exact original Linux driver. Splitting sessions saves you from direct copy-paste in the current context window, but the weights themselves remember the internal code structure perfectly well. Lawyers still have to rack their brains over this, but for now, it looks more like license laundering through the neural net's latent space than true reverse engineering

Spot on about keeping that AGENTS.md and logging all decisions. Letting an agent code for a long stretch without pinning down the state is a surefire way to end up with a Frankenstein codebase. Forcing it to document why you ditched LinuxKPI and went native basically saved the project. It's kinda ironic that AI is making us enforce strict project documentation - the exact thing human devs never have time for

Active account with real interactions = more normal. Which is a pretty telling product story in itself

The part I really agree with is the social impact

Once a group gets big enough...

Your mum's experience is probably what FB is best at: high-trust network, lots of original photos, lots of comments from real friends

The depressing part is that generative slop is a perfect match for the incentive structure: infinite supply, tuned to trigger comments, and you don't need real creators. If your product metric is time-on-site, this is what you get

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: