> But there is also evidence that AI is actually getting things done, right?
Is there? I haven't seen a single AI success story that rang true, that wasn't coming from someone with a massive financial interest in it being successful. A few people subjectively think AI is making them more productive, but there's no real objective validation of that; they're not producing super-impressive products, and there was that recent study that had senior developers thinking they were being more productive using AI when in fact it was the opposite.
You seem to be setting a high bar (AI success stories don't ring true), while taking the study as fact. This feels like a cognitive bias.
I believe you are talking about the study: Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity. It is an interesting data point, but it's far from conclusive. It studied 16 developers working on large (1MLOC+) codebases, and the AI tooling struggles with large codebases (new tools like Brokk are attempting to improve performance there). The authors acknowledge that participants dropped hard AI-disallowed issues "reducing the average AI-disallowed difficulty". Some of the selected developers seem to have been inexperienced at AI use.
Smaller tools and codebases and unfamiliar code are sweet spots for the AI tools. When I need a tool to help me do my job, the AIs can take a couple sentence description and turn it into working code, often on the first go. Monitoring plugins, automation tools, programs in the few thousand lines of code, writing tests, these are all things the AIs are REALLY good at. Also: asking questions about the code.
A few examples: Last night I had Claude Code implement a change to the helix editor so that if you go back into a file you previously edited, it takes you back to the spot you were at in your last editing session. I don't know the helix code nor Rust at all. Claude was able to implement this in the background while I was working on another task and then watching TV in the evening. A few weeks ago I used Claude Code to fix 600+ linting errors in 20 year old code, in an evening while watching TV, these easily would have taken a day to do manually. A few months ago Claude built me a "rsync but for block devices" program; I did this one as a comparison of writing it manually vs vibe coding it with Claude, and Claude had significant advantages.
But, I'm guessing these will fall into the "does not ring true" category, probably also "no real objective validation". But to me, personally, there is absolutely evidence that AI is actually getting things done.
> You seem to be setting a high bar (AI success stories don't ring true), while taking the study as fact. This feels like a cognitive bias.
I think it's interesting that you jump to that. I consider a study, even a small one, to be better evidence than subjective anecdotes; isn't that the normal position that one should take on any issue? I'm not taking that study as gospel, but I think it's grounds to be even more skeptical of anecdotal evaluations than normal.
> Some of the selected developers seem to have been inexperienced at AI use.
This seems to be a constant no-true-Scotsman argument from AI advocates. AI didn't work in a given setting? Clearly the people trying it were inexperienced, or the AI they were testing was an old one that doesn't reflect the current state of the art, or they didn't use this advocate's super-awesome prompt that solves all the problems. I never hear these objections before someone tries to evaluate AI, only after they've done it and got a bad result.
> But, I'm guessing these will fall into the "does not ring true" category, probably also "no real objective validation".
Well, yes. Duh. When the best available evidence shows little objective effectiveness from AI, and suggests that people who use AI are biased to think it's more effective than it was, I'm going to go with that, unless and until better evidence comes along.
> When the best available evidence shows little objective effectiveness from AI, and suggests that people who use AI are biased to think it's more effective than it was, I'm going to go with that, unless and until better evidence comes along.
Well you're in luck, a ton of better evidence across much larger empirical studies has been available for a while now! Somehow they just didn't get the same amount of airtime around here. You can find a few studies linked here: https://news.ycombinator.com/item?id=45379452
But if you want to verify that's a representative sample, do a simple Google Scholar search and just read the abstracts of any random sample of the results.
>I consider a study, even a small one, to be better evidence than subjective anecdotes
We're coming at it from very different places is the thing. The GenAI tooling is allowing me to do things that I otherwise wouldn't have time to do, which objectively to me is a clear win. So, I'm going to look at a study like that and pick it apart, because it doesn't match my objective observations. You are coming from a different angle.
> The GenAI tooling is allowing me to do things that I otherwise wouldn't have time to do, which objectively to me is a clear win. So, I'm going to look at a study like that and pick it apart, because it doesn't match my objective observations.
Ahh, we've reached the point in the discussion where you're arguing semantics...
"With a basis in observable facts". I am observing that I am getting things done with GenAI that I wouldn't be able to otherwise, due to lack of time.
While you were typing your message above, Claude was modifying a 100KLOC software project in a language I'm unfamiliar with to add a feature that'll make the software have one less rough edge for me. At the same time, I was doing a release of our software for work.
Feels pretty objective from my perspective. Yes, I realize from your perspective it is subjective.
If you're in the habit of using words to mean the opposite of what most people usually use them to mean, it's not surprising that you find yourself in semantic arguments often.
No, subjective experience is not reliable and is the whole reason humanity invented the scientific method to have a more reliable method of ascertaining truth.
Is there? I haven't seen a single AI success story that rang true, that wasn't coming from someone with a massive financial interest in it being successful. A few people subjectively think AI is making them more productive, but there's no real objective validation of that; they're not producing super-impressive products, and there was that recent study that had senior developers thinking they were being more productive using AI when in fact it was the opposite.