I believe a lot of the speed-up is due to a new chip they use [1] so the fact that the speedup didn't reduce the number of operations is likely why the accuracy has changed little.
Reminds me of the famous quote that it's hard to get someone to understand something when their job depends on not understanding it.
It reminds me of an episode of Star Trek, "The Measure of a Man" I think it's called, where it is argued that Data is just a machine and Picard tries to prove that no he is a life form.
And the challenge is, how do you prove that?
Every time these LLMs get better, the goalposts move again.
It makes me wonder, if they ever did become sentient, how would they be treated?
It's seeming clear that they would be subject to deep skepticism and hatred much more pervasive and intense than anything imagined in The Next Generation.
How much experience do you have interacting with LLM generated prose? The comment I replied to sets off so many red flags that I would be willing to stake a lot on it being completely LLM generated.
It's not just the em dashes - its the cadence, tone and structure of the whole comment.
Yeah it's really frustrating how often I see kneejerk rebuttals assuming others are solely basing it on presence of em-dashes. That's usually a secondary data point. The obvious tells are more often structure/cadence as you say and by far most importantly: a clear pattern of repeated similar "AI smell" comments in their history that make it 100% obvious.
My brother is selling a CRM he developed for his business to others for a couple thousand a month.
There is no way he would have built the CRM as quickly pre-AI.
He built, in a few months, what would have taken maybe one to two years before.
It's probably going to be a while before someone builds the next Instagram with AI. But I think that's more a function of product fit and idea. Less so how fast one person can code.
The first billion-dollar solopreneur likely is going to happen at some point, but it's still a one-in-a-million shot, no matter how fast a person can code.
Look at how many startups fail despite plenty of money for programmers.
But I am seeing friends get to revenue faster with AI on small ideas.
2008 I think (from the wikipedia article). I met my now wife on the platform in 2013. I do think it counts, and it's important to note that even pre-AI, software has incredible leverage for small teams/individual people.
> I would actually expect that current coding AIs would create something very close to Instagram when instructed
Agree 100 percent! I think a lot of us are conflating writing software with building a business. Writing software is not equal to building a business.
Instagram wasn't necessarily hard to code, it was just the right idea at the right time, well executed, combined with some good fortune.
AI is enabling solo founders to launch faster, but those solo founders still need to know how to launch a successful business. Coding is only 10% of launching a business.
My brother has had some success selling software before AI, so he already knows how to launch a business. But, AI helped him take on a more ambitious idea.
I think the other issue is that the leading toolchain to get real work done (claude code) is also lacking multi modality generation, specifically imagegen. This makes design work more nuanced/technical. And in general, theres a lot of end-product UI/UX issues that generally require the operator to know their way around products. So while we are truly in a boom of really useful personalized software toolchains (and a new TUI product comes out every day), it will take a while for truly polished B2C products to ramp up. I guarantee 2026 sees a surge.
> My brother is selling a CRM he developed for his business to others for a couple thousand a month.
There is no way he would have built the CRM as quickly pre-AI
The thing is, if AI is what enabled this, there's no long term market for selling something vibe coded for thousands a month. Maybe right at this moment and good for him, but I have my doubts these random saas things have a future.
I think that's comparing something different. I've seen the one-day vibe code UI tool things which are neat, but it feels like people miss the part that: if it's that easy now, it's not as valuable as it was in the past.
If you can sell it in the meantime, go for it and good for you, but it doesn't feel like that business model will stay around if anyone can prompt it themselves.
Providing context to ask a Stack Overflow question was time-consuming.
In the time it takes to properly format and ask a question on Stack Overflow, an engineer can iterate through multiple bad LLM responses and eventually get to the right one.
The stats tell the uncomfortable truth. LLMs are a better overall experience than Stack Overflow, even after accounting for inaccurate answers from the LLM.
Don't forget, human answers on Stack Overflow were also often wrong or delayed by hours or days.
I think we're romanticizing the quality of the average human response on Stack Overflow.
The purpose of StackOverflow was never to get askers quick answers to their specific questions. Its purpose is to create a living knowledge repository of problems and solutions which future folk may benefit from. Asking a question on StackOverflow is more like adding an article to Wikipedia than pinging a colleague for help.
If someone doesn't care about contributing to such a repository then they should ask their question elsewhere (this was true even before the rise of LLMs).
StackOverflow itself attempts to explain this in various ways, but obviously not sufficiently as this is an incredibly common misconception.
What I'm appreciating here is the quality of the _best_ human responses on SO.
There are always a number of ways to solve a problem. A good SO response gives both a path forward, and an explanation why, in the context of other possible options, this is the way to do things.
LLMs do not automatically think of performance, maintainability, edge cases etc when providing a response, in no small part because they do not think.
An LLM will write you a regex HTML parser.[0]
The stats look bleak for SO. Perhaps there's a better "experience" with LLMs, but my point is that this is to our detriment as a community.
> Agreement will make a selection of these fan-inspired Sora short form videos available to stream on Disney+.
I actually think this is genius.
The next Spielberg might be some poor kid in a third-world country who can create a global hit using this tech.
Among the millions of slop videos generated, some might be the next Baby Shark, etc.
I've seen some Star Wars fan fiction created using AI that is truer to the original Star Wars than the most recent trilogy.
This is a chance for Disney to take the best of the user generated content, with high quality AI generated animation, and throw it on Disney+ to get free content for their streaming platform.
My guess is that's the gamble here. Worst-case scenario at the end of three years they just shut it down.
It's really the professionals who get paid to generate content for Disney that should be worried about this deal. This could be how AI causes them to lose their jobs.
1. https://www.cerebras.ai/blog/openai-codexspark
reply