Hacker Newsnew | past | comments | ask | show | jobs | submit | eithed's commentslogin

Sure, producing code has become cheap. Yet again the taste matters and LLMs do not have taste - they will apply patterns that are unnecessary or not extendible, producing unmaintainable systems that nobody understands. Capturing domain knowledge was the crux of development process, but so was verifying, documenting, ensuring that multiple systems work together, maintaining uniformity. I don't know where the assumptions, done by developers, that they only need to produce code that just works or goes brrr fast comes from.

Domain expert can develop working code, but they will not be able to ensure above.


Depends - using Sonnet here and generally it should be as you say: plan would produce the result.

Still Claude will sneak things in - in my recent plan, for example I had defined, per acceptance criteria what colours the statuses should be: green for live, blue for sold, grey for anything else; it changed this to: green for live, orange for in progress, blue for sold, red in demolition, etc. When pressed why did it to this, it was unable to explain why. This is with a plan where AC were explicitly provided from the task in Given/When/Then format and were to be adhered to strictly. I've caught this within planning, but I shouldn't need to be doing this.

Even in standard prompts where I tell it "Change this label from X to Y", it ended reordering the tabs unrelated to ask. Again I was not able for it to explain why - it was so abrupt. And it was in fresh context, without any pollution on what I expect it to do.

I also noticed a different behaviour regarding skill; today and yesterday it would not be following skill guidance at all ie: skill writing skill - I'd have to explicitly tell it to test skills after writing them, when this is a behaviour expected by default. Similarly with other skills - knowing that it should have done something per skill guidelines and it not doing it at all. This is new behaviour that I've not seen a week ago.


Can you explain the benefits of running this over rector / eslint? (and to certain degree phpstan / deptrac)

Write a skill outlining your expectations of the code, put that skill into the pipeline, so that it can be included within your workflow.

Webdev here, but currently I have: - a skill where I outlined how the architecture of the system should look like, with guards (static analysis, architecture tests, linting) confirming that the code it generates adheres to standards

- a skill that tells it how tests should look like (use generators, write both feature / unit tests)

- a skill that tells it to generate docs from the code in a form of acceptance criteria (Given / When / Then)

- a skill that tells it to generate frontend uat tests + accompanying backend seeders given the AC

- a skill that tells it to verify that ticket objectives match what was delivered

At this point I still need to guide it to move task from one stage to the other (coding, testing, verification that indeed what was coded adheres to what was required), but I believe that these dynamic workflows can automate this work as well.


While $500k 90m movie done in two weeks is an accomplishment, looking at the trailer it's very dubious to me on the quality of it. Plot, characters, audio - everything screams "I've already seen this somewhere", there's no substance here, at least for me. And while computer visuals are nice, it's nowhere "Love Sex Robots" quality where they're driven by computer graphics as well.


> While $500k 90m movie done in two weeks is an accomplishment

Is it, though? If all you want is a movie, you can make it for both less money and less time. And if you actually have some modicum of talent, you can make it higher-quality to boot; see Joel Haver, who challenged himself to author, film, edit, and release 12 feature-length films during the course of 2024 on effectively no budget whatsoever (playlist here: https://www.youtube.com/watch?v=C-ZRRTsa5SY&list=PLKtIcOP0Wv... ).


True - your comment reminded me of Cube; that was done in 3 weeks, with budget of $350,000 CAD (according to wikipedia). Another favorite of mine - Primer = 5 weeks with budget of $7k.

edit: looking at others, Pi - 4w and $130k


> Cube; that was done in 3 weeks

The cube was not “done” in 3 weeks. Maybe they shot it in 3 weeks, but there were years of pre-production, and at least months of post-production. (According to wikipedia.)

Saying that it was done in 3 weeks is like saying that windows 11 was done in 45 minutes, because that is how long the compilation lasted.

> with budget of $350,000 CAD

“50% of the budget as C$350,000 to C$375,000 in cash and the other 50% as donated services, for a total of C$700,000. Natali considered the cash figure to be deceptive, because they deferred payment on goods and services, and got the special effects at no cost.”

Direct quote from wikipedia.


Fair enough, I was looking at budget fields on wiki rather than reading the tidbits - thanks for the correction still!


The trailer gave me this weird feeling like I've seen the movie before, even though I obviously haven't. Then it started to dawn on me. Nearly every line in the trailer is a line from another similar action-adventure movie. I bet if you searched a corpus of scripts from all past movies, you'd find each line directly in some other movie. Then I noticed the same thing about the characters. They may look unique at a surface level, but the essence of the characters are all tropes from previous movies. Same for the fight choreography, same for the score. It's as if the movie creator's AI prompt was "Take every movie made in the last 10 years that would have appealed to 14 year old boys and mash them up into another movie with visibly different characters."


This needs to be treated like LLMs, it's obvious that those flaws will be "fixed", we must already assume that this 90m movie will suddenly have the graphics and consistency of a marvel movie, soon enough, it's not like we will not have Kling 7 available in a few years.

Last year many developers were saying that it produces slop and so-on which is genuinely annoying when we know it's months/years to be GUARANTEED to be solved, as theory already proves we can go way further with models (theory means practice eventually), so we must not talk about "now" as in 1 week near but what it will be, as if it's already there imo. Even more annoying about the image gen AI, it's OBVIOUS that it will reach perfect accuracy (at least for human eye), as if we will just throw TRILLIONS of investment by the window and just stop here, nope, this will reach camera level, runtime, instantly rendered.

Else for the job loss, it's like the moment we realize that it can automate 99% of white collar jobs, we would suddenly be surprised when Opus 10 can do it? We shouldn't, we KNOW there will be Opus 10 that reach 99.9% in all benchmarks, like we know we will have Opus equivalent models running on our phones.

I won't be surprised when I see Opus 4.8 equivalent performance running on a 10B model, as this is just logical, I start to kinda hate it that we all act "surprised" with new models every few months as if the science behind it all changed suddenly, no... we just start developing what science is backing up already.

So obviously, music, video, writing... will be produced at a much higher level than humans, soon enough, there is no ceiling with AIs, humans are pretty limited.


Last line of the trailer “That was terrible”… yup.


I'd say it depends - evaluate the vibes. I spent 8y and recently 7y at a company where I genuinely responded with what I thought. But I'd say it's a matter of the audience - some people want to hear certain things and deciding if you can share these thoughts is up to you. It also allowed me to make decisions - if people don't care what I think and want sycophancy is this the company I want to be working at? I understand though it depends on one circumstances = you have to grin and bear it

Why can't people own to their mistakes and reschedule

Yeah, this - interview is a performance after all

> Why let a cooking website get visitors and ad revenue when they are free to take the content and show it as their own?

I think this is a step beyond that - why should people be creating cooking websites when you can ask LLM how to cook given thing, while indeed, serving their own ads. It's the continuation of "we own content other people produce" policy


Google already killed cooking websites - when it refused to show them in search unless they added long slop content to it. And it killed blogosphere when it decided blogs wont be found if they just contain content without deliberate SEO play.

And I think the rest of it will end the same way. People will be significantly less eager to do all that free work when no one will be able to find it.


recall the pizza sauce glue trick, to stop cheese from sliding off.

there are other such goodies like mashed potatoes with broken lightbulb gravy, or fiberglass omelette, enjoyed by beldar conehead.

i wouldnt trust an AI for any recipe that i dont have personal experience with.

the safety rails are not very strong yet.


If you are half decent at cooking it is actually pretty helpful to explore cooking something new. Just like coding it is nice to get specific answers to your specific question and it is pretty easy to reason about the quality using your own experience.

I would be interested in an example of this. LLMs will often combine recipes from random sites. If you're experienced enough at cooking to reason about the quality _for something new to you_, what value is there in an LLM here? I don't see any similarities to coding here.

To me the similarity is I know exactly what I want to do but cannot really remember syntax (coding) or key variables (cooking) like temp and time. But I have enough experience to know if the output makes sense. Either one I can ask an llm a specific question and get a somewhat reliable specific answer that I feel comfortable parsing… this is actually one of the reasons I think I am eventually going to be on the local inference bandwagon. It is not far from being good enough for my use cases. And I will be able to skip the inevitable enshittification.

In terms of temp and time surely if you know enough to judge it's correctness, you would not need it in the first place? Code correctness is rather objective and easily testable. Cooking is rather subjective and only testable with great effort and time. I just checked 4 models on a 4lb pork shoulder in an oven. Flash was super off, suggesting you could pull at 145-150F for a sliced roast. Yeah, you could and it would fucking suck. The per lb time and total time also didn't add up. The others were better but varied. Only one (opus) thought to ask if it was bone-in. If you're very specific you could certainly have it aggregate a bunch of recipes to get a sense of what's close to a good answer, but ultimately it depends on what sources it chooses.

I could see LLMs being helpful to explore what's out there, like finding similar dishes or dishes involving a specific set of ingredients or dishes involving a particular technique, but a pretty poor tool for the actual technicalities of cooking or more importantly the uniquely personal aspects of food culture.

I dunno. I'd just buy larousse and on food and cooking.


I recently roasted a 5lb leg of lamb. Temp was pretty obvious but I had never cooked meat this way so an idea on time is really useful. Google search is a disaster for this kind of question. And I guess I have never encountered a good general cook book that I feel comfortable building off of.

I think all the science of cooking ones are a good bet for generalist knowledge. Some of the more textbook like ones as well. The food lab and on food and cooking stand out, but there are many others. I'm not sure I'd classify them as cookbooks.

Food lab, for example, covers buying storing and cooking lamb + a guide for a 5-7lb boneless leg across 5 or so pages. Kenji goes through great lengths to build intuition. I'm sure larousse, which is more of an encyclopedia, covers lamb quite extensively but it's probably more terse.

The internet can be an excellent source, but like most things it depends on who is writing it down.


I agree and this response was following OPs example. But the point still stands - the goal is to outsource, in a weird way, the results being served = Google as such wouldn't need to pay for content. Now, if accuracy of such sources doesn't matter (or is good enough) for casual user...

Given most cooking or recipe websites have been AI slop for a few years now......

I'll stick with my mom's handwritten recipe book.


There are virtually no combinations of food which are toxic, you can mix any food with any food and, while it might not be good, it will still be food. (The only exception I know of is alcohol and mushrooms containing coprine, e.g. inky caps)

Point is, unless you're stupid enough to add glue or broken glass to your meal just because a recipe told you to, it's perfectly safe. More than just safe, LLM recipes these days are utterly boring in their normalacy, and, unlike cookbook recipes, can dynamically adapt to what you actually have in your pantry.


What really sucks is that Google pushed actual content creators out of the way in the first place. That is horrible. I think they should be challenged on this. Food bloggers, recipe writers, and creators have helped shape a huge amount of food culture, and they deserve to be protected rather than erased. If this kind of theft continues from the AI industry Im not sure what type of culture is is going to be left or what it is going to replace it to. I hope humanity is going to find a creative way around it, but I’m also aware how easy to manipulated the masses are.

Their assumption is that all relevant culture has already been invented and capturing the status quo is enough to get 80% of the benefits.

Evidently you're not familiar with Swedish Lemon Angels.

You can also tell the LLM exactly what you have in the fridge or what allergies you have and get customized recipes. It’s just a better experience, 2026 is rough for a recipe site.

Would you trust the tool that recommended putting glue on pizza to give you a good recipe?

I have/make rice starch glue. Can you put it on food? How are you supposed to know whether it's food safe?

Okay, so you don't trust LLM, so you go to a website instead. And... LLM-generated pages are SEO'd to get the top links. So you can't trust any website now (shoot, so much nonsense even before LLM, just more obvious to some of us). So basically everything on a computer is untrustworthy, directly from an LLM or not, unless you got yourself a copy of Encarta '97.

So you pick up a book at the local library. Librarians picked some books to order in subject matters they aren't expert in. How do you know those are accurate and safe? If the book says to use rice starch glue, how do you know the author didn't just copy that from an LLM? Or make it up?

Trust is fading entirely.


Presumably you test some things and use common sense for others. Like if you search for "grain filling oak" using an engine like Kagi(because Google just sells you the same product repackaged over and over) then you'll get people telling you variously to buy this grain filler compound that worked on their particular project, or you get people telling you to use drywall patch compound, or watered down wood filler.

The thing is, these things do produce some kind of result that looks like what you want. But it is still up to you to test these things on a project before you rely on them for whatever it is you really wanted them for, and that requirement doesn't go away just because you sourced the information from some LLM, or a book at the library, or Nick Offerman, or whoever else.


Got anything from 2025 or 2026?

AI got better over the last couple of years, and you didn't keep up, and because that's not going to stop, it will eventually become a problem for you.


The fundamental technology is still the same, just with more fossil fuel burning.

> because that's not going to stop, it will eventually become a problem for you.

How? Will it stop being possible to cook without AI?


The fundamental technology is still the same, just with more fossil fuel burning.

That's like saying the fundamental technology behind an Egger-Lohner Hybrid and a Prius are the same. Technically true, but if you use that truth as a basis for decisionmaking, you're doomed. A modern AI model wouldn't make such a foolish mistake, so you'd better not make it yourself.


Current AI models still make mistaKes all the time.

https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:allu5vs...


(Shrug) Ask a free chatbot model and get what you paid for.

Aside from that, please let me know when you find a machine or a human that never makes mistakes. I'd like to invest.


If the user puts glue on their pizza because a computer said so, that's a human problem.

The computer generated recipes can be useful as inspiration, but of course common sense is required.


This "common sense" you refer to, is it the same common sense Babbage was subject to?

"On two occasions, I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able to rightly apprehend the kind of confusion of ideas that could provoke such a question."

~ Charles Babbage


This video tells me otherwise: https://www.youtube.com/watch?v=UDQds7VZkfg ( Cold Ones - We Drank AI's Horrible Cocktail Ideas). This is a tongue in cheek response though, as LLMs improved significantly since then.

> You can also tell the LLM exactly what you have in the fridge or what allergies you have and get customized recipes.

Can you really though? Are the results delicious? I've never tried that.


It's worse than you think, many recipe sites do not taste test their stuff at all, and often have very stupid instructions.

That being said, an LLM can give creative ideas, mix and match components, but you should not trust the details at all.


Case in point, when "minced meat" and "mincemeat" were mixed up: https://metro.co.uk/2019/12/09/american-website-includes-act...

Damn, TIL. Now “Operation Mincemeat” seems less macabre.

Is this mushroom edible.jpg

> You can also tell the LLM exactly what

You can - but it's not advisable, not in the least.


Exactly! If you're an owner (ie: expert, you teach other people how to do your stuff) you should be making decisions and taking responsibility, given existing context. I'm happy using LLM to confirm my reasoning or research, but it's still me doing the coding, or architecting or anything, not LLM, and if something goes bad I cannot say "LLM told me to do it". If people are blindly doing what their tools tell them to do, that's the problem there.

Edit: in this instance if I were the expert I'd respond from my expertise. Using LLM is fine to explain whys/research per what you say, but ultimately I'm the educator here


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: