Hacker Newsnew | past | comments | ask | show | jobs | submit | neya's commentslogin

> I'm sure news outlets and popular social media accounts will use appropriate caution in reporting this, and nobody will misunderstand it.

You mean the dude who writes articles on TechCrunch and Ars Technica based off of HN and Reddit thread titles because he doesn't understand what real journalism is? Sure, we can count on him :)


Yesterday someone on was yapping about how AI is enough to replace senior software engineers and they can just "vibe code their way" over a weekend into a full-fledged product. And that somehow finally the "gatekeeping" of software development was removed. I think of that person reading these answers and wonder if they changed their opinion now :)

Does this mean we're back in favor of using weird riddles to decide programming skills now? Do we owe Google an apology for the inverse binary tree incident?

I don't know what inverse binary incident you're referring to, but the fundamental premise here is LLMs can't really think logically like humans do and they are far away from replacing humans in software, let alone senior software engineers

Not riddles but "requirements" :)

What does this nonsensical question that some LLMs get wrong some of the time, and that some don't get wrong ever, have to do with anything? This isn't a "gotcha" even though you want it to be. It's just mildly amusing.

Because, this fundamental premise demonstrates that LLMs can't really think logically like we do and they are far from replacing actual humans, let alone senior software engineers.

No, those people refuse evidence get in the way.

Humans aren't immune to getting questions like this wrong either, so I don't think it changes much in terms of the ability of AI to replace jobs.

I've seen senior software engineers get tricked with the 'if YES spells yes, what does EYES spell?', or 'Say silk three times, what do cows drink?', or 'What do you put in a toaster?'.

Even if not a trick - lots of people get the 'bat and a ball cost £1.10 in total. The bat costs £1 more than the ball. How much does the ball cost?' question wrong, or '5 machines take 5 minutes to make 5 widgets. How long do 100 machines take to make 100 widgets?' etc. There are obviously more complex variants of all these that have even lower success rates for humans.

In addition, being PHD-Level in maths as a human doesn't make you immune to the 'toaster/toast' question (assuming you haven't heard it before).

So if we assume humans are generally intelligent and can be a senior software engineer, getting this sort of question confidently wrong isn't incompatible with being a competent senior software engineer.


humans without credentials are bad at basic algebra in a word problem, ergo the large language model must be substantially equivalent to a human without a credential

thanks but no thanks

i am often glad my field of endeavour does not require special professional credentials but the advent of "vibe coding" and, just, generally, unethical behavior industry-wide, makes me wonder whether it wouldn't be better to have professional education and licensing


Let's not forget that Einstein almost got a (reasonably simple) trick question wrong:

https://fs.blog/einstein-wertheimer-car-problem/

And that many mathematicians got monty-hall wrong, despite it being intuitive for many kids.

And being at the top of your field (regardless of the PHD) does not make you immune to falling for YES / EYES.

> humans without credentials are bad at basic algebra in a word problem, ergo the large language model must be substantially equivalent to a human without a credential

I'm not saying this - i'm saying the claim that 'AI's get this question wrong ergo they cannot be a senior software engineer' is wrong when senior software engineers will get analogous questions wrong. If you apply the same bar to software engineers, you get 'senior software engineers get this question wrong so they can't be senior software engineers' which is obviously wrong.


I am not a fan of OpenAI but they are not exactly hiring a security researcher. They are hiring an aspiring builder who has built something the masses love. They can always provide him the structure and support he needs to make his products secure. It's not mutually exclusive (safety vs hiring him).

This isn't upvoted enough. This is more interesting than the OP's project! Thanks for sharing!

> it costs almost nothing to build an app, it costs almost nothing to clone an app.

I guess the author hasn't done real software development. The cost isn't just for the code. It's for the whole process - especially the architecture. Which database to use for the use case, which framework and language to use, how the database should be structured,table naming standardization, best practices, security audits and everything else.

Can AI do all that? Sure, but you must know to ask for all that in the first place. Look what happened to Clawd/Molt.

> It's because building an app went from a $50K project to a weekend with Claude.

Sure, why don't you deploy your vibe coded app over the weekend and see if it falls apart after handling one request per second

This article was written by AI btw


Vibe code to production perhaps not, but vibe code for regular personal use doesn’t seem out of the realm of possibility already.

Unless there is inherent complexity in the problem (and assuming subscriptions don’t get pricey soon) I can see nontechnical people getting into designing their own apps.

It makes me think of 3d printing. A lot of people got into 3d modeling because of it. And a lot of people publish cute baubles 3d models (analogous to vibe coded ai wrappers?) but there is genuinely useful stuff that people not in the fabrication or 3d design industry create and share, some even making money off of it.

I just can’t think of a way saas margins will stay as high as they are now.


I don't disagree with the premise, but I still can't think of a SaaS that I'm paying for that I can replace. And I have many subscriptions.

3d printing is something I think about. LLMs do their best work against text and 3d printers consume gcode. I’ve had sonnet spit out perfectly good single layer test prints. Obviously it won’t have the context window to hold much more gcode BUT…

If there was a text based file format for models, it could generate those and you could hand that to the slicer. Like I’ve never looked, but are stl files text or binary? Or those 3mf files?

If Gemini can generate a good looking pelican on a bicycle SVG, it can probably help design some fairly useful functional parts given a good design language it was trained on.

And honestly if the slicer itself could be driven via CLI, you could in theory do the entire workflow right to the printer.

It makes me wonder if we are going to really see a push to text-based file formats. Markdown is the lingua franca of output for LLMs. Same with json, csv, etc. Things that are easy to “git diff” are also easy for LLMs…


There is a text based file format for models. It's called OpenSCAD. It's also much more information compacted than a mesh model file like STL - e.g. in OpenSCAD you describe the curve, in the mesh file like STL you explicitly state all elements of it.

It's just gimped to the point that you can basically only use it for hobbyist projects, anything reasonably professional looking is using STEP compatible files and that is much more complex to try to emulate and get right. STEP is a bit different - it's more like a mesh in that it contains the final geometry, but in BRep which is pretty close to the machining grade, while OpenSCAD is more like what you're asking about - a textual recipe to generate curves that you pass into an engine that turns it into the actual geometry. It's just that OpenSCAD is so wholly insufficient to express what professional designs need it never gets used in the professional world.


> see if it falls apart after handling one request per second

Most of the problems you talk about are problems if you intend your software to be used at scale.

If you're building an app for yourself to track your own food habits; why does DB, framework, best practices matters?

People used to do this in an Excel sheet.

Now they can ask Claude to make them a nice UI similar to MFP or whatever.

Data can be stored in a single JSON file.

It's going to take years before they see actual performance issues.

And even though it becomes an issue, right now an AI Agent can already provide a fix and a script to migrate the data.

My only concern really is about security.

But a private VPS only reachable through Tailscale and they're ahead of 99% of the rest.


All your points are valid and I myself use these types of apps (eg. For handling invoices) internally. But, the second your app talks to the internet, you are more likely to shoot yourself in the foot. Look what happened to Clawdbot. Everyone who used it had their instances exposed to the internet.

AI can fix bugs, sure. But every time you ask it to fix the same problem, it will come up with a new solution - usually unnecessarily complex. Will we reach a point where the AI can be its own architect? Maybe. But, I know for a fact that it's not what we have right now.

Right now, AI needs an architect to tell it how it should solve a problem. The real value of software is in the lived human experiences, not just the code. That's why we make certain decisions different than an AI would.

Ask an AI to vibe code an invoice app. It will make some really lovely looking UI - which is what unfortunately people judge an app by - but with a MongoDb backend which is totally not the right solution for the problem. That's what I mean.


> If you're building an app for yourself to track your own food habits; why does DB, framework, best practices matters?

They don't, it's just annoying as shit when things break at the worst time for lack of these "best practices" and you know that the only answer will be "do better". I'll give you an example. Years ago I migrated a lot of my app usage to selfhosted OSS apps for all the reasons one might list them. I did like 80% of what I perceived as the "important best practices". Setup ZFS with redundancy to handle drive failures, a UPS for power interruption, wireguard for secure access, docker for application and dependencies isolation, etc.

But there were always things I just thought "I should probably do that, but later. This is just for me"

It would be the end of the day, I'm tired and on bed wanting to just chill and watch something on my ipad, and what do you know my plex is down, again.

Why does it go down every few days? Now I need to go get a laptop, ssh into my server, docker logs. See a bunch of exceptions. I don't want to debug it today. Just restart it, ok it works again. Go to bed, start watching.

20 minutes in.. I think it's down again.. wtf? get the laptop again, google the error, something about sqlite db on an NFS share not being very stable. All my ZFS storage is only exposed as NFS and SMB share to another machine.. Ok, just restart and hope it works and I'll deal with it latter.

Forget for a couple of days. I'm with a friend as her place and want to watch again, and fuck me I never fixed the sqlite issue, nevermind lets just watch netflix.

Over the weekend, I'm determined to get this fixed. Move the application folder out of NFS on the local machine SSD. It doesn't have redundancy, but it's ok for now. I'll setup an rsync job to copy it to the NFS share in case the SSD fails. I just want to see if it'll be stable.

Few months pass, and it's been pretty stable until I have a power outage. The UPS was there, but the configuration to notify the OS to shutdown broke a while ago and I didn't notice. Files on ZFS are fine, but the some on the local SSD got corrupted and I didn't notice, including plex database. the rsync job just copied the corrupted file over the "backup" file.

It's late at night again, and I just want to relax and watch something and discover this happened. I could try to figure out how to recover it, but it's probably easier to just do a clean scan. It's gonna take hours. Lets just start it and go to sleep.

Later, lets just migrate everything to jellyfin. Have auto upgrade setup because I'm smart. Jellyfin 10.8 updates and unfavorites all the facorited music tracks. "You have backups right". "Well, yes I do. Let me make sure I have an evening cleared so I can setup another instance of jellyfin, run the old backups, export the favorite list, and import it in the new one"... oh there is no way to do that? I guess I can export it to CSV, get a plugin to automate it for me. the plugin hasn't been updated to 10.8 but there is a pull request. ok lets wait. Forget that I setup restic to delete backups older than 30 days. fuck me. I have the CSV somewhere I think. God my `/tmp` is ephemeral and I hope I haven't rebooted since then. phew it's there. fuck me still.

I have worked in managing services for most of my career. I know what I'm doing wrong. I need to setup monitoring, alerts, health checks, 321 backups (not just rsync to a zfs pool) and actually use a backup software that tracks file versions, off site redundancy, dashboards for anomaly detection, scheduled hardware upgrades and checks for memtest, disk health, UPS configuration checks. I know how 3 or 4 9s are achieved in the industry.


I think there's different markets though, it's not just the enterprise market, is it? There's a huge market where security audits are not as important.

Personally, for my small business. I've replaced £500 Zapier subscription, £100 Todoist subscription, and I only haven't replaced the rest because I feel like there's not a huge rush. And it's been six months and nothing has fallen apart yet.

You might not think small business is relevant, but it absolutely is.


Oversimplified, Rocket Internet (Samwer brothers) generated billions cloning apps and services. Many other examples exist. Thinking of costs as "almost nothing" is misleading, but the low cost of cloning services and apps is a business model with a strong track record that seems to have accelerated due to AI. Of course, competition within this business model is also accelerating, making profitability more complex, and ethics is always complex in this space.

What happened to clawd/molt?

https://www.infosecurity-magazine.com/news/researchers-40000...

Actually this number was updated to 135,000 exposed instances recently


Thanks for your answer!

> This article was written by AI btw

Unless you had an AI write the article, you can't possibly know that. I'm sick of this being randomly thrown around: it's basically mentioned for every article posted. Sometimes the author chimes in to say that no, they wrote it themselves. Other times sure, the article was written by AI. I don't know, and you don't know either.


It's not that hard to find out. Copy paste the text into any AI detector online. I pasted it into Grammarly and it says it's AI content with a 99% accuracy.

Easiest way, however - any article that uses em dashes instead of regular hyphens is most likely AI. Normal bloggers, particularly in casual tech circles don't use em dashes. When was the last time you ever used an em dash? Me? Never.


> Copy paste the text into any AI detector online

The use of the word "any" here only emphasizes that this advice is not very valuable.


I use them when appropriate, and have for long before LLMs were using them. I'm not going to stop now.

I stopped using em dashes - because of LLMs. And it's a bullshit way too: everyone has heard about it and it's easy to make the LLM output something else than em dashes.

Pray how do those "ai detectors" work? I trust those even less than I trust ai: AI detectors use simple heuristics and take advantage of your gullibility.


Eh, sometimes you know.

I vibecoded an app for my business and didn’t need any engineers and it is currently in use for our customers.

I think this is great for everyone to be a developer, the gatekeeping has now been removed and we will see a creative explosion of apps that everyone can build.

The security and maintenance aspect of apps is just a claude skill away to be a solved problem.


> "The security and maintenance aspect of apps is just a claude skill away to be a solved problem."

To think that someone on Hacker News actually wrote this seriously in 2026, after a couple of decades of CVEs, security breaches, and data thefts being in the news every single week and after 50+ years of the industry experiencing how arduous software maintenance is. I doubt even Anthropic or OpenAI would be brave enough to say that.


I think you overestimate the ability of AI to write perfectly secure apps. Humans can't do it, and AI is trained on their work.

> I think you overestimate the ability of AI to write perfectly secure apps. Humans can't do it, and AI is trained on their work.

Ironically, AI tend to be better at securing code, because unlike the squishy human, it is much more cable of creating tons of tests and figuring out weaknesses.

Let alone the issue when lots of meatbags with different skill levels are working on the same codebases.

I have barely seen any codebase that has been in production for a long time, that did not have glaring issues.

But if you tried to do a code audit, your spending somebody their time (assuming this is a pro), for a long time. Where as a AI with the correct hints on what too look for, can do insane levels of work, testing, etc...

Ironically, when you try to secure test a codebase, and you use multiple different LLMs, you get a very interesting list of issues they can find. Many that are probably in tons of production level software.

But its up to you, as the instructor of that LLM codebase, to actually tell it to do regular security audits of the codebase.


> Ironically, AI tend to be better at securing code, because unlike the squishy human, it is much more cable of creating tons of tests and figuring out weaknesses.

Sentences like this make me think AI is honestly the best thing that happened for my imposter syndrome. AI is great for simulating test case, and that's it. If you leave it, it write the most basic, useless tests (i mean, half of them might be usefull when you refactor, but that's about it). It can't design reusable test components and have trouble with test double, which i would think is the easiest test case for AI. Even average devs like me write test double faster than AI, and i'm shit at writing tests.

AI is also extremely bad at understanding versionning, and will use a deprecated API for no reason except increasing the surface of attack.

AI is great for writing CLI scripts, boilerplate and autocomplete. I use it for frontend because i'm shit at it (even though i have to clean its shit up behind), and to rewrite small functionalities of some libraries i want to avoid loading (which allowed us to remove legacy dependencies). It's good at writing prototypes (my main use nowadays), and a very good way to use it is to ask it a plan to improve/factorize your code (it's _very_ bad at factorizing, but as it recognize patterns, it is able to suggest interesting refactors. Half the time it's wrong, so use the "plan" mode)

I'm on a network security and cybersecurity tooling team, i guarantee you AI is shit at securing the code (and at understanding network).


Frankly, i feel like the people downvoting my comment, are still using older LLMs. When Opus 4.5 entered the picture, there was a noticeable improvement in the way the LLM (for me), interacted with the code base, and the issues that it was able to find.

I ran Opus on some public source code, and lets just say that the picture was less rosy for the whole "human as security".

I understand people have a aversion to LLMs but it irked me the wrong way to see the amount of downvotes on here, because people disagree with a opinion. Its starting the become like reddit. As i stated before, its still your tasks as the person working with the LLM to guide it on security practices. But as somebody now 30 years in the industry, the amount of absolute crap i have seen produced as code (and security issues), makes LLMs frankly security wizards.

Stupid example: I have yet to see LLMs not use placeholders to prevent SQL injection (despite it being trained on a lot of bad code).

The amount of code i have seen, where humans just injected variables directly into the SQL... Yea, what a surprise that SQL database content get stolen like its nothing. When doing a security audit on some public code, one of the items always found by the LLMs, yep ... SQL injectable code everywhere.

A lot of practices are easy, but anybody can overlook something in their own code base. This is where LLMs are so great. You audit with multiple LLMs and you will find points that are weak or where you forgot something, even if you code security wse.

So yea, i have no issue doing discussions but the ridiculous downvotes on what seems to come from people with no clue, is amazing. Going to take a break from here.


I must only work with genius (or rather, extremely competent seniors) who keep their codebase very clean, because that never happened to me. Even in my worst job at a bank, with idiotic product dev who couldn't read a Java trace to save their lives, security was the only thing that mattered.

But like i said, this whole discussion on LLMs since Opus is out is _great_ for my ego. At first i thought i used it wrong, then my company made weekly meeting on "how to use AI" with devs who swore by it, now i'm confident I might be a bit above average after all.

Maybe it's different for tooling/network/security devs than for product devs, but i doubt our backend are _that_ complex.


> the gatekeeping has now been removed

Nobody gatekept anything. The software, tools, knowledgebase (MIT, Coursera, etc) were always there. It was a choice. Some of us chose it, rest didn't for whatever reason.


> gatekeeping has now been removed

Who was preventing you from learning how to do it yourself and then doing it?


Comments as short-sighted as this give me confidence in the future job security of people who actually know how to write software.

> the gatekeeping has now been removed

'Gatekeeping' being 'knowing'... nobody was stopping you from learning.

> The security and maintenance aspect of apps is just a claude skill away to be a solved problem.

Incredible joke. Got a good laugh from me.


How does Apple allow this? Here I thought the App Store was supposedly superior to the Android eco-system and that's why Apple justified the insane 30% tax on developers back then

Google Play was also 30%?

Yeah but Google always allowed you to bypass that by allowing users to install apps outside of their store. Whereas Apple pitched it as a security concern only to allow whoever paid them a nice fat commission

I thought android allowed installing third party apps without going through the store. Isn't this 90% of the pitch of android to begin with?

100% accurate. The architect matters so much more than people think. The most common counter argument to this I've seen on reddit are the vibe coders (particularly inside v0 and lovable subreddits) claiming they built an app that makes $x0,000 over a weekend, so who needs (senior) software engineers and the like? A few weeks later, there's almost always a listing for a technical co-founder or a CTO with experience on their careers page or LinkedIn :)))

If that's true, it sounds like the vibe coders are winning - they're creating products people want, and pull in technical folks as needed to scale.

But the argument is not about market validation, the argument is about software quality. Vibe coders love shitting on experienced software folks until their code starts falling apart the moment there is any real world usage.

And about the pulling in devs - you can actually go to indeed.com and filter out listings for co-founders and CTOs. Usually equity only, or barely any pay. Since they're used to getting code for free. No real CTO/Senior dev will touch anything like that.

For every vibe coded product, there's a 100 clones more. It's just a red ocean.


Ars Technica has always trash even before LLMs and is mostly an advertisement hub for the highest bidder

I built a windmill with Claude. I created a skills.md and followed everything by the book. But now, I have to supply power to keep the windmill running. What am I doing wrong?

20KW? Wow. That's a lot of power. Is that figure per hour?

What do you mean by "per hour"?

Watt is a measure of power, that is a rate: Joule/second, [energy/time]

> The watt (symbol: W) is the unit of power or radiant flux in the International System of Units (SI), equal to 1 joule per second or 1 kg⋅m2⋅s−3.[1][2][3] It is used to quantify the rate of energy transfer.

https://en.wikipedia.org/wiki/Watt


If you run it for an hour, yes.

Ah yes, like those EV chargers that are rated at X kWh/hour.

You would hope that an EV reporting x kWh/hour considers the charge curve when charging for an hour. Then it makes sense to report that instead of the peak kW rating. But reality is that they just report the peak kW rating as the "kWh/hour" :-(

I asked because that's the average power consumption of an average household in the US per day. So, if that figure is per hour, that's equivalent to one household worth of power consumption per hour...which is a lot.

Others clarified the kW versus kWh, but to re-visit the comparison to a household:

One household uses about 30 kWh per day.

20 kW * 24 = 480 kWh per day for the server.

So you're looking at one server (if parent's 20kW number is accurate - I see other sources saying even 25kW) consuming 16 households worth of energy.

For comparison, a hair dryer uses around 1.5 kW of energy, which is just below the rating for most US home electrical circuits. This is something like 13 hair dryers going on full blast.


At least with GPT-5.3-Codex-Spark, I gather most of the AI inference isn't rendering cat videos but mostly useful work.. so I don't feel tooo bad about 16 households worth of energy.

To be fair, this is 16 households of electrical energy. The average household uses about as much electrical energy as it uses energy in form of natural gas (or butane or fuel oil, depending on what they use). And then roughly as much gasoline as they use electricity. So really more like 5 households of energy. And that's just your direct energy use, not accounting for all the products including food consumed in the average household.

Which honestly doesn't sound that bad given how many users one server is able to serve.

Consumption of a house per day is measured in kiloWatt-hours (an amount of power like litres of water), not kiloWatts (a flow of power like 1 litre per second of water).

1 Watt = 1 Joule per second.


Thanks!

I think you are confusing KW (kilowatt) with KWH (kilowatt hour).

A KW is a unit of power while a KWH is a unit of energy. Power is a measure of energy transferred in an amount of time, which is why you rate an electronic device’s energy usage using power; it consumes energy over time.

In terms of paying for electricity, you care about the total energy consumed, which is why your electric bill is denominated in KWH, which is the amount of energy used if you use one kilowatt of power for one hour.


You're right, I absolutely was mixing them both. Thanks for clarifying!

Acktshually "kW" and "kWh" to be precise

It’s 20kW for as long as you can afford the power bill

20 kWh per hour

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: