Hacker Newsnew | past | comments | ask | show | jobs | submit | cinntaile's commentslogin

Don't leave us hanging. What happened?

A CTO sent me a message that opened with:

“Here’s a friendly message that will perfectly convey what you want to say”.

A double PhD friend says she has to talk to chatGPT for all sort of advice and can’t feel safe not doing it, “because you know I’m single and don’t have a companion to spitball my ideas”. She let chatGPT decide which way to take to get to a certain island, and she got stranded because the suggested service didn’t exist.

I have more examples. It’s a fucking mind virus.


How is the getting stranded example different than asking on a travel forum how to get somewhere, and an active and well intentioned user that isn't familiar with your area of travel answers, gives you wrong instructions, and you get lost?

The key missing step is where the traveler exercises critical thinking and checks the advice they get. Some people seem to turn that off for LLMs.

It's because we spent that last 50 years training people that computers are algorithmic, cold, and don't make human mistakes. Your calculator can't tell you the meaning of life, but it will never get 2 + 2 wrong.

Well, now the calculator can tell you a meaning of life, but it'll get 2 + 2 wrong 10% of the time.


Because they aren't probabilistic parrots? If they get it wrong, there's usually an understandable reason behind it.

cunningham's law [0] [1] increases the likelihood that at least one other person will point out the error and correct it. chances are you'll probably get more than one person posting.

LLMs don't do this. they give confident language output, not correct answers.

[0]: https://meta.wikimedia.org/wiki/Cunningham%27s_Law

[1]: https://xkcd.com/386/


Because the vast and overwhelmingly majority of the time, if you ask a question into the ether that nobody has a good answer to, most people will gloss over it and not bother answering, as attested by decades of relatable memes ( https://xkcd.com/979/ ). In contrast, the chatbot is trained to always attempt to give an answer, and is seemingly disincentivized via its training set to just shrug and say "I don't know, good luck fam".

They stop thinking and they stop verifying output too.

It would have been nice if the article explained what an optimizer is in this case?

I will add "compiler" before "optimizer" and link to the toy optimizer series

Thanks! I had to go to the rest of your site to make sense of it so that seems like the right approach.

Seems kind of pointless. If they wanted to they could just decide a retroactive date from when the law is in action.

Those would have robust constitutional challenges.

What constitution?

That bot needs more practice though. It didn't even get what it replied to.

The average Japanese person doesn't know English.


Probably unintended but this is a great pun.


How is M$ insulting? It just looks like a leetspeak version of MS.


It is supposed to indicate Microsoft cares only about money, which to me too, seems in the same league as microslop, i.e. mildly insulting but really not rude enough to be worth censoring.


And other insults are just words as well. It's the intention, history, connotation etc. behind words that give them meaning. M$ is meant as an insult, hence it's insulting. https://en.wiktionary.org/wiki/M$


As I said, I was not aware of the insult.


You created it in minutes, I think the appropriate next step would be to ask another LLM to try to poke holes in it. It does not seem fair to ask security professionals to waste their time on this.


The title should include that it's for cars.


It doesn't run a similar prompt or the same prompt again and hopes for the best. If it doesn't work, the agent debugs based on the errors received. Have you tried using a coding agent recently?


"rerun" is meant in a more abstract way here; "doesn't work" is meant in the way that the app itself is bad - and doesnt sell;

ofc i'm aware of modern agent loops; without them it wouldnt be possible to build apps with the click of a button in the first place


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: