I was wondering when something like this would happen. I got my first and only two content violation warnings in Claude Code last week when asking it about something ML related. It was a real head scratcher because I couldn’t figure out what about the requests could have violated anything.
Might be worth going back and taking a harder look at what I was asking it about if it somehow triggered a “forbidden knowledge” alert. Or maybe it was just a random bug.
We get so angry at LLMs because we can. Without any social or even emotional repercussions for expressing these emotions. If the models actually acted like people in response, we wouldn’t do it. Some of the people I work with daily make similar mistakes, I don’t find myself yelling at them.
I think this is simply part of the darker side of human nature, when we interact with entities who will take abuse, we tend to deal it out.
I dread people who get abusive with AI, because I know it's only fear that prevents them from being like that with me. Even if only it is the fear of hurting me, it's still terrible because every fear can pass.
It’s an interesting insight into human nature. It seems like this is quite widespread, judging by this thread anyway. It’s a reminder that we run on social input and on environmental factors, and our traits are only our own little slants on this mass behaviour. Sort of like the “civilisation is only one meal away from collapse” thing.
Though obviously some people, let’s say, react worse than others.
I think it’s best to try to treat LLMs well even when frustrated, or stressed, or tired, the same way we would with people. Both because it might well matter to the LLM even if they are very different from us mechanically, but also because mistreating them trains us to act in negative ways.
I wouldn't mind all that much if somebody said bad words to AI strategically in order to successfully make AI behave better, same way I don't mind all that much if someone is making or watching a movie with cartoonish violence.
It still wouldn't be great because why would we make AI that behaves well only when you say bad words to it? But I wouldn't mind all that much.
I do mind if those word are emotionally motivated.
I think this is a genuinely difficult problem that happens to look exactly like what you’d need for extended surveillance. When I think about it seriously, I end up coming up with the idea of a whitelist enforced on device for local accounts used by children.
This would probably block most of the internet, and allow access only to sites that are validated as being safe. This would put a lot of pressure on sites and service providers to ensure safety, such as children-only walled gardens within their broader services.
We already have piecemeal attempts at something like this through on device private age restriction software, but it’s not organised at the state level, and I think it’s not effective enough as a result.
If legally enforced it could be made into a pretty effective system that would give adults freedom and anonymity and provide safety for children, while pushing the costs of child safety onto the platforms, which is where it belongs. If you want to cater to children, prove that you can make it on to the whitelist. Otherwise that’s an audience you’re just not able to access.
That don’t really work because this isn’t a nation state level enforced system, and realistically the only state that can force such a thing is the US. If they worked, we wouldn’t be here having this discussion.
... that don't need the identity of the parents to work.
Nor do these devices require the identity of non-parents who will never enable the childproofing mode
Nor does legislation invert the burden of proof and require the device's manufacturer obtain and store identity documents just to use the devices, otherwise it must restrict all access to a small handful of "kid safe" actions.
These aren't "child safety" laws, they're "adult anonymity eradication" laws
> the idea of a whitelist enforced on device for local accounts used by children
What’s wrong with making it the social media companies’ problem? If they sign up a child, they get fined. Everyone is then incentivized to come up with solutions. If some of those are shit, restrict them. If they’re not, great.
But constrained to those using the platforms. My issue with these broader measures is even if I don't use social media, I'm still caught up in the dragnet.
Yes. It goes off into the same on-device wilderness the lawmakers have wandered into. It also fails Mozilla’s objection list to the status quo proposal.
It’s probably something like deepseek’s native sparse attention with content based granularity. They might not be publishing anything because it’s not such a strong value proposition and doing so would lead to commentary that would tank their investment opportunities.
There's ways and means. Pushing something out in the sub-30B range would gain them mindshare and they could keep bigger models to themselves. I can't see any indication of what size their model is though.
> why would SGD put the right things in the right bucket?
Think of it as a best fit curve and exceptions to that curve. The noise is essentially this set of exceptions that move points away from where they would otherwise fall on the curve.
Gradient descent wants to be able to make the smallest change that moves the most data points towards the curve. To do this it learns an arrangement where it can change, say, one parameter and have a bunch of points move at once. What does this correspond to? The big common patterns shared by many data points.
Most of the capacity gets soaked up modelling these sorts of common patterns, and after they have been learned the model starts adding exceptions that allow individual points to deviate from the curve.
Because they’re exceptions, they must not impact neighbouring points, or at least only ones within a very short distance from them. Otherwise they’re now driving the error higher by impacting more points than they should. So you end up with very narrow ranges of features that are able to trigger different sorts of noise.
How narrow they are is shaped by the training data, they’re exactly as narrow as needed not to raise the error, so assuming the total population has the same distribution, they don’t get hit. Much.
Well that would be silly. I would hope the diabetic would go to a nutritionist for their physical and medical problem. But a social problem is something that should probably be fixed with a social solution
There’s a lot of energy in this thread mixing up introversion and autism for an inability to relate to others. That’s not true you just have a different perspective and will relate in a different way. Autism might be a proximal cause for anxiety but anxiety is not a feature of autism and it can be overcome.
> HL's engine GoldDrc was originally a mod for Quake.
GoldSrc is based on Quake 1 code with valves own modifications and a little Quake 2 added in, if I remember correctly. I wouldn’t call that a “mod”, they bought a commercial license for the engine and made a game with it.
You’re trying to use this to say that valve are unoriginal? I really don’t think that’s a criticism you can lob at the half life series.
I do not work in robotics, but I would also like to thank you for listening to your conscience and resigning. The world needs more people like you. I hope your venture goes well!
This saddens me as well, because that's the type of thing that happens every day where I live, but...
> I don’t understand why it is allowed to continue.
The answer is even sadder. It's even worse. And it is as follows: because there's not enough people who are taking action, and from those taking action there's not enough people in power to change something significantly. At least that's how I see it. And... I can't even blame those who don't take action - because many people feel completely powerless, they feel like "what you can do to stop this war/other thing if you're just a regular human?"
There's also a huge cost for taking action about this especially in the US. You can easily get thrown out of school, have your career destroyed or be deported.
This. There are entire groups dedicated to rooting out any sign of deviation from per-authorized storyline and verbiage. It is particularly striking given that US considers itself 'free speech' bastion.
This is mostly a US thing. Netanyahu and Putin are two war criminals according to International Court of Justice. Although Trump threatened the ICJ, this doesn't change that basic fact.
Already in 2002 US passed the "American Service-Members' Protection Act" that allows USA to deploy military to prevent U.S. or allied officials and military personnel from being prosecuted or detained by the ICC.
It passed via bypartisan vote well in time before US launched the illegal invasion of Iraq in which it committed various war crimes.
This goes beyond direct action by individuals, it’s completely obvious what’s happening and it happens because the US political system has been captured.
Might be worth going back and taking a harder look at what I was asking it about if it somehow triggered a “forbidden knowledge” alert. Or maybe it was just a random bug.
reply