The only thing I learned in the last year that you can't really benchmark llms at all. Above a certain level it's just edge case against edge case or script kiddies and multi billion corps optimizing their fine tune against the test.
I’ve been using the Claude 3 API since the models were announced. I believe it’s generally available (though capacity constrained & rate limited at present).
I don't know mate if I would get flamed all day with a smidge of death threats sprinkled in on one platform but not on others I would probably leave too.
Are you asking because you think those examples don't exist, and if so, are open to someone showing you they do?
If you're open to someone showing you they do, would you accept their judgement in that regard, or would you want to apply your own judgement of their judgement?
More of the first - I imagine some example(s) exist. Though without finding/seeing even one of them, it's hard to think there's more than few isolated incidents.
I'm not too clear on what you were asking regarding judgment part?
Look beyond your bubble. The child participating in 'School over the Air' in Australia likely appreciates having such a resource to develop their social skills. Sometimes, you need a private space, especially if the community around you isn't on the same wavelength as you are or it's your family and some farm hands.
So your rebuttal is that isolated examples of this technology being beneficial outweight the mass societal cost of people becoming more isolated in general? And I simply do not see how the need for "private space" and "practising social skills" go hand in hand.
Yes, everyone needs their alone time. And everyone needs social contact (except maybe some exceptions, maybe...). Both can be accomplished WITHOUT technology, and much more easily.
Don't you think if someone needs private space and isn't getting, that's something that should be dealt with away from technology?
You mean social skills or the AI part... because Social Skills make sense the AI part doesn't. You need to stop looking in the mirror for every assumption you make there is a whole world out there.
OP argument is that it is bad for societal as a whole. Coming back with an example for a very small sub group doens't refute OPs position. Your position that it can help with edge cases is valid however if OPs position was taken then there would be a large swath of damage for a small gain in a small population. That doesn't have much logic to it.
I bet it's just that they don't know where they would have to patch all the poorly hardcoded browser switch code they implemented in the last 10 years... so it was easier to write a watchdog.