I’ve recently created many Claude skills to do repeatable tasks (architecture review, performance, magic strings, privacy, SOLID review, documentation review etc). The pattern is: when I’ve prompted it into the right state and it’s done what I want, I ask it to create a skill. I get codex to check the skill. I could then run it independently in another window etc and feed back to adjust…but you get the idea.
And almost every time it screws up we create a test, and often for the whole class of problem. More recent it’s been far better behaved. Between Opus, skills, docs, generating Mermaid diagrams, tests it’s been a lot better. I’ve also cleaned up so much of the architecture so there’s only one way to do things. This keeps it more aligned and helps with entropy. And they’ll work better as models improve. Having a match between code, documents and tests means it’s not just relying on one source.
Prompts like this seem to work: “what’s the ideal way to do this? Don’t be pragmatic. Tokens are cheaper than me hunting bugs down years later”
This is smart as hell. I’ve long wondered how they’d combat ASIC’s without diluting their own benefits. This gives them a bit more time to figure out the moats, which is useful because Groq was going places. This juices Groq’s distribution, production, ability to access a wider range of skills where necessary.
I expect China to want to compete with this. Simpler than full-blown Nvidia chips. Cue much cheaper and faster inference for all.
Not terribly niche. All config that isn’t environment-specific and is used in inner loops or at startup. It’s even got a test for serialised values so can be used to speed your case up:
Yep, same here. Fortunately they don't seem to use it for anything yet, somewhat begging the question of why it's there in the first place. (It doesn't need to be stored in a user-visible way if the only purpose is as a poor/annoying "proof of humanity" against sockpuppet accounts).
Update: I just checked, out of curiosity – seems to be gone now?
There is enough to read between the lines. Dilbert manager has bought into a barely competent app (that they think replaces humans) in a panic because costs. They’ll learn more lessons over the next year, if they survive. The author might be a bit emotionally charged right now and is writing between some lines as they help migrate. Once they have some emotional distance they might write the fully clear article you’d like.
They probably used Claude because that way they don’t get blocked as fast. Websites trust Claude more. And why not use the foreign tools against themselves at presumably discounted rates (see AI losses) rather than burn your own GPU’s and IP’s.
1000’s of calls per second? That’s a lot of traffic. Hide it in Claude which is already doing that kind of thing 24/7. Wait until someone uses all models at the same time to hide the overall traffic patterns and security implications. Or have AI’s driving botnets. Or steal a few hundred enterprise logins and hide the traffic that is presumably not being logged because privacy and compliance.
I left a research org because it was basically a professional gatherer and spender of grants. People would make some proposal with a consortium, spend the money, write a paper or two, visit a conference in eg Japan to present it, rinse and repeat. Hated that. Nothing to do with real innovation.
Developers are hit too, although I don't expect that anyone will be replaced. I think AI is a productivity boost, it just takes less times to solve the small problems and get reasonable advice for aynthing beyond. Perhaps it reduced required headcount to implement some features. But companies that expel their knowledge workers for some AI solution probably won't survive long. Those that understand the tooling advantage, will get ahead though.
I love AI image generation, but many certainly do not enjoy the results. I can see some people skimping on paying artists.
First I thought translators would be hit hard by AI, but you probably still need them as well to be decently sure about correctness.
And it remains true that any creativity produced by AI is basically still just a function of the creativity of other people.
And almost every time it screws up we create a test, and often for the whole class of problem. More recent it’s been far better behaved. Between Opus, skills, docs, generating Mermaid diagrams, tests it’s been a lot better. I’ve also cleaned up so much of the architecture so there’s only one way to do things. This keeps it more aligned and helps with entropy. And they’ll work better as models improve. Having a match between code, documents and tests means it’s not just relying on one source.
Prompts like this seem to work: “what’s the ideal way to do this? Don’t be pragmatic. Tokens are cheaper than me hunting bugs down years later”
reply