Do you know of evals with default Claude vs caveman Claude vs politician Claude solving the same tasks? Hypothesis is plausible, but I wouldn’t take it for granted
cursor should be advertising multi-model adversarial reviews, I do this all the time and let me tell you things that slip through the cracks when opus or gpt write code that gemini catches are downright scary, on the backend anyway.
Everybody is acts so surprised as if nobody (around here of all places!) read the sama tweet in which he was hiring the Head of Preparedness... in December.
Besides that i'm not reading x, what has this arbitary random tweet todo with antrophic, the yt talk about Opus quality Jump to find exploits no one else was able to find so far?
A theoretical random tweet and a clear demonstration are two different things.
Assuming the scenario happened the first bombing runs would be over after 2h and would continue for the next 48h until amphibious assault fast response finishes landing, by which time it’s safe to assume there isn’t much left to defend (though rubble makes a horrible war zone for the attacking side).
Cuba simply isn’t Iran. They’re a blockaded island with not much military experience. Iran is a huge mountainous country preparing for war for the last 40 years with first hand experience of getting blown up from above and from the inside by USA allies and surviving just fine.
By at least some. The Americans I know who have traveled to Cuba (policy changes, it was possible a few years ago at least) report the people love Americans. Of course what you see as a tourist isn't reality but at least some is true.
They will succeed eventually since they have proof it’s possible and their plans span decades. I expect them to have working EUV in 10 years. Whether it’ll still be bleeding edge tech is a different question I dare not guess the answer to.
reply