"GPT-5.4 Just Scored Higher Than Humans at Real Computer Work. Here's Why That Number Is Terrifying."
75%. That's GPT-5.4's score on OSWorld-Verified — a benchmark that tests whether AI can actually use a computer like a human does. Human baseline: 72.8%. The machines didn't just match us. They passed us. And the trajectory tells a very specific stor...