MMichaelinmakerpulse.hashnode.dev0028 Real Tasks Reveal What AI Leaderboards MissFeb 25 · 11 min read · Originally published on MakerPulse. 4.61 versus 4.55. That's the gap between the top two models in our first AgentPulse benchmark run: GPT-5.2 and Gemini 3.1 Pro, separated by six hundredths of a poiJoin discussion