makerpulse.hashnode.dev28 Real Tasks Reveal What AI Leaderboards MissOriginally published on MakerPulse. 4.61 versus 4.55. That's the gap between the top two models in our first AgentPulse benchmark run: GPT-5.2 and Gemini 3.1 Pro, separated by six hundredths of a poi2d ago·11 min read