JAJason Agostoniinjason.agostoni.net·Jun 15 · 12 min readCan Fable 5 Finish Off the Other Frontiers?Can Anthropic's Fable 5 justify its staggering cost and live up to the massive hype to unseat the top specialized models? I ran Ship-Bench against the model to find out, stacking it up directly agains01M
JAJason Agostoniinjason.agostoni.net·Jun 1 · 13 min readCan the Mid-Tier Models Stack Up Against the Bigger Siblings?Can you really justify paying flagship prices when the mid-tier models may already be good enough? The original comparison started with Gemini 3 Flash vs. Claude Sonnet 4.6, then Gemini 3.5 Flash arri10
JAJason Agostoniinjason.agostoni.net·May 21 · 5 min readAntigravity CLI First Impressions: Fast, Rough, and Not ReadyGoogle has officially replaced Gemini CLI with the new Antigravity CLI and launched it alongside Gemini 3.5 Flash, which became the default model for the new CLI experience. That made the launch more 10
JAJason Agostoniinjason.agostoni.net·May 13 · 12 min readDo Open Frontier Models Have A Chance Against Closed Models ?Which of the new open-ish frontier models has the best chance to stand up against closed-source models on both cost and quality? I ran Ship-Bench against Kimi K2.6, Qwen 3.6 Plus, and DeepSeek v4 Pro 00
JAJason Agostoniinjason.agostoni.net·Apr 27 · 11 min readCan Gemma 4 Beat Gemini 3.1 Pro at Coding?Is a $20/month Google AI Pro account worth it versus running Gemma 4 31B on OpenRouter pay-as-you-go? This Ship-Bench run was designed to answer that question across a realistic coding workflow rather00