The Benchmark Saturation Problem: Why AI Evaluation Needs a Systems Thinking Approach
Last week, Stanford released their 2025 AI Index report with a finding that should cause anyone looking at AI to scratch their head a little bit. In a bunch of ways, models are now saturating benchmarks faster than we can create new ones. According t...
distributedthoughts.org7 min read