Why 10 million tokens is the only memory benchmark that matters
Originally published at https://nicoloboschi.com/posts/20260402
TL;DR: Memory benchmarks died when context windows hit 1M tokens — just dump everything in the prompt. BEAM tests at 10M where that trick fails. Hindsight scores 64.1% there, 58% ahead ...
nicoloboschi.hashnode.dev5 min read