Caching LLM Extractions Without Lying: Conformal Gates + a Reasoning Budget Allocator
1d ago · 11 min read · The extraction pipeline processed 2,400 documents overnight. Cost: $380. The next morning I diffed the inputs against the previous batch—87% were near-duplicates with trivial whitespace changes. I’d burned $330 re-extracting answers I already had. No...
Join discussion



















