Caching LLM Extractions Without Lying: Conformal Gates + a Reasoning Budget Allocator
The extraction pipeline processed 2,400 documents overnight. Cost: $380. The next morning I diffed the inputs against the previous batch—87% were near-duplicates with trivial whitespace changes. I’d burned $330 re-extracting answers I already had.
No...
craftedbydaniel.hashnode.dev11 min read