© 2026 Hashnode
A 71% drop in inference spend, with one caveat Last quarter our edge inference bill at Vercel hit $11,400 a month, mostly from a single feature: a writing-assist tool that runs a Llama 3.1 8B model behind a Cloudflare Workers AI endpoint. In April 20...
This morning's #1 story on Hacker News (827 points) is a side panel that runs an entire LLM on your laptop with one JavaScript call: await LanguageModel.create(). No server. No API key. No round-trip. The model — Google's 4GB Gemini Nano — is already...

Last month I did something I should have done years ago: I opened chrome://extensions, looked at the permissions list on my installed extensions, and nearly choked on my coffee. A tab manager extension had access to all my browsing data. A "simple" s...
