MWMatthew Wisdominbitsofwisdom.hashnode.dev·6d ago · 7 min readState of Affairs: Why Your Code Needs Finite State MachinesI love state machines! If you've ever written a pile of nested if/else statements just to track "what mode is this thing in right now," you've already built a finite state machine. You just didn't cal12O
DDataKazindatakazkn.hashnode.dev·Jun 21 · 4 min readI Built an Apify Actor to Compare Luxury Handbag Resale Prices Across 4 MarketplacesI built an Apify Actor that compares public resale listings for Hermes Birkin, Hermes Kelly, and Chanel Classic Flap bags across four marketplaces: Vestiaire Collective Rebag Fashionphile 1stDibs Ac01O
TWTony Wangincrawlora.hashnode.dev·Jun 18 · 4 min readDead vs. Blocked: What a 10-Million-Domain Scan Reveals About Web Failures (Only 14% Is Truly Dead)Originally published on the Crawlora blog. Dataset: github.com/Crawlora-org/dead-web-index-data. When you hit a dead URL in production, do you know whether the domain is gone — or whether an anti-bot 03O
АСАлексей Спиновinspinov001.hashnode.dev·Jun 16 · 15 min readThe HTTP Code Your AI Agent Doesn't Handle Yet: 402Your fetch agent knows two endings to a request. 200: parse it. 403: back off, rotate, or skip. That branch has been the whole game for years. There's a third ending now, and it's the one your code fa01O
TWTony Wangincrawlora.hashnode.dev·Jun 15 · 10 min readWhy Reddit Blocked Unauthenticated JSON in 2026 (and How to Still Get Reddit Data)Key takeaways On May 28, 2026, Reddit announced it is deprecating unauthenticated .json endpoints — within days, appending .json to a URL started returning 403, silently breaking most open-source Red01O
TWTony Wangincrawlora.hashnode.dev·Jun 14 · 11 min readBest AI Web Scraping Tools in 2026: How to Choose Key takeaways ‘AI web scraping’ means two different things: AI-native extractors that read an arbitrary page with an LLM, and structured data APIs that hand AI clean JSON for known sources. Pick by w01O
АСАлексей Спиновinspinov001.hashnode.dev·Jun 13 · 13 min readYour AI Agent Re-Reads Every Page It Already Saw. I Measured the 8x Context TaxTurn 1 cost about 300 input tokens. Turn 20 cost 7,000. Same agent, same kind of page, 20 times more expensive for the last step than the first. Nothing was broken. The agent gave the right answer the00
АСАлексей Спиновinspinov001.hashnode.dev·Jun 12 · 18 min readYour AI Agent Trusts a 200 OK. I Logged How Often the Page Was GarbageYesterday I handed an agent a web_fetch tool. It fetched a page, got back a 200 and a screenful of text, and confidently built a plan on it. The text was a Cloudflare "Just a moment..." screen. The ag01O
TWTony Wangincrawlora.hashnode.dev·Jun 11 · 18 min readHow Paywalls Actually Work: The Engineering Behind ThemA paywall is one of the more interesting engineering problems on the web, because the publisher has to satisfy two goals that pull in opposite directions. It needs Google to index the article so peopl11O
АСАлексей Спиновinspinov001.hashnode.dev·Jun 11 · 12 min readGive Your AI Agent a Web-Fetch Tool: a 60-Line MCP Server (Free, Self-Hosted)Every MCP web-access tutorial I read this month pointed at a paid API. You don't need one. To let an AI agent read a public web page, sixty lines on the official MCP Python SDK give you a self-hosted 00