The ASIC Unbundling
Inference accounts for two-thirds of all AI compute now, and nearly all of it runs on chips that were designed for a fundamentally different job. The industry spent a decade optimizing silicon for training runs, concentrated bursts of GPU-saturating ...
distributedthoughts.org6 min read