Speed, caching, and the 40x cost wall
3d ago · 4 min read · This is mid-thought, mid-evaluation, mid-engineering. Posting it because writing it out helps me think.
We have been running the RapidNative agent on Cerebras for a while now. The speed is unreal. GLM