I think we're pointing at the same instinct from different layers. Get observability in early and most "we need to scale" problems turn out to be "we couldn't see what was actually happening" problems. The piece argues that's doubly true for agents, where the failures are silent and semantic rather than a slow query or a 500.