The scaffolding point is dead on.
A lot of teams think they have a model problem when they really have a runtime problem: unclear boundaries, weak retry rules, and no machine-readable proof for why the run kept going.
Once that layer is explicit, the agent usually looks a lot less magical and a lot more dependable. That's been one of the biggest lessons behind MartinLoop.