The gap between confidence and correctness is where agent systems break hardest. You describe this for generative AI, but it compounds when AI takes actions. An LLM that drafts an incorrect email is annoying. An agent that executes a wrong transaction is expensive. The iteration pattern works for content, but agents need: Explicit contracts. Define inputs, outputs, failure modes upfront. Observable state. Humans need inspectable decision paths, not just outputs. Graceful degradation. Failure modes are the normal operating envelope. Without structure, human-in-the-loop becomes human-fixing-things-after-they-break.