Everyone wants an AI agent now.
But here’s the uncomfortable part:
Most companies don’t need “an agent.”
They need a workflow fixed.
OpenAI’s new Deployment Company is literally built around embedding AI specialists inside companies to find high-impact AI use cases, not randomly throwing agents at everything.
That says a lot.
Before building an AI agent, ask:
What task should disappear?
What decision should speed up?
What system should it connect to?
What mistake would be expensive?
If you’re thinking beyond demos, start here.
Question: Are companies solving problems with agents, or just rebranding automation?
Completely agree. The expensive part is when a broken workflow gets an agent attached to it, because now the same ambiguity happens faster and longer. The question I keep coming back to is simple: what new evidence does the system have before another retry? If nobody can answer that, it is not really an agent problem. It is an operating problem.
The trap is usually not "agents are too autonomous" in the abstract. It's that they can keep looking busy after the state stopped changing.
A few runtime guardrails have mattered a lot for us:
completed, blocked, or budget_exhausted so humans can trust the stopThat turns the system from "hope the agent knows when to quit" into something you can actually operate.
We've been building MartinLoop around that exact control layer, but the core idea is independent of the product: no silent retries, no invisible spending, no stop condition that disappears when the terminal does.
What keeps biting teams isn't that the agent is weak. It's that the loop keeps going after the useful evidence is gone. Once you're on retry 6 with no new proof, you're not exploring anymore, you're just buying uncertainty. The control that matters most is boring: before another retry, show what changed. If nothing changed, stop.
The trap is assuming better agents fix bad operating boundaries. In practice the expensive failures usually come from missing stop conditions, weak verification, and no record of what the agent actually did.
You hit the nail on the head. Most 'agents' I see today are honestly just Automation 2.0 in a trench coat. Companies are definitely rebranding old-school automation because 'Agent' sounds better in a pitch deck. But like you said, if the underlying workflow is broken, an agent just makes the mistakes happen faster. OpenAI’s strategy of embedding specialists proves that the real work isn't 'building the bot'—it’s the deep-dive into the company's mess to see where an agent can actually have autonomy. If it’s not making a decision or connecting disconnected systems, it’s just a fancy script, not an agent.
Keesan
Sharing big ideas and thoughts from personal experiences as a founder, builder, strategic foresight, future perspective and opinions on tech
The trap is usually not intelligence, it's missing operating rules.
If a run doesn't know when to stop, what proof counts as progress, or how to classify repeated failure, the system can look smart right up until it gets expensive.
The boring controls end up carrying a lot of weight: spend cap, retry dedupe, verifier gate, receipt after the run. That's been the more durable lesson for us building MartinLoop.