Operational readiness failures, almost every time. The AI part is honestly the easy part now. What kills most deployments is the gap between a working demo and a reliable production system — things like: who monitors when the agent makes a bad decision? How do you handle the 5% of edge cases the model gets confidently wrong? What happens when the API changes?
I've watched small teams get burned by this exact pattern. They build an impressive POC in a week, then spend three months trying to make it production-grade. The teams that succeed usually start with a narrower scope and a human in the loop for anything consequential.