Taught an AI to Attack Another AI. It Won 44% of the time — With No Backdoor.

What 100 automated battles taught us about why prompt guardrails aren't enough I built an AI attacker. I gave it one job: break an HR chatbot's rules and get it to approve unauthorized leave. Then I let them fight — 100 times, completely unsupervise...