The Benchmark Is the Vulnerability: How AI Agents Are Being Tested to Attack the Real Web
Apr 13 · 8 min read · Last spring, a research team gave a large language model agent a list of real, unpatched web application vulnerabilities and a sandboxed environment in which to work. The model did not merely identify the flaws. It exploited them — autonomously, end-...
Join discussion











