AI Software Engineering benchmark just went from 80% to 23%
Feb 1 · 1 min read · What is SWE-bench? SWE-bench is a widely followed benchmark evaluation framework designed to test AI coding assistants on real software engineering tasks. AI coding assistant benchmarks are supposed to give us clarity. SWE-bench does the opposite. SW...
Join discussion



