I benchmarked 4 AI detectors on 1,000 texts. The scariest metric wasn’t accuracy, it was false positives.
I’ve been looking into AI detectors lately, especially because more schools, publishers, and workplaces are using them as if they’re objective evidence.
So I ran a benchmark on 1,000 English texts:
5
mattc95.hashnode.dev2 min read