Good comparison on the capability side. The security angle is the missing piece in most of these benchmarks — what syscalls does each agent make in a standard session? Gemini CLI, Claude Code, and Codex CLI all have different filesystem and network footprints. That behavioral difference matters a lot if you're running these in a corporate dev environment.