Abstract This study presents a comprehensive analysis of SWE-agent trajectories comparing Kimi K2 Instruct and Claude Sonnet 4 performance on software engineering tasks from the SWE-bench dataset. Through detailed examination of action category distr...

No responses yet.