LS
I guess one of the reasons for o1 performing better could be it's better distribution of training data especially for reasoning tasks than deepseek (as these 2 are primarily reasoning models). These llms mostly approximate the training data distribution, since o1 has better (and more) I guess that's why it did well (though inherently none of them can reason like we do)
CommentArticleJan 24, 2025Evaluating SotA LLM Models trying to solve a net-new LeetCode style puzzle
