I stopped reading LLM benchmark posts. Here's what I look at instead.
Half my feed this month is "Model X beat Model Y on benchmark Z." I used to read these. I don't anymore. After about three years of paying close attention to LLMs in my actual day job, I think benchma
matthewx999999.hashnode.dev3 min read