Biased test of GPT-4 era LLMs (300+ models, DeepSeek-R1 included)
Feb 1, 2025 · 53 min read · Intro Time to time I was playing with various models I can run locally (on a 16GB VRAM GPU), checking out their conversational and reasoning capabilities. I don't fully trust public benchmarks, as I've encountered multiple models with great scores on...
Join discussion