Gerard Sansai-cosmos.hashnode.devยทOct 26, 2024Why Reinforcement Learning via Chain-of-Thought Misses the Point: Misguided Optimisations-Driven AI ResearchWhile artificial intelligence continues to make headlines with impressive benchmark scores, a troubling practice has taken root in AI research. Imagine a teacher who, instead of helping students understand the subject matter, simply hands them copies...86 readsAIAdd a thoughtful commentNo comments yetBe the first to start the conversation.