Why Reinforcement Learning via Chain-of-Thought Misses the Point: Misguided Optimisations-Driven AI Research

Why Reinforcement Learning via Chain-of-Thought Misses the Point: Misguided Optimisations-Driven AI Research

Gerard Sans

ai-cosmos.hashnode.dev

·

Oct 26, 2024

Why Reinforcement Learning via Chain-of-Thought Misses the Point: Misguided Optimisations-Driven AI Research

86 reads

No comments yet

Be the first to start the conversation.