Understanding Reinforcement Learning and Distillation: A Practical C# Example
A lot of heated discussions have been going around regarding DeepSeek-R1 lately. Instead of getting caught up in various discussions, I choose to focus on the underlying technology. I wrote an article about it last week called DeepSeek-R1: A Primer, ...
tjgokken.com15 min read