SSshalin Shahinguide-reinforcement-learning.hashnode.dev·Jul 12, 2025 · 2 min readReinforcement Learning GuideWhat is Reinforcement Learning? Reinforcement Learning is a trial-and-error learning process where an agent learns to make decisions by interacting with an environment. Key Components • Agent: The learner/decision-maker • Environment: The world agent...00
SSshalin Shahinsrlm.hashnode.dev·Jul 11, 2025 · 6 min readSelf-Rewarding Language ModelsSelf-Rewarding Language Models introduces an iterative training method where a single language model learns to both generate and evaluate its own responses, leading to improvements in instruction following and self-assessment capabilities. This appro...00
SSshalin Shahinreinforcement-learning.hashnode.dev·Jun 30, 2025 · 5 min readWhat is Reinforcement Learning?Reinforcement Learning is a trial-and-error learning process where an agent learns to make decisions by interacting with an environment. It is neither supervised nor unsupervised learning, but rather a third paradigm of learning. Key Components of RL...00