07 - Reinforcement Learning: Q-Learning, DQN, and PPO
Learn how agents learn from rewards: Markov Decision Processes, Q-learning, Deep Q-Networks (DQN), Policy Gradient, Proximal Policy Optimization (PPO). Applications: games, robotics, trading. Implementation with Gymnasium (OpenAI). Demo: training age...
federicocalo.hashnode.dev1 min read