The Reward Signal you live by
I first heard of the contextual bandit algorithm a couple of years back as an undergrad. I never gave much thought to it. Recently, I started working on reinforcement learning for the thrill of picking up my HRI research again and shaking off the dus...
samueladebayo.com2 min read