Introduction to Reinforcement Learning

Reinforcement Learning (RL) has been a hot research area in the past two decades. It refers to the learning paradigm that an intelligent agent learns the optimal policy to take actions by interacting with the environments. In this post, I have summarized some basics of RL and provides an overview of classical approaches to RL.

The main mathematical formulation (not the only one) of RL is based on the Markov Decision Process (MDP). At time $t$ , the agent observes the state $s_{t}$ and takes an action $a_{t}$ to interact with the environments. Then the agents observes the reward $r_{t}$ and the next state $s_{t + 1}$ . Based on such interactions, the agent gradually learns an optimal policy $π^{*}$ to select actions in different states.

More details can be found in the RL slides.