Reinforcement Learning: Temporal Difference Learning
Learn the most central idea of the Reinforcement Learning algorithms
Imagine you are traveling from work to home, trying to predict how long it will take to reach home. As you leave work, you consider the time of the day, traffic conditions, weather conditions, etc., to constantly update the prediction of when you will reach home. As you do this, you are using temporal difference.
Here you will learn how temporal difference learning, identified as one idea central and novel to reinforcement learning, is used for predicting online.
Reinforcement learning algorithms are based on how organisms learn from experience to anticipate future rewards correctly. The temporal difference is similar to the behavior of the dopamine neurons, where the dopamine neurons encode the difference between a reward received versus an expectation of a reward.
Good to Know:
Essential elements of Reinforcement Learning
Reinforcement Learning: Monte Carlo Method
Reinforcement Learning: On Policy and Off Policy
Reinforcement learning is where the learner or the decision maker, called the Agent, interacts continually with its Environment by performing actions sequentially at each discrete time step. Interaction of the Agent…