Reinforcement Learning: Temporal Difference Learning

Learn the most central idea of the Reinforcement Learning algorithms

Renu Khandelwal
6 min readOct 3, 2022

Imagine you are traveling from work to home, trying to predict how long it will take to reach home. As you leave work, you consider the time of the day, traffic conditions, weather conditions, etc., to constantly update the prediction of when you will reach home. As you do this, you are using temporal difference.

Here you will learn how temporal difference learning, identified as one idea central and novel to reinforcement learning, is used for predicting online.

Reinforcement learning algorithms are based on how organisms learn from experience to anticipate future rewards correctly. The temporal difference is similar to the behavior of the dopamine neurons, where the dopamine neurons encode the difference between a reward received versus an expectation of a reward.

Good to Know:

Essential elements of Reinforcement Learning

Dynamic Programming

Generalized Policy Iteration

Reinforcement Learning: Monte Carlo Method

Reinforcement Learning: On Policy and Off Policy

Reinforcement learning is where the learner or the decision maker, called the Agent, interacts continually with its Environment by performing actions sequentially at each discrete time step. Interaction of the Agent…

--

--

Renu Khandelwal

A Technology Enthusiast who constantly seeks out new challenges by exploring cutting-edge technologies to make the world a better place!