Essential Elements of Reinforcement Learning

An easy-to-understand explanation of critical elements of Reinforcement learning

Renu Khandelwal

--

photo by Oscar Söderlund on unpslash

If you are trying to learn a new skill like roller skating, you will probably fall several times before mastering the art of skating. You will be rewarded a successful landing every time you learn to balance. Every time you are not able to balance yourself, you will receive a negative reward of a fall. The environment where you learn to skate is uncertain, as it could be a walkway, a skating arena, or anywhere else. You explore the environment and exploit to make use of knowledge of every successful balance to accomplish the task of learning to skate.

Welcome to Reinforcement learning.

You are the Agent, learner, and decision maker, interacting continuously with the environment. The environment is the surface on which you skate, responding to your actions with appropriate states and rewards. Actions, in this case, are the steps you take, the state is the body posture you get into after taking action, and rewards from the environment are a successful landing or a fall.

Source: Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto

Reinforcement learning is learning how to map situations to actions to maximize the long-term reward. The learner is not told what actions to take but must explore the environment and find the actions that yield the most rewards by trial and error.

Trial and error and delayed rewards are the two distinguishing features of Reinforcement learning.

There is always a trade-off between exploration of the environment and exploiting the actions that yield the maximum rewards. The most straightforward action for an Agent is to select an action it has already learned with the highest estimated reward, referred to as greedy action. However, the Agent has to explore the environment by performing various actions to select better actions in the future.

Reinforcement learning agents are goal-directed agents interacting with different aspects of an uncertain environment.

--

--

Renu Khandelwal

A Technology Enthusiast who constantly seeks out new challenges by exploring cutting-edge technologies to make the world a better place!