Deep Q Learning: A Deep Reinforcement Learning Algorithm

An easy-to-understand explanation of Deep Q-Learning with PyTorch code implementation

Renu Khandelwal
11 min readJan 12, 2023

Deep Reinforcement Learning

Deep Reinforcement Learning combines Reinforcement Learning algorithms with Artificial Neural Networks. It allows the agent to learn an optimal policy for sequential decision problems by maximizing the cumulative future reward.

Deep RL = RL Algorithm + Artificial Neural Network

Reinforcement Learning is a goal-oriented algorithm where the agent aims to find optimal actions for a given state in an environment to maximize the long-term reward.

A Policy is a mapping of the states to actions that define the behavior of an agent in an environment. However, an Optimized Policy is a policy in which the agent is trained to maximize the cumulative reward over time.

The goal of reinforcement learning is to find an optimized policy that maps optimal actions the agent takes for the different environmental states.

The policies can be represented using a lookup table, linear functions, or neural networks depending on the complexity of the action space and state space for the environment. An optimal policy is derived by selecting the highest valued action in each state.

--

--

Renu Khandelwal

A Technology Enthusiast who constantly seeks out new challenges by exploring cutting-edge technologies to make the world a better place!