Member-only story

Unlock the secrets of DDPG in Reinforcement Learning

A simple step-by-step explanation of Deep Deterministic Policy Gradient(DDPG) in RL

9 min readMar 10, 2023

How would you train a robotic arm to grasp objects or for locomotion which is a continuous control problem with continuous states and continuous actions?

What is a continuous control problem in RL?

A continuous control problem in RL is where an agent needs to take actions from a continuous action space is much more complex than the discrete action spaces.

In discrete action spaces, the agent has a limited number of actions to select; however, in the case of continuous control problems, the agent has to select from a much larger and often infinite number of actions making the optimal action selection a complex problem.

Examples of continuous control problems are

Robotics,
Autonomous Driving, and
Finance.

What are the RL algorithms that solve continuous control problems?

The RL algorithms to solve the continuous control problems are popularly based on policy gradients where the agent learns a policy that maps states to actions directly like

Deep Deterministic Policy Gradient(DDPG)
Proximal Policy Gradient(PPO)
Trust Region Policy Optimization(TRPO) or
Soft Actor-Critic(SAC)

This article will explore DDPG

Deep Deterministic Policy Gradient(DDPG) is the model-free, off-policy deep reinforcement algorithm inspired by Deep Q-Network and is based on Actor-Critic using Policy Gradient

let’s understand each of the terms that make up DDPG.

What does the term deterministic policy mean in DDPG?

The term deterministic means there is no randomness or variability in a deterministic system's output, which contrasts with the Stochastic policy.

The Deterministic policy means that…

Unlock the secrets of DDPG in Reinforcement Learning

A simple step-by-step explanation of Deep Deterministic Policy Gradient(DDPG) in RL

What is a continuous control problem in RL?

What are the RL algorithms that solve continuous control problems?

What does the term deterministic policy mean in DDPG?

Written by Renu Khandelwal

No responses yet