An Introduction to Markov Decision Process
The memoryless Markov Decision Process predicts the next state based only on the current state and not the previous one.
Google’s PageRank developed by Sergey Brin and Larry Page is based on a Markov Decision Process(MDP) utlizing the Markov chains making it the most used applications of a MDP.
What is MDP?
Markov Decision Process(MDP) is a mathematical framework for sequential decision and a dynamic optimization method in a stochastic discrete control process.
Markovian property is a memoryless property of a stochastic process where the future is independent of the past and is only based on the current state, as proposed by Andrei Markov.
Components of MDP
The learner or the decision maker, called the Agent, interacts continually with its Environment by performing actions sequentially at each discrete time step. Interaction of the Agent with its Environment changes the Environment's state, and as a result, the Agent receives a numerical reward from the Environment.