Markov Decision Process

Creator
Creator
Seonglae Cho
Created
Created
2023 Mar 5 6:47
Editor
Edited
Edited
2024 Dec 11 16:6
Refs
Refs

MDP, Stochastic
Finite State Automata

  • Random process (stochastic process)
  • A collection of random variables indexed by time (or some set)
 

1. Markov Process

notion image
 
New twist - don’t know R and T (different to traditional MDP) → (Step 1)
→ Must actually try actions and states out to learn
  • A set of states s ∈S
  • A set of actions (per state) A
  • A model T(s,a,s’) (probability)
  • A reward function R(s,a,s’)
Bellman equation - do not need to know environment model use action -value function
 
 

2. Markov Reward Process (1. with reward)

MDP is Markov Decision Process and tuple of states, actions, station probability matrix (probability to go certain state to another state by certain action), reward function, discounted factor
notion image
 
 

3. Markov Decision Process (2. with action)

notion image
notion image
 
 
 
 
 
 

Recommendations