Tabular Ergodic MDP

Creator
Creator
Seonglae Cho
Created
Created
2025 May 25 1:0
Editor
Edited
Edited
2025 May 26 1:48
Refs
An MDP where states and actions are represented as a finite table, and when run for a sufficiently long time, all state-action pairs are visited infinitely often
Tabular means not using function approximation, but rather explicitly storing and updating values corresponding to state-action pairs in a table (matrix/array) format
Being tabular means Q-values are managed separately in cells, and being ergodic provides sufficient coverage to become
Off-policy
. Similarly,
Temporal Difference
can be updated immediately since the
State-value function
reflects the expected states of all episodes without waiting for episode completion.
 
 
 
 
 
 

Recommendations