RL Policy

Creator

Seonglae Cho

Created

2025 Mar 21 11:18

Editor

Seonglae Cho

Edited

2025 Mar 21 11:25

Refs

A policy is a mapping from state s to a probability distribution over actions.

Policy-based methods directly learn the policy, while value-based methods derive policy from the value function. Policy-based methods have to maintain the table and just update the network. Actor-critic methods approximate the value function by updating its network.

Recommendations

///////