Hierarchical RL

How to use skills

Typically try to solve long-horizon task. Very hard to get it work

High-level policy makes high level decision and low-level policy makes low-level decision. HRL helps solving long-horizon complex tasks with temporally extended exploration and simplified credit assignment. Many different HRL approaches suggested, yet there is no go-to method.

It is hard to assign credit for how much has been contributed from high-level or low-level policy.

Policy update in low level requires to update all high level policies (complex learning dynamics)

Increasing a count of hierarchical levels could be helpful for better results?

High level

Skill dynamics model better than single skill RL

learning transition between skills since skills are trained independently

Skill chaining problem appears since the good initial states for each skills could be a bad initial state of ending state and vice versa. We need to bring an agent to bring a good ending state for next state with transition policy.