Goal classifier

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 May 17 3:20
Editor
Edited
Edited
2024 May 31 3:12
Refs
Refs

Classifier weakness

Idea is that adding states that RL visits as negative examples for the classifier
Use final states as success state examples → train binary classifier

learned classifier

like
GAN
  • the policy is generator
  • the goal classifier is classifier
VQ-GAN
We want goal classifier to match goal state distribution
Goal is to slightly difference from
Behavior Cloning
(match expert state-action distribution)
typically sample positive and negative half and half for make expectation 0.5
 

general reward classifier

with demonstrating trajectories
 
 
 
 

Recommendations