Advantage-Weighted RegressionImitating only good transitions based on how good the actions are, with weighting each transition depending how good the action is Can show that advantage-weighted objective approximates KL-constrained objective.