TD learning“bootstrapped” estimationTemporal difference target (TD target) Supervised regression (TD error)