Dense, per-token feedback signals automatically inferred from outcome rewards. Intrinsic rewards originate from internal drives, whereas implicit rewards emerge from the learning process.
Implicit reward
Creator
Creator
Seonglae ChoCreated
Created
2025 Apr 16 12:32Editor
Editor
Seonglae ChoEdited
Edited
2025 Apr 16 12:37Refs
Refs
