Dense, per-token feedback signals automatically inferred from outcome rewards. Intrinsic rewards originate from internal drives, whereas implicit rewards emerge from the learning process.
Implicit reward
Creator
Creator

Created
Created
2025 Apr 16 12:32Editor
Editor

Edited
Edited
2025 Apr 16 12:37Refs
Refs