Internal mechanistic evaluation not final

Creator

Created

2024 Nov 18 20:50

Editor

Edited

2024 Nov 22 23:37

Refs

Analyze the model's internal workings to evaluate the process itself rather than the final output, preventing reward errors. (

///