AI researcher starting for interpretability
Reading list
Chris Olah
Neel Nanda
The field of study of reverse engineering neural networks from the learned weights down to human-interpretable algorithms. Analogous to reverse engineering a compiled program binary back to source code.