InterpretabilityDegree to which a model can be understood in human termsInterpretable AI NotionMechanistic interpretabilityLLM Doc TransparencyNLE CoT Explainable AI MethodsSHAPDefection ProbeLIMEBERTViz