Interpretable AI

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 May 1 1:17
Editor
Edited
Edited
2025 Mar 4 15:26

Interpretability

Degree to which a model can be understood in human terms
Model inspection only provides information about the model. The model might not accurately reflect the data
Interpretability paradigms offer distinct lenses for understanding neural networks: Behavioral analyzes input-output relations; Attributional quantifies individual input feature influences; Concept-based identifies high-level representations governing behavior; Mechanistic uncovers precise causal mechanisms from inputs to outputs.
https://arxiv.org/pdf/2404.14082
Interpretable AI Notion
 
 
 
Explainable AI Methods
 
 
 
 

Dream

 
 

Recommendations