Instead of focusing on short QA or external validation, this approach identifies hallucinations at the token level rather than sentence level. By attaching linear probes or LoRA probes to the hidden states of models like Llama, it predicts hallucination probability for each token. This method significantly outperforms existing uncertainty-based methods (semantic entropy 0.71). However, detecting reasoning errors beyond entity hallucinations remains challenging.
AI Hallucination Detection
Creator
Creator

Created
Created
2025 Oct 5 0:8Editor
Editor

Edited
Edited
2025 Oct 5 0:10Refs
Refs