Instead of focusing on short QA or external validation, this approach identifies hallucinations at the token level rather than sentence level. By attaching linear probes or LoRA probes to the hidden states of models like Llama, it predicts hallucination probability for each token. This method significantly outperforms existing uncertainty-based methods (semantic entropy 0.71). However, detecting reasoning errors beyond entity hallucinations remains challenging.
AI Hallucination Detection
Creator
Creator
Seonglae ChoCreated
Created
2025 Oct 5 0:8Editor
Editor
Seonglae ChoEdited
Edited
2025 Oct 13 9:14Refs
Refs
