Factuality benchmarks
- reference-free factuality benchmark
- reference-based factuality benchmark
Hallucination Benchmarks
several types
arxiv.org
https://arxiv.org/pdf/2410.22071
vectara/hallucination_evaluation_model · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/vectara/hallucination_evaluation_model
Leaderboard
huggingface.co
https://huggingface.co/spaces/vectara/Hallucination-evaluation-leaderboard
STS model to judge
dleemiller/ModernCE-base-sts · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/dleemiller/ModernCE-base-sts
cross-encoder/stsb-roberta-large · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/cross-encoder/stsb-roberta-large

Seonglae Cho