Hallucination Benchmark

Creator

Creator

Seonglae Cho

Created

Created

2023 Nov 16 2:18

Editor

Editor

Seonglae Cho

Edited

Edited

2025 Aug 12 22:57

Refs

Refs

Factuality benchmarks

reference-free factuality benchmark

reference-based factuality benchmark

Hallucination Benchmarks

OpenAI SimpleQA

Phare benchmark

FACTS Grounding

several types

https://arxiv.org/pdf/2410.22071

vectara/hallucination_evaluation_model · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

https://huggingface.co/vectara/hallucination_evaluation_model

vectara/hallucination_evaluation_model · Hugging Face

Leaderboard

https://huggingface.co/spaces/vectara/Hallucination-evaluation-leaderboard

STS model to judge

dleemiller/ModernCE-base-sts · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

https://huggingface.co/dleemiller/ModernCE-base-sts

dleemiller/ModernCE-base-sts · Hugging Face

cross-encoder/stsb-roberta-large · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

https://huggingface.co/cross-encoder/stsb-roberta-large

cross-encoder/stsb-roberta-large · Hugging Face

Recommendations

//////////