LLM-based evaluation
google/FACTS-grounding-public · Datasets at Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/datasets/google/FACTS-grounding-public
FACTS Grounding: A new benchmark for evaluating the factuality of large language models
Our comprehensive benchmark and online leaderboard offer a much-needed measure of how accurately LLMs ground their responses in provided source material and avoid hallucinations
https://deepmind.google/discover/blog/facts-grounding-a-new-benchmark-for-evaluating-the-factuality-of-large-language-models/

Seonglae Cho