FActScore: Fine-grained Atomic Evaluation of Factual Precision in...
Evaluating the factuality of long-form text generated by large language models (LMs) is non-trivial because (1) generations often contain a mixture of supported and unsupported pieces of...
https://arxiv.org/abs/2305.14251


Seonglae Cho