An information-theoretic and reference-free entropy-based metric for measuring dataset diversity. It measures diversity using the entropy of the eigenvalue distribution of the sample similarity kernel matrix K. Specifically, Vendi Score = exp(Shannon entropy of eigenvalues of K/n). Various similarity functions, such as those in embedding space, can be used to adjust diversity. The limitation is high computational complexity.
Vendi Score
Creator
Creator
Seonglae ChoCreated
Created
2025 Oct 9 23:18Editor
Editor
Seonglae ChoEdited
Edited
2025 Oct 9 23:18Refs
Refs
