Unlearning Benchmark

Unlearning benchmark (including TOFU) has one of the biggest risks, which is "improving scores by simply breaking the model". In other words, if you just make the model brittle so that it can't say anything about the forget set, the Forget Accuracy increases, but this is far from truly meaningful selective forgetting. That's why Retain Accuracy is also essential, and a combined score of forget and retain is used.

Unlearning Benchmarks

TOFU Bench

WMDP Bench

Unlearning evaluation methods

arxiv.org

https://arxiv.org/pdf/2402.16835

Machine Unlearning in 2024

As our ML models today become larger and their (pre-)training sets grow to inscrutable sizes, people are increasingly interested in the concept of machine unlearning to edit away undesired things like private data, stale knowledge, copyrighted materials, toxic/unsafe content, dangerous capabilities, and misinformation, without retraining models from scratch.

https://ai.stanford.edu/~kzliu/blog/unlearning

NeurIPS 2023 Machine Unlearning Challenge

Website for the NeurIPS 2023 Machine Unlearning Challenge.

https://unlearning-challenge.github.io/

Unlearning Benchmark

Unlearning evaluation methods

Recommendations