A technique that averages the weights of multiple differently-trained models into one. Each model's strengths overlap and remain, while weaknesses cancel out. Particularly effective for embedding models where "global representation quality" is important.
Model soups: averaging weights of multiple fine-tuned models...
The conventional recipe for maximizing model accuracy is to (1) train multiple models with various hyperparameters and (2) pick the individual model which performs best on a held-out validation...
https://arxiv.org/abs/2203.05482


Seonglae Cho