Command Palette
Search for a command to run...
Model Souping
Model Souping was jointly proposed in July 2022 by a research team from the University of Washington, Google, and other universities and institutions. The related research results were published in the paper "...".Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time", selected for ICML 2022.
Model Souping refers to averaging the weights of multiple independently fine-tuned models to improve model accuracy and robustness. This paradigm only performs weighted averaging on the fine-tuned models after hyperparameter sweeping, requiring no additional training and not increasing computational costs during inference. When fine-tuning large pre-trained models such as ViT-G pre-trained with CLIP, ALIGN, and JFT, the Model Souping method significantly improves upon the best single model obtained through hyperparameter sweeping on ImageNet. The resulting ViT-G model achieved an accuracy of 90.941 TP3T on ImageNet, reaching a new technical level. Furthermore, this method can be extended to various image classification and natural language processing tasks, not only improving out-of-distribution generalization performance but also enhancing zero-shot learning capabilities in new downstream tasks.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.