HyperAIHyperAI

Command Palette

Search for a command to run...

Model Souping

Date

12 hours ago

Organization

Google
University of Washington

Paper URL

2203.05482

Model Souping was jointly proposed in July 2022 by a research team from the University of Washington, Google, and other universities and institutions. The related research results were published in the paper "...".Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time", selected for ICML 2022.

Model Souping refers to averaging the weights of multiple independently fine-tuned models to improve model accuracy and robustness. This paradigm only performs weighted averaging on the fine-tuned models after hyperparameter sweeping, requiring no additional training and not increasing computational costs during inference. When fine-tuning large pre-trained models such as ViT-G pre-trained with CLIP, ALIGN, and JFT, the Model Souping method significantly improves upon the best single model obtained through hyperparameter sweeping on ImageNet. The resulting ViT-G model achieved an accuracy of 90.941 TP3T on ImageNet, reaching a new technical level. Furthermore, this method can be extended to various image classification and natural language processing tasks, not only improving out-of-distribution generalization performance but also enhancing zero-shot learning capabilities in new downstream tasks.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp