Cut and Learn for Unsupervised Object Detection and Instance Segmentation
Cut and Learn for Unsupervised Object Detection and Instance Segmentation
Xudong Wang Rohit Girdhar Stella X. Yu Ishan Misra

Abstract
We propose Cut-and-LEaRn (CutLER), a simple approach for training unsupervised object detection and segmentation models. We leverage the property of self-supervised models to 'discover' objects without supervision and amplify it to train a state-of-the-art localization model without any human labels. CutLER first uses our proposed MaskCut approach to generate coarse masks for multiple objects in an image and then learns a detector on these masks using our robust loss function. We further improve the performance by self-training the model on its predictions. Compared to prior work, CutLER is simpler, compatible with different detection architectures, and detects multiple objects. CutLER is also a zero-shot unsupervised detector and improves detection performance AP50 by over 2.7 times on 11 benchmarks across domains like video frames, paintings, sketches, etc. With finetuning, CutLER serves as a low-shot detector surpassing MoCo-v2 by 7.3% APbox and 6.6% APmask on COCO when training with 5% labels.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| unsupervised-panoptic-segmentation-on-coco | CutLER+STEGO | PQ: 12.4 RQ: 15.2 SQ: 36.1 |
| unsupervised-zero-shot-instance-segmentation | CutLER | AP: 5.3 AP50: 8.6 AP75: 5.5 AR100: 9.3 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.