Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance
Segmentation
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation
Jialian Wu Liangchen Song Tiancai Wang Qian Zhang Junsong Yuan

Abstract
Despite the previous success of object analysis, detecting and segmenting alarge number of object categories with a long-tailed data distribution remainsa challenging problem and is less investigated. For a large-vocabularyclassifier, the chance of obtaining noisy logits is much higher, which caneasily lead to a wrong recognition. In this paper, we exploit prior knowledgeof the relations among object categories to cluster fine-grained classes intocoarser parent classes, and construct a classification tree that is responsiblefor parsing an object instance into a fine-grained category via its parentclass. In the classification tree, as the number of parent class nodes aresignificantly less, their logits are less noisy and can be utilized to suppressthe wrong/noisy logits existed in the fine-grained class nodes. As the way toconstruct the parent class is not unique, we further build multiple trees toform a classification forest where each tree contributes its vote to thefine-grained classification. To alleviate the imbalanced learning caused by thelong-tail phenomena, we propose a simple yet effective resampling method, NMSResampling, to re-balance the data distribution. Our method, termed as ForestR-CNN, can serve as a plug-and-play module being applied to most objectrecognition models for recognizing more than 1000 categories. Extensiveexperiments are performed on the large vocabulary dataset LVIS. Compared withthe Mask R-CNN baseline, the Forest R-CNN significantly boosts the performancewith 11.5% and 3.9% AP improvements on the rare categories and overallcategories, respectively. Moreover, we achieve state-of-the-art results on theLVIS dataset. Code is available at https://github.com/JialianW/Forest_RCNN.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| few-shot-object-detection-on-lvis-v1-0-val | Forest R-CNN | AP: 23.2 APc: 22.7 APf: 27.7 APr: 14.2 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.