Command Palette
Search for a command to run...
He Kaiming Gkioxari Georgia Dollá r Piotr Girshick Ross

摘要
我们提出了一种概念简洁、灵活且通用的物体实例分割框架。该方法能够高效地检测图像中的物体,同时为每个实例生成高质量的分割掩码。该方法称为Mask R-CNN,它在Faster R-CNN的基础上新增了一个并行分支,用于预测物体的掩码,而原有的分支则继续用于边界框识别。Mask R-CNN训练简单,对Faster R-CNN仅引入少量计算开销,运行速度可达每秒5帧(5 fps)。此外,Mask R-CNN易于推广至其他任务,例如可在同一框架下实现人体姿态估计。我们在COCO挑战赛的三个赛道(包括实例分割、边界框目标检测和人体关键点检测)中均取得了领先结果。在不使用任何额外技巧的情况下,Mask R-CNN在每一项任务上均超越了所有现有单模型方法,包括COCO 2016挑战赛的优胜者。我们希望这一简洁而高效的方法能成为实例级识别任务的坚实基线,助力未来相关研究的发展。代码已开源,地址为:https://github.com/facebookresearch/Detectron
代码仓库
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| instance-segmentation-on-bdd100k-val | Mask R-CNN | AP: 20.5 |
| instance-segmentation-on-coco | Mask R-CNN (ResNeXt-101-FPN) | AP50: 60.0 AP75: 39.4 APL: 53.5 APM: 39.9 APS: 16.9 mask AP: 37.1 |
| instance-segmentation-on-isaid | Mask-RCNN+ | Average Precision: 37.18 |
| instance-segmentation-on-isaid | Mask-RCNN | Average Precision: 36.50 |
| keypoint-detection-on-coco-1 | Mask R-CNN | Test AP: 63.1 Validation AP: 69.2 |
| keypoint-detection-on-coco-test-challenge | Mask R-CNN* | AP: 68.9 AP50: 89.2 AP75: 75.2 APL: 82.6 AR: 75.4 AR50: 93.2 AR75: 81.2 ARL: 76.8 ARM: 70.2 |
| keypoint-detection-on-coco-test-dev | Mask R-CNN | AP50: 87.3 AP75: 68.7 APL: 71.4 APM: 57.8 |
| multi-human-parsing-on-mhp-v10 | Mask R-CNN | AP 0.5: 52.68% |
| multi-human-parsing-on-mhp-v20 | Mask R-CNN | AP 0.5: 14.9 |
| multi-person-pose-estimation-on-crowdpose | Mask R-CNN | AP Easy: 69.4 AP Hard: 45.8 AP Medium: 57.9 mAP @0.5:0.95: 57.2 |
| multi-person-pose-estimation-on-ochuman | Mask R-CNN | AP50: 33.2 AP75: 24.5 Validation AP: 20.2 |
| multi-tissue-nucleus-segmentation-on-kumar | Mask R-CNN (e) | Dice: 0.760 Hausdorff Distance (mm): 50.9 |
| nuclear-segmentation-on-cell17 | Mask R-CNN | Dice: 0.707 F1-score: 0.8004 Hausdorff: 12.6723 |
| object-detection-on-coco | Mask R-CNN (ResNeXt-101-FPN) | AP50: 62.3 AP75: 43.4 APL: 51.2 APM: 43.2 APS: 22.1 Hardware Burden: 9G box mAP: 39.8 |
| object-detection-on-coco | Mask R-CNN (ResNet-101-FPN) | AP50: 60.3 AP75: 41.7 APL: 50.2 APM: 41.1 APS: 20.1 Hardware Burden: 9G box mAP: 38.2 |
| object-detection-on-coco-minival | Mask R-CNN (ResNeXt-101-FPN) | AP50: 59.5 AP75: 38.9 box AP: 36.7 |
| object-detection-on-coco-minival | Mask R-CNN (ResNet-50-FPN) | box AP: 37.7 |
| object-detection-on-coco-minival | Mask R-CNN (ResNet-101-FPN) | box AP: 40.0 |
| object-detection-on-coco-o | Mask R-CNN (ResNet-50) | Average mAP: 17.1 |
| object-detection-on-coco-o | Mask R-CNN (ResNet-50) | Effective Robustness: -0.11 |
| object-detection-on-isaid | Mask-RCNN | Average Precision: 36.50 |
| object-detection-on-isaid | Mask-RCNN+ | Average Precision: 37.18 |
| object-localization-on-grit | Mask R-CNN | Localization (ablation): 44.7 Localization (test): 45.1 |
| panoptic-segmentation-on-cityscapes-val | Mask R-CNN+COCO | PQth: 54.0 |
| pose-estimation-on-coco-test-dev | Mask-RCNN | AP: 63.1 AP50: 87.3 AP75: 68.7 APL: 71.4 |
| real-time-object-detection-on-coco-1 | Mask R-CNN X-152-32x8d | box AP: 45.2 |