Command Palette
Search for a command to run...
Zhao Hengshuang Shi Jianping Qi Xiaojuan Wang Xiaogang Jia Jiaya

摘要
场景解析在开放词汇且场景多样化的无限制环境下极具挑战性。本文通过提出金字塔场景解析网络(PSPNet),结合基于多区域的上下文聚合机制,利用金字塔池化模块挖掘全局上下文信息,充分发挥其优势。所提出的全局先验表示方法在场景解析任务中表现出色,能够生成高质量的分割结果;而PSPNet则为像素级预测任务提供了一个优越的框架。该方法在多个公开数据集上均取得了当前最优性能,在2016年ImageNet场景解析挑战赛、PASCAL VOC 2012基准测试以及Cityscapes基准测试中均获得第一名。仅使用单一PSPNet模型,就在PASCAL VOC 2012上取得了85.4%的mIoU(平均交并比)新纪录,在Cityscapes上达到了80.2%的准确率新纪录。
代码仓库
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| dichotomous-image-segmentation-on-dis-te1 | PSPNet | E-measure: 0.791 HCE: 267 MAE: 0.089 S-Measure: 0.725 max F-Measure: 0.645 weighted F-measure: 0.557 |
| dichotomous-image-segmentation-on-dis-te2 | PSPNet | E-measure: 0.828 HCE: 586 MAE: 0.092 S-Measure: 0.763 max F-Measure: 0.724 weighted F-measure: 0.636 |
| dichotomous-image-segmentation-on-dis-te3 | PSPNet | E-measure: 0.843 HCE: 1111 MAE: 0.092 S-Measure: 0.774 max F-Measure: 0.747 weighted F-measure: 0.657 |
| dichotomous-image-segmentation-on-dis-te4 | PSPNet | E-measure: 0.815 HCE: 3806 MAE: 0.107 S-Measure: 0.758 max F-Measure: 0.725 weighted F-measure: 0.630 |
| dichotomous-image-segmentation-on-dis-vd | PSPNet | E-measure: 0.802 HCE: 1588 MAE: 0.102 S-Measure: 0.744 max F-Measure: 0.691 weighted F-measure: 0.603 |
| lesion-segmentation-on-anatomical-tracings-of-1 | PSPNet | Dice: 0.3571 IoU: 0.254 Precision: 0.4769 Recall: 0.3335 |
| real-time-semantic-segmentation-on-camvid | PSPNet | Frame (fps): 5.4 Time (ms): 185.0 |
| real-time-semantic-segmentation-on-nyu-depth-1 | PSPNet101 | Speed(ms/f): 72 mIoU: 43.2 |
| real-time-semantic-segmentation-on-nyu-depth-1 | PSPNet50 | Speed(ms/f): 47 mIoU: 41.8 |
| real-time-semantic-segmentation-on-nyu-depth-1 | PSPNet18 | Speed(ms/f): 19 mIoU: 35.9 |
| semantic-segmentation-on-ade20k | PSPNet (ResNet-101) | Validation mIoU: 43.29 |
| semantic-segmentation-on-ade20k | PSPNet (ResNet-152) | Validation mIoU: 43.51 |
| semantic-segmentation-on-ade20k | PSPNet | Test Score: 55.38 Validation mIoU: 44.94 |
| semantic-segmentation-on-ade20k-val | PSPNet (ResNet-101) | mIoU: 43.29% |
| semantic-segmentation-on-ade20k-val | PSPNet (ResNet-152) | mIoU: 43.51% |
| semantic-segmentation-on-bdd100k-val | PSPNet | mIoU: 62.3 |
| semantic-segmentation-on-cityscapes | PSPNet | Mean IoU (class): 78.4% |
| semantic-segmentation-on-cityscapes | PSPNet++ | Mean IoU (class): 80.2% |
| semantic-segmentation-on-cityscapes-val | PSPNet (Dilated-ResNet-101) | mIoU: 79.7 |
| semantic-segmentation-on-dada-seg | PSPNet (ResNet-101) | mIoU: 20.1 |
| semantic-segmentation-on-densepass | PSPNet (ResNet-50) | mIoU: 29.5% |
| semantic-segmentation-on-pascal-context | PSPNet (ResNet-101) | mIoU: 47.8 |
| semantic-segmentation-on-pascal-voc-2012 | PSPNet | Mean IoU: 85.4% |
| semantic-segmentation-on-pascal-voc-2012 | PSPNet (ResNet-101) | Mean IoU: 82.6% |
| semantic-segmentation-on-potsdam | PSPNet | mIoU: 82.98 |
| semantic-segmentation-on-scannetv2 | PSPNet | Mean IoU: 47.5% |
| semantic-segmentation-on-selma | PSPNet | mIoU: 68.4 |
| semantic-segmentation-on-trans10k | PSPNet | GFLOPs: 187.03 mIoU: 68.23% |
| semantic-segmentation-on-urbanlf | PSPNet | mIoU (Real): 76.34 mIoU (Syn): 75.78 |
| semantic-segmentation-on-us3d | PSNet | mIoU: 73.12 |
| semantic-segmentation-on-vaihingen | PSPNet | mIoU: 76.79 |
| thermal-image-segmentation-on-mfn-dataset | PSPNet | mIOU: 46.1 |
| video-semantic-segmentation-on-camvid | PSPNet-50 | Mean IoU: 76 |
| video-semantic-segmentation-on-cityscapes-val | PSPNet-101 [20] | mIoU: 79.7 |
| video-semantic-segmentation-on-cityscapes-val | PSPNet-50 [20] | mIoU: 78.1 |