Linjie Deng Yanxiang Gong Yi Lin Jingwen Shuai Xiaoguang Tu Yuefei Zhang Zheng Ma Mei Xie

Abstract
Previous approaches for scene text detection usually rely on manually defined sliding windows. This work presents an intuitive two-stage region-based method to detect multi-oriented text without any prior knowledge regarding the textual shape. In the first stage, we estimate the possible locations of text instances by detecting and linking corners instead of shifting a set of default anchors. The quadrilateral proposals are geometry adaptive, which allows our method to cope with various text aspect ratios and orientations. In the second stage, we design a new pooling layer named Dual-RoI Pooling which embeds data augmentation inside the region-wise subnetwork for more robust classification and regression over these proposals. Experimental results on public benchmarks confirm that the proposed method is capable of achieving comparable performance with state-of-the-art methods. The code is publicly available at https://github.com/xhzdeng/crpn
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| scene-text-detection-on-coco-text | Corner-based Region Proposals | F-Measure: 59.1 Precision: 55.5 Recall: 63.3 |
| scene-text-detection-on-icdar-2013 | Corner-based Region Proposals | F-Measure: 87.6%% Precision: 91.9 Recall: 83.9 |
| scene-text-detection-on-icdar-2015 | Corner-based Region Proposals | F-Measure: 84.5 Precision: 88.7 Recall: 80.7 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.