Skip to yearly menu bar Skip to main content


Kernel Adaptive Convolution for Scene Text Detection via Distance Map Prediction

Jinzhi Zheng · Heng Fan · Libo Zhang

Arch 4A-E Poster #111
[ ]
Wed 19 Jun 5 p.m. PDT — 6:30 p.m. PDT


Segmentation-based scene text detection algorithms that are accurate to the pixel level can satisfy the detection of arbitrary shape scene text and have received widespread attention. On the one hand, due to the complexity and diversity of the scene text, the convolution with a fixed kernel size has some limitations in extracting the visual features of the scene text. On the other hand, most of the existing segmentation-based algorithms only segment the center of the text, losing information such as the edges and directions of the text, with limited detection accuracy. There are also some improved algorithms that use iterative corrections or introduce other multiple information to improve text detection accuracy but at the expense of efficiency. To address these issues, this paper proposes a simple and effective scene text detection method, the Kernel Adaptive Convolution, which is designed with a Kernel Adaptive Convolution Module for scene text detection via predicting the distance map. Specifically, first, we design an extensible kernel adaptive convolution module (KACM) to extract visual features from multiple convolutions with different kernel sizes in an adaptive manner. Secondly, our method predicts the text distance map under the supervision of a priori information (including direction map, and foreground segmentation map) and completes the text detection from the predicted distance map. Experiments on four publicly available datasets prove the effectiveness of our algorithm, in which the accuracy and efficiency of both the Total-Text and TD500 outperform the state-of-the-art algorithm. The algorithm efficiency is improved while the accuracy is competitive on ArT and Ctw1500.

Live content is unavailable. Log in and register to view live content