Summary: | Text detection in natural scenes is a current research hotspot. The Efficient and Accurate Scene Text (EAST) detector model has fast detection speed and good performance but is ineffective in detecting long text regions owing to its small receptive field. In this study, we built upon the EAST model by improving the bounding box’s shrinking algorithm to make the model more accurate in predicting short edges of text regions; altering the loss function from balanced cross-entropy to Focal loss; improving the model’s learning ability on hard, positive examples; and adding a feature enhancement module (FEM) to increase the receptive field of the EAST model and enhance its detection ability for long text regions. The improved EAST model achieved better detection results on both the ICDAR2015 dataset and the Street Sign Text Detection (SSTD) dataset proposed in this paper. The precision and F1 scores of the model also demonstrated advantages over other models on the ICDAR2015 dataset. A comparison of the text detection effects between the improved EAST model and the EAST model showed that the proposed FEM was more effective in increasing the EAST detector’s receptive field, which indicates that it can improve the detection of long text regions.
|