TasselNetv2: in-field counting of wheat spikes with context-augmented local regression networks

Abstract Background Grain yield of wheat is greatly associated with the population of wheat spikes, i.e., $$spike~number~\text {m}^{-2}$$ spikenumberm-2 . To obtain this index in a reliable and efficient way, it is necessary to count wheat spikes accurately and automatically. Currently computer visi...

Full description

Bibliographic Details
Main Authors: Haipeng Xiong, Zhiguo Cao, Hao Lu, Simon Madec, Liang Liu, Chunhua Shen
Format: Article
Language:English
Published: BMC 2019-12-01
Series:Plant Methods
Subjects:
Online Access:https://doi.org/10.1186/s13007-019-0537-2
Description
Summary:Abstract Background Grain yield of wheat is greatly associated with the population of wheat spikes, i.e., $$spike~number~\text {m}^{-2}$$ spikenumberm-2 . To obtain this index in a reliable and efficient way, it is necessary to count wheat spikes accurately and automatically. Currently computer vision technologies have shown great potential to automate this task effectively in a low-end manner. In particular, counting wheat spikes is a typical visual counting problem, which is substantially studied under the name of object counting in Computer Vision. TasselNet, which represents one of the state-of-the-art counting approaches, is a convolutional neural network-based local regression model, and currently benchmarks the best record on counting maize tassels. However, when applying TasselNet to wheat spikes, it cannot predict accurate counts when spikes partially present. Results In this paper, we make an important observation that the counting performance of local regression networks can be significantly improved via adding visual context to the local patches. Meanwhile, such context can be treated as part of the receptive field without increasing the model capacity. We thus propose a simple yet effective contextual extension of TasselNet—TasselNetv2. If implementing TasselNetv2 in a fully convolutional form, both training and inference can be greatly sped up by reducing redundant computations. In particular, we collected and labeled a large-scale wheat spikes counting (WSC) dataset, with 1764 high-resolution images and 675,322 manually-annotated instances. Extensive experiments show that, TasselNetv2 not only achieves state-of-the-art performance on the WSC dataset ($$91.01\%$$ 91.01% counting accuracy) but also is more than an order of magnitude faster than TasselNet (13.82 fps on $$912\times 1216$$ 912×1216 images). The generality of TasselNetv2 is further demonstrated by advancing the state of the art on both the Maize Tassels Counting and ShanghaiTech Crowd Counting datasets. Conclusions This paper describes TasselNetv2 for counting wheat spikes, which simultaneously addresses two important use cases in plant counting: improving the counting accuracy without increasing model capacity, and improving efficiency without sacrificing accuracy. It is promising to be deployed in a real-time system with high-throughput demand. In particular, TasselNetv2 can achieve sufficiently accurate results when training from scratch with small networks, and adopting larger pre-trained networks can further boost accuracy. In practice, one can trade off the performance and efficiency according to certain application scenarios. Code and models are made available at: https://tinyurl.com/TasselNetv2.
ISSN:1746-4811