Summary: | Accurate phenotype prediction of quantitative traits is paramount to enhanced plant research and breeding. Here, we report the accurate prediction of cotton fiber length, a typical quantitative trait, using 474 cotton (Gossypium ssp.) fiber length (GFL) genes and nine prediction models. When the SNPs/InDels contained in 226 of the GFL genes or the expressions of all 474 GFL genes was used for fiber length prediction, a prediction accuracy of r = 0.83 was obtained, approaching the maximally possible prediction accuracy of a quantitative trait. This has improved by 116%, the prediction accuracies of the fiber length thus far achieved for genomic selection using genome-wide random DNA markers. Moreover, analysis of the GFL genes identified 125 of the GFL genes that are key to accurate prediction of fiber length, with which a prediction accuracy similar to that of all 474 GFL genes was obtained. The fiber lengths of the plants predicted with expressions of the 125 key GFL genes were significantly correlated with those predicted with the SNPs/InDels of the above 226 SNP/InDel-containing GFL genes (r = 0.892, P = 0.000). The prediction accuracies of fiber length using both genic datasets were highly consistent across environments or generations. Finally, we found that a training population consisting of 100–120 plants was sufficient to train a model for accurate prediction of a quantitative trait using the genes controlling the trait. Therefore, the genes controlling a quantitative trait are capable of accurately predicting its phenotype, thereby dramatically improving the ability, accuracy, and efficiency of phenotype prediction and promoting gene-based breeding in cotton and other species.
|