Summary: | Constructing an accurate predictive model for clinical decision-making on the basis of a relatively small number of tumor samples with high-dimensional microarray data remains a very challenging problem. The validity of such models has been seriously questioned due to their failure in clinical validation using independent samples. Besides the statistical issues such as selection bias, some studies further implied the probable reason was improper sample selection that did not resemble the genomic space defined by the training population. Assuming that predictions would be more reliable for interpolation than extrapolation, we set to investigate the impact of applicability domain (AD) on model performance in microarray-based genomic research by evaluating and comparing model performance for samples with different extrapolation degrees. We found that the issue of applicability domain may not exist in microarray-based genomic research for clinical applications. Therefore, it is not practicable to improve model validity based on applicability domain.
|