Summary: | MicroRNAs (miRNAs) may serve as diagnostic and predictive biomarkers for cancer. The aim of this study was to identify novel cancer biomarkers from miRNA datasets, in addition to those already known. Three published miRNA cancer datasets (liver, breast, and brain) were evaluated, and the performance of the entire feature set was compared to the performance of individual feature filters, an ensemble of those filters, and a support vector machine (SVM) wrapper. In addition to confirming many known biomarkers, the main contribution of this study is that seven miRNAs have been newly identified by our ensemble methodology as possible important biomarkers for hepatocellular carcinoma or breast cancer, pending wet lab confirmation. These biomarkers were identified from miRNA expression datasets by combining multiple feature selection techniques (i.e., creating an ensemble) or by the SVM-wrapper, and then classified by different learners. Generally speaking, creating a subset of features by selecting only the highest ranking features (miRNAs) improved upon results generated when using all the miRNAs, and the ensemble and SVM-wrapper approaches outperformed individual feature selection methods. Finally, an algorithm to determine the number of top-ranked features to include in the creation of feature subsets was developed. This algorithm takes into account the performance improvement gained by adding additional features compared to the cost of adding those features. === by Alex Kotlarchyk. === Thesis (Ph.D.)--Florida Atlantic University, 2011. === Includes bibliography. === Electronic reproduction. Boca Raton, Fla., 2011. Mode of access: World Wide Web.
|