Summary: | 博士 === 逢甲大學 === 資訊工程所 === 95 === Evolutionary algorithm (EA) is a powerful optimization tool and has been widely in bioinformatics area. For a complex prediction problem which involved of large amount of tuning parameters, the prediction accuracy is dominated by the optimization performance of the used evolutionary algorithm. In this dissertation, we use several efficient evolutionary computation approaches to solve the following prediction problems: design of fuzzy rule-based classifier, flexible protein-ligand docking, and protein structural class prediction.
Firstly, an evolutionary approach to designing accurate classifiers with a compact fuzzy-rule base using a scatter partition of feature space is proposed, in which all the elements of the fuzzy classifier design problem have been moved in parameters of a complex optimization problem. An intelligent genetic algorithm (IGA) is used to effectively solve the design problem of fuzzy classifiers with many tuning parameters. The merits of the proposed method are threefold: 1) the proposed method has high search ability to efficiently find fuzzy rule-based systems with high fitness values, 2) obtained fuzzy rules have high interpretability, and 3) obtained compact classifiers have high classification accuracy on unseen test patterns. The performance comparison and statistical analysis of experimental results using ten-fold cross validation show that the IGA-based method without heuristics is efficient in designing accurate and compact fuzzy classifiers using 11 well-known data sets with numerical attribute values. Consequently, an application of the fuzzy classifier to a prediction problem in gene expression analysis is introduced.
Flexible protein-ligand docking can be formulated as a parameter optimization problem whose objective is to find the translation, orientation, and conformation of a ligand relative to the active site of a target protein with the lowest energy. For highly flexible ligands with a lot of rotatable bonds, the optimization problem of flexible docking would be more difficult due to the extremely large conformation space. We proposed a novel optimization algorithm, Swarm Optimization for flexible DOCKing (SODOCK), based on particle swarm optimization (PSO) for solving flexible protein-ligand docking problems. The computer simulation results shown that SODOCK can obtain more accurate results, comparing with several state-of-the-art docking methods.
Finally, we propose an evolutionary feature selection approach based on inheritable intelligent genetic algorithm for the prediction of protein structural class. Adding physicochemical properties into protein features can improve the prediction accuracy of a proper classifier. However, selection of useful features from hundreds of physicochemical properties is very difficult. The proposed evolutionary feature selection method can obtain high quality feature subsets from amino acid composition and physicochemical properties AAindex. The experimental results show that the obtained feature subsets improve the prediction accuracies of naive Bayes classifier, support vector machine (SVM), and logistic regression, comparing with these classifiers using amino acid composition features alone. The average prediction accuracy of these classifiers with the obtained feature subsets are also superior to an existing 66-dimensional feature set designed by experts.
|