Summary: | The motivation of this thesis originates from the cooperation with Uppsala Monitoring Centre, a WHO collaborating centre for international drug monitoring. The research question is how to give a good summary of the drug indication list. This thesis proposes a regression tree, Random Forests and XGBoost, known as tree-based models to predict the drug indication summary based on its user statistics and pharmaceutical information. Besides, this thesis also compares the aforementioned tree-based models' prediction performance with the baseline models, which are basic linear regression and support vector regression SVR. The analysis shows SVR with RBF kernel and post-pruning tree are the best models to answer the research question.
|