Summary: | 碩士 === 國立臺灣大學 === 資訊管理學研究所 === 105 === In the modern age where everyone can easily access a variety of information, online review has become an important source and will deeply affect one’s decision. The ability of knowing reviewers’ profiles is helpful for both customers and online retailers in many ways. However, most of online review websites do not provide personal information of reviewers for the privacy concern, and the only clue that can be found is content of review. There is a research field called ‘user profiling’ which focuses on extracting user-profile attributes from corpus by using labeled datasets to train classifiers. Nevertheless, it is hard to get gold-standard datasets because of the lack of ground truth. As a result, many researchers found experts to help them label datasets, yet the manual annotation was a time-consuming and laborious task.
In this paper, we propose a semi-supervised approach, trying to get labeled datasets without manual annotation. We conduct experiments to demonstrate the performance of our approach, comparing it with the ideal performance, and describe our observation. We hope that, one day, our method can be applied in user profiling, helping researchers save time on collecting gold-standard datasets, and focus on features extraction and classifier building.
|