Summary: | 碩士 === 嶺東科技大學 === 資訊科技系碩士班 === 105 === In this study we are trying to do the Chinese semantic analysis by using the genetic algorithm method. It is more difficult for Chinese semantic analysis than English, since the different is in the grammar structure. In English, a single word just could describe the situation and the Chinese might combine several words to say the same thing. Generally, it could be easy and precisely in word segmentation by a dictionary with lots of glossary. It would be too expensive to develop a massive dictionary for the people. The cost of constructing a massive dictionary is unthinkable, which is very unfavorable for personal analysis . Therefore, this study developed uses an automatic dictionary, it’s unartificial produces dictionary, and by using the genetic algorithm method for the Chinese word segmentation, and applied the application in traveling. analyze in traveling articles used.
This study collects the data for more than 400 posts from the extracts the articles by board of Tai-traveling in the social media website PTT. which to extracts more than 400 articles about Taichung place. In this study, we use the genetic algorithm method to do the Chinese word segmentation, and the TF-IDF (Term Frequency–Inverse Document Frequency) method to get the key words of each posts. The result shows a high precision in Chinese word segmentation and gets a traveling rank for local area Taichung in Taiwan. Analyze for traveling articles by TF-IDF obtained keywords.
|