Biological Terms Recognition:Using Hidden Markov Models
碩士 === 臺灣大學 === 醫學工程學研究所 === 95 === With the progress of biomedical science, text mining in biomedical domain is getting important. Since there are many irregularities and ambiguous contexts in biomedical literature such as various compound words, synonyms, acronyms, and even the laws of naming are...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2007
|
Online Access: | http://ndltd.ncl.edu.tw/handle/41575682044354041294 |
id |
ndltd-TW-095NTU05530027 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-095NTU055300272015-10-13T13:55:54Z http://ndltd.ncl.edu.tw/handle/41575682044354041294 Biological Terms Recognition:Using Hidden Markov Models 生醫詞彙辨識:利用隱藏式馬可夫模型 Chih-Wei Chen 陳志偉 碩士 臺灣大學 醫學工程學研究所 95 With the progress of biomedical science, text mining in biomedical domain is getting important. Since there are many irregularities and ambiguous contexts in biomedical literature such as various compound words, synonyms, acronyms, and even the laws of naming are not literally consistent, how to correctly identify biological terms from text is a fundamental requirement for information extraction. In this paper we propose a biological term extractor which is based on Hidden Markov Models. There are four steps to accomplish our task. First, the tokens in training data are clustered by five features at the first stage. Second, train a Hidden Markov Model by these clustering tokens. Third, normalize user’s input and cluster these tokens. Finally, annotate the biological terms according to the Machine Learning algorithm. Jau-Min Wong 翁昭旼 2007 學位論文 ; thesis 42 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 臺灣大學 === 醫學工程學研究所 === 95 === With the progress of biomedical science, text mining in biomedical domain is getting important. Since there are many irregularities and ambiguous contexts in biomedical literature such as various compound words, synonyms, acronyms, and even the laws of naming are not literally consistent, how to correctly identify biological terms from text is a fundamental requirement for information extraction.
In this paper we propose a biological term extractor which is based on Hidden Markov Models. There are four steps to accomplish our task. First, the tokens in training data are clustered by five features at the first stage. Second, train a Hidden Markov Model by these clustering tokens. Third, normalize user’s input and cluster these tokens. Finally, annotate the biological terms according to the Machine Learning algorithm.
|
author2 |
Jau-Min Wong |
author_facet |
Jau-Min Wong Chih-Wei Chen 陳志偉 |
author |
Chih-Wei Chen 陳志偉 |
spellingShingle |
Chih-Wei Chen 陳志偉 Biological Terms Recognition:Using Hidden Markov Models |
author_sort |
Chih-Wei Chen |
title |
Biological Terms Recognition:Using Hidden Markov Models |
title_short |
Biological Terms Recognition:Using Hidden Markov Models |
title_full |
Biological Terms Recognition:Using Hidden Markov Models |
title_fullStr |
Biological Terms Recognition:Using Hidden Markov Models |
title_full_unstemmed |
Biological Terms Recognition:Using Hidden Markov Models |
title_sort |
biological terms recognition:using hidden markov models |
publishDate |
2007 |
url |
http://ndltd.ncl.edu.tw/handle/41575682044354041294 |
work_keys_str_mv |
AT chihweichen biologicaltermsrecognitionusinghiddenmarkovmodels AT chénzhìwěi biologicaltermsrecognitionusinghiddenmarkovmodels AT chihweichen shēngyīcíhuìbiànshílìyòngyǐncángshìmǎkěfūmóxíng AT chénzhìwěi shēngyīcíhuìbiànshílìyòngyǐncángshìmǎkěfūmóxíng |
_version_ |
1717745274016759808 |