The Feasibility Study of Building a Routine Nursing Records Corpus, Lexicon and Its Application in the Speech Recognition

碩士 === 國立陽明大學 === 生物醫學資訊研究所 === 97 === The speech recognition technology has improved in last two decades, and some of them were used frequently in health industry of English speaking country. However, multilingual speech recognition is a challenging necessity when it comes to Mandarin based nursing...

Full description

Bibliographic Details
Main Authors: Pin-Jen Huang, 黃品甄
Other Authors: Polun Chang
Format: Others
Language:zh-TW
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/91874768499608773576
Description
Summary:碩士 === 國立陽明大學 === 生物醫學資訊研究所 === 97 === The speech recognition technology has improved in last two decades, and some of them were used frequently in health industry of English speaking country. However, multilingual speech recognition is a challenging necessity when it comes to Mandarin based nursing information system. In order to develop the foundation of nursing record entry interface with speech recognition technology, the aim of this study is to build a nursing record corpus and lexicon in Mandarin. We selected electronic nursing record from a medical center in Taiwan as text training data. The data were recorded during Jul, 07 to Mar, 08, included 7 ICU wards and 5 general wards. We used word segmentation and unknown-word extraction system from ACADEMIA SINICA to extract nursing record lexicon and training language models with and without the new lexicon, then calculate the perplexity of language models as evaluation. In this study, we build a 974-words nursing record lexicon. The relative perplexity reduction is 15.541% from the model without nursing record lexicon to the model with 974-words nursing record lexicon.