Summary: | 碩士 === 國立成功大學 === 資訊及電子工程研究所 === 83 === In this thesis , we discuss the problems and propose some
solutions for Mandarin dictation system based on Chinese word
unit . In consideration of real-time processing and robustness
, we suggest two kinds of rules to solve the sentence-
segmenting and word-generating problems.The first kind of rule
is statistical rule according to collocations. We find the
collocation between words is not random. We deal with the
ambiguity of sentence-segmenting by Markov statistical model (
bigram ). In this processing, we discuss about partial parser
and rule-nesting. We find these rules are practicable, but
there are some cogitation needed in the partition of lexicon.
The second kind of rule is word-generating rule . Some words
can not be embodied in the lexicon , therefore , we generate
these words by state-transfer rules . We establish these rules
according to the phenomenon of Mandarin . For real-time
processing , first-best algorithm is employed . This method is
successful in address-system approaching.
|