Summary: | 本論文使用非監督式新細胞認知機(Unsupervised neocognitron)神經網路來便是印刷體中文字。
關於非監督式新細胞認知機,本論文提出兩項修改。第一項,Us1子層的結點不進行學習,而是直接套用人為方式所指定的12個區域特徵,而Us1之後的S子層仍然使用非監督式學習的方式決定其所要偵測的區域特徵。第二項修改則是,在學習過中設定一個上限值來限制代表節點(representative)產生的個數。如此設計的目的是為了避免模板(cell-planes)分配不均的問題。在本研究,採用這兩項修改的新細胞認知機稱為模式一,而使用第二項修改的新細胞認知機稱為模式二。
本論文裡的所有實驗分為兩部分。在第一部分有四個實驗,這些實驗都使用相同的訓練範例與測試範例。訓練範例有兩組,第一組包含“川”,“三”,“大”,“人”,“台”等五個中文字。而第二組包含“零”,“壹”,“貳”,“參”,“肆”等中文字。訓練範例都市採用細明體,而測試範例則是採用其他九種不同字體。第一個實驗的主要目的是測試模式一的績效。實驗結果顯示,模式一很容易學習成功而且辨識率可以接受。另外三個實驗的目的是想要了解某些參數值與系統績效的關係。這些參數包含S-欄的大小(the size of S-column),模板樹(the number of cell-planes),以及節點的接收場大小(the size of cells’ receptive field)。這三個實驗所使用的網路系統是模式一。
第二部分有二個實驗,主要的目的是比較模式一與模式二的系統績效。在第一個實驗,所使用的訓練範例與第一部分實驗相同。實驗結果顯示模式一比較容易成功地學習,而且系統有不錯的表現。第二個實驗,使用17個中文字做為訓練範例。這17個字包括“零”,“壹”,“貳”,“參”,“肆”,“伍”,“陸”,“柒”,“捌”,“玖”,“拾”,“佰”,“仟”,“萬”,“億”,“圓”,“角”。實驗結果顯示,模式一仍然是一個不錯的系統。 === In this study, we are investigating the feasibility of applying the unsupervised neocognitron to the recognition of printed Chinese characters.
Two propositions for the unsupervised neocognitron are mentioned. The first on proposes that the input connections of the first layer are manually given, and all subsequent layers are trained unsupervised. The second one concerns the selection of representatives. During the process of learning, the number of cell-planes that send representatives for each training pattern has an upper bound. The unsupervised neocognitron with implementing these two propositions is named as Model 1, and the unsupervised neocognitron with implementing only the second proposition is named as Model 2.
Experiment in this study are grouped into two parts, called Part I and Part II. In Part I, four experiments are conducted. For each experiment, two sets of training patterns will be conducted respectively. The first one, called the simple training set, consists of five printed Chinese characters“川”,“三”,“大”,“人”, and “台” with size of 25*25 in MingLight font. The second one, called the complex training set, contains another five printed Chinese characters“零”,“壹”,“貳”,“參”, and “肆” in the some font and size. After training, these characters of other nine different fonts are presented to test the generalization of the network.
The objective of the first experiment of Part I is to investigate the performance of Model 1. Simulation results shot that Model 1 demonstrates a good ability to achieve a successful learning. In other three experiments, the effect of choosing different value for some parameters in investigated. The parameters include the size of S-column, the number of cell-planes, and the receptive field of cells.
In Part II, a comparison of the performance of Model 1 and Model 2 is made. In the first experiment, Model 1 and Model 2 are trained to recognize the simple and complex training sets described above. Experimental results show that Model 1 shows higher ability to achieve a successful learning, and performance of Model 1 is acceptable. In the second experiment, 17 training patterns are presented during the learning process. These training patterns include “零”,“壹”,“貳”,“參”,“肆”,“伍”,“陸”,“柒”,“捌”,“玖”,“拾”,“佰”,“仟”,“萬”,“億”,“圓”,, and “角”. From the simulation results, Model 1 is a promising approach for the recognition of printed Chinese characters.
|