An Advanced Study of Voice Diagnosis: Based on Laryngeal Cancer Voice Detection

碩士 === 元智大學 === 電機工程學系甲組 === 107 === In today's society, with the development of science and technology, it is possible to obtain more complex physiological signals through various instruments, such as ECG signals, breathing, voice and so on. The voice detection studied in this paper combines t...

Full description

Bibliographic Details
Main Authors: Zong-Ying Chuang, 莊宗穎
Other Authors: Shih-Hau Fang
Format: Others
Language:zh-TW
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/9nbc47
Description
Summary:碩士 === 元智大學 === 電機工程學系甲組 === 107 === In today's society, with the development of science and technology, it is possible to obtain more complex physiological signals through various instruments, such as ECG signals, breathing, voice and so on. The voice detection studied in this paper combines the method of artificial intelligence to detect whether the vocal cords are damaged or not, and to improve the discomfort and constant status caused by the traditional use of endoscopes. Laryngeal cancer is the most serious of all pathological conditions, so this paper is based on the previous structure, through model adaptation, to achieve the best results. For the discussion of a small amount of data, this article uses the two methods of migration learning and adding features, and the skills of migration learning can reduce the data collection cost. The advantages of this method can be reflected in small hospitals or local hospitals, trained by large hospitals. The pre-training model, when re-training a small amount of data in local and small hospitals, can preserve the characteristics of the disease and add local voice features to improve detection accuracy. This article uses the openSMILE feature extraction tool to obtain more feature data. openSMILE is widely used in the field of acoustics. Based on the modular features, you can add any required feature capture modules. There are also many teams in the FEMH Challenge 2018 competition. The speech database used in this paper includes the Yadong Hospital Pathological Voice Database (FEMH), the FEMH Voice data Challenge 2018 dataset, the Massachusetts Department of Otolaryngology Hospital Speech Disorder Database (MEEI), and the Sabrücken voice data. Library (SVD). The FEMH database was established by the ENT department of Yadong Hospital. A total of 492 voice samples were used. This is the main database used in this paper, and the concept verification is carried out through the FEMH Voice data Challenge 2018 data set. The MEEI and SVD databases serve as the basis for the pre-training model in migration learning.