The feasibility study of automatic categorization and quality evaluation for healthcare websites
碩士 === 國立陽明大學 === 衛生資訊與決策研究所 === 96 === It's easy to search healthcare information on the internet, but to collect useful and correct healthcare information is difficult. Therefore, the challenging task of evaluating healthcare website quality becomes very important. Because of the large amoun...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2008
|
Online Access: | http://ndltd.ncl.edu.tw/handle/45435442128573769804 |
id |
ndltd-TW-096YM005483001 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-096YM0054830012015-10-13T13:51:29Z http://ndltd.ncl.edu.tw/handle/45435442128573769804 The feasibility study of automatic categorization and quality evaluation for healthcare websites 健康資訊網站分類及品質檢測自動化之可行性研究 Min-Ling Lai 賴敏玲 碩士 國立陽明大學 衛生資訊與決策研究所 96 It's easy to search healthcare information on the internet, but to collect useful and correct healthcare information is difficult. Therefore, the challenging task of evaluating healthcare website quality becomes very important. Because of the large amount and extremely hush growth rate of webpages, the automatic evaluating tool for health information on the web is not only more effective and cheaper than the manual evaluating approach but also necessary. This paper presents a method for searching healthcare website and using authentic evaluating standards of Taiwan government to detect quality of health information automatically. The tool for developing the system is Visual Basic 6.0 and the database is SQL 2000. Text categorization is used to categorize health information and non-healthcare information. Information extraction is used to detect quality of the information. The system integrate the word segmentation system of Academia Sinica in Taiwan and SVMligh, which is a SVM Classifiers implemented by Thorsten Joachims. The evaluating standards are implemented by the health department of Taiwan government and selected to a representative standard, the author information of education health information. The system includes (1)a crawler to collect webpages on the web (2) the word segmentation system of Academia Sinica to segment word of the webpages (3) SVMligh to categorize webpages (4) a mechanism to detect the author information on the health information. The initial corpus of text categorization includes 520 homepages. The result of text categorization is good, one of the test set (52 pages) was incorrectly categorized. The corpus of detecting the author information on the health information includes 2,114 webpages. The precision of the test set is 71.76% and the recall is 83.56%. Our effort and the system can be further improved by the following directions: (1) expand the scope to the all websites, not just government websites (2) study the other standards of feasibility of automatically evaluating quality (3) categorize the category of health information (4) enlarge the name, organization and title databases belonging to author information. Polun Chang 張博論 2008 學位論文 ; thesis 75 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立陽明大學 === 衛生資訊與決策研究所 === 96 === It's easy to search healthcare information on the internet, but to collect useful and correct healthcare information is difficult. Therefore, the challenging task of evaluating healthcare website quality becomes very important. Because of the large amount and extremely hush growth rate of webpages, the automatic evaluating tool for health information on the web is not only more effective and cheaper than the manual evaluating approach but also necessary. This paper presents a method for searching healthcare website and using authentic evaluating standards of Taiwan government to detect quality of health information automatically.
The tool for developing the system is Visual Basic 6.0 and the database is SQL 2000. Text categorization is used to categorize health information and non-healthcare information. Information extraction is used to detect quality of the information. The system integrate the word segmentation system of Academia Sinica in Taiwan and SVMligh, which is a SVM Classifiers implemented by Thorsten Joachims. The evaluating standards are implemented by the health department of Taiwan government and selected to a representative standard, the author information of education health information. The system includes (1)a crawler to collect webpages on the web (2) the word segmentation system of Academia Sinica to segment word of the webpages (3) SVMligh to categorize webpages (4) a mechanism to detect the author information on the health information.
The initial corpus of text categorization includes 520 homepages. The result of text categorization is good, one of the test set (52 pages) was incorrectly categorized. The corpus of detecting the author information on the health information includes 2,114 webpages. The precision of the test set is 71.76% and the recall is 83.56%.
Our effort and the system can be further improved by the following directions: (1) expand the scope to the all websites, not just government websites (2) study the other standards of feasibility of automatically evaluating quality (3) categorize the category of health information (4) enlarge the name, organization and title databases belonging to author information.
|
author2 |
Polun Chang |
author_facet |
Polun Chang Min-Ling Lai 賴敏玲 |
author |
Min-Ling Lai 賴敏玲 |
spellingShingle |
Min-Ling Lai 賴敏玲 The feasibility study of automatic categorization and quality evaluation for healthcare websites |
author_sort |
Min-Ling Lai |
title |
The feasibility study of automatic categorization and quality evaluation for healthcare websites |
title_short |
The feasibility study of automatic categorization and quality evaluation for healthcare websites |
title_full |
The feasibility study of automatic categorization and quality evaluation for healthcare websites |
title_fullStr |
The feasibility study of automatic categorization and quality evaluation for healthcare websites |
title_full_unstemmed |
The feasibility study of automatic categorization and quality evaluation for healthcare websites |
title_sort |
feasibility study of automatic categorization and quality evaluation for healthcare websites |
publishDate |
2008 |
url |
http://ndltd.ncl.edu.tw/handle/45435442128573769804 |
work_keys_str_mv |
AT minlinglai thefeasibilitystudyofautomaticcategorizationandqualityevaluationforhealthcarewebsites AT làimǐnlíng thefeasibilitystudyofautomaticcategorizationandqualityevaluationforhealthcarewebsites AT minlinglai jiànkāngzīxùnwǎngzhànfēnlèijípǐnzhìjiǎncèzìdònghuàzhīkěxíngxìngyánjiū AT làimǐnlíng jiànkāngzīxùnwǎngzhànfēnlèijípǐnzhìjiǎncèzìdònghuàzhīkěxíngxìngyánjiū AT minlinglai feasibilitystudyofautomaticcategorizationandqualityevaluationforhealthcarewebsites AT làimǐnlíng feasibilitystudyofautomaticcategorizationandqualityevaluationforhealthcarewebsites |
_version_ |
1717743866628538368 |