Self-Supervised Chinese Ontology Learning from Online Encyclopedias
Constructing ontology manually is a time-consuming, error-prone, and tedious task. We present SSCO, a self-supervised learning based chinese ontology, which contains about 255 thousand concepts, 5 million entities, and 40 million facts. We explore the three largest online Chinese encyclopedias for o...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
2014-01-01
|
Series: | The Scientific World Journal |
Online Access: | http://dx.doi.org/10.1155/2014/848631 |
id |
doaj-e4c8dcabd71a47818c3cd8b1fd3b1673 |
---|---|
record_format |
Article |
spelling |
doaj-e4c8dcabd71a47818c3cd8b1fd3b16732020-11-24T21:52:46ZengHindawi LimitedThe Scientific World Journal2356-61401537-744X2014-01-01201410.1155/2014/848631848631Self-Supervised Chinese Ontology Learning from Online EncyclopediasFanghuai Hu0Zhiqing Shao1Tong Ruan2Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, ChinaDepartment of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, ChinaDepartment of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, ChinaConstructing ontology manually is a time-consuming, error-prone, and tedious task. We present SSCO, a self-supervised learning based chinese ontology, which contains about 255 thousand concepts, 5 million entities, and 40 million facts. We explore the three largest online Chinese encyclopedias for ontology learning and describe how to transfer the structured knowledge in encyclopedias, including article titles, category labels, redirection pages, taxonomy systems, and InfoBox modules, into ontological form. In order to avoid the errors in encyclopedias and enrich the learnt ontology, we also apply some machine learning based methods. First, we proof that the self-supervised machine learning method is practicable in Chinese relation extraction (at least for synonymy and hyponymy) statistically and experimentally and train some self-supervised models (SVMs and CRFs) for synonymy extraction, concept-subconcept relation extraction, and concept-instance relation extraction; the advantages of our methods are that all training examples are automatically generated from the structural information of encyclopedias and a few general heuristic rules. Finally, we evaluate SSCO in two aspects, scale and precision; manual evaluation results show that the ontology has excellent precision, and high coverage is concluded by comparing SSCO with other famous ontologies and knowledge bases; the experiment results also indicate that the self-supervised models obviously enrich SSCO.http://dx.doi.org/10.1155/2014/848631 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Fanghuai Hu Zhiqing Shao Tong Ruan |
spellingShingle |
Fanghuai Hu Zhiqing Shao Tong Ruan Self-Supervised Chinese Ontology Learning from Online Encyclopedias The Scientific World Journal |
author_facet |
Fanghuai Hu Zhiqing Shao Tong Ruan |
author_sort |
Fanghuai Hu |
title |
Self-Supervised Chinese Ontology Learning from Online Encyclopedias |
title_short |
Self-Supervised Chinese Ontology Learning from Online Encyclopedias |
title_full |
Self-Supervised Chinese Ontology Learning from Online Encyclopedias |
title_fullStr |
Self-Supervised Chinese Ontology Learning from Online Encyclopedias |
title_full_unstemmed |
Self-Supervised Chinese Ontology Learning from Online Encyclopedias |
title_sort |
self-supervised chinese ontology learning from online encyclopedias |
publisher |
Hindawi Limited |
series |
The Scientific World Journal |
issn |
2356-6140 1537-744X |
publishDate |
2014-01-01 |
description |
Constructing ontology manually is a time-consuming, error-prone,
and tedious task. We present SSCO, a self-supervised learning
based chinese ontology, which contains about 255 thousand concepts,
5 million entities, and 40 million facts. We explore the three largest online
Chinese encyclopedias for ontology learning and describe how to
transfer the structured knowledge in encyclopedias, including article titles,
category labels, redirection pages, taxonomy systems, and InfoBox
modules, into ontological form. In order to avoid the errors in encyclopedias
and enrich the learnt ontology, we also apply some machine
learning based methods. First, we proof that the self-supervised machine
learning method is practicable in Chinese relation extraction (at least
for synonymy and hyponymy) statistically and experimentally and train
some self-supervised models (SVMs and CRFs) for synonymy extraction,
concept-subconcept relation extraction, and concept-instance relation extraction;
the advantages of our methods are that all training examples
are automatically generated from the structural information of encyclopedias
and a few general heuristic rules. Finally, we evaluate SSCO in
two aspects, scale and precision; manual evaluation results show that
the ontology has excellent precision, and high coverage is concluded by
comparing SSCO with other famous ontologies and knowledge bases; the
experiment results also indicate that the self-supervised models obviously
enrich SSCO. |
url |
http://dx.doi.org/10.1155/2014/848631 |
work_keys_str_mv |
AT fanghuaihu selfsupervisedchineseontologylearningfromonlineencyclopedias AT zhiqingshao selfsupervisedchineseontologylearningfromonlineencyclopedias AT tongruan selfsupervisedchineseontologylearningfromonlineencyclopedias |
_version_ |
1725875096645533696 |