Question Generation from Knowledge Base Using Deep Learning Model
碩士 === 國立交通大學 === 資訊科學與工程研究所 === 106 === With the advancement of data-driven approach, the lack of corpora has become the main obstacle of the natural language processing research. Compared with English corpora, publicly available Mandarin corpora is even more lacking. Our paper purposes to solve th...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2018
|
Online Access: | http://ndltd.ncl.edu.tw/handle/mkj9vx |
id |
ndltd-TW-106NCTU5394105 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-106NCTU53941052019-11-28T05:22:15Z http://ndltd.ncl.edu.tw/handle/mkj9vx Question Generation from Knowledge Base Using Deep Learning Model 以深度學習方法進行知識庫問題生成 Lin, Yi-Hsiu 林怡秀 碩士 國立交通大學 資訊科學與工程研究所 106 With the advancement of data-driven approach, the lack of corpora has become the main obstacle of the natural language processing research. Compared with English corpora, publicly available Mandarin corpora is even more lacking. Our paper purposes to solve this problem by using existing question answering dataset and knowledge base to create a new Mandarin question answering dataset. In this study, we first collect the data from CN-DBpedia and question answering dataset from WebQA and web crawler, and propose a method to combine them in the form of pairs as our training data, and then using sequence-to-sequence model to generate questions from knowledge base. The generated questions then incorporate with entities in knowledge base as the answers to create a new Mandarin question answering dataset. In our experiment, we develop a template-based question generation baseline in order to evaluate our model by human evaluation. Our model achieves an acceptable performance compare to the template-based baseline. Sun, Chuen-Tsai 孫春在 2018 學位論文 ; thesis 60 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立交通大學 === 資訊科學與工程研究所 === 106 === With the advancement of data-driven approach, the lack of corpora has become the main obstacle of the natural language processing research. Compared with English corpora, publicly available Mandarin corpora is even more lacking. Our paper purposes to solve this problem by using existing question answering dataset and knowledge base to create a new Mandarin question answering dataset.
In this study, we first collect the data from CN-DBpedia and question answering dataset from WebQA and web crawler, and propose a method to combine them in the form of pairs as our training data, and then using sequence-to-sequence model to generate questions from knowledge base. The generated questions then incorporate with entities in knowledge base as the answers to create a new Mandarin question answering dataset. In our experiment, we develop a template-based question generation baseline in order to evaluate our model by human evaluation. Our model achieves an acceptable performance compare to the template-based baseline.
|
author2 |
Sun, Chuen-Tsai |
author_facet |
Sun, Chuen-Tsai Lin, Yi-Hsiu 林怡秀 |
author |
Lin, Yi-Hsiu 林怡秀 |
spellingShingle |
Lin, Yi-Hsiu 林怡秀 Question Generation from Knowledge Base Using Deep Learning Model |
author_sort |
Lin, Yi-Hsiu |
title |
Question Generation from Knowledge Base Using Deep Learning Model |
title_short |
Question Generation from Knowledge Base Using Deep Learning Model |
title_full |
Question Generation from Knowledge Base Using Deep Learning Model |
title_fullStr |
Question Generation from Knowledge Base Using Deep Learning Model |
title_full_unstemmed |
Question Generation from Knowledge Base Using Deep Learning Model |
title_sort |
question generation from knowledge base using deep learning model |
publishDate |
2018 |
url |
http://ndltd.ncl.edu.tw/handle/mkj9vx |
work_keys_str_mv |
AT linyihsiu questiongenerationfromknowledgebaseusingdeeplearningmodel AT línyíxiù questiongenerationfromknowledgebaseusingdeeplearningmodel AT linyihsiu yǐshēndùxuéxífāngfǎjìnxíngzhīshíkùwèntíshēngchéng AT línyíxiù yǐshēndùxuéxífāngfǎjìnxíngzhīshíkùwèntíshēngchéng |
_version_ |
1719297800536064000 |