Multilingual Machine Reading Comprehension based on BERT Model

碩士 === 國立臺北科技大學 === 資訊工程系 === 107 === In recent years, Internet had got more and more information so that people can without it every day. Dual to Internet was subject to Information Retrieval technique limit. Nevertheless, it can provide many different types of information resource. But to user, th...

Full description

Bibliographic Details
Main Authors:	WU, CHENG-XUAN, 吳承軒
Other Authors:	WANG, JENQ-HAUR
Format:	Others
Language:	zh-TW
Published:	2019
Online Access:	http://ndltd.ncl.edu.tw/handle/aua9d5

id	ndltd-TW-107TIT00392066
record_format	oai_dc
spelling	ndltd-TW-107TIT003920662019-11-09T05:23:36Z http://ndltd.ncl.edu.tw/handle/aua9d5 Multilingual Machine Reading Comprehension based on BERT Model 基於BERT模型之多國語言機器閱讀理解研究 WU, CHENG-XUAN 吳承軒碩士國立臺北科技大學資訊工程系 107 In recent years, Internet had got more and more information so that people can without it every day. Dual to Internet was subject to Information Retrieval technique limit. Nevertheless, it can provide many different types of information resource. But to user, those resource may not so related and helpful. With the development of neural network. Different research fields have made progress. Specifically, two research topics Question Answering and Machine Comprehension among the Natural Language Processing field. Become more and more popular research issue due to importance of Information Retrieval and Chatbot in the past few years. In this thesis, we use Google BERT pre-trained model to processing Word-Embedding, through use mass amount of data to learning and masking 15% token on training stage so that Embedded result have better performance on unknown situation and learn more semantics. Our model forms a semantic sentence feature which use single word and words via Word Embedding. Then using Cosine Similarity to calculate similarity between sentence and option. Finally, choose the option of highest cosine similarity score as machine inference answer. Our thesis does experiment on TOEFL-QA dataset and Grand challenge dataset that compare to Bi-directional Gated Recurrent Unit method and A Strong Alignment IR Baseline getting 34.87% accuracy and 57.5% accuracy. As a result, our model has multilingual propriety to some extent, even if grammar difference exists in difference language. WANG, JENQ-HAUR 王正豪 2019 學位論文 ; thesis 43 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立臺北科技大學 === 資訊工程系 === 107 === In recent years, Internet had got more and more information so that people can without it every day. Dual to Internet was subject to Information Retrieval technique limit. Nevertheless, it can provide many different types of information resource. But to user, those resource may not so related and helpful. With the development of neural network. Different research fields have made progress. Specifically, two research topics Question Answering and Machine Comprehension among the Natural Language Processing field. Become more and more popular research issue due to importance of Information Retrieval and Chatbot in the past few years. In this thesis, we use Google BERT pre-trained model to processing Word-Embedding, through use mass amount of data to learning and masking 15% token on training stage so that Embedded result have better performance on unknown situation and learn more semantics. Our model forms a semantic sentence feature which use single word and words via Word Embedding. Then using Cosine Similarity to calculate similarity between sentence and option. Finally, choose the option of highest cosine similarity score as machine inference answer. Our thesis does experiment on TOEFL-QA dataset and Grand challenge dataset that compare to Bi-directional Gated Recurrent Unit method and A Strong Alignment IR Baseline getting 34.87% accuracy and 57.5% accuracy. As a result, our model has multilingual propriety to some extent, even if grammar difference exists in difference language.
author2	WANG, JENQ-HAUR
author_facet	WANG, JENQ-HAUR WU, CHENG-XUAN 吳承軒
author	WU, CHENG-XUAN 吳承軒
spellingShingle	WU, CHENG-XUAN 吳承軒 Multilingual Machine Reading Comprehension based on BERT Model
author_sort	WU, CHENG-XUAN
title	Multilingual Machine Reading Comprehension based on BERT Model
title_short	Multilingual Machine Reading Comprehension based on BERT Model
title_full	Multilingual Machine Reading Comprehension based on BERT Model
title_fullStr	Multilingual Machine Reading Comprehension based on BERT Model
title_full_unstemmed	Multilingual Machine Reading Comprehension based on BERT Model
title_sort	multilingual machine reading comprehension based on bert model
publishDate	2019
url	http://ndltd.ncl.edu.tw/handle/aua9d5
work_keys_str_mv	AT wuchengxuan multilingualmachinereadingcomprehensionbasedonbertmodel AT wúchéngxuān multilingualmachinereadingcomprehensionbasedonbertmodel AT wuchengxuan jīyúbertmóxíngzhīduōguóyǔyánjīqìyuèdúlǐjiěyánjiū AT wúchéngxuān jīyúbertmóxíngzhīduōguóyǔyánjīqìyuèdúlǐjiěyánjiū
_version_	1719288959618514944

Multilingual Machine Reading Comprehension based on BERT Model

Similar Items