Dialogue Quality and Nugget Detection for Short Text Conversation based on Hierarchical Multi-Stack Model with Memory Enhance Structure

碩士 === 國立中央大學 === 資訊工程學系 === 107 === With the development of Natural Language Processing (NLP) Automatic question-answering system such as Waston, Siri, Alexa, has become one of the most important NLP applications. Nowadays, enterprises try to build automatic custom service chatbots to save human re...

Full description

Bibliographic Details
Main Authors: Hsiang-En Cherng, 程祥恩
Other Authors: 張嘉惠
Format: Others
Language:en_US
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/53e785
id ndltd-TW-107NCU05392035
record_format oai_dc
spelling ndltd-TW-107NCU053920352019-10-22T05:28:09Z http://ndltd.ncl.edu.tw/handle/53e785 Dialogue Quality and Nugget Detection for Short Text Conversation based on Hierarchical Multi-Stack Model with Memory Enhance Structure 應用記憶增強機制階層式深度學習模型於短文對話之對話品質與事件偵測任務 Hsiang-En Cherng 程祥恩 碩士 國立中央大學 資訊工程學系 107 With the development of Natural Language Processing (NLP) Automatic question-answering system such as Waston, Siri, Alexa, has become one of the most important NLP applications. Nowadays, enterprises try to build automatic custom service chatbots to save human resources and provide a 24-hour customer service. However, evaluation of chatbots currently relied greatly on human annotation which cost a plenty of time. Thus, Short Text Conversation 3 (STC-3) in NTCIR-14 has initiated a new subtask called Dialogue Quality (DQ) and Nugget Detection (ND) which aim to automatically evaluate dialogues generated by chatbots. In this paper, we consider the DQ and ND subtasks for STC-3 using deep learning method. The DQ subtask aims to judge the quality of the whole dialogue using three measures: Task Accomplishment (A-score), Dialogue Effectiveness (E-score) and Customer Satisfaction of the dialogue (S-score). The ND subtask, on the other hand, is to classify if an utterance in a dialogue contains a nugget, which is similar to dialogue act (DA) labeling problem. We applied a general model with utterance layer, context layer and memory layer to learn dialogue representation for both DQ and ND subtasks and use gating and attention mechanism at multiple layers including: utterance layer and context layer. We also tried BERT and multi-stack CNN as sentence representation. The result shows that BERT produced a better utterance representation than multi-stack CNN for both DQ and ND subtasks and outperform other participants’ model and the baseline models proposed by the organizer on Ubuntu customer helpdesk dialogues corpus. 張嘉惠 2019 學位論文 ; thesis 55 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立中央大學 === 資訊工程學系 === 107 === With the development of Natural Language Processing (NLP) Automatic question-answering system such as Waston, Siri, Alexa, has become one of the most important NLP applications. Nowadays, enterprises try to build automatic custom service chatbots to save human resources and provide a 24-hour customer service. However, evaluation of chatbots currently relied greatly on human annotation which cost a plenty of time. Thus, Short Text Conversation 3 (STC-3) in NTCIR-14 has initiated a new subtask called Dialogue Quality (DQ) and Nugget Detection (ND) which aim to automatically evaluate dialogues generated by chatbots. In this paper, we consider the DQ and ND subtasks for STC-3 using deep learning method. The DQ subtask aims to judge the quality of the whole dialogue using three measures: Task Accomplishment (A-score), Dialogue Effectiveness (E-score) and Customer Satisfaction of the dialogue (S-score). The ND subtask, on the other hand, is to classify if an utterance in a dialogue contains a nugget, which is similar to dialogue act (DA) labeling problem. We applied a general model with utterance layer, context layer and memory layer to learn dialogue representation for both DQ and ND subtasks and use gating and attention mechanism at multiple layers including: utterance layer and context layer. We also tried BERT and multi-stack CNN as sentence representation. The result shows that BERT produced a better utterance representation than multi-stack CNN for both DQ and ND subtasks and outperform other participants’ model and the baseline models proposed by the organizer on Ubuntu customer helpdesk dialogues corpus.
author2 張嘉惠
author_facet 張嘉惠
Hsiang-En Cherng
程祥恩
author Hsiang-En Cherng
程祥恩
spellingShingle Hsiang-En Cherng
程祥恩
Dialogue Quality and Nugget Detection for Short Text Conversation based on Hierarchical Multi-Stack Model with Memory Enhance Structure
author_sort Hsiang-En Cherng
title Dialogue Quality and Nugget Detection for Short Text Conversation based on Hierarchical Multi-Stack Model with Memory Enhance Structure
title_short Dialogue Quality and Nugget Detection for Short Text Conversation based on Hierarchical Multi-Stack Model with Memory Enhance Structure
title_full Dialogue Quality and Nugget Detection for Short Text Conversation based on Hierarchical Multi-Stack Model with Memory Enhance Structure
title_fullStr Dialogue Quality and Nugget Detection for Short Text Conversation based on Hierarchical Multi-Stack Model with Memory Enhance Structure
title_full_unstemmed Dialogue Quality and Nugget Detection for Short Text Conversation based on Hierarchical Multi-Stack Model with Memory Enhance Structure
title_sort dialogue quality and nugget detection for short text conversation based on hierarchical multi-stack model with memory enhance structure
publishDate 2019
url http://ndltd.ncl.edu.tw/handle/53e785
work_keys_str_mv AT hsiangencherng dialoguequalityandnuggetdetectionforshorttextconversationbasedonhierarchicalmultistackmodelwithmemoryenhancestructure
AT chéngxiángēn dialoguequalityandnuggetdetectionforshorttextconversationbasedonhierarchicalmultistackmodelwithmemoryenhancestructure
AT hsiangencherng yīngyòngjìyìzēngqiángjīzhìjiēcéngshìshēndùxuéxímóxíngyúduǎnwénduìhuàzhīduìhuàpǐnzhìyǔshìjiànzhēncèrènwù
AT chéngxiángēn yīngyòngjìyìzēngqiángjīzhìjiēcéngshìshēndùxuéxímóxíngyúduǎnwénduìhuàzhīduìhuàpǐnzhìyǔshìjiànzhēncèrènwù
_version_ 1719273869267697664