Sentence Attention-based Continuous Dialog State Tracking and Reinforcement Learning for Interview Coaching

碩士 === 國立成功大學 === 資訊工程學系 === 105 === Admission interviews are one of the most frequently used methods of student selection. Even though people know the importance of such interviews, very few people practice their interview skills effectively by seeking professional help. Many students thus lack int...

Full description

Bibliographic Details
Main Authors: Chu-KwangChen, 陳垂康
Other Authors: Chung-Hsien Wu
Format: Others
Language:en_US
Published: 2017
Online Access:http://ndltd.ncl.edu.tw/handle/whcrug
Description
Summary:碩士 === 國立成功大學 === 資訊工程學系 === 105 === Admission interviews are one of the most frequently used methods of student selection. Even though people know the importance of such interviews, very few people practice their interview skills effectively by seeking professional help. Many students thus lack interview experience, and are likely to be nervous during an interview. There are many ways that can improve students’ interview skills, one of which is to hire a professional interview coach. This is the most direct way to practice interview skills, but it is also rather expensive. The main purpose of this thesis is thus to develop a dialog manager for an interview coaching system. In a dialog system, Dialog State Tracking (DST) and Dialog Policy (Policy) both are important tasks. The traditional approaches define the semantic slots manually for dialog state representation and tracking. This thesis adopts the topic profiles of the sentences as the representation of a dialog state. When the input sequence consists of several sentences, the summary vector is likely to contain noisy information from many irrelevant feature vectors. This thesis thus applies a sentence attention mechanism by combining the Convolutional Neural Tensor Network (CNTN) and Topic Profile for dialog state tracking. An LSTM-based autoencoder is used as dialog state tracker to model the transition and accumulation of dialog states. Finally, by applying Reinforcement Learning (RL) along with the designed reward functions, the agent learns its behavior from the interactions in an environment for making action decisions. This study collected 260 interview dialogs containing 3,016 dialog turns. A five-fold cross validation scheme was employed for evaluation. The results show that the proposed method performed better than the semantic slot-based baseline method by comparing the statistical data on the number of normal taken actions, follow-up taken actions and accumulative reward by Dialog Policy in the collected corpus.