簡易檢索 / 詳目顯示

研究生: 詹立馨
Li-Sin Zhan
論文名稱: 協助客服判斷無標籤真實對話進度
Measuring the Progress of Customer Service Dialogues without Human Annotation
指導教授: 鮑興國
Hsing-Kuo Pao
口試委員: 鄧惟中
Wei-Chung Teng
項天瑞
Tien-Ruey Hsiang
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 64
中文關鍵詞: 自然語言處理機器學習特徵擷取文本分析客戶服務
外文關鍵詞: Natural language processing, Machine learning, Feature extraction, Text Analysis, Customer Service
相關次數: 點閱:247下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著科技的快速發展,人們對於手機等消費行電子商品的需求越來越高,選購此項商品的依據不再只是價錢和品質,還會考量售後服務的品質,因此許多公司投入大量成本來僱用客服人員。為了節省僱用客服的成本,公司希望能利用人工智慧來輔助真人客服。雖然在資訊爆炸的時代,可以很容易地獲得原始客服對話資料,但大部分對話是沒有人工標註可以做為訓練人工智慧模型的標籤,為了分析無人工標註的對話,我們利用對話順序來產生模型的標籤,使我們提出方法可以在無人工標註的情況下,依舊可以測量對話的進度來協助真人客服。

    本篇論文所提出的方法中,利用無監督式的模型去提取對話的主題和代表對話的向量來當作特徵,並且處理藉由對話的順序所產生錯誤標籤的爭議資料,因此我們的方法可以測量無人工標註對話的進度,並且預測結果可以和其他具有標籤資料所訓練的模型相比。


    With the rapid growth in technology, people's demands are getting higher and higher for consumer electronics. The standard of buying the consumer electronics is not only price and quality but also the after service. Therefore, many companies spend much money for employing the customer service agent. For reducing the cost, companies hope that utilize the artificial intelligence (AI) to assist the human agent. Although the raw customer service dialogues easily obtain in an era of information explosion, most dialogues don't have human annotations to be labels of the AI model. In order to analyse the dialogues without human annotations, we utilize the order of the dialogue to generate labels of the model. Therefore, our proposed methodology can measure the progress of the dialogue without human annotation for assisting the human agent.

    In this thesis, our proposed methodology mainly utilize the unsupervised learning models to extract the topics of the dialogue and vectors that represent the dialogues to be the features. Furthermore, deal with the issue data which leads to the wrong labels generated by the order of the dialogues. Therefore, our methodology can measure the progress of the dialogues without human annotation to assist and the prediction results can be compared with other models trained by dialogue with human annotation.

    Recommendation Letter . . . . . . . . . . . . . . . . . . . . . . . . i Approval Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . v Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii List of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3.1 Natural language processing . . . . . . . . . . . . . . . . 5 3.1.1 Embedding . . . . . . . . . . . . . . . . . . . . . 6 3.1.2 Latent Dirichlet Allocation . . . . . . . . . . . . . 7 3.2 Long short-term memory . . . . . . . . . . . . . . . . . . 8 3.3 Proposed methodology . . . . . . . . . . . . . . . . . . . 8 3.3.1 Task 1 : measuring progress of dialogues . . . . . 10 3.3.2 Task 2 : predicting whether the dialogue is successful or not . . . . . . . . . . . . . . . . . . . . . . 11 4 Experiment and result . . . . . . . . . . . . . . . . . . . . . . . 14 4.1 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.1.1 Perplexity . . . . . . . . . . . . . . . . . . . . . . 14 4.1.2 Accuracy . . . . . . . . . . . . . . . . . . . . . . 14 4.1.3 MAE . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2.1 ASUS corporation dataset . . . . . . . . . . . . . 16 4.2.2 MultiWOZ dataset . . . . . . . . . . . . . . . . . 16 4.3 Data Processing . . . . . . . . . . . . . . . . . . . . . . . 17 4.3.1 Basic data preprocessing . . . . . . . . . . . . . . 18 4.3.2 Data preprocessing for training the LDA model . . 19 4.3.3 Manually labelled data . . . . . . . . . . . . . . . 19 4.4 Experiment Setting . . . . . . . . . . . . . . . . . . . . . 22 4.4.1 Task 1 : measuring progress of dialogues . . . . . 22 4.4.2 Task 2 : predicting whether the dialogue is successful or not . . . . . . . . . . . . . . . . . . . . . . 22 4.5 Feature engineering . . . . . . . . . . . . . . . . . . . . . 23 4.5.1 Transferring sentences to vectors . . . . . . . . . . 23 4.5.2 Extracting the topics of dialogues . . . . . . . . . 24 4.5.3 Conditional entropy and number of words . . . . . 27 4.6 Task 1 : measuring progress of dialogues . . . . . . . . . 30 4.6.1 Comparing the different features . . . . . . . . . . 30 4.6.2 Comparing the different models . . . . . . . . . . 32 4.6.3 Comparing the different datasets . . . . . . . . . . 33 4.7 Task 2 : predicting whether the dialogue is successful or not 39 4.8 The result after dealing with issue data . . . . . . . . . . . 39 4.9 Extending the classification to the regression problem . . . 41 4.9.1 Compare the different methodologys . . . . . . . . 41 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Appendix A:Dialogue Examples . . . . . . . . . . . . . . . . . . 47 A.1 ASUS corporation dialogues . . . . . . . . . . . . . . . . 47 A.2 MultiWOZ dialogues . . . . . . . . . . . . . . . . . . . . 48

    [1] M. Hardalov, I. Koychev, and P. Nakov, “Towards automated customer support,” in International Conference on Artificial Intelligence: Methodology, Systems, and Applications, pp. 48–59, Springer, 2018.
    [2] S. M. Nagarajan and U. D. Gandhi, “Classifying streaming of twitter data based on sentiment analysis using hybridization,” Neural Computing and Applications, vol. 31, no. 5, pp. 1425–1433, 2019.
    [3] P.-H. Su, D. Vandyke, M. Gašić, D. Kim, N. Mrkšić, T.-H. Wen, and S. Young, “Learning from real users: Rating dialogue success with neural networks for reinforcement learning in spoken dialogue systems,” in Sixteenth Annual Conference of the International Speech Communication Association, 2015.
    [4] A. Venkatesh, C. Khatri, A. Ram, F. Guo, R. Gabriel, A. Nagar, R. Prasad, M. Cheng, B. Hedayatnia, A. Metallinou, et al., “On evaluating and comparing conversational agents,” arXiv preprint arXiv: 1801.03625, vol. 4, pp. 60–68, 2018.
    [5] P. Budzianowski, T.-H. Wen, B.-H. Tseng, I. Casanueva, S. Ultes, O. Ramadan, and M. Gasic, “Multiwoz-a large-scale multi-domain wizard-of-oz dataset for task-oriented dialogue modelling,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 5016–5026, 2018.
    [6] L. El Asri, H. Schulz, S. K. Sarma, J. Zumer, J. Harris, E. Fine, R. Mehrotra, and K. Suleman, “Frames: a corpus for adding memory to goal-oriented dialogue systems,” in Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, pp. 207–219, 2017.
    [7] J.-H. Lee, “Virtual assistant for customer service in text,” Master’s thesis, National Chiao Tung Uni- versity, 2018.
    [8] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Advances in neural information processing systems, pp. 3111–3119, 2013.
    [9] J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543, 2014.
    [10] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional trans- formers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
    [11] Q. Le and T. Mikolov, “Distributed representations of sentences and documents,” in International conference on machine learning, pp. 1188–1196, 2014.
    [12] J. Susanto, “Measuring the progression of a goal-oriented dialogue,” Master’s thesis, National Taiwan University of Science and Technology, 2019.
    [13] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” Journal of machine Learning research, vol. 3, no. Jan, pp. 993–1022, 2003.
    [14] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
    [15] B. Grün, K. Hornik, D. M. Blei, J. D. Lafferty, X.-H. Phan, M. Matsumoto, T. Nishimura, and S. Cokus, topicmodels: Topic Models, 2019. R package version 0.2-9.
    [16] F. Chollet et al., “Keras.” https://github.com/keras-team/keras, 2015.
    [17] Y. Xu and D. Reitter, “Entropy converges between dialogue participants: Explanations from an information-theoretic perspective,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 537–546, 2016.

    QR CODE