Author: |
王宗仁 TSUNG-JEN WANG |
---|---|
Thesis Title: |
基於意圖具可分解性與具關聯性之自然語言理解 Natural Language Understanding based on Decomposable and Associative Intent |
Advisor: |
陳正綱
Cheng-Kang Chen |
Committee: |
呂永和
Yung-Ho Leu 林伯慎 Bor-Shen Lin |
Degree: |
碩士 Master |
Department: |
管理學院 - 資訊管理系 Department of Information Management |
Thesis Publication Year: | 2019 |
Graduation Academic Year: | 107 |
Language: | 中文 |
Pages: | 65 |
Keywords (in Chinese): | 口語對話系統 、自然語言理解 、槽填充 、意圖識別 、深度學習 |
Keywords (in other languages): | Spoken Dialogue System, Natural Language Understanding, Slot Filling, Intent Detection, Deep Learning |
Reference times: | Clicks: 442 Downloads: 12 |
Share: |
School Collection Retrieve National Library Collection Retrieve Error Report |
本篇論文在探討口語對話系統中的自然語言理解問題,此問題又可分為槽填充(Slot Filling)與意圖識別(Intent Prediction)兩個任務。在整個語言理解的過程中,意圖識別負責理解整段語句的重點(例如:要訂車票、查詢航班)、槽填充負責抓取語句中的細節(例如:起訖地點、時間)。這兩個任務彼此本身具高度相關性,但在過往研究中,都將意圖視為獨立的任務,故本研究欲增加意圖之效用與資訊含量。在本研究中,意圖被視為具有「可分解性」與「具關聯性」,因此本研究可以將意圖分解成子意圖,使意圖的分類能夠更加明確,並且找出子意圖彼此之間的關聯性,透過將此關聯性提供給槽填充任務使用,來提升槽填充任務的準確度。
在此的概念下,本研究以類神經網路來建立解決模型,以序列對序列模型 (Sequence-to-sequence Model)與注意力機制(Attention Mechanism)為基礎,將槽填充任務與意圖識別任務作強聯結,使其需要共同找出更明確之子意圖,來達到全域最佳解。經由實驗結果證明,本研究之模型在ATIS資料集與Snips資料集的準確度,優於過去文獻之模型。且本研究之模型也較能容許粗糙之意圖分類,即使意圖縮減後,仍能維持其分類之準確度。
The purpose of this paper is to discusses the natural language understanding problem from spoken dialogue system. This problem can be divided into slot filling and intent detection. Intent detection is to get the focus of the entire statement. Slot filling is to get the detail from the statement. Therefore, slot filling and intent detection tasks are highly correlated with each other. However, intent detection had been regarded as an independent task over the past research. This study is going to increase the usefulness and the contents of information of intent. Intent will be considered to be “decomposable” and “associative”. So we can decompose intent into sub-intent and find out the relevance between sub-intent. It can make the classification of intent more precise. We can also improve the accuracy of slot filling task, by providing the relevance to it.
Under this concept, this study uses a neural network to establish a solution model. Based on sequence-to-sequence model and attention mechanism, we make strong relationships between slot filling and intent detection. Make them necessary to find the sub-intent together to achieve global optimization. The experiments show that our proposed model has improves the accuracy on the ATIS dataset and Snips dataset. And be able to tolerate the unclear classification of intent. Even if the intent had been reduced, the accuracy can be maintained.
[1] S. Bird, E. Klein, and E. Loper, Natural language processing with Python: analyzing text with the natural language toolkit. " O'Reilly Media, Inc.", 2009.
[2] G. Tur and R. De Mori, Spoken language understanding: Systems for extracting semantic information from speech. John Wiley & Sons, 2011.
[3] G. Mesnil et al., "Using recurrent neural networks for slot filling in spoken language understanding," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 3, pp. 530-539, 2014.
[4] K. T. Mengistu, M. Hannemann, T. Baum, and A. Wendemuth, "Hierarchical HMM-based semantic concept labeling model," in 2008 IEEE Spoken Language Technology Workshop, 2008, pp. 57-60: IEEE.
[5] Y.-Y. Wang, L. Deng, and A. Acero, "Spoken language understanding," IEEE Signal Processing Magazine, vol. 22, no. 5, pp. 16-31, 2005.
[6] M. Dinarelli, A. Moschitti, and G. Riccardi, "Re-ranking models for spoken language understanding," in Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, 2009, pp. 202-210: Association for Computational Linguistics.
[7] C. Raymond and G. Riccardi, "Generative and discriminative algorithms for spoken language understanding," in Eighth Annual Conference of the International Speech Communication Association, 2007.
[8] T. Kudo and Y. Matsumoto, "Chunking with support vector machines," in Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies, 2001, pp. 1-8: Association for Computational Linguistics.
[9] T. M. Mitchell, The discipline of machine learning. Carnegie Mellon University, School of Computer Science, Machine Learning …, 2006.
[10] T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013.
[11] V. E. Balas, S. S. Roy, D. Sharma, and P. Samui, Handbook of Deep Learning Applications. Springer, 2019.
[12] J. L. Elman, "Distributed representations, simple recurrent networks, and grammatical structure," Machine learning, vol. 7, no. 2-3, pp. 195-225, 1991.
[13] S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
[14] I. Sutskever, O. Vinyals, and Q. V. Le, "Sequence to sequence learning with neural networks," in Advances in neural information processing systems, 2014, pp. 3104-3112.
[15] D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," arXiv preprint arXiv:1409.0473, 2014.
[16] P. Haffner, G. Tur, and J. H. Wright, "Optimizing SVMs for complex call classification," in 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP'03). 2003, vol. 1, pp. I-I: IEEE.
[17] J.-K. Kim, G. Tur, A. Celikyilmaz, B. Cao, and Y.-Y. Wang, "Intent detection using semantically enriched word embeddings," in 2016 IEEE Spoken Language Technology Workshop (SLT), 2016, pp. 414-419: IEEE.
[18] R. Sarikaya, G. E. Hinton, and B. Ramabhadran, "Deep belief nets for natural language call-routing," in 2011 IEEE International conference on acoustics, speech and signal processing (ICASSP), 2011, pp. 5680-5683: IEEE.
[19] S. Ravuri and A. Stoicke, "A comparative study of neural network models for lexical intent classification," in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2015, pp. 368-374: IEEE.
[20] A. McCallum, D. Freitag, and F. C. Pereira, "Maximum Entropy Markov Models for Information Extraction and Segmentation," in Icml, 2000, vol. 17, no. 2000, pp. 591-598.
[21] Y. Kim, "Convolutional neural networks for sentence classification," arXiv preprint arXiv:1408.5882, 2014.
[22] B. Liu and I. Lane, "Recurrent neural network structured output prediction for spoken language understanding," in Proc. NIPS Workshop on Machine Learning for Spoken Language Understanding and Interactions, 2015.
[23] K. Yao, B. Peng, Y. Zhang, D. Yu, G. Zweig, and Y. Shi, "Spoken language understanding using long short-term memory neural networks," in 2014 IEEE Spoken Language Technology Workshop (SLT), 2014, pp. 189-194: IEEE.
[24] G. Kurata, B. Xiang, and B. Zhou, "Labeled Data Generation with Encoder-Decoder LSTM for Semantic Slot Filling," in INTERSPEECH, 2016, pp. 725-729.
[25] T. Shen, T. Zhou, G. Long, J. Jiang, S. Pan, and C. Zhang, "Disan: Directional self-attention network for rnn/cnn-free language understanding," in Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
[26] S. Zhu and K. Yu, "Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding," in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 5675-5679: IEEE.
[27] Y. Gong et al., "Deep Cascade Multi-task Learning for Slot Filling in Online Shopping Assistant," arXiv preprint arXiv:1803.11326, 2018.
[28] M. Firdaus, S. Bhatnagar, A. Ekbal, and P. Bhattacharyya, "A Deep Learning Based Multi-task Ensemble Model for Intent Detection and Slot Filling in Spoken Language Understanding," in International Conference on Neural Information Processing, 2018, pp. 647-658: Springer.
[29] C.-W. Goo et al., "Slot-gated modeling for joint slot filling and intent prediction," in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), 2018, pp. 753-757.
[30] D. Guo, G. Tur, W.-t. Yih, and G. Zweig, "Joint semantic utterance classification and slot filling with recursive neural networks," in 2014 IEEE Spoken Language Technology Workshop (SLT), 2014, pp. 554-559: IEEE.
[31] B. Liu and I. Lane, "Attention-based recurrent neural network models for joint intent detection and slot filling," arXiv preprint arXiv:1609.01454, 2016.
[32] C. Zhang, Y. Li, N. Du, W. Fan, and P. S. Yu, "Joint Slot Filling and Intent Detection via Capsule Neural Networks," arXiv preprint arXiv:1812.09471, 2018.
[33] C. T. Hemphill, J. J. Godfrey, and G. R. Doddington, "The ATIS spoken language systems pilot corpus," in Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27, 1990, 1990.
[34] A. Coucke et al., "Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces," arXiv preprint arXiv:1805.10190, 2018.
[35] G. Tur, D. Hakkani-Tür, and L. Heck, "What is left to be understood in ATIS?," in 2010 IEEE Spoken Language Technology Workshop, 2010, pp. 19-24: IEEE.
[36] L. Prechelt, "Early stopping-but when?," in Neural Networks: Tricks of the trade: Springer, 1998, pp. 55-69.
[37] Y.-N. Chen, D. Hakanni-Tür, G. Tur, A. Celikyilmaz, J. Guo, and L. Deng, "Syntax or semantics? knowledge-guided joint semantic frame parsing," in 2016 IEEE Spoken Language Technology Workshop (SLT), 2016, pp. 348-355: IEEE.
[38] S. Shaphiro and M. Wilk, "An analysis of variance test for normality," Biometrika, vol. 52, no. 3, pp. 591-611, 1965.