以具深度卷積分類神經網路的立體視覺實現人機互動之任務

簡易檢索 / 詳目顯示

回結果列表

研究生：	李俊寬 Chun-Kuan LEE
論文名稱：	以具深度卷積分類神經網路的立體視覺實現人機互動之任務 Deep CNN Stereo Camera Based Dynamic Face Emotion Recognition to Fulfill Human-Robot Interaction Tasks
指導教授：	黃志良 Chih-Lyang Hwang
口試委員:	施慶隆 Ching-Long Shih 游文雄 Wen-Shyong Yu 蔡奇謚 chi yi_tsai
學位類別：	碩士 Master
系所名稱：	電資學院 - 電機工程系 Department of Electrical Engineering
論文出版年：	2020
畢業學年度：	108
語文別：	中文
論文頁數：	44
中文關鍵詞：	卷積神經網路、人臉表情辨識、人物檢測、全方位移動機器人、視覺搜索和跟蹤、自適應有限時間分層飽和控制
外文關鍵詞：	Convolutional neural network, facial expression recognition, person detection, omni-directional mobile robot, visual search and tracking, adaptive limited time layered saturation control
相關次數：	點閱：339 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在本文中，首先以Single-Shot Detection (SSD)方法偵測相關的人物，再以多層感知器網絡(MLPN)估測人物與全方位移動機器人(ODSR)的距離與角度，之後藉由Deep Convolutional Neural Network (DCNN)架構及具有 “Adam” 的Stochastic Gradient Decent (SGD)學習法則以萃取特徵，並以softmax進行六類人臉表情 (例如，憤怒、厭惡、恐懼、快樂、驚訝、難過)的辨識。並藉由合併四個人臉表情資料庫(即NTUST-IRL, Cohn-Kanada, JAFFE, KDEF)，以提升所開發的研究方法之泛用性(Generalization)與強健性(Robustness)。其中，SSD搭配解析度為2560×720的雙眼鏡頭相機進行人物偵測，以MLPN估測的距離與角度，命令ODSR移至人物前方3公尺，開啟Haar-Cascade特徵描述器判斷人臉，切割適當的人臉尺寸輸入至已經學習的DCNN以進行人臉動態表情辨識，最後依據辨識的結果撥放相對應的歌曲完成人機互動的任務。
以基於影像處理的適應有限時間之階層飽和控制 (IB-AFTHSC)達成搜尋人物與辨識表情所需的姿態控制。最後，經由一系列的實驗，包含非資料庫的使用者及不同人臉表情，驗證所本文方法的有效性和強健性。

In this paper, the Single-Shot Detection (SSD) method is used to detect the relevant person, and then the multi-layer perceptron network (MLPN) is used to estimate the distance and angle between the person and the omnidirectional mobile robot (ODSR), and then by Deep Convolutional Neural Network (DCNN) architecture and Stochastic Gradient Decent (SGD) learning rule with "Adam" to extract features, and use softmax to identify six types of facial expressions (for example, anger, disgust, fear, happiness, surprise, sadness) . And by merging four facial expression databases (ie NTUST-IRL, Cohn-Kanada, JAFFE, KDEF) to improve the generalization and robustness of the developed research methods. Among them, the SSD is paired with a dual-eye lens camera with a resolution of 2560×720 to detect people. Using the distance and angle estimated by MLPN, the ODSR is commanded to move 3 meters in front of the person, and the Haar-Cascade feature descriptor is turned on to judge the face. Cut the appropriate face size and input it into the learned DCNN for facial dynamic expression recognition, and finally play the corresponding song according to the recognition result to complete the human-computer interaction task.
The image processing-based hierarchical saturation control (IB-AFTHSC) that adapts to a limited time is used to achieve the posture control needed to search for people and recognize expressions. Finally, through a series of experiments, including non-database users and different facial expressions, the effectiveness and robustness of the proposed method are verified.

摘要
Abstract
目錄
圖目錄
表目錄

第一章    導論與文獻回顧
1.1 導論
1.2 文獻回顧

第二章    系統建構與任務陳述
2.1 系統建構
2.2 任務陳述
2.2.1 人物偵測
2.2.2 人物定位
2.2.3 人臉趨近
2.2.4 人臉表情辨識

第三章人物之3D定位
3.1 類神經介紹
3.2 多層感知器
3.2.1 隱藏神經元
3.2.2 多層感知器架構
3.2.4 訓練階段與結果

第四章    人臉動態表情辨識
4.1 深度學習
4.2 深度卷積神經網路
4.3 訓練資料
4.4 立體視覺鏡頭

第五章  實驗結果與討論
5.1 測試資料庫結果
5.2 基於影片的辨識結果
5.3 實驗分析

第六章 結論與未來建議

參考文獻

                                

[1] S. Boucenna, P. Gaussier, and L. Hafemeister, “Development of first social referencing skills: emotional interaction as a way to regulate robot behavior,” IEEE Trans. Autonomous Mental Development, vol. 6, no. 1, pp. 42-55, Mar. 2014.
[2] A. Zaraki, D. Mazzei, M. Giuliani, and D. De Rossi, “Designing and evaluating a social gaze-control system for a humanoid robot,” IEEE Trans. Human-Machine Syst., vol. 44, no. 2, pp. 157-168, Apr. 2014.
[3] A. Zaraki, M. Pieroni, D. De Rossi, D. Mazzei, R. Garofalo, L. Cominelli, and M. B. Dehkordi, “Design and evaluation of a unique social perception system for human-robot interaction,” IEEE Trans. Cognitive and Development Syst., vol. 9, no. 4, pp. 341-352, Dec. 2017.
[4] L. Chen, M. Zhou, M. Wu, J. She, Z. Liu, F. Dong, and K. Hirota, “Three-layer weighted fuzzy support vector regression for emotional intention understanding in human-robot interaction,” IEEE Trans. Fuzzy Syst., vol. 26, no. 5, pp. 2524-2538, Oct. 2018.
[5] L. Chen, M. Wu, M. Zhou, Z. Liu, J. She, and K. Hirota, “Dynamic emotion understanding in human–robot interactions based on two-layer fuzzy SVR-TS model,” IEEE Trans. Syst. Man, Cybern.: Syst., to be published, 2020.
[6] M. Wu, W. Su, L. Chen, Z. Liu, W. Cao, and K. Hirota, “Weight-adapted convolution neural network for facial expression recognition in human-robot interaction,” IEEE Trans. Syst. Man, Cybern.: Syst., to be published, 2020.
[7] P. Ekman, W. V. Friesen, and P. Ellsworth, “Emotion in the Human Face,” Oxford University Press, 1972.
[8] Y. Li, S. Wang, Y. Zhao, and Q. Ji, “Simultaneous facial feature tracking and facial expression recognition,” IEEE Trans. Image Process., vol. 22, no. 7, pp. 2559-2573, Jul. 2013.
[9] Y. Liu, X. Yuan, X. Gong, Z. Xie, F. Fang, and Z. Luoa, “Conditional convolution neural network enhanced random forest for facial expression recognition,” Patter Recognition, vol. 84, pp. 251-261, 2018.
[10] Y. Yaddaden, M. Addaa, A. Bouzouanea, S. Gaboury, and B. Bouchard, “User action and facial expression recognition for error detection system in an ambient assisted environment,” Expert Syst. and Application, vol. 112, pp. 173-189, 2018.
[11] A. Ruiz-Garcia, M. Elshaw, A. Altahhan, and V. Palade, “A hybrid deep learning neural approach for emotion recognition from facial expressions for socially assistive robots,” Neural Computing and Applications, vol. 29, pp. 59–373, 2018.
[12] T. T. D. Pham, S. Kim, Y. Lu, S.-W. Jung, and C.-S. Won, “Facial action units-based image retrieval for facial expression recognition,” IEEE Access, vol. 7, pp. 5200-5207, 2019.
[13] J.-H. Kim, B.-G. Kim, P. P. Roy, and D.-M. Jeong, “Efficient facial expression recognition algorithm based on hierarchical deep neural network structure,” IEEE Access, vol. 7, pp. 41273 -41285, 2019.
[14] F.-C. Chen and M. R. Jahanshahi, “NB-CNN: deep learning- based crack detection using convolutional neural network and naïve Bayes data fusion,” IEEE Trans. Ind. Electron., vol. 65, no. 5, pp. 4392-4301, May 2018.
[15] Q. Xuan, B. Fang, Y. Liu, J. Wang, J. Zhang, Y. Zheng, and G. Bao, “Automatic pearl classification machine based on a multistream convolutional neural network,” IEEE Trans. Ind. Electron., vol. 65, no. 8, pp. 6538-6547, Aug. 2018.
[16] S. J. Chang and J. B. Park, “Wire mismatch detection using a convolutional neural network and fault localization based on time-frequency-domain reflectometry,” IEEE Trans. Ind. Electron., vol. 66, no. 3, pp. 2102-2110, M ar. 2019.
[17] G. Jiang, H. He, J. Yan, and P. Xie, “Multiscale convolutional neural networks for fault diagnosis of wind turbine gearbox,” IEEE Trans. Ind. Electron., vol. 66, no. 4, pp. 3196-3207, Apr. 2019.
[18] C.-L. Liu, W.-H. Hsaio, and Y.-C. Tu, “Time series classification with multivariate convolutional neural network,” IEEE Trans. Ind. Electron., vol. 66, no. 6, pp. 4788-4797, Jun. 2019.
[19] L. Wen, X. Li, L. Gao, and Y. Zhang, “A new convolutional neural network-based data-driven fault diagnosis method,” IEEE Trans. Ind. Electron., vol. 66, no. 7, pp. 5990-5998, Jul. 2019.
[20] A. Ullah, K. Muhammad, J. Del Ser, S. W. Baik, and V. H. C. de Albuquerque, “Activity recognition using temporal optical flow convolutional features and multilayer LSTM,” IEEE Trans. Ind. Electron., vol. 66, no. 12, pp. 9692-9702, Dec. 2019.
[21] L. Xie, X. Xiang, H. Xu, L. Wang, L. Lin and G. Yin, “FFCNN: A deep neural network for surface defect detection of magnetic tile,” IEEE Trans. Ind. Electron., to be published, 2020.
[22] C. Hu and Y. Wang, “An efficient CNN model based on object-level attention mechanism for casting defects Detection on radiography images,” IEEE Trans. Ind. Electron., to be published, 2020.
[23] D. A. Chanti and A. Caplier, “Deep learning for spatio-temporal modeling of dynamic spontaneous emotions,” IEEE Trans. Affect. Comput., to be published, 2020.
[24] S. K. Biswas and P. Milanfar, “One shot detection with Laplacian object and fast matrix cosine similarity,” IEEE Trans. Pattern Anal. and Mach. Intell., vol. 38, no. 3, pp. 546-562, Mar. 2016.
[25] M. Liao, B. Shi, and X. Bai, “TextBoxes++: A single-shot oriented scene text detector,” IEEE Trans. Image Processing, vol. 27, no. 8, pp. 3676-3690, Aug. 2018.
[26] C.-L. Hwang, D. S. Wang, and F. C. Weng, “Interactions between specific human and omnidirectional mobile robot using deep learning approach: SSD-FN-KCF,” IEEE Access, vol. 8, pp. 41186-41200, 2020.
[27] C.-L. Hwang, Y. J. Chou, and C. W. Lan, ““Search, track and kick to virtual target point” of humanoid robots by a neural-network-based active embedded vision system,” IEEE Syst. Journal, vol. 9, no. 1, pp. 107-118, Mar. 2015.
[28] C.-L. Hwang and G. H. Liao, “Real-time pose imitation by mid-size humanoid robot with servo-cradle -head RGB-D vision system,” IEEE Trans. Syst. Man & Cybern.: Syst., vol. 49, no. 1, pp. 181-191, Jan. 2019.
[29] Y. Qian, M. Bi, T. Tan, and K. Yu, “Very deep convolutional neural networks for noise robust speech recognition,” IEEE/ACM Trans. Audio, Speech, and Lang. Process., vol. 24, no. 12, pp. 2263-2276, Dec. 2016.
[30] Sheng-Lin Lia, Human and Omnidirectional Service Robot Interactions by Face Expression Recognition with Improved Local Binary Pattern and Localization with Depth Image, Master Thesis of Department of Electrical Engineering, Jun. 2019.
[31] C. Hwang, D. Wang, F. Weng, and S. Lai, "Interactions between specific human and omnidirectional mobile robot using deep learning approach: SSD-FN-KCF," IEEE Access, vol. 8, pp. 41186-41200, 2020.

全文公開日期 2025/07/22 (校內網路)
全文公開日期 2025/07/22 (校外網路)
全文公開日期 2025/07/22 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文