研究生: 鄧羽辰
Yu-Chen Teng
論文名稱: 應用具有序列迴歸卷積網路為基準的動態人臉表情及無線語音命令辨識的人機協同之研究
Human-Robot Collaboration Using Sequential-Recurrent-Convolution-Network-Based Dynamic Face Emotion and Wireless Speech Command Recognitions
指導教授: 黃志良
Chih-Lyang Hwang
口試委員: 黃志良
Chih-Lyang Hwang
Jing-Ming Guo
Ching-Long Shih
Chi-Yi Tsai
學位類別: 碩士
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 56
中文關鍵詞: 卷積神經網路長短期記憶模型人臉檢測動態人臉情感無線語音命令識別全方位服務機器人視覺搜索和跟踪
外文關鍵詞: CNN, LSTM, Human and face detection, Dynamic face emotion, wireless speech command recognition, Omnidirectional service robot, Visual searching and tracking
相關次數: 點閱:682下載:7
  • 本文提出的序列迴歸卷積網路(SRCN)模型包含兩部分:卷積神經網路(CNN),與長短期記憶(LSTM)模型。其中CNN能夠萃取動態人臉情感中的表情影像特徵或無線語音命令中梅爾頻譜圖的頻率特徵向量。將輸入的表情影像或是頻譜圖資料透過卷積層得到的特徵向量後,依序列輸入到長短期記憶模型序列當中,即可完成動態人臉表情識別與無線語音命令的分類。簡而言之,提出了一個用於動態人臉情感識別的SRCN-DFER模型和另一個用於無線語音命令識別的SRCN-WSCR模型來處理人機協同(HRC)任務。所提出的方法不僅可以有效的解決動態人臉情感和語音命令的識別問題,而且能夠以優異的辨識率防止過擬合問題。最後,人機協同的實驗任務內容中包括人士偵測和人臉檢測、軌跡跟踪控制、動態人臉情緒和語音命令識別以及音樂播放,並且在實驗中驗證所提出方法的有效性、可行性和強健性。

    The proposed sequential recurrent convolution network (SRCN) include two parts: one convolution neural network (CNN) and a sequence of long short-term memory (LSTM) models. The CNN is to achieve the corresponding feature vector for dynamic face emotion or wireless speech command. Subsequently, a sequence of LSTM models with the shared weight corresponding to a sequence of inputs (or feature vectors) provided by a pre-trained CNN with a sequence of input sub-images corresponding to face or spectrograms corresponding to speech command. Simply put, one SRCN for dynamic face emotion recognition (SRCN-DFER) and another SRCN for wireless speech command recognition (SRCN-WSCR) are developed to deal with human-robot collaboration (HRC) task. The proposed approaches not only can effectively tackle the recognitions of dynamic mapping of face emotion and speech command but also can prevent the overfitting problem with excellent recognition rate. Finally, the HRC including human and face detections, trajectory tracking control, face emotion and speech command recognitions, and music play, is present to validate the effectiveness, feasibility, and robustness of the proposed method.

    摘要 i Abstract ii 目錄 iii 圖目錄 iv 表目錄 v 第一章 導論與文獻回顧 1 1.1導論 1 1.2論文回顧 2 第二章 系統建構與任務陳述 4 2.1系統建構 4 2.2任務陳述 8 第三章 動態人臉表情辨識 12 3.1卷積神經網路 12 3.2 長短期記憶模型(LSTM) 15 3.3 SRCN-DFER 16 3.4人臉表情訓練測試及資料庫 17 第四章 無線語音命令辨識 24 4.1 語音命令預處理 24 4.2 SRCN-WSCR 29 第五章 實驗結果與討論 32 5.1 基於影片的動態表情識別 32 5.2 語音命令識別結果 35 5.3 人機協同任務 38 第六章 結論和未來研究 43 參考文獻 44

