研究生: 施至舜
Zhih-Shun Shih
論文名稱: 室內保全機器人的自主監控
Autonomous Surveillance for an Indoor Security Robot
指導教授: 李敏凡
Min-Fan Lee
口試委員: 李敏凡
Min-Fan Lee
Ming-Jong Tsai
Tzu-Chen Tang
學位類別: 碩士
系所名稱: 工程學院 - 自動化及控制研究所
Graduate Institute of Automation and Control
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 82
中文關鍵詞: 移動機器人人臉辨識人工智慧目標偵測同時定位與地圖構建
外文關鍵詞: Mobile robots, Face recognition, Artificial intelligence, Object detection, Simultaneous localization and mapping
  • 保全機器人的自主監視時常會因為各種不穩定的因素而受到嚴重限制,如光線變化、視角變化、遮擋,或目標的外觀變化。本論文提出了一種基於深度學習的自主機器人系統,以執行視覺感知和控制任務。視覺感知旨在識別場景中移動的所有對象,透過人臉辨識演算法,來判斷目標是否為特定對象,若為允許進入場景的目標則追蹤人臉,讓對象保持在中心;而若為陌生的目標,則發送警訊到手機。控制系統包括運動控制和導航,其中機器人使用了ROS環境開發及運作,透過RP-LiDAR及Hector SLAM演算法來達到路徑規劃及導航避障,並同時達到機器人的定位與建圖。經驗驗證包括各種感知算法的評估指標,如模型的準確性、召回率、PR曲線、ROC曲線等,而建圖準確性的評估指標為誤差的均方根值。實驗結果表明,VggNet在4種不同光照變化下的平均精度為0.95,而對於真實場景中建圖的準確性,建圖誤差的均方根值為0.222公尺。

    Conventional surveillance for a security robot suffers from severe limitations, perceptual aliasing (e.g., different places/objects can appear identical), occlusion (e.g., place/object appearance changes between visits), illumination changes, significant viewpoint changes, etc. This thesis proposes an autonomous robotic system based on CNN (Convolutional Neural Network) to perform visual perception and control tasks. The visual perception aims to identify all objects moving in the scene and to verify whether the target is an authorized person. The visual perception system includes a motion detection module, a tracking module, face detection and recognition module. The control system includes motion control and navigation (path-planning & obstacle avoidance). The empirical validation includes the evaluation metrics, such as model speed, accuracy, precision, recall, ROC (Receiver Operating Characteristic) curve, P-R (Precision-Recall) curve, F1-score for AlexNet, VggNet, and GoogLeNet, and RMSE (Root Mean Square Error) value of mapping error. The experimental results showed that the average accuracy rate of VggNet under 4 different illumination changes is 0.95, and it has the best performance under all unstable factors among three CNN architectures. For the accuracy of building map in real scene, the mapping error is 0.222 meters.

    Acknowledgemets III 摘要 IV Abstract V Table of Contents VI List of Figures VII List of Tables IX Chapter 1 Introduction 1 Chapter 2 Methods 4 2.1 Proposed Architecture 4 2.2 Face Detection Algorithm 7 2.3 CNN Architecture 10 2.4 Two-wheeled Mobile Robot 15 2.5 Navigation and Hector SLAM Algorithm 19 Chapter 3 Experimental Results 27 3.1 Face Detection and CNN Model Training 29 3.2 Model Testing under unstable environment 32 3.3 Model Testing under unstable target 44 3.4 Hector SLAM and Navigation of ROS Turtlebot 48 3.5 Summary and Discussion 55 Chapter 4 Conclusion and Future Work 57 References 59 Appendix 63

