簡易檢索 / 詳目顯示

研究生: 李家豪
Chia-Hao Li
論文名稱: 深度學習於目標物辨識與機械臂抓取物體之應用
Deep learning applications in object recognition and robotic grasping
指導教授: 施慶隆
Ching-Long Shih
口試委員: 吳修明
Hsiu-Ming Wu
黃志良
Chih-Lyang Hwang
李文猶
Wen-Yo Lee
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 88
中文關鍵詞: 相機校正與配準機器學習深度學習YOLOv3目標物辨識及抓取
外文關鍵詞: camera calibration and registration, machine learning, deep learning, YOLOv3, object identification and grasping
相關次數: 點閱:611下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文旨在使用機器學習與深度學習對超商回收物品進行辨識。使用三維攝影機Realsense彩色相機偵測目標物,再利用深度相機資訊計算目標物與機械手臂的相對距離。最後,使用影像回授控制機械臂完成視覺對正、物體抓取與擺放。本文具體內容為: (1)將Realsense彩色影像與深度影像進行匹配,以獲得影像中目標物的色彩資訊以及三維空間資訊 ; (2)使用傳統影像處理搭配機器學習對目標物進行位置偵測與分類辨識 ; 以及(3)使用YOLOv3深度神經網路並提出YOLOv3的改進版對目標物進行辨識。最後,比較上述3種方法的辨識結果。在超商回收物線上辨識數據上的實驗表現,本文所提出的YOLOv3改進版獲得高達98.77% 的精確度、94.18% 的召回率以及96.42% 的F1值,在執行速度上可達到3.2 fps。總體而言,本論文所提出的YOLOv3改進版的性能比機器學習和YOLOv3原始表現更好。


    The objective of this study is to make use of machine learning and deep learning to recognize superstore recyclables. By using Realsense’s color camera to detect targets and depth camera to calculate relative distance between the target and the manipulator. At last, a manipulator is commanded to perform visual alignment, objects grasping and placement. The specific topics of this thesis are : (1) achieve color information and 3D spatial information in image object by calibrating color frame and depth frame ; (2) detect, identify and localize objects by using traditional image processing methods ; and (3) recognize and localize objects by using YOLOv3 and the proposed improvement of YOLOv3. Finally, this work compares the detection and identification results by using above of three different methods. Experimental results on the superstore recyclables dataset demonstrate that the proposed improvement of YOLOv3 achieves 98.77 % precision, 94.18 % recall and 96.42 % F1 at the speed of 3.2 fps (frames per second). In conclusion, improvement version of YOLOv3 performances better than the machine learning and original YOLOv3 methods.

    摘要 I Abstract II 致謝 III 目錄 IV 圖目錄 VII 表目錄 IX 第一章 緒論 1 1.1研究動機 1 1.2文獻回顧 1 1.3論文大綱 5 第二章Realsense彩色及深度影像匹配 6 2.1 Realsense簡介 6 2.1.1 Realsense SR300硬體規格 7 2.1.2 Realsense軟體開發環境 8 2.2 Realsense深度影像無效點修補 10 2.3相機內部參數與外部參數 13 2.4彩色影像與深度影像匹配 18 2.4.1內外部參數轉換法 19 2.4.2平面投影轉換法 19 第三章 機器學習於多目標物辨識之應用 23 3.1目標區域偵測 25 3.1.1影像之前景與背景分離 26 3.1.2目標分割 28 3.2詞袋模型 30 3.3特徵提取 32 3.3.1建構尺度空間 33 3.3.2極值點檢測與精確定位 36 3.3.3去除邊緣點 38 3.3.4決定特徵點主方向 39 3.3.5生成特徵點之特徵向量 40 3.4建立資料庫 41 3.4.1建立視覺辭典 42 3.4.2建立影像直方圖向量 43 3.5支持向量機分類 44 3.5.1線性可分支持向量機 45 3.5.2非線性支持向量機 48 第四章 YOLO於多目標物辨識之應用與改進 50 4.1 YOLOv3簡介 50 4.1.1 YOLOv3原理與架構 51 4.1.2 多尺度特徵融合 55 4.1.3 先驗框 56 4.1.4邊界框、類別與置信度預測 58 4.1.5 損失函數 60 4.2 YOLOv3改進 61 4.2.1 Darknet53特徵網路層改進 62 4.2.2 先驗框改進 64 4.2.3 邊界框回歸改進 65 4.3模型訓練 66 4.3.1數據集預處理 67 4.3.2訓練過程 70 4.3.3模型評估指標 71 第五章 實驗結果與討論 73 5.1 Realsense彩色影像與深度影像匹配 73 5.2目標物辨識實驗 75 5.3目標物對正 79 5.4機械手臂抓取目標物實驗 80 第六章 結論與建議 85 6.1結論 85 6.2建議 86 參考文獻 87

    [1] Vázquez, Carlos, Wa James Tam, and Filippo Speranza. "Stereoscopic imaging:
    filling disoccluded areas in depth image-based rendering." Three-Dimensional
    TV,Video, and Display Vol. 6392. International Society for Optics and
    Photonics, 2006.
    [2] Yang, Xinxin, Jize Sun, and Weimin Diao. "Depth Image Inpainting for RGB-D
    Camera Based on Light Field EPI." 2018 IEEE 3rd International Conference on
    Image, Vision and Computing (ICIVC). IEEE, 2018.
    [3] Yang, Ke, et al. "Depth enhancement via non-local means filter." 2015 Seventh
    International Conference on Advanced Computational Intelligence (ICACI).
    IEEE,2015.
    [4] Zhang, Zhengyou. "A flexible new technique for camera calibration." IEEE
    Transactions on pattern analysis and machine intelligence 22.11 (2000): 1330-
    1334.
    [5] Zhang, Chenyang, Teng Huang, and Qiang Zhao. "A New Model of RGB-D Camera
    Calibration Based on 3D Control Field." Sensors 19.23 (2019): 5082.
    [6] Divya, S. V., Sourabh Paul, and Umesh Chandra Pati. "Structure tensor-based
    SIFT algorithm for SAR image registration." IET Image Processing 14.5 (2019):
    929-938.
    [7] Chen, Bohao, Chang, S., Chen, X., & Han, H. "Using scale information to
    improve SIFT-based electron microscope image registration method."Eleventh
    International Conference on Graphics and Image Processing. Vol. 11373.
    International Society for Optics and Photonics, 2020.
    [8] Fahfouh, Anass, Riffi, Jamal. "PV-DAE: A hybrid model for deceptive opinion
    spam based on neural network architectures." Expert Systems with Applications
    (2020): 113517.
    [9] Li, Pengfei, Mao, Kezhi, Xu,Yuecong, Li,Qi, Zhang ,Jiaheng. "Bag-of-Concepts
    representation for document classification based on automatic knowledge
    acquisition from probabilistic knowledge base." Knowledge-Based Systems
    (2020):105436.
    [10] Xu, Lixiang, Wang,Xiaofeng, Bai Lu, Xiao,Jin, Liu,Qi, Chen,Enhong,
    Jiang,Xiaoyi, Luo,Binet. "Probabilistic SVM classifier ensemble selection
    based on GMDH-type neural network." Pattern Recognition 106 (2020): 107373.
    [11] Vo, Son Anh, Joel, Scanlan, and Paul, Turner. "An application of
    Convolutional Neural Network to lobster grading in the Southern Rock Lobster
    supply chain." Food Control (2020): 107184.
    [12] Ahmed,Belal,and T. Aaron Gulliver."Image splicing detection using mask-RCNN.
    " Signal, Image and Video Processing (2020): 1-8.
    [13] Liu, G., Nouaze, J. C, Touko Mbouembe, P. L, & Kim, J. H.(2020).YOLO-Tomato:
    A Robust Algorithm for Tomato Detection Based on YOLOv3. Sensors, 20(7), 214
    5.
    [14] Zhou, Linghua, et al. "Detecting Motion Blurred Vehicle Logo in IoV Using
    Filter-DeblurGAN and VL-YOLO." IEEE Transactions on Vehicular Technology
    69.4 (2020): 3604-3614.
    [15] Luo, Jingting, Yong Wang, and Ying Wang. "Real-time pedestrian detection
    method based on improved YOLOv3." Journal of Physics: Conference Series.
    Vol. 1453. 2020.
    [16] Sun, Xiechang, et al. "A fast multi-target detection method based on
    improved YOLO." SPIE. Vol. 11429. 2020.
    [17] Kim, Jinsoo, Jongwon Kim, and Jeongho Cho. "An advanced object
    classification strategy using YOLO through camera and LiDAR sensor fusion."
    2019 13th International Conference on Signal Processing and Communication
    Systems (ICSPCS). IEEE, 2019.
    [18] Ju, Moran, et al. "The application of improved YOLO V3 in multi-scale target
    detection."Applied Sciences 9.18 (2019): 3775.
    [19] König, Jonas, et al. "Multi-stage Reinforcement Learning for Object
    Detection." Science and Information Conference. Springer, Cham, 2019.
    [20] Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao. "YOLOv4:
    Optimal Speed and Accuracy of Object Detection." arXiv preprint arXiv:2004.
    10934(2020).
    [21] Liu, Wei, et al. "Ssd: Single shot multibox detector." European conference
    on computer vision. Springer, Cham, 2016.
    [22] Redmon, Joseph, et al. "You only look once: Unified, real-time object
    detection."Proceedings of the IEEE conference on computer vision and pattern
    recognition.2016.
    [23] Redmon, Joseph, and Ali Farhadi."YOLO9000:better, faster, stronger."
    Proceedings of the IEEE conference on computer vision and pattern
    recognition. 2017.
    [24] Redmon, Joseph, and Ali Farhadi. "Yolov3: An incremental improvement." arXiv
    preprint arXiv:1804.02767 (2018).
    [25] Zheng, Zhaohui, et al."Distance-IoU Loss:Faster and Better Learning for
    Bounding Box Regression." arXiv preprint arXiv:1911.08287 (2019).

    無法下載圖示 全文公開日期 2025/07/06 (校內網路)
    全文公開日期 2030/07/06 (校外網路)
    全文公開日期 2030/07/06 (國家圖書館:臺灣博碩士論文系統)
    QR CODE