深度學習於目標物辨識與機械臂抓取物體之應用｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	李家豪 Chia-Hao Li
論文名稱：	深度學習於目標物辨識與機械臂抓取物體之應用 Deep learning applications in object recognition and robotic grasping
指導教授：	施慶隆 Ching-Long Shih
口試委員:	吳修明 Hsiu-Ming Wu 黃志良 Chih-Lyang Hwang 李文猶 Wen-Yo Lee
學位類別：	碩士 Master
系所名稱：	電資學院 - 電機工程系 Department of Electrical Engineering
論文出版年：	2020
畢業學年度：	108
語文別：	中文
論文頁數：	88
中文關鍵詞：	相機校正與配準、機器學習、深度學習、YOLOv3 、目標物辨識及抓取
外文關鍵詞：	camera calibration and registration, machine learning, deep learning, YOLOv3, object identification and grasping
相關次數：	點閱：611 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本論文旨在使用機器學習與深度學習對超商回收物品進行辨識。使用三維攝影機Realsense彩色相機偵測目標物，再利用深度相機資訊計算目標物與機械手臂的相對距離。最後，使用影像回授控制機械臂完成視覺對正、物體抓取與擺放。本文具體內容為: (1)將Realsense彩色影像與深度影像進行匹配，以獲得影像中目標物的色彩資訊以及三維空間資訊 ; (2)使用傳統影像處理搭配機器學習對目標物進行位置偵測與分類辨識 ; 以及(3)使用YOLOv3深度神經網路並提出YOLOv3的改進版對目標物進行辨識。最後，比較上述3種方法的辨識結果。在超商回收物線上辨識數據上的實驗表現，本文所提出的YOLOv3改進版獲得高達98.77% 的精確度、94.18% 的召回率以及96.42% 的F1值，在執行速度上可達到3.2 fps。總體而言，本論文所提出的YOLOv3改進版的性能比機器學習和YOLOv3原始表現更好。

The objective of this study is to make use of machine learning and deep learning to recognize superstore recyclables. By using Realsense’s color camera to detect targets and depth camera to calculate relative distance between the target and the manipulator. At last, a manipulator is commanded to perform visual alignment, objects grasping and placement. The specific topics of this thesis are : (1) achieve color information and 3D spatial information in image object by calibrating color frame and depth frame ; (2) detect, identify and localize objects by using traditional image processing methods ; and (3) recognize and localize objects by using YOLOv3 and the proposed improvement of YOLOv3. Finally, this work compares the detection and identification results by using above of three different methods. Experimental results on the superstore recyclables dataset demonstrate that the proposed improvement of YOLOv3 achieves 98.77 % precision, 94.18 % recall and 96.42 % F1 at the speed of 3.2 fps (frames per second). In conclusion, improvement version of YOLOv3 performances better than the machine learning and original YOLOv3 methods.

摘要    I
Abstract    II
致謝    III
目錄    IV
圖目錄    VII
表目錄    IX
第一章 緒論    1
1研究動機    1
2文獻回顧    1
3論文大綱    5
第二章Realsense彩色及深度影像匹配    6
1 Realsense簡介    6
1.1 Realsense SR300硬體規格    7
1.2 Realsense軟體開發環境    8
2    Realsense深度影像無效點修補    10
3相機內部參數與外部參數    13
4彩色影像與深度影像匹配    18
4.1內外部參數轉換法    19
4.2平面投影轉換法    19
第三章 機器學習於多目標物辨識之應用    23
1目標區域偵測    25
1.1影像之前景與背景分離    26
1.2目標分割    28
2詞袋模型    30
3特徵提取    32
3.1建構尺度空間    33
3.2極值點檢測與精確定位    36
3.3去除邊緣點    38
3.4決定特徵點主方向    39
3.5生成特徵點之特徵向量    40
4建立資料庫    41
4.1建立視覺辭典    42
4.2建立影像直方圖向量    43
5支持向量機分類    44
5.1線性可分支持向量機    45
5.2非線性支持向量機    48
第四章 YOLO於多目標物辨識之應用與改進    50
1 YOLOv3簡介    50
1.1 YOLOv3原理與架構    51
1.2 多尺度特徵融合    55
1.3 先驗框    56
1.4邊界框、類別與置信度預測    58
1.5 損失函數    60
2 YOLOv3改進    61
2.1 Darknet53特徵網路層改進    62
2.2 先驗框改進    64
2.3 邊界框回歸改進    65
3模型訓練    66
3.1數據集預處理    67
3.2訓練過程    70
3.3模型評估指標    71
第五章 實驗結果與討論    73
1 Realsense彩色影像與深度影像匹配    73
2目標物辨識實驗    75
3目標物對正    79
4機械手臂抓取目標物實驗    80
第六章 結論與建議    85
1結論    85
2建議    86
參考文獻    87


                                

[1] Vázquez, Carlos, Wa James Tam, and Filippo Speranza. "Stereoscopic imaging:
filling disoccluded areas in depth image-based rendering." Three-Dimensional
TV,Video, and Display Vol. 6392. International Society for Optics and
Photonics, 2006.
[2] Yang, Xinxin, Jize Sun, and Weimin Diao. "Depth Image Inpainting for RGB-D
Camera Based on Light Field EPI." 2018 IEEE 3rd International Conference on
Image, Vision and Computing (ICIVC). IEEE, 2018.
[3] Yang, Ke, et al. "Depth enhancement via non-local means filter." 2015 Seventh
International Conference on Advanced Computational Intelligence (ICACI).
IEEE,2015.
[4] Zhang, Zhengyou. "A flexible new technique for camera calibration." IEEE
Transactions on pattern analysis and machine intelligence 22.11 (2000): 1330-
1334.
[5] Zhang, Chenyang, Teng Huang, and Qiang Zhao. "A New Model of RGB-D Camera
Calibration Based on 3D Control Field." Sensors 19.23 (2019): 5082.
[6] Divya, S. V., Sourabh Paul, and Umesh Chandra Pati. "Structure tensor-based
SIFT algorithm for SAR image registration." IET Image Processing 14.5 (2019):
929-938.
[7] Chen, Bohao, Chang, S., Chen, X., & Han, H. "Using scale information to
improve SIFT-based electron microscope image registration method."Eleventh
International Conference on Graphics and Image Processing. Vol. 11373.
International Society for Optics and Photonics, 2020.
[8] Fahfouh, Anass, Riffi, Jamal. "PV-DAE: A hybrid model for deceptive opinion
spam based on neural network architectures." Expert Systems with Applications
(2020): 113517.
[9] Li, Pengfei, Mao, Kezhi, Xu,Yuecong, Li,Qi, Zhang ,Jiaheng. "Bag-of-Concepts
representation for document classification based on automatic knowledge
acquisition from probabilistic knowledge base." Knowledge-Based Systems
(2020):105436.
[10] Xu, Lixiang, Wang,Xiaofeng, Bai Lu, Xiao,Jin, Liu,Qi, Chen,Enhong,
Jiang,Xiaoyi, Luo,Binet. "Probabilistic SVM classifier ensemble selection
based on GMDH-type neural network." Pattern Recognition 106 (2020): 107373.
[11] Vo, Son Anh, Joel, Scanlan, and Paul, Turner. "An application of
Convolutional Neural Network to lobster grading in the Southern Rock Lobster
supply chain." Food Control (2020): 107184.
[12] Ahmed,Belal,and T. Aaron Gulliver."Image splicing detection using mask-RCNN.
" Signal, Image and Video Processing (2020): 1-8.
[13] Liu, G., Nouaze, J. C, Touko Mbouembe, P. L, & Kim, J. H.(2020).YOLO-Tomato:
A Robust Algorithm for Tomato Detection Based on YOLOv3. Sensors, 20(7), 214
5.
[14] Zhou, Linghua, et al. "Detecting Motion Blurred Vehicle Logo in IoV Using
Filter-DeblurGAN and VL-YOLO." IEEE Transactions on Vehicular Technology
69.4 (2020): 3604-3614.
[15] Luo, Jingting, Yong Wang, and Ying Wang. "Real-time pedestrian detection
method based on improved YOLOv3." Journal of Physics: Conference Series.
Vol. 1453. 2020.
[16] Sun, Xiechang, et al. "A fast multi-target detection method based on
improved YOLO." SPIE. Vol. 11429. 2020.
[17] Kim, Jinsoo, Jongwon Kim, and Jeongho Cho. "An advanced object
classification strategy using YOLO through camera and LiDAR sensor fusion."
2019 13th International Conference on Signal Processing and Communication
Systems (ICSPCS). IEEE, 2019.
[18] Ju, Moran, et al. "The application of improved YOLO V3 in multi-scale target
detection."Applied Sciences 9.18 (2019): 3775.
[19] König, Jonas, et al. "Multi-stage Reinforcement Learning for Object
Detection." Science and Information Conference. Springer, Cham, 2019.
[20] Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao. "YOLOv4:
Optimal Speed and Accuracy of Object Detection." arXiv preprint arXiv:2004.
10934(2020).
[21] Liu, Wei, et al. "Ssd: Single shot multibox detector." European conference
on computer vision. Springer, Cham, 2016.
[22] Redmon, Joseph, et al. "You only look once: Unified, real-time object
detection."Proceedings of the IEEE conference on computer vision and pattern
recognition.2016.
[23] Redmon, Joseph, and Ali Farhadi."YOLO9000:better, faster, stronger."
Proceedings of the IEEE conference on computer vision and pattern
recognition. 2017.
[24] Redmon, Joseph, and Ali Farhadi. "Yolov3: An incremental improvement." arXiv
preprint arXiv:1804.02767 (2018).
[25] Zheng, Zhaohui, et al."Distance-IoU Loss:Faster and Better Learning for
Bounding Box Regression." arXiv preprint arXiv:1911.08287 (2019).

全文公開日期 2025/07/06 (校內網路)
全文公開日期 2030/07/06 (校外網路)
全文公開日期 2030/07/06 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文