研究生: |
李家豪 Chia-Hao Li |
---|---|
論文名稱: |
深度學習於目標物辨識與機械臂抓取物體之應用 Deep learning applications in object recognition and robotic grasping |
指導教授: |
施慶隆
Ching-Long Shih |
口試委員: |
吳修明
Hsiu-Ming Wu 黃志良 Chih-Lyang Hwang 李文猶 Wen-Yo Lee |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 中文 |
論文頁數: | 88 |
中文關鍵詞: | 相機校正與配準 、機器學習 、深度學習 、YOLOv3 、目標物辨識及抓取 |
外文關鍵詞: | camera calibration and registration, machine learning, deep learning, YOLOv3, object identification and grasping |
相關次數: | 點閱:611 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文旨在使用機器學習與深度學習對超商回收物品進行辨識。使用三維攝影機Realsense彩色相機偵測目標物,再利用深度相機資訊計算目標物與機械手臂的相對距離。最後,使用影像回授控制機械臂完成視覺對正、物體抓取與擺放。本文具體內容為: (1)將Realsense彩色影像與深度影像進行匹配,以獲得影像中目標物的色彩資訊以及三維空間資訊 ; (2)使用傳統影像處理搭配機器學習對目標物進行位置偵測與分類辨識 ; 以及(3)使用YOLOv3深度神經網路並提出YOLOv3的改進版對目標物進行辨識。最後,比較上述3種方法的辨識結果。在超商回收物線上辨識數據上的實驗表現,本文所提出的YOLOv3改進版獲得高達98.77% 的精確度、94.18% 的召回率以及96.42% 的F1值,在執行速度上可達到3.2 fps。總體而言,本論文所提出的YOLOv3改進版的性能比機器學習和YOLOv3原始表現更好。
The objective of this study is to make use of machine learning and deep learning to recognize superstore recyclables. By using Realsense’s color camera to detect targets and depth camera to calculate relative distance between the target and the manipulator. At last, a manipulator is commanded to perform visual alignment, objects grasping and placement. The specific topics of this thesis are : (1) achieve color information and 3D spatial information in image object by calibrating color frame and depth frame ; (2) detect, identify and localize objects by using traditional image processing methods ; and (3) recognize and localize objects by using YOLOv3 and the proposed improvement of YOLOv3. Finally, this work compares the detection and identification results by using above of three different methods. Experimental results on the superstore recyclables dataset demonstrate that the proposed improvement of YOLOv3 achieves 98.77 % precision, 94.18 % recall and 96.42 % F1 at the speed of 3.2 fps (frames per second). In conclusion, improvement version of YOLOv3 performances better than the machine learning and original YOLOv3 methods.
[1] Vázquez, Carlos, Wa James Tam, and Filippo Speranza. "Stereoscopic imaging:
filling disoccluded areas in depth image-based rendering." Three-Dimensional
TV,Video, and Display Vol. 6392. International Society for Optics and
Photonics, 2006.
[2] Yang, Xinxin, Jize Sun, and Weimin Diao. "Depth Image Inpainting for RGB-D
Camera Based on Light Field EPI." 2018 IEEE 3rd International Conference on
Image, Vision and Computing (ICIVC). IEEE, 2018.
[3] Yang, Ke, et al. "Depth enhancement via non-local means filter." 2015 Seventh
International Conference on Advanced Computational Intelligence (ICACI).
IEEE,2015.
[4] Zhang, Zhengyou. "A flexible new technique for camera calibration." IEEE
Transactions on pattern analysis and machine intelligence 22.11 (2000): 1330-
1334.
[5] Zhang, Chenyang, Teng Huang, and Qiang Zhao. "A New Model of RGB-D Camera
Calibration Based on 3D Control Field." Sensors 19.23 (2019): 5082.
[6] Divya, S. V., Sourabh Paul, and Umesh Chandra Pati. "Structure tensor-based
SIFT algorithm for SAR image registration." IET Image Processing 14.5 (2019):
929-938.
[7] Chen, Bohao, Chang, S., Chen, X., & Han, H. "Using scale information to
improve SIFT-based electron microscope image registration method."Eleventh
International Conference on Graphics and Image Processing. Vol. 11373.
International Society for Optics and Photonics, 2020.
[8] Fahfouh, Anass, Riffi, Jamal. "PV-DAE: A hybrid model for deceptive opinion
spam based on neural network architectures." Expert Systems with Applications
(2020): 113517.
[9] Li, Pengfei, Mao, Kezhi, Xu,Yuecong, Li,Qi, Zhang ,Jiaheng. "Bag-of-Concepts
representation for document classification based on automatic knowledge
acquisition from probabilistic knowledge base." Knowledge-Based Systems
(2020):105436.
[10] Xu, Lixiang, Wang,Xiaofeng, Bai Lu, Xiao,Jin, Liu,Qi, Chen,Enhong,
Jiang,Xiaoyi, Luo,Binet. "Probabilistic SVM classifier ensemble selection
based on GMDH-type neural network." Pattern Recognition 106 (2020): 107373.
[11] Vo, Son Anh, Joel, Scanlan, and Paul, Turner. "An application of
Convolutional Neural Network to lobster grading in the Southern Rock Lobster
supply chain." Food Control (2020): 107184.
[12] Ahmed,Belal,and T. Aaron Gulliver."Image splicing detection using mask-RCNN.
" Signal, Image and Video Processing (2020): 1-8.
[13] Liu, G., Nouaze, J. C, Touko Mbouembe, P. L, & Kim, J. H.(2020).YOLO-Tomato:
A Robust Algorithm for Tomato Detection Based on YOLOv3. Sensors, 20(7), 214
5.
[14] Zhou, Linghua, et al. "Detecting Motion Blurred Vehicle Logo in IoV Using
Filter-DeblurGAN and VL-YOLO." IEEE Transactions on Vehicular Technology
69.4 (2020): 3604-3614.
[15] Luo, Jingting, Yong Wang, and Ying Wang. "Real-time pedestrian detection
method based on improved YOLOv3." Journal of Physics: Conference Series.
Vol. 1453. 2020.
[16] Sun, Xiechang, et al. "A fast multi-target detection method based on
improved YOLO." SPIE. Vol. 11429. 2020.
[17] Kim, Jinsoo, Jongwon Kim, and Jeongho Cho. "An advanced object
classification strategy using YOLO through camera and LiDAR sensor fusion."
2019 13th International Conference on Signal Processing and Communication
Systems (ICSPCS). IEEE, 2019.
[18] Ju, Moran, et al. "The application of improved YOLO V3 in multi-scale target
detection."Applied Sciences 9.18 (2019): 3775.
[19] König, Jonas, et al. "Multi-stage Reinforcement Learning for Object
Detection." Science and Information Conference. Springer, Cham, 2019.
[20] Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao. "YOLOv4:
Optimal Speed and Accuracy of Object Detection." arXiv preprint arXiv:2004.
10934(2020).
[21] Liu, Wei, et al. "Ssd: Single shot multibox detector." European conference
on computer vision. Springer, Cham, 2016.
[22] Redmon, Joseph, et al. "You only look once: Unified, real-time object
detection."Proceedings of the IEEE conference on computer vision and pattern
recognition.2016.
[23] Redmon, Joseph, and Ali Farhadi."YOLO9000:better, faster, stronger."
Proceedings of the IEEE conference on computer vision and pattern
recognition. 2017.
[24] Redmon, Joseph, and Ali Farhadi. "Yolov3: An incremental improvement." arXiv
preprint arXiv:1804.02767 (2018).
[25] Zheng, Zhaohui, et al."Distance-IoU Loss:Faster and Better Learning for
Bounding Box Regression." arXiv preprint arXiv:1911.08287 (2019).