利用深度學習之工業零件辨識搭配機械手臂自動夾取｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	嚴健榮 Chien-Jung Yen
論文名稱：	利用深度學習之工業零件辨識搭配機械手臂自動夾取 Application of Deep Learning for Industrial Object Recognition and Manipulator Automatic Gripping
指導教授：	黃緒哲 Shiuh-Jer Huang
口試委員:	藍振揚 Chen-yang Lan 周瑞仁 Jui-Jen Chou
學位類別：	碩士 Master
系所名稱：	工程學院 - 機械工程系 Department of Mechanical Engineering
論文出版年：	2018
畢業學年度：	106
語文別：	中文
論文頁數：	96
中文關鍵詞：	深度學習、卷積類神經網路、YOLOv2 、工研院7A6型機械手臂
外文關鍵詞：	Deep learning, Convolution neural network, YOLOv2, 7A6 series manipulator
相關次數：	點閱：319 下載：16
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本研究使用深度學習之方法進行工業零件辨識，並整合工研院7A6型機械手臂進行物件夾取之任務。利用於桌上型電腦搭配影像處理單元(GPU)建立立體視覺辨識系統，藉由深度攝影機(Intel RealSense SR300)擷取影像資訊，辨識物體之種類以及三維之座標，將之傳送給7A6型機械手臂運動控制系統使機械手臂可在環境中夾取特定之工業零件。
視覺系統由電腦以及內部之影像處理單元(GPU)搭配深度影像之SDK、OpenCV及TensorFlow等函式庫，分別進行影像資料擷取、深度資訊運算、三維座標轉換、影像輪廓搜尋和卷積類神經網路模型訓練等處理。本文在卷積類神經網路模型的架構部份採用了YOLOv2之方法判別目標物體之種類和預測物體的中心點並利用輪廓搜尋方法找出物體之角度資訊，作為機械手臂操作之目標點，並透過座標轉換平移的方式將相機座標轉為機械手臂座標，並由網路通訊模式(TCP/IP)傳至7A6型機械手臂運動控制系統，最後由機械手臂完成物件夾取。

In this research, Method of Deep Learning is used in image system to recognize the industrial object and integrate with a 7A6 Series Manipulator for automatic gripping task. PC and Graphic Processing Unit (GPU) are chosen to construct of the 3D Vision Recognition System. Depth Camera (Intel RealSense SR300) is employed to extract the image for object recognizing and coordinate derivation. Then, the Manipulator can grasp the specific object in the environment based on the received image information.
The vision system consists of depth camera, computer, deep learning and image processing software library. The SDK of Intel RealSense SR300 are used OpenCV and Tensorflow libraries to extract the image, calculate depth information, 3D coordinate transformation, find contour and training the model based on convolution neural network (CNN). The YOLOv2 scheme is used in Convolution neural network (CNN) structure for object classification and center point prediction. Image processing strategy is used to find the object contour for and calculate the orientation angle. Then, the 3D coordinate transformation matrix between image system and robotic system is established to calculate the coordinate transformation. The manipulator receives the object coordinates and orientation angle through TCP/IP communication. Finally, the robotic gripping is automatic to grasp the object.

摘要    I
Abstract    II
致謝    III
目錄    IV
圖目錄    VI
表目錄    VIII
第1章    緒論    1
1    文獻回顧    1
2    研究動機與目的    2
3    論文架構    3
第2章    系統架構    4
1    系統簡介    4
2    視覺系統    5
2.1    PRIME B250M-A 桌上型電腦    5
2.2    影像擷取系統    6
3    機械手臂運動控制系統    8
3.1    PC-Based控制器    8
3.2    7A6型機械手臂    10
第3章    卷積類神經網路系統    14
1    卷積層及池化層架構以及運算方式    15
2    類神經網路架構    18
2.1    神經元運算模式    19
2.2    活化函數(Activation Function)    20
3    訓練之行為模式    22
3.1    監督式學習和非監督式學習    22
3.2    損失函數    23
3.3    梯度下降法    25
3.4    最佳化梯度下降法    27
3.5    規則化    32
第4章    影像辨識系統    35
1    YOLO以及YOLOV2 之方法    36
1.1    卷積層與池化層架構    36
1.2    活化函數以及損失函數    40
1.3    預測物體座標點以及分類之方法    45
1.4    非極大值抑制(Non Max Suppression)    47
1.5    優化法    48
1.6    合適訓練(Fine-Tuning)    52
2    影像處理及座標計算與轉換    53
2.1    取輪計廓算旋轉角度之方法    53
2.2    三維座標計算    57
2.3    座標轉換    58
3    影像資料擷取以及模型訓練架構    60
3.1    訓練資料種類    60
3.2    模型訓練架構    62
3.3    手臂實驗介紹:    64
第5章    實驗結果與討論    66
1    物件準確率信心程度實驗    67
2    相機影像誤差分析    72
3    靜止物件三維座標準確度評估    73
4    整體實驗結果    78
第6章    結論與未來展望    79
1    結論    79
2    未來展望    80
參考文獻    81
附錄    85


                                

【1】 D. Cockbum, J. P. Roberge, T. H. L. Le, A. Maslyczyk and V. Duchaine, “Grasp stability assessment through unsupervised feature learning of tactile images,” 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 2017, pp. 2238-2244.
【2】 J. Yoo and K. H. Johansson, “Semi-supervised learning for mobile robot localization using wireless signal strengths,” 2017 International Conference on Indoor Positioning and Indoor Navigation (IPIN), Sapporo, 2017, pp. 1-8.
【3】 A. Zeng , Yu, K. T., Song, S., Suo, D., Walker, E., Rodriguez, A., and Xiao, J. “Multi-view self-supervised deep learning for 6D pose estimation in the Amazon Picking Challenge,” 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 2017, pp. 1386-1383.
【4】 G. E. Pazienza, P. Giangrossi, S. Tortella, M. Balsi and X. Vilasis-Cardona, “Tracking for a CNN guided robot,” Proceedings of the 2005 European Conference on Circuit Theory and Design, 2005., 2005, pp. III/77-III/80 vol. 3.
【5】 E. Martinson and V. Yalla, “Real-time human detection for robots using CNN with a feature-based layered pre-filter,” 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), New York, NY, 2016, pp. 1120-1125.
【6】 X. Peng, B. Sun, K. Ali and K. Saenko, “Learning deep object Detectors from 3D Models,” 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1278-1286.
【7】 S. K. Lenka and A. G. Mohapatra, “Gradient descent with momentum based Neural Network Pattern Classification for the Prediction of Soil Moisture Content in Precision Agriculture,”2015 IEEE International Symposium on Nanoelectronic and Information Systems, Indore, 2015, pp. 63-66.
【8】 D. Soudry, D. Di Castro, A. Gal, A. Kolodny and S. Kvatinsky, “Memristor-Based Multilayer Neural Networks With Online Gradient Descent Training,” in IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 10, pp. 2408-2421, Oct. 2015.
【9】 S. Ren, K. He, R. Girshick and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, June 1 2017
【10】 E. Shelhamer, J. Long and T. Darrell, “Fully Convolutional Networks for Semantic Segmentation,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 4, pp. 640-651, April 1 2017.
【11】 J. Redmon, S. Divvala, R. Girshick, and A. Farhadi., “Youonly look once: Unified, real-time object detection,” arXiv preprint arXiv:1506.02640, 2015
【12】 Intel® RealSense™ Technology., “ Intel® RealSense™ SDK”, Revised Jun 2016.
【13】 Martin Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro,Greg Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow,Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Lukasz Kaiser, Manjunath Kudlur,Josh Levenberg, Dan Man, Rajat Monga, Sherry Moore, Derek Murray, Jon Shlens, BenoitSteiner, Ilya Sutskever, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Oriol Vinyals, PeteWarden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng, “TensorFlow: Large-Scale MachineLearning on Heterogeneous Distributed Systems,” 2015.
【14】齊藤康毅, 吳嘉芳譯 “Python進行深度學習的基礎理論實作”.
【15】 Ruder, Sebastian. “An overview of gradient descent optimization algorithms.” arXiv preprint arXiv:1609.04747 (2016)..
【16】 Ning Qian. On the momentum term in gradient descent learning algorithms. Neural networks :the official journal of the International Neural Network Society, 12(1):145–151, 1999.
【17】 Retrieved on June 3, 2018, from ,https://www.willamette.edu/~gorr/classes/cs449/momrate.html
【18】 John Duchi, Elad Hazan, and Yoram Singer, “Adaptive Subgradient Methods for Online Learning and Stochastic Optimization,” Journal of Machine Learning Research, 12:2121–2159, 2011.
【19】 Diederik P. Kingma and Jimmy Lei Ba, “Adam: a Method for Stochastic Optimization,” International Conference on Learning Representations, pages 1–13, 2015.
【20】 Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” J. Mach. Learn. Res. 15, 1 (January 2014), 1929-1958.
【21】 Anders Krogh and John A. Hertz, 1991, “A simple weight decay can improve generalization,” In Proceedings of the 4th International Conference on Neural Information Processing Systems (NIPS'91), J. E. Moody, S. J. Hanson, and R. P. Lippmann (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 950-957.
【22】 Sergey Ioffe and Christian Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” arXiv preprint arXiv:1502.03167v3, 2015.
【23】 J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 6517-6525.
【24】 Li, Zhizhong, and Derek Hoiem. “Learning without forgetting.” IEEE Transactions on Pattern Analysis and Machine Intelligence (2017).
【25】 S. Suzuki and K. Abe, “Topological Structural Analysis of Digitized Binary Images by Border Following”, computer vision, graphics, and image processing 30, pp. 32-46, 1985
【26】 G. Bradski and A. Kaehler, 于仕琪譯, 劉瑞禎譯 “Learning OpenCV中文版”, pp. 259-263
【27】 Retrieved on June 1, 2018, from https://en.wikipedia.org/wiki/Ramer-Douglas-Peucker_algorithm
【28】 J. Redmon. Darknet: Open source neural networks in c.http://pjreddie.com/darknet/, 2013–2016
【29】 Mark Everingham, Luc Gool, Christopher K. Williams, John Winn, and Andrew Zisserman. 2010. “The Pascal Visual Object Classes (VOC) Challenge,” Int. J. Comput. Vision 88, 2 (June 2010), 303-338.
【30】 K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask r-cnn,” ´arXiv:1703.06870v2, 2017
【31】 Trottier, Ludovic, Philippe Giguere, and Brahim Chaib-draa. “Convolutional Residual Network for Grasp Localization.”
【32】 Lin, Min, Qiang Chen, and Shuicheng Yan, “Network in network,” arXiv preprint arXiv:1312.4400 (2013).

簡易檢索 / 詳目顯示

相關論文