研究生: |
林妤璟 Yu-Ching Lin |
---|---|
論文名稱: |
仿效人類取物策略於機器手臂之深度圖像夾取姿態與位置辨識 Emulating Human Grasp Strategies for Identifying Manipulator Grasp Pose and Position with RGB-D Images |
指導教授: |
郭重顯
Chung-Hsien Kuo |
口試委員: |
黃漢邦
Han-Pang Huang 劉益宏 Yi-Hung Liu 劉孟昆 Meng-Kun Liu |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 中文 |
論文頁數: | 95 |
中文關鍵詞: | 深度神經網路 、夾取姿態 、三維點雲 |
外文關鍵詞: | Deep neural network, grasping posture, three-dimensional point cloud |
相關次數: | 點閱:262 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文提出一仿效人類取物策略於機器手臂之深度圖像夾取姿態與位置辨識系統,其系統分為物體分類和夾取姿態生成。在物體分類方面使用本論文設計之自動樣本訓練系統,透過物件特徵提取及合成演算法,自主生成物件樣本及其標註文件。輸入Yolov3神經網路分類物體種類,其輸出的分類種類為杯子、瓶子以及馬克杯,並返回物體彩色圖像。夾取姿態生成方面使用前述物體彩色圖像,輸入Yolov3神經網路辨識物體夾取位置,結合深度資訊重建物件三維點雲圖,系統將根據物體種類特性匹配其夾取姿態方向角,若為對稱物體如杯子及瓶子,則依據人類夾取姿態實驗獲得不同位置下的拿取角度,做為姿態方向角。非對稱物體如馬克杯則判斷把手法向量,做為姿態方向角,接著透過標定系統求解相機點雲座標與世界座標相對應關係,獲得世界座標中物件夾取位置,最終生成夾取姿態方向角及物件夾取點並顯示於使用者介面。
為驗證系統輸出夾取點之精確度,本論文對三類共六項物體馬克杯、無把手杯子及瓶子,進行物件夾取位置定位實驗,分別於五種不同位置下分析實驗物體的夾取點定位精度及還原對稱物體直徑,最後由實驗結果得知系統於不同位置下皆可生成準確的夾取姿態。
This study proposes a posture and position recognition system based on emulating human-grasping strategies for identifying manipulator grasp pose and position with RGB-D images system. The system contains object classification and generating grasping gesture. In the aspect of object classification using automatic-sample-training system. With feature extraction of object and synthetic algorithm to self-generated object sample and label documents accordingly. As the input of Yolov3 neural network, classifying the types of object and output the classification results such as cup, bottle and mug then return the object color image. In the section of grasping gesture, using the previous object color image as the input of Yolov3 neural network to identify the object grasping position and combining depth information to reconstruct the three-dimensional point cloud of object. The system matches grasping posture angle according to characteristics of the object type. If it’s a symmetrical object such as cup and bottle, generating the posture angle base on grasping angle from various position according to the human grasping experiment. The asymmetrical object like mug, it will calculate the normal vector of handle as the posture angle. By solving the corresponding relationship between point cloud coordinate and world coordinate through calibration system to obtain the grasping position of object in world coordinate. The system can generate the grasping posture angle and object grasping point and display on the user interface finally.
In order to validate the precision of the system output, this study classify 3 categories and 6 types such as mug, cup and bottle to perform grasping position locating experiment. Analyzing the precision of the object grasping point and reconstructing symmetrical object diameter. At last, the result shows the system can generate precise grasping posture on different position.
[1] P. Huang, C. Shen and H. Hsiao, “Rgbd salient object detection using spatially coherent deep learning framework,” 2018 IEEE 23rd International Conference on Digital Signal Processing (DSP), pp. 1-5 , 2018.
[2] X. Yin, Y. Sasaki, W. Wang and K. Shimizu, “3D Object Detection Method Based on YOLO and K-Means for Image and Point Clouds,” arXiv preprint arXiv:2005.02132, 2020.
[3] X. Chen, K. Kundu, Y. Zhu, H. Ma, S. Fidler and R. Urtasun, “3d object proposals using stereo imagery for accurate object class detection,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 5, pp. 1259-1272, 2017
[4] X. Liu, B. Dai and H. He, “Real-time object segmentation for visual object detection in dynamic scenes,” 2011 International Conference of Soft Computing and Pattern Recognition (SoCPaR)., pp. 423-428, 2011.
[5] H. Law, Y. Teng, O. Russakovsky and J. Deng, “Cornernet-lite: Efficient keypoint based object detection,” arXiv preprint arXiv:1904.08900, 2019.
[6] S. Mane and S. Mangale, “Moving Object Detection and Tracking Using Convolutional Neural Networks,” 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 1809-1813, 2018.
[7] J.U. Kim and Y.M. Ro, “Attentive Layer Separation for Object Classification and Object Localization in Object Detection,” 2019 IEEE International Conference on Image Processing (ICIP), pp. 3995-3999, 2019.
[8] C. Wang, D. Xu, Y. Zhu, R. Martín-Martín, C. Lu, F.F. Li and S. Savarese, “Densefusion: 6d object pose estimation by iterative dense fusion,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3338-3347, 2019.
[9] R.Q. Charles, H. Su, M. Kaichun and L.J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 77-85, 2017.
[10] J. Redmon and A. Angelova, “Real-time grasp detection using convolutional neural networks,” 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 1316-1322, 2015.
[11] A. Mousavian, C. Eppner and D. Fox, “6-Dof GraspNet: Variational grasp generation for object manipulation,” Proceedings of the IEEE International Conference on Computer Vision, pp. 2901-2910, 2019.
[12] J. Mahler, J. Liang, S. Niyaz, M. Laskey, R. Doan, X. Liu, J. A. Ojea and K. Goldberg, “Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics,” arXiv preprint arXiv:1703.09312, 2017.
[13] J. Tremblay, T. To, B. Sundaralingam, Y. Xiang, D. Fox, and S. Birchfield, “Deep object pose estimation for semantic robotic grasping of household objects,” arXiv preprint arXiv:1809.10790 , 2018.
[14] Y. Huang, S. Huang, H. Chen, Y. Chen, C. Liu and T.S. Li, “A 3d vision based object grasping posture learning system for home service robots,” 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2690-2695, 2017.
[15] Y. Lin, L. Zeng, Z. Dong and X. Fu, “A Vision-Guided Robotic Grasping Method for Stacking Scenes Based on Deep Learning,” 2019 IEEE 3rd Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), pp. 91-96, 2019.
[16] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, “ You only look once: Unified, real-time object detection,” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779-788, 2016.
[17] J. Redmon and A. Farhadi., “Yolov3: An incremental improvement,” arXiv preprint arXiv:1804.02767 , 2018.
[18] Wikipedia,Perspective-n-Point。檢自https://en.wikipedia.org/wiki/Perspective-n-Point(May 1, 2020)