基於2D影像處理之家用物件之機械手臂全自主夾取系統｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	吳品萱 Ping-Hsuan Wu
論文名稱：	基於2D影像處理之家用物件之機械手臂全自主夾取系統 Autonomous Grasping System for Household Objects with Robot Arm Based on 2D Image Processing
指導教授：	林其禹 Chyi−Yeu Lin
口試委員:	林其禹 Chyi−Yeu Lin 林遠球 Yuan−Chiu Lin 林柏廷 Po-Ting Lin 陳羽薰 Yu-Hsun Chen
學位類別：	碩士 Master
系所名稱：	工程學院 - 機械工程系 Department of Mechanical Engineering
論文出版年：	2020
畢業學年度：	108
語文別：	中文
論文頁數：	104
中文關鍵詞：	全自主物件夾取、2D影像處理、深度學習、自然特徵、ArUco標記
外文關鍵詞：	Autonomous object grasping, 2D image processing, Deep learning, Natural feature, ArUco marker
相關次數：	點閱：305 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

儘管使用機械手臂進行全自主物件夾取於工業領域中已發展的相當成熟，但大多數的任務不是事先教導方式進行固定位置上固定物件的操作，就是使用3D視覺技術進行籃中多數件任意堆放單一形狀物件的抓取(RBP; random bin picking)。近年來由於服務型機器人需求提升，機械手臂夾取的目標變成形狀、大小皆不同的日常生活用品，且多數生活用品皆有指定且不可更改的夾取位置，因此機械手臂夾取已不再能夠使用教導方式和單純的RBP進行，而藉由特殊視覺系統的導入，給了機械手臂『看見』和『對準』的能力，使其能夠藉由物品的幾何特徵來應對每個物品夾取方式。
許多研究使用深度相機產生物體的3D點雲，並與數據集中的3D對像模型進行匹配，但其計算量十分龐大，花費時間也較長。因此本研究針對家用物件使用2D影像處理技術，以快速求得指定物件及其姿態，並進行機械手臂之全自主物件夾取。
本研究首先通過深度學習中的遷移學習，從場景中搜尋到目標物件，並輸出物件的bounding box的4個角落於圖像中之位置的像素值。通過使用神經網路輸出的像素值，移除目標物件之背景部分。根據目標物件之的幾何形狀，系統將使用自然特徵或人工標記(ArUco Marker) 進行特徵點辨識，接著使用Perspective-n-Point算法來估計相機姿態。根據當前相機姿態估計位置與預設之特徵點之最終姿態樣板來計算出轉移矩陣，便可使機械手臂移動到最終位置並夾取目標物件。
本研究最終將呈現使用六軸機械手臂進行下午茶服務，包括杯子及碟子之夾取、擺放，及倒茶服務，並在碟子中放置鳳梨酥。實驗證明，本研究中的2D影像處理技術是可實現於針對家用物件的全自主物件夾取。

In spite of the fact that using a robot arm to execute autonomous grasping in the industry is quite mature, most of the tasks in the industry are using the robot arm to perform the specific object operations at a fixed position by prior teaching or performs Random Bin Picking (RBP) using 3D vision. The needs of the service robot increase in recent years, the targets of the robot arm are changed into daily necessities which have different shapes and sizes, moreover, some of the objects have an unchangeable grasping position. Thus, it’s no longer possible using the teaching method or only performing Random Bin Picking. By importing the special vision system to the robot arm, the robot arm has the ability to “see” and “aim”, and be able to respond to the way of grasping the object by recognizing the geometric of the object.
There has been much research that use the stereo camera to create the 3D point cloud, and use these points to match with the 3D model from the dataset. However, it is computationally expensive and time-consuming. Therefore, this research aims at the household objects and uses only 2D information to quickly find the specific object and its posture, then make the robot arm autonomous grasp the object.
In this research, first will use deep learning to execute 2D object detection and find the specific object that the user asks for, and output four-pixel values of the four corners of the bounding box of the object. Through these values, the background part of the image can be removed. According to the geometric of the object, the system will detect the feature points from either natural feature or artificial landmark, then solve the Perspective-n-Point problem to estimate the camera posture. Through the current camera posture and the template of the pre-defined final camera posture, a transformation matrix can be calculated, then the robot arm can move to the final position and grasp the specific object that the user asks for.
This research will eventually present the use of a six-axis robotic arm for afternoon tea service, including grasping and placing the cups and the plates, and tea pouring service, and placing pineapple cakes in the saucer. The experiment proves that the 2D image processing technology in this research can be realized in the fully autonomous object grasping for household objects.

摘要
Abstract
致謝
目錄
圖目錄
表目錄
第一章 緒論
第二章 研究理論基礎
第三章 家用物件之機械手臂全自主夾取系統
第四章 實驗器材與設置
第五章 實驗結果
第六章 結論與未來展望
參考文獻    
                                

[1] J. Mahler et al., "Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics," 2017.
[2] J. Bohren et al., "Towards autonomous robotic butlers: Lessons learned with the PR2," in 2011 IEEE International Conference on Robotics and Automation, 2011, pp. 5568-5575: IEEE.
[3] R. B. Rusu, W. Meeussen, S. Chitta, and M. Beetz, "Laser-based perception for door and handle identification," in 2009 International Conference on Advanced Robotics, 2009, pp. 1-8: IEEE.
[4] R. Lienhart and J. Maydt, "An extended set of haar-like features for rapid object detection," in Proceedings. international conference on image processing, 2002, vol. 1, pp. I-I: IEEE.
[5] A. Zeng et al., "Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching," in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 1-8: IEEE.
[6] I. The MathWorks. (2018). What Is Camera Calibration? Available: http://www.mathworks.com/help/vision/ug/camera-calibration.html?w.mathworks.com
[7] C. B. Duane, "Close-range camera calibration," Photogramm. Eng, vol. 37, no. 8, pp. 855-866, 1971.
[8] J. G. Fryer and D. C. Brown, "Lens distortion for close-range photogrammetry," Photogrammetric engineering and remote sensing, vol. 52, no. 1, pp. 51-58, 1986.
[9] M. W. Spong, S. Hutchinson, and M. Vidyasagar, Robot modeling and control. Wiley New York, 2006.
[10] R. Hartley and A. Zisserman, Multiple view geometry in computer vision. Cambridge university press, 2003.
[11] P. Viola and M. J. I. j. o. c. v. Jones, "Robust real-time object detection," vol. 4, no. 34-47, p. 4, 2001.
[12] J. Quinlan, "Induction of Decision Trees. Mach. Learn," 1986.
[13] C. Cortes and V. J. M. l. Vapnik, "Support-vector networks," vol. 20, no. 3, pp. 273-297, 1995.
[14] 周秉誼. (2018). 淺談Deep Learning原理及應用.
[15] C. Rother, V. Kolmogorov, and A. Blake, "Grabcut: Interactive foreground extraction using iterated graph cuts," in ACM transactions on graphics (TOG), 2004, vol. 23, no. 3, pp. 309-314: ACM.
[16] L. Quan and Z. Lan, "Linear n-point camera pose determination," IEEE Transactions on pattern analysis and machine intelligence, vol. 21, no. 8, pp. 774-780, 1999.
[17] R. Horaud, B. Conio, O. Leboulleux, and B. Lacolle, "An analytic solution for the perspective 4-point problem," in Computer Vision and Pattern Recognition, 1989. Proceedings CVPR'89., IEEE Computer Society Conference on, 1989, pp. 500-507: IEEE.
[18] R. L. Carceroni and C. M. J. U. o. R. Brown, Computer Science Department, Technical Report, "Numerical methods for model-based pose recovery," no. 659, 1997.
[19] M. J. N. R. C. C. Fiala, Publication Number: NRC, "Artag, an improved marker system based on artoolkit," vol. 47419, p. 2004, 2004.
[20] S. Garrido-Jurado, R. Munoz-Salinas, F. J. Madrid-Cuevas, and R. J. P. R. Medina-Carnicer, "Generation of fiducial marker dictionaries using mixed integer linear programming," vol. 51, pp. 481-491, 2016.
[21] 江修、黃偉峰, "六軸機械臂之控制理論分析與應用," 機械工業, vol. 277, pp. 57-73, 民95.04.
[22] M. Spong and M. Vidyasagar, "Robot Dynamics and Control (New-York: John Willey & Sons)," ed: Inc, 1989

全文公開日期 2025/01/16 (校內網路)
全文公開日期 2025/01/16 (校外網路)
全文公開日期 2025/01/16 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文