簡易檢索 / 詳目顯示

研究生: 衛奕廷
Yi-Ting Wei
論文名稱: 基於深度相機與深度學習的隨機拾取系統開發
Development of a Random Bin Picking System Based on Depth Cameras and Deep Learning
指導教授: 林清安
Ching-An Lin
口試委員: 陳盈君
何羽健
學位類別: 碩士
Master
系所名稱: 工程學院 - 機械工程系
Department of Mechanical Engineering
論文出版年: 2024
畢業學年度: 113
語文別: 中文
論文頁數: 127
中文關鍵詞: 深度學習深度相機隨機拾取機械手臂3D CAD模型
外文關鍵詞: Deep learning, Depth camera, Random bin picking, Robotic arm, 3D CAD model
相關次數: 點閱:495下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨機拾取技術結合機械手臂與機器視覺,實現對堆疊零件的辨識、定位與拾取,該技術的需求隨著勞動力短缺而愈加迫切。然而,當零件的幾何形狀複雜或堆疊場景混亂時,傳統基於影像處理的技術難以精確辨識及定位零件,進而限制自動化效率的提升。針對這一挑戰,本論文結合深度相機與深度學習技術,透過2D影像分割搭配3D深度資料處理,提出一套隨機拾取系統,能有效解決複雜堆疊場景中零件的辨識與定位難題。
    本論文的特點為採用深度相機作為主要感測設備,該設備能夠同時捕捉場景的彩色影像與深度資訊。與結構光掃描儀相比,深度相機具有成本低、速度快的優勢,能夠在短時間內產生零件的3D深度點資料,改善了純點雲處理導致計算量龐大、耗時久的問題,系統運作效率提高了約20%。經過實際案例驗證後,系統成功率達到90%,即使零件數量增加,仍能辨識零件種類並完成拾取。
    本論文以3D CAD模型為基礎,利用物理引擎模擬多種零件的隨機堆疊場景,實現自動化產生堆疊零件的影像,這些資料涵蓋彩色影像與對應的深度資訊,並透過影像處理技術產生零件的標註資料,以作為深度學習模型的訓練資料集。接著訓練深度學習模型,使其具備在複雜堆疊場景中精確辨識零件種類及定位的能力。然後根據辨識結果與深度資訊進行零件的夾取順序規劃、場景重建及夾取分析,最後以機械手臂進行零件的拾取與放置。
    關鍵字:深度學習、深度相機、隨機拾取、機械手臂、3D CAD模型


    Random bin picking technology integrates robotic arms and machine vision to identify, localize, and grasp stacked parts, addressing the growing need for automation amid labor shortages. Traditional image-processing methods struggle with complex geometries and cluttered scenarios, limiting efficiency. To overcome this, this thesis combines depth cameras and deep learning, utilizing 2D image segmentation and 3D depth data processing to improve part recognition and localization in challenging settings.
    This system’s core feature is the use of a depth camera as the primary scanning device, capable of simultaneously capturing both RGB images and depth information of the scene. Compared to structured light scanner, depth cameras offer the advantages of lower cost and faster speed, enabling the generation of 3D depth point data within a short time. This approach mitigates the high computational cost and time consumption associated with point cloud processing, improving system efficiency by approximately 20%. Validated through practical case studies, the system achieves a success rate of 90%, maintaining its ability to identify part types and execute pick-and-place tasks even as the number of parts increases.
    This paper is based on 3D CAD models and uses a physics engine to simulate various random stacking scenarios of parts, achieving automated generation of images of stacked parts. These data include color images and corresponding depth information. Annotation data for the parts are generated through image processing techniques to serve as a training dataset for deep learning models. Subsequently, the deep learning model is trained to accurately identify part types and positions in complex stacking scenarios. The recognition results and depth information are then used for planning the picking sequence, scene reconstruction, and pick-and-place analysis of the parts. Finally, a robotic arm is employed to pick and place the parts.
    Keywords: Deep learning, Depth camera, Random bin picking, Robotic arm, 3D CAD model

    摘要 I Abstract II 誌謝 III 目錄 IV 圖目錄 VIII 表目錄 XIII 第一章 緒論 1 1.1 研究動機與目的 1 1.2 研究方法 2 1.3 文獻探討 4 1.3.1隨機拾取技術 4 1.3.2機器視覺結合深度學習技術的應用 12 1.3.3研究議題與解決方案 21 1.4 論文架構 22 第二章 自動化產生深度學習訓練所需之堆疊零件資料集 24 2.1 在虛擬空間中隨機堆疊零件 24 2.1.1物理引擎Pybullet 25 2.1.2建立Pybullet虛擬空間 27 2.1.3堆疊零件之資料準備 28 2.1.4隨機堆疊3D CAD零件 30 2.2 建立隨機堆疊零件的影像資料 32 2.2.1獲取各零件的轉換矩陣 34 2.2.2在Pybullet中取得隨機堆疊零件之影像 37 2.3 建立堆疊零件深度學習所需之資料集 40 2.3.1建立堆疊零件的種類標籤 41 2.3.2獲得堆疊零件的輪廓點集 42 2.3.3堆疊零件深度學習的資料前處理 46 第三章 堆疊零件影像分割之深度學習 49 3.1 影像分割之深度學習模型Mask R-CNN 49 3.2 影像分割深度學習模型訓練過程與結果 52 3.3 影像分割深度學習模型驗證結果 58 第四章 以點雲匹配之零件姿態進行夾取分析 65 4.1 以零件之深度點資訊轉換點雲 68 4.2 以點雲匹配取得零件實際擺放姿態 72 4.3 以匹配之零件姿態進行場景重建 75 4.4 零件之夾取分析 77 4.4.1產生零件的夾取資訊 77 4.4.2選定零件之夾取順序 79 4.4.3分析夾爪路徑的干涉情況 82 第五章 實例驗證 86 5.1 軟體開發工具 86 5.2 實驗硬體設備 89 5.3 系統流程說明 93 5.4 實例驗證 99 5.4.1取得零件的夾取點組 99 5.4.2使用深度學習模型對堆疊零件影像進行分割 101 5.4.3以分割深度資訊篩選夾取零件 102 5.4.4使用深度學習模型進行點雲匹配 103 5.4.5以點雲匹配之結果進行場景重建 104 5.4.6轉換零件的夾取點組並以機械手臂進行夾取 105 5.5 結果與討論 111 5.5.1夾取成功率之影響因素 112 5.5.2系統效能比較 114 第六章 結論與未來研究方向 119 6.1 結論 119 6.2 未來研究方向 121 參考文獻 123

    [1] Song, S., Zeng, A., Lee, J. and Funkhouser, T. (2020). Grasping in the wild: Learning 6DoF closed-loop grasping from low-cost demonstrations. IEEE Robotics and Automation Letters, 5(3), pp. 4978-4985.
    [2] Joshi, S., Kumra, S. and Sahin, F. (2020). Robotic grasping using deep reinforcement learning. IEEE 16th International Conference on Automation Science and Engineering, Hong Kong, China, 2020, pp. 1461-1466.
    [3] Liang, H., Ma, X., Li, S., Görner, M., Tang, S., Fang, B., Sun, F. and Zhang, J. (2019). PointNetGPD: detecting grasp configurations from point sets. International Conference on Robotics and Automation, Montreal, Canada, May 20-24, pp. 3629-3635.
    [4] Le, T. T. and Lin, C. Y. (2019). Bin-picking for planar objects based on a deep learning network: a case study of USB packs. Sensors, 19(16), pp. 3602.
    [5] Kim, B. and Min, J. (2024). Sim-to-real object pose estimation for random bin picking. IEEE International Conference on Robotics and Automation, Yokohama, Japan, 2024, pp. 10749-10756
    [6] Martinez, C., Chen, H. and Boca, R. (2015). Automated 3D vision guided bin picking process for randomly located industrial parts. IEEE International Conference on Industrial Technology, Seville, Spain, pp. 3172-3177.
    [7] 王柏富(2019),「以3D CAD模型及3D點資料處理技術進行自動化機械手臂物件夾取」,碩士論文,國立臺灣科技大學機械工程研究所。
    [8] 張仁智(2019),「以機械手臂進行複雜幾何零件之自動化夾取」,碩士論文,國立臺灣科技大學機械工程研究所。
    [9] Ansary, S. I., Deb, S. and Deb, A. K. (2022). A novel object slicing-based grasp planner for unknown 3D objects. Intelligent Service Robotics, 15(1), pp. 9-26.
    [10] 賴以衛(2022),「以3D深度學習及點雲匹配技術進行機械手臂自動化複雜零件分類」,碩士論文,國立臺灣科技大學機械工程研究所。
    [11] 陳豈銘(2024),「以深度學習技術進行隨機堆疊擺放之零件的辨識及定位」,碩士論文,國立臺灣科技大學機械工程研究所。
    [12] Krizhevsky, A., Sutskever, I. and Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, pp. 1097-1105.
    [13] Girshick, R., Donahue, J., Darrell, T. and Malik, J. (2015). Region-based convolutional networks for accurate object detection and segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1), pp. 142-158.
    [14] Girshick, R. (2015). Fast R-CNN. IEEE International Conference on Computer Vision, Santiago, Chile, December 7-13, pp. 1440-1448.
    [15] Maity, M., Banerjee, S. and Chaudhuri, S.S. (2021). Faster R-CNN and YOLO based vehicle detection: A survey. International Conference on Computing Methodologies and Communication, Erode, India, April 8-10, pp. 1442-1447.
    [16] Diwan, T., Anirudh, G. and Tembhurne, J. V. (2023). Object detection using YOLO: Challenges, architectural successors, datasets and applications. Multimedia Tools and Applications, 82(6), 9243-9275.
    [17] Besl, P. J. and McKay, N. D. (1992). A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), pp. 239-256.
    [18] Yang, J., Li, H., Campbell, D. and Jia, Y. (2015). Go-ICP: A globally optimal solution to 3D ICP point-set registration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(11), pp. 2241-2254.
    [19] Rusu, R. B., Blodow, N. and Beetz, M. (2009). Fast point feature histograms (FPFH) for 3D registration. International Conference on Robotics and Automation, Kobe, Japan, May 12-17, pp. 3212-3217.
    [20] Huang, S., Gojcic, Z., Usvyatsov, M., Wieser, A. and Schindler, K. (2021). PREDATOR: registration of 3D point clouds with low overlap. IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, Tennessee, USA, June 19-25, pp. 4265-4274.
    [21] Zhu, L., Guan, H., Lin, C. and Han, R. (2022). Leveraging inlier correspondences proportion for point cloud registration. arXiv preprint, arXiv:2201.12094.
    [22] Mamou, K., Lengyel E., and Peters, A. (2016). Game Engine Gems 3, A K Peters/CRC Press, USA.
    [23] Belongie, S. (1999). “Rodrigues' Rotation Formula.” From MathWorld-A Wolfram Web Resource, created by Eric W. Weisstein.
    https://mathworld.wolfram.com/RodriguesRotationFormula.html.
    [24] OpenCV (2013). Open source computer vision library, Retrieved from https://opencv.org/
    [25] He, K., Gkioxari, G., Dollár, P. and Girshick, R. (2020). Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2), pp. 386-397.
    [26] 林冠成(2023),「以3D CAD模型及點雲匹配之深度學習進行複雜零件的隨機拾取」,碩士論文,國立臺灣科技大學機械工程系研究所。
    [27] Creo, Retrieved from https://www.ptc.com/tw/products/creo
    [28] Intel Realsense SDK, Retrieved from https://www.intelrealsense.com/ sdk-2
    [29] Pybullet, Retrieved from http://pybullet.org/
    [30] Labelme, Retrieved from https://github.com/labelmeai/labelme/
    [31] Open3D, Retrieved from http://www.open3d.org/
    [32] Tensorflow, Retrieved from https://www.tensorflow.org/
    [33] Intel Realsense D435i, Retrieved from https://www.intelrealsense.com/ depth-camera-d435i/
    [34] EPSON, Retrieved from https://www.epson.eu/en_EU/
    [35] SCHUNK, Retrieved from https://schunk.com/ca/en

    QR CODE