研究生: |
衛奕廷 Yi-Ting Wei |
---|---|
論文名稱: |
基於深度相機與深度學習的隨機拾取系統開發 Development of a Random Bin Picking System Based on Depth Cameras and Deep Learning |
指導教授: |
林清安
Ching-An Lin |
口試委員: |
陳盈君
何羽健 |
學位類別: |
碩士 Master |
系所名稱: |
工程學院 - 機械工程系 Department of Mechanical Engineering |
論文出版年: | 2024 |
畢業學年度: | 113 |
語文別: | 中文 |
論文頁數: | 127 |
中文關鍵詞: | 深度學習 、深度相機 、隨機拾取 、機械手臂 、3D CAD模型 |
外文關鍵詞: | Deep learning, Depth camera, Random bin picking, Robotic arm, 3D CAD model |
相關次數: | 點閱:495 下載:5 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨機拾取技術結合機械手臂與機器視覺,實現對堆疊零件的辨識、定位與拾取,該技術的需求隨著勞動力短缺而愈加迫切。然而,當零件的幾何形狀複雜或堆疊場景混亂時,傳統基於影像處理的技術難以精確辨識及定位零件,進而限制自動化效率的提升。針對這一挑戰,本論文結合深度相機與深度學習技術,透過2D影像分割搭配3D深度資料處理,提出一套隨機拾取系統,能有效解決複雜堆疊場景中零件的辨識與定位難題。
本論文的特點為採用深度相機作為主要感測設備,該設備能夠同時捕捉場景的彩色影像與深度資訊。與結構光掃描儀相比,深度相機具有成本低、速度快的優勢,能夠在短時間內產生零件的3D深度點資料,改善了純點雲處理導致計算量龐大、耗時久的問題,系統運作效率提高了約20%。經過實際案例驗證後,系統成功率達到90%,即使零件數量增加,仍能辨識零件種類並完成拾取。
本論文以3D CAD模型為基礎,利用物理引擎模擬多種零件的隨機堆疊場景,實現自動化產生堆疊零件的影像,這些資料涵蓋彩色影像與對應的深度資訊,並透過影像處理技術產生零件的標註資料,以作為深度學習模型的訓練資料集。接著訓練深度學習模型,使其具備在複雜堆疊場景中精確辨識零件種類及定位的能力。然後根據辨識結果與深度資訊進行零件的夾取順序規劃、場景重建及夾取分析,最後以機械手臂進行零件的拾取與放置。
關鍵字:深度學習、深度相機、隨機拾取、機械手臂、3D CAD模型
Random bin picking technology integrates robotic arms and machine vision to identify, localize, and grasp stacked parts, addressing the growing need for automation amid labor shortages. Traditional image-processing methods struggle with complex geometries and cluttered scenarios, limiting efficiency. To overcome this, this thesis combines depth cameras and deep learning, utilizing 2D image segmentation and 3D depth data processing to improve part recognition and localization in challenging settings.
This system’s core feature is the use of a depth camera as the primary scanning device, capable of simultaneously capturing both RGB images and depth information of the scene. Compared to structured light scanner, depth cameras offer the advantages of lower cost and faster speed, enabling the generation of 3D depth point data within a short time. This approach mitigates the high computational cost and time consumption associated with point cloud processing, improving system efficiency by approximately 20%. Validated through practical case studies, the system achieves a success rate of 90%, maintaining its ability to identify part types and execute pick-and-place tasks even as the number of parts increases.
This paper is based on 3D CAD models and uses a physics engine to simulate various random stacking scenarios of parts, achieving automated generation of images of stacked parts. These data include color images and corresponding depth information. Annotation data for the parts are generated through image processing techniques to serve as a training dataset for deep learning models. Subsequently, the deep learning model is trained to accurately identify part types and positions in complex stacking scenarios. The recognition results and depth information are then used for planning the picking sequence, scene reconstruction, and pick-and-place analysis of the parts. Finally, a robotic arm is employed to pick and place the parts.
Keywords: Deep learning, Depth camera, Random bin picking, Robotic arm, 3D CAD model
[1] Song, S., Zeng, A., Lee, J. and Funkhouser, T. (2020). Grasping in the wild: Learning 6DoF closed-loop grasping from low-cost demonstrations. IEEE Robotics and Automation Letters, 5(3), pp. 4978-4985.
[2] Joshi, S., Kumra, S. and Sahin, F. (2020). Robotic grasping using deep reinforcement learning. IEEE 16th International Conference on Automation Science and Engineering, Hong Kong, China, 2020, pp. 1461-1466.
[3] Liang, H., Ma, X., Li, S., Görner, M., Tang, S., Fang, B., Sun, F. and Zhang, J. (2019). PointNetGPD: detecting grasp configurations from point sets. International Conference on Robotics and Automation, Montreal, Canada, May 20-24, pp. 3629-3635.
[4] Le, T. T. and Lin, C. Y. (2019). Bin-picking for planar objects based on a deep learning network: a case study of USB packs. Sensors, 19(16), pp. 3602.
[5] Kim, B. and Min, J. (2024). Sim-to-real object pose estimation for random bin picking. IEEE International Conference on Robotics and Automation, Yokohama, Japan, 2024, pp. 10749-10756
[6] Martinez, C., Chen, H. and Boca, R. (2015). Automated 3D vision guided bin picking process for randomly located industrial parts. IEEE International Conference on Industrial Technology, Seville, Spain, pp. 3172-3177.
[7] 王柏富(2019),「以3D CAD模型及3D點資料處理技術進行自動化機械手臂物件夾取」,碩士論文,國立臺灣科技大學機械工程研究所。
[8] 張仁智(2019),「以機械手臂進行複雜幾何零件之自動化夾取」,碩士論文,國立臺灣科技大學機械工程研究所。
[9] Ansary, S. I., Deb, S. and Deb, A. K. (2022). A novel object slicing-based grasp planner for unknown 3D objects. Intelligent Service Robotics, 15(1), pp. 9-26.
[10] 賴以衛(2022),「以3D深度學習及點雲匹配技術進行機械手臂自動化複雜零件分類」,碩士論文,國立臺灣科技大學機械工程研究所。
[11] 陳豈銘(2024),「以深度學習技術進行隨機堆疊擺放之零件的辨識及定位」,碩士論文,國立臺灣科技大學機械工程研究所。
[12] Krizhevsky, A., Sutskever, I. and Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, pp. 1097-1105.
[13] Girshick, R., Donahue, J., Darrell, T. and Malik, J. (2015). Region-based convolutional networks for accurate object detection and segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1), pp. 142-158.
[14] Girshick, R. (2015). Fast R-CNN. IEEE International Conference on Computer Vision, Santiago, Chile, December 7-13, pp. 1440-1448.
[15] Maity, M., Banerjee, S. and Chaudhuri, S.S. (2021). Faster R-CNN and YOLO based vehicle detection: A survey. International Conference on Computing Methodologies and Communication, Erode, India, April 8-10, pp. 1442-1447.
[16] Diwan, T., Anirudh, G. and Tembhurne, J. V. (2023). Object detection using YOLO: Challenges, architectural successors, datasets and applications. Multimedia Tools and Applications, 82(6), 9243-9275.
[17] Besl, P. J. and McKay, N. D. (1992). A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), pp. 239-256.
[18] Yang, J., Li, H., Campbell, D. and Jia, Y. (2015). Go-ICP: A globally optimal solution to 3D ICP point-set registration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(11), pp. 2241-2254.
[19] Rusu, R. B., Blodow, N. and Beetz, M. (2009). Fast point feature histograms (FPFH) for 3D registration. International Conference on Robotics and Automation, Kobe, Japan, May 12-17, pp. 3212-3217.
[20] Huang, S., Gojcic, Z., Usvyatsov, M., Wieser, A. and Schindler, K. (2021). PREDATOR: registration of 3D point clouds with low overlap. IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, Tennessee, USA, June 19-25, pp. 4265-4274.
[21] Zhu, L., Guan, H., Lin, C. and Han, R. (2022). Leveraging inlier correspondences proportion for point cloud registration. arXiv preprint, arXiv:2201.12094.
[22] Mamou, K., Lengyel E., and Peters, A. (2016). Game Engine Gems 3, A K Peters/CRC Press, USA.
[23] Belongie, S. (1999). “Rodrigues' Rotation Formula.” From MathWorld-A Wolfram Web Resource, created by Eric W. Weisstein.
https://mathworld.wolfram.com/RodriguesRotationFormula.html.
[24] OpenCV (2013). Open source computer vision library, Retrieved from https://opencv.org/
[25] He, K., Gkioxari, G., Dollár, P. and Girshick, R. (2020). Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2), pp. 386-397.
[26] 林冠成(2023),「以3D CAD模型及點雲匹配之深度學習進行複雜零件的隨機拾取」,碩士論文,國立臺灣科技大學機械工程系研究所。
[27] Creo, Retrieved from https://www.ptc.com/tw/products/creo
[28] Intel Realsense SDK, Retrieved from https://www.intelrealsense.com/ sdk-2
[29] Pybullet, Retrieved from http://pybullet.org/
[30] Labelme, Retrieved from https://github.com/labelmeai/labelme/
[31] Open3D, Retrieved from http://www.open3d.org/
[32] Tensorflow, Retrieved from https://www.tensorflow.org/
[33] Intel Realsense D435i, Retrieved from https://www.intelrealsense.com/ depth-camera-d435i/
[34] EPSON, Retrieved from https://www.epson.eu/en_EU/
[35] SCHUNK, Retrieved from https://schunk.com/ca/en