簡易檢索 / 詳目顯示

研究生: 丁凱庭
Kai-Ting Ting
論文名稱: 以3D點資料之深度學習搭配機械手臂進行自動化零件分類
Manipulator-based Automatic Parts Classification through Deep Learning of 3D Data Points
指導教授: 林清安
Ching-An Lin
口試委員: 徐繼聖
Gee-Sern Hsu
謝文賓
Win-Bin Shieh
學位類別: 碩士
Master
系所名稱: 工程學院 - 機械工程系
Department of Mechanical Engineering
論文出版年: 2020
畢業學年度: 109
語文別: 中文
論文頁數: 157
中文關鍵詞: 點雲深度學習3D影像辨識機械手臂隨機取放
外文關鍵詞: Point cloud, Deep learning, 3D parts identification, Manipulator, Random-bin-picking
相關次數: 點閱:266下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究探討3D深度學習應用於零件辨識的可能性,並結合機械手臂,開發一套「隨機取放零件分類系統」,進行零件拾取與放置,達到自動化零件分類之目的。在零件拾取方面,本研究使用3D結構光掃描器擷取零件群之點雲,透過減採樣在維持幾何特徵的前提下降低點雲密度,以提升運算速度;接著使用叢聚法劃分零件群之點雲,再以重心為依據從中挑選適合吸取之零件,並計算其吸取點。
    本研究將3D深度學習模型PointNet應用於零件辨識,為了解系統對不同幾何形狀之零件的判別能力,分別針對水管零件(3種)及玩具零件(6種)兩組案例,透過3D結構光掃描器擷取不同光源及隨機位置之零件點雲,建立多樣化之訓練數據。將每種零件各準備240筆點雲,經過200 Epoch的訓練,水管零件群及玩具零件群的準確率分別為96.296%與96.759%。另一方面,為探討數據數量對準確率以及訓練時間之影響,將每種零件之數據數量下降至120筆進行訓練,其結果顯示雖減少系統訓練之時間,但準確率也隨之下降。
    機械手臂實際運行結果顯示,不同零件之成功吸取率的平均值介於82%至92%之間。深度學習辨識系統應用於不同零件群,皆具有高於97%的成功辨識率,且辨識單一或多個零件所需時間未具有明顯差異。相較於匹配辨識系統,深度學習辨識系統具有即時判斷的能力。在零件群未堆疊狀態下,系統透過一次性的掃描,即可分析及辨識所有零件;在有堆疊狀態下,需於每次拾取後重新執行掃描步驟。因此系統在執行有堆疊狀態時,將花費大部分時間在掃描、3D點雲資料處理及辨識過程。


    This study aims to evaluate the feasibility of applying the deep learning technique on 3D point cloud data to identify various shapes of 3D parts. A manipulator is also adopted to develop a random-bin-picking parts sorting system. The data acquisition of the 3D point cloud data was carried out using a 3D structured light scanner, and was down-sampled on the premise of keeping geometric features of target parts. In addition, the clustering method was employed on the segmentation of point clouds. The picking facet was computed by the gravity centers of objects whence the suction nozzle sucks.
    The PointNet model of deep learning of 3D point cloud data was applied on identification of various parts. In order to evaluate the effect of different geometry features on the accuracy of deep learning system, both dataset of water-pipe case (3 different types) and education-toy case (6 different types) were scanned by 3D structured light with various light sources and placements. After 200-epoch training, the accuracy under 240 data samples of water-pipe case and education-toy case are 96.296% and 96.759%, respectively. On the other hand, the acquisition of data samples was lower to the amount of 120 to estimate the relation between the amount of data samples and accuracy, and the training duration. The results show that the reduction of data samples not only shortens the training time, but also has the detrimental effect on the accuracy.
    The average picking success rate of water-pipe case and education-toy case through manipulator are in a range of 82% to 92%. The results inferred that the picking success rate is depending on the size of suction nozzle and the choice of picking facet. Whether the parts have differential geometry, the classification success rate for deep learning is higher than 97%. Although the amount of objects varies, the classification time maintains relatively fast. In comparison with the traditional “feature matching” method, deep learning of 3D points has the capability to run a real-time identification. If the parts do not overlap with each other, the system analyzes and identifies all parts by a single light scan. On the contrary, the identification time of piled parts increased plenty. The reason of time increment can be attributed to the repeated operations for scanning, 3D point processing and identification from the top part to the bottom one.

    摘要 I Abstract III 誌謝 V 目錄 VI 圖目錄 X 表目錄 XVII 第一章 緒論 1 1.1 研究動機與目的 1 1.2 研究方法 3 1.3 文獻探討 3 1.4 論文架構 13 第二章 文獻回顧 15 2.1 人工智慧發展歷史 15 2.2 機器學習 16 2.2.1 監督式學習 17 2.2.2 非監督式學習 18 2.2.3 強化式學習 18 2.3 深度學習 19 2.3.1 人工神經網路 20 2.3.2 卷積神經網路 23 2.4 應用於3D影像之深度學習模型 27 2.4.1 3D ShapeNets模型 27 2.4.2 VoxNet模型 29 2.4.3 PointNet模型 30 第三章 3D點雲資料處理 34 3.1 機械手臂座標與掃描器座標轉換 34 3.2 使用減採樣降低點雲密度 35 3.3 使用K-D tree搜尋點資料 38 3.3.1 K-D tree建構原理 38 3.3.2 以K-D tree搜尋特定點 42 3.3.3 以K-D tree搜尋特定點之鄰近點 44 3.4 挑選適合吸取之零件 48 3.4.1 使用叢聚法進行分群 48 3.4.2 以重心決定欲吸取之零件 53 3.5 搜尋適合之吸取點 54 3.5.1 平面搜尋 55 3.5.2 以亂數搜尋吸取位置 56 第四章 基於深度學習之3D影像辨識 59 4.1 3D點雲資料數據準備 60 4.2 訓練3D點雲資料 67 4.3 訓練結果 80 4.4 數據數量對準確率及訓練時間之影響 86 4.5 使用訓練結果進行零件辨識 88 第五章 系統開發與實例驗證 92 5.1 系統運作流程 92 5.2 實驗設備 94 5.2.1 3D結構光掃描器 94 5.2.2 EPSON機械手臂 95 5.2.3 吸嘴 96 5.3 系統環境及軟體開發工具 97 5.3.1 系統環境 97 5.3.2 HP Pro S3/David SDKs 98 5.3.3 Point Cloud Library(PCL) 98 5.3.4 PyTorch 99 5.3.5 EPSON Robot API 99 5.4 實例驗證 99 5.4.1 零件群狀態簡介 100 5.4.1.1 零件群有堆疊 101 5.4.1.2 零件群未堆疊 105 5.4.2 處理3D點雲資料 109 5.4.2.1 機械手臂與掃描器座標轉換 109 5.4.2.2 降低點雲密度 109 5.4.2.3 挑選適合吸取之零件 110 5.4.2.4 搜尋零件之吸取點 112 5.4.3 以3D深度學習辨識零件類型 114 5.4.4 拾取零件並進行分類 116 5.4.4.1 三種水管零件群之有堆疊狀態 116 5.4.4.2 六種玩具零件群之未堆疊狀態 117 5.4.5 結果討論 119 5.4.5.1 影響成功吸取率之因素 119 5.4.5.2 深度學習系統與匹配系統之辨識比較 123 5.4.5.3 堆疊對深度學習系統之影響 127 第六章 結論與未來研究方向 129 6.1 結論 129 6.2 未來研究方向 130 參考文獻 132 摘要 I Abstract III 誌謝 V 目錄 VI 圖目錄 X 表目錄 XVII 第一章 緒論 1 1.1 研究動機與目的 1 1.2 研究方法 3 1.3 文獻探討 3 1.4 論文架構 13 第二章 文獻回顧 15 2.1 人工智慧發展歷史 15 2.2 機器學習 16 2.2.1 監督式學習 17 2.2.2 非監督式學習 18 2.2.3 強化式學習 18 2.3 深度學習 19 2.3.1 人工神經網路 20 2.3.2 卷積神經網路 23 2.4 應用於3D影像之深度學習模型 27 2.4.1 3D ShapeNets模型 27 2.4.2 VoxNet模型 29 2.4.3 PointNet模型 30 第三章 3D點雲資料處理 34 3.1 機械手臂座標與掃描器座標轉換 34 3.2 使用減採樣降低點雲密度 35 3.3 使用K-D tree搜尋點資料 38 3.3.1 K-D tree建構原理 38 3.3.2 以K-D tree搜尋特定點 42 3.3.3 以K-D tree搜尋特定點之鄰近點 44 3.4 挑選適合吸取之零件 48 3.4.1 使用叢聚法進行分群 48 3.4.2 以重心決定欲吸取之零件 53 3.5 搜尋適合之吸取點 54 3.5.1 平面搜尋 55 3.5.2 以亂數搜尋吸取位置 56 第四章 基於深度學習之3D影像辨識 59 4.1 3D點雲資料數據準備 60 4.2 訓練3D點雲資料 67 4.3 訓練結果 80 4.4 數據數量對準確率及訓練時間之影響 86 4.5 使用訓練結果進行零件辨識 88 第五章 系統開發與實例驗證 92 5.1 系統運作流程 92 5.2 實驗設備 94 5.2.1 3D結構光掃描器 94 5.2.2 EPSON機械手臂 95 5.2.3 吸嘴 96 5.3 系統環境及軟體開發工具 97 5.3.1 系統環境 97 5.3.2 HP Pro S3/David SDKs 98 5.3.3 Point Cloud Library(PCL) 98 5.3.4 PyTorch 99 5.3.5 EPSON Robot API 99 5.4 實例驗證 99 5.4.1 零件群狀態簡介 100 5.4.1.1 零件群有堆疊 101 5.4.1.2 零件群未堆疊 105 5.4.2 處理3D點雲資料 109 5.4.2.1 機械手臂與掃描器座標轉換 109 5.4.2.2 降低點雲密度 109 5.4.2.3 挑選適合吸取之零件 110 5.4.2.4 搜尋零件之吸取點 112 5.4.3 以3D深度學習辨識零件類型 114 5.4.4 拾取零件並進行分類 116 5.4.4.1 三種水管零件群之有堆疊狀態 116 5.4.4.2 六種玩具零件群之未堆疊狀態 117 5.4.5 結果討論 119 5.4.5.1 影響成功吸取率之因素 119 5.4.5.2 深度學習系統與匹配系統之辨識比較 123 5.4.5.3 堆疊對深度學習系統之影響 127 第六章 結論與未來研究方向 129 6.1 結論 129 6.2 未來研究方向 130 參考文獻 132

    [1] Rusu, R.B., Blodow, N. and Beetz, M. (2009), “Fast Point Feature Histograms (FPFH) for 3D registration,” IEEE International Conference on Robotics and Automation, May 12-17, 2009, Kobe, Japan, pp. 3212-3217.
    [2] Besl, P.J. and McKay, N.D. (1992), “A method for registration of 3-D shapes,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 14, No. 2, pp. 239-256.
    [3] 劉彥峰,「以機械手臂輔助零件隨機拾取與表面瑕疵檢測之系統開發與應用」(2018),碩士論文,臺灣科技大學機械工程系,台北市。
    [4] Bogdan, R.R. and Cousins, S. (2011), “3D is here: Point Cloud Library (PCL),” IEEE International Conference on Robotics and Automation, May 9-13, 2011, Shanghai, China, pp. 1-4.
    [5] Bentley, J.L. (1975), “Multidimensional binary search trees used for associative searching,” Communications of the ACM, Vol. 18, No. 9, pp. 509-517.
    [6] Rusinkiewicz, S. and Marc, L. (2001), “Efficient variants of the ICP algorithm,” Proceedings Third International Conference on 3-D Digital Imaging and Modeling, May 28-June 1, 2001, Quebec, Canada, pp. 145-152.
    [7] He, Y., Liang, B., Yang, J., Li, S. and He, J. (2017), “An iterative closest points algorithm for registration of 3D laser scanner point clouds with geometric features.” Sensors, Vol. 17, No. 8, pp. 1862.
    [8] Fischler, M.A. and Bolles, R.C. (1981), “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Communications of the ACM, Vol. 24, No. 11, pp. 381-395.
    [9] Joseph, R., Divvala, S., Girshick, R. and Farhadi, A. (2016), “You Only Look Once: Unified, real-time object detection,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA, pp. 779-788.
    [10] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C-Y. and Berg, A.C. (2016), “SSD: Single Shot MultiBox Detector,” European conference on computer vision, Oct. 11-14, 2016, Amsterdam, The Netherlands, pp. 21-37.
    [11] Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X. and Xiao, J. (2015), “3D ShapeNets: A deep representation for volumetric shapes,” 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 7-12, 2015, Boston, MV, USA, pp. 1912-1920.
    [12] Maturana, D. and Scherer, S. (2015), “VoxNet: A 3d convolutional neural network for real-time object recognition,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, Sept. 28-Oct. 2, 2015, Hamburg, Germany, pp. 922-928.
    [13] Qi, C.R., Su, H., Mo, K. and Guibas, L.J. (2017), “PointNet: Deep learning on point sets for 3D classification and segmentation,”IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA, pp. 652-660.
    [14] Pulli, K., Abi-Rached, H., Duchamp, T., Shapiro, L.G. and Stuetzle, W. (1998), “Acquisition and visualization of colored 3D objects,” Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170), IEEE, Aug. 20, 1998, Brisbane, Queensland, Australia, Vol. 1, pp. 11-15.
    [15] Alexandrea, P. (2017), “3D scanning through structured light projection,” 3Dnatives, Retrieved from https://www.3dnatives.com/en/structured-light-projection-3d-scanning/
    [16] Liu, M-Y., Tuzel, O., Veeraraghavan, A., Taguchi, Y., Marks, T.K. and Chellappa, R. (2012), “Fast object localization and pose estimation in heavy clutter for robotic bin picking,” International Journal of Robotics Research, Vol. 31, No. 8, pp. 951-973.
    [17] Bellandi, P., Docchio, F. and Sansoni, G. (2013), “Roboscan: A combined 2D and 3D vision system for improved speed and flexibility in pick-and-place operation,” The International Journal of Advanced Manufacturing Technology, Vol. 69, No. 5-8, pp. 1873-1886.
    [18] Kumar, R., Lal, S., Kumar, S. and Chand, P. (2014), “Object detection and recognition for a pick and place robot,” IEEE Asia-Pacific World Congress on Computer Science and Engineering, pp. 1-7.
    [19] Zeng, A., Yu, K.T., Song, S., Suo, D., Walker, E., Rodriguez, A. and Xiao, J. (2017), “Multi-view self-supervised deep learning for 6D pose estimation in the Amazon Picking Challenge,” IEEE International Conference on Robotics and Automation (ICRA), May 29-June 3, 2017, IEEE, Singapore, Singapore, pp. 1383-1386.
    [20] Simonyan, K. and Zisserman, A. (2014), “Very deep convolutional networks for large-scale image recognition,” International Conference on Learning Representations, May 7-9, 2014, San Diego, CA, USA.
    [21] Russell, S.J. and Norvig, P. (1995), Artificial intelligence: A modern approach, Englewood Cliffs, N. J.: Prentice Hall, Bergen, NJ, USA pp. 3-8.
    [22] 陳昇瑋、溫怡玲,人工智慧在台灣(台北市:天下雜誌,2019)。
    [23] LeCun, Y., Bengio, Y. and Hinton, G. (2015), “Deep learning,” Nature, Vol. 521, No. 7553, pp. 436-444.
    [24] Copeland, M. (2016), “What’s the difference between artificial intelligence, machine learning and deep learning,” nvidia, Re-trieved from https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/
    [25] McCulloch, W.S. and Pitts, W. (1943), “A logical calculus of the ideas immanent in nervous activity,” The bulletin of mathematical biophysics, Vol. 5, No. 4, pp. 115-133.
    [26] Hebb, D.O. (1949), The organization of behavior: A neuropsychological theory, John Wiley & Sons, Inc., New York, NY, USA.
    [27] Rosenblatt, F. (1958), “The perceptron: a probabilistic model for information storage and organization in the brain,” Psychological review, 65 (6), pp. 386.
    [28] Krizhevsky, A., Sutskever, I. and Hinton, G. E. (2012), “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, pp. 1097-1105.
    [29] Ioffe, S. and Szegedy, C. (2015), “Batch normalization: accelerating deep network training by reducing internal covariate shift,” ICML'15: Proceedings of the 32nd International Conference on International Conference on Machine Learning, July 7-9, 2015, Lille, France, Vol. 37, pp. 448-456.
    [30] He, K., Zhang, X., Ren, S. and Sun, J. (2016), “Deep residual learning for image recognition,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA, pp. 770-778.
    [31] Johnson, A. (1997), Spin-images: A representation for 3-D surface matching, Ph.D. dissertation, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA.
    [32] Chen, T., Dai, B., Liu, D. and Song, J. (2014), “Performance of global descriptors for velodyne-based urban object recognition,” 2014 IEEE Intelligent Vehicles Symposium Proceedings, June 8-11, 2014, Dearborn, MI, USA, pp. 667-673.
    [33] Xia, F., “PointNet.pytorch,” Retrieved from https://github.com/fxia22/pointnet.pytorch
    [34] HP 3D Structured Light Scanner Pro S3, Retrieved from https://www8.hp.com/us/en/campaign/3Dscanner/overview.html
    [35] EPSON Prosix S5-A701S (S5), Retrieved from https://neon.epson-europe.com/robots/products/product.php?id=10884&content=547
    [36] RGK Automation Co., Ltd, Retrieved from http://www.rgk-fa.com/
    [37] Point Cloud Library (PCL), Retrieved from https://pointclouds.org/

    無法下載圖示 全文公開日期 2025/11/11 (校內網路)
    全文公開日期 2025/11/11 (校外網路)
    全文公開日期 2025/11/11 (國家圖書館:臺灣博碩士論文系統)
    QR CODE