簡易檢索 / 詳目顯示

研究生: 黃識軒
Shih-Hsuan Huang
論文名稱: 以CAD模型建構點雲資料集以辨識三維物件之方法
A Method of Establishing a Point Cloud Dataset Using CAD Models for 3D Object Classification
指導教授: 李維楨
Wei-Chen Lee
口試委員: 林宗翰
Tzung-Han Lin
劉益宏
Yi-Hung Liu
學位類別: 碩士
Master
系所名稱: 工程學院 - 機械工程系
Department of Mechanical Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 70
中文關鍵詞: 機器視覺影像處理3D點雲深度學習物件辨識3DmFV-Net
外文關鍵詞: Computer Vision, Image Processing, 3D Point Cloud, Deep Learning, Object Classification, 3DmFV-Net
相關次數: 點閱:306下載:9
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

近年來,二維彩色影像的深度學習在物件偵測已取得令人滿意的成果,且大量標註的彩色影像公開資料集包含了生活中的許多物品。然而,在一些特定情況下,例如:前景與背景顏色相近,物件偵測以及影像分割的效果則不佳。透過三維的點雲數據即可解決此問題,不過三維點雲數據的公開資料集並不如二維的公開資料集普及,且當欲辨識的目標類別並不在公開資料集中,手動拍攝並給予標籤則會耗費大量時間以及人力資源。本研究之目的在使用虛擬資料快速產生訓練資料集以訓練深度學習模型。虛擬資料是透過3D CAD模型經過轉換後得到近似於拍攝之多視角點雲。本研究中使用6種常用於五軸加工機內部之物品作為辨識的目標。實驗結果顯示,3D CAD模型經過轉換為點雲以後,其與拍攝資料各點對配對之平均距離為5.83 mm。此距離已小於體素下採樣之8 mm,且將CAD點雲與拍攝點雲轉換為3DmFV的圖形模式相近,代表經轉換後之點雲已足夠貼近拍攝點雲。此外,平均距離與深度學習模型的辨識準確率具關連性。當訓練資料與測試資料的平均距離愈小,則測試資料的辨識準確率愈高。以3D CAD模型建構之多視角訓練資料集對3DmFV-Net模型進行訓練,其辨識真實於五軸加工機內部拍攝之點雲其辨識準確率為 91.67%。


For the past few years, deep learning on 2D color image had received great results on object classification. Many kinds of common items are included in different datasets. However, in some situations, such as similar colors exist between an object and its background, it is not easy to detect the object using color images. This problem can be solved by using 3D point-cloud data, but the point-cloud datasets are not as many as the datasets of color image. Moreover, when the items needed to be classified are not in the datasets, it is time-consuming and expensive to obtain and label the data. The purpose of this research is to train a deep learning model with synthetic data and test the trained model with real objects’ point clouds captured by a depth camera. Training dataset is built by multi-view synthetic point clouds generated from 3D CAD models. The method of generating training data is to simulate the situations that point clouds being captured. In this research, there are 6 items used for object classification. The items are usually used in the machine tool. The result shows that the mean distance between the CAD point cloud and the captured point cloud is 5.83 mm. The mean distance between them is smaller than the voxel size of voxel downsampling, and the pattern of these two point clouds’ 3DmFV looks similar. It means that the CAD point clouds after transforming is similar to captured point clouds. The result also shows that the mean distance between the CAD point cloud and the captured point cloud influences the accuracy. The smaller the mean distance is, the higher accuracy is. The 3DmFV-Net model is trained by the multi-view of 3D CAD models’ dataset. The trained model is used to classify the real objects’ point clouds captured in the machine tool, and the accuracy of it is 91.67%.

摘要 I Abstract II 誌謝 III 目錄 IV 圖目錄 VII 表目錄 X 第一章 緒論 1 1.1. 研究動機 1 1.2. 文獻回顧 1 1.3. 研究目的 3 第二章 相關原理介紹 4 2.1. 深度影像差異分割 4 2.2. 形態學 4 2.3. 深度影像與點雲間的轉換 7 2.4. 卷積神經網路 8 2.5. 點雲的深度學習 9 2.6. 有限元素法 10 2.7. 德洛涅三角剖分 11 2.8. 連通區域標記 12 第三章 實驗流程與設備 15 3.1. 實驗流程與架構 15 3.2. 深度學習模型與數據配置 15 3.3. 實驗設備 18 3.4. 軟體介紹 22 3.4.1. SolidWorks 22 3.4.2. Python 22 3.4.3. MATLAB 23 第四章 研究方法 24 4.1. 拍攝實物點雲 24 4.1.1. 拍攝深度影像 24 4.1.2. 深度影像預處理 25 4.1.3. 遮罩提取 27 4.1.4. 實物點雲生成 30 4.1.5. 點雲後處理 31 4.2. 建構多視角點雲訓練資料集 33 4.2.1. 繪製物品3D CAD模型 34 4.2.2. 3D CAD模型轉換為點雲 34 4.2.3. CAD點雲多視角分割 35 4.2.4. 點雲後處理 39 第五章 實驗結果與討論 41 5.1. 拍攝點雲與CAD點雲相似程度的評估 41 5.1.1. 手動對齊點雲 42 5.1.2. 兩點雲最近鄰點配對 43 5.1.3. 刪除錯誤配對點對 44 5.1.4. 兩點雲平均距離計算 44 5.2. 深度學習模型評估 45 5.3. 平均距離與辨識準確率之關聯 48 5.4. 辨識錯誤探討 51 第六章 結論與未來展望 54 6.1. 結論 54 6.2. 未來展望 55 參考文獻 56

[1] T.-Y. Lin et al., "Microsoft coco: Common objects in context," in European conference on computer vision, 2014: Springer, pp. 740-755.
[2] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, "Vision meets robotics: The kitti dataset," The International Journal of Robotics Research, vol. 32, no. 11, pp. 1231-1237, 2013.
[3] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," Communications of the ACM, vol. 60, no. 6, pp. 84-90, 2017, doi: 10.1145/3065386.
[4] P. Rajpura, R. Hegde, and H. Bojinov, "Object Detection Using Deep CNNs Trained on Synthetic Images," 06/21 2017.
[5] C. Rother, V. Kolmogorov, and A. Blake, ""GrabCut": interactive foreground extraction using iterated graph cuts," presented at the ACM SIGGRAPH 2004 Papers, Los Angeles, California, 2004. [Online]. Available: https://doi.org/10.1145/1186562.1015720.
[6] Y. Li, J. Zhang, P. Gao, L. Jiang, and M. Chen, "Grab Cut Image Segmentation Based on Image Region," in 2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC), 27-29 June 2018 2018, pp. 311-315, doi: 10.1109/ICIVC.2018.8492818.
[7] Q. Zhang, S. Gong, C. Liu, and W. Ying, "Similar background image co-segmentation with co-saliency," in 2017 International Smart Cities Conference (ISC2), 14-17 Sept. 2017 2017, pp. 1-2, doi: 10.1109/ISC2.2017.8090861.
[8] G. Kim and J. Sim, "Depth guided selection of adaptive region of interest for Grabcut-based image segmentation," in 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), 13-16 Dec. 2016 2016, pp. 1-4, doi: 10.1109/APSIPA.2016.7820823.
[9] S. Thalhammer, T. Patten, and M. Vincze, "SyDPose: Object Detection and Pose Estimation in Cluttered Real-World Depth Images Trained using Only Synthetic Data," in 2019 International Conference on 3D Vision (3DV), 16-19 Sept. 2019 2019, pp. 106-115, doi: 10.1109/3DV.2019.00021.
[10] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, "PointNet: Deep learning on point sets for 3D classification and segmentation," in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21-26 July 2017 2017, pp. 77-85, doi: 10.1109/CVPR.2017.16.
[11] C. R. Qi, L. Yi, H. Su, and L. J. Guibas, "Pointnet++: Deep hierarchical feature learning on point sets in a metric space," arXiv preprint arXiv:1706.02413, 2017.
[12] K. Zhang, M. Hao, J. Wang, C. W. de Silva, and C. Fu, "Linked dynamic graph cnn: Learning on point cloud via linking hierarchical features," arXiv preprint arXiv:1904.10014, 2019.
[13] Y. Ben-Shabat, M. Lindenbaum, and A. Fischer, "3DmFV: Three-dimensional point cloud classification in real-time using convolutional neural networks," IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3145-3152, 2018, doi: 10.1109/lra.2018.2850061.
[14] W. Shi and R. Rajkumar, "Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud," in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 13-19 June 2020 2020, pp. 1708-1716, doi: 10.1109/CVPR42600.2020.00178.
[15] M. A. Uy, Q. Pham, B. Hua, T. Nguyen, and S. Yeung, "Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data," in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 27 Oct.-2 Nov. 2019 2019, pp. 1588-1597, doi: 10.1109/ICCV.2019.00167.
[16] "OpenCV-Python-Morphology Transformations." [Online]. Available: https://docs.opencv.org/master/d9/d61/tutorial_py_morphological_ops.html.
[17] S. Saha. "A Comprehensive Guide to Convolutional Neural Networks." https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53 (accessed.
[18] S. A. Bello, S. Yu, C. Wang, J. M. Adam, and J. Li, "Deep learning on 3D point clouds," Remote Sensing, vol. 12, no. 11, p. 1729, 2020.
[19] H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller, "Multi-view convolutional neural networks for 3D shape recognition," presented at the 2015 IEEE International Conference on Computer Vision (ICCV), 2015.
[20] L. He, Y. Chao, and K. Suzuki, "A run-based two-scan labeling algorithm," IEEE Trans Image Process, vol. 17, no. 5, pp. 749-56, May 2008, doi: 10.1109/TIP.2008.919369.
[21] Y. Ben-Shabat. "What is 3D modified Fisher Vector (3DmFV) representation for 3D point clouds." https://www.itzikbs.com/what-is-3d-modified-fisher-vector-3dmfv-representation-for-3d-point-clouds?doing_wp_cron=1626848529.2204959392547607421875 (accessed.
[22] S. Kim, M. Kim, and Y. Ho, "Depth image filter for mixed and noisy pixel removal in RGB-D camera systems," IEEE Transactions on Consumer Electronics, vol. 59, no. 3, pp. 681-689, 2013, doi: 10.1109/TCE.2013.6626256.
[23] S. Rebay, "Efficient unstructured mesh generation by means of Delaunay triangulation and Bowyer-Watson algorithm," Journal of computational physics, vol. 106, no. 1, pp. 125-138, 1993.

QR CODE