研究生: |
黃識軒 Shih-Hsuan Huang |
---|---|
論文名稱: |
以CAD模型建構點雲資料集以辨識三維物件之方法 A Method of Establishing a Point Cloud Dataset Using CAD Models for 3D Object Classification |
指導教授: |
李維楨
Wei-Chen Lee |
口試委員: |
林宗翰
Tzung-Han Lin 劉益宏 Yi-Hung Liu |
學位類別: |
碩士 Master |
系所名稱: |
工程學院 - 機械工程系 Department of Mechanical Engineering |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 中文 |
論文頁數: | 70 |
中文關鍵詞: | 機器視覺 、影像處理 、3D點雲 、深度學習 、物件辨識 、3DmFV-Net |
外文關鍵詞: | Computer Vision, Image Processing, 3D Point Cloud, Deep Learning, Object Classification, 3DmFV-Net |
相關次數: | 點閱:306 下載:9 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,二維彩色影像的深度學習在物件偵測已取得令人滿意的成果,且大量標註的彩色影像公開資料集包含了生活中的許多物品。然而,在一些特定情況下,例如:前景與背景顏色相近,物件偵測以及影像分割的效果則不佳。透過三維的點雲數據即可解決此問題,不過三維點雲數據的公開資料集並不如二維的公開資料集普及,且當欲辨識的目標類別並不在公開資料集中,手動拍攝並給予標籤則會耗費大量時間以及人力資源。本研究之目的在使用虛擬資料快速產生訓練資料集以訓練深度學習模型。虛擬資料是透過3D CAD模型經過轉換後得到近似於拍攝之多視角點雲。本研究中使用6種常用於五軸加工機內部之物品作為辨識的目標。實驗結果顯示,3D CAD模型經過轉換為點雲以後,其與拍攝資料各點對配對之平均距離為5.83 mm。此距離已小於體素下採樣之8 mm,且將CAD點雲與拍攝點雲轉換為3DmFV的圖形模式相近,代表經轉換後之點雲已足夠貼近拍攝點雲。此外,平均距離與深度學習模型的辨識準確率具關連性。當訓練資料與測試資料的平均距離愈小,則測試資料的辨識準確率愈高。以3D CAD模型建構之多視角訓練資料集對3DmFV-Net模型進行訓練,其辨識真實於五軸加工機內部拍攝之點雲其辨識準確率為 91.67%。
For the past few years, deep learning on 2D color image had received great results on object classification. Many kinds of common items are included in different datasets. However, in some situations, such as similar colors exist between an object and its background, it is not easy to detect the object using color images. This problem can be solved by using 3D point-cloud data, but the point-cloud datasets are not as many as the datasets of color image. Moreover, when the items needed to be classified are not in the datasets, it is time-consuming and expensive to obtain and label the data. The purpose of this research is to train a deep learning model with synthetic data and test the trained model with real objects’ point clouds captured by a depth camera. Training dataset is built by multi-view synthetic point clouds generated from 3D CAD models. The method of generating training data is to simulate the situations that point clouds being captured. In this research, there are 6 items used for object classification. The items are usually used in the machine tool. The result shows that the mean distance between the CAD point cloud and the captured point cloud is 5.83 mm. The mean distance between them is smaller than the voxel size of voxel downsampling, and the pattern of these two point clouds’ 3DmFV looks similar. It means that the CAD point clouds after transforming is similar to captured point clouds. The result also shows that the mean distance between the CAD point cloud and the captured point cloud influences the accuracy. The smaller the mean distance is, the higher accuracy is. The 3DmFV-Net model is trained by the multi-view of 3D CAD models’ dataset. The trained model is used to classify the real objects’ point clouds captured in the machine tool, and the accuracy of it is 91.67%.
[1] T.-Y. Lin et al., "Microsoft coco: Common objects in context," in European conference on computer vision, 2014: Springer, pp. 740-755.
[2] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, "Vision meets robotics: The kitti dataset," The International Journal of Robotics Research, vol. 32, no. 11, pp. 1231-1237, 2013.
[3] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," Communications of the ACM, vol. 60, no. 6, pp. 84-90, 2017, doi: 10.1145/3065386.
[4] P. Rajpura, R. Hegde, and H. Bojinov, "Object Detection Using Deep CNNs Trained on Synthetic Images," 06/21 2017.
[5] C. Rother, V. Kolmogorov, and A. Blake, ""GrabCut": interactive foreground extraction using iterated graph cuts," presented at the ACM SIGGRAPH 2004 Papers, Los Angeles, California, 2004. [Online]. Available: https://doi.org/10.1145/1186562.1015720.
[6] Y. Li, J. Zhang, P. Gao, L. Jiang, and M. Chen, "Grab Cut Image Segmentation Based on Image Region," in 2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC), 27-29 June 2018 2018, pp. 311-315, doi: 10.1109/ICIVC.2018.8492818.
[7] Q. Zhang, S. Gong, C. Liu, and W. Ying, "Similar background image co-segmentation with co-saliency," in 2017 International Smart Cities Conference (ISC2), 14-17 Sept. 2017 2017, pp. 1-2, doi: 10.1109/ISC2.2017.8090861.
[8] G. Kim and J. Sim, "Depth guided selection of adaptive region of interest for Grabcut-based image segmentation," in 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), 13-16 Dec. 2016 2016, pp. 1-4, doi: 10.1109/APSIPA.2016.7820823.
[9] S. Thalhammer, T. Patten, and M. Vincze, "SyDPose: Object Detection and Pose Estimation in Cluttered Real-World Depth Images Trained using Only Synthetic Data," in 2019 International Conference on 3D Vision (3DV), 16-19 Sept. 2019 2019, pp. 106-115, doi: 10.1109/3DV.2019.00021.
[10] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, "PointNet: Deep learning on point sets for 3D classification and segmentation," in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21-26 July 2017 2017, pp. 77-85, doi: 10.1109/CVPR.2017.16.
[11] C. R. Qi, L. Yi, H. Su, and L. J. Guibas, "Pointnet++: Deep hierarchical feature learning on point sets in a metric space," arXiv preprint arXiv:1706.02413, 2017.
[12] K. Zhang, M. Hao, J. Wang, C. W. de Silva, and C. Fu, "Linked dynamic graph cnn: Learning on point cloud via linking hierarchical features," arXiv preprint arXiv:1904.10014, 2019.
[13] Y. Ben-Shabat, M. Lindenbaum, and A. Fischer, "3DmFV: Three-dimensional point cloud classification in real-time using convolutional neural networks," IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3145-3152, 2018, doi: 10.1109/lra.2018.2850061.
[14] W. Shi and R. Rajkumar, "Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud," in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 13-19 June 2020 2020, pp. 1708-1716, doi: 10.1109/CVPR42600.2020.00178.
[15] M. A. Uy, Q. Pham, B. Hua, T. Nguyen, and S. Yeung, "Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data," in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 27 Oct.-2 Nov. 2019 2019, pp. 1588-1597, doi: 10.1109/ICCV.2019.00167.
[16] "OpenCV-Python-Morphology Transformations." [Online]. Available: https://docs.opencv.org/master/d9/d61/tutorial_py_morphological_ops.html.
[17] S. Saha. "A Comprehensive Guide to Convolutional Neural Networks." https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53 (accessed.
[18] S. A. Bello, S. Yu, C. Wang, J. M. Adam, and J. Li, "Deep learning on 3D point clouds," Remote Sensing, vol. 12, no. 11, p. 1729, 2020.
[19] H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller, "Multi-view convolutional neural networks for 3D shape recognition," presented at the 2015 IEEE International Conference on Computer Vision (ICCV), 2015.
[20] L. He, Y. Chao, and K. Suzuki, "A run-based two-scan labeling algorithm," IEEE Trans Image Process, vol. 17, no. 5, pp. 749-56, May 2008, doi: 10.1109/TIP.2008.919369.
[21] Y. Ben-Shabat. "What is 3D modified Fisher Vector (3DmFV) representation for 3D point clouds." https://www.itzikbs.com/what-is-3d-modified-fisher-vector-3dmfv-representation-for-3d-point-clouds?doing_wp_cron=1626848529.2204959392547607421875 (accessed.
[22] S. Kim, M. Kim, and Y. Ho, "Depth image filter for mixed and noisy pixel removal in RGB-D camera systems," IEEE Transactions on Consumer Electronics, vol. 59, no. 3, pp. 681-689, 2013, doi: 10.1109/TCE.2013.6626256.
[23] S. Rebay, "Efficient unstructured mesh generation by means of Delaunay triangulation and Bowyer-Watson algorithm," Journal of computational physics, vol. 106, no. 1, pp. 125-138, 1993.