簡易檢索 / 詳目顯示

研究生: 陳品匡
Pin-Kuang Chen
論文名稱: 以深度攝影機影像合成物品點雲模型並生成物件辨識資料集
Using a depth camera to synthesize object point clouds and generate object detection datasets
指導教授: 李維楨
Wei-Chen Lee
口試委員: 林宗翰
Tzung-Han Lin
徐繼聖
Gee-Sern Hsu
學位類別: 碩士
Master
系所名稱: 工程學院 - 機械工程系
Department of Mechanical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 74
中文關鍵詞: 電腦視覺影像處理影像合成3D點雲ICP深度學習物件辨識Faster R-CNN
外文關鍵詞: Computer Vision, Image Processing, Image Synthesis, 3D point cloud, ICP, Deep Learning, Object detection, Faster R-CNN
相關次數: 點閱:398下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

近年來,以深度學習進行目標檢測模型的開發已成為一項成熟且熱門的技術,大量帶有標註的公開影像資料集涵蓋了日常生活中的許多物品。然而,當我們欲辨識的目標無法在公開資料集中取得時,手動拍攝並標註影像中的目標物是在訓練辨識模型前耗費大量時間與資源的步驟。
本研究之目的在使用合成影像擴增少量的真實影像訓練集,並自動化標註合成影像。使用深度攝影機搭配旋轉平台得到物品的多視角RGB點雲,以color support ICP將多視角點雲拼疊合成物品點雲,並使物品點雲以多種角度的姿態轉換至影像平面,以得到多樣化的物品合成影像作為訓練集。在合成影像與場景融合時,即可得到目標在場景中的位置,自動化標註改善了人力所耗費的成本。
本研究使用4種商品影像作為辨識的目標,並給定相同的辨識場景。實驗結果顯示,當我們將300張的合成影像與100張的真實影像混合時,在IoU門檻值為[0.5:0.1:0.9]的條件下,mAP平均得到0.47,而使用400張的真實影像時,在相同IoU的條件下mAP平均得到0.29,加入多種姿態的合成影像確實有助於提升辨識模型的平均精度。而以100張真實影像為基底,不斷地擴增合成影像作為訓練集,在加入700張的合成影像時,在IoU門檻值為[0.5:0.1:0.9]的條件下mAP平均得到0.1529,模型表現差於使用400張真實影像的辨識模型。品質較差的影像對於模型精度的提升仍是有效的,但當大量真實性不佳的合成資料加入資料集,會使模型過度學習錯誤的資訊,對於辨識模型的表現反而有負面影響。


In recent years, the development of object detection and deep learning has become a mature and accessible technology. Many annotated object datasets are open to everyone. However, when the object cannot be found in the datasets, manually taking the photo and annotating the object in the images is a step that consumes a lot of time and resources before training the object detection model.
The objective of this research was to use synthetic images to augment a small number of real images as the training sets and automatically annotate synthetic images. We used a depth camera and a rotating platform to capture multi-view RGB point clouds of an object. Then we adopted the color support ICP algorithm to achieve multi-viewpoint cloud registration and converted the point clouds to 2D images at various angles to fuse them with the real scene to generate the images for training.
In this research, we used four different commodities as recognition targets. The experimental results show that when we used 300 synthetic images and 100 real images for training, we can obtain mAP of 0.47 under the condition of the IoU threshold value [0.5:0.1:0.9]. When using 400 real images as training sets, under the same IoU condition, the average mAP was 0.29. The synthetic images with multiple poses significantly improve the average accuracy of the recognition model. We continued to add synthetic images to our training set. When 700 synthetic images were added, the average mAP was 0.1529 under the condition of the IoU threshold value [0.5:0.1:0.9]. This performance was worse than our previous results because a large amount of synthetic data may have a negative impact on the performance of the model.

摘要 I Abstract II 誌謝 III 目錄 IV 圖目錄 VI 表目錄 IX 第一章 緒論 1 1.1. 研究動機 1 1.2. 文獻回顧 2 1.3. 研究目的 5 第二章 相關原理介紹 6 2.1. 彩色影像背景分割 6 2.1.1. 影像單色去背與差異去背 6 2.1.2. Otsu二值化法 6 2.2. 形態學 8 2.3. L*a*b*色彩空間 9 2.4. 深度影像與點雲轉換 10 2.5. Standard-ICP (Iterative Closest Point) 11 2.6. 卷積神經網路(CNN) 13 第三章 多視角點雲拼接 14 3.1. 實驗流程與架構 14 3.2. 實驗設備 15 3.3. 相機座標系與世界座標系轉換 17 3.4. 彩色影像之前景提取 23 3.5. 點雲半徑離群值去除 26 3.6. 點雲法線估計 27 3.7. 物品點雲的拼接與對齊 27 3.6.1. 點雲拼接-color supported ICP 29 3.6.2. 點雲全域優化(Global Optimization) 32 第四章 影像合成與深度學習模型 33 4.1. 點雲轉換至平面彩色影像 33 4.2. 點雲合成影像的遮罩提取 37 4.3. 影像合成與自動定界框標註 38 4.4. Faster R-CNN物件偵測模型與數據配置 40 第五章 實驗結果與討論 44 5.1. 點雲重建與影像合成之結果 44 5.2. 檢測模型評估 45 5.3. 實驗一:真實與合成資料集之訓練結果 47 5.4. 實驗二:以合成資料擴充真實資料集之訓練結果 53 5.5. 實驗三: 點雲合成影像與一般彩色影像拼貼之訓練結果比較 57 第六章 結論與未來展望 61 參考文獻 63

[1] D. Dwibedi, I. Misra, and M. Hebert, "Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection," in 2017 IEEE International Conference on Computer Vision (ICCV), 22-29 Oct. 2017 2017, pp. 1310-1319, doi: 10.1109/ICCV.2017.146.
[2] P. Rajpura, R. Hegde, and H. Bojinov, "Object Detection Using Deep CNNs Trained on Synthetic Images," 06/21 2017.
[3] G. Zhang, W. Huang, Y. Shen, and X. Wang, "Fast and accurate 3D reconstruction using a moving depth camera and inertial sensor," in 2017 Chinese Automation Congress (CAC), 20-22 Oct. 2017 2017, pp. 1255-1260, doi: 10.1109/CAC.2017.8242959.
[4] S. Rusinkiewicz and M. Levoy, "Efficient variants of the ICP algorithm," ed. Los Alamitos, CA, USA, USA: IEEE, 2001, pp. 145-152.
[5] H. Chen, Y. Feng, J. Yang, and C. Cui, "3D Reconstruction Approach for Outdoor Scene Based on Multiple Point Cloud Fusion," Journal of the Indian Society of Remote Sensing, Original Paper vol. 47, no. 10, p. 1761, 2019, doi: 10.1007/s12524-019-01029-y.
[6] R. A. Newcombe et al., "KinectFusion: Real-time dense surface mapping and tracking," in 2011 10th IEEE International Symposium on Mixed and Augmented Reality, 26-29 Oct. 2011 2011, pp. 127-136, doi: 10.1109/ISMAR.2011.6092378.
[7] J. Yang, H. Li, and Y. Jia, "Go-ICP: Solving 3D Registration Efficiently and Globally Optimally," in 2013 IEEE International Conference on Computer Vision, 1-8 Dec. 2013 2013, pp. 1457-1464, doi: 10.1109/ICCV.2013.184.
[8] Y. He, B. Liang, J. Yang, S. Li, and J. He, "An Iterative Closest Points Algorithm for Registration of 3D Laser Scanner Point Clouds with Geometric Features," Sensors, vol. 17, no. 8, 2017, doi: 10.3390/s17081862.
[9] J. Demantké, C. Mallet, N. David, and B. Vallet, "Dimensionality based scale selection in 3D lidar point clouds," Proceedings of the ISPRS Workshop Laser Scanning, vol. 38, pp. 97-102, 01/01 2011, doi: 10.5194/isprsarchives-XXXVIII-5-W12-97-2011.
[10] A. Segal, D. Hähnel, and S. Thrun, Generalized-ICP. 2009.
[11] M. Korn, M. Holzkothen, and J. Pauli, "Color supported generalized-ICP," in 2014 International Conference on Computer Vision Theory and Applications (VISAPP), 5-8 Jan. 2014 2014, vol. 3, pp. 592-599.
[12] "OpenCV-Python." [Online]. Available: https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_morphological_ops/py_morphological_ops.html.
[13] P. J. Besl and N. D. McKay, "A method for registration of 3-D shapes," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, no. 2, pp. 239-256, 1992, doi: 10.1109/34.121791.
[14] D. Chetverikov, D. Svirko, D. Stepanov, and P. Krsek, "The Trimmed Iterative Closest Point algorithm," in Object recognition supported by user interaction for service robots, 11-15 Aug. 2002 2002, vol. 3, pp. 545-548 vol.3, doi: 10.1109/ICPR.2002.1047997.
[15] G. Guy, R. Marc, and B. Rejean, "Three-dimensional registration using range and intensity information," in Proc.SPIE, 1994, vol. 2350, doi: 10.1117/12.189139. [Online]. Available: https://doi.org/10.1117/12.189139
[16] "A Comprehensive Guide to Convolutional Neural Networks." [Online]. Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53.
[17] 黃健堯, "逆向工程之點雲配對研究," 碩士, 機械工程系, 國立雲林科技大學, 雲林縣, 2014. [Online]. Available: https://hdl.handle.net/11296/9tye3w
[18] H. Hoppe, T. DeRose, T. Duchamp, J. McDonald, and W. Stuetzle, "Surface Reconstruction from Unorganized Points," SIGGRAPH Comput. Graph., vol. 26, no. 2, pp. 71–78, 1992, doi: 10.1145/142920.134011.
[19] 陳柏豪, "使用彩色深度感測器之物體模型重建系統與高解析度紋理貼圖," 碩士, 資訊科學與工程研究所, 國立交通大學, 新竹市, 2016. [Online]. Available: https://hdl.handle.net/11296/rv6g7w
[20] K. Pulli, "Multiview registration for large data sets," in Second International Conference on 3-D Digital Imaging and Modeling (Cat. No.PR00062), 8-8 Oct. 1999 1999, pp. 160-168, doi: 10.1109/IM.1999.805346.
[21] S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 2017, doi: 10.1109/TPAMI.2016.2577031.

無法下載圖示 全文公開日期 2025/08/24 (校內網路)
全文公開日期 本全文未授權公開 (校外網路)
全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
QR CODE