簡易檢索 / 詳目顯示

研究生: 張仁佑
Jen-You Zhang
論文名稱: 基於背景顏色可變之收集樣本影像與建立物件辨識資料集之方法
A Method of Collecting the Sample Images with the Color-Changeable Background and Establishing a Dataset for Object Recognition
指導教授: 李維楨
Wei-Chen Lee
口試委員: 孫沛立
Pei-li Sun
洪詩涵
Shih-Han Hung
學位類別: 碩士
Master
系所名稱: 工程學院 - 機械工程系
Department of Mechanical Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 98
中文關鍵詞: 電腦視覺影像處理深度學習物件辨識Faster R-CNN影像合成機械手臂
外文關鍵詞: Computer Vision, Image Processing, Deep Learning, Object Recognition, Faster R-CNN, Image Synthesis, Robotic Arm
相關次數: 點閱:411下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報


摘要 I Abstract II 誌謝 III 圖目錄 VII 表目錄 XI 第一章 緒論 1 1.1 研究動機 1 1.2 文獻回顧 2 1.3 研究目的 5 第二章 相關原理介紹 6 2.1 彩色影像之門檻值分割法 6 2.1.1 YCbCr色彩空間 6 2.2 影像去背與影像合成 7 2.2.1 影像去背(Image Matting) 7 2.2.2 Alpha合成(Alpha composition) 8 2.3 四種常見的影像辨識任務種類 10 2.4 卷積神經網路(Convolution neural network,CNN) 11 2.5 遷移式學習(Transfer learning) 12 2.6 Faster R-CNN物件偵測模型 13 第三章 實驗情境與實驗流程架構 14 3.1 實驗情境假設 14 3.2 實驗流程架構 15 第四章 基於可變背景收集商品影像 16 4.1 搭配機械手臂自動取得不同視野位置的物品影像 16 4.2 實驗硬體架構 19 4.3 收集影像流程 24 4.3.1 前置作業-取得彩色影像分割閾值 24 4.3.2 前置作業-取得實際距離與影像像素距離比值 27 4.3.3 尋找前景位置與計算機械手臂移動座標 28 第五章 提取商品前景與商品影像資料庫建立 37 5.1 提取商品前景影像處理流程 37 5.2 基於可變背景拍攝物品影像缺點 40 5.3 商品影像資料庫 41 5.4 收集真實影像集 43 5.4.1 人工影像資料標註 45 第六章 以影像合成生成訓練影像與自動標註 46 6.1 影像合成之商品擺放組合定義 46 6.1.1 物件影像插入背景影像可移動範圍定義 47 6.1.2 情境一、兩物體接觸且長軸相平行 48 6.1.3 情境二、兩物體接觸且長軸相垂直 49 6.1.4 情境三、兩物體分離 50 6.2 商品組合影像合成於背景與自動標註 51 6.2.1 影像合成流程 53 6.2.2 影像自動標註(Object detection標註類型) 56 6.2.3 影像自動標註(Instance segmentation標註類型) 57 6.3 以影像合成建立訓練資料集 58 第七章 實驗結果與討論 59 7.1 Faster R-CNN模型訓練參數設置 59 7.2 目標檢測模型評估指標 60 7.3 實驗一: 使用合成與真實訓練Faster R-CNN模型 62 7.3.1 實驗一: 訓練資料與測試資料分配 62 7.3.2 實驗一: Faster R-CNN訓練結果評估 63 7.4 實驗二: 單個及多個視野位置樣本影像訓練集模型評估 68 7.4.1 實驗二: 訓練資料與測試資料分配 68 7.4.2 實驗二: Faster R-CNN訓練結果評估 69 第八章 結論與未來展望 71 8.1 結論 71 8.2 未來展望 72 參考文獻 73 附錄 i

[1] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," presented at the Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1, Lake Tahoe, Nevada, 2012.
[2] M. Everingham, L. Gool, C. K. Williams, J. Winn, and A. Zisserman, "The Pascal Visual Object Classes (VOC) Challenge," Int. J. Comput. Vision, vol. 88, no. 2, pp. 303-338, 2010.
[3] J. Deng, W. Dong, R. Socher, L. Li, L. Kai, and F.-F. Li, "ImageNet: A large-scale hierarchical image database," in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 20-25 June 2009 2009, pp. 248-255.
[4] T.-Y. Lin et al., "Microsoft COCO: Common Objects in Context."
[5] S. Mahapatra. Why Deep Learning over Traditional Machine Learning? Retrieved from https://towardsdatascience.com/why-deep-learning-is-needed-over-traditional-machine-learning-1b6a99177063 (June 5,2019).
[6] B. Sapp, A. Saxena, and A. Y. Ng, A Fast Data Collection and Augmentation Procedure for Object Recognition. 2008, pp. 1402-1408.
[7] H. Jo, Y. Na, and J. Song, "Data augmentation using synthesized images for object detection," in 2017 17th International Conference on Control, Automation and Systems (ICCAS), 18-21 Oct. 2017 2017, pp. 1035-1038.
[8] S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 2017.
[9] D. Dwibedi, I. Misra, and M. Hebert, "Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection," in 2017 IEEE International Conference on Computer Vision (ICCV), 22-29 Oct. 2017 2017, pp. 1310-1319.
[10] J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 7-12 June 2015 2015, pp. 3431-3440.
[11] P. P, M. Gangnet, and A. Blake, "Poisson image editing," ACM Trans. Graph., vol. 22, no. 3, pp. 313-318, 2003.
[12] T. Kiyokawa, K. Tomochika, J. Takamatsu, and T. Ogasawara, "Fully Automated Annotation With Noise-Masked Visual Markers for Deep-Learning-Based Object Detection," IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 1972-1977, 2019.
[13] R. Dalal and T. Moh, "Fine-Grained Object Detection Using Transfer Learning and Data Augmentation," in 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 28-31 Aug. 2018 2018, pp. 893-896.
[14] W. Liu et al., "SSD: Single Shot MultiBox Detector."
[15] C. Szegedy et al., "Going deeper with convolutions," in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 7-12 June 2015 2015, pp. 1-9.
[16] C. A. Poynton, A Technical Introduction to Digital Video. Wiley, 1996.
[17] A. R. Smith and J. F. Blinn, "Blue screen matting," presented at the Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, 1996.
[18] J. Sun, Y. Li, S. B. Kang, and H.-Y. Shum, "Flash matting," ACM Trans. Graph., vol. 25, no. 3, pp. 772-778, 2006.
[19] MathWorks. Convolutional Neural Network 3 things you need to know. Retrieved from https://www.mathworks.com/solutions/deep-learning/convolutional-neural-network.html?s_tid=srchtitle (June 10,2019)
[20] S. J. Pan and Q. Yang, "A Survey on Transfer Learning," IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345-1359, 2010.
[21] R. Girshick, "Fast R-CNN," in 2015 IEEE International Conference on Computer Vision (ICCV), 7-13 Dec. 2015 2015, pp. 1440-1448.
[22] A. a. Z. Dutta, Andrew, "The VIA Annotation Software for Images, Audio and Video," arXiv preprint arXiv:1904.10699, 2019, Art no. dutta2019vgg.

無法下載圖示 全文公開日期 2024/08/15 (校內網路)
全文公開日期 本全文未授權公開 (校外網路)
全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
QR CODE