一個基於深度學習的凌亂商品影像之自動物件分類方法

簡易檢索 / 詳目顯示

回結果列表

研究生：	趙偉佑 Wei-Yu Chao
論文名稱：	一個基於深度學習的凌亂商品影像之自動物件分類方法 An Automatic Object Classification Method for Cluttered Product Images Based on Deep Learning
指導教授：	范欽雄 Chin-Shyurng Fahn
口試委員:	鍾斌賢 Bin-Shyan Jong 陳彥霖 Yen-Lin Chen 陳怡伶 Yi-Ling Chen
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2020
畢業學年度：	108
語文別：	中文
論文頁數：	50
中文關鍵詞：	多物件分類、深度學習、凌亂商品辨識、自助結帳、自動結帳、自動編碼器
外文關鍵詞：	Multi-object classification, Deep learning, Cluttered product recognition, Self-checkout, Automatic checkout, Autoencoder
相關次數：	點閱：355 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

當前的大多數結帳系統都依靠條形碼、RFID標籤或附在商品上的QR碼來區分產品，本研究藉由內置有單個攝像頭的結帳台影像與深度學習模組學習商品特徵以辨識商品，透過這種電腦視覺技術取代現有的方式進行結帳，可以檢測多種沒有任何實質標籤的產品。在自動結帳的應用上，主要的挑戰來自產品的種類繁多，以及產品的日益更新，反映出要蒐集實際結賬的圖像來做訓練困難度，此外，顧客在結帳時，隨意放置商品，也會造成各個商品互相重疊、遮蔽的問題。
為了解決上述問題，本方法主要分為兩個部分：第一部分使用了顯著性目標偵測從單產品示例圖中擷取出商品的影像，藉由擷取出的商品影像合成出結帳圖，再經由我們的自動編碼器，優化我們合成的結帳圖；第二部分則將第一部分的結果用來訓練我們的深度學習模型，其採用Cascade R-CNN為基礎架構，另外再搭配特徵金字塔網路加強對不同尺寸的物體進行識別，以此進行多物件分類。
在本研究中，我們以Retail Product Checkout (RPC) 數據集進行的實驗，與RPC提供的基準方法獲得的分類準確率56.68％相比，我們的方法得到了86.39％的分類準確性；此外，當新口品的產品推出時，我們可以使用我們的方法生成混亂的產品圖像，以微調我們的模型，使得這些新產品的平均分類率達到98.8％，如此的分類性能表明我們所提的方法在自助結帳的應用上是可行的。

Most current checkout systems rely on barcodes, RFID tags, or QR codes attached to products for recognition. In this study, the products are identified by using the image captured with a single camera mounted on the top of a checkout table, and using deep learning modules to learn the characteristics of the products. Using computer vision technology to replace the existing methods of checkout, it is possible to detect a variety of products without any physical labels. In the application of automatic checkout, the main challenge comes from the variety of products and the frequent updates of products. They reflect the difficulty of collecting actual checkout images for training. In addition, when customers randomly place products during checkout, it will also cause the problems of overlapping and obscuring products.
In order to solve the above problems, our proposed method consists of two phases. The first phase uses salient object detection to segment the image of a product from a single product exemplar image. The cluttered product image is synthesized by pasting segmented product images, then it is enhanced through our autoencoder. The second phase uses the outcomes of the first phase to train our deep learning model that adopts Cascade R-CNN as the basic architecture. Besides, the feature pyramid network is employed to strengthen the recognition of different sized objects, so as to perform multi-object classification.
In this study, we conduct experiments using the Retail Product Checkout (RPC) dataset. Compared with the classification accuracy rate of 56.68% obtained by the method provided by RPC, our proposed method achieves the classification accuracy rate of 86.39%. Additionally, when new product taste is released, we can use our method to generate cluttered product images for fine-tuning our model and average classification rate of those products reaches 98.80%. Such performance manifests that our method is feasible in the application of self-checkout.

中文摘要    i
Abstract    ii
致謝    iii
Contents    iv
List of Figures    vi
List of Tables    viii
Chapter 1    Introduction    1
1.1    Overview    1
1.2    Motivation    2
1.3    System Description    3
1.4    Organization of Thesis    4
Chapter 2    Related Work    5
2.1    Other Product Recognition    5
2.2    Object Detection    6
Chapter 3    Image Preprocessing    9
3.1    Product Segmentation    9
3.2    Cluttered Product Image Synthesizing    11
3.2.1    Cluttered product image generating from product partial images    12
3.2.2    Shadow-working treatment    13
3.3    Autoencoder    14
Chapter 4    Product Detection Methodology    19
4.1    Image Augmentation    19
4.2    Product Detection    23
4.3    Overlapping Detection Removal    26
Chapter 5    Experimental Results and Discussions    28
5.1    Experimental Setup    28
5.1.1    Dataset Introduction    28
5.1.2    Development Environment Setup    30
5.2    Evaluation protocol    30
5.3    Experimental results    32
Chapter 6    Conclusions and Future Work    38
6.1    Conclusions    38
6.2    Future Work    39
References    40

                                

[1] X. S. Wei et al., “RPC: A large-scale retail product checkout dataset,” arXiv preprint arXiv:1901.07249, 2019.
[2] J. Y. Zhu et al., “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, pp. 2223-2232, 2017.
[3] S. Ren et al., “Faster R-CNN: Towards real-time object detection with region proposal networks,” in Proceedings of the Advances in Neural Information Processing Systems, Montreal, Quebec, Canada, pp. 91-99, 2015.
[4] C. Li et al., “Data priming network for automatic check-out,” in Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, pp. 2152- 2160, 2019.
[5] B. F. Wu et al., “An intelligent self-checkout system for smart retail,” in Proceedings of the International Conference on System Science and Engineering, Nantou, Taiwan, pp. 1-4, 2016.
[6] B. T. Wu et al., “Image recognition approach for expediting Chinese cafeteria checkout process,” in Proceedings of the IEEE 2nd Global Conference on Life Sciences and Technologies, Kyoto, Japan, pp. 35-38, 2020.
[7] R. Girshick et al., “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Ohio, pp. 580-587, 2014.
[8] R. Girshick, “Fast R-CNN,” in Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, pp. 1440-1448, 2015.
[9] Q. Hou et al., “Deeply supervised salient object detection with short connections,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, pp. 3203-3212, 2017.
[10] S. Xie and Z. Tu, “Holistically-nested edge detection,” in Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, pp. 1395- 1403, 2015.
[11] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large- scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
40
[12] K. He et al., “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Nevada, pp. 770-778, 2016.
[13] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Proceedings of the International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 234-241, 2015. doi: 10.1007/978-3-319-24574-4_28
[14] Z. Cai and N. Vasconcelos, “Cascade R-CNN: Delving into high quality object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Utah, pp. 6154-6162, 2018.
[15] T. Y. Lin et al., “Feature pyramid networks for object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, pp. 2117-2125, 2017.

全文公開日期 2025/08/11 (校內網路)
全文公開日期 2030/08/11 (校外網路)
全文公開日期 2030/08/11 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文