簡易檢索 / 詳目顯示

研究生: 周彥文
Yan-Wen Zhou
論文名稱: 一個基於自然影像顯著圖偵測用以增強物件提取之方法
An Enhanced Object Proposal Method Based on Saliency Map Extraction for Natural Images
指導教授: 范欽雄
Chin-Shyurng Fahn
口試委員: 謝君偉
Jun-Wei Hsieh
王聖智
Shen-Jyh Wang
吳怡樂
Yi-Leh Wu
范欽雄
Chin-Shyurng Fahn
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 61
中文關鍵詞: 物件辨識卷積神經網路顯著區域辨識影像色彩分析深度學習
外文關鍵詞: object detection, convolutional neural networks work, saliency region detection, image color analysis, deep learning
相關次數: 點閱:371下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

近年來,影像的物件偵測及辨識的重要性逐漸提升。越來越多的及時影像資訊逐漸佈滿我們的生活之中,包括衛星、監視器、隨身錄影設備、直播等等。但也產生許多的問題和衍生多項的技術議題等待被解決。隨著科技迅速的演變與發展,人們正在使用各項技術來實現讓機器模擬人類的神經網路,並且達到讓它可以學習識別圖片的方法。
本論文使用一組彩色影像作為輸入,並建立影像顯著區域的辨識系統。首先,為了找出影像中的顯著區域,我們把影像做初步的切割再將各區塊的特徵提取出來,接著在使用機器學習中的隨機森林方法對各區特徵資料做訓練和辨識進而得到初步的顯著資訊。再來我們將此顯著資訊中提煉出帶有確定前景及確定背景區域的資訊圖,利用此資訊圖計算出一組色彩係數的最佳線性組合來區分前景跟背景。最後結合圖像的空間資訊進而可得一個完好的顯著圖。根據顯著圖所切割出來的前景圖仍然有些不完善,因此我們使用了前景區域恢復法與形態學,使結果的前景物件看起來較為完整。最後我們再將該組僅有前景物件的彩色影像送入YOLO架構內訓練,得到本論文的最終結果。
我們針對我們的前景物件影像結果與手動切割前景物件影像方法結果比較,取600張影像做平均,我們的方法精確度約90%而手動切割法約93%,兩者差距甚微,但我們的方法不需要人為之介入,因此我們的方法稍有優勢。


In recent years, the importance of object detection has gradually increased. More real-time image information fills our lives. For example: satellite, monitors, portable video equipment, street live broadcast, etc. But it also generates many problems and a number of technical issues awaiting resolution. With the rapid evolution and development of science and technology, people are using various technologies to implement a machine that simulates a human neural network and allows it to learn how to recognize pictures.
In this thesis, we input a color image and build saliency detection system in order to find out the salient regions in the image. We make a preliminary cut of the image and extract the features of each region. Then, the random forest method in machine learning is used to train and identify the feature vector of each region to obtain preliminary information. Next, we use the initial saliency map with definite foreground and definite background region to calculate the optimal linear combination of coefficients to distinguish the foreground from the background. Finally, combine the spatial information to obtain a saliency map. However, the result is still incomplete after saliency cut. Therefore, we use the recover incomplete foreground method and morphology to make results look more complete. We sent our results into YOLO architecture for training and obtained the final results of this thesis.
We compare our methods and Grab cuts [5] and take 600 images to average. Our method is 90% accurate and Grab cuts about 93% and both are similar. The point is our method does not require manual. Therefore, our method has a slight advantage.

中文摘要 i Abstract vi 致謝 vii Table of Contents.........................................................................................................iv List of Figures ix List of Tables xi Chapter 1 Introduction 1 1.1 Overview 1 1.2 Motivation 1 1.3 System Description 2 1.4 Thesis organization 3 Chapter 2 Related Work 4 2.1 RCNN 4 2.2 Fast R-CNN 7 2.3 Faster R-CNN 8 Chapter 3 Object Extraction and Detection 10 3.1 Saliency Map 11 3.2 Saliency Cut 15 3.3 Restore Incomplete Foreground 18 3.4 Hole Filling Method 25 3.5 You Only Look Once Object Detection 32 Chapter 4 Experimental Results and Discussions 35 4.1 Experimental Setup 35 4.2 The Results of Saliency cuts Image 36 4.3 The Results of Foreground Object Detection 39 Chapter 5 Conclusions and Future Work 44 5.1 Conclusions 44 5.2 Future Work 45 References 46

[1] J. Kim, D. Han, Y. W. Tai, and J. Kim, “Salient region detection via high-dimensional color transform,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, pp. 883-890, June 2014.
[2] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, pp. 580-587, June 2014.
[3] R. Girshick, “Fast R-CNN,” in Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, pp. 1440-1448, Dec 2015.
[4] S. Ren, K. He and R. Girshick, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, June 2017.
[5] C. Rother, V. Kolmogorov, and A. Blake, “Grabcut: Interactive foreground extraction using iterated graph cuts,” ACM Transactions on Graphics, vol. 23, no. 3, pp. 309-314, Aug 2004.
[6] D. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no.2, pp. 90-110, Nov 2004.
[7] N. Dalal and B. Triggs. “Histograms of oriented gradients for human detection,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, pp. 886-893, June 2005.
[8] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman., ”The PASCAL Visual Object Classes (VOC) Challenge,” International Journal of Computer Vision, vol. 88, no. 2, pp. 303-338, June 2010.
[9] B. Alexe, T. Deselaers, and V. Ferrari, “Measuring the objectness of image windows,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no.11, pp. 2189-2202, Nov 2012.
[10] J. Uijlings, K. van de Sande, T. Gevers, and A. Smeulders, “Selective search for object recognition,” International Journal of Computer Vision, vol. 104, no. 2, pp 154-171, Sep 2013.
[11] I. Endres and D. Hoiem, “Category independent object proposals,” in Proceedings of the European Conference on Computer Vision, Springer, Berlin, Heidelberg, pp. 575-588, Sep 2010.
[12] J. Carreira and C. Sminchisescu, “CPMC: Automatic object segmentation using constrained parametric min-cuts,“ IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no.7, pp. 1312-1328, July 2012.
[13] P. Arbelaez, J. Pont-Tuset, J. Barron, F. Marques, and J. Malik, “Multiscale combinatorial grouping,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, pp. 328-335, Jun 2014.
[14] D. Ciresan, A. Giusti, L. Gambardella, and J. Schmidhuber, “Mitosis detection in breast cancer histology images with deep neural networks,” in Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, Berlin, Heidelberg, pp. 411-418, Sep 2013.
[15] F. Felzenszwalb and P. Huttenlocher, “Efficient Graph-Based Image Segmentation,” International Journal of Computer Vision, vol. 59, no. 2, pp. 167-181, Sep 2004.
[16] A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems, vol. 1, pp 1097-1105, Dec 2012.
[17] Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, and L. Jackel, “Backpropagation applied to handwritten zip code recognition,“ Neural Computation, vol. 1, no. 4, pp. 541-551, Dec 1989.
[18] K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1904-1916, Sep 2015.
[19] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, “OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks,” in Proceedings of the International Conference on Learning Representations, CBLS, Apr 2014.
[20] Y. Zhu, R. Urtasun, R. Salakhutdinov, and S. Fidler, “segDeepM: Exploiting segmentation and context in deep neural networks for object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, pp. 4703-4711, Jun 2015.
[21] C. L. Zitnick and P. Dollar, “Edge boxes: Locating object proposals from edges,” in Proceedings of the European Conference on Computer Vision, Springer, Cham, pp. 391-405, Sep 2014.
[22] Y.Y. Boykov and M. P. Jolly, “Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images,” in Proceedings of the Eighth IEEE International Conference on Computer Vision, Vancouver, Canada, BC, pp. 105-112, Jul 2001.
[23] H. Jiang, J. Wang, Z. Yuan, Y. Wu, N. Zheng, and S. Li, “Salient object detection: A discriminative regional feature integration approach,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, pp. 2083-2090, Jun 2013.
[24] F. Perazzi, P. Krahenbuhl, Y. Pritch, and A. Hornung, “Saliency filters: Contrast based filtering for salient region detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, pp. 733-740, Jun 2012.
[25] X. Shen and Y. Wu, “A unified approach to salient object detection via low rank matrix recovery,“ in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, pp. 853-860, Jun 2012.
[26] C. Yang, L. Zhang, H. Lu, X. Ruan, and M.-H. Yang, “Saliency detection via graph-based manifold ranking,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, pp. 3166-3173, Jun 2013.
[27] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk, “Slic superpixels compared to state-of-the-art superpixel methods,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 11, pp. 2274-2282, May 2012.
[28] T. Judd, K. Ehinger, F. Durand, and A. Torralba, “Learning to predict where humans look,” in Proceedings of the IEEE International Conference on Computer Vision, Kyoto, Japan, pp. 2106-2113, Oct 2009.
[29] R. Achanta, S. Hemami, F. Estrada, and S. Susstrunk, “Frequency-tuned salient region detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, pp. 1597-1604, Jun 2009.
[30] M. M. Cheng, N. J. Mitra, X. Huang, P. H. S. Torr, and S. M. Hu, “Global contrast based salient region detection,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 37, no. 3, pp. 569-582, Mar 2015.
[31] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan, “Object detection with discriminatively trained partbased models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1627-1645, Sep 2010.
[32] B. Su, S. Lu, and C. L. Tan, “Blurred image region detection and classification,” in Proceedings of the 19th ACM International Conference on Multimedia, Scottsdale, Arizona, pp. 1397-1400, Dec 2011.
[33] L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5-32, Oct 2001
[34] N.Otsu, “A threshold selection method from gray-level histograms,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62-66, Jan 1979.
[35] P. Kaiser and R. M. Boynton, Human Color Vision, 2nd Ed., Optical Society of America, Washington, D. C., 1996
[36] J. Redmon and A. Farhadi. “Yolo9000: Better, faster, stronger,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, pp. 7263-7271, Jul 2017.
[37] R. Rothe, M. Guillaumin, and L. Van Gool, “Non-maximum suppression for object detection by passing messages between windows,” in Proceedings of the Asian Conference on Computer Vision, Singapore, pp. 290-306, Nov 2014.
[38] L. Duan, C. Wu, J. Miao, L. Qing, and Y. Fu, “Visual saliency detection by spatially weighted dissimilarity,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Colorado, Springs, CO, pp. 473-480, June 2011.
[39] S. Goferman, L. Zelnik-Manor, and A. Tal, “Context-aware saliency detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 10, pp. 1915-1926, Oct 2012.
[40] E. Rahtu, J. Kannala, M. Salo, and J. Heikkila, “Segmenting salient objects from images and videos,” in Proceedings of the European Conference on Computer Vision, Springer, Berlin, Heidelberg, pp. 366-379, Sep 2010.

無法下載圖示 全文公開日期 2023/07/25 (校內網路)
全文公開日期 2028/07/25 (校外網路)
全文公開日期 2023/07/25 (國家圖書館:臺灣博碩士論文系統)
QR CODE