簡易檢索 / 詳目顯示

研究生: 楊裕堯
Yu-Yao Yang
論文名稱: 不同深度卷積神經網路應用於不良圖像分類之研究
A Study of Different Deep Convolutional Neural Networks on Objectionable Images Classification
指導教授: 吳怡樂
Yi-Leh Wu
口試委員: 唐政元
Cheng-Yuan Tang
閻立剛
Li-Kang Yen
陳建中
Chien-Chung Chen
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 中文
論文頁數: 60
中文關鍵詞: Caffe不良圖片辨識深度卷積神經網路深度學習
外文關鍵詞: Caffe, Objectionable Image, Convolutional Neural Network, Deep Learning
相關次數: 點閱:427下載:14
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在過去,人們通常使用皮膚特徵提取或上下文的關鍵詞,並組合多個過濾器來辨識令人反感的圖像。深度學習雖能提高辨識程度,卻相對需要大量訓練集及冗長的訓練時間。隨著硬體和新演算法的研究發展,這些問題逐漸得到改善。深度卷積神經網路對圖像分類和過去的方式相比,有著明顯優越的效果。在本文中,我們建構了一個不同的深度卷積神經網路來學習和分類令人反感的圖像,相較於原始的神經網路,有著更為卓越的分類效果。在我們的實驗環境中。使用了總計150000張圖像,以4:1的比例分別做training和testing。我們可以獲得97.2%的分類準確度,與7.8%的loss率,訓練成本僅僅44分鐘。和過去的方式比較,成功在維持一定水準的準確度與loss率的情況下,明顯降低10%以上訓練所耗費的時間。


    In the past, people often used skin feature extraction or contextual keywords and combined multiple filters to identify objectionable images. Although deep learning can increase the degree of recognition, it requires a relatively large number of training sets and lengthy training time. With the development of hardware and new algorithms, these problems have gradually alleviated. The deep convolutional neural networks have a superior effect on image classification compared with the past. In this paper, we construct different modified deep convolutional neural networks to learn and classify objectionable images. Compared with the original neural network, the proposed deep convolutional neural networks can achieve superior classification results. In our experimental environment, with a total of 150,000 images in training and testing in the 4:1 ratio, we can obtain classification accuracy of 97.2% with loss rate of 7.8%. And the training cost is only 44 minutes. Compared with the previous methods, the proposed methods success in maintaining a certain level of accuracy and loss rate but significantly reduce more than 10% time spent on training.

    論文摘要…………………....…………………….…………………...……..…I Abstract……………………...…………………….…………………....…...…II Contents....………………...……………..………...……………………….…III LIST OF FIGURES ……….……..………..………...………..…………….…IV LIST OF TABLES………………………………………………………......... VI Chapter 1. Introduction……………………………………………………...….1 Chapter 2. Deep Learning and CNN…………………………………………….4 2.1 Deep Learning………………………………………………......…...…4 2.2 CNN……………………………………………………………………5 Chapter 3. Caffe and Deep Learning Model…………………………………....7 3.1 Caffe ………………………………………………………….....……..7 3.2 Deep Learning Model…..……………………………………..……….7 Chapter 4. Experiment …………………………………………………….......11 4.1 The ImageNet of ILSVRC 2012 dataset and the People-related dataset…………………………………………………………….………12 4.2 Testing several major architectures of deep learning………………….13 4.3 Using the objectionable images of people-related dataset and the ImageNet ILSVRC 2012 dataset on Modified Net Model and Alex Net mode……………………………………………………………………...21 4.4 Using the objectionable and non- objectionable images of people-related dataset on Modified Net Model and Reference Net model……………….30 4.5 Using all of the three datasets on Modified Net Model and Reference Net model ………………………………………………………………...39 Chapter 5. Conclusions and Future works.………………………………..…...49 References……………………………………………………………..........…50

    [1] Abadpour, A., & Kasaei, S. “Pixel-based skin detection for pornography filtering.” Iranian Journal of Electrical and Electronic Engineering 1 (3) 21–41, 2005.
    [2] Fleck, M. M., Forsyth, D. A., & Bregler, C. “Finding naked people.” Proc. European Conference on Computer Vision, vol. 2, pp. 593-602, 1996
    [3] Forsyth, D. A., & Fleck, M. M. “Identifying nude pictures.” Proc. of IEEE Computer Society Workshop Applications on Computer Vision, Sarasota, FL, December 1996.
    [4] Forsyth, D. A., & Fleck, M. M. “Automatic detection of human nudes.” International Journal of Computer Vision, vol. 32, no. 1, pp. 63-77, 1999.
    [5] Ioffe, S., & Forsyth, D. “Finding people by sampling.” Proc. International Conference on Computer Vision, Korfu, Greece, 1999.
    [6] Wang, J. Z., Li, J., Wiederhold, G., & Firschein, O. “System for screening objectionable images.” Computer Communications, vol. 21, no. 15, pages 1355-1360, 1998.
    [7] Wu, Y. L., Chang, E., Cheng, K. T., Chang, C. W., Hsu, C. C., Lai, W. C., & Wu, C. T. “A distributed multimodal information filtering system.” In proceeding IEEE Pacific Rim Conference on Multimedia, pages 279–286, 2002.
    [8] Wu, Y. L., Goh, K. S., Li, B., You, H., & Chang, E. Y. “The anatomy of a multimodal information filter.” In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 462–471, Washington, D.C., ACM, August 2003.
    [9] Marcial-Basilio, J. A., Aguilar-Torres, G., Sánchez-Pérez, G., ToscanoMedina, L. K., & Pérez-Meana, H. M. “Detection of Pornographic Digital Images.” International Journal of Computers (Vol. 5), pages 298-305, 2011.
    [10] Kia, S. M., Rahmani, H., Mortezaei, R., Moghaddam, M. E., & Namazi, A. “A Novel Scheme for Intelligent Recognition of Pornographic Images.” IJSC, Computer Vision and Pattern Recognition Nov. 2014.
    [11] GPU development in recent years, https://wccftech.com/microsofts-directx-12-api-supports-amd-gcn-nvidia-fermi-kepler-maxwell-intels-iris-graphics-cores, Referenced on June 10th, 2018
    [12] Bengio, Y. “Learning deep architectures for AI.” Foundations and trends® in Machine Learning, 2(1), 1-127, 2009.
    [13] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. “Gradient-based learning applied to document recognition.” Proceedings of the IEEE, vol. 86, no. 11, pages 2278-2324, 1998.
    [14] Lee, T. H. “A Study of General Fine-Grained Classification with Deep Convolutional Neural Networks.” National Taiwan University of Science and Technology Department of Computer Science and Information Engineering, 2015.
    [15] Typical convolutional neural network architecture, https://en.wikipedia.org/wiki/Convolutional_neural_network, Referenced on June 10th, 2018
    [16] Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S. & Darrell, T. “Caffe: Convolutional architecture for fast feature embedding.” In Proceedings of the ACM International Conference on Multimedia, pp. 675-678, November 2014.
    [17] Krizhevsky, A., Sutskever, I., & Hinton, G. E. “ImageNet classification with deep convolutional neural networks.” In Advances in neural information processing systems, pp. 1097-1105, 2012.
    [18] Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., & Berg, A. C. “Imagenet large scale visual recognition challenge.” International Journal of Computer Vision, 115(3), 211-252, 2015.
    [19] Boureau, Y. L., Bach, F., LeCun, Y., & Ponce, J. “Learning mid-level features for recognition.” In Computer Vision and Pattern Recognition (CVPR), IEEE Conference, pp. 2559-2566, June 2010.
    [20] Wu, Y. L. & Lee, T. S. “A Study of Objectionable Images Classification with Deep Convolutional Neural Networks.” National Taiwan University of Science and Technology Department of Computer Science and Information Engineering, 2016.
    [21] An illustration of the architecture of Alex Net Model. https://indoml.com/2018/03/07/student-notes-convolutional-neural-networks-cnn-introduction, Referenced on June 10 th, 2018.
    [22] Xiaobing, H., Yanfei, Z., Liqin, C., & Liangpei, Z. “Pre-Trained AlexNet Architecture with Pyramid Pooling and Supervision for High Spatial Resolution Remote Sensing Image Scene Classification.” MDPI, July 2017.
    [23] Karen, S., & Andrew, Z. “Very Deep Convolutional Networks for Large-Scale Image Recognition.” Proceeding of ICLR Conference, 2015.

    QR CODE