簡易檢索 / 詳目顯示

研究生: 朱釗宏
Kevin - Alfianto Jangtjik
論文名稱: 基於多尺度加權池化深度學習的畫家分類演算法
Painter Classification via Deep Learning with Multi-scale Weighted Pooling
指導教授: 花凱龍
Kai-Lung Hua
口試委員: 葉梅珍
Mei-Chen Yeh
楊傳凱
Chuan-Kai Yang
陳永耀
Yong-Yao Chen
翁明昉
Ming-Fang Weng
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 英文
論文頁數: 47
中文關鍵詞: 圖像分類深度學習多尺度金字塔馬爾科夫隨機場
外文關鍵詞: Image classification, deep learning, multi-scale pyramid, Markov Random Field, entropy
相關次數: 點閱:391下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • For analyzing digital images of paintings we propose a new approach to categorize them based on artist. Determining the authorship of a painting is challenging because common subjects are illustrated in paintings, and paintings of an artist may not have a unique style. The proposed approach is built upon convolutional neural networks (CNN) a class of biologically inspired vision model that recently demonstrates near-human performance on several visual recognition tasks. However, training a CNN model requires large scale training data of a fixed input image size (e.g. 227 x 227). In this thesis, we propose to construct a multi-layer pyramid from an image, providing 21X more features than using a single layer (i.e., the original image) alone. We train a CNN model for each layer, and consider relationship within neighborhood patches in layers that have fine sub-regions, and propose a new weighted fusion scheme to aggregate the three layers adaptively. To evaluate the proposed methods, we collect two painting image datasets. The first dataset is categorized into 13 artists while the second one is categorized into 23 artists. As demonstrated in the experimental results, the proposed methods surpass the baseline method significantly in terms of precision, recall, and F-score to prove the effectiveness of proposed methods.


    For analyzing digital images of paintings we propose a new approach to categorize them based on artist. Determining the authorship of a painting is challenging because common subjects are illustrated in paintings, and paintings of an artist may not have a unique style. The proposed approach is built upon convolutional neural networks (CNN) a class of biologically inspired vision model that recently demonstrates near-human performance on several visual recognition tasks. However, training a CNN model requires large scale training data of a fixed input image size (e.g. 227 x 227). In this thesis, we propose to construct a multi-layer pyramid from an image, providing 21X more features than using a single layer (i.e., the original image) alone. We train a CNN model for each layer, and consider relationship within neighborhood patches in layers that have fine sub-regions, and propose a new weighted fusion scheme to aggregate the three layers adaptively. To evaluate the proposed methods, we collect two painting image datasets. The first dataset is categorized into 13 artists while the second one is categorized into 23 artists. As demonstrated in the experimental results, the proposed methods surpass the baseline method significantly in terms of precision, recall, and F-score to prove the effectiveness of proposed methods.

    Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Table of contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Background and Objective . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1 Training of Multi-scale Networks . . . . . . . . . . . . . . . . . . . . 7 3.2 Weight Fusion for Artist-based Classification . . . . . . . . . . . . . . 8 3.3 Adaptive Fusion for Artist-based Classification . . . . . . . . . . . . . 10 3.4 Refining Scheme using Markov Random Field . . . . . . . . . . . . . 11 3.4.1 Refining Formula using Markov Random Field . . . . . . . . . 12 3.4.2 Combining Markov Random Field Result . . . . . . . . . . . . 13 4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Baseline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.3 Multi-scale Pyramid with Hard Decision . . . . . . . . . . . . . . . . 18 4.4 Multi-scale Pyramid with Soft Decision . . . . . . . . . . . . . . . . . 18 4.5 Multi-scale Pyramid with Soft Decision and Adaptive Pooling . . . . 21 4.6 MRF Refining Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.7 Methods Description using Progressive Calculation . . . . . . . . . . 24 4.7.1 Multi-scale Pyramid with Hard Decision Description . . . . . 24 4.7.2 Multi-scale Pyramid with Soft Decision Description . . . . . . 26 4.7.3 Multi-scale Pyramid with Soft Decision and Adaptive Pooling Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.7.4 MRF Refining Scheme Description . . . . . . . . . . . . . . . 28 4.8 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    [1] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrel, "Caffe: convolutional architecture for fast feature embedding," ACM International Conference in Multimedia, pp. 675-678, 2014.
    [2] Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, and L. Jackel, "Backpropagation applied to hand-written zip code recognition,"Neural Computing, vol. 1, no. 4, pp. 541-551, 1989.
    [3] A. Krizhevsky, I. Sutskever, and G. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in Neural Information Processing Systems 25, pp. 1097-1105, 2012.
    [4] B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva, "Learning deep features for scene recognition using places database," Advances in Neural Information Processing System, 2014.
    [5] S. Karayev, "Recognizing image style," British Machine Vision Conference, 2014.
    [6] J. Long, E. Shelhamer, and T. Darrel, "Fully convolutional networks for semantic segmentation," Computer Vision and Pattern Recognition, pp. 3431-3440, 2015.
    [7] J. Donahue, "Long-term recurrent convolutional networks for visual recognition and description," Computer Vision and Pattern Recognition, pp. 2625-2634, 2015.
    [8] M. Kummerer, L. Theis, and M. Bethge, "Deep gaze i: Boosting saliency prediction with feature maps trained on imagenet," ICLR Workshop, 2015.
    [9] PaintingDb, "PaintingDb fastest growing art gallery in the web," 2015.
    [10] WikiArt, "WikiArt the online home for visual arts from all around the world.," 2016.
    [11] Q. Hanchao and H. Shannon, "A new method for visual stylometry on impresionist paintings," IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2036-2039, 2011.
    [12] M. Sun, D. Zhang, J. Ren, Z. Wang, and J. S.Ji, "Brushstroke based sparse hybrid convolutional neural networks for author classification of chinese inkwash paintings," IEEE International Conference on Image Processing, pp. 626-630, 2015.
    [13] S. Zhao, H. Yao, X. Jiang, and X. Sun, "Predicting discrete probability distribution of image emotion," IEEE International Conference on Image Processing, pp. 2459-2463, 2015.
    [14] K. Peng and T. Chen, "Cross-layer features in convolutional neural networks for generic classification tasks," International Conference in Image Processing, pp. 3057-3061, 2015.
    [15] W. R. Tan, C. S. Chan, H. E. Aguirre, and K. Tanaka, "Ceci n'est pas une pipe: A deep convolutional network for fine-art paintings classification," in IEEE International Conference on Image Processing ICIP, pp. 3703-3707, 2016.
    [16] K.-H. Lo, K.-L. Hua, and Y.-C. F. Wang, "Depth map super-resolution via markov random fields without texture-copying artifacts," International Conference on Acoustics, Speech and Signal Processing, pp. 1414 - 1418, 2013.

    無法下載圖示 全文公開日期 2022/01/24 (校內網路)
    全文公開日期 2027/01/24 (校外網路)
    全文公開日期 2027/01/24 (國家圖書館:臺灣博碩士論文系統)
    QR CODE