研究生: |
朱釗宏 Kevin - Alfianto Jangtjik |
---|---|
論文名稱: |
基於多尺度加權池化深度學習的畫家分類演算法 Painter Classification via Deep Learning with Multi-scale Weighted Pooling |
指導教授: |
花凱龍
Kai-Lung Hua |
口試委員: |
葉梅珍
Mei-Chen Yeh 楊傳凱 Chuan-Kai Yang 陳永耀 Yong-Yao Chen 翁明昉 Ming-Fang Weng |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 英文 |
論文頁數: | 47 |
中文關鍵詞: | 圖像分類 、深度學習 、多尺度金字塔 、馬爾科夫隨機場 、熵 |
外文關鍵詞: | Image classification, deep learning, multi-scale pyramid, Markov Random Field, entropy |
相關次數: | 點閱:391 下載:1 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
For analyzing digital images of paintings we propose a new approach to categorize them based on artist. Determining the authorship of a painting is challenging because common subjects are illustrated in paintings, and paintings of an artist may not have a unique style. The proposed approach is built upon convolutional neural networks (CNN) a class of biologically inspired vision model that recently demonstrates near-human performance on several visual recognition tasks. However, training a CNN model requires large scale training data of a fixed input image size (e.g. 227 x 227). In this thesis, we propose to construct a multi-layer pyramid from an image, providing 21X more features than using a single layer (i.e., the original image) alone. We train a CNN model for each layer, and consider relationship within neighborhood patches in layers that have fine sub-regions, and propose a new weighted fusion scheme to aggregate the three layers adaptively. To evaluate the proposed methods, we collect two painting image datasets. The first dataset is categorized into 13 artists while the second one is categorized into 23 artists. As demonstrated in the experimental results, the proposed methods surpass the baseline method significantly in terms of precision, recall, and F-score to prove the effectiveness of proposed methods.
For analyzing digital images of paintings we propose a new approach to categorize them based on artist. Determining the authorship of a painting is challenging because common subjects are illustrated in paintings, and paintings of an artist may not have a unique style. The proposed approach is built upon convolutional neural networks (CNN) a class of biologically inspired vision model that recently demonstrates near-human performance on several visual recognition tasks. However, training a CNN model requires large scale training data of a fixed input image size (e.g. 227 x 227). In this thesis, we propose to construct a multi-layer pyramid from an image, providing 21X more features than using a single layer (i.e., the original image) alone. We train a CNN model for each layer, and consider relationship within neighborhood patches in layers that have fine sub-regions, and propose a new weighted fusion scheme to aggregate the three layers adaptively. To evaluate the proposed methods, we collect two painting image datasets. The first dataset is categorized into 13 artists while the second one is categorized into 23 artists. As demonstrated in the experimental results, the proposed methods surpass the baseline method significantly in terms of precision, recall, and F-score to prove the effectiveness of proposed methods.
[1] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrel, "Caffe: convolutional architecture for fast feature embedding," ACM International Conference in Multimedia, pp. 675-678, 2014.
[2] Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, and L. Jackel, "Backpropagation applied to hand-written zip code recognition,"Neural Computing, vol. 1, no. 4, pp. 541-551, 1989.
[3] A. Krizhevsky, I. Sutskever, and G. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in Neural Information Processing Systems 25, pp. 1097-1105, 2012.
[4] B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva, "Learning deep features for scene recognition using places database," Advances in Neural Information Processing System, 2014.
[5] S. Karayev, "Recognizing image style," British Machine Vision Conference, 2014.
[6] J. Long, E. Shelhamer, and T. Darrel, "Fully convolutional networks for semantic segmentation," Computer Vision and Pattern Recognition, pp. 3431-3440, 2015.
[7] J. Donahue, "Long-term recurrent convolutional networks for visual recognition and description," Computer Vision and Pattern Recognition, pp. 2625-2634, 2015.
[8] M. Kummerer, L. Theis, and M. Bethge, "Deep gaze i: Boosting saliency prediction with feature maps trained on imagenet," ICLR Workshop, 2015.
[9] PaintingDb, "PaintingDb fastest growing art gallery in the web," 2015.
[10] WikiArt, "WikiArt the online home for visual arts from all around the world.," 2016.
[11] Q. Hanchao and H. Shannon, "A new method for visual stylometry on impresionist paintings," IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2036-2039, 2011.
[12] M. Sun, D. Zhang, J. Ren, Z. Wang, and J. S.Ji, "Brushstroke based sparse hybrid convolutional neural networks for author classification of chinese inkwash paintings," IEEE International Conference on Image Processing, pp. 626-630, 2015.
[13] S. Zhao, H. Yao, X. Jiang, and X. Sun, "Predicting discrete probability distribution of image emotion," IEEE International Conference on Image Processing, pp. 2459-2463, 2015.
[14] K. Peng and T. Chen, "Cross-layer features in convolutional neural networks for generic classification tasks," International Conference in Image Processing, pp. 3057-3061, 2015.
[15] W. R. Tan, C. S. Chan, H. E. Aguirre, and K. Tanaka, "Ceci n'est pas une pipe: A deep convolutional network for fine-art paintings classification," in IEEE International Conference on Image Processing ICIP, pp. 3703-3707, 2016.
[16] K.-H. Lo, K.-L. Hua, and Y.-C. F. Wang, "Depth map super-resolution via markov random fields without texture-copying artifacts," International Conference on Acoustics, Speech and Signal Processing, pp. 1414 - 1418, 2013.