簡易檢索 / 詳目顯示

研究生: 曾永源
Yung-Yuan Tseng
論文名稱: 目光注視為基礎的多區塊整合美感評分系統
Eye-Fixatoin-Based Multi-Patch Aggregation Image Aesthetics Score Assessment System
指導教授: 項天瑞
Tien-Ruey Hsiang
口試委員: 鮑興國
Hsing-Kuo Pao
花凱龍
Kai-Lung Hua
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 41
中文關鍵詞: 深度學習美感評估分析目光注視顯著性偵測
外文關鍵詞: Deep Learning, Image Aesthetic Assessment, Eye Fixation, Salient Detection
相關次數: 點閱:260下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著社群網站的發達,如何定義出一個好照片是一個有趣的議題,每個人都有各自的審美觀,對於相同照片的評價也不一樣。建立一個公平的評分系統有助於解決這個問題,並且可以提供一個更為客觀的角度驗證美感這件事。隨著深度學習的發展,大型美感資料集AVA提供了社群網站上大眾對於圖片的評分結果,透過這個資料集可以分析出人們對於圖片美感的喜好以及判斷條件,進一步的建立出符合大眾的美感評估系統。除了從機器的角度,我們進一步的提出了擬人的目光注視方法,讓機器以數據面分析美感時,更進一步的學習人類觀看的角度來分析美感,這對於將來機器學習以人類的角度學習抽象特徵時能夠當作參考。我們的系統以人類的目光注視為基礎,挑選出最吸引人類目光的區域當作局部區域,以及整體影像作為整體影像布局,讓美感模型進行分析,並以多區塊整合的訓練方式,提升了線性相關係數、MSE等表現,同時也提升了美感分數預測的準確性。


    As social media sites continue to advance, the aesthetic judgment of a high-quality photo-graph is an intriguing subject. Because individuals have their own aesthetics, evaluationsof the same photographs may differ. Establishing an unbiased rating system can help solvethis issue and provide a more objective perspective to verify the aesthetics of an image.Using the development of deep learning, the large-scale aesthetic Aesthetic Visual Analy-sis dataset (AVA) provides the public rating results of images on social media sites. Usingthis dataset, viewer preferences and judgment criteria regarding the aesthetics of an imagecan be analyzed, and an aesthetic assessment system can be further established. In addi-tion to a machine perspective, we further propose a human-like eye fixation method thatenables machines to further learn from a human perspective when analyzing aestheticsfrom data, which can be used as a reference for future machine learning of abstract fea-tures from a human perspective. Based on the eye fixation of humans, our system selectsthe areas that attract the most attention of viewers as patches and uses the overall image asthe overall image layout, which are then analyzed by the aesthetic model. A multi-patchaggregated training method is then applied to improve the performance of the linear cor-relation coefficient and mean square error and increase the accuracy of the aesthetic scoreprediction.

    論文摘要 Abstract Acknowledgements Table of Contents List of Figures List of Tables 1 Introduction 2 Related Work 2.1 Aesthetic Assessment 2.2 Human-Like Visual Attention Prediction 3 Proposed Method 3.1 Eye-Fixation Model 3.1.1 Eye-Fixation Model 3.1.2 Patch Selection 3.2 Aesthetic Assessment 4 Experiment 4.1 Experiment Settings 4.1.1 Dataset 4.1.2 Training 4.2 Experiment Results and Analysis 4.2.1 Experiment Results 4.2.2 Experiment Analysis 5 Conclusion References

    [1]M. Douneva, R. Jaron, and M. T. Thielsch, “Effects of different website designs onfirst impressions, aesthetic judgements and memory performance after short presen-tation,”Interacting with Computers, vol. 28, no. 4, pp. 552–567, 2016.[2]E. Michailidou, S. Harper, and S. Bechhofer, “Visual complexity and aesthetic per-ception of web pages,” inProceedings of the 26th annual ACM international con-ference on Design of communication, pp. 215–224, 2008.[3]K.A.JennathandP.Nidhish, “Aestheticjudgementandvisualimpactofarchitecturalforms: a study of library buildings,”Procedia Technology, vol. 24, pp. 1808–1818,2016.[4]Y. Deng, C. C. Loy, and X. Tang, “Image aesthetic assessment: An experimentalsurvey,”IEEE Signal Processing Magazine, vol. 34, no. 4, pp. 80–106, 2017.[5]R. Datta, D. Joshi, J. Li, and J. Z. Wang, “Studying aesthetics in photographic im-ages using a computational approach,” inEuropean conference on computer vision,pp. 288–301, Springer, 2006.[6]S. Dhar, V. Ordonez, and T. L. Berg, “High level describable attributes for predictingaesthetics and interestingness,” inCVPR 2011, pp. 1657–1664, IEEE, 2011.[7]Y. Ke, X. Tang, and F. Jing, “The design of high-level features for photo qualityassessment,” in2006 IEEE Computer Society Conference on Computer Vision andPattern Recognition (CVPR’06), vol. 1, pp. 419–426, IEEE, 2006.[8]K.-Y. Lo, K.-H. Liu, and C.-S. Chen, “Assessment of photo aesthetics with effi-ciency,” inProceedings of the 21st International Conference on Pattern Recognition(ICPR2012), pp. 2186–2189, IEEE, 2012.[9]W. Luo, X. Wang, and X. Tang, “Content-based photo quality assessment,” in2011International Conference on Computer Vision, pp. 2206–2213, IEEE, 2011.
    [10]N. Murray, L. Marchesotti, and F. Perronnin, “Ava: A large-scale database for aes-thetic visual analysis,” in2012 IEEE Conference on Computer Vision and PatternRecognition, pp. 2408–2415, IEEE, 2012.[11]X. Lu, Z. Lin, X. Shen, R. Mech, and J. Z. Wang, “Deep multi-patch aggregationnetwork for image style, aesthetics, and quality estimation,” inProceedings of theIEEE International Conference on Computer Vision, pp. 990–998, 2015.[12]L. Mai, H. Jin, and F. Liu, “Composition-preserving deep photo aesthetics assess-ment,” inProceedings of the IEEE Conference on Computer Vision and PatternRecognition, pp. 497–506, 2016.[13]S. Ma, J. Liu, and C. Wen Chen, “A-lamp: Adaptive layout-aware multi-patch deepconvolutional neural network for photo aesthetic assessment,” inProceedings ofthe IEEE Conference on Computer Vision and Pattern Recognition, pp. 4535–4544,2017.[14]K. Sheng, W. Dong, C. Ma, X. Mei, F. Huang, and B.-G. Hu, “Attention-basedmulti-patch aggregation for image aesthetic assessment,” in2018 ACM MultimediaConference on Multimedia Conference, pp. 879–886, ACM, 2018.[15]T. Judd, K. Ehinger, F. Durand, and A. Torralba, “Learning to predict where humanslook,” in2009 IEEE 12th international conference on computer vision, pp. 2106–2113, IEEE, 2009.[16]C. Cui, H. Fang, X. Deng, X. Nie, H. Dai, and Y. Yin, “Distribution-oriented aesthet-ics assessment for image search,” inProceedings of the 40th International ACM SI-GIR Conference on Research and Development in Information Retrieval, pp. 1013–1016, ACM, 2017.[17]H. Fang, C. Cui, X. Deng, X. Nie, M. Jian, and Y. Yin, “Image aesthetic distribu-tion prediction with fully convolutional network,” inInternational Conference onMultimedia Modeling, pp. 267–278, Springer, 2018.[18]X. Jin, L. Wu, X. Li, S. Chen, S. Peng, J. Chi, S. Ge, C. Song, and G. Zhao, “Pre-dicting aesthetic score distribution through cumulative jensen-shannon divergence,”inThirty-Second AAAI Conference on Artificial Intelligence, 2018.
    [19]H. Talebi and P. Milanfar, “Nima: Neural image assessment,”IEEE Transactions onImage Processing, vol. 27, no. 8, pp. 3998–4011, 2018.[20]H. Roy, T. Yamasaki, and T. Hashimoto, “Predicting image aesthetics using objects inthe scene,” inProceedings of the 2018 International Joint Workshop on MultimediaArtworks Analysis and Attractiveness Computing in Multimedia, pp. 14–19, ACM,2018.[21]Y. Kao, C. Wang, and K. Huang, “Visual aesthetic quality assessment with a regres-sion model,” in2015 IEEE International Conference on Image Processing (ICIP),pp. 1583–1587, IEEE, 2015.[22]B. Jin, M. V. O. Segovia, and S. Süsstrunk, “Image aesthetic predictors basedon weighted cnns,” in2016 IEEE International Conference on Image Processing(ICIP), pp. 2291–2295, Ieee, 2016.[23]Y. Li, X. Hou, C. Koch, J. M. Rehg, and A. L. Yuille, “The secrets of salient ob-ject segmentation,” inProceedings of the IEEE Conference on Computer Vision andPattern Recognition, pp. 280–287, 2014.[24]L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapidscene analysis,”IEEE Transactions on Pattern Analysis & Machine Intelligence,no. 11, pp. 1254–1259, 1998.[25]N. Liu, J. Han, D. Zhang, S. Wen, and T. Liu, “Predicting eye fixations using con-volutional neural networks,” inProceedings of the IEEE Conference on ComputerVision and Pattern Recognition, pp. 362–370, 2015.[26]J. Pan, E. Sayrol, X. Giro-i Nieto, K. McGuinness, and N. E. O’Connor, “Shal-low and deep convolutional networks for saliency prediction,” inProceedings of theIEEE Conference on Computer Vision and Pattern Recognition, pp. 598–606, 2016.[27]E. Vig, M. Dorr, and D. Cox, “Large-scale optimization of hierarchical features forsaliency prediction in natural images,” inProceedings of the IEEE Conference onComputer Vision and Pattern Recognition, pp. 2798–2805, 2014.
    [28]L. Wang, L. Wang, H. Lu, P. Zhang, and X. Ruan, “Saliency detection with recurrentfully convolutional networks,” inEuropean conference on computer vision, pp. 825–841, Springer, 2016.[29]S. S. Kruthiventi, V. Gudisa, J. H. Dholakiya, and R. Venkatesh Babu, “Saliencyunified: A deep architecture for simultaneous eye fixation prediction and salientobject segmentation,” inProceedings of the IEEE Conference on Computer Visionand Pattern Recognition, pp. 5781–5790, 2016.[30]W. Wang and J. Shen, “Deep cropping via attention box prediction and aestheticsassessment,” inProceedings of the IEEE International Conference on Computer Vi-sion, pp. 2186–2194, 2017.[31]C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Van-houcke, and A. Rabinovich, “Going deeper with convolutions,” inProceedings ofthe IEEE conference on computer vision and pattern recognition, pp. 1–9, 2015.[32]M. Jiang, S. Huang, J. Duan, and Q. Zhao, “Salicon: Saliency in context,” inTheIEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.[33]A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. An-dreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mo-bile vision applications,”arXiv preprint arXiv:1704.04861, 2017.[34]C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the in-ception architecture for computer vision,” inProceedings of the IEEE conference oncomputer vision and pattern recognition, pp. 2818–2826, 2016.

    無法下載圖示 全文公開日期 2025/08/27 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE