研究生: |
許哲源 Tse-Yuan-Hsu |
---|---|
論文名稱: |
一個基於類神經網路與群眾外包 的照片美學評分系統 A Photo Aesthetic Scoring System Based on Artificial Neural Networks and Crowdsourcing |
指導教授: |
范欽雄
Chin-Shyurng Fahn |
口試委員: |
戴碧如
Bi-Ru Dai 王榮華 Jung-Hua Wang 林啟芳 Chi-Fang Lin |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 英文 |
論文頁數: | 38 |
中文關鍵詞: | 類神經網路 、美學評分 、群眾外包 |
外文關鍵詞: | Artificial neural network, Aesthetic scoring, Crowdsourcing |
相關次數: | 點閱:294 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,由於人工智慧的蓬勃發展,科學家嘗試將許多領域與人工智慧做結合,然而美學是們主觀的科學,以人工智慧來判斷美學是具有挑戰性的。然而美學不是二分法,不能以好或壞來斷定照片的美醜,因此如何量化美學是一個具有意義與困難的挑戰。
本篇論文提出了一個針對四種不同類型照片的美學評分系統,包括人物、動物、建築與自然照片。我們從photo.net蒐集含有大眾評分的照片。並且依照銳利度、飽和度、色彩等八種特徵,從每一張照片萃取出來。進行特徵擷取之後,我們將照片特徵與大眾評分讓類神經網路訓練。我們訓練出的美學評分模型之後,我們會從Flickr上面蒐集400張未經評分的照片,經由群眾外包後,使每一張照片會有一個相對應的分數。最後我們讓美學評分模型去對這400張照片去做評分。
實驗部分為計算美學評分與大眾評評分的誤差,我們是以Residual sum of squares error(Res)來計算誤差。經實驗結果顯示人物類別的Res為0.258,動物類的Res為0.296,建築類的Res為0.310,自然場景類的Res為0.283,實驗數據顯示,系統的評分與大眾評分是十分相近。
With the rapid development of artificial intelligence, scientists are trying to combine many subjects with artificial intelligence, such as medicine, agriculture and genetic engineer. However, Aesthetics is a subjective subject, so it can be really challenging to judge Aesthetics through Artificial Intelligence. However we can’t arbitrarily judge Aesthetics in a binary way, so it’s a challenging task to quantify Aesthetics.
In this thesis, we propose an aesthetic scoring system on four different kinds of photos, including people, building, animal and nature. We collected many photos from Photo.net which contain public rating score. Then we extracted features including color, contrast, brightness, horizontal ratio, saturation, achromatic ratio, sharpness, harmony and saliency from those photos. After extracting the features, we input features of photos and public scores into an Artificial Neural Network to train. After we get the aesthetic scoring model, we collect 400 images with any public rating score from Flickr, then we will collect the public rating score through crowdsourcing, consequently, every image in those 400 images has a corresponding score. In the end, we predict the aesthetic score of those images through our aesthetic scoring model.
In the experiment, we use Residual sum of squares error (Res) to calculate the difference between our aesthetic scoring result and the public rating score. Experimental results show that the Res of people dataset is 0.258, the Res of animal dataset is 0.296, the Res of building dataset is 0.301 and the Res of nature dataset is 0.283. According to the results, the prediction of our system is approximate to the public opinion.
[1]“Photo.net” [online]. Available: https://www.photo.net/ . (accessed on June 23,2017)
[2]“Flickr” [online]. Available: https://www.flickr.com/ .(accessed on June 23,2017)
[3]D. Guy. “Definition Composition (photographic aesthetics)” In website Photokonnexion [online] accessed on 7/22, http://www.photokonnexion. com/3269-2/
[4]C. Li, A. Gallagher, C. Loui and T.-H. Chen, ” Aesthetic Quality Assessment of Consumer Photos With Faces”, in IEEE International Conference on Image Processing, Hong Kong, pp. 3221-3224, 2010.
[5]R. Sidra, K-H. Lee, and S.-W. Lee. "Aesthetic score assessment based on generic features in digital photography." 5th AUN/SEED-Net Regional Conference on Information and Communication Technology, Manila, Philippine. 2012.
[6]A. Savakis, S. Etz, and A. Loui. "Evaluation of image appeal in consumer photography." Electronic Imaging. International Society for Optics and Photonics, 2000.
[7]E. Siahaan, , A. Redi, and A. Hanjalic. "Beauty is in the scale of the beholder: Comparison of methodologies for the subjective assessment of image aesthetic appeal." Quality of Multimedia Experience (QoMEX), 2014 Sixth International Workshop on. IEEE, 2014.
[8]D. Cohen-Or, O. Sorkine, R. Gal, T. Leyvand, and Y.-Q. Xu, “Color harmonization,” ACM Transactions on Graphics, vol. 25, no. 3 pp. 624-630, 2006.
[9]C.-H. Yeh, Y.-C. Ho, B. A. Barsky, and M. Ouhyoung, “Personalized photograph ranking and selection system,” in Proceedings of the International Conference on Multimedia, Firenze, Italy, pp. 211-220, 2010.
[10]E. Peli, “Contrast in complex images,” Journal of the Optical Society of America A, vol. 7, no. 10, pp. 2032-2040, 1990.
[11]X. Hou, and L. Zhang. "Saliency detection: A spectral residual approach." Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on. IEEE, 2007.
[12]M. Cheng, N. J. Mitra, X. Huang, P. H. Torr, and S. Hu, “Global contrast based salient region detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no.3 , pp. 569-582, 2015.
[13]Y. Luo and X. Tang, “Photo and video quality evaluation: Focusing on the subject,” in Proceedings of the European Conference on Computer Vision, Marseille, France, pp. 386-399, 2008.
[14]E. Palmer, S. Gardner, and D. Wickens, "Aesthetic issues in spatial composition: effects of position and direction on framing single objects" in Spatial Vision, Volume 21, Issue 3, pp. 421-449, 2008.
[15]W. Gardner, and S. Dorling. "Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences." Atmospheric environment 32.14, pp. 2627-2636,1998.
[16]D. Kingma, and J. Ba. "Adam: A method for stochastic optimization." in International Conference for Learning Representations, San Diego, 2015.
[17]H. Drucker, C. Burges, L. Kaufman, and A. Smola. "Support vector regression machines." Advances in neural information processing systems. Pp.155-161,1997.
[18]J. Neter, and W. Wasserman. “Applied linear statistical models.” Vol. 4. Chicago: Irwin, 1996.
[19]A. Liaw, and M. Wiener. "Classification and regression by Random Forest." R news 2.3,pp. 18-22,2002.