研究生: |
Jilyan Bianca Sy Dy Jilyan Bianca Sy Dy |
---|---|
論文名稱: |
MCGAN: Mask Controlled Generative Adversarial Network for Image Retargeting MCGAN: Mask Controlled Generative Adversarial Network for Image Retargeting |
指導教授: |
花凱龍
Kai-Lung Hua |
口試委員: |
花凱龍
Kai-Lung Hua 余能豪 Neng-Hao Yu 郭彥甫 Yan-Fu Kuo 陳永耀 Yung-Yao Chen |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 英文 |
論文頁數: | 35 |
中文關鍵詞: | Image Retargeting 、Conditional GAN 、Controllable GAN |
外文關鍵詞: | Image Retargeting, Conditional GAN, Controllable GAN |
相關次數: | 點閱:164 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
Generative Adversarial Networks (GAN) can be trained to learn the internal distribution of the image. Once it learns this, it can generate new images of varying aspect ratios while maintaining the image's internal distribution and completeness despite adding or removing particular objects. However, due to existing model's design, it cannot understand an image's semantics and is incapable of distinguishing different objects. The lack of semantic understanding tends to lead to the generation of unnatural objects (i.e., a person with two heads). Since the model's design is not equipped for learning semantics. We choose to address this problem through user intervention. Our method allows the user to generate an image based on the user's desired aspect ratio while also masking objects they want to preserve. Masking the object allows the user to prevent the object from distorting. Masking also allows the user to remove, relocate, or replicate the object from the input image.
Generative Adversarial Networks (GAN) can be trained to learn the internal distribution of the image. Once it learns this, it can generate new images of varying aspect ratios while maintaining the image's internal distribution and completeness despite adding or removing particular objects. However, due to existing model's design, it cannot understand an image's semantics and is incapable of distinguishing different objects. The lack of semantic understanding tends to lead to the generation of unnatural objects (i.e., a person with two heads). Since the model's design is not equipped for learning semantics. We choose to address this problem through user intervention. Our method allows the user to generate an image based on the user's desired aspect ratio while also masking objects they want to preserve. Masking the object allows the user to prevent the object from distorting. Masking also allows the user to remove, relocate, or replicate the object from the input image.
[1]M.Rubinstein,A.Shamir,andS.Avidan,“Improvedseamcarvingforvideoretargeting,”ACMtransactions on graphics (TOG), vol. 27, no. 3, pp. 1–9, 2008.
[2]A. Shocher, S. Bagon, P. Isola, and M. Irani, “Ingan: Capturing and remapping the ”dna” of a naturalimage,” 2019.
[3]T. R. Shaham, T. Dekel, and T. Michaeli, “Singan: Learning a generative model from a single naturalimage,” in2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 4569–4579,2019.
[4]Mingju Zhang, Lei Zhang, Yanfeng Sun, Lin Feng, and Weiying Ma, “Auto cropping for digital photographs,” in2005 IEEE International Conference on Multimedia and Expo, pp. 4 pp.–, 2005.
[5]B. Suh, H. Ling, B. B. Bederson, and D. W. Jacobs, “Automatic thumbnail cropping and its effectiveness,”inProceedingsofthe16thAnnualACMSymposiumonUserInterfaceSoftwareandTechnology,UIST ’03, (New York, NY, USA), p. 95–104, Association for Computing Machinery, 2003.
[6]M.MaandJ.K.Guo,“Automaticimagecroppingformobiledevicewithbuiltincamera,”FirstIEEEConsumer Communications and Networking Conference, 2004. CCNC 2004., pp. 710–711, 2004.
[7]V. Setlur, S. Takagi, R. Raskar, M. Gleicher, and B. Gooch, “Automatic image retargeting,” inProceedings of the 4th International Conference on Mobile and Ubiquitous Multimedia,MUM’05, (NewYork, NY, USA), p. 59–68, Association for Computing Machinery, 2005.
[8]F. Liu and M. Gleicher, “Automatic image retargeting with fisheyeview warping,” inProceedings ofthe 18th Annual ACM Symposium on User Interface Software and Technology, UIST ’05, (New York,NY, USA), p. 153–162, Association for Computing Machinery, 2005.
[9]R. Gal, O. Sorkine, and D. CohenOr, “Featureaware texturing,” inProceedings of the 17th EurographicsConferenceonRenderingTechniques,EGSR’06,(Goslar,DEU),p.297–303,EurographicsAssociation, 2006.
[10]YanKe,XiaoouTang,andFengJing,“Thedesignofhighlevelfeaturesforphotoqualityassessment,”in2006IEEEComputerSocietyConferenceonComputerVisionandPatternRecognition(CVPR’06),vol. 1, pp. 419–426, 2006.
[11]D. Simakov, Y. Caspi, E. Shechtman, and M. Irani, “Summarizing visual data using bidirectionalsimilarity,” in2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, 2008.
[12]T. Q. Chen and M. Schmidt, “Fast patchbased style transfer of arbitrary style,”arXiv preprintarXiv:1612.04337, 2016.
[13]A. J. Champandard, “Semantic style transfer and turning twobit doodles into fine artworks,”arXivpreprint arXiv:1603.01768, 2016.
[14]O. Frigo, N. Sabater, J. Delon, and P. Hellier, “Split and match: Examplebased adaptive patch sampling for unsupervised style transfer,” inProceedings of the IEEE Conference on Computer Visionand Pattern Recognition, pp. 553–561, 2016.[15]X.HuangandS.Belongie,“Arbitrarystyletransferinrealtimewithadaptiveinstancenormalization,”inProceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510, 2017.
[16]T. Park, J.Y. Zhu, O. Wang, J. Lu, E. Shechtman, A. A. Efros, and R. Zhang, “Swapping autoencoderfor deep image manipulation,” 2020.
[17]T. Ružić and A. Pižurica, “Contextaware patchbased image inpainting using markov random fieldmodeling,”IEEE transactions on image processing, vol. 24, no. 1, pp. 444–456, 2014.
[18]Y. Song, C. Yang, Z. Lin, X. Liu, Q. Huang, H. Li, and C.C. Jay Kuo, “Contextualbased imageinpainting: Infer, match, and translate,” inProceedings of the European Conference on ComputerVision (ECCV), pp. 3–19, 2018.
[19]A. Criminisi, P. Pérez, and K. Toyama, “Region filling and object removal by exemplarbased imageinpainting,”IEEE Transactions on image processing, vol. 13, no. 9, pp. 1200–1212, 2004.
[20]O. Mac Aodha, N. D. Campbell, A. Nair, and G. J. Brostow, “Patch based synthesis for single depthimage superresolution,” inEuropean conference on computer vision, pp. 71–84, Springer, 2012.
[21]W. T. Freeman, T. R. Jones, and E. C. Pasztor, “Examplebased superresolution,”IEEE Computergraphics and Applications, vol. 22, no. 2, pp. 56–65, 2002.
[22]C.Barnes,E.Shechtman,A.Finkelstein,andD.B.Goldman,“Patchmatch: Arandomizedcorrespondence algorithm for structural image editing,”ACM Trans. Graph., vol. 28, no. 3, p. 24, 2009.
[23]S.S. Lin, I.C. Yeh, C.H. Lin, and T.Y. Lee, “Patchbased image warping for contentaware retargeting,”IEEE transactions on multimedia, vol. 15, no. 2, pp. 359–368, 2012.
[24]W. Dong, F. Wu, Y. Kong, X. Mei, T.Y. Lee, and X. Zhang, “Image retargeting by textureawaresynthesis,”IEEE Transactions on Visualization and Computer Graphics, vol. 22, no. 2, pp. 1088–1101, 2015.
[25]K.K. Maninis, S. Caelles, J. PontTuset, and L. V. Gool, “Deep extreme cut: From extreme points toobject segmentation,” 2018.
[26]I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville, “Improved training of wasserstein gans,” 2017.
[27]A. Mahendran and A. Vedaldi, “Understanding deep image representations by inverting them,” in2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5188–5196, 2015.