MCGAN: Mask Controlled Generative Adversarial Network for Image Retargeting

簡易檢索 / 詳目顯示

回結果列表

研究生：	Jilyan Bianca Sy Dy Jilyan Bianca Sy Dy
論文名稱：	MCGAN: Mask Controlled Generative Adversarial Network for Image Retargeting MCGAN: Mask Controlled Generative Adversarial Network for Image Retargeting
指導教授：	花凱龍 Kai-Lung Hua
口試委員:	花凱龍 Kai-Lung Hua 余能豪 Neng-Hao Yu 郭彥甫 Yan-Fu Kuo 陳永耀 Yung-Yao Chen
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2021
畢業學年度：	109
語文別：	英文
論文頁數：	35
中文關鍵詞：	Image Retargeting 、Conditional GAN 、Controllable GAN
外文關鍵詞：	Image Retargeting, Conditional GAN, Controllable GAN
相關次數：	點閱：164 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

Generative Adversarial Networks (GAN) can be trained to learn the internal distribution of the image. Once it learns this, it can generate new images of varying aspect ratios while maintaining the image's internal distribution and completeness despite adding or removing particular objects. However, due to existing model's design, it cannot understand an image's semantics and is incapable of distinguishing different objects. The lack of semantic understanding tends to lead to the generation of unnatural objects (i.e., a person with two heads). Since the model's design is not equipped for learning semantics. We choose to address this problem through user intervention. Our method allows the user to generate an image based on the user's desired aspect ratio while also masking objects they want to preserve. Masking the object allows the user to prevent the object from distorting. Masking also allows the user to remove, relocate, or replicate the object from the input image.

Recommendation Letter . . . . . . . . . . . . . . . . . . . . . . . .i
Approval Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . .ii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .iii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . .iv
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .v
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xiii
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
Related Literature . . . . . . . . . . . . . . . . . . . . . . . . .5
Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8
1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . .8
2 Multi­scale Architecture . . . . . . . . . . . . . . . . . .8
3 Overlay Function . . . . . . . . . . . . . . . . . . . . . .12
4 Training . . . . . . . . . . . . . . . . . . . . . . . . . . .13
4.1  Adversarial Loss . . . . . . . . . . . . . . . . . .14
4.2  Reconstruction Loss . . . . . . . . . . . . . . . .14
4.3  De­association Loss . . . . . . . . . . . . . . . .15
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18
1 Implementation Details . . . . . . . . . . . . . . . . . . .18
2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . .18
2.1  Ablation . . . . . . . . . . . . . . . . . . . . . . .19
2.2  Overlay Function Scales . . . . . . . . . . . . . .20
2.3  Reconstruction Loss . . . . . . . . . . . . . . . .22
2.4  User­Defined Mask . . . . . . . . . . . . . . . . .22
2.5  Setting Object Location . . . . . . . . . . . . . .24
2.6  Object Replication . . . . . . . . . . . . . . . . .25
2.7  Object Removal . . . . . . . . . . . . . . . . . .26
2.8  Comparison . . . . . . . . . . . . . . . . . . . . .28
2.9  User Study . . . . . . . . . . . . . . . . . . . . .29
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . .33
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .34
                                

[1]M.Rubinstein,A.Shamir,andS.Avidan,“Improvedseamcarvingforvideoretargeting,”ACMtransactions on graphics (TOG), vol. 27, no. 3, pp. 1–9, 2008.
[2]A. Shocher, S. Bagon, P. Isola, and M. Irani, “Ingan: Capturing and remapping the ”dna” of a naturalimage,” 2019.
[3]T. R. Shaham, T. Dekel, and T. Michaeli, “Singan: Learning a generative model from a single naturalimage,” in2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 4569–4579,2019.
[4]Mingju Zhang, Lei Zhang, Yanfeng Sun, Lin Feng, and Weiying Ma, “Auto cropping for digital photographs,” in2005 IEEE International Conference on Multimedia and Expo, pp. 4 pp.–, 2005.
[5]B. Suh, H. Ling, B. B. Bederson, and D. W. Jacobs, “Automatic thumbnail cropping and its effectiveness,”inProceedingsofthe16thAnnualACMSymposiumonUserInterfaceSoftwareandTechnology,UIST ’03, (New York, NY, USA), p. 95–104, Association for Computing Machinery, 2003.
[6]M.MaandJ.K.Guo,“Automaticimagecroppingformobiledevicewithbuiltincamera,”FirstIEEEConsumer Communications and Networking Conference, 2004. CCNC 2004., pp. 710–711, 2004.
[7]V. Setlur, S. Takagi, R. Raskar, M. Gleicher, and B. Gooch, “Automatic image retargeting,” inProceedings of the 4th International Conference on Mobile and Ubiquitous Multimedia,MUM’05, (NewYork, NY, USA), p. 59–68, Association for Computing Machinery, 2005.
[8]F. Liu and M. Gleicher, “Automatic image retargeting with fisheyeview warping,” inProceedings ofthe 18th Annual ACM Symposium on User Interface Software and Technology, UIST ’05, (New York,NY, USA), p. 153–162, Association for Computing Machinery, 2005.
[9]R. Gal, O. Sorkine, and D. CohenOr, “Featureaware texturing,” inProceedings of the 17th EurographicsConferenceonRenderingTechniques,EGSR’06,(Goslar,DEU),p.297–303,EurographicsAssociation, 2006.
[10]YanKe,XiaoouTang,andFengJing,“Thedesignofhighlevelfeaturesforphotoqualityassessment,”in2006IEEEComputerSocietyConferenceonComputerVisionandPatternRecognition(CVPR’06),vol. 1, pp. 419–426, 2006.
[11]D. Simakov, Y. Caspi, E. Shechtman, and M. Irani, “Summarizing visual data using bidirectionalsimilarity,” in2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, 2008.
[12]T. Q. Chen and M. Schmidt, “Fast patchbased style transfer of arbitrary style,”arXiv preprintarXiv:1612.04337, 2016.
[13]A. J. Champandard, “Semantic style transfer and turning twobit doodles into fine artworks,”arXivpreprint arXiv:1603.01768, 2016.
[14]O. Frigo, N. Sabater, J. Delon, and P. Hellier, “Split and match: Examplebased adaptive patch sampling for unsupervised style transfer,” inProceedings of the IEEE Conference on Computer Visionand Pattern Recognition, pp. 553–561, 2016.[15]X.HuangandS.Belongie,“Arbitrarystyletransferinrealtimewithadaptiveinstancenormalization,”inProceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510, 2017.
[16]T. Park, J.Y. Zhu, O. Wang, J. Lu, E. Shechtman, A. A. Efros, and R. Zhang, “Swapping autoencoderfor deep image manipulation,” 2020.
[17]T. Ružić and A. Pižurica, “Contextaware patchbased image inpainting using markov random fieldmodeling,”IEEE transactions on image processing, vol. 24, no. 1, pp. 444–456, 2014.
[18]Y. Song, C. Yang, Z. Lin, X. Liu, Q. Huang, H. Li, and C.C. Jay Kuo, “Contextualbased imageinpainting: Infer, match, and translate,” inProceedings of the European Conference on ComputerVision (ECCV), pp. 3–19, 2018.
[19]A. Criminisi, P. Pérez, and K. Toyama, “Region filling and object removal by exemplarbased imageinpainting,”IEEE Transactions on image processing, vol. 13, no. 9, pp. 1200–1212, 2004.
[20]O. Mac Aodha, N. D. Campbell, A. Nair, and G. J. Brostow, “Patch based synthesis for single depthimage superresolution,” inEuropean conference on computer vision, pp. 71–84, Springer, 2012.
[21]W. T. Freeman, T. R. Jones, and E. C. Pasztor, “Examplebased superresolution,”IEEE Computergraphics and Applications, vol. 22, no. 2, pp. 56–65, 2002.
[22]C.Barnes,E.Shechtman,A.Finkelstein,andD.B.Goldman,“Patchmatch: Arandomizedcorrespondence algorithm for structural image editing,”ACM Trans. Graph., vol. 28, no. 3, p. 24, 2009.
[23]S.S. Lin, I.C. Yeh, C.H. Lin, and T.Y. Lee, “Patchbased image warping for contentaware retargeting,”IEEE transactions on multimedia, vol. 15, no. 2, pp. 359–368, 2012.
[24]W. Dong, F. Wu, Y. Kong, X. Mei, T.Y. Lee, and X. Zhang, “Image retargeting by textureawaresynthesis,”IEEE Transactions on Visualization and Computer Graphics, vol. 22, no. 2, pp. 1088–1101, 2015.
[25]K.K. Maninis, S. Caelles, J. PontTuset, and L. V. Gool, “Deep extreme cut: From extreme points toobject segmentation,” 2018.
[26]I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville, “Improved training of wasserstein gans,” 2017.
[27]A. Mahendran and A. Vedaldi, “Understanding deep image representations by inverting them,” in2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5188–5196, 2015.

全文公開日期 2026/01/31 (校內網路)
全文公開日期 2026/01/31 (校外網路)
全文公開日期 2026/01/31 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文