Enabling Control over Strokes and Pattern Density of Style Transfer using Covariance Matrix

簡易檢索 / 詳目顯示

回結果列表

研究生：	John Jethro Virtusio John Jethro Virtusio
論文名稱：	Enabling Control over Strokes and Pattern Density of Style Transfer using Covariance Matrix Enabling Control over Strokes and Pattern Density of Style Transfer using Covariance Matrix
指導教授：	花凱龍 Kai-Lung Hua
口試委員:	Conrado Ruiz Conrado Ruiz 郭景明 Jing-Ming Guo 鍾國亮 Kuo-Liang Chung 賴祐吉 Yu-Chi Lai
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2019
畢業學年度：	107
語文別：	英文
論文頁數：	47
中文關鍵詞：	Style Transfer 、Neural Network 、Deep Learning
外文關鍵詞：	Style Transfer, Neural Network, Deep Learning
相關次數：	點閱：281 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

Despite the remarkable results and numerous advancements on neural
style transfer, enabling artistic freedom through the control over perceptual
factors such as pattern density and stroke strength remains
a challenging problem. A recent work on fast stylization networks
is able to offer some degree of controllability on the pattern density
by changing the resolution of the inputs. However, their solution requires
a dedicated network architecture that can only accommodate
a predefined set of resolutions. In this work, we propose a much simpler
solution by addressing the fundamental limitation of neural style
transfer models that uses the Gram matrix as its style representation.
More specifically, we replace the Gram matrix with a covariance
matrix in order to better capture negative spatial correlations. We
show that this simple modification allows the model to handle a wider
range of input resolutions. We also show that selectively manipulating
the covariance matrix allows us to control the stroke strengths independently
from the pattern density. Our method compares favorably
against several state-of-the-art neural style transfer models. Moreover,
since our approach is focused on manipulating and improving
the Gram matrix, it is not dependent on any network architecture.
This means that all the advancements on neural style transfer that
use the Gram matrix as its style representation can directly benefit
from our findings.

Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . ii
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Review of Related Literature . . . . . . . . . . . . . . . . . . 5
0.1 Style Transfer. . . . . . . . . . . . . . . . . . . 5
0.2 Controlling Perceptual Factors. . . . . . . . . . 6
0.3 Failure Cases of Neural Style Transfer. . . . . . 6
0.4 Style Feature Scaling. . . . . . . . . . . . . . . 7
Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
0.1 Overview . . . . . . . . . . . . . . . . . . . . . 9
0.2 Extracting the Content of an Image . . . . . . . 9
0.3 Extracting the Style of an Image . . . . . . . . 10
0.4 Covariance Matrix vs Gram Matrix . . . . . . . 11
0.5 Reducing Pattern Noise . . . . . . . . . . . . . 12
iii
0.6 Masking the Covariance to Control Stroke Strength 13
0.7 Controlling Pattern Density . . . . . . . . . . . 14
Experimental Results . . . . . . . . . . . . . . . . . . . . . . 17
0.1 Implementation Details . . . . . . . . . . . . . 19
0.2 Ablation Study . . . . . . . . . . . . . . . . . . 19
0.3 Experiments with Stroke Strength . . . . . . . 20
0.4 Experiments with Pattern Density . . . . . . . 23
0.5 Comparison to other models . . . . . . . . . . . 25
User Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Supplemental Results . . . . . . . . . . . . . . . . . . . . . . 31
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
                                

References
[1] Y. Jing, Y. Liu, Y. Yang, Z. Feng, Y. Yu, D. Tao, and M. Song, “Stroke controllable fast
style transfer with adaptive receptive fields,” in European Conference on Computer Vision,
pp. 244–260, Springer, 2018.
[2] X. Huang and S. Belongie, “Arbitrary style transfer in real-time with adaptive instance normalization,”
in The IEEE International Conference on Computer Vision (ICCV), Oct 2017.
[3] Y. Li, C. Fang, J. Yang, Z. Wang, X. Lu, and M.-H. Yang, “Universal style transfer via feature
transforms,” in Advances in Neural Information Processing Systems, pp. 386–396, 2017.
[4] L. A. Gatys, A. S. Ecker, and M. Bethge, “Image style transfer using convolutional neural networks,”
in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), pp. 2414–2423, 2016.
[5] P. Wilmot, E. Risser, and C. Barnes, “Stable and controllable neural texture synthesis and
style transfer using histogram losses,” CoRR, vol. abs/1701.08893, 2017.
[6] D. Ulyanov, A. Vedaldi, and V. S. Lempitsky, “Instance normalization: The missing ingredient
for fast stylization,” CoRR, vol. abs/1607.08022, 2016.
[7] J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style transfer and superresolution,”
in European Conference on Computer Vision, pp. 694–711, Springer, 2016.
[8] V. Dumoulin, J. Shlens, and M. Kudlur, “A learned representation for artistic style,” Proc.
of ICLR, 2017.
[9] Y. Li, C. Fang, J. Yang, Z. Wang, X. Lu, and M.-H. Yang, “Diversified texture synthesis with
feed-forward networks,” in The IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), July 2017.
[10] H. Zhang and K. Dana, “Multi-style generative network for real-time transfer,” arXiv preprint
arXiv:1703.06953, 2017.
[11] D. Chen, L. Yuan, J. Liao, N. Yu, and G. Hua, “Stylebank: An explicit representation
for neural image style transfer,” in The IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), July 2017.
[12] Y. Li, N. Wang, J. Liu, and X. Hou, “Demystifying neural style transfer,” in Proceedings of
the 26th International Joint Conference on Artificial Intelligence, pp. 2230–2236, AAAI Press,
2017.
[13] T. Q. Chen and M. Schmidt, “Fast patch-based style transfer of arbitrary style,” Workshop in
Constructive Machine Learning, Advances in Neural Information Processing Systems (NIPS),
2016.
35
[14] G. Ghiasi, H. Lee, M. Kudlur, V. Dumoulin, and J. Shlens, “Exploring the structure of a realtime,
arbitrary neural artistic stylization network,” arXiv preprint arXiv:1705.06830, 2017.
[15] Y. Jing, Y. Yang, Z. Feng, J. Ye, Y. Yu, and M. Song, “Neural style transfer: A review,”
arXiv preprint arXiv:1705.04058, 2017.
[16] L. A. Gatys, A. S. Ecker, M. Bethge, A. Hertzmann, and E. Shechtman, “Controlling perceptual
factors in neural style transfer.,” in The IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), vol. 1, p. 8, 2017.
[17] L. Gatys, A. S. Ecker, and M. Bethge, “Texture synthesis using convolutional neural networks,”
in Advances in Neural Information Processing Systems (NIPS), pp. 262–270, 2015.
[18] S. Ioffe and C. Szegedy, “Batch normalization: accelerating deep network training by reducing
internal covariate shift,” in Proceedings of the 32nd International Conference on International
Conference on Machine Learning-Volume 37, pp. 448–456, JMLR. org, 2015.
[19] D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Improved texture networks: Maximizing quality
and diversity in feed-forward stylization and texture synthesis,” in The IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), July 2017.
[20] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness
of deep features as a perceptual metric,” in The IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), 2018.
[21] M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” in The
European Conference on Computer Vision (ECCV), pp. 818–833, Springer, 2014.
[22] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in International
Conference for Learning Representations (ICLR), May 2015.
[23] C. Zhu, R. H. Byrd, P. Lu, and J. Nocedal, “Algorithm 778: L-bfgs-b: Fortran subroutines for
large-scale bound-constrained optimization,” ACM Transactions on Mathematical Software
(TOMS), vol. 23, no. 4, pp. 550–560, 1997.
[24] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image
recognition,” in International Conference on Learning Representations (ICLR), May 2015.

簡易檢索 / 詳目顯示

相關論文