研究生: |
邱嘉興 Chia-Hsing Chiu |
---|---|
論文名稱: |
使用者參與之基於微分子空間的高維度隱空間探索 Human-in-the-Loop Differential Subspace Search in High-Dimensional Latent Space |
指導教授: |
賴祐吉
Yu-Chi Lai |
口試委員: |
賴祐吉
Yu-Chi Lai 莊永裕 Yung-Yu Chuang 鄭文皇 Wen-Huang Cheng 花凱龍 Kai-Lung Hua 林士勛 Shih-Syun Lin |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 中文 |
論文頁數: | 66 |
中文關鍵詞: | 使用者參與最佳化 、降維 、生成模型 |
外文關鍵詞: | Human-in-the-loop optimization, Dimensionality reduction, Generative models |
相關次數: | 點閱:258 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
基於深度神經網路(Deep neural network)之生成模型(Generative model)一般
具有數百甚至更高維度之隱空間(Latent space),因此使用者通常難以透過調整
生成資料之方式來探索該高維度空間,為了解決該問題,本論文提出了微分
子空間(Differential Subpsace),一種不侷限於特定資料形式或應用之探索方式,
其概念為提供一具有足夠資料變化量的低維度子空間供使用者搜尋。本研究
透過生成模型的局部微分分析(Local differential analysis)來達成此子空間之建
構,首先針對生成模型的雅可比矩陣(Jacobian matrix)進行奇異值分解(Singular
Value Decomposition,SVD),並且建構一個以數個隨機選擇之奇異向量(Singular
vector)為基向量之子空間,其中奇異向量(Singular vector)之選擇是使用基於其對
應奇異值(Singular value)大小之重要性採樣(Importance sampling)以確保遍歷性
(Ergodicity)。接著使用者互動地於該子空間中找出一最佳資料,此後於該選擇位
置重啟上述步驟,並且重複該流程直到使用者滿意搜尋結果為止。基於數值模擬
(Numerical simulation)的實驗結果顯示該論文提出之方法相較於其他方法能夠更
佳地最佳化人造的黑箱目標函數,並且透過用戶研究(User study)證明此方法能夠
有效幫助使用者探索複雜生成模型中的高維度隱空間(Latent space)。
Generative models based on deep neural networks often have a high-dimensional latent
space, sometimes ranging to a few hundreds or even higher. Directly exploring in such a high-dimensional space via user manipulation thus can be intractably hard. We propose differential subspace search to allow efficient iterative user exploration in such a space, without relying on domain- or data-specific assumptions. The key is to let the user perform search in a low-dimensional subspace, such that a small change in the subspace would provide enough change in the resulting data. We construct such subspaces based on the local differential analysis of the generative model. Specifically, we first apply singular value decomposition for the Jacobian of the generative model, and form a subspace spanned by a few singular vectors stochastically selected based on their corresponding singular values using importance sampling, to maintain ergodicity. Then, the user finds a candidate in this subspace, which is in turn updated at the new candidate location. This process is repeated until no further improvement can be made. Numerical simulations show that our method can better optimize synthetic black-box objective functions than the alternatives. Furthermore, we conducted a user study using complex generative models and the results show that our method enables more efficient exploration of high-dimensional latent spaces than
the alternatives.
[1] Diederik P. Kingma, Shakir Mohamed, Danilo Jimenez Rezende, and Max Welling.
Semi-supervised learning with deep generative models. ArXiv, abs/1406.5298, 2014.
[2] Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. Spectral
normalization for generative adversarial networks. arXiv preprint arXiv:
1802.05957, 2018.
[3] Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of
gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196,
2017.
[4] Chris Donahue, Julian McAuley, and Miller Puckette. Adversarial audio synthesis.
arXiv preprint arXiv:1802.04208, 2018.
[5] Jesse Engel, Kumar Krishna Agrawal, Shuo Chen, Ishaan Gulrajani, Chris Donahue,
and Adam Roberts. Gansynth: Adversarial neural audio synthesis. arXiv preprint
arXiv:1902.08710, 2019.
[6] Nobuyuki Umetani. Exploring generative 3d shapes using autoencoder networks. In
SIGGRAPH Asia 2017 Technical Briefs, pages 24:1–24:4, 2017.
[7] Zhiqin Chen and Hao Zhang. Learning implicit fields for generative shape modeling.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
pages 5939–5948, 2019.
[8] Svante Wold, Kim Esbensen, and Paul Geladi. Principal component analysis.
Chemometrics and intelligent laboratory systems, 2(1-3):37–52, 1987.
[9] Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal
of machine learning research, 9(Nov):2579–2605, 2008.
[10] Yuki Koyama, Issei Sato, Daisuke Sakamoto, and Takeo Igarashi. Sequential line
search for efficient visual design optimization by crowds. ACM Trans. Graph., 36(4):
48:1–48:11, July 2017.
[11] Mehdi Mirza and Simon Osindero. Conditional generative adversarial nets. ArXiv,
abs/1411.1784, 2014.
[12] Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter
Abbeel. Infogan: Interpretable representation learning by information maximizing
generative adversarial nets. In Proc. NIPS ’16, pages 2172–2180, 2016.
[13] Andrew Brock, Theodore Lim, James M Ritchie, and Nick Weston. Neural photo
editing with introspective adversarial networks. In Proc. ICLR ’17, 2017.
[14] David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum,
William T. Freeman, and Antonio Torralba. GAN Dissection: Visualizing and understanding
generative adversarial networks. In Proc. ICLR ’19, 2019.
[15] Yurii Nesterov. Lectures on Convex Optimization. Springer, 2018.
[16] Yann LeCun, Léon Bottou, Yoshua Bengio, Patrick Haffner, et al. Gradient-based
learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–
2324, 1998.
[17] Toby Chong Long Hin, I-Chao Shen, Issei Sato, and Takeo Igarashi. Interactive
subspace exploration on generative image modelling. arXiv:1906.09840, 2019.
[18] J. Marks, B. Andalman, P. A. Beardsley, W. Freeman, S. Gibson, J. Hodgins, T. Kang,
B. Mirtich, H. Pfister, W. Ruml, and et al. Design galleries: A general approach to
setting parameters for computer graphics and animation. In Proc. SIGGRAPH ’97,
pages 389–400, 1997.
[19] Adobe. Using the brainstorming tool in after effects cs6, October 2017. Retrieved
January 16, 2020 from https://helpx.adobe.com/after-effects/
atv/cs6-tutorials/brainstorming.html.
[20] Yuki Koyama, Daisuke Sakamoto, and Takeo Igarashi. Crowd-powered parameter
analysis for visual design exploration. In Proc. UIST ’14, pages 65–74, 2014.
[21] Jerry O. Talton, Daniel Gibson, Lingfeng Yang, Pat Hanrahan, and Vladlen Koltun.
Exploratory modeling with collaborative design spaces. ACM Trans. Graph., 28(5):
1–10, December 2009.
[22] Jasper Snoek, Hugo Larochelle, and Ryan P Adams. Practical bayesian optimization
of machine learning algorithms. In Proc. NIPS ’12, pages 2951–2959, 2012.
[23] Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas.
Taking the human out of the loop: A review of bayesian optimization. Proceedings
of the IEEE, 104(1):148–175, 2015.
[24] Eric Brochu, Nando de Freitas, and Abhijeet Ghosh. Active preference learning with
discrete choice data. In Proc. NIPS ’07, pages 409–416, 2007.
[25] Ziyu Wang, Masrour Zoghi, Frank Hutter, David Matheson, and Nando De Freitas.
Bayesian optimization in a billion dimensions via random embeddings. J. Artif.
Intell. Res., 55:361–387, February 2016.
[26] Riccardo Moriconi, Marc P. Deisenroth, and K. S. Sesh Kumar. High-dimensional
Bayesian optimization using low-dimensional feature spaces. arXiv e-prints, page
arXiv:1902.10675, Feb 2019.
[27] Salah Rifai, Yann N Dauphin, Pascal Vincent, Yoshua Bengio, and Xavier Muller.
The manifold tangent classifier. In J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett,
F. Pereira, and K. Q. Weinberger, editors, Advances in Neural Information Processing
Systems 24, pages 2294–2302. Curran Associates, Inc., 2011.
[28] Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and
new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence,
35(8):1798–1828, 2013.
[29] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C
Courville. Improved training of wasserstein gans. In Advances in neural information
processing systems, pages 5767–5777, 2017.
[30] Jesse Engel, Cinjon Resnick, Adam Roberts, Sander Dieleman, Mohammad
Norouzi, Douglas Eck, and Karen Simonyan. Neural audio synthesis of musical
notes with wavenet autoencoders. In Proceedings of the 34th International Conference
on Machine Learning-Volume 70, pages 1068–1077. JMLR. org, 2017.
[31] Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing
Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su,
et al. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:
1512.03012, 2015.
[32] Emmanuel J. Candès and Benjamin Recht. Exact matrix completion via convex
optimization. Foundations of Computational Mathematics, 9(6):717, 2009.
[33] Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian Processes for
Machine Learning. The MIT Press, 2006.