簡易檢索 / 詳目顯示

研究生: 陳昱捷
Yu-Chieh Chen
論文名稱: 基於分類條件式生成對抗網路之三維物件修補技術
3D Object Completion via Class-Conditional Generative Adversarial Network
指導教授: 花凱龍
Kai-Lung Hua
口試委員: 陳淑媛
Shu-Yuan Chen
楊傳凱
Chuan-Kai Yang
陳建中
Jiann-Jong Chen
阮聖彰
Shanq-Jang Ruan
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 33
中文關鍵詞: 物件修補物件重建生成對抗網路物件分類
外文關鍵詞: Object completion, Object reconstruction, Generative adversarial network, Object classification
相關次數: 點閱:255下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文透過結合變分自動編碼器與三方生成對抗網路來修補殘缺的三維物件。為了保持生成物件的結構特徵,我們採用編碼器網路來學習潛在向量空間與真實物件空間之間的關聯。三方生成對抗網路是由生成器、鑑別器和分類器組成的對抗網路。生成器負責生成三維物件,分類器基於物件和標籤之間的條件分佈進行分類,鑑別器著重於識別每對物件和標籤是否為真實。在實作上,當標籤有限或昂貴時,我們會使用半監督學習。為了預測部分缺失的標籤,我們預先訓練分類殘缺物件的分類器,該分類器用少許已知標籤及相對應的殘缺物件進行標籤預測。在訓練三方生成對抗網路時,我們會將輸入的殘缺物件與預測標籤連接起來。在生成網路中,對於許多類別的三維物件與標籤進行訓練,是比較困難的。我們的方法在經過訓練之後,可以修復殘缺的物件,並同時對未知標籤的物件進行分類。經由實驗結果證實,我們的成果相對於其他方法,達到高相似度的三維物件,並且在三維物件的分類上有良好的表現。


    Many robotic tasks require accurate shape models in order to properly grasp or interact with objects. However, it is often the case that sensors produce incomplete 3D models due to several factors such as occlusion or sensor noise. To address this problem, we propose a semi-supervised method that can recover the complete the shape of a broken or incomplete 3D object model. We formulated a hybrid of 3D variational autoencoder (VAE) and generative adversarial network (GAN) to recover the complete voxelized 3D object. Furthermore, we incorporated a separate classifier in the GAN framework, making it a three player game instead of two which helps stabilize the training of the GAN as well as guides the shape completion process to follow the object class labels. Our experiments show that our model produces 3D object reconstructions with high-similarity to the ground truth and outperforms several baselines in both quantitative and qualitative evaluations.

    論文摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . III Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation and Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Thesis contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Thesis organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 3D Shape Completion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Generative models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1 Background Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3 Encoder and Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.4 Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.5 Discriminator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.6 Dealing with Missing Labels . . . . . . . . . . . . . . . . . . . . . . . . 14 3.7 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 5 Experimental Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 授權書 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    [1] F.-L. He, Y.-C. F. Wang, and K.-L. Hua, “Self-learning approach to color demosaick-
    ing via support vector regression,” in International Conference on Image Processing
    (ICIP), IEEE, 2012.

    [2] K.-L. Hua, R. Zhang, M. Comer, and I. Pollak, “Inter frame video compression
    with large dictionaries of tilings: algorithms for tiling selection and entropy cod-
    ing,” IEEE Transactions on Circuits and Systems for Video Technology, 2012.

    [3] H.-C. Li, K. C.-J. Lin, K.-L. Hua, G.-M. Chiu, Y.-C. Tsai, and S. Chin, “Dependency-
    aware quality-differentiated wireless video multicast,” in Wireless Communications
    and Networking Conference (WCNC), 2013 IEEE, pp. 2226–2231, IEEE, 2013.

    [4] D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” stat, vol. 1050,
    p. 10, 2014.

    [5] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,
    A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural
    information processing systems, pp. 2672–2680, 2014.

    [6] L. Chongxuan, T. Xu, J. Zhu, and B. Zhang, “Triple generative adversarial nets,” in
    Advances in Neural Information Processing Systems, pp. 4091–4101, 2017.

    [7] R. A. Yeh, C. Chen, T.-Y. Lim, A. G. Schwing, M. Hasegawa-Johnson, and M. N.
    Do, “Semantic image inpainting with deep generative models.,” in CVPR, vol. 2,
    p. 4, 2017.

    [8] J. Bao, D. Chen, F. Wen, H. Li, and G. Hua, “Cvae-gan: fine-grained image gener-
    ation through asymmetric training,” CoRR, abs/1703.10155, vol. 5, 2017.

    [9] J. Wu, C. Zhang, T. Xue, B. Freeman, and J. Tenenbaum, “Learning a probabilistic
    latent space of object shapes via 3d generative-adversarial modeling,” in Advances
    in Neural Information Processing Systems, pp. 82–90, 2016.

    [10] M. Mirza and S. Osindero, “Conditional generative adversarial nets,” arXiv preprint
    arXiv:1411.1784, 2014.

    [11] D. J. Rezende, S. Mohamed, and D. Wierstra, “Stochastic backpropagation and ap-
    proximate inference in deep generative models,” arXiv preprint arXiv:1401.4082,
    2014.

    [12] Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao, “3d shapenets: A
    deep representation for volumetric shapes,” in Proceedings of the IEEE conference
    on computer vision and pattern recognition, pp. 1912–1920, 2015.

    [13] A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese,
    M. Savva, S. Song, H. Su, et al., “Shapenet: An information-rich 3d model reposi-
    tory,” arXiv preprint arXiv:1512.03012, 2015.

    [14] M. Sung, V. G. Kim, R. Angst, and L. Guibas, “Data-driven structural priors for
    shape completion,” ACM Transactions on Graphics (TOG), vol. 34, no. 6, p. 175,
    2015.

    [15] A. Sharma, O. Grau, and M. Fritz, “Vconv-dae: Deep volumetric shape learning
    without object labels,” in European Conference on Computer Vision, pp. 236–250,
    Springer, 2016.

    [16] A. Dai, C. R. Qi, and M. Nießner, “Shape completion using 3d-encoder-predictor
    cnns and shape synthesis,” in Proc. IEEE Conf. on Computer Vision and Pattern
    Recognition (CVPR), vol. 3, 2017.

    [17] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning
    with deep convolutional generative adversarial networks,” arXiv preprint arXiv:
    1511.06434, 2015.

    [18] T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen, “Im-
    proved techniques for training gans,” in Advances in Neural Information Processing
    Systems, pp. 2234–2242, 2016.

    [19] C. Li and M. Wand, “Precomputed real-time texture synthesis with markovian gener-
    ative adversarial networks,” in European Conference on Computer Vision, pp. 702–
    716, Springer, 2016.

    [20] E. L. Denton, S. Chintala, R. Fergus, et al., “Deep generative image models using
    a laplacian pyramid of adversarial networks,” in Advances in neural information
    processing systems, pp. 1486–1494, 2015.

    [21] A. B. L. Larsen, S. K. Sønderby, H. Larochelle, and O. Winther, “Autoencoding
    beyond pixels using a learned similarity metric,” arXiv preprint arXiv:1512.09300,
    2015.

    [22] K. Sohn, H. Lee, and X. Yan, “Learning structured output representation using deep
    conditional generative models,” in Advances in Neural Information Processing Sys-
    tems, pp. 3483–3491, 2015.

    [23] A. Odena, C. Olah, and J. Shlens, “Conditional image synthesis with auxiliary clas-
    sifier gans,” arXiv preprint arXiv:1610.09585, 2016.

    [24] E. J. Smith and D. Meger, “Improved adversarial systems for 3d object generation
    and reconstruction,” in Proceedings of the 1st Annual Conference on Robot Learning,
    pp. 87–96, 2017.

    [25] Z. Wu, S. Song, A. Khosla, X. Tang, and J. Xiao, “3d shapenets for 2.5 d object
    recognition and next-best-view prediction,” ArXiv e-prints, vol. 2, 2014.

    [26] S. Choi, Q.-Y. Zhou, S. Miller, and V. Koltun, “A large dataset of object scans,”
    arXiv preprint arXiv:1602.02481, 2016.

    [27] D. Kinga and J. B. Adam, “A method for stochastic optimization,” in International
    Conference on Learning Representations (ICLR), vol. 5, 2015.

    無法下載圖示 全文公開日期 2024/06/18 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE