學習在函數空間中增強單視圖重建3D模型細節之研究｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	余知寰 Chih-Huan Yu
論文名稱：	學習在函數空間中增強單視圖重建3D模型細節之研究 Single-View 3D Model Reconstruction with Enhanced Details Using Deep Convolutional Neural Network in Function Space
指導教授：	吳怡樂 Yi-Leh Wu
口試委員:	陳建中 Jiann-Jone Chen 唐政元 Cheng-Yuan Tang 閻立剛 Li-Kang Yen
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2020
畢業學年度：	108
語文別：	英文
論文頁數：	40
中文關鍵詞：	占用網路、單視圖3D重建
外文關鍵詞：	Occupancy Networks, 3D Model Reconstruction, CNN
相關次數：	點閱：254 下載：11
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

最近，基於3D重建學習方法越來越流行，與傳統多視圖重建算法不同，僅已單視圖重建3D圖像，更具有挑戰性。現行流行的表示方法大概可以分為3種: 基於點雲的表示、基於體素的表示以及網格的表示。在本文中，我們改進了一個新型的深度卷積神經網路架構Occupancy Networks[1]。將網路的Encoder替換成新的方法EfficientNet，接下來將Decoder更改成不同的架構DenseNet。使用ShapeNet[6]資料集裡面的飛機、車子和椅子3個類別作為實驗。我們將資料分割成訓練集、驗證集以及測試集。實驗表明了，我們提出的新型網路不管在視覺方面，還是定性的數據的比較上都有比原本的Occupancy Networks更加好的成績。

Recently, learning methods based on 3D reconstruction have become increasingly
popular. Unlike traditional multi-view reconstruction algorithms, single-view reconstruction of 3D images is more challenging. Presently popular representation methods can be roughly divided into three types: point cloud-based representation, voxel-based representation and grid representation. In this paper, we improve a new type of deep convolutional neural network architecture Occupancy Networks. We replace the Encoder of the network with the new method EfficientNet, and then change the Decoder to a different architecture DenseNet. We use the ShapeNet dataset in the three categories of planes, cars, and chairs as experiments. The data set is split into training set, validation set, and test set. Experiments show that our proposed network outperforms the Occupancy Networks in both visual and quantitative comparison.

論文摘要  iii
Abstract  iv
Contents  v
List of Figures  vi
List of Tables  vii
Chapter 1. Introduction  8
Chapter 2 Related Work and Review  10
2.1 Image feature extraction  10
2.2 Different CNN network models  10
2.3 Introduction to Occupancy networks  11
Chapter 3 Proposed Method  15
3.1 Structure of the encoder  15
3-2 Structure of the decoder  17
3.3 Generate  19
Chapter 4 Experiments  20
4.1 Datasets and Metrics  20
4.2 Comparison of ONet and Encoder replace by EfficientNet  20
4.3 Comparison of ONet and Decoder replace by DenseNet  26
4.4 Compare ONet with our proposed network  31
4.4 Reconstruction failures  35
Chapter 5 Conclusions and Future Work  37
Reference  38
                                

[1] Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., & Geiger, A. “Occupancy networks: Learning 3d reconstruction in function space” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4460-4470), 2019.
[2] He, K., Zhang, X., Ren, S., & Sun, J. “Deep residual learning for image recognition.” In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).2016.
[3] Tan, M., & Le, Q. V. “Efficientnet: Rethinking model scaling for convolutional neural networks” arXiv preprint arXiv:1905.11946, 2019.
[4] Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. “Densely connected convolutional networks” In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700-4708), 2017.
[5] De Vries, H., Strub, F., Mary, J., Larochelle, H., Pietquin, O., & Courville, A. C. “Modulating early visual processing by language” In Advances in Neural Information Processing Systems (pp. 6594-6604), 2017.
[6] Chang, A. X., Funkhouser, T. A., Guibas, L. J., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., and Yu, F. “ShapeNet: An information-rich 3D model repository” arXiv.org, 1512.03012, 2015.
[7] Choy, C. B., Xu, D., Gwak, J., Chen, K., & Savarese, S.. “3d-r2n2: A unified approach for single and multi-view 3d object reconstruction.” In European conference on computer vision (pp. 628-644). Springer, Cham. 2016.
[8] Gerstner, T., & Pajarola, R. “Topology preserving and controlled topology simplifying multiresolution isosurface extraction” (pp. 259-266). IEEE, 2000.
[9] Lorensen, W. E., & Cline, H. E. “Marching cubes: A high resolution 3D surface construction algorithm” ACM siggraph computer graphics, 21(4), 163-169, 1987.
[10] Krizhevsky, A., Sutskever, I., & Hinton, G. E. “Imagenet classification with deep convolutional neural networks” In Advances in neural information processing systems (pp. 1097-1105), 2012.
[11] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L. C.. “Mobilenetv2: Inverted residuals and linear bottlenecks” In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4510-4520). 2018.
- 39 -
[12] Wen, C., Zhang, Y., Li, Z., & Fu, Y. “Pixel2mesh++: Multi-view 3d mesh generation via deformation.” In Proceedings of the IEEE International Conference on Computer Vision (pp. 1042-1051). 2019.
[13] Yang, G., Cui, Y., Belongie, S., & Hariharan, B. “Learning single-view 3d reconstruction with limited pose supervision.” In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 86-101).2018.
[14] Achlioptas, P., Diamanti, O., Mitliagkas, I., & Guibas, L. “Learning Representations and Generative Models for 3D Point Clouds.” In International Conference on Machine Learning (pp. 40-49).2018.
[15] Kanazawa, A., Tulsiani, S., Efros, A. A., & Malik, J. “Learning category-specific mesh reconstruction from image collections.” In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 371-386).2018.
[16] Kato, H., & Harada, T. “Learning view priors for single-view 3d reconstruction.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 9778-9787). 2019.
[17] Liao, Y., Donne, S., & Geiger, A. “Deep marching cubes: Learning explicit surface representations.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2916-2925). 2018.
[18] Rezende, D. J., Eslami, S. A., Mohamed, S., Battaglia, P., Jaderberg, M., & Heess, N. “Unsupervised learning of 3d structure from images. “In Advances in neural information processing systems (pp. 4996-5004). 2016.
[19] Fan, H., Su, H., & Guibas, L. J. “A point set generation network for 3d object reconstruction from a single image.” In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 605-613). 2017.
[20] Ranjan, A., Bolkart, T., Sanyal, S., & Black, M. J. “Generating 3D faces using convolutional mesh autoencoders.” In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 704-720). 2018.
[21] Niu, C., Li, J., & Xu, K. “Im2struct: Recovering 3d shape structure from a single rgb image.” In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4521-4529). 2018.

簡易檢索 / 詳目顯示

相關論文