單張圖片訓練之程序化3D魚類模型生成系統｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	孫上晏 Shang-Yan Sun
論文名稱：	單張圖片訓練之程序化3D魚類模型生成系統 Procedural 3D Fish Model Generation with Trainable Shapes and Textures from Single Image
指導教授：	戴文凱 Wen-Kai Tai
口試委員:	賴祐吉王學武
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2023
畢業學年度：	111
語文別：	英文
論文頁數：	70
中文關鍵詞：	魚類模型、生成式模型、程序化生成
外文關鍵詞：	fish models, generative model, procedural generation
相關次數：	點閱：203 下載：1
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在電影、動畫和遊戲產業領域中，為了建立沉浸式的水下場景，創建多樣且精緻的3D 魚類模型至關重要。然而，使用傳統的3D建模工具手動創建細節豐富的3D魚類模型相當耗時且費力，並且需要豐富的專業知識。現有的方法，如anyfish 2.0 和DreamFusion，存在著複雜人工操作和生成圖像的一致性方面的問題。本文提出了一個魚類模型生成器，以參數化生成3D魚類模型，來解決這些限制。
我們提出的系統整合了2D 擴散模型來生成魚類的正側面圖片，並使用此圖片來引導魚類模型生成器參數訓練，實現了利用提示文字生成3D模型的功能，大幅度降低了3D模型生成的學習門檻。為了能使用2D圖片訓練魚類模型生成器的參數，我們確保了魚類模型生成器的可微分性，並且採用可微分渲染器（DIB-R）作為圖片渲染工具，參數訓練則是使用了Adam優化器。通過結合Stable Diffusion模型和程序化的魚類模型生成器，我們讓整個模型生成流程中，需要手動處理的部分縮減到只剩下魚身、魚鰭部位分割與標記。
透過實驗結果，我們證明此系統能夠生成細節豐富且多樣的3D魚類模型，並且在訓練速度和視覺品質方面超越了先前的方法。通過降低美術設計師的工作量和學習門檻，我們的方法為電影、動畫和虛擬環境中的水下世界呈現做出了貢獻。

Creating diverse and realistic 3D fish models is essential for movies, animations, and game scenes to build immersive underwater environments. However, manual creation of detailed and animated 3D fish models using traditional 3D modeling software is time-consuming. Existing approaches, such as anyfish 2.0 and DreamFusion, have limitations in terms of manual involvement and coherence in generated images. In this thesis, we propose a Fish Model Generator system, an interactive tool for procedurally generating 3D fish models to address these limitations.

Our proposed system achieves text-to-3D generation by integrating a 2D diffusion model to generate a lateral image of the fish, which serves as supervision for optimizing the parameters of the Fish Model Generator. To optimize the fish model parameters, our approach is differentiable and integrated with the Differentiable Interpolation-Based Renderer (DIB-R). Auto differentiation with the Adam optimizer enables efficient parameter optimization. Through the integration of the Stable Diffusion model and the procedural Fish Model Generator, our system minimizes manual labor and only requires simple tasks of image segmentation.

Experimental results demonstrate that our system can generate detailed and diverse 3D fish models, surpassing previous methods in terms of optimization speed and visual quality. By reducing the workload and lowering the learning threshold of artists, our approach improves the representation of underwater world in movies, animations, and virtual environments.

Recommendation Letter . . . . . . . . . . . . . . . . . . . . . . . . i
Approval Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Abstract in Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Abstract in English . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . v
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1 Background and Motivations . . . . . . . . . . . . . . . . 1
2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . 2
3 Organization of This Thesis . . . . . . . . . . . . . . . . . 2
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1 Procedural Modeling . . . . . . . . . . . . . . . . . . . . 3
2 Text-to-Image Diffusion Model . . . . . . . . . . . . . . . 3
3 3D model Synthesis . . . . . . . . . . . . . . . . . . . . . 4
Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . 7
1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 External Anatomy Features of Fish . . . . . . . . . . . . . 9
3 2D Image Generation . . . . . . . . . . . . . . . . . . . . 10
3.1 Stable Diffusion . . . . . . . . . . . . . . . . . . 10
3.2 Image Segmentation . . . . . . . . . . . . . . . . 12
4 3D Model Generation . . . . . . . . . . . . . . . . . . . . 13
4.1 Fish Model Generator . . . . . . . . . . . . . . . 13
4.2 Optimization . . . . . . . . . . . . . . . . . . . . 21
Results and Discussion . . . . . . . . . . . . . . . . . . . . . . 32
1 Experiment Environment . . . . . . . . . . . . . . . . . . 32
2 Generation Results . . . . . . . . . . . . . . . . . . . . . 33
3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Conclusions and Future Work . . . . . . . . . . . . . . . . . . . 53
1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 53
2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 53
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Letter of Authority . . . . . . . . . . . . . . . . . . . . . . . . . . 58
                                

[1] L. U. M. Consortium, “External fish anatomy.” Retrieved June 10, 2023, from https://lumcon.
edu/wp-content/uploads/2020/05/Fish-Features_Final.pdf, 2020.
[2] S. Ingley, M. R. Asl, C. Wu, R. Cui, M. Gadelhak, W. Li, J. Zhang, J. Simpson, C. Hash, T. Butkowski,
T. Veen, J. Johnson, W. Yan, and G. Rosenthal, “anyfish 2.0: An open-source software platform to
generate and share animated fish models to study behavior,” SoftwareX, vol. 3, pp. 13–21, 2015.
[3] B. Poole, A. Jain, J. T. Barron, and B. Mildenhall, “Dreamfusion: Text-to-3d using 2d diffusion,” in
The Eleventh International Conference on Learning Representations, 2023.
[4] B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing
scenes as neural radiance fields for view synthesis,” Communications of the ACM, vol. 65,
no. 1, pp. 99–106, 2021.
[5] R. Li, X. Ding, J. Yu, T. yi Gao, W. ting Zheng, R. Wang, and H. Bao, “Procedural generation and realtime
rendering of a marine ecosystem,” Journal of Zhejiang University SCIENCE C, vol. 15, no. 7,
pp. 514–524, 2014.
[6] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis
with latent diffusion models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition, pp. 10684–10695, 2022.
[7] F. Han, S. Ye, M. He, M. Chai, and J. Liao, “Exemplar-based 3d portrait stylization,” IEEE Transactions
on Visualization and Computer Graphics, vol. 29, no. 2, pp. 1371–1383, 2021.
[8] C. Nash, Y. Ganin, S. M. A. Eslami, and P. W. Battaglia, “Polygen: An autoregressive generative
model of 3d meshes,” in Proceedings of the 37th International Conference on Machine Learning,
2020.
[9] A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song,
H. Su, J. Xiao, L. Yi, and F. Yu, “ShapeNet: An Information-Rich 3D Model Repository,” Tech. Rep.
arXiv:1512.03012 [cs.GR], Stanford University — Princeton University — Toyota Technological Institute
at Chicago, 2015.
[10] G. Metzer, E. Richardson, O. Patashnik, R. Giryes, and D. Cohen-Or, “Latent-nerf for shape-guided
generation of 3d shapes and textures,” in Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, pp. 12663–12673, 2023.
[11] K. M. Jatavallabhula, E. Smith, J.-F. Lafleche, C. F. Tsang, A. Rozantsev, W. Chen, T. Xiang,
R. Lebaredian, and S. Fidler, “Kaolin: A pytorch library for accelerating 3d deep learning research.”
https://github.com/NVIDIAGameWorks/kaolin, 2019.
[12] K. Crane, “Keenan’s 3d model repository.” Retrieved June 25, 2023, from https://www.cs.cmu.
edu/~kmcrane/Projects/ModelRepository/, 2022.

簡易檢索 / 詳目顯示

相關論文