基於多模態生成對抗網路的深度影像超解析度方法｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	林俊明 Jun-Ming Lin
論文名稱：	基於多模態生成對抗網路的深度影像超解析度方法 Depth Map Upsampling via Multimodal Generative Adversarial Network
指導教授：	花凱龍 Kai-Lung Hua
口試委員:	花凱龍 Kai-Lung Hua 郭景明 Jing-Ming Guo 陸敬互 Ching-Hu Lu 鍾國亮 Kuo-Liang Chung
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2018
畢業學年度：	106
語文別：	英文
論文頁數：	42
中文關鍵詞：	Depth map super-resolution 、Range sensor 、Generative adversarial network
外文關鍵詞：	深度影像超解析度, 範圍感測器, 生成對抗網路
相關次數：	點閱：175 下載：1
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

深度影像超解析度(super-resolution)的任務是通過使用低解析度深度影像來預測出高解析度深度影像。根據過去的經驗，預測出來的高解析度深度影像在邊界區域經常有銳度不足的問題。在本論文中，我們提出了一個基於生成對抗網絡
(generative adversarial network)的架構來解決這個問題。我們提出的網絡可以學習低解析度和高解析度訓練影像之間的空間訊息、距離訊息和局部梯度訊息的關係。我們的訓練影像包含了深度影像與其相對應的灰階場景影像，因為深度影像與其相對應的灰階場景影像擁有相似的邊界，所以學習低解析度和高解析度灰階場景影像之間的關係可以進一步提升深度影像超解析度的表現。藉由我們提出的 GAN 架構所訓練出的學習函數，我們可以從不特定的低解析度測試深度影像生成含有正確的高頻和低頻訊息的高解析度深度影像。最後，定量和定性實驗結果證明，我們的方法在深度影像超解析度工作上展現出最好的性能。

The task of depth map super-resolution (SR) is to predict high resolution (HR) depth images by using low resolution (LR) depth images. According to past experience, the predicted high-resolution depth image often suffers from the loss of sharpness at the depth boundaries. In this paper, we propose an architecture based on the generative adversarial network (GAN) to solve this problem. The proposed network can learn relations of spatial information, range information and local gradient information between LR and HR training images which contain depth maps and corresponding gray-scale scene images. Because the depth map and its corresponding gray-scale scene image have similar boundaries, learning the relationship between LR and HR gray-scale scene images can further enhance the performance of the depth map super-resolution. With a learning function trained by our proposed GAN architecture, we can generate SR depth images with the correct high and low-frequency information from an unspecified test LR depth image. At last, quantitative and qualitative experimental results show that our method achieves state-of-art performance on depth map super-resolution work.

摘要.......................................... I
Abstract........................................ II
Acknowledgements.................................. III
Contents........................................ IV
List of Figures..................................... V
List of Tables .....................................VIII
1 Introduction.................................... 1
2 RelatedWorks................................... 5
3 Method ...................................... 7
3.1 Minibatch of Depth Maps and Corresponding Scene Images . . . . . . . 10
3.2 Multimodal Generative Adversarial Network Architecture . . . . . . . . 10
3.3 LossFunction................................ 13
4 ExperimentalDesign ............................... 15
4.1 Dataset ................................... 15
4.2 TrainingDetails............................... 15
5 ExperimentalResult................................ 16
6 Conclusions.................................... 30
References....................................... 31
                                

[1] J. Diebel and S. Thrun, “An application of Markov random fields to range sensing,”
in MIT Press Proceedings of Conference on Neural Information Processing Systems
(NIPS), pp. 291–298, 2005.
[2] J. Kopf, M. Cohen, D. Lischinski, and M. Uyttendaele, “Joint bilateral upsampling,”
ACM Transactions on Graphics (TOG), vol. 26, no. 3, 2007.
[3] J. Lu, D. Min, R. S. Pahwa, and M. N. Do, “A revisit to MRF-based depth map
super-resolution and enhancement,” in IEEE International Conference on Acoustics,
Speech, and Signal Processing (ICASSP), pp. 985–988, 2011.
[4] D. Kim and K. Yoon, “High quality depth map up-sampling robust to edge noise
of range sensors,” in IEEE International Conference on Image Processing (ICIP),
pp. 553–556, 2012.
[5] S.-W. Jung, “Enhancement of image and depth map using adaptive joint trilateral
filter,” IEEE Transactions on Circuits and Systems for Video Technology (TCSVT),
vol. 23, no. 2, pp. 258–269, 2013.
[6] K.-H. Lo, K.-L. Hua, and Y.-C. F. Wang, “Depth map super-resolution via Markov
random fields without texture-copying artifacts,” in IEEE International Conference
on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1414–1418, 2013.
[7] J. Park, H. Kim, Y.-W. Tai, M. Brown, and I. Kweon, “High quality depth map upsampling
for 3d-tof cameras,” in IEEE International Conference on Computer Vision
(ICCV), pp. 1623–1630, 2011.
[8] D. Ferstl, C. Reinbacher, R. Ranftl, M. Ruether, and H. Bischof, “Image guided
depth upsampling using anisotropic total generalized variation,” in IEEE International
Conference on Computer Vision (ICCV), pp. 993–1000, 2013.
[9] O. Choi and S.-W. Jung, “A consensus-driven approach for structure and texture
aware depth map upsampling,” IEEE Transactions on Image Processing (TIP),
vol. 23, no. 8, pp. 3321–3335, 2014.
[10] K.-H. Lo, Y.-C. F. Wang, and K.-L. Hua, “Edge-preserving depth map upsampling
by joint trilateral filter,” in IEEE Trans. Cybernetics, vol. 48, pp. 371–384, 2017.
[11] D. Chan, H. Buisman, C. Theobalt, and S. Thrun, “A noise-aware filter for real-time
depth upsampling,” in ECCV Workshop on Multi-camera and Multi-modal Sensor
Fusion Algorithms and Applications, 2008.
[12] Q. Yang, R. Yang, J. Davis, and D. Nister, “Spatial-depth super resolution for range
images,” in IEEE International Conference on Computer Vision and Pattern Recognition
(CVPR), pp. 1–8, 2007.
[13] O. M. Aodha, N. D. Campbell, A. Nair, and G. J. Brostow, “Patch based synthesis for
single depth image super-resolution,” in European Conference on Computer Vision
(ECCV), pp. 71–84, 2012.
[14] M. Hornacek, C. Rhemann, M. Gelautz, and C. Rother, “Depth super resolution by
rigid body self-similarity in 3d,” in IEEE International Conference on Computer
Vision and Pattern Recognition (CVPR), pp. 1123–1130, 2013.
[15] J. Li, Z. Lu, G. Zeng, R. Gan, and H. Zha, “Similarity-aware patchwork assembly
for depth image super-resolution,” in IEEE International Conference on Computer
Vision and Pattern Recognition (CVPR), pp. 3374–3381, 2014.
[16] J. Lu and D. Forsyth, “Sparse depth super resolution,” in IEEE International Conference
on Computer Vision and Pattern Recognition (CVPR), pp. 2245–2253, 2015.
[17] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” in IEEE
International Conference on Computer Vision (ICCV), pp. 839–846, 1998.
[18] Y. Li, L. Zhang, Y. Zhang, H. Xuan, and Q. Dai, “Depth map super-resolution via iterative
joint-trilateral-upsampling,” in IEEE Visual Communications and Image Processing
(VCIP), pp. 386–389, 2014.
[19] J. Kim, J. Lee, S. Han, D. Kim, J. Min, and C. Kim, “Trilateral filter construction for
depth map upsampling,” in IEEE IVMSP Workshop, pp. 1–4, 2013.
[20] M.-Y. Liu, O. Tuzel, and Y. Taguchi, “Joint geodesic upsampling of depth images,”
in IEEE International Conference on Computer Vision and Pattern Recognition
(CVPR), pp. 169–176, 2013.
[21] D. Min, J. Lu, and M. N. Do, “Depth video enhancement based on weighted mode
filtering,” IEEE Transactions on Image Processing (TIP), vol. 21, no. 3, pp. 1176–
1190, 2012.
[22] K.-H. Lo, K.-L. Hua, and Y.-C. F. Wang, “Joint trilateral filtering for depth map
super-resolution,” in IEEE Visual Communications and Image Processing (VCIP),
pp. 1–6, 2013.
[23] K.-L. Hua, K.-H. Lo, and Y.-C. F. Wang, “Extended guided filtering for depth map
upsampling,” IEEE MultiMedia Magazine, vol. 23, no. 2, pp. 72–83, 2016.
[24] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Aitken, A. Tejani, J. Totz, Z. Wang, and
W. Shi, “Photo-realistic single image super-resolution using a generative adversarial
network,” in The IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), July 2017.
[25] B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee, “Enhanced deep residual networks
for single image super-resolution,” in The IEEE Conference on Computer Vision and
Pattern Recognition (CVPR) Workshops, July 2017.
[26] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,
A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in Neural
Information Processing Systems 27 (Z. Ghahramani, M. Welling, C. Cortes, N. D.
Lawrence, and K. Q. Weinberger, eds.), pp. 2672–2680, Curran Associates, Inc.,
2014.
[27] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo
correspondence algorithms,” International Journal of Computer Vision (IJCV),
vol. 47, no. 1-3, pp. 7–42, 2001.
[28] N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support
inference from RGBD images,” in European Conference on Computer Vision
(ECCV), pp. 746–760, 2012.
[29] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in International
Conference on Learning Representations (ICLR), 2015.
[30] “Middlebury stereo.” http://vision.middlebury.edu/stereo/.

簡易檢索 / 詳目顯示

相關論文