以殘差網路結合稀疏卷積網路進行深度補全｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	江宜欣 Yi-Sin Jiang
論文名稱：	以殘差網路結合稀疏卷積網路進行深度補全 Depth Completion using Deep Residual Networks with Sparse Convolutions
指導教授：	陳郁堂 Yie-Tarng Chen
口試委員:	陳郁堂 Yie-Tarng Chen 方文賢 Wen-Hsien Fang 陳省隆 Hsing-Lung Che 林銘波 Ming-Bo Lin 林昌鴻 Chang-Hong Lin
學位類別：	碩士 Master
系所名稱：	電資學院 - 電子工程系 Department of Electronic and Computer Engineering
論文出版年：	2019
畢業學年度：	107
語文別：	中文
論文頁數：	36
中文關鍵詞：	深度補全、深度學習、影像處理
外文關鍵詞：	Depth Completion, Sparse-to-Dense
相關次數：	點閱：200 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

近年來，由於光達可以提供精確的長距離測量，因此RGB圖像和光達被使用於自動駕駛汽車的感知系統。然而，目前的光達傳感器提供的資訊相當稀疏，尤其是對於遠距離物體。為了更精確地偵測深度，本文研究了深度補全問題，從稀疏測量中估算出密集圖像。由於稀疏深度的輸入中的不規則圖案以及光達和圖像信息的組合，深度補全是具有挑戰性的問題。先前的深度完成方法在估計的深度圖處遭受“邊緣模糊”問題。為了解決這個問題，我們研究了一種深度神經網絡架構，它將稀疏卷積與剩餘網絡相結合。為了進一步提高性能，我們調整了一個雙分支指導框架，一個分支使用RGB-D作為輸入，另一個分支使用Lidar，然後融合這兩個分支來估計最終完成的深度圖。在評估所提方法的性能方面，我們對KITTI數據集進行了實驗。與其他方法相比，所提出的方法可以實現更快的運算速度，同時在均方根誤差和較少的記憶體空間要求方面保持具競爭性的精度。

In recent years, the RGB images and Lidar are used in the perception system of self-driving cars, since Lidar can provide precise and long range measurements . However, current Lidar sensors only provide sparse measurements, especially for far-distance objects. To provide precisely environment sensing, in this thesis, we investigate the depth completion problem, estimating a dense image from sparse measurements. Depth completion is challenging issue due to the irregular patterns in the input of the sparse depth and combination of Lidar and image information. Previous approaches on depth completion suffer from “edge blur” problem at the estimated depth map. To address this issue, we investigate a deep neural network architecture, which combines sparse convolutions with residual networks. To further boost the performance, we adapt a two-branch guidance framework, one branch using RGB-D as input, the other one using Lidar only, and then fusing these two branches to estimate a final completed depth map. To assess the performance of the proposed approach, we perform intensive experiments on the KITTI benchmark datasets. Compared with the baseline, the proposed approach can achieve faster inference time while maintaining a competitive estimated accuracy in terms of root mean squared error and small space requirements.

Abstract                                       i
Acknowledgment                                 ii
Table of contents                              iii
List of Figures                                v
List of Tables                                 vii
Introduction                                 1
1 Depth Completion                           1
2 Motivations                                1
3 Contributions                              2
4 Summary of The Proposed Approach           2
5 Thesis Outline                             2
Related Work                                 3
1 Sparse Convolution                         3
2 Residual Networks                          4
3 Superpixel Segmentation                    4
4 Fusion                                     5
The Proposed Depth Completion Method         6
1 Overall Architecture                       6
2 Global Sparse Convolutional Neural Network 8
3 Local Sparse Convolutional Neural Network  8
4 Sparse Convolutions                        9
4.1 Residual1-Sparse Convolutions            9
4.2 Residual2-Sparse Convolutions            10
4.3 Bottleneck-Sparse Convolutions           12
5 Cooperation with Superpixel Segmentation   14
Experiment                                   16
1 Dataset and Metrics                        16
2 Evaluation Results                         17
3 Ablation Studies                           17
3.1 Sparse Convolution                       18
3.2 Loss Function                            18
4 Discussion                                21
Conclusion                                   24
1 Conclusion                                24
References                                     25
                                

[1] M. Liang, B. Yang, Y. Chen, R. Hu, and R. Urtasun, \Multi-task multi-sensor fusion for 3d object detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7345{7353, 2019.
[2] W. Van Gansbeke, D. Neven, B. De Brabandere, and L. Van Gool, \Sparse and noisy lidar completion with rgb guidance and uncertainty," arXiv preprint arXiv:1902.05356, 2019.
[3] J. Uhrig, N. Schneider, L. Schneider, U. Franke, T. Brox, and A. Geiger, \Sparsity invariant cnns," in International Conference on 3D Vision (3DV), 2017.
[4] K. He, X. Zhang, S. Ren, and J. Sun, \Deep residual learning for image recognition," in Proceedings of the IEEE Conference on Computer Cision and Cattern Cecognition, pp. 770{778, 2016.
[5] F. Ma, G. V. Cavalheiro, and S. Karaman, \Self-supervised sparse-to-dense: self-supervised depth completion from lidar and monocular camera," arXiv preprint arXiv:1807.00275, 2018.
[6] M. Jaritz, R. De Charette, E. Wirbel, X. Perrotton, and F. Nashashibi, \Sparse and dense data with cnns: Depth completion and semantic segmen- tation," in 2018 International Conference on 3D Vision (3DV), pp. 52{60, IEEE, 2018.
[7] J. Qiu, Z. Cui, Y. Zhang, X. Zhang, S. Liu, B. Zeng, and M. Polle- feys, \Deeplidar: Deep surface normal guided depth prediction for out- door scene from sparse lidar data and single color image," arXiv preprint arXiv:1812.00488, 2018.
[8] C. Y. Ren, V. A. Prisacariu, and I. D. Reid, \gSLICr: SLIC superpixels at over 250Hz," ArXiv e-prints, Sept. 2015.
[9] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk, \Slic superpixels compared to state-of-the-art superpixel methods," IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, pp. 2274{2282, Nov. 2012.
[10] K. He, X. Zhang, S. Ren, and J. Sun, \Identity mappings in deep resid- ual networks," in European Conference on Computer Vision, pp. 630{645, Springer, 2016.
[11] A. Geiger, P. Lenz, and R. Urtasun, \Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354{3361, 2012.
[12] X. Cheng, P.Wang, and R. Yang, \Depth estimation via anity learned with convolutional spatial propagation network," in Proceedings of the European Conference on Computer Vision (ECCV), pp. 103{119, 2018.

全文公開日期 2021/08/27 (校內網路)
全文公開日期 2024/08/27 (校外網路)
全文公開日期 2024/08/27 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文