簡易檢索 / 詳目顯示

研究生: Nguyen Toan Khoa
Toan-Khoa Nguyen
論文名稱: RGB-D資料之差異網路應用於正常路面和道路障礙物之檢測
Normal Surface and Road Obstacle Detection by Discrepancy Network Based on RGB-D Data
指導教授: 蘇順豐
Shun-Feng Su
郭重顯
Chung-Hsien Kuo
口試委員: 顏炳郎
Ping-Lang Yen
劉益宏
Yi-Hung Liu
黃漢邦
Han-Pang Huang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 69
中文關鍵詞: Mobile robotsSelf-supervised learningAutomated labelingSemantic segmentationRoad obstacles detection
外文關鍵詞: Mobile robots, Self-supervised learning, Automated labeling, Semantic segmentation, Road obstacles detection
相關次數: 點閱:255下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Obstacle avoidance plays an essential role in the reliable perception of intelligent autonomous mobile robots, enabling the robots to identify the anomalies that may block their working area. Recently, using the notable advantages of deep learning techniques, mobile robots may now perform autonomous navigation based on what they learned in the deep learning training phase. However, the data preparation and processing for deep learning-based methods are generally time-consuming and labor-intensive due to the need for large amounts of properly labeled data. To overcome this drawback, in this thesis, we present some approaches to solve these issues by using the self-supervised approach with a designed discrepancy network based on a channel-wise attention mechanism for normal surface (i.e., drivable areas) and road obstacles, automatic labeling segmentation tasks. For each pair of RGB and depth images, the proposed framework automatically generates a label map for the drivable areas and road obstacles by finding the dissimilarities between the input RGB image and the resynthesized image, then enhancing the obstacle localizing capabilities by integrating the depth information by different approaches from traditional image processing to fully based on deep learning technique. In addition, to validate the robustness, we trained the RGB-D datasets using self-generated ground truth labels derived from the developed automatic labeling methods with multiple off-the-shelf RGB-D semantic segmentation neural networks to obtain the predicted labels. Various experiments are conducted to show that proposed systems could obtain high performance in indoor and outdoor scenarios and also demonstrate the capability of giving predictions in real-time segmentation of drivable areas and road obstacles applications on mobile robots.


    Obstacle avoidance plays an essential role in the reliable perception of intelligent autonomous mobile robots, enabling the robots to identify the anomalies that may block their working area. Recently, using the notable advantages of deep learning techniques, mobile robots may now perform autonomous navigation based on what they learned in the deep learning training phase. However, the data preparation and processing for deep learning-based methods are generally time-consuming and labor-intensive due to the need for large amounts of properly labeled data. To overcome this drawback, in this thesis, we present some approaches to solve these issues by using the self-supervised approach with a designed discrepancy network based on a channel-wise attention mechanism for normal surface (i.e., drivable areas) and road obstacles, automatic labeling segmentation tasks. For each pair of RGB and depth images, the proposed framework automatically generates a label map for the drivable areas and road obstacles by finding the dissimilarities between the input RGB image and the resynthesized image, then enhancing the obstacle localizing capabilities by integrating the depth information by different approaches from traditional image processing to fully based on deep learning technique. In addition, to validate the robustness, we trained the RGB-D datasets using self-generated ground truth labels derived from the developed automatic labeling methods with multiple off-the-shelf RGB-D semantic segmentation neural networks to obtain the predicted labels. Various experiments are conducted to show that proposed systems could obtain high performance in indoor and outdoor scenarios and also demonstrate the capability of giving predictions in real-time segmentation of drivable areas and road obstacles applications on mobile robots.

    ACKNOWLEDGMENTS i LIST OF TABLES iii LIST OF FIGURES iv NOMENCLATURE vi CHAPTER 1 1 INTRODUCTION 1 1.1 Background 1 1.2 Objective 2 1.3 Thesis organization 3 CHAPTER 2 4 LITERATURE REVIEW 4 2.1 Drivable Area Detection 4 2.2 Obstacles Detection 5 CHAPTER 3 8 SELF-SUPERVISED AUTOMATIC LABELING BASED ON TRADITIONAL IMAGE PROCESSING USING RGB-D DATA 8 3.1 Self-supervised segmentation approach for RGB-D Data 8 3.2 Self-supervised automatic labeling system based on traditional image processing 10 3.2.1.Depth Anomaly Calculator 11 3.2.2.RGB Anomaly Calculator 14 3.2.3.Post Processor 15 CHAPTER 4 16 EFFECTIVE FREE-DRIVING REGION DETECTION FOR MOBILE ROBOTS BY UNCERTAINTY ESTIMATION USING RGB-D DATA 16 4.1. Self-supervised segmentation approach for RGB-D Data 16 4.2. Automatic Generating Segmentation Label Framework 18 4.2.1.Autoencoder 19 4.2.2.RGB Anomaly Calculator 20 4.2.3.DissimNet 21 4.2.4.Depth Anomaly Calculator 23 4.2.5.Post Processor 24 CHAPTER 5 25 DIFFERNET: AN EFFICIENT DISCREPANCY NETWORK FOR DETECTING ROAD ANOMALIES USING RGB-D DATA 25 5.1. Automatic Labeling System Principle Based on Deep Learning 25 5.2. Automatic Labeling System Framework Using RGB-D Data 26 5.2.1. Semantic Segmentation Module 28 5.2.2. Synthesized Module 29 5.2.3. DifferNet 30 5.2.4. Post Processor 35 CHAPTER 6 36 EXPERIMENTS AND DISCUSSION 36 6.1. Self-supervised Automatic Labeling Based on Traditional Image Processing Using RGB-D Data 36 6.1.1. Evaluation on GMRPD Dataset 36 6.1.2. Discussion 40 6.2. Effective Free-driving Region Detection for Mobile Robots by Uncertainty Estimation Using RGB-D Data 40 6.2.1. Evaluation on GMRPD Dataset 40 6.2.2. Evaluation on Our Anomaly Dataset 46 6.2.3. Discussion 48 6.3. DifferNet: An Efficient Discrepancy Network for Detecting Road Anomalies Using RGB-D Data 49 6.3.1. DifferNet training procedure and implementation details 49 6.3.2. DifferNet experimental set-up and results 50 6.3.3. Evaluation of Automatic Labeling System Framework 52 6.3.4. Discussion 61 CHAPTER 7 62 CONCLUSION AND FUTURE WORKS 62 7.1. Conclusion 62 7.2. Future works 63 REFERENCE 64

    [1] S. Pang, D. Morris, and H. Radha, "CLOCs: Camera-LiDAR object candidates fusion for 3D object detection," in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020: IEEE, pp. 10386-10393.
    [2] D. Mykheievskyi, D. Borysenko, and V. Porokhonskyy, "Learning local feature descriptors for multiple object tracking," in Proceedings of the Asian Conference on Computer Vision, 2020.
    [3] X. Zhou, V. Koltun, and P. Krähenbühl, "Tracking objects as points," in European Conference on Computer Vision, 2020: Springer, pp. 474-490.
    [4] K. Ji, H. Chen, H. Di, J. Gong, G. Xiong, J. Qi and T. Yi, "CPFG-SLAM: A robust simultaneous localization and mapping based on LIDAR in off-road environment," in 2018 IEEE Intelligent Vehicles Symposium (IV), 2018: IEEE, pp. 650-655.
    [5] J. Zhang and S. Singh, "Visual-lidar odometry and mapping: Low-drift, robust, and fast," in 2015 IEEE International Conference on Robotics and Automation (ICRA), 2015: IEEE, pp. 2174-2181.
    [6] Y. Xing, J. Wang, X. Chen, and G. Zeng, "Coupling two-stream RGB-D semantic segmentation network by idempotent mappings," in 2019 IEEE International Conference on Image Processing (ICIP), 2019: IEEE, pp. 1850-1854.
    [7] J. An and S. Cho, "Variational autoencoder based anomaly detection using reconstruction probability," Special Lecture on IE, vol. 2, no. 1, pp. 1-18, 2015.
    [8] P. Bevandić, I. Krešo, M. Oršić, and S. Šegvić, "Simultaneous semantic segmentation and outlier detection in presence of domain shift," in German conference on pattern recognition, 2019: Springer, pp. 33-47.
    [9] T. Vojir, T. Šipka, R. Aljundi, N. Chumerin, D. O. Reino, and J. Matas, "Road anomaly detection by partial image reconstruction with segmentation coupling," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 15651-15660.
    [10] W. Wang and U. Neumann, "Depth-aware cnn for rgb-d segmentation," in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 135-150.
    [11] C. Hazirbas, L. Ma, C. Domokos, and D. Cremers, "Fusenet: Incorporating depth into semantic segmentation via fusion-based cnn architecture," in Asian conference on computer vision, 2016: Springer, pp. 213-228.
    [12] X. Hu, K. Yang, L. Fei, and K. Wang, "Acnet: Attention based network to exploit complementary features for rgbd semantic segmentation," in 2019 IEEE International Conference on Image Processing (ICIP), 2019: IEEE, pp. 1440-1444.
    [13] R. Labayrade, D. Aubert, and J.-P. Tarel, "Real time obstacle detection in stereovision on non flat road geometry through" v-disparity" representation," in Intelligent Vehicle Symposium, 2002. IEEE, 2002, vol. 2: IEEE, pp. 646-651.
    [14] Y. Gao, X. Ai, Y. Wang, J. Rarity, and N. Dahnoun, "UV-disparity based obstacle detection with 3D camera and steerable filter," in 2011 IEEE Intelligent Vehicles Symposium (IV), 2011: IEEE, pp. 957-962.
    [15] C. Yang, P. Jun-Jian, S. Jing, Z. Lin-Lin, and T. Yan-Dong, "V-disparity based UGV obstacle detection in rough outdoor terrain," Acta Automatica Sinica, vol. 36, no. 5, pp. 667-673, 2010.
    [16] D. Yiruo, W. Wenjia, and K. Yukihiro, "Complex ground plane detection based on v-disparity map in off-road environment," in 2013 IEEE intelligent vehicles symposium (IV), 2013: IEEE, pp. 1137-1142.
    [17] J. Mayr, C. Unger, and F. Tombari, "Self-supervised learning of the drivable area for autonomous vehicles," in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018: IEEE, pp. 362-369.
    [18] U. Ozgunalp, X. Ai, and N. Dahnoun, "Stereo vision-based road estimation assisted by efficient planar patch calculation," Signal, Image and Video Processing, vol. 10, no. 6, pp. 1127-1134, 2016.
    [19] Z. Chen and Z. Chen, "Rbnet: A deep neural network for unified road and road boundary detection," in International Conference on Neural Information Processing, 2017: Springer, pp. 677-687.
    [20] X. Han, J. Lu, C. Zhao, S. You, and H. Li, "Semisupervised and weakly supervised road detection based on generative adversarial networks," IEEE Signal Processing Letters, vol. 25, no. 4, pp. 551-555, 2018.
    [21] Z. Liu, S. Yu, and N. Zheng, "A co-point mapping-based approach to drivable area detection for self-driving cars," Engineering, vol. 4, no. 4, pp. 479-490, 2018.
    [22] A. Ali, M. Gergis, S. Abdennadher, and A. El Mougy, "Drivable Area Segmentation in Deteriorating Road Regions for Autonomous Vehicles using 3D LiDAR Sensor," in 2021 IEEE Intelligent Vehicles Symposium (IV), 2021: IEEE, pp. 845-852.
    [23] R. Cong, J. Lei, C. Zhang, Q. Huang, X. Cao, and C. Hou, "Saliency detection for stereoscopic images based on depth confidence analysis and multiple cues fusion," IEEE Signal Processing Letters, vol. 23, no. 6, pp. 819-823, 2016.
    [24] J. Lou, W. Zhu, H. Wang, and M. Ren, "Small target detection combining regional stability and saliency in a color image," Multimedia Tools and Applications, vol. 76, no. 13, pp. 14781-14798, 2017.
    [25] S. Ghosh and J. Biswas, "Joint perception and planning for efficient obstacle avoidance using stereo vision," in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017: IEEE, pp. 1026-1031.
    [26] C. Wei, Q. Ge, S. Chattopadhyay, and E. Lobaton, "Robust obstacle segmentation based on topological persistence in outdoor traffic scenes," in 2014 IEEE Symposium on Computational Intelligence in Vehicles and Transportation Systems (CIVTS), 2014: IEEE, pp. 92-99.
    [27] K. Gupta, S. A. Javed, V. Gandhi, and K. M. Krishna, "Mergenet: A deep net architecture for small obstacle discovery," in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018: IEEE, pp. 5856-5862.
    [28] M. Cordts, M. Omran, S. Ramis, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth and B. Schiele, "The cityscapes dataset for semantic urban scene understanding," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 3213-3223.
    [29] H. Wang, Y. Sun, and M. Liu, "Self-supervised drivable area and road anomaly segmentation using rgb-d data for robotic wheelchairs," IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. 4386-4393, 2019.
    [30] W. T. Freeman and E. H. Adelson, "The design and use of steerable filters," IEEE Transactions on Pattern analysis and machine intelligence, vol. 13, no. 9, pp. 891-906, 1991.
    [31] D. H. Ballard, "Generalizing the Hough transform to detect arbitrary shapes," Pattern recognition, vol. 13, no. 2, pp. 111-122, 1981.
    [32] M. Hua, Y. Nan, and S. Lian, "Small obstacle avoidance based on RGB-D semantic segmentation," in Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2019, pp. 0-0.
    [33] D. Hendrycks and K. Gimpel, "A baseline for detecting misclassified and out-of-distribution examples in neural networks," arXiv preprint arXiv:1610.02136, 2016.
    [34] K. Lee, H. Lee, K. Lee, and J. Shin, "Training confidence-calibrated classifiers for detecting out-of-distribution samples," arXiv preprint arXiv:1711.09325, 2017.
    [35] S. Liang, Y. Li, and R. Srikant, "Enhancing the reliability of out-of-distribution image detection in neural networks," arXiv preprint arXiv:1706.02690, 2017.
    [36] M. Rottmann, P. Colling, T.-P. Hack, R. Chan, F. Hüger, P. Schlicht and H. Gottschalk, "Prediction error meta classification in semantic segmentation: Detection via aggregated dispersion measures of softmax probabilities," in 2020 International Joint Conference on Neural Networks (IJCNN), 2020: IEEE, pp. 1-9.
    [37] P. Oberdiek, M. Rottmann, and G. A. Fink, "Detection and retrieval of out-of-distribution objects in semantic segmentation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 328-329.
    [38] G. Di Biase, H. Blum, R. Siegwart, and C. Cadena, "Pixel-wise anomaly detection in complex driving scenes," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 16918-16927.
    [39] K. Lis, K. Nakka, P. Fua, and M. Salzmann, "Detecting the unexpected via image resynthesis," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 2152-2161.
    [40] T. Ohgushi, K. Horiguchi, and M. Yamanaka, "Road obstacle detection method based on an autoencoder with semantic segmentation," in Proceedings of the Asian Conference on Computer Vision, 2020.
    [41] K. Lis, S. Honari, P. Fua, and M. Salzmann, "Detecting Road Obstacles by Erasing Them," arXiv preprint arXiv:2012.13633, 2020.
    [42] W. T. Freeman and E. H. Adelson, "The design and use of steerable filters," IEEE Transactions on Pattern analysis and machine intelligence, vol. 13, no. 9, pp. 891-906, 1991.
    [43] H. Zhao, X. Qi, X. Shen, J. Shi, and J. Jia, "Icnet for real-time semantic segmentation on high-resolution images," in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 405-420.
    [44] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, "Pyramid scene parsing network," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2881-2890.
    [45] T. Wang, M. Liu, J. Zhu, A. Tao, J. Kautz and B. Catanzaro, "High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 8798-8807, doi: 10.1109/CVPR.2018.00917.
    [46] X. Liu, G. Yin, J. Shao, and X. Wang. "Learning to predict layout-to-image conditional convolutions for semantic image synthesis." Advances in Neural Information Processing Systems 32 (2019).
    [47] Y. Zhu, K. Sapra, F.-A. Reda, K.-J. Shih, S. Newsam, A. Tao and B. Catanzaro, "Improving semantic segmentation via video propagation and label relaxation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 8856-8865.
    [48] M. Cordts, M. Omran, S. Ramis, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth and B. Schiele, "The cityscapes dataset for semantic urban scene understanding," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 3213-3223.
    [49] Yarin Gal. Uncertainty in deep learning. University of Cambridge, 1(3), 2016.
    [50] B. Lakshminarayanan, A. Pritzel, and C. Blundell, "Simple and scalable predictive uncertainty estimation using deep ensembles," Advances in neural information processing systems, vol. 30, 2017.
    [51] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, "Imagenet large scale visual recognition challenge," International journal of computer vision, vol. 115, no. 3, pp. 211-252, 2015.
    [52] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
    [53] T. Park, M.-Y. Liu, T.-C. Wang, and J.-Y. Zhu, "Semantic image synthesis with spatially- adaptive normalization," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 2337-2346.
    [54] J. Hu, L. Shen, and G. Sun, "Squeeze-and-excitation networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7132-7141.
    [55] J. Johnson, A. Alahi, and L. Fei-Fei, "Perceptual losses for real-time style transfer and super-resolution," in European conference on computer vision, 2016: Springer, pp. 694-711.
    [56] A. Dosovitskiy and T. Brox, "Generating images with perceptual similarity metrics based on deep networks," Advances in neural information processing systems, vol. 29, 2016.
    [57] H. Hirschmuller, "Accurate and efficient stereo processing by semi-global matching and mutual information," in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005, vol. 2: IEEE, pp. 807-814.
    [58] P. Pinggera, S. Ramos, S. Gehrig, U. Franke, C. Rother, and R. Mester, "Lost and found: detecting small road hazards for self-driving vehicles," in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016: IEEE, pp. 1099-1106.
    [59] N. V. Chawla, N. Japkowicz, and A. Kotcz, "Special issue on learning from imbalanced data sets," ACM SIGKDD explorations newsletter, vol. 6, no. 1, pp. 1-6, 2004.
    [60] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
    [61] Y. Sun, W. Zuo, and M. Liu, "Rtfnet: Rgb-thermal fusion network for semantic segmentation of urban scenes," IEEE Robotics and Automation Letters, vol. 4, no. 3, pp. 2576-2583, 2019.
    [62] Z. Wang, E. P. Simoncelli and A. C. Bovik, "Multiscale structural similarity for image quality assessment," The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, 2003, pp. 1398-1402 Vol.2, doi: 10.1109/ACSSC.2003.1292216.
    [63] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva and A. Torralba, "Learning Deep Features for Discriminative Localization," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2921-2929, doi: 10.1109/CVPR.2016.319.

    無法下載圖示 全文公開日期 2032/07/22 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE