簡易檢索 / 詳目顯示

研究生: 謝政倫
Jeng-Lun Shieh
論文名稱: 一種應用於自動駕駛系統之針對一階物件偵測架構之持續學習策略
A Continual Learning Strategy for One-Stage Object Detection Frameworks in Autonomous Driving Systems
指導教授: 阮聖彰
Shanq-Jang Ruan
口試委員: 方文賢
Wen-Hsien Fang
呂政修
Jenq-Shiou Leu
陳郁堂
Yie-Tarng Chen
阮聖彰
Shanq-Jang Ruan
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2020
畢業學年度: 109
語文別: 英文
論文頁數: 65
中文關鍵詞: 深度學習持續學習一階物件偵測自動駕駛系統
外文關鍵詞: Deep Learning, Continual Learning, one-stage object detection, autonomous driving vehicles
相關次數: 點閱:282下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

多類別物件偵測是自動駕駛系統中相當重要的一部分,而隨著自動駕駛需求日益增加,所需偵測之目標也隨著大量提升,導致訓練資料與偵測類別往往超出原先預設好的架構。持續學習則能在新增偵測類別的同時,確保不受影響而損失原先準確度。基於此本論文將基於經驗回放的持續學習方法應用於一階物件偵測Yolov3模型,達到學習新類別同時能累積並儲存過去訓練過的類別資訊,避免任何的災難性遺忘。並針對提升一階物件偵測之目標定位準確度,提出兩種調整方法,(1)動態調整的損失函數,避免物件偵測模型於訓練階段中與未標記資料產生衝突,以及(2)針對Yolov3進行架構調整使其能更有效率的達到持續學習之目的。本論文透過PASCAL VOC 2007資料集驗證理論,在10+10類別中的結果顯示與一般訓練的模型相比,該方法獲得的平均準確度只下降了4.9%,在19+1類別中平均準確度則只下降了3.5%,是所有現有技術中最低的。


Object detection is an important aspect of autonomous driving systems (ADS), which may comprise a machine learning model that detects a range of classes. As the deployment of ADS widens globally, the variety of objects to be detected may increase beyond the designated range of classes. Continual learning for object detection essentially ensures a robust adaptation of a model to detect additional classes on the fly. This thesis proposes a novel continual learning method for one-stage object detection that learns new object class(es) along with cumulative memory of classes from prior learning rounds to avoid any catastrophic forgetting. To increase the accuracy of object localization, this thesis proposes a dynamic loss function and an adjusted architecture to make a more efficient training strategy. Results on PASCAL VOC 2007 with 10+10 incremental scheme have suggested that the proposed method obtains only 4.9% mAP drop compared with the all-classes strategy, whereas in 19+1 scheme, the proposed method can achieve 3.5% of mAP drop, which is the lowest amongst other prior methods.

摘要 V ABSTRACT VI Acknowledgements VII Table of Contents VIII List of Figures X List of Tables XII Chapter 1 Introduction 13 Chapter 2 Related Works 19 Chapter 3 Proposed Method 22 3.1 YOLO Architecture 24 3.2 Task Distribution and Data Augmentation 27 3.3 Memory Replay on Continual Learning 29 3.4 Branched Layer 33 Chapter 4 Experimental Results 35 4.1 Performance Evaluation Parameters 37 4.2 Addition of Classes Incrementally 39 4.3 Solving Data Imbalance Problem with Branched Layer 42 4.4 Visualization and Effect of Different Memory Size 44 4.5 Performance Evaluation on ITRI-DrvieNet60 Dataset 49 Chapter 5 Conclusions 53 References 55 Appendix 1 – Example of the VOC dataset (1) 60 Appendix 2 – Example of the VOC dataset (2) 61 Appendix 3 – Example of the ITRI dataset (1) 63 Appendix 4 – Example of the ITRI dataset (2) 64

[1] Shmelkov, K.; Schmid, C.; Alahari, K. Incremental Learning of Object Detectors without Catastrophic Forgetting. 2017 IEEE International Conference on Computer Vision (ICCV); IEEE: Venice, 2017; pp. 3420–3429. doi:10.1109/ICCV.2017.368.
[2] Ren, B.; Wang, H.; Li, J.; Gao, H. Life-long learning based on dynamic combination model. Applied Soft Computing 2017, 56, 398–404.
[3] Parisi, G.I.; Kemker, R.; Part, J.L.; Kanan, C.; Wermter, S. Continual lifelong learning with neural networks: A review. Neural Networks 2019, 113, 54–71. doi:10.1016/j.neunet.2019.01.012.
[4] Lin, H.Y.; Dai, J.M.; Wu, L.T.; Chen, L.Q. A Vision-Based Driver Assistance System with Forward Collision and Overtaking Detection. Sensors 2020, 20, 5139. Number: 18 Publisher: Multidisciplinary Digital Publishing Institute, doi:10.3390/s20185139.
[5] Fayyad, J.; Jaradat, M.A.; Gruyer, D.; Najjaran, H. Deep Learning Sensor Fusion for Autonomous Vehicle Perception and Localization: A Review. Sensors 2020, 20, 4220. Number: 15 Publisher: Multidisciplinary Digital Publishing Institute, doi:10.3390/s20154220.
[6] Dominguez-Sanchez, Alex and Cazorla, Miguel and Orts-Escolano, Sergio. A new dataset and performance evaluation of a region-based cnn for urban object detection. Electronics 2018, 7, 301. doi:https://doi.org/10.3390/electronics7110301.
[7] D. Feng.; Haase-Schütz, C.; Rosenbaum, L.; Hertlein, H.; Gläser, C.; Timm, F.; Wiesbeck, W.; Dietmayer, K. Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges. IEEE Transactions on Intelligent Transportation Systems 2020, pp. 1–20. doi:10.1109/TITS.2020.2972974.
[8] Zhao, Z.; Zheng, P.; Xu, S.; Wu, X. Object Detection With Deep Learning: A Review. IEEE Transactions on Neural Networks and Learning Systems 2019, 30, 3212–3232. doi:10.1109/TNNLS.2018.2876865.
[9] Geiger, A.; Lenz, P.; Urtasun, R. Are we ready for autonomous driving? The KITTI vision benchmark suite. 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 3354–3361. doi:10.1109/CVPR.2012.6248074.
[10] Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The cityscapes dataset for semantic urban scene understanding. Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 3213–3223.
[11] Yoon, J.; Yang, E.; Lee, J.; Hwang, S.J. Lifelong learning with dynamically expandable networks. arXiv preprint arXiv:1708.01547 2017.
[12] Chen, C.P.; Liu, Z. Broad learning system: An effective and efficient incremental learning system without the need for deep architecture. IEEE transactions on neural networks and learning systems 2017, 29, 10–24.
[13] Chen, C.P.; Liu, Z.; Feng, S. Universal approximation capability of broad learning system and its structural variations. IEEE transactions on neural networks and learning systems 2018, 30, 1191–1204.
[14] Xing, Y.; Shen, F.; Zhao, J. Perception evolution network based on cognition deepening model—Adapting to the emergence of new sensory receptor. IEEE Transactions on Neural Networks and Learning Systems 2015, 27, 607–620.
[15] Lopez-Paz, D.; Ranzato, M.A. Gradient Episodic Memory for Continual Learning. In Advances in Neural Information Processing Systems 30; Guyon, I.; Luxburg, U.V.; Bengio, S.; Wallach, H.; Fergus, R.; Vishwanathan, S.; Garnett, R., Eds.; Curran Associates, Inc., 2017; pp. 6467–6476.
[16] Hinton, G.; Vinyals, O.; Dean, J. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 2015.
[17] Zhang, J.; Zhang, J.; Ghosh, S.; Li, D.; Tasci, S.; Heck, L.; Zhang, H.; Jay Kuo, C.C. Class-incremental Learning via Deep Model Consolidation. 2020 IEEE Winter Conference on Applications of Computer Vision (WACV); IEEE: Snowmass Village, CO, USA, 2020; pp. 1120–1129. doi:10.1109/WACV45572.2020.9093365.
[18] Rolnick, D.; Ahuja, A.; Schwarz, J.; Lillicrap, T.; Wayne, G. Experience Replay for Continual learning. In Advances in Neural Information Processing Systems 32; Wallach, H.; Larochelle, H.; Beygelzimer, A.; Alché-Buc, F.d.; Fox, E.; Garnett, R., Eds.; Curran Associates, Inc., 2019; pp. 350–360.
[19] Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv:1804.02767 [cs] 2018. arXiv: 1804.02767.
[20] Ren, S.; He, K.; Girshick, R.B.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 2015. doi:10.1109/TPAMI.2016.2577031.
[21] Liu, L.; Kuang, Z.; Chen, Y.; Xue, J.H.; Yang, W.; Zhang, W. Incdet: in defense of elastic weight consolidation for incremental object detection. IEEE transactions on neural networks and learning systems 2020.
[22] MacKay, D.J. A practical Bayesian framework for backpropagation networks. Neural computation 1992, 4, 448–472.
[23] Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized Intersection over Union. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
[24] Everingham, M.; Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The Pascal Visual Object Classes (VOC) Challenge. Int. J. Comput. Vision 2010, 88, 303–338. doi:10.1007/s11263-009-0275-4.
[25] Yun, S.; Han, D.; Chun, S.; Oh, S.J.; Yoo, Y.; Choe, J. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features. 2019 IEEE/CVF International Conference on Computer Vision (ICCV); IEEE: Seoul, Korea (South), 2019; pp. 6022–6031. doi:10.1109/ICCV.2019.00612.
[26] Zinkevich, M.; Weimer, M.; Li, L.; Smola, A.J. Parallelized stochastic gradient descent. Advances in neural information processing systems, 2010, pp. 2595–2603.
[27] Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; Desmaison, A.; Kopf, A.; Yang, E.; DeVito, Z.; Raison, M.; Tejani, A.; Chilamkurthy, S.; Steiner, B.; Fang, L.; Bai, J.; Chintala, S. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32; Wallach, H.; Larochelle, H.; Beygelzimer, A.; d'Alché-Buc, F.; Fox, E.; Garnett, R., Eds.; Curran Associates, Inc., 2019; pp. 8024–8035.

無法下載圖示 全文公開日期 2025/12/23 (校內網路)
全文公開日期 2025/12/23 (校外網路)
全文公開日期 2025/12/23 (國家圖書館:臺灣博碩士論文系統)
QR CODE