研究生: |
石孟台 Meng-Tai Shih |
---|---|
論文名稱: |
使用重新適配器微調神經網路應用於自動駕駛汽車 Fine-tuning Neural Networks for Self-Driving Cars using Re-adapter |
指導教授: |
陳郁堂
Yie-Tarng Chen |
口試委員: |
方文賢
Wen-Hsien Fang 林銘波 Ming-Bo Lin 呂政修 Jenq-Shiou Leu 賴坤財 Kuen-Tsair Lay |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電子工程系 Department of Electronic and Computer Engineering |
論文出版年: | 2023 |
畢業學年度: | 111 |
語文別: | 英文 |
論文頁數: | 30 |
中文關鍵詞: | 自主駕駛車輛 、適配器 、模擬器 、多任務訓練 、平衡回歸 、Openpilot 、CARLA |
外文關鍵詞: | Autonomous vehicles, Adapter, Simulator, Multi-task training, Balanced regression, Openpilot, CARLA |
相關次數: | 點閱:190 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
Openpilot是一個被Consumer Reports於2020年認可的領先高級駕駛輔助系統(ADAS),其依賴於超級綜合神經網絡(Supercombo neural network)來從前視攝像頭的影像中生成全面的向量,包含自車軌跡規劃以及前車路徑預測。儘管Supercombo的完整訓練代碼和數據集並未公開,但本論文提供了一種增強Supercombo的解決方案,採用了受到大型語言模型中參數高效轉移學習啟發的適配器架構。通過對Supercombo進行微調,利用從CARLA數據集中提取的各種駕駛場景,我們擴展了其功能,使其能夠包含路徑和前車的預測。此外,為了優化性能,我們針對多任務和不平衡回歸的挑戰,設計了特定的損失函數。在CARLA數據集上進行了大量實驗,驗證了我們提出的微調方法相較於原始的Supercombo模型,在改善路徑和前車預測方面的優越性。
Openpilot, the leading ADAS system as recognized by Consumer Reports in 2020, relies on the Supercombo neural network to generate a comprehensive vector from front view camera images, encompassing ego car trajectory planning and lead car path prediction. While the complete training code and datasets for Supercombo are not publicly available, this thesis presents a solution for enhancing Supercombo through the implementation of the adapter architecture, inspired by parameter-efficient transfer learning in large language models. By fine-tuning Supercombo using diverse driving scenarios extracted from the CARLA datasets, we expand its capabilities to include both path and lead car prediction. Furthermore, we address the challenges of multitasking and unbalanced regression by designing specific loss functions to optimize performance. Extensive experiments conducted on the CARLA datasets validate the superiority of our proposed fine-tuning approach in achieving improved path prediction and lead car prediction compared to the original Supercombo model.
[1] G. Luo, M. Huang, Y. Zhou, X. Sun, G. Jiang, Z. Wang, and R. Ji, “Towards
efficient visual adaption via structural re-parameterization,” arXiv preprint
arXiv:2302.08106, 2023.
[2] H. Schafer, E. Santana, A. Haden, and R. Biasini, “A commute in data: The
comma2k19 dataset,” arXiv preprint arXiv:1812.05752, 2018.
[3] L. Chen, T. Tang, Z. Cai, Y. Li, P. Wu, H. Li, J. Shi, J. Yan, and Y. Qiao,
“Level 2 autonomous driving on a single device: Diving into the devils of
openpilot,” arXiv preprint arXiv:2206.08176, 2022.
[4] A. Kendall, Y. Gal, and R. Cipolla, “Multi-task learning using uncertainty to
weigh losses for scene geometry and semantics,” in Proceedings of the IEEE
conference on computer vision and pattern recognition, pp. 7482–7491, 2018.
[5] J. Ren, M. Zhang, C. Yu, and Z. Liu, “Balanced mse for imbalanced vi-
sual regression,” in Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, pp. 7926–7935, 2022.
[6] N. Houlsby, A. Giurgiu, S. Jastrzebski, B. Morrone, Q. De Laroussilhe,
A. Gesmundo, M. Attariyan, and S. Gelly, “Parameter-efficient trans-
fer learning for nlp,” in International Conference on Machine Learning,
pp. 2790–2799, PMLR, 2019.
[7] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of
deep bidirectional transformers for language understanding,” arXiv preprint
arXiv:1810.04805, 2018.
[8] S. Chen, C. Ge, Z. Tong, J. Wang, Y. Song, J. Wang, and P. Luo, “Adapt-
former: Adapting vision transformers for scalable visual recognition,” Ad-
vances in Neural Information Processing Systems, vol. 35, pp. 16664–16678,
2022.
[9] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Un-
terthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., “An image
is worth 16x16 words: Transformers for image recognition at scale,” arXiv
preprint arXiv:2010.11929, 2020.
[10] Y.-L. Sung, J. Cho, and M. Bansal, “Vl-adapter: Parameter-efficient transfer
learning for vision-and-language tasks,” in Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, pp. 5227–5237,
2022.
[11] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
recognition,” in Proceedings of the IEEE conference on computer vision and
pattern recognition, pp. 770–778, 2016.
[12] X. Ding, X. Zhang, N. Ma, J. Han, G. Ding, and J. Sun, “Repvgg: Making
vgg-style convnets great again,” in Proceedings of the IEEE/CVF conference
on computer vision and pattern recognition, pp. 13733–13742, 2021.
[13] Z. Zhang, A. Liniger, D. Dai, F. Yu, and L. Van Gool, “End-to-end urban
driving by imitating a reinforcement learning coach,” in Proceedings of the
IEEE/CVF international conference on computer vision, pp. 15222–15232,
2021.