簡易檢索 / 詳目顯示

研究生: 石孟台
Meng-Tai Shih
論文名稱: 使用重新適配器微調神經網路應用於自動駕駛汽車
Fine-tuning Neural Networks for Self-Driving Cars using Re-adapter
指導教授: 陳郁堂
Yie-Tarng Chen
口試委員: 方文賢
Wen-Hsien Fang
林銘波
Ming-Bo Lin
呂政修
Jenq-Shiou Leu
賴坤財
Kuen-Tsair Lay
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 30
中文關鍵詞: 自主駕駛車輛適配器模擬器多任務訓練平衡回歸OpenpilotCARLA
外文關鍵詞: Autonomous vehicles, Adapter, Simulator, Multi-task training, Balanced regression, Openpilot, CARLA
相關次數: 點閱:190下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Openpilot是一個被Consumer Reports於2020年認可的領先高級駕駛輔助系統(ADAS),其依賴於超級綜合神經網絡(Supercombo neural network)來從前視攝像頭的影像中生成全面的向量,包含自車軌跡規劃以及前車路徑預測。儘管Supercombo的完整訓練代碼和數據集並未公開,但本論文提供了一種增強Supercombo的解決方案,採用了受到大型語言模型中參數高效轉移學習啟發的適配器架構。通過對Supercombo進行微調,利用從CARLA數據集中提取的各種駕駛場景,我們擴展了其功能,使其能夠包含路徑和前車的預測。此外,為了優化性能,我們針對多任務和不平衡回歸的挑戰,設計了特定的損失函數。在CARLA數據集上進行了大量實驗,驗證了我們提出的微調方法相較於原始的Supercombo模型,在改善路徑和前車預測方面的優越性。


    Openpilot, the leading ADAS system as recognized by Consumer Reports in 2020, relies on the Supercombo neural network to generate a comprehensive vector from front view camera images, encompassing ego car trajectory planning and lead car path prediction. While the complete training code and datasets for Supercombo are not publicly available, this thesis presents a solution for enhancing Supercombo through the implementation of the adapter architecture, inspired by parameter-efficient transfer learning in large language models. By fine-tuning Supercombo using diverse driving scenarios extracted from the CARLA datasets, we expand its capabilities to include both path and lead car prediction. Furthermore, we address the challenges of multitasking and unbalanced regression by designing specific loss functions to optimize performance. Extensive experiments conducted on the CARLA datasets validate the superiority of our proposed fine-tuning approach in achieving improved path prediction and lead car prediction compared to the original Supercombo model.

    摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x List of Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Parameter-efficient Transfer Learning . . . . . . . . . . . . . . . . 4 2.2 Structural Re-parameterization . . . . . . . . . . . . . . . . . . . 5 3 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.1 Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.2 Data Collection from CARLA LeaderBoard . . . . . . . . . . . . 7 3.3 Training Data Pre-processing . . . . . . . . . . . . . . . . . . . . 9 3.4 Adapter Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.5 Supercombo with Adapter . . . . . . . . . . . . . . . . . . . . . . 11 3.6 Loss Function and Multi-task Training . . . . . . . . . . . . . . . 12 4 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.1 Performance metrics . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.1.1 Trajectory . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.1.2 Lead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2.1 Trajectory . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2.2 Lead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

    [1] G. Luo, M. Huang, Y. Zhou, X. Sun, G. Jiang, Z. Wang, and R. Ji, “Towards
    efficient visual adaption via structural re-parameterization,” arXiv preprint
    arXiv:2302.08106, 2023.
    [2] H. Schafer, E. Santana, A. Haden, and R. Biasini, “A commute in data: The
    comma2k19 dataset,” arXiv preprint arXiv:1812.05752, 2018.
    [3] L. Chen, T. Tang, Z. Cai, Y. Li, P. Wu, H. Li, J. Shi, J. Yan, and Y. Qiao,
    “Level 2 autonomous driving on a single device: Diving into the devils of
    openpilot,” arXiv preprint arXiv:2206.08176, 2022.
    [4] A. Kendall, Y. Gal, and R. Cipolla, “Multi-task learning using uncertainty to
    weigh losses for scene geometry and semantics,” in Proceedings of the IEEE
    conference on computer vision and pattern recognition, pp. 7482–7491, 2018.
    [5] J. Ren, M. Zhang, C. Yu, and Z. Liu, “Balanced mse for imbalanced vi-
    sual regression,” in Proceedings of the IEEE/CVF Conference on Computer
    Vision and Pattern Recognition, pp. 7926–7935, 2022.
    [6] N. Houlsby, A. Giurgiu, S. Jastrzebski, B. Morrone, Q. De Laroussilhe,
    A. Gesmundo, M. Attariyan, and S. Gelly, “Parameter-efficient trans-
    fer learning for nlp,” in International Conference on Machine Learning,
    pp. 2790–2799, PMLR, 2019.
    [7] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of
    deep bidirectional transformers for language understanding,” arXiv preprint
    arXiv:1810.04805, 2018.
    [8] S. Chen, C. Ge, Z. Tong, J. Wang, Y. Song, J. Wang, and P. Luo, “Adapt-
    former: Adapting vision transformers for scalable visual recognition,” Ad-
    vances in Neural Information Processing Systems, vol. 35, pp. 16664–16678,
    2022.
    [9] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Un-
    terthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., “An image
    is worth 16x16 words: Transformers for image recognition at scale,” arXiv
    preprint arXiv:2010.11929, 2020.
    [10] Y.-L. Sung, J. Cho, and M. Bansal, “Vl-adapter: Parameter-efficient transfer
    learning for vision-and-language tasks,” in Proceedings of the IEEE/CVF
    Conference on Computer Vision and Pattern Recognition, pp. 5227–5237,
    2022.
    [11] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
    recognition,” in Proceedings of the IEEE conference on computer vision and
    pattern recognition, pp. 770–778, 2016.
    [12] X. Ding, X. Zhang, N. Ma, J. Han, G. Ding, and J. Sun, “Repvgg: Making
    vgg-style convnets great again,” in Proceedings of the IEEE/CVF conference
    on computer vision and pattern recognition, pp. 13733–13742, 2021.
    [13] Z. Zhang, A. Liniger, D. Dai, F. Yu, and L. Van Gool, “End-to-end urban
    driving by imitating a reinforcement learning coach,” in Proceedings of the
    IEEE/CVF international conference on computer vision, pp. 15222–15232,
    2021.

    無法下載圖示 全文公開日期 2025/08/21 (校內網路)
    全文公開日期 2025/08/21 (校外網路)
    全文公開日期 2025/08/21 (國家圖書館:臺灣博碩士論文系統)
    QR CODE