研究生: 王剡家
Yan-Jia Wang
論文名稱: 應用具有空間和時間自注意力的自適應器於車禍預測
Adapting Spatial and Temporal Modeling for Traffic Accident Anticipation
指導教授: 方文賢
Wen-Hsien Fang
口試委員: 方文賢
Wen-Hsien Fang
Yie-Tarng Chen
Kuen-Tsair Lay
Jenq-Shiou Leu
Chien-Ching Chiu
學位類別: 碩士
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 40
中文關鍵詞: 異常偵測車禍預測有效微調
外文關鍵詞: Anomaly detection, Accident anticipation, Efficient tuning
交通事故預測是一個重要的研究領域,目的在提前預測可能發生的交通事故,以避免嚴重災害並減少交通事故的發生。在本論文中,我們採用了Adapting Image Models(AIM)的框架,改善了內部的架構並結合一些方法以提高預測性來應用在交通事故預測中。首先,我們加深了自適應器的結構,並利用全連結層和一維卷積層提取全局和局部特徵,從而提高模型對空間信息的理解能力。接著,我們在空間和時間維度上分別引入了注意機制。在空間維度上,我們採用交叉注意力來學習大物體和小物體之間的位置關係,準確定位可能發生事故的區域。在時間維度上,我們透過加權時間注意力來學習相鄰幀之間的相關性,提前預測可能發生事故的時間點。透過將這些改進方法融入Vision Transformer(ViT)中,我們對兩個數據集進行實驗。我們的結果也證明說我們的架構在交通事故預測中取得了顯著的性能提升,並且在準確性和預測能力方面都取得了卓越的成果,在處理不同規模物體和場景時也表現出了優越性能。這些貢獻有望為交通安全領域的研究和實踐提供有價值的參考,並促進更廣泛的應用。

Traffic accident anticipation is a crucial field of research that aims to predict potential accidents beforehand, thereby preventing severe disasters and reducing traffic incidents. In this study, we employ the Adapting Image Models(AIM) framework and introduce various methods to enhance the model's predictive performance for traffic accident anticipation. To begin with, we enhance the adapter's structure, extracting local and global features using FC and Conv1D layers to improve spatial understanding. Additionally, we incorporate attention mechanisms within adapter in both spatial and temporal dimensions. In the spatial dimension, cross attention is used to learn positional relationships between large and small objects, accurately identifying accident-prone areas. In the temporal dimension, weighted temporal attention is employed to learn correlations between adjacent frames, enabling advance anticipation of accident timings. By integrating these enhancements into the Vision Transformer(ViT), extensive experiments are conducted on two datasets, showcasing a significant performance improvement in traffic accident anticipation using our proposed approach. Our model achieves impressive accuracy and prediction capabilities, excelling in handling objects and scenarios of diverse scales. These contributions are expected to serve as valuable references for traffic safety research and practices, fostering broader applications in the field.

摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Table of contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi List of Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Traffic Accident Anticipation . . . . . . . . . . . . . . . . . . . . . 4 2.2 Attention Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Efficient Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1 Proposed Architecture . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1.1 Essence of ViT . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1.2 Adapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 Enhanced Spatial Adaptation . . . . . . . . . . . . . . . . . . . . 11 3.3 Enhanced Temporal Adaptation . . . . . . . . . . . . . . . . . . . 14 3.4 Enhanced Joint Adaptation . . . . . . . . . . . . . . . . . . . . . 15 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4 Experimental Results and Discussions . . . . . . . . . . . . . . . . . . . 18 4.1 Accident Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.1.1 Driver Anomaly Detection Dataset (DAD) . . . . . . . . . 18 4.1.2 Car Crash Dataset (CCD) . . . . . . . . . . . . . . . . . . 20 4.2 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.2.1 Implement details . . . . . . . . . . . . . . . . . . . . . . . 21 4.2.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . 22 4.3 Ablation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.3.1 Analysis of Conv1d Up & Down Module . . . . . . . . . . 24 4.3.2 Analysis of Cross Attention . . . . . . . . . . . . . . . . . 25 4.3.3 Analysis of Weighted Temporal Attention . . . . . . . . . 26 4.4 Visualization Results . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.4.1 Successful Cases . . . . . . . . . . . . . . . . . . . . . . . . 27 4.4.2 Failure Cases and Error Analysis . . . . . . . . . . . . . . 30 4.5 Comparison with the State-of-the-Art Works . . . . . . . . . . . . 32 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5 Conclusion and Future Works . . . . . . . . . . . . . . . . . . . . . . . 35 5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

