研究生: |
張晁偉 Chao-Wei Chang |
---|---|
論文名稱: |
車禍事件偵測應用於自動駕駛之研究 Accident Video Detection from YouTube Videos for Self-driving Cars |
指導教授: |
陳郁堂
Yie-Tarng Chen |
口試委員: |
陳郁堂
Yie-Tarng Chen 方文賢 Wen-Hsien Fang 林銘波 Ming-Bo Lin 呂政修 Jenq-Shiou Leu 陳省隆 Hsing-Lung Chen |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電子工程系 Department of Electronic and Computer Engineering |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 英文 |
論文頁數: | 42 |
中文關鍵詞: | 車禍偵測 、車禍影片蒐集 |
外文關鍵詞: | car accident detection, accident data collection |
相關次數: | 點閱:261 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著近年來自動駕駛汽車的感知技術受到關注,越來越多的研究人員對這一問題進行研究。具體而言,避免道路場景中的車禍是自動駕駛汽車的關鍵問題之一。為了避免車禍,自動駕駛汽車應具備提前發現車禍的能力。為此,在本論文中,我們提出了一種用於行車紀錄器影片的三階段車禍偵測架構。首先,CNN-LSTM網路用於檢測失控車輛。接下來,經過物件偵測和物件追蹤後,我們利用來自兩輛車的聯集之交集(Intersection-over-union)的比率來檢測前視圖像的車禍。最後,通過透視轉換,我們可以通過在鳥瞰圖中使用佔用圖來確認車禍。本文的主要貢獻是提出一種基於神經網絡的簡單但有效的車禍檢測系統,系統可以確定發生車禍的畫面,指出哪些車輛發生事故並記錄其駕駛軌跡。
As the perception technology of autonomous driving cars received attention in recent years, more and more researchers have investigated this issue. Specifically, to avoid car accident in road scenes is one of the critical issues for autonomous driving cars. In order to avoid car accidents, the autonomous driving cars should own the capability to detect car accidents in advance. To this end, in this thesis, we propose a novel three-stage classication architecture for dash-cam videos. First, the CNN-LSTM networks are used to detect out-of-control vehicles. Next, taking advantage of state-of-the-art object detection and object tracking schemes, we leverage the ratio of Intersection-over-union (IoU) of two bounding boxes from two vehicles to detect car accidents at the front-view images. Finally, cooperating with the inverse perspective transformation, we can confirm a car accident by using the occupancy map in the bird-view. The major contribution of this thesis is to propose a simple but eective car accident detection system based on neural networks, which can determine which frame a car accident occurs, point out which cars cause a accident and record their driving trajectories as well.
[1] S. Hochreiter and J. Schmidhuber, \Long short-term memory," Neural com-
putation, vol. 9, no. 8, pp. 1735{1780, 1997.
[2] J. Redmon and A. Farhadi, \Yolov3: An incremental improvement," arXiv,
2018.
[3] D. Gordon, A. Farhadi, and D. Fox, \Re3: Real-time recurrent regression
networks for object tracking," arXiv preprint arXiv:1705.06368, vol. 3, 2017.
[4] T. Suzuki, H. Kataoka, Y. Aoki, and Y. Satoh, \Anticipating trac accidents
with adaptive loss and large-scale incident db," in The IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), June 2018.
[5] F.-H. Chan, Y.-T. Chen, Y. Xiang, and M. Sun, \Anticipating accidents in
dashcam videos," in Asian Conference on Computer Vision, pp. 136{153,
Springer, 2016.
[6] A. P. Shah, J.-B. Lamare, T. Nguyen-Anh, and A. Hauptmann, \Cadp:
A novel dataset for cctv trac camera based accident analysis," in 2018
15th IEEE International Conference on Advanced Video and Signal Based
Surveillance (AVSS), pp. 1{9, IEEE, 2018.
[7] K.-H. Zeng, S.-H. Chou, F.-H. Chan, J. Carlos Niebles, and M. Sun, \Agent-
centric risk assessment: Accident anticipation and risky region localization,"
in Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, pp. 2222{2230, 2017.
[8] R. Girshick, \Fast r-cnn," in Proceedings of the IEEE international confer-
ence on computer vision, pp. 1440{1448, 2015.
[9] S. Ren, K. He, R. Girshick, and J. Sun, \Faster r-cnn: Towards real-time
object detection with region proposal networks," in Advances in neural in-
formation processing systems, pp. 91{99, 2015.
[10] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C.
Berg, \Ssd: Single shot multibox detector," in European conference on com-
puter vision, pp. 21{37, Springer, 2016.
[11] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, \Focal loss for dense
object detection," in Proceedings of the IEEE international conference on
computer vision, pp. 2980{2988, 2017.
[12] T.-Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and S. Belongie,
\Feature pyramid networks for object detection," in Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, pp. 2117{2125,
2017.
[13] J. Redmon, \Darknet: Open source neural networks in c." http://
pjreddie.com/darknet/, 2013{2016.
[14] K. He, X. Zhang, S. Ren, and J. Sun, \Deep residual learning for image
recognition," in Proceedings of the IEEE conference on computer vision and
pattern recognition, pp. 770{778, 2016.
[15] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar,
and C. L. Zitnick, \Microsoft coco: Common objects in context," in European
conference on computer vision, pp. 740{755, Springer, 2014.
[16] Y. Xiang, A. Alahi, and S. Savarese, \Learning to track: Online multi-object
tracking by decision making," in Proceedings of the IEEE international con-
ference on computer vision, pp. 4705{4713, 2015.
[17] X. Yan, X. Wu, I. A. Kakadiaris, and S. K. Shah, \To track or to detect?
an ensemble framework for optimal selection," in European Conference on
Computer Vision, pp. 594{607, Springer, 2012.
[18] H. Pirsiavash, D. Ramanan, and C. C. Fowlkes, \Globally-optimal greedy
algorithms for tracking a variable number of objects," in CVPR 2011,
pp. 1201{1208, IEEE, 2011.
[19] N. Dalal and B. Triggs, \Histograms of oriented gradients for human detec-
tion," in international Conference on computer vision & Pattern Recognition
(CVPR'05), vol. 1, pp. 886{893, IEEE Computer Society, 2005.
[20] C. Huang, B. Wu, and R. Nevatia, \Robust object tracking by hierarchical
association of detection responses," in European Conference on Computer
Vision, pp. 788{801, Springer, 2008.
[21] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-
Fei, \Large-scale video classication with convolutional neural networks," in
Proceedings of the IEEE conference on Computer Vision and Pattern Recog-
nition, pp. 1725{1732, 2014.
[22] K. Simonyan and A. Zisserman, \Two-stream convolutional networks for
action recognition in videos," in Advances in neural information processing
systems, pp. 568{576, 2014.
[23] J. Donahue, L. Anne Hendricks, S. Guadarrama, M. Rohrbach, S. Venu-
gopalan, K. Saenko, and T. Darrell, \Long-term recurrent convolutional
networks for visual recognition and description," in Proceedings of the IEEE
conference on computer vision and pattern recognition, pp. 2625{2634, 2015.
[24] K. Simonyan and A. Zisserman, \Very deep convolutional networks for large-
scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
[25] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, \Imagenet: A
large-scale hierarchical image database," in 2009 IEEE conference on com-
puter vision and pattern recognition, pp. 248{255, Ieee, 2009.
[26] A. Krizhevsky, I. Sutskever, and G. E. Hinton, \Imagenet classication with
deep convolutional neural networks," in Advances in neural information pro-
cessing systems, pp. 1097{1105, 2012.
[27] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang,
A. Karpathy, A. Khosla, M. Bernstein, et al., \Imagenet large scale visual
recognition challenge," International journal of computer vision, vol. 115,
no. 3, pp. 211{252, 2015.
[28] A. W. Smeulders, D. M. Chu, R. Cucchiara, S. Calderara, A. Dehghan, and
M. Shah, \Visual tracking: An experimental survey," IEEE transactions
on pattern analysis and machine intelligence, vol. 36, no. 7, pp. 1442{1468,
2013.
[29] D. Held, S. Thrun, and S. Savarese, \Learning to track at 100 fps with deep
regression networks," in European Conference on Computer Vision, pp. 749{
765, Springer, 2016.
[30] H. Rezatoghi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese,
\Generalized intersection over union: A metric and a loss for bounding box
regression," in Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, pp. 658{666, 2019.
[31] A. Gorban, H. Idrees, Y.-G. Jiang, A. Roshan Zamir, I. Laptev, M. Shah,
and R. Sukthankar, \THUMOS challenge: Action recognition with a large
number of classes." http://www.thumos.info/, 2015.