利用 YouTube 車禍影片生成車輛軌跡應用於自動駕駛車輛之研究

簡易檢索 / 詳目顯示

回結果列表

研究生：	Ferdyan Dannes Krisandika Ferdyan Dannes Krisandika
論文名稱：	利用 YouTube 車禍影片生成車輛軌跡應用於自動駕駛車輛之研究 Vehicle Trajectory Parsing from YouTube Accident Videos for Self-driving Cars
指導教授：	陳郁堂 Yie-Tarng Chen 方文賢 Wen-Hsien Fang
口試委員:	邱建青 Jian-Qing Qiu 洪賢昇 Xian-Sheng Hong 賴坤財 Kuen-Tsair Lay 陳郁堂 Yie-Tarng Chen 方文賢 Wen-Hsien Fang
學位類別：	碩士 Master
系所名稱：	電資學院 - 電子工程系 Department of Electronic and Computer Engineering
論文出版年：	2020
畢業學年度：	108
語文別：	英文
論文頁數：	60
中文關鍵詞：	Object detection 、Object tracking 、Ego-motion 、Optical flow 、Relative scale
外文關鍵詞：	Object detection, Object tracking, Ego-motion, Optical flow, Relative scale
相關次數：	點閱：204 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

Autonomous driving car systems (ADS) need training on a dataset. However current public dataset only provides videos with normal vehicle behaviour. To resolve this dilemma, if we can generate variety of accident trajectory data, then ADS can be properly trained to be aware of the surrounding environment. Consisting a variety of traffic accident videos of it. In light of this, we propose a novel method that combines object detection, object tracking, depth estimation, lane segmentation, and 3D geometry to generate the accident trajectories. The combination of the lane segmentation of the drive-able area in front of the cars, depth, detected cars, and image perspective transforms allow us to convert the front-view images into bird view images to re-map the tracked object positions. Then, by using the optical flow and ego-motion prediction information, we re-map the tracked vehicles from the 3D image coordinates into the real-world coordinates to get the accurate position. Simulations show that the proposed method can generate the accident trajectories of the crashed vehicle accurately.

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Table of contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1 Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Object Tracker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Image Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4 Depth Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
5 Trajectory Generation . . . . . . . . . . . . . . . . . . . . . . . . . . 6
6 Monocular 3D Object Detection . . . . . . . . . . . . . . . . . . . . . 6
7 Path Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1 Overall Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Trajectory Generation . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Ego-motion and Depth Estimation . . . . . . . . . . . . . . . 10
2.2 Distance Estimation . . . . . . . . . . . . . . . . . . . . . . . 13
2.3
3
Trajectory Generation . . . . . . . . . . . . . . . . . . . . . . 19
Object Speed and Orientation Estimation . . . . . . . . . . . . . . . 23
3.1 Orientation Estimation . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Speed Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 26
4 Path Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2 Distance Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3
2.1 Lane Line Detection . . . . . . . . . . . . . . . . . . . . . . . 35
2.2 Region of Interest Selection . . . . . . . . . . . . . . . . . . . 38
2.3 Vehicle Distance Estimation . . . . . . . . . . . . . . . . . . . 40
Trajectory Generation . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.1 Ego-car Speed Estimation . . . . . . . . . . . . . . . . . . . . 42
3.2 Accident Trajectory Generation . . . . . . . . . . . . . . . . . 43
4 Speed and Orientation Estimation . . . . . . . . . . . . . . . . . . . . 48
5 Path Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Conclusions and Future Works . . . . . . . . . . . . . . . . . . . . . . . . . 54
1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Appendix A : Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Biography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
                                

[1] C.-Y. Chan, “On the detection of vehicular crashes - system characteristics and archi-
tecture,” in Article of the IEEE Transactions on Vehicular Technology, pp. 180–193,
2002.
[2] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti
vision benchmark suite,” in Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition, pp. 3354–3361, IEEE, 2012.
[3] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti
dataset,” Article of the International Journal of Robotics Research (IJRR), 2013.
[4] F. Yu, W. Xian, Y. Chen, F. Liu, M. Liao, V. Madhavan, and T. Darrell, “Bdd100k: A
diverse driving video database with scalable annotation tooling,” arXiv:1805.04687,
2018.
[5] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke,
S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understand-
ing,” in Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR),
June 2016.
[6] M. Tlig, M. Machin, R. Kerneis, E. Arbaretier, L. Zhao, F. Meurville, and J. V. Frank,
“Autonomous driving system : Model based safety analysis,” Proceeding of IEEE
International Conference on Dependable Systems and Networks Workshops (DSN-
W), pp. 2–5, 2018.
[7] R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE International Conference on
Computer Vision, 2015.
[8] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Ob-
ject Detection with Region Proposal Networks,” in Article of IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137–1149, 2016.
[9] J. Redmon and A. Farhadi, “YOLO9000: Better, Faster, Stronger,” in Proceedings of
the IEEE conference on Computer Vision and Pattern Recognition, pp. 7263–7271,
2017.
[10] J. Redmon and A. Farhadi, “Yolov3: An Incremental Improvement,” arXiv preprint
arXiv:1804.02767, 2018.
[11] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. E. Reed, C.-Y. Fu, and A. C. Berg,
“SSD: Single Shot MultiBox Detector,” in Proceedings of the European Conference
on Computer Vision, 2016.
[12] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, “OverFeat:
Integrated Recognition, Localization and Detection using Convolutional Networks,”
Feb. 2014.
[13] C. Feichtenhofer, A. Pinz, and A. Zisserman, “Detect to track and track to detect.,”
Computer Research on Repository, vol. abs/1710.03958, 2017.
[14] D. Gordon, A. Farhadi, and D. Fox, “Re3: Real-time recurrent regression networks
for object tracking,” arXiv preprint arXiv:1705.06368, 2017.
[15] C. Huang, S. Lucey, and D. Ramanan, “Learning policies for adaptive tracking with
deep feature cascades,” Computer Research on Repository, vol. abs/1708.02973, 2017.
[16] A. Klein, Z. Yumak, A. Beij, and A. F. van der Stappen, “Data-driven gaze animation
using recurrent neural networks,” in Motion, Interaction and Games, MIG ’19, (New
York, NY, USA), Association for Computing Machinery, 2019.
[17] E. Shelhamer, J. Long, and T. Darrell, “Fully convolutional networks for semantic
segmentation,” Article of the IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 39, pp. 640–651, April 2017.
[18] W. Abdulla, “Mask r-cnn for object detection and instance segmentation on keras
and tensorflow.” https://github.com/matterport/Mask_RCNN, 2017.
[19] L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with
atrous separable convolution for semantic image segmentation,” in Proceedings of the
European Conference on Computer Vision (7) (V. Ferrari, M. Hebert, C. Sminchis-
escu, and Y. Weiss, eds.), vol. 11211 of Lecture Notes in Computer Science, pp. 833–
851, Springer, 2018.
[20] L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, “Rethinking atrous con-
volution for semantic image segmentation,” Computing Research on Repository,
vol. abs/1706.05587, 2017.
[21] H. Wu, J. Zhang, K. Huang, K. Liang, and Y. Yu, “Fastfcn: Rethinking dilated
convolution in the backbone for semantic segmentation,” arXiv:1903.11816, 2019.
[22] T. Takikawa, D. Acuna, V. Jampani, and S. Fidler, “Gated-scnn: Gated shape cnns
for semantic segmentation,” Proceedings of the International Conference on Computer
Vision, 2019.
[23] C. Godard, O. Mac Aodha, and G. J. Brostow, “Unsupervised Monocular Depth
Estimation with Left-Right Consistency,” in Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, pp. 270–279, 2017.
[24] D. Xu, W. Wang, H. Tang, H. Liu, N. Sebe, and E. Ricci, “Structured Attention
Guided Convolutional Neural Fields for Monocular Depth Estimation,” in Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3917–3925,
2018.
[25] V. Casser, S. Pirk, R. Mahjourian, and A. Angelova, “Depth prediction without the
sensors: Leveraging structure for unsupervised learning from monocular videos,” in
Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), 2019.
[26] C. Godard, O. Mac Aodha, M. Firman, and G. Brostow, “Digging into Self-Supervised
Monocular Depth Estimation,” arXiv preprint arXiv:1806.01260, 2018.
[27] D. González, J. Pérez, and V. Milanés, “Parametric-based path generation for au-
tomated vehicles at roundabouts,” Expert Syst. Appl., vol. 71, p. 332–341, Apr.
2017.
[28] Z. Tingting, L. Ming, M. Xiaoming, W. Qi, L. Fang, and Q. Li, “Trajectory generation
model-based imm tracking for safe driving in intersection scenario,” Article of the
International Journal of Vehicular Technology, vol. 2011, 01 2011.
[29] J. Yang, J. Yuan, and Y. Li, “Parsing 3d motion trajectory for gesture recognition,”
Article of the Journal Visual Community Image Represent., vol. 38, pp. 627–640,
2016.
[30] A. Mousavian, D. Anguelov, J. Flynn, and J. Kosecka, “3d bounding box estimation
using deep learning and geometry,” 2016.
[31] L. Liu, J. Lu, C. Xu, Q. Tian, and J. Zhou, “Deep fitting degree scoring network
for monocular 3d object detection,” in Computer Vision and Pattern Recognition,
pp. 1057–1066, Computer Vision Foundation / IEEE, 2019.
[32] A. N. P. K. J. Leordeanu, “Shift r-cnn: Deep monocular 3d object detection with
closed-form geometric constraints,” 2019.
[33] S. M. LaValle and J. J. K. Jr., “Randomized kinodynamic planning.,” in ICRA,
pp. 473–479, IEEE Robotics and Automation Society, 1999.
[34] J. Ziegler and C. Stiller, “Spatiotemporal state lattices for fast trajectory planning in
dynamic on-road driving scenarios.,” in Proceedings of the International Conference
on Intelligent Robots and Systems, pp. 1879–1884, IEEE, 2009.
[35] C. Urmson, J. Anhalt, D. Bagnell, C. Baker, R. Bittner, M. Clark, J. Dolan, D. Dug-
gins, T. Galatali, C. Geyer, et al., “Autonomous driving in urban environments: Boss
and the urban challenge,” Article of the Journal of Field Robotics, vol. 25, no. 8,
pp. 425–466, 2008.
[36] A. Kelly and B. Nagy, “Reactive nonholonomic trajectory generation via parametric
optimal control,” Article of the I. J. Robotics Res., vol. 22, no. 7-8, pp. 583–602, 2003.
[37] M. Werling, J. Ziegler, S. Kammel, and S. Thrun, “Optimal trajectory generation for
dynamic street scenarios in a frenét frame,” Proceedings of the IEEE International
Conference on Robotics and Automation, pp. 987–993, 2010.
[38] W. Xu, J. Wei, J. M. Dolan, H. Zhao, and H. Zha, “A real-time motion planner with
trajectory optimization for autonomous vehicles,” Proceedings of the IEEE Interna-
tional Conference on Robotics and Automation, pp. 2061–2067, 2012.
[39] T. Zhou, M. Brown, N. Snavely, and D. G. Lowe, “Unsupervised learning of depth
and ego-motion from video,” in Proceedings of the Computer Vision and Pattern
Recognition, 2017.
[40] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”
2015. cite arxiv:1512.03385Comment: Tech report.
[41] C. Godard, O. Mac Aodha, and G. J. Brostow, “Unsupervised monocular depth
estimation with left-right consistency,” in Proceedings of the Computer Vision and
Pattern Recognition, 2017.
[42] M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, “Spatial transformer
networks.,” in Proceedings of the Conference on Neural Information Processing Sys-
tems (C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, eds.),
pp. 2017–2025, 2015.
[43] H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss functions for image restoration with
neural networks.,” Article of the IEEE Transactions Computational Imaging, vol. 3,
no. 1, pp. 47–57, 2017.
[44] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” in Proceedings of the
IEEE International Conference on Computer Vision, pp. 2980–2988, IEEE, 2017.
[45] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Com-
put. Vision, vol. 60, pp. 91–110, Nov. 2004.
[46] B. D. Lucas and T. Kanade, “An iterative image registration technique with an
application to stereo vision,” in Proceedings of the 7th international joint conference
on Artificial intelligence - Volume 2, International Joint Conference on Artificial
Intelligence, (San Francisco, CA, USA), pp. 674–679, Morgan Kaufmann Publishers
Inc., 1981.
[47] Z. Kalal, K. Mikolajczyk, and J. Matas, “Forward-backward error: Automatic detec-
tion of tracking failures,” in Proceedings of the International Conference on Pattern
Recognition, pp. 2756–2759, IEEE, 2010.
[48] J. Ku, M. Mozifian, J. Lee, A. Harakeh, and S. Waslander, “Joint 3d proposal gen-
eration and object detection from view aggregation,” Article of the International
Conference on Intelligent Robots and Systems, 2018.
[49] W. Ali, S. Abdelkarim, M. H. Zahran, M. Zidan, and A. E. Sallab, “Yolo3d: End-to-
end real-time 3d oriented object bounding box detection from lidar point cloud,” in
Proceedings of the European Conference on Computer Vision Workshops, 2018.
[50] A. Mousavian, D. Anguelov, J. Flynn, and J. Kosecka, “3d bounding box estimation
using deep learning and geometry,” in Article of the IEEE Conference on Computer
Vision and Pattern Recognition, pp. 5632–5640, 2016.

全文公開日期 2025/08/20 (校內網路)
全文公開日期 2025/08/20 (校外網路)
全文公開日期 2025/08/20 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文