基於深度學習的物體檢測和物體追蹤之結合用於自動駕駛汽車

簡易檢索 / 詳目顯示

回結果列表

研究生：	吳岳桐 Yueh-Tung Wu
論文名稱：	基於深度學習的物體檢測和物體追蹤之結合用於自動駕駛汽車 Joint Object Detection and Object Tracking for Self-Driving Cars using Deep Learning
指導教授：	陳郁堂 Yie-Tarng Chen
口試委員:	方文賢 Wen-Hsien Fang 陳省隆 Hsing-Lung Chen 呂政修 Jenq-Shiou Leu 吳乾彌 Chen-Mie Wu 林銘波 Ming-Bo Lin
學位類別：	碩士 Master
系所名稱：	電資學院 - 電子工程系 Department of Electronic and Computer Engineering
論文出版年：	2018
畢業學年度：	106
語文別：	英文
論文頁數：	47
中文關鍵詞：	即時多目標追蹤、基於CNN之追蹤器、基於CNN之物件偵測、餘弦相似度、匈牙利演算法
外文關鍵詞：	Online multi-object tracking, CNN-based tracker, CNN-based detector, cosine similarity, Hungarian algorithm
相關次數：	點閱：330 下載：2
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本論文基於卷積神經網路(CNN)，提出了一個簡單又有效的即時多目標追蹤方法。我們所提出的方法整合了主要包含四個部分:基於CNN的YOLOv3物件偵測器、基於CNN的Re3追蹤、餘弦相似度網路和匈牙利演算法。透過YOLOv3物件偵測器，我們提出的方法可以自動和有效地產生新目標來追蹤。而在Re3追蹤器裡，透過CNN特徵抓取了追蹤目標的外觀資訊，並透過LSTM來得到物體移動的資訊，再將其預測結果與相鄰物件偵測的結果融合。之後，透過使用餘弦相似度網路計算多個追丟的目標和多個物件偵測目標之間的相似度，我們用匈牙利演算法來重新追回追丟的目標。最後，使用抑制相鄰目標的機制，來避免冗餘物件的追蹤。我們提出的方法是一個通用方案。因此，即使沒有經過微調，這個簡單的架構也可以在MOT Challenge和KITTI的測試中獲得良好的性能。

This thesis presents a simple but effective approach for real-time multiple object tracking using convolutional neural networks (CNNs). The proposed approach aggregates YOLOv3 object detector, a CNN-based object detector, Re3 object tracker, a CNN-based tracker and the Hungarian algorithm. The YOLOv3 object detector can automatically and effectively detect the new target to initialize the object tracking. In the Re3 tracker, the CNN embedding can capture the appearance of the tracked-state target and LSTMs keep track of motion information for the fusion with adjacent detection. Afterward, we leverage Hungarian algorithm to recover the lost-state targets, where the cosine similarity network is used to obtain pairwise similarity between lost-state targets and object detections. Finally, a neighboring target suppression mechanism is used to avoid redundant object tracking. The proposed method is a generic scenario. Consequently, the simple scheme can achieve a good performance on the MOT Challenge and KITTI benchmark even without fine-tuning.

中文摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Table of contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1 Overall Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Object detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Object tracker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 Object association . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5 Object detector and object tracker fusion . . . . . . . . . . . . . . . . 16
6 System Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.1 Life of an object . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.2 System architecture . . . . . . . . . . . . . . . . . . . . . . . . 18
Experimental Result and Analysis . . . . . . . . . . . . . . . . . . . . . . . 21
1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2 Evaluation on Testing Set . . . . . . . . . . . . . . . . . . . . . . . . 22
2.1 MOT Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 KITTI Benchmark . . . . . . . . . . . . . . . . . . . . . . . . 24
3 Failure Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

                                

[1] S. Ren, K. He, R. Girshick, and J. Sun, \Faster r-cnn: Towards real-time
object detection with region proposal networks," in Proceedings of the IEEE
Transactions on Pattern Analysis and Machine Intelligence, pp. 91{99, 2015.
[2] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C.
Berg, \Ssd: Single shot multibox detector," in Proceedings of the European
Conference on Computer Vision, pp. 21{37, 2016.
[3] J. Redmon and A. Farhadi, \Yolo9000: Better, faster, stronger," in Proceed-
ings of the IEEE Conference on Computer Vision and Pattern Recognition,
pp. 6517{6525, 2017.
[4] Y. Xiang, A. Alahi, and S. Savarese, \Learning to track: Online multi-object
tracking by decision making," in Proceedings of the IEEE International Con-
ference on Computer Vision, pp. 4705{4713, 2015.
[5] B. Lee, E. Erdenee, S. Jin, M. Y. Nam, Y. G. Jung, and P. K. Rhee, \Multiclass
multi-object tracking using changing point detection," in Proceedings of
the European Conference on Computer Vision, pp. 68{83, 2016.
[6] M. Yang, Y. Wu, and Y. Jia, \A hybrid data association framework for robust
online multi-object tracking," IEEE Transactions on Image Processing,
pp. 5667{5679, 2017.
[7] Y.-m. Song and M. Jeon, \Online multiple object tracking with the hierarchically
adopted gm-phd lter using motion and appearance," in Proceedings of
IEEE International Conference on Consumer Electronics-Asia, pp. 1{4, 2016.
[8] J. Redmon and A. Farhadi, \Yolov3: An incremental improvement," arXiv
preprint arXiv:1804.02767, 2018.
[9] D. Gordon, A. Farhadi, and D. Fox, \Re ^3: Real-time recurrent regression
networks for visual tracking of generic objects," IEEE Robotics and Automation
Letters, pp. 788{795, 2018.
[10] L. Leal-Taixe, A. Milan, I. Reid, S. Roth, and K. Schindler, \Motchallenge
2015: Towards a benchmark for multi-target tracking," arXiv preprint
arXiv:1504.01942, 2015.
[11] A. Milan, L. Leal-Taixe, I. Reid, S. Roth, and K. Schindler, \Mot16: A benchmark
for multi-object tracking," arXiv preprint arXiv:1603.00831, 2016.
[12] A. Geiger, P. Lenz, and R. Urtasun, \Are we ready for autonomous driving?
the kitti vision benchmark suite," in Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, pp. 3354{3361, 2012.
[13] R. Girshick, \Fast r-cnn," in Proceedings of the IEEE International Conference
on Computer Vision, pp. 1440{1448, 2015.
[14] J. Dai, Y. Li, K. He, and J. Sun, \R-fcn: Object detection via region-based fully
convolutional networks," in Proceedings of the Neural Information Processing
System, pp. 379{387, 2016.
[15] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, \Focal loss for dense object
detection," in Proceedings of IEEE International Conference on Computer
Vision, pp. 2999{3007, 2017.
[16] T.-Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and S. Belongie, \Feature
pyramid networks for object detection," in Proceedings of IEEE Conference
on Computer Vision and Pattern Recognition, pp. 936{944, 2017.
[17] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar,
and C. L. Zitnick, \Microsoft coco: Common objects in context," in Proceedings
of the European Conference on Computer Vision, pp. 740{755, 2014.
[18] X. Yan, X. Wu, I. A. Kakadiaris, and S. K. Shah, \To track or to detect?
an ensemble framework for optimal selection," in Proceedings of the European
Conference on Computer Vision, pp. 594{607, 2012.
[19] M. D. Breitenstein, F. Reichlin, B. Leibe, E. Koller-Meier, and L. Van Gool,
\Online multiperson tracking-by-detection from a single, uncalibrated camera,"
IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1820{
1833, 2011.
[20] N. Dalal and B. Triggs, \Histograms of oriented gradients for human detection,"
in Proceedings of IEEE Conference on Computer Vision and Pattern
Recognition, pp. 886{893, 2005.
[21] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan, \Object
detection with discriminatively trained part-based models," IEEE Transactions
on Pattern Analysis and Machine Intelligence, pp. 1627{1645, 2010.
[22] C. Huang, B. Wu, and R. Nevatia, \Robust object tracking by hierarchical
association of detection responses," in Proceedings of the European Conference
on Computer Vision, pp. 788{801, 2008.
[23] H. Pirsiavash, D. Ramanan, and C. C. Fowlkes, \Globally-optimal greedy algorithms
for tracking a variable number of objects," in Proceedings of IEEE
Conference on Computer Vision and Pattern Recognition, pp. 1201{1208, 2011.
[24] A. Milan, S. Roth, and K. Schindler, \Continuous energy minimization for
multitarget tracking," IEEE Transactions on Pattern Analysis and Machine
Intelligence, pp. 58{72, 2014.
[25] A. R. Zamir, A. Dehghan, and M. Shah, \Gmcp-tracker: Global multi-object
tracking using generalized minimum clique graphs," in Proceedings of the Eu-
ropean Conference on Computer Vision, pp. 343{356, 2012.
[26] S. Tang, B. Andres, M. Andriluka, and B. Schiele, \Subgraph decomposition
for multi-target tracking," in Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pp. 5033{5041, 2015.
[27] S.-H. Bae and K.-J. Yoon, \Robust online multi-object tracking based on tracklet
condence and online discriminative appearance learning," in Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1218{
1225, 2014.
[28] K. He, X. Zhang, S. Ren, and J. Sun, \Deep residual learning for image recognition,"
in Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, pp. 770{778, 2016.
[29] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang,
A. Karpathy, A. Khosla, M. Bernstein, et al., \Imagenet large scale visual
recognition challenge," International Journal of Computer Vision, pp. 211{252,
2015.
[30] A. W. Smeulders, D. M. Chu, R. Cucchiara, S. Calderara, A. Dehghan, and
M. Shah, \Visual tracking: An experimental survey," IEEE Transactions on
Pattern Analysis and Machine Intelligence, pp. 1442{1468, 2013.
[31] K. Gre, R. K. Srivastava, J. Koutnk, B. R. Steunebrink, and J. Schmidhuber,
\Lstm: A search space odyssey," IEEE Transactions on Neural Networks and
Learning Systems, pp. 2222{2232, 2017.
[32] D. Held, S. Thrun, and S. Savarese, \Learning to track at 100 fps with deep
regression networks," in Proceedings of the European Conference on Computer
Vision, pp. 749{765, 2016.
[33] J. Munkres, \Algorithms for the assignment and transportation problems,"
Journal of the Society for Industrial and Applied Mathematics, pp. 32{38, 1957.
[34] P. Dollar, R. Appel, S. Belongie, and P. Perona, \Fast feature pyramids for
object detection," IEEE Transactions on Pattern Analysis and Machine Intel-
ligence, pp. 1532{1545, 2014.
[35] F. Yang, W. Choi, and Y. Lin, \Exploit all the layers: Fast and accurate cnn
object detector with scale dependent pooling and cascaded rejection classiers,"
in Proceedings of the IEEE Conference on Computer Vision and Pattern Recog-
nition, pp. 2129{2137, 2016.
[36] R. Henschel, L. L.-T. D. Cremers, and B. Rosenhahn, \Fusion of head and
full-body detectors for multi-object tracking," in Proceedings of the IEEE Con-
ference on Computer Vision and Pattern Recognition, pp. 1509{1518, 2018.
[37] M. Keuper, S. Tang, Y. Zhongjie, B. Andres, T. Brox, and B. Schiele, \A multicut
formulation for joint segmentation and tracking of multiple objects," arXiv
preprint arXiv:1607.06317, 2016.
[38] C. Kim, F. Li, A. Ciptadi, and J. M. Rehg, \Multiple hypothesis tracking
revisited," in Proceedings of the IEEE International Conference on Computer
Vision, pp. 4696{4704, 2015.
[39] J. Chen, H. Sheng, Y. Zhang, and Z. Xiong, \Enhancing detection model for
multiple hypothesis tracking," in Proceedings of the IEEE Conference on Com-
puter Vision and Pattern Recognition Workshops, pp. 2143{2152, 2017.
[40] Z. Fu, P. Feng, F. Angelini, J. Chambers, and S. M. Naqvi, \Particle phd
lter based multiple human tracking using online group-structured dictionary
learning," IEEE Access, pp. 14764{14778, 2018.
[41] E. Bochinski, V. Eiselein, and T. Sikora, \High-speed tracking-by-detection
without using image information," in Proceedings of the IEEE International
Conference on Advanced Video and Signal Based Surveillance, pp. 1{6, 2017.
[42] R. Sanchez-Matilla, F. Poiesi, and A. Cavallaro, \Online multi-target tracking
with strong and weak detections," in Proceedings of the European Conference
on Computer Vision, pp. 84{99, 2016.
[43] T. Kutschbach, E. Bochinski, V. Eiselein, and T. Sikora, \Sequential sensor fusion
combining probability hypothesis density and kernelized correlation lters
for multi-object tracking in video data," in Proceedings of the IEEE Interna-
tional Conference on Advanced Video and Signal Based Surveillance, pp. 1{5,
2017.
[44] V. Eiselein, D. Arp, M. Patzold, and T. Sikora, \Real-time multi-human tracking
using a probability hypothesis density lter and multiple detectors," in Pro-
ceedings of the IEEE International Conference on Advanced Video and Signal
Based Surveillance, pp. 325{330, 2012.
[45] S. Tang, M. Andriluka, B. Andres, and B. Schiele, \Multiple people tracking by
lifted multicut and person reidentication," in Proceedings of the IEEE Con-
ference on Computer Vision and Pattern Recognition, pp. 3701{3710, 2017.
[46] C. Ma, C. Yang, F. Yang, Y. Zhuang, Z. Zhang, H. Jia, and X. Xie, \Trajectory
factory: Tracklet cleaving and re-connection by deep siamese bi-gru for multiple
object tracking," arXiv preprint arXiv:1804.04555, 2018.
[47] E. Levinkov, J. Uhrig, S. Tang, M. Omran, E. Insafutdinov, A. Kirillov,
C. Rother, T. Brox, B. Schiele, and B. Andres, \Joint graph decomposition &
node labeling: Problem, algorithms, applications," in Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, pp. 1904{1912, 2017.
[48] J. Son, M. Baek, M. Cho, and B. Han, \Multi-object tracking with quadruplet
convolutional neural networks," in Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, pp. 5620{5629, 2017.
[49] S.-H. Bae and K.-J. Yoon, \Condence-based data association and discriminative
deep appearance learning for robust online multi-object tracking," IEEE
Transactions on Pattern Analysis and Machine Intelligence, pp. 595{610, 2018.
[50] Y. Ban, S. Ba, X. Alameda-Pineda, and R. Horaud, \Tracking multiple persons
based on a variational bayesian model," in Proceedings of the European
Conference on Computer Vision, pp. 52{67, 2016.
[51] A. Dehghan, S. Modiri Assari, and M. Shah, \Gmmcp tracker: Globally optimal
generalized maximum multi clique problem for multiple object tracking,"
in Proceedings of the IEEE Conference on Computer Vision and Pattern Recog-
nition, pp. 4091{4099, 2015.
[52] L. Chen, H. Ai, C. Shang, Z. Zhuang, and B. Bai, \Online multi-object tracking
with convolutional neural networks," in Proceedings of the IEEE International
Conference on Image Processing, pp. 645{649, 2017.
[53] A. Sadeghian, A. Alahi, and S. Savarese, \Tracking the untrackable: Learning
to track multiple cues with long-term dependencies," in Proceedings of the IEEE
International Conference on Computer Vision, pp. 300{311, 2017.
[54] Q. Chu, W. Ouyang, H. Li, X. Wang, B. Liu, and N. Yu, \Online multi-object
tracking using cnn-based single object tracker with spatial-temporal attention
mechanism," in Proceedings of the IEEE International Conference on Computer
Vision, pp. 4846{4855, 2017.
[55] W. Choi, \Near-online multi-target tracking with aggregated local
ow descriptor,"
in Proceedings of the IEEE International Conference on Computer Vision,
pp. 3029{3037, 2015.
[56] L. Leal-Taixe, C. Canton-Ferrer, and K. Schindler, \Learning by tracking:
Siamese cnn for robust target association," in Proceedings of the IEEE Con-
ference on Computer Vision and Pattern Recognition Workshops, pp. 33{40,
2016.
[57] S. Sharma, J. A. Ansari, J. K. Murthy, and K. M. Krishna, \Beyond pixels:
Leveraging geometry and shape cues for online multi-object tracking," arXiv
preprint arXiv:1802.09298, 2018.
[58] J. Hong Yoon, C.-R. Lee, M.-H. Yang, and K.-J. Yoon, \Online multi-object
tracking via structural constraint event aggregation," in Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, pp. 1392{1400,
2016.
[59] A. Milan, K. Schindler, and S. Roth, \Detection-and trajectory-level exclusion
in multiple object tracking," in Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, pp. 3682{3689, 2013.
[60] S. Wang and C. C. Fowlkes, \Learning optimal parameters for multi-target
tracking with contextual interactions," International Journal of Computer Vi-
sion, pp. 484{501, 2017.
[61] J. H. Yoon, M.-H. Yang, J. Lim, and K.-J. Yoon, \Bayesian multi-object tracking
using motion context from multiple objects," in Proceedings of the IEEE
Winter Conference on Applications of Computer Vision, pp. 33{40, 2015.

簡易檢索 / 詳目顯示

相關論文