使用生成數據來提升汽車、行人和騎自行車的人在低辨識度環境的研究

簡易檢索 / 詳目顯示

回結果列表

研究生：	李冠霖 Kuan-Lin Lee
論文名稱：	使用生成數據來提升汽車、行人和騎自行車的人在低辨識度環境的研究 A Study of Cars, Pedestrians, and Cyclists Prediction in Low Visibility Scenes with Augmented Image Sets
指導教授：	吳怡樂 Yi-Leh Wu
口試委員:	陳建中 Jiann-Jone Chen 唐政元 Cheng-Yuan Tang 閻立剛 Li-Gang Yan
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2018
畢業學年度：	106
語文別：	英文
論文頁數：	41
中文關鍵詞：	Caffe 、SSD 、汽車辨識、深度學習、KITTI 、Photoshop 、行人辨識
外文關鍵詞：	Caffe, SSD, Car Classification, Deep Learning, KITTI, Photoshop, Human Classification
相關次數：	點閱：242 下載：1
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

近年來，在物件檢測與辨識上最常用到的方法是深度學習，而在深度學習的過程中最重要的東西就是在訓練過程中的數據集。但是當我們仔細瀏覽這些用來訓練的數據集會發現這些數據集都是在晴朗的白天而且簡單的環境下記錄下來的也因此大部分能找到的測試圖像和影片也都是在同樣的環境下記錄下來的。
但是如果我們使用明亮且簡單的數據集訓練好的模型去測試不好的氣候(雨天、濃霧天、陰天、暴雨天)或者是在比較昏暗的地點(隧道內、夜晚、黃昏或是其他物體的陰影下)，在上述的氣候和地點都會有非常糟糕的結果。為了解決在上面提到的環境會得到非常差的結果，我們使用了修圖軟體去修改原本的圖像讓數據集變得更多樣化也試著讓數據集更符合現實世界的各種不同情況。根據我們的實驗在原本的白天數據集中，將辨識率從98.6% 提升到 99.3%然後在晚上的數據集測試中，將辨識率從很低的22%提升到50%。在第五章實驗的部分會做更詳細的介紹。

Recently, deep learning is one of the popular ways of object detection, and the most important thing in object detection is the dataset, which is used in the training process. However, when we look over the dataset, we can find that most of the accessible dataset is taken during the good weather (e.g., sunny day) with the simple environment, most of the testing pictures and videos are also taken in the same environment. When we turn to test the model that is trained by bright and simple dataset during the bad weather (e.g., rainy day, cloudy day, stormy, foggy day) or in the dim place (e.g., tunnel, night, dusk, under the object’s shade), we will get a bad performance. To solve this problem, we use Photoshop to modify the original dataset, try to make the dataset more versatile and larger in number of images according to the real environment. Our experiments show that we improve the accuracy from 98.6% to 99.3% for the daytime testing dataset. And for the nighttime testing dataset, we improve the accuracy from 22% to 50%. We will have more detailed introduction in Chapter 5.

論文摘要………………….………………………….…………………...….II
Abstract …………………...………………………….…………………………….III
Contents…………………...……………..……………...……………………….…IV
LIST OF FIGURES……….……..………..……………...………..…………….…V
LIST OF TABLES………………………………………………………………....VI
Chapter 1. Introduction……………………………………………..…………………1
Chapter 2.Related Work and Dataset……………………………………………..…3
2.1    VGGNet……………………………………………………………..……3
2.2    CNN &SSD.………………………………………………….……………4
2.3    TP, FP, TN, FN and ROC curve……………………..……………………9
2-4    Adjust the images by brightness and contrast……………………………11
Chapter 3.Proposal: Rendering and Lighting………………………………………12
3-1    Rendering…………………………………..……………………………13
3-2    Lighting……………………………………………..……………………14
Chapter 4. Experiments…………………………………………..…………………16
4-1    Environment and Dataset…………………….. …………………………16
4.2    Training each dataset with SSD…………………………………………18
4.3    Testing each model with KITTI dataset……………………………..……19
4.4    Testing SYSU Nighttime Vehicle dataset by each model……….………27
4.5    Training time and loss……………………………………………………30
Chapter 5.Conclusions and Future work..……………………………………….. …31
References………………………………………………………………….………32

                                

[1] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S. “SSD: Single Shot Multi-Box Detector,” Computer Vision and Pattern Recognition (cs.CV), ECCV 2016.
[2]KITTI dataset, “http://www.cvlibs.net/datasets/kitti/”, reference on 2018/06/02.
[3] Sensitivity and specificity, introduction from wiki, “https://en.wikipedia.org
/wiki/Sensitivity_and_specificity”, reference on 2018/06/02.
[4]Davis, J., Goadrich, M. “The Relationship Between Precision-Recall and ROC Curves,” ICML '06 Proceedings of the 23rd international conference on Machine learning, 2006.
[5] Krizhevsky, A., Sutskever, I., &Hinton, G. E. “ImageNet Classification with Deep Convolutional Neural Networks,” In Advances in neural information processing systems (pp. 1097-1105), 2012.
[6] Hinton, G.E.,Osindero, S. &Teh, Y. “A fast learning algorithm for deep belief nets,” Neural Computation, 2006.
[7] Simonyan, K., & Zisserman, A. “Very deep convolutional networks for large-scale image recognition,”Computer Vision and Pattern Recognition (cs.CV), 2014.
[8] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” in Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998.
[9] Redmon, J.,Divvala, S.,Girshick, R.&Farhadi, A. “You only look once: Unified, real-time object detection,”InComputer Vision and Pattern Recognition (CVPR),arXiv:1506.02640v5 , 2016.
[10] Redmon, J &Farhadi, A. “Yolo9000: Better, faster, stronger,” InComputer Vision and Pattern Recognition (CVPR),arXiv: 1612.08242v1, 2017.
[11] Ren, S., He, K., Girshick, R., & Sun, J. “Faster R-CNN: Towards real-time object detection with region proposal networks,” In NIPS, 2015.
[12] Kuang, H., Chen, L., Gu, F., Chen, J., Chan, L. & Yan, H. “Combining region-of-interest extraction and image enhancement for nighttime vehicle detection,” IEEE Intelligent Systems, vol.31 pp.57-65, 2016.
[13] Thompson, W., Shirley, P., and Ferwerda, J. “A spatial post-processing algorithm for images of night scenes.”Journal of Graphics Tools, 2002.
[14] Haro, G., Bertalm´ıo, M., &Caselles, V. “Visual acuity in day for night,”International Journal of Computer Vision, 69 (1), 109–117, 2006.
[15] Wanat, R., Mantiuk, R., “Simulating and compensating changes in appearance between day and night vision,”ACM Transactions on Graphics (TOG), v.33 n.4, July 2014.
[16] Szegedy, C., Liu, W., Jia, Y.,Sermanet, P., Reed, S.,Anguelov, D., Erhan, D.,Vanhoucke, V., &Rabinovich., A.“Going deeper with convolutions,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2015.
[17] Abadi, M., Agarwal, A., Barham, P.,Brevdo, E., Chen, Z., Citro, C., Corrado,G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mane, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viegas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y. & Zheng X. “Tensorflow: Large-scale machine learning on heterogeneous distributed systems,” arXiv:1603.04467, 2016.
[18] WHO: World Number of road traffic deaths, “http://www.who.int/news-room
/fact-sheets/detail/road-traffic-injuries”, reference on 2018/06/02.
[19] Precision and recall, introduction from wiki, “https://en.wikipedia.org/wiki
/Precision_and_recall”, reference on 2018/06/02.
[20] Li, R., Tan, R, Cheong, L. “Robust Optical Flow Estimation in Rainy Scenes,” In CVPR, 2017.
[21] Su, H., Qi, C., Li, Y.,&Guibas, L. “Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views,” In ICCV, 2015.
[22] Large Scale Visual Recognition Challenge 2014 (ILSVRC2014), “http://www.im
age-net.org/challenges/LSVRC/2014/results”, reference on 2018/06/06.
[23] CS231n Convolutional Neural Network for Visual Recognition, “http://cs231n.
github.io/convolutional-networks/”, reference on 2018/06/06.

簡易檢索 / 詳目顯示

相關論文