物聯網邊緣設備上的即時行人檢測：輕深度學習方法｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	Muhammad Dany Alfikri Muhammad Dany Alfikri
論文名稱：	物聯網邊緣設備上的即時行人檢測：輕深度學習方法 Real-Time Pedestrian Detection on IoT Edge Devices: A Light Deep Learning Approach
指導教授：	鄭欣明 Shin-Ming Cheng
口試委員:	王志宇 Wang Chih-Yu 柯拉飛 Rafael Kaliski
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2023
畢業學年度：	111
語文別：	英文
論文頁數：	64
外文關鍵詞：	lightweight deep learning model
相關次數：	點閱：264 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

隨著電腦運算能力的提升，深度學習具備了接近人類思考方式的能力，並被廣泛用於處理與分析數據及圖像。在其中，電腦視覺應用特別受到關注，而行人檢測在交通安全和智慧運輸系統上扮演著重要角色。透過深度學習，機器能夠辨識並檢測圖像或影像中的行人，進而應用於提升行人安全和交通管理。例如，在智慧運輸系統中，深度學習模型可以分析攝像頭捕捉到的影像，識別車流和行人的移動方向，及時檢測行人的移動並發出警示。這不僅有助於駕駛人和系統對行人做出及時反應，也有助於提升整體交通安全性。

然而，在行人檢測中，需要克服數據傳輸延遲的問題。目前的系統通常依賴中央管理伺服器來進行分析和處理。然而，將攝像頭捕捉到的影像數據傳輸到中央伺服器進行行人檢測會帶來延遲。為了縮短等待時間，邊緣伺服器通常被配置在離數據源較近的地方。儘管如此，由於邊緣伺服器計算能力和資源限制，無法處理複雜且計算密集的任務，尤其對於大型深度學習模型而言，更是如此。

為了解決這個問題，我們開發了一種輕量級的深度學習模型，旨在在資源受限的邊緣設備上執行。這種輕量級模型具有較少的參數和計算需求，我們選用了改良版的YOLOv3模型，並通過MQTT協議將檢測事件傳輸到邊緣伺服器，實現即時行人檢測系統。模擬結果顯示，這個模型能夠在即時情況下進行行人檢測，推斷速度快達412毫秒，並且準確率達到78％，相較於基準模型有顯著的提升。

總之，隨著深度學習的應用，特別是在行人檢測方面，我們有望透過輕量級模型和邊緣運算，實現更高效的交通安全和智慧運輸系統。

Artificial intelligence, specifically deep learning, has become an integral part of our everyday lives, thanks to advancements in computing power. Computer vision, a subset of deep learning, aims to provide machines with human-like visual understanding. Pedestrian detection, an important application in computer vision, is crucial for intelligent transportation systems that use video and image data to identify pedestrians and issue warnings at intersections. Currently, centralized processing units are employed to analyze camera feeds and generate alerts for nearby vehicles. However, real-time applications face challenges such as latency, limited data transfer speeds, and the risk of data loss with centralized processing. To address these issues, edge servers placed near network access points are suggested, enabling the transfer of computing and storage resources from the central unit to reduce response times. Nonetheless, edge servers have limited processing power due to their compact size. To overcome this limitation, lightweight deep learning techniques are utilized, compressing deep neural network models for execution on edge devices with limited resources while maintaining performance. The study explores the implementation of a lightweight deep learning model on Internet of Things edge devices. An optimized Yolo-based deep learning model is deployed for real-time pedestrian detection, with detection events transmitted to the edge server using the MQTT protocol. The simulation results demonstrate that the model can achieve real-time pedestrian detection, with a fast inference speed of 412 milliseconds and an accuracy of 78\%, representing significant improvements over baseline models.

Contents
Recommendation Letter . . . . . . . . . . . . . . . . . . . . . . . . i
Approval Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Abstract in Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Abstract in English . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . v
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1 Background . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Study Objectives . . . . . . . . . . . . . . . . . . . . . . 4
3 Study Scope and Limitations . . . . . . . . . . . . . . . . 5
4 Research Contributions . . . . . . . . . . . . . . . . . . . 5
5 Research Organization . . . . . . . . . . . . . . . . . . . 5
Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . 7
1 Pedestrian detection . . . . . . . . . . . . . . . . . . . . . 7
1.1 Hancrafted Features based Pedestrian Detection . . 8
1.2 Deep Learning based pedestrian detection . . . . . 9
1.3 Artificial Intelligence of Things . . . . . . . . . . 11
1.4 Internet of things devices as edge server . . . . . . 13
System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1 Proposed Methods . . . . . . . . . . . . . . . . . . . . . 15
2 Deep Learning Model . . . . . . . . . . . . . . . . . . . . 20
2.1 Model Backbone . . . . . . . . . . . . . . . . . . 21
2.2 Model Neck and Head . . . . . . . . . . . . . . . 22
2.3 Dataset . . . . . . . . . . . . . . . . . . . . . . . 27
3 MQTT protocol for Internet of Things . . . . . . . . . . . 28
4 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . 30
4.1 Pedestrian Crossing Scenario . . . . . . . . . . . . 30
4.2 Deep Learning Model . . . . . . . . . . . . . . . 32
4.3 MQTT Broker Setup . . . . . . . . . . . . . . . . 34
4.4 Deployment Diagram . . . . . . . . . . . . . . . . 36
Simulation and Results . . . . . . . . . . . . . . . . . . . . . . 39
1 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 39
1.1 Simulation Results . . . . . . . . . . . . . . . . . 40
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Letter of Authority . . . . . . . . . . . . . . . . . . . . . . . . . . 54
                                

References
[1] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015.
[2] H. Tokuyama, “Intelligent transportation systems in japan.,” Public Roads, vol. 60, no. 2, pp. 41–45,
1996.
[3] S. Lim, “Intelligent transport systems in korea,” IJEI Int. J. Eng. Ind, vol. 3, pp. 58–64, 2012.
[4] J. Njord, J. Peters, M. Freitas, B. Warner, K. C. Allred, R. L. Bertini, R. Bryant, R. Callan, M. Knopp,
L. Knowlton, et al., “Safety applications of intelligent transportation systems in europe and japan.,”
tech. rep., United States. Federal Highway Administration. Office of International Programs, 2006.
[5] Ministry of Transportation and Communication, Taipei, Republic of China, Statistics of Accident In-
volving pedestrian in Taiwan year 110 (2021), 2022 [Online].
[6] G. Dimitrakopoulos and P. Demestichas, “Intelligent transportation systems,” IEEE Vehicular Tech-
nology Magazine, vol. 5, no. 1, pp. 77–84, 2010.
[7] B. Sterzbach and W. A. Halang, “A mobile vehicle on-board computing and communication system,”
Computers & Graphics, vol. 20, no. 5, pp. 659–667, 1996.
[8] S. Wang, Y. Zhao, J. Xu, J. Yuan, and C.-H. Hsu, “Edge server placement in mobile edge computing,”
Journal of Parallel and Distributed Computing, vol. 127, pp. 160–168, 2019.
[9] Z. Gao, H. Zhang, S. Dong, S. Sun, X. Wang, G. Yang, W. Wu, S. Li, and V. H. C. de Albuquerque,
“Salient object detection in the distributed cloud-edge intelligent network,” IEEE Network, vol. 34,
no. 2, pp. 216–224, 2020.
[10] B. Hussain, Q. Du, S. Zhang, A. Imran, and M. A. Imran, “Mobile edge computing-based data-driven
deep learning framework for anomaly detection,” IEEE Access, vol. 7, pp. 137656–137667, 2019.
[11] H. Xiao, C. Qiu, Q. Yang, H. Huang, J. Wang, and C. Su, “Deep reinforcement learning for optimal
resource allocation in blockchain-based iov secure systems,” in 2020 16th International Conference
on Mobility, Sensing and Networking (MSN), pp. 137–144, 2020.
[12] C.-H. Wang, K.-Y. Huang, Y. Yao, J.-C. Chen, H.-H. Shuai, and W.-H. Cheng, “Lightweight deep
learning: An overview,” IEEE Consumer Electronics Magazine, pp. 1–12, 2022.
[13] Z. Yang, J. Li, and H. Li, “Real-time pedestrian and vehicle detection for autonomous driving,” in
2018 IEEE intelligent vehicles Symposium (IV), pp. 179–184, IEEE, 2018.
[14] Y. Cui, L. Sun, and S. Yang, “Pedestrian detection using improved histogram of oriented gradients,”
in 2008 5th International Conference on Visual Information Engineering (VIE 2008), pp. 388–392,
IET, 2008.
50
[15] H. Wang, R. Lu, X. Wu, L. Zhang, and J. Shen, “Pedestrian detection and tracking algorithm de-
sign in transportation video monitoring system,” in 2009 International Conference on Information
Technology and Computer Science, vol. 2, pp. 53–56, IEEE, 2009.
[16] Z. Shao, G. Cheng, J. Ma, Z. Wang, J. Wang, and D. Li, “Real-time and accurate uav pedestrian
detection for social distancing monitoring in covid-19 pandemic,” IEEE transactions on multimedia,
vol. 24, pp. 2069–2083, 2021.
[17] A. Brunetti, D. Buongiorno, G. F. Trotta, and V. Bevilacqua, “Computer vision and deep learning
techniques for pedestrian detection and tracking: A survey,” Neurocomputing, vol. 300, pp. 17–33,
2018.
[18] P. Dollár, Z. Tu, P. Perona, and S. Belongie, “Integral channel features,” 2009.
[19] A. D. Costea and S. Nedevschi, “Semantic channels for fast pedestrian detection,” in Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2360–2368, 2016.
[20] R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision,
pp. 1440–1448, 2015.
[21] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region
proposal networks,” Advances in neural information processing systems, vol. 28, 2015.
[22] J. R. Uijlings, K. E. Van De Sande, T. Gevers, and A. W. Smeulders, “Selective search for object
recognition,” International journal of computer vision, vol. 104, pp. 154–171, 2013.
[23] R. Benenson, M. Omran, J. Hosang, and B. Schiele, “Ten years of pedestrian detection, what have we
learned?,” arXiv preprint arXiv:1411.4304, 2014.
[24] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in 2005 IEEE Com-
puter Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 886–
893 vol. 1, 2005.
[25] M. Bertozzi, A. Broggi, M. Del Rose, M. Felisa, A. Rakotomamonjy, and F. Suard, “A pedestrian
detector using histograms of oriented gradients and a support vector machine classifier,” in 2007 IEEE
Intelligent Transportation Systems Conference, pp. 143–148, IEEE, 2007.
[26] A. Daniel Costea, R. Varga, and S. Nedevschi, “Fast boosting based detection using scale invariant
multimodal multiresolution filtered features,” in Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pp. 6674–6683, 2017.
[27] N. He, J. Cao, and L. Song, “Scale space histogram of oriented gradients for human detection,” in
2008 International Symposium on Information Science and Engineering, vol. 2, pp. 167–170, IEEE,
2008.
[28] C. Zhou, L. Tang, S. Wang, and X. Ding, “Human detection based on fusion of histograms of ori-
ented gradients and main partial features,” in 2009 2nd International Congress on Image and Signal
Processing, pp. 1–5, IEEE, 2009.
51
[29] G. Zhang, F. Gao, C. Liu, W. Liu, and H. Yuan, “A pedestrian detection method based on svm classifier
and optimized histograms of oriented gradients feature,” in 2010 Sixth International conference on
natural computation, vol. 6, pp. 3257–3260, IEEE, 2010.
[30] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, “Improving neural
networks by preventing co-adaptation of feature detectors,” arXiv preprint arXiv:1207.0580, 2012.
[31] T.-H. Chan, K. Jia, S. Gao, J. Lu, Z. Zeng, and Y. Ma, “Pcanet: A simple deep learning baseline for
image classification?,” IEEE transactions on image processing, vol. 24, no. 12, pp. 5017–5032, 2015.
[32] F. Visin, M. Ciccone, A. Romero, K. Kastner, K. Cho, Y. Bengio, M. Matteucci, and A. Courville,
“Reseg: A recurrent neural network-based model for semantic segmentation,” in Proceedings of the
IEEE conference on computer vision and pattern recognition workshops, pp. 41–48, 2016.
[33] V. Badrinarayanan, A. Kendall, and R. Cipolla, “Segnet: A deep convolutional encoder-decoder ar-
chitecture for image segmentation,” IEEE transactions on pattern analysis and machine intelligence,
vol. 39, no. 12, pp. 2481–2495, 2017.
[34] Q. Hu, P. Wang, C. Shen, A. van den Hengel, and F. Porikli, “Pushing the limits of deep cnns for
pedestrian detection,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 28, no. 6,
pp. 1358–1368, 2017.
[35] L. Zhang, L. Lin, X. Liang, and K. He, “Is faster r-cnn doing well for pedestrian detection?,” in
Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October
11-14, 2016, Proceedings, Part II 14, pp. 443–457, Springer, 2016.
[36] F. B. Tesema, H. Wu, M. Chen, J. Lin, W. Zhu, and K. Huang, “Hybrid channel based pedestrian
detection,” Neurocomputing, vol. 389, pp. 1–8, 2020.
[37] A. Neubeck and L. Van Gool, “Efficient non-maximum suppression,” in 18th international conference
on pattern recognition (ICPR’06), vol. 3, pp. 850–855, IEEE, 2006.
[38] J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv preprint
arXiv:1804.02767, 2018.
[39] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “Ssd: Single shot
multibox detector,” in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The
Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37, Springer, 2016.
[40] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in
Proceedings of the IEEE international conference on computer vision, pp. 2980–2988, 2017.
[41] M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in
International conference on machine learning, pp. 6105–6114, PMLR, 2019.
[42] K. Duan, S. Bai, L. Xie, H. Qi, Q. Huang, and Q. Tian, “Centernet: Keypoint triplets for object
detection,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 6569–
6578, 2019.
52
[43] P. Adarsh, P. Rathi, and M. Kumar, “Yolo v3-tiny: Object detection and recognition using one stage
improved model,” in 2020 6th International Conference on Advanced Computing and Communication
Systems (ICACCS), pp. 687–694, 2020.
[44] S. Shao, Z. Zhao, B. Li, T. Xiao, G. Yu, X. Zhang, and J. Sun, “Crowdhuman: A benchmark for
detecting human in a crowd,” arXiv preprint arXiv:1805.00123, 2018.
[45] G. C. Hillar, MQTT Essentials-A lightweight IoT protocol. Packt Publishing Ltd, 2017.
[46] J. K. Haas, “A history of the unity game engine,” Diss. Worcester Polytechnic Institute, vol. 483,
no. 2014, p. 484, 2014.
[47] K. Chen, J. Wang, J. Pang, Y. Cao, Y. Xiong, X. Li, S. Sun, W. Feng, Z. Liu, J. Xu, et al., “Mmde-
tection: Open mmlab detection toolbox and benchmark,” arXiv preprint arXiv:1906.07155, 2019.
[48] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein,
L. Antiga, et al., “Pytorch: An imperative style, high-performance deep learning library,” Advances
in neural information processing systems, vol. 32, 2019.
[49] R. A. Light, “Mosquitto: server and client implementation of the mqtt protocol,” Journal of Open
Source Software, vol. 2, no. 13, p. 265, 2017.
[50] J. Huang, V. Rathod, C. Sun, M. Zhu, A. Korattikara, A. Fathi, I. Fischer, Z. Wojna, Y. Song,
S. Guadarrama, et al., “Speed/accuracy trade-offs for modern convolutional object detectors,” in Pro-
ceedings of the IEEE conference on computer vision and pattern recognition, pp. 7310–7311, 2017.
[51] G. Jocher, A. Chaurasia, A. Stoken, J. Borovec, Y. Kwon, K. Michael, J. Fang, Z. Yifu, C. Wong,
D. Montes, et al., “ultralytics/yolov5: v7. 0-yolov5 sota realtime instance segmentation,” Zenodo,
2022.
[52] C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “Yolov7: Trainable bag-of-freebies sets new state-
of-the-art for real-time object detectors,” in Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, pp. 7464–7475, 2023.
[53] G. Jocher, A. Chaurasia, and J. Qiu, “YOLO by Ultralytics,” Jan. 2023.
[54] G. Menghani, “Efficient deep learning: A survey on making deep learning models smaller, faster, and
better,” ACM Computing Surveys, vol. 55, no. 12, pp. 1–37, 2023.

全文公開日期 2028/08/24 (校內網路)
全文公開日期 2028/08/24 (校外網路)
全文公開日期 2028/08/24 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文