研究生: |
王聖鈞 Sheng-Chun Wang |
---|---|
論文名稱: |
以Haar特徵分類及CNN迴歸分析之深度學習用於微笑表情偵測研究 A Study on Deep Learning of Smile Detection using Haar-like Features and CNN-Based Regression Analysis |
指導教授: |
胡國瑞
Kuo-Jui Hu 孫沛立 Pei-Li Sun |
口試委員: |
胡國瑞
Kuo-Jui Hu 孫沛立 Pei-Li Sun 徐明景 Ming-Ching Shyu |
學位類別: |
碩士 Master |
系所名稱: |
應用科技學院 - 色彩與照明科技研究所 Graduate Institute of Color and Illumination Technology |
論文出版年: | 2023 |
畢業學年度: | 111 |
語文別: | 中文 |
論文頁數: | 91 |
中文關鍵詞: | Haar特徵 、AdaBoost學習演算 、CNN卷積神經網路 、TensorFlow Keras 演算 、深度學習 、微笑偵測 |
外文關鍵詞: | Haar-like features, AdaBoost learning algorithm, convolutional neural network, TensorFlow Keras, deep learning, smile detection |
相關次數: | 點閱:458 下載:3 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
微笑是人類心理表現的重要特徵之一,而微笑的偵測是從人臉辨別開始。在研究中分別採用機器學習的Haar-like特徵分類訓練模型及深度學習的卷積神經網路(Convolutional Neural Network,CNN)演算架構,進行人臉與微笑的偵測辨識,並探討各模型於不同參數變化及條件設定對微笑準確率的影響,進而得知在何種條件下可使模型獲得即時最佳的臉部微笑偵測能力。未來隨著人工智慧(Artificial Intelligence,AI)相關資源的投入與技術的演進,將使偵測速度及準確率更加提升。
本論文研究可分為二個部份,第一部份採用AdaBoost迭代學習演算法,在每一輪的樣本訓練中加入一個新的弱分類器,再將各個訓練得到的弱分類器集結起來,建構組合成一個強分類器。此一強分類器主要是利用經編程的Haar-like特徵作為人臉偵測之關鍵視覺特徵,然後刪除影像中大部分不重要的背景區塊,以加速影像處理計算所耗費的時間,接著再將影像轉換為HSV色彩空間,用以辨別膚色作為輔助,並設定不同偵測參數條件及影像高度濾除比例與微笑屬性分類,以增進人臉及臉部微笑偵測效果,由此偵測結果之精確率可達82.3%。接著第二部份係將人臉偵測後,採用深度學習之卷積神經網路,結合TensorFlow Keras API多項式迴歸分析擬合微笑曲率偵測微笑程度,並將微笑圖片標記預處理且進行影像資料訓練,接著應用於臉部微笑偵測。隨著訓練數量的增加,準確率可達97.7%。因此由偵測結果得知採本研究新修編CNN結合TensorFlow Keras深度學習系統用於臉部微笑辨識較AdaBoost機器學習演算架構具有更佳效果及高於15%以上的精確性。本研究可應用於醫療足跡框列,情緒分析,國防軍事,服務業之人際互動及智慧製造等產業,在將來深具實用價值。
Smile is one of the important characteristics of human mental state, and the detection of smile starts with face recognition. In the study, the Haar-like feature classification training model of machine learning and the convolutional neural network (CNN) algorithm framework of deep learning were used to detect and recognize faces and smiles. The effect of different parameters and conditions on the accuracy of smiles were investigated and to obtain the ability of real-time best facial smile detection.
The research of thesis were divided into two parts. The first added a new weak classifier in every step of sample training at using AdaBoost iterative learning algorithm, and then assemble the weak classifiers from every training frame to combine into a strong classifier. So that could be used as main visual feature of face detection on modified Haar-like classification training model. The unimportant background blocks of image could be deleted and that increased detecting efficiency of image treatment and calculation. The color space were converted into HSV space to increase distinguish of skin color. Effect of different detection parameter, conditions, image size ratio and smile grade on accuracy of faces and smiles were widely studied. From results of the first part found the accuracy of smile detection can be up to 82.3%. The second part, the deep learning of convolutional neural network (CNN) combined with TensorFlow Keras API was used to calculate degree of facial smile through polynomial regression of smile curvature. The facial image was labeled and trained for applying to smile detection, as the number of trained image increase, the overall accuracy ratio of smile recognition will be increased up to 97.7%. The results show that the CNN-based model is 15% higher in the accuracy of smile recognition than the Haar-based model. This research can be applied to industries such as medical footprint tracking, sentiment analysis, national defense and military affairs, emotional reactions of humans and smart manufacturing. The technology will be provided with practical value in the future.
[1] S. M. Pizer, E. P. Amburn, J. D. Austin, R. Cromartie, A. Geselowitz, T. Greer, B. T. Haar Romeny, J. B. Zimmerman, K. Zuiderveld, “Adaptive histogram equalization and its variations,” Computer vision, graphics, and image processing, 39(3): 355–368, 1987.
[2] P. Viola, M. Jones, “Robust real-time face detection,” International Journal of Computer Vision, vol. 57, no. 2, pp. 137-154, 2004.
[3] S. Z. Wang, H. J. Lee, “A cascade framework for a Real-time statistical plate recognition system,” IEEE Transactions on Information Forensics and Security, vol. 2, no. 2, pp. 267-282, 2007.
[4] T. Kozakaya, S. Ito, S. Kubota, O. Yamaguchi, “Cat face detection with two heterogeneous features,” IEEE International Conference on Computer Vision (ICIP), pp. 1213-1216, 2009.
[5] N. Dalal, B. Triggs, “Histograms of oriented gradients for human detection,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886-889, 2005.
[6] W. Yao, Z. A. Deng, “Robust pedestrian detection approach based on shapelet feature and haar detector ensembles,” Tsinghua Science and Technology, vol. 17, no. 1, pp. 40-50, 2012.
[7] S. K. Singh, D. S. Chauhan, M. Vatsa, R. Singh, “A robust skin color based face detection algorithm,” Journal of Science and Engineering, vol. 6, no. 4, pp. 227-234, 2003.
[8] J. Yang, X. Ling, Y. Zhu, Z. Zheng, “A face detection and recognition system in color image series,” Mathematics and Computers in Simulation, vol. 77, no. 5, pp. 531-539, 2008.
[9] B. Wu, R. Nevatia, “Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors,” Proceedings of the Tenth IEEE International Conference on Computer Vision, vol. 1, pp. 90-97, 2005.
[10] P. Viola, M. Jones, “Rapid object detection using a boosted cascade of simple Features,” Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 518-551, 2001.
[11] 張榮財,黃建銘,「基於曲線擬合及支持向量機之人臉微笑辨識系統」,南台學報,第37卷第1期,頁13-32,2012。
[12] N. Srivastava, G. E. Hinton, A. Krizhevsky, et al., “Dropout: a simple way to prevent neural networks from overfitting,” J. of Machine Learning Research, 15(1), pp. 1929-1958, 2014.
[13] C. Lin, “Face detection in complicated backgrounds and different illumination conditions by using YCbCr color space and neural network,” Pattern Recognition Letter, 28, pp. 2190-2200, 2007.
[14] M. Krzysztof, Kryszczuk, Andrzej Drygajło, “Color correction for face detection based on human visual perception metaphor,” Proc. of the Workshop on Multimodal User Authentication, pp. 138-143, 2003.
[15] S. Azuan Nazeer, et al., “Face detection using artificial neural network approach,” First Asia International Conference on Modelling & Simulation, 394-399, 2007.
[16] L. Luo, Y. Xiong, Y. Liu, et al., “Adaptive gradient methods with Dynamic bound of learning rate,” arXiv preprint arXiv: 1902.09843, 2019.
[17] M. Jose, Chaves-Gonzalez et al., “Detecting skin in face recognition systems: A colour spaces study,” Digital Image Processing, 20, pp. 806-823, 2010.
[18] P. Viola, M. Jones, “Robust real-time object detection,” Second International Workshop on Statistical and Computational Theories of Vision – Modeling, Learning, Computing, and Sampling, pp. 1-25, 2001.
[19] S. Z. Wang, H. J. Lee, “A cascade framework for a Real-time statistical plate recognition system,” IEEE Transactions on Information Forensics and Security, vol. 2, no. 2, pp. 267-282, 2007.
[20] Q. Chen, N. D. Georganas, “Hand Gesture Recognition Using Haar-like Features and a Stochastic Context-Free Grammar,” IEEE Transactions on Instrumentation and Measurement, vol. 57, no. 8, pp. 1562-1571, 2008.
[21] T. Burghardt, J. Calic, “Analysing animal behavior in wildlife videos using face detection and tracking,” IEE Proceedings - Vision, Image and signal Processing, vol. 153, no. 3, pp. 305-312, 2006.
[22] Y. Freund, R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” J of Computer and System Science, 55(1), pp. 119-139, 1997.
[23] John C Russ, 「數字圖像處理,6版﹝M﹞」,余翔宇等譯,電子工業出版社,北京,2014。
[24] 郭介銘,賴尚宏,「基於深度學習的臉部表情辨識系統」,國立清華大學資訊工程學研究所,碩士論文,新竹,2017。
[25] 謝斯宇,黃文吉,「基於臉部偵測及CNN模型之硬體臉部辨識系統」,國立臺灣師範大學資訊工程學研究所,碩士論文,台北,2019。
[26] Al Hussain Akoum, “Real-Time Best Smile Detection,” International Journal of Emerging Trends & Technology in Computer Science, vol. 7, Issue 5, September-October, pp. 8-12, 2018.
[27] Stan Z. Li, Z. Zhang, “FloatBoost Learning and Statistical Face Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, No. 9, pp. 1112-1123, 2004.
[28] S. Zafeiriou, C. Zhang, Z. Zhang, “A Survey on Face Detection in the wild: past, present and future,” Computer Vision and Image Understanding, vol. 138, pp. 1-24, September 2015.
[29] Kaipeng Zhang, Zhanpeng Zhang, Z. Li, Y. Qiao, “Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks,” IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499-1503, 2016.
[30] Cha Zhang, Zhengyou Zhang, IEEE Winter Conference on Applications of Computer Vision, pp. 1036-1041, 2014.
[31] 王天慶,「Python人臉識別」,機械工業出版社,北京,2021。
[32] C. P. Papageorgious, M. Oren, T. Poggio, “A general framework for object detection,” International Conference on Computer Vision, pp. 555-562, 1988.
[33] R. Lienhart, J. Maydt, “An extended set of Haar-like features for rapid object detection,” International Conference on Image Processing, pp. I-900 I-903, 2002.
[34] R. Lienhart, A. Kuranov, V. Pisarevsky, “Empirical analysis of detection cascades of boosted classifiers for rapid object detection,” Pattern Recognition - 25th DAGM Symposium, Magdeburg, Germany, pp. 297-304, 2003.
[35] N. Dalal, B. Triggs, “Histograms of Oriented Gradients for Human Detection,” IEEE Computer Society Conference on Computer Vision and PatternRecognition, vol. 1, pp. 886-893, 2005.
[36] N. Dalal, “Finding people in images and videos,” PhD thesis, Institut Nat’l Polytechnique de Grenoble, 2006.
[37] T. Ojala, M. Pietikäinen, D. Harwood, “Performance Evaluation of Texture Measures with Classification Based on Kullback Discrimination of Distributions,” Proceedings of 12th International Conference on Pattern Recognition(ICPR), 1, pp. 582-585, 1994.
[38] L. Zhang, R. Chu, S. Xiang, S. Liao, S. Z. Li, “Face detection based on Multi-Block LBP representation,” International conference ICB, LNCS, vol. 4642, pp. 11-18, 2007.
[39] T. Ojala, M. Pietikäinen, D. Harwood, “A comparative study of texture measures with classification based on feature distributions,” Pattern Recognition, vol. 29, no. 1, pp. 51-59, 1996.
[40] Y. Freund, R. Schapire, “A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting,” J. of Computer and System Sciences, pp. 119-139, 1997.
[41] 機器學習(四)——Adaboost演算法,2019。 https://www.itread01.com/content/1547087057.html (查詢日期:January. 20, 2023)
[42] Mathew 1, P. Amudha, S. Sivakumari, “Deep Learning Techniques: An Overview,” 2020 Proceedings of Advanced Machine Learning Technologies and Applications, Springer Nature Singapore Pte Ltd., pp. 599-608, 2021.
[43] https://chih-sheng-huang821.medium.com/什麼是人工智慧-機器學習和深度學習-587e6a0dc72a (查詢日期:February. 9, 2023)
[44] P. Goyal, S. Pandey, K. Jain, “Introduction to natural language processing and deep learning,” Deep Learning for Natural Language Processing, pp. 1–74, 2018. doi: 10.1007/978-1-4842-3685-7 1.
[45] Quoc V Le et al., “A tutorial on deep learning part 2: Autoencoders, convolutional neural networks and recurrent neural networks,” Google Brain, pages 1–20, 2015.
[46] R. Yamashita, M. Nishio, R. K. Gian Do, K. Togashi, “Convolutional neural networks: an overview and application in radiology,” Insights into imaging, 9(4): 611–629, 2018. doi: 10.1007/s13244-018-0639-9.
[47] LeCun Yann, Léon Bottou, Y. Bengio, P. Haffner, “Gradient-Based Learning Applied to Document Recognition,” Proceedings of The IEEE, pp. 1-46, 1998.
[48] https://bangqu.com/QTgn28.html (查詢日期:March. 1, 2023)
[49] V. Nair, Geoffrey E. Hinton, C. Farabet, “Rectified linear units improve restricted Boltzmann machines,” Proceedings of the 27th International Conference on Machine Learning, pp. 807-814, 2010.
[50] Yann LeCun, Yoshua Bengio, Geoffrey Hinton, “Review Deep Learning,” NATURE, vol. 521, pp. 436-445, 2015.
[51] 言有三,「深度學習之人臉圖像處理」,機械工業出版社,北京,2021。
[52] 方園園,「人臉辨識與美顏算法案例實戰」,機械工業出版社,北京,2020。
[53] D. Yu, H. Wang, P. Chen, et al., “Mixing pooling for convolutional neural networks,” International Conference on Rough Sets and Knowledge Technology, pp. 364-375, 2014.
[54] 廖源粕,「AI影像深度學習啟蒙:用Python進行人臉口罩識別」,深智數位股份有限公司,台北,2021。
[55] https://zihuaweng.github.io/2018/06/26/haar-classifier/ (查詢日期:March. 21, 2023)
[56] M. D. Zeiler, R. Fergus, “Stochastic pooling for regularization of deep convolutional neural Networks﹝J﹞,” arXiv preprint arXiv: 1301.3557, 2013.
[57] M. Li, L. H. Xu, F. C. Huang, M. Tang, H. B. Wang, “Reconstruction of Bionic Compound Eye Images Based on Superresolution Algorithm,” ICIT´07, IEEE International Conference on Integration Technology, 706-710, 2007.
[58] https://susanqq.github.io/UTKFace/ (查詢日期:July. 21, 2023)