簡易檢索 / 詳目顯示

研究生: 陳銘翔
Ming-Xiang - Chen
論文名稱: 以深度學習和時頻訊號進行心率量測
Deep Learning on Time-Frequency Representation for Heart Rate Estimation
指導教授: 徐繼聖
Gee-Sern Jison Hsu
口試委員: 林惠勇
Huei-Yung Lin
孫民
Min Sun
鍾聖倫
Sheng-Luen Chung
王鈺強
Yu-Chiang Frank Wang
學位類別: 碩士
Master
系所名稱: 工程學院 - 機械工程系
Department of Mechanical Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 80
中文關鍵詞: 心率深度學習時頻
外文關鍵詞: heart rate, deep learning, time-frequency
相關次數: 點閱:353下載:17
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 我們提出一個應用深度學習架構進行即時心跳量測的方法。本方法共分為4個步驟,步驟1:先以CLNF演算法偵測鏡頭前的人臉與其地標點。步驟2:由地標點框選之不同的人臉區域,擷取色彩之均值訊息,並以三種不同的前處理進行訊號萃取。步驟3:應用短時傅立葉轉換(Short-Time Fourier Transform)把前述萃取之1維訊號轉為2維之時頻圖。步驟4:把時頻圖輸入一利用心率資料庫訓練完成之VGG網路進行心率預測。本論文為少數首次應用深度學習網路進行即時心率量測之研究,與目前最新透過影像之非接觸式心率量測的方法比較,本研究所提出之方法相當具有競爭力。


    We propose a deep learning approach for measuring heart rates by a RGB camera. Our approach consists of four steps. In Step 1, we detect the face in front of the camera along with its landmarks using the Conditional Local Neural Fields (CLNF). In Step 2, we process the sample means of the colors on different facial regions by three preprocessing schemes. In Step 3, the Short-Time Fourier Transform (STFT) is employed to convert the three 1D processed colors into 2D Time-Frequency Representations (TFRs). Lastly in Step 4, a VGG net trained on the TFRs extracted from a training set is exploited to estimate the heart rate of the face. Our approach can be one of the pioneering works for heart rate estimation using deep learning networks. Its performance is comparable to other state-of-the-art approaches according to the experiments on benchmark databases.  

    摘要 I Abstract II 誌謝 III 目錄 IV 圖目錄 VI 表目錄 VIII 第一章 介紹 2 1.1 研究背景和動機 2 1.1.1 光學式心率量測 3 1.1.2 時頻表示 4 1.1.3 深度學習 5 1.2 方法概述 5 1.3 論文貢獻 6 1.4 論文架構 6 第二章 文獻回顧 8 2.1 基於影像之心率量測相關文獻 8 2.1.1 基於顏色變化之心率量測相關文獻 8 2.1.2 基於運動變化之心率量測相關文獻 14 2.2 時頻分析相關文獻 15 2.2.1 傅立葉轉換 16 2.2.2 短時傅立葉轉換 16 2.2.3 小波轉換 18 2.3 深度學習相關文獻 19 2.3.1 LeNet-5 19 2.3.2 AlexNet 21 2.3.3 VGG 22 第三章 主要方法與流程 24 3.1 心率量測方法概述 24 3.2 人臉地標點偵測與追蹤 25 3.2.1 OpenFace人臉地標點偵測器 25 3.2.2 人臉ROI選取 29 3.3 時域訊號處理 30 3.3.1 二階差分(Second-Order Difference) 法 31 3.3.2 消除趨勢波動分析法 32 3.3.3 基於色度(Chrominance-Based) 法 34 3.4 深度學習與網路架構 36 3.5 心率樣本訓練及測試 38 第四章 系統製作 41 4.1 影像取樣設計 41 4.2 人臉地標點偵測及追蹤 43 4.3 訊號處理及分析 44 第五章 實驗設置與分析 45 5.1 心率資料庫介紹 45 5.1.1 自行拍攝資料庫 45 5.1.2 PURE資料庫 47 5.1.3 MAHNOB-HCI資料庫 48 5.2 實驗設計 50 5.3 實驗結果與分析 52 5.3.1 膚色及干擾頻率分析 53 5.3.1 不同方法多數決效果 56 5.3.2 9褶(fold)交叉驗證 60 5.3.3 不同人臉部位效果 61 5.3.4 實驗結果比較 63 第六章 結論與未來研究方向 65 6.1 結論 65 6.2 未來研究方向 65 第七章 參考文獻 66

    [1] Baltrusaitis, Tadas, Peter Robinson, and Louis-Philippe Morency. "Constrained local neural fields for robust facial landmark detection in the wild." Proceedings of the IEEE International Conference on Computer Vision Workshops. 2013.
    [2] Oppenheim, Alan V., and Ronald W. Schafer. Discrete-time signal processing. Pearson Higher Education, 2010.
    [3] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
    [4] W. Verkruysse, L. O. Svaasand, and J. S. Nelson, "Remote plethysmographic imaging using ambient light," Optics express, vol. 16, no. 26, pp. 21434–21445, 2008. 1, 5, 11, 14, 15, 20, 27
    [5] Chekmenev, Sergey Y., et al. "Multiresolution approach for noncontact measurements of arterial pulse using thermal imaging." Augmented vision perception in infrared. Springer London, 2009. 87-112.
    [6] Poh, Ming-Zher, Daniel J. McDuff, and Rosalind W. Picard. "Non-contact, automated cardiac pulse measurements using video imaging and blind source separation." Optics express 18.10 (2010): 10762-10774. 1, 2, 7
    [7] M.-Z. Poh, D. J. McDuff, and R. W. Picard, "Advancements in noncontact, multi-parameter physiological measurements using a webcam," Biomedical Engineering, IEEE Transactions on, vol. 58, no. 1, pp. 7–11, 2011. 1, 5, 11, 14, 15, 20, 27, 33
    [8] Soleymani, Mohammad, et al. "A multimodal database for affect recognition and implicit tagging." IEEE Transactions on Affective Computing 3.1 (2012): 42-55.
    [9] G. De Haan and V. Jeanne. Robust pulse rate from chrominance-based rPPG. IEEE Transaction on Biomedical Engineering, 60(10):2878–2886, 2013.
    [10] Monkaresi, Hamed, Rafael A. Calvo, and Hong Yan. "A machine learning approach to improve contactless heart rate monitoring using a webcam." Biomedical and Health Informatics, IEEE Journal of 18.4 (2014): 1153-1160.
    [11] Yu-Lun Liu, "Color and Motion Analysis of Facial Component for Pulse Detection," NTUST, 2014.
    [12] Li, Xiaobai, et al. "Remote heart rate measurement from face videos under realistic situations." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014.
    [13] Tulyakov, Sergey, et al. "Self-adaptive matrix completion for heart rate estimation from face videos under realistic conditions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
    [14] G. Balakrishnan, F. Durand, and J. Guttag, "Detecting pulse from head motions in video," in Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pp. 3430–3437, IEEE, 2013. 1, 5, 12, 14, 15, 20, 21, 22, 27, 33
    [15] L. Shan and M. Yu, "Video-based heart rate measurement using head motion tracking and ica," in Image and Signal Processing (CISP), 2013 6th International Congress on, 2013. vii, 1, 5, 13, 14, 15, 20, 21, 27, 28, 33
    [16] Lucas, Bruce D., and Takeo Kanade. "An iterative image registration technique with an application to stereo vision." IJCAI. Vol. 81. No. 1. 1981.
    [17] Brox, Thomas, et al. "High accuracy optical flow estimation based on a theory for warping." European conference on computer vision. Springer Berlin Heidelberg, 2004.
    [18] Pérez, Javier Sánchez, Enric Meinhardt-Llopis, and Gabriele Facciolo. "TV-L1 optical flow estimation." Image Processing On Line 2013 (2013): 137-150.
    [19] M. Uřičář, V. Franc, and V. Hlaváč, "Detector of facial landmarks learned by the structured output SVM," VISAPP, pp. 547–556, 2012. 16, 42
    [20] Cristinacce, David, and Timothy F. Cootes. "Feature Detection and Tracking with Constrained Local Models." BMVC. Vol. 1. No. 2. 2006.
    [21] Saragih, Jason M., Simon Lucey, and Jeffrey F. Cohn. "Deformable model fitting by regularized landmark mean-shift." International Journal of Computer Vision 91.2 (2011): 200-215.
    [22] Baltru, Tadas, Peter Robinson, and Louis-Philippe Morency. "OpenFace: an open source facial behavior analysis toolkit." 2016 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2016.
    [23] Wu, Hao-Yu, et al. "Eulerian video magnification for revealing subtle changes in the world." (2012).
    [24] Le Cun, B. Boser, et al. "Handwritten digit recognition with a back-propagation network." Advances in neural information processing systems. 1990.
    [25] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
    [26] LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.
    [27] Russakovsky, Olga, et al. "Imagenet large scale visual recognition challenge." International Journal of Computer Vision 115.3 (2015): 211-252.
    [28] Parkhi, Omkar M., Andrea Vedaldi, and Andrew Zisserman. "Deep face recognition." British Machine Vision Conference. Vol. 1. No. 3. 2015.
    [29] Hinton, Geoffrey E., et al. "Improving neural networks by preventing co-adaptation of feature detectors." arXiv preprint arXiv:1207.0580 (2012).
    [30] Schroff, Florian, Dmitry Kalenichenko, and James Philbin. "Facenet: A unified embedding for face recognition and clustering." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
    [31] Taigman, Yaniv, et al. "Deepface: Closing the gap to human-level performance in face verification." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014.
    [32] Jia, Yangqing, et al. "Caffe: Convolutional architecture for fast feature embedding." Proceedings of the 22nd ACM international conference on Multimedia. ACM, 2014.
    [33] X. Zhu and D. Ramanan, "Face detection, pose estimation, and landmark localization in the wild," in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pp. 2879–2886, IEEE, 2012. 16
    [34] B. D. Lucas, T. Kanade, et al., "An iterative image registration technique with an application to stereo vision.," in IJCAI, vol. 81, pp. 674–679, 1981. 21, 22
    [35] T. Brox, A. Bruhn, N. Papenberg, and J. Weickert, "High accuracy optical flow estimation based on a theory for warping," in Computer Vision-ECCV 2004, pp. 25–36, Springer, 2004. 21
    [36] J. S. Pérez, E. Meinhardt-Llopis, and G. Facciolo, "Tv-l1 optical flow estimation," Image Processing On Line, vol. 2013, pp. 137–150, 2013. 21
    [37] B. K. Horn and B. G. Schunck, "Determining optical flow," in 1981 Technical Symposium East, pp. 319–331, International Society for Optics and Photonics, 1981. 22
    [38] J.-Y. Bouguet, "Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm," Intel Corporation, vol. 5, 2001. 23
    [39] M. J. Black and P. Anandan, "The robust estimation of multiple motions: Parametric and piecewise-smooth flow fields," Computer vision and image understanding, vol. 63, no. 1, pp. 75–104, 1996. 25
    [40] A. L. Baggish and M. J. Wood, "Athlete’s heart and cardiovascular care of the athlete scientific and clinical update," Circulation, vol. 123, no. 23, pp. 2723–2735, 2011. 27
    [41] J.-F. Cardoso, "High-order contrasts for independent component analysis," Neural computation, vol. 11, no. 1, pp. 157–192, 1999. 28
    [42] Stricker, Ronny, Sebastian Muller, and H-M. Gross. "Non-contact video-based pulse rate measurement on a mobile service robot." Robot and Human Interactive Communication, 2014 RO-MAN: The 23rd IEEE International Symposium on. IEEE, 2014.
    [43] Chen, You-Ran. "Deep Learning for Facial Attribute Identification." (2016).

    QR CODE