研究生: |
莊才賢 CAI-XIAN ZHUANG |
---|---|
論文名稱: |
基於影像之血壓預測於3D-CNN二階段式學習 Video-Based Blood Pressure Estimation Using Two-Step Learning in 3D-CNN |
指導教授: |
蘇順豐
Shun-Feng Su |
口試委員: |
陸敬互
Ching-Hu Lu 王文俊 Wen-June Wang 姚立德 Leehter Yao 莊鎮嘉 Chen-Chia Chuang |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 英文 |
論文頁數: | 62 |
中文關鍵詞: | 無侵入計算 、3D卷積網路 、血壓 、深度學習 、二階段式學習 、基於影片影像 |
外文關鍵詞: | Noninvasive Measurement, 3D-CNN, Blood Pressure, Deep Learning, Two-Step Learning, Video-Based |
相關次數: | 點閱:236 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
為了使無接觸式血壓偵測更被廣泛使用,本論文提出的模型可使用低廉常見的網路攝影機進行血壓偵測,且測量的過程中也僅需要依靠臉部影像就能夠估測出收縮壓及舒張壓之數值,達到既方便又廉價的特性,除此之外,本論文所採用之資料比絕大多數論文血壓範疇還要更廣闊,可以估測更廣泛的血壓範圍。
有別以往其他論文以訊號偵測進行預測血壓,本論文利用3D 卷積萃取空間及時間變化的特徵,並改動了原有輸入層的kernel size,使模型對於資料有更好的感受野,同時,提出二階段式學習,先利用分類(classification)模型初步地將血壓分為高血壓二級(收縮壓>=140或是舒張壓>=90)及非高血壓二級,接著再個別以該範疇之回歸(regression)模型去判定出收縮壓及舒張壓之數值,藉此達到在個別範圍下更精確地估測血壓數值。在進行第一階段分類模型訓練時,我們在每一批量(batch)的組成中加入高血壓二級的資料,迫使模型更重視對這個族群的學習,使第一階段的分類準確率達到94.16%,同樣地,為了減少極端值之大誤差,我們更動了原先在第二階段模型訓練時的損失函數之次方數值,使regression model更重視對於極端值的特徵學習。另外,我們也改動了影片的輸入,在原有的影片每三幀(frame)做出取樣,使模型訓練加速外,也更能夠學習到幀與幀之間的差異性。為了讓模型更快收斂我們也對影片增量方式進行調整,在訓練的過程中進行增量,同時將增量的移動步伐變大,讓增量的變化性更高。透過以上方式,本論文在不特定環境、年齡、性別、高中低血壓的資料上達到了收縮壓及舒張壓的MAE僅有7.57(mmHg)和5.95(mmHg),而RMSE則是10.24(mmHg)和8.65(mmHg),同時,本篇論文誤差也通過AAMI對血壓計所規範的標準。
To make contactless blood pressure monitoring closer to people, the proposed model uses a normal webcam to estimate blood pressure. In addition, the only input required is facial images, and then the model can estimate systolic and diastolic blood pressures. Moreover, the data used in this study has a broader range of blood pressure than most previous studies, so the model in this study can estimate a broader range. This study uses a 3D convolution neural network to extract the features of spatial and temporal changes rather than using signal methods. Moreover, the proposed two-step learning method is used to improve the model performances. In the first step, the classification model classifies blood pressure into the stage 2 hypertension (SBP >= 140 or DBP >= 90) and others. Then, two regression models are learned for them respectively. When training the classification model, stage 2 hypertension data are added into each batch to force the model to pay more attention on them. By doing this, the classification accuracy reaches 94.12 %. The power of the training loss function is also changed in the second step to make the model perform better. In addition, the video input is sampled from original video frames to increase the difference between frames, so that the model converges faster and learns better. During the training process, video augmentation is done by increasing the shift step to make the data augmentation more various. Through the above methods, the MAEs of systolic and diastolic blood pressure in our study on the data of unstable environment, different ages, general blood pressure are only 7.57 (mmHg) and 5.95 (mmHg), and the RMSE are only 10.24 (mmHg) and 8.65 (mmHg). At the same time, the calculated errors for blood pressure estimation fall under the standard of AAMI.
Reference
[1] I. Jeong and J. Finkelstein, "Introducing contactless blood pressure assessment using a high speed video camera," J. Medical Syst., vol. 40, pp. 1-10, 2016.
[2] P.-W. Huang, C.-H. Lin, M.-L. Chung, T.-M. Lin, and B. Wu, "Image based contactless blood pressure assessment using pulse transit time," in 2017 Int. Automat. Control Conf. (CACS), pp. 1-6, 2017.
[3] X. Fan, Q. Ye, X. Yang, and S. D. Choudhury, "Robust blood pressure estimation using an RGB camera," J. Ambient Intell. Humanized Computing, pp. 1-8, 2020.
[4] M. Jain, S. Deb, and A. Subramanyam, "Face video based touchless blood pressure and heart rate estimation," IEEE 18th International Workshop on Multimedia Signal Processing (MMSP), pp. 1-5, 2016.
[5] I. T. Jolliffe, "Principal component analysis," International Encyclopedia of Statistical Science, 2011.
[6] Y. Zhou, H. Ni, Q. Zhang, and Q. Wu, "The noninvasive blood pressure measurement based on facial images processing," IEEE Sensors Journal, vol. 19, pp. 10624-10634, 2019.
[7] H. Luo, D. Yang, A. Barszczyk, N. Vempala, J. Wei, S. J. Wu, P. P. Zheng, G. Fu, K. Lee, and Z.-P. Feng, "Smartphone-based blood pressure measurement using transdermal optical imaging technology," Circulation: Cardiovascular Imaging, vol. 12 8, 2019.
[8] R. Takahashi, K. Ogawa-Ochiai, and N. Tsumura, "Non-contact method of blood pressure estimation using only facial video," Artificial Life and Robotics, vol. 25, pp. 343 - 350, 2020.
[9] Q.-V. Tran, S. Su, M. Tran, and V. N. T. Truong, "Intelligent non-invasive vital signs estimation from image analysis," International Conference on System Science and Engineering (ICSSE), pp. 1-6, 2020.
[10] S. Ji, W. Xu, M. Yang, and K. Yu, "3D convolutional neural networks for human action recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, pp. 221-231, 2013.
[11] D. Tran, L. D. Bourdev, R. Fergus, L. Torresani, and M. Paluri, "Learning spatiotemporal features with 3D convolutional networks," IEEE International Conference on Computer Vision (ICCV), pp. 4489-4497, 2015.
[12] K. Hara, H. Kataoka, and Y. Satoh, "Can spatiotemporal 3D CNNs retrace the history of 2D CNNs and ImageNet?," IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6546-6555, 2018.
[13] R. Huang, H. Dong, G. Yin, and Q. Fu, "Ensembling 3D CNN framework for video recognition," International Joint Conference on Neural Networks (IJCNN), pp. 1-7, 2019.
[14] G. A. Martínez-Mascorro, J. R. Abreu-Pederzini, J. C. Ortíz-Bayliss, and H. Terashima-Mar'in, "Suspicious behavior detection on shoplifting cases for crime prevention by using 3D convolutional neural networks," ArXiv, vol. abs/2005.02142, 2020.
[15] N. Almaadeed, O. Elharrouss, S. Al-Maadeed, A. Bouridane, and A. Beghdadi, "A novel approach for robust multi human action recognition and summarization based on 3D convolutional neural networks," ArXiv, 2019.
[16] O. Köpüklü, N. Kose, A. Gunduz, and G. Rigoll, "Resource rfficient 3D convolutional neural networks," IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 1910-1919, 2019.
[17] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," IEEE Conference on Computer Vision, pp. 770-778, 2016.
[18] B. D. Lucas and T. Kanade, "An Iterative Image Registration Technique with an Application to Stereo Vision," IJCAI, 1981.
[19] X. Huang, "An improved FastICA algorithm for blind signal separation and its application," International Conference on Image Analysis and Signal Processing, pp. 1-4, 2012.
[20] A. Belouchrani, K. Abed-Meraim, J. Cardoso, and É. Moulines, "A blind source separation technique using second-order statistics," IEEE Transactions on Signal Processing, vol. 45, pp. 434-444, 1997.
[21] M. Higuchi, K.-i. Sorachi, and Y. Hata, "Health checkup data analysis focusing on body mass index," IEICE Transactions on Information and Systems, vol. 100-D, pp. 1634-1641, 2017.
[22] R. Boer, J. Karemaker, and J. Strackee, "Relationships between short-term blood-pressure fluctuations and heart-rate variability in resting subjects I: a spectral analysis approach," Medical and Biological Engineering and Computing, vol. 23, pp. 352-358, 2006.
[23] G. Parati, J. Saul, M. d. Rienzo, and G. Mancia, "Spectral analysis of blood pressure and heart rate variability in evaluating cardiovascular regulation. A critical appraisal," Hypertension, vol. 25 6, pp. 1276-86, 1995.
[24] P. Sleight, M. L. L. Rovere, A. Mortara, G. Pinna, R. Maestri, S. Leuzzi, B. Bianchini, L. Tavazzi, and L. Bernardi, "Physiology and pathophysiology of heart rate and blood pressure variability in humans: is power spectral analysis largely an index of baroreflex gain?," Clinical Science, vol. 88 1, pp. 103-9, 1995.
[25] R. Wang, W. Jia, Z.-H. Mao, R. Sclabassi, and M. Sun, "Cuff-free blood pressure estimation using pulse transit time and heart rate," International Conference on Signal Processing (ICSP), pp. 115-118, 2014.
[26] K. Hayashi, S. Nagasawa, Y. Naruo, A. Okumura, K. Moritake, and H. Handa, "Mechanical properties of human cerebral arteries," Advances in Bioengineering, vol. 17 3, pp. 211-8, 1980.
[27] S. Ioffe and C. Szegedy, "Batch normalization: accelerating deep network training by reducing internal covariate shift," ArXiv, vol. abs/1502.03167, 2015.
[28] W. Verkruysse, L. Svaasand, and J. Nelson, "Remote plethysmographic imaging using ambient light," Optics Express, vol. 16 26, pp. 21434-45, 2008.
[29] H. Rahman, M. U. Ahmed, S. Begum, and P. Funk, "Real time heart rate monitoring from facial RGB color video using webcam," SAIS, 2016.
[30] D. P. Kingma and J. Ba, "Adam: a method for stochastic optimization," CoRR, vol. abs/1412.6980, 2015.
[31] M. Tan and Q. V. Le, "MixConv: mixed depthwise convolutional kernels," ArXiv, vol. abs/1907.09595, 2019.
[32] J. MacQueen, "Some methods for classification and analysis of multivariate observations," Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281-297, 1967.