簡易檢索 / 詳目顯示

研究生: 莊才賢
CAI-XIAN ZHUANG
論文名稱: 基於影像之血壓預測於3D-CNN二階段式學習
Video-Based Blood Pressure Estimation Using Two-Step Learning in 3D-CNN
指導教授: 蘇順豐
Shun-Feng Su
口試委員: 陸敬互
Ching-Hu Lu
王文俊
Wen-June Wang
姚立德
Leehter Yao
莊鎮嘉
Chen-Chia Chuang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 62
中文關鍵詞: 無侵入計算3D卷積網路血壓深度學習二階段式學習基於影片影像
外文關鍵詞: Noninvasive Measurement, 3D-CNN, Blood Pressure, Deep Learning, Two-Step Learning, Video-Based
相關次數: 點閱:236下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

為了使無接觸式血壓偵測更被廣泛使用,本論文提出的模型可使用低廉常見的網路攝影機進行血壓偵測,且測量的過程中也僅需要依靠臉部影像就能夠估測出收縮壓及舒張壓之數值,達到既方便又廉價的特性,除此之外,本論文所採用之資料比絕大多數論文血壓範疇還要更廣闊,可以估測更廣泛的血壓範圍。
有別以往其他論文以訊號偵測進行預測血壓,本論文利用3D 卷積萃取空間及時間變化的特徵,並改動了原有輸入層的kernel size,使模型對於資料有更好的感受野,同時,提出二階段式學習,先利用分類(classification)模型初步地將血壓分為高血壓二級(收縮壓>=140或是舒張壓>=90)及非高血壓二級,接著再個別以該範疇之回歸(regression)模型去判定出收縮壓及舒張壓之數值,藉此達到在個別範圍下更精確地估測血壓數值。在進行第一階段分類模型訓練時,我們在每一批量(batch)的組成中加入高血壓二級的資料,迫使模型更重視對這個族群的學習,使第一階段的分類準確率達到94.16%,同樣地,為了減少極端值之大誤差,我們更動了原先在第二階段模型訓練時的損失函數之次方數值,使regression model更重視對於極端值的特徵學習。另外,我們也改動了影片的輸入,在原有的影片每三幀(frame)做出取樣,使模型訓練加速外,也更能夠學習到幀與幀之間的差異性。為了讓模型更快收斂我們也對影片增量方式進行調整,在訓練的過程中進行增量,同時將增量的移動步伐變大,讓增量的變化性更高。透過以上方式,本論文在不特定環境、年齡、性別、高中低血壓的資料上達到了收縮壓及舒張壓的MAE僅有7.57(mmHg)和5.95(mmHg),而RMSE則是10.24(mmHg)和8.65(mmHg),同時,本篇論文誤差也通過AAMI對血壓計所規範的標準。


To make contactless blood pressure monitoring closer to people, the proposed model uses a normal webcam to estimate blood pressure. In addition, the only input required is facial images, and then the model can estimate systolic and diastolic blood pressures. Moreover, the data used in this study has a broader range of blood pressure than most previous studies, so the model in this study can estimate a broader range. This study uses a 3D convolution neural network to extract the features of spatial and temporal changes rather than using signal methods. Moreover, the proposed two-step learning method is used to improve the model performances. In the first step, the classification model classifies blood pressure into the stage 2 hypertension (SBP >= 140 or DBP >= 90) and others. Then, two regression models are learned for them respectively. When training the classification model, stage 2 hypertension data are added into each batch to force the model to pay more attention on them. By doing this, the classification accuracy reaches 94.12 %. The power of the training loss function is also changed in the second step to make the model perform better. In addition, the video input is sampled from original video frames to increase the difference between frames, so that the model converges faster and learns better. During the training process, video augmentation is done by increasing the shift step to make the data augmentation more various. Through the above methods, the MAEs of systolic and diastolic blood pressure in our study on the data of unstable environment, different ages, general blood pressure are only 7.57 (mmHg) and 5.95 (mmHg), and the RMSE are only 10.24 (mmHg) and 8.65 (mmHg). At the same time, the calculated errors for blood pressure estimation fall under the standard of AAMI.

中文摘要 I Abstract II 致謝 III Table of Contents IV List of Figures VII List of Tables IX Chapter 1 Introduction 1 1.1 Background 1 1.2 Motivation 2 1.3 Baseline Model 5 1.4 Contribution 6 1.5 Thesis Organization 8 Chapter 2 Related Work 9 2.1 Literature Review 9 2.2 Baseline Model Architecture 12 2.2.1 3D convolutional kernels 12 2.2.2 Model Architecture 13 2.3 Data Augmentation 14 Chapter 3 Methodology 15 3.1 System Overview 16 3.2 Sampling 17 3.3 Model 18 3.3.1 Previous Attempt 18 3.3.2 Two-Step Training 20 3.4 Data Augmentation 22 3.4.1 Peak Shift 22 3.4.2 Modified Shift 24 3.5 Training Optimization 25 3.5.1 Batch Component Adjustment 25 3.5.2 Modified Training Loss Function 27 Chapter 4 Experiments 28 4.1 Dataset 28 4.1.1 Experiment Setup 28 4.1.2 Benchmark Dataset 30 4.2 Evaluation Metric 32 4.2.1 Cross Entropy 32 4.2.2 RMSE 33 4.3 Training and Testing Process 34 4.3.1 Training Phase 34 4.3.2 Testing Phase 36 4.4 Implement Detail 37 4.4.1 Hardware Environment 37 4.4.2 Training Details and Hyper Parameters Setting 37 4.5 Experiment Results 38 4.5.1 Sampling 38 4.5.2 Kernel Size Modification of the First layer 39 4.5.3 Data Augmentation 40 4.5.4 Two-Step Learning Method 42 4.5.5 Comparison 44 Chapter 5 Conclusions and Future Work 45 5.1 Conclusions 45 5.2 Future Work 46 Reference 47

Reference
[1] I. Jeong and J. Finkelstein, "Introducing contactless blood pressure assessment using a high speed video camera," J. Medical Syst., vol. 40, pp. 1-10, 2016.
[2] P.-W. Huang, C.-H. Lin, M.-L. Chung, T.-M. Lin, and B. Wu, "Image based contactless blood pressure assessment using pulse transit time," in 2017 Int. Automat. Control Conf. (CACS), pp. 1-6, 2017.
[3] X. Fan, Q. Ye, X. Yang, and S. D. Choudhury, "Robust blood pressure estimation using an RGB camera," J. Ambient Intell. Humanized Computing, pp. 1-8, 2020.
[4] M. Jain, S. Deb, and A. Subramanyam, "Face video based touchless blood pressure and heart rate estimation," IEEE 18th International Workshop on Multimedia Signal Processing (MMSP), pp. 1-5, 2016.
[5] I. T. Jolliffe, "Principal component analysis," International Encyclopedia of Statistical Science, 2011.
[6] Y. Zhou, H. Ni, Q. Zhang, and Q. Wu, "The noninvasive blood pressure measurement based on facial images processing," IEEE Sensors Journal, vol. 19, pp. 10624-10634, 2019.
[7] H. Luo, D. Yang, A. Barszczyk, N. Vempala, J. Wei, S. J. Wu, P. P. Zheng, G. Fu, K. Lee, and Z.-P. Feng, "Smartphone-based blood pressure measurement using transdermal optical imaging technology," Circulation: Cardiovascular Imaging, vol. 12 8, 2019.
[8] R. Takahashi, K. Ogawa-Ochiai, and N. Tsumura, "Non-contact method of blood pressure estimation using only facial video," Artificial Life and Robotics, vol. 25, pp. 343 - 350, 2020.

[9] Q.-V. Tran, S. Su, M. Tran, and V. N. T. Truong, "Intelligent non-invasive vital signs estimation from image analysis," International Conference on System Science and Engineering (ICSSE), pp. 1-6, 2020.
[10] S. Ji, W. Xu, M. Yang, and K. Yu, "3D convolutional neural networks for human action recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, pp. 221-231, 2013.
[11] D. Tran, L. D. Bourdev, R. Fergus, L. Torresani, and M. Paluri, "Learning spatiotemporal features with 3D convolutional networks," IEEE International Conference on Computer Vision (ICCV), pp. 4489-4497, 2015.
[12] K. Hara, H. Kataoka, and Y. Satoh, "Can spatiotemporal 3D CNNs retrace the history of 2D CNNs and ImageNet?," IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6546-6555, 2018.
[13] R. Huang, H. Dong, G. Yin, and Q. Fu, "Ensembling 3D CNN framework for video recognition," International Joint Conference on Neural Networks (IJCNN), pp. 1-7, 2019.
[14] G. A. Martínez-Mascorro, J. R. Abreu-Pederzini, J. C. Ortíz-Bayliss, and H. Terashima-Mar'in, "Suspicious behavior detection on shoplifting cases for crime prevention by using 3D convolutional neural networks," ArXiv, vol. abs/2005.02142, 2020.
[15] N. Almaadeed, O. Elharrouss, S. Al-Maadeed, A. Bouridane, and A. Beghdadi, "A novel approach for robust multi human action recognition and summarization based on 3D convolutional neural networks," ArXiv, 2019.
[16] O. Köpüklü, N. Kose, A. Gunduz, and G. Rigoll, "Resource rfficient 3D convolutional neural networks," IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 1910-1919, 2019.
[17] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," IEEE Conference on Computer Vision, pp. 770-778, 2016.
[18] B. D. Lucas and T. Kanade, "An Iterative Image Registration Technique with an Application to Stereo Vision," IJCAI, 1981.
[19] X. Huang, "An improved FastICA algorithm for blind signal separation and its application," International Conference on Image Analysis and Signal Processing, pp. 1-4, 2012.
[20] A. Belouchrani, K. Abed-Meraim, J. Cardoso, and É. Moulines, "A blind source separation technique using second-order statistics," IEEE Transactions on Signal Processing, vol. 45, pp. 434-444, 1997.
[21] M. Higuchi, K.-i. Sorachi, and Y. Hata, "Health checkup data analysis focusing on body mass index," IEICE Transactions on Information and Systems, vol. 100-D, pp. 1634-1641, 2017.
[22] R. Boer, J. Karemaker, and J. Strackee, "Relationships between short-term blood-pressure fluctuations and heart-rate variability in resting subjects I: a spectral analysis approach," Medical and Biological Engineering and Computing, vol. 23, pp. 352-358, 2006.
[23] G. Parati, J. Saul, M. d. Rienzo, and G. Mancia, "Spectral analysis of blood pressure and heart rate variability in evaluating cardiovascular regulation. A critical appraisal," Hypertension, vol. 25 6, pp. 1276-86, 1995.
[24] P. Sleight, M. L. L. Rovere, A. Mortara, G. Pinna, R. Maestri, S. Leuzzi, B. Bianchini, L. Tavazzi, and L. Bernardi, "Physiology and pathophysiology of heart rate and blood pressure variability in humans: is power spectral analysis largely an index of baroreflex gain?," Clinical Science, vol. 88 1, pp. 103-9, 1995.

[25] R. Wang, W. Jia, Z.-H. Mao, R. Sclabassi, and M. Sun, "Cuff-free blood pressure estimation using pulse transit time and heart rate," International Conference on Signal Processing (ICSP), pp. 115-118, 2014.
[26] K. Hayashi, S. Nagasawa, Y. Naruo, A. Okumura, K. Moritake, and H. Handa, "Mechanical properties of human cerebral arteries," Advances in Bioengineering, vol. 17 3, pp. 211-8, 1980.
[27] S. Ioffe and C. Szegedy, "Batch normalization: accelerating deep network training by reducing internal covariate shift," ArXiv, vol. abs/1502.03167, 2015.
[28] W. Verkruysse, L. Svaasand, and J. Nelson, "Remote plethysmographic imaging using ambient light," Optics Express, vol. 16 26, pp. 21434-45, 2008.
[29] H. Rahman, M. U. Ahmed, S. Begum, and P. Funk, "Real time heart rate monitoring from facial RGB color video using webcam," SAIS, 2016.
[30] D. P. Kingma and J. Ba, "Adam: a method for stochastic optimization," CoRR, vol. abs/1412.6980, 2015.
[31] M. Tan and Q. V. Le, "MixConv: mixed depthwise convolutional kernels," ArXiv, vol. abs/1907.09595, 2019.
[32] J. MacQueen, "Some methods for classification and analysis of multivariate observations," Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281-297, 1967.

無法下載圖示 全文公開日期 2026/09/08 (校內網路)
全文公開日期 2031/09/08 (校外網路)
全文公開日期 2036/09/08 (國家圖書館:臺灣博碩士論文系統)
QR CODE