簡易檢索 / 詳目顯示

研究生: 凃美綺
Mei-Ci Tu
論文名稱: SVM、RF 與 MLP 應用於腦中風預測之比較研究
Comparing with the Application of SVM, RF, and MLP in Stroke Prediction
指導教授: 黃世禎
Sun-Jen Huang
口試委員: 劉俞志
Yu-Chih Liu
魏小蘭
Hsiao-Lan Wei
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 58
中文關鍵詞: 腦中風支援向量機隨機森林多層感知器模型訓練與驗證流程
外文關鍵詞: Stroke, SVM, Random Forest, MLP, Model Training and Verification process
相關次數: 點閱:358下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 腦中風為世界之第二大死亡原因,腦中風不僅造成人體生理上的不適,後續 的照護治療也造成社會與家庭上龐大的負擔,也嚴重影響了未來的生活品質,為 了避免腦中風的發生,唯有從平常生活作息中就開始注意與管理腦中風相關危險 因子,才能有效地預防腦中風的發生。而現今機器學習與深度學習技術已廣泛應 用於醫療領域,其中應用於腦中風之研究,資料多為醫學檢驗檢查之影像資料, 並使用圖像相關之模型進行建模,但其檢驗檢查資料取得不易,而且後續實際預 測腦中風時,也需病患實際到醫院進行檢驗檢查,才能透過檢驗檢查之影像資料 進行預測,因此模型在實際應用上較不便利且需花費額外的醫療資源,而檢驗檢 查對病患不僅是經濟上的負擔,也增加了病患急性腎功能衰退發生率,亦對國家 造成醫療上之負擔。
    本研究使用美國疾病控制與預防中心的行為風險因子監測系統所進行之健 康電訪調查資料作為資料來源,並蒐集相關文獻彙整腦中風相關危險因子做為變 數篩選之依據,建構支援向量機、隨機森林與多層感知器之腦中風預測模型,並 建構一套模型訓練與驗證流程以及模型評估流程,依據模型評估指標比較各個模 型評估結果,並依照各評估指標挑選各面向之最佳模型,並發現以準確度為重的 情境之下,多層感知器於準確度評估指標中表現為最佳,以敏感度為重的情境之 下,支援向量機於敏感度評估指標中表現為最佳,以特異度為重的情境之下,隨 機森林於特異度評估指標中表現為最佳。本研究提出之建模與評估方式可提供後 人研究之參考,建構之模型亦能建構於資訊系統上做應用。


    Stroke is the second cause of death in the world. Stroke not only causes physical discomfort in the human body, but subsequent care treatment also causes a huge burden on society and the family. It also seriously affects the quality of life in the future. In order to avoid the occurrence of stroke, people only pay attention to the risk factors related to stroke and control it from the usual routine, and people can effectively prevent the occurrence of stroke. Nowadays, machine learning and deep learning technology have been widely used in the medical field. Among them, the research on brain stroke is mostly the image data of medical examination, and it is modeled using the image algorithm. However, medical examination data is not easy to obtain. Moreover, when it is actually used in the future, it is also necessary for the patient to go to the hospital for examination, through this information to predict. Therefore, the model is less convenient in practical applications and requires additional medical resources. The inspection is not only an economic burden on the patient but also increases the incidence of acute renal function decline in the patient and increases a burden on the country.
    This study employed the data from the Behavioral Risk Factor Surveillance System, and collected the stroke-related risk factors as the basis for the selection of variables. It further established a set of SVM, RF, and MLP of stroke prediction models, and constructed a set of model training and verification process and model evaluation process. The results of each model evaluation were compared based on the model evaluation indicators and then the best model of each evaluation was selected according to each evaluation indicator. It was found that the MLP was the best performance in the accuracy evaluation. SVM was the best performance in the sensitivity evaluation. RF was the best performance in the specificity evaluation. The modeling and evaluation methods proposed in this study can provide a reference for future research. The information system can also be developed based on the constructed model in this study.

    摘要 I Abstract II 致謝 III 目錄 IV 圖目錄 VI 表目錄 VIII 第一章、緒論 1 1.1 研究背景 1 1.2 研究動機 3 1.3 研究目的 5 1.4 研究流程 6 第二章、文獻探討 7 2.1 腦中風 7 2.2 機器學習與深度學習之相關醫療研究 9 2.3 機器學習 11 2.3.1 支援向量機 11 2.3.2 隨機森林 12 2.4 深度學習 13 2.4.1 多層感知器 13 第三章、研究方法 14 3.1 研究步驟 14 3.2 資料來源 15 3.3 研究變項 15 3.4 資料前處理 15 3.5 模型訓練架構 21 3.6 模型評估指標 28 第四章、研究結果分析 30 4.1 各模型訓練與驗證結果 30 4.1.1 評估流程結果 30 4.2 結果討論與分析 36 4.2.1 模型評估指標探討 36 4.2.2 發現 37 第五章、結論與建議 39 5.1 結論 39 5.2 討論 40 5.2.1 本研究與其他腦中風預測模型研究之比較 40 5.3 研究貢獻 41 5.3.1 學術面之貢獻 41 5.3.2 實務面之貢獻 42 5.4 研究限制 42 5.5 未來建議 43 參考文獻 44

    [1] World Health Organization. (2018). The top 10 causes of death. Retrieved from https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death
    [2] World Health Organization. (2014). Stroke, Cerebrovascular accident. Retrieved from https://www.who.int/topics/cerebrovascular_accident/en/
    [3] Adams HP Jr, Brott TG, Crowell RM, Furlana J, Gomez CR, Grotta J, Helgason, Marler JR, Woolson RF, Zivin JA, Feinberg W, Mayberg M. Guidelines for the management of patients with acute ischemic stroke. A statement for healthcare professionals from a special writing group of the Stroke Council, American Heart Association. Stroke 1994;25:1901-1904.
    [4] Leunens, G., Verstraete, J., Van den Bogaert, W., Van Dam, J., Dutreix, A., & Van der Schueren,E. (1992). Human errors in data transfer during the preparation and delivery of radiation treatment affecting the final result: “garbage in, garbage out”.Radiotherapy and Oncology, 23(4), 217-222.
    [5] Centers for Disease Control and Prevention. (2019). About Stroke. Retrieved from https://www.cdc.gov/stroke/about.htm
    [6] Behavioral Risk Factor Surveillance System (2019). About BRFSS. Retrieved from https://www.cdc.gov/brfss/index.html
    [7] S Patro and Kishore Kumar Sahu. (2015). Normalization: A Preprocessing Stage. Department of Computer Science Engineering and Intelligent Transport (CES & IT), Veer SurendraSai University of Technology (VSSUT), Burla, Odisha, India.
    [8] Obukhov Egor (2016). Handling the Problem of Unbalanced Data Sets in the Classification of Technical Equipment States. International Conference on Applied Innovations in IT,4, 77-79.
    [9] Cortes, C. and Vapnik, V. (1995) Support Vector Networks. Machine Learning 20:273–297.
    [10] Temurtas H, Yumusak N, Temurtas F. A comparative study on diabetes disease diagnosis using neural networks. Expert Syst Appl. 2009;36:8610–8615.
    [11] scikit-learn developers. Support Vector Machines. Retrieved from https://scikit-learn.org/stable/modules/svm.html
    [12] scikit-learn developers. sklearn.ensemble.RandomForestClassifier. https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html
    [13] Andrew P.Bradley (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition,30(7), 1145-1159.
    [14] AiguoWang,&NingAn,&GuilinChen,&LianLi,&GilAlterovitz (2015). Predicting hypertension without measurement: A non-invasive,questionnaire-based approach. Expert Systems with Applications,42(21), 7601-7609.
    [15] Gunn, S. R. (1998). Support Vector machines for classification and regression, Technical Report, University of Southampton.
    [16] Breiman, L., Friedman, J., Olshen, R. A., & Stone. C. J. (1984). Classication and Regression Trees. Wadsworth International Group.
    [17] Breiman, L. (2001). Random Forests. Machine Learning, 45 (1), 5-32.
    [18] Goldstein, L., B., & Simel, D. L. (2005). Is This Patient Having a Storke? JAMA, 293(19).
    [19] Ibrikci, T., Ustun, D., & Kaya, I. E. (2012). Diagnosis of several diseases by using combined kernels with Support Vector Machine. J Med Syst, 36(3), 1831-1840. doi:10.1007/s10916-010-9642-5
    [20] Singh, A., & Guttag, J. V. (2011). A comparison of non-symmetric entropy-based classification trees and support vector machine for cardiovascular risk stratification. Conf Proc IEEE Eng Med Biol Soc, 2011, 79-82. doi: 10.1109/IEMBS.2011.6089901
    [21] Gardner, M. W., & Dorling, S. (1998). Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmospheric environment, 32(14-15), 2627-2636.
    [22] Delen, D., Walker, G., & Kadam, A. (2005). Predicting breast cancer survivability: A comparison of three data mining methods. Artificial Intelligence in Medicine Artificial Intelligence in Medicine, 34(2), 113–127.
    [23] Ho, T.K. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832–844.
    [24] Vapnik VN, G. S., Smol A. (1997). Support vector method for function approximation, regression estimation and signal processing. 281-287.
    [25] U. Qidwai, "Fuzzy Data to Crisp Estimates: Helping the Neurosurgeon Making Better Treatment Choices for Stroke Patients," 2018 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES), Sarawak, Malaysia, 2018, pp. 691-695.
    [26] M. M. Mirbagheri and W. Z. Rymer, "Predication of reflex recovery after stroke using quantitative assessments of motor impairment at 1 month," 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Minneapolis, MN, 2009, pp. 7252-7255.
    [27] K. Cao, C. Fu, H. Li, X. Xin and Y. Gao, "A novel prognostic model to predict the recovery of ischemie stroke patients," 2013 IEEE International Conference on Bioinformatics and Biomedicine, Shanghai, 2013, pp. 1-2.
    [28] Sabut, Sukanta & Subudhi, Asit & Mohanty, Monalisa & Jena, SS. (2018). Computational Intelligence Approach for Predicting Ischemic Stroke using Brain MRI. 10.1109/ICICCT.2018.8473213.
    [29] Bonita, Ruth. (1992). Epidemiology of Stroke. Lancet. 339. 342-4. 10.1016/0140-6736(92)91658-U.
    [30] P. Huang et al., "Predicting stroke outcomes based on multi-modal analysis of physiological signals," 2015 IEEE International Conference on Digital Signal Processing (DSP), Singapore, 2015, pp. 454-457.
    [31] 衛生福利部。 106 年死因統計結果分析。民國107年6月15日。
    [32] 衛生福利部國民健康署。中風預防人人有責! 90%的中風均與危險因子有關。民國106年10月28日。
    [33] 衛生福利部國民健康署。腦中風。民國107年1月2日。
    [34] 李明憲(2013)。 智慧型初期缺血性腦中風偵測系統。 中山醫學大學醫學資訊學研究所學位論文
    [35] 鄭曼汝(2018)。應用機器學習於臉部中風檢測。國立雲林科技大學資訊工程系碩士論文
    [36] 衛生福利部中央健康保險署。105年健保支付檢查費用前20項排名。民國106年6月5日。
    [37] 鄭建興(2003)。認識腦中風。健康世界,214。
    [38] 胡漢華。台灣腦中風防治指引2008。台灣腦中風學會。
    [39] 林宗勳。Support Vector Machines 簡介。daniel@cmlab.csie.ntu.edu.tw
    [40] 蔡詩怡(2011)。以探索性資料分析方法發展心臟血管疾病臨床輔助預知模型。國立臺北護理健康大學資訊管理研究所碩士論文。
    [41] 張瓈文(2012)。資料採礦技術應用於全民健保資料庫分析腦中風病患死亡相關因素之研究。輔仁大學統計資訊學系應用統計碩士班碩士論文
    [42] 余懿真(2011)。智慧型腦中風偵測系統。中山醫學大學應用資訊科學學系碩士論文。
    [43] 王致程(2015)。以CART與多重SVM探討腦出血影響因子與三十天腦出血死亡率。中華大學資訊管理學系碩士班碩士論文。

    無法下載圖示 全文公開日期 2022/07/29 (校內網路)
    全文公開日期 2024/07/29 (校外網路)
    全文公開日期 2024/07/29 (國家圖書館:臺灣博碩士論文系統)
    QR CODE