簡易檢索 / 詳目顯示

研究生: Palmy Rawinda Meliala
Palmy Rawinda Meliala
論文名稱: 機器學習算法對肺癌患者存活率和醫療費用的預測效果比較
Comparison of Machine Learning Algorithms for Survivability and Expenditure of Lung Cancer Patients
指導教授: 王孔政
Kung-Jeng Wang
口試委員: 蔣明晃
Ming Huang Jiang
羅明琇
Ming Xiu Luo
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 89
中文關鍵詞: 考克斯比例風險決策樹趨勢分析Kaplan Meier估計法肺癌機器學習醫療費用生存能力
外文關鍵詞: Cox proportional hazard, decision tree, trend analysis, Kaplan Meier estimation, lung cancer, machine learning, medical expenditure, survivability
相關次數: 點閱:216下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

過去三十年裡,肺癌的致死率非常高,在得到的五年內只會有百分之15的存活率,所以在病情變得更嚴重之前及早對症治療是非常重要的。肺癌治療的費用相當得高,在台灣醫療費用排名是第四名的。所以需要預測預期壽命以及相關治療費用,以提高肺癌患者的生存率和費用支出開銷。預測預期壽命和費用支出訊息可以幫助醫生為肺癌患者提供最有效的治療。在本研究中,我們提供了一種機器學習比較演算法,並以十倍交叉驗證和特徵選擇手法基於死亡年份的三個不同時期,依序為第一階段(2000-2003年),第二階段(2004-2010年)和全部的階段(2000-2010年)以預測肺癌患者的生存能力和醫療費用。考克斯比例風險方法用於確定最佳的慢性疾病預測變量,接下來將使用Kaplan-Meier進行趨勢分析以進行生存分析和線性回歸,藉以了解先前使用對醫療支出有直接影響的變量。 最後結論為,決策樹具有最佳性能,基於所有階段中,不管有無使用特徵選擇的手法,其準確性、召回率和F1分數在所有基段都達到了90%以上的水準,雇用其手法預測預期壽命和支出訊息是最好的。


The Lung cancer survivability rate is low in Taiwan for the last three decades, valued at around fifteen percent for a five-year survival rate. Prognose lung cancer in the early stages before it gets more severe was a critical thing. Moreover, the cost of treatment for lung cancer is high. It is in fourth place in Taiwan for the most expensive medical expenses. Prediction of life expectancy, associated with the cost, is needed to improve the survival rate and the balance of cost from lung cancer patients. Life expectancy and cost information will help hospitals or doctors give the best effective treatment for lung cancer patients. In this research, we provide a machine learning comparison algorithm assisted with ten-fold cross-validation and feature selection to predict survivability and medical expenditure of lung cancer patients from three stages based on the year of death, which are stage I (2000-2003), stage II (2004-2010) and all-stage (2000-2010). Cox proportional hazard methods are used to determine the best chronic disease predictor variable. Trend analysis will be carried out with Kaplan-Meier for survival analysis and linear regression to know variable from prior utilization that has an immediate impact to medical expenditure. The result confirmed that the decision tree has the optimum performance to predict lung cancer patients' survivability and expenditure for all stages, whether in feature selection or not, from accuracy, recall, and F1-score always resulted above 90% from all stages.

摘要 i Abstract iv Acknowledgement v List of Figures viii List of Tables x Chapter 1. Introduction 1 Chapter 2. Literature review 3 2.1 Risk Adjustment Factors 3 2.2 Risk Factor of Lung Cancer 4 2.3 Kaplan Meier Estimation 5 2.4 Cox Proportional Hazard Model 6 2.5 Machine Learning Models 7 2.5.1 Support Vector Machine 10 2.5.2 Random Forest 10 2.5.3 K-Nearest Neighbors Algorithm 11 2.5.4 Naïve Bayes Network 11 2.5.5 Decision Tree 12 2.5.6 Logistic Regression 12 2.5.7 Recurrent Neural Network 13 2.6 Performance Measures 13 Chapter 3. Method 15 3.1 Data Source 15 3.2 Predicting Factors 16 3.3 Stage Definition 22 3.4 Experimental Method 25 Chapter 4. Experiment results and discussions 30 4.1 Stage 1 (2000-2003) 30 4.2 Stage II (2004-2010) 31 4.3 All Stage (2000-2010) 32 4.4 Trend Analysis 33 4.4.1 Trend Analysis of Survivability 33 4.4.2 Treatment and Medical Cost 40 4.4.3 Trend Analysis of Medical Cost 48 4.5 Discussion 51 Chapter 5. Conclusions and Future research 55 References 58 Appendix A. Resulting in Cox Proportional Hazard 61 Appendix B. Medical Expenditure Based in Treatment 76 Appendix C. Medical Expenditure Based in Death of Year 78

Aakshi, K. (2017). The Basics of Kaplan–Meier Estimate. Journal of the Practice of Cardiovascular Sciences, 02, 187-199.
Abdous, B., & Berred, A. (2015). Mean residual life estimation. Journal of Statistical Planning and Inference, 132(01-02), 03-19.
Altman, N. S. (1992). An Introduction to Kernel and Nearest-neighbor Nonparametric Regression. The American Statistician, 46(3), 175-185.
Asmis, T. R., Ding, K., Seymour, L., Shepherd, F. A., Leighl, N. B., Whitehead, T. L., . . . Goss, G. D. (2008, January ). Age and Comorbidity As Independent Prognostic Factors in the Treatment of Non–Small-Cell Lung Cancer: A Review of National Cancer Institute of Canada Clinical Trials Group Trials. Journal Of Clinical Oncology, 26(01), 54-59.
Barandiaran, I. (1998). The Random Subspace Method for Constructing Decision Forests. IEEE transactions on pattern analysis and machine intelligence.
Bellera, C. A., MacGrogan, G., Debled, M., Lara, C. T., Brouste, V., & Pélissier, S. M. (2010). Variables with Time-varying Effects and The Coxmodel: Some Statistical Concepts Illustrated with a Prognostic Factor Study in Breast Cancer. BMC Medical Research Methodology, 10(20), 01-12.
Bray, F., Ferlay, J., Isabelle Soerjomataram, M. M., Siegel, R. L., Torre, L. A., & Jemal, A. (2018). Global Cancer Statistics 2018: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA: A Cancer Journal for Clinicians, 68(6), 394-424.
Cox, D., & Oakes, D. (1984). Analysis of Survival Data. Chapman and Hall/CRC.
Dutkowska, A. E., & Antczak, A. (2016). Comorbidities in lung cancer. Via Medica Journals, 186-192.
Halabi, S., & Owzar, K. (2010). The Importance of Identifying and Validating Prognostic Factors in Oncology. Semin Oncol, 37(02), 09-18.
Hazra, A., Bera, N., & Mandal, A. (2017, September). Predicting Lung Cancer Survivability using SVM and Logistic Regression Algorithms. International Journal of Computer Applications, 174 No.2, 19-24.
Health, B. o. (2014). Taiwan's Cancer Registry Annual Report. Retrieved from http://www.hpa.gov.tw/Pages/Detail.aspx?nodeid=269&pid=7330
Henglin, M., Stein, G., Hushcha, P. V., Snoek, J., Wiltschko, A. B., & Cheng, S. (2017). Machine Learning Approaches in Cardiovascular Imaging. Circ Cardiovasc Imaging, 10(e005614).
Janssen-Heijnen, M. L., Maas, H. A., Houterman, S., Lemmens, V. E., Rutten, H. J., & Coebergh, J. W. (2007). Comorbidity in Old Surgical Cancer Patients: Influence on Patient Care and Outcome. European Journal of Cancer, 43, 2179-2193.
Kharya, S., & Soni, S. (2016). Weighted Naive Bayes Classifier: A Predictive Model For Breast Cancer Detection. International Journal of Computer Applications,, 133(9), 32-7.
Layton, T. J. (2017, December). Imperfect Risk Adjustment, Risk Preferences, and Sorting in Competitive Health Insurance Markets. Journal of Health Economics, 56, 259-280.
Lynch, C. M., Abdollahi, B., Fuqua, J. D., deCarlo, A. R., Bartholomai, J. A., Balgemann, R. N., . . . Frieboesc, H. B. (2017). Prediction of lung cancer patient survival via supervised machine learning classification techniques. Int J Med Inform., 108, 01-08.
Mastoli, M. M., Poli, D. U., & Patil, R. D. (2019, December). Machine Learning Classification Algorithms for Predictive Analysis in Healthcare. Machine Learning Classification Algorithms for Predictive Analysis in Healthcare, 06(12), 1225-1229.
Nasser, I. M., & Abu-Naser, S. S. (2019, march). Lung Cancer Detection Using Artificial Neural Network. International Journal of Engineering and Information Systems (IJEAIS), 03(03), 17-23.
WHO (2018). Noncommunicable diseases. Retrieved from WHO: https://www.who.int/news-room/fact-sheets/detail/noncommunicable-diseases
Panch, T., Szolovits, P., & Atun, R. (2018, October). Artificial Intelligence, Machine Learning and Health Systems. Journal of Global Health, 08(02).
Powers, D. (2011). Evaluation: From Precision, Recall and F-measure to Roc, In- formedness, Markedness and Correlation.
Wang, Y.-C., Chen, C.-Y., Chen, S.-K., Cherng, S.-H., Ho, W. L., & Lee, H. (1998, January 15). High Frequency of Deletion mutations in p53 Gene from Squamous Cell Lung Cancer Patients in Taiwan. Cancer Research, 58, 328-333.
Wu, T.-Y., Majeed, A., & Kuo, K. N. (2010). An overview of the healthcare system in Taiwan. London Journal of Primary Care, 03, 115-119.
Wuryandari, T., Kartiko, S. H., & Danardono. (2018). The Cox Proportional Hazard Model on Duration of Birth Process. Journal of Physics: Conference Series, 1025, 01-07.

無法下載圖示 全文公開日期 2024/06/30 (校內網路)
全文公開日期 2024/06/30 (校外網路)
全文公開日期 2024/06/30 (國家圖書館:臺灣博碩士論文系統)
QR CODE