運用腦部健診資料探討民眾回診之決策法則｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	艾坦琉 Talitha - Octaviani
論文名稱：	運用腦部健診資料探討民眾回診之決策法則 Investigate the Decision Rules of Examinees' Re-coming From a Cerebrovascular Health Checkup Dataset
指導教授：	歐陽超 Chao Ou-Yang
口試委員:	郭人介 Ren-Jieh Kuo 汪漢澄 Han-Cheng Wang
學位類別：	碩士 Master
系所名稱：	管理學院 - 工業管理系 Department of Industrial Management
論文出版年：	2015
畢業學年度：	103
語文別：	英文
論文頁數：	123
中文關鍵詞：	回檢健檢者、腦血管健康檢查、決策樹、粒子群演算法、K-means分群法
外文關鍵詞：	Examinees’ re-coming, Cerebrovascular Health Examination, Decision Tree, Particle Swarm Optimization, K-means clustering.
相關次數：	點閱：323 下載：1
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

面對多數民眾擔心罹患疾病卻不自知的情況，一般健康檢查應該成為民眾的例行習慣。隨著健康檢查的普遍化，醫院內健檢資料逐漸增加，對此龐大資料，如能正確地應用資料探勘手法進行分析，將可有效的幫助醫院提升醫療品質。在醫療一般健檢上，存在著健檢者是否會回診之情況，本研究應用健檢資料先以粒子群最佳化方法進行特徵擷取，接著以K-means分群法對健檢者進行分，最後決策樹關聯法則將可提供醫院對可能回診之健檢者進行相關之追蹤與行銷。

The problem faced for most people are being sick without knowing it. Regular health check-ups should be standard procedure in every person’s health routine. Preventive medical check-ups in this context is preventing the cerebrovascular disease, could be understood as one of the condition that enables the re-coming of the examinees besides doing the treatment for the disease. Establishing examinee’s diagnoses often determine the recommendation made to the examinees. The health examination data provides the information needed to make an accurate prediction of the re-coming of the examinee with diagnosis for the examinee’s condition. The condition of the examinee that match the criteria is predicted as do the re-coming and if the rule has the ABD score the examinee indicated as vulnerable to cerebrovascular disease which will have higher chance to come back to the hospital whether to do another checkup for preventing the disease or even doing the treatment for the disease. Knowledge Discovery has the potential to obtain the useful information from the dataset. Decision tree method is one of DM approach that capable to generate decision tree that can be converted to produce comprehensive rules. In generating the rules, the problem occurred is more attribute used more complex the rules would be. In this research, a metaheuristic approach applied in order to maximize the efficiency of the rules generated. The method applied is Particle Swarm Optimization (PSO). This method is applied to do feature selection in order to minimize the error of the tree.

Master’s Thesis Recommendation Form	I
Qualification Form by Master’s Degree Examinations Committee	II
摘要	III
ABSTRACT	IV
ACKNOWLEDGEMENTS	V
CONTENTS	VI
LISTS OF TABLES	IX
LISTS OF FIGURES	X
CHAPTER 1	INTRODUCTION	11
1	Research Background	11
2	Research Objectives	13
3	Scope and Constraints	14
3.1	Scopes	14
3.2	Constraints	14
4	Research framework	15
CHAPTER 2	LITERATURE REVIEW	16
1	Cerebrovascular Disease	16
2	Cerebrovascular Parameter	18
3	State of The Art	18
4	Data Mining and Knowledge Discovery	21
4.1	Knowledge Representation	22
4.1.1	Prediction Rules	22
4.2	Clustering (K-means Clustering)	22
4.2.1	Determining the Number of Cluster (Elbow Method)	24
4.3	Data Mining	24
4.3.1	Classification	24
4.3.2	Classifier Performance Evaluation (k-fold cross-validation)	25
4.3.3	Classification Model Validation (Error Rate)	25
4.3.4	Decision Tree Induction (CART)	27
4.3.5	Rule-Extraction from a Decision Tree	29
5	Metaheuristic	30
5.1	Particle swarm optimization algorithm (PSO)	30
5.2	Binary Particle Swarm Optimization (BPSO)	31
5.3	Particle Swarm Optimization for Feature Selection	31
CHAPTER 3	METHODOLOGY	32
1	Data collection	34
2	Data Pre-processing	34
2.1	Feature Reduction	35
2.2	Remove the Outliers	35
2.3	Data Categorization	35
2.4	Clustering Data (K-means clustering)	36
3	Data Processing	36
3.1	Particle Swarm Optimization	36
3.2	Decision Tree Algorithm (CART)	39
CHAPTER 4	IMPLEMENTATION	41
1	Data Analysis	41
2	Data Preprocessing	44
3	Data Processing	47
3.1	Discovering the Decision Model for Examinees’ Re-coming	48
4	Data Post-processing	52
4.1	Discovering the Decision Rules for Examinees’ Re-coming	52
4.2	Defining Cluster and Rules Characteristic and ABD Score for each rule	57
5	Discussion and Analysis	72
5.1	PSODT Analysis	72
5.2	Decision Rules Result	72
5.3	Predict The Examinee’s Re-coming	96
CHAPTER 5	CONCLUSION AND FUTURE RESEARCH	100
1	Conclusion	100
2	Contributions	102
3	Future Research	102
REFERENCES	104
APPENDIX A. Matlab Source Code	110
APPENDIX B. Best Particles Result	117
APPENDIX C. Lowest Error Decision Tree	121

                                

Adegoke, B., Ola, B., & Omotayo, M. (2014). Review of Feature Selection Methods in Medical Image Processing. IOSR Journal of Engineering (IOSRJEN), Vol. 04(01), 01-05.
Anyanwu, M. N., & Shiva, S. G. (2009). Comparative analysis of serial decision tree classification algorithms. International Journal of Computer Science and Security, 3(3), 230-240.
Arabie, P., Hubert, L. J., & de Soete, G. (1996). Clustering and classification: World Scientific.
Bai, Q. (2010). Analysis of particle swarm optimization algorithm. Computer and information science, 3(1), p180.
Bass. (2015). 7 Dangers of Very High Triglycerides. from http://www.bettermedicine.com/the-dangers-of-very-high-triglycerides/dangers-of-very-high-triglycerides
Berkhin, P. (2006). A survey of clustering data mining techniques Grouping Multidimensional Data (pp. 25-71): Springer Berlin Heidelberg.
Berlingerio, M., Bonchi, F., Giannotti, F., & Turini, F. (2007). Mining clinical data with a temporal dimension: a case study. Paper presented at the Bioinformatics and Biomedicine, 2007. BIBM 2007. IEEE International Conference on.
Bhukya, D. P., Ramachandram, S., & AL, R. S. (2009). Performance evaluation of partition based clustering algorithms in grid environment using design of experiments. Performance Evaluation, 2076, 331X.
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees: CRC press.
Bruha, I. (2001). Pre-and post-processing in machine learning and data mining Machine Learning and Its Applications (pp. 258-266): Springer.
Chakrabarti, S., Ester, M., Fayyad, U., Gehrke, J., Han, J., Morishita, S., Piatetsky-Shapiro, G., Wang, W. (2006). Data mining curriculum: a proposal, Version 1.0. ACM SIGKDD, 23.

Changala, R., Gummadi, A., Yedukondalu, G., & Raju, U. (2012). Classification by decision tree induction algorithm to learn decision trees from the classlabeled training tuples. International Journal of Advanced Research in Computer Science and Software Engineering, 2(4), 427-434.
Chen, K.-H., Wang, K.-J., Tsai, M.-L., Wang, K.-M., Adrian, A. M., Cheng, W.-C., Yang, T-Z., Tan, K-P., Chang, K.-S. (2014). Gene selection for cancer identification: a decision tree model empowered by particle swarm optimization algorithm. BMC bioinformatics, 15(1), 49.
Childs, J. D., & Cleland, J. A. (2006). Development and application of clinical prediction rules to improve decision making in physical therapist practice. Physical Therapy, 86(1), 122-131.
Cho, Y.-J., Lee, H., & Jun, C.-H. (2011). Optimization of Decision Tree for Classification Using a Particle Swarm. Industrial Engineeering & Management Systems, 10(4), 272-278.
Elkan, C. (2011). Evaluating classifiers. from http://cseweb.ucsd.edu/~elkan/250Bwinter2011/classifiereval.pdf
Ethni, S., Zahawi, B., Giaouris, D., & Acarnley, P. (2009). Comparison of particle swarm and simulated annealing algorithms for induction motor fault identification. Industrial Informatics, 2009. INDIN 2009. 7th IEEE International Conference on, 470-474.
Fakhari, A. (2015). What are pros and cons of decision tree versus other classifier as. from http://www.researchgate.net/post/What_are_pros_and_cons_of_decision_tree_versus_other_classifier_as_KNN_SVM_NN
Freitas, A. A. (2002). Data mining and knowledge discovery with evolutionary algorithms: Springer Science & Business Media.
Ghannad-Rezaie, M., Soltanain-Zadeh, H., Siadat, M.-R., & Elisevich, K. V. (2006). Medical data mining using particle swarm optimization for temporal lobe epilepsy. Evolutionary Computation, 2006. CEC 2006. IEEE Congress on, 761-768.
Gheyas, I. A., & Smith, L. S. (2010). Feature subset selection in large dimensionality domains. Pattern recognition, 43(1), 5-13.
Grubbs, F. E. (1969). Procedures for detecting outlying observations in samples. Technometrics, 11(1), 1-21.

Gulyani, T. (2015). Technology For You: K-Means Clustering Advantages and Disadvantages. from http://playwidtech.blogspot.tw/2013/02/k-means-clustering-advantages-and.html
Herani, I. R. (2013). Development of Carotid Artery Diagnostic Prediction Model using Hybrid Data Mining Approach.
Hospital, F. (2015). Cerebrovascular Disease | Florida Hospital. from https://www.floridahospital.com/cerebrovascular-disease
Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data sets with categorical values. Data mining and knowledge discovery, 2(3), 283-304.
Hunt, E. B., Marin, J., & Stone, P. J. (1966). Experiments in induction The American Journal of Psychology (Vol. 80, pp. 651-653): University of Illinois Press.
Ingui, B. J., & Rogers, M. A. (2001). Searching for clinical prediction rules in MEDLINE. Journal of the American Medical Informatics Association, 8(4), 391-397.
Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: a review. ACM computing surveys (CSUR), 31(3), 264-323.
Juntao, W., & Xiaolong, S. (2011). An improved K-Means clustering algorithm. Communication Software and Networks (ICCSN), 2011 IEEE 3rd International Conference on, 44-46.
Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. Neural Networks, 1995. Proceedings., IEEE International Conference on, 4, 1942-1948 vol.1944.
Kodinariya, T. M., & Makwana, P. R. (2013). Review on determining number of Cluster in K-Means Clustering. International Journal, 1(6).
Komaroff, D. A. (2015). What do the new blood pressure guidelines mean for the 65+ age group? , from http://www.askdoctork.com/new-blood-pressure-guidelines-mean-65-age-group-201403216195
Kumari, B., & Swarnkar, T. (2011). Filter versus wrapper feature subset selection in large dimensionality micro array: A review. International Journal of Computer Science and Information Technologies, 2 ((3)), 1048-1053.
Liu, B. (2006). Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications): Springer-Verlag New York, Inc.
M Harb, H., & S Desuky, A. (2014). Feature Selection on Classification of Medical Datasets based on Particle Swarm Optimization. International Journal of Computer Applications, 104(5), 14-17.
Maimon, O. (2007). Data Mining With Decision Trees: Theory and Applications. Series in Machine Perception & Artificial Intelligence: World Scientific.
Maimon, O., & Rokach, L. (2010). Data Mining and Knowledge Discovery Handbook: Springer.
Maslarov, D., & Drenska, D. (2012). Association between ABCD2 score, cerebral vascular territory, and dyslipidemia in patients with transient ischemic attack (our Bulgarian experience). International Journal of Stroke, 7(6), E1-E1.
McGinn, T. G., Guyatt, G. H., Wyer, P. C., Naylor, C. D., Stiell, I. G., Richardson, W. S., & Group, E.-B. M. W. (2000). Users' guides to the medical literature: XXII: how to use articles about clinical decision rules. Jama, 284(1), 79-84.
Montagu, A., Reckless, I. P., & Buchan, A. M. (2012). Stroke: management and prevention. Medicine, 40(9), 490-499.
Navi, B. B., Kamel, H., Shah, M. P., Grossman, A. W., Wong, C., Poisson, S. N., Whelstone, W, D., Josepshon, S, A., Johnston, S, C., Kim, A. S. (2012). Application of the ABCD2 score to identify cerebrovascular causes of dizziness in the emergency department. Stroke, 43(6), 1484-1489.
NITHYA, N., Duraiswamy, K., & Gomathy, P. (2013). A survey on clustering techniques in medical diagnosis. International Journal of Computer Science Trends and Technology (IJCST), 1(2), 17-23.
Norton, A. High Blood Pressure in Young Adults Could Mean Heart Trouble in Middle Age. May 29, 2015, from http://www.webmd.com/hypertension-high-blood-pressure/news/20140204/high-blood-pressure-in-young-adults-could-mean-heart-trouble-in-middle-age
Oracle. (2015). Classification. from http://docs.oracle.com/cd/B28359_01/datamine.111/b28129/classify.htm#DMCON004
Piatetsky-Shapiro, G. (1997). Knowledge Discovery and Acquisition from Imperfect Information. In A. Motro & P. Smets (Eds.), Uncertainty Management in Information Systems (pp. 155-188): Springer US.
Refaeilzadeh, P., Tang, L., & Liu, H. (2009). Cross-validation Encyclopedia of database systems (pp. 532-538): Springer.
Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323-2326.
Sayad, D. S. (2015). Decision Tree. from http://www.saedsayad.com/decision_tree.htm
Selvi, V., & Umarani, D. R. (2010). Comparative analysis of ant colony and particle swarm optimization techniques. International Journal of Computer Applications, 5(4).
Shaffer, D., Fisher, P., Dulcan, M. K., Davies, M., Piacentini, J., Schwab-Stone, M. E., Lethey, B, B., Bourdon, K., Jonsen, P, S., Bird, H. R. (1996). The NIMH Diagnostic Interview Schedule for Children Version 2.3 (DISC-2.3): Description, acceptability, prevalence rates, and performance in the MECA study. Journal of the American Academy of Child & Adolescent Psychiatry, 35(7), 865-877.
Shi, Y., & Eberhart, R. (1998). A modified particle swarm optimizer. Evolutionary Computation Proceedings, 1998. IEEE World Congress on Computational Intelligence., The 1998 IEEE International Conference on, 69-73.
Sisodia, D., Singh, L., Sisodia, S., & Saxena, K. (2012). Clustering Techniques: A Brief Survey of Different Clustering Algorithms. Int. J. Latest Trend Eng. Technol, 1(3).
Tan, P.-N., Steinbach, M., & Kumar, V. (2005). Introduction to Data Mining, (First Edition): Addison-Wesley Longman Publishing Co., Inc.
Tan, P.-N., Steinbach, M., & Kumar, V. (2006). Introduction to data mining (Vol. 1): Pearson Addison Wesley Boston.
Toussi, M., Lamy, J.-B., Le Toumelin, P., & Venot, A. (2009). Using data mining techniques to explore physicians' therapeutic decisions when clinical guidelines do not provide recommendations: methods and example for type 2 diabetes. BMC Medical Informatics and Decision Making, 9(1), 28.
Tsai, M.-C., Chen, K.-H., Su, C.-T., & Lin, H.-C. (2012). An Application of PSO Algorithm and Decision Tree for Medical Problem. 2nd Internatonal Conference on Intelligent Computational System (ICS’2012) Oct, 13-14.
Walter. (2012). Preventive Care - Preventive Medical Check-up for Adults. 2012, from https://www.wien.gv.at/english/health-socialservices/preventive.html
Xue, B. (2014). Particle Swarm Optimisation for Feature Selection in Classification. (Doctor of Philosophy), Victoria University of Wellington.
Xue, B., Zhang, M., & Browne, W. N. (2013). Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE transactions on cybernetics, 43(6), 1656-1671.
Yohannes, Y., & Hoddinott, J. (1999). Classification and regression trees: an introduction. International Food Policy Research Institute, 2033.
Young, M., Radcliffe, T., John, P, St. (2015). K-Means Clustering Overview. from http://www.improvedoutcomes.com/docs/WebSiteDocs/Clustering/K-Means_Clustering_Overview.htm
Zhang, Y., Wang, S., Phillips, P., & Ji, G. (2014). Binary PSO with mutation operator for feature selection using decision tree applied to spam detection. Knowledge-Based Systems, 64, 22-31.
Zhao, H., Sinha, A. P., & Ge, W. (2009). Effects of feature construction on classification performance: An empirical study in bank failure prediction. Expert Systems with Applications, 36(2), 2633-2644.

簡易檢索 / 詳目顯示

相關論文