簡易檢索 / 詳目顯示

研究生: 宗則綱
Ze-gang Zong
論文名稱: 結合基因、Apriori演算法建立健檢資料屬性關聯規則之研究─以頸動脈病變資料為例
Applying genetic and Apriori algorithms to discover the association rules based on a Carotid health exam dataset
指導教授: 歐陽超
Chao Ou-Yang
口試委員: 郭人介
Ren-jieh Kuo
汪漢澄
Han-cheng Wang
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2014
畢業學年度: 102
語文別: 中文
論文頁數: 73
中文關鍵詞: 頸動脈病變屬性篩選基因演算法關聯規則演算法K-Means
外文關鍵詞: Carotid disease, Feature selection, Genetic algorithm, K-Means, association rules
相關次數: 點閱:292下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •   腦血管疾病一直是造成國人的主要死因之一,而近年來,儘管醫療技術的進步,但隨著國人的生活、飲食習慣的改變,使得患有腦血管疾病之患者不減反增,名次逐年上升,不僅單單造成醫療上的支出增加,加上醫療照顧的支出以及社會成本更是龐大,唯有及早發現並進行預防,才有可能減少腦血管疾病對個社會國家的負擔。
      發現是否有腦血管疾病較為有效的方法是進行頸部顯影檢測,但在一般健檢項目中並不包含此項檢查,且進行此項檢查不僅需要檢測時間,更需額外支出一筆費用,降低民眾檢測之意願,以致錯失及時治療和預防的契機。
      因此,本研究與北部某醫學中心合作,使用其提供之健康檢查資料進行資料探勘,包含結合基因演算法、K-Means分群法以及Apriori關聯規則演算法,期望能從一般健康檢查的項目中,找出與頸動脈病變相關性較高之重要因子。使患者能透過一般健檢項目得知自己是否為腦血管疾病的潛在患者,也能協助醫生進行判斷病患的狀況,以預防勝於治療的概念,來降低腦中風的發生,進而減少龐大的醫療以及社會成本之支出。


      Cerebrovascular disease has been one of the main causes of death of people. In recent years, with the people’s living and eating habits changing, although the medical technology advances, patients with cerebrovascular disease have been increasing rather than decreasing. The ranking is growing gradually. The only way is to find out early and prevent it so that it is possible to reduce the burden of cerebrovascular disease in a country.
      The best way to find out the cerebrovascular disease is cerebrovascular ultrasound detection, but this is not included in the general health examination. Thus, most of people will ignore it and miss the opportunity of treatment
      This research got the health examination database from cooperation of medical center in Taipei. We want to use the techniques of data mining to find some factors and relation between health examination database and Carotid disease.
      Finally, we hope we can use the factors from health examination database to find out the potential patients of Carotid disease.

    摘要i Abstractii 目錄iii 圖目錄v 表目錄vi 第一章、緒論1 1.1研究背景1 1.2研究目的2 1.3研究議題3 1.4重要性3 第二章、文獻探討5 2.1腦血管疾病5 2.1.1腦血管疾病之症狀及分類5 2.1.2腦血管疾病的影響因子8 2.2 資料探勘9 2.3 基因演算法11 2.4 K-Means分群法14 2.5 Apriori關聯規則演算法15 第三章、研究方法17 3.1研究架構與流程17 3.2資料前處理19 3.3屬性篩選以及分群19 3.3.1基因演算法21 3.3.2K-Means分群法23 3.4Apriori關聯規則演算法26 3.5結果分析與評估28 第四章、個案與實驗結果30 4.1個案資料介紹30 4.2資料前處理32 4.2.1資料擷取32 4.2.2資料轉換33 4.3參數設定35 4.4實驗數據結果37 4.4.1規則擷取與解釋37 4.4.2 比較分析與結果46 4.4.3K-Means 使用SSE之收斂情形48 第五章、結論與建議49 5.1結論49 5.2研究限制與未來建議50 參考文獻52 附錄 演算法之關聯規則結果56

    Frawley, William J., Gregory, Piatetsky-Shapiro and Matheus, Christopher J. . 1992. Knowledge Discovery in Databases-An Overview. AI Magazine. 1992, pp. 57-70.
    Agrawal, R. and Srikant, R. 1994. Fast A lgorithms for Mining Association Rules in Large Database. Proceedings of the 20th International Conference on Very Large DataBases. 1994, pp. 487-499.
    Agrawal, R., Imielinski, T. and Swami, A. 1993. Mining Association Rules Between Sets of Items in Large Database. The 1993 Acmsigmod Conference. 1993, pp. 201-216.
    Berry, MJ.A. and Linoff, G.S. 2004. Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management. s.l. : Wiley; 3 edition, 2004.
    Chiu, S.M., et al. 2004. Mining the breast cancer pattern using artificial neural networks and multivariate adaptive regression splines. Expert Systems with Applications 27(1). 2004, pp. 133-142.
    Cubrilo-TurekM. 2004. Stroke risk factors: recent evidence and new aspects. International Congress Series. 2004年, 頁 466-469.
    Doddi, S., et al. 2001. Med Inform Internet Med. Discovery of association rules in medical data. 2001, pp. 25-33.
    Escalante, Hugo Jair, et al. 2012. Acute leukemia classification by ensemble particle swarm model selection. Artificial Intelligence in Medicine. 2012, pp. 163-175.
    Fayyad, U. M., Piatetsky-Shapirp, G. and Smyth, P. 1999. "From Data Mining to Knowledge Discovery: An Overview" Advances in knowledge Discovery and Data Mining. AAAI/MIT Press. 1999, pp. 1-36.
    Frawlet, W. J., Paitetsky-Shapiro, G. and Matheus, C. J. 1991. Knowledge discovery in databases : An overview, knowledge disccvery in database. AAAI/MIT Press. 1991, pp. 1-30.
    Goertzel, B., et al. 2007. Application of MUTIC to the exploration of gene expression data in prostate cancer. Genetics and Molecular Research. 2007, pp. 890–900.
    Grefenstette, JOHN J. 1986. Optimization of Control Parameters for Genetic Algorithms. IEEE Volume:16 Issue:1. 1986, pp. 122-28.
    Grupe, F. H. and Owrang, M. M. 1995. Database mining discovery new knowledge and cooperative advantage. Information System Management. 1995, Vol. 12, pp. 26-31.
    Han, J and Kamber, M. 2001. Data Mining: Concepts and Techniques. s.l. : Morgan Kaufmann Publishers, 2001.
    Han, J. and Kamber, M. 2006. Data Mining: Concepts and Techniques, 2nd ed. s.l. : Morgan Kaufmann, 2006.
    HanJ., KamberM. 且 PeiJ. 2006. Data Mining:Concepts and Techniques. 編輯地未知 : Morgan Kaufmann, 2006.
    Holland, J.H. 1975. Adaption in natural and artificial systems. s.l. : A Bradford Book, 1975.
    Kim, G ., et al. 2000. Feature Selection Using Genetic Algorithms for Handwritten Character Recognition. s.l. : NSF, 2000.
    Klementa, William , et al. 2012. Predicting the need for CT imaging in children with minor head injury using an ensemble of Naive Bayes classifiers. Artificial Intelligence in Medicine. 2012, pp. 163-170.
    Kouris, I. N., Makris, C. H. and Tsakalidis, A. K. 2005. Using Information Retrieval techniques for supporting data mining. Knowledge Engineering 52. 2005, pp. 353-383.
    Kudo, M. and Sklansky, J. 2000. Comparison of algorithms that select features for pattern classifiers. Pattern Recognition Letters 33(1). 2000, pp. 25-41.
    Kumar, D.S., Sathyadevi, G. and Sivanesh, S. 2011. Decision Support System for Medical Diagnosis Using Data Mining. s.l. : IJCSI, 2011.
    Kuncheva, L. I. and Jain, L. C. 1999. Nearest neighbor classifier: Simultaneous editing and feature selection. Pattern Recognition Letters 20(11–13). 1999, pp. 1149-56.
    MacQueen, J. 1967. Some Methods for Classification and Analysis of Multivariate Observations. In Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1. 1967, pp. 281-297.
    Odderson, I.R. and McKenna, B.S. 1993. A model for management of patients with stroke during the acute phase. Outcome and economic implications. Stroke. 1993, pp. 1823-27.
    Ordonez, C, et al. 2001. Mining constrained association rules to predict heart disease. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on. 2001, pp. 433-440.
    Siedlecki, W. and Sklansky, J. 1989. A note on genetic algorithms for large-scale feature selection. Pattern Recognition Letters. 1989, pp. 335-47.
    Soni, Jyoti., et al. 2011. Predictive Data Mining for Medical Diagnosis: An Overview of Heart Disease Prediction. IJCA, vol. 17, issue 8. 2011, pp. 43-48.
    Stem Cell Treatment Solutions. 2012. Stem Cell Treatment Solutions. [Online] 2012. http://www.stem-cell-solutions.com.au/test/training/research/97-degenerative/186-cerebrovascular-disease-stem-cell-therapy.
    Straus, S.E., Majumdar, S.R. and McAlister, F.A. 2002. New evidence for stroke prevention: scientific review. JAMA. September 2002, pp. 1388–95.
    Sugar, C A, et al. 1998. Empirically defined health states for depression from the SF-12. Health Serv Res. 1998, pp. 911–928.
    Tan, K.C., et al. 2009. A hybrid evolutionary algorithm for attribute selection in data mining. Expert Systems with Applications. May 4, 2009, Vol. 36, 4, pp. 8616-8630.
    Ye, X. and Keane, J.A. 1997. Mining association rules with composite items. Systems, Man, and Cybernetics, 1997. Computational Cybernetics and Simulation., 1997 IEEE International Conference on . 1997, pp. 1367 - 1372.
    Yeh, Duen-Yian, Cheng, Ching-Hsue and Chen, Yen-Wen. 2011. A predictive model for cerebrovascular disease using data mining. Expert Systems with Applications. 2011, pp. 8970-8977.
    Zhang, P., Kumar, K. and Verma, B. 2005. A Hybrid Classifier for Mass Classification with Different Kinds of Features in Mammography. Lecture Notes of Artificial Intelligence Volume 3614. 2005, pp. 316-19.
    丁怡婷. 2010. 文字探勘技術應用於中醫診斷腦中風之研究. 2010年6月, 頁 16-28.
    甘豐榮. 2012. 以基因演算法為基礎之K-均數群集技術應用於心臟疾病診斷分析. 編輯地未知 : 義守大學資訊管理研究所, 2012.
    在台灣腦中風之現況. 邱浩彰. 1996. 1996年, 醫學繼續教育, 頁 176-181.
    江志宏. 2003. 運用基因演算法建構疾病預測模型之研究-以尿路結石疾病. 編輯地未知 : 臺灣大學商學研究所博士論文, 2003.
    吳明隆. 2008. spss統計應用學習實務. 編輯地未知 : 五南, 2008.
    吳淑儀. 2008. 應用資料探勘技術於多重死因資料之疾病關聯分析. 編輯地未知 : 國立成功大學, 2008.
    李文瑞. 2004. 運用基因演算法建構疾病早期診斷模型之研究-以糖尿病前. 編輯地未知 : 輔仁大學資訊管理學研究所碩士論文, 2004.
    施威宏. 2009. 結合分群法與關聯性法則之資料探勘-以104家教網為例. 編輯地未知 : 國立彰化師範大學, 2009.
    張志華. 2003. 預測冠狀動脈繞道手術之重大併發症 - 類神經網路模型之建構及分析. 編輯地未知 : 臺北醫學大學醫學資訊研究所, 2003.
    許登翔. 2004. 資料挖掘在中醫診斷系統之應用—以酸痛證為例. 編輯地未知 : 世新大學資訊管理學系, 2004.
    陳李成. 2008. 以資料探勘模組建立之心血管異常預測系統. 編輯地未知 : 國立成功大學資訊工程研究所, 2008.
    黃志雄. 2003. 應用資料採礦分析線上拍賣市場之模式. 編輯地未知 : 朝陽科技大學工業工程與管理系碩士論文, 2003.
    黃鈺琳. 2011. 應用自組織映射圖網路與K-Means於中國大陸股票型基金與QDII基金投資策略之研究. 編輯地未知 : 國立臺灣科技大學資訊管理系, 2011.
    腦中風之現況與流行病學特徵. 邱弘毅. 2008. 3, 2008年, 台灣腦中風學會會刊, 第 15 冊. 臺北醫學大學公共衛生學系教授兼系主任、臺灣流行病學學會理事長.
    劉致和. 2001. 臺灣地區燙傷住院治療型態之研究--應用階層式集群分析於全民健保資料庫. 編輯地未知 : 台北醫學大學醫學資訊研究所, 2001.
    潘信良. 2008. 以台灣腦中風登錄為基礎建構多階段中風功能恢復模型. 行政院國家科學委員會補助專題研究計畫. 編輯地未知 : 台大醫學院復健科, 2008.
    鄭建興. 2004. 腦中風的危險因子. 編輯地未知 : 健康世界, 2004.
    鄭弼勳. 2009. 運用資料探勘建構消化性潰瘍之預測模式. 編輯地未知 : 國立雲林科技大學工業工程與管理研究所, 2009.
    賴桂珍. 大腦血管分布.

    無法下載圖示 全文公開日期 2019/06/20 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE