簡易檢索 / 詳目顯示

研究生: 黃正哲
Jheng-Jhe Huang
論文名稱: 運用不同資料分群法於一般腦部健檢之關聯探勘─以頸動脈病變資料為例
Applying Different Clustering Algorithms to Discover Association Rules of the Brain Health Examination Database─A Case Study on Carotid Disease
指導教授: 歐陽超
Chao Ou-Yang
口試委員: 郭人介
Ren-Jieh Kuo
汪漢澄
Han-Cheng Wang
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2015
畢業學年度: 103
語文別: 中文
論文頁數: 88
中文關鍵詞: 腦部健檢頸動脈病變自組織映射圖網路PSKODBSCAN連續型屬性離散化關聯法則演算法
外文關鍵詞: SOM, Association rules algorithm
相關次數: 點閱:314下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,腦血管疾病在國人十大死因的排名居高不下,儘管醫療技術的進步,但隨著國人生活和飲食習慣的改變,使得國人罹患慢性病的人數有增無減,不僅需要龐大的醫療支出,更是造成殘疾人數增加的主因之一。
    雖然國內健保體制健全,鼓勵民眾定期健康檢查,但是要了解血管粥狀硬化的情況尚須透過頸動脈超音波或核磁共振,而目前國內的頸部顯影檢測皆需要額外支付費用,這將會降低民眾更進一步檢測的意願,以致錯失及時治療和預防的契機。
      本研究將與北部某醫學中心合作,取得腦部健檢資料,以資料探勘的方法對資料進行分群、離散化、搜尋關聯法則,從一般健康檢查的項目中,找出與頸動脈病變高度關聯性的屬性區間值,所找出的關聯法則lift(代表正相關)大於1.17且信賴度均高於60%,藉由找出的關聯法則建置頸動脈臨界值預警系統,協助醫生能透過一般健檢資料去分析健檢者是否為罹患頸動脈病變的高危險族群,使罹患機率高者,能盡早預防及治療,以減少大量的醫療支出及提升國人的生活品質。


    Cerebrovascular diseases, or strokes, have always been highlighted as a big threat to health in Taiwan as well as worldwide. It is costly to detect stroke through brain image examination like Magnetic Resonance Imaging (MRI). Data mining has been used widely to analyze different types including medical data. In this research, data mining is used to find the association rules for stroke. The purpose of this study is to generate high confidence of association rules from patient’s cerebrovascular examination results as a reference.
    In this research, a dataset of cerebrovascular health examination from a local medical center in Taiwan is used. Three kinds of clustering approch will be applied first. They are Particle Swarm K-means Optimization (PSKO); Self-Organizing Map Neural Network (SOM) and Density-based Spatial Clustering of Applications with Noise (DBSCAN). In the next stage, K-means approach will be used to discritize the discrete features in the dataset. Finally, the association rules from each discritized cluter will be identified by applying Apriori method. Hence, based on these finding rules, the physicians can pay more attention and aware to the features highlighted. This research has focused on the association rules mining based on discretization method, to identify the important features to be concerned behind the cerebrovascular disease, as well as considered common pattern of each cluster.

    摘要 Abstract 誌謝 目錄 圖目錄 表目錄 第一章、緒論 1.1 研究背景 1.2 研究目的 1.3 研究議題 1.4 重要性 第二章、文獻探討 2.1 腦血管疾病 2.1.1 腦血管疾病的症狀與分類 2.1.2 腦血管疾病的影響因子 2.2 資料探勘 2.3 分群分析 2.3.1 自組織映射圖網路(Self-Organizing Map Neural Network; SOM) 2.3.2 PSKO(Particle Swarm K-means Optimization) 2.3.3 DBSCAN (Density-based Spatial Clustering of Applications with Noise) 2.4 連續型屬性離散化 2.4.1 K-means離散化 2.5 Apriori關聯規則演算法 15 2.6 粒子群演算法(Particle Swarm Optimization,PSO) 第三章、研究步驟與方法 3.1 資料前處理 3.2 自組織映射圖網路(Self-Organizing Map Neural Network; SOM) 3.3 PSKO(Particle Swarm K-means Optimization) 3.4 DBSCAN (Density-based Spatial Clustering of Applications with Noise) 3.5 K-means離散化 3.6 Apriori關聯規則演算法 3.7 結果分析與討論 第四章、研究個案與實驗結果 4.1 個案資料與處理 4.1.1 個案資料介紹 4.1.2 個案資料擷取 4.2 實驗結果與分析 4.2.1 參數設定 4.2.2 分群結果分析 4.2.3 離散化結果 4.2.4 關聯規則評估與分析 4.3 頸動脈病變臨界值預警系統建置 4.3.1 網頁系統介面與操作 4.3.2 手機App系統介面與操作 第五章、結論探討與建議 5.1 結論 65 5.2 研究限制與未來建議 5.2.1 資料來源的限制 5.2.2 資料處理問題 5.2.3 研究方法的選擇 5.2.4 手機App預警系統未來發展 參考文獻 附錄A 附錄B

    Agrawal, R., & Srikant, R. (1996). Mining Quantitative Association Rules in Large Relational Tables. Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data , pp. 1-12.
    Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining Association Rules between Sets of Items in Large Databases. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data , pp. 207-216.
    Akbaria, K. (2010). Market Segmentation of Taiwanese’Perception about Indonesia Tourism Using PSKO and Multidimensional Scaling. 國立臺灣科技大學, 工業管理所.
    Al-Sultan, K. (1995). A Tabu search approach to the clustering problem. Pattern Recognition , 28 (9), pp. 1443-1451.
    Alsultan, K., & Selim, S. (1991). A simulated annealing algorithm for the clustering problems. Pattern Recognition , 24 (10), pp. 1003-1008.
    Ankerst, M., Breunig, M. M., Kriegel, H.-P., & Sander, J. (1999). OPTICS: Ordering Points To Identify the Clustering Structure. International Conference On Management of Data (pp. 49-60). Philadelphia: Proceedings of the ACM SIGMOD'99.
    Barbar, D., & Chen, P. (2000). Using the fractal dimension to cluster datasets. Proceeding of 6th ACM SIGKDD International Conference Knowledge Discovery and Data Mining , pp. 260-264.
    Berry, M., & Linoff, G. (2004). Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management. Wiley; 3 edition.
    Bezdek, J. C., & Hathaway, R. (1992). Numerical convergence and interpretation of the fuzzy c-shells clustering algorithm. IEEE Transactions Neural Network , 3 (5), pp. 787-793.
    Canetta, L., Cheikhrouhou, N., & Glardon, R. (2005, December). Applying two-stage SOM-based clustering approaches to industrial data analysis. Production Planning & Control , 16 (8), pp. 774-784.
    Dash, R., Paramguru, R. L., & Dash, R. (2011). Comparative Analysis of Supervised and Unsupervised Discretization Techniques. International Journal of Advances in Science and Technology , 2 (3), pp. 29-37.
    El-Zonkoly, A. (2011, March). Optimal placement of multi-distributed generation units including different load models using particle swarm optimization. Swarm and Evolutionary Computation , 1 (1), pp. 50-59.
    Ester, M., Kriegel, H., Sander, J., & Xu, X. (1996). A density-based algorithem for discovering clusters in large spatial databases with noise. Proceedings of 2nd Internation Conference Knowledge Discovery and Data Mining(KDD-96) , pp. 226-231.
    Fayyad, U. M., & Irani, K. B. (1993). Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. Proceedings of 13th International Joint Conference on Artificial Intelligence , pp. 1022-1029.
    Fayyad, U. M., Piatetsky-Shapirp, G., & Smyth, P. (1999). "From Data Mining to Knowledge Discovery: An Overview" Advances in knowledge Discovery and Data Mining. AAAI/MIT Press , pp. 1-36.
    Frawlet, W. J., Paitetsky-Shapiro, G., & Matheus, C. J. (1991). Knowledge discovery in databases : An overview, knowledge disccvery in database. AAAI/MIT Press , pp. 1-30.
    García, S., Luengo, J., Sáez, J. A., López, V., & Herrera, F. (2013, April). A Survey of Discretization Techniques: Taxonomy and Empirical Analysis in Supervised Learning. IEEE Transactions on Knowledge and Data Engineering , 25 (4), pp. 734-750.
    Grossberg, S. (1976). Adaptive pattern recognition and universal encoding II: Feedback,expectation, olfaction, and illusions. Biological Cybernetics , 23, pp. 187-202.
    Grupe, F. H., & Owrang, M. M. (1995). Database mining discovery new knowledge and cooperative advantage. Information System Management , 12, pp. 26-31.
    Guha, S., Rastogi, R., & Shim, K. (1998). CRUE: an efficient clustering algorithm for large database. Proceedings ACM SIGMOD Conference Management of Data , pp. 73-84.
    Gupta, A., Mehrotra, K. G., & Mohan, C. (2010). A Clustering-Based Discretization for Supervised Learning. Statistics & Probability Letters , 80 (9-10), pp. 816-824.
    Hall, L., Özyurt, I., & Bezdek, J. (1999). Clustering with a Genetically Optimized Approach. IEEE Transactions on Evolutionary Computation , 3 (2), pp. 103-112.
    Inzitari, D., Eliasziw, M., Gates, P., Sharpe, B., Chan, R., Meldrum, H., et al. (2000, Jun). The causes and risk of stroke in patients with asymptomatic internal-carotid-artery stenosis. North American Symptomatic Carotid Endarterectomy Trial Collaborators. The New England journal of medicine , 342 (23), pp. 1693-1700.
    Kaufman, L., & Rousseeuw, P. (1990). Finding groups in data: an introduction to cluster analysis. Wiley .
    Kennedy, J., Eberhart, R. C., & Shi, Y. (2001). Swarm intelligence. Kaufmann,San Francisco , 1, pp. 700-720.
    Kohonen, T. (1990). The self-organizing map. Proceedings IEEE , 78 (9), pp. 1464-1480.
    Krishna, K., & Murty, M. N. (1991, June). Genetic K-means algorithm. IEEE Transactions on Systems, Man, and Cybernetics , 29 (3), pp. 433-439.
    Krishnapuram, R., & Keller, J. (1993). A possibilistic approach to clustering . IEEE Transactions Fuzzy Ststem , 1 (2), pp. 98-110.
    Krysiak-Baltyn, K., Petersen, T. N., Audouze, K., Jrgensen, N., Ängquist, L., & Brunak, S. (2014). Compass: A hybrid method for clinical and biobank data mining. Journal of Biomedical Informatics , 47, pp. 160-170.
    Kuo, R. J., Ho, L. M., & Hu, C. M. (2002). Integration of self-organizing feature map and K-means algorithm for market segmentation. Computers & Operations Research , 29, pp. 1475-1493.
    Kuo, R. J., Lin, S. Y., & shih, C. W. (2007). Mining association rules through integration of clustering analysis and ant colony system for health insurance database in Taiwan. Expert Systems with Applications , 33, pp. 794-808.
    Kuo, R. J., Wang, M. J., & Huang, T. W. (2011). An application of particle swarm optimization algorithm to clustering analysis. Journal of Soft Computing , 15 (3), pp. 533-542.
    Latkowski, T., & Osowski, S. (2015). Data mining for feature selection in gene expression autism data. Expert Systems with Applications , 42, pp. 864-872.
    Lee, T.-T., Liu, C.-Y., Kuo, Y.-H., Mills, M. E., Fong, J.-G., & Hung, C. (2011, February). Application of data mining to the identification of critical factors in patient falls using a web-based reporting system. International Journal of Medical Informatics , 80 (2), pp. 141-150.
    Mangiameli, P., Chen, S. K., & West, D. (1996). A comparison of SOM neural network and hierarchical clustering methods. European Journal of Operational Research , 93, pp. 402-417.
    Merwe, D., & Engelbrecht, A. (2003). Data Clustering using Particle Swarm Optimization. The 2003 congress on evolutionary computation , pp. 215-220.
    Mumtaz, K., & Duraiswamy, K. (2010, June). An Analysis on Density Based Clustering of Multi Dimensional Spatial Data. Indian Journal of Computer Science and Engineering , 1 (1), pp. 8-12.
    Nahar, J., Imam, T., Tickle, K. S., & Chen, Y. P. (2013). Association rule mining to detect factors which contribute to heart disease in males and females. Expert Systems with Applications , pp. 1086-1093.
    Odderson, I., & McKenna, B. (1993). A model for management of patients with stroke during the acute phase. Outcome and economic implications. Stroke , pp. 1823-27.
    Ordonez, C., O, E., de Braal, L., & Santana, C. (2001). Mining constrained association rules to predict heart disease. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on , pp. 433-440.
    Palaniappan, S., & Hong, T. K. (2008, November). Discretization of Continuous Valued Dimensions in OLAP Data Cubes. International Journal of Computer Science and Network Security , 8 (11), pp. 116-126.
    Shaikh1, S., Khan, A. P., & Mahajan, V. S. (2013). Implementation of DBSCAN Algorithm for Internet Traffic Classification. International Journal of Computer Science and Information Technology Research , 1 (1), pp. 25-32.
    Sheikholeslami, G., Chatterjee, S., & Zhang, A. (1998). WaveCluster: A multi-resolution clustering approach for very large spatial databases. Proceedings 1998 Interational Conference Very Large Databases(VLDB'98) , pp. 428-439.
    Shi, Y., & Eberhart, R. C. (1999). Empirical study of particle swarm optimization. In Proceedings of the IEEE Congress on Evolutionary Computation (CEC) , 3, pp. 1945-1950.
    Stem Cell Treatment Solutions. (2012). Stem Cell Treatment Solutions. 擷取自 http://www.stem-cell-solutions.com.au/test/training/research/97-degenerative/186-cerebrovascular-disease-stem-cell-therapy
    Trelea, I. C. (2003). The particle swarm optimization algorithm: convergence analysis and parameter selection. Information processing letters , 85 (6), pp. 317-325.
    Vannucci, M., & Colla, V. (2004, April). Meaningful discretization of continuous features for association rules mining by means of a SOM. Proceedings of the ESANN2004 European Symposium on Artificial Neural Networks , pp. 489-494.
    Vesanto, J., Himberg, J., Alhoniemi, E., & Parhankangas, J. (1999). Self-organizing map in Matlab: the SOM Toolbox. In Proceedings of the Matlab DSP conference , 99, pp. 16-17.
    Waller, N. G., Kajser, H. A., Illian, J. B., & Manry, M. (1998, March). A Comparison of The Classification Capabilities of The 1-Dimensional Kohonen Neural Network with Two Partitioning and Three Hierarchical Cluster Anakysis Algorithems. Psychometrika , 63 (1), pp. 5-22.
    Wang, Y., Li, B., Weise, T., Wang, J., Yuan, B., & Tian, Q. (2011, October). Self-adaptive learning based particle swarm optimization. Information Sciences , 281 (20), pp. 4515–4538.
    Xiao, X., Dow, E. R., Eberhart, R., Miled, Z. B., & Oppelt, R. J. (2003). Gene Clustering Using Self-Organizing Maps and Particle Swarm Optimization. Proceedings of the international parallel and distributed processing symposium , pp. 23-28.
    Xu, R., & Wunsch, D. (2005, MAY). Survey of Clustering Algorithms. IEEE TRANSACTIONS ON NEURAL NETWORKS , 16 (3), pp. 645-678.
    Yager, R. R., & Filev, D. P. (1994). Approximate Clustering Via the Mountain Method. IEEE Transactions on systems, Man, and Cybernetics , 24 (8), pp. 1279-1284.
    Ye, X., & Keane, J. (1997). Mining association rules with composite items. Systems, Man, and Cybernetics, 1997. Computational Cybernetics and Simulation., 1997 IEEE International Conference on , pp. 1367 - 1372.
    Zhang, T., Ramakrishnan, R., & Livny, M. (1996). BIRCH: an efficient data clustering method for very large database. Proceedings ACM SIGMOD Conference Management of Data , pp. 103-114.
    王昶閔. (2005). 頸動脈狹窄缺血性中風風險高. 擷取自 自由電子報: http://www.libertytimes.com.tw/2005/new/jan/13/life/medicine-3.htm
    吳淑儀. (2008). 應用資料探勘技術於多重死因資料之疾病關聯分析. 國立成功大學, 公共衛生研究所.
    李淑芬, 柯慧青, 洪錦墩, & 李美文. (2012, 七). 影響民眾選擇自費健康檢查因素之研究.澄清醫護管理雜誌 , 8 (3), pp. 27-37.
    邱弘毅. (2008). 腦中風之現況與流行病學特徵. 台灣腦中風學會會刊, 15 (3).
    徐韻雯. (2007). 具隱私保護之醫療健檢關聯規則探勘研究. 朝陽科技大學, 資訊管理所.
    張善焜. (2009). 基於物件導向資料庫的屬性約簡系統研究. 朝陽科技大學, 資訊管理所.
    許登翔. (2004). 資料挖掘在中醫診斷系統之應用—以酸痛證為例. 世新大學資訊管理學系.
    黃政道. (2012). 以資料探勘方法及案例式推理規則建立頸動脈病變預測系統. 國立台灣科技大學, 工業管理研究所.
    黃郁仁. (2004). 整合案例式推理與類神經網路於新產品銷售預測--以圖書產品. 元智大學, 工業管理研究所.
    楊燕珠, & 林瑞益. (2005). 應用資料探勘技術於全民健康保險資料庫-以慢性腎衰竭為例. 資訊管理曁實務研討會, (頁 3063-3079).
    廖冠傑. (2008). 應用離散化與多重最小支持度探勘數量關聯規則. 國立嘉義大學, 資訊管理所.
    廖美南. (1997). 應用個案管理於控制腦中風病患照護品質及成本效益之成效探討. 台北醫學院護理學研究所.
    蔡宗欣. (2008). 應用DBSCAN群聚演算系統於空間資料之分析. 逢甲大學, 土地管理所.
    蔡松彥. (2008). 完整中風照護模式. 台灣腦中風學會會刊, 15 (1).
    顏敬倫. (2011). 行動寬頻航行資訊服務之研究. 國立臺灣海洋大學, 通訊與導航工程所.

    QR CODE