研究生: |
張敏勤 Min-cin Jhang |
---|---|
論文名稱: |
應用一個新的混合式基因演算法於分群問題 Using a New Hybrid Genetic Algorithm to the Clustering Problem |
指導教授: |
楊鍵樵
Chen-Chau Yang |
口試委員: |
朱雨其
Yu-ci Jhu 呂永和 Yung-Ho Leu |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電子工程系 Department of Electronic and Computer Engineering |
論文出版年: | 2005 |
畢業學年度: | 94 |
語文別: | 中文 |
論文頁數: | 64 |
中文關鍵詞: | 基因演算法 、分群演算法 、群聚分析 、資料探勘 |
外文關鍵詞: | Genetic Algorithm, Clustering Algorithm, Data Mining, Cluster Analysis |
相關次數: | 點閱:326 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在資料探勘領域裡的群聚分析,對於在資料集中去發掘資料樣本分佈是非常有用的。為了對資料集分群,群聚演算法通常使用一以距離矩陣為基礎的相似度量測,而達到在相同群聚裡的資料點比不同群聚裡的資料點更為相似。在本論文中,我們提出一新型的分群演算法係繼承自Hybrid K-medoid Algorithm (H.K.A),H.K.A由基因演算法與傳統分群演算法的合併所構成,以解決分群問題。新型的分群演算法針對H.K.A中的局部搜尋法 (Local Search Heuristic) 與突變機制做部分修改後,比H.K.A更加快了演化的動作,且對於數值型態之資料集可做較有效率的群聚分析,同時具有不錯的分群解。
由實驗結果所示,本研究所提之分群演算法相較於H.K.A,在十二組的資料集測試下皆展現本演算法可以在更少的演化代數、更少的執行時間下找到較好的分群解,證明了本演算法在群集分析問題處理的優勢之處。
Clustering in data mining is very useful to discover distribution patterns in the underlying data. Clustering algorithms usually employ a distance metric based similarity measure in order to partition the database such that data points in the same partition are more similar than points in different partitions. In this paper, we proposed a new clustering algorithm which inherited Hybrid K-medoid Algorithm (H.K.A). H.K.A consisted of the combination of genetic algorithm and traditional clustering algorithm such that it was able to solving the clustering problem. New clustering algorithm was the improvement of H.K.A, with some modification in Local Search Heuristic and Mutation Operator. New clustering algorithm run faster than H.K.A in evolutionary processes, and it could also execute more efficiently for numerical data set in cluster analysis with the better clustering results.
After two algorithms experimented on twelve data set, the experimental results showed that our proposed algorithm could find the better clustering results with much less generations and time cost. Thus, these revealed the advantage of our proposed algorithm in resolving clustering problem.
[1] MITSUO GEN, RUNWEI CHENG, “ GENETIC ALGORITHMS AND ENGINEERING DESIGN ”, A Wiley-Interscience Publication, JOHN WILEY & SONS,INC.1997.
[2] Weiguo Sheng, Xiaohui Liu, “ A Hybrid Algorithm for K-medoid Clustering of Large Data Sets ” Evolutionary Computation, 2004. IEEE CEC2004. Congress on , Volume: 1, 19-23 Pages:77 - 82 Vol.1. June 2004.
[3] V. Estivill-Castro and A.T. Murray, “ Spatial Clustering for Data Mining with Genetic Algorithms ”, Proceedings of the International ICSC Symposium on Engineering of Intelligent Systems 1998.
[4] C.B. Lucasius, A.D. Dane and G. Kateman, “ On k-medoid clustering of large data sets with the aid of a genetic algorithm ”, background, feasibility and comparison. Analytical Chimica Acta, 282 647-669, 1993.
[5] K. Krishna and M. Narasimha Murty, “ Genetic K-means Algorithm ” IEEE Transactions on System, Man and Cybernetics-Part B: Cybernetics, vol. 29, no. 3, pp. 433-439, JUNE 1999.
[6] Cheng-Fa Tsai, Han-Chang Wu, and Chun-Wei Tsai, “ A New Data Clustering Approach for Data Mining in Large Databases ”, IEEE Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN’02) 2002.
[7] Patrick C.H. Ma, Keith C.C. Chan, “ Discovering Clusters in Gene Expression Data using Evolutionary Approach ”, IEEE Proceedings of the 15th International Conference on Tools with Artificial Intelligence (ICTAI’03) 2003.
[8] Burhaneddin SANDIKCI, “ Genetic Algorithms ”, In partial fulfillment of the requirements for the course IE 572 Spring 2000. ( Bilkent University, Department of Industrial Engineering)
[9] Hsinghua Chou, G. Premkumar, and Chao-Hsien Chu, “ Genetic Algorithms for Communications Network Design—An Empirical Study of the Factors that Influence Performance ”, IEEE Transactions on Evolutionary Computation, VOL. 5, NO. 3, JUNE 2001.
[10] T. Bäck, D. Fogel, and Z. Michalewicz, Eds., “ Handbook of Evolutionary Computation ”, Oxford: Oxford Univ. Press, 1996
[11] Ondrej Pribyl, “ Clustering of Activity Patterns Using Genetic Algorithms ”, 201 Transportation Research Bldg, University Park, PA 16802 (Pennsylvania Transportation Institute & Pennsylvania State University)
[12] Cheng-Fa Tsai, Zhi-Cheng Chen, and Chun-Wei Tsai, “ MSGKA: An Efficient Clustering Algorithm for Large Databases ”, IEEE SMC WA1D1 2002.
[13] Yi Lu, Shiyong Lu, Farshad Fotouhi, Youping Deng, and Susan J. Brown, “ FGKA: A Fast Genetic K-means Clustering Algorithm ”, ACM Symposium on Applied Computing 2004.
[14] Yong-Guo Liu, Ke-Fei Chen, Xue-Ming Li, “ A Hybrid Genetic Based Clustering Algorithm ”, IEEE Proceedings of the Third International Conference on Machine Learning and Cybernetics, Shanghai, 26-29 August 2004.
[15] Maulik U., Bandyopadhyay S., “ Genetic algorithm-based clustering technique ”, Pattern Recognition, Vol. 33, pp.1455-1465, 2000.
[16] Margaret H. Dunham, “ DATA MINING ”, Prentice Hall, 2002.
[17] Tou J. T., Gonzalez R. C., “ PATTERN RECOGNITION PRINCIPLES ”, Addison-Wesley, Reading, pp.94-97,1974.
[18] Herrera. F. and J. L. Verdegay, “ GENETIC ALGORITHMS AND SOFT COMPUTING ”, Springer-Verlag, Heidelberg, pp. 8, 1996.
[19] Periklis, A., “ Data Clustering Techniques ”, March 2002. URL: http://www.cs.toronto.edu/~periklis/pubs/depth.pdf
[20] 謝邦昌, “ 資料採礦入門及應用—從統計技術看資料採礦 ”,資商訊息顧問股份有限公司, 2001.
[21] 蘇木春,張孝德,“ 機器學習:類神經網路、模糊系統以及基因演算法則 ”,二版,全華科技圖書股份有限公司,2000.
[22] 彭怡青,“ 以統計測量為基礎之交易資料分群 ”,國立台灣大學碩士論文,2001.
[23] 葉承銓, “ 應用適應性基因演算法於資料分群的問題 ”,私立樹德科技大學碩士論文,2002.