應用一個新的混合式基因演算法於分群問題｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	張敏勤 Min-cin Jhang
論文名稱：	應用一個新的混合式基因演算法於分群問題 Using a New Hybrid Genetic Algorithm to the Clustering Problem
指導教授：	楊鍵樵 Chen-Chau Yang
口試委員:	朱雨其 Yu-ci Jhu 呂永和 Yung-Ho Leu
學位類別：	碩士 Master
系所名稱：	電資學院 - 電子工程系 Department of Electronic and Computer Engineering
論文出版年：	2005
畢業學年度：	94
語文別：	中文
論文頁數：	64
中文關鍵詞：	基因演算法、分群演算法、群聚分析、資料探勘
外文關鍵詞：	Genetic Algorithm, Clustering Algorithm, Data Mining, Cluster Analysis
相關次數：	點閱：326 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在資料探勘領域裡的群聚分析，對於在資料集中去發掘資料樣本分佈是非常有用的。為了對資料集分群，群聚演算法通常使用一以距離矩陣為基礎的相似度量測，而達到在相同群聚裡的資料點比不同群聚裡的資料點更為相似。在本論文中，我們提出一新型的分群演算法係繼承自Hybrid K-medoid Algorithm (H.K.A)，H.K.A由基因演算法與傳統分群演算法的合併所構成，以解決分群問題。新型的分群演算法針對H.K.A中的局部搜尋法 (Local Search Heuristic) 與突變機制做部分修改後，比H.K.A更加快了演化的動作，且對於數值型態之資料集可做較有效率的群聚分析，同時具有不錯的分群解。
由實驗結果所示，本研究所提之分群演算法相較於H.K.A，在十二組的資料集測試下皆展現本演算法可以在更少的演化代數、更少的執行時間下找到較好的分群解，證明了本演算法在群集分析問題處理的優勢之處。

Clustering in data mining is very useful to discover distribution patterns in the underlying data. Clustering algorithms usually employ a distance metric based similarity measure in order to partition the database such that data points in the same partition are more similar than points in different partitions. In this paper, we proposed a new clustering algorithm which inherited Hybrid K-medoid Algorithm (H.K.A). H.K.A consisted of the combination of genetic algorithm and traditional clustering algorithm such that it was able to solving the clustering problem. New clustering algorithm was the improvement of H.K.A, with some modification in Local Search Heuristic and Mutation Operator. New clustering algorithm run faster than H.K.A in evolutionary processes, and it could also execute more efficiently for numerical data set in cluster analysis with the better clustering results.
After two algorithms experimented on twelve data set, the experimental results showed that our proposed algorithm could find the better clustering results with much less generations and time cost. Thus, these revealed the advantage of our proposed algorithm in resolving clustering problem.

第一章 緒論…………………………………………………………1
1 研究動機………………………………………………………….2
2 研究目的………………………………………………………….2
3 論文架構………………………………………………………….2
第二章 相關文獻探討…………………………………………...4
1 分群技術………………………………………………………….4
1.1 階層式分群法………………………………………………..5
1.2 分割式分群法………………………………………………..6
2 基因演算法……………………………………………………….8
2.1 編碼機制……………………………………………………10
2.2 適應度函數…………………………………………………10
2.3 父母選擇……………………………………………………10
2.4 交配…………………………………………………………11
2.5 突變…………………………………………………………12
2.6 淘汰…………………………………………………………13
2.7 演化中止條件………………………………………………14
3 應用在分群問題上的混合式基因演算法……………………...15
4 Hybrid K-medoid Algorithm (H.K.A)……………………………18
第三章 新型混合式基因演算法原理……………………..27
1 研究架構與演算法……………………………………………...27
2 演算法之細部…………………………………………………...30
2.1 局部訓練……………………………………………………30
2.2 編碼機制……………………………………………………36
2.3 適應度函數…………………………………………………36
2.4 父母選擇……………………………………………………37
2.5 交配…………………………………………………………38
2.6 突變…………………………………………………………38
2.7 淘汰…………………………………………………………41
2.8 中止條件……………………………………………………41
第四章 實驗結果與分析………………………………………42
1 資料集簡介……………………………………………………...42
2 參數設定………………………………………………………...47
3 演算法評估標準………………………………………………...49
4 實驗結果與分析………………………………………………...50
第五章 結論與未來研究工作………………………………..60
1 結論……………………………………………………………...60
2 未來研究工作…………………………………………………...60
參考文獻………………………………………………………………..62

                                

[1] MITSUO GEN, RUNWEI CHENG, “ GENETIC ALGORITHMS AND ENGINEERING DESIGN ”, A Wiley-Interscience Publication, JOHN WILEY & SONS,INC.1997.

[2] Weiguo Sheng, Xiaohui Liu, “ A Hybrid Algorithm for K-medoid Clustering of Large Data Sets ” Evolutionary Computation, 2004. IEEE CEC2004. Congress on , Volume: 1, 19-23 Pages:77 - 82 Vol.1. June 2004.

[3] V. Estivill-Castro and A.T. Murray, “ Spatial Clustering for Data Mining with Genetic Algorithms ”, Proceedings of the International ICSC Symposium on Engineering of Intelligent Systems 1998.

[4] C.B. Lucasius, A.D. Dane and G. Kateman, “ On k-medoid clustering of large data sets with the aid of a genetic algorithm ”, background, feasibility and comparison. Analytical Chimica Acta, 282 647-669, 1993.

[5] K. Krishna and M. Narasimha Murty, “ Genetic K-means Algorithm ” IEEE Transactions on System, Man and Cybernetics-Part B: Cybernetics, vol. 29, no. 3, pp. 433-439, JUNE 1999.

[6] Cheng-Fa Tsai, Han-Chang Wu, and Chun-Wei Tsai, “ A New Data Clustering Approach for Data Mining in Large Databases ”, IEEE Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN’02) 2002.

[7] Patrick C.H. Ma, Keith C.C. Chan, “ Discovering Clusters in Gene Expression Data using Evolutionary Approach ”, IEEE Proceedings of the 15th International Conference on Tools with Artificial Intelligence (ICTAI’03) 2003.

[8] Burhaneddin SANDIKCI, “ Genetic Algorithms ”, In partial fulfillment of the requirements for the course IE 572 Spring 2000. ( Bilkent University, Department of Industrial Engineering)

[9] Hsinghua Chou, G. Premkumar, and Chao-Hsien Chu, “ Genetic Algorithms for Communications Network Design—An Empirical Study of the Factors that Influence Performance ”, IEEE Transactions on Evolutionary Computation, VOL. 5, NO. 3, JUNE 2001.

[10] T. Bäck, D. Fogel, and Z. Michalewicz, Eds., “ Handbook of Evolutionary Computation ”, Oxford: Oxford Univ. Press, 1996

[11] Ondrej Pribyl, “ Clustering of Activity Patterns Using Genetic Algorithms ”, 201 Transportation Research Bldg, University Park, PA 16802 (Pennsylvania Transportation Institute & Pennsylvania State University)

[12] Cheng-Fa Tsai, Zhi-Cheng Chen, and Chun-Wei Tsai, “ MSGKA: An Efficient Clustering Algorithm for Large Databases ”, IEEE SMC WA1D1 2002.

[13] Yi Lu, Shiyong Lu, Farshad Fotouhi, Youping Deng, and Susan J. Brown, “ FGKA: A Fast Genetic K-means Clustering Algorithm ”, ACM Symposium on Applied Computing 2004.

[14] Yong-Guo Liu, Ke-Fei Chen, Xue-Ming Li, “ A Hybrid Genetic Based Clustering Algorithm ”, IEEE Proceedings of the Third International Conference on Machine Learning and Cybernetics, Shanghai, 26-29 August 2004.

[15] Maulik U., Bandyopadhyay S., “ Genetic algorithm-based clustering technique ”, Pattern Recognition, Vol. 33, pp.1455-1465, 2000.

[16] Margaret H. Dunham, “ DATA MINING ”, Prentice Hall, 2002.

[17] Tou J. T., Gonzalez R. C., “ PATTERN RECOGNITION PRINCIPLES ”, Addison-Wesley, Reading, pp.94-97,1974.

[18] Herrera. F. and J. L. Verdegay, “ GENETIC ALGORITHMS AND SOFT COMPUTING ”, Springer-Verlag, Heidelberg, pp. 8, 1996.

[19] Periklis, A., “ Data Clustering Techniques ”, March 2002. URL: http://www.cs.toronto.edu/~periklis/pubs/depth.pdf

[20] 謝邦昌, “ 資料採礦入門及應用—從統計技術看資料採礦 ”,資商訊息顧問股份有限公司, 2001.

[21] 蘇木春,張孝德,“ 機器學習：類神經網路、模糊系統以及基因演算法則 ”,二版,全華科技圖書股份有限公司,2000.

[22] 彭怡青,“ 以統計測量為基礎之交易資料分群 ”,國立台灣大學碩士論文,2001.

[23] 葉承銓, “ 應用適應性基因演算法於資料分群的問題 ”,私立樹德科技大學碩士論文,2002.

全文公開日期 2006/12/05 (校內網路)
全文公開日期本全文未授權公開 (校外網路)
全文公開日期本全文未授權公開 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文