簡易檢索 / 詳目顯示

研究生: Achmad Yasid
Achmad - Yasid
論文名稱: 應用差分進化演算法與k平均數法於兩階段自動分群分析之研究
A Two-stage Automatic Cluster Analysis Using Differential Evolution Algorithm and k-means Algorithm
指導教授: 郭人介
Ren-Jieh Kuo
口試委員: Chao-Lung Yang
Chao-Lung Yang
Chieh-Yuan Tsai
Chieh-Yuan Tsai
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2013
畢業學年度: 101
語文別: 英文
論文頁數: 54
中文關鍵詞: Automatic clusteringCluster defragmented algorithmDifferential evolutionk-means algorithm
外文關鍵詞: Automatic clustering, Cluster defragmented algorithm, Differential evolution, k-means algorithm
相關次數: 點閱:260下載:10
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • One of the most challenging problems in data clustering is to determine the number of clusters. This study intends to propose a two-stage automatic clustering based differential evolution (DE) algorithm. Basically, integration of the cluster defragmented algorithm (CDA) and improved automatic clustering based differential evolution (ACDE) and k-means algorithm (CDA-ACDE-k-means) is proposed. In the first stage, CDA algorithm is used to determine k initial cluster number, while ACDE algorithm and k-means algorithm are performed for cluster analysis in the second stage. It requires no prior knowledge about number of clusters. The k-means algorithm is employed to tune cluster centroids in order to improve the performance of DE algorithm. To validate the performance of the proposed algorithm, four well-known data sets, Iris, Wine, Glass and Vowel, are employed. The computational results indicate that the proposed CDA-ACDE-k-means algorithm is superior to classical DE algorithm (CDE) and automatic clustering based improved differential evolution and k-means (ACDE-k-means), also other state of the art clustering algorithms based on genetic algorithm (GA) and particle swarm optimization (PSO) algorithm.


    One of the most challenging problems in data clustering is to determine the number of clusters. This study intends to propose a two-stage automatic clustering based differential evolution (DE) algorithm. Basically, integration of the cluster defragmented algorithm (CDA) and improved automatic clustering based differential evolution (ACDE) and k-means algorithm (CDA-ACDE-k-means) is proposed. In the first stage, CDA algorithm is used to determine k initial cluster number, while ACDE algorithm and k-means algorithm are performed for cluster analysis in the second stage. It requires no prior knowledge about number of clusters. The k-means algorithm is employed to tune cluster centroids in order to improve the performance of DE algorithm. To validate the performance of the proposed algorithm, four well-known data sets, Iris, Wine, Glass and Vowel, are employed. The computational results indicate that the proposed CDA-ACDE-k-means algorithm is superior to classical DE algorithm (CDE) and automatic clustering based improved differential evolution and k-means (ACDE-k-means), also other state of the art clustering algorithms based on genetic algorithm (GA) and particle swarm optimization (PSO) algorithm.

    CONTENTS ABSTRACTi ACKNOWLEDGEMENTSii CONTENTSiii LIST OF FIGURESv LIST OF TABLESvi Chapter 1INTRODUCTION1 1.1Research Background1 1.2Research Objectives3 1.3Research Scope and Constraints3 1.4Research Framework3 Chapter 2LITERATURE STUDY5 2.1Hierarchical and Partitional Clustering5 2.1.1Hierarchical Clustering5 2.1.2Partitional Clustering7 2.2Crisp Clustering and Fuzzy Clustering7 2.2.1Crisp Clustering7 2.2.2Fuzzy Clustering8 2.3k-means Algorithm8 2.4Similarity Measure10 2.5Clustering Validity Index11 2.5.1VI Measure12 2.6Non-Automatic and Automatic Clustering13 2.6.1Non-automatic Clustering13 2.6.2Automatic Clustering14 Chapter 3RESEARCH METHODOLOGY16 3.1Cluster Defragmented Algorithm16 3.2Automatic Clustering with Differential Evolution18 3.2.1Differential Evolution18 3.2.2Automatic Clustering Based Improved Differential Evolution algorithm20 3.3Developing Automatic Clustering Combining DE algorithm and k-means algorithm23 3.3.1Initialization of DE Population23 3.3.2Chromosome Representation25 3.3.3Fitness Function26 3.3.4Avoiding Erroneous Chromosomes27 3.3.5Hybridization DE algorithm with k-means algorithm28 3.3.6CDA-ACDE-k-means Pseodocode28 Chapter 4COMPUTATIONAL RESULTS31 4.1Experimental Setup31 4.2Computational Result32 4.3Algorithm Convergence36 4.4Statistical Test38 Chapter 5CONCLUSIONS AND FURTHER RESEARCH40 5.1Conclusions40 5.2Contributions40 5.3Future Research41 REFERENCES42 Appendix A COMPUTATIONAL RESULTS OF CLUSTERING46 1.Result of iris data set46 2.Result of wine data set48 3.Result of glass data set50 4.Result of vowel data set52 5.T-test results for proposed algorithm54

    REFERENCES
    1.A.K. Jain, M.N.M., P. J. Flynn, Data clustering: A review. Acm Computing Surveys, 1999. 31(3): p. 264-323.
    2.P.N. Tan, M.S., V. Kumar, Introduction to data mining. Boston:Pearson Education , Inc., 2006.
    3.H. Frigui, R.K., A robust competitive clustering algorithm with application in computer vision. Pattern Analysis and Machine Intelligence, 1999. 21: p. 450-465.
    4.K.Y. Yeung, F.C., Murua A, Raftery A.E, Ruzzo WL, Model based clustering and data transformation for gene expression data. 2001. 17: p. 977-987.
    5.D. Cai, X.H., Han J, Document clustering using locality preserving indexing. 2005. 17: p. 1624-1637.
    6.R.J. Kuo, K.A., Budiarto Subroto, Application of particle swarm optimization and perceptual map to tourist market segmentation. Expert System with App., 2012. 39: p. 8726-8735.
    7.S. Bandyopadhyay, U.M., Genetic clustering for automatic evolution of cluster and application to image classification. Pattern Recognition, 2002. 35(1197-1208).
    8.S. Paterlini, T.K., Differential evolution and particle swarm optimization in partitional clustering. Computational Statistics & Data Analysis, 2006. 50: p. 1220-1247.
    9.S. Das, A.A., A. Konar, Automatic clustering using improved differential evolution algorithm. IEEE Trans. Syst. Man. Cybernetics Part A, 2008. 38(1)(218-237).
    10.L.Y. Tseng, Y.S.B., A genetic approach to the automatic clustering problem. Pattern Recognition, 2001. 34: p. 415-424.
    11.G. Garai, B.B.C., A novel genetic algorithm for automattic clustering. Pattern Recognition Letters, 2004. 25: p. 173-187.
    12.K. Price, R.S., J. Lampinen, Differential Evolution: A practical approach to global optimization Springer-Verlag, Berlin, 2005.
    13.R. Storn, K.P., Differential Evolution – A Simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 1997. 11(341-359).
    14.Z. Cai, W.G., Charles XL, Harry Z, A clustering-based differential evolution for global optimization. Applied Soft Computing, 2011. 11: p. 1363-1379.
    15.W. Kwedlo, A clustering method combining differential evolution with the k-means algorithm. Pattern Recognition Letters, 2011. 32: p. 1613-1621.
    16.K.C. Gowda, G.K., Agglomerative clustering using the cencept of mutual nearest neighbourhood. Pattern Recognition, 1977. 10: p. 105-112.
    17.M.C. Ramos, Divisive and hierarchical clustering techniques to analyse variability of rainfall distribution patterns in a mediterranean. Atmospheric Research, 2001. 57(2): p. 123-138.
    18.N.C. Jain, A.I., L.R Goel, Monte Carlo Comparison of Six Hierarchical Clustering Methods on Random Data. Pattern Recognition, 1986. 19(1): p. 95-99.
    19.E. Shaffer, R.D., A.K. Jain, Single-link characteristics of a mode-seeking clustering algorithm. Pattern Recognition, 1979. 11(1): p. 65-70.
    20.S.P. Smith, R.D., Stability of hierarchical clustering. Pattern Recognition, 1980. 12(3): p. 177-187.
    21.T. Kurita, An efficient agglomerative clustering algorithm using a heap. Pattern Recognition, 1991. 24: p. 205-209.
    22.M.K. Pakhira, S.B., Ujwal Maulik, Validity Index fro crisp and fuzzy clusters. Pattern Recognition, 2004. 37: p. 487-501.
    23.A.K. Jain, Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 2009. 31: p. 651-666.
    24.P. Drineas, A.F., R. Kannan, S. Vempala, V. Vinay, Clustering large graphs via the singular value decomposition. Machine Learning, 2004. 56: p. 9-33.
    25.J. MacQueen, Some methods for classification and analysis of multivariate observations. Proc. Fifth Berkeley Symp. Math. Stat. Probability, 1967: p. 281-297.
    26.C.H. Chou, M.C.S., E. Lai, A new cluster validity measure and its application to image compression. Pattern Anal Applic, 2004. 7(205-220).
    27.R. Nayak, T.T., A progressive clustering algorithm to group the XML data by structural and semantic similarity. Pattern Recognition and Artificial Intelligence, 2007. 21.
    28.S. Rinzivillo, D.P., Mirco N., Fosca G., Natalia A., Gennady A., Visually driven analysis of movement data by progressive clustering. Information Visualization, 2008. 7: p. 225-239.
    29.N. Boujemaa. On competitive unsupervised clustering. in Pattern Recognition, 2000. Proceedings. 15th International Conference on. 2000.
    30.R.J. Kuo, Y.J.S., Z.Y. Chen, F.C. Tien, Integration of particle swarm optimization and genetic algorithm for dynamic clustering. Information Sciences 2012. 39: p. 124-140.
    31.X.L. Xie, B.G., A validity measure for fuzzy clustering. IEEE Trans Pattern Anal Machine Intell, 1991. 13: p. 841-847.
    32.N.R. Pal, J.C. Bezdex, Numerical taxonomy with fuzzy sets. J. Math Biology, 1974. 1: p. 57-71.
    33.D.L. Davies, B.D.W., A cluster separation measure. IEEE Trans Pattern Anal Machine Intell, 1979. 1: p. 224-227.
    34.V.J. Rayward-Smith, Metaheuristics for clustering in KDD. IEEE Congress Evolutionary Computation, 2005. 3: p. 2380-2387.
    35.E.R. Hruschka, R.J.G.B.C., A survey of evolutionary algorithms for clustering. IEEE Transaction on Systems, Man, and Cybernetics, 2009. 39: p. 133-155.
    36.M. Omran, P.E., Particle swarm optimization method for image clustering. International Journal of Pattern Recognition and Artificial Intelligence, 2005. 19: p. 297-321.
    37.S.Z. Selim, K.A.S., A Simulated annealing algorithm for the clustering problem. Pattern Recognition, 1991. 24: p. 1003-1008.
    38.R.J. Kuo, L.M.H., C.H. Hu, Cluster analysis in industrial market segmentation through artificial neural network. Computer & Industrial Engineering, 2002. 42: p. 391-399.
    39.A.K. Qin, V.L.H., P.N. Suganthan, Differential evolution algorithm with strategy adaptation for global numerical optimization. IEEE Transaction on Evolutionary Computation, 2009. 13: p. 398-417.
    40.R.J. Kuo, J.L.L., C. Tu, Integration of ART2 neural network and genetic K-means algorithm for analyzing web browsing path in electronic commerce. Decision Support System, 2005. 40: p. 355-374.
    41.W.P. Lee, S.-W.C., Automatic clustering with differential evolution using cluster number oscillation method. Intelligent Systems and Applications (ISA), 2nd International Workshop on, 2010.
    42.G. Onwubolu, D.D., Scheduling flow shops using differential evolution algorithm. European Journal of Operation Research, 2006. 171: p. 674-692.
    43.D.G Mayer, B.P.K., AA. Archer, Differential evolution - an easy and efficient evolutionary algorithm for model optimization. Agricultural System, 2005. 83: p. 315-328.
    44.W. Sheng, S.S., L. Zhang, X. Liu, A weighted sum validity function for clustering with a hybrid niching genetic algorithm. IEEE Transaction on Systems, Man, and Cybernetics, 2005. 35(6): p. 1156-1167.
    45.N.R Pal, J.C.B., On cluster validity for the fuzzy c-means model. IEEE Transaction on Fuzzy Systems, 1995. 3(3): p. 370-379.
    46.K. Bache, M.L., UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA:University of California, School of Information and Computer Science. 2013.

    QR CODE