簡易檢索 / 詳目顯示

研究生: 林峻宇
JUN-YU LIN
論文名稱: 應用正弦餘弦演算法為基礎之模糊可能性c排序均值演算法於顧客區隔
An Application of Sine Cosine Algorithm-based Fuzzy Possibilistic c-ordered Means Algorithm to Customer Segmentation
指導教授: 郭人介
Ren-Jieh Kuo
口試委員: 歐陽超
Chao Ou-Yang
王孔政
Kung-Jeng Wang
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 108
中文關鍵詞: 分群分析啟發式演算法正餘弦演算法異常值模糊c均值演算法可能性模糊c均值演算法模糊c排序均值演算法
外文關鍵詞: Clustering analysis, Metaheuristics, Sine Cosine Algorithm, Outliers, Fuzzy c-means algorithm, Possibilistic fuzzy c-means algorithm, Fuzzy c-ordered means algorithlm
相關次數: 點閱:263下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

由於資訊和科技的進步,數據的收集也越來越容易。分群是一種用於挖掘數據結構的重要技術且應用於許多領域,例如顧客區隔,圖像識別,社會科學等。然而,在實際應用中,有很多噪音或是異常值在資料集中,這會影響分群技術的表現,此外,分群結果容易受到初始重心和參數的影響,因此,為了改善離群值對於分群結果的影響,本研究結合可能性模糊c均值演算法(possibilistic fuzzy c-means, PFCM)和模糊c排序均值演算法 (fuzzy c-ordered means, FCOM)的優點提出模糊可能性c排序均值演算法 (fuzzy possibilistic c-ordered means, FPCOM)。再者,為了解決參數和初始重心的問題,本研究使用正餘弦演算法結合FPCOM去改善分群結果。所提出的演算法為SCA-FPCOM。有10個從UCI機器庫收集的數據集會被使用於驗證所提出的演算法。本研究所使用的兩個評比指標評估分群表現,包括Adjusted Rand Index (ARI)及Silhouette Coefficient。根據實驗結果,SCA-FPCOM演算法比起其他演算法能夠獲得更佳的結果。此外,所以提出的算法也應用於現實世界中百貨商城的客戶區隔的問題,結果顯示有不錯的表現。


Due to advances in information technology, data collection is becoming much easier. Clustering is an important technique for exploring data structures and is used in many fields, such as customer segmentation, image recognition, social science and so on. However, in real–world applications, there are a lot of noises or outliers which will influence the clustering performance in the dataset. Besides, the clustering results are susceptible to the initial centroids and algorithm parameters. Therefore, in order to overcome the influence of outliers on clustering results, this study combines the advantages of probability c-means (PFCM) and fuzzy c-ordered means (FCOM) to propose a fuzzy possibilistic c-ordered means (FPCOM) algorithm. In order to solve the problem of parameters and initial centroids determination, this study employs a sine cosine algorithm (SCA) combined with FPCOM to improve the clustering results. The proposed algorithm is named SCA-FPCOM algorithm. Ten benchmark datasets collected from the UCI machine repository were used to validate the proposed algorithm in terms of adjusted rand index (ARI) and the Silhouette coefficient. According to the experimental results, the SCA-FPCOM algorithm can obtain better results than other algorithms. In addition, the proposed algorithms were also applied to a real-world problem, mall customer segmentation. The computational result is also very promising.

CONTENTS 摘要 I ABSTRACT II ACKNOWLEDGEMENT III CONTENTS IV LIST OF TABLES VI LIST OF FIGURES VII CHAPTER 1 INTRODUCTION 1 1.1 Research Background 1 1.2 Research Objectives 3 1.3 Research Scope and Constraints 3 1.4 Research Framework 3 CHAPTER 2 LITERATURE REVIEW 5 2.1 Cluster Analysis 5 2.2 Fuzzy c-means Algorithm 6 2.3 Fuzzy c-ordered Means Algorithm 8 2.4 Possibilistic Fuzzy c-means Algorithm 11 2.5 Meta-heuristic Approaches for Clustering 13 2.5.1 Genetic algorithm for clustering 14 2.5.2 Particle swarm optimization algorithm for clustering 15 2.5.3 Sine cosine algorithm for clustering 16 CHAPTER 3 METHODOLOGY 17 3.1 Methodology Framework 17 3.2 Data Preprocessing 18 3.3 Fuzzy Possibilistic c-ordered Means Algorithm 19 3.4 Sine Cosine Algorithm-based FPCOM Algorithm 22 3.4.1 Parameter setting for SCA-FPCOM 23 3.4.2 SCA-FPCOM 25 CHAPTER 4 EXPERIMENTAL RESULTS 27 4.1 Datasets 27 4.2 Performance Measurement 28 4.3 Parameters Setting 30 4.4 Computational Results 31 4.4.1 ARI 31 4.4.2 Silhouette Coefficient 48 4.4.3 Computation time 64 CHAPTER 5 CASE STUDY 68 5.1 Dataset 68 5.2 Dataset Preprocessing 68 5.3 Analysis 69 CHAPTER 6 CONCLUSIONS AND FUTURE RESEARCH 75 6.1 Conclusions 75 6.2 Contributions 76 6.3 Future Research 76 REFERENCES 78 APPENDIX 83

Allahyari, M., Pouriyeh, S., Assefi, M., Safaei, S., Trippe, E. D., Gutierrez, J. B. & Kochut, K. (2017). A brief survey of text mining: Classification, clustering and extraction techniques. arXiv preprint arXiv:1707.02919.
Bala, M. Sine Cosine Based Algorithm for Data Clustering.
Bezdek, J. C. (2013). Pattern recognition with fuzzy objective function algorithms. Springer Science & Business Media.
Bezdek, J. C., Boggavarapu, S., Hall, L. O. & Bensaid, A. (1994, June). Genetic algorithm guided clustering. In Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence (pp. 34-39). IEEE.
Bianchi, L., Dorigo, M., Gambardella, L. M. & Gutjahr, W. J. (2009). A survey on metaheuristics for stochastic combinatorial optimization. Natural Computing, 8(2), 239-287.
Blum, C., Puchinger, J., Raidl, G. R. & Roli, A. (2011). Hybrid metaheuristics in combinatorial optimization: A survey. Applied Soft Computing, 11(6), 4135-4151.
Blum, C. & Roli, A. (2003). Metaheuristics in combinatorial optimization: Overview and conceptual comparison. ACM computing surveys (CSUR), 35(3), 268-308.
Chen, C. Y., Feng, H. M. & Ye, F. (2006). Automatic particle swarm optimization clustering algorithm. International Journal of Electrical Engineering, 13(4), 379-387.
Chen, M. S., Han, J. & Yu, P. S. (1996). Data mining: an overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering, 8(6), 866-883.
Clerc, M. (1999). The swarm and the queen: towards a deterministic and adaptive particle swarm optimization. In Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406) (Vol. 3, pp. 1951-1957). IEEE.
Das, S., Abraham, A. & Konar, A. (2008). Automatic kernel clustering with a multi-elitist particle swarm optimization algorithm. Pattern Recognition Letters, 29(5), 688-699.
Diday, E., Govaert, G., Lechevallier, Y. & Sidi, J. (1981). Clustering in pattern recognition. In Digital Image Processing (pp. 19-58). Springer, Dordrecht.
Djenouri, Y., Belhadi, A., Fournier-Viger, P. & Lin, J. C. W. (2018). Fast and effective cluster-based information retrieval using frequent closed itemsets. Information Sciences, 453, 154-167.
Dunn, J. C. (1973). A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters.
Eberhart, R. & Kennedy, J. (1995, October). A new optimizer using particle swarm theory. In MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science (pp. 39-43). IEEE.
Eberhart, R. C., Shi, Y. & Kennedy, J. (2001). Swarm Intelligence (Morgan Kaufmann Series in Evolutionary Computation). Morgan Kaufmann Publishers.
Fan, J., Han, M. & Wang, J. (2009). Single point iterative weighted fuzzy C-means clustering algorithm for remote sensing image segmentation. Pattern Recognition, 42(11), 2527-2540.
Farhang, Y. (2017). Face Extraction from Image based on K-Means Clustering Algorithms. International Journal of Advanced Computer Science and Applications, 8(9), 96-107.
Garces, E., Munoz, A., Lopez‐Moreno, J., & Gutierrez, D. (2012, June). Intrinsic images by clustering. In Computer Graphics Forum (Vol. 31, No. 4, pp. 1415-1424). Oxford, UK: Blackwell Publishing Ltd.
Goldberg, D. E., & Holland, J. H. (1988). Genetic algorithms and machine learning. Machine Learning, 3(2), 95-99.
Goldberg, D. E. (1989). Genetic Algorithms in Search, Optimization and Machine Learning. Addsion-Wesley Longman. Reading.
Grira, N., Crucianu, M., & Boujemaa, N. (2004). Unsupervised and semi-supervised clustering: a brief survey. A Review of Machine Learning Ttechniques for Processing Multimedia Content, 1, 9-16.
Holland, J. H. (1975). Adaptation in natural and artificial systems Ann Arbor. The University of Michigan Press, 1, 975.
Horn, D. & Gottlieb, A. (2001). Algorithm for data clustering in pattern recognition problems based on quantum mechanics. Physical Review Letters, 88(1), 018702.
Huber, P. J. (2011). Robust Statistics (pp. 1248-1251). Springer Berlin Heidelberg.
Hubert, L. & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193-218.
Jacob, E., Sasikumar, R., & Nair, K. R. (2004). A fuzzy guided genetic algorithm for operon prediction. Bioinformatics, 21(8), 1403-1407.
Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: a review. ACM Computing Surveys (CSUR), 31(3), 264-323.
Jiang, B., Wang, N., & Wang, L. (2014). Parameter identification for solid oxide fuel cells using cooperative barebone particle swarm optimization with hybrid learning. International Journal of Hydrogen Energy, 39(1), 532-542.
Jimenez, J. F., Cuevas, F. J., & Carpio, J. M. (2007, September). Genetic algorithms applied to clustering problem and data mining. In Proceedings of the 7th WSEAS International Conference on Simulation, Modelling and Optimization (pp. 219-224). World Scientific and Engineering Academy and Society (WSEAS).
Kennedy, J., & Eberhart, R. C. (1995). Particle swarm optimization Proceeding IEEE International Conference of Neural Network IV. IEEE Service Centre, Piscataway.
Khotimah, B. K., Irhamni, F., & Sundarwati, T. (2016). A Genetic Algorithm For Optimized Initial Centers K-Means Clustering In SMEs. Journal of Theoretical and Applied Information Technology, 90(1), 23.
Krishna, K., & Murty, N. M. (1999). Genetic K-means algorithm. IEEE Transactions on Systems Man And Cybernetics-Part B: Cybernetics, 29(3), 433-439.
Krishnapuram, R., & Keller, J. M. (1993). A possibilistic approach to clustering. IEEE Transactions on Fuzzy Systems, 1(2), 98-110.
Kumar, V., & Kumar, D. (2017). Data clustering using sine cosine algorithm: Data clustering using SCA. In Handbook of Research on Machine Learning Innovations and Trends (pp. 715-726). IGI Global.
Leski, J. M. (2016). Fuzzy c-ordered-means clustering. Fuzzy Sets and Systems, 286, 114-133.
Lin, H. J., Yang, F. W., & Kao, Y. T. (2005). An efficient GA-based clustering technique. Tamkang Journal of Science and Engineering, 8(2), 113-122.
Lu, Y., Lu, S., Fotouhi, F., Deng, Y., & Brown, S. J. (2004, March). FGKA: A fast genetic k-means clustering algorithm. In Proceedings of the 2004 ACM Symposium on Applied Computing (pp. 622-623). ACM.
MacQueen, J. (1967, June). Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (Vol. 1, No. 14, pp. 281-297).
Maulik, U., & Bandyopadhyay, S. (2000). Genetic algorithm-based clustering technique. Pattern Recognition, 33(9), 1455-1465.
Mirjalili, S., & Lewis, A. (2016). The whale optimization algorithm. Advances in Engineering Software, 95, 51-67.
Mirjalili, S. (2016). SCA: a sine cosine algorithm for solving optimization problems. Knowledge-Based Systems, 96, 120-133.
Murthy, C. A., & Chowdhury, N. (1996). In search of optimal clusters using genetic algorithms. Pattern Recognition Letters, 17(8), 825-832.
Nicholls, T., & Bright, J. (2019). Understanding news story chains using information retrieval and network clustering techniques. Communication Methods and Measures, 13(1), 43-59.
Omran, M. G., Salman, A., & Engelbrecht, A. P. (2006). Dynamic clustering using particle swarm optimization with application in image segmentation. Pattern Analysis and Applications, 8(4), 332.
Osman, I. H., & Kelly, J. P. (1997). Meta-heuristics theory and applications. Journal of the Operational Research Society, 48(6), 657-657.
Pal, N. R., Pal, K., Keller, J. M., & Bezdek, J. C. (2005). A possibilistic fuzzy c-means clustering algorithm. IEEE Transactions on Fuzzy Systems, 13(4), 517-530.
Pasandideh, S. H. R., & Khalilpourazari, S. (2018). Sine Cosine Crow Search Algorithm: A powerful hybrid meta heuristic for global optimization. arXiv preprint arXiv:1801.08485.
Pedrycz, W., & Rai, P. (2008). Collaborative clustering with the use of Fuzzy C-Means and its quantification. Fuzzy Sets and Systems, 159(18), 2399-2427.
Pizzuti, C., & Procopio, N. (2016, October). A k-means based genetic algorithm for data clustering. In International Joint Conference SOCO’16-CISIS’16-ICEUTE’16 (pp. 211-222). Springer, Cham.
Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66(336), 846-850.
Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53-65.
Shi, Y., & Eberhart, R. C. (1998, March). Parameter selection in particle swarm optimization. In International Conference on Evolutionary Programming (pp. 591-600). Springer, Berlin, Heidelberg.
Tan, P. N., Steinbach, M., & Kumar, V. (2006). Introduction to Data Mining, Pearson Education. Inc., New Delhi.
Tan, P. N., Steinbach, M., & Kumar, V. (2013). Data mining cluster analysis: basic concepts and algorithms. Introduction to Data Mining.
Van der Merwe, D. W., & Engelbrecht, A. P. (2003, December). Data clustering using particle swarm optimization. In The 2003 Congress on Evolutionary Computation, 2003. CEC'03. (Vol. 1, pp. 215-220). IEEE.
Vinh, N. X., Epps, J., & Bailey, J. (2010). Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. Journal of Machine Learning Research, 11(Oct), 2837-2854.
Xu, R., & Wunsch, I. I. (2005). Survey of clustering algorithms. IEEE Transactions on Neural Networks, 16(3), 645-678.
Yager, R. R. (1988). On ordered weighted averaging aggregation operators in multicriteria decision making. IEEE Transactions on Systems, Man, and Cybernetics, 18(1), 183-190.
Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338-353.

無法下載圖示 全文公開日期 2024/06/24 (校內網路)
全文公開日期 本全文未授權公開 (校外網路)
全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
QR CODE