簡易檢索 / 詳目顯示

研究生: Muhammad Naufal Alfareza
Muhammad Naufal Alfareza
論文名稱: 萬用演算法為基礎的密度峰值之可能性模糊c-平均演算法於顧客區隔
Metaheuristic-based Density Peak Possibilistic Fuzzy c-Means Algorithms for Customer Segmentation
指導教授: 郭人介
Ren-Jieh Kuo
口試委員: 林希偉
Shi-Woei Lin
王孔政
Kung-Jeng Wang
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 140
中文關鍵詞: 顧客區隔密度峰值遺傳演算法啟發式演算法可能性
外文關鍵詞: Customer segmentation, Density peak clustering, Genetic algorithm, Metaheuristic, Possibilistic fuzzy c-means algorithm
相關次數: 點閱:327下載:9
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 為了滿足多樣化的客戶需求,市場區隔是公司要注意的重要議題。聚類分析是劃分客戶最常用的方法之一。然而,聚類分析存在一些挑戰,因為沒有任何一個聚類技術能夠解決所有問題,並為所有類型的數據提供良好的結果。本研究旨在提出一種新穎以萬用演算法為基礎的密度峰值之可能性模糊C-平均演算法。藉由使用十一個標竿資料集來評估所提出演算法之表現。相較於其他聚類演算法,本研究所提之DP-GA-PFCM演算法能夠解決初始聚類中心的敏感度,找到參數和聚類中心的最佳組合,從而在準確性、ARI和NMI方面提供更好及更穩定的結果。此外,所提出的演算法也應用於個案研究,包含最近一次消費、消費頻率和消費金額(RFM)變量的零售業務資料集,並找出有六個具有各自特徵的集群,並建議在每個集群上,實施不同的策略,以維持或吸引客戶。


    To meet diverse customer needs, market segmentation is an important issue for a company to pay attention to. One of the most common techniques to segment the customers is by using clustering analysis. But, there are some challenges in clustering analysis. Nonetheless, there is no single clustering technique capable of solving all the problems and providing good results for all types of datasets. This study intends to propose a novel clustering algorithm with density peak clustering and a possibilistic fuzzy c-means algorithm integrated with a genetic algorithm. Using eleven benchmark datasets to evaluate the performance of the proposed algorithm, the DP-GA-PFCM algorithm capable to solve the sensitivity of initial cluster centers, find the optimal combination of parameters and cluster centers, provide better and more stable results in terms of accuracy, ARI, and NMI compared to other clustering algorithms. Furthermore, the proposed algorithm is applied for the case study dataset, a retail business dataset containing recency, frequency, and monetary (RFM) variables. There were six clusters formed with respective characteristics. It is strongly suggested to implement the different strategies on each cluster to maintain or attract customers.

    摘要 i ABSTRACT ii ACKNOWLEDGMENT iii CONTENTS iv LIST OF FIGURES vi LIST OF TABLES vii Chapter 1 INTRODUCTION 1 1.1 Research Background 1 1.2 Research Objectives 2 1.3 Research Limitations 3 1.4 Thesis Organization 3 Chapter 2 LITERATURE SURVEY 5 2.1 Density Peak Fuzzy C-Means Algorithm 5 2.2 Possibilistic Fuzzy c-Means Algorithm 8 2.3 Genetic Algorithm 9 2.4 Relevant Previous Studies 12 Chapter 3 METHODOLOGY 15 3.1 Methodology Framework 16 3.2 Data Preprocessing 17 3.3 DP-GA-PFCM Algorithm 17 3.4 Performance Measurement 20 Chapter 4 EXPERIMENTAL RESULTS 22 4.1 Datasets 22 4.2 Parameters Setup 23 4.3 Experimental Results 25 4.3.1 First-Stage Experimental Result 25 4.3.2 Second-Stage Experimental Result 26 4.4 Statistical Test 30 4.4.1 First-Stage Statistical Test 33 4.4.2 Second-Stage Statistical Test 38 4.5 Computational Time 45 Chapter 5 CASE STUDY 48 5.1 Data Description 48 5.2 Clustering Result 48 5.3 Analysis and Discussion 50 Chapter 6 CONCLUSIONS AND FUTURE RESEARCH 53 6.1. Conclusions 53 6.2. Contributions 53 6.3. Future Research 54 REFERENCES 55 APPENDIX A. Particle Swarm Optimization Algorithm 60 APPENDIX B. Sine Cosine Algorithm 62 APPENDIX C. Forensic-Based Investigation Algorithm 64 APPENDIX D. Experimental results 66 APPENDIX E. Normality Test Results 99 APPENDIX F. First-stage Kruskal-Wallis and Dunn-Bonferroni Post-hoc Test 105 APPENDIX G. Second-stage Kruskal-Wallis and Dunn-Bonferroni Post-hoc Test 116 APPENDIX H. Convergence History 127

    Ahani, A., Nilashi, M., Ibrahim, O., Sanzogni, L., & Weaven, S. (2019). Market segmentation and travel choice prediction in Spa hotels through TripAdvisor's online reviews. International Journal of Hospitality Management, 80, 52-77.
    Amjad, M. K., Butt, S. I., Kousar, R., Ahmad, R., Agha, M. H., Faping, Z., . . . Asgher, U. (2018). Recent research trends in genetic algorithm based flexible job shop scheduling problems. Mathematical Problems in Engineering, 1-32.
    Ansari, A., & Riasi, A. (2016). Customer clustering using a combination of fuzzy c-means and genetic algorithm. International Journal of Business Management, 11(7), 59-66.
    Arunachalam, D., & Kumar, N. (2018). Benefit-based consumer segmentation and performance evaluation of clustering approaches: An evidence of data-driven decision-making. Expert Systems With Applications, 111, 11-34.
    Aryafar, A., Mikaeil, R., Haghshenas, S. S., & Haghshenas, S. S. (2018). Application of metaheuristic algorithms to optimal clustering of sawing machine vibration. Measurement, 124, 20-31.
    Askari, S., Montazerin, N., & Zarandi, M. (2017b). Generalized possibilistic fuzzy c-means with novel cluster validity indices for clustering noisy data. Applied Soft Computing, 53, 262-283.
    Askari, S., Montazerin, N., Zarandi, M., & Hakimi, E. (2017a). Generalized entropy based possibilistic fuzzy c-Means for clustering noisy data and its convergence proof. Neurocomputing, 219, 186-202.
    Bie, R., Mehmood, R., Ruan, S., & Sun, Y. (2016). Adaptive fuzzy clustering by fast search and find of density peaks. Pers Ubiquit Comput, 20, 785-793.
    Borna, K., & Hashemi, V. (2014). An improved genetic algorithm with a local optimization strategy and an extra mutation level for solving traveling salesman problem. International Journal of Computer Science, Engineering and Information Technology, 4(4), 47-53.
    Chehouri, A., Younes, R., Khoder, J., Perron, J., & Ilinca, A. (2017). A selection process for genetic algorithm using clustering analysis. Algorithms, 10(123), 1-15.
    Chen, X., Li, D., & Li, H. (2016). An improved type-2 possibilistic fuzzy c-means clustering algorithm with application for MR image segmentation. International Journal of Signal Processing, Imgae Processing and Pattern Recognition, 9(11), 363-370.
    Chou, J., & Nguyen, N. (2020). FBI inspired meta-optimization. Applied Soft Computing, 93, 1-28.
    Christy, A. J., Umamakeswari, A., Priyatharsini, L., & Neyaa, A. (2018). RFM ranking - an effective approach to customer segmentation. Journal of King Saud University - Computer and Information Sciences, 1-7.
    Ding, Y., & Fu, X. (2016). Kernel-based fuzzy c-means clustering algorithm based on genetic algorithm. Neurocomputing, 188, 233-238.
    Du, M., Ding, S., Xu, X., & Xue, Y. (2018). Density peaks clustering using geodesic distances. International Journal Machine Learning & Cybernetics, 9, 1335-1349.
    Flores, K. G., & Garza, S. E. (2020). Density peaks clustering with gap-based automatic center detection. Knowledge-Based Systems, 206, 1-10.
    Franco, D. D., & Steiner, M. A. (2018). Clustering of solar energy facilities using a hybrid fuzzy c-means algorithm initialized by metaheuristics. Journal of Cleaner Production, 191, 445-457.
    Ghaemi, R., Sulaiman, N., Ibrahim, H., & Mustapha, N. (2011). A review: accuracy optimization in clustering ensembles using genetic algorithms. Artificial Intelligence Review, 35, 287-318.
    Ghaheri, A., Shoar, S., Naderan, M., & Hoseini, S. (2015). The applications of genetic algorithms in medicine. Oman Medical Journal, 30(6), 406-416.
    Ghezelbash, R., Maghsoudi, A., & Carranza, E. (2020). Optimization of geochemical anomaly detection using a novel genetic K-means clustering (GMKC) algorithm. Computers & Geosciences, 134, 1-11.
    Golmohammadi, A., Bani-Asadi, H., Zanjani, H., & Tikani, H. (2016). A genetic algorithm for preemptive scheduling of a single machine. International Journal of Industrial Engineering Computations, 7, 607-614.
    Haldurai, L., Madhubala, T., & Rajalakshmi, R. (2016). A study on genetic algorithm and its applications. International Journal of Computer Sciences and Engineering, 4(10), 139-143.
    Hou, J., Zhang, A., & Qi, N. (2020). Density peak clustering based on relative density relationship. Pattern Recognition, 108, 1-16.
    Islam, M., Estivill-Castro, V., Rahman, M., & Bossomaier, T. (2018). Combining K-Means and a genetic algorithm through a novel arrangement of genetic operators for high quality clustering. Expert Systems with Applications, 91, 402-417.
    Jennings, P., Lysgaard, S., Hummelshoj, J., Vegge, T., & Bligaard, T. (2019). Genetic algorithms for computational materials discovery accelerated by machine learning. Computational Materials, 5, 1-6.
    Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. Proceedings of ICNN'95 - International Conference on Neural Networks, 1942-1948. Perth: IEEE.
    Khotimah, B., Irhamni, F., & Sundarwati, T. (2016). A genetic algorithm for optimized initial centers K-means clustering in SMEs. Journal of Theoretical and Applied Information Technology, 90(1), 23-30.
    Kuo, R. J., Amornnikun, P., & Nguyen, T. Q. (2020). Metaheuristic-based possibilistic multivariate fuzzy weighted c-means algorithms for market segmentation. Applied Soft Computing Journal, 96, 1-14.
    Kuo, R. J., Lin, T. C., Zulvia, F. E., & Tsai, C. Y. (2018a). A hybrid metaheuristic and kernel intuitionistic fuzzy c-means algorithm for cluster analysis. Applied Soft Computing, 67, 299-308.
    Kuo, R. J., Potti, Y., & Zulvia, F. E. (2018b). Application of metaheuristic based fuzzy k-modes algorithm to supplier clustering. Computers & Industrial Engineering, 120, 298-307.
    Kuo, R. J., Zheng, Y. R., & Nguyen, T. P. (2021). Metaheuristic-based possibilistic fuzzy k-modes algorithms for categorical data clustering. Information Sciences, 557, 1-15.
    Li, H., He, H., & Wen, Y. (2015). Dynamic particle swarm optimization and K-means clustering algorithm for image segmentation. Optik, 126, 4817-4822.
    Liu, R., Wang, H., & Yu, X. (2018). Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Information Sciences, 450, 200-226.
    Liu, X., Fan, J., & Chen, Z. (2020). Improved fuzzy c-means algorithm based on density peak. International Journal of Machine Learning and Cybernetics, 11, 545-552.
    Mai, D., Ngo, L., Trinh, L., & Hagras, H. (2021). A hybrid interval type-2 semi-supervised possibilistic fuzzy c-means clustering and particle swarm optimization for sattelite image analysis. Information Sciences, 548, 398-422.
    Maina, S. (2015). The impact of market segmentation on the sales volume of a company's product or service. European Journal of Business and Management, 7(11), 132-138.
    Mirjalili, S. (2016). SCA: A sine cosine algorithm for solving optimization problems. Knowledge-Based Systems, 96, 120-133.
    Moussa, D., Eissa, N., Abounaser, H., & Badr, A. (2018). Design of novel metaheuristic techniques for clustering. IEEE Access, 6, 77350-77358.
    Muller, H., & Hamm, U. (2014). Stability of market segmentation with cluster analysis - A methodological approach. Food Quality and Preference, 34, 70-78.
    Nayak, J., Naik, B., Behera, H. S., & Abraham, A. (2017). Hybrid chemical reaction based metaheuristic with fuzzy c-means algorithm for optimal cluster analysis. Expert Systems with Applications, 79, 282-295.
    Pal, N., Pal, K., Keller, J., & Bezdek, J. (2005). A possibilistic fuzzy c-means clustering algorithm. IEEE Transactions on Fuzzy Systems, 13(4), 517-530.
    Rezaei, M. (2020). Improving a centroid-based clustering by using suitable centroids from another clustering. Journal of Classification, 37, 352-365.
    Rodriguez, A., & Laio, A. (2014). Clustering by fast search and find of density peaks. Science, 344(6191), 1492-1496.
    Shabir, S., & Singla, R. (2016). A comparative study of genetic algorithm and the particle swarm optimization. International Journal of Electrical Engineering, 9(2), 215-223.
    Silva, F. T., Silva, M. X., & Belchior, J. C. (2019). A new genetic algorithm approach applied to atomic and molecular cluster studies. Frontiers in Chemistry, 7, 1-21.
    Suthar, N., Rajput, I., & Gupta, V. (2013). A technical survey on DBSCAN clustering algorithm. International Journal of Scientific & Engineering Research, 4(5), 1775-1781.
    Tawhid, M., & Savsani, P. (2018). Discrete sine cosine algorithm (DSCA) with local search for solving traveling salesman problem. Computer Engineering and Computer Science, 44, 3669-3679.
    Tong, W., Liu, S., & Gao, X. Z. (2020). A density-peak-based clustering algorithm of automatically determining the number of clusters. Neurocomputing, 1-12.
    Tripathi, S., Bhardwaj, A., & Poovammal, E. (2018). Approaches to clustering in customer segmentation. International Journal of Engineering & Technology, 7, 802-807.
    Wang, T., Ren, C., Luo, Y., & Tian, J. (2019). NS-DBSCAN: A density-based clustering algorithm in network space. International Journal of Geo-Information, 8(218), 1-20.
    Yu, H., Liu, Z., & Wang, G. (2014). An automatic method to determine the number of clusters using decision-theoretic rough set. International Journal of Approximate Reasoning, 55, 101-115.
    Zeebaree, D., Haron, H., Abdulazeez, A., & Zeebaree, S. (2017). Combination of K-means clustering with genetic algorithm: A review. International Journal of Applied Engineering Research, 12(24), 14238-14245.
    Zhou, H., Norman, R., Kelobonye, K., Xia, J., Hughes, B., Nikolova, G., & Falkmer, T. (2020a). Market segmentation approach to investigate existing and potential aviation markets. Transport Policy, 99, 120-135.
    Zhou, J., Zhai, L., & Pantelous, A. (2020b). Market segmentation using high-dimensional sparse consumers data. Expert Systems with Applications, 145, 1-17.
    Zhu, L. (2019). Safety detection algorithm in sensor network based on ant colony optimization with improved multiple clustering algorithms. Safety Science, 118, 96-102.

    QR CODE