簡易檢索 / 詳目顯示

研究生: 吳芊瑩
Cian-Ying Wu
論文名稱: 具混合基因演算法與K-prototypes演算法之集成方法於混合型資料分類之研究
An Ensemble Method with a Hybrid of Genetic Algorithm and K-prototypes Algorithm for Mixed Data Classification
指導教授: 郭人介
Ren-Jieh Kuo
口試委員: 羅士哲
Shih-Che Lo
許嘉裕
Chia-Yu Hsu
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 113
中文關鍵詞: 混合型資料基於分群的分類基因演算法混合突變裝袋演算法
外文關鍵詞: Mixed data, Clustering-based classification, Genetic algorithm, Mixed mutation, Bagging
相關次數: 點閱:290下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 混合型資料包含類別和數值資料,然而傳統的分類器通常無法有效地處理混合型資料,因為它們只被設計來處理單一類型的資料。儘管如此,K-prototypes分群演算法已經顯示出能夠有效地將混合型資料分群的潛力。但是,K-prototypes演算法也有其自身的限制,這促使學者們開發了一些基於基因演算法的分群演算法,以克服這些問題。

    因此,本研究提出了一種新穎的基於分群的分類演算法,該演算法結合了混合型資料的分類和分群方法之優點。所提出的演算法主要是使用基因演算法以優化群心及屬性權重,並且用集成學習方法中的裝袋演算法構建多個分類器,以提高分類性能。此外,本研究還引入了包含高斯、柯西、萊維和單點突變的混合突變機制,以搜尋最佳解。

    透過使用3個來自UCI的標竿數據集進行評估,並與7個傳統的標竿分類器進行比較,再使用5個分類性能指標進行全面評估,以評估該演算法的有效性。實驗和案例研究的結果皆顯示,與傳統的標竿分類器相比,所提出的演算法可以實現較優異的分類性能。此外,案例研究也探討了所提出的演算法在實際應用中的管理意涵及見解。


    Mixed data, which includes both categorical and numeric data, can pose a challenge for traditional classifiers as they are typically designed to handle only one data type. Despite this challenge, the K-prototypes clustering algorithm has shown promise in effectively clustering mixed data. However, the K-prototypes algorithm has its own limitations, which has led to the development of clustering algorithms based on genetic algorithm (GA) to overcome these issues.

    Therefore, this study proposes a novel clustering-based classification algorithm that combines the benefits of both classification and clustering for mixed data. The proposed algorithm uses GA to optimize the weights and centroids, while the ensemble learning method’s bagging technique constructs multiple classifiers to enhance the classification performance. Additionally, mixed mutation mechanisms containing Gaussian, Cauchy, Levy, and single-point mutations, are introduced to search the optimal solution.

    The algorithm’s effectiveness is evaluated using 3 UCI benchmark datasets, compared to 7 benchmark classifiers, then evaluated comprehensively using 5 performance indicators. The experimental and case study results show that the proposed algorithm can achieve superior classification performance compared to benchmark classifiers. A case study is also conducted to explore the managerial implications and insights of the proposed algorithm in real-world applications.

    摘要 I ABSTRACT II 致謝 III CONTENTS IV LIST OF FIGURES VI LIST OF TABLES VII LIST OF APPENDICES VIII CHAPTER 1 INTRODUCTION 1 1.1 Background and Motivation 1 1.2 Research Objectives 3 1.3 Research Scope and Constraints 3 1.4 Thesis Organization 4 CHAPTER 2 LITERATURE REVIEWS 5 2.1 Mixed Data 5 2.2 Mixed Data Classification 6 2.2.1 Enhancing existing classifiers 7 2.2.2 Integrating multiple classifiers 7 2.2.3 Distance-based classifiers 9 2.2.4 Clustering-based classifiers 9 2.3 Mixed Data Clustering 10 2.3.1 Partitional clustering algorithms 11 2.3.2 Hierarchical clustering algorithms 12 2.3.3 Model-based clustering algorithms 14 2.3.4 Neural network-based clustering algorithms 15 2.3.5 Other clustering algorithms 17 2.4 Genetic Algorithm 19 2.5 Mutation Mechanism 22 2.6 Ensemble Learning Methods 23 2.6.1 Boosting 23 2.6.2 Bagging 24 CHAPTER 3 METHODOLOGY 26 3.1 Methodology Framework 26 3.2 Notations 29 3.3 Algorithm Design 33 3.4 Pseudocode 41 CHAPTER 4 EXPERIMENTAL RESULTS 44 4.1 Dataset Description 44 4.2 Parameter Setting 45 4.3 Performance Evaluation 47 4.3.1 Sampling ratio 48 4.3.2 Accuracy 49 4.3.3 Precision 52 4.3.4 Recall 54 4.3.5 F1 score 56 4.3.6 Cohen’s Kappa 58 4.3.7 Complexity analysis 60 4.4 Statistical Testing 62 4.4.1 Shapiro-Wilk normality test 62 4.4.2 Friedman test 63 4.4.3 Pairwise comparison 63 CHAPTER 5 CASE STUDY 68 5.1 Dataset Description 68 5.2 Case Study Results 70 5.2.1 Fitness values 70 5.2.2 Performance indicators 71 5.2.3 Computational time 72 5.3 Statistical Testing 74 5.4 Discussion 77 CHAPTER 6 CONCLUSIONS 78 6.1 Conclusions 78 6.2 Contributions 79 6.3 Suggestions for Future Research 79 REFERENCES 81 APPENDIX A. RELATED LITERATURE OF THE DATASETS 97 APPENDIX B. SHAPIRO-WILK NORMALITY TEST RESULTS 99 APPENDIX C. PAIRWISE COMPARISON RESULTS 101

    Ahmad, A., & Dey, L., “A k-mean clustering algorithm for mixed numeric and categorical data,” Data & Knowledge Engineering, 63(2), 503-527, 2007.
    Ahmad, A., & Dey, L., “A k-means type clustering algorithm for subspace clustering of mixed numeric and categorical datasets,” Pattern Recognition Letters, 32(7), 1062-1069, 2011.
    Ahmad, A., & Khan, S. S., “Survey of state-of-the-art mixed data clustering algorithms,” IEEE Access, 7, 31883-31902, 2019.
    Ahmad, A., & Khan, S. S., “initKmix-A novel initial partition generation algorithm for clustering mixed data using k-means-based clustering,” Expert Systems with Applications, 167, 114149, 2021.
    Ahsan, M. M., E. Alam, T., Trafalis, T., & Huebner, P., “Deep MLP-CNN model using mixed-data to distinguish between COVID-19 and Non-COVID-19 patients,” Symmetry, 12(9), 1526, 2020.
    Akay, Ö., & Yüksel, G., “Clustering the mixed panel dataset using Gower’s distance and k-prototypes algorithms,” Communications in Statistics-Simulation and Computation, 47(10), 3031-3041, 2018.
    Akay, Ö., & Yüksel, G., “Hierarchical clustering of mixed variable panel data based on new distance,” Communications in Statistics-Simulation and Computation, 50(6), 1695-1710, 2021.
    AlGhamdi, N., & Khatoon, S., “Improving Sentiment Prediction using Heterogeneous and Homogeneous Ensemble Methods: A Comparative Study,” Procedia Computer Science, 194, 60-68, 2021.
    Boriah, S., Chandola, V., & Kumar, V., “Similarity measures for categorical data: A comparative evaluation,” In Proceedings of the 2008 SIAM International Conference on Data Mining, Atlanta, Georgia, USA, April 24-26, 243-254, 2008.
    Breiman, L., “Bagging predictors,” Machine Learning, 24, 123-140, 1996.
    Browne, R. P., & McNicholas, P. D., “Model-based clustering, classification, and discriminant analysis of data with mixed type,” Journal of Statistical Planning and Inference, 142(11), 2976-2984, 2012.
    Carpenter, G. A., & Grossberg, S., “Adaptive resonance theory,” CAS/CNS Technical Report Series no. 008, 2010.
    Caruso, G., & Gattone, S. A., “Waste management analysis in developing countries through unsupervised classification of mixed data,” Social Sciences, 8(6), 186, 2019.
    Caruso, G., Gattone, S. A., Fortuna, F., & Di Battista, T., “Cluster Analysis for mixed data: An application to credit risk evaluation,” Socio-Economic Planning Sciences, 73, 100850, 2021.
    Cheeseman, P. C., & Stutz, J. C., “Bayesian classification (AutoClass): theory and results,” Advances in Knowledge Discovery and Data Mining, 180, 153-180, 1996.
    Chen, N., & Marques, N. C., “An extension of self-organizing maps to categorical data,” In Progress in Artificial Intelligence: 12th Portuguese Conference on Artificial Intelligence, EPIA 2005, Covilhã, Portugal, December 5-8, 304-313 2005.
    Chiu, C., Chi, H., Sung, R., & Yuang, J. Y., “The Hybrid of Genetic Algorithms and K-Prototypes Clustering Approach for Classification,” 2010 International Conference on Technologies and Applications of Artificial Intelligence, Hsinchu City, Taiwan, November 18-20, 327-330, 2010.
    Cover, T. M., Elements of Information Theory, John Wiley & Sons, 1999.
    Dallaki, H., Lari, K. B., Hamzeh, A., Hashemi, S., & Sami, A., “Scaling up the hybrid Particle Swarm Optimization algorithm for nominal data-sets,” Intelligent Data Analysis, 19(4), 825-844, 2015.
    Dang, U. J., Gallaugher, M. P., Browne, R. P., & McNicholas, P. D., “Model-based clustering and classification using mixtures of multivariate skewed power exponential distributions,” Journal of Classification, 1-23, 2023.
    Darwin, C., On the Origin of Species, 1859, Routledge, 2004.
    David, G., & Averbuch, A., “SpectralCAT: categorical spectral clustering of numerical and nominal data,” Pattern Recognition, 45(1), 416-433, 2012.
    De Jong, K. A., An Analysis of the Behavior of a Class of Genetic Adaptive Systems, University of Michigan, 1975.
    Del Coso, C., Fustes, D., Dafonte, C., Nóvoa, F. J., Rodríguez-Pedreira, J. M., & Arcay, B., “Mixing numerical and categorical data in a self-organizing map by means of frequency neurons,” Applied Soft Computing, 36, 246-254, 2015.
    Dempster, A. P., Laird, N. M., & Rubin, D. B., “Maximum likelihood from incomplete data via the EM algorithm,” Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1-22, 1977.
    Dhal, K. G., Das, A., Ray, S., & Das, S., “A clustering based classification approach based on modified cuckoo search algorithm,” Pattern Recognition and Image Analysis, 29, 344-359, 2019.
    Ding, S., Du, M., Sun, T., Xu, X., & Xue, Y., “An entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood,” Knowledge-Based Systems, 133, 294-313, 2017.
    Dittenbach, M., Merkl, D., & Rauber, A., “The growing hierarchical self-organizing map.” In Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium, Como, Italy, July 27, 15-19, 2000.
    Dong, H., He, J., Huang, H., & Hou, W., “Evolutionary programming using a mixed mutation strategy,” Information Sciences, 177(1), 312-327, 2007.
    Dong, X., Yu, Z., Cao, W., Shi, Y., & Ma, Q., “A survey on ensemble learning,” Frontiers of Computer Science, 14, 241-258, 2020.
    Du, M., Ding, S., & Xue, Y., “A novel density peaks clustering algorithm for mixed data,” Pattern Recognition Letters, 97, 46-53, 2017.
    Ehsani, R., & Drabløs, F., “Robust Distance Measures for kNN Classification of Cancer Data,” Cancer Informatics, 19, 1176935120965542, 2020.
    Eiben, A. E., & Smith, J. E., Introduction to Evolutionary Computing, Springer-Verlag Berlin Heidelberg, 2015.
    El Barakaz, F., Boutkhoum, O., & El Moutaouakkil, A., “A hybrid naïve Bayes based on similarity measure to optimize the mixed-data classification,” TELKOMNIKA (Telecommunication Computing Electronics and Control), 19(1), 155-162, 2021.
    Elhamifar, E., & Vidal, R., “Sparse subspace clustering: Algorithm, theory, and applications,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11), 2765-2781, 2013.
    Ester, M., Kriegel, H. P., Sander, J., & Xu, X., “A density-based algorithm for discovering clusters in large spatial databases with noise,” In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), Portland, Oregon, USA, August 2-4, 226-231, 1996.
    Fogel, D. B., Artificial Intelligence Through Simulated Evolution, Wiley-IEEE Press, 227-296, 1998.
    Fogel, L. J., Owens, A. J., & Walsh, M. J., “Artificial intelligence through simulated evolution,” 1966.
    Fraley, C., & Raftery, A. E., “How many clusters? Which clustering method? Answers via model-based cluster analysis,” The Computer Journal, 41(8), 578-588, 1998.
    Freund, Y., & Schapire, R. E., “Experiments with a new boosting algorithm,” In Proceedings of the Thirteenth International Conference (ICML’96),Bari, Italy, July 3-6, 148-156, 1996.
    Goodall, D. W., “A new similarity index based on probability,” Biometrics, 882-907, 1966.
    Grün, B., “Model-based clustering,” In Handbook of Mixture Analysis, Chapman and Hall/CRC, 157-192, 2019.
    Hall, P., & Samworth, R. J., “Properties of bagged nearest neighbour classifiers,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(3), 363-379, 2005.
    Han, J., Pei, J., & Tong, H., Data Mining: Concepts and Techniques, Morgan kaufmann, 2022.
    Hasanpour, H., Meibodi, R. G., Navi, K., & Asadi, S., “Dealing with mixed data types in the obsessive-compulsive disorder using ensemble classification,” Neurology, Psychiatry and Brain Research, 32, 77-84, 2019.
    Haupt, R. L., & Haupt, S. E., Practical Genetic Algorithms, John Wiley & Sons, 2004.
    Hautaniemi, S., Kharait, S., Iwabu, A., Wells, A., & Lauffenburger, D. A., “Modeling of signal–response cascades using decision tree analysis,” Bioinformatics, 21(9), 2027-2035, 2005.
    He, Y. L., Ou, G. L., Fournier-Viger, P., Huang, J. Z., & Suganthan, P. N., “A novel dependency-oriented mixed-attribute data classification method,” Expert Systems with Applications, 199, 116782, 2022.
    He, Z., Xu, X., & Deng, S., “TCSOM: clustering transactions using self-organizing map,” Neural Processing Letters, 22, 249-262, 2005.
    Helal, A., & Otero, F. E., “A mixed-attribute approach in ant-miner classification rule discovery algorithm,” In Proceedings of the Genetic and Evolutionary Computation Conference 2016, Denver, Colorado, USA, July 20-24, 13-20, 2016
    Holden, N., & Freitas, A. A., “A hybrid PSO/ACO algorithm for discovering classification rules in data mining,” Journal of Artificial Evolution and Applications, 2008.
    Holland, J. H., Adaptation in Natural and Artificial Systems: an Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, MIT Press, 1992.
    Hsu, C. C., & Chen, Y. C., “Mining of mixed data with application to catalog marketing,” Expert Systems with Applications, 32(1), 12-23, 2007.
    Hsu, C. C., & Huang, Y. P., “Incremental clustering of mixed data based on distance hierarchy,” Expert Systems with Applications, 35(3), 1177-1185, 2008.
    Hsu, C. C., & Lin, S. H., “Visualized analysis of mixed numeric and categorical data via extended self-organizing map,” IEEE Transactions on Neural Networks and Learning Systems, 23(1), 72-86, 2011.
    Hsu, C. C., “Generalizing self-organizing map for categorical data,” IEEE Transactions on Neural Networks, 17(2), 294-304, 2006.
    Hsu, C. C., Chen, C. L., & Su, Y. W., “Hierarchical clustering of mixed data based on distance hierarchy,” Information Sciences, 177(20), 4474-4492, 2007.
    Hsu, C. C., Huang, Y. P., & Chang, K. W., “Extended I Bayes classifier for mixed data,” Expert Systems with Applications, 35(3), 1080-1083, 2008.
    Hu, L. Y., Huang, M. W., Ke, S. W., & Tsai, C. F., “The distance function effect on k-nearest neighbor classification for medical datasets,” SpringerPlus, 5(1), 1-9, 2016.
    Huang, J. Z., Ng, M. K., Rong, H., & Li, Z., “Automated variable weighting in k-means type clustering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(5), 657-668, 2005.
    Huang, Z., & Ng, M. K., “A fuzzy k-modes algorithm for clustering categorical data,” IEEE Transactions on Fuzzy Systems, 7(4), 446-452, 1999.
    Huang, Z., “Clustering large data sets with mixed numeric and categorical values,” In Proceedings of the 1st Pacific-asia Conference on Knowledge Discovery and Data Mining,(PAKDD), Trondheim, Norway, June 24-27, 21-34, 1997.
    Huang, Z., “Extensions to the k-means algorithm for clustering large data sets with categorical values,” Data Mining and Knowledge Discovery, 2(3), 283-304, 1998.
    Iwamatsu, M., “Generalized evolutionary programming with Levy-type mutation,” Computer Physics Communications, 147(1-2), 729-732, 2002.
    Ji, J., Bai, T., Zhou, C., Ma, C., & Wang, Z., “An improved k-prototypes clustering algorithm for mixed numeric and categorical data,” Neurocomputing, 120, 590-596, 2013.
    Ji, J., Li, R., Pang, W., He, F., Feng, G., & Zhao, X., “A multi-view clustering algorithm for mixed numeric and categorical data,” IEEE Access, 9, 24913-24924, 2021.
    Ji, M., Tang, H., & Guo, J., “A single-point mutation evolutionary programming,” Information Processing Letters, 90(6), 293-299, 2004.
    Ji, J., Pang, W., Zhou, C., Han, X., & Wang, Z., “A fuzzy k-prototype clustering algorithm for mixed numeric and categorical data,” Knowledge-Based Systems, 30, 129-135, 2012.
    Jia, H., & Cheung, Y. M., “Subspace clustering of categorical and numerical data with an unknown number of clusters,” IEEE Transactions on Neural Networks and Learning Systems, 29(8), 3308-3325, 2017.
    Jia, H., Cheung, Y. M., & Liu, J., “A new distance metric for unsupervised learning of categorical data,” IEEE Transactions on Neural Networks and Learning Systems, 27(5), 1065-1079, 2015.
    Jie, L., Xinbo, G., & Li-Cheng, J., “A CSA-based clustering algorithm for large data sets with mixed numeric and categorical values,” Fifth World Congress on Intelligent Control and Automation (IEEE Cat. No. 04EX788), Hangzhou, China, June 15-19, 2303-2307, 2004.
    Kansal, T., Bahuguna, S., Singh, V., & Choudhury, T., “Customer segmentation using K-means clustering,” 2018 International Conference on Computational Techniques, Electronics and Mechanical Systems (CTEMS), Belgaum, India, December 21-22, 135-139, 2018.
    Katoch, S., Chauhan, S. S., & Kumar, V., “A review on genetic algorithm: past, present, and future,” Multimedia Tools and Applications, 80, 8091-8126, 2021.
    Khennak, I., Drias, H., Kechid, A., & Moulai, H., “Clustering algorithms for query expansion based information retrieval,” Computational Collective Intelligence: 11th International Conference, ICCCI 2019, Proceedings, Part II 11, Hendaye, France, September 4-6, 261-272, 2019.
    Khuat, T. T., & Gabrys, B., “An in-depth comparison of methods handling mixed-attribute data for general fuzzy min–max neural network,” Neurocomputing, 464, 175-202, 2021.
    Kim, D. W., Lee, K. H., & Lee, D., “Fuzzy clustering of categorical data using fuzzy centroids,” Pattern Recognition Letters, 25(11), 1263-1271, 2004.
    Kim, K. H., Yun, S. T., Park, S. S., Joo, Y., & Kim, T. S., “Model-based clustering of hydrochemical data to demarcate natural versus human impacts on bedrock groundwater quality in rural areas, South Korea,” Journal of Hydrology, 519, 626-636, 2014.
    Kim, K., & Hong, J. S., “A hybrid decision tree algorithm for mixed numeric and categorical data in regression analysis,” Pattern Recognition Letters, 98, 39-45, 2017.
    Klon, A. E., Lowrie, J. F., & Diller, D. J., “Improved naive Bayesian modeling of numerical data for absorption, distribution, metabolism and excretion (ADME) property prediction,” Journal of Chemical Information and Modeling, 46(5), 1945-1956, 2006.
    Kohonen, T., “Self-organized formation of topologically correct feature maps,” Biological Cybernetics, 43(1), 59-69, 1982.
    Kohonen, T., “The self-organizing map,” Proceedings of the IEEE, 78(9), 1464-1480, 1990.
    Kriegel, H. P., Kröger, P., Sander, J., & Zimek, A., “Density‐based clustering,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(3), 231-240, 2011.
    Kuo, T., & Wang, K. J., “A hybrid k-prototypes clustering approach with improved sine-cosine algorithm for mixed-data classification,” Computers & Industrial Engineering, 169, 108164, 2022.
    Kwedlo, W., “A hybrid steady-state evolutionary algorithm using random swaps for Gaussian model-based clustering,” Expert Systems with Applications, 208, 118159, 2022.
    Lam, D., Wei, M., & Wunsch, D., “Clustering data of mixed categorical and numerical type with unsupervised feature learning,” IEEE Access, 3, 1605-1613, 2015.
    Lebbah, M., & Benabdeslem, K., “Visualization and clustering of categorical data with probabilistic self-organizing map,” Neural Computing and Applications, 19, 393-404, 2010.
    Lee, C. Y., & Yao, X., “Evolutionary programming using mutations based on the Lévy probability distribution,” IEEE Transactions on Evolutionary Computation, 8(1), 1-13, 2004.
    Li, C., & Biswas, G., “Unsupervised learning with mixed numeric and nominal data,” IEEE Transactions on Knowledge and Data Engineering, 14(4), 673-690, 2002.
    Li, J., Gao, X. B., & Jiao, L. C., “A GA-Based Clustering Algorithm for Large Data Sets with Mixed Numerical and Categorical Values,” 电子与信息学报, 26(8), 1203-1209, 2004.
    Li, F., Qian, Y., Wang, J., Peng, F., & Liang, J., “Clustering mixed type data: a space structure-based approach,” International Journal of Machine Learning and Cybernetics, 13(9), 2799-2812, 2022.
    Li, Q., Xiong, Q., Ji, S., Yu, Y., Wu, C., & Yi, H., “A method for mixed data classification base on RBF-ELM network,” Neurocomputing, 431, 7-22, 2021.
    Lin, W. Y., Lee, W. Y., & Hong, T. P., “Adapting crossover and mutation rates in genetic algorithms,” Journal of Information Science and Engineering, 19(5), 889-903, 2003.
    Liu, W., Pokharel, P. P., & Principe, J. C., “Correntropy: Properties and applications in non-Gaussian signal processing,” IEEE Transactions on Signal Processing, 55(11), 5286-5298, 2007.
    Liu, X., Yang, Q., & He, L., “A novel DBSCAN with entropy and probability for mixed data,” Cluster Computing, 20, 1313-1323, 2017.
    Luo, H., Kong, F., & Li, Y., “Clustering mixed data based on evidence accumulation.” In Advanced Data Mining and Applications: Second International Conference, ADMA 2006, Xi’an, China, August 14-16, 348-355, 2006
    Luo, J., Fujimura, S., El Baz, D., & Plazolles, B., “GPU based parallel genetic algorithm for solving an energy efficient dynamic flexible flow shop scheduling problem,” Journal of Parallel and Distributed Computing, 133, 244-257, 2019.
    MacQueen, J., “Some methods for classification and analysis of multivariate observations.” In Proc. 5th Berkeley Symposium on Math., Stat., and Prob, California, USA, June 21-July 18, 281, 1965.
    Malondkar, A., Corizzo, R., Kiringa, I., Ceci, M., & Japkowicz, N., “Spark-GHSOM: growing hierarchical self-organizing map for large scale mixed attribute datasets,” Information Sciences, 496, 572-591, 2019.
    Marbac, M., Biernacki, C., & Vandewalle, V., “Model-based clustering of Gaussian copulas for mixed data,” Communications in Statistics-Theory and Methods, 46(23), 11635-11656, 2017.
    Martínez-Muñoz, G., & Suárez, A., “Out-of-bag estimation of the optimal sample size in bagging,” Pattern Recognition, 43(1), 143-152, 2010.
    Masuyama, N., Nojima, Y., Ishibuchi, H., & Liu, Z., “Adaptive Resonance Theory-based Clustering for Handling Mixed Data,” 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy, July 18-23, 1-8, 2022.
    Mbuga, F., & Tortora, C., “Spectral clustering of mixed-type data,” Stats, 5(1), 1-11, 2021.
    McCane, B., & Albert, M., “Distance functions for categorical and mixed variables,” Pattern Recognition Letters, 29(7), 986-993, 2008.
    McNicholas, P. D., “Model-based clustering,” Journal of Classification, 33, 331-373, 2016.
    McParland, D., & Gormley, I. C., “Model based clustering for mixed data: clustMD,” Advances in Data Analysis and Classification, 10(2), 155-169, 2016.
    Michalewicz, Z., & Schoenauer, M., “Evolutionary algorithms for constrained parameter optimization problems,” Evolutionary Computation, 4(1), 1-32, 1996.
    Michielssen, E., Ranjithan, S., & Mittra, R., “Optimal multilayer filter design using real coded genetic algorithms,” IEE Proceedings J (Optoelectronics), 139(6), 413-420, 1992.
    Mirjalili, S., “Genetic algorithm,” Evolutionary Algorithms and Neural Networks: Theory and Applications, 43-55, 2019.
    Mitchell, M., An Introduction to Genetic Algorithms, MIT press, 1998.
    Moschidis, O., Markos, A., & Chadjipadelis, T., “Hierarchical clustering of mixed-type data based on barycentric coding,” Behaviormetrika, 1-25, 2022.
    Moustaki, I., & Papageorgiou, I., “Latent class models for mixed variables with applications in Archaeometry,” Computational Statistics & Data Analysis, 48(3), 659-675, 2005.
    Muhlenbein, H., “The breeder genetic algorithm-a provable optimal search algorithm and its application,” IEE Colloquium on Applications of Genetic Algorithms, London, UK, March 15, 5-1, 1994.
    Ng, A., Jordan, M., & Weiss, Y., “On spectral clustering: Analysis and an algorithm,” Advances in Neural Information Processing Systems, 14, 2001.
    Nooraeni, R., Arsa, M. I., & Projo, N. W. K., “Fuzzy centroid and genetic algorithms: solutions for numeric and categorical mixed data clustering,” Procedia Computer Science, 179, 677-684, 2021.
    Nouaouria, N., & Boukadoum, M., “Improved global-best particle swarm optimization algorithm with mixed-attribute data classification capability,” Applied Soft Computing, 21, 554-567, 2014.
    Obafemi-Ajayi, T., Lam, D., Takahashi, T. N., Kanne, S., & Wunsch, D., “Sorting the phenotypic heterogeneity of autism spectrum disorders: A hierarchical clustering model.” 2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Niagara Falls, Canada, August 12-15, 1-7, 2015.
    Parsons, L., Haque, E., & Liu, H., “Subspace clustering for high dimensional data: a review,” Acm Sigkdd Explorations Newsletter, 6(1), 90-105, 2004.
    Peng, S., Hu, Q., Chen, Y., & Dang, J., “Improved support vector machine algorithm for heterogeneous data,” Pattern Recognition, 48(6), 2072-2083, 2015.
    Pereira, C. L., Cavalcanti, G. D., & Ren, T. I., “A new heterogeneous dissimilarity measure for data classification,” 2010 22nd IEEE International Conference on Tools with Artificial Intelligence, Arras, France, October 27-29, 373-374, 2010.
    Perišić, A., & Pahor, M., “Clustering mixed-type player behavior data for churn prediction in mobile games,” Central European Journal of Operations Research, 1-26, 2022.
    Petwal, H., & Rani, R., “An Efficient Clustering Algorithm for Mixed Dataset of Postoperative Surgical Records,” International Journal of Computational Intelligence Systems, 13(1), 757-770, 2020.
    Raza, K., “Improving the prediction accuracy of heart disease with ensemble learning and majority voting rule,” In U-Healthcare Monitoring Systems, Academic Press, 179-196, 2019.
    Rameshbhai, C. J., & Paulose, J., “Opinion mining on newspaper headlines using SVM and NLP,” International Journal of Electrical and Computer Engineering (IJECE), 9(3), 2152-2163, 2019.
    Reeves, C. R., “Genetic algorithms,” Handbook of Metaheuristics, 109-139, 2010.
    Rehioui, H., & Idrissi, A., “New clustering algorithms for twitter sentiment analysis,” IEEE Systems Journal, 14(1), 530-537, 2019.
    Ren, M., Liu, P., Wang, Z., & Pan, X., “An improved mixed-type data based kernel clustering algorithm,” 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Changsha, China, August 13-15, 1205-1209, 2016.
    Ren, T., Wang, H., Feng, H., Xu, C., Liu, G., & Ding, P., “Study on the improved fuzzy clustering algorithm and its application in brain image segmentation,” Applied Soft Computing, 81, 105503, 2019.
    Ruz, G. A., & Henríquez, P. A., “Random vector functional link with naive bayes for classification problems of mixed data,” 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), Portland, Oregon, USA, November 4-6, 1749-1752, 2019.
    Sabzevari, M., Martinez-Munoz, G., & Suarez, A., “Improving the robustness of bagging with reduced sampling size,” European Symposium on Artificial Neural Networks. In Computational Intelligence and Machine Learning, 2014.
    Sagi, O., & Rokach, L., “Ensemble learning: A survey,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1249, 2018.
    Salem, S. B., Naouali, S., & Chtourou, Z., “A fast and effective partitional clustering algorithm for large categorical datasets using a k-means based approach,” Computers & Electrical Engineering, 68, 463-483, 2018.
    Sangam, R. S., & Om, H., “An equi-biased k-prototypes algorithm for clustering mixed-type data,” Sādhanā, 43, 1-12, 2018.
    Selosse, M., Jacques, J., & Biernacki, C., “Model-based co-clustering for mixed type data,” Computational Statistics & Data Analysis, 144, 106866, 2020.
    Shih, M. Y., Jheng, J. W., & Lai, L. F., “A two-step method for clustering mixed categroical and numeric data,” Journal of Applied Science and Engineering, 13(1), 11-19, 2010.
    Sinaga, K. P., & Yang, M. S., “Unsupervised K-means clustering algorithm,” IEEE Access, 8, 80716-80727, 2020.
    Von Luxburg, U., “A tutorial on spectral clustering,” Statistics and Computing, 17, 395-416, 2007.
    Wolfe, J. H., A Computer Program for the Maximum Likelihood Analysis of Types, Naval Personnel Research Activity San Diego United States, 1965.
    Wu, C. H., Tzeng, G. H., Goo, Y. J., & Fang, W. C., “A real-valued genetic algorithm to optimize the parameters of support vector machine for predicting bankruptcy,” Expert Systems with Applications, 32(2), 397-408, 2007.
    Yao, X., Liu, Y., & Lin, G., “Evolutionary programming made faster,” IEEE Transactions on Evolutionary Computation, 3(2), 82-102, 1999.
    Yin, H., “ViSOM-a novel method for multivariate data projection and structure visualization,” IEEE Transactions on Neural Networks, 13(1), 237-243, 2002.
    Yuan, F., Yang, Y., & Yuan, T., “A dissimilarity measure for mixed nominal and ordinal attribute data in k-Modes algorithm,” Applied Intelligence, 50(5), 1498-1509, 2020.
    Zhang, S., “Nearest neighbor selection for iteratively kNN imputation. Journal of Systems and Software,” 85(11), 2541-2552, 2012.
    Zhang, Y., & Cheung, Y. M., “A new distance metric exploiting heterogeneous interattribute relationship for ordinal-and-nominal-attribute data clustering,” IEEE Transactions on Cybernetics, 2020.
    Zhang, Y., Gendeel, M. A. A., Peng, H., Qian, X., & Xu, H., “Supervised Kohonen network with heterogeneous value difference metric for both numeric and categorical inputs,” Soft Computing, 24(3), 1763-1774, 2020a.
    Zhang, G., Liu, Z., Dai, J., Yu, Z., Liu, S., & Zhang, W., “ItLnc-BXE: A Bagging-xgboost-Ensemble method with comprehensive sequence features for identification of plant lncRNAs,” IEEE Access, 8, 68811-68819, 2020b.
    Zheng, B., & Li, S., “Multivariable panel data cluster analysis and its application,” Computer Modeling & New Technologies, 18(1), 553-557, 2014.
    Zhou, Z. H., Ensemble learning, Springer Singapore, 181-210, 2021.
    Zou, P. C., Wang, J., Chen, S., & Chen, H., “Bagging-like metric learning for support vector regression,” Knowledge-Based Systems, 65, 21-30, 2014.

    無法下載圖示 全文公開日期 2028/06/23 (校內網路)
    全文公開日期 2028/06/23 (校外網路)
    全文公開日期 2028/06/23 (國家圖書館:臺灣博碩士論文系統)
    QR CODE