簡易檢索 / 詳目顯示

研究生: 涂穎定
Ying-Ting Tu
論文名稱: 應用逐步維度縮減機制於非凌駕式排序基因 演算法之連續分群分類架構
Non-Dominated Sorting Genetic Algorithm II – Sequential Clustering Classification with Stepwise Dimensionality Reduction Mechanism
指導教授: 楊朝龍
Chao-Lung Yang
口試委員: 歐陽超
Chao Ou-Yang
花凱龍
Kai-Lung Hua
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 58
中文關鍵詞: NSGAII-SCC維度縮減主成分分析
外文關鍵詞: NSGAII-SCC, dimensionality reduction, principal component analysis
相關次數: 點閱:376下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究的目標旨在利用主成分分析維度縮減方法,提升非凌駕式排序基因演算法連續分群分類資料分析架構 (Non-Dominated Sorting Genetic Algorithm – Sequential Clustering Classification, NSGAII-SCC) 的分析結果品質。NSGAII-SCC將收集的資料依照其包含的屬性特徵區分成代表效能的資料 (Q資料),以及與效能相關的資訊 (X資料),並分別對Q資料及X資料進行連續分群分類資料分析。此資料分析架構目的在於選擇適當的屬性進行連續分群分類流程同時,維持資料分群的密集性 (compactness) 和資料分類的準確度 (accuracy) 之品質。過去對於NSGAII-SCC的研究多著重於選取對於連續分群分類有最佳結果的屬性組合,本研究則透過加入維度縮減的機制改善原本NSGAII-SCC資料分析架構的分析結果品質,並試圖探索維度縮減方法應用在NSGAII-SCC架構下的各種情境。實驗結果證明本研究所提出的方法不僅顯著增加分析結果品質,同時還能維持不亞於原NSGAII-SCC的運算速度。


    This research aims to improve the performance of the Non-Dominated Sorting Genetic Algorithm – Sequential Clustering Classification (NSGAII-SCC) by integrating a dimensionality reduction technique which is the Principal Component Analysis (PCA). The NSGAII-SCC separates a dataset into two subsets. Next it applies a clustering method to generate labels from a sub-dataset which contains performance related features. Then, the labels generated are used as prediction targets in a classification model to classify data in the second sub-dataset. The sequential clustering and classification (SCC) mechanism is integrated with a feature selection mechanism which is the genetic algorithm (GA). Through the NSGA-II, the compactness of clustering and the prediction accuracy of classification will be simultaneously optimized. Instead of finding the best combination of clustering and classification methods like most studies in the past, this research aims to improve the NSGAII-SCC algorithm performance by integrating a dimensionality reduction technique. The experiment result shows that the proposed method performed better than the original NSGAII-SCC in terms of the solution quality.

    摘 要 i ABSTRACT ii 致 謝 iii CONTENTS iv FIGURE LIST vi TABLE LIST vii CHAPTER 1 INTRODUCTION 1 1.1 Research background 1 1.2 Research objectives 2 1.3 Research organization 3 CHAPTER 2 LITERATURE REVIEW 5 2.1 Combination of clustering and classification 5 2.2 Multi-objective evolutionary algorithms 7 2.3 Data analysis framework of sequential clustering and classification 8 2.4 Dimensionality reduction techniques 10 CHAPTER 3 METHODOLOGY 12 3.1 Research framework 12 3.2 Stage 1: Identification and separation of data 12 3.3 Stage 2: PCA transformation and NSGAII-SCC 14 3.3.1 Dataset X transformation by PCA 15 3.3.2 Chromosome representation 15 3.3.3 Sequential clustering and classification 16 3.3.4 Fitness function for multi-objective optimization 18 3.3.5 Elitism non-dominated sorting 19 3.3.6 Stepwise dimensionality reduction mechanism 21 3.4 Stage 3: Result comparison 23 CHAPTER 4 EXPERIMENTS AND RESULTS 25 4.1 Introduction to datasets 25 4.2 Experiment results 27 4.2.1 Experiment 1 27 4.2.2 Experiment 2 29 4.2.3 Experiment 3 35 CHAPTER 5 CONCLUSION 39 REFERENCE 41 APPENDIX 44

    Bandyopadhyay, Susmita, and Ranjan Bhattacharya. 2013. 'Applying modified NSGA-II for bi-objective supply chain problem', Journal of Intelligent Manufacturing, 24: 707-16.
    Breiman, Leo. 2017. Classification and regression trees (Routledge).
    Cai, Weiling, Songcan Chen, and Daoqiang Zhang. 2009. 'A simultaneous learning framework for clustering and classification', Pattern Recognition, 42: 1248-59.
    Coletta, Luiz Fernando Sommaggio, Nádia Félix Felipe da Silva, Eduardo Raul Hruschka, and Estevam Rafael Hruschka. 2014. "Combining classification and clustering for tweet sentiment analysis." In Intelligent Systems (BRACIS), 2014 Brazilian Conference on, 210-15. IEEE.
    Deb, Kalyanmoy, Samir Agrawal, Amrit Pratap, and Tanaka Meyarivan. 2000. "A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II." In International Conference on Parallel Problem Solving From Nature, 849-58. Springer.
    Fonseca, Carlos M, and Peter J Fleming. 1993. "Genetic Algorithms for Multiobjective Optimization: FormulationDiscussion and Generalization." In Icga, 416-23. Citeseer.
    Gini, Giuseppina, Marian Viorel Craciun, Christoph König, and Emilio Benfenati. 2004. 'Combining unsupervised and supervised artificial neural networks to predictaquatic toxicity', Journal of chemical information and computer sciences, 44: 1897-902.
    Härdle, Wolfgang, and Léopold Simar. 2007. Applied multivariate statistical analysis (Springer).
    Horn, Jeffrey, Nicholas Nafpliotis, and David E Goldberg. 1994. "A niched Pareto genetic algorithm for multiobjective optimization." In Evolutionary Computation, 1994. IEEE World Congress on Computational Intelligence., Proceedings of the First IEEE Conference on, 82-87. Ieee.
    Hssina, Badr, Abdelkarim Merbouha, Hanane Ezzikouri, and Mohammed Erritali. 2014. 'A comparative study of decision tree ID3 and C4. 5', International Journal of Advanced Computer Science and Applications, 4.
    Huber, Peter J. 1985. 'Projection pursuit', The annals of Statistics: 435-75.
    Hyvärinen, Aapo, Juha Karhunen, and Erkki Oja. 2004. Independent component analysis (John Wiley & Sons).
    Jain, Anil K. 2010. 'Data clustering: 50 years beyond K-means', Pattern recognition letters, 31: 651-66.
    Jain, Anil K, and Richard C Dubes. 1988. 'Algorithms for clustering data'.
    Kaewchinporn, Chinnapat, Nattakan Vongsuchoto, and Anantapom Srisawat. 2011. "A combination of decision tree learning and clustering for data classification." In Computer Science and Software Engineering (JCSSE), 2011 Eighth International Joint Conference on, 363-67. IEEE.
    Kodali, Shyam Prasad, Rajesh Kudikala, and Deb Kalyanmoy. 2008. "Multi-objective optimization of surface grinding process using NSGA II." In Emerging Trends in Engineering and Technology, 2008. ICETET'08. First International Conference on, 763-67. IEEE.
    Kunkle, Daniel. 2005. 'A summary and comparison of MOEA algorithms.' in, Internal Report (College of Computer and Information Science, Northeastern University).
    Kyriakopoulou, Antonia, and Theodore Kalamboukis. 2008. 'Combining clustering with classification for spam detection in social bookmarking systems', Proceedings of ECML/PKDD Discovery Challenge 2008 (RSDC 2008): 47-54.
    Mitra, Kishalay, and Ravi Gopinath. 2004. 'Multiobjective optimization of an industrial grinding operation using elitist nondominated sorting genetic algorithm', Chemical Engineering Science, 59: 385-96.
    Nizamani, Sarwat, Nasrullah Memon, Uffe Kock Wiil, and Panagiotis Karampelas. 2011. "CCM: a text classification model by clustering." In Advances in Social Networks Analysis and Mining (ASONAM), 2011 International Conference on, 461-67. IEEE.
    Papas, Diomidis, and Christos Tjortjis. 2014. "Combining clustering and classification for software quality evaluation." In Hellenic Conference on Artificial Intelligence, 273-86. Springer.
    Quinlan, J Ross. 2014. C4. 5: programs for machine learning (Elsevier).
    Quinlan, J. Ross. 1986. 'Induction of decision trees', Machine learning, 1: 81-106.
    Quinlan, Ross. 2004. 'Data mining tools See5 and C5. 0'.
    Rokach, Lior, and Oded Maimon. 2005. 'Clustering methods.' in, Data mining and knowledge discovery handbook (Springer).
    Rousseeuw, Peter J. 1987. 'Silhouettes: a graphical aid to the interpretation and validation of cluster analysis', Journal of computational and applied mathematics, 20: 53-65.
    Schaffer, J David. 1985. "Multiple objective optimization with vector evaluated genetic algorithms." In Proceedings of the First International Conference on Genetic Algorithms and Their Applications, 1985. Lawrence Erlbaum Associates. Inc., Publishers.
    Shih Chieh Tai. 2017. 'Feature Evaluation of Non Dominated Sorting Genetic Algorithm II - Seqeuntial Clustering Classification', 國立臺灣科技大學.
    Srinivas, Nidamarthi, and Kalyanmoy Deb. 1994. 'Muiltiobjective optimization using nondominated sorting in genetic algorithms', Evolutionary computation, 2: 221-48.
    Tan, Pang-Ning. 2006. Introduction to data mining (Pearson Education India).
    Trisna Yulia Junita. 2015. 'Non-Dominated Sorting Particle Swarm Optimizer for Combining Clustering and Classification', 國立臺灣科技大學.
    Wold, Svante, Kim Esbensen, and Paul Geladi. 1987. 'Principal component analysis', Chemometrics and intelligent laboratory systems, 2: 37-52.
    Yang, Chao-Lung, and Nguyen Thi Phuong Quyen. 2018. 'Data Analysis Framework of Sequential Clustering and Classification Using Non-Dominated Sorting Genetic Algorithm', Applied Soft Computing.
    Zhang, Xu-Yao, Peipei Yang, Yan-Ming Zhang, Kaizhu Huang, and Cheng-Lin Liu. 2014. 'Combination of classification and clustering results with label propagation', IEEE Signal Processing Letters, 21: 610-14.
    Zitzler, Eckart, Lothar Thiele, Marco Laumanns, Carlos M Fonseca, and Viviane Grunert Da Fonseca. 2003. 'Performance assessment of multiobjective optimizers: An analysis and review', IEEE transactions on evolutionary computation, 7: 117-32.

    無法下載圖示 全文公開日期 2024/04/09 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE