簡易檢索 / 詳目顯示

研究生: Muhammad Rakhmat Setiawan
Muhammad Rakhmat Setiawan
論文名稱: 使用深度學習及多目標正弦餘弦演算法於依序分群與分類之資料分析架構
Data Analysis Framework of Sequential Clustering and Classification Using Deep Learning Technique and Multi-objective Sine-Cosine Algorithm
指導教授: 郭人介
Ren-Jieh Kuo
口試委員: 歐陽超
Chao Ou-Yang
王孔政
Kung-Jeng Wang
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 86
中文關鍵詞: 深度學習多目標最佳化分類特徵選擇多目標正弦餘弦演算法
外文關鍵詞: Deep clustering, Multi-objective optimization, Classification, Feature selection, Multi-objective sine-cosine algorithm (MOSCA)
相關次數: 點閱:244下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 數十年來,資料分群及分類是資料探勘應用在不同領域上的兩個重要的方法,縱使這兩種方法可以分開應用,但他們常常在資料探索或是資料分析上一起使用,尤其在資料標籤沒有定義的情況下。當分類的標籤無法取得,或是模糊不清時,本文中所提及的初始資料分析之架構可以解決此項問題。

    因此,此篇論文應用啟發示演算法稱多目標正弦餘弦演算法,提出新的資料分析架構,並使用深度分群法來揭示資料中的價值,此研究透過結合自編碼以及K-平均演算法解決深度分群任務,與先前的演算法相比,提供了更好的成效。在分類問題中,這篇研究應用了三種演算法,支持向量機(SVM)、決策樹,以及反向傳播神經網路。因此,這個方法結合多目標正弦餘弦演算法,深度分群,以及分類演算法,在未執行預測模型前,開發整個資料結構,稱作深度分群及多目標正弦餘弦演算法於依序分群及分類(DeepCluster MOSCA-SCC)。其績效衡量是比較先前提出的演算法,例如:NSGAII-SCC或是其他一般的MOSCA-SCC,根據十二項比較結果,實驗的結果顯示,在準確度的倒數為比較基礎,且群心為3的情況下,深度分群及多目標正弦餘弦演算法結合決策樹相比於其他演算法更加準確。

    關鍵字: 深度學習, 多目標最佳化, 分類, 特徵選擇, 多目標正弦餘弦演算法


    Data clustering and classification are two significant data mining techniques, which have been applied in many areas for decades. Although data clustering and classification methods can be applied individually, they are often utilized together for data exploration or data analytics, mainly when data labels have not been defined. The proposed framework can be considered as a preliminary data analysis tool when the labels for classification are unavailable, or the objective of data analysis is roughly defined.

    Thus, this research proposes an innovative data analytics framework that employs a new meta-heuristic named multi-objective sine cosine algorithm (MOSCA), and uses a deep clustering technique to reveal the data pattern. This study utilized the autoencoder combined with K-means algorithm for the deep clustering task. For a classification problem, there are three classification algorithms implemented in this research, such as support vector machine (SVM), decision tree, and back-propagation neural network. Therefore, the framework named DeepCluster MOSCA-SCC, which is a combination of MOSCA, deep clustering, and classification algorithms, is used to exploit data structure before implementing the prediction model. This study conducted performance evaluation by comparing the proposed framework with the existing framework such as NSGAII-SCC and other regular MOSCA-SCC. The experimental results indicate that, based on twelve comparison result, Deep MOSCA-SCC with decision tree performs better than the other algorithms in terms of 1/accuracy when the number of clusters is equal to three.

    Keywords: Deep clustering, Multi-objective optimization, Classification, Feature selection, Multi-objective sine-cosine algorithm (MOSCA).

    摘要 i ABSTRACT ii AKNNOWLEDGEMENT iii TABLE OF CONTENTS iv LIST OF TABLES vi LIST OF FIGURES vii LISTS OF APPENDICES viii CHAPTER I – INTRODUCTION 1 1.1 Background and Motivation 1 1.2 Research Objectives 4 1.3 Research Scope and Assumption 5 1.4 Thesis Organization 5 CHAPTER II – LITERATURE REVIEW 7 2.1 K-Means Algorithm 7 2.2 Autoencoder 9 2.3 Multi-objective sine-cosine Algorithm (MOSCA) 13 2.4 Sequential Clustering and Classification 15 CHAPTER III - METHODOLOGY 19 3.1 Research Framework 19 3.2 Data Collection 22 3.3 Solution Representation 22 3.4 DeepCluster MOSCA-Sequential Clustering and Classification 23 3.5 Performance Evaluation 28 CHAPTER IV – EXPERIMENTAL RESULTS 29 4.1 Datasets 29 4.2 Parameter Setting 31 4.3 The best combination of deep clustering and classification 33 4.4 Computational Time 35 4.5 Time complexity 35 4.6 Statistical Hypothesis 36 4.7 Application of DeepCluster MOSCA-SCC with decision Tree in data analysis 41 CHAPTER V - CONLCUSIONS 44 5.1 Conclusions 44 5.2 Research Limitations 45 5.3 Contributions 46 5.4 Suggestions for Future Research 47 REFERENCES 48 APPENDIX 52

    Badr, W. "Auto-encoder: What is it? and What is it used for? (Part 1)." Towards Data Science. April 22, 2019. https://towardsdatascience.com/auto-encoder-what-is-it-and-what-is-it-used-for-part-1-3e5c6f017726 (accessed 01 04, 2020).
    Berry, M.W. and Browne, M. Lecture Notes in Data Mining. Default Book Series, 2006.
    Cai, W.L., Chen, S.C. and Zhang, D.Q. "A multiobjective simultaneous learning framework for clustering and classification." IEEE Transactions on Neural Networks, 2009: 185-200.
    Ciresan, D.C., Giusti, A., and Luca M. Gambardella, and Jurgen Schmidhuber. "Deep neural networks segment neuronal membranes in electron microscopy images." Advances in Neural Information Processing Systems, 2012: 2843-2851.
    Coletta, L.F.S., da Silva, N.F.F., Hruschka, E.R. and Hruschka Jr., E.R. "Combining classification and clustering for tweet sentiment analysis." IEEE Brazilian Conference on Intelligent Systems (BRACIS), Sau Paulo, Brazil, October 18-22, 2014.
    Columbus, L. Forbes. December 24, 2017. https://www.forbes.com/sites/louiscolumbus/2017/12/24/53-of-companies-are-adopting-big-data-analytics/#6769cbba39a1 (accessed January 06, 2020).
    Cormen, T.H., Leiserson, C.E., Rivest, R.L. and Stein, C., 2009. Introduction to Algorithms. MIT press.
    Deng, L. and Yu, D. Deep Learning: Methods and Applications. Foundations and Trends® in Signal Processing, 2014.
    Deb, K., Pratap, A., Agarwal, S. and Meyarivan, T.A.M.T., 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2), pp.182-197.
    Elyasigomari, V., Mirjafari, M.S. and Shaheed, H.R.C., and Screen, M.H. "Cancer classification using a novel gene selection approach by means of shuffling based on data clustering with optimization." Applied Soft Computing, 2015: 43-51.
    Finley, T. and Joachims, T. "Supervised k-Means Clustering." 2008.
    Hafez, A.I., Zawbaa, H.M. Emary, E., and Hassanien, A.E. "Sine cosine optimization algorithm for feature selection." IEEE International Symposium on INnovations in Intelligent SysTems and Applications (INISTA), Sinaia, Romania, August 2-5, 2016.
    Han, J. and Kamber, M. Data Mining: Concepts and Techniques (2nd edition). Morgan Kaufmann Publishers, 2006.
    Henke, N., Bughin, J., Chui, M., Manyika, J., Saleh, T., Wiseman, B., and Sethupathy, G. The Age of Analytics: Competing in a Data-Driven World. McKinsey Global Institute, 2016.
    Hinton, G. E., and Salakhutdinov, R.R. Reducing the Dimensionality of Data with Neural Networks. Magazine, www.sciencemag.org, 2006.
    Hubel, D.H., and Wiesel, T.N. "Receptive fields, binocular interaction and functional architecture in the cat's visual cortex." The Journal of physiology, 1962: 106-154.
    IDC. IDC Media Center. April 04, 2019. https://www.idc.com/getdoc.jsp?containerId=prUS44998419 (accessed May 11, 2020).
    Josien, K. and Liao, T.W. "Integrated use of fuzzy c-means and fuzzy KNN for GT part family and machine cell formation." International Journal of Production Research, 2000: 3513-3536.
    Josien, K. and Liao, T.W. "Simultaneous grouping of parts and machines with an integratedfuzzy clustering method." Fuzzy Sets and Systems, 2002: 1-21.
    Kaewchinporn, C., Vongsuchoto, N., and Srisawat, A. "A combination of decision tree learning and clustering for data classification." IEEE Eighth International Joint Conference on Computer Science and Software Engineering (JCSSE), Nakhon Pathom, Thailand, May 11-13, 2011.
    Khan, A., Shoail, A. Zahoora, U., and Qureshi, A.S. "A survey of the recent architectures of deep convolutional neural networks." arXiv, 2019.
    Knowles, J.D. and Corne, D.W. "Approximating the nondominated front using the Pareto archived evolution strategy." Evolutionary Computation, 8(2), 2000, pp.149-172.
    Krizhevsky, A., Sutskever, I., and Hinton, G.E. "Imagenet classification with deep convolutional neural networks." In Advances in Neural Information Processing Systems, 2012: 1097-1105.
    Liu, X.L., Deng, Z.D., and Yang, Y.H. "Recent progress in semantic image segmentation." Artificial Intelligence Review, 2018: 1-18.
    Mara, S.T.W. Solving a Multi-Objective Sustainable Location-Routing Problem Using Discrete Multi-Objective Sine-Cosine Algorithm. Master Thesis, Taipei: National Taiwan University of Science and Technology, 2019
    Meng, Q.X., Catchpoole, D., Skillicorn, D., and Kennedy, P.J. "Relational autoencoder for feature extraction." IEEE International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, May 14-19, 2017.
    Mirjalili, S. "SCA: a sine cosine algorithm for solving optimization problems." Knowledge-Based Systems, 2016: 120-133.
    Mukhopadhyay, A., Bandyopadhyay, S., and Maulik, U. "Multi-class clustering of cancer subtypes through SVM based ensemble of pareto-optimal." PLoS ONE, 2010.
    Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., and Ng, A.Y. "Multimodal deep learning." International Conference on Machine Learning, Madison, WI, USA, June 28, 2011.
    Nousi, P., Papadopoulos, S., Tefas, A., and Pitas, I. "Deep autoencoders for attribute preserving face de-identification." Signal Processing: Image Communication, 2020: 81.
    Rumelhart, D.E., Hinton, G.E., and Williams, R.J. "Learning representations by back-propagating errors." Nature, 1986: 533-536.
    Snoek, J., Adams, R.P. and Larochelle, H. "Nonparametric guidance of autoencoder representations using label information." Journal of Machine Learning Research 13, 2012: 2567-2588.
    Song, C.F., Liu, F., Huang, Y.Z., Wang, L., and Tan, T.N. "Auto-encoder based data clustering." Iberoamerican Congress on Pattern Recognition. Berlin, Heidelberg: Springer, 2013. 117-124.
    Talbi, E.G. Hybrid Metaheuristics. Berlin Heidelberg: Springer, 2013.
    Tan, P. N., Steinbach, M. and Kumar, V. Introduction to Data Mining. New Delhi: Pearson Education. Inc., 2006.
    Tawhid, M.A. and Savsani, V. "Multi-objective sine-cosine algorithm (MO-SCA) for multi-objective engineering design problems." Neural Computing and Applications, 2019: 915-929.
    Tian, K., Zhou, S.G., and Guan, J.H. "Deepcluster: A general clustering framework based on deep learning." Springer Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Skopje, Macedonia, September 18-22, 2017.
    Tripathi, P.K., Bandyopadhyay, S. and Pal, S.K., 2007. Multi-objective particle swarm optimization with time variant inertia and acceleration coefficients. Information Sciences, 177(22), pp.5033-5049.
    Vincent, P., Larochelle, H., Bengito, Y., and Manzagol, P.A. "Extracting and Composing Robust Features with Denoising Autoencoders." 25th International Conference on Machine Learning, Helsinki, Finland, July 5-9, 2008.
    Xiao, H.S., Xiao, Z., and Wang, Y. "Ensemble classification based on supervised clustering for credit scoring." Applied Soft Computing, 2016: 73-86.
    Yang, C.L. and Quyen, N.T.P. "Data analysis framework of sequential clustering and classification using non-dominated sorting genetic algorithm." Applied Soft Computing, 2018: 704-718.
    Zeng, H.J., Wang, X.H., Chen, Z., Lu, H.J., and Ma, W.Y. "CBC: clustering based text classification requiring minimal labeled data." Third IEEE International Conference on Data Mining, Florida, USA, November 22, 2003.
    Zhang, T. "Solving large scale linear prediction problems using stochastic gradient descent algorithms." Twenty-First International Conference on Machine Learning. ACM, 2004. 116.
    Zhang, X.Y., Yang, P.P., Zhang, Y.M., Huang, K.Z., and Liu, C.L. "Combination of classification and clustering results with label propagation." IEEE Signal Processing Letters, New York, USA, July 4, 2014.

    無法下載圖示 全文公開日期 2025/06/18 (校內網路)
    全文公開日期 2025/06/18 (校外網路)
    全文公開日期 2025/06/18 (國家圖書館:臺灣博碩士論文系統)
    QR CODE