簡易檢索 / 詳目顯示

研究生: 彭奕樺
Yi-Hua - Peng
論文名稱: 限制條件對值譜分群之影響
The Impact of Constraints on Spectral Clustering
指導教授: 陳正綱
Cheng-Kang Chen
楊朝龍
Chao-Lung Yang
口試委員: 曹譽鐘
Yu-Chung Tsao
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2017
畢業學年度: 105
語文別: 英文
論文頁數: 50
中文關鍵詞: 值譜分群法限制分群必連限制最近鄰居法
外文關鍵詞: Spectral clustering, Constrained clustering, Must-Link constraints, k-nearest neighborhood
相關次數: 點閱:337下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 值譜分群演算法乃是將數據中的每個資料點視為一個圖(Graph)的頂點V,並將資料點(即頂點)之間的相似度量化為頂點之間的連接邊E。如此可得到一個基於相似度的無方向之加權圖G(V,E)。於是,資料分群問題就利用此一加權圖轉化為圖的劃分問題,並利用圖論劃分準則使劃分後的子圖(Sub-graph)內部相似度最大,相對的,子圖之間的相似度最小,而達成分群的目的。本研究基於先前的研究基礎,探討限制式分群(Constraints Clustering)中之必連(Must-Link,ML)限制條件對值譜分群結果的影響。由於值譜分群核心結構-距離矩陣(distance matrix)可透過相似度矩陣及最近鄰居法(k-NN)產生鄰近矩陣(affinity matrix)。本研究設定兩種必連限制:(1)必連限制在鄰近矩陣內、(2)必連限制在鄰近矩陣外,以探討此兩種限制對分群的變化。本研究實驗使用三個公開的UCI資料,分別針對所提出之演算法進行實驗。實驗分群結果利用評比指標 Adjusted Rand Index (ARI)來比較有限制與沒有限制時之分群結果。實驗結果發現必連限制在鄰近矩陣內對分群結果之ARI會產生較大的變異。此外,必連限制在鄰近矩陣外會使得值譜分群結果較沒有必連限制時產生較大的差異,而且ARI在特定之限制數量後會進行收斂。


    Spectral clustering algorithm considers data points as vertexes in the undirected graph where the vertex set is V=[v_1,v_2,…, v_n ] and each edge between two vertexes v_i and v_j have a non-negative weight value (w_ij) by the similarity function. Thus, spectral clustering is based on the graph partition theory as it treats the data clustering problem as a graph partitioning problems. The objective of graph partition is to cluster data points to maximize the similarity of inner sub-graph of data points as well as to minimize the similarity between sub-graphs. This research aims to investigate the impact of clustering results when Must-Link constraints are applied under spectral clustering. Based on process of spectral clustering, the affinity matrix is generated from the similarity matrix where k-nearest neighborhood method is used to determine the closer data points. Two kind of ML constraints can be applied either in or out of affinity matrix: (1) ML constraints on affinity matrix of k-NN, and (2) ML constraints on affinity matrix of non k-NN. A series of experiments by using three UCI datasets were conducted to compare the clustering results under these two conditions. Adjusted Rand Index (ARI) was used to compare the clustering results with or without the mentioned constrains. The experimental results show the variation of ARI is larger when ML constraints on affinity matrix are applied. Furthermore, the dramatic drop of ARI appears if ML constraints on affinity matrix of non k-NN are applied. However, the descending of ARI will converged after certain number of ML constraints.

    摘要 i ABSTRACT ii CHAPTER 1 INTRODUCTION 1 1.1. Research background 1 1.2. Research objectives 3 1.3. Structure of this Thesis 4 CHAPTER 2 LITERATURE REVIEW 5 2.1. Spectral clustering algorithm 5 2.2. Application of spectral clustering 8 2.3. Constrained spectral clustering 12 2.4. Adjusted Rand Index 15 CHAPTER 3 RESEARCH METHODOLOGY 17 3.1. Detailed information about spectral clustering 17 3.1.1. k-nearest neighbors 17 3.2. Integrate the instance-level constraint on the spectra clustering 20 3.2.1. Pre-processing of instance-level Constraints 21 3.2.2. Constraints on matrix L 23 CHAPTER 4 EXPERIMENTAL RESULTS 28 4.1. UCI datasets 28 4.2 Results 28 4.2.1 Soybean Dataset 29 4.2.2 Zoo Dataset 31 4.2.3 Wine Dataset 32 CHAPTER 5 DISCUSSION AND CONCLUSION 34 REFERENCE 35

    Alush, A., Friedman, A., & Goldberger, J. (2016). Pairwise clustering based on the mutual-information criterion. Neurocomputing, 182, 284-293. doi:http://dx.doi.org/10.1016/j.neucom.2015.12.025
    Alzate, C., & Suykens, J. A. K. (2012). Hierarchical kernel spectral clustering. Neural Networks, 35, 21-30. doi:http://dx.doi.org/10.1016/j.neunet.2012.06.007
    Cai, D., & Chen, X. (2015). Large Scale Spectral Clustering Via Landmark-Based Sparse Representation. IEEE Transactions on Cybernetics, 45(8), 1669-1680. doi:10.1109/TCYB.2014.2358564
    Capocci, A., Servedio, V. D. P., Caldarelli, G., & Colaiori, F. (2005). Detecting communities in large networks. Physica a-Statistical Mechanics and Its Applications, 352(2-4), 669-676. doi:10.1016/j.physa.2004.12.050
    Chen, C., Zhang, L., Bu, J., Wang, C., & Chen, W. (2010). Constrained Laplacian Eigenmap for dimensionality reduction. Neurocomputing, 73(4–6), 951-958. doi:http://dx.doi.org/10.1016/j.neucom.2009.08.021
    Chen, N., Chen, A., Zhou, L., & Lu, L. (2001). A graph-based clustering algorithm in large transaction databases. Intelligent Data Analysis, 327–338. Retrieved from http://content.iospress.com/articles/intelligent-data-analysis/ida00059
    Chen, W. Y., Song, Y., Bai, H., Lin, C. J., & Chang, E. Y. (2011). Parallel Spectral Clustering in Distributed Systems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(3), 568-586. doi:10.1109/TPAMI.2010.88
    Cheng, J., Qiao, M., Bian, W., & Tao, D. (2011). 3D human posture segmentation by spectral clustering with surface normal constraint. Signal Processing, 91(9), 2204-2212. doi:http://dx.doi.org/10.1016/j.sigpro.2011.04.003
    Chifu, A.-G., Hristea, F., Mothe, J., & Popescu, M. (2015). Word sense discrimination in information retrieval: A spectral clustering-based approach. Information Processing & Management, 51(2), 16-31. doi:http://dx.doi.org/10.1016/j.ipm.2014.10.007
    Chrysouli, C., & Tefas, A. (2015). Spectral clustering and semi-supervised learning using evolving similarity graphs. Applied Soft Computing, 34, 625-637. doi:http://dx.doi.org/10.1016/j.asoc.2015.05.026
    Chung, F. R. K. (1999). Spectral Graph Theory SIGACT News, 30(2), 14-16. doi:10.1145/568547.568553
    Ding, C., He, X. F., & Simon, H. D. (2005). Nonnegative Lagrangian relaxation of K-means and spectral clustering. In J. Gama, R. Camacho, P. Brazdil, A. Jorge, & L. Torgo (Eds.), Machine Learning: Ecml 2005, Proceedings (Vol. 3720, pp. 530-538).
    Ding, C. H. Q., Xiaofeng, H., Hongyuan, Z., Ming, G., & Simon, H. D. (2001, 2001). A min-max cut algorithm for graph partitioning and data clustering. Paper presented at the Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on.
    Ding, L., Wall, P., & Terzija, V. (2014). Constrained spectral clustering based controlled islanding. International Journal of Electrical Power & Energy Systems, 63, 687-694. doi:http://dx.doi.org/10.1016/j.ijepes.2014.06.016
    Ding, S., Qi, B., Jia, H., Zhu, H., & Zhang, L. (2013). Research of semi-supervised spectral clustering based on constraints expansion. Neural Computing & Applications, 22, S405-S410. doi:10.1007/s00521-012-0911-8
    Frias-Martinez, V., & Frias-Martinez, E. (2014). Spectral clustering for sensing urban land use using Twitter activity. Engineering Applications of Artificial Intelligence, 35, 237-245. doi:http://dx.doi.org/10.1016/j.engappai.2014.06.019
    Hagen, L., & Kahng, A. (1991, 11-14 Nov. 1991). Fast spectral methods for ratio cut partitioning and clustering. Paper presented at the Computer-Aided Design, 1991. ICCAD-91. Digest of Technical Papers., 1991 IEEE International Conference on.
    Hamad, D., & Biela, P. (2008, 7-11 April 2008). Introduction to spectral clustering. Paper presented at the Information and Communication Technologies: From Theory to Applications, 2008. ICTTA 2008. 3rd International Conference on.
    Hong, X., Wang, J., & Qi, G. (2014). Comparison of spectral clustering, K-clustering and hierarchical clustering on e-nose datasets: Application to the recognition of material freshness, adulteration levels and pretreatment approaches for tomato juices. Chemometrics and Intelligent Laboratory Systems, 133, 17-24. doi:http://dx.doi.org/10.1016/j.chemolab.2014.01.017
    Huang, S., Wang, H., Li, D., Yang, Y., & Li, T. (2015). Spectral co-clustering ensemble. Knowledge-Based Systems, 84, 46-55. doi:http://dx.doi.org/10.1016/j.knosys.2015.03.027
    İnkaya, T. (2015). A parameter-free similarity graph for spectral clustering. Expert Systems with Applications, 42(24), 9489-9498. doi:http://dx.doi.org/10.1016/j.eswa.2015.07.074
    Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: a review. ACM Comput. Surv., 31(3), 264-323. doi:10.1145/331499.331504
    Ji, X., & Xu, W. (2006). Document clustering with prior knowledge. Paper presented at the Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, Seattle, Washington, USA. http://delivery.acm.org/10.1145/1150000/1148241/p405-ji.pdf?ip=140.118.201.244&id=1148241&acc=ACTIVE%20SERVICE&key=AF37130DAFA4998B%2E67FA4CC52BCCA472%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35&CFID=588129369&CFTOKEN=27581885&__acm__=1456975385_d9786a0fd241be7ecdbfccdd2ba89f20
    Jia, H., Ding, S., Xu, X., & Nie, R. (2014). The latest research progress on spectral clustering. Neural Comput. Appl., 24(7-8), 1477-1486. doi:10.1007/s00521-013-1439-2
    Jiang, H., Ren, Z., Xuan, J., & Wu, X. (2013). Extracting elite pairwise constraints for clustering. Neurocomputing, 99(0), 124-133. doi:http://dx.doi.org/10.1016/j.neucom.2012.06.013
    Jiao, L. C., Shang, F., Wang, F., & Liu, Y. (2012). Fast semi-supervised clustering with enhanced spectral embedding. Pattern Recognition, 45(12), 4358-4369. doi:http://dx.doi.org/10.1016/j.patcog.2012.05.007
    Khoreva, A., Galasso, F., Hein, M., & Schiele, B. (2014). Learning Must-Link Constraints for Video Segmentation Based on Spectral Clustering. In X. Jiang, J. Hornegger, & R. Koch (Eds.), Pattern Recognition, Gcpr 2014 (Vol. 8753, pp. 701-712).
    Klein, D., Kamvar, S. D., & Manning, C. D. (2002). From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering. Paper presented at the Proceedings of the Nineteenth International Conference on Machine Learning.
    Krishnan, D., Fattal, R., & Szeliski, R. (2013). Efficient preconditioning of laplacian matrices for computer graphics. ACM Trans. Graph., 32(4), 1-15. doi:10.1145/2461912.2461992
    Liu, D., Wang, J., & Wang, H. (2015). Short-term wind speed forecasting based on spectral clustering and optimised echo state networks. Renewable Energy, 78, 599-608. doi:http://dx.doi.org/10.1016/j.renene.2015.01.022
    Luxburg, U. (2007). A tutorial on spectral clustering. Statistics and Computing, 17(4), 395-416. doi:10.1007/s11222-007-9033-z
    Mcsherry, F. (2004). Spectral methods for data analysis. University of Washington.
    Nascimento, M. C. V., & de Carvalho, A. C. P. L. F. (2011). Spectral methods for graph clustering – A survey. European Journal of Operational Research, 211(2), 221-231. doi:http://dx.doi.org/10.1016/j.ejor.2010.08.012
    Ng, A., Jordan, M., & Weiss, Y. (2001). On Spectral Clustering: Analysis and an Algorithm. Advances in Neural Information Processing, 14(2), 849-856. doi:citeulike-article-id:13599430
    Nie, F., Wang, X., & Huang, H. (2014). Clustering and projected clustering with adaptive neighbors. Paper presented at the Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, New York, New York, USA. http://delivery.acm.org/10.1145/2630000/2623726/p977-nie.pdf?ip=140.118.201.244&id=2623726&acc=ACTIVE%20SERVICE&key=AF37130DAFA4998B%2E67FA4CC52BCCA472%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35&CFID=623257003&CFTOKEN=59017680&__acm__=1464706643_2aa0aa10c99d1d837fc1d08cbeffcee9
    Ozertem, U., Erdogmus, D., & Jenssen, R. (2008). Mean shift spectral clustering. Pattern Recognition,

    無法下載圖示
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE