研究生: |
李易儒 Yi-Ru Li |
---|---|
論文名稱: |
應用區間數值的物理意義於分群及預測模型建構 Symbolic Data Processing with Physical Meanings of Interval Values for Clustering and Forecasting |
指導教授: |
蘇順豐
Shun-Feng Su |
口試委員: |
郭重顯
Chung-Hsien Kuo 莊鎮嘉 Chen-Chia Chuang 鄭錦聰 Jin-Tsong Jeng |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2015 |
畢業學年度: | 103 |
語文別: | 英文 |
論文頁數: | 74 |
中文關鍵詞: | 區間型態資料 、模糊c 均值分群演算法 、徑向 基底函數網路 、區間數值的物理意義 |
外文關鍵詞: | Symbolic interval-valued data, Fuzzy c-means (FCM), Radial basis function networks (RBFN), Physical meanings of interval values |
相關次數: | 點閱:554 下載:3 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文針對區間型態的資料的分群及預測進行分析及探討。區間型態資料和單點型態資料,從表面數字來說僅有資料個數的差別而已(單個數值及兩個數值),然而,區間型態資料並非只有表面數字上的意涵,其區間數值也具有數字代表的物理意義。因此本篇論文提出一種考量區間型態資料物理意義的改良式模糊c均值演算法(Fuzzy c-means, FCM),將range的值視為區間資料中心點的可靠程度參考,於目標函式乘上一個與range成反比的exponential函數,如此便能降低range較大(較不可靠)的資料對分群結果的影響,在後續的實驗結果中也能看到同時考慮數值及物理意義的改良式分群法對於分群結果的改善。另一方面,分群後各群的質心點能作為其預測模型的區域特徵點,且區間型態資料的物理意義也能於徑向基底函數網路預測模型中運用,以提升預測模型的效能。
In this thesis, interval-valued data clustering methods and forecasting models are studied and discussed. From a very simple viewpoint, the difference between interval-valued data and single-point data is just the number of data attributes for the dataset. However, those attributes (lower/upper bounds or centre/range) have physical meanings. In this study, an adaptive fuzzy c-means clustering method is proposed to include physical meanings of interval values. The range is regarded as the confidence of the centre data and an exponential function which is inverse proportion to the range is added into the objective function to account for this confidence idea. Then the data with less confidence obtain less influence in the training process. It can be found from our experiments that with both numerical and physical meanings in our approach, the clustering performance is improved and this improvement can also benefit the following data analysis process. With the cluster centroids obtained from the proposed adaptive fuzzy c-means clustering method in radial basis function networks (RBFN), the forecasting model can be constructed. With the same idea of including physical meaning of interval value in the training process, the obtained model can have better predictions as shown in our experiments.
[1] E. W. T. Ngai, L. Xiu, and D. C. K. Chau, “Application of Data Mining Techniques in Customer Relationship Management: A Literature Review and Classification,” Expert Systems with Applications, vol. 36, no. 2, pp. 2592-2602, 2009.
[2] B. Minaei-Bidgoli, et al, “Predicting Student Performance: An Application of Data Mining Methods with An Educational Web-Based System,” in International Conference on Frontiers in Education, 2003, pp. 13-18.
[3] D. Zhang and L. Zhou, “Discovering Golden Nuggets: Data Mining in Financial Application,” IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 34, no. 4, pp. 513-522, 2004.
[4] E. W. T. Ngai, et al, “The Application of Data Mining Techniques in Financial Fraud Detection: A Classification Framework and An Academic Review of Literature,” Decision Support Systems, vol. 50, no. 3, pp. 559-569, 2011.
[5] M. K. Obenshain, “Application of Data Mining Techniques to Healthcare Data,” Infection Control, vol. 25, no. 8, pp. 690-695, 2008.
[6] K. Huarng and T. H. K. Yu, “The Application of Neural Networks to Forecast Fuzzy Time Series,” Physica A: Statistical Mechanics and its Applications, vol. 363, no. 2, pp. 481-491, 2006.
[7] H. Zareipour, et al, “Application of Public-Domain Market Information to Forecast Ontario's Wholesale Electricity Prices,” IEEE Transactions on Power Systems, vol. 21, no. 4, pp. 1707-1717, 2006.
[8] S. K. Regonda, et al, “A Multimodel Ensemble Forecast Framework: Application to Spring Seasonal Flows in the Gunnison River Basin,” Water Resources Research, vol. 42, no. 9, 2006.
[9] D. K. Ranaweera, N. F. Hubele, and A. D. Papalexopoulos, “Application of Radial Basis Function Neural Network Model for Short-Term Load Forecasting,” IEE Proc. of the Generation Transmission Distribution, vol. 142, no. 1, Jan. 1995.
[10] H. R. Maier and G. C. Dandy, “Neural Networks for the Prediction and Forecasting of Water Resources Variables: A Review of Modelling Issues and Applications,” Environmental Modelling & Software, vol. 15, no. 1, pp. 101-124, 2000.
[11] T. B. Trafalis and H. Ince, “Support Vector Machine for Regression and Applications to Financial Forecasting,” in Proc. of the 2000 IEEE-INNS-ENNS Int. Joint Conf. on Neural Networks, vol. 6, 2000, pp.348 -353.
[12] K. Xu, et al, “Application of Neural Networks in Forecasting Engine Systems Reliability,” Applied Soft Computing, vol. 2, no. 4, pp. 255-268, 2003.
[13] G. Xiang, et al, “New Algorithms for Statistical Analysis of Interval Data,” in Proc. of the Workshop State-of-the-Art Sci. Comput. (PARA), 2004, pp. 189–196.
[14] T. A. Cameron and D. D. Huppert, “OLS Versus ML Estimation of Non-Market Resource Values with Payment Card Interval Data,” Journal of Environmental Economics and Management, vol. 17, no. 3, pp. 230-246, 1989.
[15] C. F. Manski and E. Tamer, “Inference on Regressions with Interval Data on A Regressor or Outcome,” Econometrica, pp. 519-546, 2002.
[16] M. Chavent and Y. Lechevallier, “Dynamical Clustering of Interval Data: Optimization of An Adequacy Criterion Based on Hausdorff Distance,” Classification, Clustering, and Data Analysis. Springer Berlin Heidelberg, 2002, pp. 53-60.
[17] A. Kasperski and P. Zieliński, “An Approximation Algorithm for Interval Data Minmax Regret Combinatorial Optimization Problems,” Information Processing Letters, vol. 97, no. 5, pp. 177-180, 2006.
[18] E. de A. L. Neto and F. de A. T. de Carvalho, “Centre and Range Method for Fitting A Linear Regression Model to Symbolic Interval Data,” Computational Statistics & Data Analysis, vol. 52, no. 3, 1500-1515, 2008.
[19] L. Billard and E. Diday, “Regression Analysis for Interval-Valued Data, In: Data Analysis, Classification and Related Methods,” in Proc. of the 7th Conf. IFCS, 2000, pp. 369–374.
[20] P. Bertrand and F. Goupil, Descriptive Statistic for Symbolic Data, in Analysis of Symbolic Data. H.-H. Bock and E. Diday, Eds. Heidelberg, Germany: Springer-Verlag, 2000, pp. 106–124.
[21] L. Billard and E. Diday, “From the Statistics of Data to the Statistics of Knowledge: Symbolic Data Analysis,” J. Amer. Stat. Assoc., vol. 98, no. 462, pp. 470–487, Jun. 2003.
[22] F. A. T. de Carvalho, “Histograms in Symbolic Data Analysis,” Ann. Oper. Res., vol. 55, pp. 229–322, 1995.
[23] G. Punj and D. W. Stewart, “Cluster Analysis in Marketing Research: Review and Suggestions for Application,” Journal of Marketing Research, pp. 134-148, 1983.
[24] D. J. Ketchen and C. L. Shook, “The Application of Cluster Analysis in Strategic Management Research: An Analysis and Critique,” Strategic Management Journal, vol. 17, no. 6, pp. 441-458, 1996.
[25] A. Baraldi and P. Blonda, “A Survey of Fuzzy Clustering Algorithms for Pattern Recognition. Part I,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 29, no. 6, pp. 778-785, 1999.
[26] W. Cai, S. Chen, and D. Zhang, “Fast and Robust Fuzzy C-Means Clustering Algorithms Incorporating Local Information for Image Segmentation,” Pattern Recognition, vol. 40, no. 3, pp. 825-838, 2007.
[27] N. S. Boutonnet, et al, “Optimal Protein Structure Alignments by Multiple Linkage Clustering: Application to Distantly Related Proteins,” Protein engineering, vol. 8, no. 7, pp. 647-662, 1995.
[28] Z. Wu and R. Leahy, “An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1101-1113, 1993.
[29] J. C. Bezdek, R. Ehrlich, and W. Full, “FCM: The Fuzzy C-Means Clustering Algorithm,” Computers & Geosciences, vol. 10, no. 2, pp. 191-203, 1984.
[30] J. A. Hartigan and M. A. Wong, “Algorithm AS 136: A K-Means Clustering Algorithm,” Applied Statistics, pp. 100-108, 1979.
[31] C. Cortes and V. Vapnik, “Support-Vector Networks,” Machine Learning, vol. 20, no. 3, pp. 273-297, 1995.
[32] M. D. Odom and R. Sharda, “A Neural Network Model for Bankruptcy Prediction,” in Proc. of the 1990 IJCNN International Joint Conference on Neural Networks, 1990.
[33] S. Chen, C. F. N. Cowan, and P. M. Grant, “Orthogonal Least Squares Learning Algorithm for Radial Basis Function Networks,” IEEE Transactions on Neural Networks, vol. 2, no. 2, pp. 302-309, 1991.
[34] J. Park and I. W. Sandberg, “Universal Approximation Using Radial-Basis-Function Networks,” Neural Computation, vol. 3, no. 2, pp. 246-257, 1991.
[35] F. de A. T. de Carvalho, “Fuzzy C-Means Clustering Methods for Symbolic Interval Data,” Pattern Recognition Letters, vol. 28, no. 4, pp. 423-437, 2007.
[36] K. George J. and Y. Bo, Fuzzy Sets and Fuzzy Logic, Theory and Applications. vol. 4. New Jersey: Prentice Hall, 1995.
[37] P. E. Danielsson, “Euclidean Distance Mapping,” Computer Graphics and Image Processing, vol.14, no. 3, pp. 227-248, 1980.
[38] M. J. L. Orr, Introduction to Radial Basis Function Networks. Scotland, April. 1996.
[39] R. K. Beatson, J. B. Cherrie, and C. T. Mouat, “Fast Fitting of Radial Basis Functions: Methods Based on Preconditioned GMRES Iteration,” Advances in Computational Mathematics, vol. 11, no. 2-3, pp. 253-270, 1999.
[40] M. Lázaro, I. Santamarıa, and C. Pantaleón, “A New EM-Based Training Algorithm for RBF Networks,” Neural Networks, vol. 16, no. 1, pp. 69-77, 2003.
[41] B. Walczak and D. L. Massart, “Local Modelling with Radial Basis Function Networks,” Chemometrics and Intelligent Laboratory Systems, vol. 50, no. 2, pp. 179-198, 2000.
[42] T. Holcomb, and M. Morari, “Local Training for Radial Basis Function Networks: Towards Solving the Hidden Unit Problem,” in Proc. of the 1991 American Control Conference, 1991.
[43] M. J. Er, et al, “Face Recognition with Radial Basis Function (RBF) Neural Networks,” IEEE Transactions on Neural Networks, vol. 13, no. 3, pp. 697-710, 2002.
[44] F. Lampariello and M. Sciandrone, “Efficient Training of RBF Neural Networks for Pattern Recognition,” IEEE Transactions on Neural Networks, vol. 12, no. 5, pp. 1235-1242, 2001.
[45] M. J. Er, W. Chen, and S. Wu, “High-Speed Face Recognition Based On Discrete Cosine Transform and RBF Neural Networks,” IEEE Transactions on Neural Networks, vol. 16, no. 3, pp. 679-691, 2005.
[46] S. F. Su, et al, “Radial Basis Function Networks with Linear Interval Regression Weights for Symbolic Interval Data,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 42, no. 1, pp. 69-80, 2012.
[47] L. Hubert and P. Arabie, “Comparing Partitions,” Journal of Classification, vol. 2, no. 1, pp. 193-218, 1985.
[48] R. M. C. R. de Souza and F. de A. T. de Carvalho, “Clustering of Interval Data Based on City–Block Distances,” Pattern Recognition Letters, vol. 25, no. 3, pp. 353-365, 2004.
[49] D. S. Guru, B. B. Kiranagi, and P. Nagabhushan, “Multivalued Type Proximity Measure and Concept of Mutual Similarity Value Useful for Clustering Symbolic Patterns,” Pattern Recognition Letters, vol. 25, no. 10, pp. 1203-1213, 2004.
[50] F. de A. T. de Carvalho, et al, “Adaptive Hausdorff Distances and Dynamic Clustering of Symbolic Interval Data,” Pattern Recognition Letters, vol. 27, no. 3, pp. 167-179, 2006.
[51] S. L. Chiu, “Fuzzy Model Identification Based on Cluster Estimation,” Journal of Intelligent and Fuzzy Systems, vol. 2, no. 3, pp. 267-278, 1994.
[52] R. S. Crowder, Predicting the Mackey-Glass Time Series with Cascade-Correlation Learning in Connectionist Models: Proceedings of the 1990 Summer School. San Mateo, CA, 1990.
[53] 財團法人台北外匯市場發展基金會,〈新台幣有效匯率指數〉,http://www.tpefx.com.tw/htm/02ntd02.htm,2015年2月26日檢索。