簡易檢索 / 詳目顯示

研究生: 鄭宇辰
Yu-Chen Cheng
論文名稱: 於資料降維時利用自動編碼器增強資訊保留能力
Using Autoencoder to Facilitate Information Retention for Data Dimension Reduction
指導教授: 呂政修
Jenq-Shiou Leu 
口試委員: 方文賢
Wen-Hsien Fang
石維寬
Wei-Kuan Shih
陳省隆
Hsing-Lung Chen
陳郁堂
Yie-Tarng Chen
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 48
中文關鍵詞: 自動編碼器類神經網路降維演算法
外文關鍵詞: Autoencoder, Neural network, Dimensionality reduction
相關次數: 點閱:254下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 降維演算法是一種常見的前處理方式,在機器學習或資料探勘的領域中都經常的被使用。不同的降維演算法有各自獨有的特性,也可以處理各種的問題。舉例來說,主成分分析(Principal component analysis)就是一種常見的降維演算法,線性判別分析(linear discriminant analysis)及多種非線性降維的做法都能有效的減少資料的維度。降維演算法也可以作為分類器的前處理步驟來使用,在較低的資料維度下令分類器進行學習可以讓其速度變快並增加分類的準確度,因為有效的降維演算法可以去除資料中的雜訊並提高每個維度的資料特性。開放的資料集若沒有經過對應領域的專家先行處理,其通常會包含太多與資料類別不相關的特徵並提升直接處理上的困難度。換句話來說,先行對資料使用降維演算法可以某種程度的去噪並讓資料的特性變得更清楚。但是,在降維的過程中,其必定會對資料本身造成資訊上的損失,尤其當降至的維度離原始維度過於遙遠時,資訊保留上的差距就會越加明顯。為了解決這個問題,我們使用自動編碼器(Autoencoder)中的編碼器(Encoder)部分作為降維的方式並與其接近的方式
    結合,然後與其他的降維方法比較。自動編碼器是一種高彈性的結構,我們會將其作為支援向量機(Support Vector Machine)的前置步驟,並視分類後的結果來敘述其資料的保留性好壞。


    Abstract –Due to the development of internet, plentiful different data appear rapidly. The amounts of features also increase when the technology of data collecting becomes mature. Observation of different data is usually not an easy task because only a minority of people have the background knowledge of data and features. Therefore, dimensionality reduction (DR) become a familiar method to reduce the amount of features and keep the critical information. The benefits of DR are that it can sweep away useless noises, and increase the total characteristic of data. However, the loss of information during the processing of dimensionality reduction is unavoidable. When the targeted dimension is far lower than original dimension, the loss is usually too high to be endurable. To solve this problem, we use the encoder structure from autoencoder to compare with some common dimensionality reduction methods. Autoencoder is an unrestrained and flexible structure. We will use the simplest autoencoder structure as the preprocessing of Support Vector Machine (SVM) to see the result.

    論文摘要 I ABSTRACT III 誌謝 IV 目錄 V 圖表索引 VI 第 1 章 緒論 1 第 2 章 背景知識與相關研究 3 2.1 類神經網路 3 2.2 非監督式線性降維主成分分析 PCA 5 2.3 PCA與自動編碼器 6 2.4 自動編碼器 8 2.5 支援向量機(Support vector machine, SVM) 9 2.6 相關研究 10 第 3 章 系統架構 13 3.1 主要架構 13 3.1.1 主成分分析_自動編碼器架構(PCA_AE) 14 3.1.2 自動編碼器_主成分分析結構(AE_PCA) 15 第 4 章 實驗測試與評估結果 20 4.1 環境設定 20 4.2 資料集 20 4.3 比較不同分類方式與兩種架構之結合 22 4.4 實驗結果_MNIST 25 4.5 實驗結果_Reuters 26 4.5.1 討論_MNIST/Reuters 28 4.6 實驗結果_CIFAR_10 28 4.6.1 討論―CIFAR_10 31 4.7 討論 32 第 5 章 結論及未來展望 36 參考文獻 37

    [1] Van Der Maaten, L., E. Postma, and J. Van den Herik, Dimensionality reduction: a comparative. J Mach Learn Res, 2009. 10: p. 66-71.
    [2] Hotelling, H., Analysis of a complex of statistical variables into principal components. Journal of educational psychology, 1933. 24(6): p. 417.
    [3] Izenman, A.J., Linear discriminant analysis, in Modern multivariate statistical techniques. 2013, Springer. p. 237-280.
    [4] Roweis, S.T. and L.K. Saul, Nonlinear dimensionality reduction by locally linear embedding. science, 2000. 290(5500): p. 2323-2326.
    [5] Wang, W. and M.A. Carreira-Perpinán. The Role of Dimensionality Reduction in Classification. in AAAI. 2014.
    [6] Cortes, C. and V. Vapnik, Support-vector networks. Machine learning, 1995. 20(3): p. 273-297.
    [7] Abdiansah, A. and R. Wardoyo, Time complexity analysis of support vector machines (svm) in libsvm. International Journal Computer and Application, 2015.
    [8] Maaten, L.v.d. and G. Hinton, Visualizing data using t-SNE. Journal of Machine Learning Research, 2008. 9(Nov): p. 2579-2605.
    [9] Bourlard, H. and Y. Kamp, Auto-association by multilayer perceptrons and singular value decomposition. Biological cybernetics, 1988. 59(4): p. 291- 294.
    [10] McCulloch, W.S. and W. Pitts, A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics, 1943. 5(4): p. 115- 133.
    [11] Vincent, P., et al., Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 2010. 11(Dec): p. 3371-3408.
    [12] Sharma, A., Handwritten digit recognition using support vector machine. arXiv preprint arXiv:1203.3847, 2012.
    [13] Heisele, B., P. Ho, and T. Poggio. Face recognition with support vector machines: Global versus component-based approach. in Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on. 2001. IEEE.
    [14] Joachims, T., Text categorization with support vector machines: Learning with many relevant features. Machine learning: ECML-98, 1998: p. 137-142.
    [15] Hinton, G.E. and R.R. Salakhutdinov, Reducing the dimensionality of data with neural networks. science, 2006. 313(5786): p. 504-507.
    [16] Wang, Y., H. Yao, and S. Zhao, Auto-encoder based dimensionality reduction. Neurocomputing, 2016. 184: p. 232-242.
    [17] Plastria, F., S. De Bruyne, and E. Carrizosa. Dimensionality reduction for classification. in International Conference on Advanced Data Mining and Applications. 2008. Springer.
    [18] Kim, H., P. Howland, and H. Park, Dimension reduction in text classification with support vector machines. Journal of Machine Learning Research, 2005. 6(Jan): p. 37-53.
    [19] Chollet, F.c.c.o.a.o., Keras. https://github.com/fchollet/keras, 2015.
    [20] Hahnloser, R.H., et al., Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature, 2000. 405(6789): p. 947-951.
    [21] Kingma, D. and J. Ba, Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
    [22] Lehmann, E.L. and G. Casella, Theory of point estimation. 2006: Springer Science & Business Media.
    [23] Pedregosa, F., et al., Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 2011. 12(Oct): p. 2825-2830.
    [24] Krizhevsky, A. and G. Hinton, Learning multiple layers of features from tiny images. 2009.

    QR CODE