簡易檢索 / 詳目顯示

研究生: 蔡宜庭
I-Ting Tsai
論文名稱: 基於現場可程式化邏輯閘陣列之張量分解演算法與電路架構設計
Algorithm and Circuit Architecture Design for Tensor Decomposition based on FPGA
指導教授: 沈中安
Chung-An Shen
口試委員: 林昌鴻
黃琴雅
吳晉賢
沈中安
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2023
畢業學年度: 112
語文別: 英文
論文頁數: 42
中文關鍵詞: 張量分解高階正交迭代硬體架構
外文關鍵詞: Tensor decomposition, Higher-order orthogonal iteration
相關次數: 點閱:95下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

隨著大數據時代的到來,現今的資料處理的數據越來越龐大,多維數據的分析和處理成為各個領域的研究方向。張量是一種多維的數據結構,廣泛應用於信號處理、機器學習、通信系統等領域。其中最受探討的張量相關運算為張量分解。張量分解在許多領域中具有廣泛的應用,如數據分析、訊號處理和圖像處理。張量分解的計算會涉及大量的運算,例如在處理大型數據時,傳統的軟體實現方式可能會造成計算效率低的問題。因此設計高效能的張量分解電路成為重要的研究議題。本文針對張量分解提出電路架構以及改進演算法。本論文首先基於平行化的高階正交迭代演算法,進一步設計電路架構,以達到提升硬體使用率與系統效能的目的。本論文進一步透過提出改進的演算法,並且設計電路進一步改進張量分解電路系統的效能。在基於平行化演算法的張量分解電路架構中,本論文提出共用電路元件來提升硬體用率,並且改變運算過程的順序,提升運算效率。在演算法改進的部分,本論文提出的演算法可以同時更新矩陣,且減少迭帶次數。在電路架構設計上,本論文提出了一種共用硬體電路的張量分解電路架構,提高了硬體的使用率,並且減少了電路面積。本論文基於 FPGA平台進行電路設計與效能評估,並且與先前文獻中的設計有較好的硬體效能改善。


With the advent of the big data era, the amount of data processed today is getting larger and larger, and the analysis and processing of multi-dimensional data has become a research direction in various fields. Tensor is a multi-dimensional data structure, and the most discussed tensor-related operation is tensor decomposition. Tensor decomposition has wide applications in many fields, such as data analysis, signal processing and image processing. The calculation of tensor decomposition involves a large number of operations. Therefore, it is important to design efficient tensor decomposition circuits. This article proposes a circuit architecture and optimization algorithm for tensor decomposition. This paper will conduct circuit design based on a parallelized high-order orthogonal iterative algorithm to improve hardware utilization. In addition, the performance of tensor decomposition is improved by proposing optimized algorithms and designing circuits. In circuit architectures based on parallelization algorithms, shared circuit components are mainly proposed to improve hardware utilization. And change the order of the operation process so that the delay will not be too long and maintain its operation efficiency. In the algorithm optimization part, it is proposed that the matrix can be updated simultaneously and the number of iterations can be reduced. In terms of circuit architecture design, we proposed a tensor decomposition circuit architecture that shares hardware circuits based on the proposed algorithm, which improves the utilization of hardware and reduces the circuit area. This paper reports a performance evaluation based on an FPGA platform, and has better hardware performance than previous designs in the literature.

Abstract in Chinese i Abstract in English ii Contents iv List of Figures vii List of Tables ix 1 Introduction 1 2 Related Work and Background 6 2.1 Tensor and Tensor Operation 6 2.1.1 Tensor 6 2.1.2 Tensor-times-Matrix 9 2.2 An Overview of Tensor Decomposition 10 2.2.1 Higher-Order Singular Value Decomposition 13 2.2.2 Higher-Order Orthogonal Iteration 14 2.3 Related Work 15 3 The Proposed Tensor Decomposition Operation Flow and Architecture 18 3.1 Analysis of Architecture for Related Work 18 3.2 The Proposed Architecture 21 3.2.1 Overall of Architecture 21 3.2.2 Multiply Array Unit 23 3.2.3 QR decomposition 23 3.2.4 CORDIC Array Unit 25 3.3 The Proposed Operation Flow 26 4 The Proposed Algorithm and Operation Flow 28 4.1 The Proposed Algorithm 28 4.2 Simulation Results and Analysis 31 4.3 The Proposed Architecture and Operation Flow 32 4.3.1 The Proposed Architecture 33 4.3.2 The Operation Flow of the Proposed Architecture 33 5 Experimental Results and Comparisons 36 5.1 Implementation Results 36 5.2 Comparisons with Related Literature 36 6 Conclusions 39 References 40

[1] T.-L. Wu, C.-A. Shen, and Y.-H. Huang, “Tensor-based hybrid precoding processor for 8 × 8 × 8 mmwave 3d-mimo systems,” in 2022 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2167–2171, 2022.
[2] T.-L. W. Ting-Yu Tsai, Chung-An Shen and Y.-H. Huang, “The algorithm and vlsi architecture of low-latency and high-throughput tensor decomposition processor,” in Master Thesis,Department of Electronic and. Computer Engineering,National Taiwan University of Science and Technology, 2022.
[3] T. G. Kolda and B. W. Bader, “Tensor decompositions and applications,” SIAM Review, vol. 51, no. 3,pp. 455–500, 2009.
[4] R. Zdunek and M. Gabor, “Nested compression of convolutional neural networks with tucker-2 decomposition,” in 2022 International Joint Conference on Neural Networks (IJCNN), pp. 1–8, 2022.
[5] H. Ding, K. Chen, Y. Yuan, M. Cai, L. Sun, S. Liang, and Q. Huo, “A compact cnn-dblstm based character model for offline handwriting recognition with tucker decomposition,” in 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 507–512, 2017.
[6] A. Cichocki, “Tensor networks for big data analytics and large-scale optimization problems,” ArXiv,vol. abs/1407.3124, 2014.
[7] A. Cichocki, D. Mandic, L. De Lathauwer, G. Zhou, Q. Zhao, C. Caiafa, and H. A. PHAN, “Tensor decompositions for signal processing applications: From two-way to multiway component analysis,”IEEE Signal Processing Magazine, vol. 32, no. 2, pp. 145–163, 2015.
[8] X. Bai, F. Xu, L. Zhou, Y. Xing, L. Bai, and J. Zhou, “Nonlocal similarity based nonnegative tucker decomposition for hyperspectral image denoising,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 11, no. 3, pp. 701–712, 2018.
[9] J. Hatvaniy, J. Michetti, A. Basarab, M. Gyöngy, and D. Kouamé, “Single image super-resolution of noisy 3d dental ct images using tucker decomposition,” in 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), pp. 1673–1676, 2021.
[10] A. Karami, M. Yazdi, and G. Mercier, “Compression of hyperspectral images using discerete wavelet transform and tucker decomposition,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 5, no. 2, pp. 444–450, 2012.
[11] Y. Zhang, L. Xiao, G. Zhang, B. Cai, J. M. Stephen, T. W. Wilson, V. D. Calhoun, and Y.-P. Wang,“Multi-paradigm fmri fusion via sparse tensor decomposition in brain functional connectivity study,”IEEE Journal of Biomedical and Health Informatics, vol. 25, no. 5, pp. 1712–1723, 2021.
[12] M. P. Friedlander and K. Hatz, “Computing non-negative tensor factorizations,” Optimization Methods and Software, vol. 23, no. 4, pp. 631–647, 2008.
[13] H. Huang, Z. Ma, and G. Zhang, “Dimensionality reduction of tensors based on manifold-regularized tucker decomposition and its iterative solution,” International Journal of Machine Learning and Cybernetics, vol. 13, 02 2022.
[14] A. Karami, M. Yazdi, and A. Z. Asli, “Best rank-r tensor selection using genetic algorithm for better noise reduction and compression of hyperspectral images,” in 2010 Fifth International Conference on Digital Information Management (ICDIM), pp. 169–173, 2010.
[15] L. D. Lathauwer, B. D. Moor, and J. Vandewalle, “A multilinear singular value decomposition,” SIAM J. Matrix Anal. Appl., vol. 21, p. 1253–1278, mar 2000.
[16] L. Kuang, F. Hao, L. T. Yang, M. Lin, C. Luo, and G. Min, “A tensor-based approach for big data representation and dimensionality reduction,” IEEE Transactions on Emerging Topics in Computing,vol. 2, no. 3, pp. 280–291, 2014.
[17] M. R. Ameri, M. Haji, A. Fischer, D. Ponson, and T. D. Bui, “A feature extraction method for cursive character recognition using higher-order singular value decomposition,” in 2014 14th International Conference on Frontiers in Handwriting Recognition, pp. 512–516, 2014.
[18] K. Zhang, X. Zhang, and Z. Zhang, “Tucker tensor decomposition on fpga,” 2019.
[19] X. Hu, C. Deng, and B. Yuan, “Reduced-complexity singular value decomposition for tucker decomposition: Algorithm and hardware,” in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1793–1797, 2020.
[20] E. Kofidis and P. Regalia, “On the best rank-1 approximation of higher-order supersymmetric tensors,”Society for Industrial and Applied Mathematics, vol. 23, pp. 863–884, 01 2002.
[21] H. A. L. Kiers, “Towards a standardized notation and terminology in multiway analysis,” Journal of Chemometrics, vol. 14, no. 3, pp. 105–122, 2000.
[22] L. De Lathauwer, B. De Moor, and J. Vandewalle, “On the best rank-1 and rank-(r1 ,r2 ,. . .,rn) approximation of higher-order tensors,” SIAM Journal on Matrix Analysis and Applications, vol. 21, no. 4, pp. 1324–1342, 2000.
[23] T. G. Kolda, “Multilinear operators for higher-order decompositions,” in Sandia National Laboratories (SNL), Albuquerque, NM, and Livermore, CA(United States), pp. No. SAND2006–2081, 2006.
[24] P. Jain and S. Oh, “Provable tensor factorization with missing data,” in Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 1, NIPS’14, (Cambridge, MA, USA), p. 1431–1439, MIT Press, 2014.
[25] J. H. de M. Goulart, M. Boizard, R. Boyer, G. Favier, and P. Comon, “Tensor cp decomposition with structured factor matrices: Algorithms and performance,” IEEE Journal of Selected Topics in Signal Processing, vol. 10, no. 4, pp. 757–769, 2016.
[26] E. Acar, D. M. Dunlavy, T. G. Kolda, and M. Mørup, “Scalable tensor factorizations for incomplete data,” Chemometrics and Intelligent Laboratory Systems, vol. 106, no. 1, pp. 41–56, 2011. Multiway and Multiset Data Analysis.
[27] C. Lubich, T. Rohwedder, R. Schneider, and B. Vandereycken, “Dynamical approximation by hierarchical tucker and tensor-train tensors,” SIAM Journal on Matrix Analysis and Applications, vol. 34, no. 2, pp. 470–494, 2013.
[28] L. De Lathauwer, B. De Moor, and J. Vandewalle, “Blind source separation by simultaneous third-order tensor diagonalization,” in 1996 8th European Signal Processing Conference (EUSIPCO 1996), pp. 1–4, 1996.
[29] L. D. Lathauwer, P. Comon, B. D. Moor, and J. Vandewalle, “Higher-order power method - application in independent component analysis,” in Proceedings of the International Symposium on Nonlinear Theory and Its Applications, LasVegas, UT, pp. 91–96, 1995.

無法下載圖示 全文公開日期 2029/01/23 (校內網路)
全文公開日期 2029/01/23 (校外網路)
全文公開日期 2029/01/23 (國家圖書館:臺灣博碩士論文系統)
QR CODE