簡易檢索 / 詳目顯示

研究生: 鄭仁豪
Jen-Hao Cheng
論文名稱: 自適應低延遲之獨立成分分析演算法及積體電路架構設計
The Algorithm and VLSI Architecture of an Adaptive and Low-Latency ICA Processor
指導教授: 沈中安
Chung-An Shen
口試委員: 阮聖彰
sjruan@mail.ntust.edu.tw
林淵翔
linyh@mail.ntust.edu.tw
黃琴雅
chinya@mail.ntust.edu.tw
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 57
中文關鍵詞: 獨立成分分析低延遲低複雜度自適應梯度下降優化
外文關鍵詞: Independent Component Analysis, Low Latency, Low Complexity, Adaptive, Optimization for Gradient Descent
相關次數: 點閱:515下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

盲訊號分離技術被廣泛使用於需要分離混合訊號的應用中,而獨立成分分析演算法為其中最有效的演算法。獨立成分分析演算法利用訊號間統計上的獨立性及非高斯性質有效分離混合獨立訊號源。然而獨立成分分析演算法的運算量非常龐大,因此造成極高的運算延遲以及積體電路複雜度。本論文提出自適應且低延遲的獨立成分分析演算法及處理器架構。具體而言,本文提出了一種基於動量梯度下降最佳化及自適應學習步長控制的獨立成分分析演算法,使得演算法的整體疊代次數得到巨大的改善,並且只有些許的分離效能損失。此外,為了以低延遲且低複雜度的方式來計算多種不同的分佈函數,本論文提出了預先估算分佈的技巧,並應用於處理器架構設計中。我們專門設計了硬體利用的處理流程及效率,能夠在使用最少硬體元件的狀況下將處理速度最大化。我們所提出的獨立成分分析處理器是基於數位積體電路設計流程以及FPGA平台開發流程設計與實現。在論文的實驗結果中表明,我們所提出的獨立成分分析處理器以約 256.76k 個邏輯閘的複雜度實現每秒 3.81k 個矩陣運算的處理速度。與先前文獻的設計相比,我們所提出的獨立成分分析處理器實現了最高的處理速度並且有最高的資源使用效率。


This thesis presents the algorithm and VLSI architecture design of an adaptive independent component analysis (ICA) processor that achieves high-throughput and low-complexity. A novel ICA algorithm is presented with the momentum gradient descent optimization and an adaptive control of step size scheme so that the number of iterations for the algorithm is significantly reduced. The performance degradation of the proposed algorithm is minimal. Furthermore, in order to perform the computation of multiple distribution functions with low-latency and low-complexity, a novel early-distribution-estimation scheme is employed in the designed ICA processor. The processing flow and the efficiency for the hardware utilization are specifically designed so that the processing speed is maximized with a minimum employment of hardware components. The proposed ICA processor is designed and implemented based on the ASIC flow as well as on the FPGA. The evaluations based on the post-layout estimation show that, the proposed processor achieves a throughput of 3.81k Matrices per second with a complexity of 256.76k gate entity. Comparing with prior works, the proposed ICA processor achieves lowest processing latency as well as best hardware efficiency.

Contents Abstract in Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . iii Abstract in English . . . . . . . . . . . . . . . . . . . . . . . . . . iv Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix List of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . x 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Background and Related work . . . . . . . . . . . . . . . . . . 6 2.1 Preprocessing Stage of Independent Component Analysis . 7 2.2 Methodology of Iterative Estimation Stage . . . . . . . . . 9 2.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . 13 3 Proposed Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 17 3.1 Analysis For Related Work . . . . . . . . . . . . . . . . . 17 3.2 The Proposed Momentum-based CICA-EBM Algorithm . 21 3.3 The Proposed Distribution Estimator . . . . . . . . . . . . 24 4 Proposed Architecture . . . . . . . . . . . . . . . . . . . . . . . 30 4.1 Initial stage . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.2 Transform stage . . . . . . . . . . . . . . . . . . . . . . . 33 4.3 Update stage . . . . . . . . . . . . . . . . . . . . . . . . . 37 5 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . 39 5.1 Experiment setup and Software simulation . . . . . . . . . 39 5.2 Implementation Results . . . . . . . . . . . . . . . . . . . 39 5.3 Comparison with Prior Designs . . . . . . . . . . . . . . . 41 6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

[1] H.-H. Lu, M. E. Fouda, C.-A. Shen, and A. Eltawil, “Full-duplex self cancellation techniques us-
ing independent component analysis,” in 2020 54th Asilomar Conference on Signals, Systems, and
Computers, pp. 900–904, 2020.
[2] X.-L. Li and T. Adali, “A novel entropy estimator and its application to ica,” in 2009 IEEE Interna-
tional Workshop on Machine Learning for Signal Processing, pp. 1–6, 2009.
[3] A. Tharwat, “Independent component analysis: an introduction,” Applied Computing and Informatics,
vol. ahead-of-print, 08 2018.
[4] S. D. Jadhav and A. S. Bhalchandra, “Blind source separation: trends of new age - a review,” in 2008
IET International Conference on Wireless, Mobile and Multimedia Networks, pp. 251–254, 2008.
[5] M. Pal, R. Roy, J. Basu, and M. S. Bepari, “Blind source separation: A review and analysis,” in 2013
International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken
Language Research and Evaluation (O-COCOSDA/CASLRE), pp. 1–5, 2013.
[6] C. Jutten and J. Herault, “Blind separation of sources, part i: An adaptive algorithm based on neu-
romimetic architecture,” Signal Processing, vol. 24, no. 1, pp. 1–10, 1991.
[7] P. Comon, “Independent component analysis, a new concept?,” Signal Processing, vol. 36, no. 3,
pp. 287–314, 1994.
[8] R. K. Yadav, R. Mehra, and N. Dubey, “Blind audio source separation using weight initialized inde-
pendent component analysis,” in 2015 1st International Conference on Next Generation Computing
Technologies (NGCT), pp. 563–566, 2015.
[9] A. Hyvärinen and E. Oja, “Independent component analysis: algorithms and applications,” Neural
Networks, vol. 13, no. 4, pp. 411–430, 2000.
[10] X.-L. Li and T. Adali, “Complex independent component analysis by entropy bound minimization,”
IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 57, no. 7, pp. 1417–1430, 2010.
[11] X.-L. Li and T. Adali, “Independent component analysis by entropy bound minimization,” IEEE
Transactions on Signal Processing, vol. 58, no. 10, pp. 5151–5164, 2010.
[12] A. J. Bell and T. J. Sejnowski, “An information-maximization approach to blind separation and blind
deconvolution,” Neural Comput, vol. 7, pp. 1129–1159, Nov 1995.
[13] T.-W. Lee, M. Girolami, and T. J. Sejnowski, “Independent Component Analysis Using an Extended
Infomax Algorithm for Mixed Subgaussian and Supergaussian Sources,” Neural Computation, vol. 11,
pp. 417–441, Feb. 1999.
[14] M. E. Fouda, S. Shaboyan, A. Elezabi, and A. Eltawil, “Application of ica on self-interference cancel-
lation of in-band full duplex systems,” IEEE Wireless Communications Letters, vol. 9, no. 7, pp. 924–
927, 2020.
[15] X. Cui, C. Stetson, P. R. Montague, and D. M. Eagleman, “Ready...go: Amplitude of the FMRI signal
encodes expectation of cue arrival time,” PLoS Biol, vol. 7, p. e1000167, Aug 2009.
[16] N. Soldati, V. D. Calhoun, L. Bruzzone, and J. Jovicich, “ICA analysis of fMRI with real-time con-
straints: an evaluation of fast detection performance as function of algorithms, parameters and a priori
conditions,” Front Hum Neurosci, vol. 7, p. 19, 2013.
[17] H. Wu, Y. Shen, J. Zhang, H. Salah, I. A. Tsokalo, and F. H. Fitzek, “Adaptive extraction-based
independent component analysis for time-sensitive applications,” in GLOBECOM 2020 - 2020 IEEE
Global Communications Conference, pp. 1–6, 2020.
[18] H. Wu, Y. Shen, J. Zhang, I. A. Tsokalo, H. Salah, and H. Frank Fitzek, “Component-dependent
independent component analysis for time-sensitive applications,” ICC 2020 - 2020 IEEE International
Conference on Communications (ICC), pp. 1–6, 2020.
[19] H. Wu, Z. Xiang, G. T. Nguyen, Y. Shen, and F. H. Fitzek, “Computing meets network: Coin-aware
offloading for data-intensive blind source separation,” IEEE Network, vol. 35, no. 5, pp. 21–27, 2021.
[20] L.-D. Van, D.-Y. Wu, and C.-S. Chen, “Energy-efficient fastica implementation for biomedical signal
separation,” IEEE Transactions on Neural Networks, vol. 22, no. 11, pp. 1809–1822, 2011.
[21] L.-D. Van, P.-Y. Huang, and T.-C. Lu, “Cost-effective and variable-channel fastica hardware architec-
ture and implementation for eeg signal processing,” Journal of Signal Processing Systems, vol. 82,
no. 1, pp. 91–113, 2016.
[22] L.-D. Van, T.-C. Lu, T.-P. Jung, and J.-F. Wang, “Hardware-oriented memory-limited online fastica
algorithm and hardware architecture for signal separation,” in ICASSP 2019 - 2019 IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1438–1442, 2019.
[23] A. Hyvarinen, “Fast and robust fixed-point algorithms for independent component analysis,” IEEE
Transactions on Neural Networks, vol. 10, no. 3, pp. 626–634, 1999.
[24] A. J. Bell and T. J. Sejnowski, “An Information-Maximization Approach to Blind Separation and
Blind Deconvolution,” Neural Computation, vol. 7, pp. 1129–1159, 11 1995.
[25] T.-W. Lee, M. Girolami, and T. J. Sejnowski, “Independent Component Analysis Using an Extended
Infomax Algorithm for Mixed Subgaussian and Supergaussian Sources,” Neural Computation, vol. 11,
pp. 417–441, 02 1999.
[26] C.-H. Yang, Y.-H. Shih, and H. Chiueh, “An 81.6 μw fastica processor for epileptic seizure detection,”
IEEE Transactions on Biomedical Circuits and Systems, vol. 9, no. 1, pp. 60–71, 2015.
[27] E. Ahmed and A. M. Eltawil, “All-digital self-interference cancellation technique for full-duplex sys-
tems,” IEEE Transactions on Wireless Communications, vol. 14, no. 7, pp. 3519–3532, 2015.
[28] Y.-H. Chen, S.-W. Chen, and M.-X. Wei, “A vlsi implementation of independent component analysis
for biomedical signal separation using cordic engine,” IEEE Transactions on Biomedical Circuits and
Systems, vol. 14, no. 2, pp. 373–381, 2020.
[29] J. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods for online learning and stochastic
optimization,” Journal of Machine Learning Research, vol. 12, no. 61, pp. 2121–2159, 2011.
[30] B. Polyak, “Some methods of speeding up the convergence of iteration methods,” Ussr Computational
Mathematics and Mathematical Physics, vol. 4, pp. 1–17, 12 1964.
[31] G. Goh, “Why momentum really works,” Distill, 2017.
[32] D. E. Rumelhart, J. L. McClelland, and C. PDP Research Group, eds., Parallel Distributed Process-
ing: Explorations in the Microstructure of Cognition, Vol. 2: Psychological and Biological Models.
Cambridge, MA, USA: MIT Press, 1986.
[33] N. Qian, “On the momentum term in gradient descent learning algorithms,” Neural Netw., vol. 12,
p. 145–151, jan 1999.
[34] L. Hsi-Hung, “The vlsi architecture design of the configurable and high-throughput independent com-
ponent analysis preprocessor for in-band full-duplex communication systems,” 2021

QR CODE