應用於系統辨識之無除法器Normalized LMS適應性濾波器之硬體實現

簡易檢索 / 詳目顯示

回結果列表

研究生：	黃鈺哲 YU-ZHE HUANG
論文名稱：	應用於系統辨識之無除法器Normalized LMS適應性濾波器之硬體實現 Hardware Implementation of Dividerless Normalized LMS Adaptive Filters for System Identifications
指導教授：	姚嘉瑜 Chia-Yu Yao
口試委員:	彭盛裕 Sheng-Yu Peng 陳筱青 Hsiao-Chin Chen
學位類別：	碩士 Master
系所名稱：	電資學院 - 電機工程系 Department of Electrical Engineering
論文出版年：	2023
畢業學年度：	111
語文別：	英文
論文頁數：	109
中文關鍵詞：	適應性濾波器、系統辨識、FIR濾波器、IIR濾波器、NLMS演算法
外文關鍵詞：	Adaptive Filter, System Identification, FIR Filter, IIR Filter, NLMS Algorithm
相關次數：	點閱：207 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本論文提出二個不需要純量除法運算的近似 NLMS架構適應性濾波器來的設計與實現。本論文使用二進制簡化法與Canonical Signed-Digit (CSD) 簡化法來利用乘法運算來近似傳統的除法運算。在硬體優化方面，因為避免了除法的回授運算架構，可以大幅增加資料吞吐量，當與傳統除法架構相同吞吐量時，則能大幅降低面積及功耗。本論文使用FIR 架構實現近似 NLMS 演算法的適應性濾波器 VLSI 硬體，並進一步將其運用至硬體 IIR 適應性濾波器。
本論文所提出的近似NLMS架構適應性濾波器分別在基於Altera Cyclone IV EP4CE115F29C7N的DE2-115、TSMC 90 nm CMOS元件庫實現與驗證。在FIR架構下，FPGA部分的工作頻率可以達到91 MHz，二進制簡化法使用11773個邏輯元件和2555個暫存器，其功率消耗為271.67 mW， CSD簡化法使用11897個邏輯元件和2571個暫存器，其功率消耗為271.80 mW，與傳統除法架構同工作頻率下提升至236%的吞吐量。若以TSMC 90 nm CMOS元件庫合成，工作頻率可以達到270 MHz，並具有24.55 Msamples/sec.，消耗的邏輯閘數量分別為74 KGE與75KGE，功率消耗分別為11.99 mW與 12.30 mW，一樣提升了236%的吞吐量。

This thesis presents the design and implementation of two approximate NLMS architectures with adaptive filters that do not require division operations. This work uses binary simplification and canonical signed digit (CSD) simplification to replace the conventional division algorithm by using multiplication. When implemented in hardware, it can increase the throughput rate with the same clock frequency as the conventional division architecture, or reduce the clock rate and power consumption with a similar throughput rate as the conventional division architecture. In this work, we use the FIR architecture to implement the hardware for the approximate NLMS algorithm and further apply it to the adaptive IIR filter hardware.
The proposed approximate adaptive NLMS filters have been implemented and verified on the Altera Cyclone IV EP4CE115F29C7N-based DE2-115 FPGA board and in the TSMC 90 nm CMOS cell library. With the FIR architecture, the DE2-115 can operate at a clock rate of 91 MHz. The binary simplification uses 11773 logic components and 2555 registers, and the power consumption is 271.67mW. The CSD simplification uses 11897 logic components and 2571 registers, and the power consumption is 271.80 mW. The throughput rate can be increased to 236% at the same clock rate compared with the conventional NLMS-FIR architecture with 16-bit SRT division. Synthesized in TSMC 90nm CMOS cell library, the clock rate can reach 270MHz and the throughput rate can reach 24.55 Msamples/s. The gate count is 74 KGE and 75 KGE, and the power consumption is 11.99 mW and 12.30 mW, respectively.

Table of Contents
摘要    I
Abstract    II
誌謝    III
Table of Contents    IV
List of Figures    VI
List of Tables    XV
Chapter 1 Introduction    1
1.1 Motivation    1
1.2 Organization of the Thesis    2
Chapter 2 Theory    3
2.1 System Identification with Normalized LMS Algorithm    3
2.1.1 FIR Structure    3
2.1.2 IIR Structure    5
2.2 Newton-Raphson Algorithm    7
2.3 Radix-2 SRT Algorithm    8
Chapter 3 Proposed Methods    11
3.1 Binary Form Simplification    12
3.2 CSD Form Simplification    13
3.3 Partial SRT Simplification    14
Chapter 4 Hardware Design    15
4.1 NLMS Adaptive FIR Filter    15
4.2 NLMS Adaptive IIR Filter    16
4.3 Wallace Adder Tree    18
4.4 Radix-8 Booth Multiplier    19
4.4.1 Radix-8 Booth Encoder    21
4.4.2 Partial Product Addition    22
4.5 Scaling Factor Generator    24
4.6 CSD Convert Circuit    26
4.7 Multiplication of a 2’s Complement Number and a CSD Number    28
4.8 Conversion of Division to Multiplication    29
4.9 SRT Divider    32
Chapter 5 Experiment Results    34
5.1 Design Flow    34
5.2 NLMS Adaptive FIR Filter    35
5.3 NLMS Adaptive IIR Filter    45
5.4 FPGA Verification    73
5.5 IP Implementation    82
Chapter 6 Conclusion    90
6.1 Conclusion    90
6.2 Future Works    90
References    92


                                

References
[1] S. S. Haykin, Adaptive filter theory. Pearson Education India, 2002.
[2] V. Madisetti, The digital signal processing handbook. CRC press, 1997.
[3] G. Clark, S. Mitra, and S. Parker, "Block implementation of adaptive digital filters," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 29, no. 3, pp. 744-752, 1981.
[4] H. Jiang, L. Liu, P. P. Jonker, D. G. Elliott, F. Lombardi, and J. Han, "A high-performance and energy-efficient FIR adaptive filter using approximate distributed arithmetic circuits," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 66, no. 1, pp. 313-326, 2018.
[5] B. K. Mohanty, P. K. Meher, and S. K. Patel, "LUT optimization for distributed arithmetic-based block least mean square adaptive filter," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 24, no. 5, pp. 1926-1935, 2015.
[6] C. Paleologu, J. Benesty, and S. Ciochină, "A practical variable forgetting factor recursive least-squares algorithm," in 2014 11th International symposium on electronics and telecommunications (ISETC), 2014: IEEE, pp. 1-4.
[7] S. A. Ghauri and M. F. Sohail, "System identification using LMS, NLMS and RLS," 2013: IEEE, doi: 10.1109/scored.2013.7002542. [Online]. Available: https://dx.doi.org/10.1109/scored.2013.7002542
[8] P. A. C. Lopes, G. Tavares, and J. B. Gerald, "A New Type of Normalized LMS Algorithm Based on the Kalman Filter," 2007: IEEE,
doi: 10.1109/icassp.2007.367094.
[Online]. Available: https://dx.doi.org/10.1109/icassp.2007.367094
[9] D. T. Slock, "On the convergence behavior of the LMS and the normalized LMS algorithms," IEEE Transactions on Signal processing, vol. 41, no. 9, pp. 2811-2825, 1993.
[10] N. Hooman, "Architectures for floating-point division," Dissertation for Ph. D. Degree. Adelaide University of Australia, Adelaide, 2005.
[11] J. Cocke and D. Sweeney, "High speed arithmetic in a parallel device," Technical report, IBM Corp., 1957.
[12] J. E. Robertson, "A new class of digital division methods," IRE transactions on electronic computers, no. 3, pp. 218-222, 1958.
[13] K. D. Tocher, "Techniques of multiplication and division for automatic binary computers," The Quarterly Journal of Mechanics and Applied Mathematics, vol. 11, no. 3, pp. 364-384, 1958.
[14] G. A. Ruiz and M. Granda, "Efficient canonic signed digit recoding," Microelectronics journal, vol. 42, no. 9, pp. 1090-1097, 2011.
[15] C. S. Wallace, "A suggestion for a fast multiplier," IEEE Transactions on electronic Computers, no. 1, pp. 14-17, 1964.
[16] S. Patel, K. Khare, J. Yadav, and P. Yadav, "High Performance Robust FIR Filter Design Using Radix-8 Based Improved Booth Multiplier For Signal Processing Application," in 2021 8th International Conference on Signal Processing and Integrated Networks (SPIN), 2021: IEEE, pp. 82-87.
[17] A. Safaee and H. D. Taghirad, "System identification and robust controller design for the autopilot of an unmanned helicopter," in 2013 9th Asian Control Conference (ASCC), 2013: IEEE, pp. 1-6.

全文公開日期 2025/02/07 (校內網路)
全文公開日期 2025/02/07 (校外網路)
全文公開日期 2025/02/07 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文