簡易檢索 / 詳目顯示

研究生: 陳昱仁
Yu-Ren Chen
論文名稱: 具動態指令排程之RISC-V 128位元極長指令核心設計
Design of a RISC-V-based 128-bit VLIW Core with Dynamic Instruction Scheduling
指導教授: 林昌鴻
Chang-Hong Lin
口試委員: 林淵翔
Yuan-Hsiang Lin
沈中安
Chung-An Shen
劉一宇
Yi-Yu Liu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 56
中文關鍵詞: VLIW動態指令排程RISC-V
外文關鍵詞: VLIW, dynamic instruction scheduling, RISC-V
相關次數: 點閱:168下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

由於極長指令處理器經常遇到二進制程式不相容及軟體維護的問題,自編譯器支援、指令集架構至管線化硬體實作。本論文設計了一個支援RISC-V RV32I及RV32M指令集的128位元極長指令處理器。為了原生支援RISC-V編譯器工具組,我們基於經典的RISC五階段管線化設計,整合了數個運用於超純量處理器的設計技巧,例如引入額外的排程階段以分析各指令間的相依性,並組合成無危障的指令封包。而後以Verilog硬體描述語言將此概念實現並合成至Kintex-7 XC7K160TFBG484-1 FPGA上,得到IPC為0.953、頻率為70.921 MHz之極長指令處理器。共硬體資源使用率上為16877(16.64%)個查找表及5866(2.89%)個暫存器切片。與單指令執行的核心為基準相比,此設計以0.865倍的頻率換得1.395倍的IPC。


The shortcomings of Very Long Instruction Word (VLIW) architectures include binary incompatibility and software maintenance problems, varying from compiler support, instruction set architectures (ISA) to physical implementations such as pipelining. To solve this, this thesis proposed a RISC-V-based 128-bit Very Long Instruction Word (VLIW) core design, which supports the RISC-V RV32I and the RV32M ISA. In order to natively support the RISC-V compiler toolchain, based on the classical 5-pipeline RISC design, we integrated several techniques used in a typical superscalar design. For example, we introduced an additional scheduler stage to analyze instruction dependencies and to dispatch them into hazard-free instruction packets. Later, the concept is designed with Verilog HDL and synthesized on a Kintex-7 XC7K160TFBG484-1 FPGA, yielding an IPC of 0.953, a frequency of 70.921 MHz, with 16877 (16.64%) LUTs and 5866 (2.89%) slice registers resource utilized. Compared to the single-issue core, it trades off 0.865 times the baseline in frequency with 1.395 times the baseline in IPC.

摘要 I ABSTRACT II 致謝 III LIST OF CONTENTS IV LIST OF FIGURES VI LIST OF TABLES VII LIST OF ABBREVIATIONS VIII CHAPTER 1 INTRODUCTIONS 1 1.1 Motivation 1 1.2 Contributions 2 1.3 Thesis Organization 2 CHAPTER 2 RELATED WORKS 4 2.1 VLIW Architectures 4 2.2 Hybrid VLIW Architectures 5 CHAPTER 3 RISC-V ISA SPECIFICATION 7 3.1 Introduction 7 3.2 Specifications 7 3.2.1 RV32I [25] 8 3.2.2 RV32M [25] 9 CHAPTER 4 PROPOSED ARCHITECTURE 10 4.1 Instruction Format 10 4.2 Microarchitecture 11 4.2.1 Instruction Fetch Stage (IF) 13 4.2.2 Schedule Stage (SCH) 14 4.2.3 Instruction Decode Stage (ID) 18 4.2.4 Execute Stage (EX) 19 4.2.4.1 ALU 20 4.2.4.2 Multiplication 21 4.2.4.3 Division 25 4.2.5 Memory Stage (MEM) 27 4.2.6 Writeback Stage (WB) 28 4.2.7 Hazard Unit 29 4.2.7.1 Forwarding Unit 29 4.2.7.2 Pipeline Control Unit 30 CHAPTER 5 SIMULATION AND SYNTHESIS RESULTS 32 5.1 Experimental Environment 32 5.2 Functional Verification 34 5.3 IPC Performance 35 5.4 Synthesis Results 36 5.5 Comparisons 38 CHAPTER 6 CONCLUSIONS AND FUTURE WORKS 40 6.1 Conclusions 40 6.2 Future Works 41 REFERENCES 42

[1] K. W. Rudd and M. J. Flynn, "Instruction-Level Parallel Processors - Dynamic and Static Scheduling Tradeoffs," in IEEE International Symposium on Parallel Algorithms Architecture Synthesis, 1997, pp. 74-81, doi: 10.1109/AISPAS.1997.581630.
[2] B. R. Rau, "Dynamically Scheduled VLIW Processors," in IEEE/ACM International Symposium on Microarchitecture, 1993, pp. 80-92.
[3] D. A. Patterson and J. L. Hennessy, Computer Organization and Design RISC-V Edition: The Hardware Software Interface. Morgan Kaufmann Publishers Inc., 2017.
[4] J. A. Fisher, "Very Long Instruction Word Architectures and the ELI-512," in ACM/IEEE International Symposium on Computer Architecture, 1983: Association for Computing Machinery, pp. 140-150, doi: 10.1145/800046.801649.
[5] V. Kathail, M. Schlansker, and B. R. Rau, "HPL-PD Architecture Specification: Version 1.1, Technical Report HPL-93-80 (R.1)," HP Laboratories, 2000.
[6] W. W. S. Chu, R. G. Dimond, S. Perrott, S. P. Seng, and W. Luk, "Customisable EPIC Processor: Architecture and Tools," in Design, Automation & Test in Europe Conference & Exhibition, 2004, vol. 3, pp. 236-241 Vol.3, doi: 10.1109/DATE.2004.1269236.
[7] Trimaran. "Trimaran: A Compiler and Simulator for Research on Embedded and EPIC Architectures." https://trimaran.org/docs/trimaran4_manual.pdf (accessed 2022/11/27).
[8] S. Wong, T. van As, and G. Brown, "ρ-VEX: A Reconfigurable and Extensible Softcore VLIW Processor," in International Conference on Field Programmable Technology, 2008, pp. 369-372, doi: 10.1109/FPT.2008.4762420.
[9] J. A. Fisher, P. Faraboschi, and C. Young, Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools. Elsevier, 2005.
[10] A. K. Jones, R. Hoare, D. Kusic, J. Fazekas, and J. Foster, "An FPGA-based VLIW Processor with Custom Hardware Execution," in ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2005: Association for Computing Machinery, pp. 107-117, doi: 10.1145/1046192.1046207.
[11] Intel. "Nios® II Processor Reference Guide." https://www.intel.com/content/www/us/en/docs/programmable/683836/ (accessed 2022/11/30).
[12] T. M. Conte and S. W. Sathaye, "Dynamic Rescheduling: A Technique for Object Code Compatibility in VLIW Architectures," in IEEE/ACM International Symposium on Microarchitecture, 1995, pp. 208-218, doi: 10.1109/MICRO.1995.476828.
[13] S. Okamoto and M. Sowa, "Hybrid Processor Based on VLIW and PN-Superscalar," in International Conference on Parallel and Distributed Processing Techniques and Applications, 1996, pp. 623-632.
[14] A. F. de Souza and P. Rounce, "Dynamically Scheduling VLIW Instructions," Journal of Parallel and Distributed Computing, vol. 60, no. 12, pp. 1480-1511, 2000, doi: 10.1006/jpdc.2000.1661.
[15] S. Jee and K. Palaniappan, "Dynamically Scheduling VLIW Instructions with Dependency Information," in Workshop on Interaction between Compilers and Computer Architectures, 2002, pp. 15-23, doi: 10.1109/INTERA.2002.995839.
[16] S. L. Chu, G. S. Li, and R. Q. Liu, "DynaPack: A Dynamic Scheduling Hardware Mechanism for a VLIW Processor," Applied Mathematics & Information Sciences, vol. 6, pp. 983-991, 2012.
[17] MIPS. "MIPS® Architecture for Programmers Volume II-A: The MIPS32® Instruction Set Manual." https://www.mips.com/products/architectures/mips32-2/ (accessed 2022/12/4).
[18] S. Rokicki, E. Rohou, and S. Derrien, "Hybrid-DBT: Hardware/Software Dynamic Binary Translation Targeting VLIW," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 38, no. 10, pp. 1872-1885, 2019, doi: 10.1109/TCAD.2018.2864288.
[19] N. M. Qui, C. H. Lin, and P. Chen, "Design and Implementation of a 256-Bit RISC-V-Based Dynamically Scheduled Very Long Instruction Word on FPGA," IEEE Access, vol. 8, pp. 172996-173007, 2020, doi: 10.1109/ACCESS.2020.3024851.
[20] D. A. Patterson and A. Waterman, The RISC-V Reader: An Open Architecture Atlas. Strawberry Canyon, 2017.
[21] L. Calicchia, V. Ciotoli, G. C. Cardarilli, L. di Nunzio, R. Fazzolari, A. Nannarelli, and M. Re, "Digital Signal Processing Accelerator for RISC-V," in IEEE International Conference on Electronics, Circuits, and Systems, 2019, pp. 703-706, doi: 10.1109/ICECS46596.2019.8964670.
[22] A. Garofalo, M. Rusci, F. Conti, D. Rossi, and L. Benini, "PULP-NN: A Computing Library for Quantized Neural Network inference at the edge on RISC-V Based Parallel Ultra Low Power Clusters," in IEEE International Conference on Electronics, Circuits and Systems, 2019, pp. 33-36, doi: 10.1109/ICECS46596.2019.8965067.
[23] S. Di Girolamo, A. Kurth, A. Calotoiu, T. Benz, T. Schneider, J. Beránek, L. Benini, and T. Hoefler, "A RISC-V in-network accelerator for flexible high-performance low-power packet processing," in ACM/IEEE International Symposium on Computer Architecture, 2021, pp. 958-971, doi: 10.1109/ISCA52012.2021.00079.
[24] D. A. Santos, L. M. Luza, C. A. Zeferino, L. Dilillo, and D. R. Melo, "A Low-Cost Fault-Tolerant RISC-V Processor for Space Systems," in International Conference on Design and Technology of Integrated Systems in Nanoscale Era, 2020, pp. 1-5, doi: 10.1109/DTIS48698.2020.9081185.
[25] The RISC-V International. "The RISC-V Instruction Set Manual Volume I: Unprivileged ISA." https://riscv.org/technical/specifications/ (accessed 2022/12/4).
[26] C. R. Baugh and B. A. Wooley, "A Two's Complement Parallel Array Multiplication Algorithm," IEEE Transactions on Computers, vol. C-22, no. 12, pp. 1045-1047, 1973, doi: 10.1109/T-C.1973.223648.
[27] Xilinx. "Xilinx Vivado." https://www.xilinx.com/products/design-tools/vivado.html (accessed 2022/12/1).
[28] RISC-V Software. "riscv-tests." https://github.com/riscv/riscv-tests (accessed 2022/10/28).
[29] R. P. Weicker, "Dhrystone: A Synthetic Systems Programming Benchmark," Communications of the ACM, vol. 27, no. 10, pp. 1013-1030, 1984, doi: 10.1145/358274.358283.
[30] Western Digital. "SweRV-ISS." https://github.com/chipsalliance/SweRV-ISS (accessed 2022/11/21).

QR CODE