研究生: 顏百葵
Bai-Kui Yan
論文名稱: 應用於卷積神經網路加速器之高面積效率浮點數特徵圖壓縮器
Area-Efficient Compressor/Decompressor for Floating-Point Feature Maps in Convolutional Neural Network Accelerators
指導教授: 阮聖彰
Shanq-Jang Ruan
口試委員: 沈中安
Chung-An Shen
Pei-Jun Lee
Ming-Bo Lin
學位類別: 碩士
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 73
中文關鍵詞: 資料壓縮卷基神經網路卷基神經網路加速器浮點數面積效率
外文關鍵詞: Compression, Convolutional neural networks (CNNS), CNN accelerators, floating-point (FP), Area-efficient
卷積神經網絡(CNN) 已使用在許多人工智能應用中,例如物件偵測、圖像分類和自然語言處理。由於CNN 需要大量的計算資源,因此許多計算架構被提出用來提高計算的吞吐量和能效。然而,這些計算架構需要在晶片和晶片外的記憶體之間進行大量的數據移動,這會導致晶片外的記憶體的高能量消耗;因此有研究為了減少大量的數據移動而提出特徵圖壓縮。這也讓特徵圖壓縮的設計成為CNN加速器晶片外記憶體能效的主要研究之一。在這項工作中,我們提出了用於CNN加速器的浮點特徵圖壓縮器。除了壓縮零之外,我們還根據浮點數格式壓縮特徵圖中的非零值。在ILSVRC 2012 數據集上與最新技術相比此壓縮演算法實現了較低的面積和相似的壓縮率。

Convolutional neural networks (CNNs) have been deployed on many artificial intelligence applications such as object detection, image classification, and natural language processing. Since CNNsneed massive computing resources, lots of computing architectures are proposed to improve the throughput and energy efficiency of the computing. However, those computing architectures need high data movement between the chip and off-chip memories, which causes high energy consumption on the off-chip memory; thus, the feature map (fmap) compression has been discussed for reducing the data movement. Therefore, the design of fmap compression becomes one of the main researches on CNN accelerator for energy efficiency on the off-chip memory. In this work, we proposed floating-point (FP) fmap compression for hardware accelerator. In addition to the zero compression, we also compress nonzero values in the fmap based on the FP format. The compression algorithm achieves low area overhead and a similar compression ratio compared with the state-of-the-art on ILSVRC 2012 dataset.

RECOMMENDATION FORM COMMITTEE FORM 摘要 ABSTRACT ACKNOWLEDGMENTS TABLE OF CONTENTS LIST OF FIGURES LIST OF TABLES CHAPTER 1 Introduction 1.1 Background of CNN Accelerator 1.2 Challenges of Previous Works 1.3 Contribution of This Thesis 1.4 Organization CHAPTER 2 Background 2.1 The CNN Algorithm 2.2 Compression Algorithms 2.2.1 The Zero-RLE 2.2.2 The Delta Encoding CHAPTER 3 Related Work 3.1 Quantization 3.2 Fmap Compression CHAPTER 4 Proposed Method 4.1 Compression Algorithm 4.2 Hardware Architecture 4.2.1 Compressor 4.2.2 Decompressor CHAPTER 5 Experimental Results 5.1 Environment/Dataset Setup 5.2 Selection of Parameters 5.2.1 The Consecutive Zeros Storage Length of Zero-RLE 5.2.2 The Difference Storage Length of Delta Encoder 5.3 Comparison with Previous Works CHAPTER 6 Conclusions REFERENCES APPENDIX 1 APPENDIX 2 APPENDIX 3 APPENDIX 4 APPENDIX 5 APPENDIX 6

