簡易檢索 / 詳目顯示

研究生: 吳昆輝
Kun-Hui Wu
論文名稱: 嵌入式裝置應用機器學習CPU 動態頻率管理
Machine Learning for CPU Dynamic frequency Management Technique on Embedded Devices
指導教授: 楊振雄
Chen-Hsiung Yang
口試委員: 陳金聖
CHIN-SHENG CHEN
吳常熙
Chang-Shi Wu
郭永麟
Yong-Lin Kuo
楊振雄
Chen-Hsiung Yang
學位類別: 碩士
Master
系所名稱: 工程學院 - 自動化及控制研究所
Graduate Institute of Automation and Control
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 88
中文關鍵詞: 機器學習線性回歸電源管理多核心系統
外文關鍵詞: Machine Learning, Power Management, Multi-Core Systems, Linear Regression
相關次數: 點閱:634下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 有2 個理由讓我們相信在不久的未來,使用機器學習來有效控制嵌入式裝置會變的越來越
    重要。其一為近5 年Intel 處理器的發展,顯示出製程已經接近物理極限。這意味著過往透過
    製程的改善所得到的能源效益已達到上限。其二為近期的能源學家指出,ICT 產業在2020 年
    耗電會佔全球20%。假使能有效達到5%的省電效果,未來相當於省掉1%的耗電量。
    我們透過Intel 與ARM 公開的許多資訊,利用電源特性來收集在CPU 在SPLASH2 不同
    benchmark 的狀況。透過收集的這些數據,我們設計了一個適用於資料中心的分散式架構與一
    個省電演算法,這個架構使用script language 完成,除了可以跨處理器更是可以跨作業糸統。
    其中我們使用線性迴歸做為主要的機器學習演算法。這個演算法用來預測合適的省電時機,
    在演算法的驗證部份,我們使用了史丹佛大學所研發的SPLASH2 做實際驗證,透過SPLASH2
    的各種情境可以看出來,我們所設計的演算法及架構可以達到5.34%的省電效果。


    There are two reasons why we believe that in the near future, the use of machine learning to
    effectively control embedded devices will become more and more important.One is the development
    of Intel processors in the past five years, showing that the process is close to the physical limit.This
    means that the energy efficiency obtained through the improvement of the process has closed to
    upper limit.Second, recent energy scientists point out that the ICT industry will consume 20% of the
    world's electricity consumption by 2020.If we can achieve 5% power saving effect, it is equivalent to
    saving 1% of power consumption from ICT industry.
    We use the power features of Intel and ARM to collect the different benchmarks from
    SPLASH2 in the CPU .Through the collection of these data, we designed a decentralized architecture
    for the data center and a power-saving algorithm. This architecture is implemented using a script
    language, which can be cross-processed across processors.
    We use linear regression as the main machine learning algorithm.This algorithm is used to
    predict the appropriate power-saving timing. In the verification part of the algorithm, we use the
    SPLASH2 developed by Stanford University to do the actual verification. Through the various
    scenarios of SPLASH2, we can design the algorithm and The architecture can achieve 5.34% power
    saving effect.

    CONTENTS 摘要................................................................................................................................................. i Abstract ......................................................................................................................................... ii 誌謝.............................................................................................................................................. iii List of Figure................................................................................................................................. vi List of Table .................................................................................................................................. ix Chapter 1Introduction .................................................................................................................... 1 1.1 Overview .......................................................................................................................... 1 1.2 Motivations and Objectives ............................................................................................. 6 1.3 Literature Review............................................................................................................. 8 Chapter 2 Preliminaries................................................................................................................ 10 2.1 DVFS , DPM for control Multiprocessors ..................................................................... 10 2.2 Script languge ................................................................................................................ 23 2.3 Reinforcement Learning ................................................................................................ 28 2.4 Splash2 ........................................................................................................................... 31 Chapter 3 Linear regression ......................................................................................................... 33 3.1 Introduction for Linear regression ................................................................................. 33 3.2 Linear regression for Run-time Management ................................................................ 39 3.3 Formal analysis .............................................................................................................. 43 Chapter 4 Distributed architecture ............................................................................................... 47 4.2 Control Unit ................................................................................................................... 54 4.3 Algorithm Unit ............................................................................................................... 59 Chapter 5 Experiment ........................................................................................................... 65 v 5.1 Environment Settings ..................................................................................................... 65 5.2 Comparison of learning Approaches .............................................................................. 70 5.3 Results and Discussion .................................................................................................. 72 Chapter 6 Conclusions and Future Works .................................................................................... 73 Reference ..................................................................................................................................... 74 vi List of Figure Figure 1-1. Intel Moore's Law. ............................................................................................... 4 Figure 1-2. Network Connection Diagram. ........................................................................... 7 Figure 1-3. Components in Linux System Architecture. ....................................................... 8 Figure 2-1. Intel x86 Power State. ....................................................................................... 12 Figure 2-2. Comparison of Intel x86 C-state. ...................................................................... 13 Figure 2-3. ACPI Architecture. ............................................................................................ 15 Figure 2-4. ACPI Service Provided from Experimental PC. ............................................... 16 Figure 2-5. ACPI Fan Service Provided from Experimental PC. ........................................ 17 Figure 2-6. S-state on Peripheral of Experimental PC. ........................................................ 17 Figure 2-7. P-state of Experimental PC. .............................................................................. 18 Figure 2-8. CPU Frequency of Experimental PC. ............................................................... 18 Figure 2-9. APCI Code Dump. ............................................................................................ 19 Figure 2-10. APCI Code Recompiling. ................................................................................ 20 Figure 2-11. Dynamic Voltage Changing. ............................................................................ 21 Figure 2-12. Dynamic Voltage Changing Example. ............................................................ 22 Figure 2-13. Fetching CPU Loading Example. ................................................................... 23 Figure 2-14. Interpreter Flow. .............................................................................................. 25 Figure 2-15. Operating System of Data Center. ................................................................... 27 Figure 2-16. Comparison of Python Performance Differences. .......................................... 27 Figure 2-17. Reinforcement Learning Flowchart. ............................................................... 29 Figure 2-18. Q-Learning Flowchart. .................................................................................... 30 Figure 3-1. Rental Prices Distribution in Taipei. ................................................................. 34 vii Figure 3-2. Linear Regression Learning Flow. .................................................................... 35 Figure 3-3. Linear Regression 3D Model. ........................................................................... 36 Figure 3-4. Gradient Descent of Linear Regression. ........................................................... 37 Figure 3-5. Walking Path of Gradient Descent. ................................................................... 38 Figure 3-6. Function after Fitted. ......................................................................................... 38 Figure 3-7. Linear Regression Learning for CPU Flow. ...................................................... 40 Figure 3-8. CPU Usage Distribution. ................................................................................... 41 Figure 3-9. 3D Model of CPU. ............................................................................................ 41 Figure 3-10. CPU Usage Walking Path of Gradient Descent. ............................................. 42 Figure 3-11. Function after Fitted for CPU Usage............................................................... 42 Figure 3-12. CPU Idle after Time Out Control. ................................................................... 44 Figure 3-13. CPU Idle Control with Natural Logarithm. .................................................... 44 Figure 3-14. CPU Idle Control for Slower Program. ........................................................... 45 Figure 3-15. Level Control of CPU Frequency. ................................................................... 45 Figure 4-1. Network Architecture in Data Center. ............................................................... 47 Figure 4-2. Linux SNMP Example. ..................................................................................... 48 Figure 4-3. Avahi in Linux Example. ................................................................................... 49 Figure 4-4. Sensor Unit Architecture. .................................................................................. 50 Figure 4-5. Sensor Unit Flow. .............................................................................................. 51 Figure 4-6. Sequence Diagram for A.Login; B.Send; C.Logut. .......................................... 53 Figure 4-7. Sensor Unit Scenario Sequence Diagram. ........................................................ 54 Figure 4-8. Control Unit Architecture. ................................................................................. 55 Figure 4-9. Control Unit Flow. ............................................................................................ 56 viii Figure 4-10. Sequence Diagram for A.Login; B.Register; C.Logout; D.Set. ...................... 58 Figure 4-11. Control Unit Scenario Sequence Diagram. ..................................................... 59 Figure 4-12. Algorithm Unit Architecture. .......................................................................... 60 Figure 4-13. Algorithm Unit Flow. ...................................................................................... 61 Figure 4-15. Algorithm Unit Scenario Sequence Diagram. ................................................. 63 Figure 4-16. All Unit Scenario Flow Diagram. ................................................................... 64 Figure 5-1. Smart Plug HS100. ............................................................................................ 65 Figure 5-2. Cortex A7 Development Board. ........................................................................ 66 Figure 5-3. Hardware Wiring Configuration. ...................................................................... 69 Figure 5-4. Performance Penalty. ......................................................................................... 71 Figure 5-5. Power Saving. ................................................................................................... 71 ix List of Table Table 1-1. Intel's CPU Development and Process in Recent Four Years. .............................. 5 Table 2-1. Experimental PC Specifications. ........................................................................ 16 Table 2-2. C Language Differences. .................................................................................... 26 Table 3-1. Rental Prices in Taipei. ....................................................................................... 33 Table 3-2. CPU Usage Per Second. ..................................................................................... 40 Table 4-1. MIB Tree Example. ............................................................................................ 48 Table 4-2. Sensor Unit Method. ........................................................................................... 52 Table 4-3. Control Unit Method. .......................................................................................... 56 Table 4-4. Control Unit Method. .......................................................................................... 61 Table 5-1. Configuration Parameters of ARM Cortex A7. .................................................. 67 Table 5-2. SPLASH Benchmark of ARM Cortex A7. ......................................................... 68 Table 5-3. Power Mode of CPU. .......................................................................................... 69 Table 5-4. Comparisons of SPLASH2. ................................................................................ 70

    Reference
    [1] B.E. Outlook, BP Global, 2017 Edition, (2017).
    [2] A.S. Andrae, T. Edler, On global electricity usage of communication technology: trends to
    2030, Challenges, 6 (2015) 117-157.
    [3] T. Bisson, S.A. Brandt, D.D. Long, NVCache: Increasing the effectiveness of disk
    spin-down algorithms with caching, in: Modeling, Analysis, and Simulation of Computer
    and Telecommunication Systems, 2006. MASCOTS 2006. 14th IEEE International
    Symposium on, IEEE, 2006, pp. 422-432.
    [4] T. Bostoen, S. Mullender, Y. Berbers, Power-reduction techniques for data-center storage
    systems, ACM Computing Surveys (CSUR), 45 (2013) 33.
    [5] D.P. Helmbold, D.D. Long, B. Sherrod, A dynamic disk spin-down technique for mobile
    computing, in: Proceedings of the 2nd annual international conference on Mobile
    computing and networking, ACM, 1996, pp. 130-142.
    [6] A.K. Singh, C. Leech, B.K. Reddy, B.M. Al-Hashimi, G.V. Merrett, Learning-based
    run-time power and energy management of multi/many-core systems: current and future
    trends, Journal of Low Power Electronics, 13 (2017) 310-325.
    [7] A. Cohen, F. Finkelstein, A. Mendelson, R. Ronen, D. Rudoy, On estimating optimal
    performance of cpu dynamic thermal management, IEEE Computer Architecture Letters, 2
    (2003) 6-6.
    [8] R.R. Schaller, Moore's law: past, present and future, IEEE spectrum, 34 (1997) 52-59.
    [9] A. Shah, The PC upgrade cycle slows to every five to six years, in,
    https://www.pcworld.com/article/3078010/hardware/the-pc-upgrade-cycle-slows-to-every-fiv
    e-to-six-years-intels-ceo-says.html.
    [10] Intel Chips Timeline, in,
    https://www.intel.com/content/www/us/en/history/history-intel-chips-timeline-poster.html.
    [11] A.L. Shimpi, The Haswell Review, in,
    https://www.anandtech.com/show/7003/the-haswell-review-intel-core-i74770k-i54560k-teste
    d/5.
    [12] K. Levermore, A brief history of the microchip, in,
    http://dialoguereview.com/brief-history-microchip/.
    [13] C. Auth, C. Allen, A. Blattner, D. Bergstrom, M. Brazier, M. Bost, M. Buehler, V.
    Chikarmane, T. Ghani, T. Glassman, A 22nm high performance and low-power CMOS
    technology featuring fully-depleted tri-gate transistors, self-aligned contacts and high density
    MIM capacitors, in: VLSI technology (VLSIT), 2012 symposium on, IEEE, 2012, pp.
    75
    131-132.
    [14] T.P. Morgan, CAVIUM IS TRULY A CONTENDER WITH ONE-TWO ARM SERVER
    PUNCH, in,
    https://www.nextplatform.com/2017/11/27/cavium-truly-contender-one-two-arm-server-punc
    h/.
    [15] 16.12 Release Notes, in,
    https://platforms.linaro.org/documentation/Reference-Platform/Platforms/Enterprise/Release
    Notes-16.12.md/.
    [16] Designing Systems Without a Suspend Supply in,
    https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/systems-with
    out-suspend-supply-paper.pdf.
    [17] S. Fischer, Technical overview of the 45nm next generation intel core microarchitecture
    (penryn), in: Intel Developer Forum, 2007.
    [18] F.P. Miller, A.F. Vandome, J. McBrewster, Advanced Configuration and Power Interface:
    Open Standard, Operating System, Power Management, Cross-Platform, Intel Corporation,
    Microsoft, Toshiba,... Sleep Mode, Hibernate (OS Feature), Synonym, (2009).
    [19] D. Wu, Performance compares between C and python, in,
    https://github.com/DesmondWu/dvfs.
    [20] D. Katz, R. Gentile, Dynamic Power Management Optimizes Performance vs. Power in
    Embedded Applications of BlackfinTM DSPs, Analog Dialogue, (2002) 36-04.
    [21] M.T. Jones, Access the Linux kernel using the/proc filesystem, IBM developerWorks,
    (2006).
    [22] Reference Software Platform, in,
    https://www.96boards.org/blog/reference-software-platform-16-06-release/.
    [23] C. Lomont, Introduction to x64 assembly, Recuperado de: https://software. intel.
    com/en-us/articles/introduction-to-x64-assembly Programación en Python Referencias,
    (2012).
    [24] ARM® Developer Suite Assembler Guide, in,
    http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/BABJAJIB.html.
    [25] H.-C. Hsieh, K.-D. Chang, L.-F. Wang, J.-L. Chen, H.-C. Chao, ScriptIoT: A script
    framework for and internet-of-things applications, IEEE Internet of Things Journal, 3 (2016)
    628-636.
    [26] R.S. Sutton, A.G. Barto, Reinforcement learning: An introduction, MIT press Cambridge,
    1998.
    [27] C.J. Watkins, P. Dayan, Q-learning, Machine learning, 8 (1992) 279-292.
    76
    [28] G.-Y. Pan, B.-C.C. Lai, S.-Y. Chen, J.-Y. Jou, A learning-on-cloud power management
    policy for smart devices, in: Computer-Aided Design (ICCAD), 2014 IEEE/ACM
    International Conference on, IEEE, 2014, pp. 376-381.
    [29] J.M. Arnold, D.A. Buell, E.G. Davis, Splash 2, in: Proceedings of the fourth annual
    ACM symposium on Parallel algorithms and architectures, ACM, 1992, pp. 316-322.
    [30] J. Neter, W. Wasserman, M.H. Kutner, Applied linear regression models, (1989).
    [31] J.C.M.F.M. Schoffstall, C. Davin, RFC 1157: Simple network management protocol
    (SNMP), IETF, April, (1990).
    [32] K. McCloghrie, M. Rose, RFC 1156," Management Information Base for Network
    Management of TCP/IP-based internets, (1990).
    [33] D. Stirling, F. Al-Ali, Zero configuration networking, Crossroads, 9 (2003) 19-23.
    [34] S. Cheshire, B. Aboba, E. Guttman, Dynamic configuration of IPv4 link-local addresses
    (RFC 3927), May 2005, in.
    [35] D. Plummer, An Ethernet address resolution protocol (RFC 826), Network Working
    Group, (1982).
    [36] Nanopi-M1, in, http://www.nanopi.org/NanoPi-M1_Feature.html.
    [37] Allwinner H3 Datasheet, in,
    http://wiki.friendlyarm.com/wiki/index.php/File:Allwinner_H3_Datasheet_V1.2.pdf.
    [38] A. Cortex, a7 mpcore technical reference manual, 2013, URL: http://infocenter. arm.
    com/help/index. jsp.

    無法下載圖示 全文公開日期 2023/08/01 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE