簡易檢索 / 詳目顯示

研究生: 陳聰杰
Tsung-Chieh Chen
論文名稱: 高能源效率 ARM-Based 多核心多處理器伺服器系統設計
High Energy-Efficiency ARM-Based Multi-Cores & Multi-CPUs Server Board System Design
指導教授: 徐勝均
Sheng-Dong Xu
口試委員: 蘇順豐
Shun-Feng Su
吳晉賢
Chin-Hsien Wu
學位類別: 碩士
Master
系所名稱: 工程學院 - 自動化及控制研究所
Graduate Institute of Automation and Control
論文出版年: 2013
畢業學年度: 102
語文別: 中文
論文頁數: 146
中文關鍵詞: 叢集式系統電路板設計以ARM為基礎之嵌入式系統嵌入式系統設計
外文關鍵詞: Cluster systems, ARM-based embedded systems, Circuit board design, Embedded system design
相關次數: 點閱:264下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究之主要目的是設計與實現基於ARM為基礎具多核心與多中央處理器之嵌入式伺服器系統工作板(ARM-Based Multi-Cores & Multi-CPUs Server Board, M2SB),以提升ARM叢集式系統的整體效能,實現高能源效率之伺服器系統設計。
    在嵌入式系統的市場中,ARM架構基於低功率消耗的特性,其應用領域已逐漸擴展到伺服器架構。文獻回顧顯示:已有許多研究著重於ARM架構的效能分析研究,並也將這些結果拿來和已做過效能最佳化的x86系統來做比較。然而,根據以往的實驗數據,並不足以證明,ARM處理器在伺服器系統上的應用也可以得到一些優勢。因此本研究因應企業需求採用伺服器級的ARM四核心處理器設計,以作相關研究與驗證。在本設計中,主要提供PCI Express和Gigabit等級的乙太網路兩種型態的傳輸介面,這樣的設計將使ARM伺服器系統能和其他系統作溝通。最後,再透過平行運算標準和伺服器工作負載兩個實驗,可以證明M2SB具有低耗能的優點,驗證了ARM處理器在伺服器系統應用上的優勢。


    The objective of this research is to design and implement an ARM-Based Multi-Cores & Multi-CPUs Server Board (M2SB). Such design not only can improve the entire performance of ARM cluster systems but also can acheive the high-energy-efficiency server system implementation.
    In the embedded systems’ market, due to the benefit of the low power consumption in the ARM architecture, the related ARM applications have been expanded to the server architecture. Literature survey indicates that there are many existing ARM-based performance analysis, which have been compared with optimized x86 systems. However, according to the former experimental data, we cannot ensure if ARM-based server systems will get better performance. Therefore, in this study, we adopt the server-grade ARM quad-core processors for enterprise-class applications. Various types of cluster systems can be built with the M2SB by either the PCI Express interface or the Gigabit Ethernet interface. Finally, through the experiments of parallel workloads and server workloads, we can demonstrate that M2SB owns the benefit of low power consumption and clearly verify that in server systems ARM processors will be able to work with better performance than x86 prosessors do.

    致謝 I 中文摘要II 目錄 IV 圖目錄 VIII 表目錄 XII 第一章 緒論 1 1.1前言 1 1.2研究動機 2 1.3研究方式 3 1.4研究設備 5 1.4.1硬體開發設備 5 1.4.2軟體開發設備 7 1.4.3硬體設計設備 7 1.4.4系統驗證設備 7 1.5研究流程 9 1.6研究架構 10 第二章 技術及理論探討 11 2.1標準介面技術 11 2.1.1 PCI Express Interface 11 2.1.2非透明橋接器 13 2.2積體電路硬體技術 15 2.2.1處理器之特徵 15 2.2.2橋接器之特徵 15 2.2.3硬體技術 16 2.3 PCB硬體規格 17 2.3.1 PCB板材 17 2.3.2 PCB尺寸 17 2.3.3 PCB層數 18 2.3.4 PCB鑽孔 18 2.3.5 PCB傳輸線 21 2.3.6 PCB阻抗控制 24 2.4相關研究文獻探討 26 2.4.1 ARM伺服器設計 27 2.4.2 ARM系統的效能與能源效率 28 2.5計算典範和編程策略 33 2.5.1同質運算系統 35 2.5.2異質運算系統 37 第三章 伺服器系統設計 38 3.1研發平台之硬體電路設計 38 3.1.1硬體功能 38 3.1.2訂定硬體規格 39 3.1.3設計M2SB電路圖 41 3.1.4設計處理器電路圖 43 3.1.5設計PCI Express Switch電路圖 48 3.1.6設計電源系統電路圖 52 3.2研發平台之PCB 設計 55 3.2.1訂定機構尺寸規格 55 3.2.2訂定PCB規格 56 3.2.3撰寫PCB佈線指導方針 58 3.2.4設計PCB佈局 69 3.2.5檢閱PCB Layout 72 3.2.6檢閱PCB Gerber File 72 3.2.7製作PCBA 74 3.3於設計過程所遇問題之解決方案 75 3.3.1 於RGMII問題之解決方案 75 3.3.2 於Power Module問題之解決方案 76 3.3.3 於DDR3 SDRAM問題之解決方案 79 3.3.4 於PCI Express Link問題之解決方案 87 第四章 實驗平台功能驗證 89 4.1硬體功能驗證 89 4.1.1電源輸出檢測 89 4.1.2電源時序檢測 90 4.1.3燒錄U-boot Firmware 92 4.1.4燒錄PCI Express Switch EEPROM Firmware 95 4.2軟體功能驗證 97 4.2.1 驗證ARM CPU ID 97 4.2.2 驗證Power Good / ON Status 99 4.2.3 驗證PCI Express NT Sample 102 4.2.4 驗證PCI Express NT DMA 104 第五章 實驗結果與分析 106 5.1資料傳輸實驗與分析 106 5.1.1 Gigabit Ethernet Transfer Rates 106 5.1.2 PCI Express DMA Transfer Rates 107 5.2能源效率評估 109 5.2.1實驗配置 109 5.2.2 M2SB系統的可擴充性 111 5.2.3 Parallel Workloads 114 5.2.4 Server Workloads 116 第六章 結論與未來展望 119 6.1結論 119 6.2未來展望 120 參考文獻 121

    [1]Electronic Industries Alliance,“EIA Standards,”http://www.eciaonline.org/eiastandards/.
    [2]Mellanox Technologies,“InfiniBand Cards,”http://www.mellanox.com/page/infiniband_cards_overview.
    [3]InfiniBand Trade Association,“InfiniBand Architecture Specification,”http://www.infinibandta.org/content/pages.php?pg=technology_download.
    [4]PCI-SIG,“PCI Express,”http://www.pcisig.com/home.
    [5]“CentOS,”http://www.centos.org/.
    [6]陳聰杰,「監控/感測系統相輔相成 自動化控制設備增添智慧功能」,城邦文化事業股份有限公司,新通訊元件雜誌,第154期,December 2013。
    [7]Qualcomm Incorporated,“Discover the full line of Snapdragon processors,”http://www.qualcomm.com/snapdragon/processors.
    [8]Marvell Technology Group Ltd.,“ARMADA XP,”http://www.marvell.com/embedded-processors/armada-xp/.
    [9]IBM Corporate,“The Cell Broadband Engine,”http://www-03.ibm.com/ibm/history/ibm100/us/en/icons/cellengine/.
    [10]Intel Corporation,“Intel Xeon Phi Product Family,”http://www.intel.com/content/www/us/en/processors/xeon/xeon-phi-detail.html.
    [11]Mercury Computer Systems,“Mercury Computer Systems Brings the Cell BE Processor to the PC Workstation Architecture,”http://www.mrcy.com/search.aspx?searchtext=Mercury%20Computer%20Systems%20Brings%20the%20Cell%20BE%20Processor%20to%20the%20PC%20Workstation%20Architecture.
    [12]Calxeda Corporate,“Quad-Node EnergyCard,”http://www.calxeda.com/technology/%20products/energycards/quadnode/.
    [13]ARM Ltd.,“The Architecture for the Digital World,”http://www.arm.com/, 2013.
    [14]E. L. Padoin, D. A. G. Oliveira, P. Velho, and P. O. A. Navaux,“Time-to-solution and energy-to-solution: A comparison between ARM and Xeon,”Proceedings Third Workshop on Application for Multi-Core Architectures, New York, NY, USA, October 24-25, 2012, pp. 48-53.
    [15]P. Stanley-Marbell and V. C. Cabezas,“Performance, Power, and Thermal analysis of low-power processors for scale-out systems,”2011 IEEE International Parallel & Distributed Processing Symposium, Shanghai, China, May 16-20, 2011, pp. 863-870.
    [16]C. Landi, P. Merola, and G. Ianniello,“ARM-based energy management system using smart meter and Web server,”2011 IEEE Instrumentation and Measurement Technology Conference, Binjiang, China, May 10-12, 2011, pp. 1-5.
    [17]M. Olson, K. Christensen, S. H. Lee, and J. Yun,“Hybrid Web server: Traffic analysis and prototype,”36th Annual IEEE Conference on Local Computer Networks, Bonn, Germany, October 4-7, 2011, pp. 131-134.
    [18]M. Mostowfi, K. Christensen, S. H. Lee, and J. Yun,“Timed Redirection: HTTP request coalescing to reduce energy use of hybrid Web servers,”37th Conference on Local Computer Networks, Clearwater, FL, USA, October 22-25, 2012, pp. 168-171.
    [19]M. Yamagiwa and M. Uehara,“A study on constructing an energy saving cloud system powered by photovoltaic generation,”2012 15th International Conference on Network-Based Information Systems, Melbourne, VIC, Australia, September 26-28, 2012, pp. 844-848.
    [20]G. D. Prete and C. Landi,“Real-time smart meter with embedded web server capability,”2012 IEEE International Instrumentation and Measurement Technology Conference, Graz, Austria, May 13-16, 2012, pp. 682-687.
    [21]D. Jiang, M. Tian, W. C. Liu, and Y. C. Pan,“Design and implementation of a novel wireless sensor network system terminal based on embedded web server and database,”International Conference on Automatic Control and Artificial Intelligence, Xiamen, China, March 3-5, 2012, pp. 772-775.
    [22]Y. W. Bai, J. P. Hsu, H. Teng, and H. E. Lin,“Design and implementation of an embedded surveillance system with video streaming recording triggered by an infrared sensor circuit,”2007 International Symposium on Communications and Information Technologies, Sydney, Australia, October 17-19, 2007, pp. 335-340.
    [23]E. Principi, V. Colagiacomo, S. Squartini, and F. Piazza,“Low power high-performance computingon the Beagleboard platform,”2012 Proceedings of the 5th European DSP Education and Research Conference, Amsterdam, Netherlands, September 13-14, 2012, pp. 35-39.
    [24]D. Flynn,“Power gating applied to MP-SoCs for standby-mode power management,”2013 50th ACM / EDAC / IEEE Design Automation Conference, Austin, TX, USA, May 29-June 7, 2013, pp. 1-5.
    [25]Dell Inc.,“Copper enables the ARM server ecosystem,”http://www.dell.com/Learn/us/en/555/campaigns/project-copper?c=us&l=en&s=biz.
    [26]D. Goddeke, D. Komatitsch, M. Geveler, D. Ribbrock, N. Rajovic, N. Puzovic, and A. Ramirez,“Energy efficiency vs. performance of the numerical solution of PDEs: an application study on a low-power ARM-based cluster,”Journal of Computational Physics, vol. 237, 2013, pp. 132-150.
    [27]E. Blem, J. Menon, and K. Sankaralingam,“Power struggles: Revisiting the RISC vs. CISC debate on contemporary ARM and x86 architectures,”Proceedings of 19th IEEE International Symposium on High Performance Computer Architecture (HPCA), Shenzhen, China, February 23-27, 2013, pp. 1-12.
    [28]E. L. Padoin, D. A. G. Oliveira, P. Velho, and P. O. A. Navaux,“Evaluating performance and energy on ARM-based clusters for high performance computing,”2012 41st International Conference on Parallel Processing Workshops, Pittsburgh, PA, USA, September 10-13, 2012, pp. 165-172.
    [29]PLX Technology,“PCIe Switches,”http://www.plxtech.com/products/expresslane/switches.
    [30]K. J. Bowers, B. J. Albright, B. Bergen, and L. Yin,“0.374 Pflop/s trillion-particle kinetic modeling of laser plasma interaction on roadrunner,”SC 2008 Proceedings of the 2008 ACM/IEEE conference on Supercomputing, Austin, TX, USA, November 15-21, 2008, pp. 1-11.
    [31]C. H. Hsu and W. C. Feng,“A Power-aware Run-time system for High-performance computing,”Proceedings of the 2005 ACM/IEEE conference on Supercomputing, Seattle, WA, USA, November 12-18, 2005.
    [32]J. S. Chase, D. E. Irwin, L. E. Grit, J. D. Moore, and S. E. Sprenkle,“Dynamic virtual clusters in a grid site manager,”Proceedings 12th IEEE International Symposium on High Performance Distributed Computing, Seattle, WA, USA, June 22-24, 2003, pp. 90-100.
    [33]A. J. Younge, G. Laszewski, L. Wang, S. Lopez-Alarcon, and W. Carithers,“Efficient resource management for cloud computing environments,”Chicago, IL, USA, August 15-18, 2010, pp. 357-364.
    [34]K. J. Barker, K. Davis, A. Hoisie, D. K. Kerbyson, M. Lang, S. Pakin, and J. C. Sancho,“Entering the petaflop era: The architecture and performance of roadrunner,”SC 2008 International Conference for High Performance Computing, Networking, Storage and Analysis, Austin, TX, USA, November 15-21, 2008, pp. 1-11.
    [35]G. Laszewski, A. Younge, H. Xi, K. Mahinthakumar, and L. Wang,“Experiment and workflow management using Cyberaide shell,”9th IEEE/ACM International Symposium on Cluster Computing and the Grid, Shanghai, China, May 18-21, 2009, pp. 568-573.
    [36]R. Buyya, C. S. Yeo, and S. Venugopal,“Market-oriented cloud computing: Vision, Hype, and Reality for delivering IT services as computing utilities,”The 10th IEEE International Conference on High Performance Computing and Communications, Dalian, China, September 25-27, 2008, pp. 5-13.
    [37]G. Laszewski, L. Wang, A. J. Younge, and H. Xi,“Power-aware scheduling of virtual machines in DVFS-enabled clusters,”2009 IEEE International Conference on Cluster Computing, New Orleans, LA, USA, August 31- September 4, 2009, pp. 1-10.
    [38]D. J. Kerbyson, H. J. Alme, A. Hoisie, F. Petrini, H. J. Wasserman, and M. Gittings,“Predictive performance and scalability modeling of a Large-scale application,”Proceedings of the 2001 ACM/IEEE conference on Supercomputing, New York, NY, USA, November 10-16, 2001.
    [39]F. Petrini, D. K. Kerbyson, and S. Pakin,“The case of the missing supercomputer performance: Achieving optimal performance on the 8,192 processors of ASCI Q,”Proceedings of the 2003 ACM/IEEE conference on Supercomputing, Phoenix, AZ, USA, November 15-21, 2003.
    [40]NASA Advanced Supercomputing Division,“NAS Parallel Benchmarks,”http://www.nas.nasa.gov/publications/npb.html
    [41]Apache,“HTTP SERVER PROJECT, ”http://httpd.apache.org/.
    [42]“The lighttpd web server,”http://www.lighttpd.net/.
    [43]Microsoft,“Windows XP,”http://windows.microsoft.com/en-us/windows/windows-help?os=winxp#windows=windows-xp.
    [44]Canonical Ltd.,“About Ubuntu,”http://www.canonical.com/about-ubuntu.
    [45]Cadence Design Systems, Inc.,“Cadence PCB Design and SI/PI Analysis,”http://www.cadence.com/products/pcb/Pages/pcb_orcad_sigrity.aspx.
    [46]Tektronix, Inc.,“Oscilloscope,”http://www.tek.com/oscilloscope#all.
    [47]Lauterbach GmbH,“TRACE32,”http://www.lauterbach.com/frames.html?home.html.
    [48]Ravi Budruk, Don Anderson, and Tom Shanley, PCI Express System Architecture, MindShare, 2003.
    [49]楊志偉,PCI Express 系統架構,碁峰資訊股份有限公司,2007。
    [50]Jack Regula, Using Non-Transparent Bridging in PCI Express Systems, White Paper, PLX Technology Inc., June, 2004.
    [51]PLX Technology,“Non-Transparent Bridging Simplified Multi-Host System and Intelligent I/O Design with PCI Express,”http://www.plxtech.com/files/pdf/technical/expresslane/NTB_Brief_April-05.pdf, 2004.
    [52]Intel Corporation, 21555 Non-Transparent PCI-to-PCI Bridge User Manual, 2001.
    [53]Mark I. Montrose, Printed Circuit Board Design Techniques for EMC Compliance-A Handbook for Designers, Second Edition, IEEE Electromagnetic Compatibility Society, Sponsor, Wiley-interscience a John Wiley & Sons, Inc., Publication, 2000.
    [54]謝金明,高速數位電路設計暨雜訊防制技術,全華科技圖書股份有限公司,1999。
    [55]S. H. Hall, G. W. Hall, and J. A. McCall, High-Speed Digital System Design- A Handbook of Interconnect Theory and Design Practices, John Wiley & Sons, 2000.
    [56]Michael Larabel,“ARM on Ubuntu 12.04 LTS battling Intel x86?,”Phoronix Media,http://www.phoronix.com/scan.php?page=article&item=ubuntu_1204_armfeb&num=1.
    [57]Z. Ou, B. Pang, Y. Deng, J. K. Nurminen, A. Yla-Jaaski, and P. Hui,“Energy- and cost-efficiency analysis of ARM-based clusters,”2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Ottawa, Canada, May 13-16, 2012, pp. 115-123.
    [58]D. Tsirogiannis, S. Harizopoulos, and M. A. Shah,“Analyzing the energy efficiency of a database server,”Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, New York, NY, USA, June 06-11, 2010, pp. 231-242.
    [59]Texas Instruments Incorporated.,“OMAP 4 Mobile Applications Platform OMAP4430,”http://www.ti.com/product/omap4430.
    [60]NVIDIA Corporation,“Tegra Processors,”http://www.nvidia.com/object/tegra-4-processor.html.
    [61]E. Gabriel, G. E. Fagg, G. Bosilca, T. Angskun, J. J. Dongarra, J. M. Squyres, V. Sahay, P. Kambadur, B. Barrett, A. Lumsdaine, R. H. Castain, D. J. Daniel, R. L. Graham, and T. S. Woodall,“Open MPI: Goals, concept, and design of a next generation MPI implementation,”Proceedings of the 11th European PVM/MPI Users’ Group Meeting, Budapest, Hungary, September 19-22, 2004, pp. 97-104.
    [62]MVAPICH,“MVAPICH: MPI over InfiniBand, 10GigE/iWARP and RoCE,”http://mvapich.cse.ohio-state.edu/, 2013.
    [63]OpenMP Architecture Review Board,“OpenMP Application Program Interface version 3.0,”http://www.openmp.org/mp-documents/spec30.pdf, 2008.
    [64]Khronos OpenCL Working Group,“The OpenCL Specification version 1.2,”http://www.khronos.org/registry/cl/specs/opencl-1.2.pdf, 2012.
    [65]Janusz Kowalik and Tadeusz Puzniakowski, Using OpenCL: Programming Massively Parallel Computers, ser. Advances in Parallel Computing. IOS Press, 2012.
    [66]NVIDIA,“CUDA Parallel Computing Platform,”http://www.nvidia.com/object/cuda_home_new.html.
    [67]Semtech Corporation,“Switching Regulators,”http://www.semtech.com/power-management/switching-regulators/.
    [68]陳聰杰,「平行運算之電路設計」, 全亞文化事業有限公司,ET電子技術雜誌,第312期,March 2012。
    [69]張登章,陳聰杰,徐勝均,「嵌入式系統容錯控制技術架構」,智慧自動化產業期刊,第6期,September 2013。
    [70]Carl Hamacher, Zvonko Vranesic, Safwat Zaky, and Naraig Manjikian, Computer Organization and Embedded Systems, Sixth Edition, McGraw-Hill Companies, 2012.

    無法下載圖示 全文公開日期 2018/12/25 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE