使用High Performance Linpack之EXTOLL網路介面效能評估｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	張齊文 Chi-wen Chang
論文名稱：	使用High Performance Linpack之EXTOLL網路介面效能評估 Performance Evaluation of the EXTOLL Interface Using High Performance Linpack
指導教授：	林昌鴻 Chang-Hong Lin
口試委員:	陳維美 Wei-Mei Chen 吳晉賢 Chin-Hsien Wu 許孟超 Mon-Chau Shie
學位類別：	碩士 Master
系所名稱：	電資學院 - 電子工程系 Department of Electronic and Computer Engineering
論文出版年：	2013
畢業學年度：	101
語文別：	英文
論文頁數：	63
中文關鍵詞：	高效能運算、叢集
外文關鍵詞：	HyperTransport, EXTOLL, HPL, High Performance Linpack
相關次數：	點閱：342 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

近幾年來，高效能運算(high-performance computing)已經成為工程及科學運算不可或缺的重要資源。而在此領域裡，叢集(cluster)電腦是最常被用來組成高效能超級電腦的解決方案。在Top 500[1]的評比中，有越來越多的超級電腦選擇使用多核心處理器的個別系統，搭配內部連結網路(Interconnection Network)的方式，來組成叢集系統(Cluster)，進行高效能運算。在運算能力為主要考量的情況下，理論上越多的叢集節點(Node)能夠得到越好的運算效能。但隨著節點數目的增加，通訊損失也跟著增加，增加越多的節點卻沒有帶來相同比例的效能增加。
本篇論文主要針對一種新的架構，使用HyperTransport匯流排連接一種稱為EXTOLL的高速網路卡，並選用High Performance Linpack (HPL) [2]來對整體的叢集作效能評估。傳統的電腦架構裡，資料的傳播路徑為：處理器(Processor)--北橋晶片(Northbridge)--南橋晶片(Southbridge)--網路卡(Network Interface Card)，再經由傳播媒介傳送至遠端節點。將高速網卡透過HyperTransport匯流排直接連接處理器，將可以大大的減少傳輸延遲(latency)，達到高速傳輸的目的。然而此設備目前還在研發階段，價格性能比尚未達到市場能夠接受的程度，但我們藉由此次的研究評估其效能，推測其未來發展的潛力。

In recent years, high-performance computing has become an indispensable resource for engineering and scientific applications. Clusters are the most commonly used solutions for constructing high-performance supercomputers. In the Top 500 [1] appraisal, there have more and more supercomputers that are constructed from clusters to perform high performance computing. Those clusters are consisted of individual multi-core processor systems and connect to each other by interconnection networks. In order to get better computing power, the node counts of each cluster can be increased. Theoretically, the more nodes within a cluster, the more computing power it has. But with the increase in the number of nodes, the communication loss would increase as well. Increasing more nodes does not get the same proportional growth of the performance.
In this thesis, we proposed a new architecture, a high-speed network device called EXTOLL connects directly to the processor through the HyperTransport bus. The High Performance Linpack (HPL) [2] benchmark is used to evaluate the overall cluster performance. For traditional computer architecture, data propagation path is: Processor – Northbridge – Southbridge – Network Interface Card, and then transmitted to remote nodes via the network media. The network card connects directly to the processor through the HyperTransport bus will greatly reduce the transmission latency, which achieve high speed transmission. The EXTOLL card, however, is currently in the development stage, the price performance ratio has not yet been acceptable in the market. According to this study, it is possible to see the development potential of the EXTOLL card in the future.

摘要	                                        I
Abstract	                                II
誌謝	                                        III
Contents	                                IV
List of Figures	                                VI
List of Tables	                                VII
Introduction	                        1
1	Motivation	                        1
2	Contributions	                        2
3	Outline of the Thesis	                2
Related Work	                        3
1	Cluster Performance Analysis	        3
2	HyperTransport	                        6
2.1	HyperTransport Packet Format	        7
2.2	HyperTransport Device Configurations	8
2.3	Direct Peripheral-to-CPU Interconnect: HTX	9
3	EXTOLL	                                10
4	High Performance Linpack - HPL	        14
5	Message Passing Interface - MPI	        16
Cluster Setup	                        17
1	Hardware Settings	                17
1.1	System Platform Settings	        17
1.2	EXTOLL Card Settings	                19
2	Software Settings	                22
2.1	OS Installation	                        22
2.2	Network Settings	                23
2.3	NFS Settings	                        24
2.4	SSH Settings	                        26
2.5	EXTOLL Software Settings	        27
3	Benchmark Settings	                30
Research Methods	                33
1	HPL Parameters Tuning	                33
1.1	Block Size: NB	                        33
1.2	Process Grids: PxQ	                36
1.3	Matrix Size: N	                        39
2	Results	                                42
Conclusions and Future Work	        45
References	                                47
Appendix A - EXTOLL Script File	                50
Appendix B - HPL Configuration File	        53

                                

[1] TOP500 – The TOP500 Supercomputer Sites. http://www.top500.org
[2] A. Petitet, R. C. Whaley, J. Dongarra, and A. Cleary, “HPL - A Portable Implementation of the High-Performance Linpack Benchmark for Distributed-Memory Computers”, http://www.netlib.org/benchmark/hpl/.
[3] S. Gaissaryan, A. Avetisyan, O. Samovarov, and D. Grushin, “Comparative Analysis of High-Performance Clusters' Communication Environments Using HPL Test”, High Performance Computing and Grid in Asia Pacific Region, 2004. Proceedings. Seventh International Conference on. IEEE, 2004
[4] EXTOLL. http://www.extoll.de/
[5] The InfiniBand Trade Association. http://www.infinibandta.org/index.php
[6] Myrinet. http://www.myricom.com/scs/myrinet/overview/
[7] D. Culler, R. Karp, D. Patterson, A. Sahay, K. E. Schauser, E. Santos, R. Subramonian, and T. von Eicken, “LogP Towards a Realistic Model of Parallel Computation”, In Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 262-273,1993.
[8] R. P. Martin, A. M. Vahdat, D. E. Culler, and T. E. Anderson, “Effects of Communication Latency, Overhead, and Bandwidth in a Cluster Architecture”, Vol. 25. No. 2. ACM, 1997.
[9] http://www.hypertransport.org/
[10] HyperTransport Consortium, “HyperTransport I/O Technology Overview: An Optimized, Low-latency Board-level Architecture”, HyperTransport Consortium White Paper (June 2004).
[11] H. Litz, H. Froning, M. Nuessle, and U. Bruning, “A HyperTransport Network Interface Controller for Ultra-low Latency Message Transfers”, HyperTransport Consortium White Paper (2008).
[12] EXTOLL Technology Overview. http://www.extoll.de/images/pdf/extoll_technology_overview.pdf
[13] Chen Shao-Hu, Zhang Yun-Quan, Zhang Xian-Yi, Cheng Hao, “Performance Testing and Analysis of BLAS Libraries on Multi-Core CPUs”, Journal of Software, Vol.21, pp.214 – 223.2010.
[14] W. Gropp, E. Lusk, and T. Sterling, Beowulf Cluster Computing with Linux. MIT Press, 2003.
[15] W. Gropp, E. Lusk, and A. Skjellum, Using MPI: portable parallel programming with the message passing interface. Vol. 1. MIT press, 1999.
[16] M. Sindi, “Evaluating MPI Implementations Using HPL on an Infiniband Nehalem Linux Cluster”, Information Technology: New Generations (ITNG), 2010 Seventh International Conference on. IEEE, 2010.
[17] AIC Inc. Octans Specification. http://www.aicipc.com.tw/ProductDetail.aspx?ref=Octans
[18] http://www.open-mpi.org/
[19] Zhang Wenli, Fan Jianping, and Chen Mingyu, “Efficient determination of block size NB for parallel Linpack test”, Parallel and Distributed Computing and Systems. ACTA Press, 2004.
[20] D. Dunlop, S. Varrette, and P. Bouvry, “On the use of a genetic algorithm in high performance computer benchmark tuning”, Performance Evaluation of Computer and Telecommunication Systems, 2008. SPECTS 2008. International Symposium on. IEEE, 2008.
[21] Tau Leng, R. Ali, Jenwei Hsieh, V. Mashayekhi, and R. Rooholamini, “Performance impact of process mapping on small-scale SMP clusters-A case study using High Performance Linpack”, 2002.
[22] D. A. Patterson, and J. L. Hennessy, Computer Organization and Design: The Hardware/Software interface, 4th ed., Morgan Kaufmann, 2009.
[23] J. J. Dongarra, P. Luszczek, and A. Petitet, “The LINPACK Benchmark : past, present and future”, Concurrency and Computation: Practice and Experience 15.9 (2003): 803-820.
[24] M. Sindi , “HowTo - High Performance Linpack (HPL)”, 2009.

全文公開日期 2018/07/16 (校內網路)
全文公開日期本全文未授權公開 (校外網路)
全文公開日期本全文未授權公開 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文