Computational & Technology Resources
an online resource for computational,
engineering & technology publications
Civil-Comp Proceedings
ISSN 1759-3433
CCP: 90
PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING FOR ENGINEERING
Edited by:
Paper 30

Tuning a Cluster System for High Performance Computing in Engineering

J. Magiera, G. Graniczkowski and P. Kapusta

Department of Civil Engineering, Cracow University of Technology, Cracow, Poland

Full Bibliographic Reference for this paper
J. Magiera, G. Graniczkowski, P. Kapusta, "Tuning a Cluster System for High Performance Computing in Engineering", in , (Editors), "Proceedings of the First International Conference on Parallel, Distributed and Grid Computing for Engineering", Civil-Comp Press, Stirlingshire, UK, Paper 30, 2009. doi:10.4203/ccp.90.30
Keywords: Beowulf clusters, parallel computing, high performance computing, cluster performance tuning, TCP/IP stack tuning, linux kernel tuning, channel bonding.

Summary
Beowulf type clusters are very popular tools for high performance computing, but it frequently happens, especially in smaller installation or systems, that the term off-the-shelf (OTC) components means that no optimization or tuning is performed. The availability of the ready-to-install software packages such as OSCAR, which take care of the whole cluster setup process and automated setup managers for linux, may lead to a false conclusion that the process of setting up and configuring a cluster system is performed automatically for a given hardware. Successful running of standard tests such as computing the number pi using an MPI-based parallel algorithm (which is a standard test delivered seemingly with all MPI libraries from various sources) or HPL/NPB test suites even deepens this attitude and is treated as proof that the installation and configuration phase have been performed well and that the system is ready for use. Unfortunately, bearing in mind that the underlying OTC philosophy makes every cluster system different, the positive outcome of these standard tests should be treated only as an indication that the installation phase is over and as an invitation for the next phase, i.e., to tune it and profile it for the tasks it is intended to serve.

This paper presents results of an effort aimed at tuning of a relatively small Beowulf type cluster system used for several years as a workgroup HPC facility at the Institute for Computational Civil Engineering at Cracow University of Technology, Cracow, Poland. Several directions of the tuning process were considered, such as tuning and recompilation of the linux kernel (for example, native support for Intel's HyperThreading (HT) technology), tuning of the TCP/IP protocol stack parameters, activating the channel bonding (link aggregation) on the two gigabit Ethernet interconnects available in the cluster nodes, taking advantage of the new add-ons to the TCP/IP specifications such as the Task Offload Engine (TOE), Selective Acknowledgements (SACK), and tuning the CISCO Catalyst 4503 switch. Impact of all the modifications employed was extensively tested with the netperf/ iperf linux network performance testing tools as well as with the HPL and NPB benchmark suites.

In general, the research performed proved that modern HPC clusters that are built of the commodity components do require attention and a significant amount of work to understand their inner workings and to find an optimal set of settings, parameters, software configurations and technologies. It was demonstrated that comparing the "standard" performance, as recorded with default installations of the operating systems, kernels, hardware setting, network configurations/parameter settings, etc. with the performance after a careful tuning and/or optimisation, then a speed-up of even ten times or more may be attained. It was also found that the settings are not universal and that such tuning work should be performed for almost every application type run on the cluster system, especially if it is to be run systematically.

purchase the full-text of this paper (price £20)

go to the previous paper
go to the next paper
return to the table of contents
return to the book description
purchase this book (price £72 +P&P)