Tuesday, December 6, 2011
Program: University of
Delaware's New High Performance Computing Cluster
Speaker: Dr. Daniel J Grim,
Chief Technology Officer,
Information Technologies, University of Delaware
The Information Technologies (IT) group at the University of Delaware recently
undertook to create a "Community Cluster" modelled after the
experience at the Rosen Center for Advanced Computing at Purdue University.
During the process of inviting participation from researchers they
simultaneously worked on developing a configuration for the cluster. While
the initial design included ten gigabit Ethernet for the cluster interconnect
(presuming that to be a less costly technology) it was ultimately determined
that QDR Infiniband could be used at a comparable or lesser cost. The
cluster is one of the first to take advantage of the newest AMD processor, code
named Interlagos, which is built using AMD's new "Bulldozer"
architecture. In addition we purchased a Lustre file system with
approximately 200 terabytes of storage. The cost of this initial cluster was
subsidized by IT with "investors" paying just $3,000 per node for a
system with a total value of $1.2 million. The final configuration
included 200 nodes and over 5,000 processor cores.
Community Cluster Development at UD
The inspiration for the “Community” cluster idea came from the Purdue University Rosen computer center, which started with a single server cluster and built up to about 5 clusters and growing. The concept is to have a growth path to achieve a supercomputer facility as resources are available and requirements grow. UD will have it’s first cluster of this type available for use in January 2012. The current effort is best summarized by a poster which can be found at:
Dr. Grim outlined three areas of effort.
First, how to fund the computer equipment. Part of the story is that subscribers were sought for a 100 node network. This subscription effort was so successful that the final result was a 200 node network.
Second; identifying, purchase and installation of the compter equipment. Plans were to have room for several more clusters in the future. The system was bid by Dell, HP and Penguin, with Penguin the winner on cost performance.
Third, the power requirements required a substantial upgrade to over 2 megawatts; 480V at 3000 amps. About half of this power was needed to run present and future computer systems, and half to cool the systems.