Citations

follow this link for Publications & References

Hardware

The network design of the cluster described at this Link presents a schematic overview. The details of which are:

Hardware	Description
1 PowerEdge 2950 server	The head node, the access point for our users. It also runs the LSF/HPC scheduler which allocates and executes jobs on behalf of the users on the compute nodes. It is only accessible via VPN from the internet.
1 PowerEdge 2950 server	The ionode, it is connected to our NetApp file storage device via dual fiber channel connections. The ionode provides all compute nodes and head node with 4 TB of file system storage via NFS. In addition, the ionode provides all compute nodes and head node with a shared 1 TB scratch file system storage, also via NFS.
36 PowerEdge 1950 servers	These are the compute nodes. Each compute node contains two physical processors (at 2.3 Ghz), each processor contains 4 cores. There are different memory footprints for these compute nodes, see below. All compute nodes contain two network interface cards and two local disks. The first disk is for the operating system, The second disk provides local scratch space in limited quantities.
1 low-end gigabit ethernet switch	This switch is used by the scheduler software to obtain information from each node and dispatch jobs. It is located on a private network (192.168.x.x). All compute nodes, ionode and head node are connected to this switch.
1 high-end gigabit ethernet switch	This switch serves exclusively for NFS traffic. It also is located on a private network (10.3.1.x). All compute nodes and the head node are connected via their second network interface card.
1 infiniband high-speed switch	This high speed switch is specifically designed for parallel jobs. 16 of the computes nodes are connected to this switch. The performance of this switch is roughly 4-6x faster than the gigabit ethernet switches.
2 disk storage arrays	These arrays (also known as MD1000 arrays) contain fast disks (at 15,000 RPM) which are dedicated to 4 compute nodes for dedicated, local, fast, scratch file systems. Particularly for users of the software Gaussian.
2 UPS devices	To provide the head and ionode with power in case of a supply interruption. This would allow a clean shutdown.
cables, cables, cables, cables, cables, cables … blue, black, red, orange, …

Statistics

early bird access period started: april 15th, 2007.
production status started: october 15th, 2007.

16 compute nodes, each with 8 gb memory, are connected to the Infiniband switch. They comprise the 'imw' queue.
08 compute nodes, each with 4 gb memory. They comprise the 'elw' queue.
04 compute nodes, each with 8 gb memory. They comprise the 'emw' queue.
04 compute nodes, each with 16 gb memory. They comprise the 'ehw' queue.
04 compute nodes, each with 16 gb memory and local fast disks. They comprise the 'ehwfd' queue.

320 gb of total memory across all compute nodes.
Up from 192 gb when cluster was delivered.

4 TB of file system storage for home directories.
1 TB of file system for shared, scratch access by the compute nodes.

42 users accounts of which 17 are faculty (excluding members of dept ITS).
5 external users (guest) accounts.

Job throughput rates observed: 0.8 to 65 jobs/hour.
Job run times observed; from less than an hour to over 3 months.
Total number of jobs processed since powered up: 120,000+.
Typical job processing size: about 80-150 jobs across all queues, during summmer we run at capacity.

Cluster Usage Graphs

Back

DokuWiki

Table of Contents

Citations

Hardware

Statistics

DokuWiki

User Tools

Site Tools

Table of Contents

Citations

Hardware

Statistics

Page Tools