Table of Contents


Back

[CLACReps] High Performance Cluster @ Wesleyan

General answers to questions posed by the CLACReps.

This wiki may have much more detailed information scattered about and i'll point to some relevant pages. Click on the Back link above to go to the main page. Our cluster resides on our internal VLAN, hence is only accessible via Active Directory guest accounts and VPN for non-wesleyan users.

You can view our cluster activities …

HPC specs?

What type of cluster is it? Our cluster is a Dell cluster that arrived completely racked. Dell engineers performed the final configuration on-site installing Platform/OCS. This is a ROCKS based cluster.

The cluster is comprised of 36 compute nodes. Each node contains dual Quad Core Xeon Processors (Xeon 5355 chips, 2x4MB Cache, 2.66GHz, 1333MHz FSB), basically a Dell Power Edge1950. 32 servers have 4 GB 667MHz (4x1GB), Dual Ranked DIMMs and 4 servers have 16GB 667MHz (8x2GB), Dual Ranked DIMMs. That makes for a total of 36*8 = 288 cores.

There is one head node (also runs the scheduler Platform/Lava which we will upgrade soon to Platform/LSF 6.2) a PowerEdge 2950 with 2 GB memory. In addition we have one IO node which is identical to our head node. It is connect via dual 4Gbps fiber cards (in fail over mode) to our Netapp storage device. 5 TB of file system is made available, see below.

Compute nodes run Redhat Enterprise Linux WS4 while the head node runs Redhat Enterprise Linux AS4. Both linux versions in x86_64 mode running a 2.6.9 kernel version.

Queue Policies ?

We currently operate under the pragma of “no limitations”. Which we can do since we are not experiencing saturation of resources yet. Seldomly do jobs go into a pending state because of lack of resources. However, it has been our experience that 4 GB of memory per node (for a total of 8 cores) is not enough. Since 8 jobs may be scheduled on these nodes, memory is in high demand.

Some queues reflect the internal network of our cluster. There is one gigE Dell switch for the administrative software subnet (192.168.1.xxx) using their first NIC. A higher grade gigE switch (Cisco 7000) provides the gigE connectivity amongst all the nodes which we name our “NFS” subnet (10.3.1.xxx). Each node's second NIC is dedicated for nfs traffic to the ionode. A third Infiniband switch connects 16 of the nodes together. Hence we have what we call the light weight node queue “16-lwnodes” (gigE enabled nodes), the light weight nodes queue “16-ilwnodes” (gigE and Infiniband enabled nodes).

Another queue, “04-hwnodes”, is comprised of the 4 servers with the large memory footprint (16 GB each). These nodes are also connected (each) to two Dell MD1000 storage arrays. Each node has dedicated access to 7 15,000 RPM disks (striped, mirrored, raid 0) for fast scratch space.

Other than those queues we have a Matlab queue which limits the number of jobs based on the licensed number of workers. This Matlab installation uses the Distributed Computing Engine.

Four of the compute nodes allow our users ssh access. These nodes also comprise the “debug” queues. One queue, the “idle” queue, allocates jobs to any host not considered busy irregardless of resources available.

[root@swallowtail web]# bqueues
QUEUE_NAME      PRIO STATUS          MAX JL/U JL/P JL/H NJOBS  PEND   RUN  SUSP 
debug            70  Open:Active       -    -    -    -     0     0     0     0
idebug           70  Open:Active       -    -    -    -     0     0     0     0
16-lwnodes       50  Open:Active       -    -    -    -    40     0    40     0
16-ilwnodes      50  Open:Active       -    -    -    -    13     0    13     0
04-hwnodes       50  Open:Active       -    -    -    -     0     0     0     0
matlab           50  Open:Active       8    8    -    8     1     0     1     0
molscat          50  Open:Active       -    -    -    2     0     0     0     0
gaussian         50  Open:Active       -    -    -    8     4     0     4     0
nat-test         50  Open:Active       -    -    -    -     0     0     0     0
idle             10  Open:Active       -    -    -    -    60     0    60     0

Software?

The list of software installed can be found here: User Guide and Manuals

This access point is considered “required reading” for new users. We have roughly 50 accounts currently but seldomly show more than a dozen active login sessions.

File systems, Quotas ?

We currently do not enforce any disk quotas.

Home directories are spread over a dozen or so LUNs, each 1 TB in size (“thin provisioned”). The total disk space available for home directories is 4 TB. This implies that a user could use 1 TB as long as it is available.

/sanscratch is a 1 TB LUN that is shared by all the nodes for large scratch space. There are also local /localscratch file system that is provided by a single 80 GB hard disk. Except on the heavy nodes were this is roughly 230 GB of fast disk space.

The scheduler, when a job is submitted for execution will provide unique working directories in both /sanscratch and /localscratch. And also clean up afterwards. Users are encouraged to use these areas and not perform extensive read/writes using their home directories.

Home directories are backed up using the NetApp snapshot capabilities twice daily. Tivoli runs incrementals backup each night to our tape storage device.

...?


Back