This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cluster:93 [2010/12/31 23:13] hmeij |
cluster:93 [2011/01/09 20:56] hmeij |
||
---|---|---|---|
Line 4: | Line 4: | ||
====== Greentail ====== | ====== Greentail ====== | ||
- | ===== Greentail ===== | + | Time to introduce our new high performance cluster '' |
+ | |||
+ | In order to accommodate the new cluster, we have reduced the Blue Sky Studios cluster from 3 racks in production to a single rack. That rack contains nothing but 24 gb memory nodes offering just over 1.1 TB of memory across 46 nodes. | ||
+ | |||
+ | There are no changes to the Dell cluster (petaltail/ | ||
+ | |||
+ | If we like the HP management tools, in the future we may ingest cluster petaltail/ | ||
+ | |||
+ | As always, suggestions welcome. | ||
+ | |||
+ | ===== Design ===== | ||
+ | |||
+ | The purchase of the HP hardware followed a fierce bidding round in which certain design aspects had to met. | ||
+ | |||
+ | * We continually run out of disk space for our home directories. | ||
+ | * We wanted more nodes, in fewer queues, with a decent memory footprint. | ||
+ | * All nodes should be on an Infiniband switch. | ||
+ | * A single queue is preferred. | ||
+ | * Data (NFS) was to be served up via a secondary gigabit ethernet switch, hence not compete with administrative traffic. | ||
+ | * (With the HP solution we will actually route data (NFS) traffic over the infiniband switch using OFED/MPI, a practice called [[http:// | ||
+ | * Linux (Redhat or CentOS) as operating system. | ||
+ | * Flexible on scheduler (options: Lava, LSF, Sun Grid Engine) | ||
+ | * The disk array, switches and login node should be backed by some form of UPS (not the compute nodes) | ||
+ | * (We actually have moved those to our enterprise data center UPS, which is backed by building generator) | ||
+ | |||
+ | ===== Performance ===== | ||
+ | |||
+ | During the scheduled power outage of December 28th, 2010, some benchmarks were performed on old and new clusters. | ||
+ | |||
+ | In short using linpack (More about [[http:// | ||
+ | |||
+ | * greentail' | ||
+ | * petaltail/ | ||
+ | * petaltail/ | ||
+ | * so the total is 570 gigaflops, but you'd never want to run across both switches simultaneously | ||
+ | * sharptail' | ||
+ | * never got quite all the nodes working together, not sure why | ||
+ | |||
+ | |||
+ | ===== Home Dirs ===== | ||
+ | |||
+ | The home directory disk space (5 TB) on the clusters is served up via NFS from one of our data center NetApp storage servers (named filer3). | ||
+ | |||
+ | In order to do this, your old home directory content will be copied weekly from filer3 to greentail' | ||
+ | |||
+ | To avoid a conflict between home dirs I strongly suggest you create a directory to store the files you will be creating on greentail, for example / | ||
+ | |||
+ | At some point in the future, greentail' | ||
+ | |||
+ | Greentail's new home dirs will provide 10 TB of disk space. | ||
+ | |||
+ | Because of the size of the new home dirs, we will also not be able to provide backup via TSM (Tivoli). | ||
+ | |||
+ | |||
+ | ===== Passwords ===== | ||
+ | |||
+ | The password, shadow and group files of host petaltail were used to populate greentail' | ||
+ | |||
+ | If you change your password, do it on all four hosts (petaltail, swallowtail, | ||
+ | |||
+ | I know, a pain. | ||
+ | |||
+ | |||
+ | ===== SSH Keys ===== | ||
+ | |||
+ | Within the directory **/ | ||
+ | |||
+ | You can also log in to host greentail directly ('' | ||
+ | |||
+ | To set up your ssh keys: | ||
+ | |||
+ | * log into a host, then issue the command '' | ||
+ | * supply an empty passphrase (just hit return) | ||
+ | * then copy the contents of / | ||
+ | * you can have multiple public ssh key entries in this file | ||
+ | |||
+ | Note: the software stack on host petaltail/ | ||
+ | |||
+ | To test if your keys are set up right, simply ssh around the hosts petaltail, swallowtail and greentail. | ||
+ | |||
+ | ===== Rsnapshot ===== | ||
+ | |||
+ | ===== ... ===== | ||
+ | |||
+ | |||
+ | |||
\\ | \\ | ||
**[[cluster: | **[[cluster: |