Warning: Undefined array key -1 in /usr/share/dokuwiki/inc/html.php on line 1458

Differences

This shows you the differences between two versions of the page.

--- cluster:120 [2013/08/28 15:47]
hmeij
+++ cluster:120 [2014/02/21 10:24] (current)
hmeij
@@ Line 2: / Line 2: @@
 **[[cluster:0|Back]]**
-  * the  [[cluster:108| Queue Update]] page 03/01/2013
+This outdated page replace by [[cluster:126|Brief Guide to HPCC]]
+ --- //[[hmeij@wesleyan.edu|Meij, Henk]] 2014/02/21 10:23//
-==== Newest Configuration ====
+Updated
+ --- //[[hmeij@wesleyan.edu|Meij, Henk]] 2013/09/10 14:42//
-The Academic High Performance Compute Cluster is comprised of two login nodes (greentail and swallowtail, both Dell PowerEdge 2050s).  Old login node petaltail (Dell PowerEdge 2950) can be used for testing code (does not matter if it crashes, it's primary duty is backup to physical tape library).
+  * the  [[cluster:108| Queue Update]]
-Three types of compute nodes are available via the Lava scheduler:
-  * 36 nodes with dual quad core (Xeon 5620, 2.4 Ghz) sockets in HP blades (SL2x170z G6) with memory footprints of 12 GB each, all on infiniband (QDR) interconnects.  288 job slots. Total memory footprint of these nodes is 384 GB. This cluster has been measured at 1.5 teraflops (using Linpack)
-  * 32 nodes with dual quad core (Xeon 5345, 2.3 Ghz) sockets in Dell PowerEdge 1950 rack servers with memory footprints ranging from 8 GB to 16 GB.  256 job slots. Total memory footprint of these nodes is 340 GB. Only 16 nodes are on infiniband (SDR) interconnects, rest on gigabit ethernet switches. This cluster has been measured at 665 gigaflops (using Linpack)
-  * 45 node with dual single core AMD Opteron Model 250 (2.4 Ghz) with a memory footprint of 24 GB.  90 job slots. Total memory footprint of the cluster is 1.1 TB. This cluster has an estimated capacity of 500-700 gigaflops.
-All queues are available for job submissions via the login nodes greentail and swallowtail; both nodes service all queues. Our total job slots is now 634 of which 380 are on infiniband switches for parallel computational jobs.  In addition  queue "bss24" consists of 90 job slots (45 nodes) which can provide access to 1.1 TB of memory; it is turned on by request (nodes are power inefficient).
-Home directory file system are provided (via NFS or IPoIB) by the login node "greentail" from a direct attached disk array. In total, 10 TB of /home disk space is accessible to the users and 5 TB of scratch space at /sanscratch.  In addition all nodes provide a small /localscratch disk space on the nodes local internal disk if file locking is needed (about 50 GB). Backup services are provided via disk-to-disk snapshot copies on the same array.
 ==== New Configuration ====
-(May 2011)
+The Academic High Performance Compute Cluster is comprised of two login nodes (greentail and swallowtail, both Dell PowerEdge 2050s).  Old login node petaltail (Dell PowerEdge 2950) can be used for testing code (does not matter if it crashes, it's primary duty is backup to physical tape library).
-Barring some minor work (involving 4 imw nodes and queue elw), all queues are now available for job submissions via the login nodes greentail and swallowtail; both nodes service all queues. Login node sharptail has been decomissioned, and login node petaltail will only perform administrative functions.
-Our total jobslots is now 638 of which 384 are on infiniband (queues hp12 and imw, the former much faster).  These queues should be the target for parallel computations.  Queue ehwfd should be the target for jobs needing fast local scratch space.  Queue bss24 should be the target for large ethernet parallel jobs (queue offers 1 TB memory footprint). Matlab and Stata jobs should be submitted to their respective queues on greentail (license restrictions).
-Also, I suggest staging your data and programs in /sanscratch/JOBPID and copy the results back to your home directory as the last step in your job (scheduler will erase /sanscratch/JOBPID).  This file system is a different set of disks than the disks serving /home.
-==== Brief Description ====
-For inclusion in proposals.  This page is not maintained.
-Academic High Performance Computing at Wesleyan
-Wesleyan University HPC environment is comprised of three clusters: the "greentail" HP hardware cluster, the “petaltail/swallowtail” Dell hardware cluster and the Angstrom “sharptail” Blue Sky Studios hardware cluster.  A brief description of each follows.
-The HP cluster consists of one login node ("greentail"),  the Lava job scheduler, and 32 compute nodes.  Each compute node holds dual quad core (Xeon 5620, 2.4 Ghz) sockets in HP blades (SL2x170z G6) with memory footprints of 12 GB.  Total memory footprint of the cluster is 384 GB.  A high speed Voltaire interconnect (Infiniband) connects all of these compute nodes for parallel computational jobs.  The scheduler Lava manages access to 256 job slots within a single queue. The cluster operating systems is Redhat Enterprise Linux 5.5. The hardware is less than three months old. Of Note: The home directories (10 TBs) are provided by a 48 TB MSA60 disk array across the entire cluster using IPoIB, so all NFS traffic is routed across the Voltaire switch in addition to MPI traffic. This cluster has been measured at 1.5 teraflops (using Linpack).
-The Dell cluster consists of two login nodes ("petaltail"/swallowtail"), the Load Scheduler Facility (LSF) job scheduler, and 36 compute nodes.  "petaltail" is the installer/administrative server while "swallowtail" manages commercial software licenses.  Both function as login access points.  Each compute node holds dual quad core (Xeon 5345, 2.3 Ghz) Dell PowerEdge 1950 with memory footprints ranging from 8 GB to 16 GB.  Total memory footprint of the cluster is 340 GB.  A high speed Cisco interconnect (Infiniband) connects 16 of these compute nodes for parallel computational jobs.  The scheduler manages access to 288 job slots across 7 queues. The cluster operating systems is Redhat Enterprise Linux 5.1. The hardware is 3 years old. This cluster has been measured at 665 gigaflops (using Linpack). THE LOGIN NODE petaltail HAS BEEN DECOMMISSIONED AS A SCHEDULER, IT WILL ONLY PERFORM ADMINISTRATIVE TASKS.  JOBS CAN NOW BE SUBMITTED VIA swallowtail/greentail BOTH RUNNING LAVA (may 2011)
-The Blue Sky Studios (Angstrom hardware) consists of one login node ("sharptail"), the Lava job scheduler, and 46 compute nodes.  Each compute node holds dual single core AMD Opteron Model 250 (2.4 Ghz) with a memory footprint of 24GB.  Total memory footprint of the cluster is 1.1 TB.  The scheduler manages access to 92 job slots within a single queue.  The cluster operating systems is CentOS 5.3. The hardware is 7 years old. Of Note: because of it's energy inefficiencies only the login node and one compute node are powered on ... when jobs start pending in the queue, admins are notified automatically and more will be powered on to handle the load. This cluster has an estimated capacity of 500-700 gigaflops. CLUSTER LOGIN NODE sharptail IS NO MORE.  THE QUEUE bss24 HAS BEEN MOVED TO CLUSTER greentail (45 nodes, May 2011).
-Home directory file systems are provided, and shared with the "petaltail/swallowtail" and "sharptail" clusters, by a NetApp attached storage disk array.  In total, 5 TB of disk space is accessible to the users.  In addition backup services are provided using the IBM Tivoli software (this will be deprecated soon).
+Several types of compute node "clusters" are available via the Lava scheduler:
-Home directory file system are provided on the "greentail clusters" by a direct attached disk array. In total, 10 TB of disk space is accessible to the users. Backup services are provided via disk-to-disk snapshot copies on the same array.  In addition, weekly home directory refresh pulls information from the NetApp disk array.
+  * 32 nodes with dual quad core (Xeon 5620, 2.4 Ghz) sockets in HP blades (SL2x170z G6) with memory footprints of 12 GB each, all on infiniband (QDR) interconnects.  288 job slots. Total memory footprint of these nodes is 384 GB. This cluster has been measured at 1.5 teraflops (using Linpack). Known as the HP cluster, or the n-nodes (n1-n32).
+  * 30 nodes with dual quad core (Xeon 5345, 2.3 Ghz) sockets in Dell PowerEdge 1950 rack servers with memory footprints ranging from 8 GB to 16 GB.  256 job slots. Total memory footprint of these nodes is 340 GB. Only 16 nodes are on infiniband (SDR) interconnects, rest on gigabit ethernet switches. This cluster has been measured at 665 gigaflops (using Linpack). Known as the Dell cluster or the c-nodes (c00-c36...some have failed)
+  * 25 nodes with dual single core AMD Opteron Model 250 (2.4 Ghz) with a memory footprint of 24 GB each.  50 job slots. Total memory footprint of the cluster is 600 GB. This cluster has an estimated capacity of 250-350 gigaflops. (can grow to 45 nodes, 90 jobs slots, and 1.1 TB memory). Known as the Blue Sky Studio cluster or the b-nodes (b20-b45).
+  * 5 nodes with dual eight core E5-2660 Intel Xeon sockets (2.2 Ghz) in ASUS/Supermicro rack servers with a memory footprint of 256 GB each (1.28 TB). Hyperthreading is turned on doubling the core count to 32/node (120 job slots for regular HPC).  The nodes also contain 4 GPUs per node (total of 20) with 20 reserved CPU cores (job slots).  Known as the Microway or GPU-HPC cluster (n33-n37)
+All queues are available for job submissions via the login nodes greentail and swallowtail; both nodes service all queues. Our total job slots is now 734 of which 400 are on infiniband switches for parallel computational jobs.
+Home directory file system are provided (via NFS or IPoIB) by the login node "sharptail" (to come) from a direct attached disk array (48 TB). In total, 10 TB of /home disk space is accessible to the users and 5 TB of scratch space at /sanscratch.  In addition all nodes provide a small /localscratch disk space on the nodes local internal disk if file locking is needed (about 50 GB). Backup services are provided via disk-to-disk snapshot copies on the same array. In addition, the entire disk array on sharptail is rsync'ed to the 48 TB disk array on greentail. (Yet to be deployed 09sep13).
+The 25 nodes clusters listed above also runs our Hadoop cluster.  The namenode and login node is whitetail and also contains the scheduler for Hadoop. It is based on Cloudera CD3U6 repository.
 \\
 **[[cluster:0|Back]]**

DokuWiki

User Tools

Site Tools

Differences

Page Tools