Both sides previous revision
Previous revision
Next revision
|
Previous revision
Next revision
Both sides next revision
|
cluster:126 [2017/12/06 10:57] hmeij07 |
cluster:126 [2020/01/30 08:06] hmeij07 [Other Stuff] |
All queues are available for job submissions via all login nodes. All nodes on Infiniband switches for parallel computational jobs (excludes tinymem, mw128 and amber128 queues). Our total job slot count is roughly 1,712 with our physical core count 1,192. Our total teraflops compute capacity is about 38 cpu side, 25 gpu side (double precision floating point). Our total memory footprint is about 144 GB gpu side, 7,408 GB cpu side. | All queues are available for job submissions via all login nodes. All nodes on Infiniband switches for parallel computational jobs (excludes tinymem, mw128 and amber128 queues). Our total job slot count is roughly 1,712 with our physical core count 1,192. Our total teraflops compute capacity is about 38 cpu side, 25 gpu side (double precision floating point). Our total memory footprint is about 144 GB gpu side, 7,408 GB cpu side. |
| |
Home directory file system are provided (via NFS or IPoIB) by the node ''sharptail'' (our file server) from a direct attached disk array. In total, 10 TB of /home disk space is accessible to the users. Node ''greentail'' makes available 33 TB of scratch space at /sanscratch via NFS. In addition all nodes provide local scratch space at /localscratch (excludes queue tinymem). The Openlava scheduler automatically makes directories in both these scratch areas for each job (named after JOBPID). Backup services for /home are provided via disk-to-disk point-in-time snapshots from node ''sharptail'' to node ''cottontail'' disk arrays. (daily, weekly, monthly snapshots are mounted read only on ''cottontail'' for self-serve content retrievals). Some chemists have their home directories on node ''ringtail'' which provides 33 TB via /home33. | Home directory file system are provided (via NFS or IPoIB) by the node ''sharptail'' (our file server) from a direct attached disk array. In total, 10 TB of /home disk space is accessible to the users. Node ''greentail'' makes available 33 TB of scratch space at /sanscratch via NFS. In addition all nodes provide local scratch space at /localscratch (excludes queue tinymem). The Openlava scheduler automatically makes directories in both these scratch areas for each job (named after JOBPID). Backup services for /home are provided via disk-to-disk point-in-time snapshots from node ''sharptail'' to node ''cottontail'' disk arrays. (daily, weekly, monthly snapshots are mounted read only on ''cottontail'' for self-serve content retrievals). Some chemists have their home directories on node ''ringtail'' which provides 33 TB via /home33. Some psyc folks also have their own storage of 110 TB via /mindstore. In addition no-quota, no-backup user directories can be requested in /homeextra1 (7 T) or /homeextar2 (5 T). |
| |
<del>A subset of 25 nodes of the Blue Sky Studio cluster listed above also runs our test Hadoop cluster. The namenode and login node is ''whitetail'' and also contains the scheduler for Hadoop. It is based on Cloudera CD3U6 repository.</del> | Two Rstore storage servers each provide about 104 TB of usable backup space which is not mounted on the compute nodes. Each Rstore server's content is replicated to a dedicated passive standby server of same size, located in same data center but in different racks. As of Spring 2019 we have added two new Rstore servers of 220 T each, fully backed up with replication. |
| |
Two Rstore storage servers each provide about 104 TB of usable backup space which is not mounted on the compute nodes. Each Rstore server's content is replicated to a dedicated passive standby server of same size, located in same data center but in different racks. | |
| |
| |
Home directory policy and Rstore storage options [[cluster:136|HomeDir and Storage Options]] | Home directory policy and Rstore storage options [[cluster:136|HomeDir and Storage Options]] |
| |
Checkpointing is supported in all queues, how it works [[cluster:124|BLCR]] page | Checkpointing is supported in all queues, how it works [[cluster:190|DMTCP]] page |
| |
For a list of software installed consult [[cluster:73|Software List]] page | For a list of software installed consult [[cluster:73|Software List]] page |