Warning: Undefined array key "DOKU_PREFS" in /usr/share/dokuwiki/inc/common.php on line 2082
cluster:37 [DokuWiki]

User Tools

Site Tools


cluster:37

Warning: Trying to access array offset on value of type bool in /usr/share/dokuwiki/inc/html.php on line 1130

Warning: Trying to access array offset on value of type bool in /usr/share/dokuwiki/inc/html.php on line 1134

Warning: Trying to access array offset on value of type bool in /usr/share/dokuwiki/inc/html.php on line 1137

Warning: Trying to access array offset on value of type bool in /usr/share/dokuwiki/inc/html.php on line 1138

Warning: Trying to access array offset on value of type bool in /usr/share/dokuwiki/inc/html.php on line 1164

Warning: Trying to access array offset on value of type bool in /usr/share/dokuwiki/inc/html.php on line 1168

Warning: Trying to access array offset on value of type bool in /usr/share/dokuwiki/inc/html.php on line 1171

Warning: Trying to access array offset on value of type bool in /usr/share/dokuwiki/inc/html.php on line 1172

Warning: Undefined array key 0 in /usr/share/dokuwiki/inc/ChangeLog/ChangeLog.php on line 345

Warning: Undefined array key 1 in /usr/share/dokuwiki/inc/html.php on line 1453

Warning: Undefined array key -1 in /usr/share/dokuwiki/inc/html.php on line 1454

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cluster:37 [2007/05/15 13:13]
cluster:37 [2007/05/15 13:13] (current)
Line 1: Line 1:
 +\\
 +**[[cluster:0|Home]]**
 +
 +===== Cluster Steering Committee 05/09/2007 =====
 +
 +Present: James Taft, Jolee West, Henk Meij, Francis Starr, David Beveridge, Eric Aaron, Tsampikos Kottos, George Petersson
 +
 +==== ToDos ====
 +
 +  * fix PE2950s (dell issued)
 +
 +  * <del>dm-multipath failover (fiber channel)</del> done! 05/14/07
 +
 +  * filesystem fsck tests (affirm 1T LUN size)
 +
 +  * <del>rebuild a node</del> done that! hoosed the ionode 05/08/07
 +
 +
 +
 +
 +
 +==== Next Steps ====
 +
 +  * create accounts (all members of cluster_advisory_group); by end week of 05/18/07
 +
 +  * open for test & development (no home dir backups, few snapshots); approach seemed reasonable especially since we can snapshot more frequent initially when the filesystem is relatively unused.
 +
 +  * install software ... prioritization led to: amber, delphi, imsl.  others suggested were fortran 90 open-source compiler (does it exist? appears so [[http://www.g95.org|G95]] ) and xmgrace & ddd.
 +
 +| group to proritize => | portland compilers,  Matlab, charm, amber namd, gromax, gaussian + linda, R, Stata |
 +
 +  * adjust queues, currently have (in order of descending priority); it was decide to bring a few general purpose queues up with little or no restrictions.
 +
 +| name | description (lw=light weight, hw=heavy weight, i=infiniband)|
 +| => specialty queues ||
 +| priority | urgent jobs, limited by users allowed, 8 hrs of cpu time, max cores=8, any lw node|
 +| checkpoint | jobs will be checkpointed, max queued jobs=16, max jobs/user=2, cpu time=10 hrs, lw nodes. jobs are rerunnable|
 +| icheckpoint | same as above but ilw nodes|
 +| debug | hosts login1 & login2 (ethernet), max queued jobs=16, max jobs/user=8, cpu time=1 hr. scheduled with relatively high priority. lw nodes|
 +| idebug | same as above but ilw nodes only (hosts ilogin & ilogin2)|
 +| => production queues ||
 +| 16-lwnodes | for normal jobs, lw nodes, max cores=32|
 +| 16-ilwnodes | for normal jobs, ilw nodes, max cores=128|
 +| 04-hwnodes | for large memory jobs, hw nodes, max cores=32, fast /localscratch|
 +| => default queue ||
 +| idle | jobs run on any host lightly loaded, max cores=8, max jobs/user=1, cpu time 10 hrs.|
 +
 +other items
 +
 +  * get a quote for an LSF upgrade for all nodes
 +
 +  * initiate TSM tape backups while the filesystem is rather small by adding some tapes to empty slots.
 +
 +
 +\\
 +**[[cluster:0|Home]]**
  
cluster/37.txt ยท Last modified: 2007/05/15 13:13 (external edit)