cluster:37
no way to compare when less than two revisions
Differences
This shows you the differences between two versions of the page.
| — | cluster:37 [2007/05/15 17:13] (current) – created - external edit 127.0.0.1 | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | \\ | ||
| + | **[[cluster: | ||
| + | |||
| + | ===== Cluster Steering Committee 05/09/2007 ===== | ||
| + | |||
| + | Present: James Taft, Jolee West, Henk Meij, Francis Starr, David Beveridge, Eric Aaron, Tsampikos Kottos, George Petersson | ||
| + | |||
| + | ==== ToDos ==== | ||
| + | |||
| + | * fix PE2950s (dell issued) | ||
| + | |||
| + | * < | ||
| + | |||
| + | * filesystem fsck tests (affirm 1T LUN size) | ||
| + | |||
| + | * < | ||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | ==== Next Steps ==== | ||
| + | |||
| + | * create accounts (all members of cluster_advisory_group); | ||
| + | |||
| + | * open for test & development (no home dir backups, few snapshots); approach seemed reasonable especially since we can snapshot more frequent initially when the filesystem is relatively unused. | ||
| + | |||
| + | * install software ... prioritization led to: amber, delphi, imsl. others suggested were fortran 90 open-source compiler (does it exist? appears so [[http:// | ||
| + | |||
| + | | group to proritize => | portland compilers, | ||
| + | |||
| + | * adjust queues, currently have (in order of descending priority); it was decide to bring a few general purpose queues up with little or no restrictions. | ||
| + | |||
| + | | name | description (lw=light weight, hw=heavy weight, i=infiniband)| | ||
| + | | => specialty queues || | ||
| + | | priority | urgent jobs, limited by users allowed, 8 hrs of cpu time, max cores=8, any lw node| | ||
| + | | checkpoint | jobs will be checkpointed, | ||
| + | | icheckpoint | same as above but ilw nodes| | ||
| + | | debug | hosts login1 & login2 (ethernet), max queued jobs=16, max jobs/ | ||
| + | | idebug | same as above but ilw nodes only (hosts ilogin & ilogin2)| | ||
| + | | => production queues || | ||
| + | | 16-lwnodes | for normal jobs, lw nodes, max cores=32| | ||
| + | | 16-ilwnodes | for normal jobs, ilw nodes, max cores=128| | ||
| + | | 04-hwnodes | for large memory jobs, hw nodes, max cores=32, fast / | ||
| + | | => default queue || | ||
| + | | idle | jobs run on any host lightly loaded, max cores=8, max jobs/ | ||
| + | |||
| + | other items | ||
| + | |||
| + | * get a quote for an LSF upgrade for all nodes | ||
| + | |||
| + | * initiate TSM tape backups while the filesystem is rather small by adding some tapes to empty slots. | ||
| + | |||
| + | |||
| + | \\ | ||
| + | **[[cluster: | ||
cluster/37.txt · Last modified: by 127.0.0.1
