This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cluster:107 [2012/12/19 19:18] hmeij [ConfCall & Quote: MW] |
cluster:107 [2013/01/16 14:58] hmeij [ConfCall & Quote: MW] |
||
---|---|---|---|
Line 299: | Line 299: | ||
* buy a single rack and test locally, start small (will future racks be compatible? | * buy a single rack and test locally, start small (will future racks be compatible? | ||
+ | |||
+ | ==== Yale Qs ==== | ||
+ | |||
+ | Tasked with getting GPU HPC going at Wesleyan and trying to gain insights into the project. If you acquired a GPU HPC ... | ||
+ | |||
+ | * What was the most important design element of the cluster? | ||
+ | * What factor(s) settled the CPU to GPU ratio? | ||
+ | * Was either, or neither, single or double precision peak performance more/less important? | ||
+ | * What was the software suite in mind (commercial, | ||
+ | * How did you reach out/educate users on the aspects of GPU computing? | ||
+ | * What was the impact on the users? (recoding, recompiling) | ||
+ | * Was the expected computational speed up realized? | ||
+ | * Was the PGI Accelerator compilers leveraged? If so what were the results? | ||
+ | * Do users compile with nvcc? | ||
+ | * Does the scheduler have a resource for idle GPUs so they can be reserved? | ||
+ | * How are the GPUs exposed/ | ||
+ | * Do you allow multiple serial jobs to access the same GPU? Or one parallel job multiples GPUs? | ||
+ | * Can parallel jobs access mutliple GPUs across nodes? | ||
+ | * Any experiences with pmemd.cuda.MPI (part of Amber)? | ||
+ | * What MPI flavor is used most in regards to GPU computing? | ||
+ | * Do you leverage the CPU HPC of the GPU HPC? For example, if there are 16 GPUs and 64 CPU cores on a cluster, do you allow 48 standard jobs on the idle cores? (assuming the max of 16 serial GPU jobs) | ||
+ | |||
+ | Notes 04/01/2012 ConfCall | ||
+ | |||
+ | * Applications drive the CPU-to-GPU ratio and most will be 1-to-1, certainly not larger then 1-to-3 | ||
+ | * Users did not share GPUs but could obtain more than one, always on same node | ||
+ | * Experimental setup with 36 gb/node, dual 8 core chips | ||
+ | * Nothing larger than that memory wise as CPU and GPU HPC work environments were not mixed | ||
+ | * No raw code development | ||
+ | * Speed ups was hard to tell | ||
+ | * PGI Accelerator was used because it is needed with any Fortran code (Note!) | ||
+ | * Double precision was most important in scientific applications | ||
+ | * MPI flavor was OpenMPI, and others (including MVApich) showed no advantages | ||
+ | * Book: Programming Massively Parallel Processors, Second Edition: | ||
+ | * A Hands-on Approach by David B. Kirk and Wen-mei W. Hwu (Dec 28, 2012) | ||
+ | * Has examples of how to expose GPUs across nodes | ||
==== ConfCall & Quote: AC ==== | ==== ConfCall & Quote: AC ==== | ||
Line 321: | Line 357: | ||
^ Topic^Description | ^ Topic^Description | ||
- | | General| 2 CPUs (16 cores), 3 GPUs ( 22,500 cuda cores), 32 gb ram/node| | + | | General| 2 CPUs (16 cores), 3 GPUs ( 7,500 cuda cores), 32 gb ram/node| |
| Head Node| None| | | Head Node| None| | ||
| Nodes|1x4U Rackmountable Chassis, 2xXeon E5-2660 2.20 Ghz 20MB Cache 8 cores (16cores/ | | Nodes|1x4U Rackmountable Chassis, 2xXeon E5-2660 2.20 Ghz 20MB Cache 8 cores (16cores/ | ||
Line 590: | Line 626: | ||
* maybe all Lapack libraries too | * maybe all Lapack libraries too | ||
* Make the head node a compute node (in/for the future and beef it up too, 256 GB ram?) | * Make the head node a compute node (in/for the future and beef it up too, 256 GB ram?) | ||
- | * Remove | + | * Leave the 6x2TB disk space (for backup) |
- | * 2U, 8 drives | + | * 2U, 8 drives up to 6x4=24 |
+ | * Add an entry level Infiniband/ | ||
+ | * for parallel file locking | ||
* Spare parts | * Spare parts | ||
- | * 8 port switch, HCAs and cables, drives ... | + | |
- | | + | * or get 5 years total warranty |
+ | * Testing notes | ||
+ | * Amber, LAMMPS, NAMD | ||
+ | * cuda v4&5 | ||
+ | * install/ | ||
+ | * use gnu ... with openmpi | ||
\\ | \\ | ||
**[[cluster: | **[[cluster: |