\\ **[[cluster:0|Back]]** ==== 2018 GPU Expansion ==== Important notes ... about GeForce GTX1080Ti ☎ **From Nvidia web site:** //Warranted Product is intended for consumer end user purposes only, and is not intended for datacenter use and/or GPU cluster commercial deployments (“Enterprise Use”). Any use of Warranted Product for Enterprise Use shall void this warranty.// ☎ **From Exxact web site**: //Exxact AMBER Certified AMBER MD Workstations and Clusters, in addition to being numerically validated using custom GPU validation suites, also come with an optimized version of AMBER 16 that has been developed in a collaboration between lead AMBER developer Ross Walker (SDSC), NVIDIA and Exxact.// So it is a grey area if GTX1080Ti data center research use is warranted or not. Most quotes come with language that all warranty issues will be the handled between card owner and card issuer. If we go with consumer grade GTX we should add some self spares...dev budget? $700 apiece. Tesla P100 is enterprise level, $6,000 apiece. Quotes then ... * A1-C1 GTX1080Ti consumer grade gpus, C2-D2 Tesla P100 enterprise grade gpus * A1 most nodes but not max cpu cores or gpus * A3 max cpu cores, max gpus, n78 "like" * D1 max teraflops, cheapest teraflops, not n78 "like" ^ Vendor ^ A ^^^ B ^^ C ^^ D ^^ Notes ^ ^ Quote ^ #1 ^ #2 ^ #3 ^ #1 ^ #2 ^ #1 ^ #2 ^ #1 ^ #2 ^ | | Nodes | 11 | 9 | 8 | 9 | 7 | 8 | 5 | 6 | 6 | 5-22 U | | Cpus | 22 | 18 | 16 | 18 | 14 | 16 | 10 | 12 | 12 | | | Cores | 220 | 180 | 224 | 216 | 168 | 192 | 120 | 168 | 144 | physical | | Gpus | 22 | 36 | 32 | 18 | 28 | 32 | 10 | 12 | 12 | | | Cores | 79 | 129 | 115 | 66 | 100 | 115 | 36 | 43 | 43 | k physical | | Teraflops | 7.7+7.8 | 6.3+13 | 18+11 | 18+6.4 | 14+10 | 16+11 | 10+47 | 12+56 | 12+56 | cpu+gpu (dpfp) | | $/TFlop | 6,335 | 5,073 | 3,510 | 4,299 | 3,946 | 3,822 | 1,795 | 1,422 | 1,638 | hpc 38+25 | ^ Per Node ^^^^^^^^^^^ | Chassis | 2U | 2U | 2U | 2U | 2U | 1U | 1U | 1U | 1U | depth of rails? | | CPU | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | "skylake" | | | 4114 | 4114 | 5120 | 6126 | 6126 | 6126 | 6126 | 5120 | 6126 | model | | | 10+10 | 10+10 | 14+14 | 12+12 | 12+12 | 12+12 | 12+12 | 14+14 | 12+12 | phys+log/ical | | | 2.2 | 2.2 | 2.2 | 2.6 | 2.6 | 2.6 | 2.6 | 2.2 | 2.6 | Ghz max 3-3.7 | | | 85 | 85 | 105 | 125 | 125 | 125 | 125 | 105 | 125 | Watts | | | 13.75 | 13.75 | 19.25 | 19.25 | 19.25 | 19.25 | 19.25 | 19.25 | 19.25 | L3 cache MB | | | 16 | 16 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | dflops/cycle | | DDR4 | 192 | 192 | 192 | 192 | 192 | 192 | 192 | 192 | 192 | GB memory | | | 2666 | 2666 | 2666 | 2666 | 2666 | 2666 | 2666 | 2666 | 2666 | Mhz | | Drives | 480 | 480 | 480 | 2x240 | 2x240 | 1024 | 1024 | 480 | 480 | GB storage | | | 2.5"s | 2.5"s | 2.5"s | 3.5"s | 3.5"s | 2.5"h | 2.5"h | 2.5"s | 2.5"s | SSD/HDD | | | | | | 960 | 960 | | | | | GB scratch | | GPU | 2 | 4 | 4 | 2 | 4 | 4 | 2 | 2 | 2 | | | | GTX | GTX | GTX | GTX | GTX | GTX | Tesla | Tesla | Telsa | warranty note | | | 1080 | 1080 | 1080 | 1080 | 1080 | 1080 | P100 | P100 | P100 | model | | | 11 | 11 | 11 | 11 | 11 | //8//! | 12 | 12 | //16//! | GB memory | | | 1.6 | 1.6 | 1.6 | 1.6 | 1.6 | 1.6 | 1.9 | 1.9 | 1.9 | Ghz, max 1.9 | | | 250 | 250 | 250 | 250 | 250 | 250 | 250 | 250 | 250 | watts | | Image | 1 | 1 | 1 | ? | ? | ? | ? | 1 | 1 | MPI flavors | | CentOS7 | y | y | y | ? | ? | y | y | y | y | + all software | | Nics | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | gigabit ethernet | | Warranty | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | excl gtx cards | | n78? | n | y | y | n | y | y | n | n | n | matches? | | | -1.8 | -3.1 | +1.8 | +4.9 | -5.8 | +4.7 | +2.3 | -3.3 | +11.4 | Δ | | Remember to add self spare gtx cards! ||||||| No spares ||| | GFLOPS = #chassis * #nodes/chassis * #sockets/node * #cores/socket * GHz/core * FLOPs/cycle Note that the use of a GHz processor yields GFLOPS of theoretical performance. Divide GFLOPS by 1000 to get TeraFLOPS or TFLOPS. [[http://en.community.dell.com/techcenter/high-performance-computing/w/wiki/2329]] Find dpflops/cycle here https://www.aspsys.com/solutions/hpc-processors/intel-xeon-skylake/ ==== Request ==== Early fall 2017 we purchased a 1U server from Exxact with four GTX1080Ti gpus and 128 gb memory with dual 8 core E5-2620 v4 cpus. These gpus have been performing well for us with speed ups (compared to our K20s); amber 5x, gromacs 2x, lammps 11x, and fsl's bedpostx a whopping 118x. Our "sweet spot cpu:gpu" ratios for the type of jobs we are running are; * amber 1:1, * gromacs 10:1, * lammps 2-4:1, and * namd 13:1. Most of our jobs will only use one gpu at a time. We also have a dying cpu only jobs queue of HP blade servers. This expansion would run gpu jobs but also cpu only jobs. Hence we'd like to have more servers, which would allow a good mix of cpu only and cpu/gpu jobs. * Budget? $?k. When? near end Q3. Each server containing... * two (amber certified) gpus, two cpus (at least 10 core) * 128 gb memory * centos7 with modules * latest nvidia and mpi (open as to what flavor, mpich for amber I think, openmpi for all others) * latest amber (will provide proof of purchase), gromacs, lammps, namd * at least two gigabit ethernet ports starting at * node n79 nic1 192.168.102.89 nic2 10.10.102.89 ipmi 192.168.103.89 netmask for all 255.255.0.0 * image/configure at least one server (or do all, I can image using warewulf golden image) * leave building environment and logs (I will upgrade our K20s following this setup) * install software in /usr/local * 3 year warranty, NBD We will supply: * standard U42 rack (rails at 30", up to 37" usable) with 7k BTU AC (an experiment) * 2x vertical PDUs (24A) supplying 2x30 C13 outlets, 208V * openlava scheduler rpms * two ethernet switches Open to suggestions, modifications, substitutions. We'd prefer to go with the GTX1080Ti gpus which are still listed as certified for Amber 18. \\ **[[cluster:0|Back]]**