User Tools

Site Tools


cluster:181

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:181 [2019/07/31 13:21]
hmeij07 [2019 GPU Models]
cluster:181 [2019/08/02 14:53]
hmeij07 [2019 GPU Models]
Line 4: Line 4:
 ===== 2019 GPU Models ===== ===== 2019 GPU Models =====
  
-We do not do AI (yet).  The pattern is mostly one job per GPU for exclusive access.  So no NVlink requirements, CPI connections sufficient.  The application list is Amber, Gromacs, Lammps and some python biosequencing packages. Our current per GPU memory footprint is 8 GB which seems sufficient.+We do not do AI (yet).  The GPU usage pattern is mostly one job per GPU for exclusive access.  So no NVlink requirements, CPI connections sufficient.  The application list is Amber, Gromacs, Lammps and some python biosequencing packages. Our current per GPU memory footprint is 8 GB which seems sufficient.
  
 ^          Quadro  ^^^^^  Tesla  ^^  Turing  ^    ^ ^          Quadro  ^^^^^  Tesla  ^^  Turing  ^    ^
Line 10: Line 10:
 |  Cores  |  4352  |  4608  |  2304  |  4608  |  4608  |  3584  |  5120  |  2560  |parallel cuda| |  Cores  |  4352  |  4608  |  2304  |  4608  |  4608  |  3584  |  5120  |  2560  |parallel cuda|
 | Memory  |  11  |  24  |  8  |  24  |  46  |  12  |  32  |  16  |GB ddr6| | Memory  |  11  |  24  |  8  |  24  |  46  |  12  |  32  |  16  |GB ddr6|
-|  Watts  |  250  |  280  |  250  |  295  |  295  |  250  |  250  |  70!  |    |+|  Watts  |  250  |  280  |  250  |  295  |  295  |  250  |  250  |  70 !  |    |
 |  Tflops  |  -  |  0.5  |  -  |  0.5  |  -  |  4.7  |  7  |  -  |double fp64| |  Tflops  |  -  |  0.5  |  -  |  0.5  |  -  |  4.7  |  7  |  -  |double fp64|
 |  Tflops  |  13.5  |  16  |  7  |  16  |  16  |  9.3  |  14  |  8.1  |single fp32| |  Tflops  |  13.5  |  16  |  7  |  16  |  16  |  9.3  |  14  |  8.1  |single fp32|
-|  Avg Bench  |  197%  |  215%  |  120%  |  207%  |  219%  |  120%  |  150%  |  ??  |user bench reporting+|  Avg Bench  |  197%  |  215%  |  120%  |  207%  |  219%  |  120%  |  150%  |  ??  |user bench| 
-|  Price  |  $1,199  |  $2,499  |  $900  |    $4,000  |  $5,500  |  $4,250  |  $9,538  |  ??  |list price| +|  Price  |  $1,199  |  $2,499  |  $900  |    $4,000  |  $5,500  |  $4,250  |  $9,538  |  $2,200  |list price| 
-|  $/fp32  |  $89  |  $156  |  $129  |  $250  |  $344  |  $457  |  $681  |  ??     |+|  $/fp32  |  $89  |  $156  |  $129  |  $250  |  $344  |  $457  |  $681  |  $272     |
 |  Notes  |  small scale  |  medium scale  |  small scale |  medium scale  |  large scale  |  versatile but EOL |  most advanced  |  supercharge  |    | |  Notes  |  small scale  |  medium scale  |  small scale |  medium scale  |  large scale  |  versatile but EOL |  most advanced  |  supercharge  |    |
-|  FP64?  |  -  |  some  |  -  |  some  |  -  |  yes  |  yes  |  -  |double fp64|+|  FP64?  |  -  |  some  |  -  |  some  |  -  |  yes  |  yes  |  -  |double precision|
  
 A lot of information comes from this web site [[https://blog.exxactcorp.com/whats-the-best-gpu-for-deep-learning-rtx-2080-ti-vs-titan-rtx-vs-rtx-8000-vs-rtx-6000/|Best GPU for deep learning]] A lot of information comes from this web site [[https://blog.exxactcorp.com/whats-the-best-gpu-for-deep-learning-rtx-2080-ti-vs-titan-rtx-vs-rtx-8000-vs-rtx-6000/|Best GPU for deep learning]]
Line 27: Line 27:
 This is a handy tool [[https://www.nvidia.com/en-us/data-center/tesla/tesla-qualified-servers-catalog/|GPU Server Catalog]] This is a handy tool [[https://www.nvidia.com/en-us/data-center/tesla/tesla-qualified-servers-catalog/|GPU Server Catalog]]
  
-Learn more about the T4+Learn more about the T4 ... the T4 can run in mixed mode (fp32/fp16) and can deliver 65 Tflops. Other modes are INT8 at 130 Tops and INT4 260 Tops. Now at 65 Tflops mixed precision the cost dives to $34/tflop. Amazing. And the wattage is amazing too.
  
   * [[https://www.nvidia.com/en-us/data-center/tesla-t4/|T4]]   * [[https://www.nvidia.com/en-us/data-center/tesla-t4/|T4]]
   * [[https://www.nvidia.com/en-us/data-center/products/enterprise-server/|External Link]]   * [[https://www.nvidia.com/en-us/data-center/products/enterprise-server/|External Link]]
 +  * [[http://https://blog.inten.to/hardware-for-deep-learning-part-3-gpu-8906c1644664|Fp32, Fp16, INT8, INT4, Mixed Mode]]
 +      * very interesting peak performance FP32 gpu chart (RTX TITAN and RTX 6000 on top)
 +    * [[https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html#framework|Training Guide for Mixed Precision]]
 +
 +Keep track of this; does Amber run on the T4, the web site lists "Turing (SM_75) based cards require CUDA 9.2 or later." but does not list the T4 (too new?).
 +
 \\ \\
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
cluster/181.txt · Last modified: 2019/08/13 12:15 by hmeij07