User Tools

Site Tools


cluster:181

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:181 [2019/07/31 19:02]
hmeij07
cluster:181 [2019/08/08 13:25]
hmeij07 [2019 GPU Models]
Line 27: Line 27:
 This is a handy tool [[https://www.nvidia.com/en-us/data-center/tesla/tesla-qualified-servers-catalog/|GPU Server Catalog]] This is a handy tool [[https://www.nvidia.com/en-us/data-center/tesla/tesla-qualified-servers-catalog/|GPU Server Catalog]]
  
-Learn more about the T4 ... the T4 can run in mixed mode (fp32/fp16) and deliver 65 Tflops. Other modes are INT8 at 130 Tops and INT4 260 Tops. Now at 65 Tflops mixed precision the cost dives to $34/tflop. Amazing. And the wattage is amazing too.+Learn more about the T4 ... the T4 can run in mixed mode (fp32/fp16) and can deliver 65 Tflops. Other modes are INT8 at 130 Tops and INT4 260 Tops. Now at 65 Tflops mixed precision the cost dives to $34/tflop. Amazing. And the wattage is amazing too.
  
   * [[https://www.nvidia.com/en-us/data-center/tesla-t4/|T4]]   * [[https://www.nvidia.com/en-us/data-center/tesla-t4/|T4]]
   * [[https://www.nvidia.com/en-us/data-center/products/enterprise-server/|External Link]]   * [[https://www.nvidia.com/en-us/data-center/products/enterprise-server/|External Link]]
 +  * [[http://https://blog.inten.to/hardware-for-deep-learning-part-3-gpu-8906c1644664|Fp32, Fp16, INT8, INT4, Mixed Mode]]
 +      * very interesting peak performance FP32 gpu chart (RTX TITAN and RTX 6000 on top)
 +    * [[https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html#framework|Training Guide for Mixed Precision]]
 +
 +From Lammps developer: "Computing forces in all single precision is a significant approximation and mostly works ok in homogeneous system, where there is a lot of error cancellation. Using half precision in any form for force computations is not advisable."
 +
 +From Gromacs web site: "GROMACS simulations are normally run in “mixed” floating-point precision, which is suited for the use of single precision in FFTW. The default FFTW package is normally in double precision."
 +
 +
 +**Keep track of these** 
 +
 +  -  does Amber run on the T4, the web site lists "Turing (SM_75) based cards require CUDA 9.2 or later." but does not list the T4 (too new?).
 +  - Gaussian g16c01 AVX enabled linux binaries - no linda "Platforms marked with † include GPU support for NVIDIA K40, K80, //P100, and V100// boards with 12 GB of memory or higher. A version of NVIDIA drivers compatible with CUDA 8.0 or higher. We run CUDA 9.2, so ok, but OS platform 6.10 or 7.6 required? We're at 6.5 (n38-n45) or 7.5.10 (n33-n37, n78).
  
-Keep track of this; does Amber run on the T4, the web site lists "Turing (SM_75) based cards require CUDA 9.2 or later." but does not list the T4 (too new?). 
  
 \\ \\
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
cluster/181.txt · Last modified: 2019/08/13 12:15 by hmeij07