Differences

This shows you the differences between two versions of the page.

--- cluster:181 [2019/08/02 14:35]
hmeij07 [2019 GPU Models]
+++ cluster:181 [2019/08/12 14:16]
hmeij07
@@ Line 21: / Line 21: @@
 A lot of information comes from this web site [[https://blog.exxactcorp.com/whats-the-best-gpu-for-deep-learning-rtx-2080-ti-vs-titan-rtx-vs-rtx-8000-vs-rtx-6000/|Best GPU for deep learning]]
-Bench statistics (Nidia GTX 1070 is about 100% baseline) from this web site [[https://gpu.userbenchmark.com/Faq/What-is-the-effective-GPU-speed-index/82|External Link]]
+Bench statistics (Nvidia GTX 1070 is about 100% baseline) from this web site [[https://gpu.userbenchmark.com/Faq/What-is-the-effective-GPU-speed-index/82|External Link]]
 Most GPU models come in multiple memory configurations, showing the most common footprints.
@@ Line 27: / Line 27: @@
 This is a handy tool [[https://www.nvidia.com/en-us/data-center/tesla/tesla-qualified-servers-catalog/|GPU Server Catalog]]
-Learn more about the T4 ... the T4 can run in mixed mode (fp32/fp16) and can deliver 65 Tflops. Other modes are INT8 at 130 Tops and INT4 260 Tops. Now at 65 Tflops mixed precision the cost dives to $34/tflop. Amazing. And the wattage is amazing too.
+Learn more about the T4 ... the T4 can run in mixed mode (fp32/fp16) and can deliver 65 Tflops. Other modes are INT8 at 130 Tops and INT4 260 Tops. Now at 65 Tflops mixed precision the cost dives to $34/tflop. Amazing. And the wattage is amazing too. See the next page for the fp64/fp32 mixed precision mode quandary...[[cluster:182|P100 vs RTX 6000 & T4]]
   * [[https://www.nvidia.com/en-us/data-center/tesla-t4/|T4]]
   * [[https://www.nvidia.com/en-us/data-center/products/enterprise-server/|External Link]]
-  * [[http://https://blog.inten.to/hardware-for-deep-learning-part-3-gpu-8906c1644664|Fp32, Fp16, INT8, INT4, Mixed Mode]]
+  * [[http://https://blog.inten.to/hardware-for-deep-learning-part-3-gpu-8906c1644664|FP32, FP16, INT8, INT4, Mixed Mode]]
+      * very interesting peak performance FP32 gpu chart (RTX TITAN and RTX 6000 on top)
+    * [[https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html#framework|Training Guide for Mixed Precision]]
+From Lammps developer: "Computing forces in all single precision is a significant approximation and mostly works ok in homogeneous system, where there is a lot of error cancellation. Using half precision in any form for force computations is not advisable."
+From Gromacs web site: "GROMACS simulations are normally run in “mixed” floating-point precision, which is suited for the use of single precision in FFTW. The default FFTW package is normally in double precision."
+**Keep track of these**
+  -  does Amber run on the T4, the web site lists "Turing (SM_75) based cards require CUDA 9.2 or later." but does not list the T4 (too new?).
+  - Gaussian g16c01 AVX enabled linux binaries - no linda "Platforms marked with † include GPU support for NVIDIA K40, K80, //P100, and V100// boards with 12 GB of memory or higher. A version of NVIDIA drivers compatible with CUDA 8.0 or higher. We run CUDA 9.2, so ok, but OS platform 6.10 or 7.6 required? We're at 6.5 (n38-n45) or 7.5.10 (n33-n37, n78).
-Keep track of this; does Amber run on the T4, the web site lists "Turing (SM_75) based cards require CUDA 9.2 or later." but does not list the T4 (too new?).
 \\
 **[[cluster:0|Back]]**

DokuWiki

User Tools

Site Tools

Differences

Page Tools