Differences

This shows you the differences between two versions of the page.

--- cluster:184 [2019/09/27 12:45]
hmeij07 [AWS deploys T4]
+++ cluster:184 [2020/01/03 13:22] (current)
hmeij07
@@ Line 1: / Line 1: @@
 \\
 **[[cluster:0|Back]]**
+==== Turing/Volta/Pascal ====
+  * https://graphicscardhub.com/turing-vs-volta-v-pascal/
 ==== AWS deploys T4 ====
@@ Line 6: / Line 10: @@
   * https://www.hpcwire.com/2019/09/20/aws-makes-t4-gpu-instances-broadly-available-for-inferencing-graphics/
-Look at this, the smallest Elastic Cloud Compute Instances are **g4dn.xlarge** yielding access to 4 vCPUs and 1x T4 GPU. The largest is **g4dn.16xlarge** yielding access to 64 vCPUs and 1x T4 GPUs. Now the smallest is priced at $0.526/hr, and running that card 24/7 for a year is a cost of $4,607.76 ... meaning ... option #7 below with 26 GPUs would cost you a whopping $119,802. Annually! That's the low tide water mark.
+Look at this, the smallest Elastic Cloud Compute Instances are **g4dn.xlarge** yielding access to 4 vCPUs, 16GiB memory and 1x T4 GPU. The largest is **g4dn.16xlarge** yielding access to 64 vCPUs 256 GiB memory and 1x T4 GPUs. Now the smallest is priced at $0.526/hr, and running that card 24/7 for a year is a cost of $4,607.76 ... meaning ... option #7 below with 26 GPUs would cost you a whopping $119,802. Annually! That's the low tide water mark.
 The high tide water mark? The largest instance is priced at $4.352 and would cost you near one million dollars to run per year if you matched option #7.
@@ Line 15: / Line 19: @@
 ==== 2019 GPU Expansion ====
+More focus...
+  * Vendor A:
+    * Option 1: 48 gpus, 12 nodes, 24U, each: two 4214 12-core cpus (silver), 96 gb ram, 1tb SSD, four NVIDIA RTX 2080 SUPER 8GB GPU,  centos7 yes, cuda yes, 3 yr, 2x gbe nics, 17.2w 31.5d 3.46h" (fits)
+With the Deep Learning Ready docker containers...[[cluster:187|NGC Docker Containers]]
+The SUPER model quote above is what we selected\\
+ --- //[[hmeij@wesleyan.edu|Henk]] 2020/01/03 08:22//
+Focus on RTX2080 model...
+  * Vendor A:
+    * Option 1: 48 gpus, 12 nodes, 24U, each: two 4116 12-core cpus (silver), 96 gb ram, 1tb SSD, four rtx2080 gpus (8gb),  centos7 yes, cuda yes, 3 yr, nics?, wxdxh"?
+    * Option 2: 40 gpus, 10 nodes, 20U, each: two 4116 12-core cpus (silver), 96 gb ram, 1tb SSD, four rtx2080ti gpus (11gb),  centos7 yes, cuda yes, 3 yr, nics?, wxdxh"?
+    * A1+A2 installed, configured and tested: NGC Docker containers Deep Learning Software Stack: NVIDIA DIGITS, TensorFlow, Caffe, NVIDIA CUDA, PyTorch, RapidsAI, Portainer ... NGC Catalog can be found  at
+https://ngc.nvidia.com/catalog/all?orderBy=modifiedDESC&query=&quickFilter=all&filters=
+  * Vendor B:
+    * Option 1: 36 gpus, 9 nodes, 18U, each: two 4214 12-core cpus (silver), 96 gb ram, 2x960gb SATA, four rtx2080tifsta gpus (11gb),  centos7 no, cuda no, 3 yr, 2xgbe nics, wxdxh"?
+  * Vendor C:
+    * Option 1: 40 gpus, 10 nodes, 40U, each: two 4214 12-core cpus (silver), 96 gb ram, 240 gb SSD, four rtx2080ti gpus (11gb),  centos7 yes, cuda yes, 3 yr, 2xgbe nics, 18.2x26.5x7"
+    * Option 2: 48 gpus, 12 nodes, 48U, each: two 4214 12-core cpus (silver), 96 gb ram, 240 gb SSD, four rtx2080s gpus (8gb),  centos7 yes, cuda yes, 3 yr, 2xgbe nics, 18.2x26.5x7"
+  * Vendor D:
+    * Option 1: 48 gpus, 12 nodes, 12U, each: two 4214 12-core cpus (silver), 64 gb ram, 2x480gb SATA, four rtx2080s gpus (8gb),  centos7 yes, cuda yes, 3 yr, 2xgbe nics, 17.2x35.2x1.7"
 Ok, we try this year. Here are some informational pages.
@@ Line 46: / Line 79: @@
 |  Gpus  |  48  |  16  |  36  |  28  |  20  |  34  |  26  |  16  |  28  |  60  | total|
 |  Cores  |  209  |  74  |  157  |  72  |  92  |  75  |  67  |  74  |  72  |  138  | cuda K|
-|  Cores  |  26  |  9  |  20  |  16  |  11.5  |  10  |  8  |  9  |  9  |  17  | tensor K|
+|  Cores  |  26  |  9  |  20  |  8.9  |  11.5  |  10  |  8  |  9  |  9  |  17  | tensor K|
 |  Tflops  |  21  |  13  |  16  |  7  |  10  |  7.5  |  6.5  |  13  |  7  |  13  | gpu dpfp|
 |  Tflops  |  682  |  261  |  511  |  227  |  326  |  241  |  211  |  261  |  227  |  426  | gpu spfp|

DokuWiki

User Tools

Site Tools

Differences

Page Tools