User Tools

Site Tools


cluster:133

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:133 [2014/08/08 15:47]
hmeij
cluster:133 [2015/03/18 14:26] (current)
hmeij [High Core Count - Low Memory Footprint]
Line 2: Line 2:
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
  
-==== High Core Count Low Memory Footprint ====+==== High Core Count Low Memory Footprint ====
  
 I polled some folks with the problem described below to find a solution. Then ... I polled some folks with the problem described below to find a solution. Then ...
Line 10: Line 10:
 We're on the cusp of a new era! We're on the cusp of a new era!
  
 +
 +Other solutions than the one described below
 +
 +  * Amax 4U/288 cores [[http://www.amax.com/hpc/product.asp?value=High%20Density%20/%20Performance]]
 +  * Microway 2U/144 cores [[http://www.microway.com/products/hpc-clusters/high-performance-computing-with-intel-xeon-hpc-clusters/]]
 ==== Ideas ==== ==== Ideas ====
  
Line 15: Line 20:
 "We can definitely quote rackmounted Atom servers in fairly dense configurations. One example of what we could quote would be : Within each 3U enclosure :12x Sleds, each with TWO C2750 Atom systems on it. So per 3U box :: 24x C2750 Atom systems, each can have 2x 2.5" HDD, Up To 64GB Memory, 2x 10/100/1000 NIC, VGA Port" "We can definitely quote rackmounted Atom servers in fairly dense configurations. One example of what we could quote would be : Within each 3U enclosure :12x Sleds, each with TWO C2750 Atom systems on it. So per 3U box :: 24x C2750 Atom systems, each can have 2x 2.5" HDD, Up To 64GB Memory, 2x 10/100/1000 NIC, VGA Port"
  
-That's a 4-core chip so 96 cores/3U.+That's a 4-core chip (quoted) so 96 cores/3U. Could double soon with 8 core chip.
  
   * Intels calls this design "microservers". From Tower, to rack, to blade, to microservers.   * Intels calls this design "microservers". From Tower, to rack, to blade, to microservers.
Line 21: Line 26:
   * [[http://newsroom.intel.com/community/intel_newsroom/blog/2013/09/04/intel-unveils-new-technologies-for-efficient-cloud-datacenters|Intel Unveils New Technologies for Efficient Cloud Datacenters]]   * [[http://newsroom.intel.com/community/intel_newsroom/blog/2013/09/04/intel-unveils-new-technologies-for-efficient-cloud-datacenters|Intel Unveils New Technologies for Efficient Cloud Datacenters]]
  
 +So I went looking at my favorite vendor's hardware platform and found:
 +{{:cluster:microbade.jpg?200|}}
 +
 +[[http://www.supermicro.com/products/MicroBlade/|MicroBlade!]] 896 cores in 6U. Ok then.
 +  * 28 blades, 112 nodes, 4 nodes per blade, each node with
 +    * 1x Atom C2750 8 core 2.4 Ghz chip
 +    * up 32 GB ram (4 GB per core, way above what's needed)
 +    * 1x 2.5" disk
 +  * Virtual Media Over LAN (Virtual USB Floppy / CD and Drive Redirection)
 +  * Do these PXE boot? How to get OS on drives?
 +
 +  * Other thoughts
 +    * With that many nodes, /home would probably not be mounted
 +    * So users would have to stage job data in /localscratch/JOBPID probably
 +    * ... via scp from a target host
 +
 +
 +
 +==== Slurm ====
 +
 +And then we need something that can handle ten of thousand of jobs if we acquire such a dense core platform.
 +
 +Enter [[https://computing.llnl.gov/linux/slurm/|Slurm]], which according to their web site, "can sustain a throughput rate of over 120,000 jobs per hour"
 +
 +Now we're talking.
 +
 +Notes on Slurm are [[cluster:134|High Core Count - Low Memory Footprint]]
  
 ==== Problem ==== ==== Problem ====
cluster/133.1407527224.txt.gz ยท Last modified: 2014/08/08 15:47 (external edit)