This is an old revision of the document!
Back
HPCC Expansion Summer 2015
We are in need to address the problem of tens of thousands of small serial jobs swarming across our larger servers. In doing so these jobs tie up large chunks of memory they do not use and interfere with the scheduling of large parallel jobs (small serial jobs satisfy job prerequisites easily).
So the idea is to assess what we could buy in terms of large core density hardware (max cpu cores per U rack space) with small memory footprints (defined as 1 gb per physical core or less). Nodes can have tiny local disks for OS and local scratch (say 16-120 GB). /home
may not be mounted on these systems so input and output files need to be managed by the jobs and copied back and forth using scp
. The scheduler will be SLURM. The OS CentOS 6.x latest version.
Some testing results can be found here:
The expansion: lines below give an estimation of nr_nodes = int(expansion_budget/node_cost)
ExxactCorp
Node option A: Quantum IXR110-512N E5-2600 v2 family
1U Server, Intel Dual socket R (LGA 2011)
Dual port Gigabit Ethernet
350W High efficiency 1x
Intel® Xeon® processor E5-2620v2, 6C, 2.10
GHz 15M 2x (total 12 cores)
8GB 240-Pin DDR3 1866
MHz ECC/Registered Server Memory 2x (total 16gb ram)
120GB 2.5 SATA III Internal Solid State Drive (SSD)
OS Drive and Scratch Drive 2x
CentOS 6 Installation
3-Year Warranty on Parts and Labor with Perpetual Email and Telephone Support
12 cores/U, 1.3 gb ram/core
expansion: 26 nodes, 26U, 312 cores
Advanced Clustering
Node option: Pinnacle 1FX3601
2U Server Enclosure, 4 nodes per enclosure 4x, each node with:
Dual port Gigabit Ethernet
500W High efficiency 1x
Intel® Xeon® processor E5-2630v3, 8C, 2.40
GHz 20M 2x (total 16 cores)
4GB DDR4 2133
MHz ECC/Registered Server Memory 8x (total 32gb ram)
128GB SATA Solid State Drive (SSD) 1x
CentOS 6 Installation
3-Year Warranty on Parts and Labor with Perpetual Email and Telephone Support
64 cores/2U, 2.0 gb ram/core
expansion: 16 nodes, 8U, 256 cores
Microway
Node option: Twin NumberSmasher4x
2U Server Enclosure, 4 nodes per enclosure 3x, each node with:
Dual port Gigabit Ethernet
1620W High efficiency 1x
Intel® Xeon® processor E5-2630v3, 10C, 2.30
GHz 25M 2x (total 20 cores)
4GB DDR4 2133
MHz ECC/Registered Server Memory 8x (total 32gb ram)
16GB SATADom (SSD) 1x ←– important, and this is new
CentOS 6.x, MCMS (provisioning), OpenMPI, SLurm (scheduler) Installation ←– rapid deployment
3-Year Warranty on Parts and Labor with Perpetual Email and Telephone Support
80 cores/2U, 1.6 gb ram/core
expansion: 12 nodes, 6U, 240 cores (3% over budget)
AMAX
Customer shall be responsible for shipping charges (this is normal) and shall have its own insurance to cover risk during transit (this is definitely not!).
Node option A: ServMax E5-2600 v2 family
1U Server, Intel Dual socket (LGA 2011)
Dual port Gigabit Ethernet?
95W High efficiency, redundant power supplies
Intel® Xeon® processor E5-2660v2, 10C, 2.2
GHz 2x (total 20 cores)
4GB DDR3 1866
MHz E/R Memory 8x (total 32gb ram)
500GB 3.5“ SATA Hard Drive 1x
CentOS 6 Installation
1-Year Warranty on Parts and Labor ←– grh
20 cores/U, 1.6 gb ram/core
expansion: 10 nodes, 10U, 200 cores
Node option B: ServMax E5-2600 v3 family
1U Server, Intel Dual socket (LGA 2011)
Dual port Gigabit Ethernet?
120W High efficiency, redundant power supplies
Intel® Xeon® processor E5-2600v3, 14C, 2.0
GHz 2x (total 28 cores)
4GB DDR4 2133
MHz E/R Memory 8x (total 32gb ram)
500GB 3.5” SATA Hard Drive 1x
CentOS 6 Installation
1-Year Warranty on Parts and Labor ←– grh
28 cores/U, 1.1 gb ram/core
expansion: 8 nodes, 8U, 224 cores
Node option C: ServMax E5-2600 v2 family
2U Server Enclosure, 4 nodes per enclosure 4x, each node with:
Dual port Gigabit Ethernet?
95W High efficiency, redundant power supplies
Intel® Xeon® processor E5-2660v2, 10C, 2.2
GHz 8x (total 80 cores)
4GB DDR3 1866
MHz E/R Memory 32x (total 128gb ram)
500GB SATA Hard Drive 4x
CentOS 6 Installation
1-Year Warranty on Parts and Labor ←– grh
80 cores/2U, 1.6 gb ram/core
expansion: 2 nodes, 4U, 160 cores
expansion: 3 nodes, 6U, 240 cores ←– 10% over budget
Node option D: ServMax E5-2600 v3 family
2U Server Enclosure, 4 nodes per enclosure 4x, each node with:
Dual port Gigabit Ethernet?
120W High efficiency, redundant power supplies
Intel® Xeon® processor E5-2600v3, 14C, 2.0
GHz 8x (total 112 cores)
4GB DDR4 2133
MHz E/R Memory 32x (total 128gb ram)
500GB SATA Hard Drive 4x
CentOS 6 Installation
1-Year Warranty on Parts and Labor ←– grh
112 cores/2U, 1.1 gb ram/core
expansion: 2 nodes, 4U, 224 cores
CDW
Node option: Enterprise MicroBlade (up to 28 MicroBlades)
6U Enclosure, :
One port Gigabit Ethernet,internal 1×2.5 Ghz ethernet module
1600W High efficiency 200-240V
12 MicroBlades, each with 1 independent node, each
Intel Haswell-EP R3 E5-2650, 10C, 2.30
GHz 2x (total 20 cores/blade)
8GB DDR3 1600
MHz ECC/Unbuffered 4x (total 32gb ram/blade)
32GB SATA DOM 1x ←– new, important
No software ←- lots of work
MicroBlade Chassis Management Module (virtual media? - usb/cdrom) ←– without KVM ???
?-Year Warranty on Parts and Labor ←– grhhh
cores/6U, 1.6 gb ram/core
expansion: 12 nodes, 6U, 240 cores
Back