User Tools

Site Tools


cluster:133


Back

High Core Count - Low Memory Footprint

I polled some folks with the problem described below to find a solution. Then …

http://www.nytimes.com/2014/08/08/science/new-computer-chip-is-designed-to-work-like-the-brain.html

We're on the cusp of a new era!

Other solutions than the one described below

Ideas

One idea I received back was to look at the Intel Atom line of chips. From Andrew “We can definitely quote rackmounted Atom servers in fairly dense configurations. One example of what we could quote would be : Within each 3U enclosure :12x Sleds, each with TWO C2750 Atom systems on it. So per 3U box :: 24x C2750 Atom systems, each can have 2x 2.5” HDD, Up To 64GB Memory, 2x 10/100/1000 NIC, VGA Port“.

That's a 4-core chip (quoted) so 96 cores/3U. Could double soon with 8 core chip.

So I went looking at my favorite vendor's hardware platform and found:

MicroBlade! 896 cores in 6U. Ok then.

  • 28 blades, 112 nodes, 4 nodes per blade, each node with
    • 1x Atom C2750 8 core 2.4 Ghz chip
    • up 32 GB ram (4 GB per core, way above what's needed)
    • 1x 2.5” disk
  • Virtual Media Over LAN (Virtual USB Floppy / CD and Drive Redirection)
  • Do these PXE boot? How to get OS on drives?
  • Other thoughts
    • With that many nodes, /home would probably not be mounted
    • So users would have to stage job data in /localscratch/JOBPID probably
    • … via scp from a target host

Slurm

And then we need something that can handle ten of thousand of jobs if we acquire such a dense core platform.

Enter Slurm, which according to their web site, “can sustain a throughput rate of over 120,000 jobs per hour”.

Now we're talking.

Notes on Slurm are High Core Count - Low Memory Footprint

Problem

I've been asked to investigate the following: Our HPCC has slowly grown towards bigger servers with more memory and cores per node. The future holds for us several faculty whom will run very CPU intensive jobs, tens of thousands of them, but with small memory foot prints (I'm told in the order of 64-128 MB, so I'm assuming 192-256 MB per single core/node). So I either need huge amounts of cores with little memory or lots of tiny small core count blade servers.

I'm going to investigate LXC Linux containers (https://linuxcontainers.org/) but I'm weary of performance when generating tiny VMs that will mostly be CPU bound. Any ideas on hardware/software solutions would be greatly appreciated. There is no budget yet so I'm unsure how large this project will be. Thanks,


Back

cluster/133.txt · Last modified: 2015/03/18 18:26 by hmeij