This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
cluster:149 [2016/08/26 11:33] hmeij07 |
cluster:149 [2016/10/27 14:50] hmeij07 [Loose Ends] |
||
---|---|---|---|
Line 6: | Line 6: | ||
- | ==== The problem | + | ==== The Storage Problem |
In a commodity HPC setup deploying plain NFS, bottle necks can develop. | In a commodity HPC setup deploying plain NFS, bottle necks can develop. | ||
Line 38: | Line 38: | ||
This group came knocking at my door stating I really needed their product. Before I agreed to meet them I asked for a budget quote (which I have not received yet) anticipating a monetary value in our budget stratosphere. | This group came knocking at my door stating I really needed their product. Before I agreed to meet them I asked for a budget quote (which I have not received yet) anticipating a monetary value in our budget stratosphere. | ||
- | This solution is a hybrid model of Panasas. An all Flash Blade shelf with internally 40G Ethernet connectivity. Holds up to to 15 blades with a minimal population of 7. In Panasas terms, each blade is both a Director and Storage blade. All blades are 100% flash, meaning no slow disks. Blades hold 8T SSDs. So 56T raw, their web site includes deduplication performance estimates (!), you make your own guess what's usable. | + | This solution is a hybrid model of Panasas. An all Flash Blade shelf with internally 40G Ethernet connectivity. Holds up to to 15 blades with a minimal population of 7. In Panasas terms, each blade is both a Director and a Storage blade. All blades are 100% flash, meaning no slow disks. Blades hold 8T SSDs. So 56T raw, their web site includes deduplication performance estimates (!), you make your own guess what's usable. |
There are 8x 40G Ehternet ports on the back. I have no idea if they can step down on the shelf itself or if another 10G Ethernet switch is needed. I also do not know if that number decreases in a half | There are 8x 40G Ehternet ports on the back. I have no idea if they can step down on the shelf itself or if another 10G Ethernet switch is needed. I also do not know if that number decreases in a half | ||
Line 62: | Line 62: | ||
{{ : | {{ : | ||
- | Left red SAS ports for shelf connects, 2/4 UTA SFP+ ports for Cluster Interconnect communication, | + | Left red SAS ports for shelf connects, 2 of 4 UTA SFP+ ports for Cluster Interconnect communication, |
- | That leaves 2/4 SFP+ ports that we can step down from 40G/10G via two cables X6558-R6 connecting SFP+ to SFP+ compatible ports, | + | That leaves 2 of 4 SFP+ ports that we can step down from 40G/ |
- | Then ports e0a/e0b, green RJ45 ports to the right, connect to our cores switches (public | + | Suggestion: |
- | So we'd have, lets call the whole thing " | + | Then ports e0a/e0b, green RJ45 ports to the right, connect to our cores switches (public and private) to move content from and to the FAS2554 (to the research labs for example). Then we do it again for the second |
- | * 192.168.x.x/ | + | So we'd have, lets call the whole thing " |
- | * 10.10.x.x/ | + | |
- | * 129.133.22.x/255.255.255.128 on remote management for hpcfiler01-eth2 | + | |
- | * 129.133.52.x/255.255.252.0 on e0a for hpcfiler01-eth3.wesleyan.edu (wesleyan public) | + | |
- | * 10.10.52.x/255.255.0.0 on eob for hpcfiler01-eth4.wesleyan.local (this is a different 10.10 subnet, wesleyan private) | + | |
- | Then configure second controller. | + | * 192.168.102.200/ |
+ | * 10.10.102.200/ | ||
+ | * 129.133.22.200/ | ||
+ | * 129.133.52.200/ | ||
+ | * 10.10.52.200/ | ||
- | Can we bond hpcfiler01-eth3.wesleyan.edu and hpcfiler02-eth3.wesleyan.edu together | + | Then configure //second// controller. |
+ | |||
+ | Q: Can we bond hpcfiler01-eth3.wesleyan.edu and hpcfiler02-eth3.wesleyan.edu together to their core switches? (same question for eth4's wesleyan.local) | ||
+ | |||
+ | A: No you cannot bond across controllers. | ||
+ | |||
+ | |||
+ | |||
+ | Awaiting word from engineers if I got all this right. | ||
===== Supermicro ===== | ===== Supermicro ===== | ||
- | Then the final solution is to continue doing what we're doing by adding Supermicro integrated storage servers. The idea is to separate snapshotting, | + | Then the final solution is to continue doing what we're doing by adding Supermicro integrated storage servers. The idea is to separate snapshotting, |
+ | |||
+ | A 2U Supermicro with a single E5-1620v4 3.5 GHz chip (32G memory) with 2x80G disks (raid 1) can support 12x4T (raid 6) disks making roughly 40T usable. We'd connect this over Infiniband to file server sharptail and make it a snapshot host, taking over that function of sharptail. We could then merge /home and /snapshots for a larger /home (25T) on sharptail. | ||
+ | |||
+ | It's worth noting that 5 of these integrated storage servers fits the price tag of a single Netapp FAS2554 (the 51T version). So, you could buy 5 and split out /home into home1 thru home5. 200T, everybody can get as much disk space as needed. Distribute your heavy users across the 5. Mount everything up via IPoIB and round robin snapshot, as in, server home2 snapshots home1, etc. | ||
+ | |||
+ | Elegant, simple, and you can start smaller and scale up. We have room for 2 on the QDR Mellanox switch (and 2 up to 5 on the DDR Voltaire switch). Buying another QDR Mellanox adds $7K for an 18 port switch. IPoIB would be desired if we stay with Supermicro. | ||
+ | |||
+ | What's even more desired is to start our own parallel file system with [[http:// | ||
+ | |||
+ | **Short term plan** | ||
+ | |||
+ | * Grab the 32x2T flexstorage hard drives and insert into cottontail' | ||
+ | * Makes for a 60T raw raid 6 storage place (2 hot spares) | ||
+ | * move the sharptail /snapshots to it (remove that traffic from file server) | ||
+ | * Dedicate greentail' | ||
+ | * Remove / | ||
+ | * Extend /sanscratch form 27T to 37T | ||
+ | * Dedicate sharpttail' | ||
+ | * Keep old 5T /sanscratch as backup, idle | ||
+ | * Remove 15T / | ||
+ | * Extend /home for 10T to 25T | ||
+ | * Keep 7T /archives until those users graduate, move to Rstore | ||
+ | |||
+ | **Long term plan** | ||
+ | * Start a BeeGFS storage cluster | ||
+ | * cottontail as MS (management server) | ||
+ | * sharptail as AdMon (monitor server) and proof of concept storage OSS | ||
+ | * pilot storage on idle / | ||
+ | * also a folder on cottonttail:/ | ||
+ | * n38-n45 (8) as MDS (metadata servers, 15K local disk, no raid) | ||
+ | * Buy 2x 2U Supermicro for OSS (object storage servers for a total of 80T usable, raid 6, $14K) | ||
+ | * Serve up BeeGFS file system using IPoIB | ||
+ | * Move /home to it | ||
+ | * Backup to older disk arrays | ||
+ | * Expand as necessary | ||
+ | |||
+ | ===== Loose Ends ===== | ||
- | A 2U Supermicro with a single E5-1620v4 3.5 GHz (32G memory) with 2x80G (raid 1) can support 12x4T (raid 6)disks making roughly 40T usable. We'd connect this over Infiniband to file server sharptail and make it an snapshot host taking that function off sharptail. | + | In all the above, we still need a $500 HP refurbished Proliant for |
- | It's worth noting that 4-5 of these servers fit the price tag of a single Netapp FAS2554. | + | * backup Openlava scheduler for cottontail.wesleyan.edu |
+ | * backup replacement for greentail.wesleyan.edu (in case it fails) | ||
+ | Bought. | ||
+ | --- // | ||
+ | Warewulf golden image it as if it is greentail. | ||
\\ | \\ |