This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cluster:151 [2016/10/20 14:54] hmeij07 |
cluster:151 [2016/10/31 19:25] hmeij07 [Mirror Data] |
||
---|---|---|---|
Line 2: | Line 2: | ||
**[[cluster: | **[[cluster: | ||
- | ===== SGI Altix 3000 ===== | + | ===== beeGFS |
- | The HPCC community has been offered a SGI Altix 3000 (purchased in 2006), basically a half rack on wheels (20U or so). The Altix has 4 IA-64 processors (family Itanium 2)which aren't particularly fast (1.3 Ghz), but The Altix has 96 GBytes of memory and so is useful | + | A document |
- | * Details about IA-64 [[https:// | + | Basically during the Summer of 2016 I investigated if the HPCC could afford enterprise level storage. I wanted 99.999% uptime, snapshots, high availability and other goodies such as parallel NFS. Netapp came the closest but, eh, still at $42K lots of other options show up. The story is detailed here at [[cluster:149|The Storage Problem]] |
- | * Details about Altix [[https:// | + | |
- | It is running redhat AS2.1 (which definitely ages it), so basic Linux. The node has been configured to fit our environment and will provide | + | This page is best read from the bottom up. |
- | * /home from file server sharptail, over ethernet | + | ==== cluster idea ==== |
- | * /sanscratch from scratch server greentail, over ethernet | + | |
- | * Openlava 2.2 stand alone installation | + | |
- | * icc/ifort version 8.1 on local disk | + | |
- | * Gaussian version ?.?? on local disk | + | |
- | In order to use the local compilers you must " | + | * Storage servers: buy 2 now 4k+4k then 3rd in July 4k? |
+ | |||
+ | * move test users over on 2 nodes, test, only change is $HOME | ||
+ | |||
+ | * Home cluster | ||
+ | * cottontail (mngt+admingiu) | ||
+ | * 2-3 new units storage (+snapshots/ | ||
+ | * cottontail2 meta + n38-n45 meta, all mirrored | ||
+ | |||
+ | ==== beegfs-admin-gui ==== | ||
+ | |||
+ | * '' | ||
+ | |||
+ | ==== Mirror Data ==== | ||
+ | |||
+ | ** Before ** | ||
< | < | ||
- | | + | [root@cottontail2 ~]# beegfs-df |
- | . | + | METADATA SERVERS: |
+ | TargetID | ||
+ | ======== | ||
+ | 48 | ||
+ | 49 | ||
+ | 250 | ||
+ | |||
+ | STORAGE TARGETS: | ||
+ | TargetID | ||
+ | ======== | ||
+ | | ||
+ | | ||
</ | </ | ||
- | You can also find MKL libraries at ''/ | + | ** Before ** |
- | All Openlava commands work the same way as elsewhere in our HPCC environment. In order to use the SGI Altix you must SSH to the head/compute node from any of our " | + | < |
+ | |||
+ | # define buddygroup - these are storage target IDs | ||
+ | [root@n7 ~]# beegfs-ctl --addmirrorgroup --primary=13601 --secondary=21701 --groupid=101 | ||
+ | Mirror buddy group successfully set: groupID 101 -> target IDs 13601, 21701 | ||
+ | |||
+ | [root@n7 ~]# beegfs-ctl --listmirrorgroups | ||
+ | | ||
+ | | ||
+ | 101 | ||
+ | |||
+ | # enable mirroring for data by directory -numTargets needs to be set to max nr of storage servers? | ||
+ | [root@n7 ~]# beegfs-ctl --setpattern --buddymirror | ||
+ | New chunksize: 524288 | ||
+ | New number | ||
+ | Path: / | ||
+ | Mount: / | ||
+ | |||
+ | # copy some contents in (~hmeij is 10G) | ||
+ | [root@n7 ~]# rsync -vac --bwlimit /home/hmeij / | ||
+ | |||
+ | </ | ||
+ | |||
+ | ** After ** | ||
< | < | ||
- | [root@hmeij ~]# ssh hmeij@cottontail | ||
- | hmeij@cottontail' | ||
- | Last login: Thu Oct 20 09:38:40 2016 from 129.133.22.42 | ||
- | [hmeij@cottontail ~]$ | ||
- | # then | + | </ |
+ | ==== Quota ==== | ||
- | [hmeij@cottontail | + | * [[http:// |
- | [hmeij@enzo hmeij]$ bqueues | + | * setup XFS |
- | QUEUE_NAME | + | * enable beegfs quota on all clients |
- | sgi96 50 Open:Active | + | * enforce quota |
+ | * set quotas using a text file | ||
+ | * seems straightforward | ||
+ | * do BEFORE populating XFS file systems | ||
+ | |||
+ | ==== Mirror Meta ==== | ||
+ | |||
+ | Definitely want Meta content mirrored, that way I can use the n38-n45 nodes with local 15K disk, plus maybe cottontail2 (raid 1 with hot and cold spare). | ||
+ | |||
+ | Content mirroring will require more disk space. Perhaps snapshots to another node is more useful, also solves backup issue. | ||
+ | |||
+ | |||
+ | < | ||
+ | |||
+ | # enable meta mirroring, directory based | ||
+ | |||
+ | [root@n7 ~]# beegfs-ctl --mirrormd / | ||
+ | Mount: '/ | ||
+ | Operation succeeded. | ||
+ | |||
+ | # put some new content in | ||
+ | [root@n7 ~]# rsync -vac /home/hmeij/ | ||
+ | |||
+ | # lookup meta tag | ||
+ | [root@n7 ~]# beegfs-ctl --getentryinfo / | ||
+ | Path: /hmeij-mirror/iozone-tests/current.tar | ||
+ | Mount: / | ||
+ | EntryID: 3-581392E1-31 | ||
+ | |||
+ | # find | ||
+ | [root@sharptail ~]# ssh n38 find / | ||
+ | / | ||
+ | ^^^^^^ | ||
+ | # and find | ||
+ | [root@sharptail ~]# ssh n39 find / | ||
+ | / | ||
+ | |||
+ | # seems to work | ||
</ | </ | ||
- | ===== Gaussian ===== | + | Writing some initial content to both storage and meta servers; vanilla out of the box beegfs seems to balance the writes across both equally. Here are some stats. |
- | In order to run Gaussian jobs you must be a member of the unix group '' | ||
- | (Details on how to leverage this global shared memory block to some ....) | + | ==== / |
- | ===== Example Job ===== | + | * Source content 110G in XFS with ~100,000 files in ~2,000 dirs |
+ | * /home/hmeij (mix of files, nothing large) plus | ||
+ | * / | ||
+ | |||
+ | * File content spread across 2 storage servers | ||
+ | * petaltail:/ | ||
+ | * swallowtail:/ | ||
+ | * 56G used in beegfs-storage per storage server | ||
+ | * ~92,400 files per storage server | ||
+ | * ~1,400 dirs per storage server | ||
- | (Provide sample job submit script...) | + | * Meta content spread across 2 meta servers |
+ | * 338MB per beegfs-meta server so 0.006% space wise for 2 servers | ||
+ | * ~105,000 files per metadata server | ||
+ | * ~35,000 dirs almost spread evenly across " | ||
+ | * Client (n7 and n8) see 110G in /mnt/beegfs | ||
+ | * 110G in /mnt/beegfs | ||
+ | * ~100,000 files | ||
+ | * ~2,000 dirs | ||
+ | Looks like: | ||
+ | |||
+ | * NOTE: failed to mount /mn/beegfs is the result of out of space storage servers. | ||
+ | |||
+ | < | ||
+ | |||
+ | # file content | ||
+ | |||
+ | [root@swallowtail ~]# ls -lR / | ||
+ | / | ||
+ | total 672 | ||
+ | -rw-rw-rw- 1 root root 289442 Jun 26 2015 D8-57E42E89-30 | ||
+ | -rw-rw-rw- 1 root root 3854 Jun 26 2015 D9-57E42E89-30 | ||
+ | -rw-rw-rw- 1 root root 16966 Jun 26 2015 DA-57E42E89-30 | ||
+ | -rw-rw-rw- 1 root root 65779 Jun 26 2015 DB-57E42E89-30 | ||
+ | -rw-rw-rw- 1 root root 20562 Jun 26 2015 DF-57E42E89-30 | ||
+ | -rw-rw-rw- 1 root root 259271 Jun 26 2015 E0-57E42E89-30 | ||
+ | -rw-rw-rw- 1 root root 372 Jun 26 2015 E1-57E42E89-30 | ||
+ | |||
+ | [root@petaltail ~]# ls -lR / | ||
+ | / | ||
+ | total 144 | ||
+ | -rw-rw-rw- 1 root root 40 Jun 26 2015 DC-57E42E89-30 | ||
+ | -rw-rw-rw- 1 root root 40948 Jun 26 2015 DD-57E42E89-30 | ||
+ | -rw-rw-rw- 1 root root 100077 Jun 26 2015 DE-57E42E89-30 | ||
+ | |||
+ | # meta content | ||
+ | |||
+ | [root@sharptail ~]# ssh n38 find / | ||
+ | / | ||
+ | / | ||
+ | |||
+ | [root@sharptail ~]# ssh n39 find / | ||
+ | (none, no mirror) | ||
+ | |||
+ | </ | ||
+ | ==== Tuning ==== | ||
+ | |||
+ | * global interfaces files ib0-> | ||
+ | * connInterfacesFile = / | ||
+ | * set in / | ||
+ | |||
+ | * backup beeGFS EA metadata, see faq | ||
+ | * attempt a restore | ||
+ | * or just snapshot | ||
+ | |||
+ | * storage server tuning | ||
+ | * set on cottontail on sdb, both values were 128 (seems to help -- late summer 2016) | ||
+ | * echo 4096 > / | ||
+ | * echo 4096 > / | ||
+ | * set on cottontail, was 90112 + / | ||
+ | * echo 262144 > / | ||
+ | * do same on greentail? (done late fall 2016) | ||
+ | * all original values same as cottontail (all files) | ||
+ | * set on c1d1 thru c1d6 | ||
+ | * do same on sharptail? | ||
+ | * no such values for sdb1 | ||
+ | * can only find min_free_kbytes, | ||
+ | * stripe and chunk size | ||
+ | |||
+ | < | ||
+ | |||
+ | [root@n7 ~]# beegfs-ctl --getentryinfo / | ||
+ | Path: | ||
+ | Mount: /mnt/beegfs | ||
+ | EntryID: root | ||
+ | Metadata node: n38 [ID: 48] | ||
+ | Stripe pattern details: | ||
+ | + Type: RAID0 | ||
+ | + Chunksize: 512K | ||
+ | + Number of storage targets: desired: 4 | ||
+ | |||
+ | </ | ||
+ | * The cache type can be set in the client config file (/ | ||
+ | * buffered is default, few 100k per file | ||
+ | |||
+ | * tuneNumWorkers in all / | ||
+ | * for meta, storage and clients ... | ||
+ | |||
+ | * metadata server tuning | ||
+ | * read in more detail | ||
+ | |||
+ | ==== Installation ==== | ||
+ | |||
+ | * made easy [[http:// | ||
+ | * rpms pulled from repository via petaltail in '' | ||
+ | |||
+ | < | ||
+ | |||
+ | [root@cottontail ~]# ssh n7 beegfs-net | ||
+ | |||
+ | mgmt_nodes | ||
+ | ============= | ||
+ | cottontail [ID: 1] | ||
+ | | ||
+ | |||
+ | meta_nodes | ||
+ | ============= | ||
+ | n38 [ID: 48] | ||
+ | | ||
+ | n39 [ID: 49] | ||
+ | | ||
+ | |||
+ | storage_nodes | ||
+ | ============= | ||
+ | swallowtail [ID: 136] | ||
+ | | ||
+ | petaltail [ID: 217] | ||
+ | | ||
+ | |||
+ | |||
+ | </ | ||
\\ | \\ | ||
**[[cluster: | **[[cluster: |