This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cluster:151 [2016/10/20 14:31] hmeij07 |
cluster:151 [2016/10/28 19:30] hmeij07 [Mirror Meta] |
||
---|---|---|---|
Line 2: | Line 2: | ||
**[[cluster: | **[[cluster: | ||
- | ===== SGI Altix 3000 ===== | + | ===== beeGFS |
- | The HPCC community has been offered a SGI Altix 3000 (purchased in 2006), basically a half rack on wheels (20U or so). The Altix has 4 IA-64 processors (family Itanium 2)which aren't particularly fast (1.3 Ghz), but The Altix has 96 GBytes of memory and so is useful | + | A document |
- | * Details about IA-64 [[https:// | + | Basically during the Summer of 2016 I investigated if the HPCC could afford enterprise level storage. I wanted 99.999% uptime, snapshots, high availability and other goodies such as parallel NFS. Netapp came the closest but, eh, still at $42K lots of other options show up. The story is detailed here at [[cluster:149|The Storage Problem]] |
- | * Details about Altix [[https:// | + | |
- | It is running redhat AS2.1 (which definitely ages it) so basic Linux. The node has been configured to fit our environment and will provide | + | This page is best read from the bottom up. |
- | * /home from file server sharptail, over ethernet | + | ==== cluster idea ==== |
- | * /sanscratch from scratch server greentail, over ethernet | + | |
- | * Openlava 2.2 stand alone installation | + | |
- | * icc/ifort version 8.1 on local disk | + | |
- | * Gaussian version ?.?? on local disk | + | |
- | In order to use the local compilers you must " | + | idea: buy 2 now 4k+4k then 3rd in july 4k? |
+ | move test users over on 2 nodes, test, only change is $HOME | ||
+ | ctt (mngt+admingiu), | ||
+ | make ctt2 master meta node? how? | ||
+ | |||
+ | ==== Mirror Meta ==== | ||
+ | |||
+ | Definitely wnat Meta content mirrored, that way you use the n38-n45 nodes with local 15K disk, plus maybe cottontail2 (raid 1 with hot and cold spare). | ||
+ | |||
+ | Content mirrorring will require more disk space. Perhaps snapshotting to another node is more useful, also solves backup issue. | ||
+ | - | ||
< | < | ||
- | . | + | |
- | . /opt/intel_fc_80.8.1-024/bin/ifortvars.sh | + | # enable |
+ | [root@n7 ~]# beegfs-ctl --mirrormd | ||
+ | Mount: '/mnt/beegfs'; | ||
+ | Operation succeeded. | ||
+ | |||
+ | # put some new content in | ||
+ | [root@n7 ~]# rsync -vac / | ||
+ | |||
+ | # lookup meta tag | ||
+ | [root@n7 ~]# beegfs-ctl --getentryinfo / | ||
+ | Path: /hmeij-mirror/iozone-tests/ | ||
+ | Mount: / | ||
+ | EntryID: 3-581392E1-31 | ||
+ | |||
+ | # find | ||
+ | [root@sharptail ~]# ssh n38 find /data/beegfs_meta -name 3-581392E1-31 | ||
+ | / | ||
+ | |||
+ | # and find | ||
+ | [root@sharptail ~]# ssh n39 find / | ||
+ | / | ||
+ | |||
+ | # seems to work | ||
</ | </ | ||
+ | ==== / | ||
- | You can also find MKL libraries at '' | + | * Source content 110G in XFS with ~100,000 files in ~2,000 dirs |
+ | * /home/hmeij (mix of files, nothing large) plus | ||
+ | * /home/ | ||
+ | |||
+ | * File content spread across 2 storage servers | ||
+ | * petaltail:/ | ||
+ | * swallowtail:/ | ||
+ | * 56G used in beegfs-storage per storage server | ||
+ | * ~92,400 files per storage server | ||
+ | * ~1,400 dirs per storage server | ||
- | All Openlava commands work the same way as elsewhere in our HPCC environment. In order to use the SGI Altix you must SSH to the head/ | + | * Meta content spread across 2 meta servers (n37 and n38) |
+ | * 338MB per beegfs-meta server so 0.006% space wise for 2 servers | ||
+ | * ~105,000 files per metadata server | ||
+ | * ~35,000 dirs almost spread evenly across | ||
+ | |||
+ | * Client | ||
+ | * 110G in / | ||
+ | * ~100,000 files | ||
+ | * ~2,000 dirs | ||
+ | |||
+ | Looks like: | ||
< | < | ||
- | [root@hmeij ~]# ssh hmeij@cottontail | + | # file content |
- | hmeij@cottontail' | + | |
- | Last login: Thu Oct 20 09:38:40 2016 from 129.133.22.42 | + | |
- | [hmeij@cottontail ~]$ | + | |
- | # then | + | [root@swallowtail ~]# ls -lR / |
+ | / | ||
+ | total 672 | ||
+ | -rw-rw-rw- 1 root root 289442 Jun 26 2015 D8-57E42E89-30 | ||
+ | -rw-rw-rw- 1 root root 3854 Jun 26 2015 D9-57E42E89-30 | ||
+ | -rw-rw-rw- 1 root root 16966 Jun 26 2015 DA-57E42E89-30 | ||
+ | -rw-rw-rw- 1 root root 65779 Jun 26 2015 DB-57E42E89-30 | ||
+ | -rw-rw-rw- 1 root root 20562 Jun 26 2015 DF-57E42E89-30 | ||
+ | -rw-rw-rw- 1 root root 259271 Jun 26 2015 E0-57E42E89-30 | ||
+ | -rw-rw-rw- 1 root root 372 Jun 26 2015 E1-57E42E89-30 | ||
- | [hmeij@cottontail | + | [root@petaltail |
- | [hmeij@enzo hmeij]$ bqueues | + | / |
- | QUEUE_NAME | + | total 144 |
- | sgi96 50 Open:Active | + | -rw-rw-rw- 1 root root |
+ | -rw-rw-rw- | ||
+ | -rw-rw-rw- 1 root root 100077 Jun 26 2015 DE-57E42E89-30 | ||
- | /code> | + | # meta content |
+ | [root@sharptail ~]# ssh n38 find / | ||
+ | / | ||
+ | / | ||
+ | [root@sharptail ~]# ssh n39 find / | ||
+ | (none, no mirror) | ||
+ | |||
+ | </ | ||
+ | ==== Tuning ==== | ||
+ | |||
+ | * global interfaces files ib0-> | ||
+ | * priority order, seems useful | ||
+ | * set in a file somewhere | ||
+ | |||
+ | * backup beeGFS EA metadata, see faq | ||
+ | * attempt a restore | ||
+ | * or just snapshot | ||
+ | |||
+ | * storage server tuning | ||
+ | * set on cottontail on sdb, both values were 128 (seems to help -- late summer 2016) | ||
+ | * echo 4096 > / | ||
+ | * echo 4096 > / | ||
+ | * set on cottontail, was 90112 + / | ||
+ | * echo 262144 > / | ||
+ | * do same on greentail? (done late fall 2016) | ||
+ | * all original values same as cottontail (all files) | ||
+ | * set on c1d1 thru c1d6 | ||
+ | * do same on sharptail? | ||
+ | * no such values for sdb1 | ||
+ | * can only find min_free_kbytes, | ||
+ | * stripe and chunk size | ||
+ | |||
+ | < | ||
+ | |||
+ | [root@n7 ~]# beegfs-ctl --getentryinfo / | ||
+ | Path: | ||
+ | Mount: /mnt/beegfs | ||
+ | EntryID: root | ||
+ | Metadata node: n38 [ID: 48] | ||
+ | Stripe pattern details: | ||
+ | + Type: RAID0 | ||
+ | + Chunksize: 512K | ||
+ | + Number of storage targets: desired: 4 | ||
+ | |||
+ | </ | ||
+ | * The cache type can be set in the client config file (/ | ||
+ | * buffered is default, few 100k per file | ||
+ | |||
+ | * tuneNumWorkers in all / | ||
+ | * for meta, storage and clients ... | ||
+ | |||
+ | * metadata server tuning | ||
+ | * read in more detail | ||
+ | |||
+ | ==== Installation ==== | ||
+ | |||
+ | * made easy [[http:// | ||
+ | |||
+ | < | ||
+ | |||
+ | [root@cottontail ~]# ssh n7 beegfs-net | ||
+ | |||
+ | mgmt_nodes | ||
+ | ============= | ||
+ | cottontail [ID: 1] | ||
+ | | ||
+ | |||
+ | meta_nodes | ||
+ | ============= | ||
+ | n38 [ID: 48] | ||
+ | | ||
+ | n39 [ID: 49] | ||
+ | | ||
+ | |||
+ | storage_nodes | ||
+ | ============= | ||
+ | swallowtail [ID: 136] | ||
+ | | ||
+ | petaltail [ID: 217] | ||
+ | | ||
+ | |||
+ | </ | ||
\\ | \\ | ||
**[[cluster: | **[[cluster: |