User Tools

Site Tools



This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:151 [2016/11/10 20:35]
hmeij07 [Resync Meta]
cluster:151 [2016/12/06 20:14]
hmeij07 [beeGFS cluster idea]
Line 6: Line 6:
 A document for me to recall and make notes of what I read in the manual pages and what needs testing. A document for me to recall and make notes of what I read in the manual pages and what needs testing.
-Basically during the Summer of 2016 I investigated if the HPCC could afford enterprise level storage. I wanted 99.999% uptime, snapshots, high availability and other goodies such as parallel NFS. Netapp came the closest but, eh, still at $42K lots of other options show up. The story is detailed here at [[cluster:149|The Storage Problem]]+Basically during the Summer of 2016 I investigated if the HPCC could afford enterprise level storage. I wanted 99.999% uptime, snapshots, high availability and other goodies such as parallel NFS. Netapp came the closest but, eh, still at $42K lots of other options show up. That story is detailed at [[cluster:149|The Storage Problem]]
 This page is best read from the bottom up. This page is best read from the bottom up.
-==== cluster idea ====+NOTE:
-  * Storage servers: buy now 4k+4k then 3rd in July 4k?+I'm reluctantly giving up on beegfs, especially v6.1, it simply works flaky. In the admon gui I can see storage nodes, 4 storage objects, 4 meta servers with clients installed on all meta. /mnt/beegfs is there and content can be created. Then I mirror storage nodes, all is fine. Then I mirror meta servers and the mirrors set up, enabling mirrormd states success. Then the whole environment hangs on /mnt/beegfs.  My sense is helperd is not communication well in a private network environment with no DNS and does not consult /etc/hosts. But I have nothing to back that up with, so I can fix it.
-  * move test users over on 2 nodestest, only change is $HOME+Back to adding more XFS into my clusterI'll wait a few more versions. 
 + --- //[[|Henk]] 2016/12/06 15:10// 
 +==== beeGFS cluster idea ====
-  * Home cluster +  * Storage servers:  
-    * cottontail (mngt+admingiu+    * buy 2 with each 12x2TB slow disk, Raid 6, 20T usable (clustered, parallel file system
-    * 2-new units storage (+snapshots/meta backup+      create 6TB volumes on each, quota at 2TB via XFS, users/server  
-    * cottontail2 meta n38-n45 meta, all mirrored+      * only $HOME changes to ''/mnt/beegfs/home[1|2]'' (migrates ~4.5TB away from /home or ~50%) 
 +      * create 2 buddymirrors; each with primary on one, secondary on the other server (high availability) 
 +    * on UPS 
 +    * on Infiniband 
 +  * Client servers: 
 +    * all compute/login nodes become beegfs clients 
 +  * Meta servers: 
 +    * cottontail2 (root meta, on Infiniband) plus n38-n45 nodes (on Infiniband) 
 +    * all mirrored (total=9) 
 +    * cottontail2 on UPS  
 +  * Management and Monitor servers 
 +    * cottontail (on UPS, on Infiniband) 
 +  * Backups ( via rsync daemons [[cluster:150|Rsync Daemon/Rsnapshot]]) 
 +    * sharptail:/home --> cottontail 
 +    * serverA:/mnt/beegfs/home1 --> serverB (8TB max) 
 +    * serverB:/mnt/beegfs/home2 --> serverA (8TB max) 
 +  * Costs (includes 3 year NBD warranty) 
 +    * Microway $12,500 
 +    * CDW $14,700
 ==== beegfs-admin-gui ==== ==== beegfs-admin-gui ====
Line 25: Line 50:
   * ''cottontail:/usr/local/bin/beegfs-admin-gui''   * ''cottontail:/usr/local/bin/beegfs-admin-gui''
-==== Resync Data ====+==== upgrade ==== 
 +  * [[|External Link]] 
 +  * New feature - High Availability for Metadata Servers (self-healing, transparent failover) 
 +A bit complicated.  
 +  * Repo base URL baseurl= via http shows only 6.1-el6 
 +    * [   ] beegfs-mgmtd-6.1-el6.x86_64.rpm          2016-11-16 16:27  660K  
 +  * '' yum --disablerepo "*" --enablerepo beegfs repolist'' shows 
 +    * beegfs-mgmtd.x86_64                            2015.03.r22-el6            beegfs 
 +  * ''yum install --disablerepo "*" --enablerepo beegfs --downloadonly --downloaddir=/sanscratch/tmp/beegfs/beegfs_6/ *x86_64* -y'' 
 +   * [Errno 14] PYCURL ERROR 22 - "The requested URL returned error: 404 Not Found" <-- wrong package version 
 +So the wget/rpm approach (list all packages present on a particular node else you will get a dependencies failure!) 
 +# get them all 
 +# client and meta node 
 +rpm -Uvh ./beegfs-common-6.1-el6.noarch.rpm ./beegfs-utils-6.1-el6.x86_64.rpm ./beegfs-opentk-lib-6.1-el6.x86_64.rpm ./beegfs-helperd-6.1-el6.x86_64.rpm ./beegfs-client-6.1-el6.noarch.rpm ./beegfs-meta-6.1-el6.x86_64.rpm 
 +# updated? 
 +[root@cottontail2 beegfs_6]# beegfs-ctl | head -2 
 +BeeGFS Command-Line Control Tool ( 
 +Version: 6.1 
 +==== Resync Data #2 ==== 
 +If you have 2 buddymirrors and 2 storage servers each with 2 storage objects, beegfs will write to all primary storage targets even if numtargets is to 1 ... it will use all storage objects so best to numtargets's value equal to the number of primary storage objects. And then of course the content flow from primary to secondary for high availability. 
 +How does one add a server? 
 +# define storage objects, 2 per server 
 +[root@petaltail ~]# /opt/beegfs/sbin/beegfs-setup-storage -p /data/lv1/beegfs_storage -s 217 -i 21701 -m cottontail 
 +[root@petaltail ~]# /opt/beegfs/sbin/beegfs-setup-storage -p /data/lv2/beegfs_storage -s 217 -i 21702 -m cottontail 
 +[root@swallowtail data]# /opt/beegfs/sbin/beegfs-setup-storage -p /data/lv1/beegfs_storage -s 136 -i 13601 -m cottontail  
 +[root@swallowtail data]# /opt/beegfs/sbin/beegfs-setup-storage -p /data/lv2/beegfs_storage -s 136 -i 13602 -m cottontail 
 +[root@cottontail2 ~]# beegfs-df 
 +TargetID        Pool        Total         Free    %      ITotal       IFree    % 
 +========        ====        =====         ====    =      ======       =====    = 
 +     250         low     122.3GiB     116.6GiB  95%        7.8M        7.6M  98% 
 +TargetID        Pool        Total         Free    %      ITotal       IFree    % 
 +========        ====        =====         ====    =      ======       =====    = 
 +   13601         low     291.4GiB     164.6GiB  56%       18.5M       18.5M 100% 
 +   13602         low     291.4GiB     164.6GiB  56%       18.5M       18.5M 100% 
 +   21701         low     291.2GiB     130.5GiB  45%       18.5M       16.2M  87% 
 +   21702         low     291.2GiB     130.5GiB  45%       18.5M       16.2M  87% 
 +# define mirrrogroups 
 +[root@cottontail2 ~]# beegfs-ctl --addmirrorgroup [--nodetype=storage] --primary=21701 --secondary=13601 --groupid=1 
 +[root@cottontail2 ~]# beegfs-ctl --addmirrorgroup [--nodetype=storage] --primary=13602 --secondary=21702 --groupid=2 
 +[root@cottontail2 ~]# beegfs-ctl --listmirrorgroups 
 +     BuddyGroupID   PrimaryTargetID SecondaryTargetID 
 +     ============   =============== ================= 
 +                1             21701             13601 
 +                2             13602             21702 
 +# define buddygroups, numtargets=1 
 +[root@cottontail2 ~]# beegfs-ctl --setpattern --buddymirror /mnt/beegfs/home1 --chunksize=512k --numtargets=1 
 +New chunksize: 524288 
 +New number of storage targets: 1 
 +Path: /home1 
 +Mount: /mnt/beegfs 
 +[root@cottontail2 ~]# beegfs-ctl --setpattern --buddymirror /mnt/beegfs/home2 --chunksize=512k --numtargets=1 
 +New chunksize: 524288 
 +New number of storage targets: 1 
 +Path: /home2 
 +Mount: /mnt/beegfs 
 +# drop /home/hmeij in /mnt/beegfs/home1/hmeij 
 +[root@petaltail mysql_bak_ptt]# find /data/lv1/beegfs_storage/ -type f | wc -l 
 +[root@petaltail mysql_bak_ptt]# find /data/lv2/beegfs_storage/ -type f | wc -l 
 +[root@swallowtail data]# find /data/lv1/beegfs_storage/ -type f | wc -l 
 +[root@swallowtail data]# find /data/lv2/beegfs_storage/ -type f | wc -l 
 +# with numtargets=1 beegfs still writes to all primary targets found in all buddygroups 
 +# rebuild test servers with from scratch with numparts=2 
 +# drop hmeij/ into home1/ and obtain slightly more files (couple of 100s), not double the amount 
 +# /home/hmeij has 7808 files in it which gets split over primaries but numparts=2 would yield 15,616 files? 
 +# drop another copy in home2/ and file counts double to circa 7808 
 +[root@cottontail2 ~]# beegfs-ctl --getentryinfo  /mnt/beegfs/home1 
 +Path: /home1 
 +Mount: /mnt/beegfs 
 +EntryID: 0-583C50A1-FA 
 +Metadata node: cottontail2 [ID: 250] 
 +Stripe pattern details: 
 ++ Type: Buddy Mirror 
 ++ Chunksize: 512K 
 ++ Number of storage targets: desired: 2 
 +[root@cottontail2 ~]# beegfs-ctl --getentryinfo  /mnt/beegfs/home2 
 +Path: /home2 
 +Mount: /mnt/beegfs 
 +EntryID: 1-583C50A1-FA 
 +Metadata node: cottontail2 [ID: 250] 
 +Stripe pattern details: 
 ++ Type: Buddy Mirror 
 ++ Chunksize: 512K 
 ++ Number of storage targets: desired: 2 
 +Source: /home/hmeij 7808 files in 10G 
 +TargetID        Pool        Total         Free    %      ITotal       IFree    % 
 +========        ====        =====         ====    =      ======       =====    = 
 +   13601         low     291.4GiB      63.1GiB  22%       18.5M       18.5M 100% 
 +   13602         low     291.4GiB      63.1GiB  22%       18.5M       18.5M 100% 
 +   21701         low     291.2GiB     134.6GiB  46%       18.5M       16.2M  87% 
 +   21702         low     291.2GiB     134.6GiB  46%       18.5M       16.2M  87% 
 +[root@cottontail2 ~]# rsync -ac --bwlimit=2500 /home/hmeij /mnt/beegfs/home1/ 
 +[root@cottontail2 ~]# rsync -ac --bwlimit=2500 /home/hmeij /mnt/beegfs/home2/ 
 +TargetID        Pool        Total         Free    %      ITotal       IFree    % 
 +========        ====        =====         ====    =      ======       =====    = 
 +   13601         low     291.4GiB      43.5GiB  15%       18.5M       18.5M 100% 
 +   13602         low     291.4GiB      43.5GiB  15%       18.5M       18.5M 100% 
 +   21701         low     291.2GiB     114.9GiB  39%       18.5M       16.1M  87% 
 +   21702         low     291.2GiB     114.9GiB  39%       18.5M       16.1M  87% 
 +# first rsync drops roughly 5G in both primaries which then get copied to secondaries. 
 +# second rsync does the same so both storage servers loose 20G roughly 
 +# now shut a storage server down and the whole filesystem can still be accessed (HA) 
 +==== Resync Data #1 ====
 [[|StorageSynchronization Link]] [[|StorageSynchronization Link]]
Line 36: Line 205:
   * started a full --resyncstorage --mirrorgroupid=101 --timestamp=0   * started a full --resyncstorage --mirrorgroupid=101 --timestamp=0
   * got --getentryinfo EntryID for a file in my /mnt/beegfs/home/path/to/file and did the same for the directory the file was located in   * got --getentryinfo EntryID for a file in my /mnt/beegfs/home/path/to/file and did the same for the directory the file was located in
-  * did a cat /mnt.beegfs/home/path/to/file on a client (just fine)+  * did a cat /mnt/beegfs/home/path/to/file on a client (just fine)
   * brought primary storage down   * brought primary storage down
   * redid the cat above (it hangs for a couple of minutes, then displays the file content)   * redid the cat above (it hangs for a couple of minutes, then displays the file content)
Line 217: Line 386:
 Content mirroring will require more disk space. Perhaps snapshots to another node is more useful, also solves backup issue. Content mirroring will require more disk space. Perhaps snapshots to another node is more useful, also solves backup issue.
- +V6 does buddymirror meta mirroring [[|External Link]]
 <code> <code>
-# enable meta mirroring, directory based+2015.03 enable meta mirroring, directory based
 # change to 11/04/2016: used --createdir to make this home. # change to 11/04/2016: used --createdir to make this home.
 [root@n7 ~]# beegfs-ctl --mirrormd /mnt/beegfs/home [root@n7 ~]# beegfs-ctl --mirrormd /mnt/beegfs/home
Line 226: Line 395:
 Mount: '/mnt/beegfs'; Path: '/hmeij-mirror' Mount: '/mnt/beegfs'; Path: '/hmeij-mirror'
 Operation succeeded. Operation succeeded.
 +# V6.1 does it a root level not from a path
 +beegfs-ctl --addmirrorgroup --nodetype=meta --primary=38 --secondary=39 --groupid=1 
 +beegfs-ctl --addmirrorgroup --nodetype=meta --primary=250 --secondary=37 --groupid=2 
 +beegfs-ctl --mirrromd
 # put some new content in  # put some new content in 
Line 316: Line 490:
     * set in /etc/beegfs-[storage|client|meta|admon|mgmtd].conf and restart services     * set in /etc/beegfs-[storage|client|meta|admon|mgmtd].conf and restart services
-  * backup beeGFS EA metadata, see faq +  * backup/restore/mirror 
-    * attempt a restore +    * see more towards top this page
-    * or just snapshot+
   * storage server tuning   * storage server tuning
Line 360: Line 533:
   * made easy [[|External Link]]   * made easy [[|External Link]]
   * rpms pulled from repository via petaltail in ''greentail:/sanscratch/tmp/beegfs''   * rpms pulled from repository via petaltail in ''greentail:/sanscratch/tmp/beegfs''
 +    * ''yum --disablerepo "*" --enablerepo beegfs list available''
 +    * use ''yumdownloader''
 <code> <code>
cluster/151.txt · Last modified: 2016/12/06 20:14 by hmeij07