Table of Contents


Back

beeGFS

A document for me to recall and make notes of what I read in the manual pages and what needs testing.

Basically during the Summer of 2016 I investigated if the HPCC could afford enterprise level storage. I wanted 99.999% uptime, snapshots, high availability and other goodies such as parallel NFS. Netapp came the closest but, eh, still at $42K lots of other options show up. That story is detailed at The Storage Problem

This page is best read from the bottom up.

NOTE:

I'm reluctantly giving up on beegfs, especially v6.1, it simply works flaky. In the admon gui I can see 2 storage nodes, 4 storage objects, 4 meta servers with clients installed on all meta. /mnt/beegfs is there and content can be created. Then I mirror storage nodes, all is fine. Then I mirror meta servers and the mirrors set up, enabling mirrormd states success. Then the whole environment hangs on /mnt/beegfs. My sense is helperd is not communication well in a private network environment with no DNS and does not consult /etc/hosts. But I have nothing to back that up with, so I can fix it.

Back to adding more XFS into my cluster, I'll wait a few more versions. — Henk 2016/12/06 15:10

beeGFS cluster idea

beegfs-admin-gui

upgrade

A bit complicated.

So the wget/rpm approach (list all packages present on a particular node else you will get a dependencies failure!)

# get them all
wget http://www.beegfs.com/release/beegfs_6/dists/rhel6/x86_64/beegfs-mgmtd-6.1-el6.x86_64.rpm

# client and meta node
rpm -Uvh ./beegfs-common-6.1-el6.noarch.rpm ./beegfs-utils-6.1-el6.x86_64.rpm ./beegfs-opentk-lib-6.1-el6.x86_64.rpm ./beegfs-helperd-6.1-el6.x86_64.rpm ./beegfs-client-6.1-el6.noarch.rpm ./beegfs-meta-6.1-el6.x86_64.rpm

# updated?
[root@cottontail2 beegfs_6]# beegfs-ctl | head -2
BeeGFS Command-Line Control Tool (http://www.beegfs.com)
Version: 6.1

#Sheeesh

Resync Data #2

If you have 2 buddymirrors and 2 storage servers each with 2 storage objects, beegfs will write to all primary storage targets even if numtargets is to 1 … it will use all storage objects so best to numtargets's value equal to the number of primary storage objects. And then of course the content flow from primary to secondary for high availability.

How does one add a server?

# define storage objects, 2 per server
[root@petaltail ~]# /opt/beegfs/sbin/beegfs-setup-storage -p /data/lv1/beegfs_storage -s 217 -i 21701 -m cottontail
[root@petaltail ~]# /opt/beegfs/sbin/beegfs-setup-storage -p /data/lv2/beegfs_storage -s 217 -i 21702 -m cottontail
[root@swallowtail data]# /opt/beegfs/sbin/beegfs-setup-storage -p /data/lv1/beegfs_storage -s 136 -i 13601 -m cottontail 
[root@swallowtail data]# /opt/beegfs/sbin/beegfs-setup-storage -p /data/lv2/beegfs_storage -s 136 -i 13602 -m cottontail


[root@cottontail2 ~]# beegfs-df
METADATA SERVERS:
TargetID        Pool        Total         Free    %      ITotal       IFree    %
========        ====        =====         ====    =      ======       =====    =
     250         low     122.3GiB     116.6GiB  95%        7.8M        7.6M  98%

STORAGE TARGETS:
TargetID        Pool        Total         Free    %      ITotal       IFree    %
========        ====        =====         ====    =      ======       =====    =
   13601         low     291.4GiB     164.6GiB  56%       18.5M       18.5M 100%
   13602         low     291.4GiB     164.6GiB  56%       18.5M       18.5M 100%
   21701         low     291.2GiB     130.5GiB  45%       18.5M       16.2M  87%
   21702         low     291.2GiB     130.5GiB  45%       18.5M       16.2M  87%

# define mirrrogroups
[root@cottontail2 ~]# beegfs-ctl --addmirrorgroup [--nodetype=storage] --primary=21701 --secondary=13601 --groupid=1
[root@cottontail2 ~]# beegfs-ctl --addmirrorgroup [--nodetype=storage] --primary=13602 --secondary=21702 --groupid=2

[root@cottontail2 ~]# beegfs-ctl --listmirrorgroups
     BuddyGroupID   PrimaryTargetID SecondaryTargetID
     ============   =============== =================
                1             21701             13601
                2             13602             21702

# define buddygroups, numtargets=1
[root@cottontail2 ~]# beegfs-ctl --setpattern --buddymirror /mnt/beegfs/home1 --chunksize=512k --numtargets=1
New chunksize: 524288
New number of storage targets: 1
Path: /home1
Mount: /mnt/beegfs

[root@cottontail2 ~]# beegfs-ctl --setpattern --buddymirror /mnt/beegfs/home2 --chunksize=512k --numtargets=1
New chunksize: 524288
New number of storage targets: 1
Path: /home2
Mount: /mnt/beegfs

# drop /home/hmeij in /mnt/beegfs/home1/hmeij
[root@petaltail mysql_bak_ptt]# find /data/lv1/beegfs_storage/ -type f | wc -l
3623
[root@petaltail mysql_bak_ptt]# find /data/lv2/beegfs_storage/ -type f | wc -l
3678
[root@swallowtail data]# find /data/lv1/beegfs_storage/ -type f | wc -l
3623
[root@swallowtail data]# find /data/lv2/beegfs_storage/ -type f | wc -l
3678

# with numtargets=1 beegfs still writes to all primary targets found in all buddygroups

# rebuild test servers with from scratch with numparts=2
# drop hmeij/ into home1/ and obtain slightly more files (couple of 100s), not double the amount
# /home/hmeij has 7808 files in it which gets split over primaries but numparts=2 would yield 15,616 files?
# drop another copy in home2/ and file counts double to circa 7808
[root@cottontail2 ~]# beegfs-ctl --getentryinfo  /mnt/beegfs/home1
Path: /home1
Mount: /mnt/beegfs
EntryID: 0-583C50A1-FA
Metadata node: cottontail2 [ID: 250]
Stripe pattern details:
+ Type: Buddy Mirror
+ Chunksize: 512K
+ Number of storage targets: desired: 2
[root@cottontail2 ~]# beegfs-ctl --getentryinfo  /mnt/beegfs/home2
Path: /home2
Mount: /mnt/beegfs
EntryID: 1-583C50A1-FA
Metadata node: cottontail2 [ID: 250]
Stripe pattern details:
+ Type: Buddy Mirror
+ Chunksize: 512K
+ Number of storage targets: desired: 2

Source: /home/hmeij 7808 files in 10G

TargetID        Pool        Total         Free    %      ITotal       IFree    %
========        ====        =====         ====    =      ======       =====    =
   13601         low     291.4GiB      63.1GiB  22%       18.5M       18.5M 100%
   13602         low     291.4GiB      63.1GiB  22%       18.5M       18.5M 100%
   21701         low     291.2GiB     134.6GiB  46%       18.5M       16.2M  87%
   21702         low     291.2GiB     134.6GiB  46%       18.5M       16.2M  87%
[root@cottontail2 ~]# rsync -ac --bwlimit=2500 /home/hmeij /mnt/beegfs/home1/  &
[root@cottontail2 ~]# rsync -ac --bwlimit=2500 /home/hmeij /mnt/beegfs/home2/  &
TargetID        Pool        Total         Free    %      ITotal       IFree    %
========        ====        =====         ====    =      ======       =====    =
   13601         low     291.4GiB      43.5GiB  15%       18.5M       18.5M 100%
   13602         low     291.4GiB      43.5GiB  15%       18.5M       18.5M 100%
   21701         low     291.2GiB     114.9GiB  39%       18.5M       16.1M  87%
   21702         low     291.2GiB     114.9GiB  39%       18.5M       16.1M  87%

# first rsync drops roughly 5G in both primaries which then get copied to secondaries.
# second rsync does the same so both storage servers loose 20G roughly
# now shut a storage server down and the whole filesystem can still be accessed (HA)

Resync Data #1

StorageSynchronization Link

If the primary storage target of a buddy group is unreachable, it will get marked as offline and a failover to the secondary target will be issued. In this case, the former secondary target will become the new primary target.

Testing out fail over and deletion of data on secondary then a full resync process:

Nice that it works.

So you can full storage content mirror. You'll still need rsnapshots to recover lost data or point in time restores.

Mirror Data

When not all storage servers are up, client mounts will fail. This is just an optional “sanity check” which the client performs when it is mounted. Disable this check by setting “sysMountSanityCheckMS=0” in beegfs-client.conf. When the sanity check is disabled, the client mount will succeed even if no servers are running.

In order to able able to take a storage server off line without any impact, all content needs to mirrored.

Before

[root@cottontail2 ~]# beegfs-df
METADATA SERVERS:
TargetID        Pool        Total         Free    %      ITotal       IFree    %
========        ====        =====         ====    =      ======       =====    =
      48         low      29.5GiB      23.3GiB  79%        1.9M        1.5M  82%
      49         low      29.5GiB      23.1GiB  78%        1.9M        1.5M  82%
     250         low     122.3GiB     116.7GiB  95%        7.8M        7.6M  98%

STORAGE TARGETS:
TargetID        Pool        Total         Free    %      ITotal       IFree    %
========        ====        =====         ====    =      ======       =====    =
   13601         low     291.4GiB      50.6GiB  17%       18.5M       18.4M 100%
   21701         low     291.2GiB      61.8GiB  21%       18.5M       15.8M  85%

Before

# define buddygroup - these are storage target IDs
[root@n7 ~]# beegfs-ctl --addmirrorgroup --primary=13601 --secondary=21701 --groupid=101
Mirror buddy group successfully set: groupID 101 -> target IDs 13601, 21701

[root@n7 ~]# beegfs-ctl --listmirrorgroups
     BuddyGroupID   PrimaryTargetID SecondaryTargetID
     ============   =============== =================
              101             13601             21701
              
# enable mirroring for data by directory -numTargets needs to be set to max nr of storage servers?
# changed to 11/02/2016:
[root@n7 ~]# beegfs-ctl --setpattern --buddymirror /mnt/beegfs/home --chunksize=512k 
[root@n7 ~]# beegfs-ctl --setpattern --buddymirror /mnt/beegfs/hmeij-mirror-data --chunksize=512k --numtargets=2
New chunksize: 524288
New number of storage targets: 2
Path: /hmeij-mirror-data
Mount: /mnt/beegfs

# copy some contents in (~hmeij is 10G)
[root@n7 ~]# rsync -vac --bwlimit /home/hmeij /mnt/beegfs/hmeij-mirror-data/ 

After

[root@n7 ~]# beegfs-df

METADATA SERVERS: (almost no changes...)
STORAGE TARGETS: (each target less circa 10G)
TargetID        Pool        Total         Free    %      ITotal       IFree    %
========        ====        =====         ====    =      ======       =====    =
   13601         low     291.4GiB      40.7GiB  14%       18.5M       18.4M  99%
   21701         low     291.2GiB      51.9GiB  18%       18.5M       15.8M  85%

# lets find an object
[root@n7 ~]# beegfs-ctl --getentryinfo /mnt/beegfs/hmeij-mirror-data/hmeij/xen/bvm1.img
Path: /hmeij-mirror-data/hmeij/xen/bvm1.img
Mount: /mnt/beegfs
EntryID: 178-581797C8-30
Metadata node: n38 [ID: 48]
Stripe pattern details:
+ Type: Buddy Mirror
+ Chunksize: 512K
+ Number of storage targets: desired: 2; actual: 1
+ Storage mirror buddy groups:
  + 101

# original
[root@n7 ~]# ls -lh /mnt/beegfs/hmeij-mirror-data/hmeij/xen/bvm1.img
-rwxr-xr-x 1 hmeij its 4.9G 2014-04-07 13:39 /mnt/beegfs/hmeij-mirror-data/hmeij/xen/bvm1.img

# copy on primary
[root@petaltail chroots]# ls -lh /var/chroots/data/beegfs_storage/buddymir/u2018/5817/9/60-58179513-30/178-581797C8-30
-rw-rw-rw- 1 root root 4.9G Apr  7  2014 /var/chroots/data/beegfs_storage/buddymir/u2018/5817/9/60-58179513-30/178-581797C8-30
                                                                          ^^^^^^^^

# copy on secondary
[root@swallowtail ~]# find /data/beegfs_storage -name 178-581797C8-30
/data/beegfs_storage/buddymir/u2018/5817/9/60-58179513-30/178-581797C8-30
[root@swallowtail ~]# ls -lh /data/beegfs_storage/buddymir/u2018/5817/9/60-58179513-30/178-581797C8-30
-rw-rw-rw- 1 root root 4.9G Apr  7  2014 /data/beegfs_storage/buddymir/u2018/5817/9/60-58179513-30/178-581797C8-30
                                                              ^^^^^^^^

# seems to work, notice the ''buddymir'' directory on primary/secondary

Here is an important note, from community list:

Another note: I changed paths for mirrormd and buddymirror to /mnt/beegfs/home and now I see connectivity data for meta node cottontail2 which was previously missing because I working on sub directory level.

[root@cottontail2 ~]# beegfs-net
meta_nodes
=============
cottontail2 [ID: 250]
   Connections: RDMA: 1 (10.11.103.250:8005);

[root@cottontail2 ~]# beegfs-ctl --listnodes --nodetype=meta --details
cottontail2 [ID: 250]
   Ports: UDP: 8005; TCP: 8005
   Interfaces: ib1(RDMA) ib1(TCP) eth1(TCP) eth0(TCP)
               ^^^

Quota

Meta Backup/Restore

External Link

# latest tar
rpm -Uvh /sanscratch/tmp/beegfs/tar-1.23-15.el6_8.x86_64.rpm

# backup
cd /data; tar czvf /sanscratch/tmp/beegfs/meta-backup/n38-meta.tar.gz beegfs_meta/ --xattrs

# restore
cd /data;  tar xvf /sanscratch/tmp/beegfs/meta-backup/n38-meta.tar.gz --xattrs

# test
cd /data; diff -r beegfs_meta beegfs_meta.orig
# no results

Resync Meta

External Link

Mirror Meta

Metadata mirroring can currently not be disabled after it has been enabled for a certain directory

Definitely want Meta content mirrored, that way I can use the n38-n45 nodes with local 15K disk, plus maybe cottontail2 (raid 1 with hot and cold spare).

Content mirroring will require more disk space. Perhaps snapshots to another node is more useful, also solves backup issue.

V6 does buddymirror meta mirroring External Link

# 2015.03 enable meta mirroring, directory based
# change to 11/04/2016: used --createdir to make this home.
[root@n7 ~]# beegfs-ctl --mirrormd /mnt/beegfs/home
[root@n7 ~]# beegfs-ctl --mirrormd /mnt/beegfs/hmeij-mirror
Mount: '/mnt/beegfs'; Path: '/hmeij-mirror'
Operation succeeded.

# V6.1 does it a root level not from a path
beegfs-ctl --addmirrorgroup --nodetype=meta --primary=38 --secondary=39 --groupid=1 
beegfs-ctl --addmirrorgroup --nodetype=meta --primary=250 --secondary=37 --groupid=2 
beegfs-ctl --mirrromd

# put some new content in 
[root@n7 ~]# rsync -vac /home/hmeij/iozone-tests /mnt/beegfs/hmeij-mirror/

# lookup meta tag
[root@n7 ~]# beegfs-ctl --getentryinfo /mnt/beegfs/hmeij-mirror/iozone-tests/current.tar
Path: /hmeij-mirror/iozone-tests/current.tar
Mount: /mnt/beegfs
EntryID: 3-581392E1-31

# find
[root@sharptail ~]# ssh n38 find /data/beegfs_meta -name 3-581392E1-31
/data/beegfs_meta/mirror/49.dentries/54/6C/0-581392F0-30/#fSiDs#/3-581392E1-31
                  ^^^^^^ ^^
# and find
[root@sharptail ~]# ssh n39 find /data/beegfs_meta -name 3-581392E1-31
/data/beegfs_meta/dentries/54/6C/0-581392F0-30/#fSiDs#/3-581392E1-31

# seems to work

Writing some initial content to both storage and meta servers; vanilla out of the box beegfs seems to balance the writes across both equally. Here are some stats.

/mnt/beegfs/

Looks like:

# file content

[root@swallowtail ~]# ls -lR /data/beegfs_storage/chunks/u0/57E4/2/169-57E42E75-31
/data/beegfs_storage/chunks/u0/57E4/2/169-57E42E75-31:
total 672
-rw-rw-rw- 1 root root 289442 Jun 26  2015 D8-57E42E89-30
-rw-rw-rw- 1 root root   3854 Jun 26  2015 D9-57E42E89-30
-rw-rw-rw- 1 root root  16966 Jun 26  2015 DA-57E42E89-30
-rw-rw-rw- 1 root root  65779 Jun 26  2015 DB-57E42E89-30
-rw-rw-rw- 1 root root  20562 Jun 26  2015 DF-57E42E89-30
-rw-rw-rw- 1 root root 259271 Jun 26  2015 E0-57E42E89-30
-rw-rw-rw- 1 root root    372 Jun 26  2015 E1-57E42E89-30

[root@petaltail ~]# ls -lR /var/chroots/data/beegfs_storage/chunks/u0/57E4/2/169-57E42E75-31
/var/chroots/data/beegfs_storage/chunks/u0/57E4/2/169-57E42E75-31:
total 144
-rw-rw-rw- 1 root root     40 Jun 26  2015 DC-57E42E89-30
-rw-rw-rw- 1 root root  40948 Jun 26  2015 DD-57E42E89-30
-rw-rw-rw- 1 root root 100077 Jun 26  2015 DE-57E42E89-30

# meta content

[root@sharptail ~]# ssh n38 find /data/beegfs_meta -name 169-57E42E75-31
/data/beegfs_meta/inodes/6A/7E/169-57E42E75-31
/data/beegfs_meta/dentries/6A/7E/169-57E42E75-31

[root@sharptail ~]# ssh n39 find /data/beegfs_meta -name 169-57E42E75-31
(none, no mirror)

Tuning

[root@n7 ~]# beegfs-ctl --getentryinfo /mnt/beegfs/
Path:
Mount: /mnt/beegfs
EntryID: root
Metadata node: n38 [ID: 48]
Stripe pattern details:
+ Type: RAID0
+ Chunksize: 512K
+ Number of storage targets: desired: 4

Installation

[root@cottontail ~]# ssh n7 beegfs-net

mgmt_nodes
=============
cottontail [ID: 1]
   Connections: TCP: 1 (10.11.103.253:8008);

meta_nodes
=============
n38 [ID: 48]
   Connections: TCP: 1 (10.11.103.48:8005);
n39 [ID: 49]
   Connections: TCP: 1 (10.11.103.49:8005);

storage_nodes
=============
swallowtail [ID: 136]
   Connections: TCP: 1 (192.168.1.136:8003 [fallback route]);
petaltail [ID: 217]
   Connections: TCP: 1 (192.168.1.217:8003 [fallback route]);


Back