Table of Contents


Back

TrueNAS/ZFS

Notes. Mainly for me but might be useful/of interest to users.

Message:

Our current file server is sharptail.wesleyan.edu which serves out home directories (/home, 10T). A new file server hpcstore.wesleyan.edu will be deployed taking over this function (/zfshomes, 190T). This notice is to inform you your home directory has been cut over.

There are no changes for you. When you log into cottontail or cottontail2 you end up in your new home directory. $HOME and ~username work as usual. The only difference is that your old home was at /home/username and now it is at /zfshomes/username.

If you wish to load/unload large content from your new home directory please log into hpcstore.wesleyan.edu directly (via ssh/sftp) or preferably use rsync with a bandwidth throttle no larger than “–bwlimit=5000”.

Details at
https://dokuwiki.wesleyan.edu/doku.php?id=cluster:194

Summary

# from outside via VPN
$ ssh hpc21@hpcstore.wesleyan.edu

hpc21@hpcstore.wesleyan.edu's password:
FreeBSD 11.2-STABLE (TrueNAS.amd64) 
(banner snip ...)
Welcome to TrueNAS

# note we ended up on node "B"
[hpc21@hpcstore2 ~]$ pwd
/mnt/tank/zfshomes/hpc21
[hpc21@hpcstore2 ~]$ echo $HOME
/mnt/tank/zfshomes/hpc21

# quota check
[hpc21@hpcstore2 ~]$  zfs userspace tank/zfshomes | egrep -i "quota|$USER"
TYPE        NAME      USED  QUOTA
POSIX User  hpc21     282K   500G


# from inside HPCC with ssh keys properly set up
[hpc21@cottontail ~]$ ssh hpcstore
Last login: Mon Mar 23 10:58:27 2020 from 129.133.52.222

[hpc21@cottontail ~]$ echo $HOME
/zfshomes/hpc21

[hpc21@hpcstore2 ~]$ df -h .
Filesystem       Size    Used   Avail Capacity  Mounted on
tank/zfshomes    177T    414G    177T     0%    /mnt/tank/zfshomes
[hmeij@ThisPC]$ rsync -vac --dry-run --whole-file --bwlimit=4096  \
c:\Users\hmeij\ hpcstore:/mnt/tank/zfshomes/hmeij/ 
sending incremental file list
...

Not any more. Serious conflict between NFS and SMB ACLs if both protocols enabled on same dataset. So nobody has a samba share. If you want to drop&drag you need to use something like CyberDuck and make an sftp connection.

Henk 2020/05/28 11:10

# windows command line
C:\Users\hmeij07>net use W: \\hpcstore.wesleyan.edu\hmeij /user:hmeij
Enter the password for 'hmeij' to connect to 'hpcstore.wesleyan.edu':
The command completed successfully.

# or ThisPC > Map Network Drive
\\hpcstore.wesleyan.edu\username
# user is hpcc username, password is hpcc password

Consoles

HA

High Availability. Two controllers hpcstore1 (also known as A) and hpcstore2 (also known as B).

Virtual IP hpcstore.wesleyan.edu floats back and forth seamlessly (tested, some protocols will loose connectivity). In a split brain situation (no response, both controllers think they are it), disconnect one controller from power then reboot. Then reconnect and wait a few minutes for HA icon to turn green when controller comes online.

Critical for Failover Network Interfaces marked for IGB0 and IGB1 (/zfshomes via NFS) and lagg0 (vlan52)

You can always Disable Failover, to fix power feed of switches 192.168.0.0/16 or 10.10.0.0/16

Check Box to Disable Failover Go to WebUI > System > Failover > Click the Box > Then Click Save (leave default controller setting as is)

This will allow you to make your network change without failing over. Sync to Peer, probably not necessary since you are on the Active controller. Once you are finished, then yes sync with Failover Enabled (no standby reboot).

SSH

Allowed for large content transfers using scp or sftp or just checking things out.
TODO: rsync?

Home directories are located in /mnt/tank/zfshomes. When users get cut over their location will be updated in the /etc/passwd file and $HOME becomes /zfshomes/username. So we can keep track of that. Followed by an rsync process that will from TrueNAS/ZFS appliance rsync to sharptail:/home.
TODO: write script. \
TODO: add disksold sharptail:/home, enlarge and merge LVMs.
TODO: backup target

# create user, no new but set primary + auxillary groups, full-name
# set shell, set permissions, some random passwd date +%N with symbols
# then move all dot files into ~/._nas scp ~/.ssh over
# copy content over from sharptail, @hpcstore...
rsync -ac --bwlimit=4096 --whole-file --stats sharptail:/home/hmeij/  /mnt/tank/zfshomes/hmeij/ 
# SSH keys in place so should be passwordless, test
ssh username@hpcstore.wesleyan.edu

# go to $HOME
cd /mnt/tank/zfshomes/username

# this will be mounted HPC wide at
/zfshomes/username

certs

ZFS

# for users
zfs allow everyone userquota,userused tank/zfshomes

# as user
zfs userspace  tank/zfshomes
zfs groupspace tank/zfshomes

# uttlerly bizarre in v12 these commands change

root@hpcstore2[~]# su - hmeij07
hpcstore2%
hpcstore2% zfs get userused@hmeij07 tank/zfshomes
NAME           PROPERTY          VALUE             SOURCE
tank/zfshomes  userused@hmeij07  718K              local
hpcstore2% zfs get userquota@hmeij07 tank/zfshomes
NAME           PROPERTY           VALUE              SOURCE
tank/zfshomes  userquota@hmeij07  500G               local
hpcstore2% zfs get userspace@hmeij07 tank/zfshomes
bad property list: invalid property 'userspace@hmeij07'


# hpc100
TYPE        NAME     USED  QUOTA
POSIX User  hpc100  14.9G   100G
POSIX User  root       1K   none

# set quota
zfs set userquota@hpc100=100g  tank/zfshomes
zfs set groupquota@hpc100=100g tank/zfshomes

# get userused
zfs get userused@hpc100 tank/zfshomes

# list snapshots
zfs list -t snapshot

# output
NAME                                            USED  AVAIL  REFER  MOUNTPOINT
freenas-boot/ROOT/default@2019-12-17-22:04:34  2.10M      -   827M  -
tank/zfshomes@auto-20200309.1348-1y             210K      -   558K  -
tank/zfshomes@auto-20200310.1348-1y             219K      -  14.8G  -
tank/zfshomes@auto-20200311.1348-1y             165K      -  14.9G  -

# health
zpool status -v tank
  pool: tank
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:02 with 0 errors on Sun Feb  2 03:00:04 2020
config:

        NAME                                            STATE     READ WRITE CKSUM
        tank                                            ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/104a748f-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/10d0c16e-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/115414b8-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/11dd105d-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/12636cff-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/12e6d913-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/13676269-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/13ee7fb2-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/14706a76-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/1504c334-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/1592a623-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
          raidz2-1                                      ONLINE       0     0     0
            gptid/16250571-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/16b4a392-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/173e4974-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/17cb4efb-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/1861c750-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/18ef1edd-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/197d9fc9-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/1a09eebb-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/1a99e25d-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/1b2dd0b5-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/1bbaa252-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
          raidz2-2                                      ONLINE       0     0     0
            gptid/1c60422c-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/1cedf16e-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/1d807f27-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/1e0d0a20-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/1e9dec87-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/1f603e96-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/1ff8b82e-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/2087c210-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/21128be3-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/21ab0c6c-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
            gptid/2241e3e2-211a-11ea-bbd5-b496915e40c8  ONLINE       0     0     0
        logs
          gptid/238a4161-211a-11ea-bbd5-b496915e40c8    ONLINE       0     0     0
        cache
          gptid/23426c62-211a-11ea-bbd5-b496915e40c8    ONLINE       0     0     0
        spares
          gptid/22f36c47-211a-11ea-bbd5-b496915e40c8    AVAIL

errors: No known data errors

SMB

SMB/CIFS (Samba) shares are also created once the homedir is up. NOT!

#         v that plus is the problem
drwxr-xr-x+ 147 root  wheel  147 Apr 27 08:17 /mnt/tank/zfshomes

# either use ACL editor to strip off in v13.1-U2 or

setfacl -bn /mnt/tank/zfshomes/

followed by for example

find /mnt/tank/zfshomes/hmeij/ -type d -exec setfacl -bn {} \;

# also unsupported via shell

Note At user creation a random password is set. Please ask to have it reset to access SMB shares. (there should be some self-serve password reset functionality with email confirmation but I cannot find it for now. Any passwords changed outside of database will not be persistent across boots.

# windows, map network drive
\\hpcstore.wesleyan.edu\username

# credentials, one or all of these may work
WORKGROUP\username
localhost\username
username

Change $HOME location in /etc/passwd and propagate.
Note remove access to old $HOME … chown root:root + chmod o-rwx
END OF USER ACCOUNT SETUP


NFS

root@hpcstore1[~]# cat /etc/exports

/mnt/tank/zfshomes -maproot="root":"wheel" -network 192.168.0.0/16
/mnt/tank/zfshomes -maproot="root":"wheel" -network 10.10.0.0/16

/mnt/tank/zfshomes-auto-20200310.1348-1y-clone -ro \
  -maproot="root":"wheel" -network 192.168.0.0/16
/mnt/tank/zfshomes-auto-20200310.1348-1y-clone -ro \  
  -maproot="root":"wheel" -network 10.10.0.0/16

Rollback

Rollback is a potentially dangerous operation

Instead restore via snapshots. See Guide.

Snapshots

Snapshots made easier in new releases … traverse to the hidden directory /zfshomes/.zfs/snapshot and find the snapshot day desired. Content is read only. Once you cd into a snapshpot an autofs mount is performed.

129.133.52.245:/mnt/tank/zfshomes/.zfs/snapshot/auto-20221126.0200-1y  
251T   77T  175T  31%   /zfshomes/.zfs/snapshot/auto-20221126.0200-1y
# mountpoints (maproot=root:wheel)
drwxr-xr-x 2 root root 4096 Mar 10 14:08 /mnt/clone0310
drwxr-xr-x 2 root root 4096 Mar  6 14:01 /zfshomes

# /etc/fstab examples (either private network)
#192.168.102.245:/mnt/tank/zfshomes    \
/zfshomes      nfs rw,tcp,soft,intr,bg,vers=3
#10.10.102.245:/mnt/tank/zfshomes      \
/zfshomes      nfs rw,tcp,soft,intr,bg,vers=3

#192.168.102.245:/mnt/tank/zfshomes-auto-20200310.1348-1y-clone  \
/mnt/clone0310      nfs ro,tcp,soft,intr,bg,vers=3
 10.10.102.245:/mnt/tank/zfshomes-auto-20200310.1348-1y-clone    \
/mnt/clone0310      nfs ro,tcp,soft,intr,bg,vers=3

Update 11

See Update 12 for manual update to v12 with Anthony on 03.09.2021

Change the Train to 11.3, then you will apply the update first in the WebUI to the passive controller.

After its reboots, you will failover to it by rebooting the Active controller (the current WebUI).

This will failover to the updated 11.3-U2.1 controller (brief interruption).

From there, you would go to System > Update and do the same for the NEW passive controller.

After that initiate failover back to primary via dashboard (brief interruption).

Enable HA, click icon

Apply Pending Updates Upgrades both controllers. Files are downloaded to the Active Controller and then transferred to the Standby Controller. The upgrade process starts concurrently on both TrueNAS Controllers.

Server responds while HA disabled. You are instructed to Initiate Fail Over, do so, just take 5 seconds. The Continue with pending upgrade … wait 5 mins or so, watch console activity. THEN Log out and log back in once the passive standby is on new update.

Update takes 15 mins in total.

11.3 U5

HDD

Two types, hard to find in stock.



8T SAS
da0: <HGST HUS728T8TAL4201 B460> Fixed Direct Access SPC-4 SCSI device​
da0: Serial Number VAKM5GTL​
da0: 1200.000MB/s transfers​
da0: Command Queueing enabled​
da0: 7630885MB (1953506646 4096 byte sectors)​
exxactcorp​
https://www.exxactcorp.com/search?q=HUS728T8TAL4201​
​
800G SSD​
da2: <WDC WUSTR6480ASS201 B925> Fixed Direct Access SPC-5 SCSI device​
da2: Serial Number V6V1XGDA​
da2: Command Queueing enabled​
da2: 763097MB (1562824368 512 byte sectors)​
exxactcorp​
https://www.exxactcorp.com/search?q=WUSTR6480ASS201​

Logs

From support:

That information is logged via syslog for the opposite controller. For example, to find the information I did here, I looked in the syslog output on the controller that was passive at the time these alerts occurred.

You can look that information up yourself by opening an SSH session to the passive controller, navigating to the /root/syslog directory and examining the files. The “controller_{a,b}” file shows the output for today. Extract the controller_a.0.bz2 file and read the output of the resulting controller_a.0 file to see the output for yesterday. controller_a.1 would contain the output for the day before yesterday, and so on.

Split Brain

When ending up with an error fail over state try console shutdown first. If that does not work cut power to controllers. Power down disk array, wait 10 mins, power up, wait 10 mins. Slide one controller out an inch or so. Power up other controller which will become the active controller. Wait 10 mins, log in, look around. Slide in other controller and restore redundant power for both controllers. Wait till HA is enabled. This is how you get out of a split-brain situation.

fndebug

Manual debug file creation, then ftp to ftp.ixsystems.com

freenas-debug -A
tar czvf fndebug-wesleyan-20201123.tar.gz /var/tmp/fndebug

# next look at bottom of fndebug/SMART/dump.txt
/dev/da10 HGST:7200:HUS728T8TAL4201:VAKM187L C:30 dR:2 dW:2503 dL:55 uR:0 uW:0 SMART Status:OK **!!!**
/dev/da9 HGST:7200:HUS728T8TAL4201:VAKL26ML C:30 dR:3 dW:0 dL:0 uR:0 uW:39 SMART Status:OK **!!!**
# these drives have not failed yet but have write errors, offline/replace, see below

# next look at output of zpool status -x in fndebug/ZFS/dump.txt
# and the error code
# https://illumos.org/msg/ZFS-8000-9P

        NAME                                              STATE     READ WRITE CKSUM
        tank                                              DEGRADED     0     0     0
...
          raidz2-1                                        DEGRADED     0     0     0
            gptid/16250571-211a-11ea-bbd5-b496915e40c8    DEGRADED     0     0 1.09K  too many errors
# look for checksums that have failed like this disk in vdev raidz2-1

# clean up the spare that resilvered (INUSE status)
# then run a clear on the pool. Then we'll try to get another debug.

zpool detach tank gptid/173e4974-211a-11ea-bbd5-b496915e40c8
zpool clear tank

# that brought all drives back online and the vdevs show
# then via gui added the available drive back as spare

Replace a failed drive

1) Go into the Storage > Pools page. Click the Gear icon next to the pool and press the "Status" option.
2) Find da4 and press the three-dot options button next to it, then press "Offline".
3) Go to the System > View Enclosure page, select da4 and press "Identify" to light up the drive on the rack.
4) Physically swap the drive on the rack with its replacement.
5) Go back to the Storage > Pool > Status page, bring up the options for the removed drive, 
5a) Select member disk from dropdown, and press "Replace". Success popup, click Close.
The replacement drive may or may not have been given the name "da4".
6) Wait for the drive to finish resilvering before proceeding to replace da3.
6a) Click spinning icon to view progress. Pool status "healthy" while resilvering.
Return the drives in original box, return label provided.

Pool Unhealthy but not Degraded status

No failed disks, no deploy of spare, but pool unhealthy. The dump.txt files for SMART and ZFS show nothing remarkable. But in the console log we observe that disk da11 has problems. RMA issued. 3rd replacement disk in a year.

Mar 21 04:03:57 hpcstore2 (da11:mpr0:0:21:0): READ(10). CDB: 28 00 1b b0 80 13 00 00 02 00 
Mar 21 04:03:57 hpcstore2 (da11:mpr0:0:21:0): CAM status: SCSI Status Error
Mar 21 04:03:57 hpcstore2 (da11:mpr0:0:21:0): SCSI status: Check Condition
Mar 21 04:03:57 hpcstore2 (da11:mpr0:0:21:0): SCSI sense: ABORTED COMMAND asc:44,0 (Internal target failure)
Mar 21 04:03:57 hpcstore2 (da11:mpr0:0:21:0): Descriptor 0x80: f7 72
Mar 21 04:03:57 hpcstore2 (da11:mpr0:0:21:0): Error 5, Unretryable error

1) Storage > Pools. Click gear icon next to the pool and press the "Status" option.
2) Find da11 and press the three-dot options button next to it, then press "Offline".
3) System > View Enclosure, find&select da11, press "Identify".
4) Physically swap the drive on the rack with its replacement.
5) Storage > Pool > Status page, bring up three-dot options for the removed drive, 
5a) Select member disk from drop down, and press "Replace". Success popup, click Close.
6) Wait till resilver finishes.

Console hangs

12.7

As for the issue of the “Please Wait” box spins forever (when creating a new user via GUI) I would try refreshing the WebUI service and seeing if that fixes the issue. You can do this by running the following commands via a SSH session to the VIP.

service middlewared stop
service middlewared start

“try” not very convincing … I open another tab, check user, close previous tab … also the .[a-z]* csh hidden files are not created - no bother, we use bash

Update 12

System > Update > Select (new train 12.0-STABLE)

Open a console on both controllers without double ssh sessions, directly to hpcstore1/2

zpool status ifconfig ntb0, internal heartbeat IPS 169.254.10.1 and 169.254.10.2

Then download updates on passive, check version hactl run check first, then update via cli

freenas-update -v -T TrueNAS-12.0-STABLE check

freenas-update -v -T TrueNAS-12.0-STABLE update

…10%…20%…30%…40%…50%…60%…70%…80%…90%…100%

reboot passive

from active ping passive heartbeat IP, when up

check version passive

check boot env beadm list (shows N now, R reboot for 12.0)

on passive tail -f /var/log/messages

now force fail over via GUI (interruptive for 6o seconds)

Anthony did a reboot on active instead, watch log for personality swap

then update the new passive

freenas-update -v -T TrueNAS-12.0-STABLE check

freenas-update -v -T TrueNAS-12.0-STABLE update

then check version, reboot new passive, check version, become new standby

Result: personality switch active vs standby, took 35 mins

In two months: ZFS feature updates pathch, not interruptive, do around 04/09/2021
Upgrade done — Henk 2021/06/07 07:40

Storage > Pool > “wheel” > Upgrade Pool

12.0-U4.1

12.0-U5.1

Not created/set, see below
While the underlying issues have been fixed, this setting continues to be disabled by default for additional performance investigation. To manually reactivate persistent L2ARC, log in to the TrueNAS Web Interface, go to System > Tunables, and add a new tunable with these values:

Type = sysctl
Variable = vfs.zfs.l2arc.rebuild_enabled
Value = 1

From support: In an HA environment, this tunable actually delays failover while it ensures L2ARC. The tunable for the “persistent L2ARC”. Preloads your ARC with what you had before, but slows down imports and failovers. Not super useful if you don't reboot or failover often.

12.0-U6

12.0-U6.1

12.0-U7

12.0-U8

12.0-U8.1

Update 13

System > Update > Select (new train 13.0-STABLE)

# in shell

 freenas-update -v -T TrueNAS-13.0-STABLE check

 freenas-update -v -T TrueNAS-13.0-STABLE update

…10%…20%…30%…40%…50%…60%…70%…80%…90%…100% 

beadm list
# (Active N = 12.0-U8.1 and R = 13.0-U3.1)

once both have finished, reboot passive, web gui log back in

once passive back up, reboot active

web gui log back into new active, wait for HA to be enabled

debug plus screenshots for snapshot visibility which is visible (working in 13.0-U3.1) but database setting is still invisble

took less than 35 mins

bstop 0
bresume 0
# manual, one at a time
scontrol hold joblist 
# one at a time
# for i in `squeue | grep '   R   ' | awk '{print $1}'`; do echo $i; done
# then grep '   S   '
scontrol suspend joblist 
scontrol resume  joblist 
scontrol release joblist

13.0-U4 04/03/2023

13.0-U5.1 07/07/2023

13.0-U5.3 08/25/2023

Next support ticket: Ask if you ever need to reboot the disk shelves? Full power off?

Hi, I'm archiving content from my TrueNAS appliance to another platform, then deleting the files migrated. I'm observing directories like this; 7.5 million files in 990 GB or 15 million files in 7 TB. Should I be concerned that the disk shelves have never be cold rebooted? Like XFS replaying the log journal for a clean mount? My HA nodes reboot on upgrade but I realize the disk arrays keep running, always. Please advise, thanks.

Tier 1 Support: The 2 ES24 shelves do not require to be rebooted as they just house the drives themselves and provide power to them. There shouldn't be any concern with these being on at all time.

Rsync stats (after decompressing)

sod1/
Number of files: 18,691,764
Total transferred file size: 13072322138140 bytes
arnt_rosetta/
Number of files: 8,825,674 
Total transferred file size: 1,798,675,349,128 bytes

13.0-U6.1 12/12/2023


Back