User Tools

Site Tools


cluster:226

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:226 [2024/10/16 15:27]
hmeij07 [replication]
cluster:226 [2025/03/10 16:34] (current)
hmeij07 [Update 13]
Line 56: Line 56:
  
 The target dataset on the receiving system is automatically created in read-only mode to protect the data. To mount or browse the data on the receiving system, create a clone of the snapshot and use the clone. We set IGNORE so should be read/write on M40.  Enable SSH on **target** The target dataset on the receiving system is automatically created in read-only mode to protect the data. To mount or browse the data on the receiving system, create a clone of the snapshot and use the clone. We set IGNORE so should be read/write on M40.  Enable SSH on **target**
 +
 +On **source** System > SSH Keypairs
 +
 +  * name replication
 +  * generate key pair
 +  * save
 +
  
 On **source** System > SSH Connections > Add On **source** System > SSH Connections > Add
  
   * name replication   * name replication
-  * host IP or FQDN of target+  * host IP or FQDN of target (select Manual)
   * username root   * username root
-  * generate new key+  * discover remote ssh key
  
 On ** source ** Tasks > Replication Tasks On ** source ** Tasks > Replication Tasks
Line 88: Line 95:
  
 You could kick this off with Run NOW in in Edit menu of task. You could kick this off with Run NOW in in Edit menu of task.
 +
 +**Session with Marc** 
 +
 +  * replication and snapshots must be disabled on target
 +  * target host, snapshots can be same as on source
 +  * but possible to set to say 2 weeks via custom
 +    * slowly throttle this back 180 to 120 to 60 to ?
 +  * use IP rather than hostname !! in URL https://129.133.52.245
 +
 +**Session with Barak**
 +
 +  * postpone update till spring break
 +  * create "zfshomes-c1toc1 Key", paste in private key
 +  * create "zfshomes-c1toc1" ssh connection (use ip above, check discover status)
 +  * switch current replication task to use new ssh connector
 +  * works! when running picks up last 8 missed snapshots
 +
  
  
Line 123: Line 147:
  
 </code> </code>
 +
 +==== failover and replication ====
 +
 +Testing Failover and assess that rerplication continues (x20ha PUSH to m40ha; make sure both controllers have the authorized_keys for hpcstore1 - add hpcstore2)
 +
 +  * Initiated failover from m40ha controller 2, an error window message pops up
 +  * "Node can not be reached. Node CARPS states do not agree"
 +  * 
 +Yet my web browser shows hpcm40eth0c2 and a minute later hpcm40eth0c1 shows up and HA is enabled.
 +
 +Replication of snapshots continues ok after failover which was the point of testing.
 +
 +  * Initiated failover again and now back to controller 1
 +  * Controller 2 shows up a minute later (reboots)
 +  * No error window this time
 +  * Time is wrong on controller 2 ...
 +  * Load IPMI go to configuration, date and time, enable NTP, refresh
 +    * that fixes the time
 +    * button selected back to "disabled"
 +
 +Check replication again. Do this one more time before production.
 +
 +
 +12.x docs\\
 +Failover is not allowed if both TrueNAS controllers have the same CARP state. A critical Alert (page 303) is generated and the HA icon shows HA Unavailable.
 +
 +
  
 ==== CAs & Certs ==== ==== CAs & Certs ====
Line 138: Line 189:
     * copy in public     * copy in public
   * don't click http -> https s0 you don't get locked out   * don't click http -> https s0 you don't get locked out
-  * when cert expires on you, just access https://+  * when cert expires on you, just access https 
 + 
 +==== Update 13 ==== 
 + 
 + 13.0-U6.3 11/22/2024 
 + 
 +  * apply pending update 
 +  * 10 mins, standby on new update 
 +  * initiate fail over on standby; 1 mins 
 +  * look for the icon in top bar, moving back and forth 
 +  * Pending update > Continue (3 mins in) finish upgrade 
 +  * wait for HA to be enabled (about 10 mins in) 
 +  * check versions 
 + 
 + 13.0-U6.3 11/22/2024 
 + 
 +  * 10 mins download and standby reboot 
 +  * 1 min fail over 
 +  * 8 1/2 min standby reboot 
 +  * check HA and versions 
 + 
 + 13.0-U6.4 01/22/2025 
 + 
 +  * no problems 
 + 
 + 13.0-U6.7 03/10/2025 
 + 
 +  * no problems 
 +  * standby reboot took a little longer, about 12 mins.
  
 \\ \\
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
cluster/226.1729092476.txt.gz · Last modified: 2024/10/16 15:27 by hmeij07