User Tools

Site Tools


cluster:233

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
cluster:233 [2026/06/18 19:58] – created hmeij07cluster:233 [2026/06/18 20:06] (current) hmeij07
Line 1: Line 1:
 +\\
 +**[[cluster:0|Back]]**
 +
 +===== M40c2 to X20c2 =====
 +
 +Excellent support form Carlo at TrueNAS regarding aborted replication process. Controller 2 from M40 to X20 TrueNAS appliances.
 +
 However, the earlier process output indicates that the cancelled replication left behind a zfs send / ssh zfs recv pipeline. This is why tank/zfshomes@auto-20260113.0200-1y is still reporting as busy even though it is not mounted. Please keep the replication task disabled while we clean this up. However, the earlier process output indicates that the cancelled replication left behind a zfs send / ssh zfs recv pipeline. This is why tank/zfshomes@auto-20260113.0200-1y is still reporting as busy even though it is not mounted. Please keep the replication task disabled while we clean this up.
  
 First, confirm whether those send/receive processes are still present: First, confirm whether those send/receive processes are still present:
 +<code>
 ps auxww | egrep 'zfs send|zfs recv|129.133.52.245|auto-20260113|zettarepl' | grep -v egrep ps auxww | egrep 'zfs send|zfs recv|129.133.52.245|auto-20260113|zettarepl' | grep -v egrep
 +</code>
 If they are still present, please terminate them gracefully first using the current PIDs shown in the output: If they are still present, please terminate them gracefully first using the current PIDs shown in the output:
 +<code>
 +# there were many more processes...
 kill -TERM <PID1> <PID2> <PID3> kill -TERM <PID1> <PID2> <PID3>
 +</code>
 Wait around 30 seconds, then check again: Wait around 30 seconds, then check again:
 +<code>
 ps auxww | egrep 'zfs send|zfs recv|129.133.52.245|auto-20260113|zettarepl' | grep -v egrep ps auxww | egrep 'zfs send|zfs recv|129.133.52.245|auto-20260113|zettarepl' | grep -v egrep
 +</code>
 Only if the processes do not exit after kill -TERM, use kill -KILL: Only if the processes do not exit after kill -TERM, use kill -KILL:
 +<code>
 kill -KILL <PID1> <PID2> <PID3> kill -KILL <PID1> <PID2> <PID3>
 +</code>
  
 Please use the current PIDs shown by ps, and separate the PIDs with spaces, not commas. Also, please avoid killing the main middlewared/zettarepl Python service unless specifically advised. Please use the current PIDs shown by ps, and separate the PIDs with spaces, not commas. Also, please avoid killing the main middlewared/zettarepl Python service unless specifically advised.
  
 Once the stale send/receive processes are gone, retry deleting the busy snapshot: Once the stale send/receive processes are gone, retry deleting the busy snapshot:
 +<code>
 zfs destroy -v tank/zfshomes@auto-20260113.0200-1y zfs destroy -v tank/zfshomes@auto-20260113.0200-1y
 +</code>
  
 If it still reports as busy, please check for holds and dependent references: If it still reports as busy, please check for holds and dependent references:
  
 +<code>
 +# there were none...
 zfs holds -r tank/zfshomes@auto-20260113.0200-1y zfs holds -r tank/zfshomes@auto-20260113.0200-1y
 zfs get clones tank/zfshomes@auto-20260113.0200-1y zfs get clones tank/zfshomes@auto-20260113.0200-1y
 fstat | grep auto-20260113 fstat | grep auto-20260113
 +</code>
 +
 Please send us the output before releasing any holds, so we can confirm whether the hold is safe to remove. Please send us the output before releasing any holds, so we can confirm whether the hold is safe to remove.
  
Line 29: Line 48:
 Tasks > Periodic Snapshot Tasks > edit the tank/zfshomes task > Snapshot Lifetime > 30 DAYS > Save Tasks > Periodic Snapshot Tasks > edit the tank/zfshomes task > Snapshot Lifetime > 30 DAYS > Save
  
-The naming schema currently ends in -1ybut that is only part of the snapshot name. The actual local retention is controlled by the Snapshot Lifetime field. You may optionally update the naming suffix later to -30d for clarity, but the important setting is the Snapshot Lifetime.+For starting the zfshomes-c2toc2 replication over from scratchplease first confirm that the destination dataset tank/zfshomes on the target system contains no data or snapshots that need to be preserved
  
-For starting the zfshomes-c2toc2 replication over from scratch, please first confirm that the destination dataset tank/zfshomes on the target system contains no data or snapshots that need to be preserved. Starting from scratch or overwriting the destination is destructive.+Starting from scratch or overwriting the destination is destructive.
  
 Because the previous replication was cancelled mid-transfer and used resumable receive, please also check the destination system for a receive resume token: Because the previous replication was cancelled mid-transfer and used resumable receive, please also check the destination system for a receive resume token:
 +<code> 
 +# there were tokens....
 zfs get receive_resume_token tank/zfshomes zfs get receive_resume_token tank/zfshomes
-If a token is present, clear the interrupted receive state on the destination system with:+</code>
  
 +If a token is present, clear the interrupted receive state on the destination system with:
 +<code>
 zfs receive -A tank/zfshomes zfs receive -A tank/zfshomes
 +</code>
 +
 After that, if the destination content is disposable, you can either clear the existing destination dataset/snapshots manually or enable “Replication from scratch” in the zfshomes-c2toc2 replication task before running it again. Please only proceed with that option after confirming the target content can be overwritten. After that, if the destination content is disposable, you can either clear the existing destination dataset/snapshots manually or enable “Replication from scratch” in the zfshomes-c2toc2 replication task before running it again. Please only proceed with that option after confirming the target content can be overwritten.
  
 Recommended order: Recommended order:
  
-Keep the replication task disabled. +  - Keep the replication task disabled. 
-Clear the stale zfs send / ssh zfs recv processes. +  Clear the stale zfs send / ssh zfs recv processes. 
-Delete the busy local snapshot. +  Delete the busy local snapshot. 
-Change the Periodic Snapshot Task lifetime from 4 months to 30 days. +  Change the Periodic Snapshot Task lifetime from 4 months to 30 days. 
-Check and clear the destination receive resume token if present. +  Check and clear the destination receive resume token if present. 
-Confirm the destination dataset can be overwritten. +  Confirm the destination dataset can be overwritten. 
-Re-enable and rerun the replication task from scratch.+  Re-enable and rerun the replication task from scratch. 
 + 
 +\\ 
 +**[[cluster:0|Back]]** 
 + 
cluster/233.1781812687.txt.gz · Last modified: by hmeij07