Clarify what replicator does when there is no overlapping snapshot on the sending/receiving system
Old snapshots are being purged from the receiving system, in a replication task, even with the "delete stale snapshots on remote system" unchecked
when they sync
#3 Updated by Kris Moore almost 4 years ago
- Priority changed from No priority to Nice to have
- Target version set to 11.2-BETA1
So we did some digging on this one. It's acted this way for many years now, and the issue is that you don't have matching snapshot history on the remote side, you can't just send new snapshot data to the remote box without removing the old one first. We will investigate and see if we can do something to "warn" the user before this happens.
#9 Updated by Bonnie Follweiler over 3 years ago
- Private changed from No to Yes
I can confirm that this is still happening in FreeNAS-11.0-U4 (54848d13b).
I can give access to the replications systems if you need them
They are Sending System 10.231.1.76 root/abcd1234 Receiving System 10.20.20.157 root/abcd1234
The replication task has been running since 9/25 but the snapshots are being deleted, in both systems, in spite of the "Delete stale snapshots on remote system:" not being checked
#10 Updated by Sean Fagan over 3 years ago
- Status changed from 46 to Closed: Behaves correctly
- Assignee changed from Sean Fagan to Bonnie Follweiler
I've got no idea why it's doing that.
Remote deletion is accomplished by using the '-p' option on the zfs send, and from looking at the logs it's not being sent. Ah:
debug.log:Sep 27 09:00:13 bonniemini /autorepl.py: [tools.autorepl:521] Deleting 37 snapshot(s) in pull side because not a single matching snapshot was found
The comment for that is:
# No matching snapshot(s) exist. If there is any snapshots on the # target side, destroy all existing snapshots so we can proceed.
This appears to behave correctly. Albeit confusingly.
This may need to be documented -- you can't send a full dataset if there are snapshots, and if there's no matching snapshot on both sides, it has to send a full rather than incremental.
#11 Updated by Bonnie Follweiler over 3 years ago
- Status changed from Closed: Behaves correctly to 15
- Assignee changed from Bonnie Follweiler to Kris Moore
So Sean explained to me why this happened. There was no overlapping snapshot on the sending/receiving system. I had set the snapshot to every 15 minutes with a lifetime of 4 hours (Monday thru Friday). In the Replication task I didn't check the "Delete stale snapshots on remote system:" thinking that the receiving computer would be my archive.
It has been running for three days but the receiving computer only has today's snapshots. This is because, when the first replication task fired off at 9:00 am, there were no matching snapshots on the two systems. The Sending system had today's and the receiving system had yesterdays. When zfs didn't find an overlapping snapshot it wiped out the snapshots on the receiving system and sent the snapshots it had for today there.
Here is my questions 1) should this be documented as a warning in docs because, although I'm sure it's an "edge case", if I did it I'm sure someone else out there has set it up this way or will do it expecting the receiving system to retain all the snapshots. or 2) can we have a discussion about: Is the "Delete stale snapshots on remote system:" checkbox functioning in the way most users would understand or should it be reworded/removed?
#12 Updated by Kris Moore over 3 years ago
- Assignee changed from Kris Moore to Dru Lavigne
- QA Status Test Passes FreeNAS added
- QA Status deleted (
Test Fails FreeNAS)
I agree with Sean here, this is "just how it works" but perhaps we can doc it better? Dru, do you think the docs are clear enough in this instance, or should we insert some additional verbage to point users at? I'm wondering if at the top of the Replication section of the docs, we could have a breakout "Notes" section which just details how ZFS replication works in practice in regard to incrementals, and the remote needing to have the original snapshots to do incrementals from.
#16 Updated by Dru Lavigne over 3 years ago
- Subject changed from Old snapshots are deleted from the recieving system even with the "delete stale snapshots on remote system" unchecked to Clarify what replicator does when there is no overlapping snapshot on the sending/receiving system
- Private changed from Yes to No