Project

General

Profile

Bug #23989

Clarify what replicator does when there is no overlapping snapshot on the sending/receiving system

Added by Bonnie Follweiler over 1 year ago. Updated 12 months ago.

Status:
Resolved
Priority:
Nice to have
Assignee:
Warren Block
Category:
Middleware
Target version:
Seen in:
Sprint:
Severity:
New
Backlog Priority:
Reason for Closing:
Reason for Blocked:
Needs QA:
No
Needs Doc:
Yes
Needs Merging:
Yes
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

Old snapshots are being purged from the receiving system, in a replication task, even with the "delete stale snapshots on remote system" unchecked
when they sync

Screen Shot 2017-05-16 at 10.39.12 AM.png (333 KB) Screen Shot 2017-05-16 at 10.39.12 AM.png receiving system Bonnie Follweiler, 05/16/2017 08:05 AM
Screen Shot 2017-05-16 at 10.39.00 AM.png (427 KB) Screen Shot 2017-05-16 at 10.39.00 AM.png receiving system snapshots Bonnie Follweiler, 05/16/2017 08:05 AM
Screen Shot 2017-05-16 at 10.39.22 AM.png (315 KB) Screen Shot 2017-05-16 at 10.39.22 AM.png receiving system Bonnie Follweiler, 05/16/2017 08:05 AM
Screen Shot 2017-05-16 at 10.42.14 AM.png (332 KB) Screen Shot 2017-05-16 at 10.42.14 AM.png sending system Bonnie Follweiler, 05/16/2017 08:07 AM
Screen Shot 2017-05-16 at 10.42.36 AM.png (349 KB) Screen Shot 2017-05-16 at 10.42.36 AM.png sending system Bonnie Follweiler, 05/16/2017 08:07 AM
Screen Shot 2017-05-16 at 10.43.07 AM.png (937 KB) Screen Shot 2017-05-16 at 10.43.07 AM.png sending system Bonnie Follweiler, 05/16/2017 08:07 AM
Screen Shot 2017-05-16 at 11.09.01 AM.png (342 KB) Screen Shot 2017-05-16 at 11.09.01 AM.png receiving system Bonnie Follweiler, 05/16/2017 08:09 AM
11145
11146
11147
11149
11150
11151
11153

Related issues

Related to FreeNAS - Bug #26269: Replication target snapshots are all silently deleted if none are related to incoming snapshotsNot Started

Associated revisions

Revision 2c944edc (diff)
Added by Warren Block 12 months ago

Add a warning about non-overlapping snapshot deletion

Ticket: #23989

History

#1 Updated by Bonnie Follweiler over 1 year ago

Confirmed when upgrading to FreeNAS-11.0-RC2 (869046407)as well.

#2 Updated by Vaibhav Chauhan over 1 year ago

can this be considered a FreeNAS-11.0-RC2 blocker?

#3 Avatar?id=14398&size=24x24 Updated by Kris Moore over 1 year ago

  • Priority changed from No priority to Nice to have
  • Target version set to 11.2-BETA1

So we did some digging on this one. It's acted this way for many years now, and the issue is that you don't have matching snapshot history on the remote side, you can't just send new snapshot data to the remote box without removing the old one first. We will investigate and see if we can do something to "warn" the user before this happens.

#4 Updated by William Grzybowski over 1 year ago

  • Status changed from Unscreened to Screened

#5 Updated by William Grzybowski about 1 year ago

  • Status changed from Screened to Unscreened
  • Assignee changed from William Grzybowski to Sean Fagan

Another replication ticket. I forgot in last sweep.

#6 Updated by Sean Fagan about 1 year ago

  • Status changed from Unscreened to 15

Is there a recursive snapshot setup on the destination system?

#7 Updated by Dru Lavigne about 1 year ago

  • Status changed from 15 to 46

Sean: should this also be documented around (like #23674) or are there plans to correct this behavior for 11.2?

#8 Updated by Bonnie Follweiler about 1 year ago

  • QA Status Test Fails FreeNAS added
  • QA Status deleted (Not Tested)

Sean, there are no snapshots tasks set up on the destination system.

#9 Updated by Bonnie Follweiler about 1 year ago

  • Private changed from No to Yes

I can confirm that this is still happening in FreeNAS-11.0-U4 (54848d13b).
I can give access to the replications systems if you need them
They are Sending System 10.231.1.76 root/abcd1234 Receiving System 10.20.20.157 root/abcd1234
The replication task has been running since 9/25 but the snapshots are being deleted, in both systems, in spite of the "Delete stale snapshots on remote system:" not being checked

#10 Updated by Sean Fagan about 1 year ago

  • Status changed from 46 to Closed: Behaves correctly
  • Assignee changed from Sean Fagan to Bonnie Follweiler

I've got no idea why it's doing that.

Remote deletion is accomplished by using the '-p' option on the zfs send, and from looking at the logs it's not being sent. Ah:

debug.log:Sep 27 09:00:13 bonniemini /autorepl.py: [tools.autorepl:521] Deleting 37 snapshot(s) in pull side because not a single matching snapshot was found

The comment for that is:
            # No matching snapshot(s) exist.  If there is any snapshots on the
            # target side, destroy all existing snapshots so we can proceed.

This appears to behave correctly. Albeit confusingly.

This may need to be documented -- you can't send a full dataset if there are snapshots, and if there's no matching snapshot on both sides, it has to send a full rather than incremental.

Wheeeeeeeeeeee.

#11 Updated by Bonnie Follweiler about 1 year ago

  • Status changed from Closed: Behaves correctly to 15
  • Assignee changed from Bonnie Follweiler to Kris Moore

So Sean explained to me why this happened. There was no overlapping snapshot on the sending/receiving system. I had set the snapshot to every 15 minutes with a lifetime of 4 hours (Monday thru Friday). In the Replication task I didn't check the "Delete stale snapshots on remote system:" thinking that the receiving computer would be my archive.

It has been running for three days but the receiving computer only has today's snapshots. This is because, when the first replication task fired off at 9:00 am, there were no matching snapshots on the two systems. The Sending system had today's and the receiving system had yesterdays. When zfs didn't find an overlapping snapshot it wiped out the snapshots on the receiving system and sent the snapshots it had for today there.

Here is my questions 1) should this be documented as a warning in docs because, although I'm sure it's an "edge case", if I did it I'm sure someone else out there has set it up this way or will do it expecting the receiving system to retain all the snapshots. or 2) can we have a discussion about: Is the "Delete stale snapshots on remote system:" checkbox functioning in the way most users would understand or should it be reworded/removed?

#12 Avatar?id=14398&size=24x24 Updated by Kris Moore about 1 year ago

  • Assignee changed from Kris Moore to Dru Lavigne
  • QA Status Test Passes FreeNAS added
  • QA Status deleted (Test Fails FreeNAS)

I agree with Sean here, this is "just how it works" but perhaps we can doc it better? Dru, do you think the docs are clear enough in this instance, or should we insert some additional verbage to point users at? I'm wondering if at the top of the Replication section of the docs, we could have a breakout "Notes" section which just details how ZFS replication works in practice in regard to incrementals, and the remote needing to have the original snapshots to do incrementals from.

#13 Updated by Dru Lavigne about 1 year ago

  • Status changed from 15 to Unscreened
  • Assignee changed from Dru Lavigne to Warren Block

This is similar to #23674 which Warren clarified earlier today. Passing to him to see if any more doc gems can be added from this thread.

#14 Updated by Dru Lavigne about 1 year ago

  • File deleted (debug-freenas-20170516080116.tgz)

#15 Updated by Dru Lavigne about 1 year ago

  • File deleted (debug-freenas-20170516080323.tgz)

#16 Updated by Dru Lavigne about 1 year ago

  • Subject changed from Old snapshots are deleted from the recieving system even with the "delete stale snapshots on remote system" unchecked to Clarify what replicator does when there is no overlapping snapshot on the sending/receiving system
  • Private changed from Yes to No

#17 Updated by Dru Lavigne about 1 year ago

  • Target version changed from 11.2-BETA1 to 11.1-BETA1

#18 Updated by Dru Lavigne about 1 year ago

  • Target version changed from 11.1-BETA1 to 11.1

#19 Updated by Warren Block 12 months ago

  • Related to Bug #26269: Replication target snapshots are all silently deleted if none are related to incoming snapshots added

#20 Updated by Warren Block 12 months ago

  • Status changed from Unscreened to Resolved
  • Target version changed from 11.1 to 11.1-BETA1

#21 Updated by Bonnie Follweiler 12 months ago

  • Needs QA changed from Yes to No
  • QA Status deleted (Test Passes FreeNAS)

#22 Updated by Bonnie Follweiler 12 months ago

  • QA Status Test Passes FreeNAS added

Also available in: Atom PDF