Bug #17826
Errors reported during replication
Description
I have 18 luns which are snashoted and replicated on another machine. Previously I used 9.10.1 and all works fine. Yesterday I updated to U1 and replication is also working fine but reports critical errors till all luns are replicated. So, for example replication starts at 14:00 I recieve 18 errors by mail and step by step I recieve less and less error mails while next luns are replicated.
Related issues
Associated revisions
History
#1
Updated by Jan Brońka over 4 years ago
- File debug-PROD-1S-20160928144816.txz added
#2
Updated by Bonnie Follweiler over 4 years ago
- Assignee set to William Grzybowski
#3
Updated by William Grzybowski over 4 years ago
- Status changed from Unscreened to Screened
Can you give an example of the emails you have been receiving?
#4
Updated by William Grzybowski over 4 years ago
- Status changed from Screened to 15
#5
Updated by Jan Brońka over 4 years ago
Sure,
Replication storage1/lun4 -> 1.100.0.2:backup3 failed: Failed: storage1/lun4 (auto-20160928.1400-4d->auto-20160928.1800-4d)
Replication storage1/lun10 -> 1.100.0.2:backup3 failed: Failed: storage1/lun10 (auto-20160928.1000-4d->auto-20160928.1400-4d)
Replication storage1/lun18 -> 1.100.0.2:backup1 failed: Failed: storage1/lun18 (auto-20160928.1400-4d->auto-20160928.1800-4d)
Replication storage1/lun3 -> 1.100.0.2:backup3 failed: Failed: storage1/lun3 (auto-20160928.1000-4d->auto-20160928.1400-4d)
Replication storage1/lun17 -> 1.100.0.2:backup1 failed: Failed: storage1/lun17 (auto-20160928.1400-4d->auto-20160928.1800-4d)
Replication storage1/lun2 -> 1.100.0.2:backup3 failed: Failed: storage1/lun2 (auto-20160928.1000-4d->auto-20160928.1400-4d)
Replication storage1/lun14 -> 1.100.0.2:backup3 failed: Failed: storage1/lun14 (auto-20160928.1000-4d->auto-20160928.1400-4d)
and
Hello,
The replication failed for the local ZFS storage1/lun5 while attempting to
apply incremental send of snapshot auto-20160928.1400-4d -> auto-20160928.1800-4d to 1.100.0.2
#6
Updated by William Grzybowski over 4 years ago
- Status changed from 15 to Investigation
#7
Updated by William Grzybowski over 4 years ago
- Priority changed from No priority to Important
- Target version set to 9.10.1-U2
- Seen in changed from Unspecified to 9.10.1-U1
#8
Updated by William Grzybowski over 4 years ago
So this happens everyday?
#9
Updated by Jan Brońka over 4 years ago
- Priority changed from Important to No priority
- Target version deleted (
9.10.1-U2) - Seen in changed from 9.10.1-U1 to Unspecified
This happend evey replication period. In my case I have replication from 6:00 + every 4h. So history repeat each time 6:00, 10:00, 14:00, 18:00
It starts to work like this since I update 9.10.1 to U1
#10
Updated by Vaibhav Chauhan over 4 years ago
- Priority changed from No priority to Important
- Target version set to 9.10.1-U2
- Seen in changed from Unspecified to 9.10.1-U1
#11
Updated by Josh Paetzel over 4 years ago
- Priority changed from Important to Critical
#12
Updated by William Grzybowski over 4 years ago
- File deleted (
debug-PROD-1S-20160928144816.txz)
#13
Updated by William Grzybowski over 4 years ago
- Is duplicate of Bug #17836: Replication Tasks Broken after update from 9.10.1 to 9.10.1-U1 added
#14
Updated by William Grzybowski over 4 years ago
- Status changed from Investigation to Needs Developer Review
- Priority changed from Critical to Blocks Until Resolved
- Private changed from Yes to No
#15
Updated by Jan Brońka over 4 years ago
- File snip_20160929090050.png snip_20160929090050.png added
More info...
Seems replication process is significantly degradated... look at attached picture - it is state I have from yesterday. Replication process starts... process till about 20% and then drasticly slow down... practically in my case 9.10.1-U1 has no replication working.
I tried also to manually send snapshot to backup system (all pass fine with nice performance) however UI even did not update stare (this works fine on 9.10.1).
#16
Updated by William Grzybowski over 4 years ago
Jan Brońka wrote:
More info...
Seems replication process is significantly degradated... look at attached picture - it is state I have from yesterday. Replication process starts... process till about 20% and then drasticly slow down... practically in my case 9.10.1-U1 has no replication working.I tried also to manually send snapshot to backup system (all pass fine with nice performance) however UI even did not update stare (this works fine on 9.10.1).
If you check this ticket status you will see this ticket has already been solved, just waiting for another release.
Please use 9.10.1 BE or wait for next update (should happen next monday)
Thanks
#17
Updated by Jan Brońka over 4 years ago
Hi,
In status I still see "Need Review" and "% Done" = 0
So I conclude it is not solved yet.
But, nice to hear this.
#18
Updated by Vaibhav Chauhan over 4 years ago
- Assignee changed from William Grzybowski to Chris Torek
- Priority changed from Blocks Until Resolved to Critical
Chris can you please review the changes ?
#19
Updated by Chris Torek over 4 years ago
- Status changed from Needs Developer Review to Reviewed
- Assignee changed from Chris Torek to Vaibhav Chauhan
I looked at the changes when they were initially committed, but checked again for the review, they still look good :-)
Chris
#20
Updated by Vaibhav Chauhan over 4 years ago
- Status changed from Reviewed to Ready For Release
#21
Updated by William Grzybowski over 4 years ago
- Is duplicate of Bug #17905: Replication Task failed. Clone of remote backup dataset snapshot was created: backup-pool/.../myPC-zvol/%recv where myPC-zvol is being replicated and contains the snapshot in question added
#22
Updated by William Grzybowski over 4 years ago
- Is duplicate of Bug #17938: Slow Replication in 9.10.1-U1 added
#23
Updated by Josh Paetzel over 4 years ago
- Has duplicate Bug #17953: error mail added
#24
Updated by William Grzybowski over 4 years ago
- Is duplicate of Bug #17981: Replication spontaneously failed added
#25
Updated by William Grzybowski over 4 years ago
- Is duplicate of Bug #17995: Replication failing after upgrade added
#26
Updated by Vaibhav Chauhan over 4 years ago
- Status changed from Ready For Release to Resolved
#27
Updated by Broc Seib over 4 years ago
Anyone have a link to the changes made that cause this issue to be resolved? I had somewhat similar replication problems after updating to U1, but I didn't want to open a new issue until I ruled out this fix.
In my case, I ended up removing pipewatcher from autorepl.py to get all my replications working again. Is this at all related or should I post a new issue?
#28
Updated by William Grzybowski over 4 years ago
- Is duplicate of Bug #18282: [Regression] Error replicating incorrect dataset name added