Project

General

Profile

Bug #5968

Replication fails

Added by Mihai Preda about 6 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
Nice to have
Assignee:
Jordan Hubbard
Category:
Middleware
Severity:
New
Reason for Closing:
Reason for Blocked:
Needs QA:
Yes
Needs Doc:
Yes
Needs Merging:
Yes
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

Hello there,

I have 2 AMD machines with raidz1 config on both. The main machine is a 6 core with 16GB of ram, 3x3TB WD reds, 2nd machine is only a replication machine with a dual core cpu and 8GB of ram.
My replication fails continuously has been for several months spanning multiple FreeNAS releases.
Every time i click "Initialize remote side for once. (May cause data loss on remote side!):" it starts replicating and stops when it reaches snapshot labeled with date 20140307.0957-6m for several data sets.
The error message that i'm getting is: Replication backups -> 192.168.1.38 failed: cannot receive incremental stream: most recent snapshot of replica/newfs does not match incremental source.
The data set is almost always different every time i reset the replication task and start fresh.

History

#1 Updated by Josh Paetzel about 6 years ago

  • Category set to 59
  • Status changed from Unscreened to Screened
  • Assignee set to Josh Paetzel
  • Target version set to 49

Mihai,

Can you attach the output of:

  1. zfs list -t snapshot

from the source machine as well as the destination. Also:

  1. zfs list

from both source and destination,
and the output of:

  1. sqlite3 /data/freenas-v1.db "select * from storage_replication"

From just the sending machine.

#2 Updated by Mihai Preda about 6 years ago

  • File zfslistDest.txt added
  • File zltsDest.txt added

Hello Josh and thank you for your quick reply.
It seems that my source machine gets stuck on the zfs commands and it is stuck on sending one of the zfs snapshots probably due to the fact that my replication runs from 17:00 to 08:00. The reason for that is because i'm running 2 CIFS shares off the same machine and it is almost impossible to access the CIFS files if the replication runs 24/7. Looks

I was able to run the commands on the destination machine thou; please see the attached files.

#3 Updated by Mihai Preda about 6 years ago

  • File sqllitecmd.txt added
  • File zfslistDest.txt added
  • File zfslistSrc.txt added
  • File zlsSrc.txt added
  • File zltsDest.txt added

Hello Josh,

I bit my tongue and rebooted the sending machine which was stuck for whatever reason, i could not even log in via its web interface. Anyhow, i was able to run the commands you asked for and i've also ran the commands on the destination machine so that we have fresh data from both machines.

#4 Updated by Jordan Hubbard about 6 years ago

  • Status changed from Screened to Investigation

#5 Updated by Mihai Preda about 6 years ago

I was able to fix this issue by changing the Replication Stream Compression from LZ4 to PIGZ

#6 Updated by Mihai Preda about 6 years ago

Hello,
It seems my replication fails again with a new error: "cannot receive incremental strea: of remote/dataset does not match incremental source pigz: write error code 32 pigz: abort: write error on <stdout>

#7 Updated by Jordan Hubbard about 6 years ago

BRB: Do you have some snapshots scheduled on the receiving side, perhaps? Not seeing other users having this problem.

#8 Updated by Dimitar Boyn over 5 years ago

I have same problem.
FreeNAS-9.3-STABLE-201502271818

This happens after more than 20 hours of transfer.
Was happening with compression so I turned off the compression but still getting the broken pipe:

node0015 (storage/freenas) freenas_alert.03d042ea4117e2b181bbbcd6bb6f5179: Replication pool.15186186791288019284/dataset1 -> node0016:pool.764172506333928681 failed: Write failed

#9 Updated by Mihai Preda over 5 years ago

Hello Dimitar,

This issue has been fixed on my end since update 9.3. I used to experience this in 9.2 and prior versions of the OS but since 9.3, no more.

#10 Updated by Josh Paetzel over 5 years ago

  • Status changed from Investigation to Unscreened
  • Assignee changed from Josh Paetzel to Jordan Hubbard

#11 Updated by Jordan Hubbard over 5 years ago

  • Status changed from Unscreened to Investigation
  • Target version deleted (49)

#12 Updated by Jordan Hubbard over 4 years ago

  • Status changed from Investigation to Resolved

#13 Updated by Dru Lavigne almost 3 years ago

  • Target version set to Master - FreeNAS Nightlies

#14 Updated by Dru Lavigne over 2 years ago

  • File deleted (zfslistDest.txt)

#15 Updated by Dru Lavigne over 2 years ago

  • File deleted (zltsDest.txt)

#16 Updated by Dru Lavigne over 2 years ago

  • File deleted (sqllitecmd.txt)

#17 Updated by Dru Lavigne over 2 years ago

  • File deleted (zfslistDest.txt)

#18 Updated by Dru Lavigne over 2 years ago

  • File deleted (zfslistSrc.txt)

#19 Updated by Dru Lavigne over 2 years ago

  • File deleted (zlsSrc.txt)

#20 Updated by Dru Lavigne over 2 years ago

  • File deleted (zltsDest.txt)

Also available in: Atom PDF