Bug #23871
Don't create additional dataset at replication destination
ASRockRack E3C224D4I-14S
Intel(R) Core(TM) i3-4330 CPU @ 3.50GHz
32 GB ECC RAM
pool hotswap is a 8x4TB HDD pool
pool SSD-jails is 1x120GB SSD
Description
I stumbled over following message in my logs today
May 9 14:32:07 generator collectd[3367]: statvfs(/mnt/hotswap/Backup/SSDjails-backup/SSD-jails) failed: No such file or directory
that repeats every 10s.
This happened after I have created a local replication task from my SSD to my HDD pool.
- Created periodic recursive Snapshot of my SSD-jails pool (the root of this pool). This pool is used as the system dataset pool.
- Created a new dataset on my HDD pool called hotswap/Backup/SSD-jails
- Created a replication task that replicates this snapshot to the dataset just created (on the same system via localhost).
Volume/Dataset: SSD-jails
Remote ZFS Volume/Dataset: hotswap/Backup/SSD-jails
What I expected: that SSD-jails pool would be replicated directly to hotswap/Backup/SSD-jails and all children datasets would be inside it with the hierarchy preserved.
After replicating I got all my all data from SSD-jails root dataset and all children jail datasets inside hotswap/Backup/SSD-jails (and some .warden and .meta stuff) but also another dataset SSD-jails that should not be there!
this dataset is readonly and not mounted which results in the statvfs error message.
to be more clear the output of zfs list for source and target:
[root@generator] /mnt/SSD-jails# zfs list -r SSD-jails NAME USED AVAIL REFER MOUNTPOINT SSD-jails 71.7G 28.3G 21.3M /mnt/SSD-jails SSD-jails/.system 4.98G 28.3G 410M legacy SSD-jails/.system/configs-903a2a7d45924e86a448cccd86aa67c2 61.1M 28.3G 60.0M legacy SSD-jails/.system/cores 11.4M 28.3G 2.46M legacy SSD-jails/.system/rrd-903a2a7d45924e86a448cccd86aa67c2 2.85G 28.3G 82.4M legacy SSD-jails/.system/samba4 44.6M 28.3G 11.4M legacy SSD-jails/.system/syslog-903a2a7d45924e86a448cccd86aa67c2 374M 28.3G 38.7M legacy SSD-jails/.warden-template-pluginjail 472M 28.3G 471M /mnt/SSD-jails/.warden-template-pluginjail SSD-jails/.warden-template-pluginjail--x64 449M 28.3G 449M /mnt/SSD-jails/.warden-template-pluginjail--x64 SSD-jails/.warden-template-pluginjail--x64-20150716121537 452M 28.3G 452M /mnt/SSD-jails/.warden-template-pluginjail--x64-20150716121537 SSD-jails/.warden-template-pluginjail-9.3-x64 452M 28.3G 452M /mnt/SSD-jails/.warden-template-pluginjail-9.3-x64 SSD-jails/.warden-template-standard 1.66G 28.3G 1.66G /mnt/SSD-jails/.warden-template-standard SSD-jails/.warden-template-standard--x64 1.58G 28.3G 1.58G /mnt/SSD-jails/.warden-template-standard--x64 SSD-jails/.warden-template-standard--x64-20150716121538 1.62G 28.3G 1.62G /mnt/SSD-jails/.warden-template-standard--x64-20150716121538 SSD-jails/.warden-template-standard-9.3-x64 1.62G 28.3G 1.62G /mnt/SSD-jails/.warden-template-standard-9.3-x64 SSD-jails/MariaDB 3.06G 28.3G 3.54G /mnt/SSD-jails/MariaDB SSD-jails/bubbleupnp 4.89G 25.2G 4.76G /mnt/SSD-jails/bubbleupnp SSD-jails/config 1002M 9.06G 959M /mnt/SSD-jails/config SSD-jails/config/test 88K 28.3G 88K /mnt/SSD-jails/config/test SSD-jails/dnsmasq_1 2.14G 7.94G 2.06G /mnt/SSD-jails/dnsmasq_1 SSD-jails/mailfilter 9.36G 28.3G 7.37G /mnt/SSD-jails/mailfilter SSD-jails/minimserver 3.88G 6.32G 3.68G /mnt/SSD-jails/minimserver SSD-jails/nginx 1.64G 28.3G 2.12G /mnt/SSD-jails/nginx SSD-jails/owncloud 2.08G 28.3G 2.54G /mnt/SSD-jails/owncloud SSD-jails/owncloud9 2.11G 28.3G 2.56G /mnt/SSD-jails/owncloud9 SSD-jails/owncloud_nginx_mariaDB 6.88G 5.27G 4.73G /mnt/SSD-jails/owncloud_nginx_mariaDB SSD-jails/plex 5.15G 25.7G 4.28G /mnt/SSD-jails/plex SSD-jails/reservation 5.00G 33.3G 100K /mnt/SSD-jails/reservation SSD-jails/reverseproxy 1.71G 28.3G 2.18G /mnt/SSD-jails/reverseproxy SSD-jails/tools 5.46G 5.85G 4.15G /mnt/SSD-jails/tools SSD-jails/transmission_1 605M 9.36G 651M /mnt/SSD-jails/transmission_1 SSD-jails/transmission_2 1.75G 8.51G 1.49G /mnt/SSD-jails/transmission_2
[root@generator] /mnt/SSD-jails# zfs list -r hotswap/Backup/SSD-jails NAME USED AVAIL REFER MOUNTPOINT hotswap/Backup/SSD-jails 113G 10.0T 21.1M /mnt/hotswap/Backup/SSD-jails hotswap/Backup/SSD-jails/.warden-template-pluginjail 607M 10.0T 605M /mnt/hotswap/Backup/SSD-jails/.warden-template-pluginjail hotswap/Backup/SSD-jails/.warden-template-pluginjail--x64 582M 10.0T 580M /mnt/hotswap/Backup/SSD-jails/.warden-template-pluginjail--x64 hotswap/Backup/SSD-jails/.warden-template-pluginjail--x64-20150716121537 582M 10.0T 580M /mnt/hotswap/Backup/SSD-jails/.warden-template-pluginjail--x64-20150716121537 hotswap/Backup/SSD-jails/.warden-template-pluginjail-9.3-x64 582M 10.0T 580M /mnt/hotswap/Backup/SSD-jails/.warden-template-pluginjail-9.3-x64 hotswap/Backup/SSD-jails/.warden-template-standard 2.79G 10.0T 2.79G /mnt/hotswap/Backup/SSD-jails/.warden-template-standard hotswap/Backup/SSD-jails/.warden-template-standard--x64 2.75G 10.0T 2.75G /mnt/hotswap/Backup/SSD-jails/.warden-template-standard--x64 hotswap/Backup/SSD-jails/.warden-template-standard--x64-20150716121538 2.74G 10.0T 2.74G /mnt/hotswap/Backup/SSD-jails/.warden-template-standard--x64-20150716121538 hotswap/Backup/SSD-jails/.warden-template-standard-9.3-x64 2.74G 10.0T 2.74G /mnt/hotswap/Backup/SSD-jails/.warden-template-standard-9.3-x64 hotswap/Backup/SSD-jails/MariaDB 5.75G 10.0T 4.97G /mnt/hotswap/Backup/SSD-jails/MariaDB *hotswap/Backup/SSD-jails/SSD-jails 188K 10.0T 188K /mnt/hotswap/Backup/SSD-jails/SSD-jails* hotswap/Backup/SSD-jails/bubbleupnp 8.37G 10.0T 6.60G /mnt/hotswap/Backup/SSD-jails/bubbleupnp hotswap/Backup/SSD-jails/config 1011M 10.0T 959M /mnt/hotswap/Backup/SSD-jails/config hotswap/Backup/SSD-jails/config/test 870K 10.0T 188K /mnt/hotswap/Backup/SSD-jails/config/test hotswap/Backup/SSD-jails/dnsmasq_1 5.14G 10.0T 3.45G /mnt/hotswap/Backup/SSD-jails/dnsmasq_1 hotswap/Backup/SSD-jails/mailfilter 17.8G 10.0T 12.5G /mnt/hotswap/Backup/SSD-jails/mailfilter hotswap/Backup/SSD-jails/minimserver 7.31G 10.0T 5.45G /mnt/hotswap/Backup/SSD-jails/minimserver hotswap/Backup/SSD-jails/nginx 4.25G 10.0T 3.45G /mnt/hotswap/Backup/SSD-jails/nginx hotswap/Backup/SSD-jails/owncloud 4.77G 10.0T 3.95G /mnt/hotswap/Backup/SSD-jails/owncloud hotswap/Backup/SSD-jails/owncloud9 4.83G 10.0T 4.00G /mnt/hotswap/Backup/SSD-jails/owncloud9 hotswap/Backup/SSD-jails/owncloud_nginx_mariaDB 11.7G 10.0T 7.00G /mnt/hotswap/Backup/SSD-jails/owncloud_nginx_mariaDB hotswap/Backup/SSD-jails/plex 9.48G 10.0T 6.25G /mnt/hotswap/Backup/SSD-jails/plex hotswap/Backup/SSD-jails/reservation 1.92M 10.0T 196K /mnt/hotswap/Backup/SSD-jails/reservation hotswap/Backup/SSD-jails/reverseproxy 4.34G 10.0T 3.55G /mnt/hotswap/Backup/SSD-jails/reverseproxy hotswap/Backup/SSD-jails/tools 9.59G 10.0T 6.10G /mnt/hotswap/Backup/SSD-jails/tools hotswap/Backup/SSD-jails/transmission_1 1.41G 10.0T 850M /mnt/hotswap/Backup/SSD-jails/transmission_1 hotswap/Backup/SSD-jails/transmission_2 3.69G 10.0T 2.46G /mnt/hotswap/Backup/SSD-jails/transmission_2
if I query the properties of this extra SSD-jails dataset
[root@generator] /mnt/SSD-jails# zfs get all hotswap/Backup/SSD-jails/SSD-jails NAME PROPERTY VALUE SOURCE hotswap/Backup/SSD-jails/SSD-jails type filesystem - hotswap/Backup/SSD-jails/SSD-jails creation Tue May 9 16:43 2017 - hotswap/Backup/SSD-jails/SSD-jails used 188K - hotswap/Backup/SSD-jails/SSD-jails available 10.0T - hotswap/Backup/SSD-jails/SSD-jails referenced 188K - hotswap/Backup/SSD-jails/SSD-jails compressratio 1.00x - *hotswap/Backup/SSD-jails/SSD-jails mounted yes -* hotswap/Backup/SSD-jails/SSD-jails quota none default hotswap/Backup/SSD-jails/SSD-jails reservation none default hotswap/Backup/SSD-jails/SSD-jails recordsize 128K default hotswap/Backup/SSD-jails/SSD-jails mountpoint /mnt/hotswap/Backup/SSD-jails/SSD-jails default hotswap/Backup/SSD-jails/SSD-jails sharenfs off default hotswap/Backup/SSD-jails/SSD-jails checksum on default hotswap/Backup/SSD-jails/SSD-jails compression lz4 inherited from hotswap hotswap/Backup/SSD-jails/SSD-jails atime on default hotswap/Backup/SSD-jails/SSD-jails devices on default hotswap/Backup/SSD-jails/SSD-jails exec on default hotswap/Backup/SSD-jails/SSD-jails setuid on default *hotswap/Backup/SSD-jails/SSD-jails readonly on local* hotswap/Backup/SSD-jails/SSD-jails jailed off default hotswap/Backup/SSD-jails/SSD-jails snapdir hidden default hotswap/Backup/SSD-jails/SSD-jails aclmode passthrough inherited from hotswap hotswap/Backup/SSD-jails/SSD-jails aclinherit passthrough inherited from hotswap hotswap/Backup/SSD-jails/SSD-jails canmount on default hotswap/Backup/SSD-jails/SSD-jails xattr off temporary hotswap/Backup/SSD-jails/SSD-jails copies 1 default hotswap/Backup/SSD-jails/SSD-jails version 5 - hotswap/Backup/SSD-jails/SSD-jails utf8only off - hotswap/Backup/SSD-jails/SSD-jails normalization none - hotswap/Backup/SSD-jails/SSD-jails casesensitivity sensitive - hotswap/Backup/SSD-jails/SSD-jails vscan off default hotswap/Backup/SSD-jails/SSD-jails nbmand off default hotswap/Backup/SSD-jails/SSD-jails sharesmb off default hotswap/Backup/SSD-jails/SSD-jails refquota none default hotswap/Backup/SSD-jails/SSD-jails refreservation none default hotswap/Backup/SSD-jails/SSD-jails primarycache all default hotswap/Backup/SSD-jails/SSD-jails secondarycache all default hotswap/Backup/SSD-jails/SSD-jails usedbysnapshots 0 - hotswap/Backup/SSD-jails/SSD-jails usedbydataset 188K - hotswap/Backup/SSD-jails/SSD-jails usedbychildren 0 - hotswap/Backup/SSD-jails/SSD-jails usedbyrefreservation 0 - hotswap/Backup/SSD-jails/SSD-jails logbias latency default hotswap/Backup/SSD-jails/SSD-jails dedup off inherited from hotswap hotswap/Backup/SSD-jails/SSD-jails mlslabel - hotswap/Backup/SSD-jails/SSD-jails sync standard default hotswap/Backup/SSD-jails/SSD-jails refcompressratio 1.00x - hotswap/Backup/SSD-jails/SSD-jails written 188K - hotswap/Backup/SSD-jails/SSD-jails logicalused 36.5K - hotswap/Backup/SSD-jails/SSD-jails logicalreferenced 36.5K - hotswap/Backup/SSD-jails/SSD-jails volmode default default hotswap/Backup/SSD-jails/SSD-jails filesystem_limit none default hotswap/Backup/SSD-jails/SSD-jails snapshot_limit none default hotswap/Backup/SSD-jails/SSD-jails filesystem_count none default hotswap/Backup/SSD-jails/SSD-jails snapshot_count none default hotswap/Backup/SSD-jails/SSD-jails redundant_metadata all default hotswap/Backup/SSD-jails/SSD-jails org.freenas:description inherited from hotswap/Backup/SSD-jails
you can see above the dataset is readonly and (to my suprise) indeed mounted?!
to my suprise because it is not present in at the given mountpoint:
[root@generator] /mnt/SSD-jails# ls -lsha /mnt/hotswap/Backup/SSD-jails/SSD-jails ls: /mnt/hotswap/Backup/SSD-jails/SSD-jails: No such file or directory
I can recreate this issue on my machine. I deleted all snapshots, recreated the snapshot task, recreated the replication destination dataset, recreated the replication task and this is consistent behaviour on my machine.
Hardware is
ASRockRack E3C224D4I-14S
Intel(R) Core(TM) i3-4330 CPU @ 3.50GHz
32 GB ECC RAM
pool hotswap is a 8x4TB HDD pool
pool SSD-jails is 1x120GB SSD
If you need anything else please just let me know.
I created this ticket because at https://bugs.freenas.org/issues/15355 had been advised to do so
Related issues
Associated revisions
History
#1
Updated by Lorenz Pressler almost 4 years ago
- File debug-generator-20170509182822.txz added
#2
Updated by William Grzybowski almost 4 years ago
- Status changed from Unscreened to Screened
- Target version set to 11.1
#3
Updated by Lorenz Pressler almost 4 years ago
#4
Updated by Lorenz Pressler almost 4 years ago
- ChangeLog Entry updated (diff)
#5
Updated by William Grzybowski almost 4 years ago
- Priority changed from No priority to Important
#6
Updated by William Grzybowski almost 4 years ago
- Assignee changed from William Grzybowski to Sean Fagan
#7
Updated by Sean Fagan almost 4 years ago
- Status changed from Screened to Closed: Behaves correctly
So this is behaves correctly, although it is admittedly a bit strange.
Each pool has a top-level dataset of the same name, due to how zfs was implemented. (As an example: there are per-dataset and per-pool properties; compression is a per-dataset one, but you want to be able to have it set in the root so you can inherit it. Thus the root dataset's existence.)
It's mounted, but then the rest of the datasets are mounted on top of it. So it's no longer visible. You can see that it's mounted by looking at the output of "mount -v".
I just checked my own systems at home, and it's there as well -- but none of the files is missing.
#8
Updated by Lorenz Pressler almost 4 years ago
thank you Sean for looking into it and for the insight that this dataset is supposed to be there. But still: the logfile gets spammed with that statvfs errors every 10s. It would be great if that could be suppressed somehow if the root of the pool is replicated. Should I open another ticket for that? Also I would like to remove my debug log and make this ticket public to link it in the forums, since there were a few complaints about this behavior where no one had a clue that this is "normal". Would that be okay?
#9
Updated by Dru Lavigne almost 4 years ago
- File deleted (
debug-generator-20170509182822.txz)
#10
Updated by Dru Lavigne almost 4 years ago
- Related to Bug #15355: Possible repeat of bug 12455 - statvfs error for replicated snapshot of jails directory added
#11
Updated by Dru Lavigne almost 4 years ago
- Status changed from Closed: Behaves correctly to Unscreened
- Assignee changed from Sean Fagan to William Grzybowski
- Private changed from Yes to No
William: the statvs bug has raised its ugly head again. Can you take a crack at reproducing?
#12
Updated by William Grzybowski almost 4 years ago
- Assignee changed from William Grzybowski to Release Council
It is my knowledge that Sean is the new point of contact for Replication. Please update the assignee table.
#13
Updated by Dru Lavigne almost 4 years ago
- Assignee changed from Release Council to Sean Fagan
#14
Updated by Sean Fagan almost 4 years ago
- Category changed from Middleware to OS
- Assignee changed from Sean Fagan to Release Council
That would be a change made to collectd, which is what is reporting the error. (It's reading the mounted filesystem list, and can't access one of them.)
#15
Updated by Dru Lavigne almost 4 years ago
- Assignee changed from Release Council to Sean Fagan
This is a replication bug. Please sync up with William to discuss.
#16
Updated by Sean Fagan almost 4 years ago
- Assignee changed from Sean Fagan to Release Council
It is not a replication bug -- there is no code in the replication path that can be changed to change this behaviour. (I suppose ZFS could be changed, but that would still make it a ZFS issue, not a replication bug.)
Or, collectd could be changed to not complain about a mounted filesystem not being accessible.
Or this could be put back to Closed: Behaves Correctly.
#17
Updated by William Grzybowski almost 4 years ago
How is this not a replication bug? Everything is fine until replication mess around with mountpoints with ZFS send/recv. What am I misunderstanding here?
Do you think it is okay to have a message nagging you every 10 seconds about a mountpoint that should not be there but isn't?
Why filesystems are mounted on the top of the other (wrong order)?
What makes you think it is a collectd bug if collectd is unaware of replication? It is simply trying to do its job, which is look at a mountpoint that should be there but isn't.
#18
Updated by Sean Fagan almost 4 years ago
- You are misunderstanding how replications work. It's counter-intuitive and annoying.
- I don't, but the simplest and easiest way to fix that is to have collectd stop complaining about filesystems that it can't access.
- They are not mounted in the wrong order; the root dataset is normally not mounted, but when you tell a zfs replication to mount the datasets it receives, then it mounts them -- and the root dataset is thus mounted. And then the subsequent datasets are mounted.
- It's a collectd bug because there are lots of situations where you can have this happen, and there's not a lot of point to complaining about it. But I did not actually say (until now) that it's a collectd bug, I simply said that's what's complaining about it and is the easist, most doable thing to change.
- Note that it only happens when you replicate an entire pool; it won't happen if you replicate a particular dataset (even recursively).
- Alternately we could tell the replication to not mount the filesystems on the receive side. Then that would be another complaint people have, that they cannot access their data on the target system. This would mean using the '-u' option to the zfs recv command. Again, when I weigh the two options here, the one least likely to cause problems is still changing collectd to not complain about a filesystem it can't access.
#19
Updated by William Grzybowski almost 4 years ago
1. I understand how replications work. I have worked with it for years.
2. collectd is doing its job, removing a feature of it is an ugly workaround.
3. The user can't access the mount point, either, right? Something does not look right.
4. Doesn't make it a bug. Its a real complaint, it is warning the user as it should. "Hey, I went to look for data in that path you told me to, but its not there!"
5. Doesn't make it less of a bug
6. Agreed, we do not want it to not mount on the receiving side.
Which brings back to: it is a replication issue. It might not be a bug in the replication code itself (autorepl.py), but it is part of the replication ecosystem, thus a bug in the replication.
#20
Updated by Sean Fagan almost 4 years ago
You can choose to either mount none of the datasets, or all of them.
Not mounting them already results in complaints.
My position is not going to change on this.
#21
Updated by Sean Fagan almost 4 years ago
Adding '-u' is causing unexpected (to me) behaviour. Will update after I've had some actual snapshots taken and replicated.
#22
Updated by Sean Fagan almost 4 years ago
- Status changed from Unscreened to Fix In Progress
- Assignee changed from Release Council to Sean Fagan
Okay. The replication worked. Also realized that the replication code is manually recursing for replication, something my new code doesn't do (but maybe it should?).
This will need some significant QA testing to ensure that replications continue to work as expected.
#23
Updated by Sean Fagan almost 4 years ago
- Status changed from Fix In Progress to 19
Okay, checked into branch FIX-23871 and merged into master. Needs testing.
#24
Updated by William Grzybowski almost 4 years ago
Doesn't -u do not mount the filesystem on receiving side? I thought we agreed that this wasn't the best thing to do. Or am I misunderstanding this again?
#25
Updated by Sean Fagan almost 4 years ago
As I said: "Adding '-u' is causing unexpected (to me) behaviour."
You are the one insisting on a potentially dangerous resolution to a problem that would be most easily solved by not logging a pointless error. I don't like that I can't reconcile the documented behaviour of the flag with the observed results of the flag.
#26
Updated by William Grzybowski almost 4 years ago
Sean Fagan wrote:
As I said: "Adding '-u' is causing unexpected (to me) behaviour."
You are the one insisting on a potentially dangerous resolution to a problem that would be most easily solved by not logging a pointless error. I don't like that I can't reconcile the documented behaviour of the flag with the observed results of the flag.
You said that, and after that you committed the code :).
I am not insisting on anything potentially dangerous. It is a real problem that needs real attention.
And yes, I do not agree with silencing a warning of something real. The data is not visible to the user on the receiving side, thats the real issue.
#27
Updated by Sean Fagan almost 4 years ago
THERE IS NO DATA ON THE RECEIVING SIDE.
The root dataset exists solely to hold dataset properties (as opposed to pool properties). When you set a dataset property on ${POOL}, it's actually setting it on ${POOL}/${POOL}. Despite that, there is no data in it. It is an empty filesystem, even if you have data in the root of the pool.
#28
Updated by William Grzybowski almost 4 years ago
Sean Fagan wrote:
THERE IS NO DATA ON THE RECEIVING SIDE.
The root dataset exists solely to hold dataset properties (as opposed to pool properties). When you set a dataset property on ${POOL}, it's actually setting it on ${POOL}/${POOL}. Despite that, there is no data in it. It is an empty filesystem, even if you have data in the root of the pool.
May 9 14:32:07 generator collectd[3367]: statvfs(/mnt/hotswap/Backup/SSDjails-backup/SSD-jails) failed: No such file or directory
There is no data in /mnt/hotswap/Backup/SSDjails-backup/SSD-jails ?
#29
Updated by Sean Fagan almost 4 years ago
Correct.
#30
Updated by William Grzybowski almost 4 years ago
collectd7892: statvfs(/mnt/ZFS_Backup_2/SSD_Pool_SS/jails/owncloud1) failed: No such file or directory
What about this? There is no data in /mnt/ZFS_Backup_2/SSD_Pool_SS/jails/owncloud1 as well?
#31
Updated by Sean Fagan almost 4 years ago
If it's the same thing, then yes. But the logs are gone now so I can't tell what is going on there.
I'm waiting to see your patch to ZFS to handle this situation.
#32
Updated by William Grzybowski almost 4 years ago
My patch? I thought you handled Replication and is a kernel developer. Why are you asking me for a patch?
Am I not allowed to point out problems?
#33
Updated by Sean Fagan almost 4 years ago
On my systems:
Target:
root@mininas:/mnt # fgrep statvfs /var/log/messages | tail -2 Jul 14 12:18:54 mininas collectd[2947]: statvfs(/mnt/Storage/NAS/NAS) failed: No such file or directory Jul 14 12:18:54 mininas collectd[2947]: statvfs(/mnt/Storage/NAS/Test2) failed: No such file or directory root@mininas:/mnt # zfs list -r Storage/NAS | grep Test2 Storage/NAS/Test2 128K 3.63T 128K /mnt/Storage/NAS/Test2
Source:
root@nas:/mnt/NAS # zfs list -r NAS | grep Test2 root@nas:/mnt/NAS #
So:
a) The -u doesn't seem to have fixed it, so oh well, verifying that that is the case and then I'll revert it.
b) The dataset it's complaining about was deleted on the source side, but the replication did not delete it on the remote side. I believe that's because the current replicator seems to iterate over each dataset, instead of using 'zfs send -R'.
#34
Updated by Sean Fagan almost 4 years ago
William Grzybowski wrote:
My patch? I thought you handled Replication and is a kernel developer. Why are you asking me for a patch?
Am I not allowed to point out problems?
When you're making them up and deciding that simple work-arounds for non-existent problems aren't acceptable, no.
#35
Updated by William Grzybowski almost 4 years ago
Glad to know you think a bug reported by multiple people is non-existent.
Feel free to masquerade the error as you please. I am simply stating my opinion.
I wont interfere anymore. Bye.
#36
Updated by Kris Moore almost 4 years ago
Guys - First of all, this has gotten a bit out of control. Take 5 everybody.
Next, the issue here. Do we fix replication to mount/umount things so they match what was on the sending side? Or do we hide the warning from collectd. I'd say perhaps we close this, but put as a feature goal for the replication, rewrite the ability to look at what has the mount property enabled/disabled and have it make sure it is mounted/unmounted on the remote properly.
Just papering over the issue with collectd seems icky :/
#37
Updated by Sean Fagan almost 4 years ago
The mounting is done automatically by zfs. You cannot change properties on a snapshot (which is what is replicated).
collectd is complaining about something that is not really an error: a filesystem listed in the kernel is not accessible. This could be due the filesystem being unmounted between the time it called getfsstat() and when it calls statvfs(), or it could be something has mounted over it, or it could be due to running in a jail or chrooted environment, or (not in *bsd with zfs, but could happen with nfs mounts) it could be due to permissions. Regardless, it's not a useful error message in a large percentage of the cases it would happen on.
Note:- The ability to mount on top of things is inherent in unix. By design and definition, things under it are hidden.
- The kernel does not hide hidden mounts, nor does it adjust the mount table in a chroot.
- This behaviour has always been there.
- Replication can be done by using a recursive send, or iterating over the datasets; there are advantages to both methods (the iteration method can better handle the case where children datasets don't have the same set of snapshots that the parents do, but it means that deleted datasets aren't deleted). The current code uses iteration, which means that the only way a deleted dataset will end up being deleted on the target side is by "ssh ${user}@${host} zfs destroy -r ${remote_fs}/${dataset}", which frankly is a lot scarier than seeing log messages every 10 minutes.
#38
Updated by Sean Fagan almost 4 years ago
I'm going to apologise to william and take back most of what I said above: I was looking at the wrong information when I concluded that ${POOL}/${POOL} was part of zfs. It's us, and goes back to 2016. The commit was 059424b57d7bb5cf02a01ae3dca7139d9e75141f, and I am trying to understand why Josh made it. (I've currently got it changed on my home system, and it seems to be working properly, but I've got to assume I'm missing something.)
#39
Updated by Sean Fagan almost 4 years ago
- Status changed from 19 to Investigation
#40
Updated by Sean Fagan almost 4 years ago
- Status changed from Investigation to 15
- Assignee changed from Sean Fagan to Kris Moore
Okay. So that code is to ensure that we create the intermediate datasets on the remote end. If we're backing up NAS/SEF -> Storage/NAS, we need to have Storage/NAS/SEF created. If we're doing the entire pool NAS -> Storage/NAS, we only need to create Storage/NAS.
Only that code doesn't seem to be necessary.
if "/" not in localfs:
localfs_tmp = "%s/%s" % (localfs, localfs)
else:
localfs_tmp = localfs
for direc in (remotefs.partition("/")[2] + "/" + localfs_tmp.partition("/")[2]).split("/"):
# If this test fails there is no need to create datasets on the remote side
# eg: tank -> tank replication
if not direc:
continue
if '/' in remotefs or '/' in localfs:
ds = os.path.join(ds, direc)
ds_full = '%s/%s' % (remotefs.split('/')[0], ds)
if ds_full in remote_zfslist:
continue
log.debug("ds = %s, remotefs = %s" % (ds, remotefs))
sshproc = pipeopen('%s %s %s' % (sshcmd, rzfscmd, ds_full), quiet=True)
output, error = sshproc.communicate()
error = error.strip('\n').strip('\r').replace('WARNING: ENABLED NONE CIPHER', '')
# Debugging code
if sshproc.returncode:
log.debug("Unable to create remote dataset %s: %s" % (
remotefs,
error
))
I think it's setting localfs_tmp to ${POOL}/${POOL} so that the if in the loop triggers.
However, this gets into institutional and environmental memories I don't have: is it allowed to set a target for replication to just be a pool name? Or does it always need to have a target dataset?
#41
Updated by Sean Fagan almost 4 years ago
I've created a new FIX-23871 branch, but not merged it to master.
The code works (as far as my testing has revealed so far), but it requires that the target be a dataset. That is, remotefs="Storage/NAS" localfs="NAS" works, but remotefs="Storage" localfs="NAS/SEF" will create Storage/SEF, instead of Storage/NAS/SEF. I think this is correct behaviour, but it may be that we need to be able to replicate to a pool. In which case I need to know, if I set localfs=NAS and remotefs=Storage, what should the result be? Storage/NAS, or just everything on Storage? What if I set localfs=NAS/SEF and remotefs=Storage -- should it still end up as Storage/NAS/SEF, or Storage/SEF?
#42
Updated by Sean Fagan almost 4 years ago
Let me amend that: no, NAS/SEF -> Storage needs to end up as Storage/NAS/SEF, because of zfs. So the dataset needs to be created in that case. So the question is whether a pool name alone is allowed for the replication target.
#43
Updated by Kris Moore almost 4 years ago
- Status changed from 15 to Needs Developer Review
- Assignee changed from Kris Moore to John Hixson
#44
Updated by T F over 3 years ago
I'm seeing similar messages on a new FreeNAS 11.0-U1 system that is a replication target for another system of the same OS level. Previous systems running FreeNAS-9.10.2-U1 & FreeNAS-9.10.2-U3 with a similar configuration (the main difference is the 9.10.2 systems are not doing recursive replication) do not report this.
#45
Updated by T F over 3 years ago
Some addition details from poking around in /var/log/messages
- It repeats every 10 seconds
- It is currently logging 3 messages each interval - 1 for the top-level dataset (/mnt/tank/Replication_Target01/top-level-dataset) & 2 for child datasets (that also happen to be children of another child-dataset with the top-level dataset - so /mnt/tank/Replication_Target01/child-dataset1/child-dataset2 /mnt/Mirror01/Replication_Target01/child-dataset1/child-dataset3)
- There are several other child-datasets that do not hold child-datasets under them that the message is not logged for
#46
Updated by Sean Fagan over 3 years ago
Yes, the issue was diagnosed up above.
#47
Updated by T F over 3 years ago
Ah, whoops I read most of the history but missed that last piece.
#48
Updated by Sean Fagan over 3 years ago
- Has duplicate Bug #21694: The root volume on the sending computer is on the same level as it's subdatasets on the Replication computer's volumes added
#49
Updated by Dru Lavigne over 3 years ago
- Status changed from Needs Developer Review to 46
John: please review this code or pass to someone else for review.
#50
Updated by John Hixson over 3 years ago
Dru Lavigne wrote:
John: please review this code or pass to someone else for review.
I was told I could take time on this one. It's targetted for 11.1. If you want to assign to someone else, feel free to do so.
#51
Updated by Dru Lavigne over 3 years ago
- Status changed from 46 to Needs Developer Review
#52
Updated by John Hixson over 3 years ago
I've reviewed the ticket history. I'll review the code tomorrow.
#53
Updated by John Hixson over 3 years ago
- Status changed from Needs Developer Review to Reviewed by Developer
- Assignee changed from John Hixson to Release Council
Looks good to me. I'm not clear on why we originally did the poolname/poolname change either.
#54
Updated by Dru Lavigne over 3 years ago
- Status changed from Reviewed by Developer to Ready For Release
- Assignee changed from Release Council to Sean Fagan
- Target version changed from 11.1 to 11.1-BETA1
#55
Updated by Dru Lavigne over 3 years ago
- Subject changed from Replication creates additional dataset (readonly) at destination to Don't create additional dataset at replication destination
#56
Updated by Dru Lavigne over 3 years ago
- Status changed from Ready For Release to Resolved
#57
Updated by Bonnie Follweiler over 3 years ago
- Needs QA changed from Yes to No
- QA Status Test Passes FreeNAS added
- QA Status deleted (
Not Tested)
#59
Updated by Bonnie Follweiler over 3 years ago
- Status changed from Resolved to Unscreened
This is not resolved in .FreeNAS-11.1-INTERNAL2
Andrew Walker noted that he was able to reproduced the issue on the QA mini. Datasets tank/nonexistent and tank/nonexistent/smb_share were created.
tank/nonexistent/smb_share is not mountable.
#60
Updated by Dru Lavigne over 3 years ago
- Status changed from Unscreened to 46
- Target version changed from 11.1-BETA1 to 11.1
Sean: please work with Andrew and Bonnie to resolve this.
#61
Updated by Sean Fagan over 3 years ago
- Status changed from 46 to 15
- Assignee changed from Sean Fagan to Bonnie Follweiler
- Target version changed from 11.1 to 11.1-BETA1
Insufficient information here.
What was created and by whom?
#62
Updated by Sean Fagan over 3 years ago
- Target version changed from 11.1-BETA1 to 11.1
#63
Updated by Sean Fagan over 3 years ago
Let me summarize, again, what's going on, and why this is not a bug, but a design decision we chose a long time ago:
- When replicating from source:tank/ReplSource to destination:tank/ReplTarget, the destination has to exist.
- To ensure it exists, the replication code does a "zfs create -o readonly=on tank/ReplTarget" on the destination system.
- It then replicates, using essentially "zfs send tank/ReplSource | zfs recv -dF tank/ReplTarget"
- zfs then replicates the dataset ReplSource into the dataset tank/ReplTarget, which means that it is creating tank/ReplTarget/ReplSource
- Because tank/ReplTarget was created by the replication code as read-only, it is not possible for zfs to create the mountpoint "tank/ReplTarget/ReplSource", and as result of this, the dataset tank/ReplTarget/ReplSource cannot be mounted in tank/ReplTarget. It is there however.
- After rebooting (or exporting and re-importing) on the destination, zfs automatically creates tank/ReplTarget/ReplSource and mounts the dataset there.
- There's some magic I'm not fully comprehending yet when the next replication runs that resets it back to the second previous case, making it look like the dataset has gone when it hasn't.
The solution is: create the target dataset when setting up replication.
Now, we can change the code to create the intervening datasets without readonly. But then this causes problems later on, because a modified dataset getting incremental snapshots won't work. (And I am hazy on remembering the details here, because we touched that code last when Josh Paetzel was here, and we came to the current code after a fair bit of discussion and testing.)
#64
Updated by Andrew Walker over 3 years ago
Sean Fagan wrote:
Sean, the reason why this behavior is a problem:Let me summarize, again, what's going on, and why this is not a bug, but a design decision we chose a long time ago
- TrueNAS customer planned to export replication target via NFS
- TrueNAS customer created a replication task and fumbled the case of one of the path components to the target dataset (one path component was upper case instead of lower case).
- This caused the replication code to create the missing path components which resulted in the replicated dataset being unmountable.
- The above situation was not reported in the GUI and was not discovered until customer could not find dataset in TrueNAS UI. Fix was also not immediately apparent.
I think It would be good to have a GUI warning when the target path does not exist on the destination or some other form of safety belt.
#65
Updated by Sean Fagan over 3 years ago
They can fix the problem by making the intervening dataset(s) writable, by doing "zfs set readonly=off ${dataset}". (For the example I gave, on the destination system, "zfs set readonly=off tank/ReplTarget")
The GUI can try to warn about the path -- that's at a different level than autorepl.py, so that'd be a new feature to add. But the replication code just blatantly tries to create it, because that most easily handles the situation where the target to changed, destroyed, or otherwise has to have the intermediate components created. (And if the components already exist, it is a no-op -- it doesn't change the readonly attribute.)
Mostly I think this is documentation at this point.
#66
Updated by Sean Fagan over 3 years ago
I should also make this point: when just using replication for backing up (either locally or offsite), the fact that it's not automatically visible doesn't matter. You can even replicate when it's not mountable to another location, or back to the source.
When trying to use replication for visible redundancy (which is what I do at home, for example), then the additional step of creating the dataset, or of changing it to read-write, is necessary. I thought we'd agreed this needed to be documented better; does it still need to be called out more clearly? (Or in help in the UI?)
#67
Updated by Andrew Walker over 3 years ago
Sean Fagan wrote:
I thought we'd agreed this needed to be documented better; does it still need to be called out more clearly? (Or in help in the UI?)
Sean, thanks for taking the time to explain this succinctly. I agree with you regarding the autorepl.py behavior now that I understand what is going on. I will file a feature request for extra seatbelts and let conversation about this go on from there. :) FYI, I did resolve the issue on the customer system by changing the 'readonly' property of the path to the replication target and then mounting the datasets.
The current docs state the following:
The target dataset on the receiving system is automatically created in read-only mode to protect the data. To mount or browse the data on the receiving system, create a clone of the snapshot and use the clone. Clones are created in read/write mode, making it possible to browse or mount them. See Snapshots for more information on creating clones.
I think part of the reason the customer was confused was because some of his replicated datasets were visible and others were not visible. The current docs do not address this aspect of replication behavior (specifically, whether it will be visible / browsable depends on the ZFS readonly property of the path to the dataset). Perhaps they need to be a bit more nuanced. It might be a good idea to add to the tooltip as well.
#68
Updated by Sean Fagan over 3 years ago
- Assignee deleted (
Bonnie Follweiler)
Okay. RC, do we want to close this, or turn it into a doc ticket?
#69
Updated by Kris Moore over 3 years ago
- Assignee set to Dru Lavigne
Dru, want to do any docs updates here? Otherwise we can close this out again.
#70
Updated by Warren Block over 3 years ago
We can add to the existing note, but people continue to encounter this, which means it is not expected. I suspect we need to either add an option to the GUI, or have the replication code set the ZFS property, or maybe both.
#71
Updated by Dru Lavigne over 3 years ago
- Assignee changed from Dru Lavigne to Kris Moore
We've already added a note in the docs. Will this be redesigned for the new replicator in 11.2?
#72
Updated by Dru Lavigne over 3 years ago
- Target version changed from 11.1 to 11.1-BETA1
#73
Updated by Kris Moore over 3 years ago
- Status changed from 15 to Resolved
- Assignee changed from Kris Moore to Sean Fagan
I'm moving this back to resolved. I'm a bit reluctant to muck about much more with the old replication engine. This is something we should pass along to @bartosz as a feature request for the new.