Project

General

Profile

Bug #17264

Add dedicated user example to replication section of Guide

Added by Cyber Jock over 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Important
Assignee:
Warren Block
Category:
Documentation
Target version:
Seen in:
Severity:
New
Reason for Closing:
Reason for Blocked:
Needs QA:
No
Needs Doc:
Yes
Needs Merging:
Yes
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

Josh P has hit this bug and is aware of it. In short, you cannot setup a dedicated user for replication without problems. Josh P investigated this already and believes the issue is a lack of permissions for unmounting a dataset before remounting it with the new snapshot.

I have not tested this on 9.10.1, but there is no expectation that this bug is fixed.

Turn (TrueNAS customer) is waiting for this to be fixed under ticket EUU-816-96163.

2017-04-11_17h31_29.jpg (1.05 MB) 2017-04-11_17h31_29.jpg Cyber Jock, 04/11/2017 05:33 PM
bad column shown.png (597 KB) bad column shown.png Joshua Sirrine, 10/09/2017 05:50 PM
failed replication.png (777 KB) failed replication.png Joshua Sirrine, 10/09/2017 05:50 PM
10653
12661
12662

Associated revisions

Revision 73a2a758 (diff)
Added by Warren Block over 3 years ago

Add an initial dedicated user replication example, thanks to Bonnie. Ticket: #17264

Revision 2d7bf214 (diff)
Added by Warren Block over 3 years ago

Update, clarify, and simplify dedicated user replication example. Ticket: #17264

History

#1 Updated by Josh Paetzel over 4 years ago

I've repeatedly said the issue with Turn is one unit was running 9.2 and the other was on 9.3 so I set this up manually and missed a user prop. I do expect it works correctly on 9.3 -> 9.3

#2 Updated by William Grzybowski over 4 years ago

  • Status changed from Unscreened to Screened

So nothing to do here?

#3 Updated by William Grzybowski over 4 years ago

  • Status changed from Screened to 15

#4 Updated by Josh Paetzel over 4 years ago

It probably needs to get tested to ensure it still works.

#5 Updated by Cyber Jock over 4 years ago

I will test this. Please standby.

#6 Updated by Josh Paetzel over 4 years ago

This just plain doesn't work.

#7 Updated by Josh Paetzel over 4 years ago

zfs userprops for the dedicated user are not set on the receiving side.

#8 Updated by Vaibhav Chauhan over 4 years ago

BRB: cyber please report your findings to us.

#9 Updated by William Grzybowski over 4 years ago

  • Status changed from 15 to Screened
  • Target version set to TrueNAS-9.10.2

Vaibhav Chauhan wrote:

BRB: cyber please report your findings to us.

Josh already said this doesn't work, so no feedback needed.

#10 Updated by Vaibhav Chauhan over 4 years ago

  • Target version changed from TrueNAS-9.10.2 to TrueNAS-9.10.2-U1

Will not be making 9.10.2, this is not end customers top priority
-N.B

#11 Avatar?id=14398&size=24x24 Updated by Kris Moore about 4 years ago

  • Target version changed from TrueNAS-9.10.2-U1 to TrueNAS-9.10.2-U2

#12 Updated by Cyber Jock about 4 years ago

Can we please not push this back any further? The customer that first identified this issue has been waiting for a fix since July 2016.

Thank you.

#13 Updated by William Grzybowski about 4 years ago

Cyber Jock wrote:

Can we please not push this back any further? The customer that first identified this issue has been waiting for a fix since July 2016.

Thank you.

Why is this so important?
This field was added back in 2012 and never hooked up to anything.
I am considering removing it altogether.

#14 Updated by Cyber Jock about 4 years ago

The best answer I can give is that some of our customers cannot use things like root users for things like replication due to security requirements for the company. The support ticket I have right now is a company that used the feature, and on upgrade it stopped working. That's when Josh P got involved and said that it was broken and to file a bug ticket on it.

#15 Updated by William Grzybowski about 4 years ago

Cyber Jock wrote:

The best answer I can give is that some of our customers cannot use things like root users for things like replication due to security requirements for the company. The support ticket I have right now is a company that used the feature, and on upgrade it stopped working. That's when Josh P got involved and said that it was broken and to file a bug ticket on it.

What issues did you find when testing this feature? All it ever did was set the login used by SSH.

#16 Updated by Cyber Jock about 4 years ago

I didn't test the feature. Josh P troubleshot the problem with the customer in support ticket EUU-816-98183 and the only information I have is what I put in the ticket.

" Josh P investigated this already and believes the issue is a lack of permissions for unmounting a dataset before remounting it with the new snapshot."

Sorry. I couldn't figure out the problem originally and so I can't tell you how he came to that conclusion.

#17 Updated by William Grzybowski about 4 years ago

Cyber Jock wrote:

I didn't test the feature. Josh P troubleshot the problem with the customer in support ticket EUU-816-98183 and the only information I have is what I put in the ticket.

" Josh P investigated this already and believes the issue is a lack of permissions for unmounting a dataset before remounting it with the new snapshot."

Sorry. I couldn't figure out the problem originally and so I can't tell you how he came to that conclusion.

Well, does the user belong to wheel? I can't imagine any other way a user would be able to do that.

#18 Updated by Cyber Jock about 4 years ago

I don't believe so. Since the intent was for the 'dedicated user' to NOT be a member of the group wheel (they'd have only the minimum permissions required to handle replication and not all of the permissions that come with the wheel group). Am I confused or mistaken about the intent of the "dedicated user"? If so, that would also assume that Josh P was confused about the intent, and that would also mean that the customer was confused (and used it without wheel permissions from what I understand).

My information on this issue was that this worked, but on an upgrade it stopped working due to the bug.

#19 Updated by William Grzybowski about 4 years ago

Cyber Jock wrote:

My information on this issue was that this worked, but on an upgrade it stopped working due to the bug.

I don't believe in that assessment.

The user either needs to be on wheel or a "zfs allow" needs to be run in the remote side.

#20 Updated by William Grzybowski about 4 years ago

  • Status changed from Screened to Closed: User Config Issue

I have verified this feature works as intended.

As said previously, you need proper "zfs allow" in the receiving side.

#21 Updated by William Grzybowski about 4 years ago

  • Target version changed from TrueNAS-9.10.2-U2 to na

#22 Updated by Cyber Jock about 4 years ago

William,

Do you know what the minimum zfs allow properties are for this to work? Nobody seems to know this...

#23 Updated by William Grzybowski about 4 years ago

You'll need create, destroy, mount and receive permissions.

#24 Updated by William Grzybowski about 4 years ago

  • Status changed from Closed: User Config Issue to Unscreened
  • Assignee changed from William Grzybowski to Dru Lavigne
  • Target version changed from na to TrueNAS-9.10.3

Dru,

Can the docs team document whatever is needed to make the Dedicated User feature work for replication, please?

The selected user needs either be in the wheel group or have been manually enabled in the receiving side using "zfs allow".

Thanks!

#25 Updated by Dru Lavigne about 4 years ago

  • Project changed from TrueNAS to FreeNAS
  • Category changed from 50 to Documentation
  • Assignee changed from Dru Lavigne to Warren Block
  • Target version changed from TrueNAS-9.10.3 to 9.10.3
  • Seen in changed from TrueNAS 9.10 to 9.10.2-U2

#26 Updated by Warren Block about 4 years ago

  • Assignee changed from Warren Block to William Grzybowski

William:
Please give us a working example of this along with exactly what is required, and then we can document it.

#27 Updated by William Grzybowski about 4 years ago

  • Assignee changed from William Grzybowski to Warren Block

If replication is being made to "tank/foo" with dedicated user "bar".

The user will have to run in the remote side:

zfs allow -ldu bar create,destroy,mount,receive tank/foo

#28 Updated by Warren Block about 4 years ago

  • Status changed from Unscreened to Screened
  • Target version changed from 9.10.3 to 9.10.3-U1

#29 Updated by Cyber Jock about 4 years ago

10653

Okay, I just created this between 2 VMs and this isn't working. Screenshot is attached.

All that I have done that would be "out of the ordinary if we used the root user" was to take the key from the "view public key" from the source machine and provided it to the 'test' user on the destination. Is that not correct? If not, what am I supposed to do for the key exchange?

All I'm getting is:

Apr 11 17:33:01 truenas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs list -t snapshot -H -o name
Apr 11 17:33:02 truenas autorepl.py: [tools.autorepl:233] Autosnap replication started
Apr 11 17:33:02 truenas autorepl.py: [tools.autorepl:234] temp log file: /tmp/repl-7886
Apr 11 17:33:02 truenas autorepl.py: [tools.autorepl:304] Checking dataset tank/test
Apr 11 17:33:02 truenas autorepl.py: [common.pipesubr:66] Popen()ing: /sbin/zfs list -H -t snapshot -p -o name,creation -r "tank/test"
Apr 11 17:33:02 truenas autorepl.py: [common.pipesubr:66] Popen()ing: /usr/local/bin/ssh -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -l test -p 22 192.168.2.87 "zfs list -H -t snapshot -p -o name,creation -r 'tank/test'"
Apr 11 17:33:02 truenas autorepl.py: [tools.autorepl:138] Sending zfs snapshot: /sbin/zfs send -V -p | /usr/local/bin/lz4c | /bin/dd obs=1m 2> /dev/null | /bin/dd obs=1m 2> /dev/null | /usr/local/bin/pipewatcher $$ | /usr/local/bin/ssh -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -l test -p 22 192.168.2.87 "/usr/bin/env lz4c -d | /sbin/zfs receive -F -d 'tank' && echo Succeeded"
Apr 11 17:33:02 truenas autorepl.py: [tools.autorepl:157] Replication result: cannot unmount '/mnt/tank/test': Operation not permitted
Apr 11 17:33:02 truenas autorepl.py: [tools.autorepl:638] Autosnap replication finished

This is the exact problem the customer is having, which is why this bug ticket was created in the first place.

#30 Updated by Cyber Jock about 4 years ago

ping. Anyone have answers?

#31 Avatar?id=14398&size=24x24 Updated by Kris Moore about 4 years ago

This seems it should work, except on the remote its being placed into "tank"

/sbin/zfs receive -F -d 'tank'

What happens if you run that same 'zfs set' operation on 'tank' on the remote and re-try?

#32 Updated by Cyber Jock about 4 years ago

So I ran the command "zfs allow -ldu test create,destroy,mount,receive tank" and I get the same errors.

I have this running in 2 VMs, and I can power them on at-will. They are dedicated for this test, so if someone wants to actually try various things I can accommodate.

Thanks.

#33 Updated by Cyber Jock almost 4 years ago

Anyone?

#34 Avatar?id=14398&size=24x24 Updated by Kris Moore almost 4 years ago

Josh - I spoke with Chiu about this ticket last week. He was supposed to be getting in touch with you and getting something worked out where a engineering or other tech could get involved. Have you heard from him yet?

#35 Updated by Cyber Jock almost 4 years ago

No, I never heard from Chiu on this topic. I'll get in touch with him. Thanks.

#36 Updated by Cyber Jock almost 4 years ago

@Kris,

Chiu says you have this one mixed up with bug #21749. This ticket doesn't require John's intervention. Chiu said this ticket doesn't ring any bells with him.

@William,

Is it possible we can schedule a remote session for you to show me what needs to be done for this to work? If so, do you want me to message you in Slack and we'll work out a date/time?

Thanks.

#37 Avatar?id=14398&size=24x24 Updated by Kris Moore almost 4 years ago

Josh - you are correct, my bad for confusing the two.

#38 Updated by Dru Lavigne almost 4 years ago

Please include Warren on the remote session so that he has the information to add the correct procedure to the Guide.

#39 Avatar?id=14398&size=24x24 Updated by Kris Moore almost 4 years ago

  • Assignee changed from Warren Block to Joe Maloney

Sending this over to the QA dept for testing / confirmation of what knobs are needed for this to work.

#41 Updated by Joe Maloney almost 4 years ago

  • Assignee changed from Joe Maloney to Warren Block

We have already verified this works here in related ticket #23084. Under a minute of google searching for "FreeNAS dedicated user replication" about the 6th link down produced the guide linked in that ticket. Additionally we uncovered that "zfs allow -ldu bonnie tank/replicationtarget" is the only permission needed. Bonnie, and I have verified that this works. I responded to the related ticket 9 days ago.

#42 Avatar?id=14398&size=24x24 Updated by Kris Moore almost 4 years ago

  • Target version changed from 9.10.3-U1 to 11.0-U1

#43 Updated by Joe Maloney almost 4 years ago

Also just wanted to note once the proper zfs property is set I suspect previously replicated snapshots would need to be wiped from the replication target so it can start fresh.

#44 Updated by Cyber Jock almost 4 years ago

So I got this working, kind of. I created the user and such per http://mikebellerue.blogspot.com/2016/06/freenas-replication-with-dedicated-user.html (Thanks Joe Maloney).

Then I ran the previously listed command above: "zfs allow -ldu repluser create,destroy,mount,receive tank2/replicatethis"

I had to manually set the readonly attribute (this is totally expected) and replication began working.

In my case, I have a snapshot being made every 5 minutes, so 4 of every 5 minutes the WebGUI shows "Up to Date" for the task. The other minute I get an error, but the snapshot sends successfully. The error in the WebGUI is "Failed: tank/replicatethis (auto-20170501.1628-2w->auto-20170501.1633-2w)". As I said the snapshot sends successfully. The /var/log/debug.log on the source TrueNAS shows something like this:

May 1 16:28:03 truenas autorepl.py: [common.pipesubr:66] Popen()ing: /sbin/zfs list -H -t snapshot -p -o name,creation -r "tank/replicatethis"
May 1 16:28:03 truenas autorepl.py: [common.pipesubr:66] Popen()ing: /usr/local/bin/ssh -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -l repluser -p 22 192.168.2.87 "zfs list -H -t snapshot -p -o name,creation -r 'tank2/replicatethis'"
May 1 16:28:03 truenas autorepl.py: [tools.autorepl:138] Sending zfs snapshot: /sbin/zfs send -V -p -i | /usr/local/bin/lz4c | /bin/dd obs=1m 2> /dev/null | /bin/dd obs=1m 2> /dev/null | /usr/local/bin/pipewatcher $$ | /usr/local/bin/ssh -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -l repluser -p 22 192.168.2.87 "/usr/bin/env lz4c -d | /sbin/zfs receive -F -d 'tank2' && echo Succeeded"
May 1 16:28:07 truenas manage.py: [middleware.notifier:235] Popen()ing: /bin/ps -a -x -w -w -o pid,command | /usr/bin/grep '^ *13112'
May 1 16:28:08 truenas autorepl.py: [tools.autorepl:157] Replication result: cannot receive org.freenas:description property on tank2/replicatethis: permission denied
cannot mount 'tank2/replicatethis': Insufficient privileges

So what privilege am I missing? This is going to create errors in replication that will cause emails to be generated, which obviously is going to be problematic for customers.

Anyone know what priviledge I'm missing?

On the destination TrueNAS I see this in the zpool history:

2017-05-01.14:46:53 zfs create o casesensitivity=sensitive tank2/replicatethis
2017-05-01.14:46:58 zfs set org.freenas:description= tank2/replicatethis <--
notice this entry
2017-05-01.14:49:29 zfs allow -ldu repluser create,destroy,mount,receive tank2/replicatethis

That entry is from our middleware and was not created by me or the replicator.

Can anyone enlighten me on this so I can get this customer's replication fixed?

Thanks.

#45 Updated by Bonnie Follweiler almost 4 years ago

I don't know if this helps but this is what was in my notes when I was setting it up:

"sudo zfs allow bonnie create,destroy,diff,mount,readonly,receive,release,send,userprop tank_b/backup
William said to add zfs allow -ldu bonnie
(bonnie is the user in case you couldn't guess lol)"

All that might have been "overkill"

#47 Updated by Cyber Jock almost 4 years ago

Well, the purpose of the "dedicated user" feature is to provide the bare minimum permissions necessary to get the job done. So does someone somewhere actually have that list? That list should be in our documentation, and I can't seem to get this thing to work no matter how much time and effort I put into this, aside from simply granting everything. Of course, granting everything defeats the whole point of the "dedicated user" feature to begin with.

#48 Avatar?id=14398&size=24x24 Updated by Kris Moore almost 4 years ago

There's information all over the internet how to do this. Joe in QA already tested and confirmed this works on our systems. Warren: Can we just get those ZFS allow flags thrown into the docs somewhere? We have the list required mentioned in this ticket and its even on our own forums:

https://forums.freenas.org/index.php?threads/zfs-replication-without-using-root-user.21731/

#49 Updated by Cyber Jock almost 4 years ago

@Kris,

Thanks for that link. I'll give that one a try. Joe previously gave me another link which has different instructions that didn't work.

Will report back on if this works or not.

#50 Updated by Cyber Jock almost 4 years ago

So I just tried using the August 19th 2014 post. I did everything on the PULL side, but the PUSH side seems at least somewhat incomplete. When I did exactly what the forum post said, I got a "The user johndoe is not valid" for the "Dedicated User" field as you never are told to create it on the PUSH side.

So I created it on the PUSH side and it still didn't work.

I even manually ssh'd into the PUSH side to accept the host key, and it still didn't work.

I see the autorepl.py script running per /var/log/debug.log and yet the replication task page sits at "Remote system not responding." when I can ssh into it manually.

#51 Updated by Vaibhav Chauhan almost 4 years ago

  • Target version changed from 11.0-U1 to 11.0-U2

#52 Updated by Lance Fogle almost 4 years ago

After some research in the FreeBSD ZFS documentation, it appears that most of the problem people will have with using a non-root user will be the user properties (because they can vary so widely). If you don't use quota then the receive user doesn't need the "quota" flag on the allow command. If you do use quota, then the receiving user will require it, etc. The only way to have one set of instructions as far as properties go would be to use the "userprop" which covers all possible properties that need to be set on the receiving end to match what they are on the sending end. This however, if you have the requirement to truly set minimum settings, could be replaced by individual properties flags in the allow command (quota, readonly, compression, atime, etc).

That being said, the following should be sufficient based on what I am seeing in the FreeBSD ZFS doc:

zfs allow -ldu username create,destroy,mount,receive,release,userprop volume/dataset
vfs.usermount set to 1

I will do some testing with this and report back as soon as I can on if anything additional is really required.

#53 Updated by Vaibhav Chauhan almost 4 years ago

  • Target version changed from 11.0-U2 to 11.0-U3

#54 Updated by Warren Block over 3 years ago

  • % Done changed from 0 to 80

#55 Updated by Bonnie Follweiler over 3 years ago

I followed the instructions in the PDF that Warren provided and successfully set up the Dedicated Replication Task that works in FreeNAS 11.0-U2

#56 Updated by Bonnie Follweiler over 3 years ago

  • QA Status Test Passes added
  • QA Status deleted (Not Tested)

#57 Updated by Warren Block over 3 years ago

  • Status changed from Screened to Ready For Release
  • % Done changed from 80 to 100

Committed version works.

#58 Updated by Dru Lavigne over 3 years ago

  • Subject changed from "Dedicated user" feature of replication not working properly. to Add dedicated user example to replication section of Guide

#59 Updated by Dru Lavigne over 3 years ago

  • Status changed from Ready For Release to Resolved
  • Needs QA changed from Yes to No

#60 Updated by Nick Bettencourt over 3 years ago

u3 doesnt exist. is this for u4?

#61 Updated by Dru Lavigne over 3 years ago

FreeNAS U3 was released on Sept 5. The text referenced in this ticket appears in http://doc.freenas.org/11/freenas.html and will appear in the U4 TrueNAS Guide (which contains U3 plus U4).

#62 Updated by Joshua Sirrine over 3 years ago

  • Status changed from Resolved to Investigation
  • Assignee changed from Warren Block to Kris Moore

Taking this back to Investigation and assigning to Kris to decide who to send this to. I recreated this on 2 VMs and I had the following problems:

1. The replication of the snapshot seems to be successful, but the column under replication tasks for "Last snapshot sent to remote side" sits at "Not ran since boot" even though it has been run (heck, it was created after the bootup). I was never able to get it to update the column with the actual last snapshot sent. (screenshot is attached showing the source, destination, and the WebGUI for the source showing the improper WebGUI status.
2. Replication works fine, but when a snapshot is sent, it finished with an error. The WebGUI error generates an email that will be sent to the administrator, which clearly makes this not exactly how it should be. The error clears a minute later the next time the task runs, but at that point the email is already sent.

If a dev wants to look at this, I can bootup the VMs with little or no notice to demonstrate. Both VMs are on 11.0-U2.2.

Please not that I discussed this with Warren Block, who asked Bonnie about this and Bonnie said she didn't see this problem.

I do go on vacation starting Friday, Oct 13th, returning on the 23nd.

#63 Updated by Joshua Sirrine over 3 years ago

12661
12662

Forgot to attach the screenshots.

#64 Updated by Dru Lavigne over 3 years ago

  • Status changed from Investigation to Resolved
  • Assignee changed from Kris Moore to Warren Block

Joshua: please create a new bug with your findings. This bug was for a doc bug which was resolved.

Also available in: Atom PDF