Project

General

Profile

Bug #14750

FreeNAS fails to snapshot specific VM on VMWare datastore

Added by Henning Kessler over 5 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Important
Assignee:
William Grzybowski
Category:
Middleware
Target version:
Severity:
New
Reason for Closing:
Reason for Blocked:
Needs QA:
Yes
Needs Doc:
Yes
Needs Merging:
Yes
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:

HP Proliant Gen8 Microserver
16GB RAM
4x 4TB WD Red in stripped mirrors

ChangeLog Required:
No

Description

I have an issue with a VMWare snaphot producing constant errors for a single VM which is hosted on a iSCSI zvol on my FreeNAS.
As I have a periodic snapshot configured for this dataset I am receiving constantly emails with the following content:

"The following VM failed to snapshot mypool/iscsi@auto-20160414.1800-2w:
[iscsi] ELK/ELK.vmx"

I can do snapshots from the VM on the ESXI system without any errors.
When try to do a manual snapshot from my FreeNAS system I can see on the ESXI host that the snaphot is made successfully on another VM but fails on the other with the following output:

Traceback:
File "/usr/local/lib/python2.7/site-packages/django/core/handlers/base.py" in get_response
  105.                     response = middleware_method(request, callback, callback_args, callback_kwargs)
File "/usr/local/www/freenasUI/../freenasUI/freeadmin/middleware.py" in process_view
  156.         return login_required(view_func)(request, *view_args, **view_kwargs)
File "/usr/local/lib/python2.7/site-packages/django/contrib/auth/decorators.py" in _wrapped_view
  22.                 return view_func(request, *args, **kwargs)
File "/usr/local/www/freenasUI/../freenasUI/storage/views.py" in manualsnap
  715.             form.commit(fs)
File "/usr/local/www/freenasUI/../freenasUI/storage/forms.py" in commit
  1865.                     vm1.create_snapshot(vmsnapname, description=vmsnapdescription, memory=False)
File "/usr/local/lib/python2.7/site-packages/pysphere/vi_virtual_machine.py" in create_snapshot
  889.                                       FaultTypes.TASK_ERROR)

Exception Type: VIException at /storage/snapshot/create/mypool/iscsi/
Exception Value: [Task Error]: An error occurred while saving the snapshot: Failed to quiesce the virtual machine.

I posted this on the FreeNAS forums and a user (dlavigne) mentioned this bug [[https://bugs.freenas.org/issues/6873]] maybe it is related to my problem?

Here some more specs:

FreeNAS
FreeNAS-9.10-STABLE-201604111739
16GB RAM
HP Proliant Gen8 Microserver
4x 4TB WD Red in stripped mirrors

ESXI
ESXI 6.0.6 u2 (Build 3620759)
Macmini6,2
16GB RAM

failing VM
FreeBSD (64Bit) 10.3
VM-Version 8
1GB RAM
VMWare Tools 10246

working VM:
OS X 10.9
VM-Version 8
2GB RAM
VMWare Tools 9344


Related issues

Is duplicate of FreeNAS - Bug #20168: [regression] Problem with Snapshot and VMwareClosed: Duplicate2017-01-11

History

#1 Updated by William Grzybowski over 5 years ago

  • Assignee changed from William Grzybowski to Josh Paetzel

Any ideas, Josh?

#2 Updated by Josh Paetzel over 5 years ago

  • Status changed from Unscreened to Investigation
  • Priority changed from No priority to Important

I'm full of ideas, however none of them relate to this ticket. Let me do a bit of research.

#3 Avatar?id=14398&size=24x24 Updated by Kris Moore almost 5 years ago

Still something to investigate or should we close this now?

#4 Updated by Henning Kessler almost 5 years ago

Kris Moore wrote:

Still something to investigate or should we close this now?

Hi Kris
Unfortunately the issue still persist. I just stopped to make periodic snapshots as the constant flow of warning emails was annoying. Fortunately this ESXi Installation is not production system :-).

#5 Avatar?id=14398&size=24x24 Updated by Kris Moore almost 5 years ago

  • Target version set to 9.10.2

#6 Updated by Josh Paetzel over 4 years ago

  • Target version changed from 9.10.2 to 9.10.2-U1

#7 Avatar?id=14398&size=24x24 Updated by Kris Moore over 4 years ago

  • Target version changed from 9.10.2-U1 to 9.10.2-U2

#8 Updated by Josh Paetzel over 4 years ago

  • Status changed from Investigation to Unscreened
  • Assignee changed from Josh Paetzel to Kris Moore

#9 Avatar?id=14398&size=24x24 Updated by Kris Moore over 4 years ago

  • Assignee changed from Kris Moore to William Grzybowski
  • Target version changed from 9.10.2-U2 to 9.10.2-U3

#10 Updated by William Grzybowski over 4 years ago

  • Status changed from Unscreened to Screened

#11 Updated by William Grzybowski over 4 years ago

  • Status changed from Screened to 15

This sounds like an error coming from VMWare logs. Don't you see anything in the logs from vmware side?

#12 Updated by William Grzybowski over 4 years ago

  • Is duplicate of Bug #20168: [regression] Problem with Snapshot and VMware added

#13 Updated by Henning Kessler over 4 years ago

William Grzybowski wrote:

This sounds like an error coming from VMWare logs. Don't you see anything in the logs from vmware side?

Hello William,

i am not really what to search for but this is what I see in logs:

host.log:

2017-02-01T08:25:05.797Z info hostd[6D1C1B70] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 116449 : Warning message on ELK on esxi.hkint.de in ha-datacenter: Timed out while quiescing the virtual machin
-->
2017-02-01T08:25:05.980Z info hostd[6CA40B70] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK.vmx] Answered question 32031753
2017-02-01T08:25:06.017Z info hostd[6D8C2B70] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK.vmx] Send config update invoked
2017-02-01T08:25:06.038Z info hostd[6D1C1B70] [Originator@6876 sub=Snmpsvc] VmConfigListener: vm state change received, queueing reload request
2017-02-01T08:25:06.844Z info hostd[6D8C2B70] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK.vmx] Send config update invoked
2017-02-01T08:25:06.877Z info hostd[6D481B70] [Originator@6876 sub=Snmpsvc] VmConfigListener: vm state change received, queueing reload request
2017-02-01T08:25:06.877Z info hostd[6D481B70] [Originator@6876 sub=Snmpsvc] QueueReloadRequest: reload already scheduled, discarding event
2017-02-01T08:25:06.883Z info hostd[6D8C2B70] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK.vmx] Send config update invoked
2017-02-01T08:25:06.913Z info hostd[6D481B70] [Originator@6876 sub=Snmpsvc] VmConfigListener: vm state change received, queueing reload request
2017-02-01T08:25:06.913Z warning hostd[6D8C2B70] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK.vmx] Failed operation
2017-02-01T08:25:06.913Z info hostd[6D481B70] [Originator@6876 sub=Snmpsvc] QueueReloadRequest: reload already scheduled, discarding event
2017-02-01T08:25:06.913Z info hostd[6D8C2B70] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK.vmx] State Transition (VM_STATE_CREATE_SNAPSHOT -> VM_STATE_ON)

Virtual machine log file:

2017-02-01T08:11:07.270Z| vmx| I125: GuestRpcSendTimedOut: message to toolbox-dnd timed out.
2017-02-01T08:25:05.796Z| vmx| I125: Msg_Post: Warning
2017-02-01T08:25:05.796Z| vmx| I125: [msg.snapshot.quiesce.timeout] Timed out while quiescing the virtual machine.
2017-02-01T08:25:05.796Z| vmx| I125: ----------------------------------------
2017-02-01T08:25:05.797Z| vmx| I125: Vigor_MessageRevoke: message 'msg.snapshot.quiesce.timeout' (seq 32031753) is revoked
2017-02-01T08:25:05.797Z| vmx| I125: ToolsBackup: changing quiesce state: ERROR_WAIT -> DONE
2017-02-01T08:25:05.922Z| vmx| I125: SNAPSHOT: SnapshotDeleteWork '/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK.vmx' : 890
2017-02-01T08:25:05.923Z| vmx| I125: DISKLIB-VMFS  : "/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK-000001-delta.vmdk" : open successful (21) size = 16826368, hd = 0. Type 8
2017-02-01T08:25:05.923Z| vmx| I125: DISKLIB-VMFS  : "/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK-000001-delta.vmdk" : closed.
2017-02-01T08:25:05.925Z| vmx| I125: DISKLIB-VMFS  : "/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK-flat.vmdk" : open successful (21) size = 22549626880, hd = 0. Type 3
2017-02-01T08:25:05.925Z| vmx| I125: DISKLIB-VMFS  : "/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK-flat.vmdk" : closed.
2017-02-01T08:25:05.925Z| vmx| A100: ConfigDB: Setting displayName = "ELK" 
2017-02-01T08:25:05.954Z| vmx| I125: DISKLIB-VMFS  : "/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK-000001-delta.vmdk" : open successful (1114133) size = 0, hd = 0. Type 8
2017-02-01T08:25:05.954Z| vmx| I125: DISKLIB-VMFS  : "/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK-000001-delta.vmdk" : closed.
2017-02-01T08:25:05.954Z| vmx| A100: ConfigDB: Setting displayName = "ELK" 
2017-02-01T08:25:05.968Z| vmx| I125: SnapshotVMXTakeSnapshotComplete: Done with snapshot 'bd7ca1c9-7278-4107-b164-0e515b3c7ef0': 890
2017-02-01T08:25:05.968Z| vmx| I125: SnapshotVMXTakeSnapshotComplete: Snapshot 890 failed: Failed to quiesce the virtual machine (31).
2017-02-01T08:25:05.968Z| vmx| I125: SnapshotVMXTakeSnapshotComplete: Cleaning up incomplete snapshot 890.
2017-02-01T08:25:05.968Z| vmx| I125: SnapshotVMXTakeSnapshotComplete: Starting snapshot consolidation for cleaning up incomplete snapshots.
2017-02-01T08:25:05.971Z| vmx| I125: SnapshotVMX_Consolidate: Starting online snapshot consolidate operation.
2017-02-01T08:25:05.971Z| vmx| I125: Vix: [2027882 mainDispatch.c:4292]: VMAutomation_ReportPowerOpFinished: statevar=3, newAppState=1880, success=1 additionalError=0
2017-02-01T08:25:05.971Z| vmx| I125: Turning on snapshot info cache. VM=ELK.vmx.
2017-02-01T08:25:05.974Z| vmx| I125: DISKLIB-VMFS  : "/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK-000001-delta.vmdk" : open successful (1114133) size = 0, hd = 0. Type 8
2017-02-01T08:25:05.974Z| vmx| I125: DISKLIB-VMFS  : "/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK-000001-delta.vmdk" : closed.
2017-02-01T08:25:05.974Z| vmx| I125: SNAPSHOT: Turning on snapshot disk cache.
2017-02-01T08:25:05.974Z| vmx| I125: DISKLIB-VMFS  : "/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK-000001-delta.vmdk" : open successful (21) size = 16826368, hd = 0. Type 8
2017-02-01T08:25:05.975Z| vmx| I125: DISKLIB-VMFS  : "/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK-000001-delta.vmdk" : closed.
2017-02-01T08:25:05.975Z| vmx| I125: DISKLIB-VMFS  : "/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK-flat.vmdk" : open successful (21) size = 22549626880, hd = 0. Type 3
2017-02-01T08:25:05.975Z| vmx| I125: DISKLIB-VMFS  : "/vmfs/volumes/55004ac9-d0e36d4a-e177-001018000000/ELK/ELK-flat.vmdk" : closed.
2017-02-01T08:25:05.975Z| vmx| I125: WORKER: Creating new group with numThreads=1 (4)
2017-02-01T08:25:05.975Z| vmx| I125: WORKER: Creating new group with numThreads=1 (4)

#14 Avatar?id=14398&size=24x24 Updated by Kris Moore over 4 years ago

  • Target version changed from 9.10.2-U3 to 9.10.3

#15 Updated by William Grzybowski over 4 years ago

  • Status changed from 15 to Screened

I have no idea what these errors mean on VMWare side, I'll have to research on it.

#16 Updated by William Grzybowski over 4 years ago

  • Status changed from Screened to Resolved

I think this is resolved by switching the python library from pysphere to pyvmomi which disables quiesce. 05bb91c0dec

#17 Avatar?id=14398&size=24x24 Updated by Kris Moore over 4 years ago

  • Target version changed from 9.10.3 to 11.0

#19 Updated by Vaibhav Chauhan about 4 years ago

  • Target version changed from 11.0 to 11.0-RC

Also available in: Atom PDF