Project

General

Profile

Bug #34330

VM crashes and then fails to start due to insufficient memory.

Added by nood lehead about 1 year ago. Updated about 1 year ago.

Status:
Closed
Priority:
No priority
Assignee:
Brandon Schneider
Category:
Middleware
Target version:
Severity:
Med High
Reason for Closing:
Duplicate Issue
Reason for Blocked:
Needs QA:
Yes
Needs Doc:
Yes
Needs Merging:
Yes
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

On latest snapshot.
VM crashes and then fails to start due to insufficient memory. 32 GB ram + 14tb raid2z

cat /var/log/middlewared.log

[2018/05/31 07:14:29] (WARNING) middlewared._loop_monitor_thread():1103 - Task seems blocked:  File "/usr/local/lib/python3.6/site-packages/middlewared/plugins/vm.py", line 253, in run
    self.logger.debug('{}: {}'.format(self.vm['name'], line.decode()))
  File "/usr/local/lib/python3.6/logging/__init__.py", line 1294, in debug
    self._log(DEBUG, msg, args, **kwargs)
  File "/usr/local/lib/python3.6/logging/__init__.py", line 1442, in _log
    self.handle(record)
  File "/usr/local/lib/python3.6/logging/__init__.py", line 1452, in handle
    self.callHandlers(record)
  File "/usr/local/lib/python3.6/logging/__init__.py", line 1514, in callHandlers
    hdlr.handle(record)
  File "/usr/local/lib/python3.6/logging/__init__.py", line 863, in handle
    self.emit(record)
  File "/usr/local/lib/python3.6/logging/handlers.py", line 73, in emit
    logging.FileHandler.emit(self, record)
  File "/usr/local/lib/python3.6/logging/__init__.py", line 1070, in emit
    StreamHandler.emit(self, record)
  File "/usr/local/lib/python3.6/logging/__init__.py", line 996, in emit
    self.flush()
  File "/usr/local/lib/python3.6/logging/__init__.py", line 976, in flush
    self.stream.flush()

[2018/05/31 07:15:32] (DEBUG) VMService.run():253 - sonion: Unhandled ps2 keyboard keysym 0x1008ff13

[2018/05/31 07:15:32] (DEBUG) VMService.run():253 - sonion: Unhandled ps2 keyboard keysym 0x1008ff13

[2018/05/31 07:15:32] (DEBUG) VMService.run():253 - sonion: Unhandled ps2 keyboard keysym 0x1008ff13

[2018/05/31 07:21:42] (INFO) VMService.run():276 - ===> Error VM: sonion ID: 9 BHYVE_CODE: -10
[2018/05/31 07:21:42] (ERROR) VMService.running():398 - ===> VMM sonion is running without bhyve process.
[2018/05/31 07:21:42] (DEBUG) VMService.__teardown_guest_vmemory():298 - ===> Give back guest memory to ARC.: 8589934592
[2018/05/31 07:21:42] (WARNING) VMService.destroy_vm():281 - ===> Destroying VM: sonion ID: 9 BHYVE_CODE: -10
[2018/05/31 07:21:42] (DEBUG) VMService.kill_bhyve_web():365 - ==> Killing WEBVNC: 4240
[2018/05/31 07:24:16] (WARNING) VMService.__init_guest_vmemory():830 - ===> Cannot guarantee memory for guest id: 9
[2018/05/31 07:24:24] (WARNING) VMService.__init_guest_vmemory():830 - ===> Cannot guarantee memory for guest id: 9


Related issues

Is duplicate of FreeNAS - Bug #26434: Add descriptive error to API when VM fails to startDone

History

#1 Updated by Dru Lavigne about 1 year ago

  • Reason for Blocked set to Need additional information from Author

How much memory did you give the VM in this screen: http://doc.freenas.org/11/_images/vms-add.png ?

Are you running any other VMs or jails? If so, how many?

#2 Updated by nood lehead about 1 year ago

No jails or other vms, fresh reboot, I gave it 8gb of memory. I also have vfs.zfs.arc_free_target and vm.v_free_target ser to 65536

#3 Updated by Dru Lavigne about 1 year ago

  • Private changed from No to Yes

Thanks!

Please attach a debug (System -> Advanced -> Save Debug) to this ticket to assist the dev in determining the cause.

#4 Updated by nood lehead about 1 year ago

  • File debug-nas-20180531105114.tgz added

#5 Updated by Dru Lavigne about 1 year ago

  • Category changed from Middleware to OS
  • Assignee changed from Release Council to Marcelo Araujo
  • Seen in changed from 11.2-RC2 to Master - FreeNAS Nightlies
  • Reason for Blocked deleted (Need additional information from Author)

#6 Updated by nood lehead about 1 year ago

  • File debug-nas-20180601001030.tgz added

just tried again with newer snapshot, same problem.
It always fails when on the vm I am running sudo apt dist-upgrade for the first time, runs fine 2nd time.
So the crash might be issue with vm OS; security onion based on ubuntu 14.04.
But the problem where vm can't be started again is bigger problem.

#7 Updated by nood lehead about 1 year ago

vm crashed again, no action by me, same errors.

#9 Updated by Marcelo Araujo about 1 year ago

  • Status changed from Unscreened to Screened
  • Target version changed from Backlog to 12.0

When the VM crash, middlewared vm plugin right now doesn't give back the memory for ZFS ARC, what makes this issue happens "Cannot guarantee memory for guest".
I will take a look to see what can be done regards of the ZFS ARC memory when the VM crashes.

Best,

#10 Updated by Marcelo Araujo about 1 year ago

  • Category changed from OS to Middleware
  • Status changed from Screened to Unscreened
  • Assignee changed from Marcelo Araujo to William Grzybowski
  • Target version changed from 12.0 to N/A

Forward to firmware team, my last comment describe what happens inside vm plugin.

#11 Updated by William Grzybowski about 1 year ago

  • Assignee changed from William Grzybowski to Brandon Schneider
  • Target version changed from N/A to Backlog

#12 Updated by Brandon Schneider about 1 year ago

  • Status changed from Unscreened to Not Started

#13 Updated by Brandon Schneider about 1 year ago

  • Status changed from Not Started to In Progress

Should also be solved by PR https://github.com/freenas/freenas/pull/1458/files

DESC: If any error happens, the guest memory is not teared down.
RISK: Low
ACCEPTANCE: Destroy zvol and start a vm, recreate zvol and start vm, message shouldn't appear.

#14 Updated by Brandon Schneider about 1 year ago

  • Related to Bug #26434: Add descriptive error to API when VM fails to start added

#15 Updated by Dru Lavigne about 1 year ago

  • Related to deleted (Bug #26434: Add descriptive error to API when VM fails to start)

#16 Updated by Dru Lavigne about 1 year ago

  • Is duplicate of Bug #26434: Add descriptive error to API when VM fails to start added

#17 Updated by Dru Lavigne about 1 year ago

  • File deleted (debug-nas-20180531105114.tgz)

#18 Updated by Dru Lavigne about 1 year ago

  • File deleted (debug-nas-20180601001030.tgz)

#19 Updated by Dru Lavigne about 1 year ago

  • Status changed from In Progress to Closed
  • Target version changed from Backlog to N/A
  • Private changed from Yes to No
  • Reason for Closing set to Duplicate Issue

Also available in: Atom PDF