Project

General

Profile

Bug #15928

Stopping VirtualBox jail doesn't properly shutdown jail

Added by Brandon Tolbird about 3 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Nice to have
Assignee:
Vaibhav Chauhan
Category:
Middleware
Target version:
Severity:
New
Reason for Closing:
Reason for Blocked:
Needs QA:
Yes
Needs Doc:
Yes
Needs Merging:
Yes
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

As mentioned in the title, the VirtualBox jail isn't shutdown properly before it is stopped, resulting in aborted VMs and, what I've more recently discovered, a kernel panic when the jail is stopped while VMs are still running.

I have included a fix in the form of a jail-pre-stop file (that goes in .(jailname).meta) that issues the rc.shutdown command to the relevant jail ID. Also included is a textdump archive that shows the cause of the kernel panic is VBoxHeadless.

Associated revisions

Revision d022355d (diff)
Added by John Hixson about 3 years ago

Nuke virtualbox template with prejudice

Ticket: #15928

Revision ca3b5de7 (diff)
Added by John Hixson about 3 years ago

Nuke virtualbox template with prejudice

Ticket: #15928
(cherry picked from commit d022355d9fb8b06bcba53b21c58531894195d420)

Revision 356c9e97 (diff)
Added by John Hixson about 3 years ago

Nuke virtualbox template with prejudice

Ticket: #15928
(cherry picked from commit d022355d9fb8b06bcba53b21c58531894195d420)

History

#1 Updated by Jordan Hubbard about 3 years ago

  • Assignee changed from Brandon Tolbird to Kris Moore

#2 Updated by Brandon Tolbird about 3 years ago

Actually, it seems that the correct fix was to run 'chmod u+x /etc/rc.shutdown' in the jail.

I had noticed while applying the first fix that /etc/rc.shutdown wasn't marked as executable so I added execute permissions. After (successfully) shutting down FreeNAS and watching it on the console, I saw the "stopping jail with: /etc/rc.shutdown" messages. That gave me the thought that maybe it is trying to call rc.shutdown, but couldn't because of permissions. I commented out the jexec line in the pre-stop file and then attempted to cycle the virtualbox jail on and off a couple of times and still no kernel panic. Checked the jail and VM, and jail is being shutdown via rc.shutdown and the VM is getting its state saved in the process.

#3 Avatar?id=14398&size=24x24 Updated by Kris Moore about 3 years ago

  • Assignee changed from Kris Moore to John Hixson

#4 Updated by John Hixson about 3 years ago

  • Status changed from Unscreened to Screened
  • Target version set to 261

#5 Updated by John Hixson about 3 years ago

  • Status changed from Screened to 15

Brandon Tolbird wrote:

Actually, it seems that the correct fix was to run 'chmod u+x /etc/rc.shutdown' in the jail.

I had noticed while applying the first fix that /etc/rc.shutdown wasn't marked as executable so I added execute permissions. After (successfully) shutting down FreeNAS and watching it on the console, I saw the "stopping jail with: /etc/rc.shutdown" messages. That gave me the thought that maybe it is trying to call rc.shutdown, but couldn't because of permissions. I commented out the jexec line in the pre-stop file and then attempted to cycle the virtualbox jail on and off a couple of times and still no kernel panic. Checked the jail and VM, and jail is being shutdown via rc.shutdown and the VM is getting its state saved in the process.

So does this fix your problem?

#6 Updated by Brandon Tolbird about 3 years ago

Apparently not. I tried testing it again just now, by stopping the jail twice with the webGUI, and both times a kernel panic occurred. I checked the latest textdump files to confirm it was VBoxHeadless still and yes it was once again causing the panics. I went back to jail-pre-stop and uncommented the jexec command. Tried stopping the jail again and so far, seems stable. Don't know why it would stop working; the permissions for rc.shutdown in the jail haven't changed.

#7 Updated by Vaibhav Chauhan about 3 years ago

John, do we want this fix to punt to 9.10.2 or would you rather work on it for getting it into 9.10.1 ?

#8 Updated by John Hixson about 3 years ago

Vaibhav Chauhan wrote:

John, do we want this fix to punt to 9.10.2 or would you rather work on it for getting it into 9.10.1 ?

I don't have a fix for this.

#9 Updated by John Hixson about 3 years ago

  • Status changed from 15 to Investigation

#10 Updated by John Hixson about 3 years ago

  • Target version changed from 261 to 9.10.2

#11 Updated by John Hixson about 3 years ago

I'm not clear on what to do with this ticket. VirtualBox in a jail seems very problematic. The original maintainer for this template hasn't had anything to do with this for a while and I don't know that anyone else wants to maintain it. I am inclined to close this out with a "use at your own risk" warning. If that is not sufficient, I'm willing to yank the template out of FreeNAS altogether. Anyone else have any comments on this?

#12 Updated by Jordan Hubbard about 3 years ago

I'd shoot the virtualbox template and EOL it. iohyve now lives in 9.10 for ad-hoc virtual machines.

#13 Avatar?id=14398&size=24x24 Updated by Kris Moore about 3 years ago

Yea nuke it from orbit

#14 Updated by John Hixson about 3 years ago

  • Status changed from Investigation to Needs Developer Review
  • Assignee changed from John Hixson to Vaibhav Chauhan

Nuked.

#15 Updated by Vaibhav Chauhan almost 3 years ago

william it looks like a Django migration, can you please take a look, if change looks good can we target this for FreeNAS-9.10.1-U1 ?

#16 Updated by Vaibhav Chauhan almost 3 years ago

  • Assignee changed from Vaibhav Chauhan to William Grzybowski

#17 Updated by William Grzybowski almost 3 years ago

  • Status changed from Needs Developer Review to Reviewed
  • Assignee changed from William Grzybowski to John Hixson

#18 Updated by John Hixson almost 3 years ago

  • Assignee changed from John Hixson to Vaibhav Chauhan

#19 Updated by Vaibhav Chauhan over 2 years ago

  • Status changed from Reviewed to Ready For Release

#20 Updated by Vaibhav Chauhan over 2 years ago

  • Priority changed from No priority to Nice to have

#21 Updated by I. M. Stochastic over 2 years ago

As a reasonably linux and programming-savvy user who was just blindsighted by a minor version upgrade suddenly causing all of my carefully prepared VMs not to boot, let me just say that I think the decision to remove functionality like this in a minor release was a whopping bad idea. I finally discovered this thread:

https://forums.freenas.org/index.php?threads/vms-wont-start-after-update-to-9-10-1-u1.46344/

Taking actions like this persuades people to not take critical updates, because they never know what feature will magically go away. This entire story is a symptom of poor release planning. I understand that the feature was a kluge, was user contributed, and no longer being maintained. This leads one to question why kluge features being maintained by users were even present in 'stable' release branches, and were not labeled 'use at your own risk' in the first place.

I had no idea that iohyve existed until today. Now that I do, I agree that it seems to be structurally the better way to address the problem of how to run VMs in the longer term. Hopefully, some UI becomes available on a stable branch sometime soon that makes this not a pain. Lucky for me my desired guest OS types are supported.

I'm sure that there are lots of great people doing hard work trying to make FreeNAS viable, and I respect those people. I'm afraid without a good framework and good planning the result is simply not viable. As someone mentioned in the forum I linked to above, what really worries me here is not the sudden lack of a VM feature... it's the fact that the VM feature was there and then went away on a minor bugfix release on a 'stable' branch. I'll have to re-think my storage needs going forward. Shame, because I LOVE ZFS, and many of the other features of FreeNAS.

#22 Updated by Jordan Hubbard over 2 years ago

I understand the frustration being expressed above, but I think this also misses some of the nuances in terms of how "VM support has evolved" (and some of the painful choices that needed to be made along the way) with 9.x. Here's a very brief recap:

1. FreeNAS is created with the goal of being a storage appliance and, with this as its primary mission, does fine for a number of years, quietly building a fan base among people who just want a storage appliance. No VMs, no containers, just storage. Users, however, keep asking if there might not be some way, any way, to just run a VM here or a VM there, nothing too fancy. This sits on the wish list for some time, until one day someone manages to figure out how to run Virtualbox inside a jail. Users then do that on an ad-hoc basis for awhile, since the jail implementation is pretty flexible, but there's one problem - in order to run Virtualbox in a jail, you also have to modify the host because it needs kernel modules to be loaded, and even worse those kernel modules are rev-locked to the jailed version of Virtualbox. It really horks up the abstraction boundary that jails are traditionally supposed to provide (no modifications to host necessary) and people keep modifying their FreeNAS bits, only to see the modifications go away with every upgrade (that's on purpose), and it's a really cranky state of affairs for a long time.

2. Bowing to increasing user pressure, FreeNAS team eventually agrees to bundle the virtualbox kernel modules because that's a lot better than users constantly attacking the host configuration with various blunt instruments, but there's still the rev-lock problem. Not every time, but many times, whenever virtualbox updates the jailed version it goes out of sync with the kernel modules on the host and things go pear-shaped. Clearly, this is a really fragile technology and not one with long-term prospects for FreeNAS, it's just a holding action at best, and Virtualbox performance for VMs is also nothing to write home about.

3. FreeNAS is upgraded to FreeBSD 10.3 and, in the process, finally gets support for bhyve. Yay! There is not, however, any "command and control" code for bhyve - it's basically just a matter of running bhyve manually from the shell, with its 547 different command line arguments, and if users want any kind of "persistence" for their VM configurations then they're out of luck. Make a script or something, is the thinking. Also clearly far from optimal but at least the binary compatibility issues are gone. Around this time, a little tool called "iohyve" also pops up and, although it seems to be updating itself frequently, it at least takes care of the command-and-control bits and configuring all of the ZFS dataset stuff around running bhyve VMs. Given that FreeNAS 10 is also on the distant horizon and has a very comprehensive solution in the wings for doing VM management (with a middleware and CLI and GUI, yada yada yada), iohyve is included but marked "experimental" given that it's a "better than doing it by hand" solution but also has no long-term prospects, largely because everything is still very much in flux with 9 and 10 running in parallel and nobody is willing to commit to the way iohyve way of doing things as a "defacto standard" (which is understandable because it's not, it's just one tool among many different options, and it's doing a lot of stuff).

4. Someone forks FreeNAS 9 and adds their own VM management GUI and requisite glue code. There are many subjective comparisons to be made between it and iohyve but at least there is a GUI piece now. Yay! Now there are 3 ways of doing this in 9, however. Boo! Given the presence of a GUI and more "fully fleshed out" VM implementation, the 9 team decides to follow this fork and make just one way going forward. Now virtualbox and iohyve are even MORE deprecated, especially once 9.10 nightlies start coming out with the VM feature "released", albeit not on the STABLE branch.

Now we are also on the verge of shipping 10 and, of course, it has evolved its own fairly comprehensive VM and Docker story, driven in large part by the last 2 years of ad-hoc evolution described above and a clear need to have a more formal and well-defined strategy with respect to VMs.

Does that mean there has also been some turmoil for VM users who have tried one or all of the above over the years? Yeah, totally. That's kind of the natural, albeit unfortunate, side-effect of running a series of science experiments until it becomes clear what a reasonable path forward might look like and where the appropriate coding resources to throw at the problem can be found. We've probably put a few hundred man hours, very conservatively, into the VM middleware in FreeNAS 10 and made sure it had a comprehensive CLI and GUI, complete with support for cloning, snapshots, device management, and so on. We also put a bunch of work into 9pfs so that we would have a high speed (>1GB/sec) filesystem interconnect between VM and host.

All of that was a direct result of having done the science experiments and seeing the shortcomings with every previous approach, and I'm really not sure how we could have done it all substantially differently other than perhaps have just said "NO!" to all of the early requests and never offering even the interim solutions. That would have been much easier to support, and certainly less rocky during the transitions, but it also wouldn't have taught anyone anything, so I'm not sure we would have done it differently even if we knew then what we know now (OK, maybe one thing differently - we'd have skipped virtualbox completely :) )

#23 Updated by Dru Lavigne over 1 year ago

  • File deleted (jail-pre-stop)

#24 Updated by Dru Lavigne over 1 year ago

  • File deleted (textdump.tar.last.gz)

#25 Updated by Dru Lavigne over 1 year ago

  • Status changed from Ready For Release to Resolved

Also available in: Atom PDF