Project

General

Profile

Bug #54573

Write boot loading progress to console

Added by Alessandro Segala almost 2 years ago. Updated almost 2 years ago.

Status:
Done
Priority:
No priority
Assignee:
William Grzybowski
Category:
Middleware
Seen in:
Severity:
Medium
Reason for Closing:
Reason for Blocked:
Needs QA:
No
Needs Doc:
No
Needs Merging:
No
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

I have updated my NAS which was running 11.1 to 11.2-rc3. After the update, the VM fails to boot, and the serial console shows this error:

vmx_init: processor does not support VMX operation

I understand that this is because VT-x extensions aren't found. This is expected, since my box is running inside a VM. Enabling nested virtualization in ESXi (6.7) isn't an option because I am using PCI passthrough for a HBA.

I think FreeNAS should disable virtualization support if hardware virtualization isn't available, rather than failing to boot right away


Related issues

Has duplicate FreeNAS - Bug #53460: bootup hangs with "mounting root" Closed
Has duplicate FreeNAS - Bug #52687: Upgrade to 11.2 beta shows error message on console spew module_register_init: MOD_LOAD (vmm, 0xffffffff82a2c470, 0 - HangsClosed
Copied to FreeNAS - Bug #56526: Write boot loading progress to consoleDone

Associated revisions

Revision ee33b67c (diff)
Added by William Grzybowski almost 2 years ago

feat(middlewared): write loading progress to console on boot Ticket: #54573

Revision a16e1043 (diff)
Added by William Grzybowski almost 2 years ago

feat(middlewared): write loading progress to console on boot Ticket: #54573

Revision 84e39ce6 (diff)
Added by William Grzybowski almost 2 years ago

feat(middlewared): write loading progress to console on boot Ticket: #54573

Revision 7d2a4608 (diff)
Added by William Grzybowski almost 2 years ago

feat(middlewared): write loading progress to console on boot Ticket: #54573

Revision 28cc1dc4 (diff)
Added by William Grzybowski almost 2 years ago

feat(middlewared): write loading progress to console on boot Ticket: #54573

Revision 5dcd25b4 (diff)
Added by William Grzybowski almost 2 years ago

feat(middlewared): write loading progress to console on boot Ticket: #54573

Revision e5fe5ba3 (diff)
Added by William Grzybowski almost 2 years ago

fix(upgrade): run upgrade scripts as first thing in boot This needs to be the first thing otherwise going further in the boot process many things may fail, mainly related to database changes. Ticket: #54573

Revision 2306e162 (diff)
Added by William Grzybowski almost 2 years ago

fix(middlewared): write to console only once Ticket: #54573

Revision 49b0d7f2 (diff)
Added by William Grzybowski almost 2 years ago

fix(middlewared): write to console only once Ticket: #54573

Revision 7d244545 (diff)
Added by William Grzybowski almost 2 years ago

feat(middlewared): write loading progress to console on boot Ticket: #54573

Revision 666af0eb (diff)
Added by William Grzybowski almost 2 years ago

fix(upgrade): run upgrade scripts as first thing in boot This needs to be the first thing otherwise going further in the boot process many things may fail, mainly related to database changes. Ticket: #54573

Revision 966fb97f (diff)
Added by William Grzybowski almost 2 years ago

fix(middlewared): write to console only once Ticket: #54573

Revision 13b9c083 (diff)
Added by William Grzybowski almost 2 years ago

feat(middlewared): write loading progress to console on boot Ticket: #54573

Revision 6fdb30f5 (diff)
Added by William Grzybowski almost 2 years ago

fix(upgrade): run upgrade scripts as first thing in boot This needs to be the first thing otherwise going further in the boot process many things may fail, mainly related to database changes. Ticket: #54573

Revision d0c4a6e3 (diff)
Added by William Grzybowski almost 2 years ago

fix(middlewared): write to console only once Ticket: #54573

Revision 869642e9 (diff)
Added by William Grzybowski almost 2 years ago

feat(middlewared): write loading progress to console on boot Ticket: #54573

Revision 791b6ab3 (diff)
Added by William Grzybowski almost 2 years ago

fix(upgrade): run upgrade scripts as first thing in boot This needs to be the first thing otherwise going further in the boot process many things may fail, mainly related to database changes. Ticket: #54573

Revision 41c833b6 (diff)
Added by William Grzybowski almost 2 years ago

fix(middlewared): write to console only once Ticket: #54573

History

#1 Updated by Dru Lavigne almost 2 years ago

  • Assignee changed from Release Council to Alexander Motin
  • Private changed from No to Yes
  • Reason for Blocked set to Need additional information from Author

Alessandro: can you verify the version? We cannot reproduce this on 11.2-RC1.

#2 Updated by Alessandro Segala almost 2 years ago

  • Seen in changed from 11.1-RC3 to 11.2-RC1

#3 Updated by Alessandro Segala almost 2 years ago

Sorry, it was 11.2-RC1 (I wrote the wrong version above).

Have you tried updating from 11.1-U6 to 11.2-RC1?

PS: I have no jails nor VMs running on my FreeNAS host (on 11.1-U6)

#4 Updated by Alexander Motin almost 2 years ago

I don't think that message really represents a problem, but just a last message printed before something else happened. Could you please show more messages (screenshot)? Could you try to enable verbose boot messages in boot menu? Could you try to press Ctrl+T to see whether shell tell what is running there (hoping it is user-space)?

#5 Updated by Alexander Motin almost 2 years ago

  • Status changed from Unscreened to Blocked

#6 Updated by Alessandro Segala almost 2 years ago

39447

Here it is, attached. (I tried attaching a serial port to dump the boot logs, but the logs were empty).

Could this be an issue that happened because it's an upgraded host?

#7 Updated by Alessandro Segala almost 2 years ago

There's no console available. CTRL+T doesn't bring up any working console.

#8 Updated by Alexander Motin almost 2 years ago

According to the last message there vmm module loading has already failed, so as I thought issue is likely somewhere else. Have you tried to press Ctrl+T to see what is going on? Does the system respond on anything at all? Ctrl+C? Ctrl+Alt+Del?

Depending whether you had serial console enabled in FreeNAS settings, there can be more detailed console messages in different places (graphical or serial console).

#9 Updated by Alexander Motin almost 2 years ago

Alessandro Segala wrote:

CTRL+T doesn't bring up any working console.

CTRL+T supposed to print one more more lines reporting currently running process and in some cases its status. It would be priceless.

#10 Updated by Alessandro Segala almost 2 years ago

CTRL+T shows the system uptime, but it's not a working console. I tried typing basic commands like "ps" and "whoami" and nothing happened.

Before I run more tests (every time I reboot the NAS, I need to shut down a bunch of other systems first), can you please let me know exactly what I should be looking for?
1. I can attach a serial console to the VM, but when I tried it, the log file was empty. How can I enable FreeNAS to write boot logs to the serial console?
2. Like I said, system responds to CTRL+T, but it only shows 1 line with the current uptime, and nothing else.

Are you able to try reproducing by updating from 11.1 to 11.2?

#11 Updated by Alexander Motin almost 2 years ago

Alessandro Segala wrote:

2. Like I said, system responds to CTRL+T, but it only shows 1 line with the current uptime, and nothing else.

Please, show me that line(s). There should be name of the running process and state it is in. It may tell us what is going on there.

Are you able to try reproducing by updating from 11.1 to 11.2?

Our QA department did it many times without problems. If we could reproduce it -- we would already fix it.

#12 Updated by Alessandro Segala almost 2 years ago

39456

Here's the result of CTRL+T.

After pressing CTRL+T, I pressed CTRL+C, and saw a bunch of Python errors but couldn't catch them (they disappeared too fast). I tried enabling the serial console for FreeNAS, but it didn't work and the serial log file is still empty.

#13 Updated by Alexander Motin almost 2 years ago

  • Category changed from OS to Middleware
  • Status changed from Blocked to Unscreened
  • Assignee changed from Alexander Motin to William Grzybowski
  • Reason for Blocked deleted (Need additional information from Author)

William, could you please take a look on this? It seems like some python is waiting for something, and I have 3 different tickets about boot "hanging" after root mount. While in other cases I could blame bad USB sticks, here is VM, which hopefully should not be that bad.

Alessandro, how long have you tried to wait at that state? No further progress after several minutes?

#14 Updated by Alessandro Segala almost 2 years ago

Yes, the OS disk for FreeNAS is on a virtual ESXi disk, and it's backed by SSDs.

I tried waiting a couple of minutes or so.

After pressing CTRL+C in the test before, the VM rebooted.

#15 Updated by Alessandro Segala almost 2 years ago

I have an update here: it magically started working again.

What has changed:

1. Even though I had no jails running, I deleted the "iocage" dataset that for some reason was there.
2. I had attached another VHD (ESXi Virtual Hard Disk) before. Since it was not used, I detached that ZFS volume and removed it from the FreeNAS configuration.

Not sure what changed, but it's working now. To recap, these were the disks, and what has changed:

1. The root OS disk is a VHD served by ESXi
2. There is a host-bus adapter that connects 4 physical disks directly to ESXi: 3 HDDs, in a single zraid1 pool, and 1 SSD, in a zpool with no mirroring.
3. There used to be another VHD for storing data. I removed this.

Hope this helps you debugging the issue, assuming it's connected to storage.

#16 Updated by William Grzybowski almost 2 years ago

Can you attach Save Debug (System - Advanced)? We may able to get something useful for when you were experiencing the problem.

#17 Updated by Alessandro Segala almost 2 years ago

  • File dump.txt added

Attached the fndebug/system/dump.txt file from the dump, which includes some Python error messages too. If you want other files, please let me know which ones and how I can send them to you privately, as I don't want to attach them to a public bug report

#18 Updated by William Grzybowski almost 2 years ago

Alessandro Segala wrote:

Attached the fndebug/system/dump.txt file from the dump, which includes some Python error messages too. If you want other files, please let me know which ones and how I can send them to you privately, as I don't want to attach them to a public bug report

This bug report is marked as private, nobody else can see it. I dont know what I am looking for, so its hard to know what to ask : )

#19 Updated by Alessandro Segala almost 2 years ago

  • File ixdiagnose.zip added

Sounds good. It's attached now

#20 Updated by William Grzybowski almost 2 years ago

  • File deleted (ixdiagnose.zip)

#21 Updated by William Grzybowski almost 2 years ago

Thanks, I have deleted it now.

2018/11/06 22:09:39 - Did you still had the problem during that time? Or was it after? I am fearing the var log may have got cleared after.

#22 Updated by Bug Clerk almost 2 years ago

  • Status changed from Unscreened to In Progress

#23 Updated by Alessandro Segala almost 2 years ago

I can't remember exactly what time I "fixed" it, but I haven't had issues after that. Ironically, the first successful bootup was the first one where the ESXi serial console was working too, but those logs would not show any issue :(

#24 Updated by Alessandro Segala almost 2 years ago

Another thing that I noticed: when it was not working, the bootloader was GRUB. Once it started working, the bootloader changed to the FreeBSD one. Not sure if related.

#25 Updated by William Grzybowski almost 2 years ago

Could it be that it was working but nothing was being shown in the screen?

#26 Updated by Alessandro Segala almost 2 years ago

Definitely not, the web console was not responding either.

#27 Updated by William Grzybowski almost 2 years ago

Since we cannot reproduce it and there isnt any log to help fix the issue I am not sure what we can do here.

Only thing I can think of is to repurpose this ticket to implement a feature to show progress of middlewared process starting.

Are you OK with that?

#28 Updated by Alessandro Segala almost 2 years ago

I understand if you need to close the ticket. Sorry I couldn't provide more information. Have you tried adding other virtual hard disks using ESXi? And is it expected that the bootloader doesn't switch automatically when updating? (And not sure what caused it to switch eventually, maybe I deleted some old boot environments)

Personally, I'd say that a progress bar wouldn't help too much, as unless there are issues I don't follow the boot process (the server is headless).

#29 Updated by Alexander Motin almost 2 years ago

Alessandro Segala wrote:

And is it expected that the bootloader doesn't switch automatically when updating?

After Web update boot loader is updated on first boot. After that there should be no GRUB any more.

#30 Updated by William Grzybowski almost 2 years ago

We use ESXi, bare metal, KVM and bhyve on a daily basis and several different upgrade scenarios, we have never faced such issue before.
bootloader is supposed to be switched automatically but I dont immediately see how that would have caused any issues.

Also whatever was stuck in the boot process should have eventually moved on since there is a time out (how long did you wait?)

#31 Updated by Alessandro Segala almost 2 years ago

After Web update boot loader is updated on first boot. After that there should be no GRUB any more.

So THAT didn't happen to me. I still had GRUB until I removed some of the old boot environments ("Default", "InitialInstall" (can't remember the exact name) and some old 11.1-x environments). Maybe that's the root of my issues?

Also whatever was stuck in the boot process should have eventually moved on since there is a time out (how long did you wait?)

Can't remember exactly, but something like 2-5 minutes.

#32 Updated by Alexander Motin almost 2 years ago

Alessandro Segala wrote:

So THAT didn't happen to me. I still had GRUB until I removed some of the old boot environments ("Default", "InitialInstall" (can't remember the exact name) and some old 11.1-x environments). Maybe that's the root of my issues?

I guess boot just haven't got to stage where boot loader is changed. But any way it should not matter. New FreeNAS may not update GRUB configuration to add new boot environments or loader tunables, but once kernel booted, it should not matter.

#33 Updated by Jeff Ervin almost 2 years ago

39582
39588

Asked for by William. In ESXi created a VM with FreeNAS 11.1-U6, upgraded to 11.2 RC1. Resulted in a hang at 4:00 minutes, and a middleware failed message and a traceback once it was done hanging.

#34 Updated by Dru Lavigne almost 2 years ago

  • Target version changed from Backlog to 11.2-RC2

#35 Updated by William Grzybowski almost 2 years ago

  • Severity changed from New to Medium

#36 Updated by Alexander Motin almost 2 years ago

  • Has duplicate Bug #53460: bootup hangs with "mounting root" added

#37 Updated by Alexander Motin almost 2 years ago

  • Has duplicate Bug #52687: Upgrade to 11.2 beta shows error message on console spew module_register_init: MOD_LOAD (vmm, 0xffffffff82a2c470, 0 - Hangs added

#38 Updated by Bug Clerk almost 2 years ago

  • Status changed from In Progress to Ready for Testing

#39 Updated by Bug Clerk almost 2 years ago

  • Target version changed from 11.2-RC2 to 11.3

#40 Updated by Bug Clerk almost 2 years ago

  • Copied to Bug #56526: Write boot loading progress to console added

#41 Updated by Dru Lavigne almost 2 years ago

  • File deleted (dump.txt)

#42 Updated by Dru Lavigne almost 2 years ago

  • Subject changed from FreeNAs 11.2 does not boot on ESXi to Write boot loading progress to console
  • Needs Doc changed from Yes to No
  • Needs Merging changed from Yes to No

#43 Updated by Dru Lavigne almost 2 years ago

  • Private changed from Yes to No

#44 Updated by Dru Lavigne almost 2 years ago

  • Status changed from Ready for Testing to Done
  • Target version changed from 11.3 to Master - FreeNAS Nightlies
  • Needs QA changed from Yes to No

#46 Updated by Joerg Dorchain almost 2 years ago

William Grzybowski wrote:

Since we cannot reproduce it and there isnt any log to help fix the issue I am not sure what we can do here.

Only thing I can think of is to repurpose this ticket to implement a feature to show progress of middlewared process starting.

I had the similar issue when upgrading from previous stable:
First boot hang at the MOD_LOAD (vmm, ... line for >5 minutes, then it all of a sudden continued.
Subsequent boots a fine.

Also available in: Atom PDF