Project

General

Profile

Bug #27674

Fix race condition in aac(4) driver

Added by Michael ANDRE over 1 year ago. Updated about 1 year ago.

Status:
Done
Priority:
Important
Assignee:
Alexander Motin
Category:
OS
Target version:
Seen in:
Severity:
New
Reason for Closing:
Reason for Blocked:
Needs QA:
No
Needs Doc:
No
Needs Merging:
No
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

Hello,
I have tried many times to update my Freenas from 11.0-U4 to 11.1 but I get this error everytime...
My only way to recover my system is to select an old version in grub and delete the 11.1 entry.
Any idea on how to recover this?
Thanks in advance for your support


Related issues

Has duplicate FreeNAS - Bug #27804: 11.1-RELEASE failed to work with aacClosed: Duplicate2018-01-152018-02-02

Associated revisions

Revision 898fe034 (diff)
Added by Alexander Motin over 1 year ago

MFC r323317 (by scottl): Move the intrhook release to later in the function

so that GEOM knows to wait longer for possible root devices to come online.
This fixes a race that seems to be triggered by EARLY_AP_STARTUP.

Submitted by:

(cherry picked from commit 90e70e83b4584f500f60a0d1eaed01882e194326)

Ticket: #27674

History

#1 Updated by Michael ANDRE over 1 year ago

  • File debug-freenas-20180108073334.txz added

#3 Updated by Sean Fagan over 1 year ago

  • Status changed from Unscreened to 15
  • Assignee changed from Release Council to Sean Fagan

This looks like a GUI update that went wrong. You should be able to fix it by:

First, boot to the previous installation, which looks like "default". Then, from the command line, do: "beadm activate default ; beadm destroy -F 11.1-RELEASE ; freenas-update check && freenas-update --reboot update'

That should get you there.

#4 Updated by Michael ANDRE over 1 year ago

Sean Fagan wrote:

This looks like a GUI update that went wrong. You should be able to fix it by:

First, boot to the previous installation, which looks like "default". Then, from the command line, do: "beadm activate default ; beadm destroy -F 11.1-RELEASE ; freenas-update check && freenas-update --reboot update'

That should get you there.

Thanks for your quick answer.
I had tried it before, suspecting the web gui to be faulty in this upgrade with no result.
I have retried it another time, I have the same error.
Any other thing to try/test?

#5 Updated by Sean Fagan over 1 year ago

Did you run the commands I said? If so, what was the output?

#6 Updated by Michael ANDRE over 1 year ago

13839

Sean Fagan wrote:

Did you run the commands I said? If so, what was the output?

I did them directly on my server, the output was good.
I did it again in ssh to be able to show you the output :

root@freenas:~ # beadm activate default
GRUB configuration updated successfully
Activated successfully
root@freenas:~ # beadm destroy -F 11.1-RELEASE
GRUB configuration updated successfully
Destroyed successfully
root@freenas:~ # freenas-update check && freenas-update --reboot update
Status: Downloading: FreeNASUI Progress:100 Size: 24132 Rate: 2478699 B/s
Total Progress: [########################################] 100.00%
Upgrade package base-os 11.0-U4-4ee20c34fd84cd863cc7642519a68e5f->11.1-RELEASE-8815a498ef028e9ba97ca6e1e9e75c74
Upgrade package docs 11.0-U4-4ee20c34fd84cd863cc7642519a68e5f->11.1-RELEASE-8815a498ef028e9ba97ca6e1e9e75c74
Upgrade package freebsd-pkgdb 11.0-U4-4ee20c34fd84cd863cc7642519a68e5f->11.1-RELEASE-8815a498ef028e9ba97ca6e1e9e75c74
Upgrade package freenas-pkg-tools 11.0-U4-4ee20c34fd84cd863cc7642519a68e5f->11.1-RELEASE-8815a498ef028e9ba97ca6e1e9e75c74
Upgrade package FreeNASUI 11.0-U4-4ee20c34fd84cd863cc7642519a68e5f->11.1-RELEASE-8815a498ef028e9ba97ca6e1e9e75c74
Reboot is (conditionally) required
Sequence 4ee20c34fd84cd863cc7642519a68e5f -> 8815a498ef028e9ba97ca6e1e9e75c74
Status: Installing FreeNASUI
Total Progress: [########################################] 100.00%
Shutdown NOW!
shutdown: [pid 22824]
  • FINAL System shutdown message from ***
    System going down IMMEDIATELY
    System shutdown time has arrived

#7 Updated by Sean Fagan over 1 year ago

After that, when you reboot, it doesn't boot?

Okay, can you boot to a previous version again, and attach the file /boot/grub/grub.cfg please?

#8 Updated by Michael ANDRE over 1 year ago

Sean Fagan wrote:

After that, when you reboot, it doesn't boot?

Okay, can you boot to a previous version again, and attach the file /boot/grub/grub.cfg please?

In attachment :)

#9 Updated by Sean Fagan over 1 year ago

  • Assignee changed from Sean Fagan to Andrew Walker

I can't see any reason for it to not work. There are some differences between the two entries, but nothing jarring.

The error during boot is the kernel saying it cannot fine the boot filesystem, but if it can find freenas-boot/ROOT/default I can't see any reason it couldn't find freenas-booot/ROOT/11.1-RELEASE.

Oh. Hm. Your boot pool is device aacd0p2; I wonder if 11.1 lost the driverr for aac, which could explain it. Alexander, do you know?

#10 Updated by Sean Fagan over 1 year ago

Hm, still seems to be there -- at least, aac_ioctl is in the kernel symbol table.

#11 Updated by Dru Lavigne over 1 year ago

  • Assignee changed from Andrew Walker to Alexander Motin
  • Seen in changed from Unspecified to 11.1

#12 Updated by Dru Lavigne over 1 year ago

  • Status changed from 15 to Unscreened
  • Target version set to 11.1-U2

#13 Updated by Alexander Motin over 1 year ago

  • Category changed from 1 to 129
  • Status changed from Unscreened to 15
  • Priority changed from No priority to Nice to have

I agree with Sean that problem is likely somehow related to the fact of using Adaptec RAID for both boot and and data in this system. We do not officially support such configurations with hardware RAID controllers, and so do not have hardware to reproduce that, so our ability to help may be limited. According to pass7 device seen on the screenshot, I guess the RAID card itself is detected and at least partially functional, so the problem is not just in a missing driver, but supposedly there is something wrong with the logical volumes reporting or the timing of when that happened, so that system can't find the boot volume there.

You'd have to find some way to get more information out of the system. If the keyboard is working at that point already, you may try to press Scroll Lock and scroll the history back up to see whether any aacdX disks were reported, as they were on 11.0. You may also type "?" there to see the list of available disk devices, where should should see set of aacdX disks and their partitions.

Alternative way could be to install another copy of FreeNAS 11.1 on some other media, like a USB stick, and generate debug information from it, so we could compare it to one from 11.0.

#14 Updated by Dru Lavigne over 1 year ago

  • Target version changed from 11.1-U2 to 11.3

#15 Updated by Alexander Motin over 1 year ago

  • Subject changed from Update From 11.0-U4 to 11.1 Failure to aac(4) driver failure on 11.1

#16 Updated by Alexander Motin over 1 year ago

  • Has duplicate Bug #27804: 11.1-RELEASE failed to work with aac added

#17 Updated by Michael ANDRE over 1 year ago

I am working on trying to install a fresh 11.1 to test my hardware directly on this version.
I will bring you the result as soon as possible
Regards

#18 Updated by Michael ANDRE over 1 year ago

13924
13925

Back with some news :
I tried to install a fresh 11.1 on new drives in the same configuration : this only 2 drives in Mirror on the adaptec card
Everything works fine even after multiple reboot.
But when I had 2 drives (a Mirror with some data), the root pool doesn't mount like when I was trying to upgrade.
You will find in attachment the result of the "?" command when mountroot is prompted and the only error I found scrolling up in the boot messages.
This error is present in both cases : when working (only freenas boot mirror present) and not working (freenas-boot + zfs raid volume)
Is there something to try or should I try another hardware configuration?

#19 Updated by Dru Lavigne over 1 year ago

  • Status changed from 15 to Investigation

#20 Updated by Alexander Motin over 1 year ago

Unfortunately couple of screenshots don't give enough input. Could you install FreeBSD on some USB stick to make it boot and grab debug as I proposed?

#21 Updated by Michael ANDRE over 1 year ago

I agree with that. However, I am a noob for that kind of investigation... Could you please detail me the steps I have to follow? Like how to retrieve the debug informations etc...

#22 Updated by Alexander Motin over 1 year ago

Michael ANDRE wrote:

Like how to retrieve the debug informations etc...

System -> Advanced -> Save Debug

#23 Updated by Alexander Motin over 1 year ago

  • Status changed from Investigation to Fix In Progress
  • Priority changed from Nice to have to Important
  • Target version changed from 11.3 to 11.1-U2
  • Needs QA changed from Yes to No

I think I understood the problem, and I think I found solution for it in FreeBSD head, which for some reason was not merged to stable/11. I am merging that patch now, and it should be available in nightly builds nearest time, and 11.1-U2 update, when it come out.

#24 Updated by Alexander Motin over 1 year ago

  • Status changed from Fix In Progress to Ready For Release

The supposed fix is committed to nightly and merged to 11.1-stable. Please try to update to tomorrow nightly to confirm that it works now, since we have no hardware to test it.

#25 Updated by Dru Lavigne over 1 year ago

  • File deleted (debug-freenas-20180108073334.txz)

#26 Updated by Dru Lavigne over 1 year ago

  • Subject changed from aac(4) driver failure on 11.1 to Fix race condition in aac(4) driver
  • Private changed from Yes to No
  • Needs Doc changed from Yes to No
  • Needs Merging changed from Yes to No

#27 Updated by Michael ANDRE over 1 year ago

  • File debug-freenas-with storage.tgz added
  • File debug-freenas-without storage.tgz added

Alexander Motin wrote:

The supposed fix is committed to nightly and merged to 11.1-stable. Please try to update to tomorrow nightly to confirm that it works now, since we have no hardware to test it.

I will try it tomorrow, thank you.
I give you the debug files in attachment if it can still be useful...

#28 Updated by Michael ANDRE over 1 year ago

I have installed 11-MASTER-201801200415.
Everything seems to work now with my hardware configuration.
Could you confirm me that the fix will be included in the 11.1-U2? As going from nighlies to Stable isn't supported, I have done the test on a miror of disk dedicated to tests.
Thanks for you work and your answer.
Regards

#29 Updated by Michael ANDRE over 1 year ago

  • File deleted (debug-freenas-without storage.tgz)

#30 Updated by Michael ANDRE over 1 year ago

  • File deleted (debug-freenas-with storage.tgz)

#31 Updated by Alexander Motin over 1 year ago

Michael ANDRE wrote:

I have installed 11-MASTER-201801200415.
Everything seems to work now with my hardware configuration.

Thank you!

Could you confirm me that the fix will be included in the 11.1-U2?

Yes. As I have told above, the fix is merged to 11.1-stable, so it will be part of the 11.1-U2 when it is released.

#32 Updated by Dru Lavigne about 1 year ago

  • Status changed from Ready For Release to Done

#33 Avatar?id=13649&size=24x24 Updated by Ben Gadd about 1 year ago

  • Due date set to 02/12/2018

Due date updated to reflect the code freeze for 11.1U2.

#34 Avatar?id=13649&size=24x24 Updated by Ben Gadd about 1 year ago

  • Severity set to New

Also available in: Atom PDF