Project

General

Profile

Feature #38978

Add support for NVMe device replacement

Added by Alexander Motin about 1 year ago. Updated 6 months ago.

Status:
Ready for Testing
Priority:
No priority
Assignee:
Alexander Motin
Category:
Hardware
Target version:
Estimated time:
Severity:
Med High
Reason for Closing:
Reason for Blocked:
Needs QA:
Yes
Needs Doc:
No
Needs Merging:
No
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:

Description

NVMe needs support for both hot-plug and un-plug.

For hot-plug there are two potential issues: a) make PCI report that something happened, and b) hope there is enough resources reserved to allocate from (which may be difficult).

For un-plug obviously clean teardown is needed, and one of the problem is that device is no longer responding to accesses, since it is no longer there.

Associated revisions

Revision 2c2d6d99 (diff)
Added by Alexander Motin 6 months ago

MFC r342557, r342559: Reimplement nvd(4) detach handling.

Previous code typically crashed in case of NVMe device unplug or even clean
detach while some I/Os are still in flight. To fix this the new code calls
disk_gone() and waits for confirmation of all references gone before calling
disk_destroy(), freeing other resources and allowing controller detach.

While there, fix disk lists locking and reimplement unit numbers assignment.

Ticket: #38978

(cherry picked from commit bfb1323a075dc3535d422570b54cca62c5a54ffb)

History

#1 Updated by Alexander Motin about 1 year ago

  • Description updated (diff)
  • Status changed from Unscreened to Screened

#2 Updated by Alexander Motin 6 months ago

  • Status changed from Screened to In Progress
  • Target version changed from Backlog to 11.2-U3
  • Parent task deleted (#31596)

I've made few fixes there, including r343447 already in 11.3-stable branch. Unfortunately resource allocation problem on plug-in is complicated and may still require reboot, number of others should be handled now.

#3 Updated by Alexander Motin 6 months ago

Just for notice, FreeBSD head finally enabled PCI BARs reallocation (https://svnweb.freebsd.org/changeset/base/344022), that, if works fine, may be a step towards PCI resource reservation.

#4 Updated by Alexander Motin 6 months ago

  • Status changed from In Progress to Ready for Testing

I've merged to 11.2-stable change that should allow hot NVMe device replacement, at least when it is disabled with `devctl disable nvmeX` before removal, even under load.

QE: Minimal test doable on any NVMe hardware include `devctl disable nvmeX`/`devctl enable nvmeX` under load (make sure to not upset ZFS removing critical/only vdev). Maximal test would include real hot NVMe device replacement on M50 platform (with explicit `devctl disable` first). I haven't tested that after these changes, but there is a chance it may work now, since resources freed by removed device should be enough for the inserted one.

Also available in: Atom PDF