Project

General

Profile

Bug #67575

mlx4_en driver kernel panic on shutdown/reset

Added by Craig Sacco almost 3 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
No priority
Assignee:
Alexander Motin
Category:
OS
Target version:
Severity:
New
Reason for Closing:
Third Party to Resolve
Reason for Blocked:
Needs QA:
No
Needs Doc:
No
Needs Merging:
No
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

I have a Mellanox MT25418 IB card running in Ethernet mode, and whenever I restart or shutdown by FreeNAS instance the following kernel fault pops up:

---------------------------------------------------
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 10
fault virtual address = 0x0
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80d4d9b4
stack pointer = 0x20:0xfffffe01bd9be520
frame pointer = 0x20:0xfffffe01bd9be540
code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 1 (init)
[ thread pid 1 tid 100002 ]
stopped at mlx4_en_put_qp+0x74: movq (%rcx),%rcx
db:0:kdb.enter.default> write cn_mute 1
cn_mute 0 = 0x1
---------------------------------------------------

The OS then resets (even when a shutdown is requested).

This bug is (near) 100% reproducible if more information is required.

The following tunables are set:

  • mlx4en_load = YES (loader)
  • sys.device.mlx4_core0.mlx4_port1 = eth (sysctl)
  • sys.device.mlx4_core0.mlx4_port2 = eth (sysctl)

IP is setup on both IB ports with the MTU set to 9000.

Here is my hardware/software configuration:

===================================================
lspci v
--------------------------------------------------

00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Root Complex
Subsystem: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Root Complex
Flags: bus master, 66MHz, medium devsel, latency 0

00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Richland [Radeon HD 8470D] (prog-if 00 [VGA controller])
Subsystem: Gigabyte Technology Co., Ltd Device d000
Flags: bus master, fast devsel, latency 0, IRQ 17
Memory at c0000000 (32-bit, prefetchable)
I/O ports at f000
Memory at feb00000 (32-bit, non-prefetchable)
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Root Complex Integrated Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010

00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Trinity HDMI Audio Controller (prog-if 00 [VGA controller])
Subsystem: Gigabyte Technology Co., Ltd Device d000
Flags: bus master, fast devsel, latency 0, IRQ 17
Memory at c0000000 (32-bit, prefetchable)
I/O ports at f000
Memory at feb00000 (32-bit, non-prefetchable)
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Root Complex Integrated Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010

00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Root Port (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 18
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: None
Memory behind bridge: fea00000-feafffff [size=1M]
Prefetchable memory behind bridge: 00000000d0000000-00000000d07fffff [size=8M]
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Root Port (Slot+), MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [b0] Subsystem: Advanced Micro Devices, Inc. [AMD] Trinity A-series APU
Capabilities: [b8] HyperTransport: MSI Mapping Enable+ Fixed+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010

00:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Root Port (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 16
Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
I/O behind bridge: 0000e000-0000efff [size=4K]
Memory behind bridge: fe900000-fe9fffff [size=1M]
Prefetchable memory behind bridge: 00000000d0800000-00000000d08fffff [size=1M]
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Root Port (Slot+), MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [b0] Subsystem: Advanced Micro Devices, Inc. [AMD] Trinity A-series APU
Capabilities: [b8] HyperTransport: MSI Mapping Enable+ Fixed+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010

00:10.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 09) (prog-if 30 [XHCI])
Subsystem: Gigabyte Technology Co., Ltd Device 5004
Flags: bus master, fast devsel, latency 0, IRQ 18
Memory at feb46000 (64-bit, non-prefetchable)
Capabilities: [50] Power Management version 3
Capabilities: [70] MSI: Enable+ Count=1/8 Maskable- 64bit+
Capabilities: [90] MSI-X: Enable- Count=8 Masked-
Capabilities: [a0] Express Root Complex Integrated Endpoint, MSI 00
Capabilities: [100] Latency Tolerance Reporting

00:10.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 09) (prog-if 30 [XHCI])
Subsystem: Gigabyte Technology Co., Ltd Device 5004
Flags: bus master, fast devsel, latency 0, IRQ 18
Memory at feb46000 (64-bit, non-prefetchable)
Capabilities: [50] Power Management version 3
Capabilities: [70] MSI: Enable+ Count=1/8 Maskable- 64bit+
Capabilities: [90] MSI-X: Enable- Count=8 Masked-
Capabilities: [a0] Express Root Complex Integrated Endpoint, MSI 00
Capabilities: [100] Latency Tolerance Reporting

00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 40) (prog-if 01 [AHCI 1.0])
Subsystem: Gigabyte Technology Co., Ltd Device b002
Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 19
I/O ports at f140
I/O ports at f130
I/O ports at f120
I/O ports at f110
I/O ports at f100
Memory at feb4d000 (32-bit, non-prefetchable)
Capabilities: [50] MSI: Enable+ Count=8/8 Maskable- 64bit+
Capabilities: [70] SATA HBA v1.0

00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 11) (prog-if 10 [OHCI])
Subsystem: Gigabyte Technology Co., Ltd Device 5004
Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 18
Memory at feb4c000 (32-bit, non-prefetchable)

00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller (rev 11) (prog-if 10 [OHCI])
Subsystem: Gigabyte Technology Co., Ltd Device 5004
Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 18
Memory at feb4c000 (32-bit, non-prefetchable)

00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 11) (prog-if 10 [OHCI])
Subsystem: Gigabyte Technology Co., Ltd Device 5004
Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 18
Memory at feb4a000 (32-bit, non-prefetchable)

00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller (rev 11) (prog-if 10 [OHCI])
Subsystem: Gigabyte Technology Co., Ltd Device 5004
Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 18
Memory at feb4a000 (32-bit, non-prefetchable)

00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 16)
Subsystem: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller
Flags: 66MHz, medium devsel

00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 16)
Subsystem: Advanced Micro Devices, Inc. [AMD] Device 780b
Flags: 66MHz, medium devsel

00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH PCI Bridge (rev 16)
Subsystem: Advanced Micro Devices, Inc. [AMD] Device 780b
Flags: 66MHz, medium devsel

00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 16)
Subsystem: Advanced Micro Devices, Inc. [AMD] Device 780b
Flags: 66MHz, medium devsel

00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Function 0
Flags: fast devsel

00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Function 1
Flags: fast devsel

00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Function 2
Flags: fast devsel

00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Function 3
Flags: fast devsel

00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Function 4
Flags: fast devsel

00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Function 5
Flags: fast devsel

01:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX VPI PCIe 2.0 2.5GT/s - IB DDR / 10GigE] (rev a0)
Subsystem: Mellanox Technologies Device 0007
Flags: bus master, fast devsel, latency 0, IRQ 18
Memory at fea00000 (64-bit, non-prefetchable)
Memory at d0000000 (64-bit, prefetchable)
Capabilities: [40] Power Management version 3
Capabilities: [48] Vital Product Data
Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-
Capabilities: [60] Express Endpoint, MSI 00
Capabilities: [100] Alternative Routing-ID Interpretation (ARI)

02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
Subsystem: Gigabyte Technology Co., Ltd Onboard Ethernet
Flags: bus master, fast devsel, latency 0, IRQ 16
I/O ports at e000
Memory at fe900000 (64-bit, non-prefetchable)
Memory at d0800000 (64-bit, prefetchable)
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [70] Express Endpoint, MSI 01
Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
Capabilities: [d0] Vital Product Data
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Virtual Channel
Capabilities: [160] Device Serial Number 01-00-00-00-68-4c-e0-00
Capabilities: [170] Latency Tolerance Reporting

===================================================
pciconf lvc
--------------------------------------------------

hostb0@pci0:0:0:0: class=0x060000 card=0x14101022 chip=0x14101022 rev=0x00 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'Family 15h (Models 10h-1fh) Processor Root Complex'
class = bridge
subclass = HOST-PCI
vgapci0@pci0:0:1:0: class=0x030000 card=0xd0001458 chip=0x99961002 rev=0x00 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]'
device = 'Richland [Radeon HD 8470D]'
class = display
subclass = VGA
cap 0150 = powerspec 3 supports D0 D1 D2 D3 current D0
cap 1058 = PCI-Express 2 root endpoint max data 128(128) RO NS
cap 05[a0] = MSI supports 1 message, 64 bit
ecap 000b100 = Vendor 1 ID 1
none0@pci0:0:1:1: class=0x040300 card=0xa0021458 chip=0x99021002 rev=0x00 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]'
device = 'Trinity HDMI Audio Controller'
class = multimedia
subclass = HDA
cap 0150 = powerspec 3 supports D0 D1 D2 D3 current D0
cap 1058 = PCI-Express 2 root endpoint max data 128(128) RO NS
cap 05[a0] = MSI supports 1 message, 64 bit
ecap 000b100 = Vendor 1 ID 1
pcib1@pci0:0:2:0: class=0x060400 card=0x12341022 chip=0x14121022 rev=0x00 hdr=0x01
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'Family 15h (Models 10h-1fh) Processor Root Port'
class = bridge
subclass = PCI-PCI
cap 0150 = powerspec 3 supports D0 D3 current D0
cap 1058 = PCI-Express 2 root port max data 256(256) NS
link x8(x16) speed 2.5(5.0) ASPM disabled(L0s/L1)
slot 0 power limit 0 mW
cap 05[a0] = MSI supports 1 message, 64 bit
cap 0d[b0] = PCI Bridge card=0x12341022
cap 08[b8] = HT MSI fixed address window enabled at 0xfee00000
ecap 000b100 = Vendor 1 ID 1
pcib2@pci0:0:4:0: class=0x060400 card=0x12341022 chip=0x14141022 rev=0x00 hdr=0x01
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'Family 15h (Models 10h-1fh) Processor Root Port'
class = bridge
subclass = PCI-PCI
cap 0150 = powerspec 3 supports D0 D3 current D0
cap 1058 = PCI-Express 2 root port max data 128(256) NS
link x1(x1) speed 2.5(5.0) ASPM disabled(L0s/L1)
slot 0 power limit 0 mW
cap 05[a0] = MSI supports 1 message, 64 bit
cap 0d[b0] = PCI Bridge card=0x12341022
cap 08[b8] = HT MSI fixed address window enabled at 0xfee00000
ecap 000b100 = Vendor 1 ID 1
xhci0@pci0:0:16:0: class=0x0c0330 card=0x50041458 chip=0x78141022 rev=0x09 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'FCH USB XHCI Controller'
class = serial bus
subclass = USB
cap 0150 = powerspec 3 supports D0 D3 current D0
cap 0570 = MSI supports 8 messages, 64 bit enabled with 1 message
cap 1190 = MSI-X supports 8 messages
Table in map 0x10[0x1000], PBA in map 0x10[0x1080]
cap 10[a0] = PCI-Express 2 root endpoint max data 128(128) NS
ecap 0018100 = LTR 1
xhci1@pci0:0:16:1: class=0x0c0330 card=0x50041458 chip=0x78141022 rev=0x09 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'FCH USB XHCI Controller'
class = serial bus
subclass = USB
cap 0150 = powerspec 3 supports D0 D3 current D0
cap 0570 = MSI supports 8 messages, 64 bit enabled with 1 message
cap 1190 = MSI-X supports 8 messages
Table in map 0x10[0x1000], PBA in map 0x10[0x1080]
cap 10[a0] = PCI-Express 2 root endpoint max data 128(128) NS
ahci0@pci0:0:17:0: class=0x010601 card=0xb0021458 chip=0x78011022 rev=0x40 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'FCH SATA Controller [AHCI mode]'
class = mass storage
subclass = SATA
cap 0550 = MSI supports 8 messages, 64 bit enabled with 8 messages
cap 1270 = SATA Index-Data Pair
ohci0@pci0:0:18:0: class=0x0c0310 card=0x50041458 chip=0x78071022 rev=0x11 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'FCH USB OHCI Controller'
class = serial bus
subclass = USB
ehci0@pci0:0:18:2: class=0x0c0320 card=0x50041458 chip=0x78081022 rev=0x11 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'FCH USB EHCI Controller'
class = serial bus
subclass = USB
cap 01[c0] = powerspec 2 supports D0 D1 D2 D3 current D0
cap 0a[e4] = EHCI Debug Port at offset 0xe0 in map 0x14
ohci1@pci0:0:19:0: class=0x0c0310 card=0x50041458 chip=0x78071022 rev=0x11 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'FCH USB OHCI Controller'
class = serial bus
subclass = USB
ehci1@pci0:0:19:2: class=0x0c0320 card=0x50041458 chip=0x78081022 rev=0x11 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'FCH USB EHCI Controller'
class = serial bus
subclass = USB
cap 01[c0] = powerspec 2 supports D0 D1 D2 D3 current D0
cap 0a[e4] = EHCI Debug Port at offset 0xe0 in map 0x14
none1@pci0:0:20:0: class=0x0c0500 card=0x780b1022 chip=0x780b1022 rev=0x16 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'FCH SMBus Controller'
class = serial bus
subclass = SMBus
isab0@pci0:0:20:3: class=0x060100 card=0x780e1022 chip=0x780e1022 rev=0x11 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'FCH LPC Bridge'
class = bridge
subclass = PCI-ISA
pcib3@pci0:0:20:4: class=0x060401 card=0x00000000 chip=0x780f1022 rev=0x40 hdr=0x01
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'FCH PCI Bridge'
class = bridge
subclass = PCI-PCI
ohci2@pci0:0:20:5: class=0x0c0310 card=0x50041458 chip=0x78091022 rev=0x11 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'FCH USB OHCI Controller'
class = serial bus
subclass = USB
hostb1@pci0:0:24:0: class=0x060000 card=0x00000000 chip=0x14001022 rev=0x00 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'Family 15h (Models 10h-1fh) Processor Function 0'
class = bridge
subclass = HOST-PCI
hostb2@pci0:0:24:1: class=0x060000 card=0x00000000 chip=0x14011022 rev=0x00 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'Family 15h (Models 10h-1fh) Processor Function 1'
class = bridge
subclass = HOST-PCI
hostb3@pci0:0:24:2: class=0x060000 card=0x00000000 chip=0x14021022 rev=0x00 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'Family 15h (Models 10h-1fh) Processor Function 2'
class = bridge
subclass = HOST-PCI
hostb4@pci0:0:24:3: class=0x060000 card=0x00000000 chip=0x14031022 rev=0x00 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'Family 15h (Models 10h-1fh) Processor Function 3'
class = bridge
subclass = HOST-PCI
cap 0f[f0] = unknown
hostb5@pci0:0:24:4: class=0x060000 card=0x00000000 chip=0x14041022 rev=0x00 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'Family 15h (Models 10h-1fh) Processor Function 4'
class = bridge
subclass = HOST-PCI
hostb6@pci0:0:24:5: class=0x060000 card=0x00000000 chip=0x14051022 rev=0x00 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'Family 15h (Models 10h-1fh) Processor Function 5'
class = bridge
subclass = HOST-PCI
mlx4_core0@pci0:1:0:0: class=0x0c0600 card=0x000715b3 chip=0x634a15b3 rev=0xa0 hdr=0x00
vendor = 'Mellanox Technologies'
device = 'MT25418 [ConnectX VPI PCIe 2.0 2.5GT/s - IB DDR / 10GigE]'
class = serial bus
cap 0140 = powerspec 3 supports D0 D3 current D0
cap 0348 = VPD
cap 11[9c] = MSI-X supports 128 messages, enabled
Table in map 0x10[0x7c000], PBA in map 0x10[0x7d000]
cap 1060 = PCI-Express 2 endpoint max data 256(256) FLR
link x8(x8) speed 2.5(2.5) ASPM disabled(L0s)
ecap 000e100 = ARI 1
re0@pci0:2:0:0: class=0x020000 card=0xe0001458 chip=0x816810ec rev=0x0c hdr=0x00
vendor = 'Realtek Semiconductor Co., Ltd.'
device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller'
class = network
subclass = ethernet
cap 0140 = powerspec 3 supports D0 D1 D2 D3 current D0
cap 0550 = MSI supports 1 message, 64 bit
cap 1070 = PCI-Express 2 endpoint MSI 1 max data 128(128)
link x1(x1) speed 2.5(2.5) ASPM disabled(L0s/L1)
cap 11[b0] = MSI-X supports 4 messages, enabled
Table in map 0x20[0x0], PBA in map 0x20[0x800]
cap 03[d0] = VPD
ecap 0001100 = AER 1 0 fatal 0 non-fatal 0 corrected
ecap 0002140 = VC 1 max VC0
ecap 0003160 = Serial 1 01000000684ce000
ecap 0018170 = LTR 1

===================================================
cpuid
---------------------------------------------------
eax in eax ebx ecx edx
00000000 0000000d 68747541 444d4163 69746e65
00000001 00610f31 00020800 3e98320b 178bfbff
00000002 00000000 00000000 00000000 00000000
00000003 00000000 00000000 00000000 00000000
00000004 00000000 00000000 00000000 00000000
00000005 00000040 00000040 00000003 00000000
00000006 00000000 00000000 00000001 00000000
00000007 00000000 00000008 00000000 00000000
00000008 00000000 00000000 00000000 00000000
00000009 00000000 00000000 00000000 00000000
0000000a 00000000 00000000 00000000 00000000
0000000b 00000000 00000000 00000000 00000000
0000000c 00000000 00000000 00000000 00000000
0000000d 00000007 000003c0 000003c0 40000000
80000000 8000001e 68747541 444d4163 69746e65
80000001 00610f31 20000000 01ebbfff 2fd3fbff
80000002 20444d41 372d3441 20303033 20555041
80000003 68746977 64615220 206e6f65 47204448
80000004 68706172 20736369 20202020 00202020
80000005 ff40ff18 ff40ff30 10040140 40020140
80000006 64006400 64004200 04008140 00000000
80000007 00000000 00000000 00000000 000007d9
80000008 00003030 00000000 00004001 00000000
80000009 00000000 00000000 00000000 00000000
8000000a 00000001 00010000 00000000 00001cff
8000000b 00000000 00000000 00000000 00000000
8000000c 00000000 00000000 00000000 00000000
8000000d 00000000 00000000 00000000 00000000
8000000e 00000000 00000000 00000000 00000000
8000000f 00000000 00000000 00000000 00000000
80000010 00000000 00000000 00000000 00000000
80000011 00000000 00000000 00000000 00000000
80000012 00000000 00000000 00000000 00000000
80000013 00000000 00000000 00000000 00000000
80000014 00000000 00000000 00000000 00000000
80000015 00000000 00000000 00000000 00000000
80000016 00000000 00000000 00000000 00000000
80000017 00000000 00000000 00000000 00000000
80000018 00000000 00000000 00000000 00000000
80000019 f040f018 64006400 00000000 00000000
8000001a 00000003 00000000 00000000 00000000
8000001b 000000ff 00000000 00000000 00000000
8000001c 00000001 80032013 00010200 8000000f
8000001d 00000000 00000000 00000000 00000000
8000001e 00000010 00000100 00000000 00000000

Vendor ID: "AuthenticAMD"; CPUID level 13

AMD-specific functions
Version 00610f31:
Family: 15 Model: 3 []

Standard feature flags 178bfbff:
FPU Floating Point Unit
VME Virtual 8086 Mode Enhancements
DE Debugging Extensions
PSE Page Size Extensions
TSC Time Stamp Counter
MSR Model Specific Registers
PAE Physical Address Extension
MCE Machine Check Exception
CX8 COMPXCHG8B Instruction
APIC On-chip Advanced Programmable Interrupt Controller present and enabled
SEP Fast System Call
MTRR Memory Type Range Registers
PGE PTE Global Flag
MCA Machine Check Architecture
CMOV Conditional Move and Compare Instructions
PAT Page Attribute Table
PSE36 36-bit Page Size Extension
CLFSH CLFLUSH instruction
MMX MMX instruction set
FXSR Fast FP/MMX Streaming SIMD Extensions save/restore
SSE SSE extensions
SSE2 SSE2 extensions
HTT Hyper-Threading Technology
Generation: 15 Model: 3
Extended feature flags 2fd3fbff:
FPU Floating Point Unit
VME Virtual 8086 Mode Enhancements
DE Debugging Extensions
PSE Page Size Extensions
TSC Time Stamp Counter
MSR Model Specific Registers
PAE Physical Address Extension
MCE Machine Check Exception
CX8 COMPXCHG8B Instruction
APIC On-chip Advanced Programmable Interrupt Controller present and enabled
SEP Fast System Call
MTRR Memory Type Range Registers
PGE PTE Global Flag
MCA Machine Check Architecture
CMOV Conditional Move and Compare Instructions
PAT Page Attribute Table
PSE36 36-bit Page Size Extension
NX No-execute page protection
MmxExt MMX instruction extensions
MMX MMX instructions
FXSR Fast FP/MMX Streaming SIMD Extensions save/restore
FFXSR FXSAVE and FXRSTOR instruction optimizations
Pge1GB 1GB Page Support
RDTSCP RDTSCP instruction
LM 64 bit long mode

Extended Miscellaneous feature flags 01ebbfff:
LhfSaf LAHF and SAHF instructions in 65-bit mode
CmpLeg Core Multi-Processing mode
SVM Secure Virtual Machine
XAPSPC Extended APIC Register Space
AltMC8 LOCK MOV CR0 means MOV CR8
ABM Advanced Bit Manipulation
SSE4A EXTRQ, INSERTQ, MOVNTSS, and MOVNTSD support
MASSE Misaligned SSE mode
3DNPFC PREFETCH and PREFETCHW support
OSVW OS Visible Workaround support
10 Reserved
11 Reserved
SKINIT SKINIT, STGI, and DEV support
WDT Watchdog Timer support14 Reserved
16 Reserved
17 Reserved
18 Reserved
20 Reserved
22 Reserved
23 Reserved
24 Reserved
25 Reserved

Processor name string: AMD A4-7300 APU with Radeon HD Graphics
L1 Cache Information:
2/4-MB Pages:
Data TLB: associativity 255-way #entries 64
Instruction TLB: associativity 255-way #entries 24
4-KB Pages:
Data TLB: associativity 255-way #entries 64
Instruction TLB: associativity 255-way #entries 48
L1 Data cache:
size 16 KB associativity 4-way lines per tag 1 line size 64
L1 Instruction cache:
size 64 KB associativity 2-way lines per tag 1 line size 64

L2 Cache Information:
2/4-MB Pages:
Data TLB: associativity 4-way #entries 0
Instruction TLB: associativity 4-way #entries 0
4-KB Pages:
Data TLB: associativity 4-way #entries 0
Instruction TLB: associativity 2-way #entries 0
size 4 KB associativity L2 off lines per tag 129 line size 64

Advanced Power Management Feature Flags
Has temperature sensing diode
Maximum linear address: 48; maximum phys address 48

===================================================
dmidecode
---------------------------------------------------
  1. dmidecode 3.1
    Scanning /dev/mem for entry point.
    SMBIOS 2.7 present.
    52 structures occupying 2422 bytes.
    Table at 0x000EBEE0.

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
Vendor: American Megatrends Inc.
Version: F8
Release Date: 04/09/2015
Address: 0xF0000
Runtime Size: 64 kB
ROM Size: 8192 kB
Characteristics:
PCI is supported
BIOS is upgradeable
BIOS shadowing is allowed
Boot from CD is supported
Selectable boot is supported
BIOS ROM is socketed
EDD is supported
5.25"/1.2 MB floppy services are supported (int 13h)
3.5"/720 kB floppy services are supported (int 13h)
3.5"/2.88 MB floppy services are supported (int 13h)
Print screen service is supported (int 5h)
8042 keyboard services are supported (int 9h)
Serial services are supported (int 14h)
Printer services are supported (int 17h)
ACPI is supported
USB legacy is supported
BIOS boot specification is supported
Targeted content distribution is supported
UEFI is supported
BIOS Revision: 4.6

Handle 0x0001, DMI type 1, 27 bytes
System Information
Manufacturer: Gigabyte Technology Co., Ltd.
Product Name: To be filled by O.E.M.
Version: To be filled by O.E.M.
Serial Number: To be filled by O.E.M.
UUID: 038D0240-045C-0542-5606-740700080009
Wake-up Type: Power Switch
SKU Number: To be filled by O.E.M.
Family: To be filled by O.E.M.

Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
Manufacturer: Gigabyte Technology Co., Ltd.
Product Name: F2A88XM-D3H
Version: x.x
Serial Number: To be filled by O.E.M.
Asset Tag: To be filled by O.E.M.
Features:
Board is a hosting board
Board is replaceable
Location In Chassis: To be filled by O.E.M.
Chassis Handle: 0x0003
Type: Motherboard
Contained Object Handles: 0

Handle 0x0003, DMI type 3, 22 bytes
Chassis Information
Manufacturer: Gigabyte Technology Co., Ltd.
Type: Desktop
Lock: Not Present
Version: To Be Filled By O.E.M.
Serial Number: To Be Filled By O.E.M.
Asset Tag: To Be Filled By O.E.M.
Boot-up State: Safe
Power Supply State: Safe
Thermal State: Safe
Security Status: None
OEM Information: 0x00000000
Height: Unspecified
Number Of Power Cords: 1
Contained Elements: 0
SKU Number: To be filled by O.E.M.

Handle 0x0004, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J1A1
Internal Connector Type: None
External Reference Designator: PS2Mouse
External Connector Type: PS/2
Port Type: Mouse Port

Handle 0x0005, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J1A1
Internal Connector Type: None
External Reference Designator: Keyboard
External Connector Type: PS/2
Port Type: Keyboard Port

Handle 0x0006, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J2A1
Internal Connector Type: None
External Reference Designator: TV Out
External Connector Type: Mini Centronics Type-14
Port Type: Other

Handle 0x0007, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J2A2A
Internal Connector Type: None
External Reference Designator: COM A
External Connector Type: DB-9 male
Port Type: Serial Port 16550A Compatible

Handle 0x0008, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J2A2B
Internal Connector Type: None
External Reference Designator: Video
External Connector Type: DB-15 female
Port Type: Video Port

Handle 0x0009, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J3A1
Internal Connector Type: None
External Reference Designator: USB1
External Connector Type: Access Bus (USB)
Port Type: USB

Handle 0x000A, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J3A1
Internal Connector Type: None
External Reference Designator: USB2
External Connector Type: Access Bus (USB)
Port Type: USB

Handle 0x000B, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J3A1
Internal Connector Type: None
External Reference Designator: USB3
External Connector Type: Access Bus (USB)
Port Type: USB

Handle 0x000C, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J9A1 - TPM HDR
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x000D, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J9C1 - PCIE DOCKING CONN
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x000E, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J2B3 - CPU FAN
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x000F, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J6C2 - EXT HDMI
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x0010, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J3C1 - GMCH FAN
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x0011, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J1D1 - ITP
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x0012, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J9E2 - MDC INTPSR
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x0013, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J9E4 - MDC INTPSR
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x0014, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J9E3 - LPC HOT DOCKING
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x0015, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J9E1 - SCAN MATRIX
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x0016, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J9G1 - LPC SIDE BAND
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x0017, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J8F1 - UNIFIED
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x0018, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J6F1 - LVDS
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x0019, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J2F1 - LAI FAN
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x001A, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J2G1 - GFX VID
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x001B, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J1G6 - AC JACK
Internal Connector Type: Other
External Reference Designator: Not Specified
External Connector Type: None
Port Type: Other

Handle 0x001C, DMI type 9, 17 bytes
System Slot Information
Designation: J6B2
Type: x16 PCI Express
Current Usage: In Use
Length: Long
ID: 0
Characteristics:
3.3 V is provided
Opening is shared
PME signal is supported
Bus Address: 0000:00:02.0

Handle 0x001D, DMI type 9, 17 bytes
System Slot Information
Designation: J6B1
Type: x1 PCI Express
Current Usage: In Use
Length: Short
ID: 1
Characteristics:
3.3 V is provided
Opening is shared
PME signal is supported
Bus Address: 0000:00:1c.0

Handle 0x001E, DMI type 9, 17 bytes
System Slot Information
Designation: J6D1
Type: x8 PCI Express
Current Usage: In Use
Length: Short
ID: 2
Characteristics:
3.3 V is provided
Opening is shared
PME signal is supported
Bus Address: 0000:00:01.0

Handle 0x001F, DMI type 9, 17 bytes
System Slot Information
Designation: J7B1
Type: x16 PCI Express
Current Usage: In Use
Length: Short
ID: 3
Characteristics:
3.3 V is provided
Opening is shared
PME signal is supported
Bus Address: 0000:00:03.0

Handle 0x0020, DMI type 10, 6 bytes
On Board Device Information
Type: Video
Status: Enabled
Description: To Be Filled By O.E.M.

Handle 0x0021, DMI type 11, 5 bytes
OEM Strings
String 1: To Be Filled By O.E.M.
String 2: To Be Filled By O.E.M.
String 3: To Be Filled By O.E.M.
String 4: To Be Filled By O.E.M.
String 5: To Be Filled By O.E.M.

Handle 0x0022, DMI type 12, 5 bytes
System Configuration Options
Option 1: To Be Filled By O.E.M.

Handle 0x0023, DMI type 16, 23 bytes
Physical Memory Array
Location: System Board Or Motherboard
Use: System Memory
Error Correction Type: None
Maximum Capacity: 16 GB
Error Information Handle: Not Provided
Number Of Devices: 4

Handle 0x0024, DMI type 19, 31 bytes
Memory Array Mapped Address
Starting Address: 0x00000000000
Ending Address: 0x001FFFFFFFF
Range Size: 8 GB
Physical Array Handle: 0x0023
Partition Width: 255

Handle 0x0025, DMI type 17, 34 bytes
Memory Device
Array Handle: 0x0023
Error Information Handle: Not Provided
Total Width: 64 bits
Data Width: 64 bits
Size: 2048 MB
Form Factor: DIMM
Set: None
Locator: Node0_Dimm0
Bank Locator: Node0_Bank0
Type: DDR3
Type Detail: Synchronous Unbuffered (Unregistered)
Speed: 1333 MT/s
Manufacturer: Kingston
Serial Number: 771EF60C
Asset Tag: Dimm0_AssetTag
Part Number: 9905471-001.A
Rank: 2
Configured Clock Speed: 1333 MT/s

Handle 0x0026, DMI type 20, 35 bytes
Memory Device Mapped Address
Starting Address: 0x00000000000
Ending Address: 0x0007FFFFFFF
Range Size: 2 GB
Physical Device Handle: 0x0025
Memory Array Mapped Address Handle: 0x0024
Partition Row Position: 1

Handle 0x0027, DMI type 17, 34 bytes
Memory Device
Array Handle: 0x0023
Error Information Handle: Not Provided
Total Width: 64 bits
Data Width: 64 bits
Size: 2048 MB
Form Factor: DIMM
Set: None
Locator: Node0_Dimm1
Bank Locator: Node0_Bank0
Type: DDR3
Type Detail: Synchronous Unbuffered (Unregistered)
Speed: 1333 MT/s
Manufacturer: Corsair
Serial Number: 00000000
Asset Tag: Dimm1_AssetTag
Part Number: CMV4GX3M2A133
Rank: 1
Configured Clock Speed: 1333 MT/s

Handle 0x0028, DMI type 20, 35 bytes
Memory Device Mapped Address
Starting Address: 0x00080000000
Ending Address: 0x000FFFFFFFF
Range Size: 2 GB
Physical Device Handle: 0x0027
Memory Array Mapped Address Handle: 0x0024
Partition Row Position: 1

Handle 0x0029, DMI type 17, 34 bytes
Memory Device
Array Handle: 0x0023
Error Information Handle: Not Provided
Total Width: 64 bits
Data Width: 64 bits
Size: 2048 MB
Form Factor: DIMM
Set: None
Locator: Node0_Dimm2
Bank Locator: Node0_Bank0
Type: DDR3
Type Detail: Synchronous Unbuffered (Unregistered)
Speed: 1333 MT/s
Manufacturer: Kingston
Serial Number: 781EF40C
Asset Tag: Dimm2_AssetTag
Part Number: 9905471-001.A
Rank: 2
Configured Clock Speed: 1333 MT/s

Handle 0x002A, DMI type 20, 35 bytes
Memory Device Mapped Address
Starting Address: 0x00100000000
Ending Address: 0x0017FFFFFFF
Range Size: 2 GB
Physical Device Handle: 0x0029
Memory Array Mapped Address Handle: 0x0024
Partition Row Position: 1

Handle 0x002B, DMI type 17, 34 bytes
Memory Device
Array Handle: 0x0023
Error Information Handle: Not Provided
Total Width: 64 bits
Data Width: 64 bits
Size: 2048 MB
Form Factor: DIMM
Set: None
Locator: Node0_Dimm3
Bank Locator: Node0_Bank0
Type: DDR3
Type Detail: Synchronous Unbuffered (Unregistered)
Speed: 1333 MT/s
Manufacturer: Corsair
Serial Number: 00000000
Asset Tag: Dimm3_AssetTag
Part Number: CMV4GX3M2A133
Rank: 1
Configured Clock Speed: 1333 MT/s

Handle 0x002C, DMI type 20, 35 bytes
Memory Device Mapped Address
Starting Address: 0x00180000000
Ending Address: 0x001FFFFFFFF
Range Size: 2 GB
Physical Device Handle: 0x002B
Memory Array Mapped Address Handle: 0x0024
Partition Row Position: 1

Handle 0x002D, DMI type 32, 20 bytes
System Boot Information
Status: No errors detected

Handle 0x002E, DMI type 41, 11 bytes
Onboard Device
Reference Designation: Onboard LAN
Type: Ethernet
Status: Enabled
Type Instance: 1
Bus Address: 0000:00:19.0

Handle 0x002F, DMI type 7, 19 bytes
Cache Information
Socket Designation: L1 CACHE
Configuration: Enabled, Not Socketed, Level 1
Operational Mode: Write Back
Location: Internal
Installed Size: 96 kB
Maximum Size: 96 kB
Supported SRAM Types:
Pipeline Burst
Installed SRAM Type: Pipeline Burst
Speed: 1 ns
Error Correction Type: Multi-bit ECC
System Type: Unified
Associativity: 2-way Set-associative

Handle 0x0030, DMI type 7, 19 bytes
Cache Information
Socket Designation: L2 CACHE
Configuration: Enabled, Not Socketed, Level 2
Operational Mode: Write Back
Location: Internal
Installed Size: 1024 kB
Maximum Size: 1024 kB
Supported SRAM Types:
Pipeline Burst
Installed SRAM Type: Pipeline Burst
Speed: 1 ns
Error Correction Type: Multi-bit ECC
System Type: Unified
Associativity: 16-way Set-associative

Handle 0x0039, DMI type 4, 42 bytes
Processor Information
Socket Designation: P0
Type: Central Processor
Family: A-Series
Manufacturer: AuthenticAMD
ID: 31 0F 61 00 FF FB 8B 17
Signature: Family 21, Model 19, Stepping 1
Flags:
FPU (Floating-point unit on-chip)
VME (Virtual mode extension)
DE (Debugging extension)
PSE (Page size extension)
TSC (Time stamp counter)
MSR (Model specific registers)
PAE (Physical address extension)
MCE (Machine check exception)
CX8 (CMPXCHG8 instruction supported)
APIC (On-chip APIC hardware supported)
SEP (Fast system call)
MTRR (Memory type range registers)
PGE (Page global enable)
MCA (Machine check architecture)
CMOV (Conditional move instruction supported)
PAT (Page attribute table)
PSE-36 (36-bit page size extension)
CLFSH (CLFLUSH instruction supported)
MMX (MMX technology supported)
FXSR (FXSAVE and FXSTOR instructions supported)
SSE (Streaming SIMD extensions)
SSE2 (Streaming SIMD extensions 2)
HTT (Multi-threading)
Version: AMD A4-7300 APU with Radeon HD Graphics
Voltage: 1.3 V
External Clock: 100 MHz
Max Speed: 3800 MHz
Current Speed: 3800 MHz
Status: Populated, Enabled
Upgrade: Socket FM2
L1 Cache Handle: 0x002F
L2 Cache Handle: 0x0030
L3 Cache Handle: Not Provided
Serial Number: Not Specified
Asset Tag: Not Specified
Part Number: Not Specified
Core Count: 2
Core Enabled: 2
Thread Count: 2
Characteristics:
64-bit capable

Handle 0x003A, DMI type 13, 22 bytes
BIOS Language Information
Language Description Format: Long
Installable Languages: 7
en|US|iso8859-1
de|DE|iso8859-1
ru|RU|iso8859-5
ko|KR|unicode
ja|JP|unicode
zh|CS|unicode
zh|CT|unicode
Currently Installed Language: en|US|iso8859-1

Handle 0x003E, DMI type 127, 4 bytes
End Of Table

History

#1 Updated by Dru Lavigne almost 3 years ago

  • Private changed from No to Yes
  • Reason for Blocked set to Need additional information from Author

Craig: please attach a debug (System -> Advanced -> Save debug) to this ticket.

#2 Updated by Craig Sacco almost 3 years ago

  • File mlx4_fault.mp4 added

That might be difficult since the fault occurs just as it is about to be rebooted, and after the swap devices are torn down; so a kernel textdump is not persisted.

Could you advise on how to save a textdump when the kernel is in this state - I tried to set up a USB key as a dump device (using "dumpon /dev/da1") prior to rebooting, but no dump was persisted.

If it helps, I've taken a video of FreeNAS shutting down and exhibiting the fault.

#3 Updated by Craig Sacco almost 3 years ago

  • File mlx4_serial_console_output.txt added

So I decided to take another stab at getting more information about this issue, managed to find a 25-year-old serial port bracket and a null modem cable, and enabled the serial port console through the GUI.

Dropped into a terminal and broke into KDB (using sysctl debug.kdb.enter=1) and applied the following commands:

script kdb.enter.trap=watchdog 38; capture on; bt; show allpcpu; ps; alltrace; reset
continue

Attached are all kernel/console messages captured through the serial port, as well as the backtrace and full trace of all tasks in the kernel.

#4 Updated by Dru Lavigne almost 3 years ago

  • Assignee changed from Release Council to Alexander Motin
  • Seen in changed from 11.2-U2 to 11.2-RELEASE-U1
  • Reason for Blocked deleted (Need additional information from Author)

#5 Updated by Alexander Motin almost 3 years ago

  • Status changed from Unscreened to Blocked
  • Reason for Blocked set to Waiting for feedback
  • Needs QA changed from Yes to No
  • Needs Doc changed from Yes to No
  • Needs Merging changed from Yes to No

The panic happened inside the Mellanox driver:

mlx4_core0: mlx4_shutdown was called
<3>mlx4_en: mlxen0: Invalid steering mode.
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 10
fault virtual address   = 0x0
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80d4d9b4
stack pointer           = 0x28:0xfffffe02b81be520
frame pointer           = 0x28:0xfffffe02b81be540
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1 (init)
[ thread pid 1 tid 100002 ]
# Stopped at      mlx4_en_put_qp+0x74:    movq    (%rcx),%rcx
db:0:kdb.enter.trap> bt
Tracing pid 1 tid 100002 td 0xfffff80004944620
mlx4_en_put_qp() at mlx4_en_put_qp+0x74/frame 0xfffffe02b81be540
mlx4_en_stop_port() at mlx4_en_stop_port+0x3d2/frame 0xfffffe02b81be5a0
mlx4_en_destroy_netdev() at mlx4_en_destroy_netdev+0x1e6/frame 0xfffffe02b81be5d0
mlx4_en_remove() at mlx4_en_remove+0xcd/frame 0xfffffe02b81be600
mlx4_remove_device() at mlx4_remove_device+0xb1/frame 0xfffffe02b81be640
mlx4_unregister_device() at mlx4_unregister_device+0x98/frame 0xfffffe02b81be670
mlx4_unload_one() at mlx4_unload_one+0x85/frame 0xfffffe02b81be6b0
mlx4_shutdown() at mlx4_shutdown+0x83/frame 0xfffffe02b81be6e0
linux_pci_shutdown() at linux_pci_shutdown+0x39/frame 0xfffffe02b81be700
bus_generic_shutdown() at bus_generic_shutdown+0x5a/frame 0xfffffe02b81be720
bus_generic_shutdown() at bus_generic_shutdown+0x5a/frame 0xfffffe02b81be740
bus_generic_shutdown() at bus_generic_shutdown+0x5a/frame 0xfffffe02b81be760
bus_generic_shutdown() at bus_generic_shutdown+0x5a/frame 0xfffffe02b81be780
bus_generic_shutdown() at bus_generic_shutdown+0x5a/frame 0xfffffe02b81be7a0
acpi_shutdown() at acpi_shutdown+0xd/frame 0xfffffe02b81be7d0
bus_generic_shutdown() at bus_generic_shutdown+0x5a/frame 0xfffffe02b81be7f0
bus_generic_shutdown() at bus_generic_shutdown+0x5a/frame 0xfffffe02b81be810
root_bus_module_handler() at root_bus_module_handler+0x11e/frame 0xfffffe02b81be840
module_shutdown() at module_shutdown+0x6f/frame 0xfffffe02b81be860
kern_reboot() at kern_reboot+0x48a/frame 0xfffffe02b81be8b0
sys_reboot() at sys_reboot+0x447/frame 0xfffffe02b81be900
amd64_syscall() at amd64_syscall+0xa38/frame 0xfffffe02b81bea30
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe02b81bea30

I see that the driver got some updates recently in FreeBSD stable/11 branch, and you may first try tomorrow's FreeNAS 11.3-nightly build, where I've just merged all the latest changes. If that help, we may try to identify specific change(s) to merge back to 11.2.

If that won't help, I would recommend you to report it to upstream FreeBSD, since Mellanox has own FreeBSD committers, while we haven't modified the Mellanox driver, and don't even have hardware to test it.

#6 Updated by Craig Sacco almost 3 years ago

  • File mlx4_serial_console_output_11.3.txt added

Hi Alexander,

I downloaded and installed FreeNAS-11.3-MASTER-201901070910-33d7e3c.iso onto a fresh USB key and setup the conditions to make the IB adapter work in Ethernet mode. Even though the driver had been updated (from 3.4.1 to 3.5.0), the kernel still panics when mlx4_shutdown() is called:

mlx4_core0: mlx4_shutdown was called
<3>mlx4_en: mlxen0: Invalid steering mode.

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 10
fault virtual address   = 0x0
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80d69654
stack pointer           = 0x28:0xfffffe02b81be520
frame pointer           = 0x28:0xfffffe02b81be540
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1 (init)
[ thread pid 1 tid 100002 ]
Stopped at      mlx4_en_put_qp+0x74:    movq    (%rcx),%rcx
db:0:kdb.enter.trap>  bt
Tracing pid 1 tid 100002 td 0xfffff80004944620
mlx4_en_put_qp() at mlx4_en_put_qp+0x74/frame 0xfffffe02b81be540
mlx4_en_stop_port() at mlx4_en_stop_port+0x3d2/frame 0xfffffe02b81be5a0
mlx4_en_destroy_netdev() at mlx4_en_destroy_netdev+0x1e6/frame 0xfffffe02b81be5d0
mlx4_en_remove() at mlx4_en_remove+0xcd/frame 0xfffffe02b81be600
mlx4_remove_device() at mlx4_remove_device+0xb1/frame 0xfffffe02b81be640
mlx4_unregister_device() at mlx4_unregister_device+0x98/frame 0xfffffe02b81be670
mlx4_unload_one() at mlx4_unload_one+0x85/frame 0xfffffe02b81be6b0
mlx4_shutdown() at mlx4_shutdown+0x83/frame 0xfffffe02b81be6e0
linux_pci_shutdown() at linux_pci_shutdown+0x39/frame 0xfffffe02b81be700
bus_generic_shutdown() at bus_generic_shutdown+0x5a/frame 0xfffffe02b81be720
bus_generic_shutdown() at bus_generic_shutdown+0x5a/frame 0xfffffe02b81be740
bus_generic_shutdown() at bus_generic_shutdown+0x5a/frame 0xfffffe02b81be760
bus_generic_shutdown() at bus_generic_shutdown+0x5a/frame 0xfffffe02b81be780
bus_generic_shutdown() at bus_generic_shutdown+0x5a/frame 0xfffffe02b81be7a0
acpi_shutdown() at acpi_shutdown+0xd/frame 0xfffffe02b81be7d0
bus_generic_shutdown() at bus_generic_shutdown+0x5a/frame 0xfffffe02b81be7f0
bus_generic_shutdown() at bus_generic_shutdown+0x5a/frame 0xfffffe02b81be810
root_bus_module_handler() at root_bus_module_handler+0x11e/frame 0xfffffe02b81be840
module_shutdown() at module_shutdown+0x6f/frame 0xfffffe02b81be860
kern_reboot() at kern_reboot+0x48a/frame 0xfffffe02b81be8b0
sys_reboot() at sys_reboot+0x447/frame 0xfffffe02b81be900
amd64_syscall() at amd64_syscall+0xa38/frame 0xfffffe02b81bea30
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe02b81bea30
--- syscall (55, FreeBSD ELF64, sys_reboot), rip = 0x40f14a, rsp = 0x7fffffffe778, rbp = 0x7fffffffe860 ---

As before, I've attached the complete kernel and KDB output from a serial port (overriding kdb.enter.trap in KDB as above).

#7 Updated by Alexander Motin almost 3 years ago

  • Status changed from Blocked to Closed
  • Target version changed from Backlog to N/A
  • Reason for Closing set to Third Party to Resolve
  • Reason for Blocked deleted (Waiting for feedback)

OK, so if latest driver version does not help, I don't think we have resources to work on that. Please report it upstream to FreeBSD and/or Mellanox.

#8 Updated by Dru Lavigne almost 3 years ago

  • File deleted (mlx4_fault.mp4)

#9 Updated by Dru Lavigne almost 3 years ago

  • File deleted (mlx4_serial_console_output.txt)

#10 Updated by Dru Lavigne almost 3 years ago

  • File deleted (mlx4_serial_console_output_11.3.txt)

#11 Updated by Dru Lavigne almost 3 years ago

  • Private changed from Yes to No

Also available in: Atom PDF