Project

General

Profile

Bug #5919

Emulex driver version 10.0.664.0 causes Freenas to hang, update driver to 10.0.747.0

Added by Francois Herbert about 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
No priority
Assignee:
Josh Paetzel
Category:
-
Target version:
Severity:
New
Reason for Closing:
Reason for Blocked:
Needs QA:
Yes
Needs Doc:
Yes
Needs Merging:
Yes
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

I have 2 x HP NC552SFP 10Gb Emulex cards ([[http://www.emulex.com/products/ethernet-networking-storage-connectivity/ethernet-networking-adapters/hp-branded/nc552sfp/specifications/]])
My freenas will run fine for a while but then it will loose all network connectivity, the following appears in /var/log/messages:

Aug 27 11:24:00 FREENAS kernel: oce0: UE: PMEM bit set
Aug 27 11:24:00 FREENAS kernel: oce0: UE: TXPB bit set
Aug 27 11:52:01 FREENAS autosnap.py: [tools.autosnap:58] Popen()ing: /sbin/zfs snapshot -o freenas:state=NEW Root/dataseta_dev@auto-20140827.1152-1w
Aug 27 11:52:02 FREENAS autorepl.py: [common.pipesubr:58] Popen()ing: /usr/bin/ssh -c arcfour256,arcfour128,blowfish-cbc,aes128-ctr,aes192-ctr,aes256-ctr -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 remotefreenas "zfs list -Hr -o name -t snapshot -d 1 Root/dataseta_dev | tail -n 1 | cut -d@ -f2" 
Aug 27 11:52:22 FREENAS autorepl.py: [tools.autorepl:284] Can not locate Root/dataseta_dev@auto-20140827.1052-1w on remote system, starting from there
Aug 27 11:52:22 FREENAS autorepl.py: [common.pipesubr:72] Executing: /sbin/zfs set freenas:state=NEW Root/dataseta_dev@auto-20140827.1052-1w
Aug 27 11:52:22 FREENAS autorepl.py: [common.pipesubr:58] Popen()ing: /usr/bin/ssh -c arcfour256,arcfour128,blowfish-cbc,aes128-ctr,aes192-ctr,aes256-ctr -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 remotefreenas "zfs list -Hr -o name -t snapshot -d 1 Root/dataseta_dev | tail -n 1 | cut -d@ -f2" 
Aug 27 11:52:23 FREENAS autorepl.py: [tools.autorepl:380] Replication of Root/dataseta_dev@auto-20140827.1052-1w failed with ssh: Could not resolve hostname remotefreenas: hostname nor servname provided, or not known
Aug 27 11:53:02 FREENAS autorepl.py: [common.pipesubr:58] Popen()ing: /usr/bin/ssh -c arcfour256,arcfour128,blowfish-cbc,aes128-ctr,aes192-ctr,aes256-ctr -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 remotefreenas "zfs list -Hr -o name -t snapshot -d 1 Root/dataseta_dev | tail -n 1 | cut -d@ -f2" 
Aug 27 11:53:02 FREENAS autorepl.py: [tools.autorepl:380] Replication of Root/dataseta_dev@auto-20140827.1052-1w failed with ssh: Could not resolve hostname remotefreenas: hostname nor servname provided, or not known

So it looks like the machine continued to 'run' - or at least try to replicate the zfs snapshots but it was unable to resolve hostnames due to no network connectivity.

I believe this bug may be related to version 10.0.664.0 of the oce driver as noted in this bug report [[https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=183391]]

Since this is a show stopper it would be great if the oce driver could be updated to 10.0.747.0 ([[http://www.emulex.com/downloads/emulex/drivers/freebsd/freebsd-91/drivers/]])

I've checked the release notes for 9.2.1.6 and 9.2.1.7 but can't see anything relating to the oce driver being updated.

From dmesg:

oce0: <Emulex CNA NIC function:///10.0.664.0///> mem 0xf7ef0000-0xf7ef3fff,0xf7ec0000-0xf7edffff,0xf7ea0000-0xf7ebffff irq 26 at device 0.0 on pci7
oce1: <Emulex CNA NIC function:///10.0.664.0///> mem 0xf7e90000-0xf7e93fff,0xf7e60000-0xf7e7ffff,0xf7e40000-0xf7e5ffff irq 28 at device 0.1 on pci7
oce2: <Emulex CNA NIC function:///10.0.664.0///> mem 0xf7df0000-0xf7df3fff,0xf7dc0000-0xf7ddffff,0xf7da0000-0xf7dbffff irq 40 at device 0.0 on pci4
oce3: <Emulex CNA NIC function:///10.0.664.0///> mem 0xf7d90000-0xf7d93fff,0xf7d60000-0xf7d7ffff,0xf7d40000-0xf7d5ffff irq 44 at device 0.1 on pci4

Emulex card firmware version
sysctl dev.oce.0.firmware_version
dev.oce.0.firmware_version: 4.9.311.20


Related issues

Copied to FreeNAS - Bug #6028: Emulex driver version 10.0.664.0 causes Freenas to hang, update driver to 10.0.747.0Resolved2014-08-26

History

#1 Updated by Francois Herbert about 6 years ago

Rebooting the machine is the only way I can get the network/server working properly again.

#2 Updated by Josh Paetzel about 6 years ago

  • Category set to 18
  • Status changed from Unscreened to Screened
  • Assignee set to Josh Paetzel
  • Priority changed from Important to No priority
  • Target version set to 49

This driver isn't in FreeBSD yet. I'll reach out to emulex and find out why.

#3 Updated by Josh Paetzel about 6 years ago

  • Status changed from Screened to Fix In Progress

Emulex contacs have gone on radio silence. Running a make universe now to get these into FreeBSD head. Will evaluate from there.

#4 Updated by Francois Herbert about 6 years ago

I've got a patch file if it helps, tested on my system ok.

#5 Updated by Francois Herbert about 6 years ago

To get it to compile cleanly on FreeBSD 9.2 I had to patch oce_if.c and ifdef out the call to drbr_stats_update() (line 1283) as this function is only defined if FreeBSD version is 10 or later.

#6 Updated by Josh Paetzel about 6 years ago

Finally heard back from contacts at emulex....They have patches to upstream in to FreeBSD, so we'll wait for that to happen.

#7 Updated by Mark Brookfield about 6 years ago

I experiencing the same issue.

I'm a massive fan of FreeNAS but not as au fait with it as I am with Linux. How would I go about applying the patch?

Many thanks

#8 Updated by Francois Herbert about 6 years ago

Hi Mark

I've had to build freenas from source to apply this patch. I created a FreeBSD 9.2 VM in virtual box on my machine for this purpose.
Follow instructions at https://github.com/freenas/freenas

We currently run the 9.2.1 branch so change into the freenas git directory (freenasgitdir) and checkout the 9.2.1-BRANCH branch. (git checkout 9.2.1-BRANCH)

After running 'make git-external' and 'make checkout', apply the patch. You can safely ignore any whitespace errors when applying patches

cd freenasgitdir/FreeBSD/src
apply oce driver patch:
git am --signoff < 0001-Patched-OCE-driver-to-version-10.0.747.0-for-FreeBSD.patch

The run 'make release'

I have a build which you can download if you want with the updated OCE driver compiled in - but it also has the fusionio (iomemory-vsl.ko) kernel module loading. Up to you if you want to run this or not. I'm pretty sure I'm allowed to share it with you (unless someone can advise me otherwise?)
Let me know if you want to try it and I'll put it up for download - I only have a 64bit version built.

Cheers
Francois

#9 Updated by Josh Paetzel about 6 years ago

  • Description updated (diff)
  • Target version changed from 49 to 9.2.1.8-RELEASE

I think it makes sense to roll this in to 9.2.1.8. Waiting for emulex to integrate code they've already put on their website is silly.

#10 Updated by Francois Herbert about 6 years ago

Excellent - thanks Josh, it was a show stopper for us - made our Freenas boxes completely unreliable (through no fault of FreeNAS). I did have to patch the drivers that were downloaded from the Emulex web site as they wouldn't compile cleanly on FreeBSD 9.2 - as I mentioned previously.

#11 Updated by Josh Paetzel about 6 years ago

  • Status changed from Fix In Progress to Resolved

#12 Updated by Josh Paetzel about 6 years ago

  • Copied to Bug #6028: Emulex driver version 10.0.664.0 causes Freenas to hang, update driver to 10.0.747.0 added

Also available in: Atom PDF