Project

General

Profile

Bug #26695

Use 4k jumbo cluster pool for all jumbo frames until the allocator problem is fixed for the 9k jumbo cluster pool

Added by Caleb St. John over 1 year ago. Updated 11 months ago.

Status:
Done
Priority:
Critical
Assignee:
Ryan Moeller
Category:
OS
Target version:
Seen in:
TrueNAS - 11.0-U4
Severity:
Medium
Reason for Closing:
Reason for Blocked:
Needs QA:
No
Needs Doc:
No
Needs Merging:
No
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

Using a frame size of 9000, under "high" load, can cause severe performance degradation.

netstat -m shows a high amount of clusters being denied:
...
0/308411/80 requests for jumbo clusters denied (4k/9k/16k)

This is a problem with memory fragmentation, specifically jumbo mbufs >4k


Related issues

Related to FreeNAS - Bug #32694: System ResetClosed2018-04-26
Has duplicate FreeNAS - Bug #29070: iSCSI connecion droppingClosed
Has duplicate FreeNAS - Bug #35140: Slow NFS ReadClosed
Has duplicate FreeNAS - Bug #35656: Storage unresponsive and load average highClosed

Associated revisions

Revision 3e0853db (diff)
Added by Ryan Moeller 12 months ago

e1000: Don't use 9k jumbo clusters

The 9k jumbo cluster pool has fragmentation issues.
Use the 4k jumbo cluster pool for all jumbo frames until the allocator
problem is fixed.

Backported from 12-CURRENT.

Ticket: #26695

Revision 2074eddc (diff)
Added by Ryan Moeller 12 months ago

e1000: Don't use 9k jumbo clusters

The 9k jumbo cluster pool has fragmentation issues.
Use the 4k jumbo cluster pool for all jumbo frames until the allocator
problem is fixed.

Backported from 12-CURRENT.

Ticket: #26695

Revision bebecc37 (diff)
Added by Dru Lavigne 11 months ago

Mention support for 9k jumbo clusters.
Ticket: #26695

History

#1 Updated by Caleb St. John over 1 year ago

  • Project changed from TrueNAS to FreeNAS
  • Category changed from 162 to 129
  • Support Suite Ticket deleted (AEQ-766-36045)
  • Migration Needed deleted (No)
  • Support Department Priority deleted (0)

#2 Updated by Alexander Motin over 1 year ago

  • Status changed from Unscreened to Screened

It may need investigation with different NICs. I am quite suspicious that Intel 1G NIC drivers may have problems with handling denied jumbo clusters allocation. On the other side Chelsio NIC drivers seems to have some fallbacks for that case, which should make those allocation denials not so painful.

#3 Updated by Dru Lavigne over 1 year ago

  • Target version set to 11.2-BETA1

#4 Avatar?id=14398&size=24x24 Updated by Kris Moore over 1 year ago

  • Target version changed from 11.2-BETA1 to 11.3

#5 Updated by Caleb St. John over 1 year ago

Intel has officially released an errata specifically talking about jumbo cluster problems. Specifically if you look at page 10, they say there is an issue with long burst of jumbo streams of frame size 9k. The only recommendation is to use an MTU of <= 8.5k.

https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/i218-i219-ethernet-connection-spec-update.pdf

The Chelsio cards still exhibit the same problems but I haven't been able to find any official documentation from them.

#6 Avatar?id=14398&size=24x24 Updated by Kris Moore over 1 year ago

  • Status changed from Screened to Not Started

#7 Updated by Alexander Motin over 1 year ago

  • Severity set to Medium

#8 Updated by Florian Bruckner over 1 year ago

FWIW - I believe I have been hit by this issue.

We have been using an MTU of 9000 for years in FreeNAS 9. After upgrading to FreeNAS 11 recently, we have started to experience sluggish performance on a dedicated network link serving NFS for a Xenserver. After some digging around, first blaming network cables, switches, the lagg interface we had configured, etc. we finally found we could resolve the performance issues by a FreeNAS reboot. After a couple of days, the issue would return, getting worse over time.

We could see ping times degrade (to > 1000ms) and correlate poor performance of our virtual disks with the times pings went up (usually during load induced by our VMs).

After I found this ticket, I also could see a growing number of "requests for jumbo clusters denied" during the times when ping times went up.

Switching to an MTU of 1500 immediately resolved these issues - though it remains to be seen whether it stays this way or we'll get the problem after some time again.

Long story short: We didn't see this with FreeNAS 9.10 for years, the issue came a few days after the upgrade to 11, so I believe this to be a regression introduced by this upgrade (as nothing else has changed in our setup other than this version).

#9 Updated by Alexander Motin about 1 year ago

#10 Avatar?id=13649&size=24x24 Updated by Ben Gadd about 1 year ago

  • Target version changed from 11.3 to Backlog

#11 Updated by Alexander Motin about 1 year ago

  • Has duplicate Bug #29070: iSCSI connecion dropping added

#12 Updated by Alexander Motin about 1 year ago

#13 Updated by Alexander Motin about 1 year ago

#14 Updated by Alexander Motin about 1 year ago

#15 Updated by Alexander Motin 12 months ago

  • Has duplicate Bug #35656: Storage unresponsive and load average high added

#16 Updated by David Beitey 12 months ago

I've got what appears to be the same issue, but my test case isn't what you would call "high load".

I started using FreeNAS from 11.0 effectively just after that came out (eg June 2017) and set the MTU as 9000 and left it like that. I'm now on FreeNAS-11.1-U5 and everything was working fine, up until just recently. My motheroard is a Super Micro X11SAE-M and I'm currently using the Intel I219-LM Gigabit connection (em0) on the board.

In the last few months, I've been seeing my NAS's network connections 'freeze' up when downloading content to the NAS. At first, I thought it was specific applications or scripts or jails but now I can cause the issue via a simple wget command for a 100MB test file. When I issue the wget command, like Florian has described, ping times go way up (eg 35100ms or more) to the point that ping requests timeout, SSH sessions hang and all networking is interrupted until I can mash Control-C and kill the wget command. During the network 'feeze', the UPS software in FreeNAS sends "COMMBAD" error reports during this time, I'm assuming because that's also using the network stack.

After finding this ticket, I went looking at netstat -m and found that the only thing that appears amiss is the jumbo clusters line:

0/314627/0 requests for jumbo clusters denied (4k, 9k, 16k)

Changing the MTU from 9000 to 1500 via the FreeNAS UI sees an instant improvement: the ping times drop at that exact second and network activity starts working correctly. Where I couldn't manage to get the simple 100MB file downloaded via wget, now there's no issue. Pings stay around 1ms, SSH continues working, and the file downloads completely, reaching close to the maximum of my internet speed.

I've never used FreeNAS 9.10 like Florian has, but I've only started seeing this issue recently, which makes me believe it's a more recent regression within FreeNAS 11.x than from 9.10.

For note, this might be unrelated but my dmesg output has also started logging the occasional lines like so

Limiting closed port RST response from 312 to 200 packets/sec
Limiting closed port RST response from 587 to 200 packets/sec
sonewconn: pcb 0xfffff801083823a1: Listen queue overflow: 8 already in queue awaiting acceptance (2 occurrences)
Limiting closed port RST response from 241 to 200 packets/sec

Other pages online suggest at least the sonewconn line might be application-related, but the output seems to go hand-in-hand.

Is there anything to try aside from using an MTU of <8.5k as mentioned above? In the meantime, I'm going to try using the second Gigabit port with its different Intel I210 chipset to see if it makes any difference.

#17 Updated by Alexander Motin 12 months ago

  • Assignee changed from Alexander Motin to Ryan Moeller

#18 Updated by Ryan Moeller 12 months ago

The current recommendation depends on the driver:
em, igb: Set MTU less than 4k to avoid 9k jumbo cluster fragmentation.
ixgbe, ixl: Doesn't use 9k jumbo clusters, so fragmentation shouldn't be an issue.
cxgbe: Set hw.cxgbe.largest_rx_cluster=4096 in loader.conf to avoid 9k jumbo cluster fragmentation.
cxgb: Set MTU to 1500 to avoid 9k jumbo cluster fragmentation.

I have prompted a discussion on the freebsd-net mailing list about how to address the issue in the stable branches. We may end up changing all the drivers that use 9k jumbo clusters so they instead use 4k jumbo clusters for devices that support scatter/gather I/O (modern cards support this). There has already been work in this direction on 12-CURRENT, but I'd be surprised if any of that can get MFC'd due to the major restructuring in -CURRENT for iflib.

I have a gist of my notes on the details of the problem here for those interested:
https://gist.github.com/freqlabs/eba9b755f17a223260246becfbb150a1

#19 Updated by Ryan Moeller 12 months ago

  • Status changed from Not Started to In Progress

Working on a patch for em and igb upstream:
https://reviews.freebsd.org/D16534

I'm looking at how to do a similar workaround in cxgb now, but it appears it may not be quite so simple.
Unfortunately I don't have hardware to test hw.cxgb.use_16k_clusters=1 in loader.conf as a workaround.

#20 Updated by Ryan Moeller 12 months ago

  • Status changed from In Progress to Ready for Testing
  • Target version changed from Backlog to 11.2-BETA3
  • Needs Merging changed from Yes to No

The patch for e1000 was accepted upstream and I've committed the change to our freenas/11-stable os branch:
https://github.com/freenas/os/commit/3e0853db1a2a497d24c65ab14b0314f1fe2a0543

Doc Notes
cxgbe: Set hw.cxgbe.largest_rx_cluster=4096 in loader.conf when using jumbo frames to avoid this issue.
cxgb: No workaround at this time; don't use jumbo frames.

Test Notes
When an interface is configured for 9k jumbo frames (mtu 9000), `netstat -m` should show no 9k jumbo clusters in use.
The following drivers should behave correctly:
  • em
  • igb
  • ixgbe
  • ixl
  • cxgbe (when the loader tunable is set as described above)

Other Notes
I started to test the re (Realtek) driver but ran into a different issue long before 9k jumbo cluster fragmentation became a problem.
This ticket in the FreeBSD bug tracker seems relevant, for those interested:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=208205
I think it is well-established already that Realtek devices should not be used with FreeNAS, so I did not keep looking into it.

#21 Updated by Dru Lavigne 12 months ago

  • Subject changed from requests for jumbo clusters denied to Use 4k jumbo cluster pool for all jumbo frames until the allocator problem is fixed for the 9k jumbo cluster pool

#22 Updated by Dru Lavigne 12 months ago

  • Status changed from Ready for Testing to In Progress
  • Target version changed from 11.2-BETA3 to 11.1-U6
  • Needs Merging changed from No to Yes

#24 Updated by Ryan Moeller 12 months ago

  • Status changed from In Progress to Ready for Testing
  • Needs Merging changed from Yes to No

#28 Updated by Bonnie Follweiler 11 months ago

  • Status changed from Ready for Testing to Passed Testing
  • Needs QA changed from Yes to No

Passed Testing in FN 11.1-U6 Internal5

#29 Updated by Dru Lavigne 11 months ago

  • Status changed from Passed Testing to Done
  • Needs Doc changed from Yes to No

#30 Updated by Frank Riley 11 months ago

The mlx4 driver also has this issue. I am working around it by setting the MTU to 4000. Note that 4096 won't work as there appears to be some overhead that causes it to still use 9k clusters.

#31 Updated by Alexander Motin 11 months ago

Frank Riley wrote:

The mlx4 driver also has this issue. I am working around it by setting the MTU to 4000. Note that 4096 won't work as there appears to be some overhead that causes it to still use 9k clusters.

I can add that this was recently fixed in newer mlx5 driver, that will get at least to FN 11.3 one day, unless merged explicitly.

Also available in: Atom PDF