Project

General

Profile

Bug #70140

Fix spurious warning about bridge capabilities

Added by Todd Martin almost 3 years ago. Updated over 2 years ago.

Status:
Done
Priority:
No priority
Assignee:
Ryan Moeller
Category:
OS
Target version:
Seen in:
Severity:
Low Medium
Reason for Closing:
Reason for Blocked:
Needs QA:
No
Needs Doc:
No
Needs Merging:
No
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

When a bridge device attempts to negotiate with Intel em NIC, the attempt to disable TCP Offload Engine (TOE) is not completely successful on em based NICs. The failure causes a down/up state on the em device, disconnecting everything external.

Steps to reproduce:
1.create a bridge device that is connected to an em driver based Intel NIC.
2.create a vnet jail to utilize the bridge device
3.first vnet jail to start and last vnet jail to stop causes down/up state of em NIC.

log entry reads:
freenas bridge0: can't disable some capabilities on em0: 0x1

Adding "-rxcsum" to the options for em based nics will forcibly disable the TOE feature that fails proper negotiation, and bypasses the issue.

Associated revisions

Revision 76ef21d5 (diff)
Added by Ryan Moeller almost 3 years ago

[bridge] Fix spurrious warning about capabilities Mask off the bits we don't care about when checking that capabilities of the member interface have been disabled as intended. Ticket: #70140

Revision 6c4a60ed (diff)
Added by Ryan Moeller almost 3 years ago

[bridge] Fix spurrious warning about capabilities Mask off the bits we don't care about when checking that capabilities of the member interface have been disabled as intended. Ticket: #70140

History

#1 Updated by Dru Lavigne almost 3 years ago

  • Assignee changed from Release Council to Alexander Motin

#2 Updated by Alexander Motin almost 3 years ago

  • Subject changed from TOE not successfully disabled on em driver when bridge device is utilized to rxcsum not successfully disabled on em driver when bridge device is utilized
  • Assignee changed from Alexander Motin to Ryan Moeller
  • Severity changed from New to Low Medium

rxcsum is not a TOE, it is only a checksum offload. TOE is too much for simple em NIC. Interface flap there is probably normal for em NICs, at least that is what I have also seen, and it is not related to the error. Not sure it is fixable other then by what you did.

Ryan, could you look on the em driver for why it may not want to disable rxcsum. It is not the first time I see this error.

Todd, it may help if you shown `ifconfig -m` outputs before and after the bridge creation to understand what capabilities it actually tries to disable. Models of your NICs may also be useful.

#3 Updated by Ryan Moeller almost 3 years ago

  • Status changed from Unscreened to Screened

em toggles both TX and RX CSUM capabilities together, so changing one implies the other. bridge is trying to disable RX and leave TX enabled. I will look closer into this conflict and find a way to resolve it.

Note: In FreeBSD 12+ this in handled in iflib instead of em

#4 Updated by Alexander Motin almost 3 years ago

Yes, I know that those two flags are tied on em, but as I understand this error mean that bridge tried to disable them both, but somehow can't. Bridge itself should not actually care about rxcsum, only about txcsum -- too much checksumming is not as bad as not enough.

#5 Updated by Ryan Moeller almost 3 years ago

Ah I got that backwards, you're right. The bridge is trying to disable TXCSUM and leave RXCSUM enabled, but both get disabled. When it sees the enabled capabilities don't match what it was trying to set, the message it prints is misleading. It didn't fail to disable anything, something extra got disabled that it wasn't expecting. So really the only problem is the message.

The link state cycles because the driver reinitializes the device whenever capabilities are changed. That is a different issue. It can be avoided by ensuring all interfaces on the bridge have their capabilities pre-set to what the bridge expects, so that no change is necessary.

#6 Updated by Ryan Moeller almost 3 years ago

  • Status changed from Screened to In Progress

#7 Updated by Alexander Motin almost 3 years ago

The message is not printed when capability is missing, only when capability is reported when should be disabled.

#8 Updated by Ryan Moeller almost 3 years ago

What's clear is that bridge shouldn't care about the state of RX, so I have added a mask to make it only check on the bits that it is trying to affect. I tested to confirm that this resolves the erroneous message.

The test procedure I used was:
1. verify RX and TX checksums are enabled on em0
2. create a bridge with em0 as a member
3. verify RX and TX checksums are enabled on em0
4. add a tap member to the bridge
5. verify RX and TX checksums are disabled on em0
6. remove the tap member from the bridge
7. observe the warning message before the patch, observe no warning message after the patch

bridge is enabling TXCSUM to restore the greatest common denominator of capabilities when the tap is removed. In doing so, RXCSUM is also enabled. bridge erroneously checks that RXCSUM stays disabled. Masking the irrelevant bits out ensures only the relevant bits are checked.

#9 Updated by Ryan Moeller almost 3 years ago

  • Target version changed from Backlog to 11.2-U3

#10 Updated by Ryan Moeller almost 3 years ago

#11 Updated by Ryan Moeller almost 3 years ago

  • Status changed from In Progress to Ready for Testing
  • Needs Merging changed from Yes to No

#13 Updated by Dru Lavigne almost 3 years ago

  • Subject changed from rxcsum not successfully disabled on em driver when bridge device is utilized to Fix spurious warning about bridge capabilities
  • Needs Doc changed from Yes to No

#14 Updated by Ryan Moeller almost 3 years ago

  • Status changed from Ready for Testing to Passed Testing

#16 Updated by Dru Lavigne over 2 years ago

  • Status changed from Passed Testing to Ready for Testing

#18 Updated by Ryan Moeller over 2 years ago

  • Status changed from Ready for Testing to Passed Testing

#20 Updated by Dru Lavigne over 2 years ago

  • Status changed from Passed Testing to Done
  • Needs QA changed from Yes to No

Also available in: Atom PDF