Project

General

Profile

Bug #25420

Reenable Chelsio TCP offload now that bug in T580-SO-CR 4x10G is fixed

Added by Andrew Nguyen over 1 year ago. Updated 10 months ago.

Status:
Done
Priority:
Blocks Until Resolved
Assignee:
Alexander Motin
Category:
OS
Target version:
Seen in:
TrueNAS - 11.0-U2
Severity:
New
Reason for Closing:
Reason for Blocked:
Needs QA:
No
Needs Doc:
No
Needs Merging:
No
Needs Automation:
No
Support Suite Ticket:
Hardware Configuration:
ChangeLog Required:
No

Related projects 1 project

Description

On the latest TrueNAS 11 production releng, the Chelsio T580-SO-CR card is not detected by the OS.
Tried an older version (9.3) and the card was detected just fine.

When attempting to change trains to 9.10 from 11, this message appears:
The following Inconsistencies were found in your Current Install:
List of Checksum Mismatches:
/usr/local/share/vpds/t580_lp_so_spider_variable_2133_vpd.bin
/usr/local/share/vpds/t580_lp_so_variable_vpd.bin

Associated revisions

Revision 7e6f2874 (diff)
Added by Alexander Motin 12 months ago

Remove Chelsio TCP offload disable.

The issue of T580-SO-CR in 4x10G mode told to be fixed by FW update.

Ticket: #25420

History

#1 Avatar?id=14398&size=24x24 Updated by Kris Moore over 1 year ago

  • Assignee changed from Release Council to Alexander Motin
  • Target version set to 11.0-U3

#2 Avatar?id=14398&size=24x24 Updated by Kris Moore over 1 year ago

  • Priority changed from No priority to Blocks Until Resolved

#3 Updated by Dru Lavigne over 1 year ago

  • Status changed from Untriaged to Unscreened

#5 Updated by Alexander Motin over 1 year ago

  • Status changed from Unscreened to 15

It seems to be a return of the bug fixed in 9.10.2 earlier. Unfortunately I still can't reproduce it in my lab since I don't have "SO" version of the card, which seems only affected due to its limited RAM size. Can you give me access to such system?

You may also try to update to nightly train to see whether new firmware bundled there can help.

#8 Updated by Alexander Motin over 1 year ago

  • Status changed from 15 to Investigation

System provided by Andrew does reproduce it. And same as previous time it affects only cards in 4x10G mode.

#9 Updated by Steve Wong over 1 year ago

Alexander, so native 40GbE mode is not an issue; correct?

Support -- do you need product management to generate a technical notice?

#10 Updated by Alexander Motin over 1 year ago

Steve Wong wrote:

Alexander, so native 40GbE mode is not an issue; correct?

Yes, I switched the card there and back, and 2x40G mode does work, while 4x10G doesn't.

#14 Updated by Alexander Motin over 1 year ago

There is another easier workaround -- disable TCP offload (which we are not using any way now):
hw.cxgbe.toecaps_allowed=0
hw.cxgbe.rdmacaps_allowed=0
hw.cxgbe.iscsicaps_allowed=0

#15 Updated by Alexander Motin over 1 year ago

  • Status changed from Fix In Progress to 19

I've pushed the workaround into nightly for both FreeNAS and TrueNAS.

#16 Updated by Alexander Motin over 1 year ago

  • Status changed from 19 to Ready For Release

Merged the patches to 11.0-stable and 11.0-u2.1-stable branches.

#17 Updated by Dru Lavigne over 1 year ago

  • Target version changed from 11.0-U3 to TrueNAS-11.0-U2.1

#21 Updated by Joe Maloney over 1 year ago

I starting testing this morning with the system Andrew provided.

To begin I removed the following tunables which were set manually from the webui:

hw.cxgbe.toecaps_allowed=0
hw.cxgbe.rdmacaps_allowed=0
hw.cxgbe.iscsicaps_allowed=0
  1. Proceeded to reboot to verify tunables were not loaded.
  2. Verified that cxl devices were missing as expected.
  3. I upgraded to u2.1 internal using update-int.
  4. Upon reboot autotuner kicked in set the tunables, and rebooted again.
  5. Verified all cxl devices were present.
  6. Loader tunables are not present in webui but are obviously loaded.

Assuming this is expected I see no issues. Do we expect the additional reboot will have any impact on failover for HA? This system I tested on was not HA. Before marking test passes I would like for someone to review my feedback.

#22 Updated by Alexander Motin over 1 year ago

The feedback looks reasonable. Additional reboot should not be a big deal for HA, though I also don't very like it, since it takes additional time.

#23 Updated by Joe Maloney over 1 year ago

  • Status changed from 47 to Ready For Release
  • Needs QA changed from Yes to No
  • QA Status Test Passes added
  • QA Status deleted (Not Tested)

#26 Updated by Vaibhav Chauhan over 1 year ago

  • Status changed from Ready For Release to Resolved

#28 Updated by Dru Lavigne about 1 year ago

  • File deleted (debug-20170803150216.tar)

#29 Updated by Alexander Motin 12 months ago

  • Status changed from Resolved to Done
  • Target version changed from TrueNAS-11.0-U2.1 to TrueNAS 11.2
  • Needs QA changed from No to Yes
  • Needs Doc changed from Yes to No
  • Needs Merging changed from Yes to No

I've reenabled back TCP offload for 11.2 since merged FW update is told to properly fix the original issue. But it needs to be verified.

#30 Updated by Dru Lavigne 12 months ago

  • Project changed from TrueNAS to FreeNAS
  • Subject changed from T580-SO-CR Not working with latest TrueNAS to Reenable Chelsio TCP offload now that bug in T580-SO-CR 4x10G is fixed
  • Category changed from OS to OS
  • Target version changed from TrueNAS 11.2 to 11.2-BETA1
  • Migration Needed deleted (No)
  • Hide from ChangeLog deleted (No)
  • Support Department Priority deleted (0)

#31 Updated by Dru Lavigne 12 months ago

  • Status changed from Done to Ready for Testing

#32 Updated by Nick Wolff 10 months ago

  • Status changed from Ready for Testing to Passed Testing
  • Severity set to New

Tested on freenas running internal7 with card set to 4 x 10g mode

#33 Updated by Dru Lavigne 10 months ago

  • Status changed from Passed Testing to Done
  • Needs QA changed from Yes to No

Also available in: Atom PDF