Fatal trap 12 when using iSCSI zvol on FreeNAS 9.10-U6 or FreeNAS 11.0-U2
SuperMicro X10SRL-F,E5-1620v3,128GB ECC RAM, 92811-8i, 9341-8i (latest firmware), intel X520 10GB, 2 x Intel 320 SSD Mirror boot device
SuperMicro X8SIL, X3430, 16GB ECC RAM, 9211-8i (latest firmware), intel X520 10GB, 64GB Lexar USB boot device
I've ran into an issue that I can duplicate 100% of the time on two completely different hardware platforms when connecting to an iSCSI zvol from a Windows 2016 server. The server is running the latest Data Protection Manager software and the target is setup as it's storage.
The screenshots attached are some screen caps at various points during the crash on both systems. The older system with only 16GB of RAM crashes in about 45 seconds while transfer rates are ~400-450MB/sec while the new system takes about 5-6 minutes with rates of ~550-600MB/sec on my 10GB network. The newer system has been running FreeNAS for over a year and I've moved 70+TB of data to it at 770MB/sec on average without a single issue or crash.
The disks I'm using are Toshiba 3TB Enterprise x 5 in a RAIDZ1 which have been checked with preclear, short/long smart, and hd sentinel with absolutely no signs of errors. I also ran a full 24 hour memtest86+ on both systems to rule out my RAM.
I had the same drives in a RAIDZ1 3 months ago used as a zvol target to a windows server 2012 R2 data protection server which ran for over a year but have recently upgraded to 2016. I no longer have a 2012 R2 DPM server to test with at the moment thou. I'm not sure if that has anything to do with this problem but would rather mention it than not.
#6 Updated by Alexander Motin almost 3 years ago
- Status changed from Unscreened to Closed: Insufficient Info
I'm sorry, but your debug data does not include any core dumps, while screenshots only tell about some possibly memory corruption (may be software) but don't give any clue about the source. I may only recommend you to try update to FreeNAS 11, which could possibly fix it somehow, or may give any more information. Also, if the problem is reliably reproducible, you may try to temporary enable debugging kernel, that may crash your system closer to the problem source to give any more clues.