Bug #37786
Remove double free which caused bhyve to SIGBUS
Description
My Bhyve VM's keep crashing with signal 10 since the beta update. It doesn't matter if I'm running just the 512MB dns one or all three they keep crashing. Only started happening since I switched to the beta.
Subtasks
Related issues
Associated revisions
History
#1
Updated by Greg Fitzgerald almost 3 years ago
- File debug-freenas-20180712002308.txz added
- Private changed from No to Yes
#2
Updated by Marcelo Araujo almost 3 years ago
- Status changed from Unscreened to In Progress
- Assignee changed from Release Council to Marcelo Araujo
#3
Updated by Marcelo Araujo almost 3 years ago
- Severity changed from New to High
- Needs Doc changed from Yes to No
- Needs Merging changed from Yes to No
#4
Updated by Marcelo Araujo almost 3 years ago
I'm with access to Greg's box to analysis what is the reason for this crash.
#5
Updated by Marcelo Araujo almost 3 years ago
Marcelo Araujo wrote:
I'm with access to Greg's box to analysis what is the reason for this crash.
Hi Greg,
Spent all day looking into your machine, I could see the vm crash with SIGBUS. While I was investigating this issue I noticed you have lots of errors like this one:
sonewconn: pcb 0xfffff8013195f570: Listen queue overflow: 151 already in queue awaiting acceptance (4 occurrences)
A total of:
root@freenas:~ # dmesg | grep sonewconn | wc -l
3419
It can be caused by your Broadcom NIC, however I'm still not 100% sure about that.
What I did few hours ago was set this sysctl: sysctl kern.ipc.soacceptqueue=4096
By default the value is 128 that is pretty low for some expressive network traffic, I did up this value to 4096 and restarted some services such like: mdnsd and netatalk as well as your 3 vms.
What I would suggest for you would be to set this systctl at System->Tunables and reboot your FreeNAS.
Launch again your 3 vms and let me know if that solves the VM crashes.
I'm running your 3 vms for over 3 hours already without a crash.
Best,
#6
Updated by Marcelo Araujo almost 3 years ago
- Reason for Blocked set to Waiting for feedback
- Needs Doc changed from No to Yes
#7
Updated by Greg Fitzgerald almost 3 years ago
Marcelo Araujo wrote:
Marcelo Araujo wrote:
I'm with access to Greg's box to analysis what is the reason for this crash.
Hi Greg,
Spent all day looking into your machine, I could see the vm crash with SIGBUS. While I was investigating this issue I noticed you have lots of errors like this one:
sonewconn: pcb 0xfffff8013195f570: Listen queue overflow: 151 already in queue awaiting acceptance (4 occurrences)
A total of:
root@freenas:~ # dmesg | grep sonewconn | wc -l
3419It can be caused by your Broadcom NIC, however I'm still not 100% sure about that.
What I did few hours ago was set this sysctl: sysctl kern.ipc.soacceptqueue=4096
By default the value is 128 that is pretty low for some expressive network traffic, I did up this value to 4096 and restarted some services such like: mdnsd and netatalk as well as your 3 vms.
What I would suggest for you would be to set this systctl at System->Tunables and reboot your FreeNAS.
Launch again your 3 vms and let me know if that solves the VM crashes.
I'm running your 3 vms for over 3 hours already without a crash.Best,
Thank you for spending the time debugging this. I set the sysctl value and rebooted. I'll let you know if they crash again.
#8
Updated by Greg Fitzgerald almost 3 years ago
- File debug.tgz added
I was up all night working, I woke up at 2:30PM EST and my VM's had crashed again. I attached the debug.tgz.
#9
Updated by Marcelo Araujo almost 3 years ago
Greg Fitzgerald wrote:
I was up all night working, I woke up at 2:30PM EST and my VM's had crashed again. I attached the debug.tgz.
Hello Greg,
I have connected in your machine and the kern.ipc.soacceptqueue is still 128.
root@freenas:~ # sysctl kern.ipc.soacceptqueue
kern.ipc.soacceptqueue: 128
Did you roll it back?
#10
Updated by Dru Lavigne over 2 years ago
- Target version changed from Backlog to 11.2-BETA2
#12
Updated by Greg Fitzgerald over 2 years ago
Yes, when I setup the sysctl in the gui I had it set as loader instead of sysctl. When I booted back up I failed to verify that it was set correctly. I have since fixed it, VMs are still crashing regularly.
The fix has increased my network throughput on NFS shares by a lot though. Not sure the exact numbers. More than doubled. I wonder if this option could be set with the autotune when a realtek card is detected in the system?
#13
Updated by Marcelo Araujo over 2 years ago
#14
Updated by Dru Lavigne over 2 years ago
- File deleted (
debug-freenas-20180712002308.txz)
#15
Updated by Dru Lavigne over 2 years ago
- File deleted (
debug.tgz)
#16
Updated by Dru Lavigne over 2 years ago
- Subject changed from Bhyve VM's Crashing to Remove double free which caused iocage to SIGBUS
- Private changed from Yes to No
- Needs Doc changed from Yes to No
- Needs Merging changed from No to Yes
#17
Updated by Dru Lavigne over 2 years ago
- Status changed from In Progress to Ready for Testing
- Needs Merging changed from Yes to No
#18
Updated by Dru Lavigne over 2 years ago
- Reason for Blocked deleted (
Waiting for feedback)
#19
Updated by Marcelo Araujo over 2 years ago
- Subject changed from Remove double free which caused iocage to SIGBUS to Remove double free which caused bhyve to SIGBUS
#20
Updated by Dru Lavigne over 2 years ago
- Related to Bug #34747: Bhyve process exits added
#21
Updated by Joe Maloney over 2 years ago
- Status changed from Ready for Testing to Passed Testing
#22
Updated by Dru Lavigne over 2 years ago
- Status changed from Passed Testing to Done
- Needs QA changed from Yes to No