11.2 Beta2 - system lock up when deleting files
Noted in freenas forum as "https://forums.freenas.org/index.php?threads/11-2-beta2-system-lock-up-when-deleting-files.68993/"
HP Microserver G8 N54L, 8GB, ZFS RaidZ1 4x 4TB WD Red. Running FreeNAS since V8. Home nas, on 24/7. and was updated from v8 to 9.x sometime in 2016.
Upgraded from 9.10 to 11.1 Beta 1 on 21 July.
Upgraded to 11.2 Beta1 on 29th July - introduction of new UI.
Upgraded to 11.2 Beta2 on 4th Aug.
v11 has introduced massive delays in boot times.. over 5mins from power on to prompt.
I have been cleaning up and reorganising data, moving, copying and migrating data off to external systems and disks, then deleting that data from nas.
This has been working well until 11.2 Beta 2.
Removing files and folders either via smb/cifs, or direct on nas via cmd rm/ rm
R is causing complete system lock up. - the web interface renders page not found, ssh disconnects, ping unavailable and on the direct console the host is completely unresponsive. - A complete power cycle is required.
The files in question are media (tv/movies) and i can remove one a time, and sometimes a few, or a small folder. If however a part of full season is removed, it bombs.
Pool is healthty. 70% full.
Scrubs are clean.
Temps are good.
Smartd is happy, and not reporting any errors.
Pool has been updated to latest version. Issue presented before and after pool update.
Is there a way to identify if the data is problematic or if beta 2 has a gremlin. ?
would booting back into env 11.1/11.2b1 work ? (Chris M has ruled this out in forum).
Please advise info required for diagnosis.
#3 Updated by ludovic simpson about 2 years ago
- File procstat1.txt added
- File procstat2.txt added
- File procstat3.txt added
- File procstat4.txt added
- File procstat5.txt added
Sean Fagan wrote:
We'd need a system debug at least. The output of "procstat -kk -a" while it's hung, if possible.
Sean, ive replicated the issue and attached some files for you viewing pleasure.
i had two ssh terminals active. one (user) to initial the deletion, and the other (root) to run procstat as instructed above.
procstat1.txt - initial run on idle system.
procstat2.txt ..3 ..4 are run during 3 separate successful deletions. (rm -R of a folder with about 24 episode files within).
procstat5.txt was running when the system froze, approx midway through the 4th deletion run.
Required a power cycle to restore, and looking at the 4th target folder.. there about 7 remaining files from 24.
Hopefully there's something useful in there.
Let me know what else you require.
ps, as Disk Didler advised me, its a G7, not G8 microserver.. however the model is correct as n54l.
#9 Updated by ludovic simpson about 2 years ago
As per Chris' advice, despite upgrading my zpool, i was able to boot back into 11.2 Beta 1, and successfully delete files again.
full details in forum post - https://forums.freenas.org/index.php?threads/11-2-beta2-system-lock-up-when-deleting-files.68993/#post-474077
#11 Updated by ludovic simpson about 2 years ago
"Dobryj den" Alexander,
(as detailed in the forum post linked above)
I completed several successful deletions post Boot Env reversion back to beta 1.
..Since then NO further actions have been performed.
...awaiting your feedback and instruction.
#12 Updated by Alexander Motin about 2 years ago
- Status changed from Unscreened to Screened
- Target version changed from Backlog to 11.2-BETA3
I feel with by back that it should be somehow related to #27514, just not sure how. But as I can see #27514 patch was committed just before BETA1, not after it. There are only few ZFS commits between BETA1 and BETA2, and I don't think they are related. Are you sure it is revert to BETA1 that helps you, not a reboot just before the delete or some other factor(s)?
As I understand, while you are deleting files the system is alive enough for you to run at least `procstat`. What happens after that? System just immediately stops reacting on anything including local console? Can you try to boot debug kernel of BETA2 and see whether that cause any panic or other debug instead of hang?
#16 Updated by Alexander Motin about 2 years ago
- Status changed from Screened to Closed
- Target version changed from 11.2-RC1 to N/A
- Reason for Closing set to Duplicate Issue
- Needs QA changed from Yes to No
- Needs Doc changed from Yes to No
- Needs Merging changed from Yes to No
Closing this as duplicate of #41910, first provided required information. I've reproduced the issue and working on the patch.