Correctly report ZFS dataset quota overflows
When writing to a particular ZFS dataset and that dataset being below quota when the files are initially opened for write, but reaching its quota while the clients are actively writing, eventually something in zfs slows to a crawl.
This is trivially reproducible. Running TrueNAS 11.0-U4
1.zfs create cargo/quota-test
2.zfs set compression=off cargo/quota-test
3.zfs set refquota=50m cargo/quota-test
5.dd if=/dev/zero of=file.out bs=1m
If you perform these steps, you will notice that the "dd" command will continue issuing writes to the dataset but will never stop even though the quota has been met. I have procstat -kk -a | fgrep dd and attached the output to this ticket.
NOTE I have tried this on proper freeBSD 11.0, 11.1, and CURRENT and all produce the same results.
#1 Updated by Alexander Motin almost 4 years ago
- Subject changed from zfs refquota deadlock to ZFS dataset quota overflows not reported correctly
- Status changed from Unscreened to Fix In Progress
- Priority changed from Important to Critical
- Target version set to TrueNAS 11.1-U1
Investigation of the problem brought me to FreeBSD-specific change r298105 by avg@ on 2016-04-16. If quota overflow detected during write, the write will fail, but the error status can be lost, falsely reporting partial completion. As result written data are flowing to nowhere and indefinitely, as fast as CPU can handle the loop.