Critical alerts don't clear in web GUI
My HDD temps will occasionally exceed 40 deg C, and when they do, I get an email notification of a critical alert. I then take action to lower the temp (i.e., open the door to the closet where the server is kept), and it goes down nicely. The problem is that the critical alert never disappears from the web GUI. When I log in I get the red warning light. Even after clearing the checkboxes next to the individual warnings, they stay there, even weeks after the issue has been resolved.
1. At whatever interval FreeNAS re-checks SMART data, if the temperature is now below the threshold value, clear out the alert without any further user intervention.
2. Require acknowledgement in the web GUI of this former critical alert, but once acknowledged, remove the alert.
My preference would be 1--the red warning is to indicate that the server is currently in a critical state, and if the drives have been cooled, then the server is no longer in a critical state. The second option has the advantage of requiring that the user actively acknowledge the problem, but the disadvantage of requiring user intervention to resolve a non-issue. If the second option is chosen, it would be helpful if these alerts noted the date/time when they occurred.
#9 Updated by William Grzybowski almost 3 years ago
I dont know if there is anything we can do here, or at least is worth the effort.
These alerts are more like single time events, there is no easy way to check every single event triggered by smart if it has been addressed or not, as far as I can tell.
What we could do however, is changing how events are processed and parsing smartctl periodically.
It sure would have been nice if this had been fixed because it is still an issue. Not specific to drive temp, if anything causes an alert, the only way to make the alert go away is to reboot the system even if what initially triggered the alert has been corrected. For example, a bad hard drive is replaced in a system with hot-swap drive bays. The alert that the, now removed, drive has a fault is still presented even after the replacement drive has finished resilver.