Add NVDIMM alerts to TrueNAS
#10 Updated by Alexander Motin over 2 years ago
QE: I think the most realistic test would be to disconnect BBU form NVDIMM and boot the system to check for the error. On my bench I see that it makes boot to take few more minutes, but after that error is reported by the NVDIMM and passed through by the driver.
#12 Updated by Alexander Motin over 2 years ago
According to Micron, we should not consider "Critical Health Info: 0x4<PERSISTENCY_RESTORED>" status bit as a problem. I thought we could use this status bit as indicator of transient problems, but according to Micron it happens as part of normal operation too. We are already checking other registers they told us, so there is no more we can add here, just need to add this exception.