Project

General

Profile

Bug #7441

High CPU utilization every minute due to alert.py

Added by Rob Foehl over 5 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Nice to have
Assignee:
William Grzybowski
Category:
Middleware
Target version:
Seen in:
Severity:
New
Reason for Closing:
Reason for Blocked:
Needs QA:
Yes
Needs Doc:
Yes
Needs Merging:
Yes
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

I've noticed a marked increase in CPU utilization since upgrading to 9.3. So far I've identified the most routine culprit as the once-per-minute runs of /usr/local/www/freenasUI/tools/alert.py, which spends considerable time (~2.5 seconds on average) loading every module in the Django tree before doing comparatively little else and exiting.

Commit 5980390 which introduces this "load everything" behavior apparently predates 9.3 by quite a while, so it's not entirely obvious why this wasn't noticable in 9.2.

Associated revisions

Revision ca14997d (diff)
Added by William Grzybowski over 5 years ago

Turn alert into a daemon Bootstrapping django machinery every minute simply for alerts seems to be too CPU intensive for the taste of some users. Turn alert utility into a very, very, very simple daemon running every 60 seconds. Ticket: #7441

Revision 798b4eb7 (diff)
Added by William Grzybowski over 5 years ago

Change crontab to run alert every 30 minutes This only ensures the daemon is running. Ticket: #7441

Revision b297c53d (diff)
Added by William Grzybowski over 5 years ago

Add a new rc.d script to start alert on boot Ticket: #7441

Revision c85caa67 (diff)
Added by William Grzybowski over 5 years ago

Switch some alert modules to do not run every minute Ticket: #7441

Revision 7ee1b257 (diff)
Added by William Grzybowski over 5 years ago

Turn alert into a daemon Bootstrapping django machinery every minute simply for alerts seems to be too CPU intensive for the taste of some users. Turn alert utility into a very, very, very simple daemon running every 60 seconds. Ticket: #7441 (cherry picked from commit ca14997da530e49320a048ec452805aed4d24574)

Revision 2f1e5993 (diff)
Added by William Grzybowski over 5 years ago

Change crontab to run alert every 30 minutes This only ensures the daemon is running. Ticket: #7441 (cherry picked from commit 798b4eb7964afc84d13ac2ec4c43f41c0e693cd7)

Revision 1b6b662a (diff)
Added by William Grzybowski over 5 years ago

Add a new rc.d script to start alert on boot Ticket: #7441 (cherry picked from commit b297c53dcb44908150cf5df3a36005f94ff738b4)

Revision aedfec89 (diff)
Added by William Grzybowski over 5 years ago

Switch some alert modules to do not run every minute Ticket: #7441 (cherry picked from commit c85caa67ea92a3199cb7f4b4eb0b18b3d223a7cf)

History

#1 Updated by Jordan Hubbard over 5 years ago

  • Category set to 53
  • Assignee set to William Grzybowski
  • Target version set to Unspecified

#2 Updated by William Grzybowski over 5 years ago

  • Status changed from Unscreened to Screened

I dont see anything actionable here. Thats the price we pay for not having a middleware daemon, django loading is expensive but we have to do it.

We could very well implement an alert daemon but thats quite a bit work however I am not sure its worth seeing that it will be resolved in FreeNAS 10.

Comments, Jordan?

#3 Updated by Jordan Hubbard over 5 years ago

Well, I guess the question that came up for ALL of us on IRC, looking at /var/log/debug.log, is why this is done once a minute:

Jan 10 09:53:03 freenas autosnap.py: [tools.autosnap:58] Popen()ing: /sbin/zfs list -t snapshot -H
Jan 10 09:53:03 freenas alert.py: [middleware.notifier:226] Popen()ing: /sbin/zpool status -x freenas-boot
Jan 10 09:53:03 freenas alert.py: [middleware.notifier:226] Popen()ing: zpool list -H -o health tank
Jan 10 09:53:03 freenas alert.py: [middleware.notifier:226] Popen()ing: /sbin/zpool status -x tank
Jan 10 09:53:03 freenas alert.py: [middleware.notifier:226] Popen()ing: zpool list -H -o health tank
Jan 10 09:53:03 freenas alert.py: [middleware.notifier:226] Popen()ing: zpool get -H -o value version tank

What's it doing? Why does it need to do that every single minute, even if you have no snapshots configured to go off once a minute or, for that matter, any other task you can think of that requires 1 minute granularity? That's what had us all puzzled.

#4 Updated by William Grzybowski over 5 years ago

These are the alerts to check the health of the pool and the version. 1 minute is the default schedule for every alert. That can be changed very easily, but that is very far from the reported issue that is the high CPU usage. Django needs to load every time the alert system runs, and that is the expensive part.

#5 Updated by William Grzybowski over 5 years ago

  • Status changed from Screened to Fix In Progress

#6 Updated by William Grzybowski over 5 years ago

Please give the next nightly a try to validate the changes so I can merge these to stable.

#7 Updated by Rob Foehl over 5 years ago

I've updated to FreeNAS-9.3-Nightlies-201501160603 and confirmed the alertd process is running, will let it go overnight and follow up tomorrow.

#8 Updated by Rob Foehl over 5 years ago

So far, so good -- the alertd process' CPU usage is within reasonable bounds after 20 hours.

It's still not entirely clear to me why this requires loading all of Django, though... alertd is now the second largest process on the box in terms of memory footprint. Per comment #2, is this entire mechanism expected to change in FreeNAS 10?

#9 Updated by William Grzybowski over 5 years ago

Because it uses the database ORM from django.

Yes, there will be a complete rewrite in 10.

#10 Updated by William Grzybowski over 5 years ago

  • Status changed from Fix In Progress to Ready For Release

#11 Updated by Jordan Hubbard over 5 years ago

  • Status changed from Ready For Release to Resolved

#12 Avatar?id=14398&size=24x24 Updated by Kris Moore about 4 years ago

  • Target version changed from Unspecified to N/A

Also available in: Atom PDF