Project

General

Profile

Bug #27478

autorepl.py is killing CPU

Added by Karol Stilger over 2 years ago. Updated over 2 years ago.

Status:
Closed: Duplicate
Priority:
Important
Assignee:
Sean Fagan
Category:
OS
Target version:
Seen in:
Severity:
New
Reason for Closing:
Reason for Blocked:
Needs QA:
Yes
Needs Doc:
Yes
Needs Merging:
Yes
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

HW configuration is rather low (both NASes same config: ASUS C60M1-I A50, 2x8GB RAM, 6x3TB WD RED, Intel i210, freenas on USB thumb): and based on low spec CPU found that something is not optimal.

Every minute (even is snapshots are set every 4h) set of replication scripts like autorepl.py are running which cause significant CPU load visible behind python3.6, this consume most of CPU on this low HW configuration (this problem was not spotted on version 9, which was running happy for years). In general deltas in snapshots are relatively small and replication of snapshot is matter of minute or two, so observed load is not generated by this, no other CPU intensive setup, no active jails, etc.
Does autorepl.py need to run every minute even if snapshots are set every 4h?

  1. grep "started" /var/log/debug.log | tail -5
    Dec 18 13:16:39 hostname /autorepl.py: [tools.autorepl:221] Autosnap replication started
    Dec 18 13:17:38 hostname /autorepl.py: [tools.autorepl:221] Autosnap replication started
    Dec 18 13:18:39 hostname /autorepl.py: [tools.autorepl:221] Autosnap replication started
    Dec 18 13:19:38 hostname /autorepl.py: [tools.autorepl:221] Autosnap replication started
    Dec 18 13:20:39 hostname /autorepl.py: [tools.autorepl:221] Autosnap replication startedautorepl.py
in /etc/crontab I see:
  • * * * * root /usr/local/bin/python /usr/local/www/freenasUI/tools/autosnap.py > /dev/null 2>&1
autosnap.py (25.5 KB) autosnap.py Sean Fagan, 01/03/2018 10:33 PM

Related issues

Is duplicate of FreeNAS - Bug #25757: High cpu/disk usage since upgradeClosed2017-09-03

History

#1 Updated by Karol Stilger over 2 years ago

  • File debug-azazello-20171228203628.txz added
  • Private changed from No to Yes

#2 Updated by Karol Stilger over 2 years ago

  • Seen in changed from Unspecified to TrueNAS 11.1-U1

#3 Updated by Karol Stilger over 2 years ago

  • Seen in changed from TrueNAS 11.1-U1 to 11.1

#4 Avatar?id=14398&size=24x24 Updated by Kris Moore over 2 years ago

  • Assignee changed from Release Council to Sean Fagan
  • Priority changed from No priority to Important

Over for investigation

#5 Updated by Sean Fagan over 2 years ago

  • Status changed from Unscreened to Investigation

This is a duplicate but I'm still vacationy. I need to push to make sure the fix to only run autorepl.py if there's a snapshot change (made or deleted), which will help a bit; it won't stop the CPU load when the replication script runs, but I don't think there's anything really to be done about that as long as it uses ssh to deal with the remote system.

#6 Updated by Karol Stilger over 2 years ago

Sean Fagan wrote:

This is a duplicate but I'm still vacationy. I need to push to make sure the fix to only run autorepl.py if there's a snapshot change (made or deleted), which will help a bit; it won't stop the CPU load when the replication script runs, but I don't think there's anything really to be done about that as long as it uses ssh to deal with the remote system.

Hi Sean, thank you, hope, this will help.
Load which I'm finding as not needed is found during time that neither snapshoot or replication is in progress. Currently workarounded by correction in crontab to run autorepl.py only during snapshot/replication times. CPU load during replication is OK as there is some action behind it and I know that SSH action is CPU hungry:)

#7 Updated by Karol Stilger over 2 years ago

  • File deleted (debug-azazello-20171228203628.txz)

#8 Updated by Sean Fagan over 2 years ago

  • Related to Bug #25757: High cpu/disk usage since upgrade added

#9 Updated by Sean Fagan over 2 years ago

  • Related to deleted (Bug #25757: High cpu/disk usage since upgrade)

#10 Updated by Sean Fagan over 2 years ago

  • Is duplicate of Bug #25757: High cpu/disk usage since upgrade added

#11 Updated by Sean Fagan over 2 years ago

  • Status changed from Investigation to Closed: Duplicate

Ah ha there it is: #25757

There's an autosnap.py attached to that that you can try.

#12 Updated by Dru Lavigne over 2 years ago

  • Target version set to N/A
  • Private changed from Yes to No

#13 Updated by Karol Stilger over 2 years ago

Sean Fagan wrote:

Ah ha there it is: #25757

There's an autosnap.py attached to that that you can try.

I would like to test it, but unfortunately I'm not able to access #25757 to download updated autosnap.py.

#14 Updated by Sean Fagan over 2 years ago

Easy enough to fix. :)

#15 Updated by Karol Stilger over 2 years ago

Sean Fagan wrote:

Easy enough to fix. :)

Thanks, it looks fix starting autorepl.py for no reason, however autosnap.py activiy as well is visible as CPU peak every minute for about 30 secs (yes in my case after correction of autorepl.py python36 is on top for about 30 secs) - even out of snapshot times:)

#16 Updated by Sean Fagan over 2 years ago

Yes, it is going to run and look for expired snapshots, even if it doesn't have any to create. The amount of CPU time involved will depend on the number of snapshots on the systems.

Also available in: Atom PDF