Project

General

Profile

Bug #53415

Restart smartd in background as it can take some time on systems with many disks

Added by Ryan McKenzie 9 months ago. Updated 6 months ago.

Status:
Done
Priority:
No priority
Assignee:
William Grzybowski
Category:
Middleware
Target version:
Seen in:
TrueNAS - TrueNAS 11.1-U5.1
Severity:
New
Reason for Closing:
Reason for Blocked:
Needs QA:
No
Needs Doc:
No
Needs Merging:
No
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

On the M40 and M40 in our lab, when two fully populated ES60 enclosures and the head is fully populated (144 drives), volume manager times out on pool creation and pool extension (adding L2ARC). The desired operation still occurs in the background but the web UI middleware times out.

I have seen this on M series with 11.1u5 and 11.2 stable. Most recently, it was on the M50 running 11.1u5, stack trace below and screenshots attached:

Environment:

Software Version: TrueNAS-11.1-U5.1 (6ccf719c5)
Request Method: POST
Request URL: http://tn11.lab.ixsystems.com/storage/volumemanager/

Traceback:
File "/usr/local/lib/python3.6/site-packages/django/core/handlers/exception.py" in inner
42. response = get_response(request)
File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py" in _legacy_get_response
249. response = self._get_response(request)
File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py" in _get_response
178. response = middleware_method(request, callback, callback_args, callback_kwargs)
File "./freenasUI/freeadmin/middleware.py" in process_view
162. return login_required(view_func)(request, *view_args, **view_kwargs)
File "/usr/local/lib/python3.6/site-packages/django/contrib/auth/decorators.py" in _wrapped_view
23. return view_func(request, *args, **kwargs)
File "./freenasUI/storage/views.py" in volumemanager
148. if form.is_valid() and form.save():
File "./freenasUI/storage/forms.py" in save
334. notifier().restart("smartd")
File "./freenasUI/failover/notifier.py" in restart
72. return super(FailoverNotifier, self).restart(what, timeout=timeout, onetime=onetime)
File "./freenasUI/middleware/notifier.py" in restart
223. return c.call('service.restart', what, {'onetime': onetime}, **kwargs)
File "./freenasUI/middleware/notifier.py" in restart
223. return c.call('service.restart', what, {'onetime': onetime}, **kwargs)
File "/usr/local/lib/python3.6/site-packages/middlewared/client/client.py" in call
429. raise CallTimeout("Call timeout")

Exception Type: CallTimeout at /storage/volumemanager/
Exception Value: Call timeout

image.png (25.8 KB) image.png Ryan McKenzie, 10/24/2018 05:13 AM
image (1).png (31.4 KB) image (1).png Ryan McKenzie, 10/24/2018 05:13 AM
36349
36358

Associated revisions

Revision ad4f2e47 (diff)
Added by William Grzybowski 9 months ago

fix(gui): restart smartd on background

Ticket: #53415

History

#1 Updated by Ryan McKenzie 9 months ago

"On the M40 and M50 in our lab..."

Ryan McKenzie wrote:

On the M40 and M40 in our lab, when two fully populated ES60 enclosures and the head is fully populated (144 drives)

#2 Updated by Dru Lavigne 9 months ago

  • Category changed from GUI to Middleware
  • Assignee changed from Release Council to William Grzybowski

#4 Updated by Dru Lavigne 9 months ago

  • Seen in changed from TrueNAS 11.1-U5 to TrueNAS 11.1-U5.1

#5 Updated by Bug Clerk 9 months ago

  • Status changed from Unscreened to In Progress

#6 Updated by Bug Clerk 9 months ago

  • Status changed from In Progress to Ready for Testing

11.1-stable PR: https://github.com/freenas/freenas/pull/1968
This should not happen on 11.2+ which does scanning on parallel.

#7 Updated by Bug Clerk 9 months ago

  • Target version changed from Backlog to TrueNAS 11.1-U6.2

#8 Updated by Dru Lavigne 9 months ago

  • Project changed from TrueNAS to FreeNAS
  • Subject changed from Volume Manager Middleware Timeout on Large Systems to Restart smartd in background as it can take some time on systems with many disks
  • Category changed from Middleware to Middleware
  • Needs Doc changed from Yes to No
  • Needs Merging changed from Yes to No
  • Migration Needed deleted (No)
  • Hide from ChangeLog deleted (No)
  • Support Department Priority deleted (0)

#9 Updated by Dru Lavigne 8 months ago

  • Target version changed from TrueNAS 11.1-U6.2 to 11.1-U7

#11 Updated by Ryan McKenzie 6 months ago

Testing in progress. Getting enough drives on one system had to be worked into the existing performance test sequences. Sorry for the delay.

#12 Updated by Ryan McKenzie 6 months ago

  • Status changed from Ready for Testing to Passed Testing

Tested on M40-HA running TrueNAS 11.1 u7 INTERNAL 5

Steps as per Caleb's instructions:

1) Created SMART test task with all 142 SAS drives
2) Immediately edited the task and removed da0
3) Deleted the task soon afterwards

No middleware timeouts or UI tracebacks observed.

#13 Updated by Bonnie Follweiler 6 months ago

  • Needs QA changed from Yes to No

#14 Updated by Dru Lavigne 6 months ago

  • Status changed from Passed Testing to Done

Also available in: Atom PDF