Project

General

Profile

Bug #43415

Fix traceback found during load testing

Added by Waqar Ahmed about 2 years ago. Updated about 2 years ago.

Status:
Done
Priority:
No priority
Assignee:
William Grzybowski
Category:
Middleware
Target version:
Severity:
Medium
Reason for Closing:
Reason for Blocked:
Needs QA:
No
Needs Doc:
No
Needs Merging:
No
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

While investigating another ticket, I went into this traceback which caused jobs to fail. Probably not only jobs are affected. Reproduction scenario is that I created around 20 jails on FreeNAS 28-Aug-2018 Nightlies and as the jails were created via client calls in python, the were created in a loop. Following is the relevant traceback

[2018/08/29 09:52:19] (WARNING) application.call_method():182 - Exception while calling core.job_update(*[772, {'progress': {'percent': 20, 'description': 'Initial validation complete', 'extra': None}}])
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 155, in call_method
    result = await self.middleware.call_method(self, message)
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 1023, in call_method
    return await self._call(message['method'], serviceobj, methodobj, params, app=app, io_thread=False)
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 977, in _call
    return await run_method(methodobj, *args)
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 915, in run_in_thread
    return await self.run_in_executor(self.__threadpool, method, *args, **kwargs)
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 912, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
  File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.6/site-packages/middlewared/schema.py", line 668, in nf
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/middlewared/service.py", line 322, in job_update
    extra=progress.get('extra'),
  File "/usr/local/lib/python3.6/site-packages/middlewared/job.py", line 289, in set_progress
    self.middleware.send_event('core.get_jobs', 'CHANGED', id=self.id, fields=self.__encode__())
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 1066, in send_event
    for sessionid, wsclient in self.__wsclients.items():
RuntimeError: dictionary changed size during iteration

Risk
Low risk, since its a bug fix to a race condition

Acceptance Criteria
This is a race condition, testing scenario is complex, thus I will say QA testing is not required, breakage in this code path would see side effects across the board related to long running tasks in UI.

Associated revisions

Revision cb58dde6 (diff)
Added by William Grzybowski about 2 years ago

fix(middlewared): race condition on send_event send_event may traceback if its running between websocket open/close connections. Ticket: #43415

Revision a7bd00ef (diff)
Added by William Grzybowski about 2 years ago

fix(middlewared): race condition on send_event send_event may traceback if its running between websocket open/close connections. Ticket: #43415

History

#1 Updated by William Grzybowski about 2 years ago

  • Description updated (diff)
  • Status changed from Unscreened to In Progress
  • Severity changed from New to Medium
  • Needs QA changed from Yes to No

#2 Updated by Dru Lavigne about 2 years ago

  • Target version changed from Backlog to 11.2-BETA3

#3 Updated by Michael Reynolds about 2 years ago

  • File 43415-debug.tgz added
  • Private changed from No to Yes

I ran into this on a mini running the 08-28-18 nightly
Used the built-in auto-updater
Updates looked like they downloaded and then go an error dialog with the runtime message "dictionary changed size during iteration"

The traceback I found was very similar
[2018/08/29 13:53:21] (WARNING) application.call_method():182 - Exception while calling core.job_update(*[1100, {'progress': {'percent': 90.0, 'description': 'Installing freenas-pkg-tools (4/5)', 'extra': None}}])
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 155, in call_method
result = await self.middleware.call_method(self, message)
File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
return self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 1023, in call_method
return await self._call(message['method'], serviceobj, methodobj, params, app=app, io_thread=False)
File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
return self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 977, in _call
return await run_method(methodobj, *args)
File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
return self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 915, in run_in_thread
return await self.run_in_executor(self.__threadpool, method, *args, **kwargs)
File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
return self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 912, in run_in_executor
return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.6/site-packages/middlewared/schema.py", line 668, in nf
return f(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/middlewared/service.py", line 322, in job_update
extra=progress.get('extra'),
File "/usr/local/lib/python3.6/site-packages/middlewared/job.py", line 289, in set_progress
self.middleware.send_event('core.get_jobs', 'CHANGED', id=self.id, fields=self.__encode__())
File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 1066, in send_event
for sessionid, wsclient in self.__wsclients.items():
RuntimeError: dictionary changed size during iteration

Attached a debug generated right after the error was dismissed.

#4 Updated by Bug Clerk about 2 years ago

  • Status changed from In Progress to Ready for Testing

#5 Updated by William Grzybowski about 2 years ago

  • Status changed from Ready for Testing to Done

#6 Updated by Dru Lavigne about 2 years ago

  • File deleted (43415-debug.tgz)

#7 Updated by Dru Lavigne about 2 years ago

  • Private changed from Yes to No

#8 Updated by Dru Lavigne about 2 years ago

  • Subject changed from Traceback during load testing to Fix traceback found during load testing
  • Needs Doc changed from Yes to No
  • Needs Merging changed from Yes to No

Also available in: Atom PDF