Project

General

Profile

Bug #43119

Fix potential race condition when creating jails

Added by Erin Clark about 2 years ago. Updated about 2 years ago.

Status:
Done
Priority:
No priority
Assignee:
Waqar Ahmed
Category:
Middleware
Target version:
Severity:
Medium
Reason for Closing:
Reason for Blocked:
Needs QA:
No
Needs Doc:
No
Needs Merging:
No
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

Part of this original ticket was a UI thing which I am fixing in the original ticket, however there seem to be some middleware issues with jail create and jail.query so I am copying this for the middleware team so I can fix the UI part in its own ticket. The behavior I am seeing is that the jail.create will succeed and then go to the table where jail.query happens and returns [], which is what makes the jails look like they disappeared.

Creating a jail using the Basic Wizard can cause all jails to disappear.
Navigating away and back they re-appear.

I initially thought using a hyphen in the Jail Name was the issue.
I was able to cause the disappearing jail issue to happen without a hyphen in the name as well.

I am unable to create a new jail using the Advanced Jail Wizard. (With or without a hyphen.)

Clicking Save gives the Please Wait spinner, the Jail isn't created and "list index out of range" shows up in the top left corner.

See Jail_list-index-out-of-range_2018-08-24.mkv


Related issues

Related to FreeNAS - Bug #43328: Have jail wizard wait until SUCCESS before leaving pageDone
Has duplicate FreeNAS - Bug #43245: After creating a jail, jail list shows "No data to display"Closed
Copied from FreeNAS - Bug #42899: Fix "list index out of range" error when using Advanced Jail Creation in the New UIDone

History

#1 Updated by Erin Clark about 2 years ago

  • Copied from Bug #42899: Fix "list index out of range" error when using Advanced Jail Creation in the New UI added

#2 Updated by Erin Clark about 2 years ago

  • Assignee changed from Erin Clark to William Grzybowski

A little investigation shows me this, I create a jail with the following properties:

props: (63) […]
​​
0: "dhcp=on" 
​​1: "vnet=on" 
​​2: "bpf=yes" 
​​3: "defaultrouter6=none" 
​​4: "notes=none" 
​​5: "boot=off" 
​​6: "devfs_ruleset=4" 
​​7: "exec_start=/bin/sh /etc/rc" 
​​8: "exec_stop=/bin/sh /etc/rc.shutdown" 
​​9: "exec_prestart=/usr/bin/true" 
​​10: "exec_poststart=/usr/bin/true" 
​​11: "exec_prestop=/usr/bin/true" 
​​12: "exec_poststop=/usr/bin/true" 
​​13: "exec_clean=1" 
​​14: "exec_timeout=60" 
​​15: "stop_timeout=30" 
​​16: "exec_jail_user=root" 
​​17: "exec_system_jail_user=0" 
​​18: "exec_system_user=root" 
​​19: "mount_devfs=1" 
​​20: "mount_fdescfs=1" 
​​21: "enforce_statfs=2" 
​​22: "children_max=0" 
​​23: "login_flags=-f root" 
​​24: "securelevel=2" 
​​25: "sysvmsg=new" 
​​26: "sysvsem=new" 
​​27: "sysvshm=new" 
​​28: "allow_set_hostname=1" 
​​29: "allow_sysvipc=0" 
​​30: "allow_raw_sockets=0" 
​​31: "allow_chflags=0" 
​​32: "allow_mount=0" 
​​33: "allow_mount_devfs=0" 
​​34: "allow_mount_nullfs=0" 
​​35: "allow_mount_procfs=0" 
​​36: "allow_mount_tmpfs=0" 
​​37: "allow_mount_zfs=0" 
​​38: "allow_quotas=0" 
​​39: "allow_socket_af=0" 
​​40: "interfaces=vnet0:bridge0" 
​​41: "host_domainname=none" 
​​42: "exec_fib=0" 
​​43: "ip4_saddrsel=1" 
​​44: "ip4=new" 
​​45: "ip6_saddrsel=1" 
​​46: "ip6=new" 
​​47: "resolver=/etc/resolv.conf" 
​​48: "mac_prefix=02ff60" 
​​49: "vnet0_mac=none" 
​​50: "vnet1_mac=none" 
​​51: "vnet2_mac=none" 
​​52: "vnet3_mac=none" 
​​53: "owner=root" 
​​54: "priority=99" 
​​55: "hostid=B441725B-A806-E111-892C-505054503030" 
​​56: "comment=none" 
​​57: "depends=none" 
​​58: "mount_procfs=0" 
​​59: "mount_linprocfs=0" 
​​60: "host_time=yes" 
​​61: "jail_zfs=off" 
​​62: "jail_zfs_mountpoint=none" 
​​
release: "11.2-RELEASE" 
​uuid: "jailisoisissefsef" 

I get the following traceback in the return from the middleware even though it seems to succeed:

"[Errno 2] No such file or directory: '/mnt/iocage/jails/jailisoisissefsef/root/etc/hosts'", exception: "concurrent.futures.process._RemoteTraceback: \n\"\"\"\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.6/concurrent/futures/process.py\", line 175, in _process_worker\n    r = call_item.fn(*call_item.args, **call_item.kwargs)\n  File \"/usr/local/lib/python3.6/site-packages/middlewared/worker.py\", line 122, in main_worker\n    res = loop.run_until_complete(coro)\n  File \"/usr/local/lib/python3.6/asyncio/base_events.py\", line 468, in run_until_complete\n    return future.result()\n  File \"/usr/local/lib/python3.6/site-packages/middlewared/worker.py\", line 82, in _run\n    return await self._call(f'{service_name}.{method}', serviceobj, methodobj, params=args, job=job)\n  File \"/usr/local/lib/python3.6/site-packages/middlewared/worker.py\", line 75, in _call\n    return methodobj(*params)\n  File \"/usr/local/lib/python3.6/site-packages/middlewared/worker.py\", line 75, in _call\n    return methodobj(*params)\n  File \"/usr/local/lib/python3.6/site-packages/middlewared/plugins/jail.py\", line 172, in create_job\n    empty=empty\n  File \"/usr/local/lib/python3.6/site-packages/iocage_lib/iocage.py\", line 611, in create\n    clone=clone,\n  File \"/usr/local/lib/python3.6/site-packages/iocage_lib/ioc_create.py\", line 88, in create_jail\n    return self._create_jail(jail_uuid, location)\n  File \"/usr/local/lib/python3.6/site-packages/iocage_lib/ioc_create.py\", line 398, in _create_jail\n    with open(f\"{location}/root/etc/hosts\", \"r\") as _etc_hosts:\nFileNotFoundError: [Errno 2] No such file or directory: '/mnt/iocage/jails/jailisoisissefsef/root/etc/hosts'\n\"\"\"\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.6/site-packages/middlewared/job.py\", line 332, in run\n    await self.future\n  File \"/usr/local/lib/python3.6/asyncio/coroutines.py\", line 129, in throw\n    return self.gen.throw(type, value, traceback)\n  File \"/usr/local/lib/python3.6/site-packages/middlewared/job.py\", line 356, in __run_body\n    rv = await self.middleware._call_worker(self.serviceobj, self.method_name, *self.args, job={'id': self.id})\n  File \"/usr/local/lib/python3.6/asyncio/coroutines.py\", line 129, in throw\n    return self.gen.throw(type, value, traceback)\n  File \"/usr/local/lib/python3.6/site-packages/middlewared/main.py\", line 987, in _call_worker\n    job,\n  File \"/usr/local/lib/python3.6/asyncio/coroutines.py\", line 129, in throw\n    return self.gen.throw(type, value, traceback)\n  File \"/usr/local/lib/python3.6/site-packages/middlewared/main.py\", line 918, in run_in_proc\n    return await self.run_in_executor(self.__procpool, method, *args, **kwargs)\n  File \"/usr/local/lib/python3.6/asyncio/coroutines.py\", line 129, in throw\n    return self.gen.throw(type, value, traceback)\n  File \"/usr/local/lib/python3.6/site-packages/middlewared/main.py\", line 912, in run_in_executor\n    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))\nFileNotFoundError: [Errno 2] No such file or directory: '/mnt/iocage/jails/jailisoisissefsef/root/etc/hosts'\n", state: "FAILED" 

In middlewared.log I see this:

[2018/08/27 22:16:11] (WARNING) application.call_method():182 - Exception while calling cache.get(*['iocage_remote_releases'])
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 155, in call_method
    result = await self.middleware.call_method(self, message)
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 1023, in call_method
    return await self._call(message['method'], serviceobj, methodobj, params, app=app, io_thread=False)
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 977, in _call
    return await run_method(methodobj, *args)
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 915, in run_in_thread
    return await self.run_in_executor(self.__threadpool, method, *args, **kwargs)
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 912, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
  File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.6/site-packages/middlewared/schema.py", line 668, in nf
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/middlewared/plugins/cache.py", line 33, in get
    if self.__cache[key].timeout > 0:
KeyError: 'iocage_remote_releases'
[2018/08/27 22:16:11] (DEBUG) middlewared.logger.CrashReporting.report():102 - Sending a crash report...
[2018/08/27 22:16:11] (DEBUG) raven.base.Client.send_remote():716 - Sending message of length 4550 to https://sentry.ixsystems.com/api/2/store/
[2018/08/27 22:16:30] (WARNING) middlewared.devd_listen():93 - Failed to parse devd message: !system=ZFS subsystem=ZFS type=misc.fs.zfs.history_event  pool_name=poolio pool_guid=1560407986226402443 history_hostname=freenas.ad01.tn.ixsystems.com history_dsname=poolio/iocage/jails/fawefwe/root history_internal_str=origin=poolio/iocage/releases/11.2-RELEASE/root@fawefwe (338) history_internal_name=clone history_dsid=349 history_txg=695299 history_time=1535408190

[2018/08/27 22:57:43] (WARNING) middlewared.devd_listen():93 - Failed to parse devd message: !system=ZFS subsystem=ZFS type=misc.fs.zfs.history_event  pool_name=poolio pool_guid=1560407986226402443 history_hostname=freenas.ad01.tn.ixsystems.com history_dsname=poolio/iocage/jails/laskfjesaf/root history_internal_str=origin=poolio/iocage/releases/11.2-RELEASE/root@laskfjesaf (374) history_internal_name=clone history_dsid=385 history_txg=695797 history_time=1535410663

[2018/08/27 23:19:56] (WARNING) middlewared.devd_listen():93 - Failed to parse devd message: !system=ZFS subsystem=ZFS type=misc.fs.zfs.history_event  pool_name=poolio pool_guid=1560407986226402443 history_hostname=freenas.ad01.tn.ixsystems.com history_dsname=poolio/iocage/jails/gulag/root history_internal_str=origin=poolio/iocage/releases/11.2-RELEASE/root@gulag (408) history_internal_name=clone history_dsid=419 history_txg=696087 history_time=1535411996

[2018/08/27 23:34:25] (WARNING) middlewared.devd_listen():93 - Failed to parse devd message: !system=ZFS subsystem=ZFS type=misc.fs.zfs.history_event  pool_name=poolio pool_guid=1560407986226402443 history_hostname=freenas.ad01.tn.ixsystems.com history_dsname=poolio/iocage/jails/jailoisois/root history_internal_str=origin=poolio/iocage/releases/11.2-RELEASE/root@jailoisois (447) history_internal_name=clone history_dsid=458 history_txg=696275 history_time=1535412865

#3 Updated by Dru Lavigne about 2 years ago

  • Category changed from GUI (new) to Middleware

#4 Updated by Waqar Ahmed about 2 years ago

  • Status changed from Unscreened to In Progress
  • Assignee changed from William Grzybowski to Waqar Ahmed

After discussion with Erin, two issues have been identified in this ticket
1) List index out of range error
2) Exception on jail.create

Issue no 1 was resolved and has been fixed. Issue no 2 could not be reproduced on my machine. I tried the same payload as mentioned in the ticket

root@freenas[~]# midclt call jail.create '{"uuid": "hello", "release": "11.2-RELEASE", "props":["dhcp=on", "vnet=on", "bpf=yes", "defaultrouter6=none", "notes=none", "boot=off", "devfs_ruleset=4", "exec_start=/bin/sh /etc/rc", "exec_stop=/bin/sh /etc/rc.shutdown", "exec_prestart=/usr/bin/true", "exec_poststart=/usr/bin/true", "exec_prestop=/usr/bin/true", "exec_poststop=/usr/bin/true", "exec_clean=1", "exec_timeout=60", "stop_timeout=30", "exec_jail_user=root", "exec_system_jail_user=0", "exec_system_user=root", "mount_devfs=1", "mount_fdescfs=1", "enforce_statfs=2", "children_max=0", "login_flags=-f root", "securelevel=2", "sysvmsg=new", "sysvsem=new", "sysvshm=new", "allow_set_hostname=1", "allow_sysvipc=0", "allow_raw_sockets=0", "allow_chflags=0", "allow_mount=0", "allow_mount_devfs=0", "allow_mount_nullfs=0", "allow_mount_procfs=0", "allow_mount_tmpfs=0", "allow_mount_zfs=0", "allow_quotas=0", "allow_socket_af=0", "interfaces=vnet0:bridge0", "host_domainname=none", "exec_fib=0", "ip4_saddrsel=1", "ip4=new", "ip6_saddrsel=1", "ip6=new", "resolver=/etc/resolv.conf", "mac_prefix=02ff60", "vnet0_mac=none", "vnet1_mac=none", "vnet2_mac=none", "vnet3_mac=none", "owner=root", "priority=99", "hostid=B441725B-A806-E111-892C-505054503030", "comment=none", "depends=none", "mount_procfs=0", "mount_linprocfs=0", "host_time=yes", "jail_zfs=off", "jail_zfs_mountpoint=none"]}'

And then querying for the create job gives me

root@freenas[~]# midclt call core.get_jobs '[["id", "=", 38]]'
[{"id": 38, "method": "jail.create_job", "arguments": [{"uuid": "hello", "release": "11.2-RELEASE", "props": ["dhcp=on", "vnet=on", "bpf=yes", "defaultrouter6=none", "notes=none", "boot=off", "devfs_ruleset=4", "exec_start=/bin/sh /etc/rc", "exec_stop=/bin/sh /etc/rc.shutdown", "exec_prestart=/usr/bin/true", "exec_poststart=/usr/bin/true", "exec_prestop=/usr/bin/true", "exec_poststop=/usr/bin/true", "exec_clean=1", "exec_timeout=60", "stop_timeout=30", "exec_jail_user=root", "exec_system_jail_user=0", "exec_system_user=root", "mount_devfs=1", "mount_fdescfs=1", "enforce_statfs=2", "children_max=0", "login_flags=-f root", "securelevel=2", "sysvmsg=new", "sysvsem=new", "sysvshm=new", "allow_set_hostname=1", "allow_sysvipc=0", "allow_raw_sockets=0", "allow_chflags=0", "allow_mount=0", "allow_mount_devfs=0", "allow_mount_nullfs=0", "allow_mount_procfs=0", "allow_mount_tmpfs=0", "allow_mount_zfs=0", "allow_quotas=0", "allow_socket_af=0", "interfaces=vnet0:bridge0", "host_domainname=none", "exec_fib=0", "ip4_saddrsel=1", "ip4=new", "ip6_saddrsel=1", "ip6=new", "resolver=/etc/resolv.conf", "mac_prefix=02ff60", "vnet0_mac=none", "vnet1_mac=none", "vnet2_mac=none", "vnet3_mac=none", "owner=root", "priority=99", "hostid=B441725B-A806-E111-892C-505054503030", "comment=none", "depends=none", "mount_procfs=0", "mount_linprocfs=0", "host_time=yes", "jail_zfs=off", "jail_zfs_mountpoint=none"], "basejail": false, "empty": false, "short": false}], "logs_path": null, "logs_excerpt": null, "progress": {"percent": 20, "description": "Initial validation complete", "extra": null}, "result": true, "error": null, "exception": null, "state": "SUCCESS", "time_started": {"$date": 1535395565970}, "time_finished": {"$date": 1535395566480}}]

Now I am going to try and see if more information can be found for this issue on Erin's box. Will update the ticket accordingly

#5 Updated by Michael Reynolds about 2 years ago

  • File 43119-debug.tgz added

#7 Updated by Michael Reynolds about 2 years ago

  • Subject changed from Jails disappear after creating a jail to Jails disappear after creating a jail in the New UI

I was able to reproduce this bug using the Advanced Jail Creation Wizard
Previously I was only able to reproduce using the Basic Jail Creation Wizard
I didn't enable DHCP and unset BPF

Left all other defaults

Clicked Save and after the jail is created no jails are listed.
After leaving the Jails section and returning, the jails are listed again.

Tried to reproduce with the same steps did not trigger the bug...

#8 Updated by Erin Clark about 2 years ago

  • Has duplicate Bug #43245: After creating a jail, jail list shows "No data to display" added

#9 Updated by Waqar Ahmed about 2 years ago

  • Subject changed from Jails disappear after creating a jail in the New UI to Iocage not able to read hosts file on jail creation

This ticket has been divided in two parts.
1) New UI should wait for jail creation to complete
2) A scenario where iocage is unable to create/read /etc/hosts file and throws an exception

For part 1, a ticket has been made for new UI and it will be handled there - ticket no: #43328
This ticket is going to track part 2 now. For now we don't have a valid reproduction case of the issue. The issue is probably being caused by some race condition. All options will be looked at and the ticket will be updated accordingly as to what can be done to address the issue

#10 Updated by Waqar Ahmed about 2 years ago

Traceback from Erin's machine

```
Error: concurrent.futures.process._RemoteTraceback: 
""" 
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/concurrent/futures/process.py", line 175, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/usr/local/lib/python3.6/site-packages/middlewared/worker.py", line 122, in main_worker
    res = loop.run_until_complete(coro)
  File "/usr/local/lib/python3.6/asyncio/base_events.py", line 468, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.6/site-packages/middlewared/worker.py", line 82, in _run
    return await self._call(f'{service_name}.{method}', serviceobj, methodobj, params=args, job=job)
  File "/usr/local/lib/python3.6/site-packages/middlewared/worker.py", line 75, in _call
    return methodobj(*params)
  File "/usr/local/lib/python3.6/site-packages/middlewared/worker.py", line 75, in _call
    return methodobj(*params)
  File "/usr/local/lib/python3.6/site-packages/middlewared/plugins/jail.py", line 172, in create_job
    empty=empty
  File "/usr/local/lib/python3.6/site-packages/iocage_lib/iocage.py", line 611, in create
    clone=clone,
  File "/usr/local/lib/python3.6/site-packages/iocage_lib/ioc_create.py", line 88, in create_jail
    return self._create_jail(jail_uuid, location)
  File "/usr/local/lib/python3.6/site-packages/iocage_lib/ioc_create.py", line 398, in _create_jail
    with open(f"{location}/root/etc/hosts", "r") as _etc_hosts:
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/iocage/jails/jailyfoo/root/etc/hosts'
""" 

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/middlewared/job.py", line 332, in run
    await self.future
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/site-packages/middlewared/job.py", line 356, in __run_body
    rv = await self.middleware._call_worker(self.serviceobj, self.method_name, *self.args, job={'id': self.id})
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 987, in _call_worker
    job,
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 918, in run_in_proc
    return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 912, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/iocage/jails/jailyfoo/root/etc/hosts'

#11 Updated by Dru Lavigne about 2 years ago

  • Related to Bug #43328: Have jail wizard wait until SUCCESS before leaving page added

#12 Updated by Waqar Ahmed about 2 years ago

  • Status changed from In Progress to Ready for Testing

PR - https://github.com/freenas/iocage/pull/31

The original issue couldn't be reproduced however code has been updated to handle this case if such an event occurs again, so for testing this ticket it should be ensured that jail creation is working as expected.

#13 Updated by Dru Lavigne about 2 years ago

  • File deleted (debug.tgz)

#14 Updated by Dru Lavigne about 2 years ago

  • File deleted (43119-debug.tgz)

#15 Updated by Dru Lavigne about 2 years ago

  • Subject changed from Iocage not able to read hosts file on jail creation to Fix potential race condition when creating jails
  • Private changed from Yes to No
  • Needs Merging changed from Yes to No

#16 Updated by Bonnie Follweiler about 2 years ago

  • Status changed from Ready for Testing to Passed Testing
  • Needs QA changed from Yes to No

Test Passed in FreeNAS-11.2-MASTER-201809250704
Successfully created a jail using the basic jail wizard

#17 Updated by Dru Lavigne about 2 years ago

  • Status changed from Passed Testing to Done

Also available in: Atom PDF