Project

General

Profile

Bug #74695

Active HA controller shutdown/reboot handling

Added by Alexander Motin 6 months ago. Updated 4 months ago.

Status:
Closed
Priority:
No priority
Assignee:
William Grzybowski
Category:
Hardware
Target version:
Seen in:
Severity:
Low Medium
Reason for Closing:
Reason for Blocked:
Needs QA:
Yes
Needs Doc:
Yes
Needs Merging:
Yes
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

As followup of #32043, shutdown/reboot of active HA node should first trigger maximally graceful failover, and only then -- shutdown/reboot. I may be wrong, but I suspect it is not so now, so we may wait indefinitely long for random services, while still connected clients potentially suffer. Lets discuss it.


Related issues

Related to FreeNAS - Bug #32043: Fix stuck process on TrueNAS shutdownClosed

History

#1 Updated by Alexander Motin 6 months ago

  • Related to Bug #32043: Fix stuck process on TrueNAS shutdown added

#2 Updated by William Grzybowski 6 months ago

What does graceful failover means?

#3 Updated by Alexander Motin 6 months ago

I mean that on active node shutdown/reboot we should do the same as we do on loosing interface marked as critical for failover -- demoting carp, closing firewalls, stopping services and exporting data pools. It should not be just a regular OS shutdown, when passive side notice something only after active one finally gone and only then start election. Please correct me if we already have something like that.

#4 Updated by William Grzybowski 6 months ago

Alexander Motin wrote:

I mean that on active node shutdown/reboot we should do the same as we do on loosing interface marked as critical for failover -- demoting carp, closing firewalls, stopping services and exporting data pools. It should not be just a regular OS shutdown, when passive side notice something only after active one finally gone and only then start election. Please correct me if we already have something like that.

We have a patched rc.shutdown which uses the carp script:

if [ -f /data/license ]; then
        /bin/pkill -9 -f fenced
        /sbin/ifconfig -l | /usr/bin/xargs -n 1 -J % /sbin/ifconfig % down
        if [ -f /tmp/failover.json ]; then
                sleep 1
                /usr/local/bin/python /usr/local/libexec/truenas/carp-state-change-hook.py carp0 shutdown
                sleep 4
        fi
fi

#5 Updated by Jaron Parsons 4 months ago

  • Status changed from Unscreened to Closed

Also available in: Atom PDF