Project

General

Profile

Bug #33453

Fix unnecessary AD restarts caused by enabling service monitor

Added by Andrew Walker 12 months ago. Updated 8 months ago.

Status:
Done
Priority:
No priority
Assignee:
Andrew Walker
Category:
OS
Target version:
Seen in:
Severity:
High
Reason for Closing:
Reason for Blocked:
Needs QA:
No
Needs Doc:
No
Needs Merging:
No
Needs Automation:
No
Support Suite Ticket:
KCW-603-27897
Hardware Configuration:
ChangeLog Required:
No

Description

This issue affects multiple customers. The AD monitoring feature can stop or restart active directory unnecessarily. This is usually caused by DNS misconfigurations or domain controllers being temporarily taken down for maintenance. Not every problem with AD monitoring is obviously connected to DNS. I have one ticket in which "service ix-kinit status" is failing. At this point it is unclear whether this is caused by a change in the customer's environment or is a bug in AD monitoring.

Regardless, this leads to service outages when one would perhaps not have happened if monitoring were disabled.

I think we need to either:
(1) Improve the tests we're performing for "connected" and "started". Examples being:
--(a) use DNS SRV records to identify a list of domain controllers for the domain, then try to connect to all of them and only fail if they all fail.
--(b) review tests that we're using to determine whether AD is "started". Do we really need to perform "service ix-kinit status" and "service ix-activedirectory status"? Perhaps "wbinfo -p" and "wbinfo -t" are sufficient in this case. Is a 1 second sleep between tests too short?

or

(2) Validate the state of the domain prior to enabling monitoring. For instance:
--(a) Don't allow it to turn on if DNS is obviously misconfigured.

or

(3) Introduce an easy way to temporarily disable AD monitoring without restarting the AD service, so that administrators can take steps to ensure they don't experience an outage while performing maintenance on DCs.


Related issues

Related to FreeNAS - Bug #35059: Active Directory binding continuously resets when monitoring is enabledClosed
Related to FreeNAS - Bug #30678: Active Directory Service Failing IIClosed
Related to FreeNAS - Bug #34813: AD DropoutClosed
Related to FreeNAS - Bug #32094: SMB Service restartsClosed
Related to FreeNAS - Bug #36823: Fix unneeded stops and restarts in AD monitoringDone
Related to FreeNAS - Bug #41164: Domain Controller Not Found for domainClosed
Has duplicate FreeNAS - Bug #28561: Reconnecting Active Directory kills Samba connectionsClosed

History

#1 Updated by Dru Lavigne 11 months ago

  • Category changed from OS to Middleware
  • Assignee changed from Release Council to William Grzybowski

Passing to William first to see if a middleware piece is needed.

#2 Updated by William Grzybowski 11 months ago

  • Category changed from Middleware to OS
  • Assignee changed from William Grzybowski to Timur Bakeyev

#3 Updated by Dru Lavigne 11 months ago

  • Status changed from Unscreened to In Progress
  • Assignee changed from Timur Bakeyev to Andrew Walker
  • Target version changed from Backlog to 11.2-BETA1

#4 Updated by Andrew Walker 11 months ago

  • Severity changed from New to Med High

#5 Updated by Dru Lavigne 11 months ago

  • Target version changed from 11.2-BETA1 to 11.2-RC2

#6 Updated by Caleb St. John 11 months ago

  • Support Suite Ticket changed from n/a to KCW-603-27897

#10 Updated by Dru Lavigne 11 months ago

  • Private changed from No to Yes

Notes that a new PR will be forthcoming as John/Timur/Andrew have discussed a better way to handle this.

#11 Updated by Sam Fourman 10 months ago

  • Severity changed from Med High to High

#13 Updated by Dru Lavigne 10 months ago

  • Related to Bug #35059: Active Directory binding continuously resets when monitoring is enabled added

#14 Updated by Dru Lavigne 10 months ago

  • Related to Bug #32814: Active Directory fails is first DC/DNS goes down added

#15 Updated by Dru Lavigne 10 months ago

  • Related to Bug #30678: Active Directory Service Failing II added

#17 Updated by Dru Lavigne 10 months ago

  • Target version changed from 11.2-RC2 to 11.1-U6

#18 Updated by Andrew Walker 10 months ago

There are still some code changes that I need to make per John's request. I'll work on this today and ping for another review.

#19 Updated by Dru Lavigne 10 months ago

#20 Updated by Dru Lavigne 10 months ago

  • Related to Bug #32094: SMB Service restarts added

#21 Updated by Dru Lavigne 10 months ago

  • Has duplicate Bug #28561: Reconnecting Active Directory kills Samba connections added

#22 Updated by Andrew Walker 9 months ago

PR against stable: https://github.com/freenas/freenas/pull/1564 (initial commit)
PR against stable: https://github.com/freenas/freenas/pull/1624 (fixes for issues found during testing)

PR against Master - https://github.com/freenas/freenas/pull/1340 (initial commit)
PR against Master - https://github.com/freenas/freenas/pull/1614 (fixes for issues found during testing)

#23 Updated by John Hixson 9 months ago

Merged

#24 Updated by John Hixson 9 months ago

  • Status changed from In Progress to Ready for Testing

#26 Updated by Dru Lavigne 9 months ago

  • Private changed from Yes to No
  • Needs Merging changed from Yes to No

#27 Updated by Dru Lavigne 9 months ago

  • Subject changed from AD monitoring can restart AD service unnecessarily to Fix unnecessary AD restarts caused by enabling service monitor

#28 Updated by Dru Lavigne 9 months ago

  • Needs Doc changed from Yes to No

#29 Updated by Dru Lavigne 9 months ago

  • Related to Bug #36823: Fix unneeded stops and restarts in AD monitoring added

#30 Updated by Timur Bakeyev 9 months ago

  • Related to deleted (Bug #32814: Active Directory fails is first DC/DNS goes down)

#31 Updated by Dru Lavigne 8 months ago

  • Related to Bug #41164: Domain Controller Not Found for domain added

#32 Updated by Bonnie Follweiler 8 months ago

  • Status changed from Ready for Testing to Passed Testing
  • Needs QA changed from Yes to No

#34 Updated by Dru Lavigne 8 months ago

  • Status changed from Passed Testing to Done

Also available in: Atom PDF