Get DC/GC SRV records from the default site if DNS is broken in local site
Assuming two Active Directory Sites. Domain Controller in Site A. FreeNAS in Site B.
Assuming a working DNS infrastructure with properly set resolver/forwarder on FreeNAS.
Upon joining a domain using settings
1) The script properly resolves the site in which FreeNAS currently resides. -> SiteB
2) The script tries to contact a DC in its current site. -> no DC available
3) The join process fails
Upon joining a domain using settings,
1) "hardcoding" SiteA - where an active DC resides.
2) The script properly resolves services in the hardcoded SiteA.
3) The join process succeeds
Drawback of this variant is lost resiliency when SiteA is not reachable, but potential other DCs available in potential other Sites.
Expected behaviour -> Standard Windows Client behaviour
1) The script fails to contact a DC in its local site
2) The script performs a fallback lookup to find non site-specific services.
This will also fix the drawback mentioned when "hardcoding" the site.
The situation is now different in 11.1-U6. The proper procedure in the circumstances laid out above is to hardcode Site A.
This site will be used for operations where we rebuild the directory service cache, and we will try to use it to regenerate the smb4.conf file; however, if the site is unavailable we now leave the smb4.conf file configured and samba running. This means that services should not be interrupted while the winbind connection manager transitions to the next available DC.
The key manual configuration step (apart from hardcoding the site) is to also add multiple servers (space delimited list) in the advanced settings for your kerberos realm. The list should be in order of preference.
Eventually, the issue in the ticket will need to be resolved by reworking how some of our ancillary directory service code works, but the reported issue can now be worked around.