[olug] Fails Over, but does Not Fail Back

Joel B joel at kansaslinuxfest.org
Tue Apr 27 08:05:42 CDT 2021


Hi Rob,
I've seen this both ways and it seems to be dependent upon the equipment.
Examples:
The firewall units we have at my work run in a cluster (active-passive). 
They do not "fail back", but the vendor explains this as one data points 
used in deciding which is "active" is the uptime of the device (longer 
uptime weights the device more likely to be the "active" unit). I just 
reboot the now-active unit to restore the original order. (often this 
happens during a maintenance window, so it's a quick check and no 
problem rebooting).

In our networking switches (different vendor than the firewall units) we 
use MSTP (Spanning-Tree). The links & switches have a priority settings 
set that are not dependent upon device uptime, so if a "spanning-tree 
event" occurs (link/switch/etc failure) when things recover they restore 
to the desired setup based on those priorities. No extra intervention 
required.

So i see it happen both ways.
-Joel


On 4/27/2021 3:45 AM, Rob Townley wrote:
> tldr; Systems that reliably fail over to redundant system, but absolutely
> refuses to revert back to primary system.
>
> Looking for general guidelines on systems (primarily networking) to
> troubleshoot the fail back to primary pathway.
>
> The failover happens reliably.   The   problem is  when the primary comes
> back up, actually reverting back, aka “Failing Back” to the primary path.
>
> Have experienced this failure to fail back too many times across a variety
> of equipment and systems.  Looking for general guidelines.  What do noobs
> usually miss?
>
> Also, is it a common problem or just me?
> _______________________________________________
> OLUG mailing list
> OLUG at olug.org
> https://www.olug.org/mailman/listinfo/olug



More information about the OLUG mailing list