[olug] Fails Over, but does Not Fail Back
joel at kansaslinuxfest.org
Tue Apr 27 08:05:42 CDT 2021
I've seen this both ways and it seems to be dependent upon the equipment.
The firewall units we have at my work run in a cluster (active-passive).
They do not "fail back", but the vendor explains this as one data points
used in deciding which is "active" is the uptime of the device (longer
uptime weights the device more likely to be the "active" unit). I just
reboot the now-active unit to restore the original order. (often this
happens during a maintenance window, so it's a quick check and no
In our networking switches (different vendor than the firewall units) we
use MSTP (Spanning-Tree). The links & switches have a priority settings
set that are not dependent upon device uptime, so if a "spanning-tree
event" occurs (link/switch/etc failure) when things recover they restore
to the desired setup based on those priorities. No extra intervention
So i see it happen both ways.
On 4/27/2021 3:45 AM, Rob Townley wrote:
> tldr; Systems that reliably fail over to redundant system, but absolutely
> refuses to revert back to primary system.
> Looking for general guidelines on systems (primarily networking) to
> troubleshoot the fail back to primary pathway.
> The failover happens reliably. The problem is when the primary comes
> back up, actually reverting back, aka “Failing Back” to the primary path.
> Have experienced this failure to fail back too many times across a variety
> of equipment and systems. Looking for general guidelines. What do noobs
> usually miss?
> Also, is it a common problem or just me?
> OLUG mailing list
> OLUG at olug.org
More information about the OLUG