How To Resolve Two Active Servers (Split-Brain)

How To Resolve Two Active Servers (Split-Brain)

Summary

This Knowledgebase article provides information about the symptoms, causes, and resolutions of two active servers.

More Information

The occurrence of two active servers is not by design and when detected, should be resolved immediately. When there are two active servers live on the same network, Neverfail refers to the condition as Split-Brain syndrome.

Symptoms

Split-Brain syndrome can be identified by the following symptoms:

  1. Two servers in a cluster are running and in an active state. This should be visible on the Taskbar icon as P / A (Primary and active), S / A (Secondary and active) or T / A (Tertiary and active).
  2. An IP address conflict may be detected in a cluster running Neverfail Engine on the Principal (Public) IP address.
  3. A name conflict may be detected in a cluster running Neverfail Engine. In a typical WAN environment, the Primary and Secondary (in a Pair) or Primary, Secondary, and Tertiary servers (in a Trio) connect to the network using different IP addresses and no IP conflict occurs. However, if the servers are running with the same name then a name conflict may result. This will only happen if the servers are visible to each other across the WAN.
  4. Clients (for example, Outlook) cannot connect to the server running Neverfail Engine.

Cause

Two active servers (split-brain syndrome) can be caused by a number of issues.  The most common causes of two active servers are:

  • Loss of the Neverfail Channel connection (most common in a WAN environment).
  • The active server being too busy to respond to Heartbeats.
  • Misconfiguration of the Neverfail Engine software.

It is important to determine the cause of the split-brain syndrome and resolve the issue to prevent reoccurrences of the issue.

Resolution

Important Note: Once split-brain syndrome has occurred, the server with the most up-to-date data must be identified and then made Active. If the wrong server is made Active after this point, it can result in data loss. Care should be taken to reinstate the correct Active server.

The following can help identify the server with the most up-to-date data:

  1. Check the date and time of files on both servers. The most up-to-date server should be made the active server.
  2. From a client PC on a LAN, run nbtstat -A <Public_IP> where the Public_IP should be the Principal (Public) IP address of your server. This can help identify the MAC address of the server currently visible to client machines.

Note: If the two active servers have both been servicing clients, perhaps at different WAN locations, one and only one server can be made active. Both servers will contain recent data, which cannot be merged using Neverfail Engine. One server must be made active and one server made passive in order to restart replication. Once replication is restarted, ALL data on the passive server will be overwritten by the data on the active server. It may be possible to extract the up-to-date data manually from the passive server prior to restarting replication. Please consult your Protected Application vendor regarding tools that may be used for this purpose. For further information, please contact Neverfail Support.

How to resolve two active servers (split-brain syndrome):

  1. Identify the server with the most up-to-date data.
  2. Shutdown Neverfail Engine on all servers (if it is running).
  3. On the server(s) you would like to make passive, right- click the Taskbar icon, and select Configure Server wizard.
  4. Select the Machine tab.
    • Set the Active server to point to the identity of the server that should be the active server in the cluster.
    • For example, if the desired active server (identified at step 1) is the Primary server, the Active Server radio button should be set on Primary regardless if the Physical Hardware Identity is Primary , Secondary or Tertiary .
    Note: Do not change the Identity of the server for example, Primary/Secondary/Tertiary.
  5. Click Finish to accept the changes. Exit the wizard and reboot this server.
  6. Start Neverfail Engine and check that the Taskbar icon now reflects the changes by showing P / - (Primary and Passive) or S / - (Secondary and Passive) or T / - (Tertiary and Passive)
  7. On the active server, right-click the taskbar icon and select Configure Server wizard.
  8. Select the Machine tab.
    • Set the Active server to point to the identity of the local server.
    Note: Do not change the Identity of the server for example, Primary/Secondary/Tertiary.
  9. Click Finish to accept the changes. Exit the wizard and reboot this server.

    Note: As the Active server restarts, it will connect to the passive server and start replication. Once this happens, data on the passive server will be overwritten by the data on the active server. Please see above for further information on how to check which server contains the most up-to-date data.
  10. Start Neverfail Engine (if required) and check that the Taskbar icon now reflects the changes by showing P / A (Primary and active) or S / A (Secondary and active) or T / A (Tertiary and Active).
  11. Log into the Neverfail Advanced Management Client.
  12. Check that the servers have connected and replication has started.

Applies To

All Versions

Related Information

Knowledgebase article #984 : Resolve Two Passive Servers (Pair) or Three Passive Servers (Trio)

KBID-516


    • Related Articles

    • Continuity Engine Troubleshooting - Two Active or All Passive Servers

      This session introduces you to resolving unexpected occurrences where two servers are active or all the servers are passive. Neverfail Continuity Engine is designed to operate with one server active, while the other server or servers are passive.  ...
    • Resolve Two Passive Servers (Pair) or Three Passive Servers (Trio)

        Summary This Knowledgebase article provides information about the occurrence of all servers passive including the symptoms, causes, and resolutions.   More Information All servers are passive at the same time Symptoms The first indication that ...
    • How to Configure Split-Brain Avoidance For Neverfail Engine

      Summary This Knowledgebase article provides the procedure for configuring Split-Brain Avoidance when running Neverfail Continuity Engine. More Information Split-Brain Avoidance is a mechanism meant to ensure that only one server remains Active if ...
    • Accessing the Continuity Engine Servers

      This article introduces the Neverfail Continuity Engine Management IP addressing. It allows you to manage your Neverfail Continuity Engine servers even when they are in a passive role.  Continuity Engine employs 2 or 3 servers working together. One ...
    • Neverfail IT Continuity Engine Unexpected Behaviors - Two Active or All Passive Servers

      Summary This Interactive Learning Segment (ILS) discusses the occurrence of two active or all passive servers with Neverfail IT Continuity Engine including the symptoms, causes, and resolutions. More Information To review this ILS, click on the link ...