Continuity Engine Troubleshooting - Two Active or All Passive Servers

Continuity Engine Troubleshooting - Two Active or All Passive Servers

This session introduces you to resolving unexpected occurrences where two servers are active or all the servers are passive. Neverfail Continuity Engine is designed to operate with one server active, while the other server or servers are passive. 
Such occurrences prevent Continuity Engine from properly protecting your applications and you must correct them immediately upon detection.
Two Active Servers

Learning objectives

At the completion of this session you should be able to:
  1. Identify the symptoms of two active servers.
  2. Identify the causes of two active servers.
  3. Recall the process for correcting two active servers.
  4. Identify the symptoms of all passive servers.
  5. Identify the causes of all passive servers.
  6. Recall the process for correcting all passive servers.

Two Active Servers

Overview

Neverfail Continuity Engines architecture uses an active/passive server cluster to provide optimum protection for your applications. To function properly, the server cluster must always operate with one server active and one passive server in the case of a pair, or two passive servers in the case of a trio. The occurrence of two active servers is not by design. You should address such a situation immediately. 

Symptoms

Situations may occur where you encounter two servers running in the active mode. The occurrence of two active servers is referred to as Split-Brain Syndrome. An organization experiencing Split-Brain Syndrome can have two servers servicing the clients, thereby causing each server to update its application data independently from the other. Since these data differences may not subsequently be merged, the situation can result in data loss during synchronization. 

Causes

Split-Brain Syndrome can be the result of loss of:
  1. Neverfail Channel connection loss (most common in a WAN environment).
  2. The active server being too busy to respond to heartbeats.
  3. Misconfiguration of the Continuity Engine software. 

Resolutions

To resolve Split-Brain Syndrome, identify the server with the most up-to-date data. This server should remain as the active server. Do not assume that the primary server should always be the active one. You must identify the one with that has the most up-to-date data. 

Shutdown Neverfail Continuity Engine on all servers and use the Configure Server Wizard to reset their roles to active or passive appropriately. Reboot the servers and allow the server cluster to re synchronize. More information about this process is located in the Neverfail knowledge base. 

All Pasive Servers

Overview

The occurrence of two passive servers (pair) or three passive servers (trio) prevents Neverfail Continuity Engine from providing continuous application services and application protection.

Symptoms

The first indication that Continuity Engine may be experiencing too passive servers (pair) or three passive servers (trio) is when users are unable to connect to protected applications. This situation can prove critical to your business and you should address it immediately. If you've already configured your alerts, you will receive notification that replication is not functioning properly. 

Causes

Whereas the Split-Brain Syndrome usually happens because of faulty communications between the servers, all servers being passive servers can occur when you have an unclean (ungraceful) shutdown of Neverfail Continuity Engine. 
This can result from:
  1. A power failure on multiple servers.
  2. Restarting the active server without first shutting down Continuity Engine cleanly.
  3. Misconfiguring the Continuity Engine software.
This behavior is by design to protect data integrity.

Resolutions

You must manually configure which server you want to be active, choosing the primary or secondary server, or possibly the tertiary server if you have a trio. To do this, you must perform the following steps:
  1. Check the integrity of both your data and your hardware.
  2. Try to determine the cause or causes of the failure and repair them if possible. 
  3. Decide which server you would like to make active and which server (or servers) passive. As before, do not assume that the Primary server should always be the active one. You must check which has the most up to date data. 
  4. Shutdown Continuity Engine on all servers.
  5. Use the Configure Server Wizard to reset the server roles to active or passive appropriately.
  6. Reboot all servers.
  7. Allow the server cluster to re-synchronize.
More information about this process is located in the Neverfail knowledge base.

Wrap Up

This article discussed the unexpected occurrences of two active or too passive servers in the case of a pair, or three passive servers in the case of a trio. Remember the following key points:
  1. The symptoms of two active servers or split-brain syndrome. 
  2. The cause of split-brain syndrome is usually communication failure or software misconfiguration. 
  3. The symptoms of two passive servers (pair) or three passive servers (trio).
  4. Two or three passive servers are normally the result of an unclean shutdown or software misconfiguration. 
  5. The process for correcting two passive servers (pair) or 3 passive servers (trio). 

    • Related Articles

    • Continuity Engine Troubleshooting - Synchronization Failures

      Neverfail Continuity Engine provides protection to your applications by replicating data to a passive server. Continuity Engine attempts to synchronize protected data on all servers and continually replicates changes to that data. This article ...
    • Continuity Engine Troubleshooting - MaxDiskUsage Errors

      This artucke introduces you to MaxDiskUsage errors when using Neverfail Continuity Engine. Continuity Engine generates MaxDiskUsage errors when either the send or receive queues are full.  MaxDIskUsage Errors Learning objectives  At the completion of ...
    • Continuity Engine Troubleshooting - Channel Drops

      This article discusses unexpected channel drops. Under normal operations, Neverfail Continuity Engine maintains continuous communications between servers using the Neverfail Channel. When communications between servers fail, the condition is referred ...
    • Accessing the Continuity Engine Servers

      This article introduces the Neverfail Continuity Engine Management IP addressing. It allows you to manage your Neverfail Continuity Engine servers even when they are in a passive role.  Continuity Engine employs 2 or 3 servers working together. One ...
    • Continuity Engine Troubleshooting - Application Slowdowns

      This artcle discusses application slowdowns that you may encounter under routine operations. Neverfail Continuity Engine is designed to provide robust continuous application support and a slowdown of protected applications is considered an abnormal ...