This article discusses unexpected channel drops. Under normal operations, Neverfail Continuity Engine maintains continuous communications between servers using the Neverfail Channel. When communications between servers fail, the condition is referred to as a channel drop.
At the completion of this session you should be able to:
- Identify the symptoms of a channel drop.
- Identify the causes of a channel drop.
- Recall how to correct a channel drop.
They Neverfail Channel can operate on a private network segment, separate from the public network, or on the same network segment. Continuity Engine isolates replication traffic from public traffic and allows for continuous communications between servers, which is essential for Continuity Engine to function properly.
Should Neverfail Channel loose communications between servers for a period which exceeds the configured Failover timeout, a passive server will attempt to assume the active role. This is by design, as Continuity Engine assumes that the current active server has failed.
However, if the active server is in fact still operational and only the Channel connection itself has failed, the passive server may believe that the active server is no longer in operation and wrongly attempt to assume an active role.
This condition, known as False Failover, can cause Split-Brain Syndrome, where a passive server becomes active in attempts to provide services to the users while the original active server is still functioning.
When Channel drops are experienced, it is generally the result of communications problems. These can be caused by:
- Faulty crossover cables if connected directly or a faulty switch, if used.
- A faulty network card.
- Improper configuration of Neverfail Continuity Engine.
- Firewalls that incorrectly identify the channel replication traffic as an attack and subsequently blocked to traffic on ports used by Continuity Engine.
- The active server being too busy to respond to the Heartbeat message from a passive server.
To correct the Channel drop, you should first identify the cause. Check each of the following for faults:
- Check that patch cables are not faulty. Use different patch cables to perform this test.
- Check that there are no other applications using Neverfail Continuity Engine ports, and that firewalls have not been manually or automatically configured to block traffic on those ports.
- Ensure that latest network card drivers are installed on channel NICs.
- Make sure the channels are configured with the correct IP addressing.
- Make sure that the channel IP addresses are set correctly within Neverfail Configure Server Wizard.
- Confirm that routing is working correctly. This can also be used in LAN environments to confirm channel routes. Use the Windows diagnostic tools pathping and traceroute to identify excessive packet loss or inefficient routing.
- In a WAN environment, check for sudden bursts in network traffic from other applications. If Quality of Service (QoS) is operating on the network, check that Neverfail Channel communications are not restricted.
- If the channels connect across a VPN link in a WAN, the VPN may have dropped, timed-out, or prevented automatic reconnection. Therefore, verify that a valid VPN connection exists.
More information about channel drops is located in the Neverfail Knowledge Base.
This article discussed details about unexpected channel drops. Remember these key points:
- The symptoms of channel drops:
- False Failovers.
- Split-Brain Syndrome.
- The causes of channel drops:
- Faulty patch cables or switches.
- Faulty NICs.
- Improper configuration.
- Firewalls blocking channel traffic.
- Excessive loads on the active server.
- How to troubleshoot and resolve a channel drop:
- Replacing patch cables and switches.
- Checking never fail port use.
- Ensuring nic drivers are current.
- Verifying the IP addressing.
- Reviewing routing configuration.
- Ensuring adequate bandwidth.