The challenges associated with patching passive nodes with Engine's true clone based architecture
Neverfail Continuity Engine employs a clone-based architecture in order to create exact copies of production servers and incrementally synchronize changes between the active and passive clones.
The advantages of Engine’s clone-based architecture are seamless directory mapping (no need for configuration as the same locations are assumed), no post failover processing (as the server and applications configurations are the same) and no system state restoration required (no reboot after failover).
A clone-based architecture also comes with its challenges. As side effects, the following issues might be encountered when managing clones:
- All cluster nodes have the same identity. Most third party instruments (for patch management, anti-virus, configuration and network management or backup software) will keep track of the changes applied to the production server. Having the same identity on all cluster nodes can generate problems when these type of third party instrumentation software identify and audit the servers, as most of these software use a combination of IP address, FQDN, and NETBIOS name for server identification.
For example: When SCCM interrogates the primary server and pushes down patches to it, and the server role is switched from Passive to Active on the secondary node, at the next interrogation, SCCM’s configuration database will reflect that all patches have been deployed to that identity. In reality, the new Active node (secondary) does not have the required patches. This causes Engine users to have to forcefully push updates down to the secondary node. This could involve reboots of the server right after a continuity event, when rebooting is not optimal.
- Lack of consistent procedure. Since different software would require different upgrade procedures, this can be a source of confusion for the user. This could mean that an upgrade procedure which is not properly completed could generate a failed continuity event.
Neverfail Continuity Engine 8.5 implements the following solutions for handling the challenges of true clone-based architecture:
Cozen Passive Node Management
Cozen is an English and Latin term that means to trick. So with Cozen Passive Node Management, Engine tricks third party instrumentation into seeing these true clones as separate identities.
The Management options of the Server Configuration wizard allow each node in the cluster to be configured with a unique host name (FQDN) and NETBIOS name, enabling operational control over the windows servers while in the passive state. The original host name and NETBIOS name is restored automatically when a cluster node becomes active.
Reclone Secondary or Tertiary Server
This feature allows the user to easily clone a passive cluster node automatically or manually, and set up an automatic procedure that clones the specified node based on a schedule. This feature preserves the existing passive node configuration which includes:
- Public IP addresses
- Static Routes
- Channel IP addresses
- Plugin configuration
This allows the cluster to be rebuilt after application and security patches have been deployed.
What are the use cases for the features and when to choose each option?
The Neverfail Patch Management Options are designed to help the user in the following cases:
Use Case 1
Operational control over the passive nodes is required via the use of third party instrumentation and management tools. These can include patch management tools, configuration and network management tools, anti-virus and software backup tools, etc.
The Engine’s Passive Node Management option allows each passive node in the cluster to have a unique host name and NETBIOS name that can be used by third party tools.
Use Case 2
Pushing application patches that do not require the applications to be running on the passive nodes.
With Passive Node Management, application patches can be pushed to passive servers using third party tools like SCCM, Windows Server Update Services (WSUS) or Ivanti Patch
Use Case 3
A complex or custom application requires manual deployment of software components, where software updates cannot be simply pushed down.
The Reclone Secondary or Tertiary Servers feature allows the user to clone the passive server once the complex/custom application is deployed. The e cloning can be done on-demand or can be scheduled.
Use Case 4
A large dataset needs to be updated, and replicating the entire dataset takes too long. This makes production server recloning the ideal solution.
The Reclone Secondary or Tertiary Servers feature allows the user to clone the passive server with its entire dataset to obtain the other required nodes.
Use Case 5
After a major application update, an automatic cluster regeneration is desired, in order to maintain consistency of cluster nodes.
The Reclone Secondary or Tertiary Servers feature allows the user to schedule an automatic full cluster reclone, after each major upgrade. The original nodes can be automatically deleted after the reclone.
Use Case 6
Application and security patching is scheduled for a specific day in the month.
The Reclone Secondary or Tertiary Servers feature’s Scheduled Recloning can be used to plan a full or partial cluster reclone after every patching day.