How to Configure Application Start/Stop/Monitoring Processes in Neverfail Heartbeat V5.2[.n] and Earlier

Follow

Summary

This Knowledgebase article describes the procedure for creating and changing Start, Stop, and Monitoring Scripts.


More Information

The Roles of the Start, Stop, and Monitor Scripts

The start script is executed on the active server during the Heartbeat Start, Switch, Failover, and AutoSwitch operations. The start script typically starts the application by initiating all the application’s configured executables and services, and then reports if they have successfully started.

The stop script is executed on the active server during the Heartbeat Stop, Shutdown, Switch, and AutoSwitch operations. The stop script typically stops the application by closing down the configured executables and stopping the services, and then reports if they have stopped properly.

The monitor script is run on the active server at configured time intervals, while Heartbeat is started. The monitor script typically queries the application’s configured executables and services, can attempt to restart them, and can report if the application is not running properly.

Procedure

How to create a Start Script

A start script will typically: start the protected application’s services, start any other executables, then report if everything started properly.
The nfnet utility is used to start services. The net start and net stop commands are commonly used to start and stop services; however, some services do not respond correctly to net start and net stop commands. For this reason, the scripts generated by the Heartbeat installer uses the nfnet utility to start and stop services. This behaves like the net start and net stop commands, but has additional options that allow a timeout to be specified for services that do not respond properly.

A typical start script for a protected application that has a number of component services might look like the following:

nfnet start "Service1" /R || goto Failed
nfnet start "Service2" /R || goto Failed
nfnet start "Service3” /R || goto Failed
nfnet start "Service4” /R || goto Failed
nfnet start "Service5” /R || goto Failed
:All Started
echo NFCMD doApplicationExecuting Exch2k
goto Done
:Failed
Echo Failed to Start some Services
:Done

Where Service1, Service2, Service3, Service4, Service5 are the service names as displayed in Windows Service Control Manager, the /R causes additional status messages to be reported back to Heartbeat while the script executes.

If all services start successfully, the ' echo NFCMD doApplicationExecuting Exch2k ' command instructs Heartbeat that the application Exch2k has started properly.

If any service fails, the script jumps to the Failed label and does not echo the NFCMD doApplicationExecuting. When the start script finishes without reporting ApplicationExecuting, the application screen will report that the script failed to start the application properly.

The nfnet utility will start services in the correct order if they have explicitly declared dependencies on one another. Any services that have implied dependencies on one another must be started in the correct order by the script.

In the above example, the application name in the script is Exch2k, and it must match both the case and spelling of the application name as it appears on the application screen.


A script that starts an executable could simply name the executable, its pathname, and any start up switches that it requires as follows:

Echo off
Set workdir=d:\Program Files\Microsoft Office\Office10\
“%workdir%Excel.exe”

The above script sets the variable workdir to the path to the folder in which the executable resides. It then uses this path to launch the executable excel.exe. However, an executable started by a script in this way cannot be controlled by Heartbeat. In particular, there is no easy way to stop the executable. For this reason, Heartbeat defines the doStartExecutable command that can be used by a script to request Heartbeat to launch the executable on behalf of the script. Heartbeat provides other scripts for controlling executables launched in this way.
The following script executes in exactly the same way as the previous script except that the launch of the Excel executable is provided by the Neverfail Heartbeat Application Manager.

Echo off
Set workdir=d:\\Program Files\\Microsoft Office\\Office10\\
echo NFCMD doStartExecutable Office excel "%workdir%Excel.exe"

The script uses the “echo NFCMD doStartExecutable …” command to ask Heartbeat to launch the executable. Note the use of double slashes.

The parameters for the command are:

'Office', the logical name that is used to identify the application in the application screen. This is termed the application name. The name is case sensitive – the parameter must match the case and spelling of the application name on the application screen.

'Excel', a logical name for the process being launched. This is assigned the process identifier or PID for the executable, and can be used to identify it in the doStopExectable command.
“Excel.exe” The executable filename.

If the application needs more than one executable to run, more executables can be launched by the script in the same manner. For example, we will add a new executable to the Office protected application:

Echo off
Set workdir=d:\\Program Files\\Microsoft Office\\Office10\\
echo NFCMD doStartExecutable Office excel "%workdir%Excel.exe"
echo NFCMD doStartExecutable Office word "%workdir%winword.exe"

When using any NFCMD function it should be noted that functions and parameters are case sensitive, for example, writing Office in the script to locate an application configured in the Management Client GUI will not find office or any other case variant.

The application screen shows a global application status for each protected application. This can be set to three levels of status; Executing, Stopped, or Failed. It shows the status as reported by the start and stop commands. The script can report that the application is executing by adding the following line to the bottom of the script:

echo NFCMD doApplicationExecuting Office

Where, again, Office must match the logical name for the application as displayed in the application screen
Our start script now looks like the following:

Echo off
set workdir=d:\\Program Files\\Microsoft Office\\Office10\\
echo NFCMD doStartExecutable Office excel "%workdir%Excel.exe"
echo NFCMD doStartExecutable Office word "%workdir%winword.exe"
echo NFCMD doApplicationExecuting Office

How to create a Stop script

Scripts to stop services are similar to those that start services but instead of using the net stop command, they use the nfnet stop command.
A typical stop script for the previously highlighted “Exch2k” example would be as follows:

nfnet stop "Service5" /R || goto Failed
nfnet stop "Service4" /R || goto Failed
nfnet stop "Service3” /R || goto Failed
nfnet stop "Service2” /R || goto Failed
nfnet stop "Service1” /R || goto Failed
:All Stopped
echo NFCMD doApplicationStopped Exch2k
goto Done
:Failed
Echo Failed to Stop some Services
:Done

Where Service1, Service2, Service3, Service4, Service5 are again the service names as displayed in service control manager. The /R causes additional status messages to be reported back to Heartbeat while the script executes.

If all services stop successfully, the 'echo NFCMD doApplicationStopped Exch2k' command instructs Heartbeat that the application Exch2k is has stopped properly.

If any service fails to stop, the script jumps to the Failed label and does not echo the NFCMD doApplicationStopped. When the stop script finishes without reporting ApplicationStopped, the application screen will report that the script failed to stop the application properly. Note the order of services in the stop script is the reverse of that in the start script.

Any executables started using the “echo NFCMD doStartEecutable…” command can be stopped using either of two further NFCMD functions:

“echo NFCMD doStopExecutable…” or
“echo NFCMD doStopExecutables …”

The doStopExecutable is used to stop any single executable protected by an application.
If the start script launches more than one executable for a protected application, they can all be stopped at once using the doStopExectuables command.
For example, if we take the above example, where we have an Office application running the Excel and WinWord executables, the following script would stop the Excel part of the application while allowing the WinWord to continue:

Echo off
echo NFCMD doStopExecutable Office excel

The following script would stop both executables:

Echo off
echo NFCMD doStopExecutables Office

Working with the second example: Once all the associated executables of a protected application have successfully stopped, we need to flag to the Neverfail Heartbeat application manager that the global status of the application is stopped. This is achieved by adding the following line to our script:

echo NFCMD doApplicationStopped Office

The script should now look as follows:

Echo off
echo NFCMD doStopExecutables Office
echo NFCMD doApplicationStopped Office

How to create a Monitor Script

The application screen allows a third script to be defined for each protected application in addition to the start and stop scripts: the monitor script. This script is executed repeatedly at configured intervals while the application is started. It can be used to execute any command, but is often used to monitor the state of the services and executables launched by the start script. Neverfail service monitoring also monitors the protected application and is typically configured automatically at installation time.

Heartbeat defines a number of utility commands that monitor scripts can use. A monitor script can use the doMonitorExecutable command or doMonitorExecutables command to request Heartbeat to check that application executables started using doStartExecutable are still running. Using these commands, a script can request Heartbeat to check the state of an individual executable or all associated executables for a protected application.

Looking at the previous examples the following would check that the excel part of the application was still executing and not monitoring the word part:

Echo off
echo NFCMD doMonitorExecutable Office excel

This would monitor both word and excel for the Office application:

Echo off
echo NFCMD doMonitorExecutables Office

These functions check the list of processes on the system looking for the PID of the executable(s), and if any monitored executable’s PID is not present then the global status of the application is set to failed and an auto-switchover is triggered.

Scripts can also monitor services in a similar way by using the doMonitorService function, although the Neverfail Service Monitor described in previously is more commonly used.
The doMonitorService function checks to see whether the service is present and running. If it has stopped, the global application status is set to failed and a switchover is triggered.
Looking at the previous example, which has 5 services in the script to monitor, the Exch2k application may look like the following:

Echo off
echo NFCMD doMonitorService Exch2k "Service1"
echo NFCMD doMonitorService Exch2k "Service2"
echo NFCMD doMonitorService Exch2k "Service3"
echo NFCMD doMonitorService Exch2k "Service4"
echo NFCMD doMonitorService Exch2k "Service5"

If any of the five services should stop for any reason then the monitor script would set the global application status to failed and trigger a switchover.
This does not however allow for the inclusion of the ability to restart the service locally before a switchover is triggered. In order to restart a service locally the script needs to be written slightly differently.

Let’s look at a slightly easier example. The following start script has been used to launch an application named MSindex; it starts the windows indexing service:

nfnet start “cisvc.exe” /R || goto Failed
:Started
echo NFCMD doApplicationExecuting MSindex
goto Done
:Failed
Echo Could not start indexing service
:Done


The following utility can be used to see if the service is running:

NfmonService cisvc.exe

The Nfmonservice.exe utility carries out the check to see whether the service is running and returns an errorlevel code.

Set svc=csivc.exe
NfmonService %svc%
If not ErrorLevel 4 goto Fail
Goto End
:Fail
Echo Service has Failed %Errorlevel%
:End

The above script checks to see if there is an error in the service. If an error is returned, the script returns a message that the script has failed with its errorlevel code. If not, it jumps directly to the end label and exits.

Let’s now replace the message with some code that will restart the failed service.

Set svc=csivc.exe
NfmonService %svc%
If not ErrorLevel 4 goto Fail
Goto End
:Fail
Echo Restarting Indexing Service
nfnet start %svc%
:End

In addition, add the code to tell the system to mark the application as failed, if the service does not restart, and hence trigger a switchover.

Set svc=csivc.exe
NfmonService %svc%
If not ErrorLevel 4 goto Fail
Goto End
:Fail
Echo Restarting Indexing Service
nfnet start %svc%
Echo NFCMD doMonitorService MSindex %svc%
:End

The above script now checks to see if the service has failed. If it has failed, it tries to restart the service locally. If it fails to restart, a switchover will be triggered.
In addition to checking services or executables, the script may have to check other properties of the protected application to make sure it is functioning properly. If the script performs additional checks then it may be required to trigger a switchover when it detects a problem with the application. This can be done using the doApplicationFailed operation

REM Perform tests here
REM if tests ok goto TestOk
:TestFailed
Echo NFCMD doApplicationFailed Exch2k
:TestOk

The doApplicationFailed command tells Neverfail Heartbeat to make the protected application as failed, and requests a switchover.

Note: Any script that requests a switchover should be used with caution. The protected application will be stopped on the currently Active Server and restarted on the other server, and replication will not be restarted. The user must check the state of the protected applications and data before restarting replication.


Applies To

Neverfail Heartbeat Version 5.2[n] and Earlier


Related Information

None

KBID-995

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.