VMware vCenter Server Heartbeat - Out of Disk Behavior Explained

Follow

Summary

This Knowledgebase article provides information about vCenter Server Heartbeat Out of Disk behavior.


More Information

Active Server CommsMgr Logs

The queue on the active server is used to store intercepted data prior to sending across the channel to the passive server. Queue build-ups here indicate communication problems with the Secondary server, or insufficient bandwidth for the data being replicated. The queue stats are displayed in the vCenter Server Heartbeat Console on the System -> Status & Control tab.

These updates are stored in memory or to disk in the default location (c:\VMware\VMware vCenter Server Heartbeat\r2\log). The maximum size on disk is configurable, MaxDiskUsage, is by default 1GB. Both these settings can be configured via Configure Server wizard when vCenter Server Heartbeat has been stopped.

Channel Disconnect

The queue will be written out to disk if the active server is replicating and:

  1. The passive server was never connected.
  2. The channel suddenly disconnects and the configured number of heartbeats is very large.

In either case, the following could happen:

  • The MaxDiskUsage will be reached; an alert NFChannelExceededMaxDiskUsageException will be logged.

"Exception in CommsMgr [L9] Exceeded the maximum disk usage(NFChannelExceeded MaxDiskUsageException)"

  • If available space on the drive is less than the MaxDiskUsage then NFChannelIOException will be logged.

"Exception in CommsMgr [M4] Cannot open log file 2004-09-07-203.log(NFChannelCannotOpenIOException) because there is not enough space on the disk (IOException)"

In both these situations, vCenter Server Heartbeat will:

  1. Cease to log updates to the data.
  2. Discard all existing logs.
  3. Upon channel reconnection, a NFChannelLostMessageEvent will be generated.
  4. A Full System Check will be initiated to get the system back in sync.

Passive Server CommsMgr Logs

When data is received on the passive server, it is stored in the passive server (safe) queue until Apply is ready to handle updating the protected file. Depending upon system load, this will either be in memory or to disk. Under normal operating conditions, this queue should remain small. The passive server (safe) queue stats are displayed in the vCenter Server Heartbeat Console on the System -> Status & Control tab.

A build up in the queue may indicate a problem applying updates to the protected files. Common causes are:

  • Hardware / software problems with the disk subsystem.
  • Under spec'd equipment, for example disk drives on the passive server are far slower than the active server disks.
  • Applications running on the passive server blocking updates.

When this queue gets very large, the protected application will begin to slow down to avoid overloading the passive server with updates. If the configured limit is reached, vCenter Server Heartbeat will raise a NFChannelExceededMaxDiskUsageException on the passive server. The vCenter Server Heartbeat server should be shutdown opting to leave the application running on the active server. The application's performance will return to normal.

Remedial Action

The passive server's hardware should be investigated for problems.

  1. Start by checking the Windows Application and System logs for Errors and Warnings regarding impending hardware failure, or other problems.
  2. Device Manager may show problems with drivers or RAID controllers malfunctioning.
  3. Alternatively, run system diagnostic checks that are supplied with the hardware.

Passive Server Lacks Protected Disk Space

This occurs when the active server has more disk space than the passive server. Protected data cannot be written to the passive server, so updates will fail. Apply will raise a Disk Full or Quota Exceeded exception in the log saying that it cannot create files. The system will attempt to stop.

"Error","Disk Full Or Quota Exceeded","[N27]Failed to write information for the file: D:\protected\some file.txt to the disk. Either the disk is full or the quota (for the SYSTEM account) has been exceeded."

Safeguards

Employ the following to prevent Out of Disk Behavior:

  1. Ensure that the available disk space on a drive exceeds MaxDiskUsage. The default configuration requires 1GB of disk space on the vCenter Server Heartbeat installation drive.
  2. The passive server disk space should equal that of the active server.
  3. Consider hosting the VMware vCenter Server Heartbeat log directory on its own disk, or one that does not host:
    1. Protected Application data.
    2. The Windows System folder.

Applies To

All Versions


Related Information

None

KBID-1698

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.