Quantcast
Channel: VMware Communities : Discussion List - Site Recovery Manager
Viewing all articles
Browse latest Browse all 3691

hbrsrv daemon on VRMS appliance fails to start - solved

$
0
0

I publish this problem and its resolution in a hope that it will help others. A few months after the installation the HBR component of the SRM partly failed. Our setup has one VRMS and two VRS appliances, and a few dozens of replicated VMs. One third of the replications stuck and it turned out that all of them were handled by the VRMS appliance at the destination site.

 

The source ESXi hosts logged a lot of refused connections to the VRMS appliance. Because of this I suspected firewall misconfiguration, but tcpdump proved that TCP requests arrive to the appliance and refused by it. Finally it turned out that the hbrsrv daemon doesn't run on the VRMS so TCP ports 34031 andd 44046 are closed.

 

I tried to start hbrsrv manually on the appliance with the "/etc/init.d/hbrsrv restart" command, no success. However, the appliance logged the following lines to the /var/log/messages:

 

Sep 17 13:34:42 xxxxx watchdog-hbrsrv: [8818] Begin 'cgexec -g memory:/hbrsrv /usr/bin/hbrsrv --daemon --pidfile /var/run/vmware/hbrsrv.pid --vmodlport 8123 --lwdport 31031,44046', min-uptime = 60, max-quick-failures = 5, max-total-failures = 1000000, bg_pid_file = '/var/run/vmware/hbrsrv.pid'

Sep 17 13:34:42 xxxxx watchdog-hbrsrv: [8818] Executing 'cgexec -g memory:/hbrsrv /usr/bin/hbrsrv --daemon --pidfile /var/run/vmware/hbrsrv.pid --vmodlport 8123 --lwdport 31031,44046'

Sep 17 13:34:42 xxxxx su: FAILED SU (to hbrsrv) root on none

 

And the last line was the key to the solution. I checked the hbrsrv account:

 

xxxxx:/var/log # chage -l hbrsrv

Minimum:        1

Maximum:        90

Warning:        7

Inactive:       -1

Last Change:            Jun 05, 2013

Password Expires:       Sep 03, 2013

Password Inactive:      Never

Account Expires:        Never

 

Yes, the account expired. I changed the account to never-expire with "chage hbrsrv", rebooted the VRMS appliance and all stuck replication started to work. Just to be on the safe side I checked the same account on the VRS appliances but both of them was set to Never expire.

 

Is there anybody here with similar experiences? Our VRMS is version 5.1.1.0 Build 1079383


Viewing all articles
Browse latest Browse all 3691

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>