We have two physical data centers, each with it's own VCenter installation and an EMC VNX storage array. We are running VSphere 5.1 and are starting to work with SRM.
The SRM install at each site has been in place for a few months, with the appropriate SRAs for the EMC array and the Mirrorview replication. It's seeing everything in both sites.
Today we did our first "test" failover, with a protection group we set up on it's own LUN with a couple of test VM's. It worked, but there were some odd errors during the running of the test plan.
The recovery site has 7 hosts in the cluster. They're all set up to see the same set of datastores/LUN's in the storage array. During the "test" run, after SRM created the writable snapshot on the storage array, we saw all the hosts rescanning the HBA's and refreshing the storage, but on three of the hosts there was a "Mount VMFS Volume task" with the error "The operation is not allowed in the current state". However, after seeing that error, going to the configuration/Storage of the host, it DOES show the data store mounted properly. One of the test VM's was on one of the hosts with the mount error, but the VM did come up OK.
After I did the cleanup, I ran the test again. This time there were still three of the hosts that logged the error, but one of them was a different one than before.
Looking in the VMWare logs on the SRM server, everything seems OK until a block of text where it's appears to be verifying the datastore is mounted on all the hosts:
VMFS volume '524f0da8-51f0e8bb-2277-0017a47708e0' is mounted on host 'host-11'
VMFS volume '524f0da8-51f0e8bb-2277-0017a47708e0' is unmounted on host 'host-145'
VMFS volume '524f0da8-51f0e8bb-2277-0017a47708e0' is mounted on host 'host-148'
VMFS volume '524f0da8-51f0e8bb-2277-0017a47708e0' is mounted on host 'host-166'
VMFS volume '524f0da8-51f0e8bb-2277-0017a47708e0' is unmounted on host 'host-296'
VMFS volume '524f0da8-51f0e8bb-2277-0017a47708e0' is mounted on host 'host-321'
VMFS volume '524f0da8-51f0e8bb-2277-0017a47708e0' is unmounted on host 'host-39'
After this, it attempts to mount the VMFS volume on the three hosts (145, 296, and 39) that show "unmounted". It then shows the "failed to mount", and "the operation is not allowed in the current state" errors.
Since the data stores are getting mounted, I'm wondering if it's a timing issue where SRM checking the mount status too quickly, while the datastore mounting is in progress on those hosts, and thinking that they're unmounted then tries to mount them, but by then they've automatically mounted.
Any suggestions would be appreciated.