Monday, March 23, 2009

HA VMs and the Failover Clustering wizard

Here is a gotcha that just came out of the Hyper-V forums.

I have been assisting a person with a problem that he was having with his VM shaving to failover as a group. We walked through the most common issue; the VMs share a single storage LUN. But the issue persisted. So we went backward.

In the end it was all about the process of making a VM Highly Available in the Failover Clustering manager.

Now, the process has been well documented and blogged about - so I am not going to show screen shots or outline the process. But I will mention the gotcha.

Here is rule of thumb that us crusty clustering folks don't think of anymore. We just do it, and in doing that I didn't even think about the root of the issue that this individual ran into.

Here is the rule of thumb: One run of the Failover Cluster "add resource" wizard equals one Highly Available resource.

You might think, well yea. And for those of us that began with Clustering under NT4, this was a requirement. However, with the ease of use of the new wizard in Failover Cluster Manager I just don't consciously think about it anymore.

Why is this an issue?

Ah, here is the scenario.

I open the wizard to make a number of VMs Highly Available. To safe some time, I add multiple VMs and complete the wizard. I think, great, I am done.

Now, on the backside, Failover Clustering has actually just grouped all of these VM resources together as a single Highly Available entity that is composed of many workloads.

This is where Failover Clustering takes over, and Hyper-V is just an engine.

Failover Clustering deals in 'workloads.'

And in the workload world that can be a website, plus a COM service, plus a LUN, plus a database server - all distinct entities, but all dependent upon each other, and in Failover Clustering you would set these up as a single workload and that they would come online in a specific order.

When we talk VMs - Failover Clustering only has a rule that says, 'Oh, a VM is composed of a configuration file, plus a VHD, plus the volume the VHD resides on' and it is the logic in the wizard that takes care of setting all three of these items as a single Highly Available workload.
(Go ahead; take a look at the details of a Highly Available VM).

The point is that each VM must be set up individually. One invoke of the wizard = one VM.
If you add multiple VMs at the same time then Failover Clustering considers them multiple components of the same workload and will keep them together and fail them over between hosts as a single unit.

This can lead to all kinds of confusion. Such as: why can't i have each VM on its own host? Why can't i fail them over individually?

One problem is the LUN - when multiple VMs share a single LUN, but that is not the issue here.
When might this be something that you want to do?

You would want to do this if you had VM entities that were related or required each other or required an internal virtual network to communicate.

One example is an IIS server that must sit behind a separate VM firewall. In this case, make them fail together, as a single workload.

In the end, I hope this helps broaden the understanding of how Hyper-V extends and depends on other Windows services to provide features.

No comments: