Thursday, June 11, 2009

Virtual Machine storage considerations

Storage.

Storage is your issue.

Storage is all about design and deployment.

Passthrough disks were first used for SQL servers, file servers, and Exchange servers.  Workloads that all require large storage volumes with high Disk IO.

Using passthrough dedicates a physical storage resource to a VM.  Before that you carve up the physical resource.

The negative is that you lose flexability in HA, failover, etc.  Not that it cannot be done with proper planning, but it isn't just plug, click and go.  It does take planning, equipment, and design.
I know that lots of folks are producing incredibly large VHDs and using them as storage for VMs.  What does this give you?  A VHD to restore, and backup at the host level.

Otherwise all backup that you do is at the machine level with a traditional backup agent within the VM to back up the volume.
In my mind, it is all about how you design it and want to recover it.

After working through a Disaster Recovery exercise for a particular application, I frequently found myself re-architecting the deployment so I could not only get good running performance, but a fast and easy to execute recovery of the system.

Our most limiting factor was frequently the time to recover the system from the backups (disk or tape).

Again, it is all about design.

The most humbling DR exercise to do is to recover the backup system itself.  A DR exercise that is frequently over-looked. But that is a different story.

As far as tweaking - no, don't tweak storage, design smart.
Split the spindles, spread the load.  Is putting two disk intensive servers on the same RAID 5 array better or worse?  Could that big array be split in two so one VM does not limit the other?

This is the big thing with storage and VMs.

One consideration is volume (gigabytes / terabytes).
The second consideration is performance.  Unlike RAM and Processor - the hardware IO bus is not carved into virtual channels.  It frequently becomes THE limiting resource.  Especially when you have multiple disk intensive VMs fighting  for that same primary resource.  In this case it is not a pool, it is a host resource.  It is finite.  It takes planning. 

VM A will limit VM B (and vice versa) when they fight for the same read / write heads on the same disk array.

This is where you must think about the VMs that you place, where you put their OS VHD, where you put their data.  How you do that storage, how you present storage, etc.

This is where the SAN argument really wins.  As the throughput, carving of storage, sheer number of spindles and heads, really shines.

If you are resource limited and can't afford the SAN, then think about the workloads that you are placing and how you divide the physical resources.  Give each disk intensive VM enough to do its job, but isolate them from each other.

Another strategy is multiple hosts.  Each host has one disk intensive VM.  All other VMs are low disk.  This way they have less IO effect upon each other.

Be creative.

No comments: