Getting your storage repositories right is one of the more important aspects of planning and deploying Oracle VM. Throwing everything into one or two different storage repositories is not very robust nor is it conducive to backups, routine maintenance, high availability or disaster recovery. Conversely, allocating a storage repository for each individual Oracle VM guest is not scalable or practical.

You need to put some thought into the way you design and deploy Oracle VM storage repositories to gain maximum flexibility, scalability and recoverability. So, here are a few things to think about...

SAN Vs NAS

For the purpose of this blog, we will use the generic terms of SAN to mean block level storage using iSCSI or Fibre Channel and NAS to mean file level storage such as NFS. I have a deep background in Fibre Channel networks and created and maintained a large automated server deployment scheme built around Fibre Channel SAN boot in my previous role here at Oracle as a production data center system admin. I’ve been working with SAN since 1996 when the only option was direct connect, with arbitrated loop just being introduced into the data center and native Fibre a future goal.

The only reason I mention this is because my preference of storage protocols for use with Oracle VM repositories is NFS. This is a personal preference based on the fact that everyone is very familiar with NFS, it is easy to implement, easy to troubleshoot, scales nicely, performs at 7GB/s to 9GB/s and provides a number of flexible options for Oracle VM. Two of the most valuable options I like with NAS are the fact that NAS storage repositories can span multiple server pools within the same Oracle VM Manager and the page83 ID is not an issue for DR (a SCSI disk page83 ID is sort of analogous to a network MAC address). Being able to span multiple server pools solves a lot of problems and provides great flexibility.

However, don’t be too hasty in making a decision to go with NAS. SAN offers better performance, the ability to present LUNs as physical disks to guest operating systems and OCFS2 which allows Oracle VM to take advantage of hot cloning for Oracle VM guests. I just think the current limitation with OCFS2 storage repositories coupled with the complexities of troubleshooting SAN make this a less attractive option.
More on this subject in a later blog...

Designing a Scalable Deployment Architecture

Now let’s discuss the most important aspect of storage repositories: deploying them in such a way that makes maintenance, backups, restores and disaster recovery easier and more flexible. The importance of designing a robust repository scheme is vastly underestimated and leads to a frustrating user experience.

Simply creating a single big repository or many small repositories containing a single virtual machine are very big mistakes for a number of reasons. Spend some time thinking about the best approach for deploying storage repositories that will fit your requirements for years to come.

Keep virtual machines with the same backup requirements together

The illustration below shows one of many approaches to designing storage repositories – use it only as a starting point for creating something that works for your needs.

This scheme combines entire business systems into individual repositories; a single business system would have multiple Oracle VM guests that are all related to each other. You might have several VMs running databases, some running middleware, others running application servers, web portals, etc. all in the same repository – they make a discrete unit.

Notice that they are further divided by production, UAT, development and data center management. Each of these repositories contain Oracle VM guests that can be backed up, restored, undergo routine maintenance or switched over to another data center without impacting any of the other business systems in the other repositories.

Keep in mind that these repositories would also belong to different server pools; perhaps the production HR and OPS are in one pool, with production CRM and SCM in a separate pool while the development CRM and SCM belong to yet another pool. However you decide to deploy repositories into pools and what belongs in each repository is completely up to you.

There are many other scenarios that can be accommodated using a similar mind set: divide your Oracle VM guests into repositories with similar maintenance, backup and restore requirements.

Keep Oracle VM Guest resources together

Some people will be tempted to spread virtual disks across multiple storage repositories for “performance” reasons. Resist this very bad idea with as much energy as you can muster. The “performance gain” from striping virtual disks across multiple repositories is dubious at best; some people will even consider creating and allocating guest resources across multiple storage repositories using higher performance volumes for some virtual disks and lower performance for others.

There are three very important reasons that you should avoid spreading Oracle VM guest files across multiple repositories:

The first reason you don’t want to do this is to avoid creating a very complex and high maintenance scheme when deploying Oracle VM guests. No matter how much you document it, only the person that designed such a scheme will ever remember to implement it when cloning/creating new virtual machines. This usually leads to a significant amount of wasted time correcting the location of misallocated virtual disks once the faux pas has been discovered, usually at the end of a long and frustrating experience troubleshooting why a virtual machine is not behaving correctly.

The second reason this is a bad idea is that you have to ensure that all repositories are presented to all Oracle VM servers in the pool without fail. I cannot tell you how often this prevents Oracle VM guests from migrating to other servers in the pool. It is easy to miss if you have a lot of storage repositories and can be quite frustrating when trying to figure out why a virtual machine failed to start on another server only to find one virtual disk out of many resides on a repository someone forgot to present to a server. It is very hard for your average admin to even figure out which storage repositories various files reside, so it becomes a long and arduous task trying to piece together exactly which repository is missing.

The third reason you don’t want to spread files related to a specific virtual machine across multiple repositories is that somewhere down the road, you are going to need to retire or migrate one or more of the repositories. It will become a very frustrating exercise when you can’t release a repository because it still contains a virtual disk belonging to an Oracle VM guest that resides in another repository and is not part of retirement/migration project. This leads to another long, complex project to consolidate the virtual disks belonging to virtual machines into a single location before you can then proceed with the original project of retiring/migrating a repository.

Conclusion

There are many factors that go into designing a robust, scalable and low maintenance deployment architecture for your storage repositories. The bottom line is that you need to do a little planning that takes into consideration how virtual machines are related to each other including routine maintenance, backups and disaster recovery.

I will endeavor to write more about this topic in the near future. In the mean time, please read the latest Oracle VM concepts guide for more information and ideas.