There has been some speculation about VMware announcing a new hypervisor core capability in the upcoming VMworld. Stu Miniman at Wikibon first noticed and it and I worked with Stephen Foskett on this article about how a hypervisor for networking could be used to deliver network services (as opposed to server or applications) from within a virtual machine.
The Real Impact
However, I was mulling over the what the purpose of this would be and what would drive VMware to invest what are obviously significant resources to develop a new hypervisor kernel that would support real time guest operating systems. What is the business motivation ?
I have a hypothesis. And it’s related to vMotion.
The Problem with vMotion
Today we have a situation where a given Guest OS can be readily migrated between servers, but the data flow over the network must follow a fixed path. That is, you can move the server but the network equipment doesn’t move. The keystone problem is that the DNS name resolution mechanism does not allow for sub-second IP addressing changes and the OS vendors are highly resistant to developing new TCP/IP implementations and, equally, customers are resistant to change.
Therefore, if you follow the VMware methodology, the network needs to become dynamic as well. Now the paths that Ethernet frames take are always fixed from network appliance to the server. So, lets take a reasonably simple, service oriented, data centre design with firewall and load balancer and the servers. In the following diagram, you can see the typical packet flow from say the Internet comes through the front end and through the stack to the servers.
Now, consider what happens to your data center if you extend the Layer2 domain between data centers.
But the real problem is how the client application accesses that server. Consider a flow that comes in from the Internet and attempts to connect to the server that has moved to an alternate data centre:
At this point you have a lot bandwidth and latency problems. If the distance between those data is more than say 100 milliseconds (who cares how many kilometres that is, its the propagation delay that matters), then EVERY USER SESSION is going to become significantly delayed (meaning calls to the hep desk complaining about slow performance)
Where the VMware Real Time Hypervisor comes in
So let’s make the following assumptions:
- That VMware is ready to announce and ship a hypervisor that offers some form of CPU scheduling that is acceptable to network appliance manufacturers such Bluecoat, F5, Citrix, Vyatta etc (yes, Vyatta is a big winner here, and their CEO even hinted at this in a recent blog post
- That customers actually believe it will work.
- The VMware doesn’t over price the licensing: because I think this is a way for VMware to charge for hypervisor again, which is what they want. Giving away VMware for free really bites their CEO.
Now lets project what happens if the routers, load balancers and firewalls are all virtual appliances on a VMware infrastructure and what could happen if they were capable of using vMotion to migrate at the same time as the servers
So, with a bit of service orchestration, if a service was able to have separate instances of network appliances that handled the security, load balancing and routing, and those appliances were virtualised using VMware.
The EtherealMind View
A lot of guesswork here, but the vMotion technology is suitable within a data centre, but is not suitable for use between data centres because of the reliance on L2 in the absence of a better name resolution process or better TCP/IP protocols ( and I agree completely with Ivan Pepelnjak on his points ).
For VMware to continue to grow as a cloud technology, they have to solve or help to solve the networking problem. This might be a useful technique for certain workloads for private clouds (as hinted by Vyatta CEO here).
The missing link is the orchestration. That is, you would have to define rules for the group failover or migration of network devices as the server migrate in the internal compute space. Therefore you would need a unique set of network devices for every application. That’s a lot of network appliances, and a lot of licenses.
Still, it could work for certain limited use cases. I wonder if this is what VMware has in mind ?
IOSHints Live San Jose – Meet Greg Ferro and Ivan Pepelnjak
NIL Data Communications is organizing the first ever IOSHints Live event with technology blogger and Cisco Press book author Ivan Pepelnjak (CCIE#1354) in San Jose, California on September 15th 2010. IOSHints Live is your chance to meet Ivan, discuss emerging technologies and review typical network designs using them. Greg Ferro (#CCIE6920) from the Packet Pushers Podcast and EtherealMind Blog will also be joining Ivan to discuss wider issues includes servers and storage networks and their integration with modern network designs
The morning session will cover Data Center design and migration from traditional data center toward private and public cloud solutions. Afternoon session will focus on resilient and highly available VPN solutions needed to connect remote sites with the redesigned data center.
The sessions will focus on the design aspects of real-life issues relevant to the sessionís participants. Itís highly recommended that you submit a network design youíd like to discuss or challenges youíre facing in your network at least a week before the event; this will ensure that their key components will be discussed during the session.