Why VMware vSwitch’s don’t scale, and isn’t Network switch

So we all know that vSwitch is software connection between guest VMs in a single VMware server. And it seems natural, even intuitive that communications between two servers that connect to the same vSwitch must be good.

Well, yes but mostly no. Here is my logic.

Lets take this scenario of four VMs in a single server chassis:

vmware-vswitch-1.jpg

and there are traffic flows between server that are on the same vSwitch / VLAN. Like so:

vmware-vswitch-2.jpg

Now this seems like a goodthing™ and all is well. Traffic flows are localised in the chassis and the packets are all sorted. You’d think that there is nothing to worry about, no need to tell the network team, just leave it at that etc etc. The server will burn a bit of CPU & Memory to handle the packet forwarding but that’s easy.

Lets move on a bit as your VM farm grows into something resembling a cloud. And you start migrating your VMs from chassis to chassis as you need more resources. Maybe a bit more memory for VM B, or more CPU cycles on VM D. Then you system looks like this:
vmware-vswitch-3.jpg

This is still a very simple deployment with just a single pair of ethernet switches and each connected to the same edge or top of rack switches. So let’s consider a tougher east-west switching challenge (more on this in a post on bisectional bandwidth)

vmware-vswitch-4.jpg

In this case, the traffic no longer uses the vSwitch and therefore the network must be able to cope with the project traffic load between VM-A & VM-B and VM-C & VM-D.

Design Failure for Scaling

In design terms, this is a scaling failure. The single use case looks good, simple and it works. It doesn’t require any hardware, or any team engagement. Any implementation that depends on the vSwitch within a chassis is not a scaled design. ‘.

The ongoing use case for the next phase of growth is a failure. This means that any design that relies on using vSwitch as primary networking technology is a design failure. Ergo, the VMware vSwitch is not a network technology in the proper sense.

If any of those Guest Vms need more memory or CPU in the future, and that chassis is not able to deliver, then the value of virtualisation is that movement can create a pool of resources. This is, of course, the supposed value of the ‘Cloud’. If you want a cloud then you have to dsign for that from the very beginning.

I’ve not mentioned any storage issues here, but they are equally important as the storage networking issues are identical. If you want understand why FCoE / iSCSI are so important, this is the same use case.

The EtherealMind view

In my view, the correct perspective on using vSwitch in a network is to think of it as a shared network adapter. It’s not a switch. It has no networking features – no STP, limited QoS, no SPAN or RSPAN, no NetFLOW / sFLOW etc, no filtering, no VACL and so on. vSwitch features such as VLAN trunking, link bonding, frame forwarding are features of any network adapter as well as an Ethernet switch

Don’t fall into the trap of thinking that the vSwitch is a high performance connection or even a feature complete technology. Because the day you move those two servers apart, that connection cannot be sustained and you will need to redesign the network. And you should have done that the first time. A vSwitch is not an Ethernet switch, and doesn’t replace the Network, and what’s worse it leads into poor design choices if you are not paying close attention to the overall design.

With that said, there are different things happening in networking that may change this, so stay tuned and I’ll see if I can write some more about this.

  • http://lonesysadmin.net/ Bob Plankers

    You and I have talked about this at length, and despite what it might seem to an outsider when we “talk” about it I agree completely with this. A proper design for a vSphere environment absolutely has to account for east-west sorts of things, like vMotion. The vSwitch, and it’s slightly larger big brother the vNetwork Distributed Switch, have very basic feature sets. Thing is, many people don’t need any more than that, and the ones that do can go to vSphere Enterprise Plus and Cisco Nexus 1000V virtual switches. Of course, this will probably all change when the next major release of vSphere ships — hopefully there will be more features in the base switches, and more options for pluggable virtual switches. For me, I’d just be ecstatic if they did something with NetFlow — network monitoring is one piece I’m sorely lacking, for troubleshooting reasons.

    At any rate, given all this, and the prevailing attitude of “virtualization doesn’t need anybody else’s help” it isn’t surprising that so many organizations have performance problems in their virtual environments.

  • http://www.twitter.com/hardikmodi Hardik Modi

    My biggest beef has been the lack of packet counters that are readily viewable. In that respect, it more resembles a desktop 4-port switch than a rack-mount. Further, it being a software construct, you’re always going to worry about packet loss. I’ve hooked it up with an Ixia test port to see how much a packet sniffer application running in a VM, feeding off a vSwitch, could terminate and it’s clear that there’s some degree of packet loss in ESXi 4.0, even at moderate rates, which isn’t observed with the sniffer application running on a baremetal OS. The average application communicating via TCP can probably tolerate this, but the one wouldn’t expect this of a switch, which is expected to be lossless in all but the most strenuous circumstances.

    VMDirectPath is a way to work around this, but then you don’t have a switch at all and lose other goodies like vMotion.

    I don’t have corresponding reference points for the 1000v, but software based switching in the vKernel will always be problematic.
    Wouldn’t you guys expect on-host hardware accelerated solutions to make a come-back, rather than improvements to software based implementations?

    • Chris Fricke

      The other problem is money – assuming the suggestion of using the Nexus 1000V – because even in my small 24 socket environment VMware wants upwards of 20k in licensing for the privilege of then buying a software Nexus switch (which in our small environment won’t help much anyways). This is because I’ve been an enterprise customer for ages and well – now it’s Enterprise Plus if you want ALL the cool goodies. Wha?

      I imagine larger vSphere Enterprise customers are debating the same thing… or maybe not. Maybe those companies are happy writing big checks for the incremental “extra” Enterprise goodness. I really don’t know for sure but it has been a sore spot.

      I say, no thanks. I’ll keep my inefficiencies for now.

      Obviously not the point of your post but a reality us lowly admins have to constantly battle.

      Great topic!

  • http://bradhedlund.com Brad Hedlund

    If what you want is VM to VM traffic to always flow through the access layer switch whether you have (1) server or (160) servers, while still having all the vMotion capabilities … that’s possible today in Cisco UCS. Its called Hardware VN-Link.

    Just say’n

    Cheers,
    Brad

    • http://etherealmind.com Greg Ferro

      Well, yes.

      My point was that the vSwitch is a limited scalability technology and doesn’t provide good design outcomes if you aren’t clear about the long term requirements.

  • http://blog.sflow.com/ Peter Phaal

    The Open vSwitch provides QoS, NetFlow, sFlow, RSPAN, ERSPAN etc. In addition, support for the OpenFlow protocol allows networking to be virtualized to handle multi-tennancy, security etc. The Open vSwitch is a standard component of the Xen Cloud Platform (XCP) and is included in the latest XenServer release.

    http://openvswitch.org/

    The Open vSwitch is open source (Apache 2.0 license). It would be great if VMware would include Open vSwitch as a standard component so customer’s could avoid the cost of deploying Nexus1000v in order to get networking features that should be part of the base product.

  • Pingback: Red virtual, configuraciÛn de VLAN, LACP y trunking en vswitches | openredes - Networking Open Source