Scaling Virtual Appliances with Embrane

Embrane is coming out of stealth mode today and I received a presentation from them at Network Field Day back in October. Like most startups their business is focussed on specific problem and has a specific use case. To believe in Embrane’s problem you have to believe that all Network Services will reside in virtual machines ( where Network Services are technologies like Load Balancing, Application Acceleration, Intrusion Prevention Systems and Firewalls that operate at Layer 4 of the OSI model).

Some History / Existing State of Play

To date, the solution to making these technologies scale to perform at high speed has been to insert these devices into the VM Direct Path in the network protocol stack of VMware. This is complex, requires business relationships with VMware as well as a large team of rather expert software developers and makes you dependent on whimsy of VMware. Examples are Cisco’s Nexus 1000v product family of switching, firewalls, wan acceleration and Juniper vGW Virtual Gateway firewall product. Like this:

Embrane 1

The assumption for these types of products is that you don’t need a lot horsepower since the processing load exists in software of just one physical server. This leads to severe limitations in scaling such as:

  1. Each virtual machine will have ONE virtual appliance logically attached to it, and moves with the VM in the vSphere cluster. OR
  2. The rules or configuration from an instance will migrate from one hypervisor to the other as the VM moves within the VM vSphere cluster.

Both of these designs assume that the CPU/Memory footprint of the virtual appliance will not impact the hypervisor ie. the virtual appliance will consume CPU, Memory and bus bandwidth when inspecting and processing packets/frames from the physical network and you are gambling that these appliances will not impact the virtualised operating system by starving the system of the resources. Since the resources are consumed on chassis where the VM is located you can’t just move the system around, you have to have a server big enough to handle the load.

And if you move a VM between physical servers, the processing load has to move as well. Not so great.

Caption Text.

Caption Text.(Click for a full size image)

This type of design is also less useful for multi-system solutions. For example, a cluster of web servers in hypervisors still need a single load balancer to service all systems and a virtual load balancer has very few resources for high performance forwarding.

Caption Text.

What happens when the software load balancer isn’t big enough to cope ? After all, there is no hardware acceleration in a typical server – just some cheap Intel merchant silicon glued to a board.

What Embrane Does

Embrane uses concepts of IP Flows to scale virtual appliances. An IP Flow is the the stateful conversation of IP packets from a specific source/destination – not just one IP packet but the whole two way, full duplex, stateful session of TCP or UDP packets that form the Layer 5 session flow. Embrane scales out  by managing IP flows and then directing to other appliances, in effect creating what I would call a two tier load balancing

Consider what happens in a typical load balancing setup. The Load Balancer takes the state of IP flows and then distributes them to servers for the VIP:

Embrane 4

If you need more performance than a single load balancer can provide, most people will buy bigger load balancers. But some people will load balance the load balancers – like this

Embrane 5

Embrane takes this idea, manages IP Flows with Heleos platform running as a Flow Manager and ensures that TCP and UDP conversations are steered to the correct host – typically we call this sticky session – I think what Embrane is doing is FLOW BALANCING as a form of load balancing.

Embrane 6

Platforming

Who will buy this ? Customers who need to scale up to a lot of appliances to compensate for the poor performance of Intel hardware. Lets say you want to terminate 5000 transaction of SSL encryption sessions in a VMware cluster (not an uncommon amount of SSL today since FireSheep arrived) – this would require something like ten to fifteen VMs (I’m guessing) somewhere in the infrastructure to deliver this capability. But how do you direct the IP traffic flows to the VM that is handling your SSL session ? How do you maintain that state if all you have is software VMs ?

What Embrane are delivering is a platform on which to develop appliances that need to use this scale up technology ? I call it “Flow Balancing” functionality on which other companies can deliver services and appliances.

Embrane is trying to make a platform play out of this (Ed: because platforms lock in customers and is VC-friendly? Can a platform work ? ). I don’t really think they will have much success with this but hey, they got funded. In the short term, they say, they are delivering load balancing and firewall appliances as proof of concept of their products and hope that other companies will join them with application acceleration, more firewalls, Load balancers etc etc.

Scaling

The weakness in this design is the flow manager might need to hold a lot of state. Embrane claims that they have developed a stateless method for flow management – I’m guessing some sort of hash mechanism that resolves to a unique destination. On this basis I can accept that the Flow Manager is not a weak point since it’s database of destinations would not scale. I’d still want proof of that though.

OpenFlow/SDN

Embrane is using Flow Management but they are not an OpenFlow/SDN business. They might add OpenFlow APIs in the future but that would prevent them from posturing as a platform – not a good business model.

The EtherealMind View

I guess that if you believe that software appliances will be used for all networking functions at Layer 4 and above then Embrane has a solution for super scale Cloud Providers. Given that programmatic control is more important for some cloud providers for operational deployment of servers and application this might be very attractive for them – they can use virtualization software and cheap white box servers at the hardware level and lots of expensive software developers to make the orchestration go. Bug Cloudy blah blah

For most enterprise people, this announcement has little relevance. It is much more cost effective to buy a load balancing appliance from F5 or Cisco and use existing knowledge & skills than it would be to integrate Embrane into your VMware vSphere system. Maybe when the management aspects mature this might change.

For today, it’s an interesting development that shows a new way of deploying network services in large data centres and solves a major scaling problem for some vendors. The fundamental concepts that Embrane uses are not new but it’s implemented in a new way.

Postscript

As always, Iven Pepelnajk has great take is his blog post on the same topic: EMBRANE HELEOS: SCALE-OUT DISTRIBUTED VIRTUAL APPLIANCE

Disclosure

Gestalt IT Tech Field Day – I have attended several Gestalt IT Tech Field Day events. These events are sponsored by vendors, and Tech Field Day has paid for my travel and accommodation to be able to attend the event. This includes meals and some entertainment. Vendors often give various pieces of promo material such pens, mugs or t-shirts which I typically refuse because it’s useless, poor quality or ugly. There is no requirement, express or implied, to promote any vendor or product. I remain independent and able to appreciate, document or criticise as I see fit according to my personal views.

My full blog ethics statement is here

About Greg Ferro

Greg Ferro is a Network Engineer/Architect, mostly focussed on Data Centre, Security Infrastructure, and recently Virtualization. He has over 20 years in IT, in wide range of employers working as a freelance consultant including Finance, Service Providers and Online Companies. He is CCIE#6920 and has a few ideas about the world, but not enough to really count.

He is a host on the Packet Pushers Podcast, blogger at EtherealMind.com and on Twitter @etherealmind and Google Plus

You can contact Greg via the site contact page.

Subscribe For Weekly Updates by Email

Get a Weekly Summary of Latest Articles and Posts to your Email Inbox Every Sunday

Thanks for signing up. Look for the email from MailChimp & make sure you confirm your email address. You may need to check your spam or gmail settings to be sure of receiving the email.

Note: You can unsubscribe at any time using the link at the bottom of every email.