Controller Based Networks for Data Centres

Ivan Pepelnjak has a great post again on MLAG and Hot Potato Switching where he explains how the Switch and Routers are able to make poor decisions on forwarding or routing traffic thus highlighting a fundamental problem with the current generation of switching technologies.

How did we let this happen ? Obviously, that this is even possible is bad form for networking industry and how do customers accept such poor outcomes ?

Control and Data Planes

On consideration, I think that the problem comes down to the fact that each Network Device has a control plane that analyses the state of the network around it at either Layer 2 or Layer 3. For Layer 2, a Switch using Spanning Tree or TRILL or even SPB needs to form a coherent understanding of the network around it to build an Ethernet forwarding table.

For Layer 3, a Router uses OSPF to form a table of routes to build an IP forwarding table for routing packets across the device and for Layer 2, some form of Spanning Tree is used to create pathways through the network. Consider the following loose representation of devices communicating with neighbours to determine the overall configuration of the network.


So, that’s the purpose of a Control Plane – to discover and comprehend the network around it. Inherently, it knows very little of the overall architecture and, because of this, can made bad decisions about the forwarding of data.

Now, the Data Plane is a generic term to describe the forwarding process, usually done in Silicon. For a Switch operating at Layer 2, this means frames are switched across the backplane to the correct destination port, and for Layer 3 packets are routed from ingress to egress.

You can represent these interactions between the Data, Control and Management planes something like the following diagram: controller-lan-1.jpg

The underlying limits of MLAG, TRILL, SPB and all Data Centre Networking

So the Control Plane has no vision of the entire network that is not provided to it by the protocols. STP uses BPDU, OSPF ((or choose your favourite L3 routing protocol)) uses Neighbour Discovery and they both build databases. EVERY control plane has it’s own operating system and processing.

This means that the Control Planes has dozens of entities in the network. Each of these control planes are points of failure, with imperfect and incomplete views of the network. The functional whole of the network is seriously limited in what it can do. The very purpose of the new TRILL standards is to bring incoherent and independent systems into a unified and coherent network by providing a communication mechanism.

MLAG, VSS, vPC, IRF, Virtual Chassis blah blah

Each of these technologies attempt to bind two control planes into a single unit. For MLAG, Cisco VSS uses a single control plane on the Supervisor to control two switch fabrics. For Cisco vPC, two control planes manage two data planes (switch fabrics) using protocols and overrides to form a unified or coherent result for forwarding of frames. And so on with HP IRF, Juniper Virtual Chassis and all the others.

Controller Based Networking

Why not take this idea to it’s logical conclusion and have no control plane in the local switch at all, but have a remote ‘controller’ that handles all forwarding decisions. Instead of attempting to build a coherent LAN forwarding plan from many individual systems, have a central controller that ‘knows’ all of the network and loads the forwarding plane with optimal data.

This concept is already working for Wireless LAN Controllers (WLC) where each Wireless Access Point is a dumb unit with limited capabilities. The WLC acts as the brain for the entire wireless network and each Wireless AP communicates for new clients (servers), and has a full and complete view of the wireless networks.

Limits ?

It seems to me that a controller based network has limitations at the following points:

  1. on the performance of the controller
  2. on the bandwidth from each device to the controller
  3. on the reliability of the network to the controller
  4. that the size of the network is bounded, and it is not overly dynamic.

For a data centre, all of these conditions are easily met.

Then the network look something like this: controller-lan-2.jpg


I could try to summarise some of the features:

  • The controller performs all the learning, forwarding calculation and loads data to the dumb switches.
  • The controller is informed as servers join and leave the network using an 802.1x type protocol
  • Virtualised servers would signal to the controller as they dynamically migrate around the data centre
  • the controller software will know about all paths and all ENDPOINTS in the network and can comprehend multiple paths
  • it will know about bidirectional paths and resolve issues.
  • The controller will have a comprehensive view of the entire data centre and will know the end to end connectivity of the network
  • A single point of control can allow for extensive customisation of the forwarding paths through the network according to rules and conditions that are configured on the central configuration server

Real ?

There are two working instances of controller based networking that I can think of:

  • wireless LANs
  • Nexus 1000

That last one should make you go hmmmmmm. While I’m not entirely clear of the NX1K architecture, it has some of the parameters of controller based networking in place. I don’t doubt that Cisco is at least conducting research into this topic (as likely external to the company as internally)

Wireless LAN networks are often deployed with Wireless LAN controllers (WLC). The WLC acts as the intelligence for the entire wireless network and each Access Point is a cheaper, dumb device that simply forwards data to the WLC. All configuration and monitoring is delivered via the WLC. Because the WLC can see the spectrum performance over teh entire estate, it’s able to more effectively manage and dynamically adapt the wireless network to respond to changes in network performance. And that’s exactly what we need for our future Data Centre LANs where the switching backbone adapts according to traffic loads (after, say, a VM migration or an overload event on a given link), or changes paths around failures or temporary overload conditions caused by VM migration, Backup traffic, heavy storage traffic due to a SQL server upgrade and so on.

In case you think that this is pipe dream, the technology is already on it’s way in the current market. You can find existing versions of products that allow Open Source software on the switch/router from companies that are supporting the Open Flow Switching. Open Flow is an open source operating system that allows for development of the controller concept (if I’m reading the documentation correctly). I understand that three companies are already manufacturing products for this space:

    • HP Networking has specific ProCurve products that works with Open Flow and are described on the Open Flow website.
    • IW Networks makes products.
    • Broadcom already manufactures OEM / White Box switches and is believed to supply Google, Facebook etc and other large scale companies with low function and cheap ethernet†switching today.


FoRCES is the Forwarding and Control Element Separation protocol that is extensively described in the IETF RFC’s. RFC 3746 was published in April 2004, and more recently RFC5810 published on March 2010 defines the protocol for communication between controller and device. MIBs and other elements have also published in related RFCs, including the results of initial testing.

See these for reference:

Khosravi, H. and T. Anderson, “Requirements for Separation of IP Control and Forwarding”, RFC 3654, November 2003.

Yang, L., Dantu, R., Anderson, T., and R. Gopal,”Forwarding and Control Element Separation (ForCES) Framework”, RFC 3746, April 2004.

The EtherealMind View

I’m thinking that TRILL and DCB are coming along and we will use them for the next couple of years. For that matter, TRILL and DCB are not incompatible with a Controller LAN taxonomy and will solve todays problems but it won’t solve the problems around truly massive cloud infrastructures that need MUCH better protocols for network coherence to support over large Layer 2 infrastructures (and overly large fault domains that are thus created). The Controller based approach also creates Network management and automation capabilities that far exceed the capabilities of BEEP, NetConf and YANG across hundreds or thousands of devices. A single point of control makes programming a Data Centre networking much more likely to succeed.

Given what I understand about VMware vCloud and the network overlays, I think I can see the future of the data centre networking moving towards controller based networks. If you’ve been following James Hamilton (Facebook) who has been discussing better data centre entworking, then I think that this method fits the requirement that he outlines.

Note that this technology will not apply to Campus LANs or other Ethernet networks which will continue with the current technology sets. My comprehension of the long term vision is the Controller networking needs a highly rigid or crystalline structure to function for rule based compliance and verifiable design practice. This process doesn’t work so well in campus network where there is a lower level of control.

This doesn’t change the way we work today, but I’m often trying to understand what networking will do in the future. And I don’t mean in six months time, I’m looking at five and ten year horizons. Since the future always builds on the past, and repeats what we have already done, I think this method makes some sense. It will be interesting to see how it turns out. One thing for sure, the current Ethernet backbones of data centre are not robust or fast enough for what we need, and something new needs to happen. Get ready for an exciting ride in data networking over the next five years.

  • StuckInActive

    OSPF and ISIS already provide a comprehensive topology view and the protocols are engineered to mitigate incomplete or imperfect information. Interesting fact: the Juniper VirtualChassis tech uses ISIS to establish a forwarding path in the “stack.

    I think, perhaps, that you’re erroneously conflating the wireless controller technologies with something like OpenFlow. Some of the WLAN controllers actually build tunnels that encapsulate the .11 frames in ethernet and send them back to the controller. So in fact, there’s no forwarding plane on the AP at all. Others will allow the APs to place frames to local 3rd parties on the wire directly (e.g. H-REAP local switching). But, the AP has to do all of the MAC learning work itself. Those that operate that way get their configuration from the controller sure, but their forwarding planes are still programmed the same way as a standalone AP. I believe that even the mesh APs will run their own independent control planes for finding their way to the root AP(s).

    • Greg Ferro

      Indeed, IS-IS is also used by TRILL and SPB for the same purpose but this doesn’t overcome to problem of path selection under certain conditions.

      I am deliberately conflating WLC technology as a working metaphor and, while what you say is true, that the implementation chosen by Cisco but not other companies (as I understand it) because it’s cheaper to develop that software. It’s not a requirement.

      My feeling is that the data centre offers better opportunties for a coherent control plane due to the crystalline design possibilities and higher levels of investment coupled with the performance requirements of the data centre.

  • Michal

    hello Greg , interesting and probably in that case it would be in the ‘h-reap’ like ‘local switching mode.
    It would also inherite the known multi-controller architectures complexity for various services.

    BTW I see also some correlations between the old LWAAP data encap format and the future OTV data encap ( Layer-2 ethernet frame encapsulated in UDP inside of IPv4 or IPv6 ) :
    Ref. OTV-01: , page 14
    Ref: LWAAP:
    Future OTV data frame :

    1 2 3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    |Version| IHL |Type of Service| Total Length |
    | Identification |Flags| Fragment Offset |
    | Time to Live | Protocol = 17 | Header Checksum |
    | Source-site OTV Edge Device IP Address |
    | Destination-site OTV Edge Device (or multicast) Address |
    | Source Port (Random) | Dest Port (8472) |
    | UDP length | UDP Checksum = 0 |
    |R|R|R|R|I|R|R|R| Overlay ID |
    | Instance ID | Reserved |
    | |
    | Frame in Ethernet or 802.1Q Format |
    | |

  • Glenn Flint



    Cabletron was using OSPF at layer 2 in 1995 (Securefast) to replace STP and was very successful. Glad to see that a good idea never goes away?


  • Pingback: Show 35 – Breaking the Three Layer Model ó Packet Pushers()

  • Pingback: Show 38 - Comparing Data Centre Fabrics from Juniper, Brocade and Cisco ó Packet Pushers()

  • Pingback: Show 39 – Unplugged on Tech Field Day Wireless ó Packet Pushers()

  • Pingback: Show 38 – Comparing Data Centre Fabrics From Juniper, Brocade and Cisco – Gestalt IT()

  • Pingback: Show 39 – Unplugged on Tech Field Day Wireless – Gestalt IT()

  • Jimmy

    Gosh, a FEP. Networking really has come full circle

  • Pingback: Network Dictionary – Delaminated — EtherealMind()

  • Pingback: ◎ Introduction to How Overlay Networking and Tunnel Fabrics Work — EtherealMind()