Juniper QFabric is a new approach to Ethernet Switch Fabrics. When it was announced last year, it was clear that the underlying physical design is a completely different approach to building Switch Fabrics. Here I’m taking a research based approach to understand how Juniper QFabric is different from all other approaches to the problem, and also a look at some of the challenges ahead.
If you aren’t familiar with Ethernet Fabrics then this post examines The Definition of a Switch Fabric and introduces the concept of a crossbar switch fabric. You may want to understand the concept of lossless forwarding through a silicon fabric which I’ve written about here: Switch Fabrics: Input and Output Queues and Buffers for a Switch Fabric and also Switch Fabrics: Fabric Arbitration and Buffers. tweet
Standard Chassis Layout
A typical chassis has a physical layout that looks something like this:
Each line card connects to central backplane. The backplane consists of individual channels to each blade and down to the crossbar switch fabric on the supervisor in the chassis. All of the line cards then connect to a single input & output of the fabric via a connection on the backplane.
The net effect is something like this:
This means that single lossless fabric is determined by the size of the chassis and by the size of a single silicon chip. And crossbar fabrics are complicated, hot and expensive which limits the maximum I/O ports on a single chip. For more I/O the cost of the switching chip increases exponentially as the silicon die increases in size.
The fundamental computing solution to this problem is to use a multistage switching architecture that wires the outputs of one switch chip to inputs of another which allows a relatively small silicon chip to be scaled up to a much larger solution.
For those people who are native Cisco-speakers, this is exactly how the Nexus 7000 is architected. The middle stage is the Fabric Modules (often known as FAB’s), and each line card has its own fabric. Here is slide from Cisco presentation deck showing the Fab-1 connections to Nexus line cards.
All of this happens inside a single switch. This has been the limit of scaling switch technology and even this is a recent innovation. The connection from the “Line Card” to the silicon is done using a high-speed backplane but scaling beyond a single switch is done using Ethernet interfaces. Hence
QFabric Scales Further
In my view, Juniper QFabric uses what I call the “Exploded Chassis”. It takes the concept of “Line Cards” from a Chassis and places those functions into a rackable switch – these are termed QF-Nodes1.
The QF/Interconnect is the Silicon Fabric for the “chassis”. There are two broad types of line card
- Interface cards that have 10G and 40GB Ethernet on board that connect to QF/Nodes
- Silicon Switch cards that have the fabric chip and high-speed internal connection to other fabric chips so as to form a Multistage Clos Fabric.
The “Fabric Interface cards” 2 have a necessary Ethernet chipsets to connect to the QF/Nodes. Because the 40G & 100G Ethernet standards are still progressing it’s a good value to have these as replaceable assets as the next generation of SERDES units move to 25GB/s lanes.
In the same way, upgradeable Fabric Cards allows for new chips to be added as silicon can get smaller, faster and development/testing is completed. Cisco has updated their Fabric modules from 96GB to 550GB per slot and it’s seems reasonable that Juniper will do the same.
Exploded Chassis Design
Instead of having all the necessary elements in a single chassis, Juniper have created a “chassis” that spans the entire switched network. Importantly, the backplane can scale out much wider than just a single pair of chassis in traditional designs that use MLAG or Borg type designs. The width depends on the number of uplinks from the QF/Nodes & QF/Interconnect according to the capacity of the multistage close fabric architecture (there are practical limits) – currently it’s 4 x 40G per QFX3500 but in future versions I’d expect up to 16 x 40G to be possible. Of course the use of 100G allows for a different dynamic that uses less inputs.
It should look something like this.
The Control Plane
In order to scale out the Forwarding Plane into multiple units, the Control & Management Plane is held into a separate elements. It seems impractical to have the routing and switching protocol software running on the QF/Interconnect — their function is to perform high-speed, low latency forwarding.
And that’s the purpose of the QF/Director. The solution requires two which operate as Active/Standby engines. All BGP/OSPF/IS-IS/STP/TRILL etc is handled from these components. In a real sense, this is a Controller based network because the QF/Director will update the forwarding tables for the QF/Interconnect and QF/Nodes based on the protocols deployed – not unlike a switch supervisor or SDN controller.
The QF/Director uses an out-of-band network to build reliable connections to all elements in the “virtual chassis”. This network is Juniper EX-Series switches with some specific design requirements that you need to fulfil.
Now we can see that the entire QFabric infrastructure looks like a traditional chassis, except that it is made up of many individual elements.
The backplane between the QF/Node and QF/Interconnect is proprietary. Some have attempted to paint this as negative feature. However, within any switch chassis, all the connectivity, connectors and protocols are proprietary. For QFabric to function, the internal Ethernet connections must be proprietary to carry signalling data from the QF/Node to the QF/Interconnect. That is, the forwarding decision is performed at the edge of the network in the QF/Node and the designation is tagged onto the Ethernet frame (and much other data I’m sure) so that the QF/Interconnect can switch the frame to the output port at high-speed. So, sure it’s a proprietary backbone – it has to be.
Cisco, Brocade and Dell etc will do the same thing internally to their own switches. It’s just not quite so obvious that inside a switch it’s a closed network :).
Hard, Really Hard
I’ve had discussions with vendors who point out that making this technology work is an achievement and that its sheer genius that Juniper got it working at all. It seems some companies would never have attempted such a product. Rumours are rife that the first internal chipset for QFabric didn’t work out and that the current product is using a merchant silicon chip from Broadcom.
I’m not much bothered by this. It’s the architectural principle that the “Exploded Chassis” allows a scale out Ethernet networks of literally thousands of ports in single distributed switch offers a lot of advantages – operationally and architecturally. Of course, if you can only think of Ethernet switching in terms of a “Two Core Switches, A rooted spanning tree, and Two More Layers with a sesame seed bun” then you may not perceive the value in this design.
You’ve probably worked out that this is a big solution. I’m told that the practical starting point is about 500 10GbE ports for QFabric design for the price. That’s a lot of 10GbE in today’s terms where most data centres need maybe forty or so to support a handful of blade servers.
But for Internet Exchanges, ISP, Service Provides and Cloud Hosting, there is a lot of value in this product. They are customers where big is better. Also, where easy configuration and management is also a requirement.
The EtherealMind View
First thing I’ve noticed is that Cisco perceives this product as a threat. It’s possible that once a customer buys into QFabric then Cisco is locked out of the account because it’s a platform not a device. I’m hearing feedback that Cisco has prepared and is delivering an extensive “attack the weaknesses” campaign via their competitive marketing practice. Which is unusual, as they will usually “play to their strengths” thus signalling that Cisco has a weakness here. In my view, Cisco has few strengths against QFabric for those customers where QFabric fits. For most Enterprise and Corporate customers, Cisco is a smaller solution and can be purchased in ITIL sized chunks that suits project management processes just fine. Cisco doesn’t have much to worry about the in the short and medium term.
I am concerned about the product reliability and technical excellence. There are a lot of moving parts in this system, and it will require a top-notch development process, disciplined testing, and quality assurance to bring it to market. Juniper has a good pedigree but the proof is in the execution. It’s still early days.
I also find myself returning to the question of Juniper’s commitment to the Enterprise. Somehow, I can’t shake the feeling that Juniper really only wants to work with Service Providers, Carriers etc and they continue to focus product and marketing in areas that aren’t relevant to corporate network engineers. That QFabric only for SP Marketts is something that I cannot shake.
On the bright side, Juniper has delivered a true innovation here with a whole new approach to building Ethernet Fabrics. Other companies are tackling the “Exploded Chassis” concept, such as Gnodal, and I suspect there will be more.
Finally, the centralised control plane is well placed for a secure virtualised hosting platform. Because the network control is centralised, it’s doesn’t need MPLS, or QinQ, or any other technical hack, overlay, tunnel or tag technology to provide secure separation. It also works well for Software Defined Network overlay either via OpenFlow or NETCONF or SLAX.
Lets see more innovation in networking like QFabric. We need it.
I have visited Juniper as part of the Tech Field Day which is a sponsored event. My accommodation and some entertainment was paid as part of the event. I have also hosted the OpenFlow Symposium where Juniper was a sponsor. There is no commitment to write or discuss any topics as part of these events.
The opinions expressed in this article are my own. I made them up based on the information available – I hope they are correct. If you have comments please leave below and I’ll do my best to respond when I can.
Other Posts in A Series On The Same Topic
- ◎ What's Happening Inside an Ethernet Switch ? ( Or Network Switches for Virtualization People ) (11th January 2013)
- Tech Notes: Juniper QFabric - A Perspective on Scaling Up (14th February 2012)
- Switch Fabrics: Input and Output Queues and Buffers for a Switch Fabric (6th September 2011)
- Switch Fabrics: Fabric Arbitration and Buffers (22nd August 2011)
- What is an Ethernet Fabric ? (21st July 2011)
- What is the Definition of a Switch Fabric ? (30th June 2011)
- Juniper QFabric - My Speculations (1st June 2011)