Switch Fabrics: Input and Output Queues and Buffers for a Switch Fabric

In the previous article, we looked at the fact that Switch Fabrics need to have buffers on the input. However, we didn’t clearly explain why an output buffer is needed. We’ve reach the point in the architecture where building the switch fabric looks as follows:

Now that this has established that a Switch Fabric has buffers, I want to flip back into something practical about this application. Lets use an example of four servers connecting to single switch and sharing an ethernet connected  Storage Array. The array might be using iSCSI or FCoE to delivery SCSI applications to the Servers. The nature of storage traffic is such that it has transient spikes in traffic that must maintain low latency and frames should not be dropped, or dropped at very low levels (where low is usually regarded as about 1 in 10^8 frames).

Lets consider just one direction – server to storage array. Each of the servers are connected to inputs 1-4 on the fabric, and the storage array is connected to output 4 of the fabric.

What switch fabric pt3 1

Lets assume at a given instant in time, that three of the servers attempt to access the array on output 4. A switch fabric cannot forward all frames at the same time therefore two of the inputs must be blocked, or better still, queued while they wait for the output to clear. [^1]

What switch fabric pt3 2

Therefore at least two of the incoming frames are held in a buffer that is the Input Queue. The frame from Input 3 is forwarded across the fabric.

What switch fabric pt3 3

What happens if the Storage Array is not able to receive more frames ? This is a quite common occurrence since many Storage Arrays have relatively low input/output performance capabilities. The data transfer speed of disk drives is low, and array performance is derived from caching, drive multiplexing, mechanical adaptations and these functions doesn’t need a lot of CPU power to drive data flows. The recent addition of reduplication, compression, and other data munging features may further create in-band latency. Whatever the reason, at some point, the Array will not able to receive inbound frames data due to some transient condition.

In this case, the fabric is not causing the blockage. Therefore, the Switch Fabric should forward the frame while it is clear into the output buffer on the assumption that the frame will soon be forwarded to the Array.

What switch fabric pt3 4

Queueing presents it’s own challenges. If there are too many queues then:

* overall frame latency
* queues requires CPU cycles to handle the buffered frames leading to bigger CPUs ( more expense, power etc)
* buffers require specialist high speed memory to handle frames at the same rate as the fabric. This isn’t DRAM, it’s expensive, high power consumption, on-chip silicon.

Finally, the question is whether queueing should occur on the input or the output ? Are there any impacts that are specific to Switch Farbic when considering queueing ?

What switch fabric pt3 5

We will cover that in the next blog post.

[^1]: The time duration is not an instant, it’s a few millisconds depending on the interface speed being 100Mb, 1Gb, or 10Gb. This is because a frame of data has length and takes time for the frame to start and finish. Therefore the queue needs to buffer the frame until the output is cleared which takes some finite amount of time.

Other posts in the series

  1. ◎ What's Happening Inside an Ethernet Switch ? ( Or Network Switches for Virtualization People )
  2. Tech Notes: Juniper QFabric - A Perspective on Scaling Up
  3. Switch Fabrics: Input and Output Queues and Buffers for a Switch Fabric (This post)
  4. Switch Fabrics: Fabric Arbitration and Buffers
  5. What is an Ethernet Fabric ?
  6. What is the Definition of a Switch Fabric ?
  7. Juniper QFabric - My Speculations
About Greg Ferro

Greg Ferro is a Network Engineer/Architect, mostly focussed on Data Centre, Security Infrastructure, and recently Virtualization. He has over 20 years in IT, in wide range of employers working as a freelance consultant including Finance, Service Providers and Online Companies. He is CCIE#6920 and has a few ideas about the world, but not enough to really count.

He is a host on the Packet Pushers Podcast, blogger at EtherealMind.com and on Twitter @etherealmind and Google Plus

  • http://blog.ioshints.info Ivan Pepelnjak

    … and this is exactly why we have “Virtual Output Queues” on the ingress linecards in some switch architectures to prevent ingress head-of-line blocking.

    Great post
    Ivan

    • http://etherealmind.com Etherealmind

      And that (VOQs) is the topic of the next post in the series. 

  • http://www.m00nie.com m00nie

    Read the whole series of posts and just to say thanks
    Interesting to think about the fabrics in a little more detail than I usually do. 
    Looking forward to the VOQ post. 

    m00nie