In the previous article, we looked at the fact that Switch Fabrics need to have buffers on the input. However, we didn’t clearly explain why an output buffer is needed. We’ve reach the point in the architecture where building the switch fabric looks as follows:
Now that this has established that a Switch Fabric has buffers, I want to flip back into something practical about this application. Lets use an example of four servers connecting to single switch and sharing an ethernet connected Storage Array. The array might be using iSCSI or FCoE to delivery SCSI applications to the Servers. The nature of storage traffic is such that it has transient spikes in traffic that must maintain low latency and frames should not be dropped, or dropped at very low levels (where low is usually regarded as about 1 in 10^8 frames).
Lets consider just one direction – server to storage array. Each of the servers are connected to inputs 1-4 on the fabric, and the storage array is connected to output 4 of the fabric.
Lets assume at a given instant in time, that three of the servers attempt to access the array on output 4. A switch fabric cannot forward all frames at the same time therefore two of the inputs must be blocked, or better still, queued while they wait for the output to clear. [^1]
Therefore at least two of the incoming frames are held in a buffer that is the Input Queue. The frame from Input 3 is forwarded across the fabric.
What happens if the Storage Array is not able to receive more frames ? This is a quite common occurrence since many Storage Arrays have relatively low input/output performance capabilities. The data transfer speed of disk drives is low, and array performance is derived from caching, drive multiplexing, mechanical adaptations and these functions doesn’t need a lot of CPU power to drive data flows. The recent addition of reduplication, compression, and other data munging features may further create in-band latency. Whatever the reason, at some point, the Array will not able to receive inbound frames data due to some transient condition.
In this case, the fabric is not causing the blockage. Therefore, the Switch Fabric should forward the frame while it is clear into the output buffer on the assumption that the frame will soon be forwarded to the Array.
Queueing presents it’s own challenges. If there are too many queues then:
* overall frame latency
* queues requires CPU cycles to handle the buffered frames leading to bigger CPUs ( more expense, power etc)
* buffers require specialist high speed memory to handle frames at the same rate as the fabric. This isn’t DRAM, it’s expensive, high power consumption, on-chip silicon.
Finally, the question is whether queueing should occur on the input or the output ? Are there any impacts that are specific to Switch Farbic when considering queueing ?
We will cover that in the next blog post.
[^1]: The time duration is not an instant, it’s a few millisconds depending on the interface speed being 100Mb, 1Gb, or 10Gb. This is because a frame of data has length and takes time for the frame to start and finish. Therefore the queue needs to buffer the frame until the output is cleared which takes some finite amount of time.
Other Posts in This Series
- ◎ What's Happening Inside an Ethernet Switch ? ( Or Network Switches for Virtualization People ) (11th January 2013)
- Tech Notes: Juniper QFabric - A Perspective on Scaling Up (14th February 2012)
- Switch Fabrics: Input and Output Queues and Buffers for a Switch Fabric (6th September 2011)
- Switch Fabrics: Fabric Arbitration and Buffers (22nd August 2011)
- What is an Ethernet Fabric ? (21st July 2011)
- What is the Definition of a Switch Fabric ? (30th June 2011)
- Juniper QFabric - My Speculations (1st June 2011)
… and this is exactly why we have “Virtual Output Queues” on the ingress linecards in some switch architectures to prevent ingress head-of-line blocking.
Great post 😉
Ivan
And that (VOQs) is the topic of the next post in the series.
Read the whole series of posts and just to say thanks 🙂
Interesting to think about the fabrics in a little more detail than I usually do.
Looking forward to the VOQ post.
m00nie
Hello,
This was a good article. I am trying to find more information on VOQ – what are the different components that it requires to work like scheduler, queue management (entry/deletion), etc.
You mentioned here that VOQ will be your next post, but I could not find it on your page.
Could you please point me to it if you have posted it. Or else, can you point me to some reference material that explains it well.
Appreciate your help.
Thank you.
Regards,
Shamanth.