Futures Review on 40 and 100 Gigabit Ethernet

I been doing some reading on the 40 Gigabit and 100 Gigabit ethernet standards. There are number of interesting facts that are going to change the way we work in our data centres. Most interesting is that I think that our fibre cabling plant will change dramatically from current practices.

Introduction

The currently approved standard for 40GbE and 100GbE are actually using multiplexing 10GbE channels to make a single “physical” connection over multimode fibre. I believe that this is a problem with the current silicon and lasers that perform the SERDES functions. The current technology is not able to signal at higher clock rates and thus 40GbE & 100GbE are actually “channel bonded” solutions. Since the bonding is performed at the physical layer, this is invisible to the user.

What is SERDES ?

SERDES refers to the Serialiser / Deserialiser chips that perform the physical signal generation. From Wikipedia

A Serializer/Deserializer (SerDes pronounced sir-dees) is a pair of functional blocks commonly used in high speed communications to compensate for limited input/output. These blocks convert data between serial data and parallel interfaces in each direction.

 

The basic SerDes function is made up of two functional blocks: the Parallel In Serial Out (PISO) block (aka Parallel-to-Serial converter) and the Serial In Parallel Out (SIPO) block (aka Serial-to-Parallel converter). There are 4 different SerDes architectures: (1) Parallel clock SerDes, (2) Embedded clock SerDes, (3) 8b/10b SerDes, (4) Bit interleaved SerDes.

There are future versions being developed that will use 25Gbps channels to reduce the cabling count and improve die sizing.

40 Gigabit Ethernet summary
40GBASECR4 40GBASE-SR4 40GBASE-FR 40GBASE-LR4
Signalling 4 x 10Gbps 4 x 10Gbps 1 x 40Gbps 4 x 10Gbps
Cable Twinax MPO MMF Duplex SMF Duplex SMF
Cable Spec - OM3/OM4 OM3/OM4 OM3/OM4
Connector QSFP w/ CX4 QSFP CFP CFP / QSFP
When Now Now 2011/12 2011/12
100 Gigabit Summary
100GBASE-CR10 100GBASE-SR10 100GBASE-LR4 100GBASE-ER4
Signalling 10 x 10Gbs 10 x 10Gbs 4 x 25Gbs 4 x 25Gbs
Cable Twinax MPO MMF Duplex SMF Duplex SMF
Cable Spec - ? ? ?
Connector CXP CXP or CFP CFP CFP
When Now Now Now 2011/2012

How many cables ?

  • One interesting outcome is that 40GbE on Multimode fibre will need eight fibre cores. For 100GbE on Multimode you will need twenty fibre cores.
  • The connector that has been standardised is the QSFP for 40GbE 40-100-gigabit-qsfp-1.jpg
  • The connection that has been standardised for 100GbE is the CXP (mostly) 40-100-gigabit-qsfp-2.jpg
  • There is no UTP copper standard, and according to comments, never will be.
  • The 40GBASE-CR4 uses four Twinax cables (I believe)
  • There will be a Twinax version for 100GBASE-CR10 that uses 10 x 10Gbps signalling. (Q. Why ? A. For servers edge connections because there will be someone who thinks that 100GbE connected server will go faster)
  • The CFP (C form-factor pluggable) transceiver features twelve transmit and twelve receive 10Gb/s lanes to support one 100GbE port, or up to three 40GbE ports. Its larger size is suitable for the needs of single-mode optics and can easily serve multimode optics or copper as well.
  • The CXP transceiver form factor also provides twelve lanes in each direction but is much smaller than the CFP and serves the needs of multimode optics and copper.
  • The QSFP (quad small-form-factor pluggable) is similar in size to the CXP and provides four transmit and four receive lanes to support 40GbE applications for multimode fiber and copper today and may serve single-mode in the future. Another future role for the QSFP may be to serve 100GE when lane rates increase to 25Gb/s.
  • The current generation of modules are large because of heat dissipation issues due to high power consumption. For example, the CFP is rated for up to 24 watts of power dissipation but also needs to have a range of high density electrical connectors to connect to the baseboard. I take this to mean big, hot and heavy.

Cabling Notes

Presenting these fibres to the connector will required specialised cabling. I don’t have too much information here but this image shows the MPO plug for 40GbE multimode.

Photo from a Cisco presentation

  • In a 10GbE network, OM3 fiber can span up to 300m while OM4 supports even longer channels.
  • In a 40GbE or 100GbE environment, OM3 can be used up to 100m and OM4 up to 150m according to the IEEE802.3ba standard.
  • For applications approaching 150m, the cable should be terminated with low loss connectors.
  • The MPO connector uses 12 fibre core.
  • For 40GbE this means that four cores are unused.
  • For 100GbE uses 10Gbps SERDES on multimode, you need 2 x12 core MPO or 1 x 24 core MPO with 2 core per each unused.

40-100-gigabit-qsfp-4.jpg

Reference: 40 GbE: What, Why & Its Market Potential – Ethernet Alliance

Can I use my existing cabling ?

  • From the research I’ve done, I don’t think that you will use your existing cabling infrastructure because you won’t have enough of it.
  • I think it’s more likely that you will purchase manufactured cables up to 150 metres with the idea of using them as patch leads to run from location to location within the data centre. There are number of what I call “click/clack” cabling systems that deliver modular cabling solution.
  • Therefore any investment in your static fibre plant for the data centre is probably wasted. The method that you run fibre to top of rack and manually fusion splice pig tails is probably over. I predict that we will purchase all fibre optic cables as manufactured items.
  • This will help to guarantee the performance and the reuse of the cables as the data centres change over time (thus improving the green initiatives).

References

  • Brian Dooley

    Good stuff here Greg! Thanks, I didn’t realize the pipe would be multiplexed. I guess it makes sense when you pause to consider it though.

    • http://etherealmind.com Greg Ferro

      Yeah, it’s weird eh. It’s not until you research it that you realise its a dreadgul kludge. It will work just fine though.

  • Paulie

    So all this physical layer changes, what makes it still Ethernet? ether mac addresses and tiny frame sizes? That amount of cabling ends up being point to point in the end, especially with switch hardware (except apparently for Arista) can’t even keep up with 10GE…

    It’s ok for Ethernet to EOL on the really fast pipes, I’m sure there is some mainfame/mini framing format which could be resurrected and expanded to multi megabyte frames for the 21st century. :-)

    • http://etherealmind.com Greg Ferro

      Heresy. Ethernet is the all conquering protocol. We shall not hear criticism of the best protocol.

      Seriously, lowest common denominator. Ethernet. Don’t like it, but that’s what we got.

  • http://www.arkf.net/blog Adam K-F

    G’day Greg,
    Great article. 40 and 100Gbit are still early days. Whilst early adopters will be looking at 24 core fibers, I suspect as new efficiencies and technology breaktrhoughs come down the pipe, we’ll see a consolidation and native wire speeds start to show up (rather than the 24 core monster we’re seeing here!).

    Watch this space and don’t rip out your existing infrastructures just yet.

  • Krunal

    hi Greg,

    its really great blog post.

    I am just not sure why we implemented 40gig standard from 10gig. Why not directly just from 10 gig to 100 gig just like we jumped from 10meg to 100 meg, 100meg to 1gig ,1 gig to 10 gig why 10 gig to 40 gig? What group of people decided or visualize 40 gig standard in data center??
    Krunal

    • http://etherealmind.com Greg Ferro

      I think it’s a price issue. The 40GbE components are a lot cheaper and offer immediate solutions to bandwidth problems at certain industry verticals.

      Also note that most existing ethernet switches can’t support 100Gb forwarding rates. For example, each module on the C6500 is 40Gbps which makes a C6509 a 7-port 40GbE switch……. So 40GbE will be an intermediate point until the ethernet switches get faster.

      greg

  • Pingback: Technology Short Take #9 - blog.scottlowe.org - The weblog of an IT pro specializing in virtualization, storage, and servers

  • Pingback: Backplane Ethernet – the GBaseK Standard ñ My Etherealmind

  • Pingback: Show 37 – Even More IPv6 Ready Than Last Week

  • http://subnetwork.greenmojo.org Jonathan Davis

    Great information here. I knew that 40GB and 100GB was multiplexed, but I always thought “surely it’s not like I’m imagining.” Apparently it is.

    I’m really curious as to how many people outside of the ISP space, actually have a need for 40 and 100GB connections. Do you know of anyone who is bursting 80% of a 10GB connection?

    Don’t get me wrong, I understand that Google, Amazon, etc. are sure to be on-board, but outside of those types of companies, who has that kind of rack density/budget?

    • http://etherealmind.com Greg Ferro

      It’s my understanding that the driver for 40GbE and 100GbE is

      1) the IXP and Backbone providers today need faster interconnection on existing cabling.
      2) Certain of the financial institutions are looking for better latency with faster link speeds
      3) very large cloud providers such as Amaz, Goog, FBok etc need switch interconnection performance that is higher.
      4) certain niche storage systems could benefits.

      But for nirmal people like us, it’s not really a driver.

      • http://www.facebook.com/profile.php?id=100000054363300 Jimmy Shake

        High cap video is also a driver, ISP/tripple play providers such as the one I work for, move large amounts of video. Uncompressed video off the receivers is as large as 250Mbps per stream, take a 300 channel lineup and make it redudant = lots of bandwidth.
        Take that as multi cast and now encode it to 10 to 20Mbps HD streams and unicast it to your subscriber. All of these plus normal voice and Internet services and Cellular backhaul applications all riding the same backbones, 10g Ethernet and OC 192’s are becoming obsolete. 
        Many try to throw DWDM at it and hope it sticks but lambdas are expensive and if you can do 4x 100G links vs 40x 10G links, which would you prefer?

        • http://www.facebook.com/profile.php?id=100000054363300 Jimmy Shake

          Down side is cost still, brocades 100G optics for their MLX platform are 180,000 list price and about 76,000 discounted.

  • Rob

    My ignorance….why wouldnt 100Gb NICs on a server speed up network transfers faster than 40Gb, 10Gb, 1Gb?

    • Ken

      It has to do with server motherboard technology and the ability of the PCIe bus (and CPUs) to actually make use of the available bandwidth at 100GE.  40GE is attainable for servers in the near term, so they (parties interested in Data Center technology) wanted a higher speed technology that wouldn’t take 10 years to implement on servers.

    • http://etherealmind.com Etherealmind

      In the short term of say five to ten years, 100Gb will be too expensive for a single server. And, as Ken says, servers simply are not fast enough to generate that sort of traffic. The limitation is in the PCI bus, although later versions are getting a bit faster. 

      My current guess is that the we won’t see 100GB buses in servers until Optical Backplanes (such as http://etherealmind.com/hp-optical-backplanes/) arrive on server motherboards. That’s a long way into the future.

      Second, it’s much cheaper for companies to bond two or four 10Gb channels, or even two or four 40Gb channels in the future. And all servers are cheap these days – no one is much interested in a 100Gbase NIC that costs north of hundred thousand dollars today. 

  • Pingback: Show 37 – Even More IPv6 Ready Than Last Week – Gestalt IT

  • Pingback: Fibre Connectors — My Etherealmind

  • http://twitter.com/DeepStorageNet Howard Marks

    Greg,

    I think you’re right that the days of terminating or splicing fibre in the data center should be over.  Today several vendors (Panduit, Tyco/AMP, Leviton) make modular paching systems that use MPO for rack to rack distribution.  The lower cost of labor and faster installation more than make up for what additional cost there may be.

     – Howard