Network Fabric:TRILL for Server and Network People. Welcome RBridges


TRILL is a key network technology for enabling Cloud Computing by allowing for better migrations of VM’s, and better utilisation of the network switching fabric and much improved stability of the Data Centre Server Fabric.

TRILL(Transparent Interconnection of Lots of Links)

TRILL is a proposed technology that is intended for use in the data centre. It is a fundamental network technology enabler for Cloud Computing so that:

  • increase in switching density,
  • better path utilisation
  • underlying support for highly mobile servers in a virtualised server environment
  • faster convergence, at power on and in response to failures.

The support for highly mobile virtual servers within VLANs is big feature.

From the IETF Charter for TRILL


The design should have the following properties:


  • Minimal or no configuration required
  • Load-splitting among multiple paths
  • Routing loop mitigation (possibly through a TTL field)
  • Support of multiple points of attachment
  • Support for broadcast and multicast
  • No significant service delay after attachment
  • No less secure than existing bridged solutions

It’s also very interesting that TRILL is an Ethernet standard being progressed through the IETF (and not the IEEE), and is led by the Radia Perlman, the creator of Spanning Tree.

Why have TRILL at all ?

In current Layer 2 networks we use Spanning Tree (in all its forms) to ensure that there is only a single path across the network. At the same time, we deliberately design one or more redundant paths for failover in the event that the primary path fails. This redundant paths are blocked and that there are many paths in the network that are not used to carry frames thus:

  • Bandwidth is potentially underutilised, especially when more than two paths exist.
  • the shortest optimal (and thus fastest) path through the Layer 2 network is not practically possible thus most path are inefficient
  • because it’s not proprietary
  • because 802.1aq isn’t your cup of tea.

Current State of Spanning Tree

There are several forms of Spanning Tree used today:

  • Common Spanning Tree (CST)
  • Per VLAN Spanning Tree (PVST/PVST+)
  • Rapid Spanning Tree
  • Multiple Instance Spanning Tree

All types of spanning tree have one fundamental idea: that a single tree, representing a single path through the switching network is established for a given VLAN and all alternate paths are blocked. Each of the above offers improvements in speed of convergence using various features, however, even in the best cases, it takes seconds for paths to converge on the final solution.

Suboptimal path

Consider the following server/switch connections where two servers areconnected to a switch (with one or more ports if using LACP). You can see that there are several possible paths between the two servers.

trill-1.jpg

Typically, a switch backbone for a given VLAN will be configured to force the tree to be at the core of the network. As a result the path between the servers will look something like this (if the top left switch is configured as the primary root switch):

trill-2.jpg

Now this might seem OK until you realise that the entire data centre network looks more like this:

trill-3.jpg

Ok, so this isn’t absolutely perfect, but you should get the idea. There are very good operational reasons to build a data centre this way. Very large organisations with substantial resources will do it smarter and better using PVST+ and port/path costs to optimise and try to balance some load. in real life, this is standard Enterprise switching backbone for data centres. .

That link between the core switches need to be big, and many of the of the uplinks are also large. This has been a big driver for 10Gb Ethernet in the early stages.

But when you think about it, what we would really like to achieve is the following shortest path through the network:

trill-4.jpg

As you can see, this means that core is now more lightly loaded and that, scaling this up, the core is less likely to be bottlenecked around Core Switch 1.

A common scenario for this is when using top-of-rack Ethernet switches and two servers are in adjacent racks, yet the Ethernet frames travel via the core, adding a small amount of latency.

Note also that a failure event at Core 1, will have less overall impact on the entire network.

VM migration / mobility

Now consider what happens when all of these servers are virtual servers and then you use a vserver mobility tool to move ServerA across the backbone when using spanning tree:

trill-optimal-path-5.jpg

The problem here is that the path is still shared by all the other servers. The load across the backbone may drastically altered the available bandwdith.

trill-optimal-path-4.jpg

In this case, there is less chance of impact since the frames are now taking an alternate path. Your backbone is much more likely to handle the changes and surges in Internet traffic, and more generally you would have better overall usage of the available infrastructure.

Of course, you can’t be certain that this path has available bandwidth but there are other standards being developed to help solve this problem.

Multiple Paths at Layer 2

The other improvement that TRILL make available is multiple paths at layer 2. Consider the following model:

trill-multiple-paths-1.jpg

This is a major breakthrough compared to Spanning Tree and will drastically improve the available bandwidth because many links can be used to the destination.

Although it will make packet capture in the network much more difficult since predicting the path through the Layer2 network is quite complicated.

Terminology:Frames and Packets

The OSI model has defined an abstraction of the communication protocols that uses seven layers and Layer 3 = Network Layer, Layer 2 = Data Link layer. Typically, Network Engineers consider the term Packets to refer to Layer 3 data encapsulations, and Frames refer to Layer 2 encapsulations. In the past, FDDI, Token Ring and Ethernet were all Frames, and AppleTalk, IPX and IP are all Packets. For many people today, only Ethernet and IP exist and it is easy to confuse the terms.

TRILL – a fast overview on how it works

TRILL uses a concept of a Routing Bridge, known as an RBridge, running IS-IS routing protocol((Older engineers will recall Bridging Routers or BRouters, I find it amusing that this is an obvious return to bridging and stepping away from routing)) . IS-IS is a link state routing protocol that is mostly used by Global Service Providers to carry IP routes within their networks (and then use BGP between Service Providers) but is actually protocol independent. IS-IS does not use IP to establish neighbour relationships, its uses OSI protocols which includes CLNS and PDU’s to perform the neighbour and protocol exchanges. This means that IS-IS works for IPv4 & IPv6 and can equally be used for other protocols such as is proposed by TRILL.

TRILL uses IS-IS to carry routing information about MAC Addresses devices connected to VLANs and to build a shorted path tree for each MAC address in the VLAN.

TRILL intends to maintain compatibility with existing Spanning Tree implementation and can co-exist and be progressively migrated.

Thoughts and Personal Opinions

One of the most fascinating things about TRILL is that it is conceptually identical to IEEE 802.1aq. Why would someone create a competing standard ? Because the IEEE is slow, ponderous and bad company. Many times we have seen the IEEE take far too long to develop standards, fail to resolve conflicts between competing interests.

Radia Perlman is the creator of Spanning Tree who currently works for SUN, and appears to be the lead author of TRILL. Other authors include Dinesh G. Dutt from Cisco, Silvano Gai from Nuova (now Cisco Nexus products). These people are at the core of the data centre fabric development and their employers Sun and Cisco are promoting their Cloud Computing credentials.

Radia Perlman has also criticised the IEEE in her outstanding and seminal textbook on bridging – Interconnections:Bridges, Routers, Switches, and Internetworking Protocols – Radia Perlman – Addison-Wesley 2000. (Do yourself a favour and buy this book, its brilliant!).

I can’t get access to the current drafts of the IEEE since they are not open to the public. However, the most recent draft 1.5 was released on 2008/12/18 and appears to have been stumbling along for more than two years. Reviewing the available material suggests that the IEEE is not reaching any agreement at this stage.

Conclusion

I hope I have done this topic justice. I have only the IETF document to read and some of my own experiences. If anyone has more information or can’t point out my mistakes, please leave a note in the comments and I will respond or you can contact me using my contact page http://etherealmind.com/contact. I look forward to hearing back.

References

Problem & Applicability Statement

http://www.ietf.org/interet-drafts/draft-ietf-trill-prob-06.txt

About Greg Ferro

Greg Ferro is a Network Engineer/Architect, mostly focussed on Data Centre, Security Infrastructure, and recently Virtualization. He has over 20 years in IT, in wide range of employers working as a freelance consultant including Finance, Service Providers and Online Companies. He is CCIE#6920 and has a few ideas about the world, but not enough to really count.

He is a host on the Packet Pushers Podcast, blogger at EtherealMind.com and on Twitter @etherealmind and Google Plus

  • Dave Allan

    TRILL is not that conceptually aligned with 802.1aq other than that the service model is similar. 802.1aq was focused on re-use of ethernet, specifically leveraging 802.1ag (OAM), 802.1ah (adaptation+large scale virtualization) and 802.1Qay (population of the FDB by management or control plane). TRILL was about creating a new uniquitous specialized L3 specifically for Ethernet, different constraints, different results.

    The critique of IEEE is IMO ill-informed. IETF (for example) resolves conflicts by publishing multiple RFCs and letting the industry decide, which simply punts their problems onto all of us. IEEE at least has the concept of “distinct identity” for a given project. It may take a little longer to get a standard but any industry confusion is capped there….

    BTW 802.1aq is progressing nicely, has numerous pre-standard deployments, and should be baked as a standard in 1H2010…

    • http://etherealmind.com Greg Ferro

      I guess i have to disagree. I have looked the the IETF web site and I cannot get access to any information about the progress of the meetings, or read the latest documents. For something that is supposed to be a ‘standard’ it’s not very transparent. Given the time/date stamps it would seem that things are not going smoothly, but I can’t get any details to confirm or deny that since its all done in secret.

      I also remain skeptical about the IEEE competence. Ethernet is a success in spite of IEEE procedures and processes when I consider the multiple Wireless LAN cockups. Not to mention VLAN tagging…. oh I could go on.

      It’s a shame that it won’t be ready until 2H2010, it was originally promised in 2008, maybe early 2009.

      Until the IEEE does it better, I will remain critical. Show me results and transparency.

      • Dave Allan

        FYI, 2010 is because 802.1ah as well as 802.1ad is now in scope. That aspect only commenced early in 2007.

        I do not believe the drafts are publically available, but most of the input to them is, check out:

        http://www.ieee802.org/1/wp-content/uploads/public/docs2009/

        If you want to go backwards in time, the directories are also there for previous years…enjoy!

        IMO there has been steady progress since 802.1ah was in scope, as it was a more natural fit for what SPB was trying to achieve….

        cheers
        D

      • http://www.pothole.com/~dee3 Donald Eastlake 3rd

        Hi,

        I don’t understand why you are having difficulties getting access to what is going on in the TRILL WG in the IETF. The charter, which I admit is a little out of date, is here:
        http://www.ietf.org/dyn/wg/charter/trill-charter.html

        That charter page has links to the one RFC the TRILL WG has had published and to the base protocol specification draft, currently at:
        http://www.ietf.org/id/draft-ietf-trill-rbridge-protocol-13.txt

        As drafts are updated in the IETF, the direct text link, as above, to the old version goes away and a new link with the version number incremented is created, but here is a link to an html-ized version of draft -13 which should continue to work even after draft -14 comes out:
        http://tools.ietf.org/html/draft-ietf-trill-rbridge-protocol-13

        As for what is happening in the TRILL WG itself, minutes of all its meetings, like those for every IETF working group, are published in the proceedings of the IETF meetings. The proceedings are all linked from here:
        http://www.ietf.org/meeting/proceedings.html
        TRILL did not meet at the 72nd or 75th IETF meetings and I think there is at least one earlier one it skipped but if you go look at, for example, the proceedings for the 73rd and 74th IETF meetings, it is not that hard to find the TRILL meeting minutes. In fact, if you can’t make it to an IETF meeting, most of the WG meetings are publicly broadcast by streaming audio and you can call in to ask questions and there is usually an IRC channel with someone posting summaries of what is happening in real time and relaying questions anyone on the IRC channel has.
        In fact, I find it very hard to conceive of any practical way in which IETF working groups could be more open. (I haven’t even mentioned the IETF fundamental that all the working group mailing lists are open and anyone can subscribe.)

        Thanks,
        Donald

        • http://etherealmind.com Greg Ferro

          Donald

          I claimed that the IEEE is not open. You can’t get any material on what they are doing, or how they do it, or what the the current progress is. Which is quite annoying for ‘public’ standards.

          I was easily able to access and evaluate material from the IETF. I agree, you could not be anymore open.

          Greg

          • http://www.pothole.com/~dee3 Donald Eastlake 3rd

            OK. Thanks. It is true that the IEEE process is much more secretive than the IETF process. However, if you go back to your post, you will see that you typoed “IEEE” and “IETF” in the first line of your comment so I was confused.

            Donald

  • Pingback: DCB, CEE or DCE ? Whose term is best ? | My Etherealmind

  • http://www.pothole.com/~dee3 Donald Eastlake 3rd

    The Referenced Internet Draft (?d?r?a?f?t?-?i?e?t?f?-?t?r?i?l?l?-?p?rob-06.txt) has now been published as RFC 5556.
    See http://www.ietf.org/rfc/rfc5556.txt

    The TRILL base protocol specification is now quite mature and the latest version is available at
    ?http://tools.ietf.org/html/draft-ietf-trill-rbridge-protocol-14

    Thanks,
    Donald

  • peter ashwood-smith

    There is a lot of good information on 802.1aq on wikipedia under “IEEE 802.1aq”.
    We’ve been trying to keep it up to date with whats happening at the IEEE.

    In particular the l2 multipathing work has progressed rather well and now allows 16 symmetric algorithms with additional opaque mechanisms for the addition of many more. What is very interesting about this work is that these are head end chosen so you get a lot of control over how traffic will be placed.

  • Pingback: Packet Pushers – Show 5 – Deep Diving on Data Centre Switching – Trill, RBridges, and Ethernet – Oh My ó Packet Pushers

  • http://blog.INE.com Petr Lapukhov

    Unfortnately, TRILL solves only a few of problems inherent to Ethernet. Replacing bridging with routing does have some obvious benefits, but from the global point of view it does not look all rational. Packet switched networks have faced and resolved the same issues (optimum traffic engineering) long time ago. As opposed to making Ethernet “routable on its own” it would make sense reusing existing packet networks and technologies (e.g. MPlS traffic engineering) for optimized transportation (similar to what OTV tries to achieve).

    Next, the problems of MAC address table growth and automatic learning has not been completely addressed. With TRILL, edge device MAC address tables will grow linearly as the number of endpoints grow. The root cause of the problem is that MAC addresses serve purpose of pure endpoints IDs and not the location identifiers and consequently are non-aggregatable. A solution would be to reuse IP layer for routable locator IDs while keeping MAC addresses for Endpoint IDs. To control EID address tables exposion, some sort of distributed hash table for EID to RLOC mapping storage could be used (similar to what SEATTLE protocol does).

    To me TRILL seems like a good work but not a good solution – it doesn’t try to effectively reuse existing technologies not does solve all Ethernet scalability problems.

    • http://www.pothole.com/~dee3 Donald Eastlake 3rd

      Hi Petr,

      TRILL does *not* claim to scale lareger than classic bridged LANs, at least not to any significant extent. As you try to scale Layer 2, you have fundamental problems of growth in the broadcast domain and growth in the number of non-hierarchical MAC addresses. TRILL supports VLANs to help on the broadcast domain front and requires only “edge” RBridges to learn end-station MAC address, but there are other solutions with those advantages.

      The real advantages of TRILL are support for multipathing of both unicast and multi-destination traffic (you can easily support hundreds of equal cost paths for unicast with TRILL), the safety of a TTL, easy incremental deployment into an existing bridged LAN composed of commodity classic 802.1Q bridges, an options feature, optional support for end stations doing the encapsulation to an egress RBridge located through a directory service, etc., etc.

      Thanks,
      Donald

      • http://blog.INE.com Petr Lapukhov

        Hi Donald,

        Thanks for your response! For scalability, I would say that TRILL does help quite a bit, as it reduces the impact of unicast/broadcast flooding. Of course, MAC address scalability problem is not completely resolved, though it is mainly pushed to the “Edge” devices.

        As for multipathing, it looks like this feature could have been implemented by simply tunneling Ethernet over any existing packet switched technology e.g. VPLS with FAT or entropy labels or just OTV. TRILL applies tunneling concepts as well, but does that within the scope of Ethernet headers. I understand that OTV and VPLS could be mainly viewed as DCI solutions, but that does not prevent them from being used for intra-site connectivity as well.

        I believe one main “real-world” argument of TRILL proponents is cheaper cost per port for Ethernet compared to the use of VPLS edge devices. However, using layer 3 ethernet switches with added functionality of Ethernet tunneling could keep costs low as well, and it adds approximately the same complexity in the control plane as using TRILL.

        Now for multipathing again TRILL re-uses the well-known shortest-path routing concepts. The problem is, that unless specifically engineered topologies are in use, obtaining rich set of ECMP could be challenging. Optimal link utilization by means of link weight tuning for link-state protocol is an NP-complete task, which does not always results optimum utilization for every topology. Though, I should mention that it does achieve satisfactory performancy in many practical cases.

        However, for true optimum bandwidth utilization it may happen so that explicit traffic engineering solution could be needed. This significantly boosts complexity as it requires serious changes in control and data plane. VPLS-based solutions could achiver that easier, of course

        So these are my main issues with TRILL. To summarize:

        1) Why reinvent routing for ethernet when we can tunnel ethernet over routed networks.
        2) Shortest-path routing could not always yield optimum bandwidth utilization (though it’s better than STP) so additional traffic engineering could be needed.

        I do understand that the trump card or TRILL is plug-and-play behavior. Using any sort of tunneling techniques (e.g. VPLS or OTV) adds extra operational overhead, but it could be kept to minimum using auto-discovery techniques. Plus, administrative overhead adds better control and security over simple plug-and-play behavior.

        Thanks,

        Petr

        • http://blog.ioshints.info Ivan Pepelnjak

          TRILL does NOT reduce the impact of unicast/multicast flooding. No technology that claims to be fully compatible with transparent bridging can do that.

          • http://blog.INE.com Petr Lapukhov

            The main improvement that TRILL brings to the process of unicast flooding is removal of STP. One of the main issues with STP reconvergence was MAC address table flushing process that resulted in unicast flood bursting. Topology-aware learning eliminates that problem, though it does not eliminate unicast flooding per se, of course.

            As for multicast flooding, it simply becomes better optimized, as separate distribution trees could be pre-calculated and used, compared to STP instances in classic Ethernet.

  • Pingback: Bisectional Bandwidth. And why L2MP and Trill/RBridges is vital to the Virtualised Data Centres. – Gestalt IT

  • http://blog.ioshints.info Ivan Pepelnjak

    Differences between TRILL and 802.1aq unicast forwarding:

    http://blog.ioshints.info/2010/08/trill-and-8021aq-are-like-apples-and.html

    • Peter Ashwood-Smith

      Just an update to my previous comment. Last month 802.1aq was put through Interoperability tests. The slides used to present this at both the IETF and IEEE are here:

      http://www.ieee802.org/1/wp-content/uploads/public/docs2010/aq-ashwood-interop1-1110-v02.pdf

      Basically real switches were tested in a topology of 37 devices, 5 real, 32 emulated. Hardware datapaths, equal cost, and OA&M were shown.

      We will be doing much more Interop testing in the new Year so stay tuned and we will keep the Wikipedia entry for IEEE 802.1aq up to date as we go.

      • http://etherealmind.com Greg Ferro

        Peter

        I’m not really interested in SPB as I currently believe that TRILL is a better solution. I’m not sure why the IEEE is even bothering to continue with the standard. Maybe I’m missing something on understanding what the purpose of the SPB is.

        Happy to discuss, or even podcast about it if you have the time.

        greg

        • peter ashwood-smith

          Greg your statement is very broad without much detail for me to go on. My general philosophy is that I dislike ALL routing protocols and simply look at what I need and don’t need and try to decide that way. Our view (DC in particular) looks a bit like this:

          1) we don;t see any requirement for support of broadcast interfaces for NNI or shared NNI/UNI . We see the DC L2 fabric having lots of point to point links and probably 2-3 levels deep.

          2) The UNI’s will need active / active support to finer flow than vlan so that negotiation protocols should probably be separate rather than bundled in. Look at things like MCLAG and VC+ etc. We don’t see the link state protocol running at the top of rack, it has trivial routing requirements and rarely has any horizontal connectivity. If you need to run the link state in the top of rack your DC topology is going to get too large and you’ll slow it down unnecessarily.

          3) We require OA&M now and don’t really want to add a third undefined OA&M mechanism and/or respin hardware yet again to support it.

          4) We require more than 4K vlans now for large numbers of tenants and that problem was solved ages ago by .1ah and we don’t want to respin hardware yet again to support it.

          5) We require multi topology for teneant isolation. Tenants are isolated physically in a DC and routing isolation is also a requirement. the IS-IS MT mechanism which SPB incorporates transparently is ideally suited to this.

          6) We believe that IP managment can also run at the same time in the SPB IS-IS instance so that you only need one link state protocol.

          7) For cases where flatter L2 is not required IP can co-exist with the SPB IS-IS as just another NLPID.

          Hope that helps a bit.

  • Pingback: Multi-Path Ethernet: The Flying Cars of the Data Center « The Data Center Overlords

  • Pingback: Show 97 – The Future of TRILL and Spanning Tree – Part 1

  • http://twitter.com/netdad Michael Kantowski

    “TRILL uses IS-IS to carry routing information about MAC Addresses devices”

    I think it’s more accurate to say that TRILL uses IS-IS to carry routing information for RBridge reachability – not MAC addresses.  In essence, the information that IS-IS is passing is information on how to reach each RBridge in the network, not how to reach the end station MAC addresses.  This is an important distinction, because it shows that the IS-IS routing table does not grow as your end station count increases – it grows as your RBridge count increases – making the solution all the more scalable.

  • Pingback: Show 98 – The Future of TRILL and Spanning Tree – Part 2

  • Pingback: Northbound API, Southbound API, East/North – LAN Navigation in an OpenFlow World and an SDN Compass — EtherealMind