Friday, March 12, 2010

Network Fabric:TRILL for Server and Network People. Welcome RBridges

March 30, 2009 by Greg Ferro · 9 Comments 


TRILL is a key net­work tech­no­logy for enabling Cloud Computing by allow­ing for bet­ter migra­tions of VM’s, and bet­ter util­isa­tion of the net­work switch­ing fab­ric and much improved sta­bil­ity of the Data Centre Server Fabric.

TRILL(Transparent Interconnection of Lots of Links)

TRILL is a pro­posed tech­no­logy that is inten­ded for use in the data centre. It is a fun­da­mental net­work tech­no­logy ena­bler for Cloud Computing so that:

  • increase in switch­ing density,
  • bet­ter path utilisation
  • under­ly­ing sup­port for highly mobile serv­ers in a vir­tu­al­ised server environment
  • faster con­ver­gence, at power on and in response to failures.

The sup­port for highly mobile vir­tual serv­ers within VLANs is big feature.

From the IETF Charter for TRILL


The design should have the fol­low­ing properties:


  • Minimal or no con­fig­ur­a­tion required
  • Load-​​splitting among mul­tiple paths
  • Routing loop mit­ig­a­tion (pos­sibly through a TTL field)
  • Support of mul­tiple points of attachment
  • Support for broad­cast and multicast
  • No sig­ni­fic­ant ser­vice delay after attachment
  • No less secure than exist­ing bridged solutions

It’s also very inter­est­ing that TRILL is an Ethernet stand­ard being pro­gressed through the IETF (and not the IEEE), and is led by the Radia Perlman, the cre­ator of Spanning Tree.

Why have TRILL at all ?

In cur­rent Layer 2 net­works we use Spanning Tree (in all its forms) to ensure that there is only a single path across the net­work. At the same time, we delib­er­ately design one or more redund­ant paths for fail­over in the event that the primary path fails. This redund­ant paths are blocked and that there are many paths in the net­work that are not used to carry frames thus:

  • Bandwidth is poten­tially under­u­til­ised, espe­cially when more than two paths exist.
  • the shortest optimal (and thus fast­est) path through the Layer 2 net­work is not prac­tic­ally pos­sible thus most path are inefficient
  • because it’s not proprietary
  • because 802.1aq isn’t your cup of tea.

Current State of Spanning Tree

There are sev­eral forms of Spanning Tree used today:

  • Common Spanning Tree (CST)
  • Per VLAN Spanning Tree (PVST/​PVST+)
  • Rapid Spanning Tree
  • Multiple Instance Spanning Tree

All types of span­ning tree have one fun­da­mental idea: that a single tree, rep­res­ent­ing a single path through the switch­ing net­work is estab­lished for a given VLAN and all altern­ate paths are blocked. Each of the above offers improve­ments in speed of con­ver­gence using vari­ous fea­tures, how­ever, even in the best cases, it takes seconds for paths to con­verge on the final solution.

Suboptimal path

Consider the fol­low­ing server/​switch con­nec­tions where two serv­ers are­con­nec­ted to a switch (with one or more ports if using LACP). You can see that there are sev­eral pos­sible paths between the two servers.

trill-1.jpg

Typically, a switch back­bone for a given VLAN will be con­figured to force the tree to be at the core of the net­work. As a res­ult the path between the serv­ers will look some­thing like this (if the top left switch is con­figured as the primary root switch):

trill-2.jpg

Now this might seem OK until you real­ise that the entire data centre net­work looks more like this:

trill-3.jpg

Ok, so this isn’t abso­lutely per­fect, but you should get the idea. There are very good oper­a­tional reas­ons to build a data centre this way. Very large organ­isa­tions with sub­stan­tial resources will do it smarter and bet­ter using PVST+ and port/​path costs to optim­ise and try to bal­ance some load. in real life, this is stand­ard Enterprise switch­ing back­bone for data centres. .

That link between the core switches need to be big, and many of the of the uplinks are also large. This has been a big driver for 10Gb Ethernet in the early stages.

But when you think about it, what we would really like to achieve is the fol­low­ing shortest path through the network:

trill-4.jpg

As you can see, this means that core is now more lightly loaded and that, scal­ing this up, the core is less likely to be bot­tle­necked around Core Switch 1.

A com­mon scen­ario for this is when using top-​​of-​​rack Ethernet switches and two serv­ers are in adja­cent racks, yet the Ethernet frames travel via the core, adding a small amount of latency.

Note also that a fail­ure event at Core 1, will have less over­all impact on the entire network.

VM migra­tion /​ mobil­ity

Now con­sider what hap­pens when all of these serv­ers are vir­tual serv­ers and then you use a vserver mobil­ity tool to move ServerA across the back­bone when using span­ning tree:

trill-optimal-path-5.jpg

The prob­lem here is that the path is still shared by all the other serv­ers. The load across the back­bone may drastic­ally altered the avail­able bandwdith.

trill-optimal-path-4.jpg

In this case, there is less chance of impact since the frames are now tak­ing an altern­ate path. Your back­bone is much more likely to handle the changes and surges in Internet traffic, and more gen­er­ally you would have bet­ter over­all usage of the avail­able infrastructure.

Of course, you can’t be cer­tain that this path has avail­able band­width but there are other stand­ards being developed to help solve this problem.

Multiple Paths at Layer 2

The other improve­ment that TRILL make avail­able is mul­tiple paths at layer 2. Consider the fol­low­ing model:

trill-multiple-paths-1.jpg

This is a major break­through com­pared to Spanning Tree and will drastic­ally improve the avail­able band­width because many links can be used to the destination.

Although it will make packet cap­ture in the net­work much more dif­fi­cult since pre­dict­ing the path through the Layer2 net­work is quite complicated.

Terminology:Frames and Packets

The OSI model has defined an abstrac­tion of the com­mu­nic­a­tion pro­to­cols that uses seven lay­ers and Layer 3 = Network Layer, Layer 2 = Data Link layer. Typically, Network Engineers con­sider the term Packets to refer to Layer 3 data encap­su­la­tions, and Frames refer to Layer 2 encap­su­la­tions. In the past, FDDI, Token Ring and Ethernet were all Frames, and AppleTalk, IPX and IP are all Packets. For many people today, only Ethernet and IP exist and it is easy to con­fuse the terms.

TRILL — a fast over­view on how it works

TRILL uses a concept of a Routing Bridge, known as an RBridge, run­ning IS-​​IS rout­ing protocol((Older engin­eers will recall Bridging Routers or BRouters, I find it amus­ing that this is an obvi­ous return to bridging and step­ping away from rout­ing)) . IS-​​IS is a link state rout­ing pro­tocol that is mostly used by Global Service Providers to carry IP routes within their net­works (and then use BGP between Service Providers) but is actu­ally pro­tocol inde­pend­ent. IS-​​IS does not use IP to estab­lish neigh­bour rela­tion­ships, its uses OSI pro­to­cols which includes CLNS and PDU’s to per­form the neigh­bour and pro­tocol exchanges. This means that IS-​​IS works for IPv4 & IPv6 and can equally be used for other pro­to­cols such as is pro­posed by TRILL.

TRILL uses IS-​​IS to carry rout­ing inform­a­tion about MAC Addresses devices con­nec­ted to VLANs and to build a shor­ted path tree for each MAC address in the VLAN.

TRILL intends to main­tain com­pat­ib­il­ity with exist­ing Spanning Tree imple­ment­a­tion and can co-​​exist and be pro­gress­ively migrated.

Thoughts and Personal Opinions

One of the most fas­cin­at­ing things about TRILL is that it is con­cep­tu­ally identical to IEEE 802.1aq. Why would someone cre­ate a com­pet­ing stand­ard ? Because the IEEE is slow, pon­der­ous and bad com­pany. Many times we have seen the IEEE take far too long to develop stand­ards, fail to resolve con­flicts between com­pet­ing interests.

Radia Perlman is the cre­ator of Spanning Tree who cur­rently works for SUN, and appears to be the lead author of TRILL. Other authors include Dinesh G. Dutt from Cisco, Silvano Gai from Nuova (now Cisco Nexus products). These people are at the core of the data centre fab­ric devel­op­ment and their employ­ers Sun and Cisco are pro­mot­ing their Cloud Computing credentials.

Radia Perlman has also cri­ti­cised the IEEE in her out­stand­ing and sem­inal text­book on bridging — Interconnections:Bridges, Routers, Switches, and Internetworking Protocols — Radia Perlman — Addison-​​Wesley 2000. (Do your­self a favour and buy this book, its brilliant!).

I can’t get access to the cur­rent drafts of the IEEE since they are not open to the pub­lic. However, the most recent draft 1.5 was released on 2008÷12÷18 and appears to have been stum­bling along for more than two years. Reviewing the avail­able mater­ial sug­gests that the IEEE is not reach­ing any agree­ment at this stage.

Conclusion

I hope I have done this topic justice. I have only the IETF doc­u­ment to read and some of my own exper­i­ences. If any­one has more inform­a­tion or can’t point out my mis­takes, please leave a note in the com­ments and I will respond or you can con­tact me using my con­tact page http://​eth​er​e​al​mind​.com/contact. I look for­ward to hear­ing back.

References

Problem & Applicability Statement

http://​www​.ietf​.org/​i​n​t​e​r​e​t​-​d​r​a​f​t​s​/​d​r​a​f​t​-​i​e​t​f​-​t​r​i​l​l​-​p​rob-06.txt

Please rate this post:

  Why Rate Posts?
1 Star - It\\\'s Crud2 Stars - It\\\'s Tosh3 Stars - Something\\\'s missing4 Stars - Needs works5 Stars - Good Enough6 Stars - Good7 Stars - Excellent8 Stars - Brilliant9 Stars - Astonishing10 Stars - Awesomely Godlike? (2 votes, average: 7.50 out of 10)
Loading ... Loading ...

Comments

9 Responses to “Network Fabric:TRILL for Server and Network People. Welcome RBridges”
  1. Dave Allan says:

    TRILL is not that con­cep­tu­ally aligned with 802.1aq other than that the ser­vice model is sim­ilar. 802.1aq was focused on re-​​use of eth­er­net, spe­cific­ally lever­aging 802.1ag (OAM), 802.1ah (adaptation+large scale vir­tu­al­iz­a­tion) and 802.1Qay (pop­u­la­tion of the FDB by man­age­ment or con­trol plane). TRILL was about cre­at­ing a new uni­quit­ous spe­cial­ized L3 spe­cific­ally for Ethernet, dif­fer­ent con­straints, dif­fer­ent results.

    The cri­tique of IEEE is IMO ill-​​informed. IETF (for example) resolves con­flicts by pub­lish­ing mul­tiple RFCs and let­ting the industry decide, which simply punts their prob­lems onto all of us. IEEE at least has the concept of “dis­tinct iden­tity” for a given pro­ject. It may take a little longer to get a stand­ard but any industry con­fu­sion is capped there.…

    BTW 802.1aq is pro­gress­ing nicely, has numer­ous pre-​​standard deploy­ments, and should be baked as a stand­ard in 1H2010…

    • Greg Ferro says:

      I guess i have to dis­agree. I have looked the the IETF web site and I can­not get access to any inform­a­tion about the pro­gress of the meet­ings, or read the latest doc­u­ments. For some­thing that is sup­posed to be a ‘stand­ard’ it’s not very trans­par­ent. Given the time/​date stamps it would seem that things are not going smoothly, but I can’t get any details to con­firm or deny that since its all done in secret.

      I also remain skep­tical about the IEEE com­pet­ence. Ethernet is a suc­cess in spite of IEEE pro­ced­ures and pro­cesses when I con­sider the mul­tiple Wireless LAN cockups. Not to men­tion VLAN tag­ging.… oh I could go on.

      It’s a shame that it won’t be ready until 2H2010, it was ori­gin­ally prom­ised in 2008, maybe early 2009.

      Until the IEEE does it bet­ter, I will remain crit­ical. Show me res­ults and transparency.

  2. The Referenced Internet Draft (​d​r​a​f​t​-​i​e​t​f​-​t​r​i​l​l​-​p​rob-06.txt) has now been pub­lished as RFC 5556.
    See http://​www​.ietf​.org/​r​f​c​/​r​fc5556.txt

    The TRILL base pro­tocol spe­cific­a­tion is now quite mature and the latest ver­sion is avail­able at
    ​http://​tools​.ietf​.org/​h​t​m​l​/​d​r​a​f​t​-​i​e​t​f​-​t​r​i​l​l​-​r​b​r​i​d​g​e​-​p​rotocol-14

    Thanks,
    Donald

  3. peter ashwood-smith says:

    There is a lot of good inform­a­tion on 802.1aq on wiki­pe­dia under “IEEE 802.1aq”.
    We’ve been try­ing to keep it up to date with whats hap­pen­ing at the IEEE.

    In par­tic­u­lar the l2 mul­tipath­ing work has pro­gressed rather well and now allows 16 sym­met­ric algorithms with addi­tional opaque mech­an­isms for the addi­tion of many more. What is very inter­est­ing about this work is that these are head end chosen so you get a lot of con­trol over how traffic will be placed.

Trackbacks

Check out what others are saying about this post...
  1. […] is one other stand­ard that is import­ant. L2 Multipathing (L2MP) (which I have dis­cussed here) is going to be a vital part of mak­ing scal­able data centre net­works. There are two […]



Speak Your Mind

Tell us what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!