Sunday, March 14, 2010

iSCSI Network Designs: Part 2 — Simple Scaling

April 30, 2008 by Greg Ferro · Leave a Comment 

This Post is Part of a Series — click for list on iscsi»

We have looked at a simple iSCSI net­work solu­tion in iSCSI Network Designs Part 1. We have kept the stor­age traffic sep­ar­ated from our data for oper­a­tional reas­ons. But we are lim­ited to 1GB/​s and the port dens­ity of the switch.

What options do we have for increas­ing the num­ber of serv­ers in our sys­tem and the avail­able bandwidth.

Increase the bandwidth

Use Link Aggregation Control Protocol (LACP) or Etherchannel to increase the band­width to between 2 – 6 gig­abits per second (but on 1 gig­abit per second speed).

iscsi-part1-diag2a.png

By bond­ing mul­tiple gig­abit eth­er­net ports together we can increase the band­width. The are some pre-​​conditions:

1) server NICs and drivers must sup­port LACP
2) switches must sup­port LACP
3) switches must sup­port enough LACP bundles. (cheaper switches may only sup­port a few LACP bundles per switch).
4) all bundles must ter­min­ate on the same switch (or switch stack or chassis).

Thus you need to be care­ful about your choices for a stack­able or a chassis based switch. The Cisco C6500 and C4500 sup­ports a max­imum of 64 port chan­nels per chassis so the num­ber of serv­ers is lim­ited per switch.

Duplicate func­tional block

Lets assume that we want to double the pre­vi­ous design, how would that look. The easi­est thing is to duplic­ate the func­tional block as shown but this stops the shar­ing of the stor­age arrays between all servers.

iscsi-part1-diag1.png

Considerations for this approach:

  • cre­ates clearly defined mod­ules of functionality
  • stops over­load­ing of band­width or switching
  • simple approach has some oper­a­tional fea­tures e.g easy to under­stand, easy to troubleshoot
  • loss of flex­ib­il­ity and scale, i.e not any to any access from serv­ers to stor­age. Consider if you want to use VMotion to move a server between the clusters on the left and the right.

Interconnect the ‘iSCSI’ switches

If we want all stor­age resources to be avail­able to all serv­ers in the estate then we would need to inter­con­nect the iSCSI switches and the dia­gram would look some­thing like this:

iscsi-pt2-bbone1.png

There is noth­ing too amaz­ing here. More or less a stand­ard switch­ing back­bone that most of us would use every­day. However, for iSCSI to be mis­sion crit­ical we need to ensure that fail­over will occur very rap­idly, and that per­form­ance can scale. The ques­tion is whether these con­nec­tions should be mul­tiple 1Gigabit, or 10 Gigabit.

Network Solution Capacity — hard­ware selection

Lets assume that you are using Cisco C6500 switches, how many serv­ers can you support ?

Each chassis sup­ports a max­imum of 64 port­chan­nels. Assuming that each server is dual homed that means the design shown here can handle a max­imum of 120 high intens­ity serv­ers with redund­ant 2GB/​s con­nec­tions. Leaving some port­chan­nels for net­work con­nectiv­ity, then each switch could host 60 serv­ers, would use 120 Ethernet ports, and thus three 48 port eth­er­net blades would be required.

Assuming that at least 10 of these serv­ers need 4GB/​s stor­age con­nec­tions, then an addi­tional 48 port eth­er­net blade is added.

There will also be low intens­ity serv­ers that do not require mul­ti­gig­abit stor­age. Lets assume that there are 100 low intens­ity serv­ers. Two 48 port eth­er­net blades would be required.

So at this point, using C6509 could be a good choice with six WS-​​C6748 (48 x 10÷100÷1000) blades, two WS-​​SUP-​​720-​​3B super­visors, plus a WS-​​X6708-​​10G-​​3C (8 x 10GB ) for the back­bone con­nec­tions. The chassis is fully loaded.

So:

  • 10 serv­ers at 4GB/​s
  • 60 server at 2GB/​s
  • 100 serv­ers at 1GB/​s

Technical Notes

The use of Etherchannel /​ LACP is not always a plug and play pro­cess. You should take the time to check that both your switch and the server are actu­ally load bal­an­cing. Check out Scott Lowe’s blog for an excel­lent post on using eth­er­chan­nel /​ lacp on VMware ESX. Especially read the comments.

Conclusion

So far, I have out­lined the design to cre­ate an entirely sep­ar­ate eth­er­net switched net­work that would be used to exclus­ively handle the iSCSI block stor­age traffic that basic­ally meets my require­ments for num­ber of serv­ers. The next step is to eval­u­ate the peak capa­city of the sys­tem and whether the back­bone needs more or less than 10GB connections.

Other out­stand­ing items would be

  • to eval­u­ate options for improv­ing the con­ver­gence times in the event of a switch or link failure
  • check the load bal­an­cing (eth­er­chan­nel /​ lacp) actu­ally works as expected
  • what plans could I make to con­verge the data and the stor­age net­works together.

Definition

LACP
Link Aggregation Control Protocol (LACP) is part of an IEEE spe­cific­a­tion (802.3ad) that allows you to bundle sev­eral phys­ical ports together to form a single logical chan­nel. LACP allows a switch to nego­ti­ate an auto­matic bundle by send­ing LACP pack­ets to the peer. It per­forms a sim­ilar func­tion as Port Aggregation Protocol (PAgP) with Cisco EtherChannel. (from Cisco Web Site)

Please rate this post:

  Why Rate Posts?
1 Star - It\\\'s Crud2 Stars - It\\\'s Tosh3 Stars - Something\\\'s missing4 Stars - Needs works5 Stars - Good Enough6 Stars - Good7 Stars - Excellent8 Stars - Brilliant9 Stars - Astonishing10 Stars - Awesomely Godlike? (2 votes, average: 8.50 out of 10)
Loading ... Loading ...

Speak Your Mind

Tell us what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!