What options do we have for increasing the number of servers in our system and the available bandwidth.
Increase the bandwidth
Use Link Aggregation Control Protocol (LACP) or Etherchannel to increase the bandwidth to between 2-6 gigabits per second (but on 1 gigabit per second speed).
By bonding multiple gigabit ethernet ports together we can increase the bandwidth. The are some pre-conditions:
1) server NICs and drivers must support LACP
2) switches must support LACP
3) switches must support enough LACP bundles. (cheaper switches may only support a few LACP bundles per switch).
4) all bundles must terminate on the same switch (or switch stack or chassis).
Thus you need to be careful about your choices for a stackable or a chassis based switch. The Cisco C6500 and C4500 supports a maximum of 64 port channels per chassis so the number of servers is limited per switch.
Duplicate functional block
Lets assume that we want to double the previous design, how would that look. The easiest thing is to duplicate the functional block as shown but this stops the sharing of the storage arrays between all servers.
Considerations for this approach:
- creates clearly defined modules of functionality
- stops overloading of bandwidth or switching
- simple approach has some operational features e.g easy to understand, easy to troubleshoot
- loss of flexibility and scale, i.e not any to any access from servers to storage. Consider if you want to use VMotion to move a server between the clusters on the left and the right.
Interconnect the ‘iSCSI’ switches
If we want all storage resources to be available to all servers in the estate then we would need to interconnect the iSCSI switches and the diagram would look something like this:
There is nothing too amazing here. More or less a standard switching backbone that most of us would use everyday. However, for iSCSI to be mission critical we need to ensure that failover will occur very rapidly, and that performance can scale. The question is whether these connections should be multiple 1Gigabit, or 10 Gigabit.
Network Solution Capacity – hardware selection
Lets assume that you are using Cisco C6500 switches, how many servers can you support ?
Each chassis supports a maximum of 64 portchannels. Assuming that each server is dual homed that means the design shown here can handle a maximum of 120 high intensity servers with redundant 2GB/s connections. Leaving some portchannels for network connectivity, then each switch could host 60 servers, would use 120 Ethernet ports, and thus three 48 port ethernet blades would be required.
Assuming that at least 10 of these servers need 4GB/s storage connections, then an additional 48 port ethernet blade is added.
There will also be low intensity servers that do not require multigigabit storage. Lets assume that there are 100 low intensity servers. Two 48 port ethernet blades would be required.
So at this point, using C6509 could be a good choice with six WS-C6748 (48 x 10/100/1000) blades, two WS-SUP-720-3B supervisors, plus a WS-X6708-10G-3C (8 x 10GB ) for the backbone connections. The chassis is fully loaded.
- 10 servers at 4GB/s
- 60 server at 2GB/s
- 100 servers at 1GB/s
The use of Etherchannel / LACP is not always a plug and play process. You should take the time to check that both your switch and the server are actually load balancing. Check out Scott Lowe’s blog for an excellent post on using etherchannel / lacp on VMware ESX. Especially read the comments.
So far, I have outlined the design to create an entirely separate ethernet switched network that would be used to exclusively handle the iSCSI block storage traffic that basically meets my requirements for number of servers. The next step is to evaluate the peak capacity of the system and whether the backbone needs more or less than 10GB connections.
Other outstanding items would be
- to evaluate options for improving the convergence times in the event of a switch or link failure
- check the load balancing (etherchannel / lacp) actually works as expected
- what plans could I make to converge the data and the storage networks together.
- Link Aggregation Control Protocol (LACP) is part of an IEEE specification (802.3ad) that allows you to bundle several physical ports together to form a single logical channel. LACP allows a switch to negotiate an automatic bundle by sending LACP packets to the peer. It performs a similar function as Port Aggregation Protocol (PAgP) with Cisco EtherChannel. (from Cisco Web Site)