iSCSI Network Designs: Part 5 – iSCSI Multipathing, Host Bus Adapters, High Availability and Redundancy

I feel that is important to understand how the adapters will integrate with the switching infrastructure so that I can ensure that the network delivers.

To summarise the iSCSI initiator on the server – I like this picture because it speaks to the ISO seven layer model (which is embedded in my design thinking) (1):

iscsi-offload-1.png

This nicely presents how the performance can be achieved. Particularly you can see the importance of the software drivers since they will be key in terms of the features.

Redundancy

To be able to design the network for iSCSI redundancy we need to understand how iSCSI implements its HA features. There are three ways to achieve this:

  1. Link Aggregation – LACP
  2. iSCSI Multipathing – Active / Standby
  3. iSCSI Multipathing – Active / Active

The following diagram shows the relationships between the different solutions using an approximation of the ISO model.

iscsi-ha-2.png

Link Aggregation – Etherchannel / LACP

The most simple of the choices but requires the most configuration and setup. You must have the following criteria:

  • two network cards in the servers
  • the two network cards must connect to the same Cisco switch (2)
  • network adapters and drivers must have support for LACP
  • at least one switch with supports LACP (sometimes know as Etherchannel)
  • you must be able to configure both the server drivers and switch configuration

Storage data only

I recommend that the only IP traffic on this interface is iSCSI. This will ensure that you do not impact storage performance by overrunning the interface with data. I hope to post a future article showing how to configure QoS in the network if you must mix IP storage and IP data traffic on the same network adapter.

Network Adapters

Not all adapters support LACP, and some vendors have a license that you purchase as an upgrade to enable it. It is a hardware feature and software drivers will not make LACP available. Check your network adapters supports LACP.

LACP load balancing

There are certain circumstances when LACP load balancing algorithm does not work optimally. Typically the driver makes a hash of the source and destination IP address and then sends all data down one port of the LACP bundle. Normally there are enough conversations to make this balance work satisfactorily, that is, there are enough unique source/destination pairs to evenly balance the load between the two interfaces, but if all your data from your Server to Storage is a single source/destination then, by default, only one gigabit ethernet port will be used. A key server might need more than one gigabit of bandwidth.

You can configure the Switch LACP AND the Server LACP (since each can only load balance the outbound frames) to use SRC IP / Port and DST IP / Port to make the full LACP bandwidth available to a single iSCSI initiator / target pair.

iSCSI Multipathing Active / Standby

This is defined in the iSCSI RFC as the method for achieving high availability. The iSCSI initiator will initiate at least two TCP connections to the iSCSI target. Data will flow down the primary connection until a failure is detected and data will then be diverted to the second connection.

Since the connection is fully established, failover should be fast.

Note that you can still use LACP to improve bandwidth. e.g. if you need two gigabits of storage bandwidth, you would need 4 GB ports in two bundles of two to achieve this. This is a reasonable option for large VMware ESX servers which might host thirty or forty guest servers using the the iSCSI storage and you want to have maximum availability.

Dell has an example of how to configure this.

iSCSI Multipathing Active / Active

This solution is identical to the iSCSI Active / Standby, however, the iSCSI initiator and target include additional driver software that allow data to be transferred over either connection.

As before, LACP can be used to improve bandwidth by creating multi-gigabit connections.

Conclusion

We have looked at the three options for redundant iSCSI connections. Note that each of these solutions is possible with native network adapters, TOE or HBA (as covered in iSCSI Part 3 – Server Side – iSCSI Host Bus Adapters and IP Performance so that you can choose which performance level suits the requirements for the server.

From this information, it would seem that iSCSI multipathing Active/Standy is most effective for most people since it is simple to configure and operate. Choosing iSCSI Multipathing Active/Active would be a good choice for a server / storage combination where you can spend time testing the drivers and functionality. I am not sure but I think that both the storage and server would need to support this feature (can anyone confirm ? Leave a comment below if you know).

You could achieve redundancy using LACP on the server and storage (and also a performance improvement), but you must be able co-ordinate the switch configuration and the server configuration, plus modify the LACP load balancing type.

You could do both of course, using the following diagram which show using Active Standby iSCSI Multipathing AND LACP to create a two gigabits per second connection. Note that your server architecture, iSCSI driver and/or HBA would need to be able to generate this volume of traffic for it to be useful (and of course, the storage unit would also need to have the capability to move this volume of data).

iscsi-redunadncy-01.jpg

With this type of implementation the physical connections should look like this:

iscsi-mpath-lacp-physical.jpg

This would require a total of four ethernet ports per server, with NICs that support LACP, and two switches (LACP support). This would ensure redundant paths and bandwidth.

Whats Next

Now that we have finished the considerations for the Server connections to the network, we can return to designing the network and considering bandwidth, ethernet performance and QoS requirements.

Footnotes

(1) iSCSI Performance Options – NetApp Whitepaper June 2005.

(2) Nortel has Multi Link Trunking which allows for LACP to span multiple switches. I do not recommend using Nortel switches to access this feature as my experience of Nortel is less than excellent. I understand that Nortel owns the patent and does not make it available to other companies.

  • Pingback: Article: The Future of Storage - Seven Fundamental Reasons why FCoE will fail : My Etherealmind

  • Pingback: Confluence: Sysadmin

  • http://pierky.wordpress.com Pierky

    Very nice series, Greg! A great and widespreading look inside the iSCSI world!

  • http://n/a Duncan Mossop

    Excellent article and interesting reading.

    You might be interested to know that HP’s E3500+ series switches support Distributed Trunking otherwise referred to as DT-LACP.

    I’ve just recently implemented it in combination with a NetAPP iSCSI based storage system and 2 x E3500 series switches and it appears to work quite well although I’m not completely convinced yet.

    Catalyst & StackWise+ anyday of the week + EtherChannel.

  • Malheiros

    “You can configure the Switch LACP AND the Server LACP (since each can
    only load balance the outbound frames) to use SRC IP / Port and DST IP /
    Port to make the full LACP bandwidth available to a single iSCSI
    initiator / target pair.”

    Greg, could you elaborate a bit mopre about this? You mean we can achieve using more than a aphysycal link capacity for a single connection?

    Thanks for the excellent article!!!

  • Papaiso

    Excellent article Greg. 
    I have a few questions:
    you say: “the two network cards must connect to the same Cisco switch”
    Is it possible to use other brands rather than Cisco?
    If using stackable switches (with 802.3ad support) is it possible to connect each of the 2 NIC to a different switch? 
    I am working on a project with a very limited budget (reason why I am using iSCSI rather than FC), can someone please suggest a few budget switches that support LACP and stacking?