In the matter of the one week I find myself here at Etherealmind.com after being reminded of one of the key things I wanted to do after becoming free of the massive corporate machine. That is, freely sharing ideas and knowledge with the wider community. The fire was reignited after listening to the PacketPusher podcast and realising that I have stalled on that goal. So thanks to Greg for letting me share my knowledge here.
With that intro I suppose I better share some knowledge then!
Like many engineers I inherited a network and overnight I was responsible for every decision from that point onward but strangely enough I appeared to be responsible for every decision that was made before I had arrived!
One such decision before my time was to start rolling out GLBP to all branch offices, which all had 2 WAN routers connected to a MPLS VPN. I started to get it in the neck that on sites with GLBP deployed the traffic was not load balanced and I needed to fix it!
Before I could fix it I needed to understand what could the problem be. So†I looked to understand GLBP.
GLBP – Gateway Load Balancing Protocol
GLBP provides a mechanism to deliver a single default gateway(IP Address) to a subnet but share the IP address across multiple routers and have multiple active routers.† If a router went off-line or the WAN circuit went down then the other router would service the traffic. Also since we have multiple active routers then we also use the outgoing WAN connection associated with that router (this is why it is call load balancing). Actually it is more like load sharing than balancing.
The Load Sharing Algorithm:
Round-Robin is the default load sharing algorithm where PC1 will use RouterA, PC2 will use RouterB, PC3 will use Router A and so on. This mean that if PC1 send twice as much traffic as PC2 router A will be twice as busy Router B.
This in a nutshell was the problem;† PC1 was sending lots of data down the wire and skewing the figures. So how do I force the other PCs to use Router B, well I could not find a manageable way to do it with GLBP so I went back to look at faithful HSRP.
Note:There are other algorithms available for GLBP, but Round Robin is recommended unless there is a specific requirement. e.g NAT required host-dependent.
HSRP – Hot Standby Router Protocol
Historically gateway resilience was achieved by HSRP . As the name suggests only one router per subnet was Hot (active) at any one time therefore all traffic entering the router via the default gateway would use the Connected outgoing WAN interface. If Hot router went off-line or the WAN circuit went down then the Stand-by would become Active.
It is possible to have two different default gateways active on a single subnet, but then controlling the default gateway through DHCP scopes is something that need to be done between the DHCP administrator and Client administrator and based on my experiance†it is†not always easy to get them to work together. This also takes away the direct control of the load balancing away from the network administrator. Putting that aside here is a quick note on how it can be done.
Note: There is a way on a single subnet to have DHCP issues different scope parameters to PCs on the same subnet. On the Microsoft clients you can use ipconfig /setclassid command. Then at the Microsoft DHCP server issue different parameters to different classidís so in this case different default gateways.
To fix this problem I needed to be in control so I used the following method:
I†worked around the DHCP scope issue by using multiple subnets per site, in this particular case the sites had previously run HSRP and had at least 2 data vlans per site (/25 subnets). So I ended up getting rid of GLBP and going back to good old HSRP. Still today we monitor capacity and in sites where the is a significant skew we simply move some PCs to the lighter loaded VLAN and that how we†keep the management teams happy.
One key point about GLBP and HSRP that cannot be stressed enough that they are only relevant to outbound traffic in this scenario.