Cisco ACE – Enterprise Load Balancing on a Stick using Source NAT – Part 2

Following from the previous post where we talked about the Design requirements, this post looks at the practical configuration

Load Balancing with Source NAT

The solution to the problem, im my opinion, is to use Source NAT on the load balancer so that all packets appear to originate from the Load Balancer itself.

Lb snat 6

  • Client sends IP packet to Load Balancer VIP
  • LB analyses server farm for that VIP, selects server, and uses that destination address
  • LB determines which outbound interface to the server and uses the NAT pool on that interface.
  • LB rewrites the IP Header and dispatches the packet.
  • Server receives the packet, process the request and passes data to the application.
  • server sends reply packet back to the LB.
  • LB checks it’s state table to determine which client was the source.
  • LB rewrites the packet and forwards out the interface.

Thus every packet received on the VIP has it’s Source and Destination address re-written. The Server will route the packet back to the load balancer, which will modify the addressing and forward the reply back to the client.

There are several advantages in big networks:

  • the LB can be put anywhere in the existing network (subject to bandwidth concerns)
  • the server can be anywhere (subject to bandwidth concerns)
  • no change to server configuration
  • limited change to routing (if any)
  • the LB is only part of the failure domain for client data – plausible deniability for any server administration problems (very important this one)

The Configuration

In this example we look at how to configure this on Cisco ACE.

switch/ACE-TEST# sh run Generating configuration....
!
! Setup logging to console (not enabled by default)
logging enable
logging timestamp
logging monitor 7
!
!ACLs for the interface - traffic is not permitted by default.
access-list vip-traffic remark allow any any traffic to the vip
access-list vip-traffic line 1 extended permit icmp any any
access-list vip-traffic line 2 extended permit ip any any
!
! Setup the server probes. PING for Server up/down
!TCP 80 for application up/down
!
probe icmp PING
description simple ping monitor
interval 10
passdetect interval 60 probe tcp TCP80
interval 10
passdetect interval 10
passdetect count 2
receive 1
open 5
!
!Define the probes that check the server. Use ICMP so that "show rserver" reflects actual OS status.
!
rserver host 169.254.0.100
ip address 169.254.0.100
probe PING
inservice
rserver host 169.254.0.200
ip address 169.254.0.200
probe PING
inservice
!Create the server farm with the two servers. Use a TCP80 probe so that "show serverfarm TESTFARM"
! shows the application status (not the server status)
!
serverfarm host TESTFARM
probe TCP80
rserver 169.254.0.100
inservice
rserver 169.254.0.200
inservice
!
!Create a sticky database - your mileage may vary according to application.
!
sticky ip-netmask 255.255.255.255 address source stuck
timeout 60
replicate sticky
serverfarm TESTFARM
!
!Create the class maps for service policy
!Allow the interface to reachable by ICMP !Allow the VIP to be reachable over HTTP on TCP80
!
class-map type management match-any REMOTE_ACCESS
description Remote access traffic match.
3 match protocol icmp any class-map match-all VIP_CLASS
2 match virtual-address 169.254.1.1 tcp eq www
!
class-map match-all VIP_CLASS_2
2 match virtual-address 169.254.1.2 tcp eq www
!
! Apply the policy map from the class-map
! policy-map type management first-match REMOTE_MGMT
class REMOTE_ACCESS
permit
!
!Define the load-balancing policy with sticky. Round robin default.
!
policy-map type loadbalance first-match POLICY_MAP
class class-default
sticky-serverfarm stuck
!
! The VIP Load balancing policy, with NAT pool defined.
policy-map multi-match VIP_FARM
class VIP_CLASS
loadbalance vip inservice
loadbalance policy POLICY_MAP
loadbalance vip icmp-reply active
nat dynamic 100 vlan 100
!
!
interface vlan 100
ip address 169.254.0.70 255.255.0.0
access-group input vip-traffic
access-group output vip-traffic
!
!Define NAT Pool for the VIP.
!
nat-pool 100 169.254.3.1 169.254.3.1 netmask 255.255.255.255 pat
!
!Apple the service policy for administration traffic and load balancing policy.
service-policy input REMOTE_MGMT
service-policy input VIP_FARM
no shutdown
!
switch/ACE-TEST#

Show xlate

You can see the state of the translation

switch/ACE-TEST# sh xlate
TCP PAT from vlan100:169.254.0.69/51548 to vlan100:169.254.3.2/1025
switch/ACE-TEST#

Probe Configuration

I think that many people do not take the time to consider the use of probes. In fact, clever use of probes can dramatically change the way that your LoadBal acts in your networks.

My general suggestions is to use two types of probes. An ICMP for verifying OS status, and an application probe such as performing an HTTP check on TCP80, or even a simple TCP Connection check on Port 80. To make this useful, attach the ICMP probe to the REAL Server, and the HTTP probe to the Server Farm. When you run the “show real servers” you will see that the servers are “up” and responding to ping, but when you perform a “show serverfarm” you are seeing the status of the web application on those servers.

That much more useful.

Probe Timers

Think carefully about probe timers. For example, a 15 second delay in probe, with a three poll dead interval means that the detection can take between 45 and 59 seconds because the failure could occur immediately after the previous probe.

For the Cisco ACE it’s also important to consider TCP timeouts. The default timeout is 30 seconds (and the probe default is sixty seconds). When you change the timeouts to less than thirty seconds you should change the TCP timeout as well. Realistically, if a server cannot respond to a TCP SYN in less than 5 seconds then it’s got a serious problems.

A final consideration is the intervals for probes when the server is failed. It’s important to take a server out of service, but don’t try to bring it into service too quickly. Operating Systems and Applications can take minutes to start and stabilise. Therefore, configure your failure probes to have at least three successful probes about a minute apart to ensure that OS is ready for new connections. If you have high traffic volumes you need to go and research the “Slow Start” feature. This feature is configured using “passdetect” meaning detect the server passing the test to return to service.

Other Posts in A Series On The Same Topic

  1. Cisco ACE - Enterprise Load Balancing on a Stick using Source NAT - Part 3 (14th February 2011)
  2. Cisco ACE - Enterprise Load Balancing on a Stick using Source NAT - Part 2 (9th February 2011)
  3. Cisco ACE - Enterprise Load Balancing on a Stick using Source NAT - Part 1 (8th February 2011)
  • Rob

    I’d say the biggest advantage is that only load balanced traffic needs to traverse the ACE. Considering the ACE appliance’s terrible throughput this is crucial.

    disadvangates – the server will only see the NAT addresses for client connections unless you use policy routing which i’ve still yet to master or see nice examples of

    • Santino Rizzo

      For HTTP, the ACS can rewrite the HTTP header with the real client IP address for identification purposes.

      • Santino Rizzo

        Sorry, ACS on the brain. Meant ACE.

      • http://etherealmind.com Greg Ferro

        Might make that part 4 I think.

  • kAos

    My concern with source NAT is, user tracking. You may want to balance traffic to services other than http – no header to insert the original source IP.

    Also, how is it going to work with IPv6, with its lack of NAT support? Source AND destintaion?

    • http://etherealmind.com Greg Ferro

      The focus here is the Enterprise where the source IP is mostly irrelevant. However, I’ve already talked about how to solve the general case where source IP matters and that is to use the linear design.

      Second, most applications have a method to track users IP address that doesn’t rely on source address from the packet. Even apache can be configured to use X-Forwarded from the HTTP for logging.

      As always “It Depends”, if you need source address, then solve that problem.

    • Charles

      Interesting you are bringing up IPv6. The ACE-20 Hardware can’t even handle IPv6 and the earliest projections for IPv6 support on the ACE-30 are EOY. And that is 9-10 months away and could be pushed further. But hopefully (to answer your question) there will be some translational services built into the software by then to handle this type of request.

      • http://etherealmind.com Greg Ferro

        As I said (further down the page) I wouldn’t recommend using ACE for anything. The late delivery of features and upgrades is endemic to the product over the last three / four years. I figure that the hardware architecture must be painful and difficult to use and / or the software development for that team is truly broken.

        Answer: Budget to replace them with something else. I cannot see buying ACE-30 would fix what appears to be a deep rooted problem.

  • Tristan Rhodes

    Thanks for sharing this info. I have been working with ACE load-balancing a lot recently, so it is on my mind. I am a huge fan of the one-arm mode for it’s flexibility, simple design, and efficient use of load-balancer resources. (Cisco probably wants you to go inline so you buy bigger hardware)

    Here are some critical issues:

    1) Be sure to insert HTTP headers so your server guys can see the real users IP in their logs. Otherwise all they see are the IP addresses in your NAT pool.

    header insert request X-Forwarded-For: header-value “%is”

    2) How do you use NAT pools? Should each virtual server have it’s own pool, or should multiple services use a shared pool? Or do you make your NAT Pool a single IP using PAT, that matches your Virtual IPs? I am not sure what the best practice is in this area.

    3) Train your server engineers on how the load-balancer works. Give them a diagram and explain how the load-balancer detects if a server is up/down (probes) and how long it takes for the service to change state. Give them limited access to ANM (free web-based mgmt tool) so they can suspend and activate servers gracefully (otherwise users will be sent to the downed server for many seconds).

    Cheers,

    Tristan

  • Stefan Herbst

    Hi Greg,
    I have done a number of hardware load balancer installs in the enterprise. You have summed it up nicely. We do mostly the HLB-stick option with proxy IP to give us flexibility in where the ‘real’ servers and client are.. or where the HLB is relative to them. We have seen cases in the past where client source IP matters but they are rarely a hard ‘must have’. The most recent I have seen is Exchange 2010, by default users MAPI traffic is throttled into the exchange DAG cluster, exchange applies this logic using client source IP.
    The posts and podcasts are awesome, keep them coming!
    Stefan

    • http://etherealmind.com Greg Ferro

      Only Microsoft could think of something that stupid. Good tip though.

      You could use the ACE in virtual contexts to provide a “NAT on a Stick” for most services, and then a linear context for other stuff.

      Thanks.

      greg

  • Charles

    Greg,
    I’ve been running the ACE (as well as NetScaler and F5 and CSS and….) since the early days. Your writeup works very will in the Enterprise realm you are discussing it in. Can you share any thoughts you would have on deploying the ACE in a DMZd environment? Say with an FWSM and application flows and multiple levels of DMZs? Any insight would be appreciated. Part 3?

    • http://etherealmind.com Greg Ferro

      Although I continue to deploy ACE load balancers I wouldn’t recommend them. I’ve had so many problems, bugs, weird faults, mysterious reboots and general poor quality experience that I do not recommend using them for anything. I’ve had too many late nights and crisis meetings to care about the product any more. Although better than Microsoft NLB, I’d use ANYTHING else but ACE. Same goes for FWSM. I love the service modules concept, but the execution is truly poor.

      I’ve also stated this previously in the comments on this post – http://etherealmind.com/cisco-application-control-engine-ace-introduction-and-comparison-with-f5/

      • Charles

        I understand and agree completely. Sometimes business and budget get in the way of doing things right. But I appreciate your cander.

        • http://etherealmind.com Greg Ferro

          Thanks for your comment.

  • Bryan

    I have to agree wholeheartedly on the ACE and FWSM assessments. The ACE more or less works, but IPv6 is a big concern and the product roadmap is positively dismal. The ACE30 is essentially 5 appliances duct-taped together… The FWSM is just garbage, ACL space is severely limited, performance is far lower than advertised, and so on.

    The model we’ve gone with is to use the ACE inline in the DMZ and one-armed in the internal network. We’ve found one-armed mode especially useful for load-balancing across our data centers when we need to maintain stickiness. GSLB in front of ACE pairs in each data center, with the GSLB returning one side as primary. The serverfarms then have servers from both data centers.

    We slap a big subnet on the frontside of the ACE and allocate a pool of 10 IP’s from that for NAT, then pull VIPs from the rest of the range. If/when we grow beyond that we route an additional subnet toward the ACE.

    It really hasn’t come up much at all from application teams wanting source IP. As mentioned, they generally log user ID’s and can go back to the source that way.

    We have just encountered a PCI requirement mandating source IP information – the ACE appears to be able to supply translation logging we can dump to satisfy requirements (in the docs, but not verified). Though we run enough connections through that we may have to setup a separate context just for the PCI-related services to minimize the load.

    • Charles

      So everyone is a bit tough on the FWSM for some reason. We implemented both the ACE and FWSM in a bullrush several years ago. The result? FWSM has been pretty much rock solid. ACE? Sucks. I’ll agree with everyone there. But the FWSM has been pretty reliable for us. No issues except for one self inflicted when I tried to stack contexts to solve a design problem. I’d love to hear more about why the FWSM is an issue for some folks.

      PS: Heard from Cisco that IPv6 support is on the roadmap for the ACE-30 towards the end of this year. Awefully long way out all things considered….

      • Bryan

        Ok, yes I will grant you the FWSM has been solid. We really have had no problems with crashes (unlike the ACE). What we’ve run into have been resource and design constraints

        – ACL space is severely limited. We run multiple contexts and can’t get more than 5 before we run out of capacity.
        – There are some fun gotchas lurking with shared interfaces and multiple contexts (single MAC). Yes, you can design around it…
        – The backplane interface is essentially a 6-port etherchannel, so a single flow is limited to 1Gig max (not that you can achieve that).
        – The inspect engines seriously whack performance. I think the main processor is a 1Ghz P3 so anything that hits CPU is hurting…
        – IPv6? Supposedly supported, but in software only so forget it.
        – Where’s packet tracer? Whining now, I know… :-)

        I would recommend an ASA far and away over the FWSM to anyone looking to buy (just talking Cisco). It’s really sad that they are still selling these new when they are well on the downward slope of the product life cycle.

        • http://etherealmind.com Greg Ferro

          Bryan

          Yeah, the FWSM is a difficult beast. It does work for simple stuff though and has some advantages in certain types of networks because it installs into the chassis. The pace of development of firewalls is faster than Cisco can cut silicon, unsurprisingly because the FWSM relies heavily on the backplane and custom chips.

          I don’t see a future for service modules because Cisco stated belief that they will continue developing their own silicon (because it is their value add) but our experience is that they cannot deliver new silicon fast enough to keep up with the market. Hence why ASA is revving so quickly and delivering new functions.

          greg

          greg

  • Brett Mason

    Greg,

    As always a good article, however I’m generally of the opposite opinion in that I think having the Load Balancer ‘in-line’ is a better design.

    Specifically your comment “Given that Load Balancers (especially F5 Load Balancers) are expensive, this is not a good design” I don’t agree with.

    Yes they are generally expensive, but so are firewalls and proxies if they need to cater for significant amounts of traffic, and LB’s are closer to proxies or firewalls in their low level functionality than routers. So saying one design is bad because it may cost more than another is I think is incorrect, as in my experience a little more spend up front to reduce complexity tends to produce a better design.

    Also regarding your justification with the comment “Consider what happens when you want to backup your servers”. I’d be more inclined to say that a network that does backups over the same path as normal traffic is a bad design regardless of the LB placement, but I guess that also largely comes down to cost.

    I have worked on a few large scale online networks that deployed many layers of load balancing and have utilised both designs, ‘in-line’ and ‘on-a-stick’ and in my experience the ‘in-line’ design benefits outweigh the ‘on-a-stick’. However I’d caution stating one design is better than the other as I think both have their place.

    As for your 2nd reason regarding “servers must be physically in the right location…” I’m unclear what your rationale is. I agree you typically want your servers as close to the LB as possible to reduce hops but I cannot see any reason why a router or firewall or layer 3 switch could not be in path between the LB and server farm? Obviously the routing would need to be adjusted to ensure traffic is routed correctly but that is true regardless of the LB placement.

    I whole heatedly agree with your last reason however, in that the LB will be blamed for any issue with the server, application, lunch menu, and weather. However the only time in my experience the LB has actually been found at fault, was in a ‘on-a-stick’ design as since all connections were being NAT’d the amount of connections were being exhausted, as they all come from a single IP and therefore exhausting the available source port for the connection. This can obviously be fixed with adding additional IP’s in the NAT pool but it was an issue specific to that design.

    Anyway, love the article and regularly enjoy your insights!

    • http://etherealmind.com Greg Ferro

      Hey Brett

      I guess like any design the answer is “it depends”, so I’m taking a fairly broad position that in-line is harder to achieve in corporate networks. That’s not always true, and you make good points about using inline.

      So I tend to the view that for critical solitions, go with an inline design, and for lesser requirements use stick-type designs.

      greg