Citrix Branch Repeater : WCCP or not to WCCP that is the question?

Whether ’tis nobler in the network to suffer un-accelerated traffic during an outage or to take arms in the form of Policy Based Routing.

When you decide to†deploy†Citrix Branch Repeaters (CBR) you have to†deploy†at either end of the WAN to accelerate and compress traffic between these endpoints. Therefore it would seem sensible to have some resiliance in the design to at a minimum protect the hub in a hub and spoke topology.

Deployment Models

There are 3†deployment†models that I would consider, there are actually a few more available like proxy redirect, but it is not relvant to where I want to go in this post.

  1. Inline mode ñ sits on the wire and accelerates traffic flowing between the Ethernet ports.
  2. WCCP mode ñ we use WCCP to pull traffic towards the device. We can provide an active/standby solution.
  3. Virtual Inline mode ñ The router sends the traffic to the WAN appliance (using PBR) and the appliance accelerates and sends it back to its default gateway.†We can provide an active/standby solution.

You should also be aware the CBR needs to accelerate the conversation from the start and cannot kick-in halfway through, therefore the longer the CBR has been offline the more conversations will be missed that cannot be†accelerated until the conversation is restarted.

Inline Mode

There are a few problems with this when thinking about†resilience.

  1. As stated above CBR works in pairs at either end of the WAN and requires symetric routing, therefore you would have to ensure that data passes through Hub1 to Spoke1 and back from Spoke1 to Hub 1. Then you have to figure out how to do Hub1 to Spoke 2 and Spoke2 to Hub 1 or perhaps Hub2 to Spoke2 and Spoke 2 to Hub 2. Yes it gets messy!
  2. If the system fails-open and leaves traffic un-accelerated and un-compressed for any†length†of time, you will have a performance hit on the WAN, most likely at the spoke site as this is typically where the bandwidth with be tune down to match the CBR accelerated traffic profile.
  3. To replace the system you have to disconnect an inline connection which can always be problematic trying to arrange downtime.
  4. I cannot see any simple way of providing†resiliency†if an appliance fails, albeit it fails open with no acceleration.

 

WCCP mode

This seem sensible at the outset, we use WCCP to forward to the CBR based on the WCCP requests from the CBR to the Router and this can be done with hardware switching (depending on the device) so it is fast.

Here is what you need to know about the CBR setup with WCCP when you deploy a pair of CBRs in HA mode for WCCP

  1. They run as†Active/Standby.
  2. There is no stateful†fail-over.
  3. Packets once accelerated are return to the WCCP Router that sent them to the CBR.
  4. At Fail-over, the Standby now†becoming†Active needs to negotiate WCCP with the router once VRRP has†failed-over.

In my testing on 3750-X I have seen this consistently take 90 seconds before WCCP is established and traffic is being accelerated again and this was with WCCP settings hardcoded on the CBR.

 

Virtual Inline Mode

This works very similar to WCCP, except rather than using WCCP to direct the traffic we use PBR to direct traffic to the CBR. The thing here is if a router can do WCCP in hardware then it is very likely to be able to do PBR in hardware, so from a†performance†perspective I cannot determine the advantage of WCCP.†Allegedly†the configuration on the router is simpler in WCCP, but here’s what you need to know.

  1. They run as†Active/Standby.
  2. There is no stateful†fail-over.
  3. Packets once accelerated are sent to the CBR default gateway address.(This is the key difference to WCCP)
  4. PBR can send to the VRRP address of the pair, therefore failover only takes as long as VRRP to switch over.

In my testing on 3750-X I have seen this take < 4 seconds before VRRP on the CBR has failed over and traffic is being accelerated again.

 

Conclusion

WCCP was developed by Cisco for redirecting and load-balancing web traffic across an array of web proxy servers and in version 2 has been expanded to work with other protocol. In this case the where load balancing is not an option due to the active/standby nature of the deployment scenario I can not †see a strong need for WCCP; in†addition†to this is the fact that your fail-over will take over 1 minute before it is†capable†of†accelerating†traffic again.

The advantage that WCCP has over PBR is that it will send the accelerated packets back to the originating routers, therefore if you have 2 WAN connection it can easily use both, where as the PBR solution is alway going to prefer a single router.

 

For me PBR seems like a more sensible choice for†deploying†CBR with†resilience†unless there is a need to balance the traffic load across multiple egress points.

 

 

 

  • http://twitter.com/verbosemode Jochen

    Nice article John. We are deploying Riverbed Steelhead’s at work inline and had some of the issues you mentioned above.

    But one advantage of having the appliance inline is that it sees all packets. I have used Steelhead’s tcpdump functionality a lot in the past to troubleshoot fun things like broken path mtu discovery. This comes in handy when you have no extra devices at a remote site for capturing packets.

    • John McManus

      Thanks,
      The Riverbeds have a great reputation, I would like to get a chance to play with them myself.

  • Kantcho Manahov

    We deployed the Citrix Branch Repeaters inline in the remote site , but having 2 separate VLANS on the switch – one for the router and bridge port 1 and one for the users and bridge port 2.
    If the CBR falls down it goes to fail to wire. The replacement is transparent with no downtime. Assign the users ports to the VLAN of the router interface and you can change the device without downtime and activate it the next day once replaced.

  • Geniesis

    We deployed inline without HA and found that on numerous occasions the CBR would fail to fail to wire. In other words, it would not detect that the system had failed and needed to fail to wire. Causing the site to go offline until a person to reboot it or just turn it off. This is a problem when your remote office is 1500Km away and your only help is somebody that doesn’t know much IT at the remote site.

    We actually identified the issue in certain situations a process would fail in the CBR and the heartbeat process would fail to pick up this failure and think the system was still functional. In fact, you could ping the CBR and log into it but traffic would not pass through it.

    Given that PBR requires relatively complex IP-SLA requirements to get failover especially when combined with HSRP on a dual WAN setup. We opted for WCCP which we can guarantee if the CBR fails, users will not lose access to the WAN. Not to mention we don’t have to fork out more money for HA CBR.