I’ve been reviewing a network that has some CheckPoint firewalls that have been unstable, and while this isn’t surprising (in my experience, it’s common for Checkpoint firewalls to be unstable for some reason or the other), this time I’ve been faced with Checkpoint Clustering. A few years back I tried to make this work but gave up when CheckPoint couldn’t make it work either. A few years later, I find someone brave enough to attempt it. This time it’s different, I’m the one who has to justify why it’s a bad idea.
Here we go.
What is CheckPoint Clustering ?
The premise behind CheckPoint clustering is that having two firewalls in active/standby is a bad idea. This is true for CheckPoint because they are so expensive that you can’t afford to keep buying new units so why waste half of your money with the second firewall doing nothing. Therefore, owning two units, and using them in Active/Active mode is perceived as a way of saving money. To make it worse, this very idea is so ‘kleva’ that CheckPoint engineers are commonly known to suggest the practice as a ‘competitive advantage.
There is one useful feature, the fact that you can cluster up to four units into a ‘single cluster’. However, the operational impact of this is very poor. It is not possible to to determine which firewall is handling a given flow, thus making troubleshooting very hard or impossible. Anyone who thinks that the Tracker tool can be used for troubleshooting needs a good spanking – it’s a good logging tool but not a perfect troubleshooting tool.
How does CheckPoint clustering work ?
In fact, Checkpoint doesn’t do the clustering, the Nokia IPSO software does although it seems that the manual makes no reference to this. You might want to refer to Cluster XL Admin Guide for this much improved( since 2007 when I last couldn’t get the manuals because of a paywall) but mostly unhelpful piece of documentation.
It’s worth noting that CheckPoint is actually a piece of software that runs on many platforms. In the past CheckPoint was used on Solaris, Windows and BayRS routers. Today it runs on Nokia IPSO, SPLAT (custom Linux distro) and Crossbeam. As a result, the CheckPoint software isn’t tightly coupled to the networking features of the underlying platform. Perhaps this explains why the manuals miss out on the networking aspects of firewall functions.
Normal Firewall Operation
So lets set a baseline around normal firewall operation.
In normal operation a firewall works this way:
- client sends packet
- firewall will receive an ARP from from the router,
- respond with MAC address that is shared between the firewalls (and transfers between the active and standby unit on failover).
- The firewall will receive the packet and forward it to the internal network. The reverse flow is identical.
All this is standards compliant, expected and operationally easy to maintain and troubleshoot.
Checkpoint Clustering Operation
Obviously, to provide clustering something unusual has to happen because either, or both, firewalls need to receive each and every packet that needs to be forwarded. The purpose of clustering is to enable two or more (up to four ??) firewalls to pass flows in a fully load balanced/shared way. Why would you do this ? My view is that CheckPoint / Nokia firewalls are
relatively very expensive compared to Cisco/Juniper equivalents, so customers want to make the most of the “investment”. A shortcut like this looks attractive to double the throughput of the system.
From the Manual
From the manual:
ClusterXL uses unique physical IP and MAC addresses for the cluster members and virtual IP addresses to represent the cluster itself. Virtual IP addresses do not belong to an actual machine interface (except in High Availability Legacy mode, explained later). ClusterXL provides an infrastructure that ensures that data is not lost due to a failure, by ensuring that each cluster member is aware of connections passing through the other members. Passing information about connections and other Security Gateway states between the cluster members is known as State Synchronisation.
IP and MAC addresses
No, really, if you don’t understand these you should not be reading this. Return to school, do not collect $200 etc.
This is easy enough. Each flow that traverses the firewall creates a entry in a state database on the firewall, and this state database must/should/depends/your choice to be replicated to other firewalls so that if a failure event occurs, the other unit knows what traffic flows you were forwarding and can keep on going.
State Synchronisation means that for every flow on one firewall, it’s data is replicated to other firewalls. It’s most useful for long held data flows such as SQL and not so much for HTTP (YMMV).
Cluster Control Protocol
There is no standard protocol for synchronising such devices so CheckPoint created something with an imaginative name:
The Cluster Control Protocol (CCP) is the glue that links together the machines in the Check Point Gateway Cluster. CCP traffic is distinct from ordinary network traffic and can be viewed using any network sniffer. CCP runs on UDP port 8116, and has the following roles: – It allows cluster members to report their own states and learn about the states of other members by sending keep-alive packets (this only applies to ClusterXL clusters). – State Synchronisation.
Great. Basics are done.
ClusterXL has four working modes:
- Load Sharing Multicast Mode
- Load Sharing Unicast Mode
- New High Availability Mode
- High Availability Legacy Mode
Ok, so there are four possible high availability modes. Two of which are actually “clustering” and two of which are NOT – they are ‘High Availability” active/standby. So we will ignore those.
Checkpoint/Nokia Multicast Clustering
Anything with the word ‘Multicast’ in it automatically means trouble. And, you would be right. Except that Checkpoint does naughty multicast. Well, it’s not IP Multicast it’s Ethernet Multicast. Lets walk it through:
For CheckPoint/Nokia the packet flow works something like this:
- client sends packet
- router will ARP for the next hop MAC address, all firewalls will respond with a Multicast MAC address.
- Router sends Ethernet frame with a Multicast MAC address which the switch must treat as a broadcast to all devices in the VLAN
- The Cluster protocol will notify one of the firewalls to forward the flow, and it will reach the server.
Let’s consider the reverse direction:
- Server sends an ARP request.
- Firewalls respond with Multicast MAC address and transmit Ethernet frame.
- Server sends Ethernet frame with a Multicast MAC address which the switch must treat as a broadcast to all devices in the VLAN
- The Cluster protocol will notify one of the firewalls to forward the flow, and it will reach the server.
and off to the client it goes.
Multicast Ethernet, Undirected Broadcasts and Denial of Service
CheckPoint has now switched to using Ethernet multicast without using IP Multicast. By default, Ethernet switches are configured with IGMP enabled. Therefore after IGMP Query times have expired (about three minutes), the port will start to block the frames and thus disable the Clustering functionality.
Checkpoint recommends three options to ‘fix’ this:
- disable IGMP on the switches
- configure static MAC address mappings for the multicast mac address on all ports
- install an IGMP agent on the firewall
Disable IGMP on the switches
This is the primary recommendation from CheckPoint engineers and from the manual. To be fair, it’s possibly the best of three bad options although it’s most likely to cause significant problems.
When you disable IGMP on your ethernet switches, you are effectively allowing all multicast packets to be broadcast. That is, a multicast frame becomes a broadcast frame and every packet must be handled by every device in the VLAN. That is, broadcast frames are received by all devices, and the software protocol driver of the device must process the broadcast frame before discarding it thus creating performance problems (bus interrupts, buffer memory, CPU, software cycles, etc etc)
This is more commonly known as a Denial of Service Attack.
Consider this scenario:
Lets assume that you have 100Mbps of inbound traffic on a fairly typical, dual router, dual firewall cluster type of setup like the following diagram. In this case, with IGMP disabled, 100Mbps of traffic will sent to the firewall and the standby router and all other devices on the public facing LAN.
In this scenario, each VPN concentrator is connected to a VLAN with Public IPv4 addresses. Since this is the only VLAN with the public address, you can’t put them anywhere else.
The VPN concentrators will needs to handle 100Mbps of broadcast traffic, in addition to the VPN traffic. Most likely, this will cause intermittent outages and service problems on those devices as the CPU struggles to read and discard that volume of traffic. In the worst case, the VPN concentrator may attempt to report broadcast flood and even shut down.
Lets consider the return path for traffic (because all flows have a return path). In this case, lets have a VLAN directly connected to the CheckPoint / Nokia firewall and some servers connected to that VLAN. Typically, this would be an email server, a web server, maybe a proxy or some other gateway. Most likely it would be several servers on that VLAN.
The server will get a Multicast MAC address for the IP address of any frames destined for firewall (most likely the default gateway) and will dispatch those according to the normal process. However, EVERY OTHER server will receive every packet as a Broadcast.
This will cause serious CPU impacts, and possible stability problems. You can fix this by having a L3 device on the inside of the firewalls, and limiting the impact of the broadcasts to the L2 VLAN that is directly connected to the firewalls, of course. But this limits your design choices, and isn’t helpful in an existing environment.
Configure static MAC address mappings for the multicast mac address on all ports
It’s worth noting that some cheaper Ethernet switches are unable to handle large volumes of multicast or broadcast packets in silicon. They may use the onboard CPU for frame replication which can drive on a few megabits before becoming overloaded. (Less common today, but still applicable on some products).
Lets take a look at configuring static MAC address in your switches. That is, you create manual MAC address entries for each port that has the Checkpoint device connected. This seems like a good solution since it stops the broadcasting outlined previously and tightly controls the packet flow.
However, the firewall team and network team must be fully aware of this for it to be operationally effective. Consider what happens a year later, when someone upgrades the switches, or replaces a faulty module, some other minor task ? It requires close supervision to keep the static database maintained over time.
This will work for some companies, but for larger companies it’s only a matter time until an outage is caused and therefore, not a good design choice. For smaller companies where just a couple of people manage the firewalls AND the network, the static MAC address can work.
Install an IGMP agent on the firewall
This document on ClusterXL IGMP Membership dated February 14, 2006 (!) explains how to add IGMP support to Checkpoint. However, I’m told by Checkpoint that this is not supported / not recommended (it’s hard to to get a straight answer). It’s requires a number of CLI entries to work, plus specific configuration on the Module configuration.
In short, this option is operationally a disaster. You may struggle to get upgrades completed properly, module configuration on Linux/Windows need changing in the Config/Registry for the IGMP configuration to survive a reboot.
Using Unicast Mode instead
So the second option is to use Unicast instead of Multicast. In this case, the ClusterXL software selects a PIVOT firewall to act as the Master unit, and it will either process the packets itself, or redirect them to another member of the cluster.
The diagram shows this mode of operation using Unicast redirect. Although Unicast redirect doesn’t have the same problems as the Multicast solution discussed above, it does have a problem. The pivot firewall must reserve resources to be able to redirect all flows and ensure that it has enough CPU capacity to send sync data to all other firewalls in the cluster. The pivot firewall therefore handles much less traffic than other members of the cluster.
One big challenge of clustering firewalls, is that capturing packets, and troubleshooting becomes relatively more difficult. I’ve had some odd problems using Wireshark and surmise that the volume of broadcast packets was overrunning the workstation network adapter I was using to capture packets. Sadly, I wasn’t able to try with another machine to verify this theory.
Firewall Deployment with Layer Data Centre Interconnect
Apparently, there are a number of people who think that this clustering idea is perfect for data centres that are L2 interconnected. By now, most of you will have realised the problem that clustering will cause when an ethernet segment is extended between two data centres, but lets make a diagram of it anyway. Two data centres, geographically separated, but with a L2 connection between them. Doesn’t much matter how (OTV, VPLS, dark fibre, WDM – all the same for this purpose).
On the interconnect, a multicast CheckPoint ClusterXL, with an active/active firewall configuration, is going to trombone traffic between sites according to some random algorithm.
In addition, the firewall synchronisation traffic must also be given high priority and if the sync data doesn’t occur quickly enough the firewalls seem to fail quite badly (again, apocryphal evidence here, not able to test in a live network).
Finally, Multicast/ Broadcast must flow across the L2 Link on the inside and outside interfaces for every VLANs.
Thus, for 100Megabits of traffic across the firewall, 200Megabits of broadcast traffic is generated, plus the Sync traffic (determined according to firewall rule but usually a lot of traffic).
This isn’t a good idea(tm) since you need that bandwidth for server-server communication as well. Should be obvious, I think.
The EtherealMind View
Experience suggests that the Nokia/CheckPoint clustering works but at relatively low volumes, say up to 10 Mbps at a rough non-educated guess because the customer has a number of existing clusters that do work reliably. However, as load increases on the firewall, it appears that the multicast/broadcast technique causes serious service problems to devices on the same VLAN as the firewalls themselves. Since static mac options require the firewall team and network team to operate closely, this isn’t practical for very large support teams because of the level of specialisation that occurs in those teams. I have deep reservations about larger volumes of clustered traffic and have seen a number of inexplicable problems when clustering is enabled.
I could wish to have done some more testing but project timescales are a bit tight, and there is no way to have a lab of CheckPoint firewalls because of licensing and/or cost.
On the basis of this research, and recent experiences with service difficulties, I can’t recommend CheckPoint Nokia clustering because it appears to be a technology with more drawbacks than capabilities.