The key to high availability design is to consider the failure of each element in a system and then to analyse the interaction of elements in a system looking for integrated failures. The size of the failure domain is known as the “blast radius” (in relation to when “something blows up”).
For example: A firewall failure in a typical corporate DMZ design will causes a total loss of Internet services.
The concept of “blast radius” is used to describe this effect. When something blows up, how far does the damage spread throughout the infrastructure or system ?
In networking, the term is commonly used in relation to Layer 2 Networking or VLANs. The Ethernet protocol is highly dependent on broadcasts, unicast and multicast (BUM) frames for name resolution. However these frames are readily susceptible to infinite looping in Ethernet switches. At the same time, loops are required for path resilience.
It is only a matter of time until a looping event occurs in any Ethernet network so the key design factor is to limit the impact. The bigger the failure domain, the more impact to the campus network or data centre.
A small explosive can have a very large blast radius.