A reader sent this question (using the contact page.), and I guess it may not be obvious to everyone so I thought it was worth writing up. The question:
Technology advances in ASIC hardware have resulted in substantial improvements in switching performances of routers and switches. However, the routing processes are still dependent on CPU speeds. What are the existing limitations in router/switch models which prevent route computations from being performed in hardware?
Right. Not an easy question to answer and there are multiple technology layers working together here.
Remember that the forwarding and routing are two quite different network functions. Forwarding refers to the process of moving a frame or packet from the input interface to the output interface. Routing Protocols refers to the process of determined which output interface to use. In networking, forwarding needs to be very fast but routing protocolsmust be reliable and have a bazillion features that very few people use.
One you design, test and then build a chip you have lost a lot of flexibility because it’s made of REAL stuff. Most features cannot be changed, and cannot be updated or modified. It’s fixed in place – all the bad bits are stuck once the chip rolls off the production line.
There are several corollary impacts to this of which the biggest is testing – the manufacture of chip requires many million, or many tens of millions of dollars to produce just the first production run and with this level of capital investment a company cannot afford to make an mistake or an oversight. The testing for a chip requires exhaustive preparation and specialist team with specialist software to test not only the application, but the physical properties of the chip. Are the traces laid out correctly ? Will the deposition layers work correctly ? What about thermal packaging and heat dissipation ? Is there current leakage between substrates ? What about power consumption ? Is everything clocking correctly ?
Any mistakes might require a hardware recall, or a delay while the chip is redesigned, which is quite expensive. Scheduling slots in chip fabs could be lost, and financial loss due to rebooking and cancellation are hurtful.
Although many problems can be worked around in software. But that workaround is often complex and expensive. Compare this with a software where a handful programmers drink some mountain dew and whack out some code for a off the shelf Intel i960 or MIPS CPU, do a standard testing run in a simulator with a software testing suite, and then release for the customer to use. If they find any problems, they simply update the code and ship a new version to the customer.
When using hardware silicon for processing frames and packets then certain aspects of the system become finite. For example, Ternary Content Addressable Memory has finite amount of memory space to hold address registers. Compared to a software approach where you can often increase the memory available to given requirement (up to the maximum amount of DRAM in the system of course) as you need – this is dynamic memory allocation.
Over time, new features are added to routing protocols, for example, MPLS is continuously being updated to become more like Frame Relay via MPLS-TE extensions. If you are implementing routing protocols in silicon, the customer would be required to purchase new hardware – not something that customers like to do regularly
Why bother ?
Perhaps the least considered part of the equation is “Why do you want to that ?”. When compared to other functions, the CPU & Memory requirements for route computation are not intensive or even time sensitive. Consider these factors:
- most routing protocols will black hole traffic during re-convergence and computation – they are designed this way.
- Everyone seems to accept this as normal and don’t want anything better.
- As long as black holes last less than 150 ms no one notices anyway,
- route recompilation on merchant silicon such as the Intel i960 can be achieved in that time frame
- existing CPU/Memory architectures are cheap to buy, cheap to program and skills can readily be found.
- most time spent during a reconvergence is sending and receiving neighbour updates over the network.
- You can’t change the speed of light.
The EtherealMind View
In summary. The cost of implementing routing protocols is silicon is high. The vendor would lose flexibility to add new features and have higher cost of software/hardware development. At the same time, there is nothing to be gained by using custom silicon since you would not increase the performance of routing protocols themselves. I suspect that I’ve missed other reasons but I’m sure people will add more in the comments. I look forward to hearing them.