10th February 2012

Why Didn’t Nortel Do Better ? Cisco Wasn’t Always the Top Dog.

In response to Omar Sultan at Cisco on ‘Why you want this switch ?’ . In my view, Cisco IOS was buggy,slow and the hardware product was a poor design, but Nortel got the usability and technical support very wrong. Customers chose Cisco anyway because the Cisco TAC made the problems not seem so bad.

There were a number of things about the Nortel BCN that were ahead of their time. And a number of reasons why I think the market didn’t take to the product overall. I used to work for a company that sold many vendors of equipment in the late nineties and early noughts and remember how the engineering team steadily moved away from Nortel / Bay / Wellfleet to Cisco.

Multiprocessor OS with Modular Kernel.

When you bought a Nortel BCN you got a backplane, and limited processing power. Every blade that you installed then added CPU processing to the chassis. The operation system distributed itself across all the CPU’s. The system grew in performance as you added interface / line cards. Each line card had a number of CPU’s that were needed to process. BayRS was also modular. You could actually take pieces out of the image that you did not use. Cisco has only implemented kernel modules in the last few years

This meant that I spent just as little capital as possible to get what I needed. The Cisco approach (by comparison) means that I buy a monster routing / processing / supervisor engine and then hope that I will need all that performance. Thus all the capital is required up front. I always felt that this was a good model for Cisco, (and not so good for the customer) since Cisco gets all the revenue early in the lifecycle. This also has the effect of encouraging customers to buy more than they need (Aside: shout out to all those people with Cisco-standard Power over Ethernet – just wanted to say: I told you so).

Reliability

The obverse of an SMP approach that Cisco IOS used was always bound by the central CPU. As a monolithic OS you could never increase the performance by adding CPU’s later. And the search for better performance led to new CPU’s, hardware architecture and crufty software hacks (such as fast switching) and spawned a vast amount of software releases, and a lot of bugs until Cisco improved their testing in 2003/2004. BayRS was much more reliable than IOS. And I mean a lot.

I remember that it was common practice to upgrade the IOS as soon as some problem was encountered, and would often fix the problem. (Note: this is not the case in the last few years where Cisco has fixed the testing).

Long lifecycle.

The BCN was a viable product for nearly fifteen years.This was also true of the ASN router, which could be stack to provide 50K pps to 200K pps in the final configuration. At that time an similar class Cisco product had a lifecycle of less than five years before you had to forklift it out. Almost forgot, the Nortel modules were compatible through all that time. (Unlike Cisco modules which were different for every chassis at the time).

Site Manager

Lovingly described by engineers as Site Damager or Site Mangler. This GUI console used SNMP to configure all areas of the system in a nice graphical interface. One hand the GUI made configuration better and you could more clearly understand what your choices were. Compare with Cisco IOS where the defaults were often invisible.

Site Manager had to match your BayRS version because of the SNMP dependence. In some cases exactly match. This meant installing many versions of Site Manager on your laptop. Sadly, MS Windows was not up to the task (overlapping DLL’s, drivers etc) and it was an engineers nightmare to turn up at a Nortel site at 4 o’clock in the morning with the wrong Site Manager on his laptop.

They also had a retarded numbering scheme for both BayRS and Site Manager – something like SM V5.1 was for BayRS V9.3 and so on. There was no easy way of knowing what was the right version. Queue more angst and stress.

Oh yes, how could I forget that Site Manager crashed a lot on Windows. Which was problem when you making a critical change on a router and it snarfed. Unhappy customer. Unhappy engineer. It was better on Unix though (but we didn’t have Unix laptops in those days.)

Awful Diagnostics

Its true that BayRS had truly atrocious diagnostic and debugging tools. I never got over that.

Nortel Support

Nortel Tech Support was good in the early days but ran down to laughable. When compared to the Cisco TAC, you just would not choose Nortel.

US centric

Nortel was always focussed on the US market. Getting Nortel to focus on customers outside the US was painful. Stock would often be diverted to US without warning. Since I was always ‘rest of world’ it was just another negative.

Its all about the Operation

You will notice I have not noted any negatives about the Nortel hardware. That is because, pound for pound, Nortel always stomped all over Cisco for performance, capability and cost. Where Nortel lost out was the lack of focus on usability and operational excellence but mostly the tech support. Cisco had a big edge with features, I think because the monolithic OS was easier to develop for.

It was easy to recommend Cisco when you have just had a tough night unwinding a Nortel network using a second rate interface and toolset after Site Mangler corrupted the configuration. You don’t forget the bad times easily. Using the Cisco IOS CLI was better than Site Manager. So when the customer asked the engineer what he recommends, the word was Cisco. When the engineer progressed to be in pre-sales he said the same thing.

Cost doesn’t always matter

One of the lessons I learned is that cost doesn’t always matter. Cisco had a more expensive product that was more unreliable and slower than the competitor. Cisco overcame these problems with better technical support both in terms of diagnostics on the box, and in terms of the TAC.

Someday, it might be interesting to consider that the quality and capability of the modern TAC is a business response to fact that their product needed a lot of support. The world class capability might have been a necessity to survive.

Conclusion

None of this is really relevant to today. Cisco has picked up their testing and product development so that IOS is not usually buggy or flawed. I stopped working on Nortel kit a few years ago so I can’t make comments about whether they have improved. But if Nortel was doing a good job then I think companies like Extreme and Foundry would not exist.

I recently worked on Alteon gear, and was surprised at how little the product had developed. Alteon was the pre-eminent market leader to Arrowpoint when they were both by Nortel and Cisco respectively. Cisco has developed load balancing extensively but the Alteon seems unchanged from five or more years ago.

Who knows if Nortel can come back, but it wise to remember the lessons from yesteryear before considering Nortel again.

This post is copyright of Thropos Ltd ©2008-2011 at Etherealmind.com - contact | email: greg.ferro@packetpushers.net - twitter: @etherealmind | All rights reserved
About Greg Ferro

Greg Ferro is a Network Engineer/Architect, mostly focussed on Data Centre, Security Infrastructure, and recently Virtualization. He has over 20 years in IT, in wide range of employers working as a freelance consultant including Finance, Service Providers and Online Companies. He is CCIE#6920 and has a few ideas about the world, but not enough to really count.

He is a host on the Packet Pushers Podcast, blogger at EtherealMind.com and on Twitter @etherealmind and Google Plus

  • http://www.cisco.com/go/dcswitching Omar Sultan

    Greg:

    I also worked worked for a network integrator in the early 90s and sold the heck out of SynOptics and Wellfleet and I had similar experiences to yours. We were one of the largest channels for both Cisco and Wellfleet so we had some insight into both companies. I think it came down to the fact that, at the time, Wellfleet was a hardware company and Cisco was a software company and they both designed to their strengths. Wellfleet had the most elegant hardware platform, by far, but the software was always painful. Even when they did something innovative like the distributed processing, they had problems keeping the routing processes in sync. As you note, at the time, Cisco gear tended to get control-plane bound.

    I will say that IOS did not start out that way. It is a good case study in how being responsive to customers is a double edged sword. I remember I had customer having problems with LAT translation and I got on the phone with the developer (yes, you used to be able to do that) and we talked through the problem and he FTP-ed me a patched version of IOS. So, we had a version of IOS that existed in exactly two places in the universe: my customer’s network and the developer’s workstation. Now, this is incredibly cool and incredibly scary and by v9.2 or so, the wheels started to come off the wagon and the IOS folks have done a good job of turning things around since then.

    Omar

  • http://etherealmind.com Greg Ferro

    The thing that I find interesting is that Cisco has managed to end run around the IOS single processor limitation for so long. I suspect that customers are beginning to realise this because of the number of people who are commenting on the number of different types of IOS.

    Back in the days of IOS 9 & 10 this was commonplace, but I had a niggling problem with IOS that were fixed by upgrade up to 12.2. This was commonplace, I remember clearly the Networkers session when Cisco announced that they were going to perform structured testing on IOS in 2002/2003 and the relief that IOS was probably going to work more often.

    I am forced to believe that the Testing Division at Cisco is up to the job of validating each software platform but it must represent an enormous risk to business. There is a possibility that Juniper or some other startup could make ground if there is a stumble here.