2 September 2010

Blessay: Autonegotiation on Ethernet – It Works, It Should Be Mandatory!

I found this quote recently:

“The switch is configured to autodetect the speed and duplex settings on an interface. However, there are several things that can cause the autonegotiation process to fail, resulting in either speed or duplex mismatches (and performance issues). The rule of thumb for key infrastructure is to manually hard-code the speed and duplex on each interface so there is no chance for error. “


THIS IS VERY VERY VERY WRONG

Let me say it again: This is mythinformation.

Cisco(1), Sun(2), Dell(3) and all major manufacturers RECOMMEND auto-negotiation.

How did the myth get started ?

Once upon a time ( or at least, around 1996/7 ) , there was no Ethernet autonegotiation standard, but everyone agreed this would be a good idea. At that time, Intel and 3Com were the biggest manufacturers of network adapters, and Cisco / 3Com / Nortel were the main switch vendors. So 3Com implemented their version and 3Com adapters worked on 3Com switches. Intel came up with their version and Intel adapters worked with Cisco switches. Nortel switches didn’t really work with anyone consistently. And the OEM equipment didn’t do much at all, it was hard setting or nothing.

This was a massive problem. In fact, I remember entire corporate networks had their network adapters swapped out of every computer. No-one really knew which adapters and switches could interoperate, and inter-switch connections were a real concern.

So we all just concluded that hard setting Ethernet speed and duplex was the best option.

Then the 802.3u standard was ratified in 1998 and new equipment from all the vendors started to come into the market after that time that was compliant with 802.3u and things got better. But we still had a lot of that older equipment around, and those network adapters were quite expensive back then.

So we all became used to hard setting Ethernet speed and duplex as the best option. But all of that problematic hardware should be gone by now, and we can consider changing the way we work.

Why doesn’t everyone know ?

I suspect that this is one of the basic topics that no one ever researches. Everyone just “knows” that you always hard set duplex and speed. These days it is pretty much a mantra, especially among telco operations and some server teams.

Because we all “know” it seems nobody has taken the time to fix it.

And because many people configure this by default, it forces everyone else to do the same.

Whats the problem ?

A major problem is that many people are also hard setting Gigabit Ethernet , and this is causing major problems. Gigabit Ethernet must have auto-negotation ENABLED to allow negotiation of master / slave PHY relationshitwhp for clocking at the physical layer. Without negotiation the line clock will not establish correctly and physical layers problems can result.

From Sun’s Best Practices on Ethernet Auto-negotiation (2):

Disabling autonegotiation can result in physical links issues going undetected. The Fast Link Pule process does some testing for the physical link properties as well as negotiation on several Ethernet properties.

  • Unable to detect bad cables
  • Unable to detect link failures
  • Unable to check link partners capabilities
  • Unable to move systems from one port to another or to another switch or router
  • Unable to determine performance issues on higher layer applications
  • Unable to implement Pause Frames (Flow Control)(4)

From the IEEE standard:

All 1000BASE-T PHYs shall provide support for Auto-Negotiation
(Clause 28) and shall be capable of operating as MASTER or SLAVE.
Auto-Negotiation is performed as part of the initial set-up of the link, and
allows the PHYs at each end to advertise their capabilities (speed, PHY
type, half or full duplex) and to automatically select the operating mode
for communication on the link. Auto-negotiation signaling is used for the
following two primary purposes for 1000BASE-T:

a) To negotiate that the PHY is capable of supporting 1000BASE-T half
duplex or full duplex transmission.

b) To determine the MASTER-SLAVE relationship between the PHYs at
each end of the link. 1000BASE-T MASTER PHY is clocked from a local source.
The SLAVE PHY uses loop timing where the clock is recovered from the received data stream.
My emphasis

Future Considerations

There are new standards for 10 Gigabit but the new Converged Enhanced Ethernet (CEE) (Cisco is developed a superset called Data Centre Ethernet(c)) will require each side of GigE connection to negotiate capabilities relating to pause, capabilities, queues and a number of other critical features.

If you haven’t already, you should consider changing your standards to use auto-negotiation everywhere because it is a prerequisite in the new technology.

I make an especial plea to Service Providers to change their standard practice to using Autonegotiation on Ethernet. We have to move with times and mode

Operational Benefits

But most of all, setup auto-negotiation so I don’t have to think about configuring ports for each server. I am going nuts here with my staff having to manually configure every fricking port whenever a server moves around. It a ridiculous waste of resources.

Plus it makes you more successful, since hard setting means that more mistakes will be made.

Comments invited, I would love to discuss this more. Tell all your friends!

Reference

(1)Cisco recommends to leave auto-negotiation on for those devices compliant with 802.3u.This would be every device since 1999 or so.

(2) – Sun Ethernet Autonegotiation Best Practices

Wikipedia The debatable portions of the autonegotiation specifications were eliminated by the 1998 release of 802.3. This was later followed by the release of IEEE 802.3ab in 1999. The new standard specified that gigabit Ethernet over copper wiring requires autonegotiation. Currently, all network equipment manufacturers—including Cisco[1]—recommend to use autonegotiation on all access ports. On rare occasions where autonegotiation may fail [2], it may still be necessary to force settings.

Cisco When configuring an interface speed and duplex mode, note these guidelines:

•If both ends of the line support autonegotiation, we highly recommend the default setting of auto negotiation.

(3) Dell – Gigabit Ethernet Auto-Negotiation

(4) Peter Lapukhov of Internetwork Expert posted a good article about why 802.3X pause frame should be disabled here

Please rate this post:

1 Star - It\\\'s Crud2 Stars - It\\\'s Tosh3 Stars - Something\\\'s missing4 Stars - Needs works5 Stars - Good Enough6 Stars - Good7 Stars - Excellent8 Stars - Brilliant9 Stars - Astonishing10 Stars - Awesomely Godlike? (4 votes, average: 5.50 out of 10)
Loading ... Loading ...

About Greg Ferro
Greg is a Network and Security Architect / Designer / Engineer working freelance in the UK and worked for Resellers, DotCom's, Large Corporate's and Service Providers across a variety of products & Vendors. He prefers to work for end users, believes in the life cycle, total cost of ownership and that near enough is often good enough. He likes talking about himself in the first person to feel "royal", even when hosting the Packet Pushers Podcast on Data Networking. More about Greg at http://etherealmind.com/who-am-i/ and you can follow him on Twitter.

Comments

  1. I think age is the big thing with autonegotiate. I’ve seen NICs on servers (usually IBM) with manufacture dates of 2002 fail miserably at autonegotiating (10/half to 100/full makes for a wonderful medium). With the economy dying a horrible death, I’m picturing a lot of old servers getting put back into service or being used for another few years to save a dollar.

    As you said, a 1G NIC needs auto, so time will heal all wounds.

  2. cciepursuit says:

    Great article. I did not know how GigE uses auto-negotiation. Cool information.

    There are 2 reasons that my company still requires hard setting speed and duplex on server ports (user ports are all auto/auto):

    1) $$$. Switches with GigE ports cost more, therefore our port charge for 1000Mbps is higher than the cost for 100Mbps. Nearly all servers now have GigE capable NICs. If we didn’t hard set our ports then the vast majority of our servers would auto-negotiate to GigE. I personally don’t care, but the bean counters do care.

    2) If the auto-negotiation process fails, it will set the duplex to half. This problem is usually caused when one side is hard set and the other is using auto-negotiation. In our case, our billing requirements usually results in this issue. We hard set our port to 100/Full and the server uses auto/auto. Everything looks good until we get the frantic call from the server owner cursing out the network. A quick ‘show interface gx/x counter error’ will usually show collisions. Then we begin the awkward and painful dance of “I know that you don’t have your server NIC set to 100/Full but we’re going to argue about this for an hour until I set my side to auto/auto to prove you wrong.”

    Because of issue 1, we run into a lot of issue 2. It is the bane of my existence that I don’t have access to the NIC settings on servers. It would make a good 20% of my troubleshooting go 90% quicker. I could then quickly check for the big three server problems: incorrect speed/duplex settings, incorrect subnet mask, and incorrect gateway.

    Anyhoo…great article. I agree that auto-negotiation should be used if possible.

    • Greg Ferro says:

      What are you going to do when the new Gigabit becomes the standard ? CEE and DCE is coming, and will probably filter down from the Cisco Nexus to Catalyst 6500, 4500etc and you will need autoneg then. I can’t see the neg part being optional.

    • ian says:

      [q]If we didn’t hard set our ports then the vast majority of our servers would auto-negotiate to GigE[/q]

      Depending on the equipment you’re using, you can set the maximum speed that’s being advertised in the autonegotiation process. In most, if not all, recent cisco switches you can do this using the following command:

      someswitch(config-if)#speed auto ?
      speed list separated by spaces (up to 8 values total)

      setting ‘speed auto 10 100′ will allow the switchport to negotiate to both 10 and 100 Mbit, but nut 1Gbit.

  3. ahenning says:

    I have to agree with the author, I’ve had issues where switch trunk links (3500-2900) were set both sides to 100/full. Show interface output displays 100/full no errors/collisions, but throughput is exactly 10/half with throughput test. Make both auto/auto and thoughput goes up to 100/full.

  4. nevot says:

    Hi,
    In eleven years administering networks I’ve seen autonegotiation fail again and again. I won’t let my network under auto-negotiation under no circunstances.

    • Greg Ferro says:

      You might have poor cabling then. If you don’t have quality cabling then autoneg can be a problem. And if you have poor cabling, then autoneg is the least of your problems.

      • Anonymous says:

        I disagree. All our installations are CAT-6 verified and certified. Try to connect a Nokia Firewall to a cisco 2950 with autoneg and you will soon run into trouble. Problems also detected with Avaya Gateways. If you work also with embedded equipment (such as building alarms, air conditioning controls, etc) connected to your network, you will see also problems.

        My criteria is: access ports for users, autoneg. Access ports for servers and other equipments, forced 100-full. In gigabit networks I run autoneg.

        • Greg Ferro says:

          No, the problem is the Nokia firewall. They are infamous for being stupendously awful at doing this. A notable exception to the rule.

          You are actually following my recommendation. Use autoneg where you can, and hard set where you must. Embedded equipment typically use really old and cheap chipsets so I would expect them to be a problem.

          I would bet from your description that more than 90% of your ports are auto ?

        • Jay Moran says:

          Exactly. Haven’t finished reading the comments. But after a couple of years of analysis we determined the failures of 802.3u were due to high bit rate transfers. If the port was extremely loaded with 80-90Mbps of traffic for a sustained period of time, one side (host or switch) would decide that they should fall back to 100-Half instead of their originally negotiated 100-Full. Our data center cabling is all Cat-5e or Cat-6 and each run is certified as well.

          This was with various flavors of Foundry switches from the late 90′s to really just a a couple of years ago. We’ve maintained the same standard in the data center for any Fast Ethernet attached host. If Gigabit though it has always been autoneg since after the first few months more than a decade ago when we assisted Foundry and Cisco both in fixing their autoneg code since it often failed for no good reason.

          And yes, for employee/office crap, we’ve always relied on 802.3u and never really had widespread problems.

  5. shef says:

    I saw problems problems with new avaya ip pbx box and new 2960.

  6. nevot says:

    Nokia Firewall, Avaya PBX, toshiba tecra s3… in fact, I only trust in autosensing when same manufacturer equipment is in both ends (and not much trusting because, as you said, cabling is not always as we expect).

    yes 90% of the network is autosensing, because 90% of the network is users. But the point is not how much % is in autosensing. Not quantity, but quality. Just as I pointed: for infrastructure, forced. For end users, autosensing. Having problems on one end point affects only one user. Having problems on one server port or one trunk affects lots of users.

    worst case: i’ve seen problems connecting a PC to a Juniper firewall, just because they both have auto-mdix interfaces. Yeah, I know, it’s not part of this standard, but no link between this devices, with a perfect cable. disconnecting and connecting several times until one of them knows the correct mdix. Just disabling auto-mdix on the pc solved the problem. My poor conception of this being ‘auto’ is based on many problems found again and again. This pain will sure be less and less with time.

    One note: Have you reviewed LLDP? one of its characteristics is having information between switches and end equipment (such as avaya ip phones) about, among other things, speed and duplex configured. LLDP fits media selection not only on phy layer but network layer. I think LLDP its also a must in future for all network equipment. Why did anybody designing LLDP (or CDP, or NDP or any of the variants that converged on LLDP) think about the necessity of advertising media capabilities available and used on a upper protocol if autoneg is so wonderful? Sure (I think) because they have seen people running in trouble again and again.

  7. nullrouter says:

    Good article on autonegotiation

    I absolutely agree with using the inherent autonegotiation on 1000baseT links.

    However, with Cisco optic links, I’ve had to rely on the ‘speed nonegotiate’, which disables the link negotiation, on gig optic ports to get them to come up. I also seem to recall Cisco publishing a bug on optic ports without this command being enabled.

    I work in a carrier IP/MPLS/MAN environment, where we manage connections between our Cisco switches and client equipment of various types. We hard code anything less than 1000baseT, as most clients are running Fast Ethernet ports on their equipment still. I’ve seen autonegotiation fail, and devices come up in half duplex after a power hit at a client site, causing us to get unnecessary packet loss/throughput faults.

    Personally from experience, I would recommend only running autonegotiation on anything less than 1000BaseT, where you have administrative control on both ends of the link.

  8. Dan says:

    Hi

    I don’t agree with this article, mainly because my experience, even recently on hp desktops connected into either 3560 or 3750 switches has been painful when using auto-negotiate on either fast or gig ethernet. Basically either duplex or speed is nearly always incorrectly set. So now I don’t eeven bother relying on auto-negotiate, I just manually set the speed and duplex.

    • Jim says:

      The acid test – get the same problems booting BSDs/Solaries/Linux kernels?

      -if yes, suspect hardware or physical layer
      -if no, contact Microsoft/HP/whoever for your support

      We’ve never had a auto-neg fail under a non-Microsoft OS on a desktop or workstation machines. Even the so-called “WHQL” drivers give regular trouble with auto-neg, especially to Cisco kit.
      (my angle – as a software-to-OS-to-hardware compatability clearing/proving house, we link up and tear down workstations at the rate of 600+ a day, for around two week cycles, 52 weeks a year, across 3 continents)

  9. Josh Horton says:

    Greg,

    I recently ran into an issue where this article was a HUGE help. Thanks.

    I had a vendor’s router connected to a customer’s switch. Both were hard coded 100/full but for some reason, the connection was horrible. The strange thing is that only the vendor’s router displayed any errors. Anyways…

    After setting both sides to auto the problem was solved. Had I not read this article, I would have stared at the problem for a month or so saying, ” Well, both sides are hard-coded… I don’t know what to tell you. Your router must be junk … :)

    I guess old habits die hard. Thanks again!

    Josh

  10. zakki a says:

    Greg,
    Great article. In term of standard, yes auto negotiation the first we have to try. but auto negotiation failure is not a myth. it is true.

  11. Scott says:

    I work at a major vendor of server application software that pretty much requires GigE throughputs, and while I agree with you in theroy, the practical is that it doesn’t work that way in the real world. There are two hardware vendors that make 90% or better of server NICs and only a handful of vendors who make 90% or better of the switches out there. The combinations of several of these simply do not work when set to full Auto on both sides. If there was better enforcement of the IEEE standards, we could all play in your ideal world. But the truth is that many of these vendors in question need serious help. I don’t know how they can get away with charging server NIC prices in NICs that have the same bugs (or worse) than simple desktop NICs.

  12. Dmitri says:

    Flow control on GigE ports on 3550 sucks. It starts sending pause frames *way* before a link starts approaching full capacity. Not sure about other equipment, but the said experience with 3550 did not make me happy (and aware).

  13. Dennis says:

    Why do I want to hard set EVERY 10/100 port in the network?

    1) Last week, Cisco 2950 switch and Cisco 2800 router both powered up after an outage. Cisco did not successfully autonegotiate with Cisco switch. Worked well enough that no one noticed til trading started over that link.

    2) Panasonic Toughbooks, circa 2004/5, and Cisco 4500 10/100 port. Don’t remember the ethernet chipset vendor for the Panasonics offhand. Would not ever, under any circumstances, negotiate properly. We had over 100 of those laptops in our company. PITA!

    As you say, Gig is a different animal, but hard setting speed/duplex is a good practice, learned by many of us in the school of hard knocks.

    m00tpoint

  14. Tim says:

    If only the world was that perfect Greg

    I’ve had LOADS of problems with autoneg.
    I’ve NEVER had a problem where both ends are properly nailed.

    No-brainer IMHO, why take the chance? Especially if you don’t have access to both ends to check that it auto’d correctly _this_ time. (Just ‘cos it got it right last time doesn’t mean that it will again)

    I concede Gig is very different and shouldn’t be tared with the same brush.

    Tim

    Tim

    • Greg Ferro says:

      I have had all sorts of problems, but ultimately, auto-neg is the best default condition. I move to hard-config when there is a problem. That is my recommendation to everyone.

  15. Pushkar Bhatkoti says:

    good research. The point is if any vendor doesn’t follow the standard it’s not your fault. Your concept still works e.g. leave auto-neg on.

    well written.

    -Pushkar Bhatkoti

  16. joe says:

    I come from the telco world and school of hard knocks. Yes, if the pairing of equipment requires auto, or if we are talking Gigabit ether, then you must auto.

    Denis hits the nail on the head, you have an unplanned reboot, then the sucker goes to some crappy setting, and no one notices until the business day hits. The throughput sucks, and the issue can be transient in nature. Once you figure out auto bit you in the ass one more time, management says “Damn it, turn off auto”. As as management usually has a few more gray hairs, they still remember the bad old days.

    So in the real world, non-gigabit will be explicitly set if you want to keep your job.

    A red herring, some folks will say you can’t manage all the settings with out provisioning errors. That is a gap in your server management process, not a a byproduct of explicit settings. Are your configuration files in /etc set to “auto”? I don’t think so.

  17. pete bateman says:

    I would suggest that your experience is all quite recent. In the early days of full duplex Ethernet and Fast-Ethernet interoperability between various manufacturers implementations of the standard was pretty bad. I.E a 3com Nic to 3com switch might be fine. Plug a Nortel router into your switch and it would all go south. I have personally seen this many times. Us old hands got bitten by this so often that we always recommend nailing the port config down on interlinks and servers. i.e. if a desktop coems up wrong its a minor issue, if you server suddenly decides its going to be half-duplex you are in deep whatsits. I have seen this happen. More recently you are right, it generally just works, but don’t tell everyone that its infallable,cos it ain’t.

    • Greg Ferro says:

      By Crom, you missed the point completely. The first paragraph clearly says it was a problem (when I was a younger engineer) then I say it isn’t a problem any more.

      The most likely cause of the duplex mismatch today is faulty cable. Not a hardware problem with Nortel, 3Com or anyone else. And I explain the negotiation process so that you can understand why.

      Of course, you are smarter than Cisco, Sun, Dell and many other people because you don’t believe or even mention the evidence presented here.

      I don’t have a very high opinion of your company and you are confirming that very concept. Please pay attention and read the article before you criticise.

      • James says:

        I don’t think that reply required the sarcastic response you gave just because someone disagreed with your opinion. I have come across too many interfaces stuck in a half-100/full-10 state that I always hard code the speed (GigE is an exception). Cisco to Cisco is generally fine, but problems happen with interop between vendors.

        You are living in a theoretical world where you think things work the way they should. This doesn’t happen in the real world and I don’t want to be called up at 3am in the morning when I am getting packet loss across a link – something I could of avoided by hard coding. Before you ask, I install Cat-6 as standard which has all been verified before being used!

        James

  18. Calin says:

    It’s nice to see people getting involved and discussing pro and cons of using either one way or another. I have read your thoughts and I agree that network engineer should push autonegotiation in the network, where it’s possible, but let’s not forget that in many cases “paper talk” and how the things behave in the real environments it’s different.
    I always receive instructions from the server team when adding a new device, to configure the port on xxx speed and full duplex. In some cases I wanted to test what’s going on and I let the ports on auto and did just fine, but sometimes it’s not working at all. It’s true also that I don’t know how new or old is the device at the other end of the cable.
    I agree 99% with this article, just the following phrase give me a dark view of a network: “If you haven’t already, you should con­sider chan­ging your stand­ards to use auto-​​negotiation every­where because it is a pre­requis­ite in the new technology.” I imagine a new network engineer that goes there and issue a interface range command on all switches and apply auto negotiation without being aware of the problems that might appear. He will be “crucified” for making an interruption on the services due to auto negotiation failure.
    After all, when nothing is working, you can also blame the networking :)

  19. realworld says:

    My experience for the last 10 years say to statically set it, even for gig. There’s also fact that with it statically set, no time is wasted on negotiation.

  20. Nick H says:

    I’m willing to call autonegotiate a myth, not because there aren’t devices out there that won’t autonegotiate, but because it rarely seems to be the cause of problems anymore.

    I worked for a large educational resnet back in the day (as early as 2002), and we never saw autonegotiate problems with the 4000+ diverse devices we dealt with each year.

    It takes two hands and both feet to count all the times I’ve had a vendor want to pin down a port for troubleshooting purposes, but can’t recall a single time autonegotiate was the cause of a problem.

Trackbacks

  1. [...] Greg Ferro posted a nice write-up about Autonegotiation on Ethernet – It Works, It Should Be Mandatory! [...]

  2. [...] die hard holds true to this one. Quite a while ago I read an article by Greg Ferro regarding the force speed and duplex myth. Today I stumbled on Terry Slattery’s blog regarding the same “autonegotiate duplex or [...]

  3. [...] b) Check speed and duplex settings -if you have auto-negotiation here and it fails, you will end with an interface in down status -again if you have static settings here, check to be the same on both sides. -for more pro and cons regarding auto-negotiation vs static, please see Greg Ferro’s article [...]

  4. [...] goes into very well with some deep research into WHY you should always use AutoNegotiation at Gig or faster [...]

Speak Your Mind

*