Gnodal – A New Type of Fabric and Silicon – Impressive

I went on a personal “Tech Field Day” this evening and visited Gnodal to talk about their Ethernet switches.

Let me try and summarise my takeaways in bullet points.

1. When you connect their switches back to back they automatically recognise this and “fabric enable” the connection. This fabric connection means you can build Fat-Tree, or multistage Clos, or whatever fabric style suits your network requirements.

1.a Gnodal switches are really, really fast. The interface MAC is embedded in the core chip so there is very few components in the box. This leads to a practical pricing model.

1b. They are shipping three models. GS7200 with 72 x10GBE ports (that’s seventy two 10GbE ports in a TWO RU device.).
There is a GS4008 with is 40 x 10GbE and 8 x 40GBE>
There is a GS0018 with 18 x 40GBE which can be used as an Ethernet switch or as a fabric backbone.

2. This fabric connection uses up to 16 40Gb ports on the switches for multipathing into the fabric.

2a. The internal fabric is lossless and less than 120ns delay.

3. The multipathing algorithm scales linearly past 32 uplinks.

4. They have “frame fairness” concept that ensures all conversations are handled fairly at the Ethernet layer thus removing congestion.

5. They have their own, internally developed ASIC that makes all of the current merchant silicon look like technology from the 1990s.

6. They don’t buffer. Because the fabric is non-blocking, and frame fairness removes congestion points, the switch is much simpler, costs less, and uses less 150W per 1RU. Didn’t believe it until they took the lid off.

7. They do funky multicast handling for low latency forwarding i.e. less 250ns for arbitrarily large switch fabric with multipath and fabric enabled core. (As I understand it).

The EtherealMind View

These guys have built their own chips using design methods that are completely their own. By attacking the fabric problem from a different way ( not unlike, but still different to, Juniper’s QFabric ) they have created something quite unique. And, quite possibly, because they are not in Silicon Valley, have not followed in the same path as the other vendors.

Yes, I’m impressed.If you are a cloud provider, Internet exchange, or had to build a large data centre fabric it’s would be worth taking the time to learn and understand how Gnodal does their stuff and compare it to your current vendors. It’s a new approach that changed my perception of how switching works (and reinforces the QFabric  approach that Juniper has taken).

I’m hoping they will come on the podcast in the future to talk in more detail.

You should check it out.
Gnodal Logo

  • Wes Felter

    This doesn’t sound that much better than Fulcrum/Intel or Broadcom. Other merchant silicon has 64-72 ports, integrated MACs, <200W, lossless, etc. And Clos is so last year; I am totally over it. :-)

    • http://etherealmind.com Etherealmind

      Mmm, 72 x10gbe ? Am I missing something ?

      • Wes Felter

        Maybe you’re missing http://www.fulcrummicro.com/alta_micro.htm

      • Ryan Malayter

        You may want to correct your post, as Gnodal’s site claims 72x10G in ONE RU.

        Still, that’s only a 12.5% density increase over the many Broadcom Trident+ solutions already shipping. And with TRILL hardware support (though software TRILL is still “promised” in a future software release by most vendors), you can have a plough Nd play layer-2 fabric.

        What could be revolutionary is the claimed latency, which seems impossibly low. And the lossless flow control (is that a proprietary scheme)? Wouldn’t nic support be required for that?

        It is impossible to have “no congestion” when you have two or more ports fanning in to one, so that sounds like marketing BS.

        It is also impossible to have a non-blocking network without equal ingress an egress bandwidth on each switch, and even then very smart and fast adaptive routing is required. Clos networks with n=m are *rearrangably* non-blocking only, and cannot behave like a crossbar switch.

        Gnodal needs to put some actual technical detail and benchmarks out there on that site, because right now it screams “snake oil”. Are these things actually shipping?

        • http://etherealmind.com Etherealmind

          From my notes (and my understanding from a brief chat) , the ingress latency for a frame (i.e. Ethernet edge to Fabric Port) is claimed to be 120 ns and then 65ns per fabric hop thereafter.

          Flow control is achieved using an in-band signalling mechanism that monitors the end to end congestion in the fabric and dynamically adapts the flow path around any congestion is the core while maintaining IP packet ordering. This “frame fairness” (my term) is used to manage all Ethernet frames.

          The “no congestion” claim is based on how you build your network. The core silicon is a 18 x 40GB/s input. Each 40GBE port can be subdivided into 4 x 10 GB for edge connections to servers using a simple copper splitter.

          Because the silicon die is small, you can build a number of these chips into a single board ( or couple of boards) and use their fabric capability internally.

          Yes the products are shipping today. I saw a number of test beds that are conducting proof of concept for large customers, and we discussed the specific requirements of each one.

          The core team at the company have previously built networking interconnects for Quadrics supercomputers and this silicon is derived from those experiences.

          What I’m finding here is that have produced their own silicon design, and manufactured it, and brought it to market. It’s still an early stage business, but they appear to have sound technology and viable platform. The price, I believe, is much less that the major vendors, but more expensive than the Broadcom Trident white box switches. Let/s face it, the Trident has plenty of problems and design limits, but Gnodal seems to have an edge at this time.

          I’m hoping they will come on the podcast in the near future and talk in more detail about some of their features. Lets see what happens.

          • Ryan Malayter

            Glad to hear they have some real credible people behind it, and I did seem to find a datasheets on their site using a real browser instead of my phone. 

            A few technical “here’s how we do it” white-papers would be very helpful to prospective customers. Enterprise buyers generally don’t want to do the slow-dance with a salesperson just to qualify that a solution *might* meet their needs and is worth adding to the RFP list. Maybe the HPC market is different.It all sounds very impressive, especially the adaptive routing in the face of fabric congestion. Most of the academic literature I’ve read involves adaptive routing algorithms “tuned” to a particular topology, or adaptive routing algorithms that require global knowledge of the flow state in the whole network. If it’s truly topology-agnostic and can do non-minimal routing as seems to be implied… wow.Isn’t the traditional measure of network device latency something like “Δt from last bit into the physical ingress port to last bit out of the egress port”? Not sure how that compares with the latency measurements Gnodal quotes…

  • Jon Hudson

    120ns? Please explaine. Even infiniband can’t do that. Just the phys should be 40-100ns each.

    But on everything else ether way, NICE job Gnodal!

    • Jon Beecroft (Gnodal)

      Our cut through fabric latency is actually 66ns + cable delays. We have designed all our own PCS, MACs and core logic IP. About 20ns of this latency comes from the TX and RX SerDes IP. We have an additional store and forward minimum latency of 150ns at the Ethernet edge.

      • Mark Berly

        Looking at your latest data sheet it states:

        “Port-to-port Latency 
        (40+8) port network: 150ns (RFC 1242 
        for Store and Forward”

        I am guessing you are measuring this with good old RFC1242 testing, which in store and forward mode is a LIFO measurement not a FIFO measure as it is with a cut through switch. In short this means the actual latency is much higher and if you are measuring in cut through mode then why does your data sheet mention store and forward or are you measuring your cut through mode with LIFO – which would completely distorting the already distorted RFC1242…

        • Jon Beecroft

          You are, of course, right about RFC 1242. I did not write it and do not agree with it either but we are store and forward on the edge and cut through in our fabric and that is the recognized definition.

          That said we can send a frame FIFO in 200ns for 10gbe and less than 180ns for 40gbe. I might be wrong but don’t think anyone else can do that.

          But the latency looks great for higher port networks. For example with a true multi-path fat tree network using dynamic adaptive routing and no cut in cross sectional bandwidth;
          720 * 10gbe ports = 332ns FIFO small frames. (3 hops)
          6480 * 10gbe ports = 464ns. (5 hops)

  • Mark Berly

    How is your multicast support? 

  • Ryan Malayter

    Are there any shipping switches based on this Fulcrum silicon with 72x10G ports?