Posts

Showing posts from July, 2023

Cisco Joins 51.2T Switch-Chip Crowd

Image
Four chip vendors may not sound like a crowd, but in the leading-edge data-center switch segment, it's likely unsustainable over the long term. The problem is the small number of customers for these devices, that is, the hyperscalers. Despite stiff competition, however, Cisco continues to invest in its Silicon One product line, which now includes 5nm switch chips. The new top-end device is the Silicon One G200, a 51.2Tbps chip built around an internally developed 112Gbps serdes. As a refresher, the company announced production of its 7nm 25.6Tbps G100 chip last October along with design wins in new Cisco platforms. As the figure below shows, Cisco makes some bold claims regarding the G200. The most startling is the statement that the G200 is twice as power efficient as its predecessor. In other words, the G200 dissipates the same power as the G100 at twice the throughput. The company's new serdes design must improve power efficiency, as the move from 7nm to 5nm should account f

Ultra Ethernet Promises New RDMA Protocol

Image
This week saw the formal launch of the Ultra Ethernet Consortium (UEC), which aims to reinvent Ethernet fabrics for massive-scale AI and HPC deployments. An impressive list of founding members back this ambitious effort: hyperscalers Meta and Microsoft; chip vendors AMD, Broadcom, and Intel; OEMs Arista, Atos, and HPE; and Cisco, which straddles the chip and OEM camps. Absent this backing, we could easily write off this consortium as doomed to failure. Our skepticism is rooted not in the obvious need the UEC looks to serve but rather in the challenges of standardizing and implementing a full-stack approach. The effort plans to replace existing transport protocols as well as user-space APIs. Specifically, the Ultra Ethernet Transport (UET) protocol will be a new RDMA protocol to replace ROCE, and new APIs will replace the Verbs API from the InfiniBand heritage. UET will provide an alternative to RoCEv2 and Amazon’s SRD , both of which are deployed in hyperscale data centers. (Source: Ul

Spectrum-X: It's Bigger Than Software

Image
There's been a lot of confusion around Spectrum-X, some of which NVIDIA seems to have created intentionally. The company's branding is part of the issue, as it seems to conflate Spectrum-X with the Spectrum line of Ethernet switch chips. In fact, Spectrum-X is simply a software license that enables new features across a set of existing hardware products. The reality that Spectrum-X is a set of software, however, devalues what NVIDIA has actually delivered. Working on top of the company's end-to-end Ethernet hardware, the software creates the first merchant congestion-managed Ethernet fabric. Minimizing tail latency is critical to AI-training workloads, as detailed in our recent white paper . We use the merchant qualifier because some hyperscalers have developed their own congestion-management schemes that work with standard Ethernet-switch hardware. One example is Amazon, which developed the scalable reliable datagram (SDP) protocol for use with its internally-developed Ni