A senior Google software engineer called for new techniques to clear clogs out of growing data center networks in a keynote at the annual Hot Interconnects event here. They could form a new app ripe for a silicon accelerator.
Nandita Dukkipati described traffic shaping software that’s slashing latencies while trimming CPU overheads for the search giant. One chip maker said it already has baked similar techniques into one of its Ethernet adapters at the request of Google’s rival Microsoft.
Today, many data centers manage traffic by creating queues to improve efficiency, but that approach is hitting a wall. Google described techniques using time-based isolation to prevent competing jobs from colliding.
“We should invest more in isolation techniques across the board in NICs, switches and hypervisors. We pay attention to efficiency, but not enough to isolation — we think of queues but let’s think of time,” Dukkipati told an audience of networking chip and systems engineers.
Queues eat up CPU time computing complex algorithms that use hefty data structures and require significant garbage collection. In addition, they are heavy users of memory and require synchronizing processes that can add as much as a second to latencies.
“Today’s servers can hold hundreds of virtual machines, generating 25,000 flows to isolate. The numbers of VMs and queues are growing, and it’s not sustainable,” she said.
As an alternative to today’s traffic shapers, Dukkipati described two techniques Google aims to merge. Carousel is a new Google program that manages traffic at a single server. Timely is an older technique it uses to reduce latencies across its data centers.
Carousel improved network performance 8.2 percent over existing queuing-based traffic managers based on tests run on thousands of YouTube servers. It points the way to “more exciting stuff to come in new policies for network pacing and bandwidth allocation,” she said.
The Timely approach cut more than an order of magnitude off latencies compared to DCTCP, one of the most commonly used congestion-control algorithms in today’s data centers, she reported.
Dukkipati called for engineers to apply the techniques, noting they could be implemented in hardware or software that is distributed or centralized.
“We feel like we are just blazing a trail with the software right now. It’s an interesting time where we are trying more things out in hardware,” she said.
A Mellanox engineer said she “described exactly what we put in our ConnectX 3 Pro” Ethernet adapter at the recommendation of engineers running Microsoft’s Azure service.
在线留言询价
型号 | 品牌 | 询价 |
---|---|---|
MC33074DR2G | onsemi | |
BD71847AMWV-E2 | ROHM Semiconductor | |
CDZVT2R20B | ROHM Semiconductor | |
TL431ACLPR | Texas Instruments | |
RB751G-40T2R | ROHM Semiconductor |
型号 | 品牌 | 抢购 |
---|---|---|
BP3621 | ROHM Semiconductor | |
STM32F429IGT6 | STMicroelectronics | |
TPS63050YFFR | Texas Instruments | |
BU33JA2MNVX-CTL | ROHM Semiconductor | |
IPZ40N04S5L4R8ATMA1 | Infineon Technologies | |
ESR03EZPJ151 | ROHM Semiconductor |
AMEYA360公众号二维码
识别二维码,即可关注
请输入下方图片中的验证码: