Session

Rethinking Zero Copy Networking with MAIO

Speakers

Alex Markuze

Label

Moonshot

Session Type

Talk

Contents

Description

Berkeley Sockets (a.k.a, BSD, POSIX sockets) are ubiquitously used for network communication. BSD Sockets have been the de-facto standard API for network I/O since they were introduced almost four decades ago. 
With the advent of high-speed Ethernet, the performance overhead of BSD Sockets became evident. Attempts to avoid these overheads have spurred a trend kernel bypass techniques, e.g., DPDK, AF_XDP, Netmap. By bypassing the kernel, these methods attempt to avoid the performance penalties associated with BSD Sockets, i.e., memory copy, system calls, and a slow network stack. However, with great performance comes the great responsibility of re-creating the same network infrastructure that already exists inside the kernel. Kernel developers attempt to close the performance gap by adding new capabilities, most notably XDP, MSG_ZEROCOPY, and tcp_mmap, but none of the proposed solutions is a panacea. 
In this work, we propose a new paradigm for userspace networking, aiming to shrink the performance gap between BSD Sockets and kernel bypass techniques, allowing application developers to keep the standard BSD sockets API, network stack (e.g., TCP) and network tools without compromising on performance. 
We introduce MAIO, a dedicated memory allocator for networking that inherently facilitates zero-copy I/O operations. We modify the kernel memory management system to implement dynamic memory segregation, introducing I/O only pages that are shared between the user, the kernel, and the device. These I/O pages are used only for I/O and can never be used by the kernel for any other purposes. This scheme facilitates zero-copy I/O while isolating kernel memory from the user. Additionally, we leverage existing HW capabilities (e.g., NVIDIA QPs, Intel ADQ) to facilitate isolation between processes.
MAIO is the first design to provide zero-copy networking while still taking advantage of the robust kernel network stack without compromising the system’s security. It is currently used to facilitate efficient networking for our companies next-generation SD-WAN gateways.