Session

AMP To Reduce Network Jitter

Speakers

Satish Kumar
Fam Zheng

Label

Moonshot

Session Type

Talk

Contents

Description

In this paper we will share our analysis of jitter caused by the kernel network stack on datacenter applications which are micro architecture by design and distributed in nature. Usually these applications follow an event driven non-blocking design pattern where connections are multiplexed in a thread that implements the event loop using epoll and performs read system calls. The assumption that the thread which calls the read will also consume the data is not always true because the business logic usually runs on a different thread. Following the assumption that reader will consume the data, the datacenter servers are generically configured in a scaled out manner where NIC (Network Interface Card) queues are mapped to all/maximum possible CPUs. This asymmetricity in userspace and symmetricity in kernel space leads to unnecessary context switches and inefficient use of caches.

In this paper we will also share our findings of using asymmetric multi-processing (AMP) strategy on datacenter workloads. Our AMP strategy reserves CPUs for the kernel network stack and isolates applications to the remaining CPUs. The strategy considers various statistics like SoftIRQ load, overall CPU utilisation, packet throughput and latency to dynamically create the set of isolated CPUs. Such separation is known to provide some performance advantages, but our finding suggests that on above explained application design pattern the performance gains are significant. We identified that the AMP strategy leads to better cache efficiency and interference free execution of NAPI (New API) and the application. We will present these details by taking the Redis cluster and Netpoll RPC framework as a case study where throughput per CPU utilisation improvements are 25% and 12% respectively along with a 10% reduction in 99.9 percentile latency. We will also discuss different design options to implement the AMP strategy and role of io-uring to further complete the pipeline model.