Av. Reina Mercedes s/n, 41012 (Seville)
There are plenty of blogs and documentation detailing how to fine tune the Linux networking stack for improved performance (throughput, latency, scalability, etc.). Some of them contradict each other and most of them are definitely not up-to-date with the latest changes available on Linux today, nor with the latest CPU families.
We have taken a step back; starting from what is known to be the best-in-class setup for throughput (and similarly for latency), we began analysing the Linux networking stack performance via the Linux profiling tool Perf.
Our objectives were to identify bottlenecks (i.e. locks, memory copies, interrupt handlers, cache misses, TLB misses, etc.) and to possibly enhance the existing code based on our findings.