Session

Diagnosing Page Pool Leaks

Instructors

Dragos Tatulea

Label

Nuts and Bolts

Session Type

Tutorial

Description

Page pools have become the standard way of doing memory management on the RX data path. Implementing page pool support in a NIC driver is not rocket science. Once such support is added and the obvious problems are dealt with the driver author might be faced with a long tail of issues related to page pool leaks. While these leaks are not critical they are frustrating to deal with.

This talk provides practical guidelines for diagnosing page pool leaks and narrowing down their sources. Drawing from real-world examples [3][4], it demonstrates some debugging techniques and heuristics to differentiate true leaks from false positives. Although techniques for handling such issues have been discussed on the Netdev mailing list [5][1], clear examples have been lacking—this talk aims to fill that gap.

The session will benefit driver authors who need to debug or understand such issues. It could also be interesting for savvy system administrators to understand and assess the significance of these leaks.

References:

[1] f232de7cdb4b (“net/mlx5e: SHAMPO, Fix page leak”)

[2] aaab619ccd07 (“net/mlx5e: XDP, Fix XDP_REDIRECT mpwqe page fragment leaks on shutdown”)

[3] Add netlink-based introspection for page pools: https://lore.kernel.org/netdev/ZWfuyc13oEkp583C@makrotopia.org/T/#m5746652f9f93b71f120a072b206732a6fcc89f5e

[4] https://lore.kernel.org/netdev/20240814075603.05f8b0f5@kernel.org/