OSDN Git Service

Merge branch 'page_pool-stats'
authorDavid S. Miller <davem@davemloft.net>
Thu, 3 Mar 2022 09:55:28 +0000 (09:55 +0000)
committerDavid S. Miller <davem@davemloft.net>
Thu, 3 Mar 2022 09:55:28 +0000 (09:55 +0000)
commita8ff736d31396cdd913660c34ff77b549aa853b3
treea9679043361f8ad4fc325267136a919cb3f2aa66
parent42f0c1934c7cb3e94c2fe8f5771245fa5631d0e7
parentcc10e84b2ec3aea2cb82129bdf4185d4c05a486d
Merge branch 'page_pool-stats'

Joe Damato says:

====================
page_pool: Add stats counters

Greetings:

Welcome to v9.

This revisions adds a commit which updates the page_pool documentation to
describe the stats API, structures, and fields.

Additionally, this revision contains a minor cosmetic change suggested by
Saeed in page_pool_recycle_in_ring in commit 2: "page_pool: Add recycle
stats", which removes an unnecessary #ifdef.

There are no functional changes in this revision.

Benchmark output from the v7 cover [1] is pasted below, as it is still
relevant since no functional changes have been made in this revision:

Benchmarks have been re-run. As always, results between runs are highly
variable; you'll find results showing that stats disabled are both faster
and slower than stats enabled in back to back benchmark runs.

Raw benchmark output with stats off [2] and stats on [3] are available for
examination.

Test system:
- 2x Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
- 2 NUMA zones, with 18 cores per zone and 2 threads per core

bench_page_pool_simple results, loops=200000000
test name stats enabled stats disabled
cycles nanosec cycles nanosec

for_loop 0 0.335 0 0.336
atomic_inc  14 6.106 13 6.022
lock 30 13.365 32 13.968

no-softirq-page_pool01 75 32.884 74 32.308
no-softirq-page_pool02 79 34.696 74 32.302
no-softirq-page_pool03 110 48.005 105 46.073

tasklet_page_pool01_fast_path 14 6.156 14 6.211
tasklet_page_pool02_ptr_ring 41 18.028 39 17.391
tasklet_page_pool03_slow 107 46.646 105 46.123

bench_page_pool_cross_cpu results, loops=20000000 returning_cpus=4:
test name stats enabled stats disabled
cycles nanosec cycles nanosec

page_pool_cross_cpu CPU(0) 3973 1731.596 4015 1750.015
page_pool_cross_cpu CPU(1) 3976 1733.217 4022 1752.864
page_pool_cross_cpu CPU(2) 3973 1731.615 4016 1750.433
page_pool_cross_cpu CPU(3) 3976 1733.218 4021 1752.806
page_pool_cross_cpu CPU(4) 994 433.305 1005 438.217

page_pool_cross_cpu average 3378 - 3415 -

bench_page_pool_cross_cpu results, loops=20000000 returning_cpus=8:
test name stats enabled stats disabled
cycles nanosec cycles nanosec

page_pool_cross_cpu CPU(0) 6969 3037.488 6909 3011.463
page_pool_cross_cpu CPU(1) 6974 3039.469 6913 3012.961
page_pool_cross_cpu CPU(2) 6969 3037.575 6910 3011.585
page_pool_cross_cpu CPU(3) 6974 3039.415 6913 3012.961
page_pool_cross_cpu CPU(4) 6969 3037.288 6909 3011.368
page_pool_cross_cpu CPU(5) 6972 3038.732 6913 3012.920
page_pool_cross_cpu CPU(6) 6969 3037.350 6909 3011.386
page_pool_cross_cpu CPU(7) 6973 3039.356 6913 3012.921
page_pool_cross_cpu CPU(8) 871 379.934 864 376.620

page_pool_cross_cpu average 6293 - 6239 -

Thanks.

[1]: https://lore.kernel.org/all/1645810914-35485-1-git-send-email-jdamato@fastly.com/
[2]: https://gist.githubusercontent.com/jdamato-fsly/d7c34b9fa7be1ce132a266b0f2b92aea/raw/327dcd71d11ece10238fbf19e0472afbcbf22fd4/v7_stats_disabled
[3]: https://gist.githubusercontent.com/jdamato-fsly/d7c34b9fa7be1ce132a266b0f2b92aea/raw/327dcd71d11ece10238fbf19e0472afbcbf22fd4/v7_stats_enabled

v8 -> v9:
- Add documentation about the page_pool_get_stats API, stats
  structures, and fields to Documentation/networking/page_pool.rst.
- Remove unnecessary #ifdef in page_pool_recycle_in_ring.

v7 -> v8:
- Rename mlx5 ethtool stats so that users have a better idea of
  their meaning.

v6 -> v7:
- stats split out into two structs one single per-page pool struct
  for allocation path stats and one per-cpu pointer for recycle
  path stats.
- page_pool_get_stats updated to use a wrapper struct to gather
  stats for allocation and recycle stats with a single argument.
- placement of structs adjusted
- mlx5 driver modified to use page_pool_get_stats API

v5 -> v6:
- Per cpu page_pool_stats struct pointer is now marked as
  ____cacheline_aligned_in_smp. Placement of the field in the
  struct is unchanged; it is the last field.

v4 -> v5:
- Fixed the description of the kernel option in Kconfig.
- Squashed commits 1-10 from v4 into a single commit for easier
  review.
- Changed the comment style of the comment for
  the this_cpu_inc_alloc_stat macro.
- Changed the return type of page_pool_get_stats from struct
  page_pool_stat * to bool.

v3 -> v4:
- Restructured stats to be per-cpu per-pool.
- Global stats and proc file were removed.
- Exposed an API (page_pool_get_stats) for batching the pool stats.

v2 -> v3:
- patch 8/10 ("Add stat tracking cache refill") fixed placement of
  counter increment.
- patch 10/10 ("net-procfs: Show page pool stats in proc") updated:
- fix unused label warning from kernel test robot,
- fixed page_pool_seq_show to only display the refill stat
  once,
- added a remove_proc_entry for page_pool_stat to
  dev_proc_net_exit.

v1 -> v2:
- A new kernel config option has been added, which defaults to N,
   preventing this code from being compiled in by default
- The stats structure has been converted to a per-cpu structure
- The stats are now exported via proc (/proc/net/page_pool_stat)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>