OSDN Git Service

arm64: Implement optimised checksum routine
authorRobin Murphy <robin.murphy@arm.com>
Wed, 15 Jan 2020 16:42:39 +0000 (16:42 +0000)
committerWill Deacon <will@kernel.org>
Thu, 16 Jan 2020 15:23:29 +0000 (15:23 +0000)
commit5777eaed566a1d63e344d3dd8f2b5e33be20643e
tree9bf4f13c0209f26135e66073760b297ba0cac0d9
parent46cf053efec6a3a5f343fead837777efe8252a46
arm64: Implement optimised checksum routine

Apparently there exist certain workloads which rely heavily on software
checksumming, for which the generic do_csum() implementation becomes a
significant bottleneck. Therefore let's give arm64 its own optimised
version - for ease of maintenance this foregoes assembly or intrisics,
and is thus not actually arm64-specific, but does rely heavily on C
idioms that translate well to the A64 ISA and the typical load/store
capabilities of most ARMv8 CPU cores.

The resulting increase in checksum throughput scales nicely with buffer
size, tending towards 4x for a small in-order core (Cortex-A53), and up
to 6x or more for an aggressive big core (Ampere eMAG).

Reported-by: Lingyan Huang <huanglingyan2@huawei.com>
Tested-by: Lingyan Huang <huanglingyan2@huawei.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
arch/arm64/include/asm/checksum.h
arch/arm64/lib/Makefile
arch/arm64/lib/csum.c [new file with mode: 0644]