git.osdn.net Git - android-x86/external-llvm-project.git/commit

author	wlei <wlei@fb.com>
	Mon, 11 Jan 2021 20:47:22 +0000 (12:47 -0800)
committer	Tom Stellard <tstellar@redhat.com>
	Sat, 20 Feb 2021 05:21:11 +0000 (21:21 -0800)
commit	e562ff08f634d814c1cd1e65e3428ca5308d3022
tree	03bccc898dd83f2b56f00e0f742ebb964cdc780f	tree \| snapshot
parent	6209b0756d5df805f6279d3dadc8d2ba8648c3eb	commit \| diff

[CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation

For CS profile generation, the process of call stack unwinding is time-consuming since for each LBR entry we need linear time to generate the context( hash, compression, string concatenation). This change speeds up this by grouping all the call frame within one LBR sample into a trie and aggregating the result(sample counter) on it, deferring the context compression and string generation to the end of unwinding.

Specifically, it uses `StackLeaf` as the top frame on the stack and manipulates(pop or push a trie node) it dynamically during virtual unwinding so that the raw sample can just be recoded on the leaf node, the path(root to leaf) will represent its calling context. In the end, it traverses the trie and generates the context on the fly.

Results:
Our internal branch shows about 5X speed-up on some large workloads in SPEC06 benchmark.

Differential Revision: https://reviews.llvm.org/D94110

llvm/tools/llvm-profgen/PerfReader.cpp		diff \| blob \| history
llvm/tools/llvm-profgen/PerfReader.h		diff \| blob \| history
llvm/tools/llvm-profgen/ProfiledBinary.cpp		diff \| blob \| history
llvm/tools/llvm-profgen/ProfiledBinary.h		diff \| blob \| history