OSDN Git Service
Mauro Rossi [Sat, 1 Sep 2018 15:24:18 +0000 (17:24 +0200)]
android: InstCombine: add support for InstCombineTable.inc TableGen rules
Reference:
commit
7f7cea5306
("InstCombine/AMDGPU: Add dimension-aware image intrinsics to SimplifyDemanded")
Mauro Rossi [Sat, 1 Sep 2018 16:27:47 +0000 (18:27 +0200)]
android: AMDGPU: Separate R600 and GCN TableGen files
Reference:
cba2181e77 ("AMDGPU: Separate R600 and GCN TableGen files")
Mauro Rossi [Sat, 1 Sep 2018 14:24:14 +0000 (16:24 +0200)]
android: [IR] Split Intrinsics.inc into enums and implementations
Reference:
af7c445 ("[IR] Split Intrinsics.inc into enums and implementations")
Mauro Rossi [Sat, 1 Sep 2018 14:10:24 +0000 (16:10 +0200)]
android: Rename {Attributes,Intrinsics}.gen to {Attributes,Intrinsics}.inc
Reference:
ea0775ec4a ("Rename Attributes.gen, Intrinsics.gen to Attributes.inc, Intrinsics.inc")
Mauro Rossi [Sat, 1 Sep 2018 11:25:45 +0000 (13:25 +0200)]
android: [ARM] Unify handling of M-Class system registers
Reference:
683224ecbd ("[ARM] Unify handling of M-Class system registers")
Mauro Rossi [Sat, 1 Sep 2018 02:03:27 +0000 (04:03 +0200)]
android: [mips][rtdyld] Move MIPS relocation resolution to a subclass
Reference:
2762dbad70
"[mips][rtdyld] Move MIPS relocation resolution to a subclass and implment N32 relocations"
Mauro Rossi [Sat, 1 Sep 2018 01:49:03 +0000 (03:49 +0200)]
android: [PM] Create a separate library for high-level pass management code.
Reference:
2349136630 ("[PM] Create a separate library for high-level pass management code.")
Mauro Rossi [Sat, 1 Sep 2018 01:24:26 +0000 (03:24 +0200)]
android: [MSF] add support for libLLVMDebugInfoMSF
Reference:
5e117855c3 ([msf] Resubmit "Rename Msf -> MSF".)
Mauro Rossi [Sat, 1 Sep 2018 01:22:37 +0000 (03:22 +0200)]
android: NFC: Rename (PDB) RawSession to NativeSession
Reference:
11e1d83ce5 ("NFC: Rename (PDB) RawSession to NativeSession")
Mauro Rossi [Sat, 1 Sep 2018 00:36:45 +0000 (02:36 +0200)]
android: Move Object format code to lib/BinaryFormat.
Reference:
19ca2b0f9d ("Move Object format code to lib/BinaryFormat.")
Mauro Rossi [Fri, 31 Aug 2018 23:48:08 +0000 (01:48 +0200)]
android: [X86][GlobalISel] Initial implementation , select G_ADD gpr, gpr
Reference:
8ae3570fa804
("[X86][GlobalISel] Initial implementation , select G_ADD gpr, gpr")
Mauro Rossi [Fri, 31 Aug 2018 23:45:32 +0000 (01:45 +0200)]
android: [X86][AVX512] Adding new LLVM TableGen for EVEX2VEX tables
Reference:
b59d8041db [X86][AVX512] Adding new LLVM TableGen backend which
generates the EVEX2VEX compressing tables.
Mauro Rossi [Fri, 31 Aug 2018 23:23:04 +0000 (01:23 +0200)]
android: [X86][GlobalISel] Add general-purpose Register Bank
Reference:
f4df3d44a5 ("[X86][GlobalISel] Add general-purpose Register Bank")
Mauro Rossi [Fri, 31 Aug 2018 23:11:05 +0000 (01:11 +0200)]
android: [ARM] GlobalISel: Use TableGen instruction selector
Reference:
33dd8eaf5d ("[ARM] GlobalISel: Use TableGen instruction selector")
Mauro Rossi [Fri, 31 Aug 2018 22:51:14 +0000 (00:51 +0200)]
android: [ARM] Tablegen-erate current Register Bank Information.
Reference:
e0fd89875d
("[globalisel][arm] Tablegen-erate current Register Bank Information.")
Mauro Rossi [Fri, 31 Aug 2018 22:47:29 +0000 (00:47 +0200)]
android: AMDGPU: Support using tablegened MC pseudo expansions
Reference:
5991ecc3e6 ("AMDGPU: Support using tablegened MC pseudo expansions")
Mauro Rossi [Fri, 31 Aug 2018 20:55:52 +0000 (22:55 +0200)]
android: IR: Rename the tablegen'd Attributes file to .gen
Reference:
e255157a48 ("IR: Rename the tablegen'd Attributes file to .gen")
Mauro Rossi [Sat, 25 Aug 2018 21:45:24 +0000 (23:45 +0200)]
android: Aarch64,AMDGPU: enable GlobalISel
Reference commits:
8097fcb40b ("[GlobalISel] Add basic Selector-emitter tblgen backend.")
ca24065b98 ("[globalisel] Tablegen-erate current Register Bank Information")
945c85d877 ("AMDGPU/GlobalISel: Add support for simple shaders")
Mauro Rossi [Sat, 25 Aug 2018 18:03:29 +0000 (20:03 +0200)]
android: add llvm/Config/abi-breaking.h header for host and device
Fixes the following building error:
external/llvm/include/llvm/Support/Error.h:21:10:
fatal error: 'llvm/Config/abi-breaking.h' file not found
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Mauro Rossi [Sat, 25 Aug 2018 17:59:34 +0000 (19:59 +0200)]
android: add llvm/Config/llvm-platform-config.h header
Fixes the following building error:
In file included from external/llvm70bp/lib/Support/LockFileManager.cpp:10:
In file included from external/llvm/include/llvm/Support/LockFileManager.h:12:
In file included from external/llvm/include/llvm/ADT/Optional.h:20:
In file included from external/llvm/include/llvm/Support/AlignOf.h:17:
In file included from external/llvm/include/llvm/Support/Compiler.h:18:
external/llvm/device/include/llvm/Config/llvm-config.h:98:10:
fatal error: 'llvm/Config/llvm-platform-config.h' file not found
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Mauro Rossi [Sat, 25 Aug 2018 14:50:30 +0000 (16:50 +0200)]
android: Moving libFuzzer from LLVM to compiler-rt.
Reference:
ec925a2578 ("Moving libFuzzer from LLVM to compiler-rt.")
Mauro Rossi [Sat, 25 Aug 2018 17:58:03 +0000 (19:58 +0200)]
android: Move lib/LibDriver -> lib/ToolDrivers/llvm-lib. NFCI.
Reference:
a844915ce0 ("Move lib/LibDriver -> lib/ToolDrivers/llvm-lib. NFCI.")
Mauro Rossi [Sat, 25 Aug 2018 11:23:34 +0000 (13:23 +0200)]
DO NOT MERGE: util: build only TableGen
Necessary to avoid collisions with external/llvm utils
Mauro Rossi [Sat, 25 Aug 2018 10:59:12 +0000 (12:59 +0200)]
DO NOT MERGE: android: avoid re-building tools
Necessary to avoid Android Build System targets conflicts with external/llvm
Mauro Rossi [Sat, 25 Aug 2018 10:50:12 +0000 (12:50 +0200)]
android: update version in {config,llvm-config}.h headers
Version information is set to 7.0.0
Mauro Rossi [Sat, 25 Aug 2018 10:44:44 +0000 (12:44 +0200)]
android: add soong building rules
Android.bp and building rules including Android specific .go files
are imported from oreo-x86 llvm 3.9 branch to the release_70 branch
cd ~/oreo-x86_kernel/external
git clone https://github.com/llvm-mirror/llvm llvm70
cd ~/oreo-x86_kernel/external/llvm70
git reset --hard
git fetch llvm-mirror release_70
git checkout FETCH_HEAD
git clean -dxf
cd ~/oreo-x86_kernel/external/llvm
find . -name 'Android.bp' -exec cp -v --parents {} ~/oreo-x86_kernel/external/llvm70 \;
find . -name 'Android.bp' -exec git -C /home/utente/oreo-x86_kernel/external/llvm70 add {} \;
find . -name 'llvm.go' -exec cp -v --parents {} ~/oreo-x86_kernel/external/llvm70 \;
find . -name 'llvm.go' -exec git -C /home/utente/oreo-x86_kernel/external/llvm70 add {} \;
find . -name 'tblgen.go' -exec cp -v --parents {} ~/oreo-x86_kernel/external/llvm70 \;
find . -name 'tblgen.go' -exec git -C /home/utente/oreo-x86_kernel/external/llvm70 add {} \;
find . -name 'android_test.sh' -exec cp -v --parents {} ~/oreo-x86_kernel/external/llvm70 \;
find . -name 'android_test.sh' -exec git -C /home/utente/oreo-x86_kernel/external/llvm70 add {} \;
find . -name 'AsmParsers.def' -exec cp -v --parents {} ~/oreo-x86_kernel/external/llvm70 \;
find . -name 'AsmParsers.def' -exec git -C /home/utente/oreo-x86_kernel/external/llvm70 add {} \;
find . -name 'AsmPrinters.def' -exec cp -v --parents {} ~/oreo-x86_kernel/external/llvm70 \;
find . -name 'AsmPrinters.def' -exec git -C /home/utente/oreo-x86_kernel/external/llvm70 add {} \;
find . -name 'Disassemblers.def' -exec cp -v --parents {} ~/oreo-x86_kernel/external/llvm70 \;
find . -name 'Disassemblers.def' -exec git -C /home/utente/oreo-x86_kernel/external/llvm70 add {} \;
find . -name 'Targets.def' -exec cp -v --parents {} ~/oreo-x86_kernel/external/llvm70 \;
find . -name 'Targets.def' -exec git -C /home/utente/oreo-x86_kernel/external/llvm70 add {} \;
find . -name 'config.h' -exec cp -v --parents {} ~/oreo-x86_kernel/external/llvm70 \;
find . -name 'config.h' -exec git -C /home/utente/oreo-x86_kernel/external/llvm70 add {} \;
find . -name 'llvm-config.h' -exec cp -v --parents {} ~/oreo-x86_kernel/external/llvm70 \;
find . -name 'llvm-config.h' -exec git -C /home/utente/oreo-x86_kernel/external/llvm70 add {} \;
cd ~/oreo-x86_kernel/external/llvm70
Tim Northover [Mon, 1 Feb 2021 12:43:33 +0000 (12:43 +0000)]
GlobalISel: check type size before getZExtValue()ing it.
Otherwise getZExtValue() asserts.
(cherry picked from commit
c2b322fc19e829162ed4c7dcd04d9e9b2cd4e66c)
Luboš Luňák [Tue, 6 Apr 2021 16:38:18 +0000 (18:38 +0200)]
remove -fpch-codegen and -fpch-debuginfo from Clang 12.0 release notes
These were new in 11.0. The commit adding the options landed after
11.x branch had already been branched off from master, and only
then backported to 11.x, so the release notes change stayed for 12.0.
Lang Hames [Sun, 28 Mar 2021 23:30:47 +0000 (16:30 -0700)]
[ORC][C-bindings] Fix some ORC C bindings function names and signatures.
LLVMOrcDisposeObjectLayer and LLVMOrcExecutionSessionGetJITDylibByName did not
have matching signatures between the C-API header and binding implementations.
Fixes http://llvm.org/PR49745.
Patch by Mats Larsen. Thanks Mats!
Reviewed by: lhames
Differential Revision: https://reviews.llvm.org/D99478
(cherry picked from commit
666df2e2cbe9fc252d3b2d6cbb214c2c2f6afc65)
LemonBoy [Wed, 17 Mar 2021 15:59:55 +0000 (16:59 +0100)]
[LoopVectorize] Refine hasIrregularType predicate
The `hasIrregularType` predicate checks whether an array of N values of type Ty is "bitcast-compatible" with a <N x Ty> vector.
The previous check returned invalid results in some cases where there's some padding between the array elements: eg. a 4-element array of u7 values is considered as compatible with <4 x u7>, even though the vector is only loading/storing 28 bits instead of 32.
The problem causes LLVM to generate incorrect code for some targets: for AArch64 the vector loads/stores are lowered in terms of ubfx/bfi, effectively losing the top (N * padding bits).
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D97465
(cherry picked from commit
4f024938e4c932feba4d28573ec4522106f8d879)
ShihPo Hung [Tue, 30 Mar 2021 21:30:15 +0000 (14:30 -0700)]
[RISCV][MC] Fix nf encoding for vector ld/st whole register
The three bit nf is one less than the number of NFIELDS,
so we manually decrement 1 for VS1/2/4/8R & VL1/2/4/8R.
Differential revision: https://reviews.llvm.org/D98185
(cherry picked from commit rG5cdb2e98608bf57c216ee7067e8a12d070c9e2bd)
Sanjay Patel [Sat, 13 Mar 2021 13:26:27 +0000 (08:26 -0500)]
[InstCombine] avoid creating an extra instruction in zext fold and possible inf-loop
The structure of this fold is suspect vs. most of instcombine
because it creates instructions and tries to delete them
immediately after.
If we don't have the operand types for the icmps, then we are
not behaving as assumed. And as shown in PR49475, we can inf-loop.
(cherry picked from commit
4224a36957420744756d6a6450eb6502a1bfadc3)
Sanjay Patel [Fri, 12 Mar 2021 19:15:27 +0000 (14:15 -0500)]
[InstCombine] add test for zext-of-icmps; NFC
PR49475 shows an infinite loop outcome, but this
tries to show the root cause with a minimal test.
(cherry picked from commit
579b8fc2e97c489308f97b01d13d894c03c0a16c)
Nikita Popov [Sun, 14 Mar 2021 15:47:41 +0000 (16:47 +0100)]
[X86][FastISel] Fix with.overflow eflags clobber (PR49587)
If the successor block has a phi node, then additional moves may
be inserted into predecessors, which may clobber eflags. Don't try
to fold the with.overflow result into the branch in that case.
This is done by explicitly checking for any phis in successor
blocks, not sure if there's some more principled way to address
this. Other fused compare and branch patterns avoid the issue by
emitting the comparison when handling the branch, so that no
instructions may be inserted in between. In this case, the
with.overflow call is emitted separately (and I don't think this
is avoidable, as it will generally have at least two users).
Fixes https://bugs.llvm.org/show_bug.cgi?id=49587.
Differential Revision: https://reviews.llvm.org/D98600
(cherry picked from commit
7669455df49e6fc8ae7d9f4bd4ee95bb20e7eb6e)
Nikita Popov [Sun, 14 Mar 2021 15:39:03 +0000 (16:39 +0100)]
[X86] Add test for PR49587 (NFC)
Shows a miscompile with FastISel.
(cherry picked from commit
0d814ca0f02733d6581bf209fadbebf3035380e0)
Nikita Popov [Sun, 7 Mar 2021 16:27:22 +0000 (17:27 +0100)]
[FastISel] Don't trivially kill extractvalues (PR49467)
All extractvalues of the same value at the same index will map to
the same register, so even if one specific extractvalue only has
one use, we should not mark it as a trivial kill, as there may be
more extractvalues later.
Fixes https://bugs.llvm.org/show_bug.cgi?id=49467.
Differential Revision: https://reviews.llvm.org/D98145
(cherry picked from commit
55ae279ba7a5905f39ce3ae79eac7834a4a134cc)
Joseph Huber [Wed, 10 Mar 2021 18:25:33 +0000 (13:25 -0500)]
[OpenMP] Restore backwards compatibility for libomptarget
Summary:
The changes introduced in D87946 changed the API for libomptarget
functions. `__kmpc_push_target_tripcount` was a function in Clang 11.x
but was not given a backward-compatible interface. This change will
require people using Clang 13.x or 12.x to recompile their offloading
programs.
Reviewed By: jdoerfert cchen
Differential Revision: https://reviews.llvm.org/D98358
(cherry picked from commit
807466ef28125cf7268c860b09d5563c9c93602a)
Nikita Popov [Wed, 10 Mar 2021 13:37:09 +0000 (14:37 +0100)]
[PowerPC] Fix infinite loop in peephole CR optimization (PR49509)
If we encounter a degenerate select node where both operands are
the same, then we can continue negating the condition while swapping
operands, resulting in an infinite loop. Avoid this by bailing out
if both operands are the same.
Fixes https://bugs.llvm.org/show_bug.cgi?id=49509.
Differential Revision: https://reviews.llvm.org/D98340
(cherry picked from commit
2489cbaa8057c736475fd88990f4f6dbf022873d)
Shilei Tian [Thu, 18 Mar 2021 22:25:21 +0000 (18:25 -0400)]
[OpenMP] Fixed a crash in hidden helper thread
It is reported that after enabling hidden helper thread, the program
can hit the assertion `new_gtid < __kmp_threads_capacity` sometimes. The root
cause is explained as follows. Let's say the default `__kmp_threads_capacity` is
`N`. If hidden helper thread is enabled, `__kmp_threads_capacity` will be offset
to `N+8` by default. If the number of threads we need exceeds `N+8`, e.g. via
`num_threads` clause, we need to expand `__kmp_threads`. In
`__kmp_expand_threads`, the expansion starts from `__kmp_threads_capacity`, and
repeatedly doubling it until the new capacity meets the requirement. Let's
assume the new requirement is `Y`. If `Y` happens to meet the constraint
`(N+8)*2^X=Y` where `X` is the number of iterations, the new capacity is not
enough because we have 8 slots for hidden helper threads.
Here is an example.
```
#include <vector>
int main(int argc, char *argv[]) {
constexpr const size_t N = 1344;
std::vector<int> data(N);
#pragma omp parallel for
for (unsigned i = 0; i < N; ++i) {
data[i] = i;
}
#pragma omp parallel for num_threads(N)
for (unsigned i = 0; i < N; ++i) {
data[i] += i;
}
return 0;
}
```
My CPU is 20C40T, then `__kmp_threads_capacity` is 160. After offset,
`__kmp_threads_capacity` becomes 168. `1344 = (160+8)*2^3`, then the assertions
hit.
Reviewed By: protze.joachim
Differential Revision: https://reviews.llvm.org/D98838
(cherry picked from commit
2df65f87c1ea81008768e14522e5d9277234ba70)
Sanjay Patel [Fri, 12 Mar 2021 12:56:54 +0000 (07:56 -0500)]
[SimplifyCFG] avoid sinking insts within an infinite-loop
The test is reduced from a C source example in:
https://llvm.org/PR49541
It's possible that the test could be reduced further or
the predicate generalized further, but it seems to require
a few ingredients (including the "late" SimplifyCFG options
on the RUN line) to fall into the infinite-loop trap.
(cherry picked from commit
bd197ed0a57a82187ed3c6265ca811d412acfaef)
Alexandre Ganea [Wed, 24 Mar 2021 16:28:00 +0000 (12:28 -0400)]
[Support] Fix 'keeping' temporary files on Windows 7
As reported here: https://bugs.llvm.org/show_bug.cgi?id=48378#c0
and here: https://github.com/rust-lang/rust/issues/81051
since
79657e2339b58bc01fe1b85a448bb073d57d90bb, some programs such as llvm-ar
don't work properly on Windows 7.
The issue is shown in the snippet by Oleksandr Prodan:
https://pastebin.com/v51m3uBU
In essence, once the 'DeleteFile' flag has been set on FILE_DISPOSITION_INFO,
the file path can't be queried anymore with GetFinalPathNameByHandleW. This
however works on Windows 10, GetFinalPathNameByHandleW would return sucessfully.
To workaround the issue, we simply reset the 'DeleteFile' flag before even
checking if we're dealing with a network file.
Tested with `llvm-ar r empty.a a.obj` ran on a network mount. At the moment, we
cannot specifically add a test coverage for this, since it requres mounting a
network drive.
(cherry picked from commit
64ab2b6825c5aeae6e4afa7ef0829b89a6828102)
Maxim Kuvyrkov [Fri, 19 Mar 2021 13:37:19 +0000 (13:37 +0000)]
[WoA][MSVC] Use default linker setting in MSVC-compatible driver [take 2]
At the moment "link.exe" is hard-coded as default linker in MSVC.cpp,
so there's no way to use LLD as default linker for MSVC driver.
This patch adds checking of CLANG_DEFAULT_LINKER to MSVC.cpp and
updates unit-tests that expect link.exe linker to explicitly select it
via -fuse-ld=link, so that buildbots and other builds that set
-DCLANG_DEFAULT_LINKER=foobar don't fail these tests.
This is a squash of
- https://reviews.llvm.org/D98493 (MSVC.cpp change) and
- https://reviews.llvm.org/D98862 (unit-tests change)
Fixes https://bugs.llvm.org/show_bug.cgi?id=49624
Reviewed By: maxim-kuvyrkov
Differential Revision: https://reviews.llvm.org/D98935
(cherry-picked from commit
2049fe58903b68f66872a18e608f40e5233b55fb)
Maxim Kuvyrkov [Thu, 18 Mar 2021 16:08:58 +0000 (16:08 +0000)]
[aarch64][WOA64][docs] Release note for WoA-hosted LLVM 12 binary
Reviewed By: peterwaller-arm
Differential Revision: https://reviews.llvm.org/D98415
Anastasia Stulova [Tue, 16 Mar 2021 12:07:15 +0000 (12:07 +0000)]
[OpenCL][Docs] Release notes
Differential Revision: https://reviews.llvm.org/D98076
Amilendra Kodithuwakku [Fri, 12 Mar 2021 20:03:23 +0000 (20:03 +0000)]
[release][docs] List all cores Arm has added support for in LLVM 12.
Add new-line before sub-list for proper rendering.
Differential Revision: https://reviews.llvm.org/D98277
Amilendra Kodithuwakku [Fri, 12 Mar 2021 19:19:29 +0000 (19:19 +0000)]
[release][docs] List all cores Arm has added support for in LLVM 12.
Reviewed By: kristof.beyls
Differential Revision: https://reviews.llvm.org/D98277
Raul Tambre [Sat, 6 Mar 2021 09:45:57 +0000 (11:45 +0200)]
[CMake][compiler-rt] Use copying instead of symlinking for LSE builtins on non-Unix-likes
As reported in D93278 post-review symlinking requires privilege escalation on Windows.
Copying is functionally same, so fallback to it for systems that aren't Unix-like.
This is similar to the solution in AddLLVM.cmake.
Reviewed By: ikudrin
Differential Revision: https://reviews.llvm.org/D98111
(cherry picked from commit
ba860963b156db3b653c67ef044df877f3cea9cc)
Craig Topper [Fri, 5 Mar 2021 06:30:38 +0000 (22:30 -0800)]
[TargetLowering] Use HandleSDNodes to prevent nodes from being deleted by recursive calls in getNegatedExpression.
For binary or ternary ops we call getNegatedExpression multiple
times and then compare costs. While we're doing this we need to
hold a node from the first call across the second call, but its
not yet attached to the DAG. Its possible the second call creates
an identical node and then decides it didn't need it so will try
to delete it if it has no uses. This can cause a reference to the
node we're holding further up the call stack to become invalidated.
To prevent this, we can use a HandleSDNode to artifically give
the node a use without connecting it to the DAG.
I've used a std::list of HandleSDNodes so we can create handles
only when we have a node to hold. HandleSDNode does not have
default constructor and cannot be copied or moved.
Fixes PR49393.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D97914
(cherry picked from commit
74e6030bcbcc8e628f9a99a424342a0c656456f9)
Juneyoung Lee [Tue, 9 Feb 2021 05:06:17 +0000 (14:06 +0900)]
[LoopVectorize] Fix VPRecipeBuilder::createEdgeMask to correctly generate the mask
This patch fixes pr48832 by correctly generating the mask when a poison value is involved.
Consider this CFG (which is a part of the input):
```
for.body: ; preds = %for.cond
br i1 true, label %cond.false, label %land.rhs
land.rhs: ; preds = %for.body
br i1 poison, label %cond.end, label %cond.false
cond.false: ; preds = %for.body, %land.rhs
br label %cond.end
cond.end: ; preds = %land.rhs, %cond.false
%cond = phi i32 [ 0, %cond.false ], [ 1, %land.rhs ]
```
The path for.body -> land.rhs -> cond.end should be taken when 'select i1 false, i1 poison, i1 false' holds (which means it's never taken); but VPRecipeBuilder::createEdgeMask was emitting 'and i1 false, poison' instead.
The former one successfully blocks poison propagation whereas the latter one doesn't, making the condition poison and thus causing the miscompilation.
SimplifyCFG has a similar bug (which didn't expose a real-world bug yet), and a patch for this is also ongoing (see https://reviews.llvm.org/D95026).
Reviewed By: bjope
Differential Revision: https://reviews.llvm.org/D95217
(cherry picked from commit
ed253ef77248d91a15b3a1aa36c0b74bed8ec8af)
Nathan James [Wed, 3 Mar 2021 16:01:12 +0000 (16:01 +0000)]
[clang-tidy] Deprecate readability-deleted-default check
... For removal in next release cycle.
The clang warning that does the same thing is enabled by default and typically emits better diagnostics making this check surplus to requirements.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D97491
(cherry picked from commit
19aefd2d5dc3a8d3b8e81219973828170b7fcd2c)
LemonBoy [Fri, 5 Mar 2021 15:01:45 +0000 (16:01 +0100)]
[AArch64] Legalize horizontal fmax/fmin reductions on f16 vectors
Expand the horizontal reduction during the instruction selection phase, but only if the target doesn't support the full fp16 instruction set.
Fixes https://bugs.llvm.org/show_bug.cgi?id=49401
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D97840
(cherry picked from commit
8725b24c6d4abaa97425e704652a13dacb35fe3f)
Shilei Tian [Wed, 24 Feb 2021 17:37:22 +0000 (12:37 -0500)]
[OpenMP] Fixed a crash when offloading to x86_64 with target nowait
PR#49334 reports a crash when offloading to x86_64 with `target nowait`,
which is caused by referencing a nullptr. The root cause of the issue is, when
pushing a hidden helper task in `__kmp_push_task`, it also maps the gtid to its
shadow gtid, which is wrong.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D97329
(cherry picked from commit
e5da63d5a9ede1fb6d8aa18cfd44533ead128738)
Nikita Popov [Mon, 1 Mar 2021 20:37:26 +0000 (21:37 +0100)]
[GlobalISel] Bail on G_PHI narrowing of odd types (PR48188)
The current narrowing code for G_PHI can only handle the case
where the size is a multiple of the narrow size. If this is not
the case, fall back to SDAG instead of asserting.
Original patch by shepmaster.
Differential Revision: https://reviews.llvm.org/D92446
(cherry picked from commit
c35761db0f078f74550ef56bfc0745c162d76967)
Peyton, Jonathan L [Tue, 2 Mar 2021 13:44:15 +0000 (07:44 -0600)]
[OpenMP] Fix clang-cl build error regarding TSX intrinsics
Fix for https://bugs.llvm.org/show_bug.cgi?id=49339
The CMake check for the RTM intrinsics needs the -mrtm flag to be set
during the test. This way clang-cl correctly detects it has the
_xbegin() intrinsic. Otherwise, the CMake check fails.
Differential Revision: https://reviews.llvm.org/D97413
(cherry picked from commit
e83380fccc2cc9842bdcfd268efddf6fce90544d)
Kirstóf Umann [Fri, 5 Feb 2021 18:57:09 +0000 (19:57 +0100)]
[analyzer] Add 12.0.0 release notes
Differential Revision: https://reviews.llvm.org/D96163
Craig Topper [Sun, 28 Feb 2021 19:23:46 +0000 (11:23 -0800)]
[DAGCombiner][X86] Don't peek through ANDs on the shift amount in matchRotateSub when called from MatchFunnelPosNeg.
Peeking through AND is only valid if the input to both shifts is
the same. If the inputs are different, then the original pattern
ORs the two values when the masked shift amount is 0. This is ok
if the values are the same since the OR would be a NOP which is
why its ok for rotate.
Fixes PR49365 and reverts PR34641
Differential Revision: https://reviews.llvm.org/D97637
(cherry picked from commit
5de09ef02e24d234d9fc0cd1c6dfe18a1bb784b0)
Richard Smith [Mon, 1 Mar 2021 20:17:10 +0000 (12:17 -0800)]
Revert "[c++20] Mark class type NTTPs as done and start defining the feature test macro."
Some of the parts of this work were reverted; stop defining the feature
test macro for now.
This reverts commit
b4c63ef6dd90dba9af26a111c9a78b121c5284b1.
(cherry picked from commit
564f5b0734bd5d265a0046e5ca9d08ae5bc303eb)
Sanjay Patel [Sat, 27 Feb 2021 14:09:03 +0000 (09:09 -0500)]
[SimplifyCFG] avoid illegal phi with both poison and undef
In the example based on:
https://llvm.org/PR49218
...we are crashing because poison is a subclass of undef, so we merge blocks and create:
PHI node has multiple entries for the same basic block with different incoming values!
%k3 = phi i64 [ poison, %entry ], [ %k3, %g ], [ undef, %entry ]
If both poison and undef values are incoming, we soften the poison values to undef.
Differential Revision: https://reviews.llvm.org/D97495
(cherry picked from commit
356cdabd3a9e0ff919ea2c1a35c8706ecb915297)
Sanjay Patel [Sun, 28 Feb 2021 15:17:10 +0000 (10:17 -0500)]
[InstCombine] avoid infinite loop in demanded bits for select
https://llvm.org/PR49205
(cherry picked from commit
9502061bcc86982641772f45b7e7a0eb7437f054)
Shilei Tian [Tue, 23 Feb 2021 18:20:13 +0000 (13:20 -0500)]
[OpenMP][NVPTX] Fixed a compilation error in deviceRTLs caused by unsupported feature in release verion of LLVM
`ptx71` is not supported in release version of LLVM yet. As a result,
the support of CUDA 11.2 and CUDA 11.1 caused a compilation error as mentioned
in D97004. Since the support in D97004 is just a WA for releease, and we'll not
use it in the near future, using `ptx70` for CUDA 11 is feasible.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D97195
(cherry picked from commit
f6c2984a090e78947f75e096d43b476bf2ae73eb)
Pavel Iliin [Thu, 25 Feb 2021 22:51:09 +0000 (22:51 +0000)]
[AArch64][Docs] Release notes 12.x on outline atomics
Description for AArch64 -moutline-atomics, -mno-outline-atomics
options added to release notes.
Differential Revision: https://reviews.llvm.org/D97510
Fangrui Song [Wed, 24 Feb 2021 17:19:23 +0000 (09:19 -0800)]
ReleaseNotes: add lld/ELF notes
Differential Revision: https://reviews.llvm.org/D97113
Louis Dionne [Thu, 28 Jan 2021 15:46:22 +0000 (10:46 -0500)]
[libc++] Fix extern-templates.sh.cpp test on Linux
(cherry picked from commit
bf5941afcda3ac6570ba25165758869287491e0d)
Louis Dionne [Wed, 27 Jan 2021 18:08:24 +0000 (13:08 -0500)]
[libc++] Fix extern template test failing on Windows
See https://reviews.llvm.org/D94718#
2521489 for details.
(cherry picked from commit
90407b16b1d3e38f1360b6a24ceab801ab9cefc1)
Nikita Popov [Tue, 23 Feb 2021 23:57:13 +0000 (15:57 -0800)]
Andy Kaylor [Thu, 4 Feb 2021 02:16:04 +0000 (18:16 -0800)]
Add auto-upgrade support for annotation intrinsics
The llvm.ptr.annotation and llvm.var.annotation intrinsics were changed
since the 11.0 release to add an additional parameter. This patch
auto-upgrades IR containing the four-parameter versions of these
intrinsics, adding a null pointer as the fifth argument.
Differential Revision: https://reviews.llvm.org/D95993
(cherry picked from commit
9a827906cb95e7c3ae94627558da67b47ffde249)
Tom Stellard [Tue, 23 Feb 2021 19:12:50 +0000 (11:12 -0800)]
[12.0.0][llvm-symbolizer][test] Fix test broken after cherry-pick
See bug https://bugs.llvm.org/show_bug.cgi?id=49227. The cherry-pick
0d4f8a3f364f introduced a test failure, as the test included use of a feature that was only recently added to lit and isn't in the release branch. This patch fixes up the test to manage without this lit change.
Reviewed By: tstellar, MaskRay
Differential Revision: https://reviews.llvm.org/D97272
Kadir Cetinkaya [Thu, 18 Feb 2021 12:48:43 +0000 (13:48 +0100)]
[clang][CodeComplete] Ensure there are no crashes when completing with ParenListExprs as LHS
Differential Revision: https://reviews.llvm.org/D96950
Kadir Cetinkaya [Wed, 3 Feb 2021 11:45:46 +0000 (12:45 +0100)]
[clang][CodeComplete] Fix crash on ParenListExprs
Fixes https://github.com/clangd/clangd/issues/676.
Differential Revision: https://reviews.llvm.org/D95935
Tom Stellard [Tue, 23 Feb 2021 01:35:09 +0000 (17:35 -0800)]
Revert "[llvm-cov] reset executation count to 0 after wrapped segment"
This reverts commit
e3df9471750935876bd2bf7da93ccf0eacca8592.
This commit caused regressions in coverage generation for both Rust and
Swift. We're reverting this in the release/12.x branch until we have
a proper fix in trunk.
http://llvm.org/PR49297
Nikita Popov [Sun, 21 Feb 2021 15:24:18 +0000 (16:24 +0100)]
[JumpThreading] Clone noalias.scope.decl when threading blocks
When cloning instructions during jump threading, also clone and
adapt any declared scopes. This is primarily important when
threading loop exits, because we'll end up with two dominating
scope declarations in that case (at least after additional loop
rotation). This addresses a loose thread from
https://reviews.llvm.org/rG2556b413a7b8#975012.
Differential Revision: https://reviews.llvm.org/D97154
(cherry picked from commit
5e7e499b912d2c9ebaa91b5783ca123dbedeabcc)
Tom Stellard [Tue, 23 Feb 2021 00:27:19 +0000 (16:27 -0800)]
clang-tidy: Disable cppcoreguidlines-prefer-member-initializer check
Fixes https://llvm.org/PR49318
Sam McCall [Mon, 22 Feb 2021 21:05:26 +0000 (22:05 +0100)]
[clangd] Release notes for 12.x
Kadir Cetinkaya [Tue, 16 Feb 2021 19:57:00 +0000 (20:57 +0100)]
Kadir Cetinkaya [Mon, 15 Feb 2021 08:00:49 +0000 (09:00 +0100)]
[clangd] Treat paths case-insensitively depending on the platform
Path{Match,Exclude} and MountPoint were checking paths case-sensitively
on all platforms, as with other features, this was causing problems on
windows. Since users can have capital drive letters on config files, but
editors might lower-case them.
This patch addresses that issue by:
- Creating regexes with case-insensitive matching on those platforms.
- Introducing a new pathIsAncestor helper, which performs checks in a
case-correct manner where needed.
Differential Revision: https://reviews.llvm.org/D96690
(cherry picked from commit
ecea7218fb9b994b26471e9877851cdb51a5f1d4)
Sam McCall [Sun, 31 Jan 2021 12:53:22 +0000 (13:53 +0100)]
[clangd] Rename: merge index/AST refs path-insensitively where needed
If you have c:\foo open, and C:\foo indexed (case difference) then these
need to be considered the same file. Otherwise we emit edits to both,
and editors do... something that isn't pretty.
Maybe more centralized normalization is called for, but it's not trivial
to do this while also being case-preserving. see
https://github.com/clangd/clangd/issues/108
Fixes https://github.com/clangd/clangd/issues/665
Differential Revision: https://reviews.llvm.org/D95759
(cherry picked from commit
b63cd4db915c08e0cb4cf668a18de24b67f2c44c)
Conrad Poelman [Tue, 2 Feb 2021 04:59:38 +0000 (05:59 +0100)]
clang-extra: fix incorrect use of std::lock_guard by adding variable name (identified by MSVC [[nodiscard]] error)
`std::lock_guard` is an RAII class that needs a variable name whose scope determines the guard's lifetime. This particular usage lacked a variable name, meaning the guard could be destroyed before the line that it was indented to protect.
This line was identified by building clang with the latest MSVC preview release, which declares the std::lock_guard constructor to be `[[nodiscard]]` to draw attention to such issues.
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D95725
(cherry picked from commit
0b70c86e2007d3f32968f0a7d9efe8eab3bf0f0a)
Brad Smith [Sun, 21 Feb 2021 01:43:16 +0000 (20:43 -0500)]
[clang][Driver][OpenBSD] libcxx also requires pthread
(cherry picked from commit
b42d57a100c5df6ace68f686f5adaabeafe8a0f6)
Sanjay Patel [Fri, 19 Feb 2021 14:06:05 +0000 (09:06 -0500)]
[Analysis][LoopVectorize] do not form reductions of pointers
This is a fix for https://llvm.org/PR49215 either before/after
we make a verifier enhancement for vector reductions with D96904.
I'm not sure what the current thinking is for pointer math/logic
in IR. We allow icmp on pointer values. Therefore, we match min/max
patterns, so without this patch, the vectorizer could form a vector
reduction from that sequence.
But the LangRef definitions for min/max and vector reduction
intrinsics do not allow pointer types:
https://llvm.org/docs/LangRef.html#llvm-smax-intrinsic
https://llvm.org/docs/LangRef.html#llvm-vector-reduce-umax-intrinsic
So we would crash/assert at some point - either in IR verification,
in the cost model, or in codegen. If we do want to allow this kind
of transform, we will need to update the LangRef and all of those
parts of the compiler.
Differential Revision: https://reviews.llvm.org/D97047
(cherry picked from commit
5b250a27ec7822aa0a32abb696cb16c2cc60149c)
Johannes Doerfert [Sun, 14 Feb 2021 18:25:56 +0000 (12:25 -0600)]
Avoid use of stack allocations in asynchronous calls
NOTE: This is an adaption of the original patch to be applicable to the
LLVM 12 release branch. Logic is the same though.
As reported by Guilherme Valarini [0], we used to pass stack allocations
to calls that can nowadays be asynchronous. This is arguably a problem
and it will inevitably result in UB. To remedy the situation we allocate
the locations as part of the AsyncInfoTy object. The lifetime of that
object matches what we need for now. If the synchronization is not tied
to the AsyncInfoTy object anymore we might need to have a different
buffer construct in global space.
This should be back-ported to LLVM 12 but needs slight modifications as
it is based on refactoring patches we do not need to backport.
[0] https://lists.llvm.org/pipermail/openmp-dev/2021-February/003867.html
Differential Revision: https://reviews.llvm.org/D96667
Fangrui Song [Thu, 4 Feb 2021 17:07:44 +0000 (09:07 -0800)]
[llvm-objdump] --source: drop the warning when there is no debug info
Warnings have been added for three cases (PR41905): (1) missing debug info, (2)
the source file cannot be found, (3) the debug info points at a line beyond the
end of the file.
(1) is probably less useful. This was brought up once on
http://lists.llvm.org/pipermail/llvm-dev/2020-April/141264.html and two
internal users mentioned it to me that it was annoying. (I personally
find the warning confusing, too.)
Users specify --source to get additional information if sources happen to be
available. If sources are not available, it should be obvious as the output
will have no interleaved source lines. The warning can be especially annoying
when using llvm-objdump -S on a bunch of files.
This patch drops the warning when there is no debug info.
(If LLVMSymbolizer::symbolizeCode returns an `Error`, there will still be
an error. There is currently no test for an `Error` return value.
The only code path is probably a broken symbol table, but we probably already emit a warning
in that case)
`source-interleave-prefix.test` has an inappropriate "malformed" test - the test simply has no
.debug_* because new llc does not produce debug info when the filename is empty (invalid).
I have tried tampering the header of .debug_info/.debug_line but llvm-symbolizer does not warn.
This patch does not intend to add the missing test coverage.
Differential Revision: https://reviews.llvm.org/D88715
(cherry picked from commit
eecbb1c77655d38c06e47cf32e2dcc72da45c517)
William S. Moses [Wed, 17 Feb 2021 18:54:42 +0000 (13:54 -0500)]
[SROA] Amend failing test from D95826
(cherry picked from commit
892d2822b62ebcaa7aa0b006b5ea4f26593c1618)
Florian Hahn [Sat, 20 Feb 2021 05:48:46 +0000 (21:48 -0800)]
[clang] Add -ffinite-loops & -fno-finite-loops options.
This cherry-picks the following patches on the release branch:
6280bb4cd80e [clang] Remove redundant condition (NFC).
51bf4c0e6d4c [clang] Add -ffinite-loops & -fno-finite-loops options.
fb4d8fe80701 [clang] Update mustprogress tests
This patch adds 2 new options to control when Clang adds `mustprogress`:
1. -ffinite-loops: assume all loops are finite; mustprogress is added
to all loops, regardless of the selected language standard.
2. -fno-finite-loops: assume no loop is finite; mustprogress is not
added to any loop or function. We could add mustprogress to
functions without loops, but we would have to detect that in Clang,
which is probably not worth it.
Differential Revision: https://reviews.llvm.org/D96850
wlei [Wed, 10 Feb 2021 18:04:39 +0000 (10:04 -0800)]
[CSSPGO][llvm-profgen] Filter out the instructions without location info for symbolizer
It appears some instructions doesn't have the debug location info and the symbolizer will return an empty call stack for them which will cause some crash later in profile unwinding. Actually we do not record the sample info for them, so this change just filter out those instruction.
As those instruction would appears at the begin and end of the instruction list, without them we need to add the boundary check for IP `advance` and `backward`.
Also for pseudo probe based profile, we actually don't need the symbolized location info, so here just change to use an empty stack for it. This could save half of the binary loading time.
Differential Revision: https://reviews.llvm.org/D96434
wlei [Wed, 10 Feb 2021 00:41:44 +0000 (16:41 -0800)]
[CSSPGO][llvm-profgen] Renovate perfscript check and command line input validation
This include some changes related with PerfReader's the input check and command line change:
1) It appears there might be thousands of leading MMAP-Event line in the perfscript for large workload. For this case, the 4k threshold is not eligible to determine it's a hybrid sample. This change renovated the `isHybridPerfScript` by going through the script without threshold limitation checking whether there is a non-empty call stack immediately followed by a LBR sample. It will stop once it find a valid one.
2) Added several input validations for the command line switches in PerfReader.
3) Changed the command line `show-disassembly` to `show-disassembly-only`, it will print to stdout and exit early which leave an empty output profile.
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D96387
wlei [Wed, 3 Feb 2021 22:13:06 +0000 (14:13 -0800)]
[CSSPGO][llvm-profgen] Add brackets for context id to support extended binary format
To align with https://reviews.llvm.org/D95547, we need to add brackets for context id before initializing the `SampleContext`.
Also added test cases for extended binary format from llvm-profgen side.
Differential Revision: https://reviews.llvm.org/D95929
Hongtao Yu [Thu, 11 Feb 2021 22:51:47 +0000 (14:51 -0800)]
Remove test code that cause MSAN failure.
Summary:
The negative test (with the feature being added disabled) caused MSAN failure and that's the added feature is supposed to fix. Therefore the negative test code is being removed.
Hongtao Yu [Tue, 9 Feb 2021 17:17:20 +0000 (09:17 -0800)]
[CSSPGO] Process functions in a top-down order on a dynamic call graph.
Functions are currently processed by the sample profiler loader in a top-down order defined by the static call graph. The order is being adjusted to be a top-down order based on the input context-sensitive profile. One benefit is that the processing order of caller and callee in one SCC would follow the context order in the profile to favor more inlining. Another benefit is that the processing order of caller and callee through an indirect call (which is not on the static call graph) can be honored which in turn allows for more inlining.
The profile top-down order for SCC is also extended to support non-CS profiles.
Two switches `-mllvm -use-profile-indirect-call-edges` and `-mllvm -use-profile-top-down-order` are being introduced.
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D95988
Hongtao Yu [Wed, 10 Feb 2021 22:40:47 +0000 (14:40 -0800)]
[CSSPGO] Restrict pseudo probe tests to x86_64 only.
Hongtao Yu [Mon, 8 Feb 2021 06:49:20 +0000 (22:49 -0800)]
[CSSPGO] Unblock optimizations with pseudo probe instrumentation.
The IR/MIR pseudo probe intrinsics don't get materialized into real machine instructions and therefore they don't incur runtime cost directly. However, they come with indirect cost by blocking certain optimizations. Some of the blocking are intentional (such as blocking code merge) for better counts quality while the others are accidental. This change unblocks perf-critical optimizations that do not affect counts quality. They include:
1. IR InstCombine, sinking load operation to shorten lifetimes.
2. MIR LiveRangeShrink, similar to #1
3. MIR TwoAddressInstructionPass, i.e, opeq transform
4. MIR function argument copy elision
5. IR stack protection. (though not perf-critical but nice to have).
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D95982
Wenlei He [Wed, 3 Feb 2021 21:27:35 +0000 (13:27 -0800)]
[CSSPGO] Use merged base profile for hot threshold calculation
Context-sensitive profile effectively split a function profile into many copies each representing the CFG profile of a particular calling context. That makes the count distribution looks more flat as we now have more function profiles each with lower counts, which in turn leads to lower hot thresholds. Now we tells threshold computation to merge context profile first before calculating percentile based cutoffs to compensate for seemingly flat context profile. This can be controlled by swtich `sample-profile-contextless-threshold`.
Earlier measurement showed ~0.4% perf boost with this tuning on spec2k6 for CSSPGO (with pseudo-probe and new inliner).
Differential Revision: https://reviews.llvm.org/D95980
wlei [Thu, 21 Jan 2021 01:54:03 +0000 (17:54 -0800)]
[CSSPGO][llvm-profgen] Fix bug with parsing hybrid sample trace line
when we skip the call stack starting with an external address, we should also skip the bottom LBR entry, otherwise it will cause a truncated context issue.
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D95480
wlei [Thu, 21 Jan 2021 17:36:32 +0000 (09:36 -0800)]
[CSSPGO][llvm-profgen] Merge and trim profile for cold context to reduce profile size
This change allows merging and trimming cold context profile in llvm-profgen to solve profile size bloat problem. Currently when the profile's total sample is below threshold(supported by a switch), it will be considered cold and merged into a base context-less profile, which will at least keep the profile quality as good as the baseline(non-cs).
For example, two input profiles:
[main @ foo @ bar]:60
[main @ bar]:50
Under threshold = 100, the two profiles will be merge into one with the base context, get result:
[bar]:110
Added two switches:
`--csprof-cold-thres=<value>`: Specified the total samples threshold for a context profile to be considered cold, with 100 being the default. Any cold context profiles will be merged into context-less base profile by default.
`--csprof-keep-cold`: Force profile generation to keep cold context profiles instead of dropping them. By default, any cold context will not be written to output profile.
Results:
Though not yet evaluating it with the latest CSSPGO, our internal branch shows neutral on performance but significantly reduce the profile size. Detailed evaluation on llvm-profgen with CSSPGO will come later.
Differential Revision: https://reviews.llvm.org/D94111
wlei [Mon, 11 Jan 2021 20:47:22 +0000 (12:47 -0800)]
[CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation
For CS profile generation, the process of call stack unwinding is time-consuming since for each LBR entry we need linear time to generate the context( hash, compression, string concatenation). This change speeds up this by grouping all the call frame within one LBR sample into a trie and aggregating the result(sample counter) on it, deferring the context compression and string generation to the end of unwinding.
Specifically, it uses `StackLeaf` as the top frame on the stack and manipulates(pop or push a trie node) it dynamically during virtual unwinding so that the raw sample can just be recoded on the leaf node, the path(root to leaf) will represent its calling context. In the end, it traverses the trie and generates the context on the fly.
Results:
Our internal branch shows about 5X speed-up on some large workloads in SPEC06 benchmark.
Differential Revision: https://reviews.llvm.org/D94110
wlei [Fri, 29 Jan 2021 23:00:08 +0000 (15:00 -0800)]
[CSSPGO][llvm-profgen] Compress recursive cycles in calling context
This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic.
Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration.
For example:
Considering a input context string stack:
[“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For first iteration,, it removed all adjacent repeated frames of size 1:
[“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For second iteration, it removed all adjacent repeated frames of size 2:
[“a”, “b”, “c”, “a”, “b”, “c”, “d”]
So in the end, we get compressed output:
[“a”, “b”, “c”, “d”]
Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator.
Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit.
Added unit tests and regression test for this.
Differential Revision: https://reviews.llvm.org/D93556
wlei [Mon, 11 Jan 2021 17:08:39 +0000 (09:08 -0800)]
[CSSPGO][llvm-profgen] Pseudo probe based CS profile generation
This change implements profile generation infra for pseudo probe in llvm-profgen. During virtual unwinding, the raw profile is extracted into range counter and branch counter and aggregated to sample counter map indexed by the call stack context. This change introduces the last step and produces the eventual profile. Specifically, the body of function sample is recorded by going through each probe among the range and callsite target sample is recorded by extracting the callsite probe from branch's source.
Please refer https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.
**Implementation**
- Extended `PseudoProbeProfileGenerator` for pseudo probe based profile generation.
- `populateBodySamplesWithProbes` reading range counter is responsible for recording function body samples and inferring caller's body samples.
- `populateBoundarySamplesWithProbes` reading branch counter is responsible for recording call site target samples.
- Each sample is recorded with its calling context(named `ContextId`). Remind that the probe based context key doesn't include the leaf frame probe info, so the `ContextId` string is created from two part: one from the probe stack strings' concatenation and other one from the leaf frame probe.
- Added regression test
Test Plan:
ninja & ninja check-llvm
Differential Revision: https://reviews.llvm.org/D92998
Yang Fan [Wed, 3 Feb 2021 03:04:58 +0000 (11:04 +0800)]
[CSSPGO] Fix MSVC initializing truncation warning (NFC)
MSVC warning:
```
\llvm-project\llvm\include\llvm\Transforms\IPO\SampleProfileProbe.h(65): warning C4305: 'initializing': truncation from 'double' to 'const float'
```
William S. Moses [Mon, 1 Feb 2021 23:16:17 +0000 (18:16 -0500)]
[SROA] Propagate correct TBAA/TBAA Struct offsets
SROA does not correctly account for offsets in TBAA/TBAA struct metadata.
This patch creates functionality for generating new MD with the corresponding
offset and updates SROA to use this functionality.
Differential Revision: https://reviews.llvm.org/D95826
(cherry picked from commit
40862b1a7486a969ff044cd240aad24f4183cc10)
Georgii Rymar [Thu, 28 Jan 2021 13:35:18 +0000 (16:35 +0300)]
[llvm-symbolizer] - Fix the crash in GNU output style with --no-inlines and missing input file.
Fixes https://bugs.llvm.org/show_bug.cgi?id=48882.
If the input file does not exist (or has a reading error), the
following code will crash if there are two or more input addresses.
```
auto ResOrErr = Symbolizer.symbolizeInlinedCode(
ModuleName, {Offset, object::SectionedAddress::UndefSection});
Printer << (error(ResOrErr) ? DILineInfo() : ResOrErr.get().getFrame(0));
```
For the first address, `symbolizeInlinedCode` returns an error.
For the second address, `symbolizeInlinedCode` returns an empty result
(not an error) and `.getFrame(0)` will crash.
Differential revision: https://reviews.llvm.org/D95609
(cherry picked from commit
d22140687500f90830fe416d9c1e317f7c4535d5)
Simonas Kazlauskas [Tue, 16 Feb 2021 21:35:32 +0000 (13:35 -0800)]
[llvm-dwp] Join dwo paths correctly when DWOPath is absolute
When the `DWOPath` is absolute, we want to use `DWOPath` as is, without prepending any other
components to the path. The `sys::path::append` does not join, but rather unconditionally appends
the paths, so something like `sys::path::append("/tmp", "/tmp/banana")` will result in
`/tmp/tmp/banana` rather than the desired `/tmp/banana`.
This then causes `llvm-dwp` to fail in a following situation:
```
$ clang -gsplit-dwarf /tmp/banana/test.c -c -o /tmp/outdir/foo.o
$ clang outdir/foo.o -o outdir/hm
$ llvm-dwarfdump outdir/hm | grep -C2 foo.dwo
DW_AT_comp_dir ("/tmp")
DW_AT_GNU_pubnames (true)
DW_AT_GNU_dwo_name ("/tmp/outdir/foo.dwo")
DW_AT_GNU_dwo_id (0xde4d396f3bf0e257)
DW_AT_low_pc (0x0000000000401100)
$ strace -o trace llvm-dwp -e outdir/hm -o outdir/hm.dwp
error: No such file or directory
$ cat trace | grep foo.dwo
openat(AT_FDCWD, "/tmp/tmp/outdir/foo.dwo", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
```
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D96678
(cherry picked from commit
6ffcb2937c96bd0d7a55b984b5eb8f381b68e322)