OSDN Git Service
Tom Stellard [Fri, 22 Jun 2018 03:04:35 +0000 (03:04 +0000)]
AMDGPU/GlobalISel: Default to using TableGen'd instruction selector
Summary:
We can select all instructions that are marked as legal in a full piglit run,
so now is a good time to make the TableGen'd instruction selector default
for all opcodes. This is NFC for a full piglit run, which is why there are
no tests.
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D48198
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335319
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Fri, 22 Jun 2018 02:54:57 +0000 (02:54 +0000)]
AMDGPU/GlobalISel: legalize and select 32-bit G_ASHR
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D48196
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335318
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 22 Jun 2018 02:43:41 +0000 (02:43 +0000)]
[LegacyPM] Fix PR37888 by teaching the legacy loop pass manager how to
clear out deleted loops from the current queue beyond just the current
loop.
This is important because SimpleLoopUnswitch will now enqueue the same
loop to be re-processed. When it does this with the legacy PM, we don't
have a way of canceling the rest of the pipeline and so we can end up
deleting the loop before we reprocess it. =/
This change also makes it easy to support deleting other loops in the
queue to process, although I don't have any use cases for that.
Differential Revision: https://reviews.llvm.org/D48470
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335317
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Fri, 22 Jun 2018 02:34:29 +0000 (02:34 +0000)]
AMDGPU/GlobalISel: legalize and select 32-bit G_SITOFP
Reviewers: arsenm, nhaehnle
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D48195
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335316
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Fri, 22 Jun 2018 00:44:29 +0000 (00:44 +0000)]
AMDGPU/GlobalISel: Implement select() for COPY
Reviewers: arsenm, nhaehnle
Reviewed By: nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D46151
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335315
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 22 Jun 2018 00:32:26 +0000 (00:32 +0000)]
Fix test failures after r335306 due to the pipeline changing.
This wasn't obvious for the author to fix because this is the first
pipeline use of the magic utility to get function analyses within
a module pass in the lagecy pass manager. Turns out that has a bug which
prevents dumping the structure of the pipeline and shows up as an
unnamed pass.
I've just left a FIXME for that as it doesn't seem likely worth fixing
and certainly shouldn't hold up getting the bots green.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335314
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 21 Jun 2018 23:56:59 +0000 (23:56 +0000)]
[InstCombine] fix shuffle-of-binops bug
With non-commutative binops, we could be using the same
variable value as operand 0 in 1 binop and operand 1 in
the other, so we have to check for that possibility and
bail out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335312
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 21 Jun 2018 23:53:01 +0000 (23:53 +0000)]
[InstCombine] add test for shuffle-of-binops; NFC
This shows a miscompile that was missed in rL335283.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335311
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Thu, 21 Jun 2018 23:38:20 +0000 (23:38 +0000)]
AMDGPU/GlobalISel: Implement select() for G_IMPLICIT_DEF
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D46150
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335307
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael J. Spencer [Thu, 21 Jun 2018 23:31:10 +0000 (23:31 +0000)]
[Instrumentation] Add Call Graph Profile pass
This patch adds support for generating a call graph profile from Branch Frequency Info.
The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight.
After scanning all the functions, it generates an appending module flag containing the data. The format looks like:
!llvm.module.flags = !{!0}
!0 = !{i32 5, !"CG Profile", !1}
!1 = !{!2, !3, !4} ; List of edges
!2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32
!3 = !{void (i1)* @freq, void ()* @a, i64 11}
!4 = !{void (i1)* @freq, void ()* @b, i64 20}
Differential Revision: https://reviews.llvm.org/D48105
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335306
91177308-0d34-0410-b5e6-
96231b3b80d8
Reid Kleckner [Thu, 21 Jun 2018 23:06:33 +0000 (23:06 +0000)]
[X86] Fix 32-bit mingw comdat names, only add one underscore
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335304
91177308-0d34-0410-b5e6-
96231b3b80d8
Fangrui Song [Thu, 21 Jun 2018 22:34:29 +0000 (22:34 +0000)]
[gdb] Update llvm::Optional
Reviewers: dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48461
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335303
91177308-0d34-0410-b5e6-
96231b3b80d8
Scott Linder [Thu, 21 Jun 2018 22:30:09 +0000 (22:30 +0000)]
[AMDGPU] Fix lit failures introduced in r335281
The tests do not support big-endian hosts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335302
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 21 Jun 2018 22:25:42 +0000 (22:25 +0000)]
[IR] fix typo in comment; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335301
91177308-0d34-0410-b5e6-
96231b3b80d8
Reid Kleckner [Thu, 21 Jun 2018 22:19:05 +0000 (22:19 +0000)]
Revert r335297 "[X86] Implement more of x86-64 large and medium PIC code models"
MCJIT can't handle R_X86_64_GOT64 yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335300
91177308-0d34-0410-b5e6-
96231b3b80d8
Reid Kleckner [Thu, 21 Jun 2018 21:57:44 +0000 (21:57 +0000)]
[X86] Commit some comments that weren't in the medium code model patch
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335298
91177308-0d34-0410-b5e6-
96231b3b80d8
Reid Kleckner [Thu, 21 Jun 2018 21:55:08 +0000 (21:55 +0000)]
[X86] Implement more of x86-64 large and medium PIC code models
Summary:
The large code model allows code and data segments to exceed 2GB, which
means that some symbol references may require a displacement that cannot
be encoded as a displacement from RIP. The large PIC model even relaxes
the assumption that the GOT itself is within 2GB of all code. Therefore,
we need a special code sequence to materialize it:
.LtmpN:
leaq .LtmpN(%rip), %rbx
movabsq $_GLOBAL_OFFSET_TABLE_-.LtmpN, %rax # Scratch
addq %rax, %rbx # GOT base reg
From that, non-local references go through the GOT base register instead
of being PC-relative loads. Local references typically use GOTOFF
symbols, like this:
movq extern_gv@GOT(%rbx), %rax
movq local_gv@GOTOFF(%rbx), %rax
All calls end up being indirect:
movabsq $local_fn@GOTOFF, %rax
addq %rbx, %rax
callq *%rax
The medium code model retains the assumption that the code segment is
less than 2GB, so calls are once again direct, and the RIP-relative
loads can be used to access the GOT. Materializing the GOT is easy:
leaq _GLOBAL_OFFSET_TABLE_(%rip), %rbx # GOT base reg
DSO local data accesses will use it:
movq local_gv@GOTOFF(%rbx), %rax
Non-local data accesses will use RIP-relative addressing, which means we
may not always need to materialize the GOT base:
movq extern_gv@GOTPCREL(%rip), %rax
Direct calls are basically the same as they are in the small code model:
They use direct, PC-relative addressing, and the PLT is used for calls
to non-local functions.
This patch adds reasonably comprehensive testing of LEA, but there are
lots of interesting folding opportunities that are unimplemented.
Reviewers: chandlerc, echristo
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D47211
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335297
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthew Voss [Thu, 21 Jun 2018 21:43:20 +0000 (21:43 +0000)]
[GVN] Avoid casting a vector of size less than 8 bits to i8
Summary:
A reprise of D25849.
This crash was found through fuzzing some time ago and was documented in PR28879.
No check for load size has been added due to the following tests:
- Transforms/GVN/invariant.group.ll
- Transforms/GVN/pr10820.ll
These tests expect load sizes that are not a multiple of eight.
Thanks to @davide for the original patch.
Reviewers: nlopes, davide, RKSimon, reames, efriedma
Reviewed By: efriedma
Subscribers: davide, llvm-commits, Prazek
Differential Revision: https://reviews.llvm.org/D48330
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335294
91177308-0d34-0410-b5e6-
96231b3b80d8
Jonas Devlieghere [Thu, 21 Jun 2018 21:37:53 +0000 (21:37 +0000)]
[dsymutil] Force mmap'ing of binaries
After the recent refactoring that introduced parallel handling of
different object, the binary holder became unique per object file. This
defeats its optimization of caching archives, leading to an archive
being opened for every binary it contains. This is obviously unfortunate
and will need to be refactored soon.
Luckily in practice, the impact of this is limited as most files are
mmap'ed instead of memcopy'd. There's a caveat however: when the memory
buffer requires a null terminator and it's a multiple of the page size,
we allocate instead of mmap'ing. If this happens for a static archive,
we end up with N copies of it in memory, where N is the number of
objects in the archive, leading to exuberant memory usage. This provided
a stopgap solution to ensure that all the files it loads are mmap in
memory by removing the requirement for a terminating null byte.
Differential revision: https://reviews.llvm.org/D48397
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335293
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Shen [Thu, 21 Jun 2018 21:29:54 +0000 (21:29 +0000)]
[SCEV] Re-apply r335197 (with Polly fixes).
Summary:
This initiates a discussion on changing Polly accordingly while re-applying r335197 (D48338).
I have never worked on Polly. The proposed change to param_div_div_div_2.ll is not educated, but just patterns that match the output.
All LLVM files are already reviewed in D48338.
Reviewers: jdoerfert, bollu, efriedma
Subscribers: jlebar, sanjoy, hiraditya, llvm-commits, bixia
Differential Revision: https://reviews.llvm.org/D48453
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335292
91177308-0d34-0410-b5e6-
96231b3b80d8
Konstantin Zhuravlyov [Thu, 21 Jun 2018 20:28:19 +0000 (20:28 +0000)]
AMDGPU: Remove ability to reserve VGPRs for debugger
Differential Revision: https://reviews.llvm.org/D48234
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335288
91177308-0d34-0410-b5e6-
96231b3b80d8
Reid Kleckner [Thu, 21 Jun 2018 20:27:38 +0000 (20:27 +0000)]
[mingw] Fix GCC ABI compatibility for comdat things
Summary:
GCC and the binutils COFF linker do comdats differently from MSVC.
If we want to be ABI compatible, we have to do what they do, which is to
emit unique section names like ".text$_Z3foov" instead of short section
names like ".text". Otherwise, the binutils linker gets confused and
reports multiple definition errors when two object files from GCC and
Clang containing the same inline function are linked together.
The best description of the issue is probably at
https://github.com/Alexpux/MINGW-packages/issues/1677, we don't seem to
have a good one in our tracker.
I fixed up the .pdata and .xdata sections needed everywhere other than
32-bit x86. GCC doesn't use associative comdats for those, it appears to
rely on the section name.
Reviewers: smeenai, compnerd, mstorsjo, martell, mati865
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D48402
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335286
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 21 Jun 2018 20:15:09 +0000 (20:15 +0000)]
[InstCombine] fold vector select of binops with constant ops to 1 binop (PR37806)
This is the simplest case from PR37806:
https://bugs.llvm.org/show_bug.cgi?id=37806
If we have a common variable operand used in a pair of binops with vector constants
that are vector selected together, then we can constant shuffle the constant vectors
to eliminate the shuffle instruction.
This has some tricky parts that are hopefully addressed in the tests and their
respective comments:
1. If the shuffle mask contains an undef element, then that lane of the result is
undef:
http://llvm.org/docs/LangRef.html#shufflevector-instruction
Therefore, we can replace the constant in that lane with an undef value except
for div/rem. With div/rem, an undef in the divisor would cause the whole op to
be undef. So I'm using the same hack as in D47686 - replace the undefs with '1'.
2. Intersect the wrapping and FMF of the original binops for the new binop. There
should be no extra poison or fast-math potential in the new binop that wasn't
possible in the original code.
3. Disregard other uses. Given that we're eliminating uses (shortening the
dependency chain), I think that's always the right IR canonicalization. But
I purposely chose the udiv test to demonstrate the scenario where both
intermediate values have other uses because that seems likely worse for
codegen with an expensive math op. This seems like a very rare possibility to
me, so I don't think it requires a backend patch first.
Differential Revision: https://reviews.llvm.org/D48401
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335283
91177308-0d34-0410-b5e6-
96231b3b80d8
Scott Linder [Thu, 21 Jun 2018 19:38:56 +0000 (19:38 +0000)]
[AMDGPU] Update assembler for HSA Code Object v3
Update AMDGPU assembler syntax behind the code-object-v3 feature:
* Replace/rename most AMDGPU assembler directives/symbols and document them.
* Provide more diagnostics (e.g. values out of range, missing values, repeated
values).
* Provide path for backwards compatibility, even with underlying descriptor
changes.
Differential Revision: https://reviews.llvm.org/D47736
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335281
91177308-0d34-0410-b5e6-
96231b3b80d8
Francis Visoiu Mistrih [Thu, 21 Jun 2018 19:18:36 +0000 (19:18 +0000)]
Revert r335206 "Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions."
This reverts commit r335206.
As discussed here: https://reviews.llvm.org/rL333740, a fix will come
tomorrow. In the meanwhile, revert this to fix some bots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335272
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Dardis [Thu, 21 Jun 2018 18:52:32 +0000 (18:52 +0000)]
[mips] Modify comment to test new email address (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335269
91177308-0d34-0410-b5e6-
96231b3b80d8
Scott Linder [Thu, 21 Jun 2018 18:48:48 +0000 (18:48 +0000)]
[AMDGPU] Fix bug with tracking processed blocks in SIInsertWaitcnts
BlockWaitcntProcessedSet was not being cleared between calls, so it was
producing incorrect counts in cases where MBB addresses happened to coincide
across multiple calls.
Differential Revision: https://reviews.llvm.org/D48391
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335268
91177308-0d34-0410-b5e6-
96231b3b80d8
Konstantin Zhuravlyov [Thu, 21 Jun 2018 18:36:04 +0000 (18:36 +0000)]
AMDGPU/AMDHSA: Remove GridWorkGroupCountX/Y/Z
and everything that comes with it from implementation
and v3 header files.
Leave definition in v2 header files for backwards
compatibility.
Differential Revision: https://reviews.llvm.org/D48191
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335267
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 21 Jun 2018 18:07:38 +0000 (18:07 +0000)]
[InstCombine] add tests for shuffled cmps; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335266
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Davis [Thu, 21 Jun 2018 17:59:52 +0000 (17:59 +0000)]
[DebugInfo] Ignore DBG_VALUE instructions in PostRA Machine Sink
Summary:
The logic for handling the sinking of COPY instructions was generating
different code when building with debug flags.
The original code did not take into consideration debug instructions. This
resulted in the registers in the DBG_VALUE instructions being treated as used,
and prevented the COPY from being sunk. This patch avoids analyzing debug
instructions when trying to sink COPY instructions.
This patch also creates a routine from the code in MachineSinking::SinkInstruction to
perform the logic of sinking an instruction along with its debug instructions.
This functionality is used in multiple places, including the code for sinking COPY instrs.
Reviewers: junbuml, javed.absar, MatzeB, bjope
Reviewed By: bjope
Subscribers: aprantl, probinson, thegameg, jonpa, bjope, vsk, kristof.beyls, JDevlieghere, llvm-commits
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D45637
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335264
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 21 Jun 2018 17:51:44 +0000 (17:51 +0000)]
[InstCombine] use constant pattern matchers with icmp+sext
The previous code worked with vectors, but it failed when the
vector constants contained undef elements.
The matchers handle those cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335262
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 21 Jun 2018 17:37:14 +0000 (17:37 +0000)]
[InstCombine] add vector icmp tests with undefs; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335261
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 21 Jun 2018 17:06:36 +0000 (17:06 +0000)]
[InstCombine] simplify binops before trying other folds
This is outwardly NFC from what I can tell, but it should be more efficient
to simplify first (despite the name, SimplifyAssociativeOrCommutative does
not actually simplify as InstSimplify does - it creates/morphs instructions).
This should make it easier to refactor duplicated code that runs for all binops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335258
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 21 Jun 2018 16:54:32 +0000 (16:54 +0000)]
[LoopVectorize] regenerate full checks; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335257
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 21 Jun 2018 16:54:18 +0000 (16:54 +0000)]
[X86] Update fast-isel tests for clang r335253.
The new IR fixes a mismatch in the final extractelement for the i32 intrinsics. Previously we extracted a 64-bit element even though we only wanted 32 bits.
SimplifyDemandedElts isn't able to make FP elements undef now and the shuffle mask I used prevents the use of horizontal add we had before. Not sure we should have been using horizontal add anyway. It's implemented on Intel with two port 5 shuffles and an add. So we have on less shuffle now, but an additional instruction to decode.
Differential Revision: https://reviews.llvm.org/D48347
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335256
91177308-0d34-0410-b5e6-
96231b3b80d8
Paul Robinson [Thu, 21 Jun 2018 16:42:03 +0000 (16:42 +0000)]
[DWARF] Warn on and ignore ".file 0" for DWARF v4 and earlier.
This had been messing with the directory table for prior versions, and
also could induce a crash when generating asm output.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335254
91177308-0d34-0410-b5e6-
96231b3b80d8
Sirish Pande [Thu, 21 Jun 2018 16:05:24 +0000 (16:05 +0000)]
Revert "[AArch64] Coalesce Copy Zero during instruction selection"
This reverts commit
d8f57105010cc7e78026e511d5def873fc91e0e7.
Original Commit:
Author: Haicheng Wu <haicheng@codeaurora.org>
Date: Sun Feb 18 13:51:33 2018 +0000
[AArch64] Coalesce Copy Zero during instruction selection
Add special case for copy of zero to avoid a double copy.
Differential Revision: https://reviews.llvm.org/D36104
Author's intention is to remove a BB that has one mov instruction. In
order to do that,
d8f571050 pessmizes MachineSinking by introducing a
copy, such that mov instruction is NOT moved to the BB. Optimization
downstream gets rid of the BB with only mov instruction. This works well
if we have only one fall through branch as there is only one "extra"
mov instruction.
If we have multiple fall throughs, we will have a lot of redundant movs.
In such a case, it's better to have this BB which has one mov instruction.
This is causing degradation in jpeg, fft and other codebases. I believe
if we want to remove a BB with only one branch instruction, we should not
pessimize Machine Sinking at all, and find some other solution.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335251
91177308-0d34-0410-b5e6-
96231b3b80d8
Stanislav Mekhanoshin [Thu, 21 Jun 2018 16:02:05 +0000 (16:02 +0000)]
DAG combine "and|or (select c, -1, 0), x" -> "select c, x, 0|-1"
Allowed folding for "and/or" binops with non-constant operand if
arguments of select are 0/-1 values.
Normally this code with "and" opcode does not get to a DAG combiner
and simplified yet in the InstCombine. However AMDGPU produces it
during lowering and InstCombine has no chance to optimize it out.
In turn the same pattern with "or" opcode can reach DAG.
Differential Revision: https://reviews.llvm.org/D48301
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335250
91177308-0d34-0410-b5e6-
96231b3b80d8
David Green [Thu, 21 Jun 2018 15:48:29 +0000 (15:48 +0000)]
[ARM] Enable useAA() for the in-order Cortex-R52
This option allows codegen (such as DAGCombine or MI scheduling) to use alias
analysis information, which can help with the codegen on in-order cpu's,
especially machine scheduling. Here I have done things the same way as AArch64,
adding a subtarget feature to enable this for specific cores, and enabled it for
the R52 where we have a schedule to make use of it.
Differential Revision: https://reviews.llvm.org/D48074
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335249
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 21 Jun 2018 14:59:35 +0000 (14:59 +0000)]
[InstCombine] make div/rem vector constant utility function; NFCI
This was originally in D48401 and will be used there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335242
91177308-0d34-0410-b5e6-
96231b3b80d8
Sam Parker [Thu, 21 Jun 2018 14:53:06 +0000 (14:53 +0000)]
[NFC][ARM] ldrd/strd negative tests
Add negative tests for load and stores of alignment 2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335241
91177308-0d34-0410-b5e6-
96231b3b80d8
Clement Courbet [Thu, 21 Jun 2018 14:49:04 +0000 (14:49 +0000)]
[llvm-exegesis][NFC] Simplify BenchmarkRunner.
Get rid of createExecutableFunction().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335240
91177308-0d34-0410-b5e6-
96231b3b80d8
Sameer AbuAsal [Thu, 21 Jun 2018 14:37:09 +0000 (14:37 +0000)]
[RISCV] Tail calls don't need to save return address
Summary:
When expanding the PseudoTail in expandFunctionCall() we were using X6
to save the return address. Since this is a tail call the return
address is not needed, this patch replaces it with X0 to be ignored.
This matches the behaviour listed in the ISA V2.2 document page 110.
tail offset -----> jalr x0, x6, offset
GCC exhibits the same behavior.
Reviewers: apazos, asb, mgrang
Reviewed By: asb
Subscribers: rbar, johnrusso, simoncook, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01
Differential Revision: https://reviews.llvm.org/D48343
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335239
91177308-0d34-0410-b5e6-
96231b3b80d8
Mikhail Dvoretckii [Thu, 21 Jun 2018 14:16:45 +0000 (14:16 +0000)]
[x86] Lower some trunc + shuffle patterns to vpmov[q|d][b|w]
This should help in lowering the following four intrinsics:
_mm256_cvtepi32_epi8
_mm256_cvtepi64_epi16
_mm256_cvtepi64_epi8
_mm512_cvtepi64_epi8
Differential Revision: https://reviews.llvm.org/D46957
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335238
91177308-0d34-0410-b5e6-
96231b3b80d8
Clement Courbet [Thu, 21 Jun 2018 14:11:09 +0000 (14:11 +0000)]
[llvm-exegesis][NFC] Simplify LLVMState.
Summary: Pretty much everything we need is in llvm::TargetMachine.
Reviewers: gchatelet
Subscribers: llvm-commits, tschuett
Differential Revision: https://reviews.llvm.org/D48428
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335237
91177308-0d34-0410-b5e6-
96231b3b80d8
Krzysztof Parzyszek [Thu, 21 Jun 2018 13:38:43 +0000 (13:38 +0000)]
[CodeGen] Avoid handling DBG_VALUE in LiveRegUnits::stepBackward
Patch by Jesper Antonsson.
Differential Revision: https://reviews.llvm.org/D48420
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335233
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Thu, 21 Jun 2018 13:37:55 +0000 (13:37 +0000)]
AMDGPU: Remove redundant MIMG instruction variants
Summary:
For sample and gather ops, we can accurately determine the set of
vaddr-size instruction variants that are required. This reduces
the size of instruction tables by ~5%.
The number of machine instruction opcodes is reduced from 10002
to 9476.
Change-Id: Ie7fc65d3657b762c7816017fe70b2e9bec644a8a
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D48168
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335232
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Thu, 21 Jun 2018 13:37:45 +0000 (13:37 +0000)]
AMDGPU: Remove old-style image intrinsics
Summary:
This also removes the need for atomic pseudo instructions, since
we select the correct encoding directly in SITargetLowering::lowerImage
for dimension-aware image intrinsics.
Mesa uses dimension-aware image intrinsics since
commit
a9a7993441.
Change-Id: I7473d20009476a4ed6d919cae4e6dca9ff42e77a
Reviewers: arsenm, rampitec, mareko, tpr, b-sumner
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D48167
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335231
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Thu, 21 Jun 2018 13:37:31 +0000 (13:37 +0000)]
InstCombine/AMDGPU: Add dimension-aware image intrinsics to SimplifyDemanded
Summary:
Use the expanded features of the TableGen generic tables to avoid manually
adding the combinatorially exploded set of intrinsics. The
getAMDGPUImageDimIntrinsic lookup function is early-out,
i.e. non-AMDGPU intrinsics will never look at the underlying table.
Use a generic approach for getting the new intrinsic overload to keep the
code simple, and make the image dmask handling more generic:
- handle non-sampler image loads
- handle the case where the set of demanded elements is not a prefix
There is some overlap between this code and an optimization that happens
in the backend during code generation. They currently complement each other:
- only the codegen optimization can generate vec3 loads
- only the InstCombine optimization can handle D16
The InstCombine optimization also likely covers more cases since the
codegen optimization is fairly ad-hoc. Ideally, we'll remove the optimization
in codegen once the infrastructure for vec3 is in place (which will probably
take a long time).
Modify the test cases to use dimension-aware intrinsics. This makes it
easier to see that the test coverage for the new intrinsics is equivalent,
and the old style intrinsics will be removed in a follow-up commit anyway.
Change-Id: I4b91ea661413d13004956fe4ef7d13d41b8ce3ad
Reviewers: arsenm, rampitec, majnemer
Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D48165
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335230
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Thu, 21 Jun 2018 13:37:19 +0000 (13:37 +0000)]
AMDGPU: Convert test cases to the dimension-aware intrinsics
Summary:
Also explicitly port over some tests in llvm.amdgcn.image.* that were
missing. Some tests are removed because they no longer apply (i.e.
explicitly testing building an address vector via insertelement).
This is in preparation for the eventual removal of the old-style
intrinsics.
Some additional notes:
- constant-address-space-32bit.ll: change some GCN-NEXT to GCN because
the instruction schedule was subtly altered
- insert_vector_elt.ll: the old test didn't actually test anything,
because %tmp1 was not used; remove the load, because it doesn't work
(Because of the amdgpu_ps calling convention? In any case, it's
orthogonal to what the test claims to be testing.)
Change-Id: Idfa99b6512ad139e755e82b8b89548ab08f0afcf
Reviewers: arsenm, rampitec
Subscribers: MatzeB, qcolombet, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D48018
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335229
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Thu, 21 Jun 2018 13:36:57 +0000 (13:36 +0000)]
AMDGPU: Select MIMG instructions manually in SITargetLowering
Summary:
Having TableGen patterns for image intrinsics is hitting limitations:
for D16 we already have to manually pre-lower the packing of data
values, and we will have to do the same for A16 eventually.
Since there is already some custom C++ code anyway, it is arguably easier
to just do everything in C++, now that we can use the beefed-up generic
tables backend of TableGen to provide all the required metadata and map
intrinsics to corresponding opcodes. With this approach, all image
intrinsic lowering happens in SITargetLowering::lowerImage. That code is
dense due to all the cases that it handles, but it should still be easier
to follow than what we had before, by virtue of it all being done in a
single location, and by virtue of not relying on the TableGen pattern
magic that very few people really understand.
This means that we will have MachineSDNodes with MIMG instructions
during DAG combining, but that seems alright: previously we had
intrinsic nodes instead, but those are similarly opaque to the generic
CodeGen infrastructure, and the final pattern matching just did a 1:1
translation to machine instructions anyway. If anything, the fact that
we now merge the address words into a vector before DAG combine should
be an advantage.
Change-Id: I417f26bd88f54ce9781c1668acc01f3f99774de6
Reviewers: arsenm, rampitec, rtaylor, tstellar
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D48017
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335228
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Thu, 21 Jun 2018 13:36:44 +0000 (13:36 +0000)]
AMDGPU: Refactor MIMG instruction TableGen using generic tables
Summary:
This allows us to access rich information about MIMG opcodes from C++ code.
Simplifying the mapping between equivalent opcodes of different data size
becomes quite natural.
This also flattens the MIMG-related class and multiclass hierarchy a little,
and collapses together some of the scaffolding for sample and gather4 opcodes.
Change-Id: I1a2549fdc1e881ff100e5393d2d87e73729a0ccd
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D48016
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335227
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Thu, 21 Jun 2018 13:36:33 +0000 (13:36 +0000)]
AMDGPU: Use generic tables instead of SearchableTable
Summary:
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D48014
Change-Id: Ibb43f90d955275571aff17d0c3ecfb5e5b299641
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335226
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Thu, 21 Jun 2018 13:36:22 +0000 (13:36 +0000)]
TableGen/SearchableTables: Support more generic enums and tables
Summary:
This is essentially a rewrite of the backend which introduces TableGen
base classes GenericEnum, GenericTable, and SearchIndex. They allow
generating custom enums and tables with lookup functions using
separately defined records as the underlying database.
Also added as part of this change:
- Lookup functions may use indices composed of multiple fields.
- Instruction fields are supported similar to Intrinsic fields.
- When the lookup key has contiguous numeric values, the lookup
function will directly index into the table instead of using a binary
search.
The existing SearchableTable functionality is internally mapped to the
new primitives.
Change-Id: I444f3490fa1dbfb262d7286a1660a2c4308e9932
Reviewers: arsenm, tra, t.p.northover
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D48013
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335225
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Thu, 21 Jun 2018 13:36:13 +0000 (13:36 +0000)]
AMDGPU: Pass AMDGPUSampleVariant to MIMG_{Sampler,Gather}(_WQM)
Summary:
This will allows us to provide rich metadata about the instructions
in tables that are accessible by custom C++ code.
Change-Id: Id9305a26304ab6a6cceb6c65c8cd49141cc0101d
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D48011
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335224
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Thu, 21 Jun 2018 13:36:08 +0000 (13:36 +0000)]
AMDGPU: Add implicit def of SCC to kill and indirect pseudos
Summary:
Kill instructions sometimes do use SCC in unusual circumstances, when
v_cmpx cannot be used due to the operands that are involved.
Additionally, even if SCC was never defined by the expansion, kill pseudos
could previously occur between an s_cmp and an s_cbranch_scc, which breaks
the SCC liveness tracking when the pseudo is expanded to split the basic
block. While it would be possible to explicitly mark the SCC as live-in for
the successor basic block, it's simpler to just mark the pseudo as using SCC,
so that such a sequence is never emitted by instruction selection in the
first place.
A similar issue affects indirect source/dest pseudos in principle, although
I haven't been able to come up with a test case where it actually matters
(this affects instruction selection, so a MIR test can't be used).
Fixes: dEQP-GLES3.functional.shaders.discard.dynamic_loop_always
Change-Id: Ica8d82ecff1a763b892a1112cf1b06c948863a4f
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D47761
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335223
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Thu, 21 Jun 2018 13:36:01 +0000 (13:36 +0000)]
AMDGPU: Turn D16 for MIMG instructions into a regular operand
Summary:
This allows us to reduce the number of different machine instruction
opcodes, which reduces the table sizes and helps flatten the TableGen
multiclass hierarchies.
We can do this because for each hardware MIMG opcode, we have a full set
of IMAGE_xxx_Vn_Vm machine instructions for all required sizes of vdata
and vaddr registers. Instead of having separate D16 machine instructions,
a packed D16 instructions loading e.g. 4 components can simply use the
same V2 opcode variant that non-D16 instructions use.
We still require a TSFlag for D16 buffer instructions, because the
D16-ness of buffer instructions is part of the opcode. Renaming the flag
should help avoid future confusion.
The one non-obvious code change is that for gather4 instructions, the
disassembler can no longer automatically decide whether to use a V2 or
a V4 variant. The existing logic which choose the correct variant for
other MIMG instruction is extended to cover gather4 as well.
As a bonus, some of the assembler error messages are now more helpful
(e.g., complaining about a wrong data size instead of a non-existing
instruction).
While we're at it, delete a whole bunch of dead legacy TableGen code.
Change-Id: I89b02c2841c06f95e662541433e597f5d4553978
Reviewers: arsenm, rampitec, kzhuravl, artem.tamazov, dp, rtaylor
Subscribers: wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D47434
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335222
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Thu, 21 Jun 2018 13:35:44 +0000 (13:35 +0000)]
TableGen: Allow foreach in multiclass to depend on template args
Summary:
This also allows inner foreach loops to have a list that depends on
the iteration variable of an outer foreach loop. The test cases show
some very simple examples of how this can be used.
This was perhaps the last remaining major non-orthogonality in the
TableGen frontend.
Change-Id: I79b92d41a5c0e7c03cc8af4000c5e1bda5ef464d
Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D47431
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335221
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrea Di Biagio [Thu, 21 Jun 2018 12:14:49 +0000 (12:14 +0000)]
[llvm-mca] Updates comment in code, and remove some stale comments. NFC
Also, rename fields `TotalMappings` and `NumUsedMappings` in struct
RegisterMappingTracker into `NumPhysRegs` and `NumUsedPhysRegs`.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335219
91177308-0d34-0410-b5e6-
96231b3b80d8
David Green [Thu, 21 Jun 2018 11:53:16 +0000 (11:53 +0000)]
[DA] Enable -da-delinearize by default
This enables da-delinearize in Dependence Analysis for delinearizing array
accesses into multiple dimensions. This can help to increase the power of
Dependence analysis on multi-dimensional arrays and prevent having to fall
back to the slower and less accurate MIV tests. It adds static checks on the
bounds of the arrays to ensure that one dimension doesn't overflow into
another, and brings our code in line with our tests.
Differential Revision: https://reviews.llvm.org/D45872
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335217
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Thu, 21 Jun 2018 11:37:13 +0000 (11:37 +0000)]
[X86][AVX] Reduce v4f64/v4i64 shuffle costs (PR37882)
These were being over cautious for costs for one/two op general shuffles - VSHUFPD doesn't have to replicate the same shuffle in both lanes like VSHUFPS does.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335216
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Thu, 21 Jun 2018 11:16:10 +0000 (11:16 +0000)]
[SLPVectorizer][X86] Add horizontal add/sub tests
Shows PR37882 perf regression
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335215
91177308-0d34-0410-b5e6-
96231b3b80d8
Mikael Holmen [Thu, 21 Jun 2018 10:03:34 +0000 (10:03 +0000)]
[DebugInfo] Make sure all DBG_VALUEs' reguse operands have IsDebug property
Summary:
In some cases, these operands lacked the IsDebug property, which is meant to signal that
they should not affect codegen. This patch adds a check for this property in the
MachineVerifier and adds it where it was missing.
This includes refactorings to use MachineInstrBuilder construction functions instead of
manually setting up the intrinsic everywhere.
Patch by: JesperAntonsson
Reviewers: aprantl, rnk, echristo, javed.absar
Reviewed By: aprantl
Subscribers: qcolombet, sdardis, nemanjai, JDevlieghere, atanasyan, llvm-commits
Differential Revision: https://reviews.llvm.org/D48319
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335214
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Atanasyan [Thu, 21 Jun 2018 09:59:44 +0000 (09:59 +0000)]
CODE_OWNERS: Take ownership of the MIPS backend
As agreed with Simon Dardis, I'm taking over as a code owner
for the MIPS backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335213
91177308-0d34-0410-b5e6-
96231b3b80d8
David Green [Thu, 21 Jun 2018 08:30:07 +0000 (08:30 +0000)]
[DAGCombine] Fix alignment for offset loads/stores
The alignment parameter to getExtLoad is treated as a base alignment,
not the alignment of the load (base + offset). When we infer a better
alignment for a Ptr we need to ensure that it applies to the base to
prevent the alignment on the load from being wrong.
This fixes a bug where the alignment could then be used to incorrectly
prove noalias between a load and a store, leading to a miscompile.
Differential Revision: https://reviews.llvm.org/D48029
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335210
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Thu, 21 Jun 2018 07:15:19 +0000 (07:15 +0000)]
Remove FIXME comment about WIP.
This is the only line other than the function signature remaining
of the original patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335208
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Thu, 21 Jun 2018 07:15:14 +0000 (07:15 +0000)]
Add some explanatory text to the associated symbol support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335207
91177308-0d34-0410-b5e6-
96231b3b80d8
Florian Hahn [Thu, 21 Jun 2018 07:15:08 +0000 (07:15 +0000)]
Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions.
r335150 should resolve the issues with the clang-with-thin-lto-ubuntu
and clang-with-lto-ubuntu builders.
Original message:
This patch updates IPSCCP to use PredicateInfo to propagate
facts to true branches predicated by EQ and to false branches
predicated by NE.
As a follow up, we should be able to extend it to also propagate additional
facts about nonnull.
Reviewers: davide, mssimpso, dberlin, efriedma
Reviewed By: davide, dberlin
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335206
91177308-0d34-0410-b5e6-
96231b3b80d8
Mikael Holmen [Thu, 21 Jun 2018 07:02:46 +0000 (07:02 +0000)]
[DebugInfo] Keep DBG_VALUE undef in LiveDebugVariables
Summary:
Fixes PR36579.
For cases where we had e.g.
DBG_VALUE 42
[...]
DBG_VALUE undef
LiveDebugVariables would discard all undef DBG_VALUEs and then it would
look like the variable had the value 42 throughout the rest of the
function, which is incorrect.
With this patch we don't remove all undef DBG_VALUEs in LiveDebugVariables
so they will be kept after register allocation just like other DBG_VALUEs
which will yield more correct debug information.
Reviewers: aprantl
Reviewed By: aprantl
Subscribers: bjope, Ka-Ka, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D48277
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335205
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 21 Jun 2018 06:17:16 +0000 (06:17 +0000)]
[X86] Go through some tests that still reference old intrinsics that have been autoupgraded and replace them with the upgraded IR.
This is mostly the stack folding tests and is by no means a thorough audit of tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335204
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 21 Jun 2018 06:14:03 +0000 (06:14 +0000)]
[PM/LoopUnswitch] Add partial non-trivial unswitching for invariant
conditions feeding a chain of `and`s or `or`s for a branch.
Much like with full non-trivial unswitching, we rely on the pass manager
to handle iterating until all of the profitable unswitches have been
done. This is to allow other more profitable unswitches to fire on any
of the cloned, simpler versions of the loop if viable.
Threading the partial unswiching through the non-trivial unswitching
logic motivated some minor refactorings. If those are too disruptive to
make it reasonable to review this patch, I can separate them out, but
it'll be somewhat timeconsuming so I wanted to send it for initial
review as-is. Feel free to tell me whether it warrants pulling apart.
I've tried to re-use (and factor out) logic form the partial trivial
unswitching, but not as much could be shared as I had haped. Still, this
wasn't as bad as I naively expected.
Some basic testing is added, but I probably need more. Suggestions for
things you'd like to see tested more than welcome. One thing I'd like to
do is add some testing that when we schedule this with loop-instsimplify
it effectively cleans up the cruft created.
Last but not least, this uncovered a bug that has been in loop cloning
the entire time for non-trivial unswitching. Specifically, we didn't
correctly add the outer-most cloned loop to the list of cloned loops.
This meant that LCSSA wouldn't be updated for it hypothetically, and
more significantly that we would never visit it in the loop pass
manager. I noticed this while checking loop-instsimplify by hand. I'll
try to separate this bugfix out into its own patch with a more focused
test. But it is just one line, so shouldn't significantly confuse the
review here.
After this patch, the only missing "feature" in this unswitch I'm aware
of us non-trivial unswitching of switches. I'll try implementing *full*
non-trivial unswitching of switches (which is at least a sound thing to
implement), but *partial* non-trivial unswitching of switches is
something I don't see any sound and principled way to implement. I also
have no interesting test cases for the latter, so I'm not really
worried. The rest of the things that need to be ported are bug-fixes and
more narrow / targeted support for specific issues.
Differential Revision: https://reviews.llvm.org/D47522
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335203
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 21 Jun 2018 05:42:05 +0000 (05:42 +0000)]
[RISC-V] Fix a test case to not include label names as those aren't
stable in non-asserts builds. This fixes a test failure in release
config.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335202
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Zolotukhin [Thu, 21 Jun 2018 05:14:00 +0000 (05:14 +0000)]
ProvenanceAnalysis: Store WeakTrackingVH instead of Value* in UnderlyingValue Cache.
Summary:
Since the value stored in the cache might be deleted or replaced with
something else, we need to use tracking ValueHandlers instead of plain
Value pointers. It was discovered in one of internal builds, and
unfortunately there is no small reproducer for the issue.
The cache was introduced in rL327328.
Reviewers: ahatanak, pete
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D48407
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335201
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 21 Jun 2018 05:00:56 +0000 (05:00 +0000)]
[X86] Remove masking from 512-bit floating max/min intrinsics. Use select instruction instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335199
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Shen [Thu, 21 Jun 2018 02:15:32 +0000 (02:15 +0000)]
Revert "[SCEV] Improve zext(A /u B) and zext(A % B)"
This reverts commit r335197, as some bots are not happy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335198
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Shen [Thu, 21 Jun 2018 01:49:07 +0000 (01:49 +0000)]
[SCEV] Improve zext(A /u B) and zext(A % B)
Summary:
Try to match udiv and urem patterns, and sink zext down to the leaves.
I'm not entirely sure why some unrelated tests change, but the added <nsw>s seem right.
Reviewers: sanjoy
Subscribers: jlebar, hiraditya, bixia, llvm-commits
Differential Revision: https://reviews.llvm.org/D48338
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335197
91177308-0d34-0410-b5e6-
96231b3b80d8
Wolfgang Pieb [Wed, 20 Jun 2018 22:56:37 +0000 (22:56 +0000)]
[DWARF] Improved error reporting for range lists.
Errors found processing the DW_AT_ranges attribute are propagated by lower level
routines and reported by their callers.
Reviewer: JDevlieghere
Differential Revision: https://reviews.llvm.org/D48344
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335188
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Dardis [Wed, 20 Jun 2018 22:40:12 +0000 (22:40 +0000)]
[mips] Add microMIPS specific addressing patterns.
These are identical but use microMIPS instructions instead of MIPS instructions.
Also, flatten the 'let AdditionalPredicates = [InMicroMips]' by using the
ISA_MICROMIPS adjective. Add tests for constant materialization.
Reviewers: atanasyan, abeserminji, smaksimovic
Differential Revision: https://reviews.llvm.org/D48275
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335185
91177308-0d34-0410-b5e6-
96231b3b80d8
Alina Sbirlea [Wed, 20 Jun 2018 22:01:04 +0000 (22:01 +0000)]
Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred.
Summary:
Two utils methods have essentially the same functionality. This is an attempt to merge them into one.
1. lib/Transforms/Utils/Local.cpp : MergeBasicBlockIntoOnlyPred
2. lib/Transforms/Utils/BasicBlockUtils.cpp : MergeBlockIntoPredecessor
Prior to the patch:
1. MergeBasicBlockIntoOnlyPred
Updates either DomTree or DeferredDominance
Moves all instructions from Pred to BB, deletes Pred
Asserts BB has single predecessor
If address was taken, replace the block address with constant 1 (?)
2. MergeBlockIntoPredecessor
Updates DomTree, LoopInfo and MemoryDependenceResults
Moves all instruction from BB to Pred, deletes BB
Returns if doesn't have a single predecessor
Returns if BB's address was taken
After the patch:
Method 2. MergeBlockIntoPredecessor is attempting to become the new default:
Updates DomTree or DeferredDominance, and LoopInfo and MemoryDependenceResults
Moves all instruction from BB to Pred, deletes BB
Returns if doesn't have a single predecessor
Returns if BB's address was taken
Uses of MergeBasicBlockIntoOnlyPred that need to be replaced:
1. lib/Transforms/Scalar/LoopSimplifyCFG.cpp
Updated in this patch. No challenges.
2. lib/CodeGen/CodeGenPrepare.cpp
Updated in this patch.
i. eliminateFallThrough is straightforward, but I added using a temporary array to avoid the iterator invalidation.
ii. eliminateMostlyEmptyBlock(s) methods also now use a temporary array for blocks
Some interesting aspects:
- Since Pred is not deleted (BB is), the entry block does not need updating.
- The entry block was being updated with the deleted block in eliminateMostlyEmptyBlock. Added assert to make obvious that BB=SinglePred.
- isMergingEmptyBlockProfitable assumes BB is the one to be deleted.
- eliminateMostlyEmptyBlock(BB) does not delete BB on one path, it deletes its unique predecessor instead.
- adding some test owner as subscribers for the interesting tests modified:
test/CodeGen/X86/avx-cmp.ll
test/CodeGen/AMDGPU/nested-loop-conditions.ll
test/CodeGen/AMDGPU/si-annotate-cf.ll
test/CodeGen/X86/hoist-spill.ll
test/CodeGen/X86/2006-11-17-IllegalMove.ll
3. lib/Transforms/Scalar/JumpThreading.cpp
Not covered in this patch. It is the only use case using the DeferredDominance.
I would defer to Brian Rzycki to make this replacement.
Reviewers: chandlerc, spatel, davide, brzycki, bkramer, javed.absar
Subscribers: qcolombet, sanjoy, nemanjai, nhaehnle, jlebar, tpr, kbarton, RKSimon, wmi, arsenm, llvm-commits
Differential Revision: https://reviews.llvm.org/D48202
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335183
91177308-0d34-0410-b5e6-
96231b3b80d8
Bruno Cardoso Lopes [Wed, 20 Jun 2018 21:43:49 +0000 (21:43 +0000)]
Fix WasmEHFuncInfo.h to include what it uses
This fixes clang+llvm build with Modules and local submodule visibility.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335181
91177308-0d34-0410-b5e6-
96231b3b80d8
Alina Sbirlea [Wed, 20 Jun 2018 21:30:29 +0000 (21:30 +0000)]
[MemorySSA] Add convenience APIs in updater to avoid needing MSSA.
Summary:
Ideally passes should not need to pass MSSA around and do all updates through the updater.
Add convenience APIs to help with that.
Reviewers: george.burgess.iv
Subscribers: sanjoy, jlebar, llvm-commits, Prazek
Differential Revision: https://reviews.llvm.org/D48334
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335179
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Dardis [Wed, 20 Jun 2018 21:25:50 +0000 (21:25 +0000)]
Remove myself from the release testers list. (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335178
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Wed, 20 Jun 2018 21:12:59 +0000 (21:12 +0000)]
[Dominators] Simplify child lists and make them deterministic
This fixes an extremely subtle non-determinism that can only be
triggered by an unfortunate alignment of passes. In my case:
- JumpThreading does large dominator tree updates
- CorrelatedValuePropagation preserves domtree now
- LICM codegen depends on the order of children on domtree nodes
The last part is non-deterministic if the update was stored in a set.
But it turns out that the set is completely unnecessary, updates are
deduplicated at an earlier stage so we can just use a vector, which is
both more efficient and doesn't destroy the input ordering.
I didn't manage to get the 240 MB IR file reduced enough, triggering
this bug requires a lot of jump threading, so landing this without a
test case.
Differential Revision: https://reviews.llvm.org/D48392
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335176
91177308-0d34-0410-b5e6-
96231b3b80d8
Alina Sbirlea [Wed, 20 Jun 2018 21:06:13 +0000 (21:06 +0000)]
[MemorySSA] Verify Phi incoming blocks are block predecessors.
Summary: Make the MemorySSA verify also check that all Phi incoming blocks are block predecessors.
Reviewers: george.burgess.iv
Subscribers: sanjoy, jlebar, Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D48333
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335174
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Wed, 20 Jun 2018 21:05:02 +0000 (21:05 +0000)]
[X86] Use setcc ISD opcode for AVX512 integer comparisons all the way to isel
I don't believe there is any real reason to have separate X86 specific opcodes for vector compares. Setcc has the same behavior just uses a different encoding for the condition code.
I had to change the CondCodeAction for SETLT and SETLE to prevent some transforms from changing SETGT lowering.
Differential Revision: https://reviews.llvm.org/D43608
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335173
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Wed, 20 Jun 2018 20:54:52 +0000 (20:54 +0000)]
[SLPVectorizer] Provide InstructionsState down the BoUpSLP vectorization call tree
As described in D48359, this patch pushes InstructionsState down the BoUpSLP call hierarchy instead of the corresponding raw OpValue. This makes it easier to track the alternate opcode etc. and avoids us having to call getAltOpcode which makes it difficult to support more than one alternate opcode.
Differential Revision: https://reviews.llvm.org/D48382
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335170
91177308-0d34-0410-b5e6-
96231b3b80d8
Stanislav Mekhanoshin [Wed, 20 Jun 2018 20:24:20 +0000 (20:24 +0000)]
Allow binop C1, (select cc, CF, CT) -> select folding
Previously this folding was done only if select is a first operand.
However, for non-commutative operations constant may go before
select.
Differential Revision: https://reviews.llvm.org/D48223
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335167
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 20 Jun 2018 20:16:45 +0000 (20:16 +0000)]
[InstCombine] fix typo in test comment; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335165
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Dardis [Wed, 20 Jun 2018 19:59:58 +0000 (19:59 +0000)]
[mips] Correct predicates for loads, bit manipulation instructions and some pseudos
Additionally, correct the definition of the rdhwr instruction.
Reviewers: atanasyan, abeserminji, smaksimovic
Differential Revision: https://reviews.llvm.org/D48216
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335162
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Wed, 20 Jun 2018 19:45:48 +0000 (19:45 +0000)]
AMDGPU: Fix scalar_to_vector for v4i16/v4f16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335161
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Wed, 20 Jun 2018 19:45:40 +0000 (19:45 +0000)]
AMDGPU: Fix missing C++ mode comment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335160
91177308-0d34-0410-b5e6-
96231b3b80d8
Krzysztof Parzyszek [Wed, 20 Jun 2018 19:22:27 +0000 (19:22 +0000)]
[Hexagon] Replace .ll test for expanding post-ra pesudos with .mir
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335158
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 20 Jun 2018 19:02:17 +0000 (19:02 +0000)]
[IR] add/use isIntDivRem convenience function
There are more existing potential users of this,
but I've limited this patch to the first couple
that I found to minimize typo risk.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335157
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 20 Jun 2018 18:57:07 +0000 (18:57 +0000)]
[PM/LoopUnswitch] Support partial trivial unswitching.
The idea of partial unswitching is to take a *part* of a branch's
condition that is loop invariant and just unswitching that part. This
primarily makes sense with i1 conditions of branches as opposed to
switches. When dealing with i1 conditions, we can easily extract loop
invariant inputs to a a branch and unswitch them to test them entirely
outside the loop.
As part of this, we now create much more significant cruft in the loop
body, so this relies on adding cleanup passes to the loop pipeline and
revisiting unswitched loops to do that cleanup before continuing to
process them.
This already appears to be more powerful at unswitching than the old
loop unswitch pass, and so I'd appreciate pretty careful review in case
I'm just missing some correctness checks. The `LIV-loop-condition` test
case is not unswitched by the old unswitch pass, but is with this pass.
Thanks to Sanjoy and Fedor for the review!
Differential Revision: https://reviews.llvm.org/D46706
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335156
91177308-0d34-0410-b5e6-
96231b3b80d8
Alex Bradbury [Wed, 20 Jun 2018 18:42:25 +0000 (18:42 +0000)]
[RISCV] Accept fmv.s.x and fmv.x.s as mnemonic aliases for fmv.w.x and fmv.x.w
These instructions were renamed in version 2.2 of the user-level ISA spec, but
the old name should also be accepted by standard tools.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335154
91177308-0d34-0410-b5e6-
96231b3b80d8
Jessica Paquette [Wed, 20 Jun 2018 18:41:11 +0000 (18:41 +0000)]
[MachineOutliner] Add debug info test for the outliner
The outliner emits debug info. Add a test that outlines a function
and uses llvm-dwarfdump to check the emitted DWARF for correctness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335153
91177308-0d34-0410-b5e6-
96231b3b80d8
Vedant Kumar [Wed, 20 Jun 2018 18:40:14 +0000 (18:40 +0000)]
[Local] Generalize insertReplacementDbgValues, NFC
This utility should operate on Values, not Instructions. While I'm here,
I've also made it possible to skip emitting replacement dbg.values for
certain debug users (by having RewriteExpr return nullptr).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335152
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 20 Jun 2018 17:48:43 +0000 (17:48 +0000)]
[InstCombine] add vector select of binops tests (PR37806)
These represent the most basic requested transform - a matching
operand and 2 constant operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335151
91177308-0d34-0410-b5e6-
96231b3b80d8
Florian Hahn [Wed, 20 Jun 2018 17:42:01 +0000 (17:42 +0000)]
[PredicateInfo] Order instructions in different BBs by DFSNumIn.
Using OrderedInstructions::dominates as comparator for instructions in
BBs without dominance relation can cause a non-deterministic order
between such instructions. That in turn can cause us to materialize
copies in a non-deterministic order. While this does not effect
correctness, it causes some minor non-determinism in the final generated
code, because values have slightly different labels.
Without this patch, running -print-predicateinfo on a reasonably large
module produces slightly different output on each run.
This patch uses the dominator trees DFSInNum to order instruction from
different BBs, which should enforce a deterministic ordering and
guarantee that dominated instructions come after the instructions that
dominate them.
Reviewers: dberlin, efriedma, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D48230
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335150
91177308-0d34-0410-b5e6-
96231b3b80d8
Paul Robinson [Wed, 20 Jun 2018 17:08:46 +0000 (17:08 +0000)]
[DWARF] Don't keep a ref to possibly stack allocated data.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335146
91177308-0d34-0410-b5e6-
96231b3b80d8