OSDN Git Service
Matthias Braun [Fri, 27 Jan 2017 18:53:05 +0000 (18:53 +0000)]
ScheduleDAGInstrs: Cleanup toggleKillFlag(); NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293323
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthias Braun [Fri, 27 Jan 2017 18:53:00 +0000 (18:53 +0000)]
ScheduleDAGInstrs: Cleanup; NFC
Comment, doxygen and a bit of whitespace cleanup.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293322
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Fri, 27 Jan 2017 18:41:14 +0000 (18:41 +0000)]
AMDGPU/SI: Move some ISel helpers into utils so they can be shared with GISel
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D29068
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293321
91177308-0d34-0410-b5e6-
96231b3b80d8
Konstantin Zhuravlyov [Fri, 27 Jan 2017 18:32:40 +0000 (18:32 +0000)]
[AMDGPU] Grab MCSubtargetInfo from TargetMachine instead of constructing it
Differential Revision: https://reviews.llvm.org/D29224
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293318
91177308-0d34-0410-b5e6-
96231b3b80d8
Chris Ray [Fri, 27 Jan 2017 18:02:53 +0000 (18:02 +0000)]
[X86] Adding FFREEP instruction.
Summary: Small change to get the FREEP instruction to decode properly.
Reviewers: craig.topper
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29193
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293314
91177308-0d34-0410-b5e6-
96231b3b80d8
Anna Thomas [Fri, 27 Jan 2017 17:57:05 +0000 (17:57 +0000)]
NFC: Add debug tracing for more cases where loop unrolling fails.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293313
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 27 Jan 2017 17:42:26 +0000 (17:42 +0000)]
AMDGPU: Enable FeatureFlatForGlobal on Volcanic Islands
Accomplishes what r292982 was supposed to, which ended up
only really making the necessary test changes.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <vedran@miletic.net>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293310
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthew Simpson [Fri, 27 Jan 2017 17:33:16 +0000 (17:33 +0000)]
[ARM/AArch64] Relocate and update InterleavedAccessPass tests (NFC)
The interleaved access pass is an IR-to-IR transformation that runs before code
generation. It matches interleaved memory operations to target-specific
intrinsics (that are later lowered to load and store multiple instructions on
ARM/AArch64). We place tests for similar passes (e.g., GlobalMergePass) under
test/Transforms. This patch moves the InterleavedAccessPass tests out of
test/CodeGen and into target-specific directories under
test/Transforms/InterleavedAccess.
Although the pass is an IR pass, many of the existing tests were llc tests
rather opt tests. For example, the tests would check for ldN/stN instructions
generated by llc rather than the intrinsic calls the pass actually inserts.
Thus, this patch updates all tests to be opt tests that check for the inserted
intrinsics. We already have separate CodeGen tests that ensure we lower the
interleaved access intrinsics to their corresponding ldN/stN instructions. In
addition to migrating the tests to opt, this patch also performs some minor
clean-up (to ensure consistent naming, etc.).
Differential Revision: https://reviews.llvm.org/D29184
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293309
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 27 Jan 2017 17:30:39 +0000 (17:30 +0000)]
NVPTX: Make NVPTXInferAddressSpaces preserve CFG
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293308
91177308-0d34-0410-b5e6-
96231b3b80d8
Jun Bum Lim [Fri, 27 Jan 2017 17:16:37 +0000 (17:16 +0000)]
[CodeGenPrep]No negative cost in the ExtLd promotion
Summary: This change prevent the signed value of cost from being negative as the value is passed as an unsigned argument.
Reviewers: mcrosier, jmolloy, qcolombet, javed.absar
Reviewed By: mcrosier, qcolombet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28871
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293307
91177308-0d34-0410-b5e6-
96231b3b80d8
Stanislav Mekhanoshin [Fri, 27 Jan 2017 16:38:10 +0000 (16:38 +0000)]
[AMDGPU] Turn AMDGPUUnifyMetadata back into module pass
With the adjustPassManager interface that is now possible to use
custom early module passes.
Differential Revision: https://reviews.llvm.org/D29189
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293300
91177308-0d34-0410-b5e6-
96231b3b80d8
Mehdi Amini [Fri, 27 Jan 2017 16:12:22 +0000 (16:12 +0000)]
Fix BasicAA incorrect assumption on GEP
This is fixing pr31761: BasicAA is deducing NoAlias
on the result of the GEP if the base pointer is itself NoAlias.
This is possible only if the NoAlias on the base pointer is
deduced with a non-sized query: this should guarantee that
the pointers are belonging to different memory allocation
and that the GEP can't legally jump from one to another.
Differential Revision: https://reviews.llvm.org/D29216
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293293
91177308-0d34-0410-b5e6-
96231b3b80d8
Ivan Krasin [Fri, 27 Jan 2017 15:54:49 +0000 (15:54 +0000)]
Avoid using unspecified ordering in MetadataLoader::MetadataLoaderImpl::parseOneMetadata.
Summary:
MetadataLoader::MetadataLoaderImpl::parseOneMetadata uses
the following construct in a number of places:
```
MetadataList.assignValue(<...>, NextMetadataNo++);
```
There, NextMetadataNo gets incremented, and since the order
of arguments evaluation is not specified, that can happen
before or after other arguments are evaluated.
In a few cases the other arguments indirectly use NextMetadataNo.
For instance, it's
```
MetadataList.assignValue(
GET_OR_DISTINCT(DIModule,
(Context, getMDOrNull(Record[1]),
getMDString(Record[2]), getMDString(Record[3]),
getMDString(Record[4]), getMDString(Record[5]))),
NextMetadataNo++);
```
getMDOrNull calls getMD that uses NextMetadataNo:
```
MetadataList.getMetadataFwdRef(NextMetadataNo);
```
Therefore, the order of evaluation becomes important. That caused
a very subtle LLD crash that only happens if compiled with GCC or
if LLD is built with LTO. In the case if LLD is compiled with Clang
and regular linking mode, everything worked as intended.
This change extracts incrementing of NextMetadataNo outside of
the arguments list to guarantee the correct order of evaluation.
For the record, this has taken 3 days to track to the origin. It all
started with a ThinLTO bot in Chrome not being able to link a target
if debug info is enabled.
Reviewers: pcc, mehdi_amini
Reviewed By: mehdi_amini
Subscribers: aprantl, llvm-commits
Differential Revision: https://reviews.llvm.org/D29204
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293291
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Dardis [Fri, 27 Jan 2017 11:36:52 +0000 (11:36 +0000)]
[mips] Recommit: "N64 static relocation model support"
This patch makes one change to GOT handling and two changes to N64's
relocation model handling. Furthermore, the jumptable encodings have
been corrected for static N64.
Big GOT handling is now done via a new SDNode MipsGotHi - this node is
unconditionally lowered to an lui instruction.
The first change to N64's relocation handling is the lifting of the
restriction that N64 always uses PIC. Now it is possible to target static
environments.
The second change adds support for 64 bit symbols and enables them by
default. Previously N64 had patterns for sym32 mode only. In this mode all
symbols are assumed to have 32 bit addresses. sym32 mode support
is selectable with attribute 'sym32'. A follow on patch for clang will
add the necessary frontend parameter.
This partially resolves PR/23485.
Thanks to Brooks Davis for reporting the issue!
This version corrects a "Conditional jump or move depends on uninitialised
value(s)" error detected by valgrind present in the original commit.
Reviewers: dsanders, seanbruno, zoran.jovanovic, vkalintiris
Differential Revision: https://reviews.llvm.org/D23652
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293279
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Bataev [Fri, 27 Jan 2017 10:54:04 +0000 (10:54 +0000)]
[SLP] Refactoring of horizontal reduction analysis, NFC.
Some checks in SLP horizontal reduction analysis function are performed
several times, though it is enough to perform these checks only once
during an initial attempt at adding candidate for the reduction
instruction/reduced value.
Differential Revision: https://reviews.llvm.org/D29175
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293274
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 27 Jan 2017 10:27:32 +0000 (10:27 +0000)]
[LICM] When we are recomputing the alias sets for a subloop, we cannot
skip sub-subloops.
The logic to skip subloops dated from when this code was shared with the
cached case. Once it was factored out to only run in the case of
recomputed subloops it became a dangerous bug. If a subsubloop contained
an interfering instruction it would be silently skipped from the alias
sets for LICM.
With the old pass manager this was extremely hard to trigger as it would
require failing to visit these subloops with the LICM pass but then
visiting the outer loop somehow. I've not yet contrived any test case
that actually manages to trigger this.
But with the new pass manager we don't do the cross-loop caching hack
that the old PM does and so we recompute alias set information from
first principles. While this seems much cleaner and simpler it exposed
this bug and would subtly miscompile code due to failing to correctly
model the aliasing constraints of deeply nested loops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293273
91177308-0d34-0410-b5e6-
96231b3b80d8
Jonas Paulsson [Fri, 27 Jan 2017 07:46:26 +0000 (07:46 +0000)]
[DAGTypeLegalizer] Handle SIGN/ZERO_EXTEND in WidenVecRes_Convert().
In case of a SIGN/ZERO_EXTEND of an incomplete vector type (using only a
partial number of available vector elements), WidenVecRes_Convert() used to
resort to scalarization.
This patch adds a handling of the (common) case where an input vector can be
found of same width as the widened result vector, by converting the node to
SIGN/ZERO_EXTEND_VECTOR_INREG.
Review: Eli Friedman
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293268
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Fri, 27 Jan 2017 06:39:09 +0000 (06:39 +0000)]
[opt-viewer] Introduce global context
This is necessary since globals (max_hotness, caller_loc) need to be
explicitly passed to the subprocesses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293266
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Fri, 27 Jan 2017 06:39:08 +0000 (06:39 +0000)]
[opt-viewer] Remove message from the key
This is causing problems because the rendering of the text will depend on
varying global state to show relative hotness or a link in the inlining
context.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293265
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Fri, 27 Jan 2017 06:39:06 +0000 (06:39 +0000)]
[opt-viewer] Unique across the different jobs as well
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293264
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Fri, 27 Jan 2017 06:39:02 +0000 (06:39 +0000)]
[opt-viewer] Make sorting for the index page deterministic
Break the tie between entries with identical hotness deterministically.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293263
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Fri, 27 Jan 2017 06:39:01 +0000 (06:39 +0000)]
[opt-viewer] Include the function in the remark key
Avoid uniquing remarks with different the inlining context (Function).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293262
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Fri, 27 Jan 2017 06:38:31 +0000 (06:38 +0000)]
[opt-viewer] Put critical items in parallel
Summary:
Put opt-viewer critical items in parallel
Patch by Brian Cain!
Requires features from Python 2.7
**Performance**
Below are performance results across various configurations. These were taken on an i5-5200U (dual core + HT). They were taken with a small subset of the YAML output of building Python 3.6.0b3 with LTO+PGO. 60 YAML files.
"multiprocessing" is the current submission contents. "baseline" is as of
544f14c6b2a07a94168df31833dba9dc35fd8289 (I think this is aka r287505).
"ImportError" vs "class<...CLoader>" below are just confirming the expected configuration (with/without CLoader).
The below was measured on AMD A8-5500B (4 cores) with 224 input YAML files, showing a ~1.75x speed increase over the baseline with libYAML. I suspect it would scale well on high-end servers.
```
**************************************** MULTIPROCESSING ****************************************
PyYAML:
Traceback (most recent call last):
File "<string>", line 1, in <module>
ImportError: cannot import name CLoader
Python 2.7.10
489.42user 5.53system 2:38.03elapsed 313%CPU (0avgtext+0avgdata 400308maxresident)k
0inputs+31392outputs (0major+473540minor)pagefaults 0swaps
PyYAML+libYAML:
<class 'yaml.cyaml.CLoader'>
Python 2.7.10
78.69user 5.45system 0:32.63elapsed 257%CPU (0avgtext+0avgdata 398560maxresident)k
0inputs+31392outputs (0major+542022minor)pagefaults 0swaps
PyPy/PyYAML:
Traceback (most recent call last):
File "<builtin>/app_main.py", line 75, in run_toplevel
File "<builtin>/app_main.py", line 601, in run_it
File "<string>", line 1, in <module>
ImportError: cannot import name 'CLoader'
Python 2.7.9 (2.6.0+dfsg-3, Jul 04 2015, 05:43:17)
[PyPy 2.6.0 with GCC 4.9.3]
154.27user 8.12system 0:53.83elapsed 301%CPU (0avgtext+0avgdata 627960maxresident)k
808inputs+30376outputs (0major+727994minor)pagefaults 0swaps
**************************************** BASELINE ****************************************
PyYAML:
Traceback (most recent call last):
File "<string>", line 1, in <module>
ImportError: cannot import name CLoader
Python 2.7.10
358.08user 4.05system 6:08.37elapsed 98%CPU (0avgtext+0avgdata 315004maxresident)k
0inputs+31392outputs (0major+85252minor)pagefaults 0swaps
PyYAML+libYAML:
<class 'yaml.cyaml.CLoader'>
Python 2.7.10
50.32user 3.30system 0:56.59elapsed 94%CPU (0avgtext+0avgdata 307296maxresident)k
0inputs+31392outputs (0major+79335minor)pagefaults 0swaps
PyPy/PyYAML:
Traceback (most recent call last):
File "<builtin>/app_main.py", line 75, in run_toplevel
File "<builtin>/app_main.py", line 601, in run_it
File "<string>", line 1, in <module>
ImportError: cannot import name 'CLoader'
Python 2.7.9 (2.6.0+dfsg-3, Jul 04 2015, 05:43:17)
[PyPy 2.6.0 with GCC 4.9.3]
72.94user 5.18system 1:23.41elapsed 93%CPU (0avgtext+0avgdata 455312maxresident)k
0inputs+30392outputs (0major+110280minor)pagefaults 0swaps
```
Reviewers: fhahn, anemet
Reviewed By: anemet
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D26967
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293261
91177308-0d34-0410-b5e6-
96231b3b80d8
Richard Trieu [Fri, 27 Jan 2017 06:06:05 +0000 (06:06 +0000)]
Fix unused variable warning.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293260
91177308-0d34-0410-b5e6-
96231b3b80d8
Saleem Abdulrasool [Fri, 27 Jan 2017 03:41:53 +0000 (03:41 +0000)]
ARM: fix vectorized division on WoA
The Windows on ARM target uses custom division for normal division as
the backend needs to insert division-by-zero checks. However, it is
designed to only handle non-vectorized division. ARM has custom
lowering for vectorized division as that can avoid loading registers
with the values and invoke a division routine for each one, preferring
to lower using NEON instructions. Fall back to the custom lowering for
the NEON instructions if we encounter a vectorized division.
Resolves PR31778!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293259
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Berlin [Fri, 27 Jan 2017 02:37:11 +0000 (02:37 +0000)]
NewGVN: Add basic dead and redundant store elimination
Summary:
This adds basic dead and redundant store elimination to
NewGVN. Unlike our current DSE, it will happily do cross-block DSE if
it meets our requirements.
We get a bunch of DSE's simple.ll cases, and some stuff it doesn't.
Unlike DSE, however, we only try to eliminate stores of the same value
to the same memory location, not just general stores to the same
memory location.
Reviewers: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29149
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293258
91177308-0d34-0410-b5e6-
96231b3b80d8
NAKAMURA Takumi [Fri, 27 Jan 2017 02:11:10 +0000 (02:11 +0000)]
NVPTXCodeGen: Add IPO to libdeps, since r293189.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293256
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Shen [Fri, 27 Jan 2017 02:11:07 +0000 (02:11 +0000)]
[APFloat] Reduce some dispatch boilerplates. NFC.
Summary: This is an attempt to reduce the verbose manual dispatching code in APFloat. This doesn't handle multiple dispatch on single discriminator (e.g. APFloat::add(const APFloat&)), nor handles multiple dispatch on multiple discriminators (e.g. APFloat::convert()).
Reviewers: hfinkel, echristo, jlebar
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D29161
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293255
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Lebar [Fri, 27 Jan 2017 02:04:07 +0000 (02:04 +0000)]
[NVPTX] [InstCombine] Add llvm_unreachable to appease MSVC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293253
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Lebar [Fri, 27 Jan 2017 01:49:39 +0000 (01:49 +0000)]
[NVPTX] Fix use-after-stack-free bug in InstCombineCalls.
Introduced in r293244.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293251
91177308-0d34-0410-b5e6-
96231b3b80d8
Xin Tong [Fri, 27 Jan 2017 01:42:20 +0000 (01:42 +0000)]
Constant fold switch inst when looking for trivial conditions to unswitch on.
Summary: Constant fold switch inst when looking for trivial conditions to unswitch on.
Reviewers: sanjoy, chenli, hfinkel, efriedma
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D29037
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293250
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 27 Jan 2017 01:32:26 +0000 (01:32 +0000)]
[PM] Port LoopLoadElimination to the new pass manager and wire it into
the main pipeline.
This is a very straight forward port. Nothing weird or surprising.
This brings the number of missing passes from the new PM's pipeline down
to three.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293249
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Fri, 27 Jan 2017 01:30:46 +0000 (01:30 +0000)]
[ARM][LegalizerInfo] Specify the type of the opcode.
This is to fix the win7 bot that does not seem to be very
good at infering the type when it gets used in an initiliazer list.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293248
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Fri, 27 Jan 2017 01:13:30 +0000 (01:13 +0000)]
[AArch64][LegalizerInfo] Specify the type of the opcode.
This is an attempt to fix the win7 bot that does not seem to be very
good at infering the type when it gets used in an initiliazer list.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293246
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Fri, 27 Jan 2017 01:13:25 +0000 (01:13 +0000)]
Revert "[AArch64][LegalizerInfo] Specify the type of the initialization list."
This reverts commit r293238.
Even with that the win7 bot is still failing:
http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/3862
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293245
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Lebar [Fri, 27 Jan 2017 00:58:58 +0000 (00:58 +0000)]
[NVPTX] Upgrade NVVM intrinsics in InstCombineCalls.
Summary:
There are many NVVM intrinsics that we can't entirely get rid of, but
that nonetheless often correspond to target-generic LLVM intrinsics.
For example, if flush denormals to zero (ftz) is enabled, we can convert
@llvm.nvvm.ceil.ftz.f to @llvm.ceil.f32. On the other hand, if ftz is
disabled, we can't do this, because @llvm.ceil.f32 will be lowered to a
non-ftz PTX instruction. In this case, we can, however, simplify the
non-ftz nvvm ceil intrinsic, @llvm.nvvm.ceil.f, to @llvm.ceil.f32.
These transformations are particularly useful because they let us
constant fold instructions that appear in libdevice, the bitcode library
that ships with CUDA and essentially functions as its libm.
Reviewers: tra
Subscribers: hfinkel, majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D28794
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293244
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Lebar [Fri, 27 Jan 2017 00:58:34 +0000 (00:58 +0000)]
[ValueTracking] Add comment that CannotBeOrderedLessThanZero does the wrong thing for powi.
Summary:
CannotBeOrderedLessThanZero(powi(x, exp)) returns true if
CannotBeOrderedLessThanZero(x). But powi(-0, exp) is negative if exp is
odd, so we actually want to return SignBitMustBeZero(x).
Except that also isn't right, because we want to return true if x is
NaN, even if x has a negative sign bit.
What we really need in order to fix this is a consistent approach in
this function to handling the sign bit of NaNs. Without this it's very
difficult to say what the correct behavior here is.
Reviewers: hfinkel, efriedma, sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28927
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293243
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Lebar [Fri, 27 Jan 2017 00:58:03 +0000 (00:58 +0000)]
[LangRef] Make @llvm.sqrt(x) return undef, rather than have UB, for negative x.
Summary:
Some frontends emit a speculate-and-select idiom for sqrt, wherein they compute
sqrt(x), check if x is negative, and select NaN if it is:
%cmp = fcmp olt double %a, -0.
000000e+00
%sqrt = call double @llvm.sqrt.f64(double %a)
%ret = select i1 %cmp, double 0x7FF8000000000000, double %sqrt
This is technically UB as the LangRef is written today if %a is ever less than
-0. But emitting code that's compliant with the current definition of sqrt
would require a branch, which would then prevent us from matching this idiom in
SelectionDAG (which we do today -- ISD::FSQRT has defined behavior on negative
inputs), because SelectionDAG looks at one BB at a time.
Nothing in LLVM takes advantage of this undefined behavior, as far as we can
tell, and the fact that llvm.sqrt has UB dates from its initial addition to the
LangRef.
Reviewers: arsenm, mehdi_amini, hfinkel
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D28797
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293242
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 27 Jan 2017 00:50:21 +0000 (00:50 +0000)]
[PM] Flesh out almost all of the late loop passes.
With this the per-module pass pipeline is *extremely* close to the
legacy PM. The missing pieces are:
- PruneEH (or some equivalent)
- ArgumentPromotion
- LoopLoadElimination
- LoopUnswitch
I'm going to work through those in essentially that order but this seems
like a worthwhile incremental step toward the end state.
One difference in what I have here from the legacy PM is that I've
consolidated some of the per-function passes at the very end of the
pipeline into the main optimization function pipeline. The intervening
passes are *really* uninteresting and so this seems very likely to have
any effect other than minor improvement to locality.
Note that there are still some failures in the test suite, but the
compiler doesn't crash or assert.
Differential Revision: https://reviews.llvm.org/D29114
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293241
91177308-0d34-0410-b5e6-
96231b3b80d8
Kostya Serebryany [Fri, 27 Jan 2017 00:39:12 +0000 (00:39 +0000)]
[libFuzzer] simplify the value profiling callback further: don't use (idx MOD prime) on the hot path where it is useless anyway
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293239
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Fri, 27 Jan 2017 00:39:03 +0000 (00:39 +0000)]
[AArch64][LegalizerInfo] Specify the type of the initialization list.
This is an attempt to fix the win7 bot that does not seem to be very
good at infering the type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293238
91177308-0d34-0410-b5e6-
96231b3b80d8
Kostya Serebryany [Fri, 27 Jan 2017 00:20:55 +0000 (00:20 +0000)]
[libFuzzer] make sure (again) that __builtin_popcountl is compiled into popcnt
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293237
91177308-0d34-0410-b5e6-
96231b3b80d8
Kostya Serebryany [Fri, 27 Jan 2017 00:09:59 +0000 (00:09 +0000)]
[libFuzzer] simplify the value profile code and disable asan/msan on it
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293236
91177308-0d34-0410-b5e6-
96231b3b80d8
Adrian McCarthy [Fri, 27 Jan 2017 00:01:55 +0000 (00:01 +0000)]
NFC: Rename PDB_ReaderType::Raw to Native for consistency with the NativeSession rename.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293235
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Thu, 26 Jan 2017 23:53:31 +0000 (23:53 +0000)]
Switch the default for building GlobalISel.
Now, GlobalISel will be built by default. To turn that off, one has to
use -DLLVM_BUILD_GLOBAL_ISEL=OFF on the cmake command line.
<rdar://problem/
30004433>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293232
91177308-0d34-0410-b5e6-
96231b3b80d8
Yichao Yu [Thu, 26 Jan 2017 23:50:18 +0000 (23:50 +0000)]
CMake is funky on detecting Intel 17 as GCC compatible.
Summary: This adds a fallback in case that the Intel compiler is failed to be detected correctly.
Reviewers: chapuni
Reviewed By: chapuni
Subscribers: llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D27610
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293230
91177308-0d34-0410-b5e6-
96231b3b80d8
Eugene Zelenko [Thu, 26 Jan 2017 23:40:06 +0000 (23:40 +0000)]
[ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293229
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Thu, 26 Jan 2017 23:39:14 +0000 (23:39 +0000)]
GlobalISel: support debug intrinsics.
The translation scheme is mostly cribbed from FastISel, and it's not entirely
convincing semantically. But it does seem to work in the common cases and allow
variables to be printed so it can't be all wrong.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293228
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Thu, 26 Jan 2017 23:38:11 +0000 (23:38 +0000)]
Revert a couple of InstCombine/Guard checkins
This change reverts:
r293061: "[InstCombine] Canonicalize guards for NOT OR condition"
r293058: "[InstCombine] Canonicalize guards for AND condition"
They miscompile cases like:
```
declare void @llvm.experimental.guard(i1, ...)
define void @test_guard_not_or(i1 %A, i1 %B) {
%C = or i1 %A, %B
%D = xor i1 %C, true
call void(i1, ...) @llvm.experimental.guard(i1 %D, i32 20, i32 30)[ "deopt"() ]
ret void
}
```
because they do transfer the `i32 20, i32 30` parameters to newly
created guard instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293227
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Kaylor [Thu, 26 Jan 2017 23:27:59 +0000 (23:27 +0000)]
Add intrinsics for constrained floating point operations
This commit introduces a set of experimental intrinsics intended to prevent
optimizations that make assumptions about the rounding mode and floating point
exception behavior. These intrinsics will later be extended to specify
flush-to-zero behavior. More work is also required to model instruction
dependencies in machine code and to generate these instructions from clang
(when required by pragmas and/or command line options that are not currently
supported).
Differential Revision: https://reviews.llvm.org/D27028
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293226
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 26 Jan 2017 23:21:17 +0000 (23:21 +0000)]
[PM] Enable the main loop pass pipelines with everything but
loop-unswitch in the main pipelines for the new PM.
All of these now work, and Clang built using this pipeline can build the
test suite and SPEC without hitting any asserts of ASan failures.
There are still some bugs hiding though -- 7 tests regress with the new
PM. I'm going to be investigating these, but it seems worthwhile to at
least get the pipelines in place so that others can play with them, and
they aren't completely broken.
Differential Revision: https://reviews.llvm.org/D29113
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293225
91177308-0d34-0410-b5e6-
96231b3b80d8
Davide Italiano [Thu, 26 Jan 2017 23:12:53 +0000 (23:12 +0000)]
[obj2yaml] Produce correct output for invalid relocations.
R_X86_64_NONE can be emitted without a symbol associated (well,
in theory it should never be emitted in an ABI-compliant relocatable
object). So, if there's no symbol associated to a reloc, emit one
with an empty name, instead of crashing.
Ack'ed by Michael Spencer offline.
PR: 31768
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293224
91177308-0d34-0410-b5e6-
96231b3b80d8
Krzysztof Parzyszek [Thu, 26 Jan 2017 23:03:22 +0000 (23:03 +0000)]
[Hexagon] Require IPO library in Hexagon build
This should unbreak the Hexagon build bots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293221
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Berlin [Thu, 26 Jan 2017 22:21:48 +0000 (22:21 +0000)]
NewGVN: Fix bug exposed by PR31761
Summary:
This does not actually fix the testcase in PR31761 (discussion is
ongoing on the testcase), but does fix a bug it exposes, where stores
were not properly clobbering loads.
We accomplish this by unifying the memory equivalence infratructure
back into the normal congruence infrastructure, and then properly
destroying congruence classes when memory state leaders disappear.
Reviewers: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29195
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293216
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 26 Jan 2017 22:08:10 +0000 (22:08 +0000)]
[InstCombine] fold (X >>u C) << C --> X & (-1 << C)
We already have this fold when the lshr has one use, but it doesn't need that
restriction. We may be able to remove some code from foldShiftedShift().
Also, move the similar:
(X << C) >>u C --> X & (-1 >>u C)
...directly into visitLShr to help clean up foldShiftByConstOfShiftByConst().
That whole function seems questionable since it is called by commonShiftTransforms(),
but there's really not much in common if we're checking the shift opcodes for every
fold.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293215
91177308-0d34-0410-b5e6-
96231b3b80d8
Ahmed Bougacha [Thu, 26 Jan 2017 22:07:37 +0000 (22:07 +0000)]
[GlobalISel] Remove duplicate function using variadic templates. NFC.
I think the initial version of r293172 was trying:
std::forward<Args...>(args)...
which doesn't compile. This seems like the correct way:
std::forward<Args>(args)...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293214
91177308-0d34-0410-b5e6-
96231b3b80d8
Krzysztof Parzyszek [Thu, 26 Jan 2017 21:41:10 +0000 (21:41 +0000)]
[Hexagon] Add Hexagon-specific loop idiom recognition pass
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293213
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Berlin [Thu, 26 Jan 2017 21:39:49 +0000 (21:39 +0000)]
NewGVN: Add algorithm overview
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293212
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 26 Jan 2017 20:52:27 +0000 (20:52 +0000)]
[InstCombine] use m_APInt to allow (X << C) >>u C --> X & (-1 >>u C) with splat vectors
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293208
91177308-0d34-0410-b5e6-
96231b3b80d8
Zvi Rackover [Thu, 26 Jan 2017 20:29:15 +0000 (20:29 +0000)]
[Doc][LangRef] Fix typo-ish error in description of Masked Gather
Summary: Fix the example of equivalent expansion for when mask is all ones.
Reviewers: delena
Reviewed By: delena
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29179
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293206
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 26 Jan 2017 20:10:55 +0000 (20:10 +0000)]
[InstCombine] add tests for shift-shift folds; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293205
91177308-0d34-0410-b5e6-
96231b3b80d8
Balaram Makam [Thu, 26 Jan 2017 20:10:41 +0000 (20:10 +0000)]
[AArch64] Refine Kryo Machine Model
Summary: Refine floating point SQRT and DIV with accurate latency information.
Reviewers: mcrosier
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D29191
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293204
91177308-0d34-0410-b5e6-
96231b3b80d8
Kyle Butt [Thu, 26 Jan 2017 20:02:47 +0000 (20:02 +0000)]
[IfConversion] Use reverse_iterator to simplify. NFC
This simplifies skipping debug instructions and shrinking ranges.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293202
91177308-0d34-0410-b5e6-
96231b3b80d8
Sean Fertile [Thu, 26 Jan 2017 18:59:15 +0000 (18:59 +0000)]
[PPC] cleanup of mayLoad/mayStore flags and memory operands.
1) Explicitly sets mayLoad/mayStore property in the tablegen files on load/store
instructions.
2) Updated the flags on a number of intrinsics indicating that they write
memory.
3) Added SDNPMemOperand flags for some target dependent SDNodes so that they
propagate their memory operand
Review: https://reviews.llvm.org/D28818
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293200
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Berlin [Thu, 26 Jan 2017 18:49:03 +0000 (18:49 +0000)]
NewGVN: Fix output of pr31578 testcase now that we mark unreachable blocks as unreachable
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293198
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Berlin [Thu, 26 Jan 2017 18:30:29 +0000 (18:30 +0000)]
NewGVN: Make unreachable blocks be marked with unreachable
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293196
91177308-0d34-0410-b5e6-
96231b3b80d8
Stanislav Mekhanoshin [Thu, 26 Jan 2017 16:49:08 +0000 (16:49 +0000)]
Replace addEarlyAsPossiblePasses callback with adjustPassManager
This change introduces adjustPassManager target callback giving a
target an opportunity to tweak PassManagerBuilder before pass
managers are populated.
This generalizes and replaces addEarlyAsPossiblePasses target
callback. In particular that can be used to add custom passes to
extension points other than EP_EarlyAsPossible.
Differential Revision: https://reviews.llvm.org/D28336
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293189
91177308-0d34-0410-b5e6-
96231b3b80d8
Nirav Dave [Thu, 26 Jan 2017 16:46:13 +0000 (16:46 +0000)]
Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
This reverts commit r293184 which is failing in LTO builds
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293188
91177308-0d34-0410-b5e6-
96231b3b80d8
Serge Rogatch [Thu, 26 Jan 2017 16:17:03 +0000 (16:17 +0000)]
[XRay][Arm32] Reduce the portion of the stub and implement more staging for tail calls - in LLVM
Summary:
This patch provides more staging for tail calls in XRay Arm32 . When the logging part of XRay is ready for tail calls, its support in the core part of XRay Arm32 may be as easy as changing the number passed to the handler from 1 to 2.
Coupled patch:
- https://reviews.llvm.org/D28674
Reviewers: dberris, rengolin
Reviewed By: dberris
Subscribers: llvm-commits, iid_iunknown, aemerson, rengolin, dberris
Differential Revision: https://reviews.llvm.org/D28673
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293185
91177308-0d34-0410-b5e6-
96231b3b80d8
Nirav Dave [Thu, 26 Jan 2017 16:02:24 +0000 (16:02 +0000)]
In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
* Simplify Consecutive Merge Store Candidate Search
Now that address aliasing is much less conservative, push through
simplified store merging search and chain alias analysis which only
checks for parallel stores through the chain subgraph. This is cleaner
as the separation of non-interfering loads/stores from the
store-merging logic.
When merging stores search up the chain through a single load, and
finds all possible stores by looking down from through a load and a
TokenFactor to all stores visited.
This improves the quality of the output SelectionDAG and the output
Codegen (save perhaps for some ARM cases where we correctly constructs
wider loads, but then promotes them to float operations which appear
but requires more expensive constant generation).
Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)
Additional Minor Changes:
1. Finishes removing unused AliasLoad code
2. Unifies the chain aggregation in the merged stores across code
paths
3. Re-add the Store node to the worklist after calling
SimplifyDemandedBits.
4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
arbitrary, but seems sufficient to not cause regressions in
tests.
5. Remove Chain dependencies of Memory operations on CopyfromReg
nodes as these are captured by data dependence
6. Forward loads-store values through tokenfactors containing
{CopyToReg,CopyFromReg} Values.
7. Peephole to convert buildvector of extract_vector_elt to
extract_subvector if possible (see
CodeGen/AArch64/store-merge.ll)
8. Store merging for the ARM target is restricted to 32-bit as
some in some contexts invalid 64-bit operations are being
generated. This can be removed once appropriate checks are
added.
This finishes the change Matt Arsenault started in r246307 and
jyknight's original patch.
Many tests required some changes as memory operations are now
reorderable, improving load-store forwarding. One test in
particular is worth noting:
CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
forwarding converts a load-store pair into a parallel store and
a memory-realized bitcast of the same value. However, because we
lose the sharing of the explicit and implicit store values we
must create another local store. A similar transformation
happens before SelectionDAG as well.
Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293184
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Thu, 26 Jan 2017 15:02:31 +0000 (15:02 +0000)]
Use shouldAssumeDSOLocal in classifyGlobalReference.
And teach shouldAssumeDSOLocal that ppc has no copy relocations.
The resulting code handle a few more case than before. For example, it
knows that a weak symbol can be resolved to another .o file, but it
will still be in the main executable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293180
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Thu, 26 Jan 2017 14:31:12 +0000 (14:31 +0000)]
[X86][SSE] Add support for combining ANDNP byte masks with target shuffles
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293178
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniil Fukalov [Thu, 26 Jan 2017 13:33:17 +0000 (13:33 +0000)]
[SCEV] Introduce add operation inlining limit
Inlining in getAddExpr() can cause abnormal computational time in some cases.
New parameter -scev-addops-inline-threshold is intruduced with default value 500.
Reviewers: sanjoy
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D28812
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293176
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Thu, 26 Jan 2017 13:06:02 +0000 (13:06 +0000)]
[X86][SSE] Pull out target shuffle resolve code into helper. NFCI.
Pulled out code that removed unused inputs from a target shuffle mask into a helper function to allow it to be reused in a future commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293175
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Thu, 26 Jan 2017 12:10:43 +0000 (12:10 +0000)]
Remove a '#if 0' that wasn't intended for commit in r293173.
The '#if 0' contained the code I had intended to use but clang
rejects it (possibly incorrectly).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293174
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Thu, 26 Jan 2017 11:23:49 +0000 (11:23 +0000)]
Attempt to fix windows buildbots after r293172.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293173
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Thu, 26 Jan 2017 11:10:14 +0000 (11:10 +0000)]
[globalisel] Re-factor ISel matchers into a hierarchy. NFC
Summary:
This should make it possible to easily add everything needed to import all
the existing SelectionDAG rules. It should also serve the likely
kinds of GlobalISel rules (some of which are not currently representable
in SelectionDAG) once we've nailed down the tablegen definition for that.
The hierarchy is as follows:
MatcherRule - A matching rule. Currently used to emit C++ ISel code but will
| also be used to emit test cases and tablegen definitions in the
| near future.
|- Instruction(s) - Represents the instruction to be matched.
|- Instruction Predicate(s) - Test the opcode, arithmetic flags, etc. of an
| instruction.
\- Operand(s) - Represents a particular operand of the instruction. In the
| future, there may be subclasses to test the same predicates
| on multiple operands (including for variadic instructions).
\ Operand Predicate(s) - Test the type, register bank, etc. of an operand.
This is where the ComplexPattern equivalent
will be represented. It's also
nested-instruction matching will live as a
predicate that follows the DefUse chain to the
Def and tests a MatcherRule from that position.
Support for multiple instruction matchers in a rule has been retained from
the existing code but has been adjusted to assert when it is used.
Previously it would silently drop all but the first instruction matcher.
The tablegen-erated file is not functionally changed but has more
parentheses and no longer attempts to format the if-statements since
keeping track of the indentation is tricky in the presence of the matcher
hierarchy. It would be nice to have CMakes tablegen() run the output
through clang-format (when available) so we don't have to complicate
TableGen with pretty-printing.
It's also worth mentioning that this hierarchy will also be able to emit
TableGen definitions and test cases in the near future. This is the reason
for favouring explicit emit*() calls rather than the << operator.
Reviewers: aditya_nandakumar, rovka, t.p.northover, qcolombet, ab
Reviewed By: ab
Subscribers: igorb, dberris, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D28942
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293172
91177308-0d34-0410-b5e6-
96231b3b80d8
Valery Pykhtin [Thu, 26 Jan 2017 10:51:47 +0000 (10:51 +0000)]
[AMDGPU] Fix typo in GCNSchedStrategy
Differential revision: https://reviews.llvm.org/D28980
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293171
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Dardis [Thu, 26 Jan 2017 10:46:07 +0000 (10:46 +0000)]
Revert "[mips] N64 static relocation model support"
This reverts commit r293164. There are multiple tests failing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293170
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 26 Jan 2017 10:41:09 +0000 (10:41 +0000)]
[LV] Fix an issue where forming LCSSA in the place that we did would
change the set of uniform instructions in the loop causing an assert
failure.
The problem is that the legalization checking also builds data
structures mapping various facts about the loop body. The immediate
cause was the set of uniform instructions. If these then change when
LCSSA is formed, the data structures would already have been built and
become stale. The included test case triggered an assert in loop
vectorize that was reduced out of the new PM's pipeline.
The solution is to form LCSSA early enough that no information is cached
across the changes made. The only really obvious position is outside of
the main logic to vectorize the loop. This also has the advantage of
removing one case where forming LCSSA could mutate the loop but we
wouldn't track that as a "Changed" state.
If it is significantly advantageous to do some legalization checking
prior to this, we can do a more careful positioning but it seemed best
to just back off to a safe position first.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293168
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Dardis [Thu, 26 Jan 2017 10:19:02 +0000 (10:19 +0000)]
[mips] N64 static relocation model support
This patch makes one change to GOT handling and two changes to N64's
relocation model handling. Furthermore, the jumptable encodings have
been corrected for static N64.
Big GOT handling is now done via a new SDNode MipsGotHi - this node is
unconditionally lowered to an lui instruction.
The first change to N64's relocation handling is the lifting of the
restriction that N64 always uses PIC. Now it is possible to target static
environments.
The second change adds support for 64 bit symbols and enables them by
default. Previously N64 had patterns for sym32 mode only. In this mode all
symbols are assumed to have 32 bit addresses. sym32 mode support
is selectable with attribute 'sym32'. A follow on patch for clang will
add the necessary frontend parameter.
This partially resolves PR/23485.
Thanks to Brooks Davis for reporting the issue!
Reviewers: dsanders, seanbruno, zoran.jovanovic, vkalintiris
Differential Revision: https://reviews.llvm.org/D23652
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293164
91177308-0d34-0410-b5e6-
96231b3b80d8
Diana Picus [Thu, 26 Jan 2017 09:20:47 +0000 (09:20 +0000)]
[ARM] GlobalISel: Load i1, i8 and i16 args from stack
Add support for loading i1, i8 and i16 arguments from the stack, with or without
the ABI extension flags.
When the ABI extension flags are present, we load a 4-byte value, otherwise we
preserve the size of the load and let the instruction selector replace it with a
LDRB/LDRH. This generates the same thing as DAGISel.
Differential Revision: https://reviews.llvm.org/D27803
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293163
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Bataev [Thu, 26 Jan 2017 09:18:41 +0000 (09:18 +0000)]
[SLP] Add one more reduction operation for extra argument test to make
it vectorizable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293162
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 26 Jan 2017 08:31:54 +0000 (08:31 +0000)]
[PM] Use PoisoningVH correctly when merely deleting entries in a map
with it.
This code was dereferencing the PoisoningVH which isn't allowed once it
is poisoned. But the code itself really doesn't need to access the
pointer, it is just doing the safe stuff of clearing out data structures
keyed on the pointer value.
Change the code to use iterators to erase directly from a DenseMap. This
is also substantially more efficient as it avoids lots of hashing and
lookups to do the erasure. DenseMap supports iterating behind the
iteration which is fairly easy to implement.
Sadly, I don't have a test case here. I'm not even close and I don't
know that I ever will be. The issue is that several of the tricky
aspects of fixing this only show up when you cause the stack's
SmallVector to be in *EXACTLY* the right location. I only ever got
a reproduction for those with Clang, and only with *exactly* the right
command line flags. Any adjustment, even to seemingly unrelated flags,
would make partial and half-way solutions magically start to "work". In
good news, all of this was caught with the LLVM test suite. Also, there
is no *specific* code here that is untested, just that the old pattern
of code won't immediately fail on any test case I've managed to
contrive.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293160
91177308-0d34-0410-b5e6-
96231b3b80d8
NAKAMURA Takumi [Thu, 26 Jan 2017 08:31:14 +0000 (08:31 +0000)]
Chapter3/KaleidoscopeJIT.h: Fix a warning. [-Wunused-lambda-capture]
"this", aka class members, is not referred in the body.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293159
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 26 Jan 2017 08:04:27 +0000 (08:04 +0000)]
[TargetTransformInfo] Add override keywords to supporess -Winconsistent-missing-override.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293158
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 26 Jan 2017 07:17:58 +0000 (07:17 +0000)]
[AVX-512] Move the combine that runs combineBitcastForMaskedOp to the last DAG combine phase where I had originally meant to put it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293157
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 26 Jan 2017 07:17:53 +0000 (07:17 +0000)]
[X86] When bitcasting INSERT_SUBVECTOR/EXTRACT_SUBVECTOR to match masked operations, use the correct type for the immediate operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293156
91177308-0d34-0410-b5e6-
96231b3b80d8
Jonas Paulsson [Thu, 26 Jan 2017 07:03:25 +0000 (07:03 +0000)]
[TargetTransformInfo] Refactor and improve getScalarizationOverhead()
Refactoring to remove duplications of this method.
New method getOperandsScalarizationOverhead() that looks at the present unique
operands and add extract costs for them. Old behaviour was to just add extract
costs for one operand of the type always, which still happens in
getArithmeticInstrCost() if no operands are provided by the caller.
This is a good start of improving on this, but there are more places
that can be improved by using getOperandsScalarizationOverhead().
Review: Hal Finkel
https://reviews.llvm.org/D29017
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293155
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Bataev [Thu, 26 Jan 2017 06:19:52 +0000 (06:19 +0000)]
[SLP] Fixed test for extra arguments in horizontal reductions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293153
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 26 Jan 2017 05:38:46 +0000 (05:38 +0000)]
[DAGCombiner] Fold extract_subvector of undef to undef. Fold away inserting undef subvectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293152
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 26 Jan 2017 05:17:13 +0000 (05:17 +0000)]
[X86] Add demanded elts support for the inputs to pclmul intrinsic
This intrinsic uses bit 0 and bit 4 of an immediate argument to determine which bits of its inputs to read. This patch uses this information to simplify the demanded elements of the input vectors.
Differential Revision: https://reviews.llvm.org/D28979
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293151
91177308-0d34-0410-b5e6-
96231b3b80d8
Taewook Oh [Thu, 26 Jan 2017 04:34:25 +0000 (04:34 +0000)]
Revert test commit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293150
91177308-0d34-0410-b5e6-
96231b3b80d8
Taewook Oh [Thu, 26 Jan 2017 04:32:40 +0000 (04:32 +0000)]
test commit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293148
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Thu, 26 Jan 2017 04:03:18 +0000 (04:03 +0000)]
[OptDiag] Predicates to check the same type of IR and MIR opt remarks
It will be used from clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293145
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Thu, 26 Jan 2017 02:15:08 +0000 (02:15 +0000)]
gold-plugin: Fix test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293137
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 26 Jan 2017 02:13:50 +0000 (02:13 +0000)]
[PM] Simplify the new PM interface to the loop unroller and expose two
factory functions for the two modes the loop unroller is actually used
in in-tree: simplified full-unrolling and the entire thing including
partial unrolling.
I've also wired these up to nice names so you can express both of these
being in a pipeline easily. This is a precursor to actually enabling
these parts of the O2 pipeline.
Differential Revision: https://reviews.llvm.org/D28897
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293136
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 26 Jan 2017 02:07:20 +0000 (02:07 +0000)]
[Loops] Restructure the LoopInfo verify function so that it more
directly walks the current loop structure verifying that a matching
structure can be found in a freshly computed version.
Also pull things out of containers when necessary once an issue is found
and print them directly.
This makes it substantially easier to debug verification failures as
the process stops at the exact point in the loop nest where they diverge
and has in easily accessed local variables (or printed to stderr
already) the loops and other information needed to analyze the failure.
Differential Revision: https://reviews.llvm.org/D29142
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293133
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Thu, 26 Jan 2017 02:07:05 +0000 (02:07 +0000)]
gold-plugin: Simplify naming of object files created with save-temps or obj-path.
Now we never append a number to the file name for task ID 0.
Differential Revision: https://reviews.llvm.org/D29160
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293132
91177308-0d34-0410-b5e6-
96231b3b80d8
Rui Ueyama [Thu, 26 Jan 2017 02:03:58 +0000 (02:03 +0000)]
Fix --Wunused-function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293131
91177308-0d34-0410-b5e6-
96231b3b80d8