OSDN Git Service
David Tenty [Fri, 7 Jun 2019 15:45:25 +0000 (15:45 +0000)]
Build with _XOPEN_SOURCE defined on AIX
Summary:
It is useful to build with _XOPEN_SOURCE defined on AIX, enabling X/Open
and POSIX compatibility mode, to work around stray macros and other
bugs in the headers provided by the system and build compiler.
This patch adds the config to cmake to build with _XOPEN_SOURCE defined
on AIX with a few exceptions. Google Test internals require access to
platform specific thread info constructs on AIX so in that case we build
with _ALL_SOURCE defined instead. Libclang also uses header which needs
_ALL_SOURCE on AIX so we leave that as is as well.
We also add building on AIX with the large file API and doing CMake
header checks with X/OPEN definitions so the results are consistent with
the environment that will be present in the build.
Reviewers: hubert.reinterpretcast, xingxue, andusy
Reviewed By: hubert.reinterpretcast
Subscribers: mgorny, jsji, cfe-commits, llvm-commits
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D62533
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362808
91177308-0d34-0410-b5e6-
96231b3b80d8
Jinsong Ji [Fri, 7 Jun 2019 14:54:47 +0000 (14:54 +0000)]
[MachineScheduler] checkResourceLimit boundary condition update
When we call checkResourceLimit in bumpCycle or bumpNode, and we
know the resource count has just reached the limit (the equations
are equal). We should return true to mark that we are resource
limited for next schedule, or else we might continue to schedule
in favor of latency for 1 more schedule and create a schedule that
actually overbook the resource.
When we call checkResourceLimit to estimate the resource limite before
scheduling, we don't need to return true even if the equations are
equal, as it shouldn't limit the schedule for it .
Differential Revision: https://reviews.llvm.org/D62345
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362805
91177308-0d34-0410-b5e6-
96231b3b80d8
Stefan Stipanovic [Fri, 7 Jun 2019 14:18:02 +0000 (14:18 +0000)]
test-commit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362802
91177308-0d34-0410-b5e6-
96231b3b80d8
David Bolvansky [Fri, 7 Jun 2019 14:05:42 +0000 (14:05 +0000)]
[NFC] Added tests for D63004
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362801
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 7 Jun 2019 13:33:34 +0000 (13:33 +0000)]
TailDuplicator: Remove no-op analyzeBranch call
This could fail, which looked concerning. However nothing was actually
using the results of this. I assume this was intended to use the
anti-feature of analyzeBranch of removing instructions, but wasn't
actually calling it with AllowModify = true.
Fixes bug 42162.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362800
91177308-0d34-0410-b5e6-
96231b3b80d8
Joerg Sonnenberger [Fri, 7 Jun 2019 13:28:52 +0000 (13:28 +0000)]
[NFC] Don't export helpers of ConstantFoldCall
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362799
91177308-0d34-0410-b5e6-
96231b3b80d8
Nico Weber [Fri, 7 Jun 2019 13:24:34 +0000 (13:24 +0000)]
llvm-lib: Disallow mixing object files with different machine types
lib.exe doesn't allow creating .lib files with object files that have
differing machine types. Update llvm-lib to match.
The motivation is to make it possible to infer the machine type of a
.lib file in lld, so that it can warn when e.g. a 32-bit .lib file is
passed to a 64-bit link (PR38965).
Fixes PR38782.
Differential Revision: https://reviews.llvm.org/D62913
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362798
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Fri, 7 Jun 2019 13:17:46 +0000 (13:17 +0000)]
[x86] narrow extract subvector of vector select
This is a potentially large perf win for AVX1 targets because of the way we
auto-vectorize to 256-bit but then expect the backend to legalize/optimize
for the half-implemented AVX1 ISA.
On the motivating example from PR37428 (even though this patch doesn't solve
the vector shift issue):
https://bugs.llvm.org/show_bug.cgi?id=37428
...there's a 16% speedup when compiling with "-mavx" (perf tested on Haswell)
because we eliminate the remaining 256-bit vblendv ops.
I added comments on a couple of tests that require further work. If we have
256-bit logic ops separating the vselect and extract, we should probably narrow
everything to 128-bit, but that requires a larger pattern match.
Differential Revision: https://reviews.llvm.org/D62969
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362797
91177308-0d34-0410-b5e6-
96231b3b80d8
Nico Weber [Fri, 7 Jun 2019 13:09:40 +0000 (13:09 +0000)]
gn build: Merge r362766
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362796
91177308-0d34-0410-b5e6-
96231b3b80d8
Nico Weber [Fri, 7 Jun 2019 13:08:17 +0000 (13:08 +0000)]
gn build: Merge r362774
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362795
91177308-0d34-0410-b5e6-
96231b3b80d8
Nico Weber [Fri, 7 Jun 2019 13:07:00 +0000 (13:07 +0000)]
gn build: Run `git ls-files '*.gn' '*.gni' | xargs llvm/utils/gn/gn.py format`
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362794
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Tatham [Fri, 7 Jun 2019 12:42:54 +0000 (12:42 +0000)]
[ARM] Fix bugs introduced by the fp64/d32 rework.
Change D60691 caused some knock-on failures that weren't caught by the
existing tests. Firstly, selecting a CPU that should have had a
restricted FPU (e.g. `-mcpu=cortex-m4`, which should have 16 d-regs
and no double precision) could give the unrestricted version, because
`ARM::getFPUFeatures` returned a list of features including subtracted
ones (here `-fp64`,`-d32`), but `ARMTargetInfo::initFeatureMap` threw
away all the ones that didn't start with `+`. Secondly, the
preprocessor macros didn't reliably match the actual compilation
settings: for example, `-mfpu=softvfp` could still set `__ARM_FP` as
if hardware FP was available, because the list of features on the cc1
command line would include things like `+vfp4`,`-vfp4d16` and clang
didn't realise that one of those cancelled out the other.
I've fixed both of these issues by rewriting `ARM::getFPUFeatures` so
that it returns a list that enables every FP-related feature
compatible with the selected FPU and disables every feature not
compatible, which is more verbose but means clang doesn't have to
understand the dependency relationships between the backend features.
Meanwhile, `ARMTargetInfo::handleTargetFeatures` is testing for all
the various forms of the FP feature names, so that it won't miss cases
where it should have set `HW_FP` to feed into feature test macros.
That in turn caused an ordering problem when handling `-mcpu=foo+bar`
together with `-mfpu=something_that_turns_off_bar`. To fix that, I've
arranged that the `+bar` suffixes on the end of `-mcpu` and `-march`
cause feature names to be put into a separate vector which is
concatenated after the output of `getFPUFeatures`.
Another side effect of all this is to fix a bug where `clang -target
armv8-eabi` by itself would fail to set `__ARM_FEATURE_FMA`, even
though `armv8` (aka Arm v8-A) implies FP-Armv8 which has FMA. That was
because `HW_FP` was being set to a value including only the `FPARMV8`
bit, but that feature test macro was testing only the `VFP4FPU` bit.
Now `HW_FP` ends up with all the bits set, so it gives the right
answer.
Changes to tests included in this patch:
* `arm-target-features.c`: I had to change basically all the expected
results. (The Cortex-M4 test in there should function as a
regression test for the accidental double-precision bug.)
* `arm-mfpu.c`, `armv8.1m.main.c`: switched to using `CHECK-DAG`
everywhere so that those tests are no longer sensitive to the order
of cc1 feature options on the command line.
* `arm-acle-6.5.c`: been updated to expect the right answer to that
FMA test.
* `Preprocessor/arm-target-features.c`: added a regression test for
the `mfpu=softvfp` issue.
Reviewers: SjoerdMeijer, dmgreen, ostannard, samparker, JamesNagurne
Reviewed By: ostannard
Subscribers: srhines, javed.absar, kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D62998
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362791
91177308-0d34-0410-b5e6-
96231b3b80d8
Sam Elliott [Fri, 7 Jun 2019 12:20:14 +0000 (12:20 +0000)]
[RISCV] Support Bit-Preserving FP in F/D Extensions
Summary:
This allows some integer bitwise operations to instead be performed by
hardware fp instructions. This is correct because the RISC-V spec
requires the F and D extensions to use the IEEE-754 standard
representation, and fp register loads and stores to be bit-preserving.
This is tested against the soft-float ABI, but with hardware float
extensions enabled, so that the tests also ensure the optimisation also
fires in this case.
Reviewers: asb, luismarques
Reviewed By: asb
Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62900
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362790
91177308-0d34-0410-b5e6-
96231b3b80d8
Valery Pykhtin [Fri, 7 Jun 2019 12:16:46 +0000 (12:16 +0000)]
[AMDGPU] Constrain the AMDGPU inliner on maximum number of basic blocks in a caller function (compile time performance)
Differential revision: https://reviews.llvm.org/D62917
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362789
91177308-0d34-0410-b5e6-
96231b3b80d8
Dmitri Gribenko [Fri, 7 Jun 2019 09:28:19 +0000 (09:28 +0000)]
Work around a circular dependency between IR and MC introduced in r362735
I replaced the circular library dependency with a forward declaration,
but it is only a workaround, not a real fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362782
91177308-0d34-0410-b5e6-
96231b3b80d8
Cullen Rhodes [Fri, 7 Jun 2019 08:46:56 +0000 (08:46 +0000)]
[AArch64][AsmParser] error on unexpected SVE predicate type suffix
Summary:
This patch fixes a bug in the assembler that permitted a type suffix on
predicate registers when not expected. For instance, the following was
previously valid:
faddv h0, p0.q, z1.h
This bug was present in all SVE instructions containing predicates with
no type suffix and no predication form qualifier, i.e. /z or /m. The
latter instructions are already caught with an appropiate error message
by the assembler, e.g.:
.text
<stdin>:1:13: error: not expecting size suffix
cmpne p1.s, p0.b/z, z2.s, 0
^
A similar issue for SVE vector registers was fixed in:
https://reviews.llvm.org/D59636
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D62942
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362780
91177308-0d34-0410-b5e6-
96231b3b80d8
Cullen Rhodes [Fri, 7 Jun 2019 08:37:00 +0000 (08:37 +0000)]
[AArch64][AsmParser] Provide better diagnostics for SVE predicates
Patch by Sander de Smalen (sdesmalen)
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D62941
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362779
91177308-0d34-0410-b5e6-
96231b3b80d8
George Rimar [Fri, 7 Jun 2019 08:34:18 +0000 (08:34 +0000)]
[llvm-objcopy] - Emit error and don't crash if program header reaches past end of file.
This is https://bugs.llvm.org/show_bug.cgi?id=42122.
If an object file has a size less than program header's file [offset + size]
(i.e. if we have overflow), llvm-objcopy crashes instead of reporting a
error.
The patch fixes this issue.
Differential revision: https://reviews.llvm.org/D62898
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362778
91177308-0d34-0410-b5e6-
96231b3b80d8
George Rimar [Fri, 7 Jun 2019 08:31:36 +0000 (08:31 +0000)]
[yaml2elf] - Refactoring followup for D62809
This is a refactoring follow-up for D62809
"Change how we handle implicit sections.".
It allows to simplify the code.
Differential revision: https://reviews.llvm.org/D62912
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362777
91177308-0d34-0410-b5e6-
96231b3b80d8
Pengfei Wang [Fri, 7 Jun 2019 08:31:35 +0000 (08:31 +0000)]
[X86] -march=cooperlake (llvm)
Support intel -march=cooperlake in llvm
Patch by Shengchen Kan (skan)
Differential Revision: https://reviews.llvm.org/D62836
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362776
91177308-0d34-0410-b5e6-
96231b3b80d8
Sam Parker [Fri, 7 Jun 2019 08:04:18 +0000 (08:04 +0000)]
Fix for lld buildbot
Removed unused (in non-debug builds) variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362775
91177308-0d34-0410-b5e6-
96231b3b80d8
Sam Parker [Fri, 7 Jun 2019 07:35:30 +0000 (07:35 +0000)]
[CodeGen] Generic Hardware Loop Support
Patch which introduces a target-independent framework for generating
hardware loops at the IR level. Most of the code has been taken from
PowerPC CTRLoops and PowerPC has been ported over to use this generic
pass. The target dependent parts have been moved into
TargetTransformInfo, via isHardwareLoopProfitable, with
HardwareLoopInfo introduced to transfer information from the backend.
Three generic intrinsics have been introduced:
- void @llvm.set_loop_iterations
Takes as a single operand, the number of iterations to be executed.
- i1 @llvm.loop_decrement(anyint)
Takes the maximum number of elements processed in an iteration of
the loop body and subtracts this from the total count. Returns
false when the loop should exit.
- anyint @llvm.loop_decrement_reg(anyint, anyint)
Takes the number of elements remaining to be processed as well as
the maximum numbe of elements processed in an iteration of the loop
body. Returns the updated number of elements remaining.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362774
91177308-0d34-0410-b5e6-
96231b3b80d8
Dylan McKay [Fri, 7 Jun 2019 06:55:00 +0000 (06:55 +0000)]
[AVR] Expand 16-bit rotations during the legalization stage
In r356860, the legalization logic for BSWAP was modified to ISD::ROTL,
rather than the old ISD::{SHL, SRL, OR} nodes.
This works fine on AVR for 8-bit rotations, but 16-bit rotations are
currently unimplemented - they always trigger an assertion error in the
AVRExpandPseudoInsts pass ("RORW unimplemented").
This patch instructions the legalizer to expand 16-bit rotations into
the previous SHL, SRL, OR pattern it did previously.
This fixes the 'issue-cannot-select-bswap.ll' test. Interestingly, this
test failure seems flaky - it passes successfully on the avr-build-01
buildbot, but fails locally on my Arch Linux install.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362773
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Pozulp [Fri, 7 Jun 2019 06:28:43 +0000 (06:28 +0000)]
[NFC] Delete trailing whitespace character.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362772
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Pozulp [Fri, 7 Jun 2019 06:23:54 +0000 (06:23 +0000)]
[llvm-objdump] Print source when subsequent lines in the translation unit come from the same line in two different headers.
Reviewers: grimar, rupprecht, jhenderson
Reviewed By: grimar, jhenderson
Subscribers: llvm-commits, jhenderson
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62461
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362771
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Pozulp [Fri, 7 Jun 2019 05:11:13 +0000 (05:11 +0000)]
[llvm-objdump] Add warning if --disassemble-functions specifies an unknown symbol
Summary: Fixes Bug 41904 https://bugs.llvm.org/show_bug.cgi?id=41904
Reviewers: jhenderson, rupprecht, grimar, MaskRay
Reviewed By: jhenderson, rupprecht, MaskRay
Subscribers: dexonsmith, rupprecht, kristina, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62275
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362768
91177308-0d34-0410-b5e6-
96231b3b80d8
Fangrui Song [Fri, 7 Jun 2019 03:47:22 +0000 (03:47 +0000)]
[MC][ELF] Don't create relocations with section symbols for STB_LOCAL ifunc
We should keep the symbol type (STT_GNU_IFUNC) for a local ifunc because
it may result in an IRELATIVE reloc that the dynamic loader will use to
resolve the address at startup time.
There is another problem that is not fixed by this patch: a PC relative
relocation should also create a relocation with the ifunc symbol.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362767
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Pozulp [Fri, 7 Jun 2019 03:23:00 +0000 (03:23 +0000)]
[ADT] Enable set_difference() to be used on StringSet
Subscribers: mgorny, mgrang, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62992
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362766
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Pozulp [Fri, 7 Jun 2019 01:55:59 +0000 (01:55 +0000)]
[NFC] Test commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362763
91177308-0d34-0410-b5e6-
96231b3b80d8
Fangrui Song [Fri, 7 Jun 2019 01:48:26 +0000 (01:48 +0000)]
[LV] Fix -Wunused-function after r362736
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362762
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 7 Jun 2019 00:14:55 +0000 (00:14 +0000)]
AMDGPU: Don't count mask branch pseudo towards skip threshold
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362761
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 7 Jun 2019 00:14:45 +0000 (00:14 +0000)]
AMDGPU: Insert skips for blocks with FLAT
This already forced a skip for VMEM, so it should also be done for
flat. I'm somewhat skeptical about the benefit of this though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362760
91177308-0d34-0410-b5e6-
96231b3b80d8
Nemanja Ivanovic [Thu, 6 Jun 2019 23:49:01 +0000 (23:49 +0000)]
[PowerPC] Exploit the vector min/max instructions
Use the PPC vector min/max instructions for computing the corresponding
operation as these should be faster than the compare/select sequences
we currently emit.
Differential revision: https://reviews.llvm.org/D47332
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362759
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 6 Jun 2019 22:51:51 +0000 (22:51 +0000)]
AMDGPU: Insert skip branches over return blocks
SIInsertSkips really doesn't understand the control flow, and makes
very stupid assumptions about the block layout. This was able to get
away with not skipping return blocks, since usually after
structurization there is only one placed at the end of the
function. Tail duplication can break this assumption.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362754
91177308-0d34-0410-b5e6-
96231b3b80d8
David Tenty [Thu, 6 Jun 2019 22:07:14 +0000 (22:07 +0000)]
[NFC] Test commit, whitespace change
As per the Developer Policy, upon obtaining commit access.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362753
91177308-0d34-0410-b5e6-
96231b3b80d8
Cameron McInally [Thu, 6 Jun 2019 21:49:59 +0000 (21:49 +0000)]
[NFC][CodeGen] Add unary fneg tests to X86/fma4-intrinsics-x86.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362752
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Lapshin [Thu, 6 Jun 2019 21:19:39 +0000 (21:19 +0000)]
[DebugInfo] Incorrect debug info record generated for loop counter.
Incorrect Debug Variable Range was calculated while "COMPUTING LIVE DEBUG VARIABLES" stage.
Range for Debug Variable("i") computed according to current state of instructions
inside of basic block. But Register Allocator creates new instructions which were not taken
into account when Live Debug Variables computed. In the result DBG_VALUE instruction for
the "i" variable was put after these newly inserted instructions. This is incorrect.
Debug Value for the loop counter should be inserted before any loop instruction.
Differential Revision: https://reviews.llvm.org/D62650
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362750
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexander Timofeev [Thu, 6 Jun 2019 21:13:02 +0000 (21:13 +0000)]
[AMDGPU] Partial revert for the
ba447bae7448435c9986eece0811da1423972fdd
"Divergence driven ISel. Assign register class for cross block values
according to the divergence."
that discovered the design flaw leading to several issues that
required to be solved before.
This change reverts AMDGPU specific changes and keeps common part
unaffected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362749
91177308-0d34-0410-b5e6-
96231b3b80d8
Cameron McInally [Thu, 6 Jun 2019 21:12:22 +0000 (21:12 +0000)]
[NFC][CodeGen] Add unary fneg tests to X86/fma-intrinsics-x86.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362748
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 6 Jun 2019 21:00:04 +0000 (21:00 +0000)]
[X86] Make a bunch of merge masked binops commutable for loading folding.
This primarily affects add/fadd/mul/fmul/and/or/xor/pmuludq/pmuldq/max/min/fmaxc/fminc/pmaddwd/pavg.
We already commuted the unmasked and zero masked versions.
I've added 512-bit stack folding tests for most of the instructions
affected. I've tested needing commuting and not commuting across
unmasked, merged masked, and zero masked. The 128/256 bit instructions
should behave similarly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362746
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 6 Jun 2019 20:14:06 +0000 (20:14 +0000)]
[InstSimplify] add tests for fcmp with known-never-nan operands; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362742
91177308-0d34-0410-b5e6-
96231b3b80d8
Cameron McInally [Thu, 6 Jun 2019 20:11:30 +0000 (20:11 +0000)]
[NFC][CodeGen] Add unary fneg tests to X86/fma-scalar-combine.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362741
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 6 Jun 2019 19:21:23 +0000 (19:21 +0000)]
[CFLGraph] Add support for unary fneg instruction.
Differential Revision: https://reviews.llvm.org/D62791
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362737
91177308-0d34-0410-b5e6-
96231b3b80d8
Renato Golin [Thu, 6 Jun 2019 19:15:52 +0000 (19:15 +0000)]
[LV] Wrap LV illegality reporting in a function. NFC.
A function for loop vectorization illegality reporting has been
introduced:
void LoopVectorizationLegality::reportVectorizationFailure(
const StringRef DebugMsg, const StringRef OREMsg,
const StringRef ORETag, Instruction * const I) const;
The function prints a debug message when the debug for the compilation
unit is enabled as well as invokes the optimization report emitter to
generate a message with a specified tag. The function doesn't cover any
complicated logic when a custom lambda should be passed to the emitter,
only generating a message with a tag is supported.
The function always prints the instruction `I` after the debug message
whenever the instruction is specified, otherwise the debug message
ends with a dot: 'LV: Not vectorizing: Disabled/already vectorized.'
Patch by Pavel Samolysov <samolisov@gmail.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362736
91177308-0d34-0410-b5e6-
96231b3b80d8
Jason Liu [Thu, 6 Jun 2019 19:13:36 +0000 (19:13 +0000)]
[AIX] Implement function descriptor on SDAG
Summary:
(1) Function descriptor on AIX
On AIX, a called routine may have 2 distinct symbols associated with it:
* A function descriptor (Name)
* A function entry point (.Name)
The descriptor structure on AIX is the same as those in the ELF V1 ABI:
* The address of the entry point of the function.
* The TOC base address for the function.
* The environment pointer.
The descriptor symbol uses the same name as the source level function in C.
The function entry point is analogous to the symbol we would generate for a
function in a non-descriptor-based ABI, except that it is renamed by
prepending a ".".
Which symbol gets referenced depends on the context:
* Taking the address of the function references the descriptor symbol.
* Calling the function references the entry point symbol.
(2) Speaking of implementation on AIX, for direct function call target, we
create proper MCSymbol SDNode(e.g . ".foo") while constructing SDAG to
replace original TargetGlobalAddress SDNode. Then down the path, we can
take advantage of this MCSymbol.
Patch by: Xiangling_L
Reviewed by: sfertile, hubert.reinterpretcast, jasonliu, syzaara
Differential Revision: https://reviews.llvm.org/D62532
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362735
91177308-0d34-0410-b5e6-
96231b3b80d8
Cameron McInally [Thu, 6 Jun 2019 19:02:46 +0000 (19:02 +0000)]
[NFC][CodeGen] Add unary fneg tests to X86/fma4-fneg-combine.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362733
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 6 Jun 2019 19:02:18 +0000 (19:02 +0000)]
[InlineCost] Add support for unary fneg.
This adds support for unary fneg based on the implementation of BinaryOperator without the soft float FP cost.
Previously we would just delegate to visitUnaryInstruction. I think the only real change is that we will pass the FastMath flags to SimplifyFNeg now.
Differential Revision: https://reviews.llvm.org/D62699
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362732
91177308-0d34-0410-b5e6-
96231b3b80d8
Cameron McInally [Thu, 6 Jun 2019 18:41:18 +0000 (18:41 +0000)]
[NFC][CodeGen] Add unary fneg tests to X86/fma_patterns.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362730
91177308-0d34-0410-b5e6-
96231b3b80d8
Philip Reames [Thu, 6 Jun 2019 18:02:36 +0000 (18:02 +0000)]
[LoopPred] Fix a bug in unconditional latch bailout introduced in r362284
This is a really silly bug that even a simple test w/an unconditional latch would have caught. I tried to guard against the case, but put it in the wrong if check. Oops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362727
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Thu, 6 Jun 2019 17:04:13 +0000 (17:04 +0000)]
[DAGCombine] MergeConsecutiveStores - improve non-temporal load\store handling (PR42123)
This patch is the first step towards ensuring MergeConsecutiveStores correctly handles non-temporal loads\stores:
1 - When merging load\stores we must ensure that they all have the same non-temporal flag. This is unlikely to occur, but can in strange cases where we're storing at the end of one page and the beginning of another.
2 - The merged load\store node must retain the non-temporal flag.
Differential Revision: https://reviews.llvm.org/D62910
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362723
91177308-0d34-0410-b5e6-
96231b3b80d8
Cameron McInally [Thu, 6 Jun 2019 16:55:51 +0000 (16:55 +0000)]
[NFC][CodeGen] Add unary fneg tests to X86/fma_patterns_wide.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362720
91177308-0d34-0410-b5e6-
96231b3b80d8
Nico Weber [Thu, 6 Jun 2019 16:55:05 +0000 (16:55 +0000)]
gn build: Merge r362685
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362719
91177308-0d34-0410-b5e6-
96231b3b80d8
Dmitri Gribenko [Thu, 6 Jun 2019 16:47:06 +0000 (16:47 +0000)]
Remove unused PPC.h includes under llvm/lib/Target/PowerPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362718
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 6 Jun 2019 16:39:04 +0000 (16:39 +0000)]
[X86] Make masked floating point equality/ordered compares commutable for load folding purposes.
Same as what is supported for the unmasked form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362717
91177308-0d34-0410-b5e6-
96231b3b80d8
Cameron McInally [Thu, 6 Jun 2019 16:13:23 +0000 (16:13 +0000)]
[NFC][CodeGen] Add unary fneg tests to fmul-combines.ll fnabs.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362715
91177308-0d34-0410-b5e6-
96231b3b80d8
Fangrui Song [Thu, 6 Jun 2019 15:31:45 +0000 (15:31 +0000)]
[PowerPC] Add R_PPC_IRELATIVE
This will be used by lld's powerpc port.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362713
91177308-0d34-0410-b5e6-
96231b3b80d8
Cameron McInally [Thu, 6 Jun 2019 15:29:11 +0000 (15:29 +0000)]
[NFC][CodeGen] Add unary fneg tests to fp-fast.ll fp-fold.ll fp-in-intregs.ll fp-stack-compare-cmov.ll fp-stack-compare.ll fsxor-alignment.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362712
91177308-0d34-0410-b5e6-
96231b3b80d8
Whitney Tsang [Thu, 6 Jun 2019 15:12:49 +0000 (15:12 +0000)]
[DA] Add an option to control delinearization validity checks
Summary: Dependence Analysis performs static checks to confirm validity
of delinearization. These checks often fail for 64-bit targets due to
type conversions and integer wrapping that prevent simplification of the
SCEV expressions. These checks would also fail at compile-time if the
lower bound of the loops are compile-time unknown.
For example:
void foo(int n, int m, int a[][m]) {
for (int i = 0; i < n; ++i)
for (int j = 0; j < m; ++j) {
a[i][j] = a[i+1][j-2];
}
}
opt -mem2reg -instcombine -indvars -loop-simplify -loop-rotate -inline
-pass-remarks=.* -debug-pass=Arguments
-da-permissive-validity-checks=false k3.ll -analyze -da
will produce the following by default:
da analyze - anti [* *|<]!
but will produce the following expected dependence vector if the
validity checks are disabled:
da analyze - consistent anti [1 -2]!
This revision will introduce a debug option that will leave the validity
checks in place by default, but allow them to be turned off. New tests
are added for cases where it cannot be proven at compile-time that the
individual subscripts stay in-bound with respect to a particular
dimension of an array. These tests enable the option to provide user
guarantee that the subscripts do not over/under-flow into other
dimensions, thereby producing more accurate dependence vectors.
For prior discussion on this topic, leading to this change, please see
the following thread:
http://lists.llvm.org/pipermail/llvm-dev/2019-May/132372.html
Reviewers: Meinersbur, jdoerfert, kbarton, dmgreen, fhahn
Reviewed By: Meinersbur, jdoerfert, dmgreen
Subscribers: fhahn, hiraditya, javed.absar, llvm-commits, Whitney,
etiotto
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D62610
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362711
91177308-0d34-0410-b5e6-
96231b3b80d8
Cameron McInally [Thu, 6 Jun 2019 14:52:16 +0000 (14:52 +0000)]
[NFC][CodeGen] Remove duplicate test in fp-fast.ll
@test10 is the same as @test11.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362710
91177308-0d34-0410-b5e6-
96231b3b80d8
Ilya Biryukov [Thu, 6 Jun 2019 14:51:55 +0000 (14:51 +0000)]
gn build: Add new tidy checks to gn files
The checks were added in r362673 and r362672.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362709
91177308-0d34-0410-b5e6-
96231b3b80d8
Jason Liu [Thu, 6 Jun 2019 14:36:43 +0000 (14:36 +0000)]
[AIX] Implement call lowering with parameters could pass onto GPRs
Summary:
This patch implements SDAG call lowering on AIX for functions
which only have parameters that could fit into GPRs.
Reviewers: hubert.reinterpretcast, syzaara
Differential Revision: https://reviews.llvm.org/D62823
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362708
91177308-0d34-0410-b5e6-
96231b3b80d8
Thomas Preud'homme [Thu, 6 Jun 2019 13:21:06 +0000 (13:21 +0000)]
FileCheck [6/12]: Introduce numeric variable definition
Summary:
This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch introduces support for defining
numeric variable in a CHECK directive.
This commit introduces support for defining numeric variable from a
litteral value in the input text. Numeric expressions can then use the
variable provided it is on a later line.
Copyright:
- Linaro (changes up to diff 183612 of revision D55940)
- GraphCore (changes in later versions of revision D55940 and
in new revision created off D55940)
Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk
Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60386
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362705
91177308-0d34-0410-b5e6-
96231b3b80d8
Owen Reynolds [Thu, 6 Jun 2019 13:19:50 +0000 (13:19 +0000)]
[llvm-ar] Create thin archives with MRI scripts
This patch implements the "CREATE_THIN" MRI script command, allowing thin archives to be created via MRI scripts.
Differential Revision: https://reviews.llvm.org/D62919
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362704
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 6 Jun 2019 13:18:20 +0000 (13:18 +0000)]
[InstCombine] add tests for loads of bitcasted vector pointer; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362703
91177308-0d34-0410-b5e6-
96231b3b80d8
Adhemerval Zanella [Thu, 6 Jun 2019 12:38:11 +0000 (12:38 +0000)]
AArch64] Handle ISD::LRINT and ISD::LLRINT for float16
This patch is a follow up for D62018 to add lrint/llrint
support for float16.
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D62863
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362700
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Thu, 6 Jun 2019 12:35:46 +0000 (12:35 +0000)]
Revert "[SCEV] Use wrap flags in InsertBinop"
This reverts commit r362687. Miscompiles llvm-profdata during selfhost.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362699
91177308-0d34-0410-b5e6-
96231b3b80d8
Adhemerval Zanella [Thu, 6 Jun 2019 11:53:26 +0000 (11:53 +0000)]
[AArch64] Handle ISD::LROUND and ISD::LLROUND for float16
This patch is a follow up for D61391 to add lround/llround
support for float16.
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D62861
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362698
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Thu, 6 Jun 2019 11:15:36 +0000 (11:15 +0000)]
[X86][SSE] Add nonuniform constant vector test for PR42105
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362697
91177308-0d34-0410-b5e6-
96231b3b80d8
Dmitri Gribenko [Thu, 6 Jun 2019 10:37:06 +0000 (10:37 +0000)]
Include what you use in LanaiAsmParser.cpp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362696
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Thu, 6 Jun 2019 10:21:18 +0000 (10:21 +0000)]
[DAGCombine] Cleanup isNegatibleForFree/GetNegatedExpression. NFCI.
Prep work for PR42105 - clang-format, use auto for cast and merge nested if()s
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362695
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Thu, 6 Jun 2019 10:15:26 +0000 (10:15 +0000)]
Fix whitespace indentation. NFCI.
Tabs are not our friends.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362694
91177308-0d34-0410-b5e6-
96231b3b80d8
Luis Marques [Thu, 6 Jun 2019 10:12:28 +0000 (10:12 +0000)]
[RISCV] Disable test/Analysis/CostModel/RISCV tests if RISCV backend not built
Adds missing lit.local.cfg. Fixes rL362691.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362693
91177308-0d34-0410-b5e6-
96231b3b80d8
Petar Avramovic [Thu, 6 Jun 2019 10:00:41 +0000 (10:00 +0000)]
[MIPS GlobalISel] Select sqrt
Select G_FSQRT for MIPS32.
Differential Revision: https://reviews.llvm.org/D62905
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362692
91177308-0d34-0410-b5e6-
96231b3b80d8
Luis Marques [Thu, 6 Jun 2019 09:47:53 +0000 (09:47 +0000)]
[RISCV] Add CostModel GEP tests
Differential Revision: https://reviews.llvm.org/D61185
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362691
91177308-0d34-0410-b5e6-
96231b3b80d8
Petar Avramovic [Thu, 6 Jun 2019 09:22:37 +0000 (09:22 +0000)]
[MIPS GlobalISel] Select fabs
Select G_FABS for MIPS32.
Differential Revision: https://reviews.llvm.org/D62903
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362690
91177308-0d34-0410-b5e6-
96231b3b80d8
Petar Avramovic [Thu, 6 Jun 2019 09:16:58 +0000 (09:16 +0000)]
[MIPS GlobalISel] Select fpext and fptrunc
Select G_FPEXT and G_FPTRUNC for MIPS32.
Differential Revision: https://reviews.llvm.org/D62902
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362689
91177308-0d34-0410-b5e6-
96231b3b80d8
Petar Avramovic [Thu, 6 Jun 2019 09:02:24 +0000 (09:02 +0000)]
[MIPS GlobalISel] Select floor and ceil
Select G_FFLOOR and G_FCEIL for MIPS32.
Differential Revision: https://reviews.llvm.org/D62901
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362688
91177308-0d34-0410-b5e6-
96231b3b80d8
Sam Parker [Thu, 6 Jun 2019 08:56:26 +0000 (08:56 +0000)]
[SCEV] Use wrap flags in InsertBinop
If the given SCEVExpr has no (un)signed flags attached to it, transfer
these to the resulting instruction or use them to find an existing
instruction.
Differential Revision: https://reviews.llvm.org/D61934
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362687
91177308-0d34-0410-b5e6-
96231b3b80d8
Dylan McKay [Thu, 6 Jun 2019 08:06:50 +0000 (08:06 +0000)]
[AVR] Fix the 'load.ll' test after r362351
In that commit, the 'load.ll' test was modified, but still failed.
This commit updates the test so that it now passes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362684
91177308-0d34-0410-b5e6-
96231b3b80d8
Amara Emerson [Thu, 6 Jun 2019 07:58:37 +0000 (07:58 +0000)]
[AArch64][GlobalISel] Add manual selection support for G_ZEXTLOADs to s64.
We already get support for G_ZEXTLOAD to s32 from the importer, but it can't
deal with the SUBREG_TO_REG in the pattern. Tweaking the existing manual
selection code for G_LOAD to handle an additional SUBREG_TO_REG when dealing
with G_ZEXTLOAD isn't much work.
Also add tests to check the imported pattern selections to s32 work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362681
91177308-0d34-0410-b5e6-
96231b3b80d8
Amara Emerson [Thu, 6 Jun 2019 07:33:47 +0000 (07:33 +0000)]
[AArch64][GlobalISel] Add the new changes to fix PR42129 that were supposed to go into r362666.
The changes weren't staged so ended up just re-commiting the unmodified reverted change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362677
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 6 Jun 2019 05:41:27 +0000 (05:41 +0000)]
[X86] Don't turn avx masked.load with constant mask into masked.load+vselect when passthru value is all zeroes.
This is intended to enable the use of an immediate blend or
more optimal instruction. But if the passthru is zero we don't
need any additional instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362675
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 6 Jun 2019 05:41:22 +0000 (05:41 +0000)]
[X86] Add test case for masked load with constant mask and all zeros passthru.
avx/avx2 masked loads only support all zeros for passthru in hardware.
So we have to emit a blend for all other values. We have an optimization
that tries to optimize this blend if the mask is constant. But we
don't need to perform this optimization if the passthru value is zero
which doesn't need the blend at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362674
91177308-0d34-0410-b5e6-
96231b3b80d8
Amara Emerson [Wed, 5 Jun 2019 23:46:16 +0000 (23:46 +0000)]
Revert "Revert "[AArch64][GlobalISel] Optimize G_FCMP + G_SELECT pairs when G_SELECT is fp""
When looking through copies, make sure to not try to find the vreg def of a physreg.
Normally getVRegDef will return nullptr in this case, but if there happens to be
multiple defs then it will assert.
This fixes PR42129.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362666
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Wed, 5 Jun 2019 22:37:50 +0000 (22:37 +0000)]
AMDGPU: Don't fix emergency stack slot at offset 0
This forced the caller to be aware of this, which is an ugly ABI
feature.
Partially reverts r295877. The original reasons for doing this are
mostly fixed. Alloca is now in a non-0 address space, so it should be
OK to have 0 as a valid pointer. Since we treat the absolute address
as the pointer value, this part only really needed to apply to
kernels.
Since r357093, we avoid the need to increment/decrement the offset
register in more cases, and since r354816 the scavenger can fail
without spilling, so it's less critical that we try to avoid an offset
that fits in the MUBUF offset.
Restrict to callable functions for now to split this into 2 steps to
limit thte number of test updates and in case anything breaks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362665
91177308-0d34-0410-b5e6-
96231b3b80d8
Cameron McInally [Wed, 5 Jun 2019 22:37:05 +0000 (22:37 +0000)]
[MSAN] Add unary FNeg visitor to the MemorySanitizer
Differential Revision: https://reviews.llvm.org/D62909
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362664
91177308-0d34-0410-b5e6-
96231b3b80d8
Ulrich Weigand [Wed, 5 Jun 2019 22:33:10 +0000 (22:33 +0000)]
Allow target to handle STRICT floating-point nodes
The ISD::STRICT_ nodes used to implement the constrained floating-point
intrinsics are currently never passed to the target back-end, which makes
it impossible to handle them correctly (e.g. mark instructions are depending
on a floating-point status and control register, or mark instructions as
possibly trapping).
This patch allows the target to use setOperationAction to switch the action
on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code
will stop converting the STRICT nodes to regular floating-point nodes, but
instead pass the STRICT nodes to the target using normal SelectionDAG
matching rules.
To avoid having the back-end duplicate all the floating-point instruction
patterns to handle both strict and non-strict variants, we make the MI
codegen explicitly aware of the floating-point exceptions by introducing
two new concepts:
- A new MCID flag "mayRaiseFPException" that the target should set on any
instruction that possibly can raise FP exception according to the
architecture definition.
- A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI
instruction resulting from expansion of any constrained FP intrinsic.
Any MI instruction that is *both* marked as mayRaiseFPException *and*
FPExcept then needs to be considered as raising exceptions by MI-level
codegen (e.g. scheduling).
Setting those two new flags is straightforward. The mayRaiseFPException
flag is simply set via TableGen by marking all relevant instruction
patterns in the .td files.
The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes
in the SelectionDAG, and gets inherited in the MachineSDNode nodes created
from it during instruction selection. The flag is then transfered to an
MIFlag when creating the MI from the MachineSDNode. This is handled just
like fast-math flags like no-nans are handled today.
This patch includes both common code changes required to implement the
new features, and the SystemZ implementation.
Reviewed By: andrew.w.kaylor
Differential Revision: https://reviews.llvm.org/D55506
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362663
91177308-0d34-0410-b5e6-
96231b3b80d8
Petr Hosek [Wed, 5 Jun 2019 22:27:31 +0000 (22:27 +0000)]
Revert "[AArch64][GlobalISel] Optimize G_FCMP + G_SELECT pairs when G_SELECT is fp"
This reverts commit r362435 as this triggers ICE, see PR42129 for details.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362662
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Wed, 5 Jun 2019 22:20:47 +0000 (22:20 +0000)]
AMDGPU: Invert frame index offset interpretation
Since the beginning, the offset of a frame index has been consistently
interpreted backwards. It was treating it as an offset from the
scratch wave offset register as a frame register. The correct
interpretation is the offset from the SP on entry to the function,
before the prolog. Frame index elimination then should select either
SP or another register as an FP.
Treat the scratch wave offset on kernel entry as the pre-incremented
SP. Rely more heavily on the standard hasFP and frame pointer
elimination logic, and clean up the private reservation code. This
saves a copy in most callee functions.
The kernel prolog emission code is still kind of a mess relying on
checking the uses of physical registers, which I would prefer to
eliminate.
Currently selection directly emits MUBUF instructions, which require
using a reference to some register. Use the register chosen for SP,
and then ignore this later. This should probably be cleaned up to use
pseudos that don't refer to any specific base register until frame
index elimination.
Add a workaround for shaders using large numbers of SGPRs. I'm not
sure these cases were ever working correctly, since as far as I can
tell the logic for figuring out which SGPR is the scratch wave offset
doesn't match up with the shader input initialization in the shader
programming guide.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362661
91177308-0d34-0410-b5e6-
96231b3b80d8
Joseph Tremoulet [Wed, 5 Jun 2019 21:30:10 +0000 (21:30 +0000)]
[EarlyCSE] Add tests for negated min/max/abs [NFC]
Summary:
I'm planning to update the hashing logic to recognize their equivalence
in a subsequent change (D62644).
Reviewers: spatel
Reviewed By: spatel
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62918
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362657
91177308-0d34-0410-b5e6-
96231b3b80d8
Mircea Trofin [Wed, 5 Jun 2019 21:28:13 +0000 (21:28 +0000)]
[CallSite removal] Refactoring llvm::InlineFunction APIs
Summary:
This change only unifies the API previous API pair accepting
CallInst and InvokeInst, thus making it easier to refactor
inliner pass ode to CallBase. The implementation of the unified
API still relies on the CallSite implementation.
Reviewers: eraman, chandlerc, jdoerfert
Reviewed By: jdoerfert
Subscribers: jdoerfert, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62283
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362656
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 5 Jun 2019 21:26:52 +0000 (21:26 +0000)]
[InstCombine] simplify code for bitcast of insertelement; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362655
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Wed, 5 Jun 2019 21:15:52 +0000 (21:15 +0000)]
NewGVN: Handle addrspacecast
The AllConstant check needs to be moved out of the if/else if chain to
avoid a test regression. The "there is no SimplifyZExt" comment
puzzles me, since there is SimplifyCastInst. Additionally, the
Simplify* calls seem to not see the operand as constant, so this needs
to be tried if the simplify failed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362653
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Wed, 5 Jun 2019 21:00:31 +0000 (21:00 +0000)]
[X86] Fix mistake that marked VADDSSrrb_Int/VADDSDrrb_Int/VMULSSrrb_Int/VMULSDrrb_Int as commutable.
One of the sources controls the pass through value for the upper bits
of the result so we can't really commute it.
In practice this problem isn't a functional issue because we would
only try to commute this instruction in order to fold a load. But
we can't do embedded rounding and fold a load at the same time. So
the load fold would never succeed so I don't think we would ever
commute or at least keep the version after commuting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362647
91177308-0d34-0410-b5e6-
96231b3b80d8
Whitney Tsang [Wed, 5 Jun 2019 20:42:47 +0000 (20:42 +0000)]
[LOOPINFO] Extend Loop object to add utilities to get the loop bounds,
step, and loop induction variable.
Summary: This PR extends the loop object with more utilities to get loop
bounds, step, and loop induction variable. There already exists passes
which try to obtain the loop induction variable in their own pass, e.g.
loop interchange. It would be useful to have a common area to get these
information.
/// Example:
/// for (int i = lb; i < ub; i+=step)
/// <loop body>
/// --- pseudo LLVMIR ---
/// beforeloop:
/// guardcmp = (lb < ub)
/// if (guardcmp) goto preheader; else goto afterloop
/// preheader:
/// loop:
/// i1 = phi[{lb, preheader}, {i2, latch}]
/// <loop body>
/// i2 = i1 + step
/// latch:
/// cmp = (i2 < ub)
/// if (cmp) goto loop
/// exit:
/// afterloop:
///
/// getBounds
/// getInitialIVValue --> lb
/// getStepInst --> i2 = i1 + step
/// getStepValue --> step
/// getFinalIVValue --> ub
/// getCanonicalPredicate --> '<'
/// getDirection --> Increasing
/// getInductionVariable --> i1
/// getAuxiliaryInductionVariable --> {i1}
/// isCanonical --> false
Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara,
fhahn
Reviewed By: kbarton
Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya,
llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D60565
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362644
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Wed, 5 Jun 2019 20:38:17 +0000 (20:38 +0000)]
InstCombine: correctly change byval type attribute alongside call args.
When the byval attribute has a type, it must match the pointee type of
any parameter; but InstCombine was not updating the attribute when
folding casts of various kinds away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362643
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Wed, 5 Jun 2019 20:37:47 +0000 (20:37 +0000)]
IR: make getParamByValType Just Work. NFC.
Most parts of LLVM don't care whether the byval type is derived from an
explicit Attribute or from the parameter's pointee type, so it makes
sense for the main access function to just return the right value.
The very few users who do care (only BitcodeReader so far) can find out
how it's specified by accessing the Attribute directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362642
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Wed, 5 Jun 2019 20:32:32 +0000 (20:32 +0000)]
AMDGPU: Remove amdgpu-max-work-group-size attribute
This has been deprecated for a long time, and mesa recently switched
to amdgpu-flat-work-group-size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362641
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Wed, 5 Jun 2019 20:32:25 +0000 (20:32 +0000)]
AMDGPU: Fix using 2 different enums for same operand flags
These enums are really for the same namespace of flags set on
arbitrary MachineOperands, so merge them to avoid value collisions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362640
91177308-0d34-0410-b5e6-
96231b3b80d8
Dan Gohman [Wed, 5 Jun 2019 20:01:01 +0000 (20:01 +0000)]
[WebAssembly] Limit PIC support to the Emscripten target
The current PIC support currently only works with Emscripten, so
disable it for other targets.
This is the PIC portion of https://reviews.llvm.org/D62542.
Reviewed By: dschuff, sbc100
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362638
91177308-0d34-0410-b5e6-
96231b3b80d8