OSDN Git Service

android-x86/external-llvm.git
5 years ago[SelectionDAG] Support promotion of PREFETCH operands
Alex Bradbury [Fri, 30 Nov 2018 10:06:31 +0000 (10:06 +0000)]
[SelectionDAG] Support promotion of PREFETCH operands

For targets where i32 is not a legal type (e.g. 64-bit RISC-V),
LegalizeIntegerTypes must promote the operands of ISD::PREFETCH.

Differential Revision: https://reviews.llvm.org/D53281

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347980 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[LoopSimplifyCFG] Update MemorySSA in terminator folding. PR39783
Max Kazantsev [Fri, 30 Nov 2018 10:06:23 +0000 (10:06 +0000)]
[LoopSimplifyCFG] Update MemorySSA in terminator folding. PR39783

Terminator folding transform lacks MemorySSA update for memory Phis,
while they exist within MemorySSA analysis. They need exactly the same
type of updates as regular Phis. Failing to update them properly ends up
with inconsistent MemorySSA and manifests in various assertion failures.

This patch adds Memory Phi updates to this transform.

Thanks to @jonpa for finding this!

Differential Revision: https://reviews.llvm.org/D55050
Reviewed By: asbirlea

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347979 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[SelectionDAG] Support promotion of FRAMEADDR/RETURNADDR operands
Alex Bradbury [Fri, 30 Nov 2018 10:02:06 +0000 (10:02 +0000)]
[SelectionDAG] Support promotion of FRAMEADDR/RETURNADDR operands

For targets where i32 is not a legal type (e.g. 64-bit RISC-V),
LegalizeIntegerTypes must promote the operand.

Differential Revision: https://reviews.llvm.org/D53279

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347978 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[TargetLowering][RISCV] Introduce isSExtCheaperThanZExt hook and implement for RISC-V
Alex Bradbury [Fri, 30 Nov 2018 09:56:54 +0000 (09:56 +0000)]
[TargetLowering][RISCV] Introduce isSExtCheaperThanZExt hook and implement for RISC-V

DAGTypeLegalizer::PromoteSetCCOperands currently prefers to zero-extend
operands when it is able to do so. For some targets this is more expensive
than a sign-extension, which is also a valid choice. Introduce the
isSExtCheaperThanZExt hook and use it in the new SExtOrZExtPromotedInteger
helper. On RISC-V, we prefer sign-extension for FromTy == MVT::i32 and ToTy ==
MVT::i64, as it can be performed using a single instruction.

Differential Revision: https://reviews.llvm.org/D52978

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347977 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[NFC] Simplify and reduce tests for PR39783
Max Kazantsev [Fri, 30 Nov 2018 09:51:25 +0000 (09:51 +0000)]
[NFC] Simplify and reduce tests for PR39783

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347976 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[RISCV] Introduce codegen patterns for instructions introduced in RV64I
Alex Bradbury [Fri, 30 Nov 2018 09:38:44 +0000 (09:38 +0000)]
[RISCV] Introduce codegen patterns for instructions introduced in RV64I

As discussed in the RFC
<http://lists.llvm.org/pipermail/llvm-dev/2018-October/126690.html>, 64-bit
RISC-V has i64 as the only legal integer type.  This patch introduces patterns
to support codegen of the new instructions
introduced in RV64I: addiw, addiw, subw, sllw, slliw, srlw, srliw, sraw,
sraiw, ld, sd.

Custom selection code is needed for srliw as SimplifyDemandedBits will remove
lower bits from the mask, meaning the obvious pattern won't work:

def : Pat<(sext_inreg (srl (and GPR:$rs1, 0xffffffff), uimm5:$shamt), i32),
          (SRLIW GPR:$rs1, uimm5:$shamt)>;
This is sufficient to compile and execute all of the GCC torture suite for
RV64I other than those files using frameaddr or returnaddr intrinsics
(LegalizeDAG doesn't know how to promote the operands - a future patch
addresses this).

When promoting i32 sltu/sltiu operands, it would be more efficient to use
sign-extension rather than zero-extension for RV64. A future patch adds a hook
to allow this.

Differential Revision: https://reviews.llvm.org/D52977

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347973 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[docs][AtomicExpandPass] Document the alternate lowering strategy for part-word atomi...
Alex Bradbury [Fri, 30 Nov 2018 09:23:24 +0000 (09:23 +0000)]
[docs][AtomicExpandPass] Document the alternate lowering strategy for part-word atomicrmw/cmpxchg

D47882, D48130 and D48131 introduce a new lowering strategy for part-word
atomicrmw/cmpxchg and uses it to lower these operations for the RISC-V target.
Rather than having AtomicExpandPass produce the LL/SC loop in the IR level, it
instead calculates the necessary mask values and inserts a target-specific
intrinsic, which is lowered at a much later stage (after register allocation).
This ensures that architecture-specific restrictions for forward-progress in
LL/SC loops can be guaranteed.

This patch documents this new AtomicExpandPass functionality. See the previous
llvm-dev RFC for more info
<http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html>.

Differential Revision: https://reviews.llvm.org/D52234

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347971 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Emit PACKUS directly from the v16i8 LowerMULH code instead of using a shuffle.
Craig Topper [Fri, 30 Nov 2018 08:32:05 +0000 (08:32 +0000)]
[X86] Emit PACKUS directly from the v16i8 LowerMULH code instead of using a shuffle.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347967 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Change the pre-sse4.1 code in the v16i8 MULHU lowering to be what we get after...
Craig Topper [Fri, 30 Nov 2018 08:32:01 +0000 (08:32 +0000)]
[X86] Change the pre-sse4.1 code in the v16i8 MULHU lowering to be what we get after DAG combine cleans it up.

Previously we emitted a punpcklbw/punpckhbw to move the byte elements into the upper half of 16 bit elements then shifted right by 8 to zero the upper bits. After DAG combine we end up with punpcklbw/punpckhbw into the lower bits with zeros in the uppers bits and no shifts. So just emit that directly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347966 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ARM] Don't expand sdiv when optimising for minsize
Sjoerd Meijer [Fri, 30 Nov 2018 08:14:28 +0000 (08:14 +0000)]
[ARM] Don't expand sdiv when optimising for minsize

Don't expand SDIV with an immediate that is a power of 2 if we optimise for
minimum code size. For example:

sdiv %1, i32 4

gets expanded to a sequence of 3 instructions, but this is suboptimal for
minimum code size so instead we just generate a MOV and a SDIV if integer
division is supported.

Differential Revision: https://reviews.llvm.org/D54546

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347965 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[CodeGen] Fix bugs in BranchFolderPass when debug labels are generated.
Hsiangkai Wang [Fri, 30 Nov 2018 08:07:29 +0000 (08:07 +0000)]
[CodeGen] Fix bugs in BranchFolderPass when debug labels are generated.

Skip DBG_VALUE and DBG_LABEL in branch folding algorithms.

The bug is reported in
https://bugs.chromium.org/p/chromium/issues/detail?id=898160.

Differential Revision: https://reviews.llvm.org/D54199

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347964 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[NFC] Refine doxygen format.
Hsiangkai Wang [Fri, 30 Nov 2018 08:07:24 +0000 (08:07 +0000)]
[NFC] Refine doxygen format.

Differential Revision: https://reviews.llvm.org/D54568

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347963 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[SystemZ::TTI] i8/i16 operands extension costs revisited
Jonas Paulsson [Fri, 30 Nov 2018 07:09:34 +0000 (07:09 +0000)]
[SystemZ::TTI] i8/i16 operands extension costs revisited

Three minor changes to these extra costs:

* For ICmp instructions, instead of adding 2 all the time for extending each
  operand, this is only done if that operand is neither a load or an
  immediate.

* The operands extension costs for divides removed, because we now use a high
  cost already for the divide (20).

* The costs for lhsr/ashr extra costs removed as this did not seem useful.

Review: Ulrich Weigand
https://reviews.llvm.org/D55053

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347961 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Fix a couple types in SimplifyDemandedVectorEltsForTargetNode. NFCI
Craig Topper [Fri, 30 Nov 2018 06:23:55 +0000 (06:23 +0000)]
[X86] Fix a couple types in SimplifyDemandedVectorEltsForTargetNode. NFCI

We had a EVT variable capturing the result of getSimpleValueType which returns an MVT. Another place using EVT that could have been MVT. And an 'int' that should be 'unsigned'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347959 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[llvm-objcopy] Move elf-specific tests into subfolder
Alexander Shaposhnikov [Fri, 30 Nov 2018 05:43:39 +0000 (05:43 +0000)]
[llvm-objcopy] Move elf-specific tests into subfolder

In this diff the elf-specific tests are moved into the subfolder llvm-objcopy/ELF
(the change was discussed in the comments on https://reviews.llvm.org/D54674).
A separate code reivew wasn't sent for this change
since Phabricator is failing to create such a large diff.

Test plan:
make check-all
make check-llvm-tools
make check-llvm-tools-llvm-objcopy

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347958 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoFix build warnings introduced in rL347938
Mircea Trofin [Fri, 30 Nov 2018 01:53:17 +0000 (01:53 +0000)]
Fix build warnings introduced in rL347938

Summary:
Suppressed warnings in release builds due to variable used
only in assert statement.

Subscribers: llvm-commits, eraman, mgorny

Differential Revision: https://reviews.llvm.org/D55100

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347939 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert "Revert r347596 "Support for inserting profile-directed cache prefetches""
Mircea Trofin [Fri, 30 Nov 2018 01:01:52 +0000 (01:01 +0000)]
Revert "Revert r347596 "Support for inserting profile-directed cache prefetches""

Summary:
This reverts commit d8517b96dfbd42e6a8db33c50d1fa1e58e63fbb9.

Fix: correct  the use of DenseMap.

Reviewers: davidxl, hans, wmi

Reviewed By: wmi

Subscribers: mgorny, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D55088

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347938 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[CMake] build correctly if build path contains whitespace
Shoaib Meenai [Fri, 30 Nov 2018 00:30:53 +0000 (00:30 +0000)]
[CMake] build correctly if build path contains whitespace

The add_llvm_symbol_exports function in AddLLVM.cmake creates command
line link flags with paths containing CMAKE_CURRENT_BINARY_DIR, but that
will break if CMAKE_CURRENT_BINARY_DIR contains whitespace. This patch
adds quotes to those paths.

Fixes PR39843.

Patch by John Garvin.

Differential Revision: https://reviews.llvm.org/D55081

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347937 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[SCEV] Guard movement of insertion point for loop-invariants
Warren Ristow [Fri, 30 Nov 2018 00:02:54 +0000 (00:02 +0000)]
[SCEV] Guard movement of insertion point for loop-invariants

r320789 suppressed moving the insertion point of SCEV expressions with
dev/rem operations to the loop header in non-loop-invariant situations.
This, and similar, hoisting is also unsafe in the loop-invariant case,
since there may be a guard against a zero denominator. This is an
adjustment to the fix of r320789 to suppress the movement even in the
loop-invariant case.

This fixes PR30806.

Differential Revision: https://reviews.llvm.org/D54713

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347934 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[gn build] merge r346978 and r347741.
Nico Weber [Thu, 29 Nov 2018 23:03:17 +0000 (23:03 +0000)]
[gn build] merge r346978 and r347741.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347929 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[gn build] Set +x bit on .py files in llvm/utils/gn/build.
Nico Weber [Thu, 29 Nov 2018 22:56:40 +0000 (22:56 +0000)]
[gn build] Set +x bit on .py files in llvm/utils/gn/build.

Also add a shebang line to write_cmake_config.py.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347928 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[gn build] Add template for running llvm-tblgen and use it to add build file for...
Nico Weber [Thu, 29 Nov 2018 22:53:21 +0000 (22:53 +0000)]
[gn build] Add template for running llvm-tblgen and use it to add build file for llvm/lib/IR.

Also adds a boring build file for llvm/lib/BinaryFormat (needed by llvm/lib/IR).

lib/IR marks Attributes and IntrinsicsEnum as public_deps (because IR's public
headers include the generated .inc files), so projects depending on lib/IR will
implicitly depend on them being generated. As a consequence, most targets won't
have to explicitly list a dependency on these tablegen steps (contrast with
intrinsics_gen in the cmake build).

This doesn't yet have the optimization where tablegen's output is only updated
if it's changed.

Differential Revision: https://reviews.llvm.org/D55028#inline-486755

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347927 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[gn build] Add a script checking if sources in BUILD.gn and CMakeLists.txt files...
Nico Weber [Thu, 29 Nov 2018 22:25:31 +0000 (22:25 +0000)]
[gn build] Add a script checking if sources in BUILD.gn and CMakeLists.txt files match.

Also fix a missing file in lib/Support/BUILD.gn found by the script.

The script is very stupid and assumes that CMakeLists.txt follow the standard
LLVM CMakeLists.txt formatting with one cpp source file per line. Despite its
simplicity, it works well in practice.

It would be nice if it also checked deps and maybe automatically applied its
suggestions.

Differential Revision: https://reviews.llvm.org/D54930

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347925 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[WebAssembly] Expand unavailable integer operations for vectors
Thomas Lively [Thu, 29 Nov 2018 22:01:01 +0000 (22:01 +0000)]
[WebAssembly] Expand unavailable integer operations for vectors

Summary:
Expands for vector types all of the integer operations that are
expanded for scalars because they are not supported at all by
WebAssembly.

This CL has no tests because such tests would really be testing the
target-independent expansion, but I'm happy to add tests if reviewers
think it would be helpful.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D55010

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347923 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoProduce an error on non-encodable offsets for darwin ARM scattered relocations.
Jonas Devlieghere [Thu, 29 Nov 2018 21:58:23 +0000 (21:58 +0000)]
Produce an error on non-encodable offsets for darwin ARM scattered relocations.

Scattered ARM relocations for Mach-O's only have 24 bits available to
encode the offset. This is not checked but just truncated and can result
in corrupt binaries after linking because the relocations are applied to
the wrong offset. This patch will check and error out in those
situations instead of emitting a wrong relocation.

Patch by: Sander Bogaert (dzn)

Differential revision: https://reviews.llvm.org/D54776

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347922 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoComment tweak requested in code review. NFC
Paul Robinson [Thu, 29 Nov 2018 21:13:51 +0000 (21:13 +0000)]
Comment tweak requested in code review. NFC

I forgot to do this before committing D54755.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347918 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[DAGCombiner] narrow truncated binops
Sanjay Patel [Thu, 29 Nov 2018 20:58:26 +0000 (20:58 +0000)]
[DAGCombiner] narrow truncated binops

The motivating case for this is shown in:
https://bugs.llvm.org/show_bug.cgi?id=32023
and the corresponding rot16.ll regression tests.

Because x86 scalar shift amounts are i8 values, we can end up with trunc-binop-trunc
sequences that don't get folded in IR.

As the TODO comments suggest, there will be regressions if we extend this (for x86,
we mostly seem to be missing LEA opportunities, but there are likely vector folds
missing too). I think those should be considered existing bugs because this is the
same transform that we do as an IR canonicalization in instcombine. We just need
more tests to make those visible independent of this patch.

Differential Revision: https://reviews.llvm.org/D54640

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347917 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[obj2yaml] [COFF] Write RVA instead of VA for sections, fix roundtripping executables
Martin Storsjo [Thu, 29 Nov 2018 20:53:57 +0000 (20:53 +0000)]
[obj2yaml] [COFF] Write RVA instead of VA for sections, fix roundtripping executables

yaml2obj writes the yaml value as is to the output file.

Differential Revision: https://reviews.llvm.org/D54965

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347916 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[RISCV] Implement codegen for cmpxchg on RV32IA
Alex Bradbury [Thu, 29 Nov 2018 20:43:42 +0000 (20:43 +0000)]
[RISCV] Implement codegen for cmpxchg on RV32IA

Utilise a similar ('late') lowering strategy to D47882. The changes to
AtomicExpandPass allow this strategy to be utilised by other targets which
implement shouldExpandAtomicCmpXchgInIR.

All cmpxchg are lowered as 'strong' currently and failure ordering is ignored.
This is conservative but correct.

Differential Revision: https://reviews.llvm.org/D48131

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347914 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAdding .vscode to svn:ignore
Leonard Mosescu [Thu, 29 Nov 2018 20:41:10 +0000 (20:41 +0000)]
Adding .vscode to svn:ignore

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347913 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Change the pre-type legalization DAG combine added in r347898 into a custom...
Craig Topper [Thu, 29 Nov 2018 20:18:58 +0000 (20:18 +0000)]
[X86] Change the pre-type legalization DAG combine added in r347898 into a custom type legalization operation instead.

This seems to produce the same results on the tests we have.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347912 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r347871 "Fix: Add support for TFE/LWE in image intrinsic"
David Stuttard [Thu, 29 Nov 2018 20:14:17 +0000 (20:14 +0000)]
Revert r347871 "Fix: Add support for TFE/LWE in image intrinsic"

Also revert fix r347876

One of the buildbots was reporting a failure in some relevant tests that I can't
repro or explain at present, so reverting until I can isolate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347911 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoIntroduce MaxUsesToExplore argument to capture tracking
Artur Pilipenko [Thu, 29 Nov 2018 20:08:12 +0000 (20:08 +0000)]
Introduce MaxUsesToExplore argument to capture tracking

Currently CaptureTracker gives up if it encounters a value with more than 20
uses. The motivation for this cap is to keep it relatively cheap for
BasicAliasAnalysis use case, where the results can't be cached. Although, other
clients of CaptureTracker might be ok with higher cost. This patch introduces an
argument for PointerMayBeCaptured functions to specify the max number of uses to
explore. The motivation for this change is a downstream user of CaptureTracker,
but I believe upstream clients of CaptureTracker might also benefit from more
fine grained cap.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D55042

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347910 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[MachineScheduler] Order FI-based memops based on stack direction
Francis Visoiu Mistrih [Thu, 29 Nov 2018 20:03:19 +0000 (20:03 +0000)]
[MachineScheduler] Order FI-based memops based on stack direction

It makes more sense to order FI-based memops in descending order when
the stack goes down. This allows offsets to stay "consecutive" and allow
easier pattern matching.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347906 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[SelectionDAG][AArch64][X86] Move legalization of vector MULHS/MULHU from LegalizeDAG...
Craig Topper [Thu, 29 Nov 2018 19:36:17 +0000 (19:36 +0000)]
[SelectionDAG][AArch64][X86] Move legalization of vector MULHS/MULHU from LegalizeDAG to LegalizeVectorOps

I believe we should be legalizing these with the rest of vector binary operations. If any custom lowering is required for these nodes, this will give the DAG combine between LegalizeVectorOps and LegalizeDAG to run on the custom code before constant build_vectors are lowered in LegalizeDAG.

I've moved MULHU/MULHS handling in AArch64 from Lowering to isel. Moving the lowering earlier caused build_vector+extract_subvector simplifications to kick in which made the generated code worse.

Differential Revision: https://reviews.llvm.org/D54276

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347902 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Add a DAG combine pre type legalization to widen division by constant splat...
Craig Topper [Thu, 29 Nov 2018 19:13:38 +0000 (19:13 +0000)]
[X86] Add a DAG combine pre type legalization to widen division by constant splat on narrow vectors to avoid scalarization

This is another patch for -x86-experimental-vector-widening. This pre widens narrow division by constants so that we can get pass the legal type check in the generic DAG combiner. Otherwise we end up scalarizing.

I've restricted this to splats for now because it was easy to just call DAG.getConstant. Not sure what we should do for non-splat? Increase the element size?Widen the constant vector by padding with 1?

Differential Revision: https://reviews.llvm.org/D54919

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347898 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[InstSimplify] fold select with implied condition
Sanjay Patel [Thu, 29 Nov 2018 18:44:39 +0000 (18:44 +0000)]
[InstSimplify] fold select with implied condition

This is an almost direct move of the functionality from InstCombine to
InstSimplify. There's no reason not to do this in InstSimplify because
we never create a new value with this transform.

(There's a question of whether any dominance-based transform belongs in
either of these passes, but that's a separate issue.)

I've changed 1 of the conditions for the fold (1 of the blocks for the
branch must be the block we started with) into an assert because I'm not
sure how that could ever be false.

We need 1 extra check to make sure that the instruction itself is in a
basic block because passes other than InstCombine may be using InstSimplify
as an analysis on values that are not wired up yet.

The 3-way compare changes show that InstCombine has some kind of
phase-ordering hole. Otherwise, we would have already gotten the intended
final result that we now show here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347896 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[TableGen] Examine entire subreg compositions to detect ambiguity
Krzysztof Parzyszek [Thu, 29 Nov 2018 18:20:08 +0000 (18:20 +0000)]
[TableGen] Examine entire subreg compositions to detect ambiguity

When tablegen detects that there exist two subregister compositions that
result in the same value for some register, it will emit a warning. This
kind of an overlap in compositions should only happen when it is caused
by a user-defined composition. It can happen, however, that the user-
defined composition is not identically equal to another one, but it does
produce the same value for one or more registers. In such cases suppress
the warning.
This patch is to silence the warning when building the System Z backend
after D50725.

Differential Revision: https://reviews.llvm.org/D50977

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347894 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[GlobalISel] LegalizationArtifactCombiner: Combine aext([asz]ext x) -> [asz]ext x
Volkan Keles [Thu, 29 Nov 2018 18:19:24 +0000 (18:19 +0000)]
[GlobalISel] LegalizationArtifactCombiner: Combine aext([asz]ext x) -> [asz]ext x

Summary:
Replace `aext([asz]ext x)` with `aext/sext/zext x` in order to
reduce the number of instructions generated to clean up some
legalization artifacts.

Reviewers: aditya_nandakumar, dsanders, aemerson, bogner

Reviewed By: aemerson

Subscribers: rovka, kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D54174

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347893 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[llvm-objcopy] Delete redundant !Config.xx.empty() when followed by positive is_conta...
Fangrui Song [Thu, 29 Nov 2018 17:32:51 +0000 (17:32 +0000)]
[llvm-objcopy] Delete redundant !Config.xx.empty() when followed by positive is_contained() check

Summary: The original intention of !Config.xx.empty() was probably to emphasize the thing that is currently considered, but I feel the simplified form is actually easier to understand and it is also consistent with the call sites in other llvm components.

Reviewers: alexshap, rupprecht, jakehehrlich, jhenderson, espindola

Reviewed By: alexshap, rupprecht

Subscribers: emaste, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D55040

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347891 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAvoid redundant reference to isPodLike in SmallVect/Optional implementation
Serge Guelton [Thu, 29 Nov 2018 17:21:54 +0000 (17:21 +0000)]
Avoid redundant reference to isPodLike in SmallVect/Optional implementation

NFC, preparatory work for isPodLike cleaning.

Differential Revision: https://reviews.llvm.org/D55005

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347890 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[LICM] Reapply r347776 "Make LICM able to hoist phis" with fix
John Brawn [Thu, 29 Nov 2018 17:10:00 +0000 (17:10 +0000)]
[LICM] Reapply r347776 "Make LICM able to hoist phis" with fix

This commit caused a large compile-time slowdown in some cases when NDEBUG is
off due to the dominator tree verification it added. Fix this by only doing
dominator tree and loop info verification when something has been hoisted.

Differential Revision: https://reviews.llvm.org/D52827

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347889 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ThinLTO] Import local variables from the same module as caller
Teresa Johnson [Thu, 29 Nov 2018 17:02:42 +0000 (17:02 +0000)]
[ThinLTO] Import local variables from the same module as caller

Summary:
We can sometimes end up with multiple copies of a local variable that
have the same GUID in the index. This happens when there are local
variables with the same name that are in different source files having the
same name/path at compile time (but compiled into different bitcode objects).

In this case make sure we import the copy in the caller's module.
This enables importing both of the variables having the same GUID
(but which will have different promoted names since the module paths,
and therefore the module hashes, will be distinct).

Importing the wrong copy is particularly problematic for read only
variables, since we must import them as a local copy whenever
referenced. Otherwise we get undefs at link time.

Note that the llvm-lto.cpp and ThinLTOCodeGenerator changes are needed
for testing the distributed index case via clang, which will be sent as
a separate clang-side patch shortly. We were previously not doing the
dead code/read only computation before computing imports when testing
distributed index generation (like it was for testing importing and
other ThinLTO mechanisms alone).

Reviewers: evgeny777

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, dang, llvm-commits

Differential Revision: https://reviews.llvm.org/D55047

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347886 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agogit-llvm: Fix incremental population of svn tree.
James Y Knight [Thu, 29 Nov 2018 16:46:34 +0000 (16:46 +0000)]
git-llvm: Fix incremental population of svn tree.

"svn update --depth=..." is, annoyingly, not a specification of the
desired depth, but rather a _limit_ added on top of the "sticky" depth
in the working-directory. However, if the directory doesn't exist yet,
then it sets the sticky depth of the new directory entries.

Unfortunately, the svn command-line has no way of expanding the depth
of a directory from "empty" to "files", without also removing any
already-expanded subdirectories. The way you're supposed to increase
the depth of an existing directory is via --set-depth, but
--set-depth=files will also remove any subdirs which were already
requested.

This change avoids getting into the state of ever needing to increase
the depth of an existing directory from "empty" to "files" in the
first place, by:

1. Use svn update --depth=files, not --depth=immediates.

The latter has the effect of checking out the subdirectories and
marking them as depth=empty. The former excludes sub-directories from
the list of entries, which avoids the problem.

2. Explicitly populate missing parent directories.

Using --parents seemed nice and easy, but it marks the parent dirs as
depth=empty. Instead, check out parents explicitly if they're missing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347883 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[SimplifyCFG] auto-generate complete checks; NFC
Sanjay Patel [Thu, 29 Nov 2018 16:28:37 +0000 (16:28 +0000)]
[SimplifyCFG] auto-generate complete checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347882 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[InstCombine] auto-generate complete checks; NFC
Sanjay Patel [Thu, 29 Nov 2018 16:26:03 +0000 (16:26 +0000)]
[InstCombine] auto-generate complete checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347881 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[AMDGPU] Add and update scalar instructions
Graham Sellers [Thu, 29 Nov 2018 16:05:38 +0000 (16:05 +0000)]
[AMDGPU] Add and update scalar instructions

This patch adds support for S_ANDN2, S_ORN2 32-bit and 64-bit instructions and adds splits to move them to the vector unit (for which there is no equivalent instruction). It modifies the way that the more complex scalar instructions are lowered to vector instructions by first breaking them down to sequences of simpler scalar instructions which are then lowered through the existing code paths. The pattern for S_XNOR has also been updated to apply inversion to one input rather than the output of the XOR as the result is equivalent and may allow leaving the NOT instruction on the scalar unit.

A new tests for NAND, NOR, ANDN2 and ORN2 have been added, and existing tests now hit the new instructions (and have been modified accordingly).

Differential: https://reviews.llvm.org/D54714

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347877 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoFix: Add support for TFE/LWE in image intrinsic
David Stuttard [Thu, 29 Nov 2018 15:56:36 +0000 (15:56 +0000)]
Fix: Add support for TFE/LWE in image intrinsic

My change svn-id: 347871 caused a buildbot failure due to an unused
variable def (used in an assert).

Change-Id: Ia882d18bb6fa79b4d7bbfda422b9ea5d23eab336

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347876 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r347823 "[TextAPI] Switch back to a custom Platform enum."
Hans Wennborg [Thu, 29 Nov 2018 15:47:24 +0000 (15:47 +0000)]
Revert r347823 "[TextAPI] Switch back to a custom Platform enum."

It broke the Windows buildbots, e.g.
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/21829/steps/test/logs/stdio

This also reverts the follow-ups: r347824, r347827, and r347836.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347874 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[CallSiteSplitting] Report edge deletion to DomTreeUpdater
Joseph Tremoulet [Thu, 29 Nov 2018 15:27:04 +0000 (15:27 +0000)]
[CallSiteSplitting] Report edge deletion to DomTreeUpdater

Summary:
When splitting musttail calls, the split blocks' original terminators
get removed; inform the DTU when this happens.

Also add a testcase that fails an assertion in the DTU without this fix.

Reviewers: fhahn, junbuml

Reviewed By: fhahn

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D55027

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347872 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAdd support for TFE/LWE in image intrinsics
David Stuttard [Thu, 29 Nov 2018 15:21:13 +0000 (15:21 +0000)]
Add support for TFE/LWE in image intrinsics

TFE and LWE support requires extra result registers that are written in the
event of a failure in order to detect that failure case.
The specific use-case that initiated these changes is sparse texture support.

This means that if image intrinsics are used with either option turned on, the
programmer must ensure that the return type can contain all of the expected
results. This can result in redundant registers since the vector size must be a
power-of-2.

This change takes roughly 6 parts:
1. Modify the instruction defs in tablegen to add new instruction variants that
can accomodate the extra return values.
2. Updates to lowerImage in SIISelLowering.cpp to accomodate setting TFE or LWE
(where the bulk of the work for these instruction types is now done)
3. Extra verification code to catch cases where intrinsics have been used but
insufficient return registers are used.
4. Modification to the adjustWritemask optimisation to account for TFE/LWE being
enabled (requires extra registers to be maintained for error return value).
5. An extra pass to zero initialize the error value return - this is because if
the error does not occur, the register is not written and thus must be zeroed
before use. Also added a new (on by default) option to ensure ALL return values
are zero-initialized that is required for sparse texture support.
6. Disable the inst_combine optimization in the presence of tfe/lwe (later TODO
for this to re-enable and handle correctly).

There's an additional fix now to avoid a dmask=0

For an image intrinsic with tfe where all result channels except tfe
were unused, I was getting an image instruction with dmask=0 and only a
single vgpr result for tfe. That is incorrect because the hardware
assumes there is at least one vgpr result, plus the one for tfe.

Fixed by forcing dmask to 1, which gives the desired two vgpr result
with tfe in the second one.

The TFE or LWE result is returned from the intrinsics using an aggregate
type. Look in the test code provided to see how this works, but in essence IR
code to invoke the intrinsic looks as follows:

%v = call {<4 x float>,i32} @llvm.amdgcn.image.load.1d.v4f32i32.i32(i32 15,
                                      i32 %s, <8 x i32> %rsrc, i32 1, i32 0)
%v.vec = extractvalue {<4 x float>, i32} %v, 0
%v.err = extractvalue {<4 x float>, i32} %v, 1

Differential revision: https://reviews.llvm.org/D48826

Change-Id: If222bc03642e76cf98059a6bef5d5bffeda38dda

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347871 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[CVP] tidy processCmp(); NFC
Sanjay Patel [Thu, 29 Nov 2018 14:41:21 +0000 (14:41 +0000)]
[CVP] tidy processCmp(); NFC

1. The variables were confusing: 'C' typically refers to a constant, but here it was the Cmp.
2. Formatting violations.
3. Simplify code to return true/false constant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347868 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert "[LICM] Enable control flow hoisting by default" and "[LICM] Reapply r347190...
Martin Storsjo [Thu, 29 Nov 2018 14:39:39 +0000 (14:39 +0000)]
Revert "[LICM] Enable control flow hoisting by default" and "[LICM] Reapply r347190 "Make LICM able to hoist phis" with fix"

This reverts commits r347776 and r347778.

The first one, r347776, caused significant compile time regressions
for certain input files, see PR39836 for details.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347867 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[CVP] auto-generate complete test checks; NFC
Sanjay Patel [Thu, 29 Nov 2018 14:28:47 +0000 (14:28 +0000)]
[CVP] auto-generate complete test checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347866 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r347596 "Support for inserting profile-directed cache prefetches"
Hans Wennborg [Thu, 29 Nov 2018 13:58:02 +0000 (13:58 +0000)]
Revert r347596 "Support for inserting profile-directed cache prefetches"

It causes asserts building BoringSSL. See https://crbug.com/91009#c3 for
repro.

This also reverts the follow-ups:
Revert r347724 "Do not insert prefetches with unsupported memory operands."
Revert r347606 "[X86] Add dependency from X86 to ProfileData after rL347596"
Revert r347607 "Add new passes to X86 pipeline tests"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347864 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[GlobalISel] Fix insertion of stack-protector epilogue
Petr Pavlu [Thu, 29 Nov 2018 13:22:53 +0000 (13:22 +0000)]
[GlobalISel] Fix insertion of stack-protector epilogue

* Tell the StackProtector pass to generate the epilogue instrumentation
  when GlobalISel is enabled because GISel currently does not implement
  the same deferred epilogue insertion as SelectionDAG.
* Update StackProtector::InsertStackProtectors() to find a stack guard
  slot by searching for the llvm.stackprotector intrinsic when the
  prologue was not created by StackProtector itself but the pass still
  needs to generate the epilogue instrumentation. This fixes a problem
  when the pass would abort because the stack guard AllocInst pointer
  was null when generating the epilogue -- test
  CodeGen/AArch64/GlobalISel/arm64-irtranslator-stackprotect.ll.

Differential Revision: https://reviews.llvm.org/D54518

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347862 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[GlobalISel] Make EnableGlobalISel always set when GISel is enabled
Petr Pavlu [Thu, 29 Nov 2018 12:56:32 +0000 (12:56 +0000)]
[GlobalISel] Make EnableGlobalISel always set when GISel is enabled

Change meaning of TargetOptions::EnableGlobalISel. The flag was
previously set only when a target switched on GlobalISel but it is now
always set when the GlobalISel pipeline is enabled. This makes the flag
consistent with TargetOptions::EnableFastISel and allows its use in
other parts of the compiler to determine when GlobalISel is enabled.

The EnableGlobalISel flag had previouly only one use in
TargetPassConfig::isGlobalISelAbortEnabled(). The method used its value
to determine if GlobalISel was enabled by a target and returned false in
such a case. To preserve the current behaviour, a new flag
TargetOptions::GlobalISelAbort is introduced to separately record the
abort behaviour.

Differential Revision: https://reviews.llvm.org/D54518

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347861 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[llvm-rc] Support EXSTYLE statement.
Martin Storsjo [Thu, 29 Nov 2018 12:17:39 +0000 (12:17 +0000)]
[llvm-rc] Support EXSTYLE statement.

Patch by Jacek Caban!

Differential Revision: https://reviews.llvm.org/D55020

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347858 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[llvm-mca][MC] Add the ability to declare which processor resources model load/store...
Andrea Di Biagio [Thu, 29 Nov 2018 12:15:56 +0000 (12:15 +0000)]
[llvm-mca][MC] Add the ability to declare which processor resources model load/store queues (PR36666).

This patch adds the ability to specify via tablegen which processor resources
are load/store queue resources.

A new tablegen class named MemoryQueue can be optionally used to mark resources
that model load/store queues.  Information about the load/store queue is
collected at 'CodeGenSchedule' stage, and analyzed by the 'SubtargetEmitter' to
initialize two new fields in struct MCExtraProcessorInfo named `LoadQueueID` and
`StoreQueueID`.  Those two fields are identifiers for buffered resources used to
describe the load queue and the store queue.
Field `BufferSize` is interpreted as the number of entries in the queue, while
the number of units is a throughput indicator (i.e. number of available pickers
for loads/stores).

At construction time, LSUnit in llvm-mca checks for the presence of extra
processor information (i.e. MCExtraProcessorInfo) in the scheduling model.  If
that information is available, and fields LoadQueueID and StoreQueueID are set
to a value different than zero (i.e. the invalid processor resource index), then
LSUnit initializes its LoadQueue/StoreQueue based on the BufferSize value
declared by the two processor resources.

With this patch, we more accurately track dynamic dispatch stalls caused by the
lack of LS tokens (i.e. load/store queue full). This is also shown by the
differences in two BdVer2 tests. Stalls that were previously classified as
generic SCHEDULER FULL stalls, are not correctly classified either as "load
queue full" or "store queue full".

About the differences in the -scheduler-stats view: those differences are
expected, because entries in the load/store queue are not released at
instruction issue stage. Instead, those are released at instruction executed
stage.  This is the main reason why for the modified tests, the load/store
queues gets full before PdEx is full.

Differential Revision: https://reviews.llvm.org/D54957

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347857 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/InsertWaitcnts: Remove the dependence on MachineLoopInfo
Nicolai Haehnle [Thu, 29 Nov 2018 11:06:26 +0000 (11:06 +0000)]
AMDGPU/InsertWaitcnts: Remove the dependence on MachineLoopInfo

Summary:
MachineLoopInfo cannot be relied on for correctness, because it cannot
properly recognize loops in irreducible control flow which can be
introduced by late machine basic block optimization passes. See the new
test case for the reduced form of an example that occurred in practice.

Use a simple fixpoint iteration instead.

In order to facilitate this change, refactor WaitcntBrackets so that it
only tracks pending events and registers, rather than also maintaining
state that is relevant for the high-level algorithm. Various accessor
methods can be removed or made private as a consequence.

Affects (in radv):
- dEQP-VK.glsl.loops.special.{for,while}_uniform_iterations.select_iteration_count_{fragment,vertex}

Fixes: r345719 ("AMDGPU: Rewrite SILowerI1Copies to always stay on SALU")

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54231

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347853 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/InsertWaitcnt: Consistently use uint32_t for scores / time points
Nicolai Haehnle [Thu, 29 Nov 2018 11:06:21 +0000 (11:06 +0000)]
AMDGPU/InsertWaitcnt: Consistently use uint32_t for scores / time points

Summary:
There is one obsolete reference to using -1 as an indication of "unknown",
but this isn't actually used anywhere.

Using unsigned makes robust wrapping checks easier.

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, llvm-commits, tpr, t-tye, hakzsam

Differential Revision: https://reviews.llvm.org/D54230

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347852 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/InsertWaitcnt: Remove unused WaitAtBeginning
Nicolai Haehnle [Thu, 29 Nov 2018 11:06:18 +0000 (11:06 +0000)]
AMDGPU/InsertWaitcnt: Remove unused WaitAtBeginning

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54229

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347851 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/InsertWaitcnts: Simplify pending events tracking
Nicolai Haehnle [Thu, 29 Nov 2018 11:06:14 +0000 (11:06 +0000)]
AMDGPU/InsertWaitcnts: Simplify pending events tracking

Summary:
Instead of storing the "score" (last time point) of the various relevant
events, only store whether an event is pending or not.

This is sufficient, because whenever only one event of a count type is
pending, its last time point is naturally the upper bound of all time
points of this count type, and when multiple event types are pending,
the count type has gone out of order and an s_waitcnt to 0 is required
to clear any pending event type (and will then clear all pending event
types for that count type).

This also removes the special handling of GDS_GPR_LOCK and EXP_GPR_LOCK.
I do not understand what this special handling ever attempted to achieve.
It has existed ever since the original port from an internal code base,
so my best guess is that it solved a problem related to EXEC handling in
that internal code base.

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54228

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347850 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/InsertWaitcnts: Use foreach loops for inst and wait event types
Nicolai Haehnle [Thu, 29 Nov 2018 11:06:11 +0000 (11:06 +0000)]
AMDGPU/InsertWaitcnts: Use foreach loops for inst and wait event types

Summary:
It hides the type casting ugliness, and I happened to have to add a new
such loop (in a later patch).

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54227

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347849 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/InsertWaitcnts: Untangle some semi-global state
Nicolai Haehnle [Thu, 29 Nov 2018 11:06:06 +0000 (11:06 +0000)]
AMDGPU/InsertWaitcnts: Untangle some semi-global state

Summary:
Reduce the statefulness of the algorithm in two ways:

1. More clearly split generateWaitcntInstBefore into two phases: the
   first one which determines the required wait, if any, without changing
   the ScoreBrackets, and the second one which actually inserts the wait
   and updates the brackets.

2. Communicate pre-existing s_waitcnt instructions using an argument to
   generateWaitcntInstBefore instead of through the ScoreBrackets.

To simplify these changes, a Waitcnt structure is introduced which carries
the counts of an s_waitcnt instruction in decoded form.

There are some functional changes:

1. The FIXME for the VCCZ bug workaround was implemented: we only wait for
   SMEM instructions as required instead of waiting on all counters.

2. We now properly track pre-existing waitcnt's in all cases, which leads
   to less conservative waitcnts being emitted in some cases.

     s_load_dword ...
     s_waitcnt lgkmcnt(0)    <-- pre-existing wait count
     ds_read_b32 v0, ...
     ds_read_b32 v1, ...
     s_waitcnt lgkmcnt(0)    <-- this is too conservative
     use(v0)
     more code
     use(v1)

   This increases code size a bit, but the reduced latency should still be a
   win in basically all cases. The worst code size regressions in my shader-db
   are:

 WORST REGRESSIONS - Code Size
 Before After     Delta Percentage
   1724  1736        12    0.70 %   shaders/private/f1-2015/1334.shader_test [0]
   2276  2284         8    0.35 %   shaders/private/f1-2015/1306.shader_test [0]
   4632  4640         8    0.17 %   shaders/private/ue4_elemental/62.shader_test [0]
   2376  2384         8    0.34 %   shaders/private/f1-2015/1308.shader_test [0]
   3284  3292         8    0.24 %   shaders/private/talos_principle/1955.shader_test [0]

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54226

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347848 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[CODE_OWNERS] Add myself as code owner for MinGW
Martin Storsjo [Thu, 29 Nov 2018 10:58:15 +0000 (10:58 +0000)]
[CODE_OWNERS] Add myself as code owner for MinGW

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347847 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[NFC] Add two XFAIL tests from PR39783
Max Kazantsev [Thu, 29 Nov 2018 09:38:22 +0000 (09:38 +0000)]
[NFC] Add two XFAIL tests from PR39783

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347845 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoDisable TermFolding in LoopSimplifyCFG until PR39783 is fixed
Max Kazantsev [Thu, 29 Nov 2018 09:00:19 +0000 (09:00 +0000)]
Disable TermFolding in LoopSimplifyCFG until PR39783 is fixed

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347844 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[LoopStrengthReduce] ComplexityLimit as an option
Sam Parker [Thu, 29 Nov 2018 08:34:22 +0000 (08:34 +0000)]
[LoopStrengthReduce] ComplexityLimit as an option

Convert ComplexityLimit into a command line value.

Differential Revision: https://reviews.llvm.org/D54899

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347843 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Inliner] Modify the merging of min-legal-vector-width attribute to better handle...
Craig Topper [Thu, 29 Nov 2018 07:27:38 +0000 (07:27 +0000)]
[Inliner] Modify the merging of min-legal-vector-width attribute to better handle when the caller or callee don't have the attribute.

Lack of an attribute means that the function hasn't been checked for what vector width it requires. So if the caller or the callee doesn't have the attribute we should make sure the combined function after inlining does not have the attribute.

If the caller already doesn't have the attribute we can just avoid adding it. Otherwise if the callee doesn't have the attribute just remove the caller's attribute.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347841 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Inliner] Add test for merging of min-legal-vector-width function attribute.
Craig Topper [Thu, 29 Nov 2018 07:02:47 +0000 (07:02 +0000)]
[Inliner] Add test for merging of min-legal-vector-width function attribute.

This should have been added in r337844, but apparently was I failed to 'git add' the file.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347840 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[CGP] Improve compile time for complex addressing mode
Serguei Katkov [Thu, 29 Nov 2018 06:45:18 +0000 (06:45 +0000)]
[CGP] Improve compile time for complex addressing mode

This is a fix for PR39625 with improvement the compile time
by reducing the number of intermediate Phi nodes created.

Reviewers: john.brawn, reames
Reviewed By: john.brawn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54932

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347839 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert "[TextAPI] Fix a memory leak in the TBD reader."
Juergen Ributzka [Thu, 29 Nov 2018 06:32:49 +0000 (06:32 +0000)]
Revert "[TextAPI] Fix a memory leak in the TBD reader."

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347838 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[TextAPI] Fix a memory leak in the TBD reader.
Juergen Ributzka [Thu, 29 Nov 2018 06:16:33 +0000 (06:16 +0000)]
[TextAPI] Fix a memory leak in the TBD reader.

This fixes an issue where we were leaking the YAML document if there was a
parsing error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347837 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[TextAPI] Switch back to a custom Platform enum.
Juergen Ributzka [Thu, 29 Nov 2018 05:56:03 +0000 (05:56 +0000)]
[TextAPI] Switch back to a custom Platform enum.

Moving to PlatformType from BinaryFormat had some UB fallout when handing
unknown platforms or malformed input files.

This should fix the sanitizer bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347836 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Correct comment. NFC
Craig Topper [Thu, 29 Nov 2018 05:56:03 +0000 (05:56 +0000)]
[X86] Correct comment. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347835 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAdd Hurd target to LLVMSupport (1/2)
Kristina Brooks [Thu, 29 Nov 2018 03:23:01 +0000 (03:23 +0000)]
Add Hurd target to LLVMSupport (1/2)

Add the required target triples to LLVMSupport to support Hurd
in LLVM (formally `pc-hurd-gnu`).

Patch by sthibaul (Samuel Thibault)

Differential Revision: https://reviews.llvm.org/D54378

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347832 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[PowerPC] Fix a conversion is not considered when the ISD::BR_CC node making the...
Li Jia He [Thu, 29 Nov 2018 03:04:39 +0000 (03:04 +0000)]
[PowerPC] Fix a conversion is not considered when the ISD::BR_CC node making the instruction selection

Summary:
 A signed comparison of i1 values produces the opposite result to an unsigned one if the condition code
 includes less-than or greater-than. This is so because 1 is the most negative signed i1 number and the
 most positive unsigned i1 number. The CR-logical operations used for such comparisons are non-commutative
 so for signed comparisons vs. unsigned ones, the input operands just need to be swapped.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D54825

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347831 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[PowerPC] [NFC] Add test cases to the ISD::BR_CC node in the instruction selection
Li Jia He [Thu, 29 Nov 2018 02:51:03 +0000 (02:51 +0000)]
[PowerPC] [NFC] Add test cases to the ISD::BR_CC node in the instruction selection
Add the following test case for the ISD::BR_CC node in the instruction selection
define i64 @testi64slt(i64 %c1, i64 %c2, i64 %c3, i64 %c4, i64 %a1, i64 %a2) #0 {
entry:
  %cmp1 = icmp eq i64 %c3, %c4
  %cmp3tmp = icmp eq i64 %c1, %c2
  %cmp3 = icmp slt i1 %cmp3tmp, %cmp1
  br i1 %cmp3, label %iftrue, label %iffalse
iftrue:
  ret i64 %a1
iffalse:
  ret i64 %a2
}
The data type i64 can be replaced by i32, i64, float, double

And condition codes can be replaced by: SETEQ, SETEN, SELT, SETLE, SETGT, SETGE,SETULT, SETULE, SSETGT, and SETUGE

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D54824

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347828 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[TextAPI] TBD Reader/Writer (bot fixes: take 2)
Juergen Ributzka [Thu, 29 Nov 2018 02:28:58 +0000 (02:28 +0000)]
[TextAPI] TBD Reader/Writer (bot fixes: take 2)

Replace the tuple with a struct to work around an explicit constructor bug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347827 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoNFC. Use unsigned type for uses counter in CaptureTracking
Artur Pilipenko [Thu, 29 Nov 2018 02:15:35 +0000 (02:15 +0000)]
NFC. Use unsigned type for uses counter in CaptureTracking

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347826 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[TextAPI] TBD Reader/Writer (bot fixes)
Juergen Ributzka [Thu, 29 Nov 2018 01:55:57 +0000 (01:55 +0000)]
[TextAPI] TBD Reader/Writer (bot fixes)

Trying if switching from a vector to an array will appeas the bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347824 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[TextAPI] TBD Reader/Writer
Juergen Ributzka [Thu, 29 Nov 2018 01:20:46 +0000 (01:20 +0000)]
[TextAPI] TBD Reader/Writer

Add basic infrastructure for reading and writting TBD files (version 1 - 3).

The TextAPI library is not used by anything yet (besides the unit tests). Tool
support will be added in a separate commit.

The TBD format is currently documented in the implementation file (TextStub.cpp).

https://reviews.llvm.org/D53945

Update: This contains changes to fix issues discovered by the bots:
 - add parentheses to silence warnings.
 - rename variables
 - use PlatformType from BinaryFormat

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347823 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[x86] try select simplification for target-specific nodes
Sanjay Patel [Wed, 28 Nov 2018 22:51:04 +0000 (22:51 +0000)]
[x86] try select simplification for target-specific nodes

This failed to select (which might be a separate bug) in
X86ISelDAGToDAG because we try to create a select node
that can be simplified away after rL347227.

This change avoids the problem by simplifying the SHRUNKBLEND
node sooner. In the test case, we manage to realize that the
true/false values of the select (SHRUNKBLEND) are the same thing,
so it simplifies away completely.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347818 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert "[TextAPI] TBD Reader/Writer"
Juergen Ributzka [Wed, 28 Nov 2018 21:38:28 +0000 (21:38 +0000)]
Revert "[TextAPI] TBD Reader/Writer"

Reverting to unbreak bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347809 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[TextAPI] TBD Reader/Writer
Juergen Ributzka [Wed, 28 Nov 2018 21:27:00 +0000 (21:27 +0000)]
[TextAPI] TBD Reader/Writer

Add basic infrastructure for reading and writting TBD files (version 1 - 3).

The TextAPI library is not used by anything yet (besides the unit tests). Tool
support will be added in a separate commit.

The TBD format is currently documented in the implementation file (TextStub.cpp).

https://reviews.llvm.org/D53945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347808 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[DebugInfo] IR/Bitcode changes for DISubprogram flags.
Paul Robinson [Wed, 28 Nov 2018 21:14:32 +0000 (21:14 +0000)]
[DebugInfo] IR/Bitcode changes for DISubprogram flags.

Packing the flags into one bitcode word will save effort in
adding new flags in the future.

Differential Revision: https://reviews.llvm.org/D54755

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347806 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoReapply "[llvm-mca] Return the total number of cycles from method Pipeline::run()."
Andrea Di Biagio [Wed, 28 Nov 2018 19:31:19 +0000 (19:31 +0000)]
Reapply "[llvm-mca] Return the total number of cycles from method Pipeline::run()."

This reapplies r347767 (originally reviewed at: https://reviews.llvm.org/D55000)
with a fix for the missing std::move of the Error returned by the call to
Pipeline::runCycle().

Below is the original commit message from r347767.

If a user only cares about the overall latency, then the best/quickest way is to
change method Pipeline::run() so that it returns the total number of cycles to
the caller.

When the simulation pipeline is run, the number of cycles (or an error) is
returned from method Pipeline::run().
The advantage is that no hardware event listener is needed for computing that
latency. So, the whole process should be faster (and simpler - at least for that
particular use case).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347795 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Make X86TTIImpl::getCastInstrCost properly handle the case where AVX512 is...
Craig Topper [Wed, 28 Nov 2018 18:11:42 +0000 (18:11 +0000)]
[X86] Make X86TTIImpl::getCastInstrCost properly handle the case where AVX512 is enabled, but 512-bit vectors aren't legal.

Unlike most cost model functions this code makes a lot of table lookups without using the results from getTypeLegalizationCost. This means 512-bit vectors can be looked up even when the type isn't legal.

This patch adds a check around the two tables that contain 512-bit types to make sure that neither of the types would be split by type legalization. Meaning 512 bit types are illegal. I wanted to write this in a somewhat generic way that uses type legalization query hooks. But if prefered, I can switch to just using is512BitVector and the subtarget feature.

Differential Revision: https://reviews.llvm.org/D54984

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347786 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Add some cost model entries for sext/zext for avx512bw
Craig Topper [Wed, 28 Nov 2018 18:11:39 +0000 (18:11 +0000)]
[X86] Add some cost model entries for sext/zext for avx512bw

This fixes some of scalarization costs reported for sext/zext using avx512bw. This does not fix all scalarization costs being reported. Just the worst.

I've restricted this only to combinations of types that are legal with avx512bw like v32i1/v64i1/v32i16/v64i8 and conversions between vXi1 and vXi8/vXi16 with legal vXi8/vXi16 result types.

Differential Revision: https://reviews.llvm.org/D54979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347785 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Add a combine for back to back VSRAI instructions
Craig Topper [Wed, 28 Nov 2018 18:03:38 +0000 (18:03 +0000)]
[X86] Add a combine for back to back VSRAI instructions

Expansion of SIGN_EXTEND_INREG can create a VSRAI instruction. If there is already a VSRAI after it, we should combine them into a larger VSRAI

Differential Revision: https://reviews.llvm.org/D54959

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347784 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[DebugInfo] Give inlinable calls DILocs (PR39807)
Jeremy Morse [Wed, 28 Nov 2018 17:58:45 +0000 (17:58 +0000)]
[DebugInfo] Give inlinable calls DILocs (PR39807)

In PR39807 we incorrectly handle circumstances where calls are common'd
from conditional blocks into the parent BB. Calls that can be inlined
must always have DebugLocs, however we strip them during commoning, which
the IR verifier asserts on.

Fix this by using applyMergedLocation: it will perform the same DebugLoc
stripping of conditional Locs, but will also generate an unknown location
DebugLoc that satisfies the requirement for inlinable calls to always have
locations.

Some of the prior logic for selecting a DebugLoc is now likely redundant;
I'll generate a follow-up to remove it (involves editing more regression
tests).

Differential Revision: https://reviews.llvm.org/D54997

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347782 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[LICM] Enable control flow hoisting by default
John Brawn [Wed, 28 Nov 2018 17:23:03 +0000 (17:23 +0000)]
[LICM] Enable control flow hoisting by default

Differential Revision: https://reviews.llvm.org/D54949

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347778 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[LICM] Reapply r347190 "Make LICM able to hoist phis" with fix
John Brawn [Wed, 28 Nov 2018 17:21:49 +0000 (17:21 +0000)]
[LICM] Reapply r347190 "Make LICM able to hoist phis" with fix

This commit caused failures because it failed to correctly handle cases where
we hoist a phi, then hoist a use of that phi, then have to rehoist that use. We
need to make sure that we rehoist the use to _after_ the hoisted phi, which we
do by always rehoisting to the immediate dominator instead of just rehoisting
everything to the original preheader.

An option is also added to control whether control flow is hoisted, which is
off in this commit but will be turned on in a subsequent commit.

Differential Revision: https://reviews.llvm.org/D52827

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347776 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert [llvm-mca] Return the total number of cycles from method Pipeline::run().
Andrea Di Biagio [Wed, 28 Nov 2018 16:39:48 +0000 (16:39 +0000)]
Revert [llvm-mca] Return the total number of cycles from method Pipeline::run().

This reverts commits 347767.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347775 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[RISCV] Support .option push and .option pop
Alex Bradbury [Wed, 28 Nov 2018 16:39:14 +0000 (16:39 +0000)]
[RISCV] Support .option push and .option pop

This adds support in the RISCVAsmParser the storing of Subtarget feature bits to a stack so that they can be pushed/popped to enable/disable multiple features at once.

Differential Revision: https://reviews.llvm.org/D46424
Patch by Lewis Revill.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347774 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[InstCombine] Combine saturating add/sub with constant operands
Nikita Popov [Wed, 28 Nov 2018 16:37:15 +0000 (16:37 +0000)]
[InstCombine] Combine saturating add/sub with constant operands

Combine
  sat(sat(X + C1) + C2) -> sat(X + (C1+C2))
and
  sat(sat(X - C1) - C2) -> sat(X - (C1+C2))
if the sign of C1 and C2 matches.

In the unsigned case we can compute C1+C2 with saturating arithmetic,
and InstSimplify will reduce this just to the saturation value. For
the signed case, we cannot perform the simplification if the result
of the addition overflows.

This change is part of https://reviews.llvm.org/D54534.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347773 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[InstCombine] Canonicalize ssub.sat to sadd.sat
Nikita Popov [Wed, 28 Nov 2018 16:37:09 +0000 (16:37 +0000)]
[InstCombine] Canonicalize ssub.sat to sadd.sat

Canonicalize ssub.sat(X, C) to ssub.sat(X, -C) if C is constant and
not signed minimum. This will help further optimizations to apply.

This change is part of https://reviews.llvm.org/D54534.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347772 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ValueTracking] Determine always-overflow condition for unsigned sub
Nikita Popov [Wed, 28 Nov 2018 16:37:04 +0000 (16:37 +0000)]
[ValueTracking] Determine always-overflow condition for unsigned sub

Always-overflow was already determined for unsigned addition, but
not subtraction. This patch establishes parity.

This allows us to perform some additional simplifications for
signed saturating subtractions.

This change is part of https://reviews.llvm.org/D54534.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347771 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[InstCombine] Use known overflow information for saturating add/sub
Nikita Popov [Wed, 28 Nov 2018 16:36:59 +0000 (16:36 +0000)]
[InstCombine] Use known overflow information for saturating add/sub

If ValueTracking can determine that the add/sub can newer overflow,
replace it with the corresponding nuw/nsw add/sub.

Additionally, for the unsigned case, if ValueTracking determines
that the add/sub always overflows, replace the result with the
saturation value.

This change is part of https://reviews.llvm.org/D54534.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347770 91177308-0d34-0410-b5e6-96231b3b80d8