git.osdn.net Git - android-x86/external-llvm.git/log

[X86] Use getTypeAction in most places that were checking ExperimentalVectorWideningLegalization.

This will allow more flexibility in what types we legalize via widening or not. This should help with a couple lines in D41062.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324980 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove duplicate CHECK-LABEL line the update script didn't delete when I converted the test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324979 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Document the shortcomings of DwarfExpression::addMachineReg()."

This reverts commit r324972. This commit broke a bot, so perhaps it is
testable after all?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324977 91177308-0d34-0410-b5e6-96231b3b80d8

[Utils] Salvage debug info of DCE'ed mul/sdiv/srem instructions

Here are the number of additional debug values salvaged in a stage2
build of clang:

63 SALVAGE: MUL
1250 SALVAGE: SDIV

(No values were salvaged from `srem` instructions in this experiment,
but it's a simple case to handle so we might as well.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324976 91177308-0d34-0410-b5e6-96231b3b80d8

[Utils] Salvage debug info of DCE'ed shl/lhsr/ashr instructions

Here are the number of additional debug values salvaged in a stage2
build of clang:

  1912 SALVAGE: ASHR
   405 SALVAGE: LSHR
   249 SALVAGE: SHL

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324975 91177308-0d34-0410-b5e6-96231b3b80d8

[Utils] Salvage the debug info of DCE'ed 'sub' instructions

This salvages 14 debug values in a stage2 build of clang.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324974 91177308-0d34-0410-b5e6-96231b3b80d8

[Utils] Salvage the debug info of DCE'ed 'xor' instructions

This salvages 259 debug values in a stage2 build of clang.

Differential Revision: https://reviews.llvm.org/D43207

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324973 91177308-0d34-0410-b5e6-96231b3b80d8

Document the shortcomings of DwarfExpression::addMachineReg().

Also make a drive-by-fix of a bug in the subregister scan code that
only triggers with an incomplete or otherwise very irregular machine
description.

rdar://problem/37404493

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324972 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: IRTranslate llvm.fmuladd.* intrinsic

Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, bogner

Reviewed By: qcolombet

Subscribers: rovka, kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D43090

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324971 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] allow exp/log simplifications with only 'reassoc' FMF

These intrinsic folds were added with D41381, but only allowed with isFast().
That's more than necessary because FMF has 'reassoc' to apply to these
kinds of folds after D39304, and that's all we need in these cases.

Differential Revision: https://reviews.llvm.org/D43160

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324967 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Update ADT/TripleTest.cpp now that default file format has changed

Differential Revision: https://reviews.llvm.org/D43212

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324966 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Auto generate complete checks. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324964 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] change tests to 'fast' to reflect current folds

The diff to use 'reassoc' is part of D43160; it should not have
been made with rL324961. Reverting that part here, so we'll
see the intended diff with the code change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324963 91177308-0d34-0410-b5e6-96231b3b80d8

[Dominators] Always recalculate postdominators when update yields different roots

Summary:
This patch makes postdominators always recalculate the tree when an update causes to change the tree roots.
As @dmgreen noticed in [[ https://reviews.llvm.org/D41298 | D41298 ]], the previous implementation was not conservative enough and it was possible to end up with a PostDomTree that was different than a freshly computed one.
The patch also compares postdominators with a freshly computed tree at the end of full verification to make sure we don't hit similar issues in the future.

This should (ideally) be also backported to 6.0 before the release, although I don't have any reports of this causing an observable error. It should be safe to do it even if it's late in the release, as the change only makes the current behavior more conservative.

Reviewers: dmgreen, dberlin, davide, brzycki, grosser

Reviewed By: brzycki, grosser

Subscribers: llvm-commits, dmgreen

Differential Revision: https://reviews.llvm.org/D43140

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324962 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] consolidate tests for log-exp inverse folds

Some tests didn't add much value because we already show stronger
constraints for the folds in other tests, so the weaker versions
were deleted.

Moved the remaining tests into 1 file because the folds are
very similar and handled from 1 place in the code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324961 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Simplify MemTransferInst's source and dest alignments separately

Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
InstCombine pass to cease using the deprecated MemoryIntrinsic::getAlignment() method, and
instead we use the separate getSourceAlignment and getDestAlignment APIs to simplify
the source and destination alignment attributes separately.

Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774,
rL324781, rL324784, rL324955 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

Reviewers: majnemer, bollu, efriedma

Reviewed By: efriedma

Subscribers: efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D42871

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324960 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[LSR] Avoid UB overflow when examining reuse opportunities"

This reverts commit r324943.

Breaking bots, reverting for Gerolf.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324958 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] MC: Remove redundant struct types

Differential Revision: https://reviews.llvm.org/D43210

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324957 91177308-0d34-0410-b5e6-96231b3b80d8

[SafeStack] Use updated CreateMemCpy API to set more accurate source and destination alignments.

Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
creation of memcpys in the SafeStack pass to set the alignment of the destination object to
its stack alignment while separately setting the source byval arguments alignment to its
alignment.

Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. (rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774,
rL324781, rL324784 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

Reviewers: eugenis, bollu

Reviewed By: eugenis

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42710

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324955 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Reverse the operand order of the autoupgrade of the kunpack builtins.

The second operand needs to be in the lower bits of the concatenation. This matches llvm 5.0, gcc, and icc behavior.

Fixes PR36360.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324953 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] MC: Remove redundant `private` specifiers

This is inline with the other MCSection and MCSymbol subclasses

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324950 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add select test to show there's no single right answer (PR28968); NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324947 91177308-0d34-0410-b5e6-96231b3b80d8

Simplify switch statement (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324945 91177308-0d34-0410-b5e6-96231b3b80d8

[LSR] Avoid UB overflow when examining reuse opportunities

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324943 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Fix casting MCSymbol to MCSymbolWasm on ELF

Summary:
wasm32-unknown-unknown-elf has MCSymbols that are not MCSymbolWasms, so
we need a non-asserting cast here.

Reviewers: dschuff, sunfish

Subscribers: jfb, sbc100, aheejin, llvm-commits

Differential Revision: https://reviews.llvm.org/D43205

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324942 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] make binops with undef operands consistent with IR

This started by noticing that scalar and vector types were producing different results with div ops in PR36305:
https://bugs.llvm.org/show_bug.cgi?id=36305

...but the problem is bigger. I couldn't keep it straight without a table, so I'm attaching that as a PDF to
the review. The x86 tests in undef-ops.ll correspond to that table.

Green means that instsimplify and the DAG agree on the result for all types.
Red means the DAG was returning undef when IR was not.
Yellow means the DAG was returning a non-undef result when IR returned undef.

This patch assumes that we're currently doing the right thing in IR.

Note: I couldn't find any problems with lowering vector constants as the code comments were warning,
but those comments were written long ago in rL36413 .

Differential Revision: https://reviews.llvm.org/D43141

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324941 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Simplify X86DAGToDAGISel::matchBEXTRFromAnd by creating an X86ISD::BEXTR node and calling Select. Add isel patterns to recognize this node.

This removes a bunch of special case code for selecting the immediate and folding loads.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324939 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove unused multiclass argument. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324938 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalMerge] Allow merging of dllexported variables

If merging them, the dllexport attribute needs to be brought along
to the new GlobalAlias.

Differential Revision: https://reviews.llvm.org/D43192

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324937 91177308-0d34-0410-b5e6-96231b3b80d8

Fix the syntax highlighting of strings in dwarfdump.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324936 91177308-0d34-0410-b5e6-96231b3b80d8

Factor out common condition into an easier to understand helper function (NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324935 91177308-0d34-0410-b5e6-96231b3b80d8

Move the debuginfo-dce-or test into debuginfo-variables.ll, NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324933 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[ThinLTO] Add GraphTraits for FunctionSummaries"

It caused assertion failure
Assertion failed: (!DD.IsLambda && !MergeDD.IsLambda && "faked up lambda definition?"), function MergeDefinitionData, file /Users/buildslave/jenkins/workspace/clang-stage1-configure-RA/llvm/tools/clang/lib/Serialization/ASTReaderDecl.cpp, line 1675.

on the second stage build bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324932 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Follow on to rL324854 (Added tests)" as part of r324854 revert.

r324854 caused broken build on the second stage build bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324931 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Unify ChecksumKind and Checksum value in DIFile

Rather than encode the absence of a checksum with a Kind variant, instead put
both the kind and value in a struct and wrap it in an Optional.

Differential Revision: http://reviews.llvm.org/D43043

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324928 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] X / (X * Y) --> 1.0 / Y

This is similar to the instsimplify fold added with D42385
( rL323716 )
...but this can't be in instsimplify because we're creating/morphing
a different instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324927 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for missing fdiv fold; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324926 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] regenerate checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324924 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] various clean-ups for div transforms; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324922 91177308-0d34-0410-b5e6-96231b3b80d8

[LICM] update BlockColors after splitting predecessors

Update BlockColors after splitting predecessors. Do not allow splitting
EHPad for sinking when the BlockColors is not empty, so we can
simply assign predecessor's color to the new block.

Fixes PR36184

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324916 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Fixes for ARMv8.2-A FP16 scalar intrinsic - llvm portion

https://reviews.llvm.org/D42993

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324912 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add missing scheduling class tag for i64 absolute address moves

Expand existing SchedRW to encompass these like it did for the other memory offset movs - added comments to closing braces to keep track of def scopes.

We only tagged it with the itinerary class, so completeness checks were erroneously passed (PR35639).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324910 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Improve v8.1-A code-gen for atomic load-and

Armv8.1-A added an atomic load-clear instruction (which performs bitwise
and with the complement of it's operand), but not a load-and
instruction. Our current code-generation for atomic load-and always
inserts an MVN instruction to invert its argument, even if it could be
folded into a constant or another instruction.

This adds lowering early in selection DAG to convert a load-and
operation into an xor with -1 and a load-clear, allowing the normal DAG
optimisations to work on it.

To do this, I've had to add a new ISD opcode, ATOMIC_LOAD_CLR. I don't
see any easy way to do this with an AArch64-specific ISD node, because
the code-generation for atomic operations assumes the SDNodes are of
type AtomicSDNode.

I've left the old tablegen patterns in because they are still needed for
global isel.

Differential revision: https://reviews.llvm.org/D42478

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324908 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Add missing scheduling class tag for KMOVB/KMOVW/KMOVD/KMOVQ moves/loads/stores.

We only tagged it with the itinerary class, so completeness checks were erroneously passed (PR35639).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324905 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Refactor identification of SIMD immediates

Get rid of icky goto loops and make the code easier to maintain (NFC).

Differential revision: https://reviews.llvm.org/D42723

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324903 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Add missing scheduling class tag for VMOVQ/VMOVHLPS/VMOVLHPS/VMOVHPD/VMOVHPS/VMOVLPD/VMOVLPS

Tag AVX512 variants to match SSE/AVX originals.

We only tagged it with the itinerary class, so completeness checks were erroneously passed (PR35639).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324901 91177308-0d34-0410-b5e6-96231b3b80d8

Re-commit r324489: [DebugInfo] Improvements to representation of enumeration types (PR36168)

Differential Revision: https://reviews.llvm.org/D42734

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324899 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Tag CET-IBT instruction scheduler classes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324898 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][MMX] Add missing scheduling class tag for EMMS/FEMMS

We only tagged it with the itinerary class, so completeness checks were erroneously passed (PR35639).

AMD targets can perform these a lot quicker than WriteMicrocoded so will need an override in the models.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324897 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Fix comment of class InstrStage

Patch by Wei-Ren Chen.

Differential Revision: https://reviews.llvm.org/D42905

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324894 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Take user instructions cost into consideration in insertelement vectorization.

Summary:
For better vectorization result we should take into consideration the
cost of the user insertelement instructions when we try to
vectorize sequences that build the whole vector. I.e. if we have the
following scalar code:
```
<Scalar code>
insertelement <ScalarCode>, ...
```
we should consider the cost of the last `insertelement ` instructions as
the cost of the scalar code.

Reviewers: RKSimon, spatel, hfinkel, mkuper

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D42657

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324893 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Improve v8.1-A code-gen for atomic load-subtract

Armv8.1-A added an atomic load-add instruction, but not a load-subtract
instruction. Our current code-generation for atomic load-subtract always
inserts a NEG instruction to negate it's argument, even if it could be
folded into a constant or another instruction.

This adds lowering early in selection DAG to convert a load-subtract
operation into a subtract and a load-add, allowing the normal DAG
optimisations to work on it.

I've left the old tablegen patterns in because they are still needed for
global isel.

Some of the tests in this patch are copied from D35375 by Chad Rosier (which
was abandoned).

Differential revision: https://reviews.llvm.org/D42477

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324892 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] various clean-ups for commonIDivTransforms; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324891 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit: reformat comment

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324889 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r324835 "[X86] Reduce Store Forward Block issues in HW"

It asserts building Chromium; see PR36346.

(This also reverts the follow-up r324836.)

> If a load follows a store and reloads data that the store has written to memory, Intel microarchitectures can in many cases forward the data directly from the store to the load, This "store forwarding" saves cycles by enabling the load to directly obtain the data instead of accessing the data from cache or memory.
> A "store forward block" occurs in cases that a store cannot be forwarded to the load. The most typical case of store forward block on Intel Core microarchiticutre that a small store cannot be forwarded to a large load.
> The estimated penalty for a store forward block is ~13 cycles.
>
> This pass tries to recognize and handle cases where "store forward block" is created by the compiler when lowering memcpy calls to a sequence
> of a load and a store.
>
> The pass currently only handles cases where memcpy is lowered to XMM/YMM registers, it tries to break the memcpy into smaller copies.
> breaking the memcpy should be possible since there is no atomicity guarantee for loads and stores to XMM/YMM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324887 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Fix 'l' constraint handling for types smaller than 32 bits

In case of correct using of the 'l' constraint llvm now generates valid
code; otherwise it shows an error message. Initially these triggers an
assertion.

This commit is the same as r324869 with fixed the test's file name.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324885 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Revert rL324869

This commit adds inlineasm-cnstrnt-bad-l.ll which is clashing
with inlineasm-cnstrnt-bad-L.ll on case insensitive file systems.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324882 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopInterchange] Simplify splitInnerLoopHeader logic (NFC).

We can use SplitBlock for both cases, which makes the code slightly
simpler and updates both LoopInfo and the dominator tree.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324881 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] Add a -trap-unreachable option for debugging

Add a common -trap-unreachable option, similar to the target
specific hexagon equivalent, which has been replaced. This
turns unreachable instructions into traps, which is useful for
debugging.

Differential Revision: https://reviews.llvm.org/D42965

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324880 91177308-0d34-0410-b5e6-96231b3b80d8

[gtest] Support raw_ostream printing functions more comprehensively.

Summary:
These are functions like operator<<(raw_ostream&, Foo).

Previously these were only supported for messages. In the assertion
EXPECT_EQ(A, B) << C;
the local modifications would explicitly try to use raw_ostream printing for C.
However A and B would look for a std::ostream printing function, and often fall
back to gtest's default "168 byte object <00 01 FE 42 ...>".

This patch pulls out the raw_ostream support into a new header under `custom/`.

I changed the mechanism: instead of a convertible stream, we wrap the printed
value in a proxy object to allow it to be sent to a std::ostream.
I think the new way is clearer.

I also changed the policy: we prefer raw_ostream printers over std::ostream
ones. This is because the fallback printers are defined using std::ostream,
while all the raw_ostream printers should be "good".

Reviewers: ilya-biryukov, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43091

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324876 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Fix 'l' constraint handling for types smaller than 32 bits

In case of correct using of the 'l' constraint llvm now generates valid
code; otherwise it shows an error message. Initially these triggers an
assertion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324869 91177308-0d34-0410-b5e6-96231b3b80d8

[MC] Issue error message when data region is not terminated

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324868 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Fix typos

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324867 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Make getPostIncExpr guaranteed to return AddRec

The current implementation of `getPostIncExpr` invokes `getAddExpr` for two recurrencies
and expects that it always returns it a recurrency. But this is not guaranteed to happen if we
have reached max recursion depth or refused to make SCEV simplification for other reasons.

This patch changes its implementation so that now it always returns SCEVAddRec without
relying on `getAddExpr`.

Differential Revision: https://reviews.llvm.org/D42953

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324866 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Don't look for TEST instruction shrinking opportunities when the root node is a X86ISD::SUB.

I don't believe we ever create an X86ISD::SUB with a 0 constant which is what the TEST handling needs. The ternary operator at the end of this code shows up as only going one way in the llvm-cov report from the bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324865 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove check for X86ISD::AND with no flag users from the TEST instruction immediate shrinking code.

We turn X86ISD::AND with no flag users back to ISD::AND in PreprocessISelDAG.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324864 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Change some compare patterns to use loadi8/loadi16/loadi32/loadi64 helper fragments.

This enables CMP8mi to fold zextloadi8i1 which in all tests allows us to avoid creating a TEST8rr that peephole can't fold.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324863 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Autogenerate complete checks. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324862 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add KADD X86ISD opcode instead of reusing ISD::ADD.

ISD::ADD implies individual vector element addition with no carries between elements. But for a vXi1 type that would be the same as XOR. And we already turn ISD::ADD into ISD::XOR for all vXi1 types during lowering. So the ISD::ADD pattern would never be able to match anyway.

KADD is different, it adds the elements but also propagates a carry between them. This just a way of doing an add in k-register without bitcasting to the scalar domain. There's still no way to match the pattern, but at least its not obviously wrong.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324861 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Allow zextload/extload i1->i8 to be folded into instructions during isel

Previously we just emitted this as a MOV8rm which would likely get folded during the peephole pass anyway. This just makes it explicit earlier.

The gpr-to-mask.ll test changed because the kaddb instruction has no memory form.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324860 91177308-0d34-0410-b5e6-96231b3b80d8

Follow on to rL324854 (Added tests)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324859 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove MASK_BINOP intrinsic type. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324858 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove dead code from getMaskNode that looked for a i64 mask with a maskVT that wasn't v64i1. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324857 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove LowerBoolVSETCC_AVX512, we get this with a target independent DAG combine now. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324856 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Add GraphTraits for FunctionSummaries

Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324854 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeView] Allow variable names to be as long as the codeview format supports

Instead of reserving 0xF00 bytes for the fixed length portion of the CodeView
symbol name, calculate the actual length of the fixed length portion.

Differential Revision: https://reviews.llvm.org/D42125

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324850 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Update some required-vector-width.ll test cases to not pass 512-bit vectors in arguments or return.

ABI for these would require 512 bits support so we don't want to test that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324845 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Use SplitBinaryOpsAndApply to recognise PSUBUS patterns before they're split on AVX1

This needs to be generalised further to support AVX512BW cases but I want to add non-uniform constants first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324844 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] X / (X * Y) -> 1 / Y if the multiplication does not overflow

The related cases for (X * Y) / X were handled in rL124487.

https://rise4fun.com/Alive/6k9

The division in these tests is subsequently eliminated by existing instcombines
for 1/X.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324843 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use min/max for vector ult/ugt compares if avoids a sign flip.

Summary:
Currently we only use min/max to help with ule/uge compares because it removes an invert of the result that would otherwise be needed. But we can also use it for ult/ugt compares if it will prevent the need for a sign bit flip needed to use pcmpgt at the cost of requiring an invert after the compare.

I also refactored the code so that the max/min code is self contained and does its own return instead of setting up a flag to manipulate the rest of the function's behavior.

Most of the test cases look ok with this. I did notice that we added instructions when one of the operands being sign flipped is a constant vector that we were able to constant fold the flip into.

I also noticed that sometimes the SSE min/max clobbers a register that is needed after the compare. This resulted in an extra move being inserted before the min/max to preserve the register. We could try to detect this and switch from min to max and change the compare operands to use the operand that gets reused in the compare.

Reviewers: spatel, RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42935

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324842 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Moved SplitBinaryOpsAndApply earlier so more methods can use it. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324841 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for div-mul folds; NFC

The related cases for (X * Y) / X were handled in rL124487.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324840 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] try to create -1 constant operand for math ops via demanded bits

This reverses instcombine's demanded bits' transform which always tries to clear bits in constants.

As noted in PR35792 and shown in the test diffs:
https://bugs.llvm.org/show_bug.cgi?id=35792
...we can do better in codegen by trying to form -1. The x86 sub test shows a missed opportunity.

I did investigate changing instcombine's behavior, but it would be more work to change
canonicalization in IR. Clearing bits / shrinking constants can allow killing instructions,
so we'd have to figure out how to not regress those cases.

Differential Revision: https://reviews.llvm.org/D42986

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324839 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add PR33747 test case

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324838 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Enable SMIN/SMAX/UMIN/UMAX custom lowering for all legal types

This allows us to recognise more saturation patterns and also simplify some MINMAX codegen that was failing to combine CMPGE comparisons to a legal CMPGT.

Differential Revision: https://reviews.llvm.org/D43014

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324837 91177308-0d34-0410-b5e6-96231b3b80d8

fix test/CodeGen/X86/fixup-sfb.ll test failure after commit https://reviews.llvm.org/rL324835

Change-Id: I2526c2f342654e85ce054237de03ae9db9ab4994

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324836 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Reduce Store Forward Block issues in HW

If a load follows a store and reloads data that the store has written to memory, Intel microarchitectures can in many cases forward the data directly from the store to the load, This "store forwarding" saves cycles by enabling the load to directly obtain the data instead of accessing the data from cache or memory.
A "store forward block" occurs in cases that a store cannot be forwarded to the load. The most typical case of store forward block on Intel Core microarchiticutre that a small store cannot be forwarded to a large load.
The estimated penalty for a store forward block is ~13 cycles.

This pass tries to recognize and handle cases where "store forward block" is created by the compiler when lowering memcpy calls to a sequence
of a load and a store.

The pass currently only handles cases where memcpy is lowered to XMM/YMM registers, it tries to break the memcpy into smaller copies.
breaking the memcpy should be possible since there is no atomicity guarantee for loads and stores to XMM/YMM.

Change-Id: I620b6dc91583ad9a1444591e3ddc00dd25d81748

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324835 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Don't make 512-bit vectors legal when preferred vector width is 256 bits and 512 bits aren't required

This patch adds a new function attribute "required-vector-width" that can be set by the frontend to indicate the maximum vector width present in the original source code. The idea is that this would be set based on ABI requirements, intrinsics or explicit vector types being used, maybe simd pragmas, etc. The backend will then use this information to determine if its save to make 512-bit vectors illegal when the preference is for 256-bit vectors.

For code that has no vectors in it originally and only get vectors through the loop and slp vectorizers this allows us to generate code largely similar to our AVX2 only output while still enabling AVX512 features like mask registers and gather/scatter. The loop vectorizer doesn't always obey TTI and will create oversized vectors with the expectation the backend will legalize it. In order to avoid changing the vectorizer and potentially harm our AVX2 codegen this patch tries to make the legalizer behavior similar.

This is restricted to CPUs that support AVX512F and AVX512VL so that we have good fallback options to use 128 and 256-bit vectors and still get masking.

I've qualified every place I could find in X86ISelLowering.cpp and added tests cases for many of them with 2 different values for the attribute to see the codegen differences.

We still need to do frontend work for the attribute and teach the inliner how to merge it, etc. But this gets the codegen layer ready for it.

Differential Revision: https://reviews.llvm.org/D42724

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324834 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove setOperationAction lines for promoting vXi1 SINT_TO_FP/UINT_TO_FP.

We promote these via a DAG combine now before lowering gets the chance.

Also remove the v2i1 custom handling since it will no longer be triggered.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324833 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Remove TargetLowering::getConstTrueVal. Use SelectionDAG::getBoolConstant in the one place it was used.

SelectionDAG::getBoolConstant was recently introduced. At the time I didn't know getConstTrueVal existed, but I think getBoolConstant is better as it will use the source VT to make sure it can properly detect floating point if it is configured differently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324832 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove some redundant qualifications from the setOperationAction blocks. NFC

These were added as part of the refactoring for prefer vector width. At the time I thought the hasAVX512 here would be replaced with "allow 512 bit vectors" so that it would read "allow 512 bit vectors OR VLX". But now the plan is to only give the option of disabling 512 bit vectors when VLX is enabled. So we don't need this qualification at all

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324831 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add SMIN/SMAX combine test

As discussed on D43014, we need the ability to flip SMIN/SMAX to (legal) UMIN/UMAX

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324829 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Change signatures of avx512 packed fp compare intrinsics to return a vXi1 mask type to be closer to an fcmp.

Summary:
This patch changes the signature of the avx512 packed fp compare intrinsics to return a vXi1 vector and no longer take a mask as input. The casts to scalar type will now need to be explicit in the IR. The masking node will now be an explicit and in the IR.

This makes the intrinsic look much more similar to an fcmp instruction that we wish we could use for these but can't. We already use icmp instructions for integer compares.

Previously the lowering step of isel would turn the intrinsic into an X86 specific ISD node and a emit the masking nodes as well as some bitcasts. This means DAG combines can't see the vXi1 type until somewhat late, making it more difficult to combine out gpr<->mask transition sequences. By exposing the vXi1 type explicitly in the IR and initial SelectionDAG we give earlier DAG combines and even InstCombine the chance to see it and optimize it.

This should make any issues with gpr<->mask sequences the same between integer and fp. Meaning we only have to fix them once.

Reviewers: spatel, delena, RKSimon, zvi

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43137

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324827 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add UMIN/UMAX combine test

As discussed on D43014, we need the ability to flip UMIN/UMAX to (legal) SMIN/SMAX

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324826 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add constant vector support for ~(C >> Y) --> ~C >> Y

Includes adding m_NonNegative constant pattern matcher

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324825 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Increase PMULLD costs to better match hardware

Until Skylake, most hardware could only issue a PMULLD op every other cycle

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324823 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Custom legalize (v2i32 (setcc (v2f32))) so that we don't end up with a (v4i1 (setcc (v4f32)))

Undef VLX, getSetCCResultType returns v2i1/v4i1 for v2f32/v4f32 so default type legalization will end up changing the setcc result type back to vXi1 if it had been extended. The resulting extend gets messed up further by type legalization and is difficult to recombine back to (v4i32 (setcc (v4f32))) after legalization.

I went ahead and enabled this for SSE2 and later since its always the result we want and this helps type legalization get there in less steps.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324822 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Extend inputs with elements smaller than i32 to sint_to_fp/uint_to_fp before type legalization.

This prevents extends of masks being introduced during lowering where it become difficult to combine them out.

There are a few oddities in here.

We sometimes concatenate two k-registers produced by two compares, sign_extend the combined pair, then extract two halves. This worked better previously because the sign_extend wasn't created until after the fp_to_sint was split which led to a split sign_extend being created.

We probably also need to custom type legalize (v2i32 (sext v2i1)) via widening.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324820 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove some check-prefixes from avx512-cvt.ll to prepare for an upcoming patch.

The update script sometimes has trouble when there are check-prefixes representing every possible combination of feature flags. I have a patch where the update script was generating something that didn't pass lit.

This patch just removes some check-prefixes and expands out some of the checks to workaround this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324819 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] preserve test intent by removing undef

D43141 proposes to correct undef folding in the DAG,
and this test would not survive that change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324817 91177308-0d34-0410-b5e6-96231b3b80d8