git.osdn.net Git - android-x86/external-llvm.git/log

[Power9] Add tests for passing float128 in VSX reg for non-homogenous aggregates

Add missing testcase for rL336310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336313 91177308-0d34-0410-b5e6-96231b3b80d8

[demangler] Avoid alignment warning

The alignment specified by a constant for the field
`BumpPointerAllocator::InitialBuffer` exceeded the alignment
guaranteed by `malloc` and `new` on Windows. This change set
the alignment value to that of `long double`, which is defined
by the used platform.

It fixes https://bugs.llvm.org/show_bug.cgi?id=37944.

Differential Revision: https://reviews.llvm.org/D48889

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336311 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Ensure float128 in non-homogenous aggregates are passed via VSX reg

Non-homogenous aggregates are passed in consecutive GPRs, in GPRs and in memory,
or in memory. This patch ensures that float128 members of non-homogenous
aggregates are passed via VSX registers.

This is done via custom lowering a bitcast of a build_pari(i64,i64) to float128
to a new PPCISD node, BUILD_FP128.

Differential Revision: https://reviews.llvm.org/D48308

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336310 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9]Legalize and emit code for quad-precision convert from single-precision

Legalize and emit code for quad-precision floating point operation conversion of
single-precision value to quad-precision.

Differential Revision: https://reviews.llvm.org/D47569

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336307 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Implement float128 parameter passing and return values

This patch enable parameter passing and return by value for float128 types.
Passing aggregate/union which contain float128 members will be submitted in
subsequent patches.

Differential Revision: https://reviews.llvm.org/D47552

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336306 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove some isel patterns for X86ISD::SELECTS that specifically looked for the v1i1 mask to have come from a scalar_to_vector from GR8.

We have patterns for SELECTS that top at v1i1 and we have a pattern for (v1i1 (scalar_to_vector GR8)). The patterns being removed here do the same thing as the two other patterns combined so there is no need for them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336305 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add support for combining FMSUB/FNMADD/FNMSUB ISD nodes with an fneg input.

Previously we could only negate the FMADD opcodes. This used to be mostly ok when we lowered FMA intrinsics during lowering. But with the move to llvm.fma from target specific intrinsics, we can combine (fneg (fma)) to (fmsub) earlier. So if we start with (fneg (fma (fneg))) we would get stuck at (fmsub (fneg)).

This patch fixes that so we can also combine things like (fmsub (fneg)).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336304 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove some of the packed FMA3 intrinsics since we no longer use them in clang.

There's a regression in here due to inability to combine fneg inputs of X86ISD::FMSUB/FNMSUB/FNMADD nodes.

More removals to come, but I wanted to stop and fix the regression that showed up in this first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336303 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9]Legalize and emit code for round & convert quad-precision values

Legalize and emit code for round & convert float128 to double precision and
single precision.

Differential Revision: https://reviews.llvm.org/D46997

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336299 91177308-0d34-0410-b5e6-96231b3b80d8

Silence an MSVC C4189 warning about a local variable being initialized but not used; NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336298 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Warn when crc, ginv, virt flags are used with too old revision

CRC and GINV ASE require revision 6, Virtualization requires revision 5.
Print a warning when revision is older than required.

Differential Revision: https://reviews.llvm.org/D48843

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336296 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Replace the Post RA List Scheduler with the Machine Scheduler

  We want to run the Machine Scheduler instead of the List Scheduler after RA.
  Checked with a performance run on a Power 9 machine with SPEC 2006 and while
  some benchmarks improved and others degraded the geomean was slightly improved
  with the Machine Scheduler.

  Differential Revision: https://reviews.llvm.org/D45265

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336295 91177308-0d34-0410-b5e6-96231b3b80d8

[Dominators] Add DomTreeUpdater constructor from DT* and PDT*

Summary:
Previously, if a function accepts an optional DT pointer,
```
void Foo (.., DominatorTree * DT = nullptr) {
  ...
  if(DT)
    DomTreeUpdater(*DT, ...).insertEdge(A, B);
  if(DT){
    DomTreeUpdater DTU(*DT, ...);
    ... // Construct the update vector and applyUpdates
  }
  ...
  if(DT){
    DomTreeUpdater DTU(*DT, ...);
    ... // Construct the update vector and applyUpdates
  }
}
```
After this patch, it can be simplified as
```
void Foo (.., DominatorTree * DT = nullptr) {
  DomTreeUpdater DTU(DT, ...);
  ...
  DTU.insertEdge(A, B);
  if(DT){
    ... // Construct the update vector and applyUpdates
  }
  ...
  if(DT){
    ... // Construct the update vector and applyUpdates
  }
}
```
Patch by Chijun Sima <simachijun@gmail.com>.

Reviewers: kuhar, brzycki, dmgreen

Reviewed By: kuhar

Author: NutshellySima

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48923

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336294 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] allow narrowing of min/max/abs

We have bailout hacks based on min/max in various places in instcombine
that shouldn't be necessary. The affected test was added for:
D48930
...which is a consequence of the improvement in:
D48584 (https://reviews.llvm.org/rL336172)

I'm assuming the visitTrunc bailout in this patch was added specifically
to avoid a change from SimplifyDemandedBits, so I'm just moving that
below the EvaluateInDifferentType optimization. A narrow min/max is still
a min/max.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336293 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][BtVer2][MCA][NFC] Add CMPEQ dependency-breaking one-idioms tests

Summary: As per `Agner's Microarchitecture doc
(21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions)`,
these, like zero-idioms, are dependency-breaking,
although they produce ones and still consume resources.

FIXME: as discussed in D48877, llvm-mca handling is broken for these.

Reviewers: andreadb

Reviewed By: andreadb

Subscribers: gbedwell, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D48876

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336292 91177308-0d34-0410-b5e6-96231b3b80d8

Fix some irregular whitespace/indentation. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336291 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add value names to test; NFC

That makes it easier to mix and match lines into other tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336289 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] [Assembler] Support negative immediates: cover few missing cases

Support for negative immediates was implemented in
https://reviews.llvm.org/rL298380, however few instruction options were missing.

This change adds negative immediates support and respective tests
for the following:

ADD
ADDS
ADDS.W
AND.W
ANDS
BIC.W
BICS
BICS.W
SUB
SUBS
SUBS.W

Differential Revision: https://reviews.llvm.org/D48649

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336286 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner] Fix typo in getOutliningCandidateInfo function name

getOutlininingCandidateInfo -> getOutliningCandidateInfo

Differential Revision: https://reviews.llvm.org/D48867

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336285 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Add --file-headers (-f) option

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336284 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add v16i16 shl x,c -> pmullw test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336277 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Update ThinLTO cache file atimes when on Windows

ThinLTO cache file access times are used for expiration based pruning
and since Vista, file access times are not updated by Windows by
default:

https://blogs.technet.microsoft.com/filecab/2006/11/07/disabling-last-access-time-in-windows-vista-to-improve-ntfs-performance

This means on Windows, cache files are currently being pruned from
creation time. This change manually updates cache files that are
accessed by ThinLTO, when on Windows.

Patch by Owen Reynolds.

Differential Revision: https://reviews.llvm.org/D47266

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336276 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for reversed subtract (SUBR) instruction.

This patch adds both a vector and an immediate form, e.g.

- Vector form:

    subr z0.h, p0/m, z0.h, z1.h

  subtract active elements of z0 from z1, and store the result in z0.

- Immediate form:

    subr z0.h, z0.h, #255

  subtract elements of z0, and store the result in z0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336274 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add SSE2 target to some shift tests

Show the difference in behaviour cf SSE41 (no PMULLD, PBLENDW etc.)

Raised by D48936

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336271 91177308-0d34-0410-b5e6-96231b3b80d8

NFC - Various typo fixes in tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336268 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for instructions to set/read FFR.

Includes instructions to read the First-Faulting Register (FFR):
- RDFFR (unpredicated)
    rdffr   p0.b
- RDFFR (predicated)
    rdffr   p0.b, p0/z
- RDFFRS (predicated, sets condition flags)
    rdffr   p0.b, p0/z

Includes instructions to set/write the FFR:
- SETFFR (no arguments, sets the FFR to all true)
    setffr
- WRFFR  (unpredicated)
    wrffr   p0.b

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336267 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Remove dead comment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336266 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for FP conversion instructions.

The variants added are:

- fcvt   (FP convert precision)
- scvtf  (signed int -> FP)
- ucvtf  (unsigned int -> FP)
- fcvtzs (FP -> signed int (round to zero))
- fcvtzu (FP -> unsigned int (round to zero))

For example:
  fcvt   z0.h, p0/m, z0.s  (single- to half-precision FP)
  scvtf  z0.h, p0/m, z0.s  (32-bit int to half-precision FP)
  ucvtf  z0.h, p0/m, z0.s  (32-bit unsigned int to half-precision FP)
  fcvtzs z0.s, p0/m, z0.h  (half-precision FP to 32-bit int)
  fcvtzu z0.s, p0/m, z0.h  (half-precision FP to 32-bit unsigned int)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336265 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Add test that shows that InstCombine can do better

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336258 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo][LoopVectorize] Preserve DL in generated phi instruction

When creating `phi` instructions to resume at the scalar part of the loop,
copy the DebugLoc from the original phi over to the new one.

Differential Revision: https://reviews.llvm.org/D48769

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336256 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo][InstCombine] Preserve DI after combining zext

When zext is EvaluatedInDifferentType, InstCombine
drops the dbg.value intrinsic. This patch tries to
preserve said DI, by inserting the zext's old DI in the
resulting instruction. (Only for integer type for now)

Differential Revision: https://reviews.llvm.org/D48331

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336254 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values (REAPPLIED)

We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case.

Reapplied with a fixed (extra null tests) version of rL336113 after reversion in rL336189 - extra test case added at rL336247.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336250 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add reduced crash test case for r336113 - [X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values

The patch was reverted at r336189 due to crashes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336247 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for SVE condition code aliases

SVE overloads the AArch64 PSTATE condition flags and introduces
a set of condition code aliases for the assembler. The
details are described in section 2.2 of the architecture
reference manual supplement for SVE.

In short:

  SVE alias =>  AArch64 name
  --------------------------
  NONE      => EQ
  ANY       => NE
  NLAST     => HS
  LAST      => LO
  FIRST     => MI
  NFRST     => PL
  PMORE     => HI
  PLAST     => LS
  TCONT     => GE
  TSTOP     => LT

Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D48869

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336245 91177308-0d34-0410-b5e6-96231b3b80d8

[ImplicitNullChecks] Check for rewrite of register used in 'test' instruction

The following code pattern:

       mov %rax, %rcx
       test %rax, %rax
       %rax = ....
       je  throw_npe
       mov(%rcx), %r9
       mov(%rax), %r10

gets transformed into the following incorrect code after implicit null check pass:
        mov %rax, %rcx
       %rax = ....
       faulting_load_op("movl (%rax), %r10", throw_npe)
       mov(%rcx), %r9

For implicit null check pass, if the register that is checked for null value (ie, the register used in the 'test' instruction) is written into before the condition jump, we should avoid doing the optimization.

Patch by Surya Kumari Jangala!

Differential Revision: https://reviews.llvm.org/D48627
Reviewed By: skatkov

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336241 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Remove SaveOr which is no longer used

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336237 91177308-0d34-0410-b5e6-96231b3b80d8

[lanai] Handle atomic load of i8 like regular load.

Loads and stores less than 64-bits are already atomic, this adds support for a special case thereof. This needs to be expanded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336236 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AsmParser] Fix inconsistent declaration parameter name in r336218

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336232 91177308-0d34-0410-b5e6-96231b3b80d8

[NVPTX] Expand v2f16 INSERT_VECTOR_ELT

Vectorization can create them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336227 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove repeated 'the' from multiple comments that have been copy and pasted. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336226 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add tests for low/high bit clearing with different attributes.

D48768 may turn some of these into shifts.

Reviewers: spatel

Reviewed By: spatel

Subscribers: spatel, RKSimon, llvm-commits, craig.topper

Differential Revision: https://reviews.llvm.org/D48767

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336224 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fix inconsistent declaration parameter name in r336195

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336223 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Make function parameter names in declarations match those of definitions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336222 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for shuffle+binop with constant op1; NFC

This adds coverage for a planned enhancement for ConstantExpr::getBinOpIdentity() noted in D48830.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336220 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AsmParser] Rework the in/out (%dx) hack one more time.

This patch adds a new token type specifically for (%dx). We will now always create this token when we parse (%dx). After all operands have been parsed, if the mnemonic is in/out we'll morph this token to a regular register token. Otherwise we keep it as the special DX token which won't match any instructions.

This removes the need for passing Mnemonic through the parsing functions. It also seems closer to gas where when its used on the wrong instruction it just gets diagnosed as an invalid operand rather than a bad memory address.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336218 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AsmParser] Don't consider %eip as a valid register outside of 32-bit mode.

This might make the error message added in r335668 unneeded, but I'm not sure yet.

The check for RIP is technically unnecessary since RIP is in GR64, but that fact is kind of surprising so be explicit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336217 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typo in lib/Support/Path.cpp to test commit access

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336216 91177308-0d34-0410-b5e6-96231b3b80d8

[Constants] add identity constants for fadd/fmul

As the test diffs show, the current users of getBinOpIdentity()
are InstCombine and Reassociate. SLP vectorizer is a candidate
for using this functionality too (D28907).

The InstCombine shuffle improvements are part of the planned
enhancements noted in D48830.

InstCombine actually has several other uses of getBinOpIdentity()
via SimplifyUsingDistributiveLaws(), but we don't call that for
any FP ops. Fixing that might be another part of removing the
custom reassociation in InstCombine that is only done for fadd+fmul.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336215 91177308-0d34-0410-b5e6-96231b3b80d8

[Reassociate] add tests for binop with identity constant; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336214 91177308-0d34-0410-b5e6-96231b3b80d8

[Reassociate] regenerate checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336211 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for FP Complex ADD/MLA.

The variants added in this patch are:

- Predicated Complex floating point ADD with rotate, e.g.

   fcadd   z0.h, p0/m, z0.h, z1.h, #90

- Predicated Complex floating point MLA with rotate, e.g.

   fcmla   z0.h, p0/m, z1.h, z2.h, #180

- Unpredicated Complex floating point MLA with rotate (indexed operand), e.g.

   fcmla   z0.h, p0/m, z1.h, z2.h[0], #180

Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D48824

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336210 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][GlobalISel] Fix fallbacks introduced in r336120 due to unselectable stores.

r336120 resulted in falling back to SelectionDAG more often due to the G_STORE
MMOs not matching the vreg size. This fixes that by explicitly any-extending the
value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336209 91177308-0d34-0410-b5e6-96231b3b80d8

[Reassociate] add test for missing FP constant analysis; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336208 91177308-0d34-0410-b5e6-96231b3b80d8

Rename lazy initialization functions to reflect behavior (NFC)

Suggested in review for D48698.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336207 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for FMUL (indexed)

Unpredicated FP-multiply of SVE vector with a vector-element given by
vector[index], for example:

  fmul z0.s, z1.s, z2.s[0]

which performs an unpredicated FP-multiply of all 32-bit elements in
'z1' with the first element from 'z2'.

This patch adds restricted register classes for SVE vectors:
  ZPR_3b (only z0..z7 are allowed)  - for indexed vector of 16/32-bit elements.
  ZPR_4b (only z0..z15 are allowed) - for indexed vector of 64-bit elements.

Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D48823

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336205 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for predicated unary operations.

The patch includes support for the following instructions:

       ABS z0.h, p0/m, z0.h
       NEG z0.h, p0/m, z0.h

  (S|U)XTB z0.h, p0/m, z0.h
  (S|U)XTB z0.s, p0/m, z0.s
  (S|U)XTB z0.d, p0/m, z0.d

  (S|U)XTH z0.s, p0/m, z0.s
  (S|U)XTH z0.d, p0/m, z0.d

  (S|U)XTW z0.d, p0/m, z0.d

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336204 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] visitSDIV - Permit MIN_SIGNED_VALUE in pow2 vector codegen

Now that D45806 has landed, we can re-enable support for MIN_SIGNED_VALUE in the sdiv by pow2-constant code

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336198 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] fold shuffle-with-binop and common value

This is the last significant change suggested in PR37806:
https://bugs.llvm.org/show_bug.cgi?id=37806#c5
...though there are several follow-ups noted in the code comments
in this patch to complete this transform.

It's possible that a binop feeding a select-shuffle has been eliminated
by earlier transforms (or the code was just written like this in the 1st
place), so we'll fail to match the patterns that have 2 binops from:
D48401,
D48678,
D48662,
D48485.

In that case, we can try to materialize identity constants for the remaining
binop to fill in the "ghost" lanes of the vector (where we just want to pass
through the original values of the source operand).

I added comments to ConstantExpr::getBinOpIdentity() to show planned follow-ups.
For now, we only handle the 5 commutative integer binops (add/mul/and/or/xor).

Differential Revision: https://reviews.llvm.org/D48830

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336196 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM][NFC] Refactor sequential access for DSP

With a view to support parallel operations that have their results
stored to memory, refactor the consecutive access helper out so it
could support stores instructions.

Differential Revision: https://reviews.llvm.org/D48872

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336195 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Strip trailing whitespace. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336194 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Armv8.4-A: system registers

This adds the following system registers:
- RAS registers,
- MPAM registers,
- Activitiy monitor registers,
- Trace Extension registers,
- Timing insensitivity of data processing instructions,
- Enhanced Support for Nested Virtualization.

Differential Revision: https://reviews.llvm.org/D48871

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336193 91177308-0d34-0410-b5e6-96231b3b80d8

build_llvm_package.bat: Re-try the build steps

The build on Windows has been extra flaky recently; retrying helps.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336192 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Corrections for salvageDebugInfo

Summary:
When salvaging a dbg.declare/dbg.addr we should not add
DW_OP_stack_value to the DIExpression
(see test/Transforms/InstCombine/salvage-dbg-declare.ll).

Consider this example
  %vla = alloca i32, i64 2
  call void @llvm.dbg.declare(metadata i32* %vla, metadata !1, metadata !DIExpression())

Instcombine will turn it into
  %vla1 = alloca [2 x i32]
  %vla1.sub = getelementptr inbounds [2 x i32], [2 x i32]* %vla, i64 0, i64 0
  call void @llvm.dbg.declare(metadata [2 x i32]* %vla1.sub, metadata !19, metadata !DIExpression())

If the GEP can be eliminated, then the dbg.declare will be salvaged
and we should get
  %vla1 = alloca [2 x i32]
  call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression())

The problem was that salvageDebugInfo did not recognize dbg.declare
as being indirect (%vla1 points to the value, it does not hold the
value), so we incorrectly got
  call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression(DW_OP_stack_value))

I also made sure that llvm::salvageDebugInfo and
DIExpression::prependOpcodes do not add DW_OP_stack_value to
the DIExpression in case no new operands are added to the
DIExpression. That way we avoid to, unneccessarily, turn a
register location expression into an implicit location expression
in some situations (see test11 in test/Transforms/LICM/sinking.ll).

Reviewers: aprantl, vsk

Reviewed By: aprantl, vsk

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D48837

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336191 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values"

This reverts commit r336113. It causes crashes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336189 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Adjust AArch64 unit test

The signature of setRegToConstant changed in r336171, so adjust the AArch64
unit test in a similar way to how the X86 unit test was changed in that commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336188 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Add an AArch64 target

The target does just enough to be able to run llvm-exegesis in latency mode for
at least some opcodes.

Differential Revision: https://reviews.llvm.org/D48780

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336187 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for saturing ADD/SUB instructions.

The variants added are:
    signed Saturating ADD/SUB (immediate)  e.g. sqadd z0.h, z0.h, #42
  unsigned Saturating ADD/SUB (immediate)  e.g. uqadd z0.h, z0.h, #42
    signed Saturating ADD/SUB (vectors)    e.g. sqadd z0.h, z0.h, z1.h
  unsigned Saturating ADD/SUB (vectors)    e.g. uqadd z0.h, z0.h, z1.h

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336186 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS GlobalISel] Lower arguments using stack

Lower more than 4 arguments using stack. This patch targets MIPS32.
It supports only functions with arguments of type i32.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D47934

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336185 91177308-0d34-0410-b5e6-96231b3b80d8

[PM/LoopUnswitch] Fix PR37651 by correctly invalidating SCEV when
unswitching loops.

Original patch trying to address this was sent in D47624, but that
didn't quite handle things correctly. There are two key principles used
to select whether and how to invalidate SCEV-cached information about
loops:

1) We must invalidate any info SCEV has cached before unswitching as we
   may change (or destroy) the loop structure by the act of unswitching,
   and make it hard to recover everything we want to invalidate within
   SCEV.

2) We need to invalidate all of the loops whose CFGs are mutated by the
   unswitching. Notably, this isn't the *entire* loop nest, this is
   every loop contained by the outermost loop reached by an exit block
   relevant to the unswitch.

And we need to do this even when doing trivial unswitching.

I've added more focused tests that directly check that SCEV starts off
with imprecise information and after unswitching (and simplifying
instructions) re-querying SCEV will produce precise information. These
tests also specifically work to check that an *outer* loop's information
becomes precise.

However, the testing here is still a bit imperfect. Crafting test cases
that reliably fail to be analyzed by SCEV before unswitching and succeed
afterward proved ... very, very hard. It took me several hours and
careful work to build these, and I'm not optimistic about necessarily
coming up with more to cover more elaborate possibilities. Fortunately,
the code pattern we are testing here in the pass is really
straightforward and reliable.

Thanks to Max Kazantsev for the initial work on this as well as the
review, and to Hal Finkel for helping me talk through approaches to test
this stuff even if it didn't come to much.

Differential Revision: https://reviews.llvm.org/D47624

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336183 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for vector element FP compare.

Contains the following variants:

- Compare with (elements from) other vector
  instructions: fcmeq, fcmgt, fcmge, fcmne, fcmuo.
  aliases: fcmle, fcmlt.

  e.g. fcmle   p0.h, p0/z, z0.h, z1.h => fcmge p0.h, p0/z, z1.h, z0.h

- Compare absolute values with (absolute values from) other vector.
  instructions: facge, facgt.
  aliases: facle, faclt.

  e.g. facle   p0.h, p0/z, z0.h, z1.h => facge   p0.h, p0/z, z1.h, z0.h

- Compare vector elements with #0.0
  instructions: fcmeq, fcmgt, fcmge, fcmle, fcmlt, fcmne.

  e.g. fcmle   p0.h, p0/z, z0.h, #0.0

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336182 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Disable the single callback optimization on Windows.

It appears that the function pointer we use there isn't reliably 4-byte
aligned. I have no idea why or how we could correct this, so for now we
just regress the Windows performance some.

Someone with access to Windows could try working on a fix. At the very
least we could use a double indirection rather than a table, but maybe
there is some way to fully restore this optimization. I don't want to
play too much with this when I don't have access to the platform and
this at least should restore the last bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336178 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Fix PR37395.

DbgLabelInst has no address as its operands.

Differential Revision: https://reviews.llvm.org/D46738

Patch by Hsiangkai Wang.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336176 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] This sanity check in the test only works with certain versions
of libstdc++, not just certain versions of GCC. The original macros
broke when using Clang + libstdc++4.9 sadly.

Sadly, testing for versions of libstdc++ has been extremely problematic
in the past, so I'm just narrowing this down to Windows and when using
libc++ as that seems at least very unlikely to keep build bots broken.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336174 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done

This patch changes order of transform in InstCombineCompares to avoid
performing transforms based on ranges which produce complex bit arithmetics
before more simple things (like folding with constants) are done. See PR37636
for the motivating example.

Differential Revision: https://reviews.llvm.org/D48584
Reviewed By: spatel, lebedev.ri

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336172 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] ExegisX86Target::setRegToConstant() should depend on the subtarget features.

Summary: This fixes PR38008.

Reviewers: gchatelet, RKSimon

Subscribers: tschuett, craig.topper, llvm-commits

Differential Revision: https://reviews.llvm.org/D48820

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336171 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Try to work around a crash in MSVC.

Putting `sizeof(T) <= 16` into the parameter of a `std::conditional`
causes every version of MSVC I've tried to crash:

https://godbolt.org/g/eqVULL

Really frustrating, but an extra layer of indirection through an
instantiated type gives a working way to access this computed constant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336170 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add avx512vl command line to break-false-dep.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336169 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Switch another place to `llvm::is_trivially_move_constructible`.

I missed this the first time around, sorry.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336166 91177308-0d34-0410-b5e6-96231b3b80d8

Reappl "[Dominators] Add the DomTreeUpdater class"

Summary:
This patch is the first in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]].

This patch introduces the DomTreeUpdater class, which provides a cleaner API to perform updates on available dominator trees (none, only DomTree, only PostDomTree, both) using different update strategies (eagerly or lazily) to simplify the updating process.

—Prior to the patch—

   - Directly calling update functions of DominatorTree updates the data structure eagerly while DeferredDominance does updates lazily.
   - DeferredDominance class cannot be used when a PostDominatorTree also needs to be updated.
   - Functions receiving DT/DDT need to branch a lot which is currently necessary.
   - Functions using both DomTree and PostDomTree need to call the update function separately on both trees.
   - People need to construct an additional DeferredDominance class to use functions only receiving DDT.

—After the patch—

Patch by Chijun Sima <simachijun@gmail.com>.

Reviewers: kuhar, brzycki, dmgreen, grosser, davide

Reviewed By: kuhar, brzycki

Author: NutshellySima

Subscribers: vsk, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D48383

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336163 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r336159, r336157. Some bots failed on qualified std::max_align_t, and other on unqualified max_align_t.

I'll take another stab at this tomorrow. Any ideas for fixing this would be appreciated!

http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/23071/steps/build_Lld/logs/stdio
http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/11185/steps/build-stage1-compiler/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336162 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Fix llvm::unique_function when building with GCC 4.9 by
introducing llvm::trivially_{copy,move}_constructible type traits.

This uses a completely portable implementation of these traits provided
by Richard Smith. You can see it on compiler explorer in all its glory:

https://godbolt.org/g/QEDZjW

I have transcribed it, clang-formatted it, added some comments, and made
the tests fit into a unittest file.

I have also switched llvm::unique_function over to use these new, much
more portable traits. =D

Hopefully this will fix the build bot breakage from my prior commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336161 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Fix printing of aliases for distributed backend indexes

Summary:
When we import an alias (which will import a copy of the aliasee), but
aren't going to import the aliasee directly, the distributed backend
index will not contain the aliasee summary. Handle this in the summary
assembly printer by printing "null" as the aliasee.

Reviewers: davidxl, dexonsmith

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, llvm-commits

Differential Revision: https://reviews.llvm.org/D48699

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336160 91177308-0d34-0410-b5e6-96231b3b80d8

Some buildbots were choking on std::max_align_t, try using the global alias.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336159 91177308-0d34-0410-b5e6-96231b3b80d8

[demangler] Fix a MSVC alignment warning.

This should fix llvm.org/PR37944

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336157 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Add llvm::unique_function which is like std::function but
supporting move-only closures.

Most of the core optimizations for std::function are here plus
a potentially novel one that detects trivially movable and destroyable
functors and implements those with fewer indirections.

This is especially useful as we start trying to add concurrency
primitives as those often end up with move-only types (futures,
promises, etc) and wanting them to work through lambdas.

As further work, we could add better support for things like const-qualified
operator()s to support more algorithms, and r-value ref qualified operator()s
to model call-once. None of that is here though.

We can also provide our own llvm::function that has some of the optimizations
used in this class, but with copy semantics instead of move semantics.

This is motivated by increasing usage of things like executors and the task
queue where it is useful to embed move-only types like a std::promise within
a type erased function. That isn't possible without this version of a type
erased function.

Differential Revision: https://reviews.llvm.org/D48349

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336156 91177308-0d34-0410-b5e6-96231b3b80d8

Remove absolute path in test

My test change in r336148 accidentally included an absolute path, clean
that up to fix bot failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336151 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Verify modules when running LLLazyJIT in LLI, and deal with fallout.

The verifier identified several modules that were broken due to incorrect
linkage on declarations. To fix this, CompileOnDemandLayer2::extractFunction
has been updated to change decls to external linkage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336150 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Fix printing of module paths for distributed backend indexes

Summary:
In the individual index files emitted for distributed ThinLTO backends,
the module path ids are not contiguous. Assign slots to module paths in
order to handle this better and also to get contiguous numbering in the
summary assembly.

Reviewers: davidxl, dexonsmith

Subscribers: mehdi_amini, inglorion, eraman, llvm-commits, steven_wu

Differential Revision: https://reviews.llvm.org/D48698

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336148 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Support for atomic stores

Summary: Add support for atomic store instructions.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D48839

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336145 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fix PR37382: Don't optimize mul.with.overflow on thumbv6m.

Reviewers: efriedma, rogfer01, javed.absar

Reviewed By: efriedma, rogfer01

Subscribers: kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D48846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336144 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Clear the content of map VariantDescriptors in InstrBuilder before we start analyzing a new CodeBlock. NFCI.

Different CodeBlocks don't overlap. The same MCInst cannot appear in more than
one code block because all blocks are instantiated before the simulation is run.

We should always clear the content of map VariantDescriptors before every
simulation, since VariantDescriptors cannot possibly store useful information
for the next blocks. It is also "safer" to clear its content because `MCInst*`
is used as the key type for map VariantDescriptors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336142 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Strengthen StrengthenNoWrapFlags (reapply r334428).

Summary:
Comment on Transforms/LoopVersioning/incorrect-phi.ll: With the change
SCEV is able to prove that the loop doesn't wrap-self (due to zext i16
to i64), disabling the entire loop versioning pass. Removed the zext and
just use i64.

Reviewers: sanjoy

Subscribers: jlebar, hiraditya, javed.absar, bixia, llvm-commits

Differential Revision: https://reviews.llvm.org/D48409

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336140 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Fix fast-isel optimization of branch conditions.

LLVM doesn't guarantee anything about the high bits of a register holding
an i1 value at the IR level, so don't translate LLVM IR i1 values directly
into WebAssembly conditional branch operands. WebAssembly's conditional
branches do demand all 32 bits be valid.

Fixes PR38019.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336138 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add phony registers for high halves of regs with low halves

Add registers still missing after r328016 (D43353):
- for bits 15-8 of SI, DI, BP, SP (*H), and R8-R15 (*BH),
- for bits 31-16 of R8-R15 (*WH).

Thanks to Craig Topper for pointing it out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336134 91177308-0d34-0410-b5e6-96231b3b80d8

Replace "Replacable" with "Replaceable". [NFC]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336133 91177308-0d34-0410-b5e6-96231b3b80d8

Replace unused output filenames with /dev/null in tests

Similar to rLLD336129

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336131 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Recognize min/max pattern using instructions producing same values.

Summary: It is common to have the following min/max pattern during the intermediate stages of SLP since we only optimize at the end. This patch tries to catch such patterns and allow more vectorization.

         %1 = extractelement <2 x i32> %a, i32 0
         %2 = extractelement <2 x i32> %a, i32 1
         %cond = icmp sgt i32 %1, %2
         %3 = extractelement <2 x i32> %a, i32 0
         %4 = extractelement <2 x i32> %a, i32 1
         %select = select i1 %cond, i32 %3, i32 %4

Author: FarhanaAleen

Reviewed By: ABataev, RKSimon, spatel

Differential Revision: https://reviews.llvm.org/D47608

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336130 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] reverse canonicalization of add --> or to allow more shuffle folding

This extends D48485 to allow another pair of binops (add/or) to be combined either
with or without a leading shuffle:
or X, C --> add X, C (when X and C have no common bits set)

Here, we need value tracking to determine that the 'or' can be reversed into an 'add',
and we've added general infrastructure to allow extending to other opcodes or moving
to where other passes could use that functionality.

Differential Revision: https://reviews.llvm.org/D48662

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336128 91177308-0d34-0410-b5e6-96231b3b80d8

[MC] Error on a .zerofill directive in a non-virtual section

On darwin, all virtual sections have zerofill type, and having a
.zerofill directive in a non-virtual section is not allowed. Instead of
asserting, show a nicer error.

In order to use the equivalent of .zerofill in a non-virtual section,
the usage of .zero of .space is required.

This patch replaces the assert with an error.

Differential Revision: https://reviews.llvm.org/D48517

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336127 91177308-0d34-0410-b5e6-96231b3b80d8

nm: Add -no-weak flag for hiding weak symbols

Summary:
This adds a new -no-weak flag to nm to hide weak symbols in its output.
This also adds a -W alias for this which is analogous to -U.

Patch by Keith Smiley

Reviewers: kastiglione, enderby, compnerd

Reviewed By: kastiglione

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48751

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336126 91177308-0d34-0410-b5e6-96231b3b80d8