git.osdn.net Git - android-x86/external-llvm.git/log

[IR] Fix inconsistent declaration parameter name

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336459 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove patterns for MOVLPD/MOVLPS nodes with integer types.

Lowering shouldn't generate these. If we need to use them for integer types, it should use a bitcast.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336458 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add more FMA3 memory folding patterns. Remove patterns that are no longer needed.

We've removed the legacy FMA3 intrinsics and are now using llvm.fma and extractelement/insertelement. So we don't need patterns for the nodes that could only be created by the old intrinscis. Those ISD opcodes still exist because we haven't dropped the AVX512 intrinsics yet, but those should go to EVEX instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336457 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Add HardwareUnit and Context classes.

This patch moves the construction of the default backend from llvm-mca.cpp and
into mca::Context. The Context class is responsible for holding ownership of
the simulated hardware components. These components are subclasses of
HardwareUnit. Right now the HardwareUnit is pretty bare-bones, but eventually
we might want to add some common functionality across all hardware components,
such as isReady() or something similar.

I have a feeling this patch will probably need some updates, but it's a start.
One thing I am not particularly fond of is the rather large interface for
createDefaultPipeline. That convenience routine takes a rather large set of
inputs from the llvm-mca driver, where many of those inputs are generated via
command line options.

One item I think we might want to change is the separating of ownership of
hardware components (owned by the context) and the pipeline (which owns
Stages). In short, a Pipeline owns Stages, a Context (currently) owns hardware.
The Pipeline's Stages make use of the components, and thus there is a lifetime
dependency generated. The components must outlive the pipeline. We could solve
this by having the Context also own the Pipeline, and not return a
unique_ptr<Pipeline>. Now that I think about it, I like that idea more.

Differential Revision: https://reviews.llvm.org/D48691

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336456 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Add support for static libraries

This diff adds support for handling static libraries
to llvm-objcopy and llvm-strip.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D48413

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336455 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add more tests for potentially poisonous shifts; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336454 91177308-0d34-0410-b5e6-96231b3b80d8

Revert 336426 (and follow-ups 428, 440), it very likely caused PR38084.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336453 91177308-0d34-0410-b5e6-96231b3b80d8

[Debugify] Allow unsigned values narrower than their variables

Suppress the diagnostic for mis-sized dbg.values when a value operand is
narrower than the unsigned variable it describes. Assume that a debugger
would implicitly zero-extend these values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336452 91177308-0d34-0410-b5e6-96231b3b80d8

[Local] replaceAllDbgUsesWith: Update debug values before RAUW

The replaceAllDbgUsesWith utility helps passes preserve debug info when
replacing one value with another.

This improves upon the existing insertReplacementDbgValues API by:

- Updating debug intrinsics in-place, while preventing use-before-def of
  the replacement value.
- Falling back to salvageDebugInfo when a replacement can't be made.
- Moving the responsibiliy for rewriting llvm.dbg.* DIExpressions into
  common utility code.

Along with the API change, this teaches replaceAllDbgUsesWith how to
create DIExpressions for three basic integer and pointer conversions:

- The no-op conversion. Applies when the values have the same width, or
  have bit-for-bit compatible pointer representations.
- Truncation. Applies when the new value is wider than the old one.
- Zero/sign extension. Applies when the new value is narrower than the
  old one.

Testing:

- check-llvm, check-clang, a stage2 `-g -O3` build of clang,
  regression/unit testing.
- This resolves a number of mis-sized dbg.value diagnostics from
  Debugify.

Differential Revision: https://reviews.llvm.org/D48676

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336451 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add more tests with poison and undef; NFC

As discussed in D48987 and D48893, there are many different
ways to go wrong depending on the binop (and as shown here
we already do go wrong in some cases).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336450 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix UBSan error caused by r335942

Summary: Fixes PR38071.

Reviewers: arsenm, dstenb

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336448 91177308-0d34-0410-b5e6-96231b3b80d8

[Constants] extend getBinOpIdentity(); NFC

The enhanced version will be used in D48893 and related patches
and an almost identical (fadd is different) version is proposed
in D28907, so adding this as a preliminary step.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336444 91177308-0d34-0410-b5e6-96231b3b80d8

[Constant] add undef element query for vector constants; NFC

This is likely to be used in D48987 and similar patches,
so adding it as an NFC preliminary step.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336442 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] ParallelDSP: added statistics, NFC.

Added statistics for the number of SMLAD instructions created, and
als renamed the pass name to -arm-parallel-dsp.

Differential Revision: https://reviews.llvm.org/D48971

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336441 91177308-0d34-0410-b5e6-96231b3b80d8

Commit rL336426 cause buildbot failures

http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/50537/testReport/junit/LLVM/CodeGen_AArch64/FoldRedundantShiftedMasking_ll/

This removes the comments of the function label causing this error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336440 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopSink] Make the enforcement of determinism deterministic.

LoopBlockNumber is a DenseMap<BasicBlock*, int>, comparing the result of
find() will compare a pair<BasicBlock*, int>. That's of course depending
on pointer ordering which varies from run to run. Reverse iteration
doesn't find this because we're copying to a vector first.

This bug has been there since 2016 but only recently showed up on clang
selfhost with FDO and ThinLTO, which is also why I didn't manage to get
a reasonable test case for this. Add an assert that would've caught
this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336439 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] A write latency cannot be a negative value. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336437 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Armv8.4-A: TLB support

This adds:
- outer shareable TLB Maintenance instructions, and
- TLB range maintenance instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336434 91177308-0d34-0410-b5e6-96231b3b80d8

[dsymutil] Emit label at the begin of a CU

When emitting a CU, store the MCSymbol pointing to the beginning of the
CU. We'll need this information later when emitting the .debug_names
section (DWARF5 accelerator table).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336433 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit: [AArch64] Armv8.4-A: Flag manipulation instructions

Now with the asm operand definition included.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336432 91177308-0d34-0410-b5e6-96231b3b80d8

Added missing semicolon

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336428 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] https://reviews.llvm.org/D48278

D48278

Allow to reduce redundant shift masks.
For example:
x1 = x & 0xAB00
x2 = (x >> 8) & 0xAB

can be reduced to:
x1 = x & 0xAB00
x2 = x1 >> 8
It only allows folding when the masks and shift values are constants.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336426 91177308-0d34-0410-b5e6-96231b3b80d8

Revert [AArch64] Armv8.4-A: Flag manipulation instructions

It's causing build errors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336422 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Armv8.4-A: Flag manipulation instructions

These instructions are added to AArch64 only.

Differential Revision: https://reviews.llvm.org/D48926

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336421 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] improve the instruction issue logic implemented by the Scheduler.

This patch modifies the Scheduler heuristic used to select the next instruction
to issue to the pipelines.

The motivating example is test X86/BtVer2/add-sequence.s, for which llvm-mca
wrongly reported an estimated IPC of 1.50. According to perf, the actual IPC for
that test should have been ~2.00.
It turns out that an IPC of 2.00 for test add-sequence.s cannot possibly be
predicted by a Scheduler that only prioritizes instructions based on their
"age". A similar issue also affected test X86/BtVer2/dependent-pmuld-paddd.s,
for which llvm-mca wrongly estimated an IPC of 0.84 instead of an IPC of 1.00.

Instructions in the ReadyQueue are now ranked based on two factors:
- The "age" of an instruction.
- The number of unique users of writes associated with an instruction.

The new logic still prioritizes older instructions over younger instructions to
minimize the pressure on the reorder buffer. However, the number of users of an
instruction now also affects the overall rank. This potentially increases the
ability of the Scheduler to extract instruction level parallelism. This patch
fixes the problem with the wrong IPC reported for test add-sequence.s and test
dependent-pmuld-paddd.s.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336420 91177308-0d34-0410-b5e6-96231b3b80d8

CallGraphSCCPass: iterate over all functions.

Previously we only iterated over functions reachable from the set of
external functions in the module. But since some of the passes under
this (notably the always-inliner and coroutine lowerer) are required for
correctness, they need to run over everything.

This just adds an extra layer of iteration over the CallGraph to keep
track of which functions we've already visited and get the next batch of
SCCs.

Should fix PR38029.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336419 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][ARM] Armv8.4-A: Trace synchronization barrier instruction

This adds the Armv8.4-A Trace synchronization barrier (TSB) instruction.

Differential Revision: https://reviews.llvm.org/D48918

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336418 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove FMA4 scalar intrinsics. Use llvm.fma intrinsic instead.

The intrinsics can be implemented with a f32/f64 llvm.fma intrinsic and an insert into a zero vector.

There are a couple regressions here due to SelectionDAG not being able to pull an fneg through an extract_vector_elt. I'm not super worried about this though as InstCombine should be able to do it before we get to SelectionDAG.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336416 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Make support types more easily printable.

Summary:
Error's new operator<< is the first way to print an error without consuming it.

formatv() can now print objects with an operator<< that works with raw_ostream.

Reviewers: bkramer

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D48966

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336412 91177308-0d34-0410-b5e6-96231b3b80d8

Reapply: "objdump: Support newer ObjC image info flags"

Summary:
Add support for two additional ObjC image info flags: `IS_SIMULATED` and
`HAS_CATEGORY_CLASS_PROPERTIES`.

`IS_SIMULATED` indicates a Mach-O binary built for iOS simulator.

`HAS_CATEGORY_CLASS_PROPERTIES` indicates a Mach-O binary built by a compiler
that supports class properties in categories.

Reviewers: enderby, compnerd

Reviewed By: compnerd

Subscribers: keith, llvm-commits

Differential Revision: https://reviews.llvm.org/D48568

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336411 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336410 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove all of the avx512 masked packed fma intrinsics. Use llvm.fma or unmasked 512-bit intrinsics with rounding mode.

This upgrades all of the intrinsics to use fneg instructions to convert fma into fmsub/fnmsub/fnmadd/fmsubadd. And uses a select instruction for masking.

This matches how clang uses the intrinsics these days.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336409 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Cleanup some of the avx512 masked fma tests to prepare for removing and autoupgrading.

-Split cases that call 2 intrinsics in the same case.
-Remove testing mask3 and maskz intrinsics with an all ones mask. These won't be interesting after the upgrade.
-Restore test cases for some intrinsics that are marked for deletion, but haven't been deleted yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336408 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-pdbutil] Dump more info about globals.

We add an option to dump the entire global / public symbol record
stream. Previously we would dump globals or publics, but not both.
And when we did dump them, we would always dump them in the order
they were referenced by the corresponding hash streams, not in
the order they were serialized in. This patch adds a lower level
mode that just dumps the whole stream in serialization order.

Additionally, when dumping global-extras, we now dump the hash
bitmap as well as the record offset instead of dumping all zeros
for the offsets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336407 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Add __float128 library call for frem

Power 9 does not have a hardware instruction for frem but we can call fmodf128.

Differential Revision: https://reviews.llvm.org/D48552

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336406 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] Sort globals symbols by name in GSI hash buckets.

It seems like the debugger first computes a symbol's bucket,
and then does a binary search of entries in the bucket using the
symbol's name in order to find it. If the bucket entries are not
in sorted order, this obviously won't work. After this patch a
couple of simple test cases show that we generate an exactly
identical GSI hash stream, which is very nice.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336405 91177308-0d34-0410-b5e6-96231b3b80d8

[x86]Add a test case to show missed vfnmadd generation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336404 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "objdump: Support newer ObjC image info flags"

This reverts commit 8c4cc472e7a67bd3b2b20cc4cf32d31af29bc7e9.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336402 91177308-0d34-0410-b5e6-96231b3b80d8

[OpenEmbedded] Add OpenEmbedded vendor

Summary: The lib paths are not correctly picked up for OpenEmbedded sysroots
(like arm-oe-linux-gnueabi). I fix this in a follow-up clang patch. But in
order to add the correct libs I need to detect if the vendor is oe. For this
reason, it is first necessary to teach llvm to detect oe vendor, which is what
this patch does.

Reviewers: chandlerc, compnerd, rengolin, javed.absar

Reviewed By: compnerd

Subscribers: kristof.beyls, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D48861

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336401 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][Disassembler] Fix LOCK prefix disassembler support

Summary:
If LOCK prefix is not the first prefix in an instruction, LLVM
disassembler silently drops the prefix.

The fix is to select a proper instruction with a builtin LOCK prefix if
one exists.

Reviewers: craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49001

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336400 91177308-0d34-0410-b5e6-96231b3b80d8

objdump: Support newer ObjC image info flags

Summary:
Add support for two additional ObjC image info flags: `IS_SIMULATED` and
`HAS_CATEGORY_CLASS_PROPERTIES`.

`IS_SIMULATED` indicates a Mach-O binary built for iOS simulator.

`HAS_CATEGORY_CLASS_PROPERTIES` indicates a Mach-O binary built by a compiler
that supports class properties in categories.

Reviewers: enderby, compnerd

Reviewed By: compnerd

Subscribers: keith, llvm-commits

Differential Revision: https://reviews.llvm.org/D48568

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336399 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r332168: "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading.""

There were a couple of issues reported (PR38047, PR37929) - I'll reland
the patch when I figure out and fix the rootcause.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336393 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add missing _S opcodes of atomic stores to InstPrinter

Summary: This was missing in D48839 (rL336145).

Reviewers: aardappel

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D48992

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336390 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Add BitReader/BitWriter to target_link_libraries

Summary:
CompileOnDemandLayer.cpp uses function in these libraries, and builds
with `-DSHARED_LIB=ON` fail without this.

Reviewers: lhames

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D48995

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336389 91177308-0d34-0410-b5e6-96231b3b80d8

This is a recommit of r336322, previously reverted in r336324 due to
a deficiency in TableGen that has been addressed in r336334.

[AArch64][SVE] Asm: Support for predicated FP rounding instructions.

This patch also adds instructions for predicated FP square-root and
reciprocal exponent.

The added instructions are:
- FRINTI  Round to integral value (current FPCR rounding mode)
- FRINTX  Round to integral value (current FPCR rounding mode, signalling inexact)
- FRINTA  Round to integral value (to nearest, with ties away from zero)
- FRINTN  Round to integral value (to nearest, with ties to even)
- FRINTZ  Round to integral value (toward zero)
- FRINTM  Round to integral value (toward minus Infinity)
- FRINTP  Round to integral value (toward plus Infinity)
- FSQRT   Floating-point square root
- FRECPX  Floating-point reciprocal exponent

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336387 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] In CompileOnDemandLayer2, clone modules on to different contexts by
writing them to a buffer and re-loading them.

Also introduces a multithreaded variant of SimpleCompiler
(MultiThreadedSimpleCompiler) for compiling IR concurrently on multiple
threads.

These changes are required to JIT IR on multiple threads correctly.

No test case yet. I will be looking at how to modify LLI / LLJIT to test
multithreaded JIT support soon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336385 91177308-0d34-0410-b5e6-96231b3b80d8

Testing commit permision

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336384 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove the last of the 'x86.fma.' intrinsics and autoupgrade them to 'llvm.fma'. Add upgrade tests for all.

Still need to remove the AVX512 masked versions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336383 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add SHUF128 to target shuffle decoding.

Differential Revision: https://reviews.llvm.org/D48954

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336376 91177308-0d34-0410-b5e6-96231b3b80d8

Fix asserts in AMDGCN fmed3 folding by handling more cases of NaN

Better NaN handling for AMDGCN fmed3.

All operands are checked for NaN now. The checks
were moved before the canonicalization to provide
a better mapping from fclamp. Changed the behaviour
of fmed3(x,y,NaN) to return max(x,y) instead of
min(x,y) in light of this. Updated tests as a result
and added some new cases to cover the fix.

Patch by Alan Baker

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336375 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Don't use spir_kernel in a test

Also use verify-machineinstrs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336374 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Implement custom kernel arg lowering

Avoid using allocateKernArg / AssignFn. We do not want any
of the type splitting properties of normal calling convention
lowering.

For now at least this exists alongside the IR argument lowering
pass. This is necessary to handle struct padding correctly while
some arguments are still skipped by the IR argument lowering
pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336373 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel][X86] Add UDIV/UREM by pow2 costs

Normally InstCombine would have simplified these to SRL/AND instructions but we may still see these during SLP vectorization etc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336371 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Removed archive-headers-disas test

This test is failing because of the disas part.
For the moment, I will juste remove it. I will add it again tomorrow
with a proper fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336370 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Fix RegisterFile debug prints. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336367 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Fix timezone dependant tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336363 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Add lib calls for float128 operations with no equivalent PPC instructions

Map the following instructions to the proper float128 lib calls:
pow[i], exp[2], log[2|10], sin, cos, fmin, fmax

Differential Revision: https://reviews.llvm.org/D48544

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336361 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add srem x, (1 << c) combine tests

Now that D45806 has landed we can start trying to avoid scalarizing srem by constant - these tests demonstrate some example cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336360 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Add --archive-headers (-a) option

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336357 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Add uop computation for more X87 instruction classes.

Summary:
This allows measuring comparisons (UCOM_FpIr32,UCOM_Fpr32,...),
conditional moves (CMOVBE_Fp32,...)

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48713

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336352 91177308-0d34-0410-b5e6-96231b3b80d8

Fix comment typo. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336351 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64, PowerPC, x86] add tests for signbit bit hacks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336348 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer] Begin abstracting InstructionsState alternate matching away from opcodes. NFCI.

This is an early step towards matching Instructions by attributes other than the opcode. This will be necessary for cast/call alternates which share the same opcode but have different types/intrinsicIDs etc. - which we could vectorize as long as we split them using the alternate mechanism.

Differential Revision: https://reviews.llvm.org/D48945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336344 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis][NFC]clang-format

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336343 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Add VALU to V_INTERP Instructions

Wait states are not properly being inserted after buffer_store for v_interp instructions.

Add VALU to V_INTERP instructions so that the GCNHazardRecognizer can
check and insert the appropriate wait states when needed.

Differential Revision: https://reviews.llvm.org/D48772

Change-Id: Id540c9b074fc69b5c1de6b182276aa089c74aa64

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336339 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Switch to indirect even the trivial case through an object pointer
that has required alignment. This avoids issues that keep coming up with
function pointers being less aligned.

I'm pretty annoyed that we can't take advantage of function alignment
even on platforms where they *are* aligned, but build modes and other
things make taking advantage of it somewhere between hard and
impossible. The best case scenario would still embed various build modes
into the ABI causing really hard to debug issues if you compiled one
object file differently from another. =/

This should at least bring the bots back that were having trouble with
this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336337 91177308-0d34-0410-b5e6-96231b3b80d8

Partially revert r336268 in address-offsets.ll

Summary: There the typos are intentional, explicitly introduced to disable these cases in r280285.

Reviewers: bkramer

Reviewed By: bkramer

Subscribers: dschuff, sbc100, jgravelle-google, aheejin, llvm-commits

Differential Revision: https://reviews.llvm.org/D48962

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336336 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Increase the number of supported decoder fix-ups.

The vast number of added instructions for SVE causes TableGen to fail with an assertion:

Assertion `Delta < 65536U && "disassembler decoding table too large!"'

This patch increases the number of supported decoder fix-ups.

Reviewers: dmgreen, stoklund, petpav01

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D48937

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336334 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add extra v16i16 shl x,c -> pmullw test

We want to compare shifts with repeated vs non-repeated v8i16 shuffle masks (for PBLENDW ymm)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336333 91177308-0d34-0410-b5e6-96231b3b80d8

Try to fix -Wimplicit-fallthrough warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336331 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Fix atomic operations at O0, v3

Similar to PR/25526, fast-regalloc introduces spills at the end of basic
blocks. When this occurs in between an ll and sc, the stores can cause the
atomic sequence to fail.

This patch fixes the issue by introducing more pseudos to represent atomic
operations and moving their lowering to after the expansion of postRA
pseudos.

This version addresses issues with the initial implementation and covers
all atomic operations.

This resolves PR/32020.

Thanks to James Cowgill for reporting the issue!

Patch By: Simon Dardis

Differential Revision: https://reviews.llvm.org/D31287

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336328 91177308-0d34-0410-b5e6-96231b3b80d8

[NEON] Fix combining of vldx_dup intrinsics with updating of base addresses

Resolves:
Unsupported ARM Neon intrinsics in Target-specific DAG combine
function for VLDDUP
https://bugs.llvm.org/show_bug.cgi?id=38031

Related diff: D48439

Differential Revision: https://reviews.llvm.org/D48920

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336325 91177308-0d34-0410-b5e6-96231b3b80d8

Reverting r336322 for now, as it causes an assert failure
in TableGen, for which there is already a patch in Phabricator
(D48937) that needs to be committed first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336324 91177308-0d34-0410-b5e6-96231b3b80d8

Partial revert of "NFC - Various typo fixes in tests"

This partially reverts r336268 since it causes buildbot failures.

Added FIXME at the places where the CHECKs are misspelled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336323 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for predicated FP rounding instructions.

This patch also adds instructions for predicated FP square-root and
reciprocal exponent.

The added instructions are:
- FRINTI  Round to integral value (current FPCR rounding mode)
- FRINTX  Round to integral value (current FPCR rounding mode, signalling inexact)
- FRINTA  Round to integral value (to nearest, with ties away from zero)
- FRINTN  Round to integral value (to nearest, with ties to even)
- FRINTZ  Round to integral value (toward zero)
- FRINTM  Round to integral value (toward minus Infinity)
- FRINTP  Round to integral value (toward plus Infinity)
- FSQRT   Floating-point square root
- FRECPX  Floating-point reciprocal exponent

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336322 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] ParallelDSP: only support i16 loads for now

We were miscompiling i8 loads, so reject them as unsupported narrow operations
for now.

Differential Revision: https://reviews.llvm.org/D48944

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336319 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for signed/unsigned MIN/MAX/ABD

This patch implements the following varieties:

- Unpredicated signed max,   e.g. smax z0.h, z1.h, #-128
- Unpredicated signed min,   e.g. smin z0.h, z1.h, #-128

- Unpredicated unsigned max, e.g. umax z0.h, z1.h, #255
- Unpredicated unsigned min, e.g. umin z0.h, z1.h, #255

- Predicated signed max,     e.g. smax z0.h, p0/m, z0.h, z1.h
- Predicated signed min,     e.g. smin z0.h, p0/m, z0.h, z1.h
- Predicated signed abd,     e.g. sabd z0.h, p0/m, z0.h, z1.h

- Predicated unsigned max,   e.g. umax z0.h, p0/m, z0.h, z1.h
- Predicated unsigned min,   e.g. umin z0.h, p0/m, z0.h, z1.h
- Predicated unsigned abd,   e.g. uabd z0.h, p0/m, z0.h, z1.h

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336317 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Optimize codgen for conversions of int to float128

Optimize code sequences for integer conversion to fp128 when the integer is a result of:
  * float->int
  * float->long
  * double->int
  * double->long

Differential Revision: https://reviews.llvm.org/D48429

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336316 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart independent FMA and extractelement/insertelement.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336315 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9][NFC] add back-end tests for passing homogeneous fp128 aggregates by value

Tests to verify that we are passing fp128 via VSX registers as per ABI.
These are related to clang commit rL336308.

Differential Revision: https://reviews.llvm.org/D48310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336314 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Add tests for passing float128 in VSX reg for non-homogenous aggregates

Add missing testcase for rL336310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336313 91177308-0d34-0410-b5e6-96231b3b80d8

[demangler] Avoid alignment warning

The alignment specified by a constant for the field
`BumpPointerAllocator::InitialBuffer` exceeded the alignment
guaranteed by `malloc` and `new` on Windows. This change set
the alignment value to that of `long double`, which is defined
by the used platform.

It fixes https://bugs.llvm.org/show_bug.cgi?id=37944.

Differential Revision: https://reviews.llvm.org/D48889

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336311 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Ensure float128 in non-homogenous aggregates are passed via VSX reg

Non-homogenous aggregates are passed in consecutive GPRs, in GPRs and in memory,
or in memory. This patch ensures that float128 members of non-homogenous
aggregates are passed via VSX registers.

This is done via custom lowering a bitcast of a build_pari(i64,i64) to float128
to a new PPCISD node, BUILD_FP128.

Differential Revision: https://reviews.llvm.org/D48308

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336310 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9]Legalize and emit code for quad-precision convert from single-precision

Legalize and emit code for quad-precision floating point operation conversion of
single-precision value to quad-precision.

Differential Revision: https://reviews.llvm.org/D47569

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336307 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Implement float128 parameter passing and return values

This patch enable parameter passing and return by value for float128 types.
Passing aggregate/union which contain float128 members will be submitted in
subsequent patches.

Differential Revision: https://reviews.llvm.org/D47552

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336306 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove some isel patterns for X86ISD::SELECTS that specifically looked for the v1i1 mask to have come from a scalar_to_vector from GR8.

We have patterns for SELECTS that top at v1i1 and we have a pattern for (v1i1 (scalar_to_vector GR8)). The patterns being removed here do the same thing as the two other patterns combined so there is no need for them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336305 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add support for combining FMSUB/FNMADD/FNMSUB ISD nodes with an fneg input.

Previously we could only negate the FMADD opcodes. This used to be mostly ok when we lowered FMA intrinsics during lowering. But with the move to llvm.fma from target specific intrinsics, we can combine (fneg (fma)) to (fmsub) earlier. So if we start with (fneg (fma (fneg))) we would get stuck at (fmsub (fneg)).

This patch fixes that so we can also combine things like (fmsub (fneg)).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336304 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove some of the packed FMA3 intrinsics since we no longer use them in clang.

There's a regression in here due to inability to combine fneg inputs of X86ISD::FMSUB/FNMSUB/FNMADD nodes.

More removals to come, but I wanted to stop and fix the regression that showed up in this first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336303 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9]Legalize and emit code for round & convert quad-precision values

Legalize and emit code for round & convert float128 to double precision and
single precision.

Differential Revision: https://reviews.llvm.org/D46997

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336299 91177308-0d34-0410-b5e6-96231b3b80d8

Silence an MSVC C4189 warning about a local variable being initialized but not used; NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336298 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Warn when crc, ginv, virt flags are used with too old revision

CRC and GINV ASE require revision 6, Virtualization requires revision 5.
Print a warning when revision is older than required.

Differential Revision: https://reviews.llvm.org/D48843

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336296 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Replace the Post RA List Scheduler with the Machine Scheduler

  We want to run the Machine Scheduler instead of the List Scheduler after RA.
  Checked with a performance run on a Power 9 machine with SPEC 2006 and while
  some benchmarks improved and others degraded the geomean was slightly improved
  with the Machine Scheduler.

  Differential Revision: https://reviews.llvm.org/D45265

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336295 91177308-0d34-0410-b5e6-96231b3b80d8

[Dominators] Add DomTreeUpdater constructor from DT* and PDT*

Summary:
Previously, if a function accepts an optional DT pointer,
```
void Foo (.., DominatorTree * DT = nullptr) {
  ...
  if(DT)
    DomTreeUpdater(*DT, ...).insertEdge(A, B);
  if(DT){
    DomTreeUpdater DTU(*DT, ...);
    ... // Construct the update vector and applyUpdates
  }
  ...
  if(DT){
    DomTreeUpdater DTU(*DT, ...);
    ... // Construct the update vector and applyUpdates
  }
}
```
After this patch, it can be simplified as
```
void Foo (.., DominatorTree * DT = nullptr) {
  DomTreeUpdater DTU(DT, ...);
  ...
  DTU.insertEdge(A, B);
  if(DT){
    ... // Construct the update vector and applyUpdates
  }
  ...
  if(DT){
    ... // Construct the update vector and applyUpdates
  }
}
```
Patch by Chijun Sima <simachijun@gmail.com>.

Reviewers: kuhar, brzycki, dmgreen

Reviewed By: kuhar

Author: NutshellySima

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48923

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336294 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] allow narrowing of min/max/abs

We have bailout hacks based on min/max in various places in instcombine
that shouldn't be necessary. The affected test was added for:
D48930
...which is a consequence of the improvement in:
D48584 (https://reviews.llvm.org/rL336172)

I'm assuming the visitTrunc bailout in this patch was added specifically
to avoid a change from SimplifyDemandedBits, so I'm just moving that
below the EvaluateInDifferentType optimization. A narrow min/max is still
a min/max.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336293 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][BtVer2][MCA][NFC] Add CMPEQ dependency-breaking one-idioms tests

Summary: As per `Agner's Microarchitecture doc
(21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions)`,
these, like zero-idioms, are dependency-breaking,
although they produce ones and still consume resources.

FIXME: as discussed in D48877, llvm-mca handling is broken for these.

Reviewers: andreadb

Reviewed By: andreadb

Subscribers: gbedwell, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D48876

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336292 91177308-0d34-0410-b5e6-96231b3b80d8

Fix some irregular whitespace/indentation. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336291 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add value names to test; NFC

That makes it easier to mix and match lines into other tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336289 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] [Assembler] Support negative immediates: cover few missing cases

Support for negative immediates was implemented in
https://reviews.llvm.org/rL298380, however few instruction options were missing.

This change adds negative immediates support and respective tests
for the following:

ADD
ADDS
ADDS.W
AND.W
ANDS
BIC.W
BICS
BICS.W
SUB
SUBS
SUBS.W

Differential Revision: https://reviews.llvm.org/D48649

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336286 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner] Fix typo in getOutliningCandidateInfo function name

getOutlininingCandidateInfo -> getOutliningCandidateInfo

Differential Revision: https://reviews.llvm.org/D48867

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336285 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Add --file-headers (-f) option

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336284 91177308-0d34-0410-b5e6-96231b3b80d8