git.osdn.net Git - android-x86/external-llvm.git/log

[SelectionDAG] Split float and integer isKnownNeverZero tests

Splits off isKnownNeverZeroFloat to handle +/- 0 float cases.

This will make it easier to be more aggressive with the integer isKnownNeverZero tests (similar to ValueTracking), use computeKnownBits etc.

Differential Revision: https://reviews.llvm.org/D48969

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336492 91177308-0d34-0410-b5e6-96231b3b80d8

Use const APInt& to avoid extra copy. NFCI.

As discussed on D48825.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336491 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Add EXTRACT_SUBVECTOR to SimplifyDemandedVectorElts

As discussed on PR37989, this patch adds EXTRACT_SUBVECTOR handling to TargetLowering::SimplifyDemandedVectorElts and calls it from DAGCombiner::visitEXTRACT_SUBVECTOR.

Differential Revision: https://reviews.llvm.org/D48825

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336490 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel][X86] Add SREM/UREM general and constant costs (PR38056)

We penalize general SDIV/UDIV costs but don't do the same for SREM/UREM.

This patch makes general vector SREM/UREM x20 as costly as scalar, the same approach as we do for SDIV/UDIV. The patch also extends the existing SDIV/UDIV constant costs for SREM/UREM - at the moment this means the additional cost of a MUL+SUB (see D48975).

Differential Revision: https://reviews.llvm.org/D48980

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336486 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336485 91177308-0d34-0410-b5e6-96231b3b80d8

NFC - Typo fixes in X86 flags-copy-lowering.mir test

Differential Revision: https://reviews.llvm.org/D48934

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336484 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner] Add missing liveness tracking info in MIR test.

This should bring the bots back to green state.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336482 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner] Assert that Liveness tracking is accurate (NFC)

The checking is done deeper inside MachineBasicBlock, but this will
hopefully help to find issues when porting the machine outliner to a
target where Liveness tracking is broken (like ARM).

Differential Revision: https://reviews.llvm.org/D49023

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336481 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Clear errno before calling the function in RetryAfterSignal.

For certain APIs, the return value of the function does not distinguish
between failure (which populates errno) and other non-error conditions
(which do not set errno).

For example, `fgets` returns `NULL` both when an error has occurred, or
upon EOF. If `errno` is already `EINTR` for whatever reason, then
```
RetryAfterSignal(nullptr, fgets, ...);
```
on a stream that has reached EOF would infinite loop.

Fix this by setting `errno` to `0` before each attempt in
`RetryAfterSignal`.

Patch by Ricky Zhou!

Differential Revision: https://reviews.llvm.org/D48755

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336479 91177308-0d34-0410-b5e6-96231b3b80d8

[PM/LoopUnswitch] Fix PR37889, producing the correct loop nest structure
after trivial unswitching.

This PR illustrates that a fundamental analysis update was not performed
with the new loop unswitch. This update is also somewhat fundamental to
the core idea of the new loop unswitch -- we actually *update* the CFG
based on the unswitching. In order to do that, we need to update the
loop nest in addition to the domtree.

For some reason, when writing trivial unswitching, I thought that the
loop nest structure cannot be changed by the transformation. But the PR
helps illustrate that it clearly can. I've expanded this to a number of
different test cases that try to cover the different cases of this. When
we unswitch, we move an exit edge of a loop out of the loop. If this
exit edge changes which loop reached by an exit is the innermost loop,
it changes the parent of the loop. Essentially, this transformation may
hoist the inner loop up the nest. I've added the simple logic to handle
this reliably in the trivial unswitching case. This just requires
updating LoopInfo and rebuilding LCSSA on the impacted loops. In the
trivial case, we don't even need to handle dedicated exits because we're
only hoisting the one loop and we just split its preheader.

I've also ported all of these tests to non-trivial unswitching and
verified that the logic already there correctly handles the loop nest
updates necessary.

Differential Revision: https://reviews.llvm.org/D48851

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336477 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Merge INTR_TYPE_3OP_RM with INTR_TYPE_3OP. Remove unused INTR_TYPE_1OP_RM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336476 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[SCEV] Strengthen StrengthenNoWrapFlags (reapply r334428)."

This reverts commit r336140. Our tests shows that LSR assert fails with it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336473 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] memicmp only exists on Windows, use StringRef::compare_lower instead

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336469 91177308-0d34-0410-b5e6-96231b3b80d8

Fix DIExpression::ExprOperand::appendToVector

appendToVector used the wrong overload of SmallVector::append, resulting
in it appending the same element to a vector `getSize()` times. This did
not cause a problem when initially committed because appendToVector was
only used to append 1-element operands.

This changes appendToVector to use the correct overload of append().

Testing: ./unittests/IR/IRTests --gtest_filter='*DIExpressionTest*'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336466 91177308-0d34-0410-b5e6-96231b3b80d8

Remove a redundant null-check in DIExpression::prepend, NFC

Code outside of an `if (Expr)` block dereferenced `Expr`, so the null
check was redundant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336465 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] One more fix for hasing GSI records.

The reference implementation uses a case-insensitive string
comparison for strings of equal length.  This will cause the
string "tEo" to compare less than "VUo".  However we were using
a case sensitive comparison, which would generate the opposite
outcome.  Switch to a case insensitive comparison.  Also, when
one of the strings contains non-ascii characters, fallback to
a straight memcmp.

The only way to really test this is with a DIA test.  Before this
patch, the test will fail (but succeed if link.exe is used instead
of lld-link).  After the patch, it succeeds even with lld-link.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336464 91177308-0d34-0410-b5e6-96231b3b80d8

Use Type::isIntOrPtrTy where possible, NFC

It's a bit neater to write T.isIntOrPtrTy() over `T.isIntegerTy() ||
T.isPointerTy()`.

I used Python's re.sub with this regex to update users:

r'([\w.\->()]+)isIntegerTy\(\)\s*\|\|\s*\1isPointerTy\(\)'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336462 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Fix inconsistent declaration parameter name

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336459 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove patterns for MOVLPD/MOVLPS nodes with integer types.

Lowering shouldn't generate these. If we need to use them for integer types, it should use a bitcast.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336458 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add more FMA3 memory folding patterns. Remove patterns that are no longer needed.

We've removed the legacy FMA3 intrinsics and are now using llvm.fma and extractelement/insertelement. So we don't need patterns for the nodes that could only be created by the old intrinscis. Those ISD opcodes still exist because we haven't dropped the AVX512 intrinsics yet, but those should go to EVEX instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336457 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Add HardwareUnit and Context classes.

This patch moves the construction of the default backend from llvm-mca.cpp and
into mca::Context. The Context class is responsible for holding ownership of
the simulated hardware components. These components are subclasses of
HardwareUnit. Right now the HardwareUnit is pretty bare-bones, but eventually
we might want to add some common functionality across all hardware components,
such as isReady() or something similar.

I have a feeling this patch will probably need some updates, but it's a start.
One thing I am not particularly fond of is the rather large interface for
createDefaultPipeline. That convenience routine takes a rather large set of
inputs from the llvm-mca driver, where many of those inputs are generated via
command line options.

One item I think we might want to change is the separating of ownership of
hardware components (owned by the context) and the pipeline (which owns
Stages). In short, a Pipeline owns Stages, a Context (currently) owns hardware.
The Pipeline's Stages make use of the components, and thus there is a lifetime
dependency generated. The components must outlive the pipeline. We could solve
this by having the Context also own the Pipeline, and not return a
unique_ptr<Pipeline>. Now that I think about it, I like that idea more.

Differential Revision: https://reviews.llvm.org/D48691

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336456 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Add support for static libraries

This diff adds support for handling static libraries
to llvm-objcopy and llvm-strip.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D48413

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336455 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add more tests for potentially poisonous shifts; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336454 91177308-0d34-0410-b5e6-96231b3b80d8

Revert 336426 (and follow-ups 428, 440), it very likely caused PR38084.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336453 91177308-0d34-0410-b5e6-96231b3b80d8

[Debugify] Allow unsigned values narrower than their variables

Suppress the diagnostic for mis-sized dbg.values when a value operand is
narrower than the unsigned variable it describes. Assume that a debugger
would implicitly zero-extend these values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336452 91177308-0d34-0410-b5e6-96231b3b80d8

[Local] replaceAllDbgUsesWith: Update debug values before RAUW

The replaceAllDbgUsesWith utility helps passes preserve debug info when
replacing one value with another.

This improves upon the existing insertReplacementDbgValues API by:

- Updating debug intrinsics in-place, while preventing use-before-def of
  the replacement value.
- Falling back to salvageDebugInfo when a replacement can't be made.
- Moving the responsibiliy for rewriting llvm.dbg.* DIExpressions into
  common utility code.

Along with the API change, this teaches replaceAllDbgUsesWith how to
create DIExpressions for three basic integer and pointer conversions:

- The no-op conversion. Applies when the values have the same width, or
  have bit-for-bit compatible pointer representations.
- Truncation. Applies when the new value is wider than the old one.
- Zero/sign extension. Applies when the new value is narrower than the
  old one.

Testing:

- check-llvm, check-clang, a stage2 `-g -O3` build of clang,
  regression/unit testing.
- This resolves a number of mis-sized dbg.value diagnostics from
  Debugify.

Differential Revision: https://reviews.llvm.org/D48676

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336451 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add more tests with poison and undef; NFC

As discussed in D48987 and D48893, there are many different
ways to go wrong depending on the binop (and as shown here
we already do go wrong in some cases).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336450 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix UBSan error caused by r335942

Summary: Fixes PR38071.

Reviewers: arsenm, dstenb

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336448 91177308-0d34-0410-b5e6-96231b3b80d8

[Constants] extend getBinOpIdentity(); NFC

The enhanced version will be used in D48893 and related patches
and an almost identical (fadd is different) version is proposed
in D28907, so adding this as a preliminary step.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336444 91177308-0d34-0410-b5e6-96231b3b80d8

[Constant] add undef element query for vector constants; NFC

This is likely to be used in D48987 and similar patches,
so adding it as an NFC preliminary step.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336442 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] ParallelDSP: added statistics, NFC.

Added statistics for the number of SMLAD instructions created, and
als renamed the pass name to -arm-parallel-dsp.

Differential Revision: https://reviews.llvm.org/D48971

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336441 91177308-0d34-0410-b5e6-96231b3b80d8

Commit rL336426 cause buildbot failures

http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/50537/testReport/junit/LLVM/CodeGen_AArch64/FoldRedundantShiftedMasking_ll/

This removes the comments of the function label causing this error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336440 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopSink] Make the enforcement of determinism deterministic.

LoopBlockNumber is a DenseMap<BasicBlock*, int>, comparing the result of
find() will compare a pair<BasicBlock*, int>. That's of course depending
on pointer ordering which varies from run to run. Reverse iteration
doesn't find this because we're copying to a vector first.

This bug has been there since 2016 but only recently showed up on clang
selfhost with FDO and ThinLTO, which is also why I didn't manage to get
a reasonable test case for this. Add an assert that would've caught
this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336439 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] A write latency cannot be a negative value. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336437 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Armv8.4-A: TLB support

This adds:
- outer shareable TLB Maintenance instructions, and
- TLB range maintenance instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336434 91177308-0d34-0410-b5e6-96231b3b80d8

[dsymutil] Emit label at the begin of a CU

When emitting a CU, store the MCSymbol pointing to the beginning of the
CU. We'll need this information later when emitting the .debug_names
section (DWARF5 accelerator table).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336433 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit: [AArch64] Armv8.4-A: Flag manipulation instructions

Now with the asm operand definition included.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336432 91177308-0d34-0410-b5e6-96231b3b80d8

Added missing semicolon

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336428 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] https://reviews.llvm.org/D48278

D48278

Allow to reduce redundant shift masks.
For example:
x1 = x & 0xAB00
x2 = (x >> 8) & 0xAB

can be reduced to:
x1 = x & 0xAB00
x2 = x1 >> 8
It only allows folding when the masks and shift values are constants.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336426 91177308-0d34-0410-b5e6-96231b3b80d8

Revert [AArch64] Armv8.4-A: Flag manipulation instructions

It's causing build errors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336422 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Armv8.4-A: Flag manipulation instructions

These instructions are added to AArch64 only.

Differential Revision: https://reviews.llvm.org/D48926

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336421 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] improve the instruction issue logic implemented by the Scheduler.

This patch modifies the Scheduler heuristic used to select the next instruction
to issue to the pipelines.

The motivating example is test X86/BtVer2/add-sequence.s, for which llvm-mca
wrongly reported an estimated IPC of 1.50. According to perf, the actual IPC for
that test should have been ~2.00.
It turns out that an IPC of 2.00 for test add-sequence.s cannot possibly be
predicted by a Scheduler that only prioritizes instructions based on their
"age". A similar issue also affected test X86/BtVer2/dependent-pmuld-paddd.s,
for which llvm-mca wrongly estimated an IPC of 0.84 instead of an IPC of 1.00.

Instructions in the ReadyQueue are now ranked based on two factors:
- The "age" of an instruction.
- The number of unique users of writes associated with an instruction.

The new logic still prioritizes older instructions over younger instructions to
minimize the pressure on the reorder buffer. However, the number of users of an
instruction now also affects the overall rank. This potentially increases the
ability of the Scheduler to extract instruction level parallelism. This patch
fixes the problem with the wrong IPC reported for test add-sequence.s and test
dependent-pmuld-paddd.s.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336420 91177308-0d34-0410-b5e6-96231b3b80d8

CallGraphSCCPass: iterate over all functions.

Previously we only iterated over functions reachable from the set of
external functions in the module. But since some of the passes under
this (notably the always-inliner and coroutine lowerer) are required for
correctness, they need to run over everything.

This just adds an extra layer of iteration over the CallGraph to keep
track of which functions we've already visited and get the next batch of
SCCs.

Should fix PR38029.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336419 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][ARM] Armv8.4-A: Trace synchronization barrier instruction

This adds the Armv8.4-A Trace synchronization barrier (TSB) instruction.

Differential Revision: https://reviews.llvm.org/D48918

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336418 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove FMA4 scalar intrinsics. Use llvm.fma intrinsic instead.

The intrinsics can be implemented with a f32/f64 llvm.fma intrinsic and an insert into a zero vector.

There are a couple regressions here due to SelectionDAG not being able to pull an fneg through an extract_vector_elt. I'm not super worried about this though as InstCombine should be able to do it before we get to SelectionDAG.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336416 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Make support types more easily printable.

Summary:
Error's new operator<< is the first way to print an error without consuming it.

formatv() can now print objects with an operator<< that works with raw_ostream.

Reviewers: bkramer

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D48966

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336412 91177308-0d34-0410-b5e6-96231b3b80d8

Reapply: "objdump: Support newer ObjC image info flags"

Summary:
Add support for two additional ObjC image info flags: `IS_SIMULATED` and
`HAS_CATEGORY_CLASS_PROPERTIES`.

`IS_SIMULATED` indicates a Mach-O binary built for iOS simulator.

`HAS_CATEGORY_CLASS_PROPERTIES` indicates a Mach-O binary built by a compiler
that supports class properties in categories.

Reviewers: enderby, compnerd

Reviewed By: compnerd

Subscribers: keith, llvm-commits

Differential Revision: https://reviews.llvm.org/D48568

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336411 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336410 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove all of the avx512 masked packed fma intrinsics. Use llvm.fma or unmasked 512-bit intrinsics with rounding mode.

This upgrades all of the intrinsics to use fneg instructions to convert fma into fmsub/fnmsub/fnmadd/fmsubadd. And uses a select instruction for masking.

This matches how clang uses the intrinsics these days.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336409 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Cleanup some of the avx512 masked fma tests to prepare for removing and autoupgrading.

-Split cases that call 2 intrinsics in the same case.
-Remove testing mask3 and maskz intrinsics with an all ones mask. These won't be interesting after the upgrade.
-Restore test cases for some intrinsics that are marked for deletion, but haven't been deleted yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336408 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-pdbutil] Dump more info about globals.

We add an option to dump the entire global / public symbol record
stream. Previously we would dump globals or publics, but not both.
And when we did dump them, we would always dump them in the order
they were referenced by the corresponding hash streams, not in
the order they were serialized in. This patch adds a lower level
mode that just dumps the whole stream in serialization order.

Additionally, when dumping global-extras, we now dump the hash
bitmap as well as the record offset instead of dumping all zeros
for the offsets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336407 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Add __float128 library call for frem

Power 9 does not have a hardware instruction for frem but we can call fmodf128.

Differential Revision: https://reviews.llvm.org/D48552

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336406 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] Sort globals symbols by name in GSI hash buckets.

It seems like the debugger first computes a symbol's bucket,
and then does a binary search of entries in the bucket using the
symbol's name in order to find it. If the bucket entries are not
in sorted order, this obviously won't work. After this patch a
couple of simple test cases show that we generate an exactly
identical GSI hash stream, which is very nice.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336405 91177308-0d34-0410-b5e6-96231b3b80d8

[x86]Add a test case to show missed vfnmadd generation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336404 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "objdump: Support newer ObjC image info flags"

This reverts commit 8c4cc472e7a67bd3b2b20cc4cf32d31af29bc7e9.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336402 91177308-0d34-0410-b5e6-96231b3b80d8

[OpenEmbedded] Add OpenEmbedded vendor

Summary: The lib paths are not correctly picked up for OpenEmbedded sysroots
(like arm-oe-linux-gnueabi). I fix this in a follow-up clang patch. But in
order to add the correct libs I need to detect if the vendor is oe. For this
reason, it is first necessary to teach llvm to detect oe vendor, which is what
this patch does.

Reviewers: chandlerc, compnerd, rengolin, javed.absar

Reviewed By: compnerd

Subscribers: kristof.beyls, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D48861

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336401 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][Disassembler] Fix LOCK prefix disassembler support

Summary:
If LOCK prefix is not the first prefix in an instruction, LLVM
disassembler silently drops the prefix.

The fix is to select a proper instruction with a builtin LOCK prefix if
one exists.

Reviewers: craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49001

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336400 91177308-0d34-0410-b5e6-96231b3b80d8

objdump: Support newer ObjC image info flags

Summary:
Add support for two additional ObjC image info flags: `IS_SIMULATED` and
`HAS_CATEGORY_CLASS_PROPERTIES`.

`IS_SIMULATED` indicates a Mach-O binary built for iOS simulator.

`HAS_CATEGORY_CLASS_PROPERTIES` indicates a Mach-O binary built by a compiler
that supports class properties in categories.

Reviewers: enderby, compnerd

Reviewed By: compnerd

Subscribers: keith, llvm-commits

Differential Revision: https://reviews.llvm.org/D48568

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336399 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r332168: "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading.""

There were a couple of issues reported (PR38047, PR37929) - I'll reland
the patch when I figure out and fix the rootcause.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336393 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add missing _S opcodes of atomic stores to InstPrinter

Summary: This was missing in D48839 (rL336145).

Reviewers: aardappel

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D48992

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336390 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Add BitReader/BitWriter to target_link_libraries

Summary:
CompileOnDemandLayer.cpp uses function in these libraries, and builds
with `-DSHARED_LIB=ON` fail without this.

Reviewers: lhames

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D48995

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336389 91177308-0d34-0410-b5e6-96231b3b80d8

This is a recommit of r336322, previously reverted in r336324 due to
a deficiency in TableGen that has been addressed in r336334.

[AArch64][SVE] Asm: Support for predicated FP rounding instructions.

This patch also adds instructions for predicated FP square-root and
reciprocal exponent.

The added instructions are:
- FRINTI  Round to integral value (current FPCR rounding mode)
- FRINTX  Round to integral value (current FPCR rounding mode, signalling inexact)
- FRINTA  Round to integral value (to nearest, with ties away from zero)
- FRINTN  Round to integral value (to nearest, with ties to even)
- FRINTZ  Round to integral value (toward zero)
- FRINTM  Round to integral value (toward minus Infinity)
- FRINTP  Round to integral value (toward plus Infinity)
- FSQRT   Floating-point square root
- FRECPX  Floating-point reciprocal exponent

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336387 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] In CompileOnDemandLayer2, clone modules on to different contexts by
writing them to a buffer and re-loading them.

Also introduces a multithreaded variant of SimpleCompiler
(MultiThreadedSimpleCompiler) for compiling IR concurrently on multiple
threads.

These changes are required to JIT IR on multiple threads correctly.

No test case yet. I will be looking at how to modify LLI / LLJIT to test
multithreaded JIT support soon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336385 91177308-0d34-0410-b5e6-96231b3b80d8

Testing commit permision

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336384 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove the last of the 'x86.fma.' intrinsics and autoupgrade them to 'llvm.fma'. Add upgrade tests for all.

Still need to remove the AVX512 masked versions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336383 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add SHUF128 to target shuffle decoding.

Differential Revision: https://reviews.llvm.org/D48954

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336376 91177308-0d34-0410-b5e6-96231b3b80d8

Fix asserts in AMDGCN fmed3 folding by handling more cases of NaN

Better NaN handling for AMDGCN fmed3.

All operands are checked for NaN now. The checks
were moved before the canonicalization to provide
a better mapping from fclamp. Changed the behaviour
of fmed3(x,y,NaN) to return max(x,y) instead of
min(x,y) in light of this. Updated tests as a result
and added some new cases to cover the fix.

Patch by Alan Baker

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336375 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Don't use spir_kernel in a test

Also use verify-machineinstrs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336374 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Implement custom kernel arg lowering

Avoid using allocateKernArg / AssignFn. We do not want any
of the type splitting properties of normal calling convention
lowering.

For now at least this exists alongside the IR argument lowering
pass. This is necessary to handle struct padding correctly while
some arguments are still skipped by the IR argument lowering
pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336373 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel][X86] Add UDIV/UREM by pow2 costs

Normally InstCombine would have simplified these to SRL/AND instructions but we may still see these during SLP vectorization etc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336371 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Removed archive-headers-disas test

This test is failing because of the disas part.
For the moment, I will juste remove it. I will add it again tomorrow
with a proper fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336370 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Fix RegisterFile debug prints. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336367 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Fix timezone dependant tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336363 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Add lib calls for float128 operations with no equivalent PPC instructions

Map the following instructions to the proper float128 lib calls:
pow[i], exp[2], log[2|10], sin, cos, fmin, fmax

Differential Revision: https://reviews.llvm.org/D48544

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336361 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add srem x, (1 << c) combine tests

Now that D45806 has landed we can start trying to avoid scalarizing srem by constant - these tests demonstrate some example cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336360 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Add --archive-headers (-a) option

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336357 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Add uop computation for more X87 instruction classes.

Summary:
This allows measuring comparisons (UCOM_FpIr32,UCOM_Fpr32,...),
conditional moves (CMOVBE_Fp32,...)

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48713

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336352 91177308-0d34-0410-b5e6-96231b3b80d8

Fix comment typo. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336351 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64, PowerPC, x86] add tests for signbit bit hacks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336348 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer] Begin abstracting InstructionsState alternate matching away from opcodes. NFCI.

This is an early step towards matching Instructions by attributes other than the opcode. This will be necessary for cast/call alternates which share the same opcode but have different types/intrinsicIDs etc. - which we could vectorize as long as we split them using the alternate mechanism.

Differential Revision: https://reviews.llvm.org/D48945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336344 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis][NFC]clang-format

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336343 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Add VALU to V_INTERP Instructions

Wait states are not properly being inserted after buffer_store for v_interp instructions.

Add VALU to V_INTERP instructions so that the GCNHazardRecognizer can
check and insert the appropriate wait states when needed.

Differential Revision: https://reviews.llvm.org/D48772

Change-Id: Id540c9b074fc69b5c1de6b182276aa089c74aa64

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336339 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Switch to indirect even the trivial case through an object pointer
that has required alignment. This avoids issues that keep coming up with
function pointers being less aligned.

I'm pretty annoyed that we can't take advantage of function alignment
even on platforms where they *are* aligned, but build modes and other
things make taking advantage of it somewhere between hard and
impossible. The best case scenario would still embed various build modes
into the ABI causing really hard to debug issues if you compiled one
object file differently from another. =/

This should at least bring the bots back that were having trouble with
this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336337 91177308-0d34-0410-b5e6-96231b3b80d8

Partially revert r336268 in address-offsets.ll

Summary: There the typos are intentional, explicitly introduced to disable these cases in r280285.

Reviewers: bkramer

Reviewed By: bkramer

Subscribers: dschuff, sbc100, jgravelle-google, aheejin, llvm-commits

Differential Revision: https://reviews.llvm.org/D48962

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336336 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Increase the number of supported decoder fix-ups.

The vast number of added instructions for SVE causes TableGen to fail with an assertion:

Assertion `Delta < 65536U && "disassembler decoding table too large!"'

This patch increases the number of supported decoder fix-ups.

Reviewers: dmgreen, stoklund, petpav01

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D48937

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336334 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add extra v16i16 shl x,c -> pmullw test

We want to compare shifts with repeated vs non-repeated v8i16 shuffle masks (for PBLENDW ymm)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336333 91177308-0d34-0410-b5e6-96231b3b80d8

Try to fix -Wimplicit-fallthrough warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336331 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Fix atomic operations at O0, v3

Similar to PR/25526, fast-regalloc introduces spills at the end of basic
blocks. When this occurs in between an ll and sc, the stores can cause the
atomic sequence to fail.

This patch fixes the issue by introducing more pseudos to represent atomic
operations and moving their lowering to after the expansion of postRA
pseudos.

This version addresses issues with the initial implementation and covers
all atomic operations.

This resolves PR/32020.

Thanks to James Cowgill for reporting the issue!

Patch By: Simon Dardis

Differential Revision: https://reviews.llvm.org/D31287

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336328 91177308-0d34-0410-b5e6-96231b3b80d8

[NEON] Fix combining of vldx_dup intrinsics with updating of base addresses

Resolves:
Unsupported ARM Neon intrinsics in Target-specific DAG combine
function for VLDDUP
https://bugs.llvm.org/show_bug.cgi?id=38031

Related diff: D48439

Differential Revision: https://reviews.llvm.org/D48920

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336325 91177308-0d34-0410-b5e6-96231b3b80d8

Reverting r336322 for now, as it causes an assert failure
in TableGen, for which there is already a patch in Phabricator
(D48937) that needs to be committed first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336324 91177308-0d34-0410-b5e6-96231b3b80d8

Partial revert of "NFC - Various typo fixes in tests"

This partially reverts r336268 since it causes buildbot failures.

Added FIXME at the places where the CHECKs are misspelled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336323 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for predicated FP rounding instructions.

This patch also adds instructions for predicated FP square-root and
reciprocal exponent.

The added instructions are:
- FRINTI  Round to integral value (current FPCR rounding mode)
- FRINTX  Round to integral value (current FPCR rounding mode, signalling inexact)
- FRINTA  Round to integral value (to nearest, with ties away from zero)
- FRINTN  Round to integral value (to nearest, with ties to even)
- FRINTZ  Round to integral value (toward zero)
- FRINTM  Round to integral value (toward minus Infinity)
- FRINTP  Round to integral value (toward plus Infinity)
- FSQRT   Floating-point square root
- FRECPX  Floating-point reciprocal exponent

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336322 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] ParallelDSP: only support i16 loads for now

We were miscompiling i8 loads, so reject them as unsupported narrow operations
for now.

Differential Revision: https://reviews.llvm.org/D48944

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336319 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for signed/unsigned MIN/MAX/ABD

This patch implements the following varieties:

- Unpredicated signed max,   e.g. smax z0.h, z1.h, #-128
- Unpredicated signed min,   e.g. smin z0.h, z1.h, #-128

- Unpredicated unsigned max, e.g. umax z0.h, z1.h, #255
- Unpredicated unsigned min, e.g. umin z0.h, z1.h, #255

- Predicated signed max,     e.g. smax z0.h, p0/m, z0.h, z1.h
- Predicated signed min,     e.g. smin z0.h, p0/m, z0.h, z1.h
- Predicated signed abd,     e.g. sabd z0.h, p0/m, z0.h, z1.h

- Predicated unsigned max,   e.g. umax z0.h, p0/m, z0.h, z1.h
- Predicated unsigned min,   e.g. umin z0.h, p0/m, z0.h, z1.h
- Predicated unsigned abd,   e.g. uabd z0.h, p0/m, z0.h, z1.h

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336317 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Optimize codgen for conversions of int to float128

Optimize code sequences for integer conversion to fp128 when the integer is a result of:
  * float->int
  * float->long
  * double->int
  * double->long

Differential Revision: https://reviews.llvm.org/D48429

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336316 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart independent FMA and extractelement/insertelement.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336315 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9][NFC] add back-end tests for passing homogeneous fp128 aggregates by value

Tests to verify that we are passing fp128 via VSX registers as per ABI.
These are related to clang commit rL336308.

Differential Revision: https://reviews.llvm.org/D48310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336314 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Add tests for passing float128 in VSX reg for non-homogenous aggregates

Add missing testcase for rL336310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336313 91177308-0d34-0410-b5e6-96231b3b80d8

[demangler] Avoid alignment warning

The alignment specified by a constant for the field
`BumpPointerAllocator::InitialBuffer` exceeded the alignment
guaranteed by `malloc` and `new` on Windows. This change set
the alignment value to that of `long double`, which is defined
by the used platform.

It fixes https://bugs.llvm.org/show_bug.cgi?id=37944.

Differential Revision: https://reviews.llvm.org/D48889

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336311 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Ensure float128 in non-homogenous aggregates are passed via VSX reg

Non-homogenous aggregates are passed in consecutive GPRs, in GPRs and in memory,
or in memory. This patch ensures that float128 members of non-homogenous
aggregates are passed via VSX registers.

This is done via custom lowering a bitcast of a build_pari(i64,i64) to float128
to a new PPCISD node, BUILD_FP128.

Differential Revision: https://reviews.llvm.org/D48308

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336310 91177308-0d34-0410-b5e6-96231b3b80d8