git.osdn.net Git - android-x86/external-llvm.git/log

[BPF] generate R_BPF_NONE relocation for BTF DataSec variables

The variables in BTF DataSec type encode in-section offset.
R_BPF_NONE should be generated instead of R_BPF_64_32.

Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D62460

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361742 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.

    Details: To make instruction selection really divergence driven it is necessary to assign
             the correct register classes to the cross block values beforehand. For the divergent targets
             same value type requires different register classes dependent on the value divergence.

    Reviewers: rampitec, nhaehnle

    Differential Revision: https://reviews.llvm.org/D59990

    This commit was reverted because of the build failure.
    The reason was mlformed patch.
    Build failure fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361741 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA][Scheduler] Improved critical memory dependency computation.

This fixes a problem where back-pressure increases caused by register
dependencies were not correctly notified if execution was also delayed by memory
dependencies.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361740 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] GetDemandedBits - cleanup to more closely match SimplifyDemandedBits. NFCI.

Prep work before adding demanded elts support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361739 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] MaskedValueIsZero - add demanded elements implementation

Will be used in an upcoming patch but I've updated the original implementation to call this to ensure test coverage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361738 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA] Refactor the logic that computes the critical memory dependency info. NFCI

CriticalRegDep has been renamed CriticalDependency, and it is now used by class
Instruction to store information about the critical register dependency and the
critical memory dependency. No functional change intendend.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361737 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] back out all SwitchInst commits

They caused the sanitizer builds to fail.

My suspicion is the change the countLeadingZeros().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361736 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add shuffle combining support for ISD::ANY_EXTEND_VECTOR_INREG

Reuses what we already have in place for ISD::ZERO_EXTEND_VECTOR_INREG just with a different sentinel

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361734 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] NFC, one more fixed test from previous push.

The old test was checking for a stupid subtract one that is a transform that
makes the code woorse.

The constant-islands-jump-table.ll test wants the code a specific way,
that makes sense, so I will submit code to fix that one.

Sorry that I really didn't know how to run the test suite before this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361733 91177308-0d34-0410-b5e6-96231b3b80d8

Revert rL361731 : [LLParser] Fix uninitialized variable warnings. NFCI.

These 3 variables cause quite a few warnings in the scan-build report on llvm.
........
Revert accidental commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361732 91177308-0d34-0410-b5e6-96231b3b80d8

[LLParser] Fix uninitialized variable warnings. NFCI.

These 3 variables cause quite a few warnings in the scan-build report on llvm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361731 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] NFC, fix failing tests from last patches.

No problems with the transforms.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361730 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] prevent crashing with invalid extractelement index

This was found/reduced from a fuzzer report:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14956

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361729 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] ReduceSwitchRange: Improve on the case where the SubThreshold doesn't trigger

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361728 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Run ReduceSwitchRange unconditionally, generalize

Rather than gating on "isSwitchDense" (resulting in necessesarily
sparse lookup tables even when they were generated), always run
this quite cheap transform.

This transform is useful not just for generating tables.
LowerSwitch also wants this: read LowerSwitch.cpp:257.

Be careful to not generate worse code, by introducing a
SubThreshold heuristic.

Instead of just sorting by signed, generalize the finding of the
best base.

And now that it is run unconditionally, do not replicate its
functionality in SwitchToLookupTable (which could use a Sub
when having a hole is smaller, hence the SubThreshold
heuristic located in a single place).
This simplifies SwitchToLookupTable, and fixes
some ugly corner cases due to the use of signed numbers,
such as a table containing i16 32768 and 32769, of which
32769 would be interpreted as -32768, and now the code thinks
the table is size 65536.

(We still use unconditional subtraction when building a single-register mask,
but I think this whole block should go when the more general sparse
map is added, which doesn't leave empty holes in the table.)

And the reason test4 and test5 did not trigger was documented wrong:
it was because they were not considered sufficiently "dense".

Also, fix generation of invalid LLVM-IR: shl by bit-width.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361727 91177308-0d34-0410-b5e6-96231b3b80d8

[SimpligyCFG] NFC, remove GCD that was only used for powers of two

and replace with an equilivent countTrailingZeros.

GCD is much more expensive than this, with repeated division.

This depends on D60823

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361726 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] NFC, update Switch tests to HEAD so I can see if my changes change anything

Also add baseline tests to show effect of later patches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361725 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] make countLeadingZeros() and countTrailingZeros() return unsigned

This matches countLeadingOnes() and countTrailingOnes(), and
APInt's countLeadingZeros() and countTrailingZeros().

(as well as __builtin_clzll())

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361724 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueTracking] Base computeOverflowForUnsignedMul() on ConstantRange code; NFCI

The implementation in ValueTracking and ConstantRange are equally
powerful, reuse the one in ConstantRange, which will make this easier
to extend.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361723 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r361664

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361722 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Refactor OptimizeOverflowCheck; NFCI

Extract method to compute overflow based on binop and signedness,
and then make the result handling code generic. This extends the
always-overflow handling to signed muls, but has currently no effect,
as we don't compute always overflow for them (thus NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361721 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Remove OverflowCheckFlavor; NFC

Instead pass binary op and signedness. The extra enum only makes
things more complicated in this case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361720 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Select fp16 fma

This adds a pattern for fma, similar to the float and double patterns.

Differential Revision: https://reviews.llvm.org/D62330

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361719 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Select a number of fp16 rounding functions

This add patterns for fp16 round and ceil etc. Same as the float and double
patterns.

Differential Revision: https://reviews.llvm.org/D62326

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361718 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Promote various fp16 math intrinsics

Promote a number of fp16 math intrinsics to float, so that the relevant float
math routines can be used. Copysign is expanded so as to be handled in-place.

Differential Revision: https://reviews.llvm.org/D62325

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361717 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] combineBitcastvxi1 - peek through bitops to determine size of original vector

We were only testing for direct SETCC results - this allows us to peek through AND/OR/XOR combinations of the comparison results as well.

There's a missing SEXT(PACKSS) fold that I need to investigate for v8i1 cases before I can enable it there as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361716 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Select fp16 fabs

This adds a pattern for the fabs intrinsic, the same as float and double.

Differential Revision: https://reviews.llvm.org/D62324

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361715 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Select fp16 fsqrt

This adds a pattern for the sqrt intrinsic, the same as float and double.

Differential Revision: https://reviews.llvm.org/D62322

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361714 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Promote fp16 frem

Promote fp16 frem operations on ARM to floats so they call fmodf.

Differential Revision: https://reviews.llvm.org/D62321

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361713 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add some base fullfp16 tests. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361712 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Add missing R_PPC_* relocation types

While people mostly care about 64-bit, some systems need basic lib32
support. The plan is to make lld (see PR40888) capable of linking some
applications (PR40888).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361711 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Added condition assumption for unreachable blocks

Summary: PR41688

Reviewers: spatel, efriedma, craig.topper, hfinkel, reames

Reviewed By: hfinkel

Subscribers: javed.absar, dmgreen, fhahn, hfinkel, reames, nikic, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61409

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361707 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] lowerBuildVectorToBitOp - support build_vector(shift()) -> shift(build_vector(),C)

Commonly occurs in sign-extension cases

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361706 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM-C] Add Accessor for Mach-O Universal Binary Slices

Summary: Allow for retrieving an object file corresponding to an architecture-specific slice in a Mach-O universal binary file.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60378

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361705 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Combine fminnum/fmaxnum with non-nan operand to fmin/fmax

If we have a known non-nan operand, place it in the second operand
of fmin/fmax that is returned if either operand is nan.

Differential Revision: https://reviews.llvm.org/D62448

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361704 91177308-0d34-0410-b5e6-96231b3b80d8

[LVI][CVP] Add support for saturating add/sub

Adds support for the uadd.sat family of intrinsics in LVI, based on
ConstantRange methods from D60946.

Differential Revision: https://reviews.llvm.org/D62447

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361703 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] vector-sext - cleanup prefix lists

Add X32-SSE common prefix to merge some checks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361702 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] define binops as a superset of commutative binops

The test diffs show improved vector narrowing for integer min/max opcodes because
those were all absent from the list. I'm not sure if we can expose functional diffs
for all of the moved/added opcodes though.

It seems like we are missing an AVX512 opportunity to use 256-bit ops in place of
512-bit ops on some tests/targets, but I think that can be a follow-up.

Preliminary steps to make sure the callers are not misusing these queries:
rL361268
rL361547

Differential Revision: https://reviews.llvm.org/D62191

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361701 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add tests for min/maxnum with const operand; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361700 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopVectorize] Fix test by regenerating checks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361699 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP] Remove unnecessary checks for empty GNWR; NFC

The guaranteed no-wrap region is never empty, it always contains at
least zero, so these optimizations don't ever apply.

To make this more obviously true, replace the conversative return
in makeGNWR with an assertion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361698 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Make tests more robust for new optimizations

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361697 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] soften assertion when legalizing narrow vector FP ops

The test based on PR42010:
https://bugs.llvm.org/show_bug.cgi?id=42010
...may show an inaccuracy for PPC's target defs, but we should not
be so aggressive with an assert here. There's no telling what out-of-tree
targets look like.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361696 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Update test checks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361695 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP] Add tests for saturating add/sub ranges; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361694 91177308-0d34-0410-b5e6-96231b3b80d8

[LVI][CVP] Calculate with.overflow result range

In LVI, calculate the range of extractvalue(op.with.overflow(%x, %y), 0)
as the range of op(%x, %y). This is mainly useful in conjunction with
D60650: If the result of the operation is extracted in a branch guarded
against overflow, then the value of %x will be appropriately constrained
and the result range of the operation will be calculated taking that
into account.

Differential Revision: https://reviews.llvm.org/D60656

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361693 91177308-0d34-0410-b5e6-96231b3b80d8

[LVI] Extract helper for binary range calculations; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361692 91177308-0d34-0410-b5e6-96231b3b80d8

[X86FixupLEAs] Turn optIncDec into a generic two address LEA optimizer. Support LEA64_32r properly.

INC/DEC is really a special case of a more generic issue. We should also turn leas into add reg/reg or add reg/imm regardless of the slow lea flags.

This also supports LEA64_32 which has 64 bit input registers and 32 bit output registers. So we need to convert the 64 bit inputs to their 32 bit equivalents to check if they are equal to base reg.

One thing to note, the original code preserved the kill flags by adding operands to the new instruction instead of using addReg. But I think tied operands aren't supposed to have the kill flag set. I dropped the kill flags, but I could probably try to preserve it in the add reg/reg case if we think its important. Not sure which operand its supposed to go on for the LEA64_32r instruction due to the super reg implicit uses. Though I'm also not sure those are needed since they were probably just created by an INSERT_SUBREG from a 32-bit input.

Differential Revision: https://reviews.llvm.org/D61472

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361691 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add zero idioms to the haswell, broadwell, and skylake schedule models. Add 256-bit fp xor to sandybridge zero idioms

This copies the Sandy Bridge zero idiom support to later CPUs. Adding the AVX2 and AVX512F/VL instructions as appropriate.

Differential Revision: https://reviews.llvm.org/D62360

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361690 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][llvm-mca] Add zero idiom tests for Intel CPUs. NFC

This pre-commits tests for D62360

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361689 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence."

Broke sanitizer bots:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21694/steps/bootstrap%20clang/logs/stdio
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32478/steps/check-llvm%20asan/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361688 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[Analysis] Link library dependencies to Analysis plugins"

This reverts commit r361340. The following builder has been broken for
the past few days because of this commit:

http://green.lab.llvm.org/green/job/clang-stage2-cmake-RgSan/

Also revert r361399, which was committed to fix r361340.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361685 91177308-0d34-0410-b5e6-96231b3b80d8

Rename clangToolingRefactor to clangToolingRefactoring for consistency with its directory

See "[cfe-dev] The name of clang/lib/Tooling/Refactoring".

Differential Revision: https://reviews.llvm.org/D62420

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361684 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-dwarfdump: Don't error on mixed units using/not using str_offsets

This lead to errors when dumping binaries with v4 and v5 units linked
together (but could've also errored on v5 units that did/didn't use
str_offsets).

Also improves error handling and messages around invalid str_offsets
contributions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361683 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] Make FP constraint checks consider possible use/def banks

In a few places in getInstrMapping, we check if use/def instructions for the
instruction we're mapping have floating point constraints.

We can improve this check and reduce the number of copies in GISel-compiled code
if we make a couple observations:

- For a def instruction, it only matters if the def instruction must always
output a value stored on a FPR

- For a use instruction, it only matters if the use instruction must always
only take in values stored in FPRs

This adds two new functions:

- onlyUsesFP
- onlyDefinesFP

Then we can use those when we're checking the uses/defs instead.

Without this patch, the load, unmerge, store, and select in the added test
would have unnecessary copies.

Differential Revision: https://reviews.llvm.org/D62426

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361679 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] NFC: Factor out HasFPConstraints into a proper function

Factor it out into a function, and replace places where we had the same check
with the new function.

Differential Revision: https://reviews.llvm.org/D62421

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361677 91177308-0d34-0410-b5e6-96231b3b80d8

[dwarfdump] Add flag to limit the number of parents DIEs

This adds `-parent-recurse-depth` which limits the number of parent DIEs
being dumped.

Differential revision: https://reviews.llvm.org/D62359

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361671 91177308-0d34-0410-b5e6-96231b3b80d8

Implement call lowering without parameters on AIX

Summary:dd
This patch implements call lowering for calls without parameters
on AIX as initial support.

Reviewers: sfertile, hubert.reinterpretcast, aheejin, efriedma

Differential Revision: https://reviews.llvm.org/D61948

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361669 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] Improve register bank mappings for G_SELECT

The fcsel and csel instructions differ in only the register banks they work on.

So, they're entirely interchangeable otherwise.

With this in mind, this does two things:

- Teach AArch64RegisterBankInfo to consider the inputs to G_SELECT as well as
the outputs.
- Teach it to choose the best register bank mapping based off the constraints
of the inputs and outputs.

The "best" in this case means the one that requires the smallest number of
copies to properly emit a fcsel/csel.

For example, if the inputs are all already going to be on FPRs, we should
emit a fcsel, even if the output is a GPR. This costs one copy to produce the
result, but saves us from copying the inputs into GPRs.

Also update the regbank-select.mir to check that we end up with the right
select instruction.

Differential Revision: https://reviews.llvm.org/D62267

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361665 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] check for INLINEASM_BR along w/ INLINEASM

Summary:
It looks like since INLINEASM_BR was created off of INLINEASM, a few
checks for INLINEASM needed to be updated to check for either case.

pr/41999

Reviewers: t.p.northover, peter.smith

Reviewed By: peter.smith

Subscribers: craig.topper, javed.absar, kristof.beyls, hiraditya, llvm-commits, peter.smith, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62402

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361661 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] additionally check for ARM::INLINEASM_BR w/ ARM::INLINEASM

Summary:
We were observing failures for arm32 allyesconfigs of the Linux kernel
with the asm goto Clang patch, where ldr's were being generated to
offsets too far away to encode in imm12.

It looks like since INLINEASM_BR was created off of INLINEASM, a few
checks for INLINEASM needed to be updated to check for either case.

pr/41999

Link: https://github.com/ClangBuiltLinux/linux/issues/490
Reviewers: peter.smith, kristof.beyls, ostannard, rengolin, t.p.northover

Reviewed By: peter.smith

Subscribers: jyu2, javed.absar, hiraditya, llvm-commits, nathanchance, craig.topper, kees, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62400

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361659 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Activate all lanes when spilling CSR VGPR for SGPR spills

If some lanes weren't active on entry to the function, this could
clobber their VGPR values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361655 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Boost inline threshold with addrspacecasted alloca arguments

This was skipping GetUnderlyingObject for nonprivate addresses, but an
alloca could also be found through an addrspacecast if it's flat.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361649 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopVectorize] update test to be independent of instcombine; NFC

This is a regression test for vectorization, so remove instcombine
from the RUN line and adjust the comparison predicates to show what
the vectorizer is creating rather than how instcombine cleans it up.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361648 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Fix issues building runtimes

This resolves two issues:
(1) LIBCXX_HEADER_DIR is a very misleadingly named variable because it shouldn't be set to the header directory, instead it needs to be the root binary dir.
(2) If you build runtimes without libcxx, we can't depend on the libcxx header target, so we should instaed refer to it by the variable name which will be unset if libcxx isn't present.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361646 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.

Details: To make instruction selection really divergence driven it is necessary to assign
the correct register classes to the cross block values beforehand. For the divergent targets
same value type requires different register classes dependent on the value divergence.

Reviewers: rampitec, nhaehnle

Differential Revision: https://reviews.llvm.org/D59990

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361644 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] - Strip undefined symbols if they are no longer referenced following --only-section

This is https://bugs.llvm.org/show_bug.cgi?id=40004.

In this patch I teach llvm-objcopy to remove undefined symbols if
them are not used anymore after applying -j/--only-section option.

Differential revision: https://reviews.llvm.org/D62317

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361642 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r361607

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361640 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA] Zero-initialize field CRD in InstructionBase. Also run clang-format on a couple of files. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361637 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] Implement GNU-style output for dynamic table

GNU readelf tool prints slightly different dynamic table "header" and
surrounds dynamic tag names by brackets. This patch implements the same
formatting for GNU-style output of the `llvm-readobj`.

LLVM
```
DynamicSection [ (13 entries)
  Tag        Type                 Name/Value
  0x00000006 SYMTAB               0x168
  ...
]
```

GNU
```
Dynamic section at offset 0x1d0 contains 13 entries:
  Tag        Type                 Name/Value
  0x00000006 (SYMTAB)             0x168
  ...
```

Differential Revision: https://reviews.llvm.org/D62256

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361633 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Remove CRBits Copy Of Unset/set CBit

For the situation, where we generate the following code:

       crxor 8, 8, 8
       < Some instructions>
.LBB0_1:
       < Some instructions>
       cror 1, 8, 8

cror (COPY of CRbit) depends on the result of the crxor instruction.
CR8 is known to be zero as crxor is equivalent to CRUNSET. We can simply use
crxor 1, 1, 1 instead to zero out CR1, which does not have any dependency on
any previous instruction.

This patch will optimize it to:

        < Some instructions>
.LBB0_1:
        < Some instructions>
        cror 1, 1, 1

Patch By: Victor Huang (NeHuang)

Differential Revision: https://reviews.llvm.org/D62044

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361632 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r361630 "[llvm-readelf] - Allow dumping of the .dynamic section even if there is no PT_DYNAMIC header."

It broke BB:
http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/3748

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361631 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readelf] - Allow dumping of the .dynamic section even if there is no PT_DYNAMIC header.

It is now possible after D61937 was landed and was discussed
in it's review comments. It is not consistent with GNU, which
does not output .dynamic section content in this case for
no visible reason.

Differential revision: https://reviews.llvm.org/D62179

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361630 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: support SVE2 String Processing Group

Summary:
Patch adds support for the SVE2 character match instructions MATCH and NMATCH.

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D62206

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361627 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj][mips] Align GOT columns headers properly in 64-bit case

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361626 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: support SVE2 Narrowing Group

Summary:
Patch adds support for the following instructions:

SVE2 bitwise shift right narrow:
    * SQSHRUNB, SQSHRUNT, SQRSHRUNB, SQRSHRUNT, SHRNB, SHRNT, RSHRNB, RSHRNT,
      SQSHRNB, SQSHRNT, SQRSHRNB, SQRSHRNT, UQSHRNB, UQSHRNT, UQRSHRNB,
      UQRSHRNT

SVE2 integer add/subtract narrow high part:
    * ADDHNB, ADDHNT, RADDHNB, RADDHNT, SUBHNB, SUBHNT, RSUBHNB, RSUBHNT

SVE2 saturating extract narrow:
    * SQXTNB, SQXTNT, UQXTNB, UQXTNT, SQXTUNB, SQXTUNT

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D62205

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361624 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: support SVE2 Accumulate Group

Summary:
Patch adds support for the following instructions:

SVE2 bitwise shift and insert:
    * SRI, SLI

SVE2 bitwise shift right and accumulate:
    * SSRA, USRA, SRSRA, URSRA

SVE2 complex integer add:
    * CADD, SQCADD

SVE2 integer absolute difference and accumulate:
    * SABA, UABA

SVE2 integer absolute difference and accumulate long:
    * SABALB, SABALT, UABALB, UABALT

SVE2 integer add/subtract long with carry:
    * ADCLB, ADCLT, SBCLB, SBCLT

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D62204

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361622 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump][test] Fix for spurious matches against file paths

r361479 added tests that did --implicit-check-not=main, but a user found
that they failed on his machine, due to it having 'main' in a file path
printed earlier in the output. This test fixes this issue by making the
check pattern more explicit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361621 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] computeKnownBits - support constant pool values from target

This patch adds the overridable TargetLowering::getTargetConstantFromLoad function which allows targets to return any constant value loaded by a LoadSDNode node - only X86 makes use of this so far but everything should be in place for other targets.

computeKnownBits then uses this function to improve codegen, notably vector code after legalization.

A future commit will do the same for ComputeNumSignBits but computeKnownBits sees the bigger benefit.

This required a couple of fixes:
* SimplifyDemandedBits must early-out for getTargetConstantFromLoad cases to prevent infinite loops of constant regeneration (similar to what we already do for BUILD_VECTOR).
* Fix a DAGCombiner::visitTRUNCATE issue as we had trunc(shl(v8i32),v8i16) <-> shl(trunc(v8i16),v8i32) infinite loops after legalization on AVX512 targets.

Differential Revision: https://reviews.llvm.org/D61887

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361620 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: add PMULLB/PMULLT instructions

Summary:
This patch adds support for the polynomial multiplication instructions
PMULLB/PMULLT. The 64-bit source and 128-bit destination element
variants are enabled with crypto extensions (+sve2-aes), similar to the
NEON PMULL2 instruction. All other variants are enabled with +sve2.

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D62145

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361619 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: add integer add/sub long/wide instructions

Summary:
Patch adds support for the following instructions:

SVE2 integer add/subtract long:
    * SADDLB, SADDLT, UADDLB, UADDLT, SSUBLB, SSUBLT, USUBLB, USUBLT,
      SABDLB, SABDLT, UABDLB, UABDLT

SVE2 integer add/subtract wide:
    * SADDWB, SADDWT, UADDWB, UADDWT, SSUBWB, SSUBWT, USUBWB, USUBWT

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D62142

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361615 91177308-0d34-0410-b5e6-96231b3b80d8

Use the DataLayout::typeSizeEqualsStoreSize helper. NFC

Just a minor refactoring to use the new helper method
DataLayout::typeSizeEqualsStoreSize(). This is done when
checking if getTypeSizeInBits is equal/non-equal to
getTypeStoreSizeInBits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361613 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: add various bitwise shift instructions

Summary:
This patch adds support for the SVE2 saturating/rounding bitwise shift
left (predicated) group of instructions:

* SRSHL, URSHL, SRSHLR, URSHLR, SQSHL, UQSHL, SQRSHL, UQRSHL,
SQSHLR, UQSHLR, SQRSHLR, UQRSHLR

Immediate forms of the SQSHL and UQSHL instructions are also added to
the existing SVE bitwise shift by immediate (predicated) group, as well
as three new instructions SRSHR/URSHR/SQSHLU. The new instructions in
this group are encoded similarly and are implemented using the same
TableGen class with a minimal change (1 bit in encoding).

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D62140

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361612 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: add saturating add/sub instructions

Summary:
Patch adds support for the following instructions:

* SQADD, UQADD, SUQADD, USQADD
* SQSUB, UQSUB, SQSUBR, UQSUBR

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D62130

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361611 91177308-0d34-0410-b5e6-96231b3b80d8

StructurizeCFG: Relax uniformity checks.

This change relaxes the checks for hasOnlyUniformBranches such that our
region is uniform if:

1. All conditional branches that are direct children are uniform.
2. And either:
  a. All sub-regions are uniform.
  b. There is one or less conditional branches among the direct
     children.

Differential Revision: https://reviews.llvm.org/D62198

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361610 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: fix overlapping bit

Summary:
Bit 20 in sve2_int_arith_pred TableGen class was overlapping. The
encodings are not affected as bit 20 is defined by the opc bits
and this was overwriting the earlier error of setting bit 20 to 0.

Raised by Momchil: https://reviews.llvm.org/D62130

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D62292

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361609 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: support swifterror attribute on AArch64.

swifterror marks an argument as a register pretending to be a pointer, so we
need a guaranteed mem2reg-like analysis of its uses. Fortunately most of the
infrastructure can be reused from the DAG world.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361608 91177308-0d34-0410-b5e6-96231b3b80d8

CodeGen: factor out swifterror value tracking.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361607 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Always check that `shift and add` optimization is efficient.

The D45316 introduced the `shouldTransformMulToShiftsAddsSubs` function
to check that breaking down constant multiplications into a series
of shifts, adds, and subs is efficient. Unfortunately, this function
does not check maximum number of steps on all paths of the algorithm.
This patch fixes this bug.

Fix for PR41929.

Differential Revision: https://reviews.llvm.org/D62166

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361606 91177308-0d34-0410-b5e6-96231b3b80d8

[DSE] Bugfix to avoid PartialStoreMerging involving non byte-sized stores

Summary:
The DeadStoreElimination pass now skips doing
PartialStoreMerging when stores overlap according to
OW_PartialEarlierWithFullLater and at least one of
the stores is having a store size that is different
from the size of the type being stored.

This solves problems seen in
https://bugs.llvm.org/show_bug.cgi?id=41949
for which we in the past could end up with
mis-compiles or assertions.

The content and location of the padding bits is not
formally described (or undefined) in the LangRef
at the moment. So the solution is chosen based on
that we cannot assume anything about the padding bits
when having a store that clobbers more memory than
indicated by the type of the value that is stored
(such as storing an i6 using an 8-bit store instruction).

Fixes: https://bugs.llvm.org/show_bug.cgi?id=41949

Reviewers: spatel, efriedma, fhahn

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62250

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361605 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] ARMExpandPseudoInsts: add debug messages

This pass wasn't printing any messages at all, which I find really inconvenient
while debugging/tracing things. It now dumps the before and after of expanded
instructions. It doesn't do this yet for all instructions, but this is a good
start I guess.

Differential Revision: https://reviews.llvm.org/D62297

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361604 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Add a specific heuristic to schedule the addi before the load
When we are scheduling the load and addi, if all other heuristic didn't take effect,
we will try to schedule the addi before the load, to hide the latency, and avoid the
true dependency added by RA. And this only take effects for Power9.

Differential Revision: https://reviews.llvm.org/D61930

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361600 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test case that was supposed to go with r360102.

Found in my working area. Guess I forgot 'git add' before committing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361599 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] SwitchInst: Introduce wrapper for prof branch_weights handling

This patch introduces a wrapper class that re-implements
several mutator methods of SwitchInst to handle changes
of prof branch_weights metadata along with remove/add
switch case methods.
Subsequent patches will use this wrapper to implement
prof branch_weights metadata handling for SwitchInst.

Reviewers: davidx, eraman, reames, chandlerc
Reviewed By: davidx
Differential Revision: https://reviews.llvm.org/D62122

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361596 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-nm] Fix Bug 41353 - unique symbols printed as D instead of u

Summary:
https://bugs.llvm.org/show_bug.cgi?id=41353

I'm new to LLVM and C++ so please do not hesitate to iterate with me on this fix.

Patch by Mike Pozulp!

Reviewers: rupprecht, zbrid, grimar, jhenderson

Reviewed By: rupprecht, jhenderson

Subscribers: jhenderson, chrisjackson, MaskRay, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61117

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361595 91177308-0d34-0410-b5e6-96231b3b80d8

Fix BUILD_SHARED_LIBS builds after r361567

Also fixed a comment I noticed while debugging this build

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361591 91177308-0d34-0410-b5e6-96231b3b80d8

Clarify how musttail can be used to create forwarding thunks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361590 91177308-0d34-0410-b5e6-96231b3b80d8

dwarfdump: Deterministically... determine whether parsing a DWARF32 or DWARF64 str_offsets header

Rather than trying one and then the other - use the kind of the CU to
select which kind of header to parse.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361589 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Preserve X8 for thunks ending in variadic musttail calls

Summary:
On Windows, X8 may be used to pass in the address of an aggregate that
is returned indirectly. Therefore, it should be forwarded to variadic
musttail calls and preserved in thunks.

Fixes PR41997

Reviewers: mgrang, efriedma

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62344

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361585 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Add nvcast patterns for v2f32 -> v1f64

Summary: Constant stores of f32 values can create such NvCast nodes.

Reviewers: t.p.northover

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62285

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361584 91177308-0d34-0410-b5e6-96231b3b80d8