git.osdn.net Git - android-x86/external-llvm.git/log

[X86] Add PreprocessISelDAG support for turning ISD::FP_TO_SINT/UINT into X86ISD::CVTTP2SI/CVTTP2UI and to reduce the number of isel patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364887 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Implement the areMemAccessesTriviallyDisjoint hook
After implemented this hook, we will model the memory dependency in the scheduling dependency graph more precise,
and will have more opportunity to reorder the load/stores, as they didn't have the dependency at some condition

Differential Revision: https://reviews.llvm.org/D63804

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364886 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Exploiting more about the transformation of TransformFPLoadStorePair function

For a given floating point load / store pair, if the load value isn't used by any other operations,
then consider transforming the pair to integer load / store operations if the target deems the transformation profitable.

And we can exploiting much more when there are other operation nodes with chain operand between the load/store pair
so long as we keep the chain ordering original. We only replace the register used to load/store from float to integer.

I only add testcase in ARM because the TLI.isDesirableToTransformToIntegerOp hook is only enabled in ARM target.

Differential Revision: https://reviews.llvm.org/D60601

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364883 91177308-0d34-0410-b5e6-96231b3b80d8

Revert Recommit [PowerPC] Update P9 vector costs for insert/extract element

This reverts r364557 (git commit 9f7f5858fe46b8e706e87a83e2fd0a2678be619e)

This crashes as reported on the commit thread. Repro instructions TBD.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364876 91177308-0d34-0410-b5e6-96231b3b80d8

[cmake] With utils disabled, don't build tblgen in cross mode

Summary:
In cross mode, we build a separate NATIVE tblgen that runs on the
host and is used during the build. Separately, we have a flag that
disables building all executables in utils/. Of course generally,
this doesn't turn off tblgen, since we need that during the build.
In cross mode, however, that tblegen is useless since we never
actually use it. Furthermore, it can be actively problematic if the
cross toolchain doesn't like building executables for whatever reason.
And even if building executables works fine, we can at least save
compile time by omitting it from the target build. There's two changes
needed to make this happen:
- Stop creating a dependency from the native tool to the target tool.
  No such dependency is required for a correct build, so I'm not entirely
  sure why it was there in the first place.
- If utils were disabled on the CMake command line and we're in cross mode,
  respect that by excluding it from the install target (using EXCLUDE_FROM_ALL).

Reviewers: smeenai
Differential Revision: https://reviews.llvm.org/D64032

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364872 91177308-0d34-0410-b5e6-96231b3b80d8

[PGO] Update ICP pass for recent byval type changes

Fixes verifier errors encountered in PR42413.

Reviewers: xur, t.p.northover, inglorion, gbiv, george.burgess.iv

Differential Revision: https://reviews.llvm.org/D63842

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364861 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Correct properties for adjcallstack* pseudos

These should be SALU writes, and these are lowered to instructions
that def SCC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364859 91177308-0d34-0410-b5e6-96231b3b80d8

Fix broken C++ mode comment

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364858 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine][NFCI] Update test cases in onehot_merge.ll

Use both one bit and signbit shifting to check for one bit merge.

Reviewers: lebedev.ri, spatel, efriedma, craig.topper

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63903

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364857 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] reduce more checks for power-of-2-or-zero using ctpop

Extends the transform from:
rL364341
...to include another (more common?) pattern that tests whether a
value is a power-of-2 (including or excluding zero).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364856 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use v4i32 vzloads instead of v2i64 for vpmovzx/vpmovsx patterns where only 32-bits are loaded.

v2i64 vzload defines a 64-bit memory access. It doesn't look like
we have any coverage for this either way.

Also remove some vzload usages where the instruction loads only
16-bits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364851 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add missing schedinfo for MIPSeh_return[32|64] instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364850 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add virtualization ASE to P5600 scheduling definitions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364849 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add missing schedinfo for LONG_BRANCH_* instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364848 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove several bad load folding isel patterns for VPMOVZX/VPMOVSX.

These patterns all matched a v2i64 vzload which only loads 64-bits
to instructions that load a full 128-bits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364847 91177308-0d34-0410-b5e6-96231b3b80d8

Revert [SLP] Look-ahead operand reordering heuristic.

This reverts r364478 (git commit 574cb0eb3a7ac95e62d223a60bef891171dfe321)

The patch is causing compilation timeouts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364846 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] More commutative tests for "shift direction in bittest" (PR42466)

'and' is commutative, if we don't want to touch shift-of-const,
we still need to check the other hand of 'and'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364844 91177308-0d34-0410-b5e6-96231b3b80d8

Testing commit access through minor formatting change

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364843 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Try to widen merges with other merges

If the requested source type an be used as a merge source type, create
a merge of merges. This avoids creating large, illegal extensions and
bit-ops directly to the result type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364841 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Revert accidental change to test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364839 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Correct v4f32->v2i64 cvt(t)ps2(u)qq memory isel patterns

These instructions only read 64-bits of memory so we shouldn't
allow a full vector width load to be pattern matched in case it
is marked volatile.

Instead allow vzload or scalar_to_vector+load.

Also add a DAG combine to turn full vector loads into vzload when
used by one of these instructions if the load isn't volatile.

This fixes another case for PR42079

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364838 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Handle more input argument intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364836 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Lower kernarg segment ptr intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364835 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Legalize workgroup ID intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364834 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Legalize workitem ID intrinsics

Tests don't cover the masked input path since non-kernel arguments
aren't lowered yet.

Test is copied directly from the existing test, with 2 additions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364833 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Custom lower control flow intrinsics

Replace the brcond for the 2 cases that act as branches. For now
follow how the current system works, although I think we can
eventually get rid of the pseudos.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364832 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Handle 16-bit SALU min/max

This needs to be extended to s32, and expanded into cmp+select. This
is relying on the fact that widenScalar happens to leave the
instruction in place, but this isn't a guaranteed property of
LegalizerHelper.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364831 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Lower SALU min/max to cmp+select

Use a change observer to apply a register bank to the newly created
intermediate result register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364830 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Avoid SFB - Fix inconsistent codegen with/without debug info(2)

The function findPotentialBlockers may consider debug info instructions as
potential blockers and may stop searching for a store-load pair prematurely.

This patch corrects this and tests the cases where the store is separated
from the load by more than InspectionLimit debug instructions.

Patch by Chris Dawson.

Differential Revision: https://reviews.llvm.org/D62408

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364829 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Add tests for add legalization

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364828 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Legalize s16 add/sub/mul

If this is scalar, promote to s32. Use a new observer class to assign
the register bank of newly created registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364827 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix allowing non-boolean conditions for G_SELECT

The condition register bank must be scc or vcc so that a copy will be
inserted, which will be lowered to a compare.

Currently greedy unnecessarily forces using a VCC select.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364825 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Add tests for "shift direction in bittest" (PR42466)

https://rise4fun.com/Alive/8O1
https://bugs.llvm.org/show_bug.cgi?id=42466

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364824 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Verify G_MERGE_VALUES operand sizes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364822 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel]: Allow backends to custom legalize Intrinsics

https://reviews.llvm.org/D31359

Add a hook "legalizeInstrinsic" to allow backends to override this
and custom lower/legalize intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364821 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for sendmsg/sendmsghalt

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364819 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Legalize s16 fcmp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364817 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Implement lower for min/max

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364816 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GFX10: implement ds_ordered_count changes

Summary:
ds_ordered_count can now simultaneously operate on up to 4 dwords
in a single instruction, which are taken from (and returned to)
lanes 0..3 of a single VGPR.

Change-Id: I19b6e7b0732b617c10a779a7f9c0303eec7dd276

Reviewers: mareko, arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63716

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364815 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Support GDS atomics

Summary:
Original patch by Marek Olšák

Change-Id: Ia97d5d685a63a377d86e82942436d1fe6e429bab

Reviewers: mareko, arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63452

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364814 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for DS ordered add/swap

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364811 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64/GlobalISel: Fix trying to select invalid MIR

Physical registers are not allowed to be a phi operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364810 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for amdgcn.writelane

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364808 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fail instead of assert when selecting loads

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364807 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Complete implementation of G_GEP

Also works around tablegen defect in selecting add with unused carry,
but if we have to manually select GEP, might as well handle add
manually.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364806 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select G_PHI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364805 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Try to select VOP3 form of add

There are several things broken, but at least emit the right thing for
gfx9.

The import of the pattern with the unused carry out seems to not
work. Needs a special class for clamp, because OperandWithDefaultOps
doesn't really work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364804 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add widenSubVector to size in bits helper. NFCI.

We can already widenSubVector to a specific type (of the same scalar type) - this variant just specifies the target vector size.

This will be useful when CombineShuffleWithExtract relaxes the need to have the same scalar type for all shuffle operand subvector sources.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364803 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for readlane/readfirstlane

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364801 91177308-0d34-0410-b5e6-96231b3b80d8

[docs][llvm-readelf] Expand llvm-readelf documentation

Previously, the llvm-readelf documentation was essentially just a list
of differences to llvm-readobj. Since llvm-readelf is the more likely
goto tool for many people migrating to the LLVM toolchain, it seems like
it would be helpful to document all the switches in the llvm-readelf
document too. This change expands the options listed accordingly.
Additionally, they are unlikely to care what the differences are to
llvm-readobj, since they won't be familiar with the latter as there is
no GNU equivalent, so this change moves the "differences" section to
llvm-readobj's documentation.

Reviewed by: peter.smith

Differential Revision: https://reviews.llvm.org/D63826

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364800 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Implement select for 32-bit G_ADD

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: hiraditya, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58804

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364797 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fix MVE_VQxDMLxDH instruction class

Summary:
According to the ARMARM, the VQDMLADH, VQRDMLADH, VQDMLSDH and
VQRDMLSDH instructions handle their results as follows: "The base
variant writes the results into the lower element of each pair of
elements in the destination register, whereas the exchange variant
writes to the upper element in each pair". I.e., the initial content
of the output register affects the result, as usual, we model this
with an additional input.

Also, for 32-bit variants Qd is not allowed to be the same register as
Qm and Qn, we use @earlyclobber to indicate this.

This patch also changes vpred_r to vpred_n because the instructions
don't have an explicit 'inactive' operand.

Reviewers: dmgreen, ostannard, simon_tatham

Reviewed By: simon_tatham

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64007

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364796 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select G_BRCOND for vcc

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364795 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] MVE: support QQPRRegClass and QQQQPRRegClass

Summary:
QQPRRegClass and QQQQPRRegClass are used by the
interleaving/deinterleaving loads/stores to represent sequences of
consecutive SIMD registers.

Reviewers: ostannard, simon_tatham, dmgreen

Reviewed By: simon_tatham

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64009

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364794 91177308-0d34-0410-b5e6-96231b3b80d8

Update email address in CODE_OWNERS

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364793 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] (Y + ~X) + 1 --> Y - X fold (PR42459)

Summary:
To be noted, this pattern is not unhandled by instcombine per-se,
it is somehow does end up being folded when one runs opt -O3,
but not if it's just -instcombine. Regardless, that fold is
indirect, depends on some other folds, and is thus blind
when there are extra uses.

This does address the regression being exposed in D63992.

https://godbolt.org/z/7DGltU
https://rise4fun.com/Alive/EPO0

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42459 | PR42459 ]]

Reviewers: spatel, nikic, huihuiz

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63993

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364792 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Shift amount reassociation in bittest (PR42399)

Summary:
Given pattern:
`icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0`
we should move shifts to the same hand of 'and', i.e. rewrite as
`icmp eq/ne (and (x shift (Q+K)), y), 0` iff `(Q+K) u< bitwidth(x)`

It might be tempting to not restrict this to situations where we know
we'd fold two shifts together, but i'm not sure what rules should there be
to avoid endless combine loops.

We pick the same shift that was originally used to shift the variable we picked to shift:
https://rise4fun.com/Alive/6x1v

Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42399 | PR42399]].

Reviewers: spatel, nikic, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63829

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364791 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Custom-lower UADDO(x, 1) and USUBO(x, 1)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364790 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select G_FRAME_INDEX

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364789 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GFX10: fix scratch resource descriptor

Summary:
The stride should depend on the wave size, not the hardware generation.

Also, the 32_FLOAT format is 0x16, not 16; though that shouldn't be
relevant.

Change-Id: I088f93bf6708974d085d1c50967f119061da6dc6

Reviewers: arsenm, rampitec, mareko

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63808

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364788 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Make s16 select legal

This is easy to handle and avoids legalization artifacts which are
likely to obscure combines.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364787 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select G_BRCOND for scc conditions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364786 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Tolerate copies with no type set

isVCC has the same bug, but isn't used in a context where it can cause
a problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364784 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix tests using the default alloca address space

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364783 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select src modifiers

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364782 91177308-0d34-0410-b5e6-96231b3b80d8

Fixup r364512

Fix stack-use-after-scope errors from r364512. One instance was already
fixed in r364611 - this patch simplifies that fix and addresses one more
instance of similar code.

Discussed in: https://reviews.llvm.org/D63905

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364778 91177308-0d34-0410-b5e6-96231b3b80d8

[UpdateTestChecks][PowerPC] Avoid empty string when scrubbing loop comments

Summary:
SCRUB_LOOP_COMMENT_RE was introduced in https://reviews.llvm.org/D31285
This works for some loops.

However, we may generate lines with loop comments only.
And since we don't scrub leading white spaces, this will leave an empty
line there, and FileCheck will complain it.

eg: llvm/test/CodeGen/PowerPC/PR35812-neg-cmpxchg.ll:27:15:
error: found empty check string with prefix 'CHECK:'
; CHECK-NEXT:

This prevented us from using the `update_llc_test_checks.py` for quite some cases.

We should still keep the comment token there, so that we can safely
scrub the loop comment without breaking FileCheck.

Reviewers: timshen, hfinkel, lebedev.ri, RKSimon

Subscribers: nemanjai, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63957

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364775 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Better commutative tests for "shift amount reassociation in bittest" pattern.

As discussed in https://reviews.llvm.org/D63829
*if* *both* shifts are one-use, we'd most likely want to produce `lshr`,
and not rely on ordering.

Also, there should likely be a *separate* fold to do this reordering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364772 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Rework VLCR algorithm

Add code to catch pattern for commutative instructions for VLCR.

Patch by Suyog Sarda.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364770 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Convert some places to Register

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364769 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix RegBankSelect for G_FCANONICALIZE

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364768 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix RegBankSelect for G_BUILD_VECTOR

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364767 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fail on store to 32-bit address space

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364766 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Improve icmp selection coverage.

Select s64 eq/ne scalar icmp.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364765 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Improve test coverage for ((~x) + y) + 1 -> y - x fold fold (PR42459)

So we indeed to have this fold, but only if +1 is not the last operation..

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364764 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for WWM/WQM

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364763 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Use vcc reg bank for amdgcn.wqm.vote

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364762 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix scc->vcc copy handling

This was checking the size of the register with the value of the size,
which happens to be exec. Also fix assuming VCC is 64-bit to fix
wave32.

Also remove some untested handling for physical registers which is
skipped. This doesn't insert the V_CNDMASK_B32 if SCC is the physical
copy source. I'm not sure if this should be trying to handle this
special case instead of dealing with this in copyPhysReg.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364761 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Use and instead of BFE with inline immediate

Zext from s1 is the only case where this should do anything with the
current legal extensions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364760 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Add GINodeEquiv for min/max

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364759 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Add DAG compat for G_FCANONICALIZE

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364758 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add missing schedinfo for MSA and ASE instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364757 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add missing schedinfo for atomic instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364756 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add missing schedinfo for ADJCALLSTACKDOWN, ADJCALLSTACKUP

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364755 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Call isLoopExiting for blocks in the loop.

isLoopExiting should only be called for blocks in the loop. A follow
up patch makes this requirement an assertion.

I've updated the usage here, to only match for actual exit blocks. Previously,
it would also match blocks not in the loop.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D63980

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364750 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Tests for ((~x) + y) + 1 -> y - x fold fold (PR42459)

To be noted, this pattern is not unhandled by instcombine per-se,
it is somehow does end up being folded when one runs opt -O3,
but not if it's just -instcombine. Regardless, that fold is
indirect, depends on some other folds, and is thus blind
when there are extra uses.

https://bugs.llvm.org/show_bug.cgi?id=42459
https://rise4fun.com/Alive/EPO0

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364749 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Add break; to the last switch case

As suggested by jrtc27 in the post-commit review of D60528.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364746 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] CombineShuffleWithExtract - updated description comments. NFCI.

CombineShuffleWithExtract no longer requires that both shuffle ops are extract_subvectors, from the same type or from the same size.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364745 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Do minnum->minimum at legalization time instead of building time

The SDAGBuilder behavior stems from the days when we didn't have fast
math flags available in SDAG. We do now and doing the transformation in
the legalizer has the advantage that it also works for vector types.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364743 91177308-0d34-0410-b5e6-96231b3b80d8

[benchmark] Disable CMake get_git_version

Disabled CMake get_git_version as it is meaningless for this in-tree
build, and hardcoded a null version.

Not using get_git_version avoids a refresh of the git index that is
executed by get_git_version. Refreshing the index can take a
considerable amount of time if the index needs to be refreshed
(particularly with the mono repo). This situation can arise when
building shared source on a host in VMs.

Differential Revision: https://reviews.llvm.org/D63925

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364742 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Tests for x - ~(y) -> x + y + 1 fold (PR42457)

https://bugs.llvm.org/show_bug.cgi?id=42457
https://rise4fun.com/Alive/iFhE

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364739 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Omit 'urem' where possible

This was added in D63390 / rL364286 to backend,
but it makes sense to also handle it in middle-end.
https://rise4fun.com/Alive/Zsln

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364738 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Copy test for omit urem when possible from TargetLowering

Was added in D63390 / rL364286 to backend, but it makes sense to also handle it here.
https://rise4fun.com/Alive/Zsln

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364737 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Avoid adding too much indirection to pointer-valued variables

This patch addresses PR41675, where a stack-pointer variable is dereferenced
too many times by its location expression, presenting a value on the stack as
the pointer to the stack.

The difference between a stack *pointer* DBG_VALUE and one that refers to a
value on the stack, is currently the indirect flag. However the DWARF backend
will also try to guess whether something is a memory location or not, based
on whether there is any computation in the location expression. By simply
prepending the stack offset to existing expressions, we can accidentally
convert a register location into a memory location, which introduces a
suprise (and unintended) dereference.

The solution is to add DW_OP_stack_value whenever we add a DIExpression
computation to a stack *pointer*. It's an implicit location computed on the
expression stack, thus needs to be flagged as a stack_value.

For the edge case where the offset is zero and the location could be a register
location, DIExpression::prepend will still generate opcodes, and thus
DW_OP_stack_value must still be added.

Differential Revision: https://reviews.llvm.org/D63429

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364736 91177308-0d34-0410-b5e6-96231b3b80d8

[SimpleLoopUnswitch] Implement handling of prof branch_weights metadata for SwitchInst

Differential Revision: https://reviews.llvm.org/D60606

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364734 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] WLS/LE Code Generation

Backend changes to enable WLS/LE low-overhead loops for armv8.1-m:
1) Use TTI to communicate to the HardwareLoop pass that we should try
   to generate intrinsics that guard the loop entry, as well as setting
   the loop trip count.
2) Lower the BRCOND that uses said intrinsic to an Arm specific node:
   ARMWLS.
3) ISelDAGToDAG the node to a new pseudo instruction:
   t2WhileLoopStart.
4) Add support in ArmLowOverheadLoops to handle the new pseudo
   instruction.

Differential Revision: https://reviews.llvm.org/D63816

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364733 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add more load folding tests for vcvt(t)ps2(u)qq showing missed foldings. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364730 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Improve the type checking fast-isel handling of vector bitcasts.

We had a bunch of vector size legality checks for the source type
based on feature flags, but we didn't check the destination type at
all beyond ensuring that it was a "simple" type. But this allowed
the destination to be i128 which isn't legal.

This commit changes the code to use TLI's isTypeLegal logic in
place of the all the subtarget checks. Then additionally checks
that the source and dest are vectors.

Fixes 42452

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364729 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add a DAG combine to replace vector loads feeding a v4i32->v2f64 CVTSI2FP/CVTUI2FP node with a vzload.

But only when the load isn't volatile.

This improves load folding during isel where we only have vzload
and scalar_to_vector+load patterns. We can't have full vector load
isel patterns for the same volatile load issue.

Also add some missing masked cvtsi2fp/cvtui2fp with vzload patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364728 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add some additional load folding tests to vec_int_to_fp.ll/vec_int_to_fp-widen.ll and disable the peephole pass.

Also copy some missing test cases from vec_int_to_fp.ll to vec_int_to_fp-widen.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364727 91177308-0d34-0410-b5e6-96231b3b80d8