git.osdn.net Git - android-x86/external-llvm.git/log

Revert "[NewPM] Port Sancov"

This reverts commit 5652f35817f07b16f8b3856d594cc42f4d7ee29c.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366153 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Teach convertToThreeAddress to handle SUB with immediate

We mostly avoid sub with immediate but there are a couple cases that can create them. One is the add 128, %rax -> sub -128, %rax trick in isel. The other is when a SUB immediate gets created for a compare where both the flags and the subtract value is used. If we are unable to linearize the SelectionDAG to satisfy the flag user and the sub result user from the same instruction, we will clone the sub immediate for the two uses. The one that produces flags will eventually become a compare. The other will have its flag output dead, and could then be considered for LEA creation.

I added additional test cases to add.ll to show the the sub -128 trick gets converted to LEA and a case where we don't need to convert it.

This showed up in the current codegen for PR42571.

Differential Revision: https://reviews.llvm.org/D64574

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366151 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add missing utility methods for exnref type

Summary:
This adds missing utility methods and copy instruction handling for
`exnref` type and also adds tests.

`tee` instruction tests are missing because `isTee` is currently only
used in ExplicitLocals pass and testing that pass in mir requires
serialization of stackified registers in mir files, which is a bit
nontrivial because `MachineFunctionInfo` only has info of vreg numbers
(which are large integers) but not the mir's register numbers. But this
change is quite trivial anyway.

Reviewers: tlively

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64705

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366149 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readelf] Print "File: lib.a(file.o)" info when dumping archive files.

Match GNU readelf.

https://bugs.llvm.org/show_bug.cgi?id=35351

Reviewers: jhenderson, grimar, MaskRay, rupprecht

Reviewed by: jhenderson, MaskRay, grimar

Differential Revision: https://reviews.llvm.org/D64361

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366147 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Rename except_ref type to exnref

Summary:
We agreed to rename `except_ref` to `exnref` for consistency with other
reference types in
https://github.com/WebAssembly/exception-handling/issues/79. This also
renames WebAssemblyInstrExceptRef.td to WebAssemblyInstrRef.td in order
to use the file for other reference types in future.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64703

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366145 91177308-0d34-0410-b5e6-96231b3b80d8

Revert [llvm-lipo] Implement -create (with hardcoded alignments)

This reverts r366142 (git commit 67cee1dc7ee285b03372eb818a3894d35efa7394)

The test is failing on the Windows buildbots. Reverting while I
investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366144 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-lipo] Implement -create (with hardcoded alignments)

Creates universal binary output file from input files. Currently uses
hard coded value for alignment. Want to get the create functionality
approved before implementing the alignment function.

Patch by Anusha Basana <anusha.basana@gmail.com>

Differential Revision: https://reviews.llvm.org/D64102

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366142 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Simplify regcopy.mir

Summary:
This deletes the ll templates from the functions because they don't need
them (mir files need ll templates only when they have function calls or
BB names that are not numbers).

This also renames the filename to `reg-copy.mir`, because I'm planning
to add some more `reg-*.mir` soon.

Reviewers: tlively

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64704

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366140 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Assembler: support special floats: infinity / nan

Summary:
These are emitted as identifiers by the InstPrinter, so we should
parse them as such. These could potentially clash with symbols of
the same name, but that is out of our (the WebAssembly backend) control.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, aheejin, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64770

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366139 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Enable merging m0 initializations.

Summary:
Enable hoisting and merging m0 defs that are initialized with the same
immediate value. Fixes bug where removed instructions are not considered
to interfere with other inits, and make sure to not hoist inits before block
prologues.

Reviewers: rampitec, arsenm

Reviewed By: rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64766

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366135 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Print BEQZL and BNEZL pseudo instructions

One of the reasons - to be compatible with GNU tools.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366133 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Use standalone MUBUF load patterns

We already do this for the flat and DS instructions, although it is
certainly uglier and more verbose.

This will allow using separate pattern definitions for extload and
zextload. Currently we get away with using a single PatFrag with
custom predicate code to check if the extension type is a zextload or
anyextload. The generic mechanism the global isel emitter understands
treats these as mutually exclusive. I was considering making the
pattern emitter accept zextload or sextload extensions for anyextload
patterns, but in global isel, the different extending loads have
distinct opcodes, and there is currently no mechanism for an opcode
matcher to try multiple (and there probably is very little need for
one beyond this case).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366132 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopUnroll+LoopUnswitch] do not transform loops containing callbr

Summary:
There is currently a correctness issue when unrolling loops containing
callbr's where their indirect targets are being updated correctly to the
newly created labels, but their operands are not. This manifests in
unrolled loops where the second and subsequent copies of callbr
instructions have blockaddresses of the label from the first instance of
the unrolled loop, which would result in nonsensical runtime control
flow.

For now, conservatively do not unroll the loop. In the future, I think
we can pursue unrolling such loops provided we transform the cloned
callbr's operands correctly.

Such a transform and its legalities are being discussed in:
https://reviews.llvm.org/D64101

Link: https://bugs.llvm.org/show_bug.cgi?id=42489
Link: https://groups.google.com/forum/#!topic/clang-built-linux/z-hRWP9KqPI
Reviewers: fhahn, hfinkel, efriedma

Reviewed By: fhahn, hfinkel, efriedma

Subscribers: efriedma, hiraditya, zzheng, dmgreen, llvm-commits, pirama, kees, nathanchance, E5ten, craig.topper, chandlerc, glider, void, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64368

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366130 91177308-0d34-0410-b5e6-96231b3b80d8

TableGen/GlobalISel: Fix handling of truncstore patterns

This was failing to import the AMDGPU truncstore patterns. The
truncating stores from 32-bit to 8/16 were then somehow being
incorrectly selected to a 4-byte store.

A separate check is emitted for the LLT size in comparison to the
specific memory VT, which looks strange to me but makes sense based on
the hierarchy of PatFrags used for the default truncstore PatFrags.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366129 91177308-0d34-0410-b5e6-96231b3b80d8

TableGen: Add address space to matchers

Currently AMDGPU uses a CodePatPred to check address spaces from the
MachineMemOperand. Introduce a new first class property so that the
existing patterns can be easily modified to uses the new generated
predicate, which will also be handled for GlobalISel.

I would prefer these to match against the pointer type of the
instruction, but that would be difficult to get working with
SelectionDAG compatbility. This is much easier for now and will avoid
a painful tablegen rewrite for all the loads and stores.

I'm also not sure if there's a better way to encode multiple address
spaces in the table, rather than putting the number to expect.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366128 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Allow scalar s1 and/or/xor

If a 1-bit value is in a 32-bit VGPR, the scalar opcodes set SCC to
whether the result is 0. If the inputs are SCC, these can be copied to
a 32-bit SGPR to produce an SCC result.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366125 91177308-0d34-0410-b5e6-96231b3b80d8

ARM MTE stack sanitizer.

Add "memtag" sanitizer that detects and mitigates stack memory issues
using armv8.5 Memory Tagging Extension.

It is similar in principle to HWASan, which is a software implementation
of the same idea, but there are enough differencies to warrant a new
sanitizer type IMHO. It is also expected to have very different
performance properties.

The new sanitizer does not have a runtime library (it may grow one
later, along with a "debugging" mode). Similar to SafeStack and
StackProtector, the instrumentation pass (in a follow up change) will be
inserted in all cases, but will only affect functions marked with the
new sanitize_memtag attribute.

Reviewers: pcc, hctim, vitalybuka, ostannard

Subscribers: srhines, mehdi_amini, javed.absar, kristof.beyls, hiraditya, cryptoad, steven_wu, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D64169

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366123 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select G_AND/G_OR/G_XOR

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366121 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Don't constrain source register of VCC copies

This is a hack until I come up with a better way of dealing with the
pseudo-register banks used for boolean values. If the use instruction
constrains the register, the selector for the def instruction won't
see that the bank was VCC. A 1-bit SReg_32 is could ambiguously have
been SCCRegBank or VCCRegBank in wave32.

This is necessary to successfully select branches with and and/or/xor
condition.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366120 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix selecting vcc->vcc bank copies

The extra test change is correct, although how it arrives there is a
bug that needs work. With wave32, the test for isVCC ambiguously
reports true for an SCC or VCC source. A new allocatable pseudo
register class for SCC may be necesssary.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366119 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix not constraining result reg of copies to VCC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366118 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix handling of sgpr (not scc bank) s1 to VCC

This was emitting a copy from a 32-bit register to a 64-bit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366117 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Custom legalize G_INSERT_VECTOR_ELT

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366116 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Custom legalize G_EXTRACT_VECTOR_ELT

Turn the constant cases into G_EXTRACTs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366115 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix G_ICMP for wave32

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366114 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Implement narrowScalar for vector extract/insert indexes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366113 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix missing immarg from interp intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366110 91177308-0d34-0410-b5e6-96231b3b80d8

[FileCheck] Store line numbers as optional values

Summary:
Processing of command-line definition of variable and logic around
implicit not directives both reuse parsing code that expects a line
number to be defined. So far, a special line number of 0 was used for
those users of the parsing code where a line number does not make sense.
This commit instead represents line numbers as Optional values so that
they can be None for those cases.

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64639

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366109 91177308-0d34-0410-b5e6-96231b3b80d8

[cmake] Don't set install rules for tblgen if building utils is disabled

Summary:
This is a follow up to D64032. Afterwards if building utils is disabled
and cross compilation is attempted, CMake will complain that adding
`install()` directives to targets with EXCLUDE_FROM_ALL set is "undefined".
Indeed, it appears depending on the CMake version and the selected
Generator, the install rule will error because the underlying target isn't
built. Fix that by not adding the install rule if building utils is not
requested. Note that this doesn't prevent building tblgen as a
dependency in not cross-build, even if building tools is disabled.

Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D64225

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366108 91177308-0d34-0410-b5e6-96231b3b80d8

Expand comment about how StringsToBuckets was computed, and add more entries

The construction was explained in
https://reviews.llvm.org/D44810?id=139526#inline-391999 but reading the code
shouldn't require hunting down old reviews to understand it.

The precomputed list was missing an entry for the empty list case, and
one entry at the very end. (The current last entry is the last one where
3 * BucketCount fits in a signed int, but the reference implementation
uses unsigneds as far as I can tell, so there's room for one more entry.)

No behavior change for inputs seen in practice.

Differential Revision: https://reviews.llvm.org/D64738

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366107 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] MVE vector for 64bit types

We need to make sure that we are sensibly dealing with vectors of types v2i64
and v2f64, even if most of the time we cannot generate native operations for
them. This mostly adds a lot of testing, plus fixes up a couple of the issues
found. And, or and xor can be legal for v2i64, and shifts combining needs a
slight fixup.

Differential Revision: https://reviews.llvm.org/D64316

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366106 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Assembler: recognize .init_array as data section.

Reviewers: sbc100

Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64602

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366104 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Widen vector extracts

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366103 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Handle llvm.amdgcn.if.break

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366102 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove reserved value accidentally left in for gfx908

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366101 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select llvm.amdgcn.end.cf

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366099 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] try to keep FP casted+truncated+extracted vector element out of GPRs

inttofp (trunc (extelt X, 0)) --> inttofp (extelt (bitcast X), 0)

We have pseudo-vectorization of scalar int to FP casts, so this tries to
make that more likely by replacing a truncate with a bitcast. I didn't see
any test diffs starting from 'uitofp', so I left that as a TODO. We can't
only match the shorter trunc+extract pattern because there's an opposing
transform somewhere, so we infinite loop. Waiting to try this during
lowering is another possibility.

A motivating case is shown in PR39975 and included in the test diffs here:
https://bugs.llvm.org/show_bug.cgi?id=39975

Differential Revision: https://reviews.llvm.org/D64710

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366098 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-lib] Add a dependency to intrinsics_gen to the LLVMLibDriver build

Summary:
Occasionally the build of LLVMLibDriver will fail because Attributes.inc has not been generated yet. Add an explicit dependency, so that we can guarantee that the file has been generated before LLVMLibDriver is build.

##[error]llvm\include\llvm\IR\Attributes.h(73,0): Error C1083: Cannot open include file: 'llvm/IR/Attributes.inc': No such file or directory
llvm\include\llvm/IR/Attributes.h(73): fatal error C1083: Cannot open include file: 'llvm/IR/Attributes.inc': No such file or directory [LLVMLibDriver.vcxproj]

Reviewers: asmith

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64357

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366097 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Return UNDEF from LowerScalarImmediateShift when the shift amount is out of range.

I think we only turn out of range shiftss to undef when
all elements are out of range or the shift amount is a splat out
of range. I'm not sure which, I didn't check.

During lowering we can split a shift where some elements
are out of range into multiple shifts. This can create a
new shift with a splat shift amount that is out of range.

This patch returns undef for this case.

Fixes PR42615.

Differential Revision: https://reviews.llvm.org/D64699

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366096 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add 24-bit mul intrinsics

Insert these during codegenprepare.

This works around a DAG issue where generic combines eliminate the and
asserting the high bits are zero, which then exposes an unknown read
source to the mul combine. It doesn't worth the hassle of trying to
insert an AssertZext or something to try to deal with it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366094 91177308-0d34-0410-b5e6-96231b3b80d8

Add some release notes for 9.0 release

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366093 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Copy missing predicate from pseudo to real

NFC at the momemnt, needed for future commit.

Differential Revision: https://reviews.llvm.org/D64761

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366092 91177308-0d34-0410-b5e6-96231b3b80d8

[FunctionAttrs] Remove readonly and writeonly assertion

There are scenarios where mutually recursive functions may cause the SCC
to contain both read only and write only functions. This removes an
assertion when adding read attributes which caused a crash with a the
provided test case, and instead just doesn't add the attributes.

Patch by Luke Lau <luke.lau@intel.com>

Differential Revision: https://reviews.llvm.org/D60761

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366090 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Minor formatting in ARMInstrMVE.td. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366089 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select easy cases for G_BUILD_VECTOR

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366087 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for G_CONCAT_VECTORS

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366086 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add tests for reductions that might be better with more horizontal ops; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366082 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "r366069: [PatternMatch] Implement matching code for LibFunc"

Reason: the change introduced a layering violation by adding a
dependency on IR to Analysis.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366081 91177308-0d34-0410-b5e6-96231b3b80d8

[docs][llvm-nm] Fix inconsistent grammar

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366080 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Regenerated packss.ll test file.

Not sure what went wrong in rL366077....

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366079 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add PACKSS with zero shuffle masks.

This is an example of expansion due to D61129 - it should combine back to a PACKSS with a zero operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366077 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Start adding ORCv1 to ORCv2 transition tips to the ORCv2 doc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366075 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] fixed scheduler crash in gfx908

For some reason scheduler can send down an SUnit without an
instruction.

Differential Revision: https://reviews.llvm.org/D64709

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366074 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Add a note on how to locally tell git to ignore build dir

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366072 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC][GFX9][GFX10] Added support of GET_DOORBELL message

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D64729

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366071 91177308-0d34-0410-b5e6-96231b3b80d8

[PatternMatch] Implement matching code for LibFunc

Summary: Provides m_LibFunc pattern that can be used to match LibFuncs.

Reviewers: spatel, hfinkel, efriedma, lebedev.ri

Reviewed By: lebedev.ri

Subscribers: lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D42047

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366069 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC] Corrected encoding of src0 for DS_GWS_* instructions

See bug 42599: https://bugs.llvm.org/show_bug.cgi?id=42599

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D64716

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366067 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] isTargetShuffleEquivalent - assert the expected mask is correctly formed. NFCI.

While we don't make any assumptions about the actual mask, assert that the expected mask only contains valid mask element values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366066 91177308-0d34-0410-b5e6-96231b3b80d8

[Testing] Add missing "REQUIRES: asserts"

This broke after r366048 / https://reviews.llvm.org/D63923

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366065 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Remove "else-after-return". NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366064 91177308-0d34-0410-b5e6-96231b3b80d8

PDB HashTable: Make iterator key type const

Having the hash table key change during iteration is bad, so make it
impossible. Nothing relied on the key type not being const.

(This is also necessary to be able to call the const version of
iterator_facade_base::operator->(). Nothing calls this, and nothing
will, but I tried using it locally during development and it took me a
while to understand what was going wrong.)

Also rename the iterator typedef to const_iterator.

No behavior change.

Differential Revision: https://reviews.llvm.org/D64641

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366060 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit r366052 "[obj2yaml] - Rework tool's error reporting logic for ELF target."

No changes, LLD code was updated in r366057.

Original commit message:

ELF.h contains two getSymbol methods
which seems to be used only from obj2yaml.

One of these methods calls another, which in turn
contains untested error message which doesn't
provide enough information.

Problem is that after improving only just that message,
obj2yaml will not show it,
("Error reading file: yaml: Invalid data was
encountered while parsing the file" message will be shown instead),
because internal errors handling of tool is based on ErrorOr<> class which
stores a error code and as a result can only show a predefined error string, what
actually isn't very useful.

In this patch, I rework obj2yaml's error reporting system
for ELF targets to use Error Expected<> classes.
Also, I improve the error message produced
by getSymbol for demonstration of the new functionality.

Differential revision: https://reviews.llvm.org/D64631

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366058 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] MVE Vector Shifts

This adds basic lowering for MVE shifts. There are many shifts in MVE, but the
instructions handled here are:
VSHL (imm)
VSHRu (imm)
VSHRs (imm)
VSHL (vector)
VSHL (register)

MVE, like NEON before it, doesn't have shift right by a vector (or register).
We instead have to negate the amount and shift in the opposite direction. This
means we have to convert any SHR's into a form of SHL (that is still signed or
unsigned) with a negated condition and selecting from there. MVE still does
have shifting by an immediate for SHL, ASR and LSR.

This adds lowering for these and for register forms, which work well for shift
lefts but may require an extra fold of neg(vdup(x)) -> vdup(neg(x)) to potentially
work optimally for right shifts.

Differential Revision: https://reviews.llvm.org/D64212

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366056 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Move Shifts after Bits. NFC

This just moves the shift instruction definitions further down the
ARMInstrMVE.td file, to make positioning patterns slightly more natural.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366054 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r366052 "[obj2yaml] - Rework tool's error reporting logic for ELF target."

Seems it broke LLD:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/48434

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366053 91177308-0d34-0410-b5e6-96231b3b80d8

[obj2yaml] - Rework tool's error reporting logic for ELF target.

ELF.h contains two getSymbol methods
which seems to be used only from obj2yaml.

One of these methods calls another, which in turn
contains untested error message which doesn't
provide enough information.

Problem is that after improving only just that message,
obj2yaml will not show it,
("Error reading file: yaml: Invalid data was
encountered while parsing the file" message will be shown instead),
because internal errors handling of tool is based on ErrorOr<> class which
stores a error code and as a result can only show a predefined error string, what
actually isn't very useful.

In this patch, I rework obj2yaml's error reporting system
for ELF targets to use Error Expected<> classes.
Also, I improve the error message produced
by getSymbol for demonstration of the new functionality.

Differential revision: https://reviews.llvm.org/D64631

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366052 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Adjust how NEON shifts are lowered

This adjusts the way that we lower NEON shifts to use a DAG target node, not
via a neon intrinsic. This is useful for handling MVE shifts operations in the
same the way. It also renames some of the immediate shift nodes for
consistency, and moves some of the processing of immediate shifts into
LowerShift allowing it to capture more cases.

Differential Revision: https://reviews.llvm.org/D64426

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366051 91177308-0d34-0410-b5e6-96231b3b80d8

[Loop Peeling] Fix the bug with IDom setting for exit loops

It is possible that loop exit has two predecessors in a loop body.
In this case after the peeling the iDom of the exit should be a clone of
iDom of original exit but no a clone of a block coming to this exit.

Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D64618

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366050 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopVectorize] Pass unfiltered list of arguments to getIntrinsicInstCost.

We do not compute the scalarization overhead in getVectorIntrinsicCost
and TTI::getIntrinsicInstrCost requires the full arguments list.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366049 91177308-0d34-0410-b5e6-96231b3b80d8

[Loop Peeling] Enable peeling for loops with multiple exits

This CL enables peeling of the loop with multiple exits where
one exit should be from latch and others are basic blocks with
call to deopt.

The peeling is enabled under the flag which is false by default.

Reviewers: reames, mkuper, iajbar, fhahn
Reviewed By: reames
Subscribers: xbolva00, hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D63923

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366048 91177308-0d34-0410-b5e6-96231b3b80d8

DeveloperPolicy: fix a typo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366046 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor] Deduce "nonnull" attribute

Summary:
Porting nonnull attribute to attributor.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63604

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366043 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopUtils] Extend the scope of getLoopEstimatedTripCount

With this patch the getLoopEstimatedTripCount function will
accept also the loops where there are more than one exit but
all exits except latch block should ends up with a call to deopt.

This side exits should not impact the estimated trip count.

Reviewers: reames, mkuper, danielcdh
Reviewed By: reames
Subscribers: fhahn, lebedev.ri, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D64553

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366042 91177308-0d34-0410-b5e6-96231b3b80d8

Remove set but unused variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366041 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopInfo] Introduce getUniqueNonLatchExitBlocks utility function

Extract the code from LoopUnrollRuntime into utility function to
re-use it in D63923.

Reviewers: reames, mkuper
Reviewed By: reames
Subscribers: fhahn, hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D64548

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366040 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Support fp128 libcalls

On PowerPC, IEEE 754 quadruple-precision libcall names use "kf" instead of "tf".

In libgcc, libgcc/config/rs6000/float128-sed converts TF names to KF
names. This patch implements its 24 substitution rules.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D64282

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366039 91177308-0d34-0410-b5e6-96231b3b80d8

[BPF] add unit tests for preserve_{array,union,struct}_access_index intrinsics

This is a followup patch for https://reviews.llvm.org/D61810/new/,
which adds new intrinsics preserve_{array,union,struct}_access_index.

Currently, only BPF backend utilizes preserve_{array,union,struct}_access_index
intrinsics, so all tests are compiled with BPF target.

https://reviews.llvm.org/D61524 already added some tests for these
intrinsics, but some of them pretty complex.
This patch added a few unit test cases focusing on individual intrinsic
functions.

Also made a few clarification on language reference for these intrinsics.

Differential Revision: https://reviews.llvm.org/D64606

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366038 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][PowerPC] Add the test block-placement.mir

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366037 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueTracking] Look through constant Int2Ptr/Ptr2Int expressions

Summary:
This is analogous to the int2ptr/ptr2int instruction handling introduced
in D54956.

Reviewers: fhahn, efriedma, spatel, nlopes, sanjoy, lebedev.ri

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64708

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366036 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Separate the memory size of vzext_load/vextract_store from the element size of the result type. Use them improve the codegen of v2f32 loads/stores with sse1 only.

Summary:
SSE1 only supports v4f32. But does have instructions like movlps/movhps that load/store 64-bits of memory.

This patch breaks the connection between the node VT of the vzext_load/vextract_store patterns and the memory VT. Enabling a v4f32 node with a 64-bit memory VT. I've used i64 as the memory VT here. I've written the PatFrag predicate to just check the store size not the specific VT. I think the VT will only matter for CSE purposes. We could use v2f32, but if we want to start using these operations in more places a simple integer type might make the most sense.

I'd like to maybe use this same thing for SSE2 and later as well, but that will need more work to be supported by EltsFromConsecutiveLoads to avoid regressing lit tests. I'd maybe also like to combine bitcasts with these load/stores nodes now that the types are disconnected. And I'd also like to consider canonicalizing (scalar_to_vector + load) to vzext_load.

If you want I can split the mechanical tablegen stuff where I added the 32/64 off from the sse1 change.

Reviewers: spatel, RKSimon

Reviewed By: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64528

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366034 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetParser][ARM] Account dependencies when processing target features

Teaches ARM::appendArchExtFeatures to account dependencies when processing
target features: i.e. when you say -march=armv8.1-m.main+mve.fp+nofp it
means mve.fp should get discarded too. (Split from D63936)

Differential Revision: https://reviews.llvm.org/D64048

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366031 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Exclude loop-invariant inputs from scalar cost computation.

Loop invariant operands do not need to be scalarized, as we are using
the values outside the loop. We should ignore them when computing the
scalarization overhead.

Fixes PR41294

Reviewers: hsaito, rengolin, dcaballe, Ayal

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D59995

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366030 91177308-0d34-0410-b5e6-96231b3b80d8

[clang][Driver][ARM] Favor -mfpu over default CPU features

When processing the command line options march, mcpu and mfpu, we store
the implied target features on a vector. The change D62998 introduced a
temporary vector, where the processed features get accumulated. When
calling DecodeARMFeaturesFromCPU, which sets the default features for
the specified CPU, we certainly don't want to override the features
that have been explicitly specified on the command line. Therefore, the
default features should appear first in the final vector. This problem
became evident once I added the missing (unhandled) target features in
ARM::getExtensionFeatures.

Differential Revision: https://reviews.llvm.org/D63936

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366027 91177308-0d34-0410-b5e6-96231b3b80d8

[GitSVN][NFC] Mark dry-run commits as such in the log output

Summary: This helps to avoid worries about the "dry run flag" while testing.

Reviewers: jyknight, rnk, mehdi_amini

Subscribers: bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64697

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366023 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add assume context test; NFC

Baseline test for D37215.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366021 91177308-0d34-0410-b5e6-96231b3b80d8

[Hashing] hash_1to3_bytes - avoid trunc(v + zext(x)) NFCI.

MSVC complains about the extension to uint64_t for an addition followed by truncation back to uint32_t - add an explicit uint32_t cast to avoid this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366020 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add test for sub-with-flags opportunity (PR40483); NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366019 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit "[BitcodeReader] Validate OpNum, before accessing Record array."

This recommits r365750 (git commit 8b222ecf2769ee133691f208f6166ce118c4a164)

Original message:

   Currently invalid bitcode files can cause a crash, when OpNum exceeds
   the number of elements in Record, like in the attached bitcode file.

   The test case was generated by clusterfuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15698

   Reviewers: t.p.northover, thegameg, jfb

   Reviewed By: jfb

   Differential Revision: https://reviews.llvm.org/D64507

   llvm-svn: 365750jkkkk

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366018 91177308-0d34-0410-b5e6-96231b3b80d8

[BitcodeReader] Use tighter upper bound to validate forward references.

At the moment, bitcode files with invalid forward reference can easily
cause the bitcode reader to run out of memory, by creating a forward
reference with a very high index.

We can use the size of the bitcode file as an upper bound, because a
valid bitcode file can never contain more records. This should be
sufficient to fail early in most cases. The only exception is large
files with invalid forward references close to the file size.

There are a couple of clusterfuzz runs that fail with out-of-memory
because of very high forward references and they should be fixed by this
patch.

A concrete example for this is D64507, which causes out-of-memory on
systems with low memory, like the hexagon upstream bots.

Reviewers: t.p.northover, thegameg, jfb, efriedma, hfinkel

Reviewed By: jfb

Differential Revision: https://reviews.llvm.org/D64577

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366017 91177308-0d34-0410-b5e6-96231b3b80d8

VirtRegMap - add missing initializers. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366016 91177308-0d34-0410-b5e6-96231b3b80d8

SlotIndexes - add missing initializer. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366015 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner] Add missing initializers for OutlinedFunction. NFCI.

Appeases MSVC/cppcheck.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366014 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove offset of 8 from the call to FuseInst for UNPCKLPDrr folding added in r365287.

This was copy/pasted from above and I forgot to change it. We just
need the default offset of 0 here.

Fixes PR42616.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366011 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor][Fix] Never override given argument numbers

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366009 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add sign and zero extend patterns for MVE

The vmovlb instructions can be uses to sign or zero extend vector registers
between types. This adds some patterns for them and relevant testing. The
VBICIMM generation is also put behind a hasNEON check (as is already done for
VORRIMM).

Code originally by David Sherwood.

Differential Revision: https://reviews.llvm.org/D64069

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366008 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] MVE VNEG instruction patterns

This selects integer VNEG instructions, which can be especially useful with shifts.

Differential Revision: https://reviews.llvm.org/D64204

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366006 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] MVE integer abs

Similar to floating point abs, we also have instructions for integers.

Differential Revision: https://reviews.llvm.org/D64027

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366005 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] MVE integer min and max

This simply makes the MVE integer min and max instructions legal and adds the
relevant patterns for them.

Differential Revision: https://reviews.llvm.org/D64026

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366004 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] MVE VRINT support

This adds support for the floor/ceil/trunc/... series of instructions,
converting to various forms of VRINT. They use the same suffixes as their
floating point counterparts. There is not VTINTR, so nearbyint is expanded.

Also added a copysign test, to show it is expanded.

Differential Revision: https://reviews.llvm.org/D63985

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366003 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] MVE minnm and maxnm instructions

This adds the patterns for minnm and maxnm from the fminnum and fmaxnum nodes,
similar to scalar types.

Original patch by Simon Tatham

Differential Revision: https://reviews.llvm.org/D63870

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366002 91177308-0d34-0410-b5e6-96231b3b80d8