git.osdn.net Git - android-x86/external-llvm.git/log

Fuzzer: remove temporary files after we're done with them.

These were just copies of the relevant fuzzer binary with (presumably)
meaningful suffixes, but accounted for more than 10% of my build
directory (> 8GB). Hard drive space is cheap, but not that cheap.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326710 91177308-0d34-0410-b5e6-96231b3b80d8

TableGen: Resolve all template args simultaneously in ResolveMulticlassDefARgs

Summary:
Use the new resolver interface more explicitly, and avoid traversing
all the initializers multiple times.

Change-Id: I679e86988b309d19f25e6cca8b0b14ea150198a6

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D43654

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326708 91177308-0d34-0410-b5e6-96231b3b80d8

TableGen: Resolve all template args simultaneously in AddSubMultiClass

Summary:
Use the new resolver interface more explicitly, and avoid traversing
all the initializers multiple times.

Change-Id: Ia4dcc6d42dd8b65e6079d318c6a202f36f320fee

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D43653

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326707 91177308-0d34-0410-b5e6-96231b3b80d8

TableGen: Resolve all template args simultaneously in AddSubClass

Summary:
Use the new resolver interface more explicitly, and avoid traversing
all the initializers multiple times.

Add a test case for a pattern that was broken by an earlier version
of this change.

An additional change is that we now remove *all* template arguments
after resolving them.

Change-Id: I86c828c8cc84c18b052dfe0f64c0d5cbf3c4e13c

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D43652

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326706 91177308-0d34-0410-b5e6-96231b3b80d8

TableGen: Reimplement !foreach using the resolving mechanism

Summary:
This changes the syntax of !foreach so that the first "parameter" is
a new syntactic variable: !foreach(x, lst, expr) will define the
variable x within the scope of expr, and evaluation of the !foreach
will substitute elements of the given list (or dag) for x in expr.

Aside from leading to a nicer syntax, this allows more complex
expressions where x is deeply nested, or even constant expressions
in which x does not occur at all.

!foreach is currently not actually used anywhere in trunk, but I
plan to use it in the AMDGPU backend. If out-of-tree targets are
using it, they can adjust to the new syntax very easily.

Change-Id: Ib966694d8ab6542279d6bc358b6f4d767945a805

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, llvm-commits, tpr

Differential Revision: https://reviews.llvm.org/D43651

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326705 91177308-0d34-0410-b5e6-96231b3b80d8

TableGen: Introduce an abstract variable resolver interface

Summary:
The intention is to allow us to more easily restructure how resolving is
done, e.g. resolving multiple variables simultaneously, or using the
resolving mechanism to implement !foreach.

Change-Id: I4b976b54a32e240ad4f562f7eb86a4d663a20ea8

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D43564

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326704 91177308-0d34-0410-b5e6-96231b3b80d8

Pass Divergence Analysis data to Selection DAG to drive divergence
dependent instruction selection.

Differential revision: https://reviews.llvm.org/D35267

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326703 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Add more missing instructions to the Power 9 scheduler

Adding more instructions using InstRW so that we can move away from ItinRW
and ultimately have a complete Power 9 scheduler.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326701 91177308-0d34-0410-b5e6-96231b3b80d8

TableGen: Allow NAME in template arguments in defm in multiclass

Summary:
NAME has already worked for def in a multiclass, since the (protoype)
record including its NAME variable is created before parsing the
superclasses. Since defm's do not have an associated single record,
support for NAME has to be implemented differently here.

Original test cases provided by Artem Belevich (tra)

Change-Id: I933b74f328c0ff202e7dc23a35b78f3505760cc9

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D43656

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326700 91177308-0d34-0410-b5e6-96231b3b80d8

TableGen: Use DefInit::getDef() instead of the type's getRecord()

The former simply makes more sense: we want to access the data here in
the backend, not information about the type.

More importantly, removing users of RecordRecTy::getRecord() allows us
more freedom to refactor the frontend.

Change-Id: Iee8905fd22cdb9b11c42ca03246c03d8fe4dd77f

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326699 91177308-0d34-0410-b5e6-96231b3b80d8

[MergeICmp] We can discard initial blocks that do other work

Summary:
We can discard initial blocks that do other work
We do not need to limit ourselves to just the first block in the chain.

Reviewers: courbet, davide

Reviewed By: courbet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44029

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326698 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add validation to reloc section

We now check relocations offsets are within range, and the relocation
index is valid.

Also updated tests which contained invalid Wasm files that were
previously not checked.

Differential Revision: https://reviews.llvm.org/D43684

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326697 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM][Asm] VMOVSRR and VMOVRRS need sequential S registers

These instructions require that the two S registers are adjacent (but not the R
registers), because only the first register is included in the encoding, but we
were not checking this in the assembler.

Differential revision: https://reviews.llvm.org/D44084

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326696 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Reorder reloc sections to come between symtab and name

This is required in order to enable relocs to be validated
as they are read in.

Also update tests with new section ordering.

Differential Revision: https://reviews.llvm.org/D43940

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326694 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Fix tests with invalid yaml (required CODE section missing)

Differential Revision: https://reviews.llvm.org/D44023

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326692 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Attach a name to globals similarly to function naming

This allows LLD to print the name for an InputGlobal when encountering
an error.

Differential Revision: https://reviews.llvm.org/D44033

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326691 91177308-0d34-0410-b5e6-96231b3b80d8

Fix location of comment in EmitPopInst

Comment about folding return in LDM was not moved along with the
corresponding code in r242714. This commit fixes that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326690 91177308-0d34-0410-b5e6-96231b3b80d8

[Bash-autocompletion] Pass all flags in shell command-line to Clang

Previously, we passed "#" to --autocomplete to indicate to enable cc1
flags. For example, when -cc1 or -Xclang was passed to bash, bash
executed `clang --autocomplete=#-<flag they want to complete>`.

However, this was not a good implementation because it depends -Xclang
and -cc1 parsing to shell. So I changed this to pass all flags shell
has, so that Clang can handle them internally.

I had to change many testcases because API spec changed quite a lot.

Reviewers: teemperor, v.g.vassilev

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D39342

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326684 91177308-0d34-0410-b5e6-96231b3b80d8

[MergeICmps][NFC] Improve logging.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326683 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Replace usages of X86Subtarget::hasFp256 with hasAVX. Remove hasFP256.

Almost none of these usages were FP specific. And we had no clear guideliness on when to use hasAVX vs hasFP256.

I might also remove hasInt256 too since its an alias for hasAVX2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326682 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add a DAG combine to turn stores of vXi1 constants into scalar stores.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326679 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add a 32-bit mode command line to avx512-mask-op.ll. Add tests for storing v2i1 and v4i1 constants.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326678 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Add a peekThroughBitcast to MergeStoresOfConstantsOrVecElts to fix a crash if we are storing a bitcast of a constant.

Loading a constant into a k-register in AVX512 requires a bitcast from a scalar constant. In the test case here we have a k-register store that gets split into multiple parts of KNL. MergeConsecutiveStores sees each of these pieces as a consecutive store and looks through the bitcast to find the underly scalar constant. But when we went to create the combined store we didn't look through the same bitcast.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326677 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][X87] Add X87 folded integer arithmetic tests

Add tests for FIADD/FISUB/FISUBR/FIMUL/FIDIV/FIDIVR

Shows we have more FILD stack usage than necessary (arg load, spill, reload to x87)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326674 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][MMX] Remove completed _mm_cvtsi32_si64 todo

rL322525 - mmx zero constant support
rL322553 - mmx i32 zero extended value
rL326497 - mmx i64 general constant handling

Not all constants are folded, we generate some on the GPRs (similar to SSE build vector) where appropriate

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326673 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix unused variable in release builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326672 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Combine (store (v1i1 (scalar_to_vector (i8 X)))) -> (store (i8 X)).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326670 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Lower v1i1/v2i1/v4i1/v8i1 load/stores to i8 load/store during op legalization if AVX512DQ is not supported.

We were previously doing this with isel patterns. Moving it to op legalization gives us chance to see the required bitcast earlier. And it lets us remove some isel patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326669 91177308-0d34-0410-b5e6-96231b3b80d8

[CallSiteSplitting] fix use after-free

Iterating through predecessors of `TailBB` while removing their
terminators leads to use after-free, because the predecessor list is
changing on each removal.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326668 91177308-0d34-0410-b5e6-96231b3b80d8

[CallSiteSplitting] properly split musttail calls

Summary:
`musttail` calls can't be naively splitted. The split blocks must
include not only the call instruction itself, but also (optional)
`bitcast` and `return` instructions that follow it.

Clone `bitcast` and `ret`, place them into the split blocks, and
remove the tail block when done.

Reviewers: junbuml, mcrosier, davidxl, davide, fhahn

Reviewed By: fhahn

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D43729

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326666 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add test for vectors with undef elts; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326661 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] (~X) - (~Y) --> Y - X

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326660 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for notnotsub; NFC

As shown in D44043, we may need this fold in the backend,
but it's also missing in the IR optimizer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326659 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] This bit-test TODO has been moved in PR36551

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326658 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove 'else' after return. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326642 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Revert r325320: Import global variables

This caused some links to fail with ThinLTO due to missing symbols as
well as causing some binaries to have failures at runtime. We're working
with the author to get a test case, but want to get the tree green
again.

Further, it appears to introduce a data race. While the test usage of
threads was disabled in r325361 & r325362, that isn't an acceptable fix.
I've reverted both of these as well. This code needs to be thread safe.
Test cases for this are already on the original commit thread.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326638 91177308-0d34-0410-b5e6-96231b3b80d8

[LegalizeVectorTypes] When scalarizing the operand of a unary op like TRUNC, use a SCALAR_TO_VECTOR rather than a single element BUILD_VECTOR to convert back to a vector type.

X86 considers v1i1 a legal type under AVX512 and as such a truncate from a v1iX type to v1i1 can be turned into a scalar truncate plus a conversion to v1i1. We would much prefer a v1i1 SCALAR_TO_VECTOR over a one element BUILD_VECTOR.

During lowering we were detecting the v1i1 BUILD_VECTOR as a splat BUILD_VECTOR like we try to do for v2i1/v4i1/etc. In this case we create (select i1 splat_elt, v1i1 all-ones, v1i1 all-zeroes). That goes through some more legalization and we end up with a CMOV choosing between 0 and 1 in scalar and a scalar_to_vector.

Arguably we could detect the v1i1 BUILD_VECTOR and do this better in X86 target code. But just using a SCALAR_TO_VECTOR in legalization is much easier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326637 91177308-0d34-0410-b5e6-96231b3b80d8

Implementation of MRI "delete" command.

Differential Revision: https://reviews.llvm.org/D43989

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326636 91177308-0d34-0410-b5e6-96231b3b80d8

[AggressiveInstCombine] Use use_empty() instead of !getNumUses(), NFC

use_empty() runs in O(1), whereas getNumUses() runs in O(# uses).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326635 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] rearrange visitFMul; NFCI

Put the simplest non-FMF folds first, so it's easier to
see what's left to fix/group/add with the FMF folds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326632 91177308-0d34-0410-b5e6-96231b3b80d8

Add DBG_VALUE support to the linear DAG scheduler

The fast/linear DAG scheduler doesn't lower DBG_VALUEs except for
function entry nodes.

Patch by Joshua Cranmer!

Differential Revision: https://reviews.llvm.org/D43028

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326631 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-symbolizer] Use correct path when resolving .gnu_debuglink in .debug

Summary:
The symbolizer was checking for .debug as a subdirectory of the
binary file itself, not of the directory containing the binary. This led to
a failure to find split debug info when it was contained in a .debug directory.

Reviewers: rnk, glider, zturner

Subscribers: llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D44025

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326630 91177308-0d34-0410-b5e6-96231b3b80d8

[Utils] Salvage debug info in block simplification

In stage2 -O3 builds of llc, this results in small but measurable
increases in the number of variables with locations, and in the number
of unique source variables overall.

(According to llvm-dwarfdump --statistics, there are 123 additional
variables with locations, which is just a 0.006% improvement).

The size of the .debug_loc section of the llc dsym increases by 0.004%.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326629 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Generate valignb for shifting shuffles (instead of vdelta)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326627 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Implement MC relaxations for compressed instructions.

Summary:
     This patch implements relaxation for RISCV in the MC layer.
      The following relaxations are currently handled:
      1) Relax C_BEQZ to BEQ and C_BNEZ to BNEZ in RISCV.
      2) Relax and C_J $imm  to JAL x0, $imm  and CJAL to JAL ra, $imm.

Reviewers: asb, llvm-commits, efriedma

Reviewed By: asb

Subscribers: shiva0217

Differential Revision: https://reviews.llvm.org/D43055

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326626 91177308-0d34-0410-b5e6-96231b3b80d8

Make llvm::djbHash an inline function.

Differential Revision: https://reviews.llvm.org/D43644

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326625 91177308-0d34-0410-b5e6-96231b3b80d8

[Utils] Salvage debug info in recursive inst deletion

In stage2 -O3 builds of llc, this results in a 0.3% increase in the
number of variables with locations, and a 0.2% increase in the number of
unique source variables overall.

The size of the .debug_loc section of the llc dsym increases by 0.5%.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326621 91177308-0d34-0410-b5e6-96231b3b80d8

[unittests] Make some parseIR calls more readable, NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326620 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Avoid cast ExprType to wasm::ValType

This cast was causing invalid signatures to be written
for libcall functions.

Add an MC test which includes a call to builtin memcpy.

Differential Revision: https://reviews.llvm.org/D44037

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326618 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Rewrite the binary op shrinking in visitFPTrunc to avoid creating overly small ConstantFPs that we'll just need to extend again.

Instead of returning the smaller FP constant we now return the minimal Type the constant can fit into. We also return the Type of the input to any fp extends. The legality checks are then done on just the size of these Types. If we find something profitable we then emit FPTruncs in front of the smaller binop and assume those FPTruncs will be constant folded or combined with any ConstantFPs or fpextends.

Differential Revision: https://reviews.llvm.org/D44038

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326617 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Fix test cases after r326613

I forgot to check in the updated test cases after the r326613 commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326616 91177308-0d34-0410-b5e6-96231b3b80d8

Reland "[WebAssembly] More uses of uint8_t for single byte values"

Summary:
Original change was D43991 (rL326541) and was reverted by rL326571 and
rL326572. This adds also the necessary MCCodeEmitter patch.

Reviewers: sbc100

Subscribers: jfb, dschuff, sbc100, jgravelle-google, sunfish, llvm-commits, ncw

Differential Revision: https://reviews.llvm.org/D44034

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326614 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Allow LRV/STRV with volatile memory accesses

The byte-swapping loads and stores do not actually perform multiple
accesses to their memory operand, so they are OK to use with volatile
memory operands as well. Remove overly cautious check.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326613 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Add support for anyregcc calling convention

This adds back-end support for the anyregcc calling convention
for use with patchpoints.

Since all registers are considered call-saved with anyregcc
(except for 0 and 1 which may still be clobbered by PLT stubs
and the like), this required adding support for saving and
restoring vector registers in prologue/epilogue code for the
first time. This is not used by any other calling convention.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326612 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Support stackmaps and patchpoints

This adds back-end support for the @llvm.experimental.stackmap and
@llvm.experimental.patchpoint intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326611 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Fix common-code users of stack size

On SystemZ we need to provide a register save area of 160 bytes to
any called function. This size needs to be added when allocating
stack in the function prologue. However, it was not accounted for
as part of MachineFrameInfo::getStackSize(); instead the back-end
used a private routine getAllocatedStackSize().

This is OK for code-gen purposes, but it breaks other users of
the getStackSize() routine, in particular it breaks the recently-
added -stack-size-section feature.

Fix this by updating the main stack size tracked by common code
(in emitPrologue) instead of using the private routine.

No change in code generation intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326610 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Support vector registers in inline asm

This adds support for specifying vector registers for use with inline
asm statements, either via the 'v' constraint or by explicit register
names (v0 ... v31).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326609 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] partly fix FMF for fmul+log2 fold

The code was checking that all of the instructions in the
sequence are 'fast', but that's not necessary. The final
multiply is all that we need to check (tests adjusted).
The fmul doesn't need to be fully 'fast' either, but that
can be another patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326608 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for rL169025; NFC

This narrow fold was added with no motivation or test cases
a bit over 5 years ago. Removing a constant operand is a
good canonicalization? We should handle Y*2.0 too then?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326606 91177308-0d34-0410-b5e6-96231b3b80d8

Fix more spelling mistakes in comments of LLVM Analysis passes

Patch by Reshabh Sharma!

Differential Revision: https://reviews.llvm.org/D43939

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326601 91177308-0d34-0410-b5e6-96231b3b80d8

[PatternMatch, InstSimplify] fix m_NaN to work with vector constants and use it

This is NFC for the moment (and independent of any potential NaN semantic
controversy). Besides making the code in InstSimplify easier to read, the
motivation is to eventually allow undef elements in vector constants to
match too. A proposal to add the base logic for that is in D43792.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326600 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Handle VACOPY in isel lowering

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326599 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][BTVER2] Fix throughput of YMM bitwise instructions

These instructions are double-pumped, split into 2 128-bit ops and then passing through either FPU pipe.

Found while testing llvm-mca (D43951)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326597 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Reject xmm16-31 in inline asm constraints when AVX512 is disabled

Fixes PR36532

Differential Revision: https://reviews.llvm.org/D43960

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326596 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Allow fptrunc (fpext X)) to be reduced to a single fpext/ftrunc

If we are only truncating bits from the extend we should be able to just use a smaller extend.

If we are truncating more than the extend we should be able to just use a fptrunc since the presense of the fpextend shouldn't affect rounding.

Differential Revision: https://reviews.llvm.org/D43970

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326595 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][x32] Save callee-save register used as base pointer for x32 ABI

For the x32 ABI, since the base pointer register (EBX) is a callee save register
it should be saved before use.

This fixes https://bugs.llvm.org/show_bug.cgi?id=36011

Differential Revision: https://reviews.llvm.org/D42358

Patch by Pratik Bhatu

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326593 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fold variable into assert.

Avoids unused variable warnings in Release mode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326592 91177308-0d34-0410-b5e6-96231b3b80d8

[utils] Add utils/update_cc_test_checks.py

A utility to update LLVM IR in C/C++ FileCheck test files.

Example RUN lines in .c/.cc test files:

// RUN: %clang -S -Os -DXX %s -o - | FileCheck %s
// RUN: %clangxx -S -Os %s -o - | FileCheck -check-prefix=IR %s

Usage:

% utils/update_cc_test_checks.py --llvm-bin=release/bin test/a.cc
% utils/update_cc_test_checks.py --c-index-test=release/bin/c-index-test --clang=release/bin/clang /tmp/c/a.cc

    // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
    // RUN: %clang -emit-llvm -S -Os -DXX %s -o - | FileCheck -check-prefix=AA %s
    // RUN: %clangxx -emit-llvm -S -Os %s -o - | FileCheck -check-prefix=BB %s
    using T =
    #ifdef XX
        int __attribute__((vector_size(16)))
    #else
        short __attribute__((vector_size(16)))
    #endif
        ;

    // AA-LABEL: _Z3fooDv4_i:
    // AA:       entry:
    // AA-NEXT:    %add = shl <4 x i32> %a, <i32 1, i32 1, i32 1, i32 1>
    // AA-NEXT:    ret <4 x i32> %add
    //
    // BB-LABEL: _Z3fooDv8_s:
    // BB:       entry:
    // BB-NEXT:    %add = shl <8 x i16> %a, <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1>
    // BB-NEXT:    ret <8 x i16> %add
    T foo(T a) {
      return a + a;
    }

Differential Revision: https://reviews.llvm.org/D42712

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326591 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: InstrMapping for G_ZEXT

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326589 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: InstrMapping for G_TRUNC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326588 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define InstrMappings for G_FCMP

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326587 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for @llvm.minnum

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326586 91177308-0d34-0410-b5e6-96231b3b80d8

LoopUnroll: respect pragma unroll when AllowRemainder is disabled

Currently when AllowRemainder is disabled, pragma unroll count is not
respected even though there is no remainder. This bug causes a loop
fully unrolled in many cases even though the user specifies a unroll
count. Especially it affects OpenCL/CUDA since in many cases a loop
contains convergent instructions and currently AllowRemainder is
disabled for such loops.

Differential Revision: https://reviews.llvm.org/D43826

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326585 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fix access to stack arguments when re-aligning SP in Armv6m

When an Armv6m function dynamically re-aligns the stack, access to incoming
stack arguments (and to stack area, allocated for register varargs) is done via
SP, which is incorrect, as the SP is offset by an unknown amount relative to the
value of SP upon function entry.

This patch fixes it, by making access to "fixed" frame objects be done via FP
when the function needs stack re-alignment. It also changes the access to
"fixed" frame objects be done via FP (instead of using R6/BP) also for the case
when the stack frame contains variable sized objects. This should allow more
objects to fit within the immediate offset of the load instruction.

All of the above via a small refactoring to reuse the existing
`ARMFrameLowering::ResolveFrameIndexReference.`

Differential Revision: https://reviews.llvm.org/D43566

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326584 91177308-0d34-0410-b5e6-96231b3b80d8

[MergeICmps] Revert accidentally submitted failing test case.

Reverts r326574.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326582 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Add missing instructions to the Power 9 scheduler

Adding more instructions using InstRW so that we can move away from ItinRW
and ultimately have a complete Power 9 scheduler.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326578 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Check function type indexes

Also update tests containing invalid Wasm files, exposed by the check

Differential Revision: https://reviews.llvm.org/D43954

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326577 91177308-0d34-0410-b5e6-96231b3b80d8

[Docs] Add LLVM for Grad Students to Contributing page.

Adrian Sampson's blog post provides a good and relatively up-do-date
introduction to LLVM. I think this post could be helpful for people wanting
to get started with LLVM.

Reviewers: asb, tonic, silvas, probinson, kristof.beyls, rengolin

Reviewed By: rengolin

Differential Revision: https://reviews.llvm.org/D42904

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326576 91177308-0d34-0410-b5e6-96231b3b80d8

[MergeICmps] Revert 324317 "Enable the MergeICmps Pass by default."

While working on PR36557.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326575 91177308-0d34-0410-b5e6-96231b3b80d8

[MergeIcmps] Add the test case from PR36557.

Summary: See PR36557.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44009

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326574 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit: Remove an extraneous space. NFC

Test commit access.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326573 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[WebAssembly] More uses of uint8_t" and "[WebAssembly] Update tests"

This reverts commits r326541 and r326571.

The tests were correct, and were updated with incorrect expectations.
The original commit was broken and should be reverted to get things back
to a working state.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326572 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Update tests after r326541

r326541 slightly increased the size of WebAssembly object files
and it broke test/MC/WebAssembly/global-ctor-dtor.ll.

This commit updates the test to unbreak it, also mentioned this to the
author of the original commit in case they don't want it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326571 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fix codegen for VLD3/VLD4/VST3/VST4 with WB

Code generation of VLD3, VLD4, VST3 and VST4 with register writeback is
broken due to 2 separate bugs:

1) VLD1d64TPseudoWB_register and VLD1d64QPseudoWB_register are missing
   rules to expand them to non pseudo MIR. These are selected for
   ARMISD::VLD3_UPD/VLD4_UPD with v1i64 vectors in SelectVLD.

2) Selection of the right VLD/VST instruction is broken for load and
   store of 3 and 4 v1i64 vectors. SelectVLD and SelectVST are called
   with MIR opcode for fixed writeback (ie increment is access size)
   and call getVLDSTRegisterUpdateOpcode() to select an opcode with
   register writeback if base register update is of a different size.
   Since getVLDSTRegisterUpdateOpcode() only knows about
   VLD1/VLD2/VST1/VST2 the call is currently conditional on the number
   of element in the vector.

   However, VLD1/VST1 is selected by SelectVLD/SelectVST's caller for
   load and stores of 3 or 4 v1i64 vectors. Therefore the opcode is not
   updated which later lead to a fixed writeback instruction being
   constructed with an extra operand for the register writeback.

This patch addresses the two issues as follows:
- it adds the necessary mapping from VLD1d64TPseudoWB_register and
  VLD1d64QPseudoWB_register to VLD1d64Twb_register and
  VLD1d64Qwb_register respectively. Like for the existing _fixed
  variants, the cost of these is bumped for unaligned access.
- it changes the logic in SelectVLD and SelectVSD to call isVLDfixed
  and isVSTfixed respectively to decide whether the opcode should be
  updated. It also reworks the logic and comments for pushing the
  writeback offset operand and r0 operand to clarify the logic:
  writeback offset needs to be pushed if it's a register writeback,
  r0 needs to be pushed if not and the instruction is a
  VLD1/VLD2/VST1/VST2.

Reviewers: rengolin, t.p.northover, samparker

Reviewed By: samparker

Patch by Thomas Preud'homme <thomas.preudhomme@arm.com>

Differential Revision: https://reviews.llvm.org/D42970

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326570 91177308-0d34-0410-b5e6-96231b3b80d8

[LV][CFG] Add irreducible CFG detection for outer loops

This patch adds support for detecting outer loops with irreducible control
flow in LV. Current detection uses SCCs and only works for innermost loops.
This patch adds a utility function that works on any CFG, given its RPO
traversal and its LoopInfoBase. This function is a generalization
of isIrreducibleCFG from lib/CodeGen/ShrinkWrap.cpp. The code in
lib/CodeGen/ShrinkWrap.cpp is also updated to use the new generic utility
function.

Patch by Diego Caballero <diego.caballero@intel.com>

Differential Revision: https://reviews.llvm.org/D40874

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326568 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for @llvm.maxnum

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326567 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove old UNIMPLEMENTED list

All of these are implemented and have appropriate test coverage

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326553 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] More uses of uint8_t for single byte values

Summary: It looks like this was missing from D43921.

Reviewers: sbc100

Subscribers: jfb, dschuff, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D43991

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326541 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Added a couple of C LTO API interfaces to control the cache policy.
- thinlto_codegen_set_cache_size_bytes to control the absolute size of cache directory.
- thinlto_codegen_set_cache_size_files the size and amount of files in cache directory.
These functions have been supported in C++ LTO API for a long time, but were absent in C LTO API.

Differential Revision: https://reviews.llvm.org/D42446

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326537 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GCN: Promote i16 ctpop

i16 capable ASICs do not support i16 operands for this instruction.
Add tablegen pattern to merge chained i16 additions.

Differential Revision: https://reviews.llvm.org/D43985

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326535 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_FPTOSI

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326534 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_FPTOUI

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326533 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_FMUL

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326532 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add more test case to fpextend.ll.

This includes the test cases from D43970 and additional tests for combining (fptrunc (binop (fpext), (fpext))) where the pre-extended types don't match the trunc and therefore can't be completely removed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326528 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_FADD

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326526 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_SHL

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326525 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_XOR

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326524 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_AND

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326523 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Gather EH instructions in one place. NFC.

Summary:
- Gather EH instructions in one place for easy tracking (more will be
added later)
- Variable name change

Reviewers: dschuff

Subscribers: jfb, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D43742

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326522 91177308-0d34-0410-b5e6-96231b3b80d8

[ArgumentPromotion] don't break musttail invariant PR36543

Summary:
Do not break musttail invariant by promoting arguments of musttail
callee or caller.

Reviewers: sanjoy, dberlin, hfinkel, george.burgess.iv, fhahn, rnk

Reviewed By: rnk

Subscribers: rnk, llvm-commits

Differential Revision: https://reviews.llvm.org/D43926

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326521 91177308-0d34-0410-b5e6-96231b3b80d8