git.osdn.net Git - android-x86/external-llvm.git/log

[dsymutil] Correctly handle DW_TAG_label

This patch contains logic for handling DW_TAG_label that's present in
darwin's dsymutil implementation, but not yet upstream.

Differential revision: https://reviews.llvm.org/D43438

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325600 91177308-0d34-0410-b5e6-96231b3b80d8

[vim] Recognize more FileCheck comments

Summary:
Currently vim syntax highlighting recognizes 'CHECK:' as a special
comment, but not CHECK-DAG, CHECK-NOT and other CHECKs. This patch
adds rules for these comments.

Reviewers: chandlerc, compnerd, rogfer01

Reviewed By: rogfer01

Subscribers: rogfer01, llvm-commits

Differential Revision: https://reviews.llvm.org/D43289

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325599 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] remove unneeded dyn_cast to prevent unused variable warning

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325597 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] remove compound fdiv pattern folds

These are fdiv-with-constant-divisor, so they already become
reciprocal multiplies. The last gap for vector ops should be
closed with rL325590.

It's possible that we're missing folds for some edge cases
with denormal intermediate constants after deleting these,
but there are no tests for those patterns, and it would be
better to handle denormals more consistently (and less
conservatively) as noted in TODO comments.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325595 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] fold fdiv with non-splat divisor to fmul: X/C --> X * (1/C)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325590 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Correct the definition of cvt.d.w

An upcoming patch D41434, changes the ordering of the matcher table
for assembly. This patch corrects the definition of the normal MIPS
cvt.d.w not to be available in microMIPS.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325589 91177308-0d34-0410-b5e6-96231b3b80d8

[DEBUGINFO] Add support for emission of the inlined strings.

Summary:
Patch adds an option for emission of inlined strings rather than
.debug_str section.

Reviewers: echristo, jlebar

Subscribers: eraman, llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D43390

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325583 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Reduce stack frame for fastcc functions by only allocating parameter save area when needed

Current implementation always allocates the parameter save area conservatively
for fastcc functions. There is no reason to allocate the parameter save area if
all the parameters can be passed via registers.

Differential Revision: https://reviews.llvm.org/D42602

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325581 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Fix alignment calculation of stack objects in Hexagon bit tracker

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325580 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Regenerate XOR tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325579 91177308-0d34-0410-b5e6-96231b3b80d8

[VectorLegalizer] Fix uint64_t typo in ExpandUINT_TO_FLOAT (PR36391)

ExpandUINT_TO_FLOAT can accept vXi32 or vXi64 inputs, so we need to use a uint64_t shift to generate the 2^(BW/2) constant.

No test case unfortunately as no upstream target uses this, but its affecting a downstream target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325578 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Mark -1 as cheap in xor's for thumb1

We can always convert xor %a, -1 into MVN, even in thumb 1 where the -1
would not otherwise be considered a cheap constant. This prevents the
-1's from being pulled out into constants and potentially hoisted.

Differential Revision: https://reviews.llvm.org/D43451

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325573 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mc] - Produce R_X86_64_PLT32 for "call/jmp foo".

For instructions like call foo and jmp foo patch changes
relocation produced from R_X86_64_PC32 to R_X86_64_PLT32.
Relocation can be used as a marker for 32-bit PC-relative branches.
Linker will reduce PLT32 relocation to PC32 if function is defined locally.

Differential revision: https://reviews.llvm.org/D43383

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325569 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] stop buffer_store being moved illegally

Summary:
The machine instruction scheduler was illegally moving a buffer store
past a buffer load with the same descriptor and offset. Fixed by marking
buffer ops as mayAlias and isAliased. This may be overly conservative,
and we may need to revisit.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D43332

Change-Id: Iff3173d9e0653e830474546276ab9d30318b8ef7

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325567 91177308-0d34-0410-b5e6-96231b3b80d8

[MC] - Don't crash on unclosed frame.

llvm-mc can crash when
there is cfi_startproc without cfi_end_proc:

.text
.globl foo
foo:
.cfi_startproc

Testcase shows the issue, patch fixes it.

Differential revision: https://reviews.llvm.org/D43456

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325564 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][CET]: Adding full coverage of MC encoding for the CET instructions.<NFC>

NFC.
Adding MC regressions tests to cover the CET instructions.
This patch is part of a larger task to cover MC encoding of all X86 isa sets started in revision: https://reviews.llvm.org/D39952

Reviewers: zvi, craig.topper, RKSimon, AndreiGrischenko, oren_ben_simhon
Differential Revision: https://reviews.llvm.org/D41329

Change-Id: I9c133d4ba07508ce8fd738a1230edd586e2c2f1b

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325561 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add 512-bit unmasked pmulhrsw/pmulhw/pmulhuw intrinsics. Remove and auto upgrade 128/256/512 bit masked pmulhrsw/pmulhw/pmulhuw intrinsics.

The 128 and 256 bit versions were already not used by clang. This adds an equivalent unmasked 512 bit version. Then autoupgrades all sizes to use unmasked intrinsics plus select.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325559 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove GCCBuiltin from a bunch of intrinsics that aren't used by clang and should be removed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325552 91177308-0d34-0410-b5e6-96231b3b80d8

Report fatal error in the case of out of memory

This is the second part of recommit of r325224. The previous part was
committed in r325426, which deals with C++ memory allocation. Solution
for C memory allocation involved functions `llvm::malloc` and similar.
This was a fragile solution because it caused ambiguity errors in some
cases. In this commit the new functions have names like `llvm::safe_malloc`.

The relevant part of original comment is below, updated for new function
names.

Analysis of fails in the case of out of memory errors can be tricky on
Windows. Such error emerges at the point where memory allocation function
fails, but manifests itself when null pointer is used. These two points
may be distant from each other. Besides, next runs may not exhibit
allocation error.

In some cases memory is allocated by a call to some of C allocation
functions, malloc, calloc and realloc. They are used for interoperability
with C code, when allocated object has variable size and when it is
necessary to avoid call of constructors. In many calls the result is not
checked for null pointer. To simplify checks, new functions are defined
in the namespace 'llvm': `safe_malloc`, `safe_calloc` and `safe_realloc`.
They behave as corresponding standard functions but produce fatal error if
allocation fails. This change replaces the standard functions like 'malloc'
in the cases when the result of the allocation function is not checked
for null pointer.

Finally, there are plain C code, that uses malloc and similar functions. If
the result is not checked, assert statement is added.

Differential Revision: https://reviews.llvm.org/D43010

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325551 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][GlobalISel] When copying from a gpr32 to an fpr16 reg, convert to fpr32 first.

This is a follow on commit to r[x] where we fix the other direction of copy.
For this case, after converting the source from gpr32 -> fpr32, we use a
subregister copy, which is essentially what EXTRACT_SUBREG does in SDAG land.

https://reviews.llvm.org/D43444

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325550 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Mark XOP vpmac* and vpmadc intrinsics as being commutative so that tablegen will generate patterns with the load in operand 0.

This allows loads to be folded during isel without the peephole pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325548 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Make XOP VPCOM instructions commutable to fold loads during isel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325547 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Make a helper function for commuting AVX512 VPCMP immediates since we do it in two places.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325546 91177308-0d34-0410-b5e6-96231b3b80d8

[GISel]: Add pattern matchers for G_BITCAST/PTRTOINT/INTTOPTR

Adds pattern matchers for the above along with unit tests for the same.
https://reviews.llvm.org/D43479

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325542 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] use CreateWithCopiedFlags to reduce code; NFCI

Also, move the folds with constants closer to make it easier to follow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325541 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[mem2reg] Use range loops (NFCI)"

This reverts commit r325532.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325539 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use vpmovq2m/vpmovd2m for truncate to vXi1 when possible.

Previously we used vptestmd, but the scheduling data for SKX says vpmovq2m/vpmovd2m is lower latency. We already used vpmovb2m/vpmovw2m for byte/word truncates. So this is more consistent anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325534 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] allow fdiv with constant dividend folds with less than full -ffast-math

It's possible that we could allow this either 'arcp' or 'reassoc' alone, but this
should be conservatively better than what we have right now. GCC allows this with
only -freciprocal-math.

The last test is changed to show a case that is expected to fold, but we need D43398.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325533 91177308-0d34-0410-b5e6-96231b3b80d8

[mem2reg] Use range loops (NFCI)

Summary:
Several for loops in PromoteMemoryToRegister.cpp leave their increment
expression empty, instead incrementing the iterator within the for loop
body. I believe this is because these loops were previously implemented
as while loops; see https://reviews.llvm.org/rL188327.

Incrementing the iterator within the body of the for loop instead of
in its increment expression makes it seem like the iterator will be
modified or conditionally incremented within the loop, but that is not
the case in these loops.

Instead, use range loops.

Test Plan: `check-llvm`

Reviewers: davide, bkramer

Reviewed By: davide, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43473

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325532 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] refactor fdiv with constant dividend folds; NFC

The last fold that used to be here was not necessary. That's a
combination of 2 folds (and there's a regression test to show that).

The transforms are guarded by isFast(), but that should be loosened.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325531 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] move fdiv tests; NFC

Also, use vector constants just to prove that already works.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325530 91177308-0d34-0410-b5e6-96231b3b80d8

[Coroutines] Move debug statement before assert

Summary:
Move a debug statement to above where an assertion is hit, so that the debug
statement can be inspected before a stack trace.

Test Plan: `check-llvm`

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325529 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Use the full filename in --add-gnu-debuglink

Summary:
The current implementation was writing the file name without the extension
whereas GNU objcopy writes the full filename. With this change GDB will now
load the .debug file instead of silently ignoring it.

Reviewers: jakehehrlich, jhenderson

Reviewed By: jakehehrlich

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43474

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325528 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Stop swapping the operands of AVX512 setge.

We swapped the operands and used setle, but I don't see any reason to do that. I think this is a holdover from SSE where we swap and the invert to use pcmpgt. But with AVX512 we don't want an invert so we won't use pcmpgt. So there's no need to swap.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325527 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Reduce the number of isel pattern variations needed for VPTESTM/VPTESTNM matching.

Canonicalize EQ/NE PCMPM to have build vector all zeros on the RHS so we don't have to pattern match it in both locations. This significantly reduces the number of isel patterns needed since we also had to multiply it out with loads being in either operand of the 'and' input node and in the 'and' masking node.

This removes over 24000 bytes from the isel table.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325526 91177308-0d34-0410-b5e6-96231b3b80d8

bitcode support change for fast flags compatibility

Summary: The discussion and as per need, each vendor needs a way to keep the old fast flags and the new fast flags in the auto upgrade path of the IR upgrader. This revision addresses that issue.

Patched by Michael Berg

Reviewers: qcolombet, hans, steven_wu

Reviewed By: qcolombet, steven_wu

Subscribers: dexonsmith, vsk, mehdi_amini, andrewrk, MatzeB, wristow, spatel

Differential Revision: https://reviews.llvm.org/D43253

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325525 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Make note of existing waitcnt instrs; this is add-on work related to suppression of redundant waitcnt instrs. It is necessary to make note of these existing waitcnt instrs so that we do not fall into an infinite loop when handling loops. Also, [NFC] some minor code clean-up.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325524 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] ComputeKnownBits - add support for SMIN+SMAX clamp patterns

If we have a clamp pattern, SMIN(SMAX(X, LO),HI) or SMAX(SMIN(X, HI),LO) then we can deduce that the number of signbits (zeros/ones) will be at least the minimum of the LO and HI constants.

ComputeKnownBits equivalent of D43338.

Differential Revision: https://reviews.llvm.org/D43463

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325521 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Increased vector length for global/constant loads.

Summary: GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache; loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords.

Author: FarhanaAleen

Reviewed By: rampitec

Subscribers: llvm-commits, AMDGPU

Differential Revision: https://reviews.llvm.org/D43275

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325518 91177308-0d34-0410-b5e6-96231b3b80d8

[Dominators] Update DominatorTree compare in case roots are different

The compare function, unusually, returns false on same, true on
different. This fixes the conditions for different roots.

Reviewed as a part of D41298.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325517 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] Refactor AppleAccelTable

Summary:
This commit separates the abstract accelerator table data structure
from the code for writing out an on-disk representation of a specific
accelerator table format. The idea is that former (now called
AccelTable<T>) can be reused for the DWARF v5 accelerator tables
as-is, without any further customizations.

Some bits of the emission code (now living in the EmissionContext class)
can be reused for DWARF v5 as well, but the subtle differences in the
layout of various subtables mean the sharing is not always possible.
(Also, the individual emit*** functions are fairly simple so there's a
tradeoff between making a bigger general-purpose function, and two
smaller targeted functions.)

Another advantage of this setup is that more of the serialization logic
can be hidden in the .cpp file -- I have moved declarations of the
header and all the emission functions there.

Reviewers: JDevlieghere, aprantl, probinson, dblaikie

Subscribers: echristo, clayborg, vleschuk, llvm-commits

Differential Revision: https://reviews.llvm.org/D43285

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325516 91177308-0d34-0410-b5e6-96231b3b80d8

[TTI CostModel] change default cost of FP ops to 1 (PR36280)

This change was mentioned at least as far back as:
https://bugs.llvm.org/show_bug.cgi?id=26837#c26
...and I found a real program that is harmed by this:
Himeno running on AMD Jaguar gets 6% slower with SLP vectorization:
https://bugs.llvm.org/show_bug.cgi?id=36280
...but the change here appears to solve that bug only accidentally.

The div/rem costs for x86 look very wrong in some cases, but that's already true,
so we can fix those in follow-up patches. There's also evidence that more cost model
changes are needed to solve SLP problems as shown in D42981, but that's an independent
problem (though the solution may be adjusted after this change is made).

Differential Revision: https://reviews.llvm.org/D43079

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325515 91177308-0d34-0410-b5e6-96231b3b80d8

Bring back r323297.

It was reverted because it broke the grub build. The reason the grub
build broke is because grub does its own relocation processing and was
not handing R_386_PLT32. Since grub has no dynamic linker, the fix is
trivial: handle R_386_PLT32 exactly like R_386_PC32.

On the report it was noted that they are using
-fno-integrated-assembler. The upstream GAS (starting with
451875b4f976a527395e9303224c7881b65e12ed) will already be producing a
R_386_PLT32 anyway, so they have to update their code one way or the
other

Original message:

Don't assume a null GV is local for ELF and MachO.

This is already a simplification, and should help with avoiding a plt
reference when calling an intrinsic with -fno-plt.

With this change we return false for null GVs, so the caller only
needs to check the new metadata to decide if it should use foo@plt or
*foo@got.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325514 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] Fix tests breaking after r325505

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325512 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Add GraphTraits for FunctionSummaries

Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes.

Third attempt - moved function from lambda to static function due to build failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325506 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[CodeGen] Move printing '\n' from MachineInstr::print to MachineBasicBlock::print"

This reverts commit r324681.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325505 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] combineTruncateWithSat - use truncateVectorWithPACK down to 64-bit subvectors

Add support for chaining PACKSS/PACKUS down to 64-bit vectors by using only a single 128-bit input.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325494 91177308-0d34-0410-b5e6-96231b3b80d8

[Transforms] Propagate new-format TBAA tags on simplification of memory-transfer intrinsics

With this patch in place, when a new-format TBAA tag is available
for a memory-transfer intrinsic call, we prefer propagating that
new-format tag. Otherwise, we fallback to the old approach where
we try to construct a proper TBAA access tag from 'tbaa.struct'
metadata.

Differential Revision: https://reviews.llvm.org/D41543

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325488 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-opt-fuzzer] Add another pack of passes for continuous fuzzing

Differential Revision: https://reviews.llvm.org/D43384

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325487 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Set the program address space in the data layout

This adds the program memory address space setting to the AVR data
layout.

This setting was very recently added under r325479.

At the moment, there are no uses of this setting. In the future, things
such as switch lookup tables should reside there.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325481 91177308-0d34-0410-b5e6-96231b3b80d8

Add default address space for functions to the data layout (1/3)

Summary:
This adds initial support for letting targets specify which address
spaces their functions should reside in by default.

If a function is created by a frontend, it will get the default address space specified in the DataLayout, unless the frontend explicitly uses a more general `llvm::Function` constructor. Function address spaces will become a part of the bitcode and textual IR forms, as we do not have access to a data layout whilst parsing LL.

It will be possible to write IR that explicitly has `addrspace(n)` on a function. In this case, the function will reside in the specified space, ignoring the default in the DL.

This is the first step towards placing functions into the correct
address space for Harvard architectures.

Full patchset
* Add program address space to data layout D37052
* Require address space to be specified when creating functions D37054
* [clang] Require address space to be specified when creating functions D37057

Reviewers: pcc, arsenm, kparzysz, hfinkel, theraven

Reviewed By: theraven

Subscribers: arichardson, simoncook, rengolin, wdng, uabelho, bjope, asb, llvm-commits

Differential Revision: https://reviews.llvm.org/D37052

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325479 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Fix a lowering bug in AVRISelLowering.cpp

The parseFunctionArgs() method was directly reading the
arguments from a Function object, but is should have used the
arguments supplied by the SelectionDAGBuilder.

This was causing
the lowering code to only lower one argument, not two in some cases.

Thanks to @brainlag on GitHub for coming up with the working fix!

Patch-by: @brainlag on GitHub
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325474 91177308-0d34-0410-b5e6-96231b3b80d8

Add LanaiMCTargetDesc.h to LanaiInstrInfo.h to make it self contained
with instruction enum definitions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325473 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Correct a typo I made in combineToExtendCMOV recently.

We're accidentally checking that the same node is a constant twice instead of checking the other node.

This isn't a functional problem since we didn't do anything below that explicitly requires constants. It just means we may have introduced a sign_extend or zero_extend that won't fold out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325469 91177308-0d34-0410-b5e6-96231b3b80d8

[PatternMatch, InstSimplify] enhance m_AllOnes() to ignore undef elements in vectors

Loosening the matcher definition reveals a subtle bug in InstSimplify (we should not
assume that because an operand constant matches that it's safe to return it as a result).

So I'm making that change here too (that diff could be independent, but I'm not sure how
to reveal it before the matcher change).

This also seems like a good reason to *not* include matchers that capture the value.
We don't want to encourage the potential misstep of propagating undef values when it's
not allowed/intended.

I didn't include the capture variant option here or in the related rL325437 (m_One),
but it already exists for other constant matchers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325466 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] add tests with vector undef elts; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325465 91177308-0d34-0410-b5e6-96231b3b80d8

Fix unused assertion variable warning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325464 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][GlobalISel] Fix an assert fail/miscompile when fp16 types are copied
to gpr register banks.

PR36345.

rdar://36478867

Differential Revision: https://reviews.llvm.org/D43310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325463 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][GlobalISel] Support G_INSERT/G_EXTRACT of types < s32 bits.

These are needed for operations on fp16 types in a later patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325462 91177308-0d34-0410-b5e6-96231b3b80d8

[PatternMatch] reformatting and comment clean-ups; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325461 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Replace hand-written scope_exit with make_scope_exit.

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325460 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Coalesce Copy Zero during instruction selection

Add special case for copy of zero to avoid a double copy.

Differential Revision: https://reviews.llvm.org/D36104

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325459 91177308-0d34-0410-b5e6-96231b3b80d8

[BPF] Return true in enableMultipleCopyHints().

Enable multiple COPY hints to eliminate more COPYs during register allocation.

Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.

Review: Yonghong Song

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325457 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Make masked pcmpeq commutable during isel so we can fold loads in other operand to the shorter encoding.

Previously we used the immediate encoding if the load was in operand 0 and the short encoding if the load was in operand 1.

This added an insane number of bytes to the size of the isel table. I'm wondering if we should always use the immediate form during isel and change to the short form during emission. This would remove the need to pattern match every combination for both the immediate form and the short form during isel. We could do the same with vpcmpgt

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325456 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add -show-mc-encoding to the avx512-vec-cmp.ll test and add test case to show that we're failing to use the shorter pcmpeq encoding when the memory arguemnt is the first argument.

This can't be spotted without showing the encodings since they have the same mnemonic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325455 91177308-0d34-0410-b5e6-96231b3b80d8

Revert: [llvm] r325448 - [ThinLTO] Add GraphTraits for FunctionSummaries

Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes.

Second attempt, since last patch caused stage2 build to fail (now using function_ref rather than std::function).

Reverted due to buildbot failures

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325454 91177308-0d34-0410-b5e6-96231b3b80d8

Fix Wparentheses warning. NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325451 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] ComputeNumSignBits - add support for SMIN+SMAX clamp patterns

If we have a clamp pattern, SMIN(SMAX(X, LO),HI) or SMAX(SMIN(X, HI),LO) then we can deduce that the number of signbits will be at least the minimum of the LO and HI constants.

I haven't bothered with the UMIN/UMAX equivalent as (1) we don't have any current use cases and (2) I wonder if we'd be better off immediately falling back for ComputeKnownBits for UMIN/UMAX which already has optimization patterns useful for unsigned cases.

Differential Revision: https://reviews.llvm.org/D43338

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325450 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] SimplifyDemandedVectorElts - add support for VECTOR_INSERT_ELT

Differential Revision: https://reviews.llvm.org/D43431

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325449 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Add GraphTraits for FunctionSummaries

Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes.

Second attempt, since last patch caused stage2 build to fail (now using function_ref rather than std::function).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325448 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS][MSA] Convert vector integer min/max opcodes to use generic implementation

Found while investigating D43338

Simon^3 - the LLVM project needs more Simons.

Differential Revision: https://reviews.llvm.org/D43433

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325447 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add LLVM tests for the vcvtr builtins

Follow up of Clang commit r325351; this adds the LLVM tests, which
were also missing.

Differential Revision: https://reviews.llvm.org/D43395

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325443 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Revert r324172 now r323991 was reverted

This fixes the build, now that r325421 was commited to revert r323991.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325441 91177308-0d34-0410-b5e6-96231b3b80d8

Made test dbg_value_fastisel.ll specific to AArch64 fast-isel.

Some buildbots failed on this test (rL325438) because they don't
build all targets. I set the triple to aarch64 and moved the test
to test/CodeGen/AArch64/fast-isel-dbg-value.ll.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325440 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add 'sahf' to getHostCPUFeatures so -march=native will pick it up correctly.

Summary: We probably mostly get this right due to family/model/stepping mapping to CPU names. But we should detect it explicitly.

Reviewers: RKSimon, echristo, dim, spatel

Reviewed By: dim

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43418

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325439 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo][FastISel] Fix dropping dbg.value()

Summary:
https://llvm.org/PR36263 shows that when compiling at -O0 a dbg.value()
instruction (that remains from an original dbg.declare()) is dropped
by FastISel. Since FastISel selects instructions by iterating a basic
block backwards, it drops the dbg.value if one of its operands is not
yet instantiated by a previously selected instruction.

Instead of calling 'lookUpRegForValue()' we can call 'getRegForValue()'
instead that will insert a placeholder for the operand to be filled in
when continuing the instruction selection.

Reviewers: aprantl, dblaikie, probinson

Reviewed By: aprantl

Subscribers: llvm-commits, dstenb, JDevlieghere

Differential Revision: https://reviews.llvm.org/D43386

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325438 91177308-0d34-0410-b5e6-96231b3b80d8

[PatternMatch] enhance m_One() to ignore undef elements in vectors

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325437 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify, InstCombine] add tests with vector undef elts; NFC

These would fold if the m_One pattern matcher accounted for undef elts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325436 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][3DNow!] Add PFRCP reg-reg disassembler test case (PR21168)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325435 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] move select undef cond fold with other constant cond folds; NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325434 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Implement dynamic stack probing for windows

This makes sure that alloca() function calls properly probe the
stack as needed.

Differential Revision: https://reviews.llvm.org/D42356

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325433 91177308-0d34-0410-b5e6-96231b3b80d8

Fix unused variable warning. NFCI.

We were casting to AArch64InstrInfo but only using it for static methods which some compilers complain about.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325432 91177308-0d34-0410-b5e6-96231b3b80d8

[dwarfdump] Fix spurious verification errors for DW_AT_location attributes

Verifying any DWARF file that is optimized and contains at least one tag
with a DW_AT_location with a location list offset as a
DW_AT_form_dataXXX results in dwarfdump spuriously claiming that the
location list is invalid.

Differential revision: https://reviews.llvm.org/D40199

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325430 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Remove simplifyShuffleMask - now handled more generally by SimplifyDemandedVectorElts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325429 91177308-0d34-0410-b5e6-96231b3b80d8

Fix signed/unsigned comparison warning in AsmGenMatcher generated code. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325428 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Removed assert on missing CountVarDIE

Summary:
The assert for a DISubrange's CountVarDIE to be available fails
when the dbg.value() has been optimized away for any reason.
Having the assert for that is a little heavy, so instead removing
it now in favor of not generating the 'count' expression.

Addresses http://llvm.org/PR36263 .

Reviewers: aprantl, dblaikie, probinson

Reviewed By: aprantl

Subscribers: JDevlieghere, llvm-commits, dstenb

Differential Revision: https://reviews.llvm.org/D43387

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325427 91177308-0d34-0410-b5e6-96231b3b80d8

Report fatal error in the case of out of memory

This is partial recommit of r325224, reverted in 325227. The relevant
part of original comment is below.

Analysis of fails in the case of out of memory errors can be tricky on
Windows. Such error emerges at the point where memory allocation function
fails, but manifests itself when null pointer is used. These two points
may be distant from each other. Besides, next runs may not exhibit
allocation error.

Usual programming practice does not require checking result of 'operator
new' because it throws 'std::bad_alloc' in the case of allocation error.
However, LLVM is usually built with exceptions turned off, so 'new' can
return null pointer. This change installs custom new handler, which causes
fatal error in the case of out of memory. The handler is installed
automatically prior to call to 'main' during construction of a static
object defined in 'lib/Support/ErrorHandling.cpp'. If the application does
not use this file, the handler may be installed manually by a call to
'llvm::install_out_of_memory_new_handler', declared in
'include/llvm/Support/ErrorHandling.h".

Differential Revision: https://reviews.llvm.org/D43010

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325426 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Return true in enableMultipleCopyHints().

Enable multiple COPY hints to eliminate more COPYs during register allocation.

Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.

Review: Stanislav Mekhanoshin, Tom Stellard.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325425 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"

This reverts commit r323991.

This commit breaks target that don't model all the register constraints
in TableGen. So far the workaround was to set the
hasExtraXXXRegAllocReq, but it proves that it doesn't cover all the
cases.
For instance, when mutating an instruction (like in the lowering of
COPYs) the isRenamable flag is not properly updated. The same problem
will happen when attaching machine operand from one instruction to
another.

Geoff Berry is working on a fix in https://reviews.llvm.org/D43042.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325421 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG, X86] Revert r324797, r324491, and r324359.

Sadly, r324359 caused at least PR36312. There is a patch out for review
but it seems to be taking a bit and we've already had these crashers in
tree for too long. We're hitting this PR in real code now and are
blocked on shipping new compilers as a consequence so I'm reverting us
back to green.

Sorry for the churn due to the stacked changes that I had to revert. =/

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325420 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] add vector select tests with undef elts in condition; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325419 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Turn selects with constant condition into vector shuffles during DAG combine

Summary:
Currently we convert to shuffles during lowering. This moves it to DAG combine so hopefully we can get it done before type legalization has to extend the condition.

I believe in some cases we're creating SHRUNKBLENDs that end up with constant conditions because we see the extended on the condition and think its a dynamic selelect before DAG combine gets a chance to constant fold the extend. We could add combines to turn SHRUNKBLENDs with constant condition back to vselect. But it seemed like it might be better to just send them to shuffles as early as possible so they never get a chance to become SHRUNKBLENDs. This the reason some tests went from blends controlled by a constant pool load to just move.

Some of the constant pool entries changed because the sign_extend introduced by type legalization turned undef elements in select condition into 0s. While the select->shuffle used -1 in the shuffle mask. So now the shuffle lowering can do what it wants with them.

I'll remove the lowering code as a follow up. We might be able to simplify some of the pre-checks for SHRUNKBLEND as the FIXME there says.

Reviewers: spatel, RKSimon, efriedma, zvi, andreadb

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43367

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325417 91177308-0d34-0410-b5e6-96231b3b80d8

Remove "--full-shutdown" and instead use an environment variable LLD_IN_TEST.

We are running lld tests with "--full-shutdown" option because we don't
want to call _exit() in lld if it is running tests. Regular shutdown
is needed for leak sanitizer.

This patch changes the way how we tell lld that it is running tests.
Now "--full-shutdown" is removed, and LLD_IN_TEST environment variable
is used instead.

This patch enables full shutdown on all ports, e.g. ELF, COFF and wasm.
Previously, we enabled it only for ELF.

Differential Revision: https://reviews.llvm.org/D43410

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325413 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Allow indexing to request backend to ignore the module

Summary:
Gold plugin does not add pass to ThinLTO modules without useful symbols.
In this case ThinLTO can't create corresponding index file and some features, like CFI,
cannot be processes by backed correctly without index.
Given that we don't need the backed output we can request it to avoid
processing the module. This is implemented by this patch using new
"SkipModuleByDistributedBackend" flag.

Reviewers: pcc, tejohnson

Subscribers: mehdi_amini, inglorion, eraman, cfe-commits

Differential Revision: https://reviews.llvm.org/D42995

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325411 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove unused private member of AMDGPUTargetELFStreamer

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325408 91177308-0d34-0410-b5e6-96231b3b80d8

Run these tests, the errors were old and not valid anymore.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325407 91177308-0d34-0410-b5e6-96231b3b80d8

Remove an unused function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325403 91177308-0d34-0410-b5e6-96231b3b80d8

Silence an unsigned vs signed compare warning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325402 91177308-0d34-0410-b5e6-96231b3b80d8

[GISel]: Make GlobalISelEmitter rule prioritization compatible with selectionDAG

This patch changes GlobalISelEmitter to rank patterns similar to how the
DAG does it (ie it computes a score for a pattern and adds the added
complexity to it).
This is so that the decision tree for GISelSelector remains compatible
with that of SelectionDAG.

https://reviews.llvm.org/D43270

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325401 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Bring elf flags in sync with the spec

- Add MACH flags
- Add XNACK flag
- Add reserved flags
- Minor cleanups in docs

Differential Revision: https://reviews.llvm.org/D43356

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325399 91177308-0d34-0410-b5e6-96231b3b80d8