git.osdn.net Git - android-x86/external-llvm.git/log

[docs] Updated docs to work with Doxygen 1.8.11

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262786 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Improved VPERMILPS variable shuffle mask decoding.

Added support for decoding VPERMILPS variable shuffle masks that aren't in the constant pool.

Added target shuffle mask decoding for SCALAR_TO_VECTOR+VZEXT_MOVL cases - these can happen for v2i64 constant re-materialization

Followup to D17681

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262784 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] AMD Bobcat CPU (btver1) doesn't support XSAVE

btver1 is a SSSE3/SSE4a only CPU - it doesn't have AVX and doesn't support XSAVE.

Differential Revision: http://reviews.llvm.org/D17683

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262782 91177308-0d34-0410-b5e6-96231b3b80d8

Support: catch invalid accesses

It is possible to invoke these methods on an invalid input resulting in an
invalid substring construction. It seems that we do not have unit tests for
these methods. Tests to ensure that the invalid call is caught to follow in
clang.

Resolves PR26839.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262778 91177308-0d34-0410-b5e6-96231b3b80d8

ExecutionEngine: tweak debug log

Add a newline to separate the log message. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262777 91177308-0d34-0410-b5e6-96231b3b80d8

Replace GlobalScopeAsm[GlobalScopeAsm.size()-1] with GlobalScopeAsm.back(), NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262775 91177308-0d34-0410-b5e6-96231b3b80d8

Add DAG mutation interface to the post-RA scheduler

Differential Revision: http://reviews.llvm.org/D17868

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262774 91177308-0d34-0410-b5e6-96231b3b80d8

[aa-eval] Enhance the comments to better describe the overview of why
this pass exists.

This is based on feedback received when moving this comment from the source
file to a new header file.

Differential Revision: http://reviews.llvm.org/D17476

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262769 91177308-0d34-0410-b5e6-96231b3b80d8

RegisterCoalescer: Remap subregister lanemasks before exchanging operands

Rematerializing and merging into a bigger register class at the same
time, requires the subregister range lanemasks getting remapped to the
new register class.

This fixes http://llvm.org/PR26805

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262768 91177308-0d34-0410-b5e6-96231b3b80d8

RegisterCoalescer: Need to check DstReg+SrcReg for missing undef flags

copy coalescing with enabled subregister liveness can reveal undef uses,
previously this was only checked for the SrcReg in updateRegDefsUses()
but we need to check DstReg as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262767 91177308-0d34-0410-b5e6-96231b3b80d8

RegisterPressure: Small cleanup

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262766 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix the lowering of setjmp intrinsic on i386.
When the lowering of the setjmp intrinsic requires
a global base pointer to be set, make sure such pointer
gets defined by the CGBR pass.

This fixes PR26742.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262762 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing triple in my previous commit!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262760 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Do not use cmpxchgXXb when we need the base pointer (RBX).
cmpxchgXXb uses RBX as one of its implicit argument. I.e., when
we use that instruction we need to clobber RBX. This is generally
fine, expect when RBX is a reserved register because in that case,
the register allocator will not track its value and will not
save and restore it when interferences occur.

rdar://problem/24851412

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262759 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add tests for masked loads with constant masks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262758 91177308-0d34-0410-b5e6-96231b3b80d8

[libfuzzer] adding std:string to allowed adaptable argument.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262757 91177308-0d34-0410-b5e6-96231b3b80d8

Fix build breakage

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262756 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Support cleaning more than 2**16 bytes of stack

The x86 ret instruction has a 16 bit immediate indicating how many bytes
to pop off of the stack beyond the return address.

There is a problem when extremely large structs are passed by value: we
might not be able to fit the number of bytes to pop into the return
instruction.

To fix this, expand RET_FLAG a little later and use a special sequence
to clean the stack:

pop  %ecx     ; return address is now in %ecx
add  $n, %esp ; clean the stack
push %ecx     ; bring the return address back on the stack
ret           ; pop the return address and jmp to it's value

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262755 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] log less when re-loading files; fix a silly bug: when running single files actually run all of them, not just the first one

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262754 91177308-0d34-0410-b5e6-96231b3b80d8

[LVI] Fix a bug which prevented use of !range metadata within a query

The diff is relatively large since I took a chance to rearrange the code I had to touch in a more obvious way, but the key bit is merely using the !range metadata when we can't analyze the instruction further. The previous !range metadata code was essentially just dead since no binary operator or cast will have !range metadata (per Verifier) and it was otherwise dropped on the floor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262751 91177308-0d34-0410-b5e6-96231b3b80d8

[PGO] Add a commandline option to control number of the VP annotation metadata.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262750 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Fix divrem combine not to assume div/rem type is simple.

The divrem combine assumed the type of the div/rem is simple, which isn't
necessarily true. This probably worked fine until r250825, since it only
saw legal types, but now breaks when it runs as a pre-type-legalization
combine.

This fixes PR26835.

Differential Revision: http://reviews.llvm.org/D17878

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262746 91177308-0d34-0410-b5e6-96231b3b80d8

Fix new gold test to specify emulation mode.

The thinlto_linkonceresolution.ll gold linker test introduced in r262727
included a target triple, but didn't set the emulation mode, which is
necessary since the default linker target may be different.

Patch by H.J. Lu

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262745 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add another possible code-size optimization to README.txt

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262740 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Merging 64-bit divmod lib calls into one

When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make
sure we only emit valid types or the ones that were explicitly marked as custom.
Now, passing check-all and test-suite on x86, ARM and AArch64.

This patch fixes PR17193 (and a long time FIXME in the tests).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262738 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/SI: Add support for spiling SGPRs to scratch buffer

Summary:
This is necessary for when we run out of VGPRs and can no
longer use v_{read,write}_lane for spilling SGPRs.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17592

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262732 91177308-0d34-0410-b5e6-96231b3b80d8

Fix bot failure from r262721: unintented change in gold-plugin save-temps

The split code gen task ID should not be appended to save-temps output
file when the parallelism factor is 1 (not actually splitting).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262731 91177308-0d34-0410-b5e6-96231b3b80d8

[Statepoint docs] Delete trailing whitespace

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262730 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/SI: Enable frame index scavenging during PrologEpilogueInserter

Summary:
This allows us to use virtual registers when we need extra registers
for inserting spill instructions in SIRegisterInfo:eliminateFrameIndex().

Once all the frame indices have been eliminated, the
PrologEpilogueInserter does an extra pass over the program to replace
all virtual registers with physical ones.

This allows us to make more efficient use of our emergency spill slots,
so we only need to create one.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17591

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262728 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Ensure prevailing linkonce emitted as weak in ThinLTO backends

Summary:
Since IR files are all compiled into separate independent object files
in ThinLTO mode, the prevailing linkonce symbols must be emitted in its
object file even if it is no longer referenced there, e.g. if no
references remain in the module after inlining, since it may be
referenced by another ThinLTO compiled object file. This is done by
changing LDPR_PREVAILING_DEF_IRONLY* symbols to LDPR_PREVAILING_DEF,
which converts the prevailing linkonce to weak. We also don't need the
other prevailing IRONLY handling for internalization, which is not
currently performed for ThinLTO.

Test case included.

Reviewers: davidxl, rafael

Subscribers: rafael, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16173

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262727 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Fix lowering of calls with the return type of i1

This fixes an assertion in test/CodeGen/Hexagon/ifcvt-edge-weight.ll
when run with -debug-only=isel

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262726 91177308-0d34-0410-b5e6-96231b3b80d8

[mips][microMIPS] Prevent usage of OR16_MMR6 instruction when code for microMIPS is generated.

Author: milena.vujosevic.janicic
Reviewers: dsanders
Differential Revision: http://reviews.llvm.org/D17373

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262725 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Launch importing backends in parallel threads from gold plugin

Summary:
Launch ThinLTO backends (LTO and codegen pipelines with importing) in
parallel using a ThreadPool, after creating the combined index.
The number of threads is controlled by the existing -jobs gold plugin
option, or the hardware concurrency if not specified.

The old behavior of exiting after creating the combined index can be
invoked via a new thinlto-index-only plugin option.

This commit involves just the ThinLTO-specific pieces of D15390, the NFC
and other restructuring pieces were committed independently:
  r262677: Add hardware_concurrency interface to llvm::thread (NFC)
  r262719: Change split code gen to use ThreadPool
  r262721: Refactor gold-plugin codegen to prepare for ThinLTO threads (NFC)

Reviewers: pcc, joker.eph, rafael

Subscribers: rafael, davidxl, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D15390

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262724 91177308-0d34-0410-b5e6-96231b3b80d8

Refactor gold-plugin codegen to prepare for ThinLTO threads (NFC)

This is the NFC part remaining from D15390, which refactors the
current codegen() into a CodeGen class with various modular methods and
other helper functions that will be used by the follow-on ThinLTO piece.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262721 91177308-0d34-0410-b5e6-96231b3b80d8

Change split code gen to use ThreadPool

Part of D15390.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262719 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Added some basic X86ISD::VPERMV3 shuffle combining tests

None of these actually combine yet as we haven't enabled X86ISD::VPERMV3 for target shuffle combining

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262718 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit access

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262714 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSSE3] Added combine test for unary shuffle (pshufb) only referencing elements from the second input of a binary shuffle (punpcklbw)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262710 91177308-0d34-0410-b5e6-96231b3b80d8

test commit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262709 91177308-0d34-0410-b5e6-96231b3b80d8

Make headers self-contained again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262702 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsics

These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the
GL_ARB_shader_image_load_store extension.

Initial change by Nicolai H.hnle

Differential Revision: http://reviews.llvm.org/D17401

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262701 91177308-0d34-0410-b5e6-96231b3b80d8

Annotate our undefined behaviour to sneak it past the sanitizers

We have known UB in some ilists where we static cast half nodes to
(larger) derived types and use the address. See llvm.org/PR26753.

This needs to be fixed, but in the meantime it'd be nice if running
ubsan didn't complain. This adds annotations in the two places where
ubsan complains while running check-all of a sanitized clang build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262683 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a memory leak.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262682 91177308-0d34-0410-b5e6-96231b3b80d8

CodeGen: Tune the SmallVector size in LiveRange

The vast majority of LiveRanges (ie, 4/5) have exactly 1 segment and 1
value number, and a good chunk of the rest have 2 of each, so
allocating space for 4 is wasteful. This is especially noticeable when
dealing with a very large number of vregs, and I have an internal case
where dropping this to 2 shaves over 5% off of peak memory when
compiling a particularly large function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262681 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a use-after-free bug introduced in r262636

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262679 91177308-0d34-0410-b5e6-96231b3b80d8

Add hardware_concurrency interface to llvm::thread (NFC)

Part of D15390.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262677 91177308-0d34-0410-b5e6-96231b3b80d8

[gold] Handle modules that are not included in the link.

Gold has a newly added LDPT_GET_SYMBOLS_V3 callback that can
distinguish between a module that is not included in the link, and
one that is included but has its entire interface preempted by others.

Fixes PR26674.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262676 91177308-0d34-0410-b5e6-96231b3b80d8

Fix memory leak in tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262674 91177308-0d34-0410-b5e6-96231b3b80d8

[libfuzzer] arbitrary function adapter.

The adapter automates converting sequence of bytes into arbitrary
arguments.

Differential Revision: http://reviews.llvm.org/D17829

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262673 91177308-0d34-0410-b5e6-96231b3b80d8

[docs] Add a description of current problem areas to the statepoint docs

Triggered by a question on llvm-dev about status

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262671 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Combine A->B->A BitCast

This patch enhances InstCombine to handle following case:

        A  ->  B    bitcast
        PHI
        B  ->  A    bitcast

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262670 91177308-0d34-0410-b5e6-96231b3b80d8

llvm/test/CodeGen/ARM/rem_crash.ll: Avoid unsupported targets to specify explicit triple.

We will see it for targeting win32;

LLVM ERROR: CPU: 'generic' does not support ARM mode execution!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262668 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] when interrupted, call _Exit() instead of exit()

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262667 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test.

PSHUFB decoder was assuming that input was 128 or 256-bit vector only.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262661 91177308-0d34-0410-b5e6-96231b3b80d8

[RuntimeDyld] Fix '_' stripping in RTDyldMemoryManager::getSymbolAddressInProcess.

The RTDyldMemoryManager::getSymbolAddressInProcess method accepts a
linker-mangled symbol name, but it calls through to dlsym to do the lookup (via
DynamicLibrary::SearchForAddressOfSymbol), and dlsym expects an unmangled
symbol name.

Historically we've attempted to "demangle" by removing leading '_'s on all
platforms, and fallen back to an extra search if that failed. That's broken, as
it can cause symbols to resolve incorrectly on platforms that don't do mangling
if you query '_foo' and the process also happens to contain a 'foo'.

Fix this by demangling conditionally based on the host platform. That's safe
here because this function is specifically for symbols in the host process, so
the usual cross-process JIT looking concerns don't apply.

M unittests/ExecutionEngine/ExecutionEngineTest.cpp
M lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262657 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueTracking] "constant fold" an experimental hidden option

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262648 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueTracking] Remove dead code from an old experiment

This experiment was originally about trying to use facts implied dominating conditions to infer more precise known bits.  While the compile time was found to be acceptable on several large code bases, we never found sufficiently profitable examples to justify turning on the code by default.  Given this, it's time to abandon the experiment.

Several folks have commented that they've found this useful for experimentation, but nothing has come of those experiments.  Given how easy the patch is to apply, there's no reason to leave the code in tree.

For anyone interested in further investigation in this area, I recommend finding the summary email I sent on one of the original review threads.  In particular, I now believe the use-list based approach is strictly worse than the dom-tree-walking approach.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262646 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] transform bitcasted bitwise logic ops with constants (PR26702)

Given that we're not actually reducing the instruction count in the included
regression tests, I think we would call this a canonicalization step.

The motivation comes from the example in PR26702:
https://llvm.org/bugs/show_bug.cgi?id=26702

If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable
example of:

define <4 x i32> @is_negative(<4 x i32> %x) {
  %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
  %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1>
  %bc = bitcast <4 x i32> %not to <2 x i64>
  %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1>
  %bc2 = bitcast <2 x i64> %notnot to <4 x i32>
  ret <4 x i32> %bc2
}

Simplifies to the expected:

define <4 x i32> @is_negative(<4 x i32> %x) {
  %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
  ret <4 x i32> %lobit
}

Differential Revision: http://reviews.llvm.org/D17583

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262645 91177308-0d34-0410-b5e6-96231b3b80d8

Fix breakage caused by r262636.

Use LLVM_ATTRIBUTE_UNUSED instead of __attribute_((unused))

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262643 91177308-0d34-0410-b5e6-96231b3b80d8

[ConstantRange] Rename test; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262640 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Prove no-overflow via constant ranges

Exploit ScalarEvolution::getRange's newly acquired smartness (since
r262438) by using that to infer nsw and nuw when possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262639 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Be less eager about demoting zexts to sexts

After r262438 we can have provably positive NSW SCEV expressions whose
zero extensions cannot be simplified (since r262438 makes SCEV better at
computing constant ranges). This means demoting sexts of positive add
recurrences eagerly can result in an unsimplified zero extension where
we could have had a simplified sign extension. This change fixes the
issue by teaching SCEV to demote sext of a positive SCEV expression to a
zext only if the sext could not be simplified.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262638 91177308-0d34-0410-b5e6-96231b3b80d8

[ConstantRange] Generalize makeGuaranteedNoWrapRegion to work on ranges

This will be used in a later patch to ScalarEvolution. Right now only
the unit tests exercise the newly added code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262637 91177308-0d34-0410-b5e6-96231b3b80d8

Infrastructure for PGO enhancements in inliner

This patch provides the following infrastructure for PGO enhancements in inliner:

Enable the use of block level profile information in inliner
Incremental update of block frequency information during inlining
Update the function entry counts of callees when they get inlined into callers.

Differential Revision: http://reviews.llvm.org/D16381

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262636 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Better support for the variable mask form of VPERMILPD/VPERMILPS

The variable mask form of VPERMILPD/VPERMILPS were only partially implemented, with much of it still performed as an intrinsic.

This patch properly defines the instructions in terms of X86ISD::VPERMILPV, permitting the opcode to be easily combined as a target shuffle.

Differential Revision: http://reviews.llvm.org/D17681

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262635 91177308-0d34-0410-b5e6-96231b3b80d8

Use LineLocation instead of CallsiteLocation to index callsite profile.

Summary: With discriminator, LineLocation can uniquely identify a callsite without the need to specifying callee name. Remove Callee function name from the key, and put it in the value (FunctionSamples).

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17827

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262634 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Tidied up 256-bit -> 2 x 128-bit vector shift extraction.

lowerShift was manually splitting BUILD_VECTOR cases when it could just call Extract128BitVector which does this anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262633 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Pulled out repeated code testing for constant vector shift amount. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262631 91177308-0d34-0410-b5e6-96231b3b80d8

MCU target has its own ABI, however X86 interrupt handler calling convention overrides this ABI.
Fixed the ordering to check first for X86 interrupt handler then for MCU target.

Differential Revision: http://reviews.llvm.org/D17801

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262628 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Don't assume that shuffle non-mask operands starts at #0.

That's not the case for VPERMV/VPERMV3, which cover all possible
combinations (the C intrinsics use a different order; the AVX vs
AVX512 intrinsics are different still).

Since:
r246981 AVX-512: Lowering for 512-bit vector shuffles.
VPERMV is recognized in getTargetShuffleMask.

This breaks assumptions in most callers, as they expect
the non-mask operands to start at index 0.
VPERMV has the mask as operand #0; VPERMV3 has it in the middle.

Instead of the faulty assumption, have getTargetShuffleMask return
its operands as well.

One alternative we considered was to change the operand order of
VPERMV, but we agreed to stick to the instruction order, as there
are more AVX512 weirdness to cover (vpermt2/vpermi2 in particular).

Differential Revision: http://reviews.llvm.org/D17041

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262627 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopUtils, LV] Fix PR26734

The vectorization of first-order recurrences (r261346) caused PR26734. When
detecting these recurrences, we need to ensure that the previous value is
actually defined inside the loop. This patch includes the fix and test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262624 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] fold 'isPositive' vector integer operations (PR26819)

This is one of the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26819

Shift and negate is what InstCombine prefers to produce (and I tried to make it do more of that
in http://reviews.llvm.org/rL262424 ), so we should recognize that pattern as something that might
come from autovectorization even if it's unlikely to be produced from C NEON intrinsics.

The patch is based on the x86 equivalent:
http://reviews.llvm.org/rL262036

Differential Revision: http://reviews.llvm.org/D17834

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262623 91177308-0d34-0410-b5e6-96231b3b80d8

AVX512: Combine AND + TESTM instructions .

Differential Revision: http://reviews.llvm.org/D17844

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262621 91177308-0d34-0410-b5e6-96231b3b80d8

Making rem_crash.ll target-specific

This test failed in some ARM bots after a divmod change because it was
running on a native llc, instead of targeted one. This makes sure the test
is target-specific (as intended), and also copies to ARM and AArch64
directories. If it is also supposed to work on other architectures, I'll
leave as an exercise to the respective maintainers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262620 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Add calling convention parser tokens

Summary: Adds the 'avr_intrcc' and 'avr_signalcc' IR calling convention tokens to the parser.

Reviewers: arsenm

Subscribers: dylanmckay, llvm-commits

Differential Revision: http://reviews.llvm.org/D16348

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262600 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG

Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns.

Differential Revision: http://reviews.llvm.org/D17691

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262599 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[ARM] Merging 64-bit divmod lib calls into one"

This reverts commit r262507, which broke some ARM buildbots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262594 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM][AVX512] PSRLWI Chnage imm8 to int

Differential Revision: http://reviews.llvm.org/D17753

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262592 91177308-0d34-0410-b5e6-96231b3b80d8

TTI: Fix not using overload of getIntrinsicInstrCost

This was always calling the generic version, so the target
custom implementation was never called.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262585 91177308-0d34-0410-b5e6-96231b3b80d8

[BranchFolding] Change function name related with merging MMOs. NFC

Summary:
Removing MMOs is not our prefer behavior any more.

Reviewers: mcrosier, reames

Differential Revision: http://reviews.llvm.org/D17668

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262580 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Insert two S_NOP instructions for every high level source statement.

Patch by: Konstantin Zhuravlyov

Summary: Tools, such as debugger, need to pause execution based on user input (i.e. breakpoint). In order to do this, two S_NOP instructions are inserted for each high level source statement: one before first isa instruction of high level source statement, and one after last isa instruction of high level source statement. Further, debugger may replace S_NOP instructions with S_TRAP instructions based on user input.

Reviewers: tstellarAMD, arsenm

Subscribers: echristo, dblaikie, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17454

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262579 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/SI: Don't try to move scratch wave offset when there are no free SGPRs

Summary:
When there were no free SGPRs, we were trying to move this value into
some of the reserved registers which was causing a segmentation fault.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17590

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262577 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Enable forwarding bool arguments in tail calls (PR26305)

The code was previously not able to track a boolean argument
at a call site back to the formal argument of the caller.

Differential Revision: http://reviews.llvm.org/D17786

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262575 91177308-0d34-0410-b5e6-96231b3b80d8

[PPCVSXFMAMutate] Temporarily disable this pass

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262573 91177308-0d34-0410-b5e6-96231b3b80d8

[MBP] Renaming a confusing variable and add clarifying comments

Was discussed as part of http://reviews.llvm.org/D17830

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262571 91177308-0d34-0410-b5e6-96231b3b80d8

[lanai] Fixing file path used in test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262567 91177308-0d34-0410-b5e6-96231b3b80d8

TargetSchedule: Allow explicit Unsupported markers in InstRW

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262549 91177308-0d34-0410-b5e6-96231b3b80d8

TableGen: Accept itinerary data when checking for schedmodel completeness

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262548 91177308-0d34-0410-b5e6-96231b3b80d8

[MBP] Avoid placing random blocks between loop preheader and header

If we have a loop with a rarely taken path, we will prune that from the blocks which get added as part of the loop chain. The problem is that we weren't then recognizing the loop chain as schedulable when considering the preheader when forming the function chain. We'd then fall to various non-predecessors before finally scheduling the loop chain (as if the CFG was unnatural.) The net result was that there could be lots of garbage between a loop preheader and the loop, even though we could have directly fallen into the loop. It also meant we separated hot code with regions of colder code.

The particular reason for the rejection of the loop chain was that we were scanning predecessor of the header, seeing the backedge, believing that was a globally more important predecessor (true), but forgetting to account for the fact the backedge precessor was already part of the existing loop chain (oops!.

Differential Revision: http://reviews.llvm.org/D17830

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262547 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Don't give catch objects a displacement of zero

Catch objects with a displacement of zero do not initialize a catch
object. The displacement is relative to %rsp at the end of the
function's prologue for x86_64 targets.

If we place an object at the top-of-stack, we will end up wit a
displacement of zero resulting in our catch object remaining
uninitialized.

Address this by creating our catch objects as fixed objects. We will
ensure that the UnwindHelp object is created after the catch objects so
that no catch object will have a displacement of zero.

Differential Revision: http://reviews.llvm.org/D17823

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262546 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] add tests to demonstrate existing codegen for PR26819

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262540 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Simplify boolean conditional return statements

Patch by Richard Thomson

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262536 91177308-0d34-0410-b5e6-96231b3b80d8

[MBP] Remove overly verbose debug output

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262531 91177308-0d34-0410-b5e6-96231b3b80d8

Explode store of arrays in instcombine

Summary: This is the last step toward supporting aggregate memory access in instcombine. This explodes stores of arrays into a serie of stores for each element, allowing them to be optimized.

Reviewers: joker.eph, reames, hfinkel, majnemer, mgrang

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17828

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262530 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-nm] Restore the previous behaviour (pre r262525).

It broke some buildbots.

Pointy-hat to: me

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262529 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-nm] Fix rendering of -s grouping with all the othe options.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262525 91177308-0d34-0410-b5e6-96231b3b80d8

[MBP] Adjust debug output to be more focused and approachable

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262522 91177308-0d34-0410-b5e6-96231b3b80d8

Unpack array of all sizes in InstCombine

Summary: This is another step toward improving fca support. This unpack load of array in a series of load to array's elements.

Reviewers: chandlerc, joker.eph, majnemer, reames, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15890

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262521 91177308-0d34-0410-b5e6-96231b3b80d8

Really fix ASAN leak/etc issues with MemorySSA unittests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262519 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] add -Werror for libFuzzer build rule

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262517 91177308-0d34-0410-b5e6-96231b3b80d8