git.osdn.net Git - android-x86/external-llvm.git/log

Integrate findBaseOffset address analyses to BaseIndexOffset. NFCI.

BaseIndexOffset supercedes findBaseOffset analysis save only Constant
Pool addresses. Migrate analysis to BaseIndexOffset.

Relanding after correcting base address matching check.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321389 91177308-0d34-0410-b5e6-96231b3b80d8

[git-llvm] Handle files ignored by svn correctly

Summary: Correctly handle files ignored by svn (such as .o files,
which are ignored by default) by adding "--no-ignore" flag to "svn
status" and "svn add".

Differential Revision: https://reviews.llvm.org/D41404

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321388 91177308-0d34-0410-b5e6-96231b3b80d8

Unbreak the build. Combining chrono with Optional is annoying.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321387 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] MC: Fix for address taken aliases

Previously, taking the address for an alias would result in:
"Symbol not found in table index space"

Increase test coverage for weak aliases.

This code should be more efficient too as it avoids building
the `IsAddressTaken` set.

Differential Revision: https://reviews.llvm.org/D41510

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321384 91177308-0d34-0410-b5e6-96231b3b80d8

[MemorySSA] Allow reordering of loads that alias in the presence of volatile loads.

Summary:
Make MemorySSA allow reordering of two loads that may alias, when one is volatile.
This makes MemorySSA less conservative and behaving the same as the AliasSetTracker.
For more context, see D16875.

LLVM language reference: "The optimizers must not change the number of volatile operations or change their order of execution relative to other volatile operations. The optimizers may change the order of volatile operations relative to non-volatile operations. This is not Java’s “volatile” and has no cross-thread synchronization behavior."

Reviewers: george.burgess.iv, dberlin

Subscribers: sanjoy, reames, hfinkel, llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D41525

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321382 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[DAG] Integrate findBaseOffset address analyses to BaseIndexOffset. NFCI."

which was causing miscompilations in for some test-suite components.

This reverts commit 3e9de9ff0f3162920a2a3cba51c7dc14b54b4d16.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321380 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Don't do if-conversion if there is a long dependence chain

If after if-conversion, most of the instructions in this new BB construct a long and slow dependence chain, it may be slower than cmp/branch, even if the branch has a high miss rate, because the control dependence is transformed into data dependence, and control dependence can be speculated, and thus, the second part can execute in parallel with the first part on modern OOO processor.

This patch checks for the long dependence chain, and give up if-conversion if find one.

Differential Revision: https://reviews.llvm.org/D39352

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321377 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO][CachePruning] explicitly disable pruning

In https://reviews.llvm.org/rL321077 and https://reviews.llvm.org/D41231 I fixed a regression in the c-api which prevented the pruning from being *effectively* disabled.

However this approach, helpfully recommended by @labath, is cleaner.
It is also nice to remove the weasel words about effectively disabling from the api comments.

Differential Revision: https://reviews.llvm.org/D41497

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321376 91177308-0d34-0410-b5e6-96231b3b80d8

(Re-landing) Expose a TargetMachine::getTargetTransformInfo function

Re-land r321234.  It had to be reverted because it broke the shared
library build.  The shared library build broke because there was a
missing LLVMBuild dependency from lib/Passes (which calls
TargetMachine::getTargetIRAnalysis) to lib/Target.  As far as I can
tell, this problem was always there but was somehow masked
before (perhaps because TargetMachine::getTargetIRAnalysis was a
virtual function).

Original commit message:

This makes the TargetMachine interface a bit simpler.  We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.

See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html

I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.

Reviewers: echristo, MatzeB, hfinkel

Reviewed By: hfinkel

Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D41464

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321375 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC] Corrected handling of negative expressions

See bug 35716: https://bugs.llvm.org/show_bug.cgi?id=35716

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D41488

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321372 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Reverse the order of operands in the ISD::ADD created by TargetLowering::getVectorElementPointer so that the FrameIndex is on the left.

This seems to improve X86's ability to match this into an address computation. Otherwise the other operand gets assigned to the base register and the stack pointer + frame index ends up in the index register. But index registers can't encode ESP/RSP so we end up having to move it into another register to meet the constraint.

I could try to improve the address matcher in X86, but swapping the producer seemed easier. Several other places already have the operands in this order so this is at least consistent.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321370 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] When lowering insert_vector_elt/extract_vector_elt of vXi1 with a non-constant index just use either a 128-bit type or the vXi8 type with the correct number of elements.

Despite what the comment said there isn't better codegen for 512-bit vectors. The 128/256/512 bit implementation jus stores to memory and loads an element. There's no advantage to doing that with a larger size. In fact in many cases it causes a stack realignment and generates worse code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321369 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Improve the printing of address mode during isel matching.

Fix some inconsistent new line behavior and only print the FrameIndex when the address mode is a FrameIndexBase addressing mode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321368 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC] Corrected parsing of optional operands for ds_swizzle_b32

See bug 35645: https://bugs.llvm.org/show_bug.cgi?id=35645

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D41186

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321367 91177308-0d34-0410-b5e6-96231b3b80d8

[InlineCost] Find more free binary operations

Currently, inline cost model considers a binary operator as free only if both
its operands are constants. Some simple cases are missing such as a + 0, a - a,
etc. This patch modifies visitBinaryOperator() to call SimplifyBinOp() without
going through simplifyInstruction() to get rid of the constant restriction.
Thus, visitAnd() and visitOr() are not needed.

Differential Revision: https://reviews.llvm.org/D41494

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321366 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] Integrate findBaseOffset address analyses to BaseIndexOffset. NFCI.

BaseIndexOffset supercedes findBaseOffset analysis save only Constant
Pool addresses. Migrate analysis to BaseIndexOffset.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321364 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC] Added support of 256- and 512-bit tuples of ttmp registers

See bug 35561: https://bugs.llvm.org/show_bug.cgi?id=35561

This patch also affects implementation of SGPR and VGPR registers though changes are cosmetic.

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D41437

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321359 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add test case to check that calls to mcount follow long calls / short calls options. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321357 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM GlobalISel] Support G_INTTOPTR and G_PTRTOINT for s32

Mark conversions between pointers and 32-bit scalars as legal, map them
to the GPR and select to a simple COPY.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321356 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM GlobalISel] Support pointer constants

Pointer constants are pretty rare, since we usually represent them as
integer constants and then cast to pointer. One notable exception is the
null pointer constant, which is represented directly as a G_CONSTANT 0
with pointer type. Mark it as legal and make sure it is selected like
any other integer constant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321354 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Revert r321259

Improve ReduceLoadWidth for SRL Patch is causing an issue on the
PPC64 BE santizer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321349 91177308-0d34-0410-b5e6-96231b3b80d8

Rewrite the cached map used for locating the most precise DIE among
inlined subroutines for a given address.

This is essentially the hot path of llvm-symbolizer when extracting
inlined frames during symbolization. Previously, we would read every
subprogram and every inlined subroutine, building a std::map across the
entire PC space to the best DIE, and then do only a handful of queries
as we symbolized a backtrace. A huge fraction of the time was spent
building the map itself.

This patch changes it two a two-level system. First, we just build a map
from PC-interval to DWARF subprograms. These are required to be disjoint
and so constructing this is pretty easy. Second, we build a map *just*
for the inlined subroutines within the subprogram containing the query
address. This allows us to look at far fewer DIEs and build a *much*
smaller set of cached maps in the llvm-symbolizer case where only a few
address get symbolized during the entire run.

It also builds both interval maps in a very different way. It constructs
a single flat vector of pairs that maps from offset -> index. The
indices point into collections of DIE objects, but can also be
"tombstones" (-1) to mark gaps. In the case of subprograms, this mostly
just simplifies the data structure a bit. For inlined subroutines,
because we carefully split them as we build the map, we end up in many
cases having no holes and not having to store both start and stop
offsets.

Finally, the PC ranges for the inlined subroutines are compressed into
32-bits by making them relative to the base PC of the outer subprogram.
This means that if you have a single function body with over 2gb of
executable code in it, we will stop mapping address past the first 2gb
of that function into inlined subroutines and just give you the
subprogram. This doesn't seem like a problem. ;]

All of this combines to make llvm-symbolizer *well* over 2x faster for
symbolizing backtraces out of LLVM's unittests. Death-test heavy unit
tests are running >2x faster. I'm still going to look at completely
disabling symbolization there, but figured while I had a good benchmark
we should make symbolization a bit better.

Sadly, the logic to build the flat interval map for the inlined
subroutines is fairly complex. I'm not super happy about this and
welcome any simplifying suggestions.

Huge thanks to Dave Blaikie who helped walk me through what the various
things I needed to do in DWARF to make this work.

Differential Revision: https://reviews.llvm.org/D40987

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321345 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add missing initialization for the HasPREFETCHWT1 subtarget variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321340 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Enable PRFCHW feature on KNL/KNM and all CPUs inherited from Broadwell.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321336 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add prefetchwt1 instruction and overhaul priorities and isel enabling for prefetch instructions.

Previously prefetch was only considered legal if sse was enabled, but it should be supported with 3dnow as well.

The prfchw flag now imply at least some form of prefetch without the write hint is available, either the sse or 3dnow version. This is true even if 3dnow and sse are explicitly disabled.

Similarly prefetchwt1 feature implies availability of prefetchw and the the prefetcht0/1/2/nta instructions. This way we can support _MM_HINT_ET0 using prefetchw and _MM_HINT_ET1 with prefetchwt1. And its assumed that if we have levels for the write hint we would have levels for the non-write hint, thus why we enable the sse prefetch instructions.

I believe this behavior is consistent with gcc. I've updated the prefetch.ll to test all of these combinations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321335 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use SIGN_EXTEND to implement ANY_EXTEND from vXi1.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321334 91177308-0d34-0410-b5e6-96231b3b80d8

inline-fp.ll was moved in r321332; delete it properly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321333 91177308-0d34-0410-b5e6-96231b3b80d8

[Inliner] Restrict soft-float inlining penalty.

The penalty is currently getting applied in a bunch of places where it
doesn't make sense, like bitcasts (which are free) and calls (which
were getting the call penalty applied twice). Instead, just apply the
penalty to binary operators and floating-point casts.

While I'm here, also fix getFPOpCost() to do the right thing in more
cases, so we don't have to dig into function attributes.

Differential Revision: https://reviews.llvm.org/D41522

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321332 91177308-0d34-0410-b5e6-96231b3b80d8

Add hasProfileData() to check if a function has profile data. NFC.

Summary:
This replaces calls to getEntryCount().hasValue() with hasProfileData
that does the same thing. This refactoring is useful to do before adding
synthetic function entry counts but also a useful cleanup IMO even
otherwise. I have used hasProfileData instead of hasRealProfileData as
David had earlier suggested since I think profile implies "real" and I
use the phrase "synthetic entry count" and not "synthetic profile count"
but I am fine calling it hasRealProfileData if you prefer.

Reviewers: davidxl, silvas

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41461

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321331 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] Fix formatting bug with r321295. This fixes a MIPS buildbot failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321330 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use SIGN_EXTEND rather than ZERO_EXTEND for lowering extract_vector_elt from vXi1 with a non-const index.

We have a better range of instructions we can use if we can fill with the value i1 value rather than zeroing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321315 91177308-0d34-0410-b5e6-96231b3b80d8

[ModRefInfo] Add must alias info to ModRefInfo.

Summary:
Add an additional bit to ModRefInfo, ModRefInfo::Must, to be cleared for known must aliases.
Shift existing Mod/Ref/ModRef values to include an additional most
significant bit. Update wrappers that modify ModRefInfo values to
reflect the change.

Notes:
* ModRefInfo::Must is almost entirely cleared in the AAResults methods, the remaining changes are trying to preserve it.
* Only some small changes to make custom AA passes set ModRefInfo::Must (BasicAA).
* GlobalsModRef already declares a bit, who's meaning overlaps with the most significant bit in ModRefInfo (MayReadAnyGlobal). No changes to shift the value of MayReadAnyGlobal (see AlignedMap). FunctionInfo.getModRef() ajusts most significant bit so correctness is preserved, but the Must info is lost.
* There are cases where the ModRefInfo::Must is not set, e.g. 2 calls that only read will return ModRefInfo::NoModRef, though they may read from exactly the same location.

Reviewers: dberlin, hfinkel, george.burgess.iv

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D38862

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321309 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] When lowering truncates to vXi1, don't sign extend i16/i8 types to 512-bit if we have VLX.

This should only affect what we do for v8i16. Previously we went to v8i64, but if we have VLX we only need v8i32. This prevents an unnecessary zmm usage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321303 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF v5] Rework of string offsets table reader

Reorganizes the DWARF consumer to derive the string offsets table
contribution's format from the contribution header instead of
(incorrectly) from the unit's format.

Reviewers: JDevliegehere, aprantl

Differential Revision: https://reviews.llvm.org/D41146

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321295 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Promote v8i1 shuffles to v8i32 instead of v8i64 if we have VLX.

We should have equally good shuffle options for v8i32 with VLX. This was spotted during my attempts to remove 512-bit vectors from SKX.

We still use 512-bits for v16i1, v32i1, and v64i1. I'm less sure we can handle those well with narrower vectors. i32 and i64 element sizes get the best shuffle support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321291 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Split large PAVGB/PAVGW vectors to legal widths

Patch to allow detectAVGPattern handle vectors larger than the legal size (128 SSE2, 256 AVX2, 512 AVX512BW), splitting the vectors accordingly.

Differential Revision: https://reviews.llvm.org/D41440

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321288 91177308-0d34-0410-b5e6-96231b3b80d8

[YAML] Refactor escaping unittests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321284 91177308-0d34-0410-b5e6-96231b3b80d8

[YAML] Fix UTF-8 handling

Previous YAML quoting patches broke UTF-8 printing in YAML: see https://reviews.llvm.org/D41290#961801.

Differential Revision: https://reviews.llvm.org/D41490

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321283 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Print more helpful information in case of type contradiction

Dump the failing TreePattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321282 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Remove (xor (xor x, c1), c2) -> (xor x, (xor c1, c2)) fold. NFCI.

More general cases are already handled by constant canonicalization and then the ReassociateOps call at line 5327

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321280 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Generalize (or (and X, c1), c2) -> (and (or X, c2), c1|c2) combine to work on non-splat vectors

The knownbits_mask_or_shuffle_uitofp change is interesting - shuffle combines manage to kick in, removing the AND constant mask load. For targets with fast-variable-shuffle this should reduce further to VPOR+VPSHUFB+VCVTDQ2PS.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321279 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add (or (and X, c1), c2) -> (and (or X, c2), c1|c2) non-splat vector test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321278 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Fix parest build failure in SPEC2017.

The build failure was caused by an assertion in pre-legalization DAGCombine:

Combining: t6: ppcf128 = uint_to_fp t5
... into: t20: f32 = PPCISD::FCFIDUS t19

which is clearly wrong since ppcf128 are definitely different type with f32 and
we cannot change the node value type when do DAGCombine. The fix is don't
handle ppc_fp128 or i1 conversions in PPCTargetLowering::combineFPToIntToFP and
leave it to downstream to legalize it and expand it to small legal types.

Differential Revision: https://reviews.llvm.org/D41411

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321276 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Generalize (and (or x, C), D) -> D iff (C & D) == D combine to work on non-splat vectors

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321275 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Fix the invalid EVA test

During the review of D40362 I spotted that this test wasn't actually
testing the eva instructions due to '-mattr==eva', rather than '-mattr=+eva',
which resulted in test having no effect.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321273 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add (and (or x, C), D) -> D iff (C & D) == D non-splat vector test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321268 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add v48i8 AVG test case, based on discussion on D41440

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321261 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Improve ReduceLoadWidth for SRL

If the SRL node is only used by an AND, we may be able to set the
ExtVT to the width of the mask, making the AND redundant. To support
this, another check has been added in isLegalNarrowLoad which queries
whether the load is valid.

Differential Revision: https://reviews.llvm.org/D41350

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321259 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Remove MemoryBuffer::getNewUninitMemBuffer

There is nothing useful that can be done with a read-only uninitialized
buffer without const_casting its contents to initialize it. A better
solution is to obtain a writable buffer
(WritableMemoryBuffer::getNewUninitMemBuffer), and then convert it to a
read-only buffer after initialization. All callers of this function have
already been updated to do this, so this function is now unused.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321257 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Armv8-R DFB instruction

Implement MC support for the Armv8-R 'Data Full Barrier' instruction.

Differential Revision: https://reviews.llvm.org/D41430

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321256 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] Fix ambiguous call to the `printNumber`

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321254 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] Support 'GNU' style for MIPS GOT/PLT dumping

This change adds `printMipsGOT` and `printMipsPLT` methods to the
`DumpStyle` class and overrides them in the `GNUStyle` and `LLVMStyle`
descendants. To pass information about GOT/PLT layout into these
methods, the `MipsGOTParser` class has been extended to hold all
necessary data.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321253 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use PSHUFB for v32i16 shuffles before falling back to VPERMW/VPERMI2W.

PSHUFB has the ability to implicitly 0 elements which VPERMI2W can't do. So give a chance to use it first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321251 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use VPERMI2B for v16i8 shuffles if we have VBMI+VLX and would have otherwise used two PSHUFBs ORed together.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321249 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use VPERMB/VPERMI2B for v32i8 shuffle lowering if VBMI and VLX are supported.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321248 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add avx512vbmi command lines to vector-shuffle-256-v32.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321247 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Remove unneeded sub-directory

This is the only wasm def (and likely likely will be
for the foreseeable) file so no need for a sub-directory

Differential Revision: https://reviews.llvm.org/D41476

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321246 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Expose a TargetMachine::getTargetTransformInfo function"

This reverts commit r321234. It breaks the -DBUILD_SHARED_LIBS=ON build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321243 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Fix local references to weak aliases

When weak aliases are used with in same translation
unit we need to be able to directly reference to alias
and not just the thing it is aliases.  We do this by
defining both a wasm import and a wasm export in this
case that result in a single Symbol.  This change is
a partial revert of rL314245.  A corresponding lld
change address the previous issues we had with this.

See: https://github.com/WebAssembly/tool-conventions/issues/34

Differential Revision: https://reviews.llvm.org/D41472

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321242 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Avoid quadratic on a predecessors number behavior in instruction sinking.

If a block has N predecessors, then the current algorithm will try to
sink common code to this block N times (whenever we visit a
predecessor). Every attempt to sink the common code includes going
through all predecessors, so the complexity of the algorithm becomes
O(N^2).
With this patch we try to sink common code only when we visit the block
itself. With this, the complexity goes down to O(N).
As a side effect, the moment the code is sunk is slightly different than
before (the order of simplifications has been changed), that's why I had
to adjust two tests (note that neither of the tests is supposed to test
SimplifyCFG):
* test/CodeGen/AArch64/arm64-jumptable.ll - changes in this test mimic
the changes that previous implementation of SimplifyCFG would do.
* test/CodeGen/ARM/avoid-cpsr-rmw.ll - in this test I disabled common
code sinking by a command line flag.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321236 91177308-0d34-0410-b5e6-96231b3b80d8

Expose a TargetMachine::getTargetTransformInfo function

Summary:
This makes the TargetMachine interface a bit simpler. We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.

See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html

I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.

Reviewers: echristo, MatzeB, hfinkel

Reviewed By: hfinkel

Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D41464

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321234 91177308-0d34-0410-b5e6-96231b3b80d8

Attempt to pacify 4.8.5 with makeArrayRef

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321233 91177308-0d34-0410-b5e6-96231b3b80d8

[orc][cmake] Check if 8 byte atomics require libatomic for unittest

rL319838 introduced SymbolStringPool which uses 8 byte atomics for
reference counters. On systems which do not support such atomics
natively such as MIPS32, explicitly add libatomic as one of the
libraries for SymbolStringPool's unittest.

Reviewers: lhames, beanz

Differential Revision: https://reviews.llvm.org/D41010

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321225 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Optimize {s,u}{add,sub}.with.overflow.

The AArch64 backend contains code to optimize {s,u}{add,sub}.with.overflow during SelectionDAG. This commit ports that code to the ARM backend.

Differential revision: https://reviews.llvm.org/D35635

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321224 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Use ArrayRef member functions instead of custom ones

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321221 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Allow construction of HVX vector predicates

Handle BUILD_VECTOR of boolean values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321220 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Legalize vector elements to i32 in buildVector32/64

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321218 91177308-0d34-0410-b5e6-96231b3b80d8

Do not generate an empty switch statement as it causes MSVC to issue diagnostics about switch statements without case or default labels.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321217 91177308-0d34-0410-b5e6-96231b3b80d8

bpf: add support for objdump -print-imm-hex

Add support for 'objdump -print-imm-hex' for imm64, operand imm
and branch target. If user programs encode immediate values
as hex numbers, such an option will make it easy to correlate
asm insns with source code. This option also makes it easy
to correlate imm values with insn encoding.

There is one changed behavior in this patch. In old way, we
print the 64bit imm as u64:
  O << (uint64_t)Op.getImm();
and the new way is:
  O << formatImm(Op.getImm());

The formatImm is defined in llvm/MC/MCInstPrinter.h as
  format_object<int64_t> formatImm(int64_t Value)

So the new way to print 64bit imm is i64 type.
If a 64bit value has the highest bit set, the old way
will print the value as a positive value and the
new way will print as a negative value. The new way
is consistent with x86_64.
For the code (see the test program):
...
if (a == 0xABCDABCDabcdabcdULL)
...
x86_64 objdump, with and without -print-imm-hex, looks like:
48 b8 cd ab cd ab cd ab cd ab   movabsq $-6067004223159161907, %rax
48 b8 cd ab cd ab cd ab cd ab   movabsq $-0x5432543254325433, %rax

Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321215 91177308-0d34-0410-b5e6-96231b3b80d8

PR35705: Fix Chapter 9 example code for API changes to DIBuilder

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321214 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Refactor DomainReassignment pass to make the Closure class not stores references to the main data structures of the pass itself

Multiple Closure objects can be created and stored for a single function. It's not a good idea to devote so many fields of it to storing pointers and references to global data structures of the pass. The closure class should only store the things needed to represent the closure itself.

This patch refactors many of the methods of Closure to belong to the pass object and to pass around a reference to the current Closure. The Closure class gains a few simple methods to add instructions and edges, and to return iterators to edges and instructions

Differential Revision: https://reviews.llvm.org/D41327

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321213 91177308-0d34-0410-b5e6-96231b3b80d8

TableGen: Allow setting SDNodeProperties on intrinsics

Allows preserving MachineMemOperands on intrinsics
through selection. For reasons I don't understand, this
is a static property of the pattern and the selector
deliberately goes out of its way to drop if not present.

Intrinsics already inherit from SDPatternOperator allowing
them to be used directly in instruction patterns. SDPatternOperator
has a list of SDNodeProperty, but you currently can't set them on
the intrinsic. Without SDNPMemOperand, when the node is selected
any memory operands are always dropped. Allowing setting this
on the intrinsics avoids needing to introduce another equivalent
target node just to have SDNPMemOperand set.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321212 91177308-0d34-0410-b5e6-96231b3b80d8

[ICP] Expose unconditional call promotion interface

This patch modifies the indirect call promotion utilities by exposing and using
an unconditional call promotion interface. The unconditional promotion
interface (i.e., call promotion without creating an if-then-else) can be used
if it's known that an indirect call has only one possible callee. The existing
conditional promotion interface uses this unconditional interface to promote an
indirect call after it has been versioned and placed within the "then" block.

A consequence of unconditional promotion is that the fix-up operations for phi
nodes in the normal destination of invoke instructions are changed. This is
necessary because the existing implementation assumed that an invoke had been
versioned, creating a "merge" block where a return value bitcast could be
placed. In the new implementation, the edge between a promoted invoke's parent
block and its normal destination is split if needed to add a bitcast for the
return value. If the invoke is also versioned, the phi node merging the return
value of the promoted and original invoke instructions is placed in the "merge"
block.

Differential Revision: https://reviews.llvm.org/D40751

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321210 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove zext from vXi32 to vXi64 on indices of gather/scatter instructions if we can prove the pre-extended value is positive.

Gather/scatter can implicitly sign extend from i32->i64 on indices. So if we know the sign bit of the input to a zext is 0 we can use the implicit extension.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321209 91177308-0d34-0410-b5e6-96231b3b80d8

DAG: Tolerate non-MemSDNodes for OPC_RecordMemRef

When intrinsics are allowed to have mem operands, there
are two ways this can happen. First is an intrinsic
that is marked has having a mem operand, but is not handled
by getTgtMemIntrinsic.

The second way can occur even for intrinsics which do not
have a mem operand. It seems the selector table does
some kind of sorting based on the opcode, and the
mem ref recording can happen in the same scope for
intrinsics that both do and do not have mem refs.
I haven't been able to figure out exactly why this happens
(although it happens even with the matcher optimizations disabled).
I'm not sure if it's worth trying to avoid hitting this for
these nodes since I think it's still reasonable to handle
this in case getTgtMemIntrinic is not implemented.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321208 91177308-0d34-0410-b5e6-96231b3b80d8

Improve the test for r320216. NFC.

Patch by Matthew Voss!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321207 91177308-0d34-0410-b5e6-96231b3b80d8

[opt-viewer] Also demangle indirect-call promotion targets

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321206 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Added an assert to make sure that the MBBI iterator is valid.

The function createTailCallBranchInstr assumes that the iterator MBBI is valid.
However, only one use of MBBI is guarded in the function.
Fix this by adding an assert.

Differential Revision: https://reviews.llvm.org/D41358

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321205 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] Fix condition on overlapping store check.

Prevent overlapping store elision when overlapping store is
pre-inc/dec as analysis is wrong in these cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321204 91177308-0d34-0410-b5e6-96231b3b80d8

[hwasan] Implement -fsanitize-recover=hwaddress.

Summary: Very similar to AddressSanitizer, with the exception of the error type encoding.

Reviewers: kcc, alekseyshl

Subscribers: cfe-commits, kubamracek, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D41417

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321203 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU, AsmParser] Enable the mnemonic spell corrector.

Patch by Dmitry Venikov

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321202 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Implement the fusing of MUL+SUBADD to FMSUBADD

This patch turns shuffles of fadd/fsub with fmul into fmsubadd.

Patch by Dmitry Venikov

Differential Revision: https://reviews.llvm.org/D40335

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321200 91177308-0d34-0410-b5e6-96231b3b80d8

[PGO] Function section hotness prefix should look at all blocks

Summary:
The function section prefix for PGO based layout (e.g. hot/unlikely)
should look at the hotness of all blocks not just the entry BB.
A function with a cold entry but a very hot loop should be placed in the
hot section, for example, so that it is located close to other hot
functions it may call. For SamplePGO it was already looking at the
branch weights on calls, and I made that code conditional on whether
this is SamplePGO since it was essentially a noop for instrumentation
PGO anyway.

Reviewers: davidxl

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D41395

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321197 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add debug location to new caller.

Reviewers: rnk, aprantl, majnemer

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D414

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321191 91177308-0d34-0410-b5e6-96231b3b80d8

[JumpTables] Let targets decide which switch instructions are suitable

This commits the non-controversial part of https://reviews.llvm.org/D41029
(making the queries virtual). The PPC-specific portion of this will be
committed in a subsequent patch once some of the finer points are ironed out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321182 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r320548:[SLP] Vectorize jumbled memory loads

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321181 91177308-0d34-0410-b5e6-96231b3b80d8

Add optional SelectionDAG* parameter to SValue::dump and SDValue::dumpr

These functions simply call their counterparts in the associated SDNode,
which do take an optional SelectionDAG. This change makes the legalization
debug trace a little easier to read, since target-specific nodes will
now have their names shown instead of "Unknown node #123".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321180 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Fix Typo. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321179 91177308-0d34-0410-b5e6-96231b3b80d8

[NVPTX] Initial adaptation of MCAsmStreamer/MCTargetStreamer for debug info in Cuda.

Summary:
Initial changes in interfaces of MCAsmStreamer/MCTargetStreamer for
correct debug info emission for Cuda.
1. PTX foramt does not support `.ascii` directives. Added the ability to
nullify it.
2. The initial function label must follow the first debug `.loc`
directive, not be followed by.
3. DWARF sections must be enclosed in braces.

Reviewers: hfinkel, probinson, jlebar, rafael, echristo

Subscribers: sdardis, nemanjai, llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D40033

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321178 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Adjust the value type for BCvt in LowerFormalArguments

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321177 91177308-0d34-0410-b5e6-96231b3b80d8

[globalisel][tablegen] Allow ImmLeaf predicates to use InstructionSelector members

NFC for currently supported targets. This resolves a problem encountered by
targets such as RISCV that reference `Subtarget` in ImmLeaf predicates.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321176 91177308-0d34-0410-b5e6-96231b3b80d8

Allow to apply cherry-picks when building Docker images.

Reviewers: mehdi_amini, ioeric, klimek

Reviewed By: ioeric

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41393

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321175 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Remove unnecessary DoExtraAnalysis guard (silent bug)

canVectorize is only checking if the loop has a normalized pre-header if DoExtraAnalysis is true.
This doesn't make sense to me because reporting analysis information shouldn't alter legality
checks. This is probably the result of a last minute minor change before committing (?).

Patch by Diego Caballero.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D40973

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321172 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX2] Split more shuffle tests into 'slow' and 'fast' variable shuffles

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321171 91177308-0d34-0410-b5e6-96231b3b80d8

Trivial commit to force LLVM to run TableGen for Mips target after
a change to the AsmMatcherEmitter, and should fix the buildbot
failure on llvm-clang-x86_64-expensive-checks-win.

The issue is also described here:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119617.html

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321170 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetParser] Check size before accessing architecture version.

Summary:
This fixes a crash when invalid -march options like `armv` are provided.

Based on a patch by Will Lovett.

Reviewers: rengolin, samparker, mcrosier

Reviewed By: samparker

Subscribers: aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D41429

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321166 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM GlobalISel] Fix assertion in RegBankSelect

We get an assertion in RegBankSelect for code along the lines of
my_32_bit_int = my_64_bit_int, which tends to translate into a 64-bit
load, followed by a G_TRUNC, followed by a 32-bit store. This appears in
a couple of places in the test-suite.

At the moment, the legalizer doesn't distinguish between integer and
floating point scalars, so a 64-bit load will be marked as legal for
targets with VFP, and so will the rest of the sequence, leading to a
slightly bizarre G_TRUNC reaching RegBankSelect.

Since the current support for 64-bit integers is rather immature, this
patch works around the issue by explicitly handling this case in
RegBankSelect and InstructionSelect. In the future, we may want to
revisit this decision and make sure 64-bit integer loads are narrowed
before reaching RegBankSelect.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321165 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Lower unsigned saturation to USAT

Summary:
Implement lower of unsigned saturation on an interval [0, k] where k + 1 is a power of two using USAT instruction in a similar way to how [~k, k] is lowered using SSAT on ARM models that supports it.

Patch by Marten Svanfeldt

Reviewers: t.p.northover, pbarrio, eastig, SjoerdMeijer, javed.absar, fhahn

Reviewed By: fhahn

Subscribers: fhahn, aemerson, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D41348

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321164 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Re-submit patch series for ZIP1/ZIP2

This patch resubmits the SVE ZIP1/ZIP2 patch series consisting of
of r320992, r320986, r320973, and r320970 by reverting
https://reviews.llvm.org/rL321024.

The issue that caused r321024 has been addressed in https://reviews.llvm.org/rL321158,
so this patch-series should be safe to resubmit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321163 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64: fix one more place movi.2d could be created.

Somehow got missed out of r320965.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321162 91177308-0d34-0410-b5e6-96231b3b80d8