git.osdn.net Git - android-x86/external-llvm.git/log

[Support] Remove redundant qualifiers in YAMLTraits (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344166 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[OptRemarks] Add library for parsing optimization remarks"

This reverts commit 1cc98e6672b6319fdb00b70dd4474aabdadbe193.

Seems to break bots: http://lab.llvm.org:8011/builders/clang-x86_64-linux-abi-test/builds/33398/steps/build-unified-tree/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344164 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Fix the artifact combiner to fold G_IMPLICIT_DEF properly

Summary:
GlobalISel generates incorrect code because the legalizer artifact
combiner assumes `G_[SZ]EXT (G_IMPLICIT_DEF)` is equivalent to
`G_IMPLICIT_DEF `.

Replace `G_[SZ]EXT (G_IMPLICIT_DEF)` with 0 because the top bits
will be 0 for G_ZEXT and 0/1 for the G_SEXT.

Reviewers: aditya_nandakumar, dsanders, aemerson, javed.absar

Reviewed By: aditya_nandakumar

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D52996

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344163 91177308-0d34-0410-b5e6-96231b3b80d8

[OptRemarks] Add library for parsing optimization remarks

Add a library that parses optimization remarks (currently YAML, so based
on the YAMLParser).

The goal is to be able to provide tools a remark parser that is not
completely dependent on YAML, in case we decide to change the format
later.

It exposes a C API which takes a handler that is called with the remark
structure.

It adds a libLLVMOptRemark.a static library, and it's used in-tree by
the llvm-opt-report tool (from which the parser has been mostly moved
out).

Differential Revision: https://reviews.llvm.org/D52776

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344162 91177308-0d34-0410-b5e6-96231b3b80d8

[VPlan] Fix CondBit quoting in dumpBasicBlock

Quotes were being printed for VPInstructions but not the rest.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344161 91177308-0d34-0410-b5e6-96231b3b80d8

Change the timestamp of llvmcache-foo file to meet the thinLTO prune policy

The case will randomly fail if we test it with command "
while llvm-lit test/tools/gold/X86/cache.ll ; do true; done". It is because the llvmcache-foo file is younger than llvmcache-349F039B8EB076D412007D82778442BED3148C4E and llvmcache-A8107945C65C2B2BBEE8E61AA604C311D60D58D6. But due to timestamp precision reason their timestamp is the same. Given the same timestamp, the file prune policy is to remove bigger size file first, so mostly foo file is removed for its bigger size. And the files size is under threshold after deleting foo file. That's what test case expect.

However sometimes, the precision is enough to measure that timestamp of llvmcache-349F039B8EB076D412007D82778442BED3148C4E and llvmcache-A8107945C65C2B2BBEE8E61AA604C311D60D58D6 are smaller than foo, so llvmcache-349F039B8EB076D412007D82778442BED3148C4E and llvmcache-A8107945C65C2B2BBEE8E61AA604C311D60D58D6 are deleted first. Since the files size is still above the file size threshold after deleting the 2 files, the foo file is also deleted. And then the test case fails, because it expect only one file should be deleted instead of 3.

The fix is to change the timestamp of llvmcache-foo file to meet the thinLTO prune policy.

Patch by Luo Yuanke.

Differential Revision: https://reviews.llvm.org/D52452

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344158 91177308-0d34-0410-b5e6-96231b3b80d8

Relax trivial cast requirements in CallPromotionUtils

Differential Revision: https://reviews.llvm.org/D52792

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344153 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Fix always true assert

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344151 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Minor refactoring in preparation for a patch that will fully fix PR36671. NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344149 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis][NFC] Pass Instruction instead of bare Opcode

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344145 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca][BtVer2] Add two more move-elimination tests. NFC

These should test all the optimizable moves on Jaguar.
A follow-up patch will teach how to recognize these optimizable register moves.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344144 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis][NFC] Code simplification

Summary: Simplify code by having LLVMState hold the RegisterAliasingTrackerCache.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53078

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344143 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Improve Load-Store Forwarding

Summary:
Extend analysis forwarding loads from preceeding stores to work with
extended loads and truncated stores to the same address so long as the
load is fully subsumed by the store.

Hexagon's swp-epilog-phis.ll and swp-memrefs-epilog1.ll test are
deleted as they've no longer seem to be relevant.

Reviewers: RKSimon, rnk, kparzysz, javed.absar

Subscribers: sdardis, nemanjai, hiraditya, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D49200

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344142 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] allow single source horizontal op matching (PR39195)

This is intended to restore horizontal codegen to what it looked like before IR demanded elements improved in:
rL343727

As noted in PR39195:
https://bugs.llvm.org/show_bug.cgi?id=39195
...horizontal ops can be worse for performance than a shuffle+regular binop, so I've added a TODO. Ideally, we'd
solve that in a machine instruction pass, but a quicker solution will be adding a 'HasFastHorizontalOp' feature
bit to deal with it here in the DAG.

Differential Revision: https://reviews.llvm.org/D52997

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344141 91177308-0d34-0410-b5e6-96231b3b80d8

Lift VFS from clang to llvm (NFC)

This patch moves the virtual file system form clang to llvm so it can be
used by more projects.

Concretely the patch:
- Moves VirtualFileSystem.{h|cpp} from clang/Basic to llvm/Support.
- Moves the corresponding unit test from clang to llvm.
- Moves the vfs namespace from clang::vfs to llvm::vfs.
- Formats the lines affected by this change, mostly this is the result of
the added llvm namespace.

RFC on the mailing list:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/126657.html

Differential revision: https://reviews.llvm.org/D52783

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344140 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Fix function return generation so it doesn't return register 0

When fillMachineFunction generates a return on targets without a return opcode
(such as AArch64) it should pass an empty set of registers as the return
registers, not 0 which means register number zero.

Differential Revision: https://reviews.llvm.org/D53074

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344139 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] SimplifyDemandedBits - rename demanded mask args. NFCI.

Help stop bugs like rL343935 by making the 'original' DemandedBits arg more obviously not the mask that is actually used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344138 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis][NFC] Fix typo

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53075

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344137 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] SimplifyDemandedBits - pull out repeated getOperands. NFCI.

Part of a minor cleanup to make all the switch statements more consistent prior to improving vector support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344136 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG."

This reverts commit r344120.

It was causing buildbot failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344135 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] fix assert in !cast when used out of definition in a multiclass

Differential Revision: https://reviews.llvm.org/D53068

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344134 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] Add root node back to work list after successful SimplifyDemandedBits/SimplifyDemandedVectorElts

Similar to what already happens in the DAGCombiner wrappers, this patch adds the root nodes back onto the worklist if the DCI wrappers' SimplifyDemandedBits/SimplifyDemandedVectorElts were successful.

Differential Revision: https://reviews.llvm.org/D53026

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344132 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Fix broken build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344131 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis][NFC] Simplify code now that Instruction has more semantic

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53065

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344130 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Temporarily disable high VFs with integer div/rem.

Until mischeduler is clever enough to avoid spilling in a vectorized loop
with many (scalar) DLRs it is better to avoid high vectorization factors (8
and above).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344129 91177308-0d34-0410-b5e6-96231b3b80d8

Fix an ordering bug in the scalarizer.

I've added a new test case that causes the scalarizer to try and use
dead-and-erased values - caused by the basic blocks not being in
domination order within the function. To fix this, instead of iterating
through the blocks in function order, I walk them in reverse post order.

Differential Revision: https://reviews.llvm.org/D52540

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344128 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Remove unused variable, add more semantic to Instruction.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53062

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344127 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG.

When SimplifyCFG changes the PHI node into a select instruction, the debug line records becomes ambiguous. It causes the debugger to display unreachable source lines.

Differential Revision: https://reviews.llvm.org/D52887

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344120 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove FeatureRTM from Skylake processor list

Summary:
There are a LOT of Skylakes and later without TSX-NI. Examples:
- SKL: https://ark.intel.com/products/136863/Intel-Core-i3-8121U-Processor-4M-Cache-up-to-3-20-GHz-
- KBL: https://ark.intel.com/products/97540/Intel-Core-i7-7560U-Processor-4M-Cache-up-to-3-80-GHz-
- KBL-R: https://ark.intel.com/products/149091/Intel-Core-i7-8565U-Processor-8M-Cache-up-to-4-60-GHz-
- CNL: https://ark.intel.com/products/136863/Intel-Core-i3-8121U-Processor-4M-Cache-up-to-3_20-GHz

This feature seems to be present only on high-end desktop and server
chips (I can't find any SKX without). This commit leaves it disabled
for all processors, but can be re-enabled for specific builds with
-mrtm.

Patch by Thiago Macieira

Reviewers: erichkeane, craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53041

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344116 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Take better care when computing needed vector registers in TTI.

A new function getNumVectorRegs() is better to use for the number of needed
vector registers instead of getNumberOfParts(). This is to make sure that the
number of vector registers (and typically operations) required for a vector
type is accurate.

getNumberOfParts() which was previously used works by splitting the vector
type until it is legal gives incorrect results for types with a non
power of two number of elements (rare).

A new static function getScalarSizeInBits() that also checks for a pointer
type and returns 64U for it since otherwise it gets a value of 0). Used in a
few places where Ty may be pointer.

Review: Ulrich Weigand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344115 91177308-0d34-0410-b5e6-96231b3b80d8

[Analysis] Make LocationSizes carry an 'imprecise' bit

There are places where we need to merge multiple LocationSizes of
different sizes into one, and get a sensible result.

There are other places where we want to optimize aggressively based on
the value of a LocationSizes (e.g. how can a store of four bytes be to
an area of storage that's only two bytes large?)

This patch makes LocationSize hold an 'imprecise' bit to note whether
the LocationSize can be treated as an upper-bound and lower-bound for
the size of a location, or just an upper-bound.

This concludes the series of patches leading up to this. The most recent
of which is r344108.

Fixes PR36228.

Differential Revision: https://reviews.llvm.org/D44748

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344114 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Make a variable const

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344113 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC][NFC] Add a test case for extract and store patterns

An upcoming patch will change the codegen for these patterns. This test case is
added now so that the patch can show the differences in codegen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344112 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Fix the 'call.ll' CodeGen test

Commit r343851 changed the format of the generated instructions.

An unnecessary load has been removed. Previously, a value would be moved
from r24 into a temporary register just to be copied into r30 before the
indirect call. Now, codegen immediately loads r24 into r30, saving a
MOVW instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344111 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Fix the assert of ISD::SIGN_EXTEND_INREG when type is v2i16 and v2i8
For ISD::SIGN_EXTEND_INREG operation of v2i16 and v2i8 types will cause assert because they are registered as custom operation.
So that the type legalization phase will enter the custom hook, which do not handle ISD::SIGN_EXTEND_INREG operation and fall throw into unreachable assert.

Patch By: wuzish (Zixuan Wu)
Differential Revision: https://reviews.llvm.org/D52449

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344109 91177308-0d34-0410-b5e6-96231b3b80d8

[Analysis] Make LocationSize pretty-printing more descriptive

This is the third patch in a series intended to make
https://reviews.llvm.org/D44748 more easily reviewable. Please see that
patch for more context. The second being r344013.

The intent is to make the output of printing a LocationSize more
precise. The main motivation for this is that we plan to add a bit to
distinguish whether a given LocationSize is an upper-bound or is
precise; making that information available in pretty-printing is nice.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344108 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Fix fneg lowering

Summary:
Subtraction from zero and floating point negation do not have the same
semantics, so fix lowering.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52948

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344107 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Improve comments for SIMD instruction definitions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344106 91177308-0d34-0410-b5e6-96231b3b80d8

[sancov] Generalize the code to get the previous instruction to multiple architectures

sancov subtracts one from the address to get the previous instruction,
which makes sense on x86_64, but not on other platforms.
This change ensures that the offset is correct for different platforms.
The logic for computing the offset is copied from sanitizer_common.

Differential Revision: https://reviews.llvm.org/D53039

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344103 91177308-0d34-0410-b5e6-96231b3b80d8

[opt] Change the parameter of OptTable::PrintHelp from Name to Usage and don't append "[options] <inputs>"

Summary:
Before, "[options] <inputs>" is unconditionally appended to the `Name` parameter. It is more flexible to change its semantic to `Usage` and let user customize the usage line.

% llvm-objcopy
...
USAGE: llvm-objcopy <input> [ <output> ] [options] <inputs>

With this patch:

% llvm-objcopy
...
USAGE: llvm-objcopy input [output]

Reviewers: rupprecht, alexshap, jhenderson

Reviewed By: rupprecht

Subscribers: jakehehrlich, mehdi_amini, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51009

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344097 91177308-0d34-0410-b5e6-96231b3b80d8

[git-llvm] Fix some issues surrouding EOL conversion on Windows.

This patch fixes three issues.

The first is that we didn't consider files which are explicitly
set to eolstyle CRLF in the repo, and there are a handful of
these.

Second is that dos2unix doesn't have a -q option in GnuWin32,
so this codepath wasn't working properly.

Finally with newer versions of Python (or newer versions of Git,
or some combination of the two) patches can't be applied when
we treat stdin as text, because Python silently undoes all the
work we did to convert the newlines to LF using dos2unix by
using universal_newlines=True and then converting them *back*
to CRLF. So we need to add a way to force stdin to be treated
as binary, and use it when LF-newlines are required.

Differential Revision: https://reviews.llvm.org/D51444

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344095 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Handle V128 register class in explicit locals pass

Summary:
Also add tests to catch crashes in passes that are not normally run in
tests.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52959

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344094 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Expand combining of FP logical ops to sign-setting FP ops

We already do the following combines:
(bitcast int (and (bitcast fp X to int), 0x7fff...) to fp) -> fabs X
(bitcast int (xor (bitcast fp X to int), 0x8000...) to fp) -> fneg X

When the target has "bit preserving fp logic". This patch just extends it
to also combine:
(bitcast int (or (bitcast fp X to int), 0x8000...) to fp) -> fneg (fabs X)

As some targets have fnabs and even those that don't can efficiently lower
both the fabs and the fneg.

Differential revision: https://reviews.llvm.org/D44548

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344093 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix sanitizer bot failure from 344085

Fix the memory issue exposed by sanitizer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344092 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC][NFC] Commit nabs test case in preparation for committing D44548

This just adds the test case so that the different code gen is clearly visible
when the DAG Combine lands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344091 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Move test for r343954 into x86 subdirectory

This test uses an x86 triple, so it needs to be in the x86 specific
test directory.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344087 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Improve readability of SIMD instructions (NFC)

Summary:
- Categorize instructions into the categories as in the SIMD spec
- Move SIMD-related definition to WebAssemblyInstrSIMD.td
- Put definition and use of patterns together
- Add newlines here and there

Reviewers: tlively

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53045

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344086 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit r343993: [X86] condition branches folding for three-way conditional codes

Fix the memory issue exposed by sanitizer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344085 91177308-0d34-0410-b5e6-96231b3b80d8

[FPEnv] PatternMatcher support for checking FNEG ignoring signed zeros

https://reviews.llvm.org/D52934

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344084 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] reverse 'trunc X to <N x i1>' canonicalization

icmp ne (and X, 1), 0 --> trunc X to N x i1

Ideally, we'd do the same for scalars, but there will likely be
regressions unless we add more trunc folds as we're doing here
for vectors.

The motivating vector case is from PR37549:
https://bugs.llvm.org/show_bug.cgi?id=37549

define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) {
  %c = fcmp ole <4 x float> %x, %y
  %s = sext <4 x i1> %c to <4 x i32>
  %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1>
  %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3>
  %cond = or <4 x i32> %s1, %s2
  %condtr = trunc <4 x i32> %cond to <4 x i1>
  %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w
  ret <4 x float> %r
}

Here's a sampling of the vector codegen for that case using
mask+icmp (current behavior) vs. trunc (with this patch):

AVX before:

vcmpleps %xmm1, %xmm0, %xmm0
vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps %xmm0, %xmm1, %xmm0
vandps LCPI0_0(%rip), %xmm0, %xmm0
vxorps %xmm1, %xmm1, %xmm1
vpcmpeqd %xmm1, %xmm0, %xmm0
vblendvps %xmm0, %xmm3, %xmm2, %xmm0

AVX after:

vcmpleps %xmm1, %xmm0, %xmm0
vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps %xmm0, %xmm1, %xmm0
vblendvps %xmm0, %xmm2, %xmm3, %xmm0

AVX512f before:

vcmpleps %xmm1, %xmm0, %xmm0
vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps %xmm0, %xmm1, %xmm0
vpbroadcastd LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1]
vptestnmd %zmm1, %zmm0, %k1
vblendmps %zmm3, %zmm2, %zmm0 {%k1}

AVX512f after:

vcmpleps %xmm1, %xmm0, %xmm0
vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps %xmm0, %xmm1, %xmm0
vpslld $31, %xmm0, %xmm0
vptestmd %zmm0, %zmm0, %k1
vblendmps %zmm2, %zmm3, %zmm0 {%k1}

AArch64 before:

fcmge v0.4s, v1.4s, v0.4s
zip1 v1.4s, v0.4s, v0.4s
zip2 v0.4s, v0.4s, v0.4s
orr v0.16b, v1.16b, v0.16b
movi v1.4s, #1
and v0.16b, v0.16b, v1.16b
cmeq v0.4s, v0.4s, #0
bsl v0.16b, v3.16b, v2.16b

AArch64 after:

fcmge v0.4s, v1.4s, v0.4s
zip1 v1.4s, v0.4s, v0.4s
zip2 v0.4s, v0.4s, v0.4s
orr v0.16b, v1.16b, v0.16b
bsl v0.16b, v2.16b, v3.16b

PowerPC-le before:

xvcmpgesp 34, 35, 34
vspltisw 0, 1
vmrglw 3, 2, 2
vmrghw 2, 2, 2
xxlor 0, 35, 34
xxlxor 35, 35, 35
xxland 34, 0, 32
vcmpequw 2, 2, 3
xxsel 34, 36, 37, 34

PowerPC-le after:

xvcmpgesp 34, 35, 34
vmrglw 3, 2, 2
vmrghw 2, 2, 2
xxlor 0, 35, 34
xxsel 34, 37, 36, 0

Differential Revision: https://reviews.llvm.org/D52747

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344082 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] Fix another bug in globals stream name lookup.

When we're on the last bucket the computation is tricky.
We were failing when the last bucket contained multiple
matches. Added a new test for this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344081 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Make -S an alias for --strip-all

-S should be an alias for --strip-all not --strip-all-gnu

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344080 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-dwarfdump: Extend --name to also search DW_AT_linkage_name.

rdar://problem/45132695

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344079 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Promote and rename private symbols inside the CompileOnDemand layer,
rather than require them to have been promoted before being passed in.

Dropping this precondition is better for layer composition (CompileOnDemandLayer
was the only one that placed pre-conditions on the modules that could be added).
It also means that the promoted private symbols do not show up in the target
JITDylib's symbol table. Instead, they are confined to the hidden implementation
dylib that contains the actual definitions.

For the 403.gcc testcase this cut down the public symbol table size from ~15,000
symbols to ~4000, substantially reducing symbol dependence tracking costs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344078 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Implement hasBitPreservingFPLogic for types that can be supported

This is the PPC-specific non-controversial part of
https://reviews.llvm.org/D44548 that simply enables this combine for PPC
since PPC has these instructions.
This commit will allow the target-independent portion to be truly target
independent.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344077 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] When lowering unsigned v2i64 setcc without SSE42, flip the sign bits in the v2i64 type then bitcast to v4i32.

This may give slightly better opportunities for DAG combine to simplify with the operations before the setcc. It also matches the type the xors will eventually be promoted to anyway so it saves a legalization step.

Almost all of the test changes are because our constant pool entry is now v2i64 instead of v4i32 on 64-bit targets. On 32-bit targets getConstant should be emitting a v4i32 build_vector and a v4i32->v2i64 bitcast.

There are a couple test cases where it appears we now combine a bitwise not with one of these xors which caused a new constant vector to be generated. This prevented a constant pool entry from being shared. But if that's an issue we're concerned about, it seems we need to address it another way that just relying a bitcast to hide it.

This came about from experiments I've been trying with pushing the promotion of and/or/xor to vXi64 later than LegalizeVectorOps where it is today. We run LegalizeVectorOps in a bottom up order. So the and/or/xor are promoted before their users are legalized. The bitcasts added for the promotion act as a barrier to computeKnownBits if we try to use it during vector legalization of a later operation. So by moving the promotion out we can hopefully get better results from computeKnownBits/computeNumSignBits like in LowerTruncate on AVX512. I've also looked at running LegalizeVectorOps in a top down order like LegalizeDAG, but thats showing some other issues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344071 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer] Check that lowered type is floating point before calling isFabsFree

In the case of soft-fp (e.g. fp128 under wasm) the result of
getTypeLegalizationCost() can be an integer type even if the input is
floating point (See LegalizeTypeAction::TypeSoftenFloat).

Before calling isFabsFree() (which asserts if given a non-fp
type) we need to check that that result is fp. This is safe since in
fabs is certainly not free in the soft-fp case.

Fixes PR39168

Differential Revision: https://reviews.llvm.org/D52899

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344069 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] Make llvm-dwarfdump display the .debug_loc.dwo section. Fixes PR38991.

Reviewer: dblaikie

Differential Revision: https://reviews.llvm.org/D52444

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344068 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for extract subvector shuffles; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344067 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing space

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344064 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] Fix failure on big endian machines.

We changed an ArrayRef<uint8_t> to an ArrayRef<uint32_t>, but
it needs to be an ArrayRef<support::ulittle32_t>.

We also change ArrayRef<> to FixedStreamArray<>. Technically
an ArrayRef<> will work, but it can cause a copy in the underlying
implementation if the memory is not contiguous, and there's no
reason not to use a FixedStreamArray<>.

Thanks to nemanjai@ and thakis@ for helping me track this down
and confirm the fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344063 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Autogenerate complete checks. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344060 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][x86] add tests for bitcasted fnabs; NFC

Alternate target coverage for D44548.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344059 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] make helper function 'static'; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344056 91177308-0d34-0410-b5e6-96231b3b80d8

Fix function case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344051 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Fix invalid return type and add a Dump function.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53020

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344050 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] use demanded bits to simplify masked store codegen

As noted in D52747, if we prefer IR to use trunc for bool vectors rather
than and+icmp, we can expose codegen shortcomings as seen here with masked store.

Replace a hard-coded PCMPGT simplification with the more general demanded bits call
to improve things.

Differential Revision: https://reviews.llvm.org/D52964

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344048 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Add SIGN_EXTEND_VECTOR_INREG and CONCAT_VECTORS support to SimplifyDemandedBits

Fix for AVX1 masked load/store regression on D52964

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344043 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Fix FDE/CFI encoding in case of N32 ABI

For O32 and N32 ABI FDE/CFI encoding should be `DW_EH_PE_sdata4` and only
N64 ABI uses `DW_EH_PE_sdata8`. To cover all cases this patch check code
pointer size and setup a correct FDE/CFI encoding type.

Differential revision: https://reviews.llvm.org/D52876

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344040 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Set pointer size to 4 bytes for N32 ABI

CodePointerSize and CalleeSaveStackSlotSize values are used in DWARF
generation. In case of MIPS it's incorrect to check for Triple::isMIPS64()
only this function returns true for N32 ABI too.

Now we do not have a method to recognize N32 if it's specified by a command
line option and is not a part of a target triple. So we check for
Triple::GNUABIN32 only. It's better than nothing.

Differential revision: https://reviews.llvm.org/D52874

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344039 91177308-0d34-0410-b5e6-96231b3b80d8

Fix buildbot failures with the newly added test case (triple was missing).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344037 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Remove self-copies in pre-emit peephole

There are occasionally instances where AADB rewrites registers in such a way
that a reg-reg copy becomes a self-copy. Such an instruction is obviously
redundant and can be removed. This patch does precisely that.

Note that this will not remove various nop's that we insert (which are
themselves just self-copies). The reason those are left alone is that all of
them have their own opcodes (that just encode to a self-copy).

What prompted this patch is the fact that these self-copies sometimes end up
using registers that make the instruction a priority-setting nop, thereby
having a significant effect on performance.

Differential revision: https://reviews.llvm.org/D52432

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344036 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Fix wrong index type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344032 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Fix unused lambda capture.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344029 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis][NFC] Use accessors for Operand.

Summary:
This moves checking logic into the accessors and makes the structure smaller.
It will also help when/if Operand are generated from the TD files.

Subscribers: tschuett, courbet, llvm-commits

Differential Revision: https://reviews.llvm.org/D52982

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344028 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Force the alignment of the `data` field of `IntervalMap`

Summary:
This patch forces the alignment of the `data` field of `IntervalMap`.
It is because x86 MSVC doesn't apply automatically
(without `__declspec(align(...))`) alignments more than 4 bytes,
even if `alignof` has returned so. Consider the example:

https://godbolt.org/z/zIPa_G

Here `alignof` for both `S0` and `S1` returns `8`, but only `S1` is really
aligned on x86. The explanation of this behavior is here:

https://docs.microsoft.com/en-us/cpp/build/conflicts-with-the-x86-compiler

Reviewers: bkramer, stoklund, hans, rnk

Reviewed By: rnk

Subscribers: dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D52613

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344027 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[ADT] Change the `IntervalMap` alignment assert for x86 MSVC"

This reverts commit 7f9eb168a9a8f5ff4fc931a00aec43e8706afecb.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344020 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX1] Enable *_EXTEND_VECTOR_INREG lowering of 256-bit vectors

As discussed on D52964, this adds 256-bit *_EXTEND_VECTOR_INREG lowering support for AVX1 targets to help improve SimplifyDemandedBits handling.

Differential Revision: https://reviews.llvm.org/D52980

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344019 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Change the `IntervalMap` alignment assert for x86 MSVC

Summary:
This patch forces the alignment of the `data` field of `IntervalMap`.
It is because x86 MSVC doesn't apply automatically
(without `__declspec(align(...))`) alignments more than 4 bytes,
even if `alignof` has returned so. Consider the example:

https://godbolt.org/z/zIPa_G

Here `alignof` for both `S0` and `S1` returns `8`, but only `S1` is really
aligned on x86. The explanation of this behavior is here:

https://docs.microsoft.com/en-us/cpp/build/conflicts-with-the-x86-compiler

Reviewers: bkramer, stoklund, hans, rnk

Reviewed By: rnk

Subscribers: dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D52613

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344018 91177308-0d34-0410-b5e6-96231b3b80d8

[CFG Printer] Add support for writing the dot files with a custom
prefix.

Use this to direct these files to a specific location in the test suite
so that we don't write files out to random directories (or fail if the
working directory isn't writable).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344014 91177308-0d34-0410-b5e6-96231b3b80d8

Make LocationSize a proper Optional type; NFC

This is the second in a series of changes intended to make
https://reviews.llvm.org/D44748 more easily reviewable. Please see that
patch for more context. The first change being r344012.

Since I was requested to do all of this with post-commit review, this is
about as small as I can make this patch.

This patch makes LocationSize into an actual type that wraps a uint64_t;
users are required to call getValue() in order to get the size now. If
the LocationSize has an Unknown size (e.g. if LocSize ==
MemoryLocation::UnknownSize), getValue() will assert.

This also adds DenseMap specializations for LocationInfo, which required
taking two more values from the set of values LocationInfo can
represent. Hence, heavy users of multi-exabyte arrays or structs may
observe slightly lower-quality code as a result of this change.

The intent is for getValue()s to be very close to a corresponding
hasValue() (which is often spelled `!= MemoryLocation::UnknownSize`).
Sadly, small diff context appears to crop that out sometimes, and the
last change in DSE does require a bit of nonlocal reasoning about
control-flow. :/

This also removes an assert, since it's now redundant with the assert in
getValue().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344013 91177308-0d34-0410-b5e6-96231b3b80d8

Use locals instead of struct fields; NFC

This is one of a series of changes intended to make
https://reviews.llvm.org/D44748 more easily reviewable. Please see that
patch for more context.

Since I was requested to do all of this with post-commit review, this is
about as small as I can make it (beyond committing changes to these few
files separately, but they're incredibly similar in spirit, so...)

On its own, this change doesn't make a great deal of sense. I plan on
having a follow-up Real Soon Now(TM) to make the bits here make more
sense. :)

In particular, the next change in this series is meant to make
LocationSize an actual type, which you have to call .getValue() on in
order to get at the uint64_t inside. Hence, this change refactors code
so that:
- we only need to call the soon-to-come getValue() once in most cases,
and
- said call to getValue() happens very closely to a piece of code that
checks if the LocationSize has a value (e.g. if it's != UnknownSize).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344012 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-link: Improve diagnostic for module-level metadata mismatch

This might produce hard to read/illegible diagnostics for especially
weird/non-trivial module metadata but integers are about all we are
using these days, so seems more useful than not.

Patch based on work by Kristina Brooks - thanks!

Differential Revision: https://reviews.llvm.org/D52952

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344011 91177308-0d34-0410-b5e6-96231b3b80d8

ExpandPostRAPseudos: Fix alldefsAreDead() not removing operands

One case left around nonsensical operands for the KILL instruction
which the machine verifier checks for nowadays. While this should not
hurt in release builds we should fix the machine verifier errors anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344008 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS GlobalISel] Legalize i64 add

Custom legalize s64 G_ADD for MIPS32.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D52652

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344007 91177308-0d34-0410-b5e6-96231b3b80d8

TwoAddressInstructionPass: Modernize/fix some comments; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344006 91177308-0d34-0410-b5e6-96231b3b80d8

PHIElimination: Remove wrong comment; NFC

The comment was contradicting the code. Looking at history the feature
was implemented a day after the comment was written without dropping the
comment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344005 91177308-0d34-0410-b5e6-96231b3b80d8

MachineFunctionPrinterPass: Declare SlotIndexes as used if available; NFC

This makes print-machineinstrs print the slot indexes in more
situations. NFC for normal compilation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344004 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unused variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344002 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] fix a bug in global stream name lookup.

When we're looking up a record in the last hash bucket chain, we
need to be careful with the end-offset calculation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344001 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Fix debug information label tests

Remove the space in the asm check so that the expression is more general
and can also capture MIPS labels which can be surrounded by braces, e.g.:

.4byte ($tmp1) # DW_AT_low_pc

Also change optimization level to O0 because the DW_TAG_label does not
appear on MIPS when -O2 is used.

Patch by Milos Stojanovic.

Differential Revision: https://reviews.llvm.org/D52901

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343999 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Revert r343993 condition branches folding for three-way conditional codes

Some buildbots failed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343998 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] simplify code for fmul with constant fold; NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343997 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Prefer isTypeLegal over checking isSimple in a DAG combine.

Simple types are a superset of what all in tree targets in LLVM could possibly have a legal type. This means the behavior of using isSimple to check for a supported type for X86 could change over time. For example, this could would change if a v256i1 type was added to MVT in the future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343995 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add tests for phaddd/phaddw; NFC

More tests related to PR39195:
https://bugs.llvm.org/show_bug.cgi?id=39195

If we limit the horizontal codegen, it may require different
constraints for FP and integer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343994 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] condition branches folding for three-way conditional codes

This patch implements a pass that optimizes condition branches on x86 by
taking advantage of the three-way conditional code generated by compare
instructions.

Currently, it tries to hoisting EQ and NE conditional branch to a dominant
conditional branch condition where the same EQ/NE conditional code is
computed. An example:
bb_0:
  cmp %0, 19
  jg bb_1
  jmp bb_2
bb_1:
  cmp %0, 40
  jg bb_3
  jmp bb_4
bb_4:
  cmp %0, 20
  je bb_5
  jmp bb_6
Here we could combine the two compares in bb_0 and bb_4 and have the
following code:

bb_0:
  cmp %0, 20
  jg bb_1
  jl bb_2
  jmp bb_5
bb_1:
  cmp %0, 40
  jg bb_3
  jmp bb_6

For the case of %0 == 20 (bb_5), we eliminate two jumps, and the control height
for bb_6 is also reduced. bb_4 is gone after the optimization.

This optimization is motivated by the branch pattern generated by the switch
lowering: we always have pivot-1 compare for the inner nodes and we do a pivot
compare again the leaf (like above pattern).

This pass currently is enabled on Intel's Sandybridge and later arches. Some
reviewers pointed out that on some arches (like AMD Jaguar), this pass may
increase branch density to the point where it hurts the performance of the
branch predictor.

Differential Revision: https://reviews.llvm.org/D46662

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343993 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Legalize VGPR Rsrc operands for MUBUF instructions

Emit a waterfall loop in the general case for a potentially-divergent Rsrc
operand. When practical, avoid this by using Addr64 instructions.

Recommits r341413 with changes to update the MachineDominatorTree when present.

Differential Revision: https://reviews.llvm.org/D51742

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343992 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX2] Enable ZERO_EXTEND_VECTOR_INREG lowering of 256-bit vectors

Some necessary yak shaving before lowering *_EXTEND_VECTOR_INREG 256-bit vectors on AVX1 targets as suggested by D52964.

Differential Revision: https://reviews.llvm.org/D52970

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343991 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] make horizontal binop matching clearer; NFCI

The instructions are complicated, so this code will
probably never be very obvious, but hopefully this
makes it better.

As shown in PR39195:
https://bugs.llvm.org/show_bug.cgi?id=39195
...we need to improve the matching to not miss cases
where we're h-opping on 1 source vector, and that
should be a small patch after this rearranging.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343989 91177308-0d34-0410-b5e6-96231b3b80d8

[TailCallElim] Enable marking of calls with byval as tails

In r339636 the alias analysis rules were changed with regards to tail calls
and byval arguments. Previously, tail calls were assumed not to alias
allocas from the current frame. This has been updated, to not assume this
for arguments with the byval attribute.

This patch aligns TailCallElim with the new rule. Tail marking can now be
more aggressive and mark more calls as tails, e.g.:

define void @test() {
  %f = alloca %struct.foo
  call void @bar(%struct.foo* byval %f)
  ret void
}

define void @test2(%struct.foo* byval %f) {
  call void @bar(%struct.foo* byval %f)
  ret void
}

define void @test3(%struct.foo* byval %f) {
  %agg.tmp = alloca %struct.foo
  %0 = bitcast %struct.foo* %agg.tmp to i8*
  %1 = bitcast %struct.foo* %f to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 40, i1 false)
  call void @bar(%struct.foo* byval %agg.tmp)
  ret void
}

The problematic case where a byval parameter is captured by a call is still
handled correctly, and will not be marked as a tail (see PR7272).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343986 91177308-0d34-0410-b5e6-96231b3b80d8