git.osdn.net Git - android-x86/external-llvm.git/log

[PowerPC] For larger offsets, when possible, fold offset into addis toc@ha

When we have an offset into a global, etc. that is accessed relative to the TOC
base pointer, and the offset is larger than the minimum alignment of the global
itself and the TOC base pointer (which is 8-byte aligned), we can still fold
the @toc@ha into the memory access, but we must update the addis instruction's
symbol reference with the offset as the symbol addend. When there is only one
use of the addi to be folded and only one use of the addis that would need its
symbol's offset adjusted, then we can make the adjustment and fold the @toc@l
into the memory access.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280545 91177308-0d34-0410-b5e6-96231b3b80d8

[Sparc] Mark i128 shift libcalls unavailable in 32-bit mode.

Recently, llvm wants to emit calls to these functions, while it didn't
seem to be an issue before. Not sure why. Nor do I know why only these
three are important to disable, out of all of the i128 libcalls.

Nevertheless, many other targets have this snippet of code, so, just
copying it to sparc as well, to unbreak things.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280537 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/R600: EXTRACT_VECT_ELT should only bypass BUILD_VECTOR if the vectors have the same number of elements.

Fixes R600 piglit regressions since r280298

Differential Revision: https://reviews.llvm.org/D24174

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280535 91177308-0d34-0410-b5e6-96231b3b80d8

Setting fp trapping mode and denormal type: this an improvement of
r280246 and calculates compatibility of functions attributes in
a better way.

Differential Revision: https://reviews.llvm.org/D24070

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280534 91177308-0d34-0410-b5e6-96231b3b80d8

Do not consider subreg defs as reads when computing subrange liveness

Subregister definitions are considered uses for the purpose of tracking
liveness of the whole register. At the same time, when calculating live
interval subranges, subregister defs should not be treated as uses.

Differential Revision: https://reviews.llvm.org/D24190

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280532 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] auto-generate assertions for tighter checking

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280531 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Don't pass a global CL option as an argument. NFC.

Differential Revision: https://reviews.llvm.org/D24199

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280527 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/R600: Expand unaligned writes to local and global AS

LOCAL and GLOBAL AS only
PRIVATE needs special treatment

Differential Revision: https://reviews.llvm.org/D23971

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280526 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Reorganize store tests

Split by AS.
Merge with some prviously failing tests.

Differential Revision: https://reviews.llvm.org/D23969

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280523 91177308-0d34-0410-b5e6-96231b3b80d8

[codeview] Use the correct max CV record length of 0xFF00

Previously we were splitting our records at 0xFFFF bytes, which the
Microsoft tools don't like.

Should fix failure on the new Windows self-host buildbot.

This length appears in microsoft-pdb/PDB/dbi/dbiimpl.h

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280522 91177308-0d34-0410-b5e6-96231b3b80d8

IfConversion: Add assertions that both sides of a diamond don't pred-clobber.

One side of a diamond may end with a predicate clobbering instruction.
That side of the diamond has to be if-converted second. Both sides can't
clobber the predicate or the ifconversion is invalid. This is checked
elsewhere, but add an assert as a safety check. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280518 91177308-0d34-0410-b5e6-96231b3b80d8

IfConversion: Fix bug introduced by rescanning diamonds.

Passing the wrong values for predicate-clobbering. Simple to miss.
Added an assert to make this easier to catch in the future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280517 91177308-0d34-0410-b5e6-96231b3b80d8

Fix up comment from r280442, noticed by Justin.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280508 91177308-0d34-0410-b5e6-96231b3b80d8

Split the store of a wide value merged from an int-fp pair into multiple stores.

For the store of a wide value merged from a pair of values, especially int-fp pair,
sometimes it is more efficent to split it into separate narrow stores, which can
remove the bitwise instructions or sink them to colder places.

Now the feature is only enabled on x86 target, and only store of int-fp pair is
splitted. It is possible that the application scope gets extended with perf evidence
support in the future.

Differential Revision: https://reviews.llvm.org/D22840

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280505 91177308-0d34-0410-b5e6-96231b3b80d8

[InsttCombine] fold insertelement of constant into shuffle with constant operand (PR29126)

The motivating case occurs with SSE/AVX scalar intrinsics, so this is a first step towards
shrinking that to a single shufflevector.

Note that the transform is intentionally limited to shuffles that are equivalent to vector
selects to avoid creating arbitrary shuffle masks that may not lower well.

This should solve PR29126:
https://llvm.org/bugs/show_bug.cgi?id=29126

Differential Revision: https://reviews.llvm.org/D23886

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280504 91177308-0d34-0410-b5e6-96231b3b80d8

[lib/LTO] Simplify. No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280503 91177308-0d34-0410-b5e6-96231b3b80d8

Quick fix to make LIT_PRESERVES_TMP work again

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280502 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Clean up temporary files created by tests

Do this by creating a temp directory in the normal system temp
directory, and cleaning it up on exit.

It is still possible for this temp directory to leak if Python exits
abnormally, but this is probably good enough for now.

Fixes PR18335

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280501 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Update known test failures

Fixed an issue with the experimental C headers

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280498 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Ensure reverse interleaved group GEPs remain uniform

For uniform instructions, we're only required to generate a scalar value for
the first vector lane of each unroll iteration. Thus, if we have a reverse
interleaved group, computing the member index off the scalar GEP corresponding
to the last vector lane of its pointer operand technically makes the GEP
non-uniform. We should compute the member index off the first scalar GEP
instead.

I've added the updated member index computation to the existing reverse
interleaved group test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280497 91177308-0d34-0410-b5e6-96231b3b80d8

Simplify code a bit. No functional change intended.

We don't need to call `GetCompareTy(LHS)' every single time true or false is
returned from function SimplifyFCmpInst as suggested by Sanjay in review D24142.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280491 91177308-0d34-0410-b5e6-96231b3b80d8

fix documentation comments; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280489 91177308-0d34-0410-b5e6-96231b3b80d8

[instsimplify] Fix incorrect folding of an ordered fcmp with a vector of all NaN.

This patch fixes a crash caused by an incorrect folding of an ordered comparison
between a packed floating point vector and a splat vector of NaN.

An ordered comparison between a vector and a constant vector of NaN, should
always be folded into a constant vector where each element is i1 false.

Since revision 266175, SimplifyFCmpInst folds the ordered fcmp into a scalar
'false'. Later on, this would cause an assertion failure, since the value type
of the folded value doesn't match the expected value type of the uses of the
original instruction: "Assertion failed: New->getType() == getType() &&
"replaceAllUses of value with new value of different type!".

This patch fixes the issue and adds a test case to the already existing test
InstSimplify/floating-point-compares.ll.

Differential Revision: https://reviews.llvm.org/D24143

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280488 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGcombiner] Fix incorrect sinking of a truncate into the operand of a shift.

This fixes a regression introduced by revision 268094.
Revision 268094 added the following dag combine rule:
// trunc (shl x, K) -> shl (trunc x), K => K < vt.size / 2

That rule converts a truncate of a shift-by-constant into a shift of a truncated
value. We do this only if the shift count is less than half the size in bits of
the truncated value (K < vt.size / 2).

The problem is that the constraint on the shift count is incorrect, so the rule
doesn't work well in some cases involving vector types. The combine rule should
have been written instead like this:
// trunc (shl x, K) -> shl (trunc x), K => K < vt.getScalarSizeInBits()

Basically, if K is smaller than the "scalar size in bits" of the truncated value
then we know that by "sinking" the truncate into the operand of the shift we
would never accidentally make the shift undefined.

This patch fixes the check on the shift count, and adds test cases to make sure
that we don't regress the behavior.

Differential Revision: https://reviews.llvm.org/D24154

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280482 91177308-0d34-0410-b5e6-96231b3b80d8

Fixed a typo (LLVM/Support/CFG.h -> LLVM/IR/CFG.h)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280481 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Try to fix an MSVC2013 failure due to finding a template
constructor when trying to do copy construction by adding an explicit
move constructor.

Will watch the bots to discover if this is sufficient.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280479 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add test for insertelementinsts with constants.

Added a tests that shows that several insertelementinsts with constant
indexes/data are not folded into a single shuffleinst.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280474 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] - Fix possible crash in match() of llvm::Regex.

Crash was possible if match() method
was called on object that was moved or object
created with empty constructor.

Testcases updated.

DIfferential revision: https://reviews.llvm.org/D24123

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280473 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] - Teach readobj to print DT_AUXILIARY dynamic tag in human readable form.

Previously DT_AUXILIARY was unknown, patch fixes that.

Differential revision: https://reviews.llvm.org/D24138

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280471 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Add a workaround to fix PR30188

We're sinking stores, which is a good thing, but in the process creating selects for the store address operand, which SROA/Mem2Reg can't look through, which caused serious regressions.

The real fix is in SROA, which I'll be looking into.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280470 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Move tests for masked floating point logical operations to avx512dqvl-intrinsics-upgrade.ll since they have now been autoupgraded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280467 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Remove floating point logical operation instrinsics and replace them with native IR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280466 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add more patterns for masked and broadcasted logical operations where the select or broadcast has a floating point type.

These are needed in order to remove the masked floating point logical operation intrinsics and use native IR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280465 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add execution domain fixing for logical operations with broadcast loads. This builds on the handling of masked ops since we need to keep element size the same.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280464 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Strengthen some SDNode type constraints.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280463 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add NoVLX Predicates to some patterns so they don't rely on pattern ordering to be lower priority than their equivalent VLX pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280462 91177308-0d34-0410-b5e6-96231b3b80d8

[Docs] Fix another typo in the Error/Expected docs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280461 91177308-0d34-0410-b5e6-96231b3b80d8

[Docs] Fix a couple of typos in the Error/Expected docs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280460 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Fix some missing fields in OrcRemoteTargetClient's move constructor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280459 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing &. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280458 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] hasAndNotCompare should return true

As Sanjay suggested when he added the hook, PPC should return true from
hasAndNotCompare. We have an efficient negated 'and' on PPC (which can feed a
compare).

Fixes PR27203.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280457 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Fail testing if a googletest executable crashes during test discovery

googletest formatted tests are discovered by running the test executable.
Previously testing would silently succeed if the test executable crashed
during the discovery process. Now testing fails with "error: unable to
discover google-tests ..." if the test executable exits with a non-zero status.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280455 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Add a pattern for a runtime bit check

Following a suggestion by Sanjay, we should lower:

  %shl = shl i32 1, %y
  %and = and i32 %x, %shl
  %cmp = icmp eq i32 %and, %shl
  ret i1 %cmp

into:

  subfic r4, r4, 32
  rlwnm r3, r3, r4, 31, 31

Add this pattern and some associated patterns for the 64-bit case and the
not-equal case. Fixes PR27356.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280454 91177308-0d34-0410-b5e6-96231b3b80d8

revert r280429 and r280425:

r280425 | dehao | 2016-09-01 16:15:50 -0700 (Thu, 01 Sep 2016) | 9 lines

Refactor LICM pass in preparation for LoopSink pass.

Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778).

r280429 | dehao | 2016-09-01 16:31:25 -0700 (Thu, 01 Sep 2016) | 9 lines

Refactor LICM to expose canSinkOrHoistInst to LoopSink pass.

Summary: LoopSink pass shares the same canSinkOrHoistInst functionality with LICM pass. This patch exposes this function in preparation of https://reviews.llvm.org/D22778

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280453 91177308-0d34-0410-b5e6-96231b3b80d8

revert r280432:

r280432 | dehao | 2016-09-01 16:51:37 -0700 (Thu, 01 Sep 2016) | 9 lines

Explicitly require DominatorTreeAnalysis pass for instsimplify pass.

Summary: DominatorTreeAnalysis is always required by instsimplify.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280452 91177308-0d34-0410-b5e6-96231b3b80d8

llvm/test/Transforms/GCOVProfiling/three-element-mdnode.ll: Use %/T instead of %T, not to emit backslashes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280451 91177308-0d34-0410-b5e6-96231b3b80d8

bugpoint: clang-format all of bugpoint. NFC

I'm going to clean up the APIs here a bit and touch many many lines
anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280450 91177308-0d34-0410-b5e6-96231b3b80d8

raw_pwrite_stream_test.cpp: _putenv_s() may be assumed as win32-generic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280449 91177308-0d34-0410-b5e6-96231b3b80d8

IfConversion: Don't count branches in # of duplicates.

If the entire blocks match, we would count the branch instructions
toward the number of duplicated instructions. This doesn't match what we
do elsewhere, and was causing a bug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280448 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Add a unittest for invalidating module analyses with an SCC pass.

This wasn't really well explicitly tested with a nice unittest before.
It seems good to have reasonably broken out unittests for this kind of
functionality as I'm workin go other invalidation features to make sure
none of the existing ones regress.

This still has too much duplicated code, I plan to factor that out in
a subsequent commit to use common helpers for repeated parts of this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280447 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] (NFC) Split the IR parsing into a fixture so that I can split out
more testing into other test routines while using the same core module.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280446 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a real temp file leak in FileOutputBuffer

If we failed to commit the buffer but did not die to a signal, the temp
file would remain on disk on Windows. Having an open file mapping and
file handle prevents the file from being deleted. I am choosing not to
add an assertion of success on the temp file removal, since virus
scanners and other environmental things can often cause removal to fail
in real world tools.

Also fix more temp file leaks in unit tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280445 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] (NFC) Refactor the CGSCC pass manager tests to use lambda-based
passes.

This simplifies the test some and makes it more focused and clear what
is being tested. It will also make it much easier to extend with further
testing of different pass behaviors.

I've also replaced a pointless module pass with running the requires
pass directly as that is all that it was really doing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280444 91177308-0d34-0410-b5e6-96231b3b80d8

Try to fix some temp file leaks in SupportTests, PR18335

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280443 91177308-0d34-0410-b5e6-96231b3b80d8

[CFGPrinter] Display branch weight on the edges

Summary:
This is pretty useful especially in connection with
BFI's -view-block-freq-propagation-dags. It helped me to track down the
bug that is being fixed in D24118.

While -view-block-freq-propagation-dags displays the high-level
information with static heuristics included (and block frequencies), the
new thing only shows the raw weight as presented by PGO without any of
the static estimates. This helps to distinguished what has been
measured vs. estimated.

For the sample loop in D24118, -view-block-freq-propagation-dags=integer
looks like this:

https://reviews.llvm.org/F2381352

While with -view-cfg-only you can see the underlying branch weights:

https://reviews.llvm.org/F2392296

Reviewers: dexonsmith, bogner, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24144

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280442 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Don't apply the PPC64 address-formation peephole for offsets greater than 7

When applying our address-formation PPC64 peephole, we are reusing the @ha TOC
addis value with the low parts associated with different offsets (i.e.
different effective symbol addends). We were assuming this was okay so long as
the offsets were less than the alignment of the global variable being accessed.
This ignored the fact, however, that the TOC base pointer itself need only be
8-byte aligned. As a result, what we were doing is legal only for offsets less
than 8 regardless of the alignment of the object being accessed.

Fixes PR28727.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280441 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Don't consider fusion in PPC64 address-formation peephole

The logic in this function assumes that the P8 supports fusion of addis/addi,
but it does not. As a result, there is no advantage to restricting our peephole
application, merging addi instructions into dependent memory accesses, even
when the addi has multiple users, regardless of whether or not we're optimizing
for size.

We might need something like this again for the P9; I suspect we'll revisit
this code when we work on P9 tuning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280440 91177308-0d34-0410-b5e6-96231b3b80d8

Explicitly require DominatorTreeAnalysis pass for instsimplify pass.

Summary: DominatorTreeAnalysis is always required by instsimplify.

Reviewers: davidxl, danielcdh

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24173

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280432 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAGBuilder] Add const to relevant places

Reviewers: hans, evandro, sebpop

Differential Revision: https://reviews.llvm.org/D24112

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280430 91177308-0d34-0410-b5e6-96231b3b80d8

Refactor LICM to expose canSinkOrHoistInst to LoopSink pass.

Summary: LoopSink pass shares the same canSinkOrHoistInst functionality with LICM pass. This patch exposes this function in preparation of https://reviews.llvm.org/D22778

Reviewers: chandlerc, davidxl, danielcdh

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24171

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280429 91177308-0d34-0410-b5e6-96231b3b80d8

Refactor replaceDominatedUsesWith to have a flag to control whether to replace uses in BB itself.

Summary: This is in preparation for LoopSink pass which calls replaceDominatedUsesWith to update after sinking.

Reviewers: chandlerc, davidxl, danielcdh

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24170

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280427 91177308-0d34-0410-b5e6-96231b3b80d8

Refactor LICM pass in preparation for LoopSink pass.

Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778).

Reviewers: chandlerc, davidxl, danielcdh

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24168

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280425 91177308-0d34-0410-b5e6-96231b3b80d8

[Legalizer] Don't throw away false low half when expanding GT/LT SETCC

When expanding a SETCC for which the low half is known to evaluate to false,
we can only throw it away for LT/GT comparisons, not LE/GE.

This fixes PR29170.

Differential Revision: https://reviews.llvm.org/D24151

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280424 91177308-0d34-0410-b5e6-96231b3b80d8

Make the coding standards a bit more clear that we prefer the fancy new
auto-brief format for doxygen comments. Most notable is switching to
that in the example doxygen comment. I've also tweaked the wording but
am happy to tweak it further if others have suggestions here.

Mostly doing this to capture something I and others have been writing
consistently and repeatedly in code reviews.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280419 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Generate vector_shuffle nodes for undersized result vector sizes

Prior to this, we could generate a vector_shuffle from an IR shuffle when the
size of the result was exactly the sum of the sizes of the input vectors.
If the output vector was narrower - e.g. a <12 x i8> being formed by a shuffle
with two <8 x i8> inputs - we would lower the shuffle to a sequence of extracts
and inserts.

Instead, we can form a larger vector_shuffle, and then extract a subvector
of the right size - e.g. shuffle the two <8 x i8> inputs into a <16 x i8>
and then extract a <12 x i8>.

This also includes a target-specific X86 combine that in the presence of
AVX2 combines:
(vector_shuffle <mask> (concat_vectors t1, undef)
(concat_vectors t2, undef))
into:
(vector_shuffle <mask> (concat_vectors t1, t2), undef)
in cases where this allows us to form VPERMD/VPERMQ.

(This is not a separate commit, as that pattern does not appear without
the DAGBuilder change.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280418 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add asm.js-style setjmp/longjmp handling for wasm (reland r280302)

Summary: This patch adds asm.js-style setjmp/longjmp handling support for WebAssembly. It also uses JavaScript's try and catch mechanism.

Reviewers: jpp, dschuff

Subscribers: jfb, dschuff

Differential Revision: https://reviews.llvm.org/D24121

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280415 91177308-0d34-0410-b5e6-96231b3b80d8

bugpoint: clang-format and modernize comments in ListReducer. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280414 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: add a G_PHI instruction to give phis a type.

They're another source of generic vregs, which are going to need a type on the
definition when we remove the register width from MachineRegisterInfo.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280412 91177308-0d34-0410-b5e6-96231b3b80d8

Fix the ASan fuse-lld.cc test after LLD r280012

With that change, images built with 'lld-link /debug' always have a
debug directory. If no PDB filename was passed on the command line, then
the filename in the executable is empty.

PDB information would never work anyway if the PDB file name is empty,
so go ahead and try DWARF in that case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280410 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Use ScalarParts for ad-hoc pointer IV scalarization (NFCI)

We can now maintain scalar values in VectorLoopValueMap. Thus, we no longer
have to create temporary vectors with insertelement instructions when handling
pointer induction variables. This case was mistakenly missed from r279649 when
refactoring the other scalarization code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280405 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests to show potential shuffle+insert folds

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280403 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Loosen memory folding requirements for cvtdq2pd and cvtps2pd instructions.

According to spec cvtdq2pd and cvtps2pd instructions don't require memory operand to be aligned
to 16 bytes. This patch removes this requirement from the memory folding table.

Differential Revision: https://reviews.llvm.org/D23919

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280402 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add runtime metadata for pointee alignment of argument.

Add runtime metdata for pointee alignment of pointer type kernel argument. The key is KeyArgPointeeAlign and the value is a 32 bit unsigned integer.

Differential Revision: https://reviews.llvm.org/D24145

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280399 91177308-0d34-0410-b5e6-96231b3b80d8

[lib/LTO] Simplify a bit. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280396 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Connecting check-all and test-depends targets correctly

My previous attempt at this connected the sub-project check targets to the test-depends target instead of to the check-all target. That resulted in the tests running multiple times on bots that built "test-depends" and "check-all" in separate build invocations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280392 91177308-0d34-0410-b5e6-96231b3b80d8

Rename some variables to have meaningful names. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280391 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Move VectorParts allocation and mapping into PHI widening (NFC)

This patch moves the allocation of VectorParts for PHI nodes into the actual
PHI widening code. Previously, we allocated these VectorParts in
vectorizeBlockInLoop, and passed them by reference to widenPHIInstruction. Upon
returning, we would then map the VectorParts in VectorLoopValueMap. This
behavior is problematic for the cases where we only want to generate a scalar
version of a PHI node. For example, if in the future we only generate a scalar
version of an induction variable, we would end up inserting an empty vector
entry into the map once we return to vectorizeBlockInLoop. We now no longer
need to pass VectorParts to the various PHI widening functions, and we can keep
VectorParts allocation as close as possible to the point at which they are
actually mapped in VectorLoopValueMap.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280390 91177308-0d34-0410-b5e6-96231b3b80d8

[codeview] Properly propagate the TypeLeafKind through the pipeline.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280388 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Don't fold a trunc if it feeds an anyext

Legalization tends to create anyext(trunc) patterns. This should always be
combined - into either a single trunc, a single ext, or nothing if the
types match exactly. But if we happen to combine the trunc first, we may pull
the trunc away from the anyext or make it implicit (e.g. the truncate(extract)
-> extract(bitcast) fold).

To prevent this, we can avoid doing the fold, similarly to how we already handle
fpround(fpextend).

Differential Revision: https://reviews.llvm.org/D23893

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280386 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/SI: MIMG TD Refactoring.

Summary:
Created a new td file MIMGInstructions.td which contains all definitions
of MIMG related instructions.

Reviewed by:
kzhuravl, vpykhtin

Differential Revision:
http://reviews.llvm.org/D24106

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280385 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Use multiprocessing by default on Windows

Apparently nobody evaluated multiprocessing on Windows since Daniel
enabled multiprocessing on Unix in r193279. It works so far as I can
tell.

Today this is worth about an 8x speedup (631.29s to 73.25s) on my 24
core Windows machine. Hopefully this will improve Windows buildbot cycle
time, where currently it takes more time to run check-all than it does
to self-host with assertions enabled:
http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/20
build stage 2 ninja all ( 28 mins, 22 secs )
ninja check 2 stage 2 ( 37 mins, 38 secs )

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280382 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Revive LLVM_*_DIRS variables

This is a partial revert of r280013. Brad King pointed out these variable names are matching CMake conventions, so we should preserve them.

I've also added a direct mapping of the LLVM_*_DIR variables which we need to make projects support building in and out of tree.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280380 91177308-0d34-0410-b5e6-96231b3b80d8

[EarlyCSE] Change C API pass interface for EarlyCSE w/ MemorySSA

Previous change broke the C API for creating an EarlyCSE pass w/
MemorySSA by adding a bool parameter to control whether MemorySSA was
used or not. This broke the OCaml bindings. Instead, change the old C
API entry point back and add a new one to request an EarlyCSE pass with
MemorySSA.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280379 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Include missed file from previous commit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280377 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Dropped (V)CVTPD2PS intrinsic patterns now that its bound to X86vfpround

It now uses X86vfpround patterns directly instead.

Followup to D23797

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280376 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] interAptiv based generic schedule model

This scheduler describes a processor which covers all MIPS ISAs based
around the interAptiv and P5600 timings.

Reviewers: vkalintiris, dsanders

Differential Revision: https://reviews.llvm.org/D23551

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280374 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Fix LLVM_ENABLE_EH and LLVM_ENABLE_RTTI on MSVC

Patch by Johannes Sebastian Mueller-Roemer.

Differential Revision: https://reviews.llvm.org/D23645

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280371 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] remove fold of an icmp pattern that should never happen

While removing a scalar shackle from an icmp fold, I noticed that I couldn't find any tests to trigger
this code path.

The 'and' shrinking transform should be handled by InstCombiner::foldCastedBitwiseLogic()
or eliminated with InstSimplify. The icmp narrowing is part of InstCombiner::foldICmpWithCastAndCast().

Differential Revision: https://reviews.llvm.org/D24031

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280370 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Deal with undefs when extending live intervals

Reapply r280275, since MSVC accepts r280358.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280369 91177308-0d34-0410-b5e6-96231b3b80d8

Optimized FMA intrinsic + FNEG , like
-(a*b+c)

and FNEG + FMA, like
a*b-c or (-a)*b+c.

The bug description is here : https://llvm.org/bugs/show_bug.cgi?id=28892

Differential revision: https://reviews.llvm.org/D23313

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280368 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches

This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted.

As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider:

   if (a)
     x(1);
   else if (b)
     x(2);

This produces the following CFG:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \    |  /
          [ end ]

[end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch.

We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects).

We are now able to detect this case and split off the unconditional arcs to a common successor:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \   /    |
     [sink.split] |
           \     /
           [ end ]

Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280364 91177308-0d34-0410-b5e6-96231b3b80d8

Add an optional parameter with a list of undefs to extendToIndices

Reapply r280268, hopefully in a version that MSVC likes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280358 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Properly handle escape characters in Attribute::getAsString()

If an attribute name has special characters such as '\01', it is not
properly printed in LLVM assembly language format.  Since the format
expects the special characters are printed as it is, it has to contain
escape characters to make it printable.

Before:
  attributes #0 = { ... "counting-function"="^A__gnu_mcount_nc" ...

After:
  attributes #0 = { ... "counting-function"="\01__gnu_mcount_nc" ...

Reviewers: hfinkel, rengolin, rjmccall, compnerd

Subscribers: nemanjai, mcrosier, hans, shenhan, majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D23792

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280357 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd

r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything.

This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink.

This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example:

%a = load i32* %b %d = load i32* %b
%c = gep i32* %a, i32 0 %e = gep i32* %d, i32 1

Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor).

This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough.

Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm.

In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging.

In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive.

This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280351 91177308-0d34-0410-b5e6-96231b3b80d8

Add ISD::EH_DWARF_CFA, simplify @llvm.eh.dwarf.cfa on Mips, fix on PowerPC

LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible
__builtin_dwarf_cfa() builtin. As pointed out in PR26761, this is currently
broken on PowerPC (and likely on ARM as well). Currently, @llvm.eh.dwarf.cfa is
lowered using:

ADD(FRAMEADDR, FRAME_TO_ARGS_OFFSET)

where FRAME_TO_ARGS_OFFSET defaults to the constant zero. On x86,
FRAME_TO_ARGS_OFFSET is lowered to 2*SlotSize. This setup, however, does not
work for PowerPC. Because of the way that the stack layout works, the canonical
frame address is not exactly (FRAMEADDR + FRAME_TO_ARGS_OFFSET) on PowerPC
(there is a lower save-area offset as well), so it is not just a matter of
implementing FRAME_TO_ARGS_OFFSET for PowerPC (unless we redefine its
semantics -- We can do that, since it is currently used only for
@llvm.eh.dwarf.cfa lowering, but the better to directly lower the CFA construct
itself (since it can be easily represented as a fixed-offset FrameIndex)). Mips
currently does this, but by using a custom lowering for ADD that specifically
recognizes the (FRAMEADDR, FRAME_TO_ARGS_OFFSET) pattern.

This change introduces a ISD::EH_DWARF_CFA node, which by default expands using
the existing logic, but can be directly lowered by the target. Mips is updated
to use this method (which simplifies its implementation, and I suspect makes it
more robust), and updates PowerPC to do the same.

Fixes PR26761.

Differential Revision: https://reviews.llvm.org/D24038

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280350 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Scalar Memory instructions TD refactoring

Differential revision: https://reviews.llvm.org/D23996

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280349 91177308-0d34-0410-b5e6-96231b3b80d8

Add a counter-function insertion pass

As discussed in https://reviews.llvm.org/D22666, our current mechanism to
support -pg profiling, where we insert calls to mcount(), or some similar
function, is fundamentally broken. We insert these calls in the frontend, which
means they get duplicated when inlining, and so the accumulated execution
counts for the inlined-into functions are wrong.

Because we don't want the presence of these functions to affect optimizaton,
they should be inserted in the backend. Here's a pass which would do just that.
The knowledge of the name of the counting function lives in the frontend, so
we're passing it here as a function attribute. Clang will be updated to use
this mechanism.

Differential Revision: https://reviews.llvm.org/D22825

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280347 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Fix a warning introduced in r280339 due to the member
initializers not being in the same order as the members.

Specifically, 'preg' is the first member followed by 'error', so they
will be initialized in that order and should be written in the member
initializer list in that order.

For the constructor in question, there is no change in behavior.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280345 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Fix nondeterministic iteration order

We iterate over the result from SafeToMergeTerminators, so make it a SmallSetVector instead of a SmallPtrSet.

Should fix stage3 convergence builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280342 91177308-0d34-0410-b5e6-96231b3b80d8

Commit of forgotten header for r280339 "[LLVM/Support] - Create no-arguments constructor for llvm::Regex"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280340 91177308-0d34-0410-b5e6-96231b3b80d8