git.osdn.net Git - android-x86/external-llvm.git/log

[PM] Revert r280447: Add a unittest for invalidating module analyses with an SCC pass.

This was mistakenly committed. The world isn't ready for this test, the
test code has horrible debugging code in it that should never have
landed in tree, it currently passes because of bugs elsewhere, and it
needs to be rewritten to not be susceptible to passing for the wrong
reasons.

I'll re-land this in a better form when the prerequisite patches land.

So sorry that I got this mixed into a series of commits that *were*
ready to land. I shouldn't have. =[ What's worse is that it stuck around
for so long and I discovered it while fixing the underlying bug that
caused it to pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280620 91177308-0d34-0410-b5e6-96231b3b80d8

[LCG] Clean up and make NDEBUG verify calls more rigorous with
make_scope_exit now that we have that utility.

This makes the code much more clear and readable by isolating the check.
It also makes it easy to go through and make sure all the interesting
update routines have a start and end verify so we don't slowly let the
graph drift into an invalid state.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280619 91177308-0d34-0410-b5e6-96231b3b80d8

[LCG] A NFC refactoring to extract the logic for doing
a postorder-sequence based update after edge insertion into a generic
helper function.

This separates the SCC-specific logic into two fairly simple lambdas and
extracts the rest into a generic helper template function. I think this
is a net win on its own merits because it disentangles different pieces
of the algorithm. Now there is one place that does the two-step
partition to identify a set of newly connected components and at the
same time update the postorder sequence.

However, I'm also hoping to re-use this an upcoming patch to update
a cached post-order sequence of RefSCCs when doing the analogous update
to the RefSCC graph, and I don't want to have two copies.

The diff is quite messy but this really is just moving things around and
making types generic rather than specific.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280618 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Preserve llvm.mem.parallel_loop_access metadata when replacing
memcpy with ld/st.

When InstCombine replaces a memcpy with loads+stores it does not copy over the
llvm.mem.parallel_loop_access from the memcpy instruction. This patch fixes
that.

Differential Revision: https://reviews.llvm.org/D23499

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280617 91177308-0d34-0410-b5e6-96231b3b80d8

[ExecutionEngine] Move ObjectCache::anchor from MCJIT to ExecutionEngine.

ObjectCache is an ExecutionEngine utility, so its anchor belongs there. The
practical impact of this change is that ORC users no longer need to link MCJIT
to use ObjectCaches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280616 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280615 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Zero-extend constants in FastISel

As it turns out, whether we zero-extend or sign-extend i8/i16 constants, which
are illegal types promoted to i32 on PowerPC, is a choice constrained by
assumptions within the infrastructure. Specifically, the logic in
FunctionLoweringInfo::ComputePHILiveOutRegInfo assumes that constant PHI
operands will be zero extended, and so, at least when materializing constants
that are PHI operands, we must do the same.

The rest of our fast-isel implementation does not appear to depend on the fact
that we were sign-extending i8/i16 constants, and all other targets also appear
to zero-extend small-bitwidth constants in fast-isel; we'll now do the same (we
had been doing this only for i1 constants, and sign-extending the others).

Fixes PR27721.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280614 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Remove masked integer add/sub/mull intrinsics and upgrade to native IR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280611 91177308-0d34-0410-b5e6-96231b3b80d8

Fix inliner funclet unwind memoization

Summary:
The inliner may need to determine where a given funclet unwinds to,
and this determination may depend on other funclets throughout the
funclet tree.  The code that performs this walk in getUnwindDestToken
memoizes results to avoid redundant computations.  In the case that
a funclet's unwind destination is derived from its ancestor, there's
code to walk back down the tree from the ancestor updating the memo
map of its descendants to record the unwind destination.  This change
fixes that code to account for the case that some descendant has a
different unwind destination, which can happen if that unwind dest
is a descendant of the EHPad being queried and thus didn't determine
its unwind destination.

Also update test inline-funclets.ll, which is supposed to cover such
scenarios, to include a case that fails an assertion without this fix
but passes with it.

Fixes PR29151.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24117

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280610 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Combine some of the strings in autoupgrade code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280603 91177308-0d34-0410-b5e6-96231b3b80d8

Cleanup : Use metadata preserving API for branch creation

Use the wrapper API in IRBuilder that does meta data copy
to create new branch in LoopUnswitch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280602 91177308-0d34-0410-b5e6-96231b3b80d8

[Profile] preserve branch metadata lowering select in CGP

CGP currently drops select's MD_prof profile data when
generating conditional branch which can lead to bad
code layout. The patch fixes the issue.

Differential Revision: http://reviews.llvm.org/D24169

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280600 91177308-0d34-0410-b5e6-96231b3b80d8

Fix ThinLTO crash with debug info

Because the recent change about ODR type uniquing in the context,
we can reach types defined in another module during IR linking.
This triggered some assertions in case we IR link without starting
from an empty module. To alleviate that, we can self-map metadata
defined in the destination module so that they won't be visited.

Differential Revision: https://reviews.llvm.org/D23841

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280599 91177308-0d34-0410-b5e6-96231b3b80d8

Strip trailing whitespace

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280598 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Set sizes of spill pseudos

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280595 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix adding duplicate implicit exec uses

I'm not sure if this should be considered a bug in
copyImplicitOps or not, but implicit operands that are part
of the static instruction definition should not be copied.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280594 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add integer ADD/SUB instructions to load folding tables. Add an AVX512 stack folding test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280593 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Mark EVEX encoded vpcmpeq as commutable just like its AVX and SSE equivalent.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280592 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Reduce the duration of whole-quad-mode

Summary:
This contains two changes that reduce the time spent in WQM, with the
intention of reducing bandwidth required by VMEM loads:

1. Sampling instructions by themselves don't need to run in WQM, only their
   coordinate inputs need it (unless of course there is a dependent sampling
   instruction). The initial scanInstructions step is modified accordingly.

2. When switching back from WQM to Exact, switch back as soon as possible.
   This affects the logic in processBlock.

This should always be a win or at best neutral.

There are also some cleanups (e.g. remove unused ExecExports) and some new
debugging output.

Reviewers: arsenm, tstellarAMD, mareko

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D22092

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280590 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix an interaction between WQM and polygon stippling

Summary:
This fixes a rare bug in polygon stippling with non-monolithic pixel shaders.

The underlying problem is as follows: the prolog part contains the polygon
stippling sequence, i.e. a kill. The main part then enables WQM based on the
_reduced_ exec mask, effectively undoing most of the polygon stippling.

Since we cannot know whether polygon stippling will be used, the main part
of a non-monolithic shader must always return to exact mode to fix this
problem.

Reviewers: arsenm, tstellarAMD, mareko

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D23131

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280589 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Do basic folding of class intrinsic

This allows more of the OCML builtin library to be
constant folded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280586 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix spilling of m0

readlane/writelane do not support using m0 as the output/input.
Constrain the register class of spill vregs to try to avoid this,
but also handle spilling of the physreg when necessary by inserting
an additional copy to a normal SGPR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280584 91177308-0d34-0410-b5e6-96231b3b80d8

Improve debug error message with register name

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280583 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add EVEX encoded VPCMPEQ and VPCMPGT to the load folding tables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280581 91177308-0d34-0410-b5e6-96231b3b80d8

Make lit/util.py py3-compatible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280579 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r280549.

The test it added doesn't pass:
http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/15318/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Apdbdump-yaml-types.test

Command Output (stdout):
--
$ "D:/buildslave/clang-x64-ninja-win7/stage1/./bin\llvm-pdbdump.EXE" "pdb2yaml" "-tpi-stream" "D:\buildslave\clang-x64-ninja-win7\llvm\test\DebugInfo\PDB/Inputs/empty.pdb"
$ "D:/buildslave/clang-x64-ninja-win7/stage1/./bin\FileCheck.EXE" "-check-prefix=YAML" "D:\buildslave\clang-x64-ninja-win7\llvm\test\DebugInfo\PDB\pdbdump-yaml-types.test"
# command stderr:
D:\buildslave\clang-x64-ninja-win7\llvm\test\DebugInfo\PDB\pdbdump-yaml-types.test:36:7: error: expected string not found in input
YAML: Name: apartment
^
<stdin>:153:10: note: scanning from here
Value: 161
^

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280577 91177308-0d34-0410-b5e6-96231b3b80d8

ADT: Use std::list in SparseBitVector, NFC

The only intrusive thing about SparseBitVector's usage of ilist<> was
that new was usually called externally. There were no custom traits.

It seems like the reason to switch to ilist in r41855 was to avoid
pointer invalidation, but std::list<> has that feature too. Maybe
std::list<>::emplace makes this a little more obvious than it was then.

Switch over to std::list<> and simplify the code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280573 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Support asm parsing for bc[l][a][+-] mnemonics

PowerPC assembly code in the wild, so it seems, has things like this:

bc+ 12, 28, .L9

This is a bit odd because the '+' here becomes part of the BO field, and the BO
field is otherwise the first operand. Nevertheless, the ISA specification does
clearly say that the +- hint syntax applies to all conditional-branch mnemonics
(that test either CTR or a condition register, although not the forms which
check both), both basic and extended, so this is supposed to be valid.

This introduces some asm-parser-only definitions which take only the upper
three bits from the specified BO value, and the lower two bits are implied by
the +- suffix (via some associated aliases).

Fixes PR23646.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280571 91177308-0d34-0410-b5e6-96231b3b80d8

ADT: Do not inherit from std::iterator in ilist_iterator

Inheriting from std::iterator uses more boiler-plate than manual
typedefs. Avoid that in both ilist_iterator and
MachineInstrBundleIterator.

This has the side effect of removing ilist_iterator from certain ADL
lookups in namespace std; calls to std::next need to be qualified by
"std::" that didn't have to before. The one case of this in-tree was
operating on a temporary, so I used the more compact operator++.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280570 91177308-0d34-0410-b5e6-96231b3b80d8

ADT: Split out iplist_impl from iplist, NFC

Split out iplist_impl from iplist, and change SymbolTableList to inherit
directly from iplist_impl.  This makes it more straightforward to add
new template paramaters to iplist [*]:
- iplist_impl takes a "base" list that provides the intrusive
  functionality (usually simple_ilist<T>) and a traits class.
- iplist no longer takes a "Traits" template parameter.  It only takes
  the value_type, T, and instantiates iplist_impl with simple_ilist<T>
  and ilist_traits<T>.
- SymbolTableList now inherits from iplist_impl, instead of iplist.

Note for out-of-tree code: if you have an iplist whose second template
parameter was *not* the default (i.e., not ilist_traits<YourT>), you
have three options:
- Stop using a custom traits class, and instead specialize
  ilist_traits<YourT>.  This is the usual thing to do.
- Specialize iplist<YourT> to pass your custom traits class into
  iplist_impl.
- Create your own trivial list type that passes your custom traits class
  into iplist_impl (see SymbolTableList<> for an example).

[*]: The eventual goal is to start tracking a sentinel bit on the
MachineInstr list even when LLVM_ENABLE_ABI_BREAKING_CHECKS is off,
which will enable MachineBasicBlock::reverse_iterator to have normal
list invalidation semantics that matching the new
iplist<>::reverse_iterator from r280032.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280569 91177308-0d34-0410-b5e6-96231b3b80d8

Fix buildbot error.

Add -mtriple=x86_64-unknown-linux-gnu for the test and move it to CodeGen/X86.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280568 91177308-0d34-0410-b5e6-96231b3b80d8

ADT: Rename NodeTy to T in iplist/ilist template parameters

And use other typedefs so that the next rename has a smaller diff.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280567 91177308-0d34-0410-b5e6-96231b3b80d8

ADT: Remove external uses of ilist_iterator, NFC

Delete the dead code for Write(ilist_iterator) in the IR Verifier,
inline report(ilist_iterator) at its call sites in the MachineVerifier,
and use simple_ilist<>::iterator in SymbolTableListTraits.

The only remaining reference to ilist_iterator outside of the ilist
implementation is from MachineInstrBundleIterator. I'll get rid of that
in a follow-up.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280565 91177308-0d34-0410-b5e6-96231b3b80d8

ADT: Fix up IListTest.privateNode and get it passing

This test was using the wrong type, and so not actually testing much.
ilist_iterator constructors weren't going through ilist_node_access, so
they didn't actually work with private inheritance.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280564 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Add asm parser/disassembler support for hrfid,nap,slbmfev

These few book-III instructions are used by the Linux kernel.

Partially fixes PR24796.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280560 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Add support for the extended dcbf form and mnemonics

dcbf has an optional hint-like field, add support for the extended form and the
associated mnemonics (dcbfl and dcbflp).

Partially fixes PR24796.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280559 91177308-0d34-0410-b5e6-96231b3b80d8

(LLVM part) Implement MASM-flavor intel syntax behavior for inline MS asm block:
1. 0xNN and NNh are accepted as valid hexadecimal numbers, but 0xNNh is not.
0xNN and NNh may come with optional U or L suffix.
2. NNb is accepted as a valid binary (base-2) number, but 0bNN is not.
NNb may come with optional U or L suffix.

Differential Revision: https://reviews.llvm.org/D22112

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280555 91177308-0d34-0410-b5e6-96231b3b80d8

Make sure to maintain register liveness when generating predicated instructions.

Author: Krzysztof Parzyszek <kparzysz@codeaurora.org>

Differential Revision: https://reviews.llvm.org/D24209

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280552 91177308-0d34-0410-b5e6-96231b3b80d8

gitignore: ignore VS Code editor files

Summary: VS code creates .vscode folder to keep its stuff that we really don't need in git.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24211

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280551 91177308-0d34-0410-b5e6-96231b3b80d8

lit: print process output, if getting the list of google-tests failed.

Summary:
This is a follow up to r280455, where a check for the process exit code
was introduced. Some ASAN bots throw this error now, but it's impossible
to understand what's wrong with them, and the issue is not reproducible.

Reviewers: vitalybuka

Differential Revision: https://reviews.llvm.org/D24210

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280550 91177308-0d34-0410-b5e6-96231b3b80d8

[codeview] Make FieldList records print as a yaml sequence.

Before we were kind of imitating the behavior of a Yaml sequence
by outputting each record one after the other. This makes it a
little cumbersome when we want to go the other direction -- from
Yaml to Pdb. So this treats FieldList records as no different than
any other list of records, by printing them as a Yaml sequence with
the exact same format.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280549 91177308-0d34-0410-b5e6-96231b3b80d8

[Profile] handle select instruction in 'expect' lowering

Builtin expect lowering currently ignores select. This patch
fixes the issue

Differential Revision: http://reviews.llvm.org/D24166

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280547 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] For larger offsets, when possible, fold offset into addis toc@ha

When we have an offset into a global, etc. that is accessed relative to the TOC
base pointer, and the offset is larger than the minimum alignment of the global
itself and the TOC base pointer (which is 8-byte aligned), we can still fold
the @toc@ha into the memory access, but we must update the addis instruction's
symbol reference with the offset as the symbol addend. When there is only one
use of the addi to be folded and only one use of the addis that would need its
symbol's offset adjusted, then we can make the adjustment and fold the @toc@l
into the memory access.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280545 91177308-0d34-0410-b5e6-96231b3b80d8

[Sparc] Mark i128 shift libcalls unavailable in 32-bit mode.

Recently, llvm wants to emit calls to these functions, while it didn't
seem to be an issue before. Not sure why. Nor do I know why only these
three are important to disable, out of all of the i128 libcalls.

Nevertheless, many other targets have this snippet of code, so, just
copying it to sparc as well, to unbreak things.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280537 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/R600: EXTRACT_VECT_ELT should only bypass BUILD_VECTOR if the vectors have the same number of elements.

Fixes R600 piglit regressions since r280298

Differential Revision: https://reviews.llvm.org/D24174

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280535 91177308-0d34-0410-b5e6-96231b3b80d8

Setting fp trapping mode and denormal type: this an improvement of
r280246 and calculates compatibility of functions attributes in
a better way.

Differential Revision: https://reviews.llvm.org/D24070

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280534 91177308-0d34-0410-b5e6-96231b3b80d8

Do not consider subreg defs as reads when computing subrange liveness

Subregister definitions are considered uses for the purpose of tracking
liveness of the whole register. At the same time, when calculating live
interval subranges, subregister defs should not be treated as uses.

Differential Revision: https://reviews.llvm.org/D24190

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280532 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] auto-generate assertions for tighter checking

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280531 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Don't pass a global CL option as an argument. NFC.

Differential Revision: https://reviews.llvm.org/D24199

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280527 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/R600: Expand unaligned writes to local and global AS

LOCAL and GLOBAL AS only
PRIVATE needs special treatment

Differential Revision: https://reviews.llvm.org/D23971

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280526 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Reorganize store tests

Split by AS.
Merge with some prviously failing tests.

Differential Revision: https://reviews.llvm.org/D23969

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280523 91177308-0d34-0410-b5e6-96231b3b80d8

[codeview] Use the correct max CV record length of 0xFF00

Previously we were splitting our records at 0xFFFF bytes, which the
Microsoft tools don't like.

Should fix failure on the new Windows self-host buildbot.

This length appears in microsoft-pdb/PDB/dbi/dbiimpl.h

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280522 91177308-0d34-0410-b5e6-96231b3b80d8

IfConversion: Add assertions that both sides of a diamond don't pred-clobber.

One side of a diamond may end with a predicate clobbering instruction.
That side of the diamond has to be if-converted second. Both sides can't
clobber the predicate or the ifconversion is invalid. This is checked
elsewhere, but add an assert as a safety check. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280518 91177308-0d34-0410-b5e6-96231b3b80d8

IfConversion: Fix bug introduced by rescanning diamonds.

Passing the wrong values for predicate-clobbering. Simple to miss.
Added an assert to make this easier to catch in the future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280517 91177308-0d34-0410-b5e6-96231b3b80d8

Fix up comment from r280442, noticed by Justin.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280508 91177308-0d34-0410-b5e6-96231b3b80d8

Split the store of a wide value merged from an int-fp pair into multiple stores.

For the store of a wide value merged from a pair of values, especially int-fp pair,
sometimes it is more efficent to split it into separate narrow stores, which can
remove the bitwise instructions or sink them to colder places.

Now the feature is only enabled on x86 target, and only store of int-fp pair is
splitted. It is possible that the application scope gets extended with perf evidence
support in the future.

Differential Revision: https://reviews.llvm.org/D22840

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280505 91177308-0d34-0410-b5e6-96231b3b80d8

[InsttCombine] fold insertelement of constant into shuffle with constant operand (PR29126)

The motivating case occurs with SSE/AVX scalar intrinsics, so this is a first step towards
shrinking that to a single shufflevector.

Note that the transform is intentionally limited to shuffles that are equivalent to vector
selects to avoid creating arbitrary shuffle masks that may not lower well.

This should solve PR29126:
https://llvm.org/bugs/show_bug.cgi?id=29126

Differential Revision: https://reviews.llvm.org/D23886

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280504 91177308-0d34-0410-b5e6-96231b3b80d8

[lib/LTO] Simplify. No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280503 91177308-0d34-0410-b5e6-96231b3b80d8

Quick fix to make LIT_PRESERVES_TMP work again

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280502 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Clean up temporary files created by tests

Do this by creating a temp directory in the normal system temp
directory, and cleaning it up on exit.

It is still possible for this temp directory to leak if Python exits
abnormally, but this is probably good enough for now.

Fixes PR18335

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280501 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Update known test failures

Fixed an issue with the experimental C headers

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280498 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Ensure reverse interleaved group GEPs remain uniform

For uniform instructions, we're only required to generate a scalar value for
the first vector lane of each unroll iteration. Thus, if we have a reverse
interleaved group, computing the member index off the scalar GEP corresponding
to the last vector lane of its pointer operand technically makes the GEP
non-uniform. We should compute the member index off the first scalar GEP
instead.

I've added the updated member index computation to the existing reverse
interleaved group test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280497 91177308-0d34-0410-b5e6-96231b3b80d8

Simplify code a bit. No functional change intended.

We don't need to call `GetCompareTy(LHS)' every single time true or false is
returned from function SimplifyFCmpInst as suggested by Sanjay in review D24142.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280491 91177308-0d34-0410-b5e6-96231b3b80d8

fix documentation comments; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280489 91177308-0d34-0410-b5e6-96231b3b80d8

[instsimplify] Fix incorrect folding of an ordered fcmp with a vector of all NaN.

This patch fixes a crash caused by an incorrect folding of an ordered comparison
between a packed floating point vector and a splat vector of NaN.

An ordered comparison between a vector and a constant vector of NaN, should
always be folded into a constant vector where each element is i1 false.

Since revision 266175, SimplifyFCmpInst folds the ordered fcmp into a scalar
'false'. Later on, this would cause an assertion failure, since the value type
of the folded value doesn't match the expected value type of the uses of the
original instruction: "Assertion failed: New->getType() == getType() &&
"replaceAllUses of value with new value of different type!".

This patch fixes the issue and adds a test case to the already existing test
InstSimplify/floating-point-compares.ll.

Differential Revision: https://reviews.llvm.org/D24143

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280488 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGcombiner] Fix incorrect sinking of a truncate into the operand of a shift.

This fixes a regression introduced by revision 268094.
Revision 268094 added the following dag combine rule:
// trunc (shl x, K) -> shl (trunc x), K => K < vt.size / 2

That rule converts a truncate of a shift-by-constant into a shift of a truncated
value. We do this only if the shift count is less than half the size in bits of
the truncated value (K < vt.size / 2).

The problem is that the constraint on the shift count is incorrect, so the rule
doesn't work well in some cases involving vector types. The combine rule should
have been written instead like this:
// trunc (shl x, K) -> shl (trunc x), K => K < vt.getScalarSizeInBits()

Basically, if K is smaller than the "scalar size in bits" of the truncated value
then we know that by "sinking" the truncate into the operand of the shift we
would never accidentally make the shift undefined.

This patch fixes the check on the shift count, and adds test cases to make sure
that we don't regress the behavior.

Differential Revision: https://reviews.llvm.org/D24154

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280482 91177308-0d34-0410-b5e6-96231b3b80d8

Fixed a typo (LLVM/Support/CFG.h -> LLVM/IR/CFG.h)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280481 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Try to fix an MSVC2013 failure due to finding a template
constructor when trying to do copy construction by adding an explicit
move constructor.

Will watch the bots to discover if this is sufficient.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280479 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add test for insertelementinsts with constants.

Added a tests that shows that several insertelementinsts with constant
indexes/data are not folded into a single shuffleinst.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280474 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] - Fix possible crash in match() of llvm::Regex.

Crash was possible if match() method
was called on object that was moved or object
created with empty constructor.

Testcases updated.

DIfferential revision: https://reviews.llvm.org/D24123

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280473 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] - Teach readobj to print DT_AUXILIARY dynamic tag in human readable form.

Previously DT_AUXILIARY was unknown, patch fixes that.

Differential revision: https://reviews.llvm.org/D24138

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280471 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Add a workaround to fix PR30188

We're sinking stores, which is a good thing, but in the process creating selects for the store address operand, which SROA/Mem2Reg can't look through, which caused serious regressions.

The real fix is in SROA, which I'll be looking into.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280470 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Move tests for masked floating point logical operations to avx512dqvl-intrinsics-upgrade.ll since they have now been autoupgraded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280467 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Remove floating point logical operation instrinsics and replace them with native IR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280466 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add more patterns for masked and broadcasted logical operations where the select or broadcast has a floating point type.

These are needed in order to remove the masked floating point logical operation intrinsics and use native IR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280465 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add execution domain fixing for logical operations with broadcast loads. This builds on the handling of masked ops since we need to keep element size the same.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280464 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Strengthen some SDNode type constraints.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280463 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add NoVLX Predicates to some patterns so they don't rely on pattern ordering to be lower priority than their equivalent VLX pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280462 91177308-0d34-0410-b5e6-96231b3b80d8

[Docs] Fix another typo in the Error/Expected docs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280461 91177308-0d34-0410-b5e6-96231b3b80d8

[Docs] Fix a couple of typos in the Error/Expected docs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280460 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Fix some missing fields in OrcRemoteTargetClient's move constructor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280459 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing &. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280458 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] hasAndNotCompare should return true

As Sanjay suggested when he added the hook, PPC should return true from
hasAndNotCompare. We have an efficient negated 'and' on PPC (which can feed a
compare).

Fixes PR27203.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280457 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Fail testing if a googletest executable crashes during test discovery

googletest formatted tests are discovered by running the test executable.
Previously testing would silently succeed if the test executable crashed
during the discovery process. Now testing fails with "error: unable to
discover google-tests ..." if the test executable exits with a non-zero status.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280455 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Add a pattern for a runtime bit check

Following a suggestion by Sanjay, we should lower:

  %shl = shl i32 1, %y
  %and = and i32 %x, %shl
  %cmp = icmp eq i32 %and, %shl
  ret i1 %cmp

into:

  subfic r4, r4, 32
  rlwnm r3, r3, r4, 31, 31

Add this pattern and some associated patterns for the 64-bit case and the
not-equal case. Fixes PR27356.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280454 91177308-0d34-0410-b5e6-96231b3b80d8

revert r280429 and r280425:

r280425 | dehao | 2016-09-01 16:15:50 -0700 (Thu, 01 Sep 2016) | 9 lines

Refactor LICM pass in preparation for LoopSink pass.

Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778).

r280429 | dehao | 2016-09-01 16:31:25 -0700 (Thu, 01 Sep 2016) | 9 lines

Refactor LICM to expose canSinkOrHoistInst to LoopSink pass.

Summary: LoopSink pass shares the same canSinkOrHoistInst functionality with LICM pass. This patch exposes this function in preparation of https://reviews.llvm.org/D22778

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280453 91177308-0d34-0410-b5e6-96231b3b80d8

revert r280432:

r280432 | dehao | 2016-09-01 16:51:37 -0700 (Thu, 01 Sep 2016) | 9 lines

Explicitly require DominatorTreeAnalysis pass for instsimplify pass.

Summary: DominatorTreeAnalysis is always required by instsimplify.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280452 91177308-0d34-0410-b5e6-96231b3b80d8

llvm/test/Transforms/GCOVProfiling/three-element-mdnode.ll: Use %/T instead of %T, not to emit backslashes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280451 91177308-0d34-0410-b5e6-96231b3b80d8

bugpoint: clang-format all of bugpoint. NFC

I'm going to clean up the APIs here a bit and touch many many lines
anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280450 91177308-0d34-0410-b5e6-96231b3b80d8

raw_pwrite_stream_test.cpp: _putenv_s() may be assumed as win32-generic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280449 91177308-0d34-0410-b5e6-96231b3b80d8

IfConversion: Don't count branches in # of duplicates.

If the entire blocks match, we would count the branch instructions
toward the number of duplicated instructions. This doesn't match what we
do elsewhere, and was causing a bug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280448 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Add a unittest for invalidating module analyses with an SCC pass.

This wasn't really well explicitly tested with a nice unittest before.
It seems good to have reasonably broken out unittests for this kind of
functionality as I'm workin go other invalidation features to make sure
none of the existing ones regress.

This still has too much duplicated code, I plan to factor that out in
a subsequent commit to use common helpers for repeated parts of this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280447 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] (NFC) Split the IR parsing into a fixture so that I can split out
more testing into other test routines while using the same core module.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280446 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a real temp file leak in FileOutputBuffer

If we failed to commit the buffer but did not die to a signal, the temp
file would remain on disk on Windows. Having an open file mapping and
file handle prevents the file from being deleted. I am choosing not to
add an assertion of success on the temp file removal, since virus
scanners and other environmental things can often cause removal to fail
in real world tools.

Also fix more temp file leaks in unit tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280445 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] (NFC) Refactor the CGSCC pass manager tests to use lambda-based
passes.

This simplifies the test some and makes it more focused and clear what
is being tested. It will also make it much easier to extend with further
testing of different pass behaviors.

I've also replaced a pointless module pass with running the requires
pass directly as that is all that it was really doing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280444 91177308-0d34-0410-b5e6-96231b3b80d8

Try to fix some temp file leaks in SupportTests, PR18335

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280443 91177308-0d34-0410-b5e6-96231b3b80d8

[CFGPrinter] Display branch weight on the edges

Summary:
This is pretty useful especially in connection with
BFI's -view-block-freq-propagation-dags. It helped me to track down the
bug that is being fixed in D24118.

While -view-block-freq-propagation-dags displays the high-level
information with static heuristics included (and block frequencies), the
new thing only shows the raw weight as presented by PGO without any of
the static estimates. This helps to distinguished what has been
measured vs. estimated.

For the sample loop in D24118, -view-block-freq-propagation-dags=integer
looks like this:

https://reviews.llvm.org/F2381352

While with -view-cfg-only you can see the underlying branch weights:

https://reviews.llvm.org/F2392296

Reviewers: dexonsmith, bogner, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24144

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280442 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Don't apply the PPC64 address-formation peephole for offsets greater than 7

When applying our address-formation PPC64 peephole, we are reusing the @ha TOC
addis value with the low parts associated with different offsets (i.e.
different effective symbol addends). We were assuming this was okay so long as
the offsets were less than the alignment of the global variable being accessed.
This ignored the fact, however, that the TOC base pointer itself need only be
8-byte aligned. As a result, what we were doing is legal only for offsets less
than 8 regardless of the alignment of the object being accessed.

Fixes PR28727.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280441 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Don't consider fusion in PPC64 address-formation peephole

The logic in this function assumes that the P8 supports fusion of addis/addi,
but it does not. As a result, there is no advantage to restricting our peephole
application, merging addi instructions into dependent memory accesses, even
when the addi has multiple users, regardless of whether or not we're optimizing
for size.

We might need something like this again for the P9; I suspect we'll revisit
this code when we work on P9 tuning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280440 91177308-0d34-0410-b5e6-96231b3b80d8

Explicitly require DominatorTreeAnalysis pass for instsimplify pass.

Summary: DominatorTreeAnalysis is always required by instsimplify.

Reviewers: davidxl, danielcdh

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24173

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280432 91177308-0d34-0410-b5e6-96231b3b80d8