git.osdn.net Git - android-x86/external-llvm.git/log

[lit, shtest-timeout] Always use an internal shell for the shtest-timeout to diagnose buildbot failures

Summary:
Right now this test is failing on the builtbots on Windows but we have a very similar setup where the test passes. The test is meant to test that specifying a timeout works correctly by running an infnite loop and having it timeout - on the buildbot, the infinite loop doesn't actually execute. This change runs all of the tests in the set using an internal shell rather than an external shell. I expect this will make the test pass which means that either the way the external shell is invoked or the external shell setup on the buildbots is not correct. Regardless of whether the test passes with this change, we'll need to undo this change and have a real fix.

@gkistanova was able to get logs from the buildbot to rule out a number of theories as to why this test is failing, but they didn't have enough information to confirm exactly what the issue is. The purpose of this change is to narrow it down, but if someone has a local repro and can aid in debugging, that would make it much speedier (and less prone to making the bots fail).

Reviewers: gkistanova, asmith, zturner, modocache, rnk, delcypher

Reviewed By: rnk

Subscribers: delcypher, llvm-commits, gkistanova

Differential Revision: https://reviews.llvm.org/D51326

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340840 91177308-0d34-0410-b5e6-96231b3b80d8

[debuginfo] generate debug info with asm+.file

Summary:
For assembly input files, generate debug info even when the .file
directive is present, provided it does not include a file-number
argument. Fixes PR38695.

Reviewers: probinson, sidneym

Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D51315

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340839 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] CodeGenDAGPatterns::GenerateVariants - basic caching of matching predicates

CodeGenDAGPatterns::GenerateVariants is a costly function in many tblgen commands (33.87% of the total runtime of x86 -gen-dag-isel), and due to the O(N^2) nature of the function, there are a high number of repeated comparisons of the pattern's vector<Predicate>.

This initial patch at least avoids repeating these comparisons for every Variant in a pattern. I began investigating caching all the matches before entering the loop but hit issues with how best to store the data and how to update the cache as patterns were added.

Saves around 15secs in debug builds of x86 -gen-dag-isel.

Differential Revision: https://reviews.llvm.org/D51035

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340837 91177308-0d34-0410-b5e6-96231b3b80d8

[benchmark] Stop building benchmarks by default

Although the benchmark regex-related build issue seems to be
fixed, it appears that benchmark library triggers some stage 2 clang-cl
bugs:

http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/13495/steps/build%20stage%202/logs/stdio

The only sensible option now is to prevent benchmark library from
building in the default configuration.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340836 91177308-0d34-0410-b5e6-96231b3b80d8

[Inliner] Attribute callsites with inline remarks

Summary:
Sometimes reading an output *.ll file it is not easy to understand why some callsites are not inlined. We can read output of inline remarks (option --pass-remarks-missed=inline) and try correlating its messages with the callsites.

An easier way proposed by this patch is to add to every callsite processed by Inliner an attribute with the latest message that describes the cause of not inlining this callsite. The attribute is called //inline-remark//. By default this feature is off. It can be switched on by the option //-inline-remark-attribute//.

For example in the provided test the result method //@test1// has two callsites //@bar// and inline remarks report different inlining missed reasons:
  remark: <unknown>:0:0: bar not inlined into test1 because too costly to inline (cost=-5, threshold=-6)
  remark: <unknown>:0:0: bar not inlined into test1 because it should never be inlined (cost=never): recursive

It is not clear which remark correspond to which callsite. With the inline remark attribute enabled we get the reasons attached to their callsites:
  define void @test1() {
    call void @bar(i1 true) #0
    call void @bar(i1 false) #2
    ret void
  }
  attributes #0 = { "inline-remark"="(cost=-5, threshold=-6)" }
  ..
  attributes #2 = { "inline-remark"="(cost=never): recursive" }

Patch by: yrouban (Yevgeny Rouban)

Reviewers: xbolva00, tejohnson, apilipenko

Reviewed By: xbolva00, tejohnson

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D50435

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340834 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix copy paste mistake in vector-idiv-v2i32.ll. Add missing test case.

Some of the test cases contained the same load twice instead of a different load.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340833 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Add support for a16 modifiear for gfx9

Summary:
Adding support for a16 for gfx9. A16 bit replaces r128 bit for gfx9.

Change-Id: Ie8b881e4e6d2f023fb5e0150420893513e5f4841

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D50575

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340831 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Initialize each element in vector TimelineView::UsedBuffers to a default invalid buffer descriptor. NFCI

Also change the default buffer size for UsedBuffer entries to -1 (i.e. "unknown
size"). No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340830 91177308-0d34-0410-b5e6-96231b3b80d8

[benchmark] Fix buildbots failing to identify regex support

This is cleanup after newly introduced google/benchmark library
(rL340809). Many buildbots fail to identify regex engine support, so
this should presumably fix the issue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340827 91177308-0d34-0410-b5e6-96231b3b80d8

Clarify comment in the string-offsets-table-order.ll test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340826 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca][TimelineView] Force the same number of executions for every entry in the 'wait-times' table.

This patch also uses colors to highlight problematic wait-time entries.
A problematic entry is an entry with an high wait time that tends to match (or
exceed) the size of the scheduler's buffer.

Color RED is used if an instruction had to wait an average number of cycles
which is bigger than (or equal to) the size of the underlying scheduler's
buffer.
Color YELLOW is used if the time (in cycles) spend waiting for the
operands or pipeline resources is bigger than half the size of the underlying
scheduler's buffer.
Color MAGENTA is used if an instruction does not consume buffer resources
according to the scheduling model.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340825 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] ImmutableList no longer requires elements to be copy constructible

ImmutableList used to require elements to have a copy constructor for no
good reason, this patch aims to fix this.
It also required but did not enforce its elements to be trivially
destructible, so a new static_assert is added to guard against misuse.

Differential Revision: https://reviews.llvm.org/D49985

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340824 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Pass an instruction reference when notifying event listeners about reserved/released buffer resources. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340821 91177308-0d34-0410-b5e6-96231b3b80d8

[CloneFunction] Constant fold terminators before checking single predecessor

Summary:
This fixes PR31105.

There is code trying to delete dead code that does so by e.g. checking if
the single predecessor of a block is the block itself.

That check fails on a block like this
bb:
   br i1 undef, label %bb, label %bb
since that has two (identical) predecessors.

However, after the check for dead blocks there is a call to
ConstantFoldTerminator on the basic block, and that call simplifies the
block to
bb:
   br label %bb

Therefore we now do the call to ConstantFoldTerminator before the check if
the block is dead, so it can realize that it really is.

The original behavior lead to the block not being removed, but it was
simplified as above, and then we did a call to
    Dest->replaceAllUsesWith(&*I);
with old and new being equal, and an assertion triggered.

Reviewers: chandlerc, fhahn

Reviewed By: fhahn

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D51280

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340820 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Use std::move where possible in InstructionMemo constructor. NFCI.

Requested in post-commit review for rL339670

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340819 91177308-0d34-0410-b5e6-96231b3b80d8

[GVNHoist] Prune out useless CHI insertions

Fix for the out-of-memory error when compiling SemaChecking.cpp
with GVNHoist and ubsan enabled. I've used a cache for inserted
CHIs to avoid excessive memory usage.

Differential Revision: https://reviews.llvm.org/D50323

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340818 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Apply another commit to comply with old CMake

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340817 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Improve variable scalar shift of vXi8 vectors (PR34694)

This patch creates the shift mask and actual shift using the vXi16 vector shift ops.

Differential Revision: https://reviews.llvm.org/D51263

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340813 91177308-0d34-0410-b5e6-96231b3b80d8

[benchmark] Silence warning by applying upstream patch

ompiling benchmark library (introduced in D50894) with the latest
bootstrapped Clang produces a lot of warnings, this issue was addressed
in the upstream patch I pushed earlier.

Upstream patch:
https://github.com/google/benchmark/commit/f85304e4e3a0e4e1bf15b91720df4a19e90b589f

`README.LLVM` notes were updated to reflect the latest changes.

Reviewed by: lebedev.ri

Differential Revision: https://reviews.llvm.org/D51342

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340811 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Avoid vector extraction/insertion for non-constant uniform shifts

As discussed on D51263, we're better off using byte shifts to clear the upper bits on pre-SSE41 hardware.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340810 91177308-0d34-0410-b5e6-96231b3b80d8

Pull google/benchmark library to the LLVM tree

This patch pulls google/benchmark v1.4.1 into the LLVM tree so that any
project could use it for benchmark generation. A dummy benchmark is
added to `llvm/benchmarks/DummyYAML.cpp` to validate the correctness of
the build process.

The current version does not utilize LLVM LNT and LLVM CMake
infrastructure, but that might be sufficient for most users. Two
introduced CMake variables:

* `LLVM_INCLUDE_BENCHMARKS` (`ON` by default) generates benchmark
  targets
* `LLVM_BUILD_BENCHMARKS` (`OFF` by default) adds generated
  benchmark targets to the list of default LLVM targets (i.e. if `ON`
  benchmarks will be built upon standard build invocation, e.g. `ninja` or
  `make` with no specific targets)

List of modifications:

* `BENCHMARK_ENABLE_TESTING` is disabled
* `BENCHMARK_ENABLE_EXCEPTIONS` is disabled
* `BENCHMARK_ENABLE_INSTALL` is disabled
* `BENCHMARK_ENABLE_GTEST_TESTS` is disabled
* `BENCHMARK_DOWNLOAD_DEPENDENCIES` is disabled

Original discussion can be found here:
http://lists.llvm.org/pipermail/llvm-dev/2018-August/125023.html

Reviewed by: dberris, lebedev.ri

Subscribers: ilya-biryukov, ioeric, EricWF, lebedev.ri, srhines,
dschuff, mgorny, krytarowski, fedor.sergeev, mgrang, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D50894

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340809 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] A loop can never contain Ret instruction

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340808 91177308-0d34-0410-b5e6-96231b3b80d8

Fix in getAllocationDataForFunction

Summary:
Correct to use set like behaviour of AllocType. Should check for
subset, not precise value.

Reviewers: theraven

Reviewed By: theraven

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50959

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340807 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix some comments to refer to KORTEST not KTEST. NFC

KTEST is a different instruction. All of this code uses KORTEST.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340799 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner][AMDGPU][Mips] Fold bitcast with volatile loads if the resulting load is legal for the target.

Summary:
I'm not sure if this patch is correct or if it needs more qualifying somehow. Bitcast shouldn't change the size of the load so it should be ok? We already do something similar for stores. We'll change the type of a volatile store if the resulting store is Legal or Custom. I'm not sure we should be allowing Custom there...

I was playing around with converting X86 atomic loads/stores(except seq_cst) into regular volatile loads and stores during lowering. This would allow some special RMW isel patterns in X86InstrCompiler.td to be removed. But there's some floating point patterns in there that didn't work because we don't fold (f64 (bitconvert (i64 volatile load))) or (f32 (bitconvert (i32 volatile load))).

Reviewers: efriedma, atanasyan, arsenm

Reviewed By: efriedma

Subscribers: jvesely, arsenm, sdardis, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, arichardson, jrtc27, atanasyan, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D50491

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340797 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Extend (add (sext x), cst) --> (sext (add x, cst')) and (add (zext x), cst) --> (zext (add x, cst')) to work for vectors

Differential Revision: https://reviews.llvm.org/D51236

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340796 91177308-0d34-0410-b5e6-96231b3b80d8

[PPC] Remove Darwin support from POWER backend.
This patch issues an error message if Darwin ABI is attempted with the PPC
backend. It also cleans up existing test cases, either converting the test to
use an alternative triple or removing the test if the coverage is no longer
needed.

Updated Tests
-------------
The majority of test cases were updated to use a different triple that does not
include the Darwin ABI. Many tests were also updated to use FileCheck, in place
of grep.

Deleted Tests
-------------
llvm/test/tools/dsymutil/PowerPC/sibling.test was originally added to test
specific functionality of dsymutil using an object file created with an old
version of llvm-gcc for a Powerbook G4. After a discussion with @JDevlieghere he
suggested removing the test.

llvm/test/CodeGen/PowerPC/combine_loads_from_build_pair.ll was converted from a
PPC test to a SystemZ test, as the behavior is also reproducible there.

All other tests that were deleted were specific to the darwin/ppc ABI and no
longer necessary.

Phabricator Review: https://reviews.llvm.org/D50988

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340795 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[CodeGenPrepare] Scan past debug intrinsics to find select candidates (NFC)"

This causes crashes due to the interleaved dbg.value intrinsics being
left at the end of basic blocks, causing the actual terminators (br,
etc) to be not where they should be (not at the end of the block),
leading to later crashes.

Further discussion on the original commit thread.

This reverts commit r340368.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340794 91177308-0d34-0410-b5e6-96231b3b80d8

[MemorySSA] Add NDEBUG checks to verifiers; NFC

verify*() methods are intended to have no side-effects (unless we detect
broken MSSA, in which case they assert()), and all of the other verify
methods are wrapped by `#ifndef NDEBUG`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340793 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] fix formatting; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340790 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add test cases for D51236. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340789 91177308-0d34-0410-b5e6-96231b3b80d8

[RuntimeDyld] Add test case that was accidentally left out of r340125.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340788 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] allow shuffle+binop canonicalization with widening shuffles

This lines up with the behavior of an existing transform where if both
operands of the binop are shuffled, we allow moving the binop before the
shuffle regardless of whether the shuffle changes the size of the vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340787 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Add unit tests for the new RTDyldObjectLinkingLayer2 class.

The new unit tests match the old ones, which will remain in tree until the
old RTDyldObjectLinkingLayer is removed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340786 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add AVX runs to show more potential scalar->vector mov opportunities; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340785 91177308-0d34-0410-b5e6-96231b3b80d8

[PATCH] [InstCombine] Fix issue in the simplification of pow() with nested exp{,2}()

Fix the issue of duplicating the call to `exp{,2}()` when it's nested in
`pow()`, as exposed by rL340462.

Differential revision: https://reviews.llvm.org/D51194

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340784 91177308-0d34-0410-b5e6-96231b3b80d8

s/std::set/DenseSet/; NFC

We only use this set for `insert` and `count`, so a hashing container
seems better here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340783 91177308-0d34-0410-b5e6-96231b3b80d8

[Pipeliner] Fix incorrect phi values in the epilog and kernel

The code that generates the loop definition operand for phis
in the epilog and kernel is incorrect in some cases.

In the kernel, when a phi refers to another phi, the code that
updates PhiOp2 needs to include the stage difference between
the two phis.

In the epilog, the check for using the loop definition instead
of the phi definition uses the StageDiffAdj value (the difference
between the phi stage and the loop definition stage), but the
adjustment is not needed to determine if the current stage
contains an iteration with the loop definition.

Differential Revision: https://reviews.llvm.org/D51167

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340782 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] TableGen backend for stackifying instructions

Summary:
The new stackification backend generates the giant switch statement
used to translate instructions to their stackified forms. I did this
because it was more interesting than adding all the different vector
versions of the various SIMD instructions to the switch statment
manually.

Reviewers: aardappel, aheejin, dschuff

Subscribers: mgorny, sbc100, jgravelle-google, sunfish, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D51318

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340781 91177308-0d34-0410-b5e6-96231b3b80d8

Update the Visual Studio Integration from user feedback.

This patch removes the MSBuild warnings about options that
clang-cl ignores.  It also adds several additional fields to
the LLVM Configuration options page.  The first is that it
adds support for LLD!  To give the user flexibility though,
we don't want to force LLD to always-on, and if we're not
forcing LLD then we might as well not force clang-cl either.
So we add options that can enable or disable lld, clang-cl,
or any combination of the two.  Whenever one is disabled,
it falls back to the Microsoft equivalent.

Additionally, for each of clang-cl and lld-link, we add a new
configuration setting that allows Additional Options to be
passed for that specific tool only.  This is similar to the
C/C++ > Command Line > Additional Options entry box, but
it serves the use case where a user switches back and forth
between the toolsets in their vcxproj, but where cl.exe
won't accept some options that clang-cl will.  In this case
you can pass those options in the clang-cl additional options
and whenever clang-cl is disabled (or the other toolset is
selected entirely), those options won't get passed at all.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340780 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[SCEV][NFC] Check NoWrap flags before lexicographical comparison of SCEVs"

This reverts r319889.

Unfortunately, wrapping flags are not a part of SCEV's identity (they
do not participate in computing a hash value or in equality
comparisons) and in fact they could be assigned after the fact w/o
rebuilding a SCEV.

Grep for const_cast's to see quite a few of examples, apparently all
for AddRec's at the moment.

So, if 2 expressions get built in 2 slightly different ways: one with
flags set in the beginning, the other with the flags attached later
on, we may end up with 2 expressions which are exactly the same but
have their operands swapped in one of the commutative N-ary
expressions, and at least one of them will have "sorted by complexity"
invariant broken.

2 identical SCEV's won't compare equal by pointer comparison as they
are supposed to.

A real-world reproducer is added as a regression test: the issue
described causes 2 identical SCEV expressions to have different order
of operands and therefore compare not equal, which in its turn
prevents LoadStoreVectorizer from vectorizing a pair of consecutive
loads.

On a larger example (the source of the test attached, which is a
bugpoint) I have seen even weirder behavior: adding a constant to an
existing SCEV changes the order of the existing terms, for instance,
getAddExpr(1, ((A * B) + (C * D))) returns (1 + (C * D) + (A * B)).

Differential Revision: https://reviews.llvm.org/D40645

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340777 91177308-0d34-0410-b5e6-96231b3b80d8

Set line endings to Windows on MSBuild files.

Normally we force Unix line endings in the repository, but since these are Windows files which are consumed by Microsoft tools that we don't have the source of, we should probably err on the side of caution and force CRLF.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340776 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Reverse the check prefixes in the test added in r340774.

The 32-bit and 64-bit checks were reversed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340775 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases to show current codegen of v2i32 div/rem in 32-bit and 64-bit modes

In particular this shows that we end up using libcalls in 32-bit mode even for division by constant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340774 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add tests for possibly avoiding scalar->vector move; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340773 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Remove unused include. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340768 91177308-0d34-0410-b5e6-96231b3b80d8

DAG: Check transformed type for forming fminnum/fmaxnum from vselect

Follow up to r340655 to fix vector types which are split.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340766 91177308-0d34-0410-b5e6-96231b3b80d8

MachineVerifier: Fix assert on implicit virtreg use

If the liveness of a physical register was invalid, this
was attempting to iterate the subregisters of all register
uses of the instruction, which would assert when it
encountered an implicit virtual register operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340763 91177308-0d34-0410-b5e6-96231b3b80d8

LangRef: Clarify expected sNaN behavior for minnum/maxnum

This matches the de-facto behavior based on constant folding
and the default lowering to fmin/fmax.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340762 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC][MC] Support expressions in getMemRIX16Encoding.

Loosens an assert in getMemRIX16Encoding that restricts DQ-form instructions to
using an immediate, so that we can assemble instructions like lxv/stxv where the
offset is an expression.

Differential Revision: https://reviews.llvm.org/D51122

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340761 91177308-0d34-0410-b5e6-96231b3b80d8

[NVPTX] Implement isLegalToVectorizeLoadChain

This lets LSV nicely split up underaligned chains.

Differential Revision: https://reviews.llvm.org/D51306

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340760 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] When lowering v32i8 MULHS/MULHU, shuffle after the PACKUS rather than before.

We're using a 256-bit PACKUS to do the truncation, but that instruction operates on 128-bit lanes. So previously we shuffled first to rearrange the lanes. But that requires 2 shuffles. Instead we can shuffle after the PACKUS using a single VPERMQ. This matches what our normal LowerTRUNCATE code does when it uses PACKUS.

Differential Revision: https://reviews.llvm.org/D51284

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340757 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add support for matching paddus patterns where one of the vectors is a constant.

InstCombine mucks these up a bit. So we need to do some additional pattern matching to fix it. There are a still a few special cases not handled, but this covers the general case.

Differential Revision: https://reviews.llvm.org/D50952

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340756 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Introduce the llvm-mca library and organize the directory accordingly. NFC.

Summary:
This patch introduces llvm-mca as a library.  The driver (llvm-mca.cpp), views, and stats, are not part of the library.
Those are separate components that are not required for the functioning of llvm-mca.

The directory has been organized as follows:
All library source files now reside in:
  - `lib/HardwareUnits/` - All subclasses of HardwareUnit (these represent the simulated hardware components of a backend).
      (LSUnit does not inherit from HardwareUnit, but Scheduler does which uses LSUnit).
  - `lib/Stages/` - All subclasses of the pipeline stages.
  - `lib/` - This is the root of the library and contains library code that does not fit into the Stages or HardwareUnit subdirs.

All library header files now reside in the `include` directory and mimic the same layout as the `lib` directory mentioned above.

In the (near) future we would like to move the library (include and lib) contents from tools and into the core of llvm somewhere.
That change would allow various analysis and optimization passes to make use of MCA  functionality for things like cost modeling.

I left all of the non-library code just where it has always been, in the root of the llvm-mca directory.
The include directives for the non-library source file have been updated to refer to the llvm-mca library headers.
I updated the llvm-mca/CMakeLists.txt file to include the library headers, but I made the non-library code
explicitly reference the library's 'include' directory.  Once we eventually (hopefully) migrate the MCA library
components into llvm the include directives used by the non-library source files will be updated to point to the
proper location in llvm.

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb

Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D50929

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340755 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Remove unused method. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340754 91177308-0d34-0410-b5e6-96231b3b80d8

[lit, python] Remove quotes around %python in cache.ll

Summary: We needed quotes around %python before to make python work correctly (on Windows) if the path contains spaces. I recently made a change so that %python now inherently has quotes, so now adding quotes around %python makes the test fail because the quotes cancel each other.

Reviewers: asmith, inglorion

Subscribers: mehdi_amini, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51244

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340753 91177308-0d34-0410-b5e6-96231b3b80d8

Use a lambda for calls to ::open in RetryAfterSignal

In Bionic, open can be overloaded for _FORTIFY_SOURCE support, causing
compile errors of RetryAfterSignal due to overload resolution. Wrapping
the call in a lambda avoids this.

Based on a patch by Chih-Wei Huang <cwhuang@linux.org.tw>!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340751 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Added default stack-only instruction mode for MC.

Summary:
Made it convert from register to stack based instructions, and removed the registers.
Fixes to related code that was expecting register based instructions.
Added the correct testing flag to all tests, depending on what the
format they were expecting so far.
Translated one test to stack format as example: reg-stackify-stack.ll

tested:
llvm-lit -v `find test -name WebAssembly`
unittests/MC/*

Reviewers: dschuff, sunfish

Subscribers: sbc100, jgravelle-google, eraman, aheejin, llvm-commits, jfb

Differential Revision: https://reviews.llvm.org/D51241

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340750 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Improved report generated by the SchedulerStatistics view.

Before this patch, the SchedulerStatistics only printed the maximum number of
buffer entries consumed in each scheduler's queue at a given point of the
simulation.

This patch restructures the reported table, and adds an extra field named
"Average number of used buffer entries" to it.
This patch also uses different colors to help identifying bottlenecks caused by
high scheduler's buffer pressure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340746 91177308-0d34-0410-b5e6-96231b3b80d8

fix comment typo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340744 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] add helper query for binops; NFC

We will also use this in a planned enhancement for vector insertelement.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340741 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Revert commit r339779

This commit has caused failures in some internal benchmarks. Temporarily
reverting this patch until the issue can be diagnosed and fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340740 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Adding the test pointing to the fail case of D45653

Summary:
This commit adds the case of tail calling a sret function from a non-sret
function when both functions have the C calling convention.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340737 91177308-0d34-0410-b5e6-96231b3b80d8

[Sparc] Avoid writing outside array in applyFixup

Summary: If an object file ends with a relocation that is smaller
than 4 bytes we will write outside the Data array and trigger an
"Invalid index" assertion.

Reviewers: jyknight, venkatra

Reviewed By: jyknight

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D50971

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340736 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][X86] Fix `sibcall.ll` formatting

Summary:
Remove unnecessary lines from `sibcall.ll` and rename labels according
to @RKSimon's recommendations in the D45653 conversation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340735 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Recommit r340016 after fixing the reported issue

The internal benchmark failure reported by Google was due to a missing
check for the result type for the sign-extend and shift DAG. This commit
adds the check and re-commits the patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340734 91177308-0d34-0410-b5e6-96231b3b80d8

[Sparc] Add support for the cycle counter available in GR740

Summary: The GR740 provides an up cycle counter in the registers ASR22
and ASR23. As these registers can not be read together atomically we only
use the value of ASR23 for llvm.readcyclecounter(). The ASR23 register
holds the 32 LSBs of the up-counter.

Reviewers: jyknight, venkatra

Reviewed By: jyknight

Subscribers: jfb, fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D48638

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340733 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Try to make buildbot happy about virtual destructors

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340732 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Split logic of ImplicitControlFlowTracking to allow generalization

We have a class `ImplicitControlFlowTracking` which allows us to keep track of
instructions that can abnormally exit and answer queries like "whether or not
there is side-exiting instruction above this instruction in its block".

We may want to have the similar tracking for other types of "special" instructions,
for example instructions that write memory.

This patch separates ImplicitControlFlowTracking into two classes, isolating all
general logic not related to implicit control flow into its parent class. We can
later make another child of this class to keep track of instructions that write
memory.

The motivation for that is that we want to make these checks efficiently in the
patch https://reviews.llvm.org/D50891.

NOTE: The naming of the parent class is not super cool, but the other options we
have are hardly better. Please feel free to rename it as NFC if you think you've
found a more informative name for it.

Differential Revision: https://reviews.llvm.org/D50954
Reviewed By: fedor.sergeev

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340728 91177308-0d34-0410-b5e6-96231b3b80d8

[COFF] Expose an easier helper function for getting names for relocation types

The existing method is protected, and requires using DataRefImpl
and SmallVector.

Differential Revision: https://reviews.llvm.org/D50995

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340725 91177308-0d34-0410-b5e6-96231b3b80d8

[Sparc] Custom bitcast between f64 and v2i32

Summary:
Currently bitcasting constants from f64 to v2i32 is done by storing the
value to the stack and then loading it again. This is not necessary, but
seems to happen because v2i32 is a valid type for Sparc V8. If it had not
been legal, we would have gotten help from the type legalizer.

This patch tries to do the same work as the legalizer would have done by
bitcasting the floating point constant and splitting the value up into a
vector of two i32 values.

Reviewers: venkatra, jyknight

Reviewed By: jyknight

Subscribers: glaubitz, fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D49219

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340723 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] atomic_store_nn have a different layout to regular store

We cannot directy reuse the patterns of StPat because for some reason the store
DAG node and the atomic_store_nn DAG nodes put the ptr and the value in
different positions. Currently we attempt to store the address to an address
formed by the value.

Differential Revision: https://reviews.llvm.org/D51217

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340722 91177308-0d34-0410-b5e6-96231b3b80d8

Fix this file to have the necessary standard library includes and use
the `std::` namespace. Should fix a number of build bots as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340721 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Cleanup the LowerMULH code by hoisting some commonalities between the vXi32 and vXi8 handling. NFCI

vXi32 support was recently moved from LowerMUL_LOHI to LowerMULH.

This commit shares the getOperand calls, switches both to use common IsSigned flag, and hoists the NumElems/NumElts variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340720 91177308-0d34-0410-b5e6-96231b3b80d8

[MS Demangler] Add virtual destructor.

Silence -Wnon-virtual-dtor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340711 91177308-0d34-0410-b5e6-96231b3b80d8

[MS Demangler] Re-write the Microsoft demangler.

This is a pretty large refactor / re-write of the Microsoft
demangler.  The previous one was a little hackish because it
evolved as I was learning about all the various edge cases,
exceptions, etc.  It didn't have a proper AST and so there was
lots of custom handling of things that should have been much
more clean.

Taking what was learned from that experience, it's now
re-written with a completely redesigned and much more sensible
AST.  It's probably still not perfect, but at least it's
comprehensible now to someone else who wants to come along
and make some modifications or read the code.

Incidentally, this fixed a couple of bugs, so I've enabled
the tests which now pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340710 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Correct the cost of (v4i32 (fptoui (v4f64))) under AVX512F.

Summary: This was inheriting the cost from the AVX table, but should be legal under AVX512.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51267

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340708 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add FeatureCMOV explicitly to all CPUs that support it. Remove FeatureCMOV implication from Feature64Bit and FeatureSSE1

Summary:
Previously most CPUs inherited cmov support through Feature64Bit(or FeatureCMPXCHG16HB implying Feature64Bit) or FeatureSSE1.

This has the surprising side effect that -mattr=-cmov causes an assert to fire in 64-bit mode because it clears the Feature64Bit. Or in 32-bit mode, -mattr=-cmov disables any sse/avx features which seems surprising.

This patch removes the implication and instead updates hasCMOV in X86Subtarget to check SSE1 or is64Bit in addition to the regular cmov flag. This should keep most things working the way they did before. I don't believe there is a way to specific "-cmov" directly from clang so this should only effect our lower level tools.

This does stop -mattr=cx16(cmpxchg16b) from implying cmov is enabled via the 64bit flag as you can see from one of the changed tests. But that was a 32-bit test so I don't know why it enabled cx16 anyway.

For the other test I had to add -sse to override the new sse check in hasCMOV.

Reviewers: RKSimon, DavidKreitzer, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits, jfb

Differential Revision: https://reviews.llvm.org/D51228

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340707 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add FeatureCMOV to athlon and athlon-tbird cpus.

Summary: This matches gcc and one cpuid dump I found online. Given that these are considered 7th generation x86 CPU it seems likely they support cmov since cmov was added by Intel in their 6th generation.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51264

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340706 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG][x86] turn insertelement into undef with variable index into splat

I noticed this along with the patterns in D51125, but when the index is variable,
we don't convert insertelement into a build_vector.

For x86, that means these get expanded at legalization time into the loading/spilling
code that we see in the tests. I think it's always better to avoid going to memory on
these, and we get the optimal 'broadcast' if it's available.

I suspect other targets may want to look at enabling the hook. AArch64 and AMDGPU have
regression tests that would be affected (although I did not check what would happen in
those cases). In the most basic cases shown here, AArch64 would probably do much
better with a splat.

Differential Revision: https://reviews.llvm.org/D51186

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340705 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Remove a workaround for systems lacking 8-byte atomics.

SymbolStringPool ref counts are now size_t, rather than uint64_t, so I do not
think this is necessary any more.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340704 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Do not include non-global symbols in getObjectSymbolFlags.

Private symbols are not visible outside the object file, and so not defined by
the object file from ORC's perspective.

No test case yet. Ideally this would be a unit test parsing a checked-in binary,
but I am not aware of any way to reference the LLVM source root from a unit
test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340703 91177308-0d34-0410-b5e6-96231b3b80d8

Replace fancy use of initializer lists with simple functions that return
vectors, and move this test code into an anonymous namespace.

Hoping that this will avoid hitting an MSVC bug that causes it to crash
and burn pretty spectacularly. Also, this degree of clever use of
initializer lists seems somewhat questionable in general. ;]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340702 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Replace `isa<TerminatorInst>` with `isTerminator()`.

This is a bit awkward in a handful of places where we didn't even have
an instruction and now we have to see if we can build one. But on the
whole, this seems like a win and at worst a reasonable cost for removing
`TerminatorInst`.

All of this is part of the removal of `TerminatorInst` from the
`Instruction` type hierarchy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340701 91177308-0d34-0410-b5e6-96231b3b80d8

Avoid specializing a variadic member template in a way that seems to not
agree with MSVC.

There isn't actually a need for specialization here as we can write the
code generically and just have a test that will fold away as a constant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340700 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Sink `isExceptional` predicate to `Instruction`, rename it to
`isExceptionalTermiantor` and implement it for opcodes as well following
the common pattern in `Instruction`.

Part of removing `TerminatorInst` from the `Instruction` type hierarchy
to make it easier to share logic and interfaces between instructions
that are both terminators and not terminators.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340699 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Begin removal of TerminatorInst by removing successor manipulation.

The core get and set routines move to the `Instruction` class. These
routines are only valid to call on instructions which are terminators.

The iterator and *generic* range based access move to `CFG.h` where all
the other generic successor and predecessor access lives. While moving
the iterator here, simplify it using the iterator utilities LLVM
provides and updates coding style as much as reasonable. The APIs remain
pointer-heavy when they could better use references, and retain the odd
behavior of `operator*` and `operator->` that is common in LLVM
iterators. Adjusting this API, if desired, should be a follow-up step.

Non-generic range iteration is added for the two instructions where
there is an especially easy mechanism and where there was code
attempting to use the range accessor from a specific subclass:
`indirectbr` and `br`. In both cases, the successors are contiguous
operands and can be easily iterated via the operand list.

This is the first major patch in removing the `TerminatorInst` type from
the IR's instruction type hierarchy. This change was discussed in an RFC
here and was pretty clearly positive:
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123407.html

There will be a series of much more mechanical changes following this
one to complete this move.

Differential Revision: https://reviews.llvm.org/D47467

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340698 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS GlobalISel] Legalize i8 and i16 add

Legalize G_ADD for types smaller than i32.
LegalizationArtifactCombiner replaces extend instructions with appropriate
bitwise instructions.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D51213

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340697 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix typo in comment, expect->except. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340695 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases for D50952, paddus patterns involving constants. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340694 91177308-0d34-0410-b5e6-96231b3b80d8

[C-API][DIBuilder] Use NameLen in LLVMDIBuilderCreateParameterVariable

Summary: NameLen wasn't being used and caused the parameters in gdb to very long, in my case, crashes in others. Please also perform the correct magical incarnations to have this be applied to the LLVM 7 branch.

Reviewers: whitequark, CodaFi

Reviewed By: CodaFi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51141

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340691 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Replace support for vXi32 SMUL_LOHI/UMUL_LOHI with MULHS/MULHU support instead.

Summary:
The only time vector SMUL_LOHI/UMUL_LOHI nodes are created is during division/remainder lowering. If its created before op legalization, generic DAGCombine immediately turns that SMUL_LOHI/UMUL_LOHI into a MULHS/MULHU since only the upper half is used. That node will stick around through vector op legalization and will be turned back into UMUL_LOHI/SMUL_LOHI during op legalization. It will then be custom lowered by the X86 backend. Due to this two step lowering the vector shuffles created by the custom lowering get legalized after their inputs rather than before. This prevents the shuffles from being combined with any build_vector of constants.

This patch uses changes vXi32 to use MULHS/MULHU instead. This is what the later DAG combine did anyway. But by skipping the change back to UMUL_LOHI/SMUL_LOHI we lower it before any constant BUILD_VECTORS. This allows the vector_shuffle creation to constant fold with the build_vectors. This accounts for the test changes here.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51254

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340690 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG][X86] Reorder the operands the MaskedStoreSDNode to put the value first.

Summary:
Previously the value being stored is the last operand in SDNode. This causes the type legalizer to visit the mask operand before the value operand. The type legalizer was more complicated because of this since we want the type of the value to drive the decisions.

This patch moves the value to be the first operand so we visit it first during type legalization. It also simplifies the type legalization code accordingly.

X86 is currently the only in tree target that uses this SDNode. Not sure if there are any users out of tree.

Reviewers: RKSimon, delena, hfinkel, eli.friedman

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50402

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340689 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Make sure type is a vector before calling VT.getVectorNumElements() in combineLoopMAddPattern

Fixes PR38700.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340688 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Wunused-function warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340687 91177308-0d34-0410-b5e6-96231b3b80d8

Remove superfluous semicolon. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340686 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] try harder to use broadcast to load a scalar into vector reg

This is a preliminary step for a preliminary step for D50992.
I noticed that x86 often misses chances to load a scalar directly
into a vector register.

So this patch is just allowing more of those cases to match a
broadcast op in lowerBuildVectorAsBroadcast(). The old code comment
said it doesn't make sense to use a broadcast when we're loading a
single element and everything else is undef, but I think that's the
best case in the improved tests in insert-loaded-scalar.ll. We avoid
scalar-to-vector-register move and/or less efficient shuffling.

Note that there are some existing types that were already producing
a broadcast, but that happens semi-accidentally. Ie, it's not
happening as part of lowerBuildVectorAsBroadcast(). The build vector
gets expanded into load + shuffle, and then shuffle lowering produces
the broadcast.

Description of the other test diffs:
1. avx-basic.ll - replacing load+shufle is a win.
2. sse3-avx-addsub-2.ll - vmovddup vs. vbroadcastss is neutral
3. sse41.ll - don't care - we convert that intrinsic to generic IR now, so this test is deprecated
4. vector-shuffle-128-v8.ll / vector-shuffle-256-v16.ll - pshufb alternatives with an extra instruction are not obviously bad

Differential Revision: https://reviews.llvm.org/D51125

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340685 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Add support for multi-dword s.buffer.load intrinsic

Summary:
Patch by Marek Olsak and David Stuttard, both of AMD.

This adds a new amdgcn intrinsic supporting s.buffer.load, in particular
multiple dword variants. These are convenient to use from some front-end
implementations.

Also modified the existing llvm.SI.load.const intrinsic to common up the
underlying implementation.

This modification also requires that we can lower to non-uniform loads correctly
by splitting larger dword variants into sizes supported by the non-uniform
versions of the load.

V2: Addressed minor review comments.
V3: i1 glc is now i32 cachepolicy for consistency with buffer and
tbuffer intrinsics, plus fixed formatting issue.
V4: Added glc test.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D51098

Change-Id: I83a6e00681158bb243591a94a51c7baa445f169b

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340684 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for shuffle+binop transform; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340683 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Make requested test changes from D50636

The tests were relying on X / X -> 1 and X % X -> 0 combines not happening in the DAG.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340682 91177308-0d34-0410-b5e6-96231b3b80d8