git.osdn.net Git - android-x86/external-llvm.git/log

[RegisterBankInfo] Ignore InstrMappings that create impossible to repair operands

Summary:
This is a follow-up to r303043. In computeMapping(), we need to disqualify an
InstrMapping if it would be impossible to repair one of the registers in the
instruction to match the mapping.

This change is needed in order to be able to define an instruction
mapping for G_SELECT for the AMDGPU target and will be tested
by test/CodeGen/AMDGPU/GlobalISel/regbankselect-select.mir

Reviewers: ab, qcolombet, t.p.northover, dsanders

Reviewed By: qcolombet

Subscribers: tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D49735

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337882 91177308-0d34-0410-b5e6-96231b3b80d8

[profile] Support profiling runtime on Fuchsia

This ports the profiling runtime on Fuchsia and enables the
instrumentation. Unlike on other platforms, Fuchsia doesn't use
files to dump the instrumentation data since on Fuchsia, filesystem
may not be accessible to the instrumented process. We instead use
the data sink to pass the profiling data to the system the same
sanitizer runtimes do.

Differential Revision: https://reviews.llvm.org/D47208

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337881 91177308-0d34-0410-b5e6-96231b3b80d8

[x86/SLH] Teach the x86 speculative load hardening pass to harden
against v1.2 BCBS attacks directly.

Attacks using spectre v1.2 (a subset of BCBS) are described in the paper
here:
https://people.csail.mit.edu/vlk/spectre11.pdf

The core idea is to speculatively store over the address in a vtable,
jumptable, or other target of indirect control flow that will be
subsequently loaded. Speculative execution after such a store can
forward the stored value to subsequent loads, and if called or jumped
to, the speculative execution will be steered to this potentially
attacker controlled address.

Up until now, this could be mitigated by enableing retpolines. However,
that is a relatively expensive technique to mitigate this particular
flavor. Especially because in most cases SLH will have already mitigated
this. To fully mitigate this with SLH, we need to do two core things:
1) Unfold loads from calls and jumps, allowing the loads to be post-load
hardened.
2) Force hardening of incoming registers even if we didn't end up
needing to harden the load itself.

The reason we need to do these two things is because hardening calls and
jumps from this particular variant is importantly different from
hardening against leak of secret data. Because the "bad" data here isn't
a secret, but in fact speculatively stored by the attacker, it may be
loaded from any address, regardless of whether it is read-only memory,
mapped memory, or a "hardened" address. The only 100% effective way to
harden these instructions is to harden the their operand itself. But to
the extent possible, we'd like to take advantage of all the other
hardening going on, we just need a fallback in case none of that
happened to cover the particular input to the control transfer
instruction.

For users of SLH, currently they are paing 2% to 6% performance overhead
for retpolines, but this mechanism is expected to be substantially
cheaper. However, it is worth reminding folks that this does not
mitigate all of the things retpolines do -- most notably, variant #2 is
not in *any way* mitigated by this technique. So users of SLH may still
want to enable retpolines, and the implementation is carefuly designed to
gracefully leverage retpolines to avoid the need for further hardening
here when they are enabled.

Differential Revision: https://reviews.llvm.org/D49663

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337878 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use a shift plus an lea for multiplying by a constant that is a power of 2 plus 2/4/8.

The LEA allows us to combine an add and the multiply by 2/4/8 together so we just need a shift for the larger power of 2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337875 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Expand mul by pow2 + 2 using a shift and two adds similar to what we do for pow2 - 2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337874 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use a two lea sequence for multiply by 37, 41, and 73.

These fit a pattern used by 11, 21, and 19.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337871 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases for multiply by 37, 41, and 73.

These can all be handled with 2 LEAs similar to what we do for 11, 19, 21.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337870 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Change multiply by 26 to use two multiplies by 5 and an add instead of multiply by 3 and 9 and a subtract.

Same number of operations, but ending in an add is friendlier due to it being commutable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337869 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Fix for PR38110, LV encountered llvm_unreachable()

Summary: truncateToMinimalBitWidths() doesn't handle all Instructions and the worst case is compiler crash via llvm_unreachable(). Fix is to add a case to handle PHINode and changed the worst case to NO-OP (from compiler crash).

Reviewers: sbaranga, mssimpso, hsaito

Reviewed By: hsaito

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49461

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337861 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Add zext(C + x + ...) -> D + zext(C-D + x + ...)<nuw><nsw> transform

if the top level addition in (D + (C-D + x + ...)) could be proven to
not wrap, where the choice of D also maximizes the number of trailing
zeroes of (C-D + x + ...), ensuring homogeneous behaviour of the
transformation and better canonicalization of such expressions.

This enables better canonicalization of expressions like

  1 + zext(5 + 20 * %x + 24 * %y)  and
      zext(6 + 20 * %x + 24 * %y)

which get both transformed to

  2 + zext(4 + 20 * %x + 24 * %y)

This pattern is common in address arithmetics and the transformation
makes it easier for passes like LoadStoreVectorizer to prove that 2 or
more memory accesses are consecutive and optimize (vectorize) them.

Reviewed By: mzolotukhin

Differential Revision: https://reviews.llvm.org/D48853

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337859 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] When expanding a multiply by a negative of one less than a power of 2, like 31, don't generate a negate of a subtract that we'll never optimize.

We generated a subtract for the power of 2 minus one then negated the result. The negate can be optimized away by swapping the subtract operands, but DAG combine doesn't know how to do that and we don't add any of the new nodes to the worklist anyway.

This patch makes use explicitly emit the swapped subtract.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337858 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Generalize the multiply by 30 lowering to generic multipy by power 2 minus 2.

Use a left shift and 2 subtracts like we do for 30. Move this out from behind the slow lea check since it doesn't even use an LEA.

Use this for multiply by 14 as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337856 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add tests for weaker memory consistency orderings

Summary:
Currently all wasm atomic memory access instructions are sequentially
consistent, so even if LLVM IR specifies weaker orderings than that, we
should upgrade them to sequential ordering and treat them in the same
way.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D49194

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337854 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Change multiply by 19 to use (9 * X) * 2 + X instead of (5 * X) * 4 - 1.

The new lowering can be done in 2 LEAs. The old code took 1 LEA, 1 shift, and 1 sub.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337851 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner][NFC] Move outlined function remark into its own function

This pulls the OutlinedFunction remark out into its own function to make
the code a bit easier to read.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337849 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner][NFC] Move target frame info into OutlinedFunction

Just some gardening here.

Similar to how we moved call information into Candidates, this moves outlined
frame information into OutlinedFunction. This allows us to remove
TargetCostInfo entirely.

Anywhere where we returned a TargetCostInfo struct, we now return an
OutlinedFunction. This establishes OutlinedFunctions as more of a general
repeated sequence, and Candidates as occurrences of those repeated sequences.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337848 91177308-0d34-0410-b5e6-96231b3b80d8

Put "built-in" function definitions in global Used list, for LTO. (fix bug 34169)

When building with LTO, builtin functions that are defined but whose calls have not been inserted yet, get internalized. The Global Dead Code Elimination phase in the new LTO implementation then removes these function definitions. Later optimizations add calls to those functions, and the linker then dies complaining that there are no definitions. This CL fixes the new LTO implementation to check if a function is builtin, and if so, to not internalize (and later DCE) the function. As part of this fix I needed to move the RuntimeLibcalls.{def,h} files from the CodeGen subidrectory to the IR subdirectory. I have updated all the files that accessed those two files to access their new location.

Fixes PR34169

Patch by Caroline Tice!

Differential Revision: https://reviews.llvm.org/D49434

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337847 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Teach the x86 backend that it can fold between TCRETURNm* and TCRETURNr* and fix latent bugs with register class updates.

Summary:
Enabling this fully exposes a latent bug in the instruction folding: we
never update the register constraints for the register operands when
fusing a load into another operation. The fused form could, in theory,
have different register constraints on its operands. And in fact,
TCRETURNm* needs its memory operands to use tailcall compatible
registers.

I've updated the folding code to re-constrain all the registers after
they are mapped onto their new instruction.

However, we still can't enable folding in the general case from
TCRETURNr* to TCRETURNm* because doing so may require more registers to
be available during the tail call. If the call itself uses all but one
register, and the folded load would require both a base and index
register, there will not be enough registers to allocate the tail call.

It would be better, IMO, to teach the register allocator to *unfold*
TCRETURNm* when it runs out of registers (or specifically check the
number of registers available during the TCRETURNr*) but I'm not going
to try and solve that for now. Instead, I've just blocked the forward
folding from r -> m, leaving LLVM free to unfold from m -> r as that
doesn't introduce new register pressure constraints.

The down side is that I don't have anything that will directly exercise
this. Instead, I will be immediately using this it my SLH patch. =/

Still worse, without allowing the TCRETURNr* -> TCRETURNm* fold, I don't
have any tests that demonstrate the failure to update the memory operand
register constraints. This patch still seems correct, but I'm nervous
about the degree of testing due to this.

Suggestions?

Reviewers: craig.topper

Subscribers: sanjoy, mcrosier, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D49717

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337845 91177308-0d34-0410-b5e6-96231b3b80d8

[Inliner] Teach inliner to merge 'min-legal-vector-width' function attribute

When we inline a function with a min-legal-vector-width attribute we need to make sure the caller also ends up with at least that vector width.

This patch is necessary to make always_inline functions like intrinsics propagate their min-legal-vector-width. Though nothing uses min-legal-vector-width yet.

A future patch will add heuristics to preventing inlining with different vector width mismatches. But that code would need to be in inline cost analysis which is separate from the code added here.

Differential Revision: https://reviews.llvm.org/D49162

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337844 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test case to show failure to combine away negates that may be created by mul by constant expansion.

Mul by constant can expand to a sequence that ends with a negate. If the next instruction is an add or sub we might be able to fold the negate away.

We currently fail to do this because we explicitly don't add anything to the DAG combine worklist when we expand multiplies. This is primarily to keep the multipy from being reformed, but we should consider adding the users to worklist.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337843 91177308-0d34-0410-b5e6-96231b3b80d8

[docker] Fix LLVM_EXTERNAL_PROJECTS cmake variable value

Summary:
LLVM_ENABLE_PROJECTS expects a semicolon separated project list.

Fixes PR38158.

Reviewers: ilya-biryukov

Reviewed By: ilya-biryukov

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49712

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337842 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner][NFC] Make Candidates own their call information

Before this, TCI contained all the call information for each Candidate.

This moves that information onto the Candidates. As a result, each Candidate
can now supply how it ought to be called. Thus, Candidates will be able to,
say, call the same function in cheaper ways when possible. This also removes
that information from TCI, since it's no longer used there.

A follow-up patch for the AArch64 outliner will demonstrate this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337840 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner][NFC] Move missed opt remark into its own function

Having the missed remark code in the middle of `findCandidates` made the
function hard to follow. This yanks that out into a new function,
`emitNotOutliningCheaperRemark`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337839 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner][NFC] Sink some candidate logic into OutlinedFunction

Just some simple gardening to improve clarity.

Before, we had something along the lines of

1) Create a std::vector of Candidates
2) Create an OutlinedFunction
3) Create a std::vector of pointers to Candidates
4) Copy those over to the OutlinedFunction and the Candidate list

Now, OutlinedFunctions create the Candidate pointers. They're still copied
over to the main list of Candidates, but it makes it a bit clearer what's
going on.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337838 91177308-0d34-0410-b5e6-96231b3b80d8

Use SCEV to avoid inserting some bounds checks.

This patch uses SCEV to avoid inserting some bounds checks when they are not needed. This slightly improves the performance of code compiled with the bounds check sanitizer.

Differential Revision: https://reviews.llvm.org/D49602

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337830 91177308-0d34-0410-b5e6-96231b3b80d8

[PredicateInfo] Use custom mangling to support ssa_copy with unnamed types.

This is a workaround and it would be better to fix this generally, but
doing it generally is quite tricky. See D48541 and PR38117.

Doing it in PredicateInfo directly allows us to use the type address to
differentiate different unnamed types, because neither the created
declarations nor the ssa_copy calls should be visible after
PredicateInfo got destroyed.

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D49126

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337828 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Fix local dynamic TLS with Sym64

For the final DTPREL addition, rather than a lui/daddiu/daddu triple,
LLVM was erronously emitting a daddiu/daddiu pair, treating the %dtprel_hi
as if it were a %dtprel_lo, since Mips::Hi expands unshifted for Sym64.
Instead, use a new TlsHi node and, although unnecessary due to the exact
structure of the nodes emitted, use TlsHi for local exec too to prevent
future bugs. Also garbage-collect the unused TprelLo and TlsGd nodes,
and TprelHi since its functionality is provided by the new common TlsHi node.

Patch by James Clarke.

Differential revision: https://reviews.llvm.org/D49259

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337827 91177308-0d34-0410-b5e6-96231b3b80d8

[x86/SLH] Extract the core register hardening logic to a low-level
helper and restructure the post-load hardening to use this.

This isn't as trivial as I would have liked because the post-load
hardening used a trick that only works for it where it swapped in
a temporary register to the load rather than replacing anything.
However, there is a simple way to do this without that trick that allows
this to easily reuse a friendly API for hardening a value in a register.
That API will in turn be usable in subsequent patcehs.

This also techincally changes the position at which we insert the subreg
extraction for the predicate state, but that never resulted in an actual
instruction and so tests don't change at all.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337825 91177308-0d34-0410-b5e6-96231b3b80d8

[x86/SLH] Tidy up a comment, using doxygen structure and wording it to
be more accurate and understandable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337822 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Disable ARMCodeGenPrepare by default

ARM Stage 2 builders have been suspiciously broken since the pass was
committed. Disabling to hopefully fix the bots and give me time to
debug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337821 91177308-0d34-0410-b5e6-96231b3b80d8

ADT: Shrink SmallVector size 0 to 16B on 64-bit platforms

SmallVectorTemplateCommon wants to know the address of the first element
so it can detect whether it's in "small size" mode.

The old implementation split the small array, creating the storage for
the first element in SmallVectorTemplateCommon, and pulling the rest
into SmallVectorStorage where we know the size of the array.  This
bloats SmallVector size 0 by the larger of sizeof(void*) and sizeof(T),
and we're not even using the storage.

The new implementation leaves the full small storage to
SmallVectorStorage.  To calculate the offset of the first element in
SmallVectorTemplateCommon, we just need to know how far to jump, which
we can calculate out-of-band.  One subtlety is that we need
SmallVectorStorage to be properly aligned even when the size is 0, to be
sure that (for large alignments) we actually have the padding and it's
well defined to do the pointer math.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337820 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit r334887: [SmallSet] Add SmallSetIterator.

Updated to make sure we properly construct/destroy SetIter if it has a
non-trivial ctors/dtors, like in MSVC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337818 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[DebugInfo] Generate DWARF debug information for labels."

This reverts commit b454fa1b4079b6c0a5b1565982d16516385838d7.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337812 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Clean up and convert test to use generated CHECK lines.

This test was already checking microscopic behavior of tail call under
specific conditions. This just makes the CHECK lines much more
consistent, clear, and easily updated when intentional changes are made.

I've also switched the test to consistently name the entry block and to
order the helper declarations and comments for specific tests in the
more usual locations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337806 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Update the CHECK lines of this test to use the latest patterns
from the script. This minimizes the diff in subsequent changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337805 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Generate DWARF debug information for labels.

There are two forms for label debug information in DWARF format.

1. Labels in a non-inlined function:

DW_TAG_label
  DW_AT_name
  DW_AT_decl_file
  DW_AT_decl_line
  DW_AT_low_pc

2. Labels in an inlined function:

DW_TAG_label
  DW_AT_abstract_origin
  DW_AT_low_pc

We will collect label information from DBG_LABEL. Before every DBG_LABEL,
we will generate a temporary symbol to denote the location of the label.
The symbol could be used to get DW_AT_low_pc afterwards. So, we create a
mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase.
The DBG_LABEL in the mapping is used to query the symbol before it.

The AbstractLabels in DwarfCompileUnit is used to process labels in inlined
functions.

We also keep a mapping between scope and labels in DwarfFile to help to
generate correct tree structure of DIEs.

Differential Revision: https://reviews.llvm.org/D45556

Patch by Hsiangkai Wang.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337799 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Legalize G_INSERT

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D49601

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337798 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-xray: Broken chrome trace event format output

Summary:
Missing comma separator for EXIT and TAIL_EXIT RecordTypes emit invalid
JSON output for Chrome Trace Event Format.

Reviewers: dberris

Reviewed By: dberris

Subscribers: sammccall, kpw, llvm-commits

Differential Revision: https://reviews.llvm.org/D49687

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337795 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Remove unnecessary legality constraint for G_EXTRACT

Summary:
We were marking G_EXTRACT operations unsupported if the output type
was larger than the input type. I don't see how this could ever actually
happen, so I dropped the constraint. Doing this makes it possible to
reuse the same legality code for G_INSERT.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D49600

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337794 91177308-0d34-0410-b5e6-96231b3b80d8

Add PerfJITEventListener for perf profiling support.

This new JIT event listener supports generating profiling data for
the linux 'perf' profiling tool, allowing it to generate function and
instruction level profiles.

Currently this functionality is not enabled by default, but must be
enabled with LLVM_USE_PERF=yes.  Given that the listener has no
dependencies, it might be sensible to enable by default once the
initial issues have been shaken out.

I followed existing precedent in registering the listener by default
in lli. Should there be a decision to enable this by default on linux,
that should probably be changed.

Please note that until https://reviews.llvm.org/D47343 is resolved,
using this functionality with mcjit rather than orcjit will not
reliably work.

Disregarding the previous comment, here's an example:

$ cat /tmp/expensive_loop.c

bool stupid_isprime(uint64_t num)
{
        if (num == 2)
                return true;
        if (num < 1 || num % 2 == 0)
                return false;
        for(uint64_t i = 3; i < num / 2; i+= 2) {
                if (num % i == 0)
                        return false;
        }
        return true;
}

int main(int argc, char **argv)
{
        int numprimes = 0;

        for (uint64_t num = argc; num < 100000; num++)
        {
                if (stupid_isprime(num))
                        numprimes++;
        }

        return numprimes;
}

$ clang -ggdb -S -c -emit-llvm /tmp/expensive_loop.c -o
/tmp/expensive_loop.ll

$ perf record -o perf.data -g -k 1 ./bin/lli -jit-kind=mcjit /tmp/expensive_loop.ll 1

$ perf inject --jit -i perf.data -o perf.jit.data

$ perf report -i perf.jit.data
-   92.59%  lli      jitted-5881-2.so                   [.] stupid_isprime
     stupid_isprime
     main
     llvm::MCJIT::runFunction
     llvm::ExecutionEngine::runFunctionAsMain
     main
     __libc_start_main
     0x4bf6258d4c544155
+    0.85%  lli      ld-2.27.so                         [.] do_lookup_x

And line-level annotations also work:
       │              for(uint64_t i = 3; i < num / 2; i+= 2) {
       │1 30:   movq   $0x3,-0x18(%rbp)
  0.03 │1 38:   mov    -0x18(%rbp),%rax
  0.03 │        mov    -0x10(%rbp),%rcx
       │        shr    $0x1,%rcx
  3.63 │     ┌──cmp    %rcx,%rax
       │     ├──jae    6f
       │     │                if (num % i == 0)
  0.03 │     │  mov    -0x10(%rbp),%rax
       │     │  xor    %edx,%edx
89.00 │     │  divq   -0x18(%rbp)
       │     │  cmp    $0x0,%rdx
  0.22 │     │↓ jne    5f
       │     │                        return false;
       │     │  movb   $0x0,-0x1(%rbp)
       │     │↓ jmp    73
       │     │        }
  3.22 │1 5f:│↓ jmp    61
       │     │        for(uint64_t i = 3; i < num / 2; i+= 2) {

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D44892

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337789 91177308-0d34-0410-b5e6-96231b3b80d8

[Debugify] Export per-pass debug info loss statistics

Add a -debugify-export option to opt. This exports per-pass `debugify`
loss statistics to a file in CSV format.

For some interesting numbers on debug value loss during an -O2 build
of the sqlite3 amalgamation, see the review thread.

Differential Revision: https://reviews.llvm.org/D49003

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337787 91177308-0d34-0410-b5e6-96231b3b80d8

[Debugify] Move interface definitions to a header, NFC

This is a minor cleanup in preparation for a change to export DI
statistics from -check-debugify. To do that, it would be cleaner to have
a dedicated header for the debugify interface.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337786 91177308-0d34-0410-b5e6-96231b3b80d8

[x86/SLH] Simplify the code for hardening a loaded value. NFC.

This is in preparation for extracting this into a re-usable utility in
this code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337785 91177308-0d34-0410-b5e6-96231b3b80d8

[x86/SLH] Remove complex SHRX-based post-load hardening.

This code was really nasty, had several bugs in it originally, and
wasn't carrying its weight. While on Zen we have all 4 ports available
for SHRX, on all of the Intel parts with Agner's tables, SHRX can only
execute on 2 ports, giving it 1/2 the throughput of OR.

Worse, all too often this pattern required two SHRX instructions in
a chain, hurting the critical path by a lot.

Even if we end up needing to safe/restore EFLAGS, that is no longer so
bad. We pay for a uop to save the flag, but we very likely get fusion
when it is used by forming a test/jCC pair or something similar. In
practice, I don't expect the SHRX to be a significant savings here, so
I'd like to avoid the complex code required. We can always resurrect
this if/when someone has a specific performance issue addressed by it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337781 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] Use deque in place of SmallVector to fix use-after-free issue

Summary: SmallVector's elements are moved when resizing and cause use-after-free.

Reviewers: probinson, dblaikie

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D49702

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337772 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typo in test/CodeGen/Mips/dins.ll

Differential Revision: https://reviews.llvm.org/D49704

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337771 91177308-0d34-0410-b5e6-96231b3b80d8

Embed a template specialization in a namespace to work around a gcc bug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337770 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF v5] Refactor range lists dumping by using a more generic way of handling tables of lists.
The intent is to use it for location list tables as well. Change is almost NFC with the exception
of the spelling of some strings used during dumping (all lowercase now).

Reviewer: JDevlieghere

Differential Revision: https://reviews.llvm.org/D49500

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337763 91177308-0d34-0410-b5e6-96231b3b80d8

[LTO] Handle __imp_ (dllimport) symbols consistently with lld

Summary:
Similar to what lld already does for dllimport symbols which are
prefaced with __imp_ (see lld patch r240620), strip off the __imp_
prefix in LTO. Otherwise we can get 2 separate GlobalResolution for
a single symbol, the dllimport declaration, and the definition, which
leads to incorrect LTO handling.

Fixes PR38105.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49138

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337762 91177308-0d34-0410-b5e6-96231b3b80d8

[demangler] call terminate() if allocation failed

We really should set *status to memory_alloc_failure, but we need to refactor
the demangler a bit to properly propagate the failure up the stack. Until then,
its better to explicitly terminate then rely on a null dereference crash.

rdar://31240372

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337759 91177308-0d34-0410-b5e6-96231b3b80d8

[MC] Add a separate flag for skipping comdat constant sections for MinGW. NFC.

This actually has nothing to do with the associative comdat sections
that aren't supported by GNU binutils ld.

Clarify the comments from SVN r335918 and use a separate flag for it.

Differential Revision: https://reviews.llvm.org/D49645

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337757 91177308-0d34-0410-b5e6-96231b3b80d8

[COFF] Fix assembly output of comdat sections without an attached symbol

Since SVN r335286, the .xdata sections are produced without an attached
symbol, which requires using a different syntax when printing assembly
output.

Instead of the usual syntax of '.section <name>,"dr",discard,<symbol>',
use '.section <name>,"dr"' + '.linkonce discard' (which is what GCC
uses for all assembly output).

This fixes PR38254.

Differential Revision: https://reviews.llvm.org/D49651

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337756 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Use MCAsmInfoMicrosoft and MCAsmInfoGNUCOFF as base classes

This matches the structure used on X86 and ARM. This requires
a little bit of duplication of the parts that are equal in both
AArch64 COFF variants though.

Before SVN r335286, these classes didn't add anything that MCAsmInfoCOFF
didn't, but now they do.

This makes AArch64 match X86 in how comdat is used for float constants
for MinGW.

Differential Revision: https://reviews.llvm.org/D49637

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337755 91177308-0d34-0410-b5e6-96231b3b80d8

[utils] Fix the llvm::Optional data formatter

The llvm::Optional data formatter needs to look through the `Storage`
container if it's present.

Before:

220 if (Op && Op->getOp() != dwarf::DW_OP_LLVM_fragment)
-> 221 HasComplexExpression = true;
222
223 // If the register can only be described by a complex expression (i.e.,
224 // multiple subregisters) it doesn't safely compose with another complex
Target 0: (llc) stopped.
(lldb) p Op
(llvm::Optional<llvm::DIExpression::ExprOperand>) $0 = None

After:

(lldb) p Op
(llvm::Optional<llvm::DIExpression::ExprOperand>) $0 =
(llvm::DIExpression::ExprOperand) storage = {
Op = 0x000000010603d460
}

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337752 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Reduce DanglingDebugInfo memory traffic, NFC

This avoids approx. 2 x 10^5 DenseMap insertions in both non-debug and
debug -O2 builds of the sqlite3 amalgamation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337751 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Ensure the TargetLibraryInfo is constructed early enough

Summary:
Without this change, the WholeProgramDevirt pass, which requires the
TargetLibraryInfo, will construct one from the default triple.

Fixes PR38139.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49278

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337750 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugCounters] Keep track of total counts

This patch makes debug counters keep track of the total number of times
we've called `shouldExecute` for each counter, so it's easier to build
automated tooling on top of these.

A patch to print these counts is coming soon.

Patch by Zhizhou Yang!

Differential Revision: https://reviews.llvm.org/D49560

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337748 91177308-0d34-0410-b5e6-96231b3b80d8

[gdb] Fix SmallVector pretty printer after r337514

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337747 91177308-0d34-0410-b5e6-96231b3b80d8

ConstantFolding: Avoid a crash.

Summary:
Check if the parent basic block and caller exists
before calling CS.getCaller when constant folding
strip.invariant.group instrinsic.

This avoids a crash when the function containing the intrinsic
is being inlined. The instruction is checked for any simplifiction
but has not yet been added to a basic block.

Reviewers: Prazek, rsmith, efriedma

Reviewed By: efriedma

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D49690

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337742 91177308-0d34-0410-b5e6-96231b3b80d8

Re-land r335297 "[X86] Implement more of x86-64 large and medium PIC code models"

Don't try to generate large PIC code for non-ELF targets. Neither COFF
nor MachO have relocations for large position independent code, and
users have been using "large PIC" code models to JIT 64-bit code for a
while now. With this change, if they are generating ELF code, their
JITed code will truly be PIC, but if they target MachO or COFF, it will
contain 64-bit immediates that directly reference external symbols. For
a JIT, that's perfectly fine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337740 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca][docs] Define IPC where it is first mentioned. NFC.

Expand the abbreviation where it is first used, and use IPC elsewhere.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337739 91177308-0d34-0410-b5e6-96231b3b80d8

Fix RegScavenger::unprocess

RegScavenger::unprocess walks backward, so it should undo the effects
of defs before undoing effects of kills. Previously it did things in
the opposite order, leaving a register apparently unused (dead) in the
case where an instruction both used (killed) and defined a register.

Differential Revision: https://reviews.llvm.org/D42200

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337735 91177308-0d34-0410-b5e6-96231b3b80d8

Add inline asm aliasing test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337734 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[docs] Add support for Markdown documentation in Sphinx"

Looks like this bot hasn't been updated yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337731 91177308-0d34-0410-b5e6-96231b3b80d8

[docs] Add support for Markdown documentation in Sphinx

Differential Revision: https://reviews.llvm.org/D44910

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337730 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2obj] Add default sh_entsize for dynamic sections

Dynamic section holds a table, so the sh_entsize might be set. As the
dynamic section entry size never changes, we can default it to the size
of a dynamic entry.

Differential Revision: https://reviews.llvm.org/D49619

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337725 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Handle unnamed globals in HexagonConstExpr

Instead of comparing names, compare positions in the parent module.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337723 91177308-0d34-0410-b5e6-96231b3b80d8

[Demangle] Attempt to fix arena memory leak

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337720 91177308-0d34-0410-b5e6-96231b3b80d8

Fixing a typo; NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337719 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Move the shtest-xunit-output check lines into shtest-format

These two tests are operating on the same test suite, which causes
them to be racy about writing temporary files and can cause spurious
failures. Merge them into one test to avoid the issue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337718 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Use unique_ptr to fix memory leak introduced in r337701

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337714 91177308-0d34-0410-b5e6-96231b3b80d8

OpChain has subclasses, so add a virtual destructor.

Summary:
OpChain has subclasses, so add a virtual destructor.

This fixes an issue when deleting subclasses of OpChain (see MatchSMLAD() specifically) in r337701.

Reviewers: javed.absar

Subscribers: llvm-commits, SjoerdMeijer, samparker

Differential Revision: https://reviews.llvm.org/D49681

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337713 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Follow-up to r337709.

Fix double-free.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337711 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add doFinalization() to ARMCodeGenPrepare pass.

Attempt to fix the leak introduced in r337687 and make sanitizer
buildbots green again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337709 91177308-0d34-0410-b5e6-96231b3b80d8

[Legalize] Elide MERGE_VALUES created by scalarizeVectorLoad.

scalarizeVectorLoad creates MERGE_VALUES nodes which are immediately
decomposed in expandLoad. Elide the node in these cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337708 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add more checks to the tls.ll test case. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337705 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM][NFC] ParallelDSP reorganisation

In preparing to allow ARMParallelDSP pass to parallelise more than
smlads, I've restructed some elements:

- The ParallelMAC struct has been renamed to BinOpChain.
- The BinOpChain struct holds two value lists: LHS and RHS, as well
  as inheriting from the OpChain base class.
- The OpChain struct holds all the values of the represented chain
  and has had the memory locations functionality inserted into it.
- ParallelMACList becomes OpChainList and it now holds pointers
  instead of objects.

Differential Revision: https://reviews.llvm.org/D49020

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337701 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Fix dumpSU() method in SystemZHazardRecognizer.

Two minor issues: The new MCD SchedWrite name does not contain "Unit" like
all the others, so a check is needed. Also, print "LSU" instead of "LS".

Review: Ulrich Weigand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337700 91177308-0d34-0410-b5e6-96231b3b80d8

[FPEnv] Legalize double width StrictFP vector operations

Differential Revision: https://reviews.llvm.org/D48809

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337698 91177308-0d34-0410-b5e6-96231b3b80d8

[Docs] Fix LLVM_YAML_IS_DOCUMENT_LIST_VECTOR

The docs incorrectly said to repeat std::vector inside
LLVM_YAML_IS_DOCUMENT_LIST_VECTOR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337695 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] ARMCodeGenPrepare backend pass

Arm specific codegen prepare is implemented to perform type promotion
on icmp operands, which can enable the removal of uxtb and uxth
(unsigned extend) instructions. This is possible because performing
type promotion before ISel alleviates this duty from the DAG builder
which has to perform legalisation, but has a limited view on data
ranges.

The pass visits any instruction operand of an icmp and creates a
worklist to traverse the use-def tree to determine whether the values
can simply be promoted. Our concern is values in the registers
overflowing the narrow (i8, i16) data range, so instructions marked
with nuw can be promoted easily. For add and sub instructions, we are
able to use the parallel dsp instructions to operate on scalar data
types and avoid overflowing bits. Underflowing adds and subs are also
permitted when the result is only used by an unsigned icmp.

Differential Revision: https://reviews.llvm.org/D48832

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337687 91177308-0d34-0410-b5e6-96231b3b80d8

[GVN] Don't use the eliminated load as an available value in phi construction

In ConstructSSAForLoadSet if an available value is actually the load that we're
doing SSA construction to eliminate, then we can omit it as SSAUpdate will add
in the value for the phi that will be replacing it anyway. This can result in
simpler IR which can allow further optimisation.

Differential Revision: https://reviews.llvm.org/D44160

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337686 91177308-0d34-0410-b5e6-96231b3b80d8

[MemorySSAUpdater] Update Phi operands after trivial Phi elimination

Bug fix for PR37445. The underlying problem and its fix are similar to PR37808.
The bug lies in MemorySSAUpdater::getPreviousDefRecursive(), where PhiOps is
computed before the call to tryRemoveTrivialPhi() and it ends up being out of
date, pointing to stale data. We have now turned each of the PhiOps into a
TrackingVH<MemoryAccess>.

Differential Revision: https://reviews.llvm.org/D49425

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337680 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Add a UniqueStringSaver: like StringSaver, but deduplicating.

Summary: Clarify contract of StringSaver (it null-terminates, callers rely on it).

Reviewers: hokein

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49596

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337677 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][MCA] ZnVer1: Update RegisterFile to identify false dependencies on partially written registers.

Summary:
Pretty mechanical follow-up for D49196.

As microarchitecture.pdf notes, "20 AMD Ryzen pipeline",
"20.8 Register renaming and out-of-order schedulers":
  The integer register file has 168 physical registers of 64 bits each.
  The floating point register file has 160 registers of 128 bits each.
"20.14 Partial register access":
  The processor always keeps the different parts of an integer register together.
  ...
  An instruction that writes to part of a register will therefore have a false dependence
  on any previous write to the same register or any part of it.

Reviewers: andreadb, courbet, RKSimon, craig.topper, GGanesh

Reviewed By: GGanesh

Subscribers: gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D49393

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337676 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][MCA] ZnVer1: add partial-reg-update tests

Reviewers: andreadb, courbet, RKSimon, craig.topper, GGanesh

Reviewed By: GGanesh

Subscribers: gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D49392

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337675 91177308-0d34-0410-b5e6-96231b3b80d8

[GVNHoist] safeToHoistLdSt allows illegal hoisting

Bug fix for PR36787. When reasoning if it's safe to hoist a load we
want to make sure that the defining memory access dominates the new
insertion point of the hoisted instruction. safeToHoistLdSt calls
firstInBB(InsertionPoint,DefiningAccess) which returns false if
InsertionPoint == DefiningAccess, and therefore it falsely thinks
it's safe to hoist.

Differential Revision: https://reviews.llvm.org/D49555

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337674 91177308-0d34-0410-b5e6-96231b3b80d8

[x86/SLH] Fix a bug where we would harden tail calls twice -- once as
a call, and then again as a return.

Also added a comment to try and explain better why we would be doing
what we're doing when hardening the (non-call) returns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337673 91177308-0d34-0410-b5e6-96231b3b80d8

[x86/SLH] Add a test covering indirect forms of control flow. NFC.

This specifically covers different ways of making indirect calls and
jumps. There are some bugs in SLH that I will be fixing in subsequent
patches where the diff in the generated instructions makes the bug fix
much more clear, so just checking in a baseline of this test to start.

I'm also going to be adding direct mitigation for variant 1.2 which this
file very specifically tests in the various forms it can arise on x86.
Again, the diff to the generated instructions should make the change for
that much more clear, so having the test as a baseline seems useful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337672 91177308-0d34-0410-b5e6-96231b3b80d8

[x86/SLH] Rename and comment the main hardening function. NFC.

This provides an overview of the algorithm used to harden specific
loads. It also brings this our terminology further in line with
hardening rather than checking.

Differential Revision: https://reviews.llvm.org/D49583

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337667 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit, fix a minor typo.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337657 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove the max vector width restriction from combineLoopMAddPattern and rely splitOpsAndApply to handle splitting.

This seems to be a net improvement. There's still an issue under avx512f where we have a 512-bit vpaddd, but not vpmaddwd so we end up doing two 256-bit vpmaddwds and inserting the results before a 512-bit vpaddd. It might be better to do two 512-bits paddds with zeros in the upper half. Same number of instructions, but breaks a dependency.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337656 91177308-0d34-0410-b5e6-96231b3b80d8

[ORE] Move loop invariant ORE checks outside the PM loop.

Summary:
This takes 22ms out of ~20s compiling sqlite3.c because we call it
for every unit of compilation and every pass.

Reviewers: paquette, anemet

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D49586

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337654 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAGBuilder] Use APInt::isZero instead of comparing APInt::getZExtValue to 0 in a place where we can't be sure contents of the APInt fit in a uint64_t.

This is used on an extract vector element index which is most cases is going to be an i32 or i64 and the element will be a valid element number. But it is possible to construct IR with a larger type and large out of range value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337652 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAGBuilder] Restrict vector reduction check to types with a power of 2 number of elements.

The check for the shuffles usages probably isn't correct for non power of 2 vectors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337651 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add more MADD recurrence test cases with larger and narrower vector widths.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337650 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca][docs] Add documentation for the statistic outputs from mca. NFC

Summary: The original text was lifted from the MCA README. I re-ran the dot-product example and updated the output seen in the docs. I also added a few paragraphs discussing the instruction issued and retired histograms, as well as discussing the register file stats.

Reviewers: andreadb, RKSimon, courbet, gbedwell, filcab

Reviewed By: andreadb

Subscribers: tschuett

Differential Revision: https://reviews.llvm.org/D49614

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337648 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Factor out register class selection for global base register. NFC

Factor out register class selection for global base register into a
separate function to escape long chain of ternary operators.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337647 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Move out the WrapperPat declaration from the NotInMicroMips predicate

This is a follow-up to the rL335185. Those commit adds some WrapperPat
patterns for microMIPS target. But declaration of the WrapperPat class
is under the NotInMicroMips predicate and microMIPS patterns cannot be
selected because predicate (Subtarget->inMicroMipsMode()) &&
(!Subtarget->inMicroMipsMode()) is always false.

This change move out the WrapperPat class declaration from the
NotInMicroMips predicate and enables microMIPS WrapperPat patterns.

Differential revision: https://reviews.llvm.org/D49533

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337646 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-undname] Flush output before demangling.

If an error occurs and we write it to stderr, it could appear
before we wrote the mangled name which we're undecorating.
By flushing stdout first, we ensure that the messages are always
sequenced in the correct order.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337645 91177308-0d34-0410-b5e6-96231b3b80d8