git.osdn.net Git - android-x86/external-llvm.git/log

[globalisel][irtranslator] Verify that DILocations aren't lost in translation

Summary:
Also fix a couple bugs where DILocations are lost. EntryBuilder wasn't passing
on debug locations for PHI's, constants, GLOBAL_VALUE, etc.

Reviewers: aprantl, vsk, bogner, aditya_nandakumar, volkan, rtereshin, aemerson

Reviewed By: aemerson

Subscribers: aemerson, rovka, kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D53740

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345743 91177308-0d34-0410-b5e6-96231b3b80d8

MachineModuleInfo: Initialize DbgInfoAvailable depending on debug_cus existing

Before this patch DbgInfoAvailable was set to true in
DwarfDebug::beginModule() or CodeViewDebug::CodeViewDebug(). This made
MIR testing weird since passes would suddenly stop dealing with debug
info just because we stopped the pipeline before the debug printers.

This patch changes the logic to initialize DbgInfoAvailable based on the
fact that debug_compile_units exist in the llvm Module. The debug
printers may then override it with false in case of debug printing being
disabled.

Differential Revision: https://reviews.llvm.org/D53885

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345740 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] refactor fabs+fcmp fold; NFC

Also, remove/replace/minimize/enhance the tests for this fold.
The code drops FMF, so it needs more tests and at least 1 fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345734 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Make sure not to use GP-relative addressing with PIC

Make sure that -relocation-model=pic prevents use of GP-relative
addressing modes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345731 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Remove namespace prefixes made redundant by r345612. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345730 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] fold 'fcmp nnan ult X, 0.0' when X is not negative

This is the inverted case for the transform added with D53874 / rL345725.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345728 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add assertion that InstSimplify has folded a fabs+fcmp; NFC

The 'OLT' case was updated at rL266175, so I assume it was just an
oversight that 'UGE' was not included because that patch handled
both predicates in InstSimplify.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345727 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] fold 'fcmp nnan oge X, 0.0' when X is not negative

This re-raises some of the open questions about how to apply and use fast-math-flags in IR from PR38086:
https://bugs.llvm.org/show_bug.cgi?id=38086
...but given the current implementation (no FMF on casts), this is likely the only way to predicate the
transform.

This is part of solving PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475

Differential Revision: https://reviews.llvm.org/D53874

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345725 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopUnroll] allow customization for new-pass-manager version of LoopUnroll

Unlike its legacy counterpart new pass manager's LoopUnrollPass does
not provide any means to select which flavors of unroll to run
(runtime, peeling, partial), relying on global defaults.

In some cases having ability to run a restricted LoopUnroll that
does more than LoopFullUnroll is needed.

Introduced LoopUnrollOptions to select optional unroll behaviors.
Added 'unroll<peeling>' to PassRegistry mainly for the sake of testing.

Reviewers: chandlerc, tejohnson
Differential Revision: https://reviews.llvm.org/D53440

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345723 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] add tests for fcmp and known positive; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345722 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Fold 0 div/rem X to 0

Reviewers: RKSimon, spatel, javed.absar, craig.topper, t.p.northover

Reviewed By: RKSimon

Subscribers: craig.topper, llvm-commits

Differential Revision: https://reviews.llvm.org/D52504

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345721 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Rewrite SILowerI1Copies to always stay on SALU

Summary:
Instead of writing boolean values temporarily into 32-bit VGPRs
if they are involved in PHIs or are observed from outside a loop,
we use bitwise masking operations to combine lane masks in a way
that is consistent with wave control flow.

Move SIFixSGPRCopies to before this pass, since that pass
incorrectly attempts to move SGPR phis to VGPRs.

This should recover most of the code quality that was lost with
the bug fix in "AMDGPU: Remove PHI loop condition optimization".

There are still some relevant cases where code quality could be
improved, in particular:

- We often introduce redundant masks with EXEC. Ideally, we'd
  have a generic computeKnownBits-like analysis to determine
  whether masks are already masked by EXEC, so we can avoid this
  masking both here and when lowering uniform control flow.

- The criterion we use to determine whether a def is observed
  from outside a loop is conservative: it doesn't check whether
  (loop) branch conditions are uniform.

Change-Id: Ibabdb373a7510e426b90deef00f5e16c5d56e64b

Reviewers: arsenm, rampitec, tpr

Subscribers: kzhuravl, jvesely, wdng, mgorny, yaxunl, dstuttard, t-tye, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D53496

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345719 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove PHI loop condition optimization

Summary:
The optimization to early break out of loops if all threads are dead was
never fully implemented.

But the PHI node analyzing is actually causing a number of problems, so
remove all the extra code for it.

(This does actually regress code quality in a few places because it
ends up relying more heavily on phi's of i1, which we don't do a
great job with. However, since it fixes real bugs in the wild, we
should take this change. I have some prototype changes to improve
i1 lowering in general -- not just for control flow -- which should
help recover the code quality, I just need to make those changes
fit for general consumption. -- Nicolai)

Change-Id: I6fc6c6c8961857ac6009fcfb9f7e5e48dc23fbb1
Patch-by: Christian König <christian.koenig@amd.com>
Reviewers: arsenm, rampitec, tpr

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D53359

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345718 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] fold icmp based on range of abs/nabs

This is a fix for PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475

We managed to get some of these patterns using computeKnownBits in D47041, but that
can't be used for nabs(). Instead, put in some range-based logic, so we can fold
both abs/nabs with icmp with a constant value.

Alive proofs:
https://rise4fun.com/Alive/21r

Name: abs_nsw_is_positive
  %cmp = icmp slt i32 %x, 0
  %negx = sub nsw i32 0, %x
  %abs = select i1 %cmp, i32 %negx, i32 %x
  %r = icmp sgt i32 %abs, -1
    =>
  %r = i1 true

Name: abs_nsw_is_not_negative
  %cmp = icmp slt i32 %x, 0
  %negx = sub nsw i32 0, %x
  %abs = select i1 %cmp, i32 %negx, i32 %x
  %r = icmp slt i32 %abs, 0
    =>
  %r = i1 false

Name: nabs_is_negative_or_0
  %cmp = icmp slt i32 %x, 0
  %negx = sub i32 0, %x
  %nabs = select i1 %cmp, i32 %x, i32 %negx
  %r = icmp slt i32 %nabs, 1
    =>
  %r = i1 true

Name: nabs_is_not_over_0
  %cmp = icmp slt i32 %x, 0
  %negx = sub i32 0, %x
  %nabs = select i1 %cmp, i32 %x, i32 %negx
  %r = icmp sgt i32 %nabs, 0
    =>
  %r = i1 false

Differential Revision: https://reviews.llvm.org/D53844

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345717 91177308-0d34-0410-b5e6-96231b3b80d8

[tblgen][PredicateExpander] Add the ability to describe more complex constraints on instruction operands.

Before this patch, class PredicateExpander only knew how to expand simple
predicates that performed checks on instruction operands.
In particular, the new scheduling predicate syntax was not rich enough to
express checks like this one:

Foo(MI->getOperand(0).getImm()) == ExpectedVal;

Here, the immediate operand value at index zero is passed in input to function
Foo, and ExpectedVal is compared against the value returned by function Foo.

While this predicate pattern doesn't show up in any X86 model, it shows up in
other upstream targets. So, being able to support those predicates is
fundamental if we want to be able to modernize all the scheduling models
upstream.

With this patch, we allow users to specify if a register/immediate operand value
needs to be passed in input to a function as part of the predicate check. Now,
register/immediate operand checks all derive from base class CheckOperandBase.

This patch also changes where TIIPredicate definitions are expanded by the
instructon info emitter. Before, definitions were expanded in class
XXXGenInstrInfo (where XXX is a target name).
With the introduction of this new syntax, we may want to have TIIPredicates
expanded directly in XXXInstrInfo. That is because functions used by the new
operand predicates may only exist in the derived class (i.e. XXXInstrInfo).

This patch is a non functional change for the existing scheduling models.
In future, we will be able to use this richer syntax to better describe complex
scheduling predicates, and expose them to llvm-mca.

Differential Revision: https://reviews.llvm.org/D53880

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345714 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Add tests for loop-simplifycfg for further development

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345713 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Remove illegal comparison of singular iterators from SmallSetTest

This removes the assertion that a copy of a moved-from SmallSetIterator
equals the original, which is illegal due to SmallSetIterator including
an instance of a standard `std::set` iterator.

C++ [iterator.requirements.general] states that comparing singular
iterators has undefined result:

> Iterators can also have singular values that are not associated with
> any sequence. [...] Results of most expressions are undefined for
> singular values; the only exceptions are destroying an iterator that
> holds a singular value, the assignment of a non-singular value to an
> iterator that holds a singular value, and, for iterators that satisfy
> the Cpp17DefaultConstructible requirements, using a value-initialized
> iterator as the source of a copy or move operation.

This assertion triggers the following error in the GNU C++ Library in
debug mode under EXPENSIVE_CHECKS:

  /usr/include/c++/8.2.1/debug/safe_iterator.h:518:
  Error: attempt to compare a singular iterator to a singular iterator.

  Objects involved in the operation:
      iterator "lhs" @ 0x0x7fff86420670 {
        state = singular;
      }
      iterator "rhs" @ 0x0x7fff86420640 {
        state = singular;
      }

Patch by Eugene Sharygin.

Reviewers: fhahn, dblaikie, chandlerc

Reviewed By: fhahn, dblaikie

Differential Revision: https://reviews.llvm.org/D53793

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345712 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] support image load/store a16

Our a16 support was only enabled for sample/gather and buffer
load/store, but not for image load/store operations (which take an i16
as the pixel index rather than a half).

Fix our isel lowering and add test cases to prove it out.

Differential Revision: https://reviews.llvm.org/D53750

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345710 91177308-0d34-0410-b5e6-96231b3b80d8

[IndVars] Strengthen restricton in rewriteLoopExitValues

For some unclear reason rewriteLoopExitValues considers recalculation
after the loop profitable if it has some "soft uses" outside the loop (i.e. any
use other than call and return), even if we have proved that it has a user inside
the loop which we think will not be optimized away.

There is no existing unit test that would explain this. This patch provides an
example when rematerialisation of exit value is not profitable but it passes
this check due to presence of a "soft use" outside the loop.

It makes no sense to recalculate value on exit if we are going to compute it
due to some irremovable within the loop. This patch disallows applying this
transform in the described situation.

Differential Revision: https://reviews.llvm.org/D51581
Reviewed By: etherzhhb

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345708 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Support vectorization of interleave-groups that require an epilog under
optsize using masked wide loads

Under Opt for Size, the vectorizer does not vectorize interleave-groups that
have gaps at the end of the group (such as a loop that reads only the even
elements: a[2*i]) because that implies that we'll require a scalar epilogue
(which is not allowed under Opt for Size). This patch extends the support for
masked-interleave-groups (introduced by D53011 for conditional accesses) to
also cover the case of gaps in a group of loads; Targets that enable the
masked-interleave-group feature don't have to invalidate interleave-groups of
loads with gaps; they could now use masked wide-loads and shuffles (if that's
what the cost model selects).

Reviewers: Ayal, hsaito, dcaballe, fhahn

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D53668

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345705 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Mark syms/t flags as NotHidden. NFC.

Slight improvement to help output of llvm-objdump that exposes the
shorter -t flag for -syms instead of it being hidden away.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345704 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Add --reloc alias for -r (PR39407)

This addresses PR39407 (https://bugs.llvm.org/show_bug.cgi?id=39407)
improving compatibility with GNU binutils counterparts.

Reviewed By: kristina

Patch by Higuoxing (Xing).

Differential Revision: https://reviews.llvm.org/D53804

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345703 91177308-0d34-0410-b5e6-96231b3b80d8

[MSan] another take at instrumenting inline assembly - now with calls

Turns out it's not always possible to figure out whether an asm()
statement argument points to a valid memory region.
One example would be per-CPU objects in the Linux kernel, for which the
addresses are calculated using the FS register and a small offset in the
.data..percpu section.
To avoid pulling all sorts of checks into the instrumentation, we replace
actual checking/unpoisoning code with calls to
msan_instrument_asm_load(ptr, size) and
msan_instrument_asm_store(ptr, size) functions in the runtime.

This patch doesn't implement the runtime hooks in compiler-rt, as there's
been no demand in assembly instrumentation for userspace apps so far.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345702 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM64] [Windows] Exception handling support in frame lowering

Emit pseudo instructions indicating unwind codes corresponding to each
instruction inside the prologue/epilogue. These are used by the MCLayer to
populate the .xdata section.

Differential Revision: https://reviews.llvm.org/D50288

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345701 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Mark condition flags and x16/x17 as clobbered when calling __chkstk

This is similar to SVN r311061 for ARM.

Differential Revision: https://reviews.llvm.org/D53878

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345698 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] support '--syms' as an alias of -t

This adds support for '--syms' as an alias of '-t' for llvm-objdump,
fixing PR39406 (https://bugs.llvm.org/show_bug.cgi?id=39406).

Patch by Higuoxing (Xing).

Differential Revision: https://reviews.llvm.org/D53803

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345697 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Fix hex printing of uint64_t values.

A plain "%x" format string will drop the high 32-bits. Use the PRIx64 macro
instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345696 91177308-0d34-0410-b5e6-96231b3b80d8

2nd attempt to fix ambiguities because of ADL

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345690 91177308-0d34-0410-b5e6-96231b3b80d8

Try to fix ambiguities with C++17 headers in unittest

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345689 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] Revert r345546: Refactor range list extraction and dumping

This patch caused some internal tests to break which are being investigated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345687 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Delete a redundant override whose base is empty

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345684 91177308-0d34-0410-b5e6-96231b3b80d8

Use llvm::any_of instead std::any_of. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345683 91177308-0d34-0410-b5e6-96231b3b80d8

Use the container form llvm::sort(C)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345682 91177308-0d34-0410-b5e6-96231b3b80d8

Don't duplicate function/class name at the beginning of the comment. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345681 91177308-0d34-0410-b5e6-96231b3b80d8

ADT/STLExtras: Introduce llvm::empty; NFC

This is modeled after C++17 std::empty().

Differential Revision: https://reviews.llvm.org/D53909

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345679 91177308-0d34-0410-b5e6-96231b3b80d8

DWARFVerifier: make the verifier more comprehensive for objects

Make the code do what was mentioned in the comment: only skip the CU types.
This enables the lexical blocks to be verified as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345675 91177308-0d34-0410-b5e6-96231b3b80d8

MachineOperand/MIParser: Do not print debug-use flag, infer it

The debug-use flag must be set exactly for uses on DBG_VALUEs. This is
so obvious that it can be trivially inferred while parsing. This will
reduce noise when printing while omitting an information that has little
value to the user.

The parser will keep recognizing the flag for compatibility with old
`.mir` files.

Differential Revision: https://reviews.llvm.org/D53903

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345671 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM][NFC] Make tests immune to better div optimizations

Summary: Related to D52504

Reviewers: spatel

Reviewed By: spatel

Subscribers: javed.absar, kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D53901

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345665 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r345542: AMDGPU: Enable code object v3 by default

It breaks mesa.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345662 91177308-0d34-0410-b5e6-96231b3b80d8

[FPEnv] [FPEnv] Add constrained intrinsics for MAXNUM and MINNUM

Differential Revision: https://reviews.llvm.org/D53216

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345650 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] use 'match' to reduce code; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345647 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Teach the move free before null test opti how to deal with noop casts

InstCombine features an optimization that essentially replaces:
if (a)
  free(a)
into:
free(a)

Right now, this optimization is gated by the minsize attribute and therefore
we only perform it if we can prove that we are going to be able to eliminate
the branch and the destination block.

However when casts are involved the optimization would fail to apply, because
the optimization was not smart enough to realize that it is possible to also
move the casts away from the destination block and that is harmless to the
performance since they are just noops.
E.g.,
foo(int *a)
if (a)
  free((char*)a)

Wouldn't be optimized by instcombine, because
- We would refuse to hoist the `bitcast i32* %a to i8` in the source block
- We would fail to see that `bitcast i32* %a to i8` and %a are the same value.

This patch fixes both these problems:
- It teaches the pattern matching of the comparison how to look
  through casts.
- It checks that whether the additional instruction in the destination block
  can be hoisted and are harmless performance-wise.
- It hoists all the code of the destination block in the source block.

Differential Revision: D53356

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345644 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] try to make test immune to better div optimization; NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345642 91177308-0d34-0410-b5e6-96231b3b80d8

[COFF, ARM64] Make sure to forward arguments from vararg to musttail vararg

Summary:
    Thunk functions in Windows are varag functions that call a musttail function
    to pass the arguments after the fixup is done.  We need to make sure that we
    forward the arguments from the caller vararg to the callee vararg function.
    This is the same mechanism that is used for Windows on X86.

Reviewers: ssijaric, eli.friedman, TomTan, mgrang, mstorsjo, rnk, compnerd, efriedma

Reviewed By: efriedma

Subscribers: efriedma, kristof.beyls, chrib, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D53843

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345641 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] try to make test immune to better div optimization; NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345640 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] try to make test immune to better div optimization; NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345639 91177308-0d34-0410-b5e6-96231b3b80d8

[ScalarizeMaskedMemIntrin] Limit the scope of some variables that are only used inside loops.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345638 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Fix for big endian in ForwardStoreValueToDirectLoad

Summary:
Normalize the offset for endianess before checking
if the store cover the load in ForwardStoreValueToDirectLoad.

Without this we missed out on some optimizations for big
endian targets. If for example having a 4 bytes store followed
by a 1 byte load, loading the least significant byte from the
store, the STCoversLD check would fail (see @test4 in
test/CodeGen/AArch64/load-store-forwarding.ll).

This patch also fixes a problem seen in an out-of-tree target.
The target has i40 as a legal type, it is big endian,
and the StoreSize for i40 is 48 bits. So when normalizing
the offset for endianess we need to take the StoreSize into
account (assuming that padding added when storing into
a larger StoreSize always is added at the most significant
end).

Reviewers: niravd

Reviewed By: niravd

Subscribers: javed.absar, kristof.beyls, llvm-commits, uabelho

Differential Revision: https://reviews.llvm.org/D53776

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345636 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] [Windows] SEH opcodes should be scheduling boundaries.

Prevents the post-RA scheduler from modifying the prologue sequences
emitting by frame lowering. This is roughly similar to what we do for
other targets: TargetInstrInfo::isSchedulingBoundary checks
isPosition(), which checks for CFI_INSTRUCTION.

isSEHInstruction is taken from D50288; it'll land with whatever patch
lands first.

Differential Revision: https://reviews.llvm.org/D53851

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345634 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Create proper memoperand for multi-vector stores

Re-apply r345315 with testcase fixes.

Include all of the store's source vector operands when creating the
MachineMemOperand. Previously, we were missing the first operand,
making the store size seem smaller than it really is.

Differential Revision: https://reviews.llvm.org/D52816

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345631 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] In lowerVectorShuffleAsBroadcast, make peeking through CONCAT_VECTORS work correctly if we already walked through a bitcast that changed the element size.

The CONCAT_VECTORS case was using the original mask element count to determine how to adjust the broadcast index. But if we looked through a bitcast the original mask size doesn't tell us anything about the concat_vectors.

This patch switchs to using the concat_vectors input element count directly instead.

Differential Revision: https://reviews.llvm.org/D53823

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345626 91177308-0d34-0410-b5e6-96231b3b80d8

[GCOV] Function counters are wrong when on one line

Summary:
After commit https://reviews.llvm.org/rL344228, the function definitions have a counter but when on one line the counter is wrong (e.g. void foo() { })
I added a test in: https://reviews.llvm.org/D53601

Reviewers: marco-c

Reviewed By: marco-c

Subscribers: llvm-commits, sylvestre.ledru

Differential Revision: https://reviews.llvm.org/D53600

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345624 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] Add const variants for BaseIndexOffset functions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345623 91177308-0d34-0410-b5e6-96231b3b80d8

Fix printing bug in pdb2yaml.

We were using the wrong enum table when mapping enum values
to strings for public symbol flags.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345622 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Define base function on DWARFDie reverse iterators

This defines member function base on the specialization of
std::reverse_iterator for DWARFDie::iterator as required by C++
[reverse.iter.conv].

This fixes unit test DWARFDebugInfoTest.cpp under EXPENSIVE_CHECKS which
currently can't be built due to GNU C++ Library calling this member
function in debug mode.

This fixes https://llvm.org/PR38785

Patch by: Eugene Sharygin

Differential revision: https://reviews.llvm.org/D53792

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345621 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Simplify LRV/STRV ISD nodes

The LRV and STRV nodes carry an extra operand to indicate the
type of the memory access. This is redundant, since the nodes
are actually of class MemIntrinsicNode and therefore hold that
same information already as MemoryVT.

NFC intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345618 91177308-0d34-0410-b5e6-96231b3b80d8

[TTI] Fix uses of SK_ExtractSubvector shuffle costs (PR39368)

Correct costings of SK_ExtractSubvector requires the SubTy argument to indicate the type/size of the extracted subvector.

Unlike the rest of the shuffle kinds this means that the main Ty argument represents the source vector type not the destination!

I've done my best to fix a number of vectorizer uses:

SLP - the reduction epilogue costs should be using a SK_PermuteSingleSrc shuffle as these all occur at the hardware vector width - we're not extracting (illegal) subvector types. This is causing the cost model diffs as SK_ExtractSubvector costs are poorly handled and tend to just return 1 at the moment.

LV - I'm not clear on what the SK_ExtractSubvector should represents for recurrences - I've used a <1 x ?> subvector extraction as that seems to match the VF delta.

Differential Revision: https://reviews.llvm.org/D53573

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345617 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add preliminary tests for nested min/max combines. NFC

Summary: As requested in D53774.

Reviewers: spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53875

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345616 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] add tests for fcmp folds; NFC

This is part of a problem noted in PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345615 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Fix --keep-global-symbol/--globalize-symbol for undefined symbols.

Summary: --keep-global-symbol and --globalize-symbol don't make sense for undefined symbols, so it should be ignored for those symbols. This matches GNU objcopy behavior.

Reviewers: jhenderson, alexshap, jakehehrlich, espindola

Reviewed By: jhenderson, jakehehrlich

Subscribers: emaste, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D53733

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345614 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] use getFltSemantics() instead of duplicating it; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345613 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Move namespace mca inside llvm::

Summary: This allows to remove `using namespace llvm;` in those *.cpp files

When we want to revisit the decision (everything resides in llvm::mca::*) in the future, we can move things to a nested namespace of llvm::mca::, to conceptually make them separate from the rest of llvm::mca::*

Reviewers: andreadb, mattd

Reviewed By: andreadb

Subscribers: javed.absar, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D53407

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345612 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] try to turn shuffle into insertelement

shuffle (insert ?, Scalar, IndexC), V1, Mask --> insert V1, Scalar, IndexC'

The motivating case is at least a couple of steps away: I noticed that
SLPVectorizer does not analyze shuffles as well as sequences of
insert/extract in PR34724:
https://bugs.llvm.org/show_bug.cgi?id=34724
...so SLP may fail to vectorize when source code has shuffles to start
with or instcombine has converted insert/extract to shuffles.

Independent of that, an insertelement is always a simpler op for IR
analysis vs. a shuffle, so we should transform to insert when possible.

I don't think there's any codegen concern here - if a target can't insert
a scalar directly to some fixed element in a vector (x86?), then this
should get expanded to the insert+shuffle that we started with.

Differential Revision: https://reviews.llvm.org/D53507

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345607 91177308-0d34-0410-b5e6-96231b3b80d8

[SchedModel] Fix for read advance cycles with implicit pseudo operands.

The SchedModel allows the addition of ReadAdvances to express that certain
operands of the instructions are needed at a later point than the others.

RegAlloc may add pseudo operands that are not part of the instruction
descriptor, and therefore cannot have any read advance entries. This meant
that in some cases the desired read advance was nullified by such a pseudo
operand, which still had the original latency.

This patch fixes this by making sure that such pseudo operands get a zero
latency during DAG construction.

Review: Matthias Braun, Ulrich Weigand.
https://reviews.llvm.org/D49671

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345606 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopVectorizer]  Fix for cost values of memory accesses.

This commit is a combination of two patches:

* "Fix in getScalarizationOverhead()"

   If target returns false in TTI.prefersVectorizedAddressing(), it means the
   address registers will not need to be extracted. Therefore, there should
   be no operands scalarization overhead for a load instruction.

* "Don't pass the instruction pointer from getMemInstScalarizationCost."

   Since VF is always > 1, this is a cost query for an instruction in the
   vectorized loop and it should not be evaluated within the scalar
   context of the instruction.

Review: Ulrich Weigand, Hal Finkel
https://reviews.llvm.org/D52351
https://reviews.llvm.org/D52417

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345603 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] narrow vector binops when extraction is cheap

Narrowing vector binops came up in the demanded bits discussion in D52912.

I don't think we're going to be able to do this transform in IR as a canonicalization
because of the risk of creating unsupported widths for vector ops, but we already have
a DAG TLI hook to allow what I was hoping for: isExtractSubvectorCheap(). This is
currently enabled for x86, ARM, and AArch64 (although only x86 has existing regression
test diffs).

This is artificially limited to not look through bitcasts because there are so many
test diffs already, but that's marked with a TODO and is a small follow-up.

Differential Revision: https://reviews.llvm.org/D53784

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345602 91177308-0d34-0410-b5e6-96231b3b80d8

[FIX][AArch64] Add support for UDF instruction

Fix: Simplify test files from rL345581 failing
in windows bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345601 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] fix build warning for mismatched signs in compare; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345598 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Improve isFoldableLoad() for Sub, SDiv and UDiv.

Sub, SDiv and UDiv are not commutative, so only the RHS operand can fold a
load. This patch adds a check for this.

Review: Ulrich Weigand
https://reviews.llvm.org/D53791

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345596 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Re-enable the machine verifier after fixing more tests

Was disabled again in r345528. Hopefully this the bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345593 91177308-0d34-0410-b5e6-96231b3b80d8

[llc] Error out when -print-machineinstrs is used with an unknown pass

We used to assert instead of reporting an error.

PR39494

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345589 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-size] Reject unknown radix values

This addresses https://bugs.llvm.org/show_bug.cgi?id=39403 by making
-radix an enumeration option with 8, 10, and 16 as the only accepted
values.

Reviewed by: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D53799

Patch by Eugene Sharygin

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345588 91177308-0d34-0410-b5e6-96231b3b80d8

[FIX][AArch64] Add support for UDF instruction

Fix wrong test files submited
in rL345581

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345587 91177308-0d34-0410-b5e6-96231b3b80d8

[SROA] Use offset sizes from the DataLayout instead of the pointer siezes.

This fixes an assertion when constant folding a GEP when the part of the offset
was in i32 (IndexSize, as per DataLayout) and part in the i64 (PointerSize) in
the newly created test case.

Differential Revision: https://reviews.llvm.org/D52609

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345585 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][BMI1] X86DAGToDAGISel: select BEXTR from x & (-1 >> (32 - y)) pattern

Summary:
The final pattern.
There is no test changes:
* We are looking for the pattern with one-use of it's mask,
* If the mask is one-use, D48768 will unfold it into pattern d.
* Thus, the tests have extra-use on the mask.
* Thus, only the BMI2 BZHI can be tested, and it already worked.
* So there is no BMI1 test coverage, we just assume it works since it uses the same codepath.

Reviewers: craig.topper, RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53575

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345584 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add extra-uses on the mask of pattern c of extract-{low,}bits.ll tests

Summary:
Because of the D48768, that pattern is always unfolded into pattern d,
thus we had no test coverage.

Reviewers: RKSimon, craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53574

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345583 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Add support for UDF instruction

Summary: Add support for AArch64 UDF instruction.
UDF - Permanently Undefined generates an Undefined
Instruction exception (ESR_ELx.EC = 0b000000).

Reviewers: DavidSpickett, javed.absar, t.p.northover

Reviewed By: javed.absar

Subscribers: nhaehnle, kristof.beyls

Differential Revision: https://reviews.llvm.org/D53319

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345581 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Add FoldBUILD_VECTOR to simplify new BUILD_VECTOR nodes

Similar to FoldCONCAT_VECTORS, this patch adds FoldBUILD_VECTOR to simplify cases that can avoid the creation of the BUILD_VECTOR - if all the operands are UNDEF or if the BUILD_VECTOR simplifies to a copy.

This exposed an assumption in some AMDGPU code that getBuildVector was guaranteed to be a BUILD_VECTOR node that I've tried to handle.

Differential Revision: https://reviews.llvm.org/D53760

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345578 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Improve X div/rem Y fold if single bit element type

Summary: Tests by @spatel, thanks

Reviewers: spatel, RKSimon

Reviewed By: spatel

Subscribers: sdardis, atanasyan, llvm-commits, spatel

Differential Revision: https://reviews.llvm.org/D52668

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345575 91177308-0d34-0410-b5e6-96231b3b80d8

[LegalizeTypes] Teach PromoteIntRes_BITCAST to better handle a bitcast with vector output type and a vector input type that needs to be widened

Summary: Previously if we had a bitcast vector output type that needs promotion and a vector input type that needs widening we would just do a stack store and load to handle the conversion. We can do a little better if we can widen the bitcast to a legal vector type the same size as the widened input type. Then we can do the bitcast between this widened type and the widened input type. Afterwards we can extract_subvector back to the original output and any_extend that. Type legalization will then circle back and handle promotion of the extract_subvector and the any_extend will just be removed. This will avoid going through the stack and allows us to remove a custom version of this legalization from X86.

Reviewers: efriedma, RKSimon

Reviewed By: efriedma

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D53229

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345567 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Add test case for D53229. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345566 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Cleanup the code in LowerFABSorFNEG and LowerFCOPYSIGN a little. NFC

Use SelectionDAG::EVTToAPFloatSemantics. Make the LogicVT calculation in LowerFABSorFNEG similar to LowerFCOPYSIGN. Use APInt::getSignedMaxValue instead of ~APInt::getSignMask.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345565 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Stop changing f128 fand/for/fxor to v2i64.

The additional patterns don't cost us much and it seems better than changing element widths.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345564 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove custom BUILD_VECTOR combine

This was looping in a testcase and removing it
now slightly improves a test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345560 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Use scavengeRegisterBackwards

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345559 91177308-0d34-0410-b5e6-96231b3b80d8

Remove dead declaration

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345555 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typos in comment

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345554 91177308-0d34-0410-b5e6-96231b3b80d8

Pass TRI to printReg

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345553 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unneeded friend declarations that clang-cl warns on

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345549 91177308-0d34-0410-b5e6-96231b3b80d8

[AliasSetTracker] Cleanup addPointer interface. [NFCI]

Summary:
Attempting to simplify the addPointer interface.
Currently there's code decomposing a MemoryLocation into (Ptr, Size, AAMDNodes) only to recreate the MemoryLocation inside the call.

Reviewers: reames, mkazantsev

Subscribers: sanjoy, jlebar, llvm-commits

Differential Revision: https://reviews.llvm.org/D53836

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345548 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF][NFC] Refactor range list extraction and dumping

The purpose of this patch is twofold:
- Fold pre-DWARF v5 functionality into v5 to eliminate the need for 2 different
  versions of range list handling. We get rid of DWARFDebugRangelist{.cpp,.h}.
- Templatize the handling of range list tables so that location list handling
  can take advantage of it as well. Location list and range list tables have the
  same basic layout.

A non-NFC version of this patch was previously submitted with r342218, but it caused
errors with some TSan tests. This patch has no functional changes. The difference to
the non-NFC patch is that there are no changes to rangelist dumping in this patch.

Differential Revision: https://reviews.llvm.org/D53545

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345546 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Move elf-specific code into subfolder

In this diff the elf-specific code is moved into the subfolder ELF
(and factored out from llvm-objcopy.cpp).

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D53790

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345544 91177308-0d34-0410-b5e6-96231b3b80d8

Add parens to fix incorrect assert check.

&& has higher priority than ||, so this assert works really oddly. Add
parens to match the programmer's intent.

Change-Id: I3abe1361ee0694462190c5015779db664012f3d4

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345543 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Enable code object v3 by default

Differential Revision: https://reviews.llvm.org/D53525

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345542 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] add tests for abs/nabs+icmp folding; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345541 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopUnroll] NFC. Factor out runtime-loop.ll common test behavior.

Adding COMMON prefix to get common part handled there.
Needed to simplify test changes for D53440.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345538 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner] Inherit target features from parent function

If a function has target features, it may contain instructions that aren't
represented in the default set of instructions. If the outliner pulls out one
of these instructions, and the function doesn't have the right attributes
attached, we'll run into an LLVM error explaining that the target doesn't
support the necessary feature for the instruction.

This makes outlined functions inherit target features from their parents.

It also updates the machine-outliner.ll test to check that we're properly
inheriting target features.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345535 91177308-0d34-0410-b5e6-96231b3b80d8

Relax fast register allocator related test cases; NFC

- Relex hard coded registers and stack frame sizes
- Some test cleanups
- Change phi-dbg.ll to match on mir output after phi elimination instead
of going through the whole codegen pipeline.

This is in preparation for https://reviews.llvm.org/D52010
I'm committing all the test changes upfront that work before and after
independently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345532 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Set isMachineVerifierClean() back to false (PR27481)

Put back the isMachineVerifierClean() override removed at rL345513 to fix Windows ThinLTO tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345528 91177308-0d34-0410-b5e6-96231b3b80d8

[HotColdSplitting] Allow outlining single-block cold regions

It can be profitable to outline single-block cold regions because they
may be large.

Allow outlining single-block regions if they have over some threshold of
non-debug, non-terminator instructions. I chose 3 as the threshold after
experimenting with several internal frameworks.

In practice, reducing the threshold further did not give much
improvement, whereas increasing it resulted in substantial regressions.

Differential Revision: https://reviews.llvm.org/D53824

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345524 91177308-0d34-0410-b5e6-96231b3b80d8