git.osdn.net Git - android-x86/external-llvm.git/log

Make PredIteratorCache size() logically const. Do not require copying predecessors to get size.

Summary:
Every single benchmark i can run, on large and small cfgs, fully
connected, etc, across 3 different platforms (x86, arm., and PPC) says
that the current pred iterator cache is a losing proposition.

I can't find a case where it's faster than just walking preds, and in some cases, it's 5-10% slower.

This is due to copying the preds.
It also degrades into copying the entire cfg.

The one operation that is occasionally faster is the cached size.
This makes that operation faster by not relying on having the copies available.

I'm not even sure that is faster enough to be worth it. I, again, have
trouble finding cases where this takes long enough in a pass to be
worth caching compared to a million other things they could cache or
improve.

My suggestion:
We next remove the get() interface.
We do stronger benchmarking of size().
We probably end up killing this entire cache.
/

Reviewers: chandlerc

Subscribers: aemerson, llvm-commits, trentxintong

Differential Revision: https://reviews.llvm.org/D30873

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297733 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297731 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] Fix -Wreorder warning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297729 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typos in ADCE comments

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297726 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueTracking] Out of range shifts might be undef

If it is possible for the RHS of a shift operation to be greater than or equal
to the bit-width, then the result might be undef, and we can't report any known
bits.

In some cases, this was allowing a transformation in instcombine which widened
an undef value from i1 to i32, increasing the range of values that a function
could return.

Differential revision: https://reviews.llvm.org/D30781

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297724 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Move SMULW[B|T] isel to DAG Combine

Create nodes for smulwb and smulwt and move their selection from
DAGToDAG to DAG combine. smlawb and smlawt can then be selected
using tablegen. Added some helper functions to detect shift patterns
as well as a wrapper around SimplifyDemandBits. Added a couple of
extra tests.

Differential Revision: https://reviews.llvm.org/D30708

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297716 91177308-0d34-0410-b5e6-96231b3b80d8

Disable Callee Saved Registers

Each Calling convention (CC) defines a static list of registers that should be preserved by a callee function. All other registers should be saved by the caller.
Some CCs use additional condition: If the register is used for passing/returning arguments – the caller needs to save it - even if it is part of the Callee Saved Registers (CSR) list.
The current LLVM implementation doesn’t support it. It will save a register if it is part of the static CSR list and will not care if the register is passed/returned by the callee.
The solution is to dynamically allocate the CSR lists (Only for these CCs). The lists will be updated with actual registers that should be saved by the callee.
Since we need the allocated lists to live as long as the function exists, the list should reside inside the Machine Register Info (MRI) which is a property of the Machine Function and managed by it (and has the same life span).
The lists should be saved in the MRI and populated upon LowerCall and LowerFormalArguments.
The patch will also assist to implement future no_caller_saved_regsiters attribute intended for interrupt handler CC.

Differential Revision: https://reviews.llvm.org/D28566

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297715 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Use iPTR instead of i64 in patterns for extract_subvector/insert_subvector index.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297707 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add test cases that demonstrate some patterns that don't work correctly in 32-bit mode. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297706 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetTransformInfo] getIntrinsicInstrCost() scalarization estimation improved

getIntrinsicInstrCost() used to only compute scalarization cost based on types.
This patch improves this so that the actual arguments are checked when they are
available, in order to handle only unique non-constant operands.

Tests updates:

Analysis/CostModel/X86/arith-fp.ll
Transforms/LoopVectorize/AArch64/interleaved_cost.ll
Transforms/LoopVectorize/ARM/interleaved_cost.ll

The improvement in getOperandsScalarizationOverhead() to differentiate on
constants made it necessary to update the interleaved_cost.ll tests even
though they do not relate to intrinsics.

Review: Hal Finkel
https://reviews.llvm.org/D29540

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297705 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Pre-emptively fix more places in fastisel where we might copy a VK1 register into a AH/BH/CH/DH register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297704 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing condprop-xfail.ll that contains the remaining xfail'd tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297699 91177308-0d34-0410-b5e6-96231b3b80d8

Recommitting Craig Topper's patch now that r296476 has been recommitted.

When checking if chain node is foldable, make sure the intermediate nodes have a single use across all results not just the result that was used to reach the chain node.

This recovers a test case that was severely broken by r296476, my making sure we don't create ADD/ADC that loads and stores when there is also a flag dependency.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297698 91177308-0d34-0410-b5e6-96231b3b80d8

In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.

    Recommiting with compiler time improvements

    Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner.

    * Simplify Consecutive Merge Store Candidate Search

    Now that address aliasing is much less conservative, push through
    simplified store merging search and chain alias analysis which only
    checks for parallel stores through the chain subgraph. This is cleaner
    as the separation of non-interfering loads/stores from the
    store-merging logic.

    When merging stores search up the chain through a single load, and
    finds all possible stores by looking down from through a load and a
    TokenFactor to all stores visited.

    This improves the quality of the output SelectionDAG and the output
    Codegen (save perhaps for some ARM cases where we correctly constructs
    wider loads, but then promotes them to float operations which appear
    but requires more expensive constant generation).

    Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)

    Additional Minor Changes:

      1. Finishes removing unused AliasLoad code

      2. Unifies the chain aggregation in the merged stores across code
         paths

      3. Re-add the Store node to the worklist after calling
         SimplifyDemandedBits.

      4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
         arbitrary, but seems sufficient to not cause regressions in
         tests.

      5. Remove Chain dependencies of Memory operations on CopyfromReg
         nodes as these are captured by data dependence

      6. Forward loads-store values through tokenfactors containing
          {CopyToReg,CopyFromReg} Values.

      7. Peephole to convert buildvector of extract_vector_elt to
         extract_subvector if possible (see
         CodeGen/AArch64/store-merge.ll)

      8. Store merging for the ARM target is restricted to 32-bit as
         some in some contexts invalid 64-bit operations are being
         generated. This can be removed once appropriate checks are
         added.

    This finishes the change Matt Arsenault started in r246307 and
    jyknight's original patch.

    Many tests required some changes as memory operations are now
    reorderable, improving load-store forwarding. One test in
    particular is worth noting:

      CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
      forwarding converts a load-store pair into a parallel store and
      a memory-realized bitcast of the same value. However, because we
      lose the sharing of the explicit and implicit store values we
      must create another local store. A similar transformation
      happens before SelectionDAG as well.

    Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297695 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] Reorder includes in test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297692 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] Fix compilation of CustomCrossOverAndMutateTest on Windows

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297690 91177308-0d34-0410-b5e6-96231b3b80d8

Add the beginning of PDB diffing support.

For now this only diffs the stream directory and the MSF
Superblock. Future patches will drill down into individual
streams to find out where the differences lie.

Differential Revision: https://reviews.llvm.org/D30908

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297689 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Debug Info: Add basic support for external types references."

This reverts commit r242302. External type refs of this form were
never used by any LLVM frontend so this is effectively dead code.
(They were introduced to support clang module debug info, but in the
end we came up with a better design that doesn't use this feature at
all.)

rdar://problem/25897929

Differential Revision: https://reviews.llvm.org/D30917

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297684 91177308-0d34-0410-b5e6-96231b3b80d8

NewGVN: We pass rle-nonlocal, we just perform the replacement in a way that keeps the old name instead of the new one

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297683 91177308-0d34-0410-b5e6-96231b3b80d8

[Thumb1] combine ADDC/SUBC with a negative immediate

Summary: This simple optimization has been split out of https://reviews.llvm.org/D30400

Reviewers: efriedma, jmolloy

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D30829

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297682 91177308-0d34-0410-b5e6-96231b3b80d8

Make FileOutputBuffer fail early if you pass a directory.

Previously, it created a temporary directory and then failed when
FileOutputBuffer tried to rename that file to the destination file
(which is actually a directory name).

Differential Revision: https://reviews.llvm.org/D30912

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297679 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Fix another case where we are copying from a mask register using AH/BH/CH/DH with fastisel.

Fixes PR32256. Still planning to do an audit for other possible cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297678 91177308-0d34-0410-b5e6-96231b3b80d8

Fix llvm-symbolizer to navigate both DW_AT_abstract_origin and DW_AT_specification in a single chain

In a recent refactoring (r291959) this regressed to only following one
or the other, not both, in a single chain.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297676 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unused lambda capture

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297675 91177308-0d34-0410-b5e6-96231b3b80d8

Fix sign compare warning in unit test by using an explicit unsigned literal suffix

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297674 91177308-0d34-0410-b5e6-96231b3b80d8

[IPRA] Change algorithm for RegUsageInfoCollector.

The previous algorithm for RegUsageInfoCollector had pretty bad
performance on architectures with a lot of registers that alias
a lot one another, because we potentially iterate for every register
over all the aliasing registers. This costs even more if the function
is small and doesn't define a lot of registers.
This patch changes the algorithm to one that while iterating over
all the registers it will iterate over the aliasing registers only
if the register itself is defined.
This should be faster based on the assumption that only a subset
of the whole LLVM registers set is actually defined in the function.

Differential Revision: https://reviews.llvm.org/D30880

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297673 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Follow-up for "Test directory iterators and recursive directory iterators with broken symlinks."

Fix the test by sorting the result vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297672 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Translate ConstantDataVector

Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, javed.absar, ab

Reviewed By: qcolombet, dsanders, ab

Subscribers: dberris, rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30216

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297670 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Test directory iterators and recursive directory iterators with broken symlinks.

This commit adds a unit test to the file system tests to verify the behavior of
the directory iterator and recursive directory iterator with broken symlinks.

This test is Unix only.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297669 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "GlobalISel: move vector extract/insert inside generic opcode region."

I was writing against an earlier branch and Volkan had already fixed this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297668 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][MMX] Fix folding of shift value loads to cover whole 64-bits

rL230225 made the assumption that only the lower 32-bits of an MMX register load is used as a shift value, when in fact the whole 64-bits are reloaded and treated as a i64 to determine the shift value.

This patch reverts rL230225 to ensure that the whole 64-bits of memory are folded and ensures that the upper 32-bit are zero'd for cases where the shift value has come from a scalar source.

Found during fuzz testing.

Differential Revision: https://reviews.llvm.org/D30833

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297667 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: move vector extract/insert inside generic opcode region.

Otherwise they won't be legalized or selected, causing instruction selection to
fail horribly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297666 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r295004 (Add MXCSR) due to errors reported by MachineVerifier

I am leaving the code in clang which filters mxcsr from the clobber list because that is still technically correct and will be useful again when the MXCSR register is reintroduced.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297664 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Update PRE_ISEL_GENERIC_OPCODE_END marker

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297663 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Re-use TM.getNullPointerValue

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297662 91177308-0d34-0410-b5e6-96231b3b80d8

Bring back r297624.

The issues was just a missing REQUIRES in the test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297661 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] move tests for PR31028 from CGP

Hopefully, this will make sense with a forthcoming patch. If not, we can move these back.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297660 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Treat 0 as private null pointer in addrspacecast lowering

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297658 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Fix crash when multiple raw_fd_ostreams to stdout are created."

This reverts commit r297624.
It was failing on the bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297657 91177308-0d34-0410-b5e6-96231b3b80d8

Fix some indenting and line-wrapping issues identified in ProgrammersManual. Make description of debugCounters a little clearer

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297656 91177308-0d34-0410-b5e6-96231b3b80d8

[Outliner] Add tail call support

This commit adds tail call support to the MachineOutliner pass. This allows
the outliner to insert jumps rather than calls in areas where tail calling is
possible. Outlined tail calls include the return or terminator of the basic
block being outlined from.

Tail call support allows the outliner to take returns and terminators into
consideration while finding candidates to outline. It also allows the outliner
to save more instructions. For example, in the X86-64 outliner, a tail called
outlined function saves one instruction since no return has to be inserted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297653 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Lower AVX2 gather intrinsics similar to AVX-512. Apply the same input source optimizations to break execution dependencies.

For AVX-512 we force the input to zero if the input is undef or the mask is all ones to break an execution dependency. This patch brings the same behavior to AVX2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297652 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] If gather mask is all ones, force the input to a zero vector.

We were already forcing undef inputs to become a zero vector, this now catches an all ones mask too.

Ideally we'd use undef and let execution dep fix handle picking the best register/clearance for the undef, but I don't think it can handle the early clobber today.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297651 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold icmp/fcmp into icmp intrinsic

The typical use is a library vote function which
compares to 0. Fold the user condition into the intrinsic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297650 91177308-0d34-0410-b5e6-96231b3b80d8

[Linker] Provide callback for internalization

Differential Revision: https://reviews.llvm.org/D30738

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297649 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Enhance SDTCisSameNumEltsAs to work with scalar types and use it on extend/trunc/round operations.

Currently we don't enforce that ISD::ANY_EXTEND, ZERO_EXTEND, SIGN_EXTEND, TRUNC, FP_ROUND, FP_EXTEND have the same number of elements(including scalar) between their input and output. Though we have them documented as such. Up until a few months ago x86 created nodes that violated this rule. That's all been fixed now, and we should enforce the rule going forward.

In order to do this we need to allow SDTCisSameNumEltsAs to support scalar types and not enforce being a vector. If one type is scalar we will force the other type to also be scalar.

Differential Revision: https://reviews.llvm.org/D30878

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297648 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing include on <limits>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297646 91177308-0d34-0410-b5e6-96231b3b80d8

API gardening: Rename FindAllocaDbgValue to findDbgValue (NFC)
and use have it use SmallVectorImpl.

There is nothing specific about allocas in this function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297643 91177308-0d34-0410-b5e6-96231b3b80d8

Use numeric_limits<size_t>::max() instead of size_t(-1).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297641 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a warning due to signed/unsigned comparison.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297639 91177308-0d34-0410-b5e6-96231b3b80d8

Use the new member accessors of llvm::enumerate.

The value_type is no longer a struct, it's a class whose
members you have to access via a method.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297635 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Improve the genericity of llvm::enumerate().

There were some issues in the implementation of enumerate()
preventing it from being used in various contexts. These were
all related to the fact that it did not supporter llvm's
iterator_facade_base class. So this patch adds support for that
and additionally exposes a new helper method to_vector() that
will evaluate an entire range and store the results in a
vector.

Differential Revision: https://reviews.llvm.org/D30853

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297633 91177308-0d34-0410-b5e6-96231b3b80d8

Remove an unused variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297632 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP] add tests for PR31028; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297629 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-pdbdump] Add support for dumping symbols from Yaml -> PDB.

Previously we could round-trip type records from PDB -> Yaml ->
PDB, but for symbols we could only go from PDB -> Yaml. This
completes the round-tripping for symbols as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297625 91177308-0d34-0410-b5e6-96231b3b80d8

Fix crash when multiple raw_fd_ostreams to stdout are created.

If raw_fd_ostream is constructed with the path of "-", it claims
ownership of the stdout file descriptor. This means that it closes
stdout when it is destroyed. If there are multiple users of
raw_fd_ostream wrapped around stdout, then a crash can occur because
of operations on a closed stream.

An example of this would be running something like "clang -S -o - -MD
-MF - test.cpp". Alternatively, using outs() (which creates a local
version of raw_fd_stream to stdout) anywhere combined with such a
stream usage would cause the crash.

The fix duplicates the stdout file descriptor when used within
raw_fd_ostream, so that only that particular descriptor is closed when
the stream is destroyed.

Patch by James Henderson!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297624 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] GlobalISel: Support SP in regbankselect

We used to hit an unreachable in getRegBankFromRegClass when dealing with the
stack pointer. This commit adds support for the GPRsp reg class.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297621 91177308-0d34-0410-b5e6-96231b3b80d8

Reverting r297617 because it broke some bots:

http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/49970

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297618 91177308-0d34-0410-b5e6-96231b3b80d8

Add support for getting file system permissions and implement sys::fs::permissions to set them.

Patch by James Henderson.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297617 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Map Sched Read/Write resources for Falkor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297611 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Set memcheck metadata also for VF==1

This commit is a follow-up on r297580. It fixes the FIXME added temporarily
by that commit to keep the removal of Unroller's specialized version of
scalarizeInstruction() an NFC. See https://reviews.llvm.org/D30715 for details.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297610 91177308-0d34-0410-b5e6-96231b3b80d8

ARMDisassembler: loop over ARM decode tables

Loop over the ARM decode tables; this is a clean-up to reduce some code
duplication.

Differential Revision: https://reviews.llvm.org/D30814

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297608 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/RelocVisitor: Handle R_AMDGPU_ABS64

Test is in the separate patch.

Differential Revision: https://reviews.llvm.org/D30027

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297604 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add EVEX2VEX test cases for the cvt instructions fixed in r297599 and r297600.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297603 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[AVX-512] EVEX2VEX, don't reject intrinsic instructions when both have a memory operand. We should just continue to check other operands instead."

This reverts r297596.

There were other issues that were making this not work that have been fixed now. Reverting this results in a more accurate table.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297602 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add VEX_WIG to VEX vcvtsd2ss/vcvtss2sd intrinsic instructions so they can be correctly matched by EVEX2VEX table generation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297601 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Use sse_loadf32/f64 for vcvtss2sd and vcvtsd2ss intrinsic patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297600 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Use sse_load_f64/f32 in VCVTSS2SI/VCVTSD2SI patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297599 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] EVEX2VEX, don't reject intrinsic instructions when both have a memory operand. We should just continue to check other operands instead.

This exposed that we have several intrinsic instructions that have identical TSFlags to other instructions. We should merge their patterns and kill of the duplicate. I'll fix that in a follow up patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297596 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Minor formatting tweaks in EVEX to VEX tables. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297595 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove unused SDTypeProfile. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297594 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Lower SSE/AVX cmpps/pd intrinsics directly to X86ISD::CMPP SDNodes.

This allows us to remove a duplicate set of patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297593 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Fix the valid immediates for the scatter/gather prefetch intrinsics.

The immediate should be 1 or 2, not 0 or 1. This was found while adding bounds checking to clang. In fact the existing clang builtin test failed if we ran it all the way to assembly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297591 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] don't blindly transform SETB into SBB

I noticed unnecessary 'sbb' instructions in D30472 and while looking at 'ptest' codegen recently.
This happens because we were transforming any 'setb' - even when we only wanted a single-bit result.

This patch moves those transforms under visitAdd/visitSub, so we we're only creating sbb/adc when it
is a win. I don't know why we need a SETCC_CARRY node type, but I'm not proposing to change that
existing behavior in this patch.

Also, I'm skeptical that sbb/adc are a win for all micro-arches, so I added comments to the test files
where this transform still fires.

The test changes here are all cases where we no longer produce sbb/adc. Avoiding partial register
stalls (generating an xor to clear a register) is not handled in some cases, but that's a separate
issue.

Differential Revision: https://reviews.llvm.org/D30611

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297586 91177308-0d34-0410-b5e6-96231b3b80d8

[LVI] Add Datalayout to the class LazyValueInfo since all its Impls require it. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297583 91177308-0d34-0410-b5e6-96231b3b80d8

Remove CRC32 instructions from AArch64InstrInfo::hasShiftedReg

Summary:
A53 scheduler causes an assertion failure on all CRC instructions:
include/llvm/CodeGen/MachineInstr.h:280: const llvm::MachineOperand
&llvm::MachineInstr::getOperand(unsigned int) const: Assertion `i <
getNumOperands() && "getOperand() out of range!"' failed.

The case statements corresponding to CRC instructions are incorrect and should
be removed.

Also adding a testcase while on this.

Reviewers: t.p.northover, javed.absar, apazos, rengolin

Reviewed By: rengolin

Subscribers: evandro, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D30274

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297582 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add vector zext tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297581 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] A unified scalarizeInstruction() for Vectorizer and Unroller; NFC

Unroller's specialized scalarizeInstruction() is mostly duplicating Vectorizer's
variant. OTOH Vectorizer's scalarizeInstruction() already supports the special
case of VF==1 except for avoiding mask-bit extraction in that case. This patch
removes Unroller's specialized version in favor of a unified method.

The only functional difference between the two variants seems to be setting
memcheck metadata for loads and stores only in Vectorizer's variant, which is a
bug in Unroller. To keep this patch an NFC the unified method doesn't set
memcheck metadata for VF==1.

Differential Revision: https://reviews.llvm.org/D30715

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297580 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297579 91177308-0d34-0410-b5e6-96231b3b80d8

Split NewGVN class into a legacy pass and an impl, instead of a merged class.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297576 91177308-0d34-0410-b5e6-96231b3b80d8

Add documentation on debug counters to Programmers Manual.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30842

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297575 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Fix a bad use of a high GR8 register after copying from a mask register during fast isel. This ends up extracting from bits 15:8 instead of the lower bits of the mask.

I'm pretty sure there are more problems lurking here. But I think this fixes PR32241.

I've added the test case from that bug and added asserts that will fail if we ever try to copy between high registers and mask registers again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297574 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add test case for PR32241. Fix coming in another commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297573 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Remove unused field in X86VectorVTInfo tablegen class.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297572 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Improve extraction of elements from v16i8 (pre-SSE41)

Without SSE41 (pextrb) we currently extract byte elements from a vector by spilling to stack and reloading the byte.

This patch is an initial attempt at using MOVD/PEXTRW to extract the relevant DWORD/WORD from the vector and then shift+truncate to collect the correct byte.

Extraction of multiple bytes this way would result in code bloat, but as explained in the patch we could probably afford to be more aggressive with the supported extractions before again falling back on spilling - possibly through counting the number of extracts and which DWORD/WORD they originate?

Differential Revision: https://reviews.llvm.org/D29841

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297568 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unnecessary whitespace.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297567 91177308-0d34-0410-b5e6-96231b3b80d8

Fix signed/unsigned comparison warning

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297565 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add avx2 gather tests cases that show a failure to remove zeroing of the source when the mask is all ones.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297564 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove unnecessary commented out code. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297563 91177308-0d34-0410-b5e6-96231b3b80d8

Fix signed/unsigned comparison warnings

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297561 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Wsentinel warning

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297560 91177308-0d34-0410-b5e6-96231b3b80d8

Use setBits in SelectionDAG

Summary: As per title.

Reviewers: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30836

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297559 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove packf16 intrinsic

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297557 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Keep track of modifiers when converting v_mac to v_mad

Since v_max_f32_e64/v_max_f16_e64 can be folded if the target
instruction supports the clamp bit, we also need to maintain
modifiers when converting v_mac to v_mad.

This fixes a rendering issue with Dirt Rally because a v_mac
instruction with the clamp bit set was converted to a v_mad
but that bit was lost during the conversion.

Fixes: e184e01dd79 ("AMDGPU: Fold FP clamp as modifier bit")

Patch by Samuel Pitoiset <samuel.pitoiset@gmail.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297556 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] add more iterations to LLVMFuzzer-Memcmp64BytesTest

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297554 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Add a DenseMapInfo<T> for shorts.

Differential Revision: https://reviews.llvm.org/D30857

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297552 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] reduce the number of vector resizes during merge (https://github.com/google/oss-fuzz/issues/445)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297551 91177308-0d34-0410-b5e6-96231b3b80d8

Fix line endings of DenseMapInfo.h

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297550 91177308-0d34-0410-b5e6-96231b3b80d8

Remove eol-style:native from DenseMapInfo.h

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297549 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Add a formatv provider for Twine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297548 91177308-0d34-0410-b5e6-96231b3b80d8