git.osdn.net Git - android-x86/external-llvm.git/log

[AArch64] Create proper memoperand for multi-vector stores

Re-apply r345315 with testcase fixes.

Include all of the store's source vector operands when creating the
MachineMemOperand. Previously, we were missing the first operand,
making the store size seem smaller than it really is.

Differential Revision: https://reviews.llvm.org/D52816

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345631 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] In lowerVectorShuffleAsBroadcast, make peeking through CONCAT_VECTORS work correctly if we already walked through a bitcast that changed the element size.

The CONCAT_VECTORS case was using the original mask element count to determine how to adjust the broadcast index. But if we looked through a bitcast the original mask size doesn't tell us anything about the concat_vectors.

This patch switchs to using the concat_vectors input element count directly instead.

Differential Revision: https://reviews.llvm.org/D53823

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345626 91177308-0d34-0410-b5e6-96231b3b80d8

[GCOV] Function counters are wrong when on one line

Summary:
After commit https://reviews.llvm.org/rL344228, the function definitions have a counter but when on one line the counter is wrong (e.g. void foo() { })
I added a test in: https://reviews.llvm.org/D53601

Reviewers: marco-c

Reviewed By: marco-c

Subscribers: llvm-commits, sylvestre.ledru

Differential Revision: https://reviews.llvm.org/D53600

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345624 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] Add const variants for BaseIndexOffset functions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345623 91177308-0d34-0410-b5e6-96231b3b80d8

Fix printing bug in pdb2yaml.

We were using the wrong enum table when mapping enum values
to strings for public symbol flags.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345622 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Define base function on DWARFDie reverse iterators

This defines member function base on the specialization of
std::reverse_iterator for DWARFDie::iterator as required by C++
[reverse.iter.conv].

This fixes unit test DWARFDebugInfoTest.cpp under EXPENSIVE_CHECKS which
currently can't be built due to GNU C++ Library calling this member
function in debug mode.

This fixes https://llvm.org/PR38785

Patch by: Eugene Sharygin

Differential revision: https://reviews.llvm.org/D53792

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345621 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Simplify LRV/STRV ISD nodes

The LRV and STRV nodes carry an extra operand to indicate the
type of the memory access. This is redundant, since the nodes
are actually of class MemIntrinsicNode and therefore hold that
same information already as MemoryVT.

NFC intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345618 91177308-0d34-0410-b5e6-96231b3b80d8

[TTI] Fix uses of SK_ExtractSubvector shuffle costs (PR39368)

Correct costings of SK_ExtractSubvector requires the SubTy argument to indicate the type/size of the extracted subvector.

Unlike the rest of the shuffle kinds this means that the main Ty argument represents the source vector type not the destination!

I've done my best to fix a number of vectorizer uses:

SLP - the reduction epilogue costs should be using a SK_PermuteSingleSrc shuffle as these all occur at the hardware vector width - we're not extracting (illegal) subvector types. This is causing the cost model diffs as SK_ExtractSubvector costs are poorly handled and tend to just return 1 at the moment.

LV - I'm not clear on what the SK_ExtractSubvector should represents for recurrences - I've used a <1 x ?> subvector extraction as that seems to match the VF delta.

Differential Revision: https://reviews.llvm.org/D53573

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345617 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add preliminary tests for nested min/max combines. NFC

Summary: As requested in D53774.

Reviewers: spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53875

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345616 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] add tests for fcmp folds; NFC

This is part of a problem noted in PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345615 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Fix --keep-global-symbol/--globalize-symbol for undefined symbols.

Summary: --keep-global-symbol and --globalize-symbol don't make sense for undefined symbols, so it should be ignored for those symbols. This matches GNU objcopy behavior.

Reviewers: jhenderson, alexshap, jakehehrlich, espindola

Reviewed By: jhenderson, jakehehrlich

Subscribers: emaste, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D53733

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345614 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] use getFltSemantics() instead of duplicating it; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345613 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Move namespace mca inside llvm::

Summary: This allows to remove `using namespace llvm;` in those *.cpp files

When we want to revisit the decision (everything resides in llvm::mca::*) in the future, we can move things to a nested namespace of llvm::mca::, to conceptually make them separate from the rest of llvm::mca::*

Reviewers: andreadb, mattd

Reviewed By: andreadb

Subscribers: javed.absar, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D53407

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345612 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] try to turn shuffle into insertelement

shuffle (insert ?, Scalar, IndexC), V1, Mask --> insert V1, Scalar, IndexC'

The motivating case is at least a couple of steps away: I noticed that
SLPVectorizer does not analyze shuffles as well as sequences of
insert/extract in PR34724:
https://bugs.llvm.org/show_bug.cgi?id=34724
...so SLP may fail to vectorize when source code has shuffles to start
with or instcombine has converted insert/extract to shuffles.

Independent of that, an insertelement is always a simpler op for IR
analysis vs. a shuffle, so we should transform to insert when possible.

I don't think there's any codegen concern here - if a target can't insert
a scalar directly to some fixed element in a vector (x86?), then this
should get expanded to the insert+shuffle that we started with.

Differential Revision: https://reviews.llvm.org/D53507

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345607 91177308-0d34-0410-b5e6-96231b3b80d8

[SchedModel] Fix for read advance cycles with implicit pseudo operands.

The SchedModel allows the addition of ReadAdvances to express that certain
operands of the instructions are needed at a later point than the others.

RegAlloc may add pseudo operands that are not part of the instruction
descriptor, and therefore cannot have any read advance entries. This meant
that in some cases the desired read advance was nullified by such a pseudo
operand, which still had the original latency.

This patch fixes this by making sure that such pseudo operands get a zero
latency during DAG construction.

Review: Matthias Braun, Ulrich Weigand.
https://reviews.llvm.org/D49671

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345606 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopVectorizer]  Fix for cost values of memory accesses.

This commit is a combination of two patches:

* "Fix in getScalarizationOverhead()"

   If target returns false in TTI.prefersVectorizedAddressing(), it means the
   address registers will not need to be extracted. Therefore, there should
   be no operands scalarization overhead for a load instruction.

* "Don't pass the instruction pointer from getMemInstScalarizationCost."

   Since VF is always > 1, this is a cost query for an instruction in the
   vectorized loop and it should not be evaluated within the scalar
   context of the instruction.

Review: Ulrich Weigand, Hal Finkel
https://reviews.llvm.org/D52351
https://reviews.llvm.org/D52417

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345603 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] narrow vector binops when extraction is cheap

Narrowing vector binops came up in the demanded bits discussion in D52912.

I don't think we're going to be able to do this transform in IR as a canonicalization
because of the risk of creating unsupported widths for vector ops, but we already have
a DAG TLI hook to allow what I was hoping for: isExtractSubvectorCheap(). This is
currently enabled for x86, ARM, and AArch64 (although only x86 has existing regression
test diffs).

This is artificially limited to not look through bitcasts because there are so many
test diffs already, but that's marked with a TODO and is a small follow-up.

Differential Revision: https://reviews.llvm.org/D53784

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345602 91177308-0d34-0410-b5e6-96231b3b80d8

[FIX][AArch64] Add support for UDF instruction

Fix: Simplify test files from rL345581 failing
in windows bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345601 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] fix build warning for mismatched signs in compare; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345598 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Improve isFoldableLoad() for Sub, SDiv and UDiv.

Sub, SDiv and UDiv are not commutative, so only the RHS operand can fold a
load. This patch adds a check for this.

Review: Ulrich Weigand
https://reviews.llvm.org/D53791

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345596 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Re-enable the machine verifier after fixing more tests

Was disabled again in r345528. Hopefully this the bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345593 91177308-0d34-0410-b5e6-96231b3b80d8

[llc] Error out when -print-machineinstrs is used with an unknown pass

We used to assert instead of reporting an error.

PR39494

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345589 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-size] Reject unknown radix values

This addresses https://bugs.llvm.org/show_bug.cgi?id=39403 by making
-radix an enumeration option with 8, 10, and 16 as the only accepted
values.

Reviewed by: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D53799

Patch by Eugene Sharygin

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345588 91177308-0d34-0410-b5e6-96231b3b80d8

[FIX][AArch64] Add support for UDF instruction

Fix wrong test files submited
in rL345581

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345587 91177308-0d34-0410-b5e6-96231b3b80d8

[SROA] Use offset sizes from the DataLayout instead of the pointer siezes.

This fixes an assertion when constant folding a GEP when the part of the offset
was in i32 (IndexSize, as per DataLayout) and part in the i64 (PointerSize) in
the newly created test case.

Differential Revision: https://reviews.llvm.org/D52609

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345585 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][BMI1] X86DAGToDAGISel: select BEXTR from x & (-1 >> (32 - y)) pattern

Summary:
The final pattern.
There is no test changes:
* We are looking for the pattern with one-use of it's mask,
* If the mask is one-use, D48768 will unfold it into pattern d.
* Thus, the tests have extra-use on the mask.
* Thus, only the BMI2 BZHI can be tested, and it already worked.
* So there is no BMI1 test coverage, we just assume it works since it uses the same codepath.

Reviewers: craig.topper, RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53575

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345584 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add extra-uses on the mask of pattern c of extract-{low,}bits.ll tests

Summary:
Because of the D48768, that pattern is always unfolded into pattern d,
thus we had no test coverage.

Reviewers: RKSimon, craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53574

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345583 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Add support for UDF instruction

Summary: Add support for AArch64 UDF instruction.
UDF - Permanently Undefined generates an Undefined
Instruction exception (ESR_ELx.EC = 0b000000).

Reviewers: DavidSpickett, javed.absar, t.p.northover

Reviewed By: javed.absar

Subscribers: nhaehnle, kristof.beyls

Differential Revision: https://reviews.llvm.org/D53319

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345581 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Add FoldBUILD_VECTOR to simplify new BUILD_VECTOR nodes

Similar to FoldCONCAT_VECTORS, this patch adds FoldBUILD_VECTOR to simplify cases that can avoid the creation of the BUILD_VECTOR - if all the operands are UNDEF or if the BUILD_VECTOR simplifies to a copy.

This exposed an assumption in some AMDGPU code that getBuildVector was guaranteed to be a BUILD_VECTOR node that I've tried to handle.

Differential Revision: https://reviews.llvm.org/D53760

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345578 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Improve X div/rem Y fold if single bit element type

Summary: Tests by @spatel, thanks

Reviewers: spatel, RKSimon

Reviewed By: spatel

Subscribers: sdardis, atanasyan, llvm-commits, spatel

Differential Revision: https://reviews.llvm.org/D52668

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345575 91177308-0d34-0410-b5e6-96231b3b80d8

[LegalizeTypes] Teach PromoteIntRes_BITCAST to better handle a bitcast with vector output type and a vector input type that needs to be widened

Summary: Previously if we had a bitcast vector output type that needs promotion and a vector input type that needs widening we would just do a stack store and load to handle the conversion. We can do a little better if we can widen the bitcast to a legal vector type the same size as the widened input type. Then we can do the bitcast between this widened type and the widened input type. Afterwards we can extract_subvector back to the original output and any_extend that. Type legalization will then circle back and handle promotion of the extract_subvector and the any_extend will just be removed. This will avoid going through the stack and allows us to remove a custom version of this legalization from X86.

Reviewers: efriedma, RKSimon

Reviewed By: efriedma

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D53229

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345567 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Add test case for D53229. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345566 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Cleanup the code in LowerFABSorFNEG and LowerFCOPYSIGN a little. NFC

Use SelectionDAG::EVTToAPFloatSemantics. Make the LogicVT calculation in LowerFABSorFNEG similar to LowerFCOPYSIGN. Use APInt::getSignedMaxValue instead of ~APInt::getSignMask.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345565 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Stop changing f128 fand/for/fxor to v2i64.

The additional patterns don't cost us much and it seems better than changing element widths.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345564 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove custom BUILD_VECTOR combine

This was looping in a testcase and removing it
now slightly improves a test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345560 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Use scavengeRegisterBackwards

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345559 91177308-0d34-0410-b5e6-96231b3b80d8

Remove dead declaration

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345555 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typos in comment

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345554 91177308-0d34-0410-b5e6-96231b3b80d8

Pass TRI to printReg

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345553 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unneeded friend declarations that clang-cl warns on

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345549 91177308-0d34-0410-b5e6-96231b3b80d8

[AliasSetTracker] Cleanup addPointer interface. [NFCI]

Summary:
Attempting to simplify the addPointer interface.
Currently there's code decomposing a MemoryLocation into (Ptr, Size, AAMDNodes) only to recreate the MemoryLocation inside the call.

Reviewers: reames, mkazantsev

Subscribers: sanjoy, jlebar, llvm-commits

Differential Revision: https://reviews.llvm.org/D53836

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345548 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF][NFC] Refactor range list extraction and dumping

The purpose of this patch is twofold:
- Fold pre-DWARF v5 functionality into v5 to eliminate the need for 2 different
  versions of range list handling. We get rid of DWARFDebugRangelist{.cpp,.h}.
- Templatize the handling of range list tables so that location list handling
  can take advantage of it as well. Location list and range list tables have the
  same basic layout.

A non-NFC version of this patch was previously submitted with r342218, but it caused
errors with some TSan tests. This patch has no functional changes. The difference to
the non-NFC patch is that there are no changes to rangelist dumping in this patch.

Differential Revision: https://reviews.llvm.org/D53545

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345546 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Move elf-specific code into subfolder

In this diff the elf-specific code is moved into the subfolder ELF
(and factored out from llvm-objcopy.cpp).

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D53790

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345544 91177308-0d34-0410-b5e6-96231b3b80d8

Add parens to fix incorrect assert check.

&& has higher priority than ||, so this assert works really oddly. Add
parens to match the programmer's intent.

Change-Id: I3abe1361ee0694462190c5015779db664012f3d4

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345543 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Enable code object v3 by default

Differential Revision: https://reviews.llvm.org/D53525

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345542 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] add tests for abs/nabs+icmp folding; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345541 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopUnroll] NFC. Factor out runtime-loop.ll common test behavior.

Adding COMMON prefix to get common part handled there.
Needed to simplify test changes for D53440.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345538 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner] Inherit target features from parent function

If a function has target features, it may contain instructions that aren't
represented in the default set of instructions. If the outliner pulls out one
of these instructions, and the function doesn't have the right attributes
attached, we'll run into an LLVM error explaining that the target doesn't
support the necessary feature for the instruction.

This makes outlined functions inherit target features from their parents.

It also updates the machine-outliner.ll test to check that we're properly
inheriting target features.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345535 91177308-0d34-0410-b5e6-96231b3b80d8

Relax fast register allocator related test cases; NFC

- Relex hard coded registers and stack frame sizes
- Some test cleanups
- Change phi-dbg.ll to match on mir output after phi elimination instead
of going through the whole codegen pipeline.

This is in preparation for https://reviews.llvm.org/D52010
I'm committing all the test changes upfront that work before and after
independently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345532 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Set isMachineVerifierClean() back to false (PR27481)

Put back the isMachineVerifierClean() override removed at rL345513 to fix Windows ThinLTO tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345528 91177308-0d34-0410-b5e6-96231b3b80d8

[HotColdSplitting] Allow outlining single-block cold regions

It can be profitable to outline single-block cold regions because they
may be large.

Allow outlining single-block regions if they have over some threshold of
non-debug, non-terminator instructions. I chose 3 as the threshold after
experimenting with several internal frameworks.

In practice, reducing the threshold further did not give much
improvement, whereas increasing it resulted in substantial regressions.

Differential Revision: https://reviews.llvm.org/D53824

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345524 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Lower away condition truncations for scalar selects

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53676

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345521 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] getFauxShuffleMask - Fix shuffle mask adjustment for multiple inserted subvectors

Part of the issue discovered in PR39483, although its not fully exposed until I reapply rL345395 (by reverting rL345451)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345520 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add AES to KNL CPUs to match clang.

I believe this was lost from KNL when AES was pushed from Westmere to Skylake recently. KNL used to inherit from IVB.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345519 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fixed return value causing warning and regression

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345518 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Rename FP16FML instruction format (NFC)

Rename SIMDThreeSameMult (etc.) to SIMDThreeSameVectorFML (etc.) to follow
usual naming convention, and add some comments in the .td files.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345515 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Match v_swap_b32

Differential Revision: https://reviews.llvm.org/D52677

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345514 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Enable the MachineVerifier by default

The machine verifier was disabled for x86 by default. There are now only
9 tests failing, compared to what previously was between 20 and 30.

This is a good opportunity to file bugs for all the remaining issues,
then explicitly disable the failing tests and enabling the machine
verifier by default.

This allows us to avoid adding new tests that break the verifier.

PR27481

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345513 91177308-0d34-0410-b5e6-96231b3b80d8

[Intrinsic] Signed and Unsigned Saturation Subtraction Intirnsics

Add an intrinsic that takes 2 integers and perform saturation subtraction on
them.

This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.

Differential Revision: https://reviews.llvm.org/D53783

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345512 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Return address signing B key support

- Add support to generate AUTIBSP, PACIBSP, RETAB instructions for return
address signing
- The key used to sign the function is controlled by the function attribute
"sign-return-address-key"

Differential Revision: https://reviews.llvm.org/D51427

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345511 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM-C] Add Builder Bindings to Common Memory Intrinsics

Summary: Add IRBuilder bindings for memmove, memcpy, and memset.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: harlanhaskins, llvm-commits

Differential Revision: https://reviews.llvm.org/D53555

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345508 91177308-0d34-0410-b5e6-96231b3b80d8

[git/svn] Ignore Visual Studio's CMakeSettings.json.

When using Visual Studio's built-in support for CMake, the CMakeSettings.json contains the build configurations (build dir, generator, toolchain, cmake variables, etc). It is specific to the build machine, therefore should not be versioned.

Differential Revision: https://reviews.llvm.org/D53775

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345504 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Don't crash when using `-a` on non-archives

This fixes PR39402. The crash was caused when dereferencing nullptr in
DumpObject and printArchiveChild.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D53690

Patch by Xing GUO

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345503 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove outdated test

This test breaks the X86 MachineVerifier. It looks like the MIR part is
completely useless.

The original author suggests that it can be removed.

Differential Revision: https://reviews.llvm.org/D53767

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345501 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Lower to mca::Instructon before the pipeline is run.

Before this change, the lowering of instructions from llvm::MCInst to
mca::Instruction was done as part of the first stage of the pipeline (i.e. the
FetchStage). In particular, FetchStage was responsible for picking the next
instruction from the source sequence, and lower it to an mca::Instruction with
the help of an object of class InstrBuilder.

The dependency on InstrBuilder was problematic for a number of reasons. Class
InstrBuilder only knows how to lower from llvm::MCInst to mca::Instruction.
That means, it is hard to support a different scenario where instructions
in input are not instances of class llvm::MCInst. Even if we managed to
specialize InstrBuilder, and generalize most of its internal logic, the
dependency on InstrBuilder in FetchStage would have caused more troubles (other
than complicating the pipeline logic).

With this patch, the lowering step is done before the pipeline is run. The
pipeline is no longer responsible for lowering from MCInst to mca::Instruction.
As a consequence of this, the FetchStage no longer needs to interact with an
InstrBuilder. The mca::SourceMgr class now simply wraps a reference to a
sequence of mca::Instruction objects.
This simplifies the logic of FetchStage, and increases the usability of it. As
a result, on a debug build, we see a 7-9% speedup; on a release build, the
speedup is around 3-4%.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345500 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca][UpdateTestChecks] Don't try to align blocks that have already been subject to alignment in update_mca_test_checks.py

This fixes PR39466.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345499 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Add '--full-contents' as alias for '-s'

This fixes PR39404.

Reviewed By: jhenderson

Patch by Xing Guo

Differential Revision: https://reviews.llvm.org/D53576

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345495 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM][NFC] Fix test inlineasm-X-allocation.ll

Differential Revision: https://reviews.llvm.org/D53748

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345491 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Force floating point values in constant pool decoding to print in scientific notation so they can't be confused with integers.

When the floating point constants are whole numbers they have no decimal point so look like integers, but mean something very different in something like an 'and' instruction.

Ideally we would just print a decimal point and a 0, but I couldn't see how to make APFloat::toString do that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345488 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Recognize constant splats in LowerFCOPYSIGN.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345484 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test case to show failure to handle splat vectors in the constant check in LowerFCOPYSIGN.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345483 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Revert "DebugInfo: reduce DIE range verification on object files""

This reverts commit 836c763dadbd9478fa35b1a291a38bf17aa206ba. Default
initialize the values that MSAN caught.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345482 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Fix bad indentation. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345481 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Fix SNB counter definition and handling.

Summary: SNB is the only one that has P23 as a single proc res.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53766

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345480 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] Move i64/vXi64 to f32/vXf32 UINT_TO_FP handling to TargetLowering::expandUINT_TO_FP.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345478 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][NFC] sse42-schedule.ll: disable XOP for BdVer2 tests

Else we are clearly testing the wrong instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345476 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][NFC] sse41-schedule.ll: disable XOP for BdVer2 tests

Else we are clearly testing the wrong instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345475 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][NFC] sse2-schedule.ll: disable XOP for BdVer2 tests

Else we are clearly testing the wrong instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345474 91177308-0d34-0410-b5e6-96231b3b80d8

[VectorLegalizer] Enable TargetLowering::expandFP_TO_UINT support.

Add vector support to TargetLowering::expandFP_TO_UINT.

This exposes an issue in X86TargetLowering::LowerVSELECT which was assuming that the select mask was the same width as the LHS/RHS ops - as long as the result is a sign splat we can easily sext/trunk this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345473 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Better constant vector support for FCOPYSIGN.

Enable constant folding when both operands are vectors of constants.

Turn into FNEG/FABS when the RHS is a splat constant vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345469 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases showing missed opportunities for optimizing vector fcopysign when the RHS is a splat constant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345468 91177308-0d34-0410-b5e6-96231b3b80d8

[utils] collect_and_build_with_pgo.py: revert part already fixed in rL345461

The change was inadvertently included in my last commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345467 91177308-0d34-0410-b5e6-96231b3b80d8

[utils] Fix _run_benchmark in collect_and_build_with_pgo.py

Summary: Also fix a FIXME in _build_stage1_clang: clang llvm-profdata profile are sufficient

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53795

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345466 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r344172: [LV] Add a new reduction pattern match

This patch has caused fast-math issues in the reduction pattern.

Will re-work and land again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345465 91177308-0d34-0410-b5e6-96231b3b80d8

AMD BdVer2 (Piledriver) Initial Scheduler model

Summary:
# Overview
This is somewhat partial.
* Latencies are good {F7371125}
  * All of these remaining inconsistencies //appear// to be noise/noisy/flaky.
* NumMicroOps are somewhat good {F7371158}
  * Most of the remaining inconsistencies are from `Ld` / `Ld_ReadAfterLd` classes
* Actual unit occupation (pipes, `ResourceCycles`) are undiscovered lands, i did not really look there.
  They are basically verbatum copy from `btver2`
* Many `InstRW`. And there are still inconsistencies left...

To be noted:
I think this is the first new schedule profile produced with the new next-gen tools like llvm-exegesis!

# Benchmark
I realize that isn't what was suggested, but i'll start with some "internal" public real-world benchmark i understand - [[ https://github.com/darktable-org/rawspeed | RawSpeed raw image decoding library ]].
Diff (the exact clang from trunk without/with this patch):
```
Comparing /home/lebedevri/rawspeed/build-old/src/utilities/rsbench/rsbench to /home/lebedevri/rawspeed/build-new/src/utilities/rsbench/rsbench
Benchmark                                                                                        Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Canon/EOS 5D Mark II/09.canon.sraw1.cr2/threads:8/real_time_pvalue                             0.0000          0.0000      U Test, Repetitions: 25 vs 25
Canon/EOS 5D Mark II/09.canon.sraw1.cr2/threads:8/real_time_mean                              -0.0607         -0.0604           234           219           233           219
Canon/EOS 5D Mark II/09.canon.sraw1.cr2/threads:8/real_time_median                            -0.0630         -0.0626           233           219           233           219
Canon/EOS 5D Mark II/09.canon.sraw1.cr2/threads:8/real_time_stddev                            +0.2581         +0.2587             1             2             1             2
Canon/EOS 5D Mark II/10.canon.sraw2.cr2/threads:8/real_time_pvalue                             0.0000          0.0000      U Test, Repetitions: 25 vs 25
Canon/EOS 5D Mark II/10.canon.sraw2.cr2/threads:8/real_time_mean                              -0.0770         -0.0767           144           133           144           133
Canon/EOS 5D Mark II/10.canon.sraw2.cr2/threads:8/real_time_median                            -0.0767         -0.0763           144           133           144           133
Canon/EOS 5D Mark II/10.canon.sraw2.cr2/threads:8/real_time_stddev                            -0.4170         -0.4156             1             0             1             0
Canon/EOS 5DS/2K4A9927.CR2/threads:8/real_time_pvalue                                          0.0000          0.0000      U Test, Repetitions: 25 vs 25
Canon/EOS 5DS/2K4A9927.CR2/threads:8/real_time_mean                                           -0.0271         -0.0270           463           450           463           450
Canon/EOS 5DS/2K4A9927.CR2/threads:8/real_time_median                                         -0.0093         -0.0093           453           449           453           449
Canon/EOS 5DS/2K4A9927.CR2/threads:8/real_time_stddev                                         -0.7280         -0.7280            13             4            13             4
Canon/EOS 5DS/2K4A9928.CR2/threads:8/real_time_pvalue                                          0.0004          0.0004      U Test, Repetitions: 25 vs 25
Canon/EOS 5DS/2K4A9928.CR2/threads:8/real_time_mean                                           -0.0065         -0.0065           569           565           569           565
Canon/EOS 5DS/2K4A9928.CR2/threads:8/real_time_median                                         -0.0077         -0.0077           569           564           569           564
Canon/EOS 5DS/2K4A9928.CR2/threads:8/real_time_stddev                                         +1.0077         +1.0068             2             5             2             5
Canon/EOS 5DS/2K4A9929.CR2/threads:8/real_time_pvalue                                          0.0220          0.0199      U Test, Repetitions: 25 vs 25
Canon/EOS 5DS/2K4A9929.CR2/threads:8/real_time_mean                                           +0.0006         +0.0007           312           312           312           312
Canon/EOS 5DS/2K4A9929.CR2/threads:8/real_time_median                                         +0.0031         +0.0032           311           312           311           312
Canon/EOS 5DS/2K4A9929.CR2/threads:8/real_time_stddev                                         -0.7069         -0.7072             4             1             4             1
Canon/EOS 10D/CRW_7673.CRW/threads:8/real_time_pvalue                                          0.0004          0.0004      U Test, Repetitions: 25 vs 25
Canon/EOS 10D/CRW_7673.CRW/threads:8/real_time_mean                                           -0.0015         -0.0015           141           141           141           141
Canon/EOS 10D/CRW_7673.CRW/threads:8/real_time_median                                         -0.0010         -0.0011           141           141           141           141
Canon/EOS 10D/CRW_7673.CRW/threads:8/real_time_stddev                                         -0.1486         -0.1456             0             0             0             0
Canon/EOS 40D/_MG_0154.CR2/threads:8/real_time_pvalue                                          0.6139          0.8766      U Test, Repetitions: 25 vs 25
Canon/EOS 40D/_MG_0154.CR2/threads:8/real_time_mean                                           -0.0008         -0.0005            60            60            60            60
Canon/EOS 40D/_MG_0154.CR2/threads:8/real_time_median                                         -0.0006         -0.0002            60            60            60            60
Canon/EOS 40D/_MG_0154.CR2/threads:8/real_time_stddev                                         -0.1467         -0.1390             0             0             0             0
Canon/EOS 77D/IMG_4049.CR2/threads:8/real_time_pvalue                                          0.0137          0.0137      U Test, Repetitions: 25 vs 25
Canon/EOS 77D/IMG_4049.CR2/threads:8/real_time_mean                                           +0.0002         +0.0002           275           275           275           275
Canon/EOS 77D/IMG_4049.CR2/threads:8/real_time_median                                         -0.0015         -0.0014           275           275           275           275
Canon/EOS 77D/IMG_4049.CR2/threads:8/real_time_stddev                                         +3.3687         +3.3587             0             2             0             2
Canon/PowerShot G1/crw_1693.crw/threads:8/real_time_pvalue                                     0.4041          0.3933      U Test, Repetitions: 25 vs 25
Canon/PowerShot G1/crw_1693.crw/threads:8/real_time_mean                                      +0.0004         +0.0004            67            67            67            67
Canon/PowerShot G1/crw_1693.crw/threads:8/real_time_median                                    -0.0000         -0.0000            67            67            67            67
Canon/PowerShot G1/crw_1693.crw/threads:8/real_time_stddev                                    +0.1947         +0.1995             0             0             0             0
Fujifilm/GFX 50S/20170525_0037TEST.RAF/threads:8/real_time_pvalue                              0.0074          0.0001      U Test, Repetitions: 25 vs 25
Fujifilm/GFX 50S/20170525_0037TEST.RAF/threads:8/real_time_mean                               -0.0092         +0.0074           547           542            25            25
Fujifilm/GFX 50S/20170525_0037TEST.RAF/threads:8/real_time_median                             -0.0054         +0.0115           544           541            25            25
Fujifilm/GFX 50S/20170525_0037TEST.RAF/threads:8/real_time_stddev                             -0.4086         -0.3486             8             5             0             0
Fujifilm/X-Pro2/_DSF3051.RAF/threads:8/real_time_pvalue                                        0.3320          0.0000      U Test, Repetitions: 25 vs 25
Fujifilm/X-Pro2/_DSF3051.RAF/threads:8/real_time_mean                                         +0.0015         +0.0204           218           218            12            12
Fujifilm/X-Pro2/_DSF3051.RAF/threads:8/real_time_median                                       +0.0001         +0.0203           218           218            12            12
Fujifilm/X-Pro2/_DSF3051.RAF/threads:8/real_time_stddev                                       +0.2259         +0.2023             1             1             0             0
GoPro/HERO6 Black/GOPR9172.GPR/threads:8/real_time_pvalue                                      0.0000          0.0001      U Test, Repetitions: 25 vs 25
GoPro/HERO6 Black/GOPR9172.GPR/threads:8/real_time_mean                                       -0.0209         -0.0179            96            94            90            88
GoPro/HERO6 Black/GOPR9172.GPR/threads:8/real_time_median                                     -0.0182         -0.0155            95            93            90            88
GoPro/HERO6 Black/GOPR9172.GPR/threads:8/real_time_stddev                                     -0.6164         -0.2703             2             1             2             1
Kodak/DCS Pro 14nx/D7465857.DCR/threads:8/real_time_pvalue                                     0.0000          0.0000      U Test, Repetitions: 25 vs 25
Kodak/DCS Pro 14nx/D7465857.DCR/threads:8/real_time_mean                                      -0.0098         -0.0098           176           175           176           175
Kodak/DCS Pro 14nx/D7465857.DCR/threads:8/real_time_median                                    -0.0126         -0.0126           176           174           176           174
Kodak/DCS Pro 14nx/D7465857.DCR/threads:8/real_time_stddev                                    +6.9789         +6.9157             0             2             0             2
Nikon/D850/Nikon-D850-14bit-lossless-compressed.NEF/threads:8/real_time_pvalue                 0.0000          0.0000      U Test, Repetitions: 25 vs 25
Nikon/D850/Nikon-D850-14bit-lossless-compressed.NEF/threads:8/real_time_mean                  -0.0237         -0.0238           474           463           474           463
Nikon/D850/Nikon-D850-14bit-lossless-compressed.NEF/threads:8/real_time_median                -0.0267         -0.0267           473           461           473           461
Nikon/D850/Nikon-D850-14bit-lossless-compressed.NEF/threads:8/real_time_stddev                +0.7179         +0.7178             3             5             3             5
Olympus/E-M1MarkII/Olympus_EM1mk2__HIRES_50MP.ORF/threads:8/real_time_pvalue                   0.6837          0.6554      U Test, Repetitions: 25 vs 25
Olympus/E-M1MarkII/Olympus_EM1mk2__HIRES_50MP.ORF/threads:8/real_time_mean                    -0.0014         -0.0013          1375          1373          1375          1373
Olympus/E-M1MarkII/Olympus_EM1mk2__HIRES_50MP.ORF/threads:8/real_time_median                  +0.0018         +0.0019          1371          1374          1371          1374
Olympus/E-M1MarkII/Olympus_EM1mk2__HIRES_50MP.ORF/threads:8/real_time_stddev                  -0.7457         -0.7382            11             3            10             3
Panasonic/DC-G9/P1000476.RW2/threads:8/real_time_pvalue                                        0.0000          0.0000      U Test, Repetitions: 25 vs 25
Panasonic/DC-G9/P1000476.RW2/threads:8/real_time_mean                                         -0.0080         -0.0289            22            22            10            10
Panasonic/DC-G9/P1000476.RW2/threads:8/real_time_median                                       -0.0070         -0.0287            22            22            10            10
Panasonic/DC-G9/P1000476.RW2/threads:8/real_time_stddev                                       +1.0977         +0.6614             0             0             0             0
Panasonic/DC-GH5/_T012014.RW2/threads:8/real_time_pvalue                                       0.0000          0.0000      U Test, Repetitions: 25 vs 25
Panasonic/DC-GH5/_T012014.RW2/threads:8/real_time_mean                                        +0.0132         +0.0967            35            36            10            11
Panasonic/DC-GH5/_T012014.RW2/threads:8/real_time_median                                      +0.0132         +0.0956            35            36            10            11
Panasonic/DC-GH5/_T012014.RW2/threads:8/real_time_stddev                                      -0.0407         -0.1695             0             0             0             0
Panasonic/DC-GH5S/P1022085.RW2/threads:8/real_time_pvalue                                      0.0000          0.0000      U Test, Repetitions: 25 vs 25
Panasonic/DC-GH5S/P1022085.RW2/threads:8/real_time_mean                                       +0.0331         +0.1307            13            13             6             6
Panasonic/DC-GH5S/P1022085.RW2/threads:8/real_time_median                                     +0.0430         +0.1373            12            13             6             6
Panasonic/DC-GH5S/P1022085.RW2/threads:8/real_time_stddev                                     -0.9006         -0.8847             1             0             0             0
Pentax/645Z/IMGP2837.PEF/threads:8/real_time_pvalue                                            0.0016          0.0010      U Test, Repetitions: 25 vs 25
Pentax/645Z/IMGP2837.PEF/threads:8/real_time_mean                                             -0.0023         -0.0024           395           394           395           394
Pentax/645Z/IMGP2837.PEF/threads:8/real_time_median                                           -0.0029         -0.0030           395           394           395           393
Pentax/645Z/IMGP2837.PEF/threads:8/real_time_stddev                                           -0.0275         -0.0375             1             1             1             1
Phase One/P65/CF027310.IIQ/threads:8/real_time_pvalue                                          0.0232          0.0000      U Test, Repetitions: 25 vs 25
Phase One/P65/CF027310.IIQ/threads:8/real_time_mean                                           -0.0047         +0.0039           114           113            28            28
Phase One/P65/CF027310.IIQ/threads:8/real_time_median                                         -0.0050         +0.0037           114           113            28            28
Phase One/P65/CF027310.IIQ/threads:8/real_time_stddev                                         -0.0599         -0.2683             1             1             0             0
Samsung/NX1/2016-07-23-142101_sam_9364.srw/threads:8/real_time_pvalue                          0.0000          0.0000      U Test, Repetitions: 25 vs 25
Samsung/NX1/2016-07-23-142101_sam_9364.srw/threads:8/real_time_mean                           +0.0206         +0.0207           405           414           405           414
Samsung/NX1/2016-07-23-142101_sam_9364.srw/threads:8/real_time_median                         +0.0204         +0.0205           405           414           405           414
Samsung/NX1/2016-07-23-142101_sam_9364.srw/threads:8/real_time_stddev                         +0.2155         +0.2212             1             1             1             1
Samsung/NX30/2015-03-07-163604_sam_7204.srw/threads:8/real_time_pvalue                         0.0000          0.0000      U Test, Repetitions: 25 vs 25
Samsung/NX30/2015-03-07-163604_sam_7204.srw/threads:8/real_time_mean                          -0.0109         -0.0108           147           145           147           145
Samsung/NX30/2015-03-07-163604_sam_7204.srw/threads:8/real_time_median                        -0.0104         -0.0103           147           145           147           145
Samsung/NX30/2015-03-07-163604_sam_7204.srw/threads:8/real_time_stddev                        -0.4919         -0.4800             0             0             0             0
Samsung/NX3000/_3184416.SRW/threads:8/real_time_pvalue                                         0.0000          0.0000      U Test, Repetitions: 25 vs 25
Samsung/NX3000/_3184416.SRW/threads:8/real_time_mean                                          -0.0149         -0.0147           220           217           220           217
Samsung/NX3000/_3184416.SRW/threads:8/real_time_median                                        -0.0173         -0.0169           221           217           220           217
Samsung/NX3000/_3184416.SRW/threads:8/real_time_stddev                                        +1.0337         +1.0341             1             3             1             3
Sony/DSLR-A350/DSC05472.ARW/threads:8/real_time_pvalue                                         0.0001          0.0001      U Test, Repetitions: 25 vs 25
Sony/DSLR-A350/DSC05472.ARW/threads:8/real_time_mean                                          -0.0019         -0.0019           194           193           194           193
Sony/DSLR-A350/DSC05472.ARW/threads:8/real_time_median                                        -0.0021         -0.0021           194           193           194           193
Sony/DSLR-A350/DSC05472.ARW/threads:8/real_time_stddev                                        -0.4441         -0.4282             0             0             0             0
Sony/ILCE-7RM2/14-bit-compressed.ARW/threads:8/real_time_pvalue                                0.0000          0.4263      U Test, Repetitions: 25 vs 25
Sony/ILCE-7RM2/14-bit-compressed.ARW/threads:8/real_time_mean                                 +0.0258         -0.0006            81            83            19            19
Sony/ILCE-7RM2/14-bit-compressed.ARW/threads:8/real_time_median                               +0.0235         -0.0011            81            82            19            19
Sony/ILCE-7RM2/14-bit-compressed.ARW/threads:8/real_time_stddev                               +0.1634         +0.1070             1             1             0             0
```
{F7443905}
If we look at the `_mean`s, the time column, the biggest win is `-7.7%` (`Canon/EOS 5D Mark II/10.canon.sraw2.cr2`),
and the biggest loose is `+3.3%` (`Panasonic/DC-GH5S/P1022085.RW2`);
Overall: mean `-0.7436%`, median `-0.23%`, `cbrt(sum(time^3))` = `-8.73%`
Looks good so far i'd say.

llvm-exegesis details:
{F7371117} {F7371125}
{F7371128} {F7371144} {F7371158}

Reviewers: craig.topper, RKSimon, andreadb, courbet, avt77, spatel, GGanesh

Reviewed By: andreadb

Subscribers: javed.absar, gbedwell, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52779

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345463 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][X86] Baseline tests for AMD BdVer2 (Piledriver) Scheduler model

Adding the baseline tests in a preparatory NFC commit,
so that the actual commit shows the *diff*.

Yes, i'm aware that a few of these codegen-based sched tests
are testing wrong instructions, i will fix that afterwards.

For https://reviews.llvm.org/D52779

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345462 91177308-0d34-0410-b5e6-96231b3b80d8

[utils] Run tests in the proper directory.

The intent here was to run check-llvm/check-clang in the instrumented
clang's build directory, not the maybe-not-yet-created uninstrumented
clang's. Oops. :)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345461 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] LowerVSELECT - pull out repeated getOperand(). NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345458 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "DebugInfo: reduce DIE range verification on object files"

This reverts commits r345441 and r345444, they were causing msan
buildbot failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345457 91177308-0d34-0410-b5e6-96231b3b80d8

[Local] Keep K's range if K does not move when combining metadata.

As K has to dominate I, IIUC I's range metadata must be a subset of
K's. After Eli's recent clarification to the LangRef, loading a value
outside of the range is undefined behavior.
Therefore if I's range contains elements outside of K's range and we would load
one such value, K would cause undefined behavior.

In cases like hoisting/sinking, we still want the most generic range
over all code paths to/from the hoist/sink point. As suggested in the
patches related to D47339, I will refactor the handling of those
scenarios and try to decouple it from this function as follow up, once
we switched to a similar handling of metadata in most of
combineMetadata.

I updated some tests checking mostly the merging of metadata to keep the
metadata of to dominating load. The most interesting one is probably test8 in
test/Transforms/JumpThreading/thread-loads.ll. It contained a comment
about the alias metadata preventing us to eliminate the branch, but it
seem like the actual problem currently is that we merge the ranges of
both loads and cannot eliminate the icmp afterwards. With this patch, we
manage to eliminate the icmp, as the range of the first load excludes 8.

Reviewers: efriedma, nlopes, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D51629

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345456 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] make test immune to improved extraction in D53784; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345455 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Wdocumentation warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345454 91177308-0d34-0410-b5e6-96231b3b80d8

Regenerate FP_TO_INT tests.

Precursor to fix for PR17686

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345453 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] Move LegalizeDAG FP_TO_UINT handling to TargetLowering::expandFP_TO_UINT. NFCI.

First step towards fixing PR17686 and adding vector support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345452 91177308-0d34-0410-b5e6-96231b3b80d8

Revert rL345395: [X86][SSE] Move 2-input limit up from getFauxShuffleMask to resolveTargetShuffleInputs
Makes no difference to actual shuffle decoding yet, but merges all the existing limits in one place for when proper support is fixed.
........
Its been reported that this is causing out of trunk failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345451 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM64][Windows] MCLayer support for exception handling

Add ARM64 unwind codes to MCLayer, as well SEH directives that will be emitted
by the frame lowering patch to follow. We only emit unwind codes into object
object files for now.

Differential Revision: https://reviews.llvm.org/D50166

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345450 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add some isel patterns for scalar_to_vector/extract_vector_element that use the avx512 extended register classes when they are available.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345448 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r345169 [along with its llvm counterpart r345170] as it makes Halide builds timeout.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345447 91177308-0d34-0410-b5e6-96231b3b80d8

test: add missing -triple

Ensure that the test builds for x86_64 as it is an assembly test. This
should repair the buildbots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345444 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Add missing assignment to Itinerary in Call_nr

The class definition for Call_nr has the itinerary as a
parameter, but the value is never assigned to the Itinerary
field for the instruction. This means the compiler is unable
to schedule and packetize the instruction correctly because
these instrution will not have any resource descritions.
I don't have a specific test case, but the ps_call_nr.ll
test failed with a proposed patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345442 91177308-0d34-0410-b5e6-96231b3b80d8