git.osdn.net Git - android-x86/external-llvm.git/log

[AMDGPU] Define code object identification string used in AMDHSA runtimes.

Differential Revision: https://reviews.llvm.org/D44718

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328669 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] Fixed unreachable with -print-machineinstrs and custom pseudo source value

Summary:
Rev 327580 "[CodeGen] Use MIR syntax for MachineMemOperand printing"
broke -print-machineinstrs for us on AMDGPU, because we have custom
pseudo source values, and MIR serialization does not implement that.

This commit at least restores the functionality of -print-machineinstrs,
even if it does not properly implement the missing MIR serialization
functionality.

Differential Revision: https://reviews.llvm.org/D44871

Change-Id: I44961c0b90bf6d48c01484ed7a4e466fd300db66

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328668 91177308-0d34-0410-b5e6-96231b3b80d8

Initialize variable added in r328617.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328667 91177308-0d34-0410-b5e6-96231b3b80d8

[YAML] Remove unit test of multibyte non-printable escaping that uses C++11 escapes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328665 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add WriteFMOVMSK/WriteVecMOVMSK/WriteMMXMOVMSK scheduler classes

Currently MOVMSK instructions use the WriteVecLogic class, which is a very poor choice given that MOVMSK involves a SSE->GPR transfer.

Differential Revision: https://reviews.llvm.org/D44924

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328664 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF][DWARF v5]: Adding support for dumping DW_RLE_offset_pair and DW_RLE_base_address

Reviewers: dblakie, aprantl

Differential Revision: https://reviews.llvm.org/D44811

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328662 91177308-0d34-0410-b5e6-96231b3b80d8

[YAML] Escape non-printable multibyte UTF8 in Output::scalarString.

The existing YAML Output::scalarString code path includes a partial and
incorrect implementation of YAML escaping logic. In particular, the logic put
in place in rL321283 escapes non-printable bytes only if they are not part of a
multibyte UTF8 sequence; implicitly this means that all multibyte UTF8
sequences -- printable and non -- are passed through verbatim.

The simplest solution to this is to direct the Output::scalarString method to
use the standalone yaml::escape function, and this _almost_ works, except that
the existing code in that function _over_ escapes: any multibyte UTF8 sequence
is escaped, even printable ones. While this is permitted for YAML, it is also
more aggressive (and hard to read for non-English locales) than necessary,
and the entire point of rL321283 was to back off such aggressive over-escaping.

So in this change, I have both redirected Output::scalarString to use
yaml::escape _and_ modified yaml::escape to optionally restrict its escaping to
non-printables. This preserves behaviour of any existing clients while giving
them a path to more moderate escaping should they desire.

Reviewers: JDevlieghere, thegameg, MatzeB, vladimir.plyashkun

Reviewed By: thegameg

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44863

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328661 91177308-0d34-0410-b5e6-96231b3b80d8

80-line wrap. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328660 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix not preserving CSR VGPR if used for SGPR spills

Before this was not done if the function had no calls in it. This
is still a possible issue with any callable function, regardless
of calls present.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328659 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Set natural stack alignment in DataLayout

Only 4 byte alignment is ever useful, so increasing anything
beyond this may require realigning the stack.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328656 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] Simplify DWARFAddressRange::contains

This transform is valid because the ranges have been validated (LowPC <= HighPC).

Differential Revision: https://reviews.llvm.org/D44772

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328655 91177308-0d34-0410-b5e6-96231b3b80d8

[PGO] Fix branch probability remarks assert

Fixed counter/weight overflow that leads to an assertion. Also fixed the help
string for pgo-emit-branch-prob option.

Differential Revision: https://reviews.llvm.org/D44809

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328653 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix crash when MachinePointerInfo invalid

The combine on a select of a load only triggers for
addrspace 0, and discards the MachinePointerInfo. The
conservative default needs to be used for this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328652 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix register name format in tests

These were changed to match the asm output name a long time ago,
although I think the old tablegenerated names still work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328651 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix FP restore from being reordered with stack ops

In a function, s5 is used as the frame base SGPR. If a function
is calling another function, during the call sequence
it is copied to a preserved SGPR and restored.

Before it was possible for the scheduler to move stack operations
before the restore of s5, since there's nothing to associate
a frame index access with the restore.

Add an implicit use of s5 to the adjcallstack pseudo which ends
the call sequence to preven this from happening. I'm not 100%
satisfied with this solution, but I'm not sure what else would be
better.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328650 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Implement TTI::shouldMaximizeVectorBandwidth

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328648 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Fix the resource list for the COPY instruction.

The COPY instruction was listed as a 4 cycle instruction.
It is now listed correctly as a 2 cycle ALU instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328647 91177308-0d34-0410-b5e6-96231b3b80d8

Remap values in PromotedFloats

Summary: When a node is about to be erased from ReplacedValues, we should also remap its corresponding values in PromotedFloats.

Patch by Yan Luo (Yan.Luo2@synopsys.com)

Reviewers: pirama

Reviewed By: pirama

Subscribers: lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D44872

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328644 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a reoccuring typo in load-combine tests

   %tmp = bitcast i32* %arg to i8*
   %tmp1 = getelementptr inbounds i8, i8* %tmp, i32 0
-  %tmp2 = load i8, i8* %tmp, align 1
+  %tmp2 = load i8, i8* %tmp1, align 1

This doesn't change the semantics of the tests but makes use of %tmp1 which was originally intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328642 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Rudimentary support for auto-vectorization for HVX

This implements a set of TTI functions that the loop vectorizer uses.
The only purpose of this is to enable testing. Auto-vectorization is
disabled by default, enabled by -hexagon-autohvx.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328639 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Decorate AArch64 instrs with OPERAND_PCREL

Summary:
This is a canonical way to teach objdump to print the target
symbols for branches when disassembling AArch64 code.

Reviewers: evandro, t.p.northover, espindola

Reviewed By: t.p.northover

Differential Revision: https://reviews.llvm.org/D44851

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328638 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] OptPassGate extracted from OptBisect

Summary:
This is an NFC refactoring of the OptBisect class to split it into an optional pass gate interface used by LLVMContext and the Optional Pass Bisector (OptBisect) used for debugging of optional passes.

This refactoring is needed for D44464, which introduces setOptPassGate() method to allow implementations other than OptBisect.

Patch by Yevgeny Rouban.

Reviewers: andrew.w.kaylor, fedor.sergeev, vsk, dberlin, Eugene.Zelenko, reames, skatkov
Reviewed By: fedor.sergeev
Differential Revision: https://reviews.llvm.org/D44821

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328637 91177308-0d34-0410-b5e6-96231b3b80d8

Use .set instead of = when printing assignment in assembly output

On Hexagon "x = y" is a syntax used in most instructions, and is not
treated as a directive.

Differential Revision: https://reviews.llvm.org/D44256

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328635 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Add TTI::shouldMaximizeVectorBandwidth to allow enabling it per target

The default implementation returns false and keeps the current behavior.

Differential Revision: https://reviews.llvm.org/D44735

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328632 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] pass the correct set of used registers in checkRAT.

We were incorrectly initializing the array of used registers in method checkRAT.
As a consequence, the number of register file stalls was misreported.

Added a test to cover this case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328629 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][Btver2] Add MMX_PMOVMSKBrr to MOVMSK scheduler class

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328620 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Secure PLT support

This patch supports secure PLT mode for PowerPC 32 architecture.

Differential Revision: https://reviews.llvm.org/D42112

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328617 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS] Add static_assert that all Fixups are handled in getFixupKind

Summary:
I recently added a new Fixup kind to our fork of LLVM but forgot to add
it to the table in MipsAsmBackend.cpp. With this static_assert the error
would have been caught instead of zero-initializing the array entries for
the new fixups.

Reviewers: sdardis, atanasyan

Reviewed By: atanasyan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44895

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328616 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopUnroll][NFC] Remove redundant canPeel check

We check `canPeel` twice: when evaluating the number of iterations to be peeled
and within the method `peelLoop` that performs peeling. This method is only
executed if the calculated peel count is positive. Thus, the check in `peelLoop` can
never fail. This patch replaces this check with an assert.

Differential Revision: https://reviews.llvm.org/D44919
Reviewed By: fhahn

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328615 91177308-0d34-0410-b5e6-96231b3b80d8

[IRCE] Enable decreasing loops of non-const bound

As a follow-up to r328480, this updates the logic for the decreasing
safety checks in a similar manner:
- CanBeMax is replaced by CannotBeMaxInLoop which queries
  isLoopEntryGuardedByCond on the maximum value.
- SumCanReachMin is replaced by isSafeDecreasingBound which includes
  some logic from parseLoopStructure and, again, has been updated to
  use isLoopEntryGuardedByCond on the given bounds.

Differential Revision: https://reviews.llvm.org/D44776

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328613 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Fix comments in getExact()

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328612 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Make exact taken count calculation more optimistic

Currently, `getExact` fails if it sees two exit counts in different blocks. There is
no solid reason to do so, given that we only calculate exact non-taken count
for exiting blocks that dominate latch. Using this fact, we can simply take min
out of all exits of all blocks to get the exact taken count.

This patch makes the calculation more optimistic with enforcing our assumption
with asserts. It allows us to calculate exact backedge taken count in trivial loops
like

  for (int i = 0; i < 100; i++) {
    if (i > 50) break;
    . . .
  }

Differential Revision: https://reviews.llvm.org/D44676
Reviewed By: fhahn

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328611 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Add one more case in computeConstantDifference

This patch teaches `computeConstantDifference` handle calculation of constant
difference between `(X + C1)` and `(X + C2)` which is `(C2 - C1)`.

Differential Revision: https://reviews.llvm.org/D43759
Reviewed By: anna

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328609 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineScheduler] Add itinerary to schedcover.py. Make default work in the command line filter

Summary:
This patch adds itinerary support to the schedcover.py script. I've been trying to use this script to figure out why SSE and AVX instructions are ending up in separate tablegen scheduler classes and sometimes its because we are using different itineraries.

Rather than using None to indicate the default scheduler model, I now use the string "default". I had to hack around the sorting a little to keep "default" at the beginning. But this also makes it so you can specify "default" on the command line to just get the defaults

I also fixed the regular expression code so that the no_default wasn't evaluated twice.

Reviewers: RKSimon, atrick, jmolloy, javed.absar

Reviewed By: javed.absar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44834

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328608 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Revert "[lit] Generalized /dev/null support on Windows.""

Summary:
This reverts commit r328596.

Checking if the arguments are strings before testing if they contain "/dev/null".

Reviewers: rnk

Reviewed By: rnk

Subscribers: delcypher, llvm-commits

Differential Revision: https://reviews.llvm.org/D44914

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328603 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add RUN for target before roundss; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328601 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Temporarily disable shtest-timeout.py on darwin

Disabled until fixed in order to avoid random failures on green dragon.

rdar://problem/38774530

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328598 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[lit] Generalized /dev/null support on Windows."

This reverts commit ca7fdbb974384ce5a05528b22a41d46b1cc13e92.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328596 91177308-0d34-0410-b5e6-96231b3b80d8

Add a build dependency from libMC to libDebugInfoCodeView to match the reality of header dependencies here

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328595 91177308-0d34-0410-b5e6-96231b3b80d8

Move CVDebugRecord from CodeView to Object to fix layering

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328593 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add tests for ftrunc; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328592 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfoPDB] Print the method name along with the variant value

Before this change, using dumpProperties() with PDBSymbolData
would look like this:

  get_locationType: 3
  1

After this change:

  get_locationType: 3
  get_value: 1

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328590 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Generalized /dev/null support on Windows.

Generalized /dev/null remapping on Windows, and added test.

Reviewers: rnk

Reviewed By: rnk

Subscribers: amccarth, zturner, delcypher, llvm-commits

Differential Revision: https://reviews.llvm.org/D44771

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328589 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfoPDB] Add methods to get the compiland and line numbers with PDBSymbolData

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328587 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfoPDB] Add DIA implementation of findLineNumbersByRVA

This method is used to find line numbers for PDBSymbolData
that have an invalid virtual address.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328586 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfoPDB] Add DIA implementation of addressForVA and addressForRVA

These are used in finding line numbers for PDBSymbolData

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328585 91177308-0d34-0410-b5e6-96231b3b80d8

Fix newlines. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328583 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add WriteCRC32 scheduler class

Currently CRC32 instructions use the WriteFAdd class, this patch splits them off into their own, at the moment it is still mostly just a duplicate of WriteFAdd but it can now be tweaked on a target by target basis.

Differential Revision: https://reviews.llvm.org/D44647

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328582 91177308-0d34-0410-b5e6-96231b3b80d8

Use local symbols for creating .stack-size.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328581 91177308-0d34-0410-b5e6-96231b3b80d8

Fix go bindings test when using goma distributed build tool

Goma[1] is a distributed build system similar to distcc and icecc
primarily used to compile Chromium. The client is open source, and
hopefully soon the server will be as well. The intended usage model is
similar to most distributed build systems: prefix gomacc onto your
compiler command line, and it transparently distributes compilation.

The go lit config wants to determine the host compiler binary, so it
needs some extra logic to avoid looking at these prefixes.

[1] https://chromium.googlesource.com/infra/goma/client/

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328580 91177308-0d34-0410-b5e6-96231b3b80d8

Use correct format specifier.
Review comment on r328235 by James Henderson.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328578 91177308-0d34-0410-b5e6-96231b3b80d8

[MemorySSA] Fix exponential compile-time updating MemorySSA.

MemorySSAUpdater::getPreviousDefRecursive is a recursive algorithm, for
each block, it computes the previous definition for each predecessor,
then takes those definitions and combines them. But currently it doesn't
remember results which it already computed; this means it can visit the
same block multiple times, which adds up to exponential time overall.

To fix this, this patch adds a cache. If we computed the result for a
block already, we don't need to visit it again because we'll come up
with the same result. Well, unless we RAUW a MemoryPHI; in that case,
the TrackingVH will be updated automatically.

This matches the original source paper for this algorithm.

The testcase isn't really a test for the bug, but it adds coverage for
the case where tryRemoveTrivialPhi erases an existing PHI node. (It's
hard to write a good regression test for a performance issue.)

Differential Revision: https://reviews.llvm.org/D44715

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328577 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Assertion failure in HexagonSubtarget.cpp

In restoreLatency, replace range-for loop with std::find.

Patch by Jyotsna Verma.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328574 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][Btver2] Add (U)COMISD/(U)COMISD scheduler costs

Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328573 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Add more checks to a test case. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328572 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32

Summary:
Re-lands r328386 and r328443, reverting r328482.

Incorporates fixes from @mstorsjo in D44876 (thanks!) so that small
parameters in i8 and i16 do not end up in the SysV register parameters
(EDI, ESI, etc).

I added tests for how we receive small parameters, since that is the
important part. It's always safe to store more bytes than will be read,
but the assumptions you make when loading them are what really matter.

I also tested this by self-hosting clang and it passed tests on win64.

Reviewers: mstorsjo, hans

Subscribers: hiraditya, mstorsjo, llvm-commits

Differential Revision: https://reviews.llvm.org/D44900

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328570 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add WriteBitScan/WriteLZCNT/WriteTZCNT/WritePOPCNT scheduler classes (PR36881)

Give the bit count instructions their own scheduler classes instead of forcing them into existing classes.

These were mostly overridden anyway, but I had to add in costs from Agner for silvermont and znver1 and the Fam16h SoG for btver2 (Jaguar).

Differential Revision: https://reviews.llvm.org/D44879

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328566 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unused file, ExecutionEngine/MCJIT/ObjectBuffer.h

This header also wasn't self contained/modular - but with no users, it
didn't seem worth fixing because it'd break so easily again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328565 91177308-0d34-0410-b5e6-96231b3b80d8

[XCore] Change std::sort to llvm::sort in response to r327219

Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: dblaikie, RKSimon, robertlytton

Reviewed By: robertlytton

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44875

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328564 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Implement 'cat' command for internal shell

Fixes PR36449

Patch by Chamal de Silva

Differential Revision: https://reviews.llvm.org/D43501

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328563 91177308-0d34-0410-b5e6-96231b3b80d8

Delete pdbutil diff mode.

This has been made obsolete by the fact that almost all of the
things it previously checked for are no longer relevant since
we can just compare bytes in a lot of places.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328562 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Add more lit tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328561 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] improve code comment; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328560 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9]Legalize and emit code for quad-precision convert from double-precision

Legalize and emit code for quad-precision floating point operation xscvdpqp
and add option to guard the quad precision operation support.

Differential Revision: https://reviews.llvm.org/D44746

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328558 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Infrastructure work. Implement getting the opcode for a spill in one place.

A new function getOpcodeForSpill should now be the only place to get
the opcode for a given spilled register.

Differential Revision: https://reviews.llvm.org/D43086

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328556 91177308-0d34-0410-b5e6-96231b3b80d8

Disable [MachineLICM] Add functions to MachineLICM to hoist invariant stores

Disable https://reviews.llvm.org/D40196 with setting option
hoist-const-stores to false since failing s390 buildbot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328555 91177308-0d34-0410-b5e6-96231b3b80d8

[Pipeliner] Several node-ordering fixes

First, we change the heuristic that is used to ignore the recurrent
node-sets in the node ordering. In certain cases it's not important
to focus on the recurrent node-sets. Instead, the algorithm begins
by considering all the instructions in the node ordering step.

Second, a minor change to the bottom up traversal, which needs to
consider loop carried dependences (modeled as anti dependences).
Previously, these instructions were skipped, which caused problems
because the instruction ends up having both predecessors and
sucessors in the schedule.

Third, consider anti-dependences as a tie breaker when choosing
between instructions in the node ordering. We want to make sure
that the source of the anti-dependence does not end up with both
predecesssors and sucessors in the final node ordering.

Patch by Brendon Cahoon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328554 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Improve disassembler error handling

Summary:
llvm-objdump now disassembles unrecognised opcodes as data, using
the .long directive. We treat unrecognised opcodes as being 32 bit
values, so move along 4 bytes rather than the single byte which
previously resulted in a cascade of bogus disassembly following an
unrecognised opcode.

While no solution can always disassemble code that contains
embedded data correctly this provides a significant improvement.

The disassembler will now cope with an arbitrary length section
as it no longer truncates it to a multiple of 4 bytes, and will
use the .byte directive for trailing bytes.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D44685

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328553 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][Btver2] Add CVTSI2SD/CVTSI2SS scheduler costs

We still need to account for how Jaguar passes data from GPR -> XMM, which isn't as clean as XMM -> GPR.....

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328551 91177308-0d34-0410-b5e6-96231b3b80d8

[Pipeliner] Check for affine expression in isLoopCarriedOrder

The pipeliner must add a loop carried dependence between two memory
operations if the base register is not an affine (linear) exression.
The current implementation doesn't check how the base register is
defined, which allows non-affine expressions, and then the pipeliner
does not add a loop carried dependence when one is needed.

This patch adds code to isLoopCarriedOrder that checks if the base
register of the memory operations is defined by a phi, and the loop
definition for the phi is a constant increment value. This is a very
simple check for a linear expression.

Patch by Brendon Cahoon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328550 91177308-0d34-0410-b5e6-96231b3b80d8

Remove an unneeded (& mislayered) include from Target/TargetLoweringObjectFile on a CodeGen header

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328549 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unneeded (& mislayered) include from TargetMachine.cpp on a CodeGen header

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328548 91177308-0d34-0410-b5e6-96231b3b80d8

[Pipeliner] Add missing loop carried dependences

The pipeliner is not adding a dependence edge for a loop carried
dependence, and ends up scheduling a load from iteration n prior
to an aliased store in iteration n-1.

The code that adds the loop carried dependences in the pipeliner
doesn't check if the memory objects for loads and stores are
"identified" (i.e., distinct) objects. If they are not, then the
code that adds the dependences needs to be conservative. The
objects can be used to check dependences only when they are
distinct objects.

The code that checks for loop carried dependences has been updated
to classify loads and stores that are not identified as "unknown"
values. A store with an "unknown" value can potentially create
a loop carried dependence with any pending load.

Patch by Brendon Cahoon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328547 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Add a test case. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328546 91177308-0d34-0410-b5e6-96231b3b80d8

[Pipeliner] Fix renaming in pipeliner when eliminating phis

The phi renaming code in the pipeliner uses the wrong value when
rewriting phi uses, which results in an undefined value. In this
case, the original phi is no longer needed due to the order of
instruction in the pipelined loop. The pipeliner was assuming, in
this case, the the phi loop definition should be used to
rewrite the uses. However, the pipeliner needs to check to make
sure that the loop definition has already been scheduled. If not,
then the phi initial value needs to be used instead.

Patch by Brendon Cahoon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328545 91177308-0d34-0410-b5e6-96231b3b80d8

[Pipeliner] Fix number of phis to generate in the epilog

The pipeliner was generating too many phis in the epilog blocks, which
caused incorrect code generation when rewriting an instruction that uses
the phi.

In this case, there 3 prolog and epilog stages. An existing phi was
scheduled at stage 1. When generating the code for the 2nd epilog an
extra new phi was generated.

To fix this, we need to update the code that calculates the maximum
number of phis that can be generated, which is based upon the current
prolog stage and the stage of the original phi. In this case, when the
prolog stage is 1 and the original phi stage is 1, the maximum number
of phis to generate is 2.

Patch by Brendon Cahoon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328543 91177308-0d34-0410-b5e6-96231b3b80d8

[Pipeliner] Use latency to compute RecMII

The patch contains severals changes needed to pipeline an example
that was transformed so that a Phi with a subreg is converted to
copies.

The pipeliner wasn't working for a couple of reasons.
- The RecMII was 3 instead of 2 due to the extra copies.
- Copy instructions contained a latency of 1.
- The node order algorithm was not choosing the best "bottom"
node, which caused an instruction to be scheduled that had a
predecessor and successor already scheduled.
- Updated the Hexagon Machine Scheduler to check if the node is
latency bound when adding the cost for a 0-latency dependence.

The RecMII was 3 because the computation looks at the number of
nodes in the recurrence. The extra copy is an extra node but
it shouldn't increase the latency. The new RecMII computation
looks at the latency of the instructions in the recurrence. We
changed the latency of the dependence of a copy to 0. The latency
computation for the copy also checks the use of the copy (similar
to a reg_sequence).

The node order algorithm was not choosing the last instruction
in the recurrence for a bottom up traversal. This was when the
last instruction is a copy. A check was added when choosing the
instruction to check for NodeNum if the maxASAP is the same. This
means that the scheduler will not end up with another node in
the recurrence that has both a predecessor and successor already
scheduled.

The cost computation in Hexagon Machine Scheduler adds cost when
an instruction can be packetized with a zero-latency instruction.
We should only do this if the schedule is latency bound.

Patch by Brendon Cahoon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328542 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][Btver2] Add CVTSD2SS/CVTSS2SD scheduler costs

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328541 91177308-0d34-0410-b5e6-96231b3b80d8

[Pipeliner] Fix assert caused by pipeliner serialization

The pipeliner is asserting because the serialization step that
occurs at the end is deleting an instruction. The assert
occurs later on because there is a use without a definition.

The problem occurs when an instruction defines a value used
by a REQ_SEQUENCE and that value is used by a COPY instruction.
The latencies between these instructions are zero, so they are
put in to the same packet. The serialization code is unable to
handle this correctly, and ends up putting the REG_SEQUENCE
before its definition.

There is special code in the serialization step that attempts
to handle zero-cost instructions (phis, copy, reg_sequence)
differently than regular instructions. Unfortunately, this means
the order does not come out correct.

This patch simplifies the code by changing the seperate steps for
handling zero-cost and regular instructions. Only phis are
handled separate now, since they should occurs first. Then, this
patch adds checks to make use the MoveUse is set to the smallest
value if there are multiple uses in a cycle.

Patch by Brendon Cahoon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328540 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] reassociate loop invariant GEP chains to enable LICM

This change brings performance of zlib up by 10%. The example below is from a
hot loop in longest_match() from zlib.

do.body:
  %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ]
  %idx.ext = zext i32 %cur_match.addr.0 to i64
  %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext
  %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1
  %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1

In this example %idx.ext1 is a loop invariant. It will be moved above the use of
loop induction variable %idx.ext such that it can be hoisted out of the loop by
LICM. The operands that have dependences carried by the loop will be sinked down
in the GEP chain. This patch will produce the following output:

do.body:
  %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ]
  %idx.ext = zext i32 %cur_match.addr.0 to i64
  %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1
  %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1
  %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328539 91177308-0d34-0410-b5e6-96231b3b80d8

[Pipeliner] Enable more base+offset dependence changes in pipeliner

The pipeliner changes dependences between base+offset instructions
(loads and stores) so that the instructions have more flexibility
to be scheduled with respect to each other. This occurs when the
pipeliner is able to compute that the instructions will not alias
if their order is changed. The prevous code enforced the alias
property by checking if the base register is the same, and that the
offset values are either both positive or negative.

This patch improves the alias check by using the API
areMemAccessesTriviallyDisjoint instead. This enables more cases,
especially if the offset is a negative value. The pipeliner uses
the function by creating a new instruction with the offset used
in the next iteration.

Patch by Brendon Cahoon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328538 91177308-0d34-0410-b5e6-96231b3b80d8

[Pipeliner] Fix calculation when reusing phis

A schedule may require that a phi from the original loop is used in
multiple iterations in the scheduled loop. When this occurs, we generate
multiple phis in the pipelined loop to save the value across iterations.

When we generate the new phis and update the register names in the
pipelined loop, the pipeliner attempts to reuse a previously generated
phi, when possible. The calculation for the name of the new phi needs
to account for the version/iteration of the original phi. Also, in the
epilog, the code only needs to check backwards for a previous iteration
until reaching the first prolog block.

Patch by Brendon Cahoon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328537 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][Btver2] Account for the "+i" integer pipe transfer costs (1cy use of JALU0 for GPR PRF write)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328536 91177308-0d34-0410-b5e6-96231b3b80d8

[Pipeliner] Fix check for order dependences when finalizing instructions

The code in orderDepdences that looks at the order dependences between
instructions was processing all the successor and predecessor order
dependences. However, we really only want to check for an order dependence
for instructions scheduled in the same cycle.

Also, fixed how the pipeliner handles output dependences. An output
dependence is also a potential loop carried dependence. The pipeliner
didn't handle this case properly so an invalid schedule could be created
that allowed an output dependence to be scheduled in the next iteration
at the same cycle.

Patch by Brendon Cahoon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328516 91177308-0d34-0410-b5e6-96231b3b80d8

[Pipeliner] Fix in the pipeliner phi reuse code

When the definition of a phi is used by a phi in the next iteration,
the pipeliner was assuming that the definition is processed first.
Because of the assumption, an incorrect phi name was used. This patch
has a check to see if the phi definition has been processed already.

Patch by Brendon Cahoon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328510 91177308-0d34-0410-b5e6-96231b3b80d8

[Pipeliner] Pipeliner should mark physical registers as used

The software pipeliner attempts to delete dead instructions after
generating the pipelined loop. The code looks for uses of each
instruction. Physical registers should be treated differently because
the use chains do not exist. The code that checks for dead
instructions should assume that definitions of physical registers
are used if the operand doesn't contain the dead flag.

Patch by Brendon Cahoon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328509 91177308-0d34-0410-b5e6-96231b3b80d8

[Pipeliner] Correctly update memoperands in the epilog

The pipeliner needs to be conservative when updating the memoperands
of instructions in the epilog. Previously, the pipeliner was changing
the offset of the memoperand based upon the scheduling stage. However,
that is incorrect when control flow branches around the kernel code.
The bug enabled a load and store to the same stack offset to be swapped.

This patch fixes the bug by updating the size of the memoperands to be
UINT_MAX. This conservative value means that dependences will be created
between other loads and stores.

Patch by Brendon Cahoon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328508 91177308-0d34-0410-b5e6-96231b3b80d8

[demangler] Fix a bug in r328464 found by oss-fuzz.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328507 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Give priority to post-incremementing memory accesses in LSR

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328506 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][Btver2] Add CVTSD2SI/CVTSS2SI scheduler costs

Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write)

This also adds missing vcvttss2si tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328505 91177308-0d34-0410-b5e6-96231b3b80d8

Migrate dockerfiles to use multi-stage builds.

Summary:
We previously emulated multi-staged builds using two dockerfiles,
native support from Docker allows us to merge them into one,
simplifying our scripts.

For more details about multi-stage builds, see:
https://docs.docker.com/develop/develop-images/multistage-build/

Reviewers: mehdi_amini, klimek, sammccall

Reviewed By: sammccall

Subscribers: llvm-commits, ioeric, cfe-commits

Differential Revision: https://reviews.llvm.org/D44787

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328503 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] distribute fmul over fadd/fsub

This replaces a large chunk of code that was looking for compound
patterns that include these sub-patterns. Existing tests ensure that
all of the previous examples are still folded as expected.

We still need to loosen the FMF check.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328502 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][Btver2] Fix YMM BLENDPD/BLENDPS + UNPCKPD/UNPCKP instructions costs

These should match the YMM MOVDUP/ PERMILPD/PERMILPS + SHUFPD/SHUFPS shuffles instead of using the WriteFShuffle defaults.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328501 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Fix how views are added to the InstructionTables.

This should fix the stack-use-after-scope reported by the asan buildbots after
revision 328493.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328499 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] check uses before creating instructions for fmul distribution

As the tests show, we could create extra instructions without any obvious benefit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328498 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][Btver2] Add (V)SQRTPD/(V)SQRTSD costs

The xmm sd/pd versions were using the WriteFSQRT default which is modelled on sqrtss/sqrtps

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328497 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Introduce common SOP_Pseudo and VOP_Pseudo TableGen base classes

Differential revision: https://reviews.llvm.org/D44820

Change-Id: I732979e2964006aa15d78a333d8886e6855f319a

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328496 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Add a flag -instruction-info to enable/disable the instruction info view.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328493 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Update the commandline docs after r328305.

Document that flag -resource-pressure can be used to enable/disable the resource
pressure view. This change should have been part of r328305.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328492 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][Btver2] Double the AGU and schedule pipe resources for YMM

Both the AGUs and schedule pipes are double pumped for 256-bit instructions as well as the functional units which we already model.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328491 91177308-0d34-0410-b5e6-96231b3b80d8