git.osdn.net Git - android-x86/external-llvm.git/log

Renovate CMake files in the `llvm-(cfi-verify|exegesis|mca)` tools.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342148 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] reorder folds to reduce chance of infinite loops

I don't have a test case for this, but it's motivated by
the discussion in D51964, and I've added TODO comments for
the better fix - move simplifications into instsimplify
because that's more efficient and reduces risk of infinite
loops in instcombine caused by transforms trying to do the
opposite folds.

In this case, we know that the transform that tries to move
'not' through min/max can be fooled by the multiple uses
of a value in another min/max, so try to squash the
foldSPFofSPF() patterns first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342147 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Allow truncs as sources in ARM CGP

We previously only allowed truncs as sinks, but now allow them as
sources too. We do this by checking that the result type is the
narrow type that we're trying to optimise for.

Differential Revision: https://reviews.llvm.org/D51978

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342141 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fix FixConst for ARMCodeGenPrepare

Part of FixConsts wrongly assumes either a 8- or 16-bit constant
which can result in the wrong constants being generated during
promotion.

Differential Revision: https://reviews.llvm.org/D52032

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342140 91177308-0d34-0410-b5e6-96231b3b80d8

[MC/Dwarf] Unclamp DWARF linetables format on Darwin.

In r319995, we fixed the line table format to version 2 on Darwin
because dsymutil didn't yet understand the new format which caused test
failures for the LLDB bots. This has been resolved in the meantime so
there's no reason to keep this limitation.

rdar://problem/35968332

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342136 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix not preserving alignent in call setups

If an argument was passed on the stack, this
was using the default alignment.

I'm not sure there's an observable change from this. This
was observable due to bugs in expansion of unaligned
loads and stores, but since that is fixed I don't think
this matters much.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342133 91177308-0d34-0410-b5e6-96231b3b80d8

DAG: Fix expansion of unaligned FP loads and stores

This was trying to scalarizing a scalar FP type,
resulting in an assert.

Fixes unaligned f64 stack stores for AMDGPU.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342132 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix some outdated datalayouts in tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342131 91177308-0d34-0410-b5e6-96231b3b80d8

Fix unused variable warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342128 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: align loops to 4 bytes on Cortex-M3 and Cortex-M4.

The Technical Reference Manuals for these two CPUs state that branching
to an unaligned 32-bit instruction incurs an extra pipeline reload
penalty. That's bad.

This also enables the optimization at -Os since it costs on average one
byte per loop in return for 1 cycle per iteration, which is pretty good
going.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342127 91177308-0d34-0410-b5e6-96231b3b80d8

[XRay] Bug fixes for FDR custom event and arg-logging

Summary:
This change has a number of fixes for FDR mode in compiler-rt along with
changes to the tooling handling the traces in llvm.

In the runtime, we do the following:

- Advance the "last record" pointer appropriately when writing the
  custom event data in the log.

- Add XRAY_NEVER_INSTRUMENT in the rewinding routine.

- When collecting the argument of functions appropriately marked, we
  should not attempt to rewind them (and reset the counts of functions
  that can be re-wound).

In the tooling, we do the following:

- Remove the state logic in BlockIndexer and instead rely on the
  presence/absence of records to indicate blocks.

- Move the verifier into a loop associated with each block.

Reviewers: mboerger, eizan

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D51965

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342122 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Load divergence predicate refactoring

Differential revision: https://reviews.llvm.org/D51931

Reviewers: rampitec

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342120 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Enable the mnemonic spell corrector

This implements suggesting alternative mnemonics when an invalid one is
specified. For example `addru $9, $6, 17767` leads to the following
error message:

error: unknown instruction, did you mean: add, addiu, addu, maddu?

Differential revision: https://reviews.llvm.org/D40646

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342119 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis][NFC] Remove dead parameter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342118 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis][NFC] Split BenchmarkRunner class

Summary:
The snippet-generation part goes to the SnippetGenerator class.

This will allow benchmarking arbitrary code (see PR38437).

Reviewers: gchatelet

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D51979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342117 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Preliminary patch for divergence driven instruction selection. Load offset inlining pattern changed.

Differential revision: https://reviews.llvm.org/D51975

Reviewers: rampitec

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342115 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Type legalize v2i32 div/rem by scalarizing rather than promoting

Summary:
Previously we type legalized v2i32 div/rem by promoting to v2i64. But we don't support div/rem of vectors so op legalization would then scalarize it using i64 scalar ops since it doesn't know about the original promotion. 64-bit scalar divides on Intel hardware are known to be slow and in 32-bit mode they require a libcall.

This patch switches type legalization to do the scalarizing itself using i32.

It looks like the division by power of 2 optimization is still kicking in and leaving the code as a vector. The division by other constant optimization doesn't kick in pre type legalization since it ignores illegal types. And previously, after type legalization we scalarized the v2i64 since we don't have v2i64 MULHS/MULHU support.

Another option might be to widen v2i32 to v4i32 so we could do division by constant optimizations, but we'd have to be careful to only do that for constant divisors or we risk scalaring to 4 scalar divides.

Reviewers: RKSimon, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51325

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342114 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: correct the relocation type for `bl` on WoA

The `IMAGE_REL_ARM_BRANCH20T` applies only to a `b.w` instruction. A
thumb-2 `bl` should be relocated using a `IMAGE_REL_ARM_BRANCH24T`.
Correct the relocation that we emit in such a case.

Resolves PR38620! Based on the patch by Jordan Rhee!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342109 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Add Requires: asserts where needed

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342108 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Use expensive asserts in relevant LICM tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342107 91177308-0d34-0410-b5e6-96231b3b80d8

Remove isAsCheapAsAMove from v128.const

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342106 91177308-0d34-0410-b5e6-96231b3b80d8

Remove isAsCheapAsAMove from mem ops

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342105 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add missing SIMD instruction attributes

Summary:
These attributes are copied from equivalent instructions in
WebAssemblyInstrInfo.td.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51518

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342104 91177308-0d34-0410-b5e6-96231b3b80d8

STLExtras: Add some more algorithm wrappers

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342102 91177308-0d34-0410-b5e6-96231b3b80d8

DebugInfo/PDB: Remove unused member

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342101 91177308-0d34-0410-b5e6-96231b3b80d8

dwarfdump: Improve performance on large DWP files

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342099 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] improve formatting for select+setcc code; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342095 91177308-0d34-0410-b5e6-96231b3b80d8

fix 80-column violation with clang-format

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342094 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] Remove all clone() methods.

These are dead code and encourage poor usage patterns, so I'm
removing them. They weren't called anywhere anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342093 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Use shuffles when lowering "gather" shufflevectors

Shufflevector instructions in LLVM IR that extract a subset of elements
of a longer input into a shorter vector can be done using VECTOR_SHUFFLEs.
This will avoid expanding them into constly extracts and inserts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342091 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Improve the selection algorithm in scalarizeShuffle

Use topological ordering for newly generated nodes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342090 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] sys::fs::directory_entry includes the file_type.

This is available on most platforms (Linux/Mac/Win/BSD) with no extra syscalls.
On other platforms (e.g. Solaris) we stat() if this information is requested.

This will allow switching clang's VFS to efficiently expose (path, type) when
traversing a directory. Currently it exposes an entire Status, but does so by
calling fs::status() on all platforms.
Almost all callers only need the path, and all callers only need (path, type).

Patch by sammccall (Sam McCall)

Differential Revision: https://reviews.llvm.org/D51918

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342089 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-cov] Delete custom JSON serialization code (NFC)

Teach llvm-cov to use the new llvm JSON library, and remove some
redundant/brittle JSON serialization tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342088 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Merge ExecutionSessionBase with ExecutionSession by moving a couple of
template methods in JITDylib out-of-line.

This also splits JITDylib::define into a pair of template methods, one taking an
lvalue reference and the other an rvalue reference. This simplifies the
templates at the cost of a small amount of code duplication.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342087 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Add a special 'main' JITDylib that is created on ExecutionSession
construction, a new convenience lookup method, and add-to layer methods.

ExecutionSession now creates a special 'main' JITDylib upon construction. All
subsequently created JITDylibs are added to the main JITDylib's search order by
default (controlled by the AddToMainDylibSearchOrder parameter to
ExecutionSession::createDylib). The main JITDylib's search order will be used in
the future to properly handle cross-JITDylib weak symbols, with the first
definition in this search order selected.

This commit also adds a new ExecutionSession::lookup convenience method that
performs a blocking lookup using the main JITDylib's search order, as this will
be a very common operation for clients.

Finally, new convenience overloads of IRLayer and ObjectLayer's add methods are
introduced that add the given program representations to the main dylib, which
is likely to be the common case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342086 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Make tied inline asm operands work again

Summary:
rL341389 broke code with tied register operands in inline assembly. For
example, `asm("" : "=r"(var) : "0"(var));`
The code above specifies the input operand to be in the same register
with the output operand, tying the two register. This patch makes this
kind of code work again.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, eraman, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51991

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342084 91177308-0d34-0410-b5e6-96231b3b80d8

revert r341288 - [Reassociate] swap binop operands to increase factoring potential

This causes or exposes indeterminism that is visible in the output of -reassociate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342083 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for unsigned add overflow; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342082 91177308-0d34-0410-b5e6-96231b3b80d8

Guard FMF context by excluding some FP operators from FPMathOperator

Summary:
Some FPMathOperators succeed and the retrieve FMF context when they never have it, we should omit these cases to keep from removing FMF context.

For instance when we visit some FPMathOperator mapped Instructions which never have FMF flags and a Node was associated which does have FMF flags, that Node today will have all its flags cleared via the intersect operation. With this change, we exclude associating Nodes that never have FPMathOperator status under FMF.

Reviewers: spatel, wristow, arsenm, hfinkel, aemerson

Reviewed By: spatel

Subscribers: llvm-commits, wdng

Differential Revision: https://reviews.llvm.org/D51145

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342081 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] Emit old fpo data to the PDB file.

r342003 added support for emitting FPO data from the
DEBUG_S_FRAMEDATA subsection of the .debug$S section to the PDB
file.  However, that is not the end of the story.  FPO can end
up in two different destinations in a PDB, each corresponding to
a different FPO data source.

The case handled by r342003 involves copying data from the
DEBUG_S_FRAMEDATA subsection of the .debug$S section to the
"New FPO" stream in the PDB, which is then referred to by the
DBI stream.  The case handled by this patch involves copying
records from the .debug$F section of an object file to the "FPO"
stream (or perhaps more aptly, the "Old FPO" stream) in the PDB
file, which is also referred to by the DBI stream.

The formats are largely similar, and the difference is mostly
only visible in masm generated object files, such as some of the
low-level CRT object files like memcpy.  MASM doesn't appear to
support writing the DEBUG_S_FRAMEDATA subsection, and instead
just writes these records to the .debug$F section.

Although clang-cl does not emit a .debug$F section ever, lld still
needs to support it so we have good debugging for CRT functions.

Differential Revision: https://reviews.llvm.org/D51958

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342080 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Use legalized type for extracted elements in scalarizeShuffle

Scalarization of a shuffle will break up the source vectors into individual
elements, and use them to assemble the resulting vector. An element type of
a legal vector type may not necessarily be a legal scalar type, so make
sure that the extracted values are extended to a legal scalar type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342079 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Print all kernel descriptor directives (including the ones with default values)

Change by Tony Tye

Differential Revision: https://reviews.llvm.org/D51954

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342077 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Drop newly-added interference-tests-for-high-bit-check.ll

Now that i have actually double-checked, no,
there is no such interference possible...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342076 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] R38708 - inefficient pattern for high-bits checking.

More complicated, canonical pattern:
https://rise4fun.com/Alive/uhA
https://godbolt.org/z/o4RB8D

Also, we need to be careful not to skip some patters...

https://bugs.llvm.org/show_bug.cgi?id=38708

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342074 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Re-apply r341982 after fixing the layering issue

Move isa version determination into TargetParser.

Also switch away from target features to CPU string when
determining isa version. This fixes an issue when we
output wrong isa version in the object code when features
of a particular CPU are altered (i.e. gfx902 w/o xnack
used to result in gfx900).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342069 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Inefficient pattern for high-bits checking (PR38708)

Summary:
It is sometimes important to check that some newly-computed value
is non-negative and only `n` bits wide (where `n` is a variable.)
There are **many** ways to check that:
https://godbolt.org/z/o4RB8D
The last variant seems best?
(I'm sure there are some other variations i haven't thought of..)

Let's handle the second variant first, since it is much simpler.
https://rise4fun.com/Alive/LYjY

https://bugs.llvm.org/show_bug.cgi?id=38708

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51985

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342067 91177308-0d34-0410-b5e6-96231b3b80d8

[objcopy] make objcopy follow program header standards

Submitted on behalf of Armando Montanez (amontanez@google.com).

Objects with unused program headers copied by objcopy would always have
nonzero values for program header offset and program header entry size.
While technically valid, this atypical behavior triggers warnings in some
tools. This change sets the two fields to zero when the program header is
unused, better fitting the general expectations for unused program header
data.

Section headers behaved somewhat similarly (though only with the entry size),
and are fixed in this revision as well.

Differential Revision: https://reviews.llvm.org/D51961

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342065 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] SIMD comparisons

Summary:
Match the ordering semantics of non-vector comparisons. For
floating point comparisons that do not correspond to instructions, the
tests check that some vector comparison instruction was emitted but do
not care about the full implementation.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51765

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342064 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Tighten f64<->f16 conversion requirements

Fix missing Requires fields.

Patch by Bernard Ogden (bogden)

Reviewers: SjoerdMeijer, javed.absar, t.p.northover

Reviewed By: t.p.northover

Differential Revision: https://reviews.llvm.org/D51631

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342061 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove isel patterns for ADCX instruction

There's no advantage to this instruction unless you need to avoid touching other flag bits. It's encoding is longer, it can't fold an immediate, it doesn't write all the flags.

I don't think gcc will generate this instruction either.

Fixes PR38852.

Differential Revision: https://reviews.llvm.org/D51754

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342059 91177308-0d34-0410-b5e6-96231b3b80d8

[PatternMatch] Use generic One,Two,ThreeOps_match classes (NFC).

Currently we have a few duplicated matcher classes, which all do pretty
much the same thing. This patch introduces generic
One,Tow,ThreeOps_match classes which take the opcode the match as
template argument.

Reviewers: SjoerdMeijer, dneilson, spatel, arsenm

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D51044

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342058 91177308-0d34-0410-b5e6-96231b3b80d8

Reverting r342048, which caused UBSan failures in dsymutil.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342056 91177308-0d34-0410-b5e6-96231b3b80d8

[GVNHoist] computeInsertionPoints() miscalculates IDF

Fix for https://bugs.llvm.org/show_bug.cgi?id=38912.

In GVNHoist::computeInsertionPoints() we iterate over the Value
Numbers and calculate the Iterated Dominance Frontiers without
clearing the IDFBlocks vector first. IDFBlocks ends up accumulating
an insane number of basic blocks, which bloats the compilation time
of SemaChecking.cpp with ubsan enabled.

Differential Revision: https://reviews.llvm.org/D51980

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342055 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] R38708 - inefficient pattern for high-bits checking.

The simplest pattern for now:
https://rise4fun.com/Alive/LYjY
https://godbolt.org/z/o4RB8D

https://bugs.llvm.org/show_bug.cgi?id=38708

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342054 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Implement aarch64_vector_pcs codegen support.

This patch adds codegen support for the saving/restoring
V8-V23 for functions specified with the aarch64_vector_pcs
calling convention attribute, as added in patch D51477.

Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar, MatzeB

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D51479

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342049 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] Refactoring range list dumping to fold DWARF v4 functionality into v5 handling

Eliminating some duplication of rangelist dumping code at the expense of
some version-dependent code in dump and extract routines.

Reviewer: dblaikie, JDevlieghere, vleschuk

Differential revision: https://reviews.llvm.org/D51081

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342048 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Made numerous methods of ImmutableList const

Also added ImmutableList<T>::iterator::operator->.

Differential Revision: https://reviews.llvm.org/D51881

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342045 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP] Ensure splitgep gives deterministic output

The output of splitLargeGEPOffsets does not appear to be deterministic because
of the way that we iterate over a DenseMap. I've changed it to a MapVector for
consistent output.

The test here isn't particularly great, only showing a consmetic difference in
output. The original reproducer is much larger but show a diffierence in
instruction ordering, leading to different codegen.

Differential Revision: https://reviews.llvm.org/D51851

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342043 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Follow-up to rL342033

Fixed typo which can cause segfault.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342040 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Put an alignment on generated switch tables

Previously the alignment on the newly created switch table data was not set,
meaning that DataLayout::getPreferredAlignment was free to overalign it to 16
bytes. This causes unnecessary code bloat.

Differential Revision: https://reviews.llvm.org/D51800

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342039 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] NFC: Refactoring to prepare for vector PCS.

This patch refactors several parts of AArch64FrameLowering
so that it can be easily extended to support saving/restoring
of FPR128 (Q) registers.

Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D51478

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342038 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis][NFC]Remove dead function parameter

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342035 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Exchange MAC operands in ARMParallelDSP

SMLAD and SMLALD instructions also come in the form of SMLADX and
SMLALDX which perform an exchange on their second operand. To support
this, more of the loads in the MAC candidates are compared for
sequential access and a boolean value has been added to BinOpChain.

AddMACCandiate has been refactored into a small pattern matching
state machine to reduce the amount of duplicated code, but also to
enable the matching to be more flexible. CreateParallelMACPairs now
iterates through all the candidates to find parallel ones.

Differential Revision: https://reviews.llvm.org/D51424

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342033 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Allow bitcasts in ARMCodeGenPrepare

Allow bitcasts in the use-def chains, treating them as sources.

Differential Revision: https://reviews.llvm.org/D50758

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342032 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Add parsing of aarch64_vector_pcs attribute.

This patch adds parsing support for the 'aarch64_vector_pcs'
calling convention attribute to calls and function declarations.

More information describing the vector ABI and procedure call standard
can be found here:

https://developer.arm.com/products/software-development-tools/\
hpc/arm-compiler-for-hpc/vector-function-abi

Reviewers: t.p.northover, rnk, rengolin, javed.absar, thegameg, SjoerdMeijer

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D51477

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342030 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Move InterleaveGroup and InterleavedAccessInfo to VectorUtils.h (NFC)

Move the 2 classes out of LoopVectorize.cpp to make it easier to re-use
them for VPlan outside LoopVectorize.cpp

Reviewers: Ayal, mssimpso, rengolin, dcaballe, mkuper, hsaito, hfinkel, xbolva00

Reviewed By: rengolin, xbolva00

Differential Revision: https://reviews.llvm.org/D49488

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342027 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unused include from IVDescriptors.cpp.

This fixes a layering violation:

Analysis/IVDescrtors.cpp can't include Transforms/Utils/BasicBlockUtils.h,
since TransformUtils depends on Analysis.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342024 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination into TargetParser."

This reverts commit r341982.

The change introduced a layering violation. Reverting to unbreak
our integrate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342023 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Remove some code from PromoteIntOp_MGATHER that handles UpdateNodeOperands returning an existing node instead of updating.

I suspect this became unecessary when the CSE of mgather was fixed in r338080. It may still be possible to hit this if we widen the element type of a gather outside of type legalization and the promote the mask of a separate gather node so they become the same. But that seems pretty unlikely.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342022 91177308-0d34-0410-b5e6-96231b3b80d8

Break LoopUtils into an Analysis file.

Summary:
The InductionDescriptor and RecurrenceDescriptor classes basically analyze the IR to identify the respective IVs. So, it is better to have them in the "Analysis" directory instead of the "Transforms" directory.

The rationale for this is to make the Induction and Recurrence descriptor classes available for analysis passes. Currently including them in an analysis pass produces link error (http://lists.llvm.org/pipermail/llvm-dev/2018-July/124456.html).

Induction and Recurrence descriptors are moved from Transforms/Utils/LoopUtils.h|cpp to Analysis/IVDescriptors.h|cpp.

Reviewers: dmgreen, llvm-commits, hfinkel

Reviewed By: dmgreen

Subscribers: mgorny

Differential Revision: https://reviews.llvm.org/D51153

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342016 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Teach X86SelectionDAGInfo::EmitTargetCodeForMemcpy about GNUX32

Summary:
In GNUX23, is64BitMode returns true, but pointers are 32-bits. So we shouldn't copy pointer values into RSI/RDI since the widths don't match.

Fixes PR38865 despite what the title says. I think the llvm_unreachable in the copyPhysReg code tricked the optimizer and made the fatal error trigger.

Reviewers: rnk, efriedma, MatzeB, echristo

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51893

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342015 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Remove some unused typedefs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342013 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner] Add codegen size remarks to the MachineOutliner

Since the outliner is a module pass, it doesn't get codegen size remarks like
the other codegen passes do. This adds size remarks *to* the outliner.

This is kind of a workaround, so it's peppered with FIXMEs; size remarks
really ought to not ever be handled by the pass itself. However, since the
outliner is the only "MachineModulePass", this works for now. Since the
entire purpose of the MachineOutliner is to produce code size savings, it
really ought to be included in codgen size remarks.

If we ever go ahead and make a MachineModulePass (say, something similar to
MachineFunctionPass), then all of this ought to be moved there.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342009 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add folds for unsigned-overflow compares

Name: op_ugt_sum
  %a = add i8 %x, %y
  %r = icmp ugt i8 %x, %a
  =>
  %notx = xor i8 %x, -1
  %r = icmp ugt i8 %y, %notx

Name: sum_ult_op
  %a = add i8 %x, %y
  %r = icmp ult i8 %a, %x
  =>
  %notx = xor i8 %x, -1
  %r = icmp ugt i8 %y, %notx

https://rise4fun.com/Alive/ZRxI

AFAICT, this doesn't interfere with any add-saturation patterns
because those have >1 use for the 'add'. But this should be
better for IR analysis and codegen in the basic cases.

This is another fold inspired by PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342004 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] Write FPO Data to the PDB.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342003 91177308-0d34-0410-b5e6-96231b3b80d8

[cmake] Speed up check-llvm 5x by delay loading shell32 and ole32

Summary:
Previously, check-llvm on my Windows 10 workstation took about 300s to
run, and it would lock up my mouse. Now, it takes just under 60s.
Previously running the tests only used about 15% of the available CPU
time, now it uses up to 60%.

Shell32.dll and ole32.dll have direct dependencies on user32.dll and
gdi32.dll. These DLLs are mostly used to for Windows GUI functionality,
which most LLVM tools don't need. It seems that these DLLs acquire and
release some system resources on startup and exit, and that requires
acquiring the same highly contended lock discussed in this post:
https://randomascii.wordpress.com/2017/07/09/24-core-cpu-and-i-cant-move-my-mouse/

Most LLVM tools still have a transitive dependency on
SHGetKnownFolderPathW, which is used to look up the user home directory
and local AppData directory. We also use SHFileOperationW to recursively
delete a directory, but only LLDB and clang-doc use this today. At some
point, we might consider removing these last shell32.dll deps, but for
now, I would like to take this free performance.

Many thanks to Bruce Dawson for suggesting this fix. He looked at an ETW
trace of an LLVM test run, noticed these DLLs, and suggested avoiding
them.

Reviewers: zturner, pcc, thakis

Subscribers: mgorny, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D51952

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342002 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[GVNHoist] Re-enable GVNHoist by default"

This reverts rL341954.

The builder `sanitizer-x86_64-linux-bootstrap-ubsan` has been
failing with timeouts at stage2 clang/ubsan:

[3065/3073] Linking CXX executable bin/lld
command timed out: 1200 seconds without output running python
../sanitizer_buildbot/sanitizers/buildbot_selector.py,
attempting to kill

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342001 91177308-0d34-0410-b5e6-96231b3b80d8

Apply local fixes intended to be part of r341999.'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342000 91177308-0d34-0410-b5e6-96231b3b80d8

[codeview] Decode and dump FP regs from S_FRAMEPROC records

Summary:
There are two registers encoded in the S_FRAMEPROC flags: one for locals
and one for parameters. The encoding is described by the
ExpandEncodedBasePointerReg function in cvinfo.h. Two bits are used to
indicate one of four possible values:

  0: no register - Used when there are no variables.
  1: SP / standard - Variables are stored relative to the standard SP
     for the ISA.
  2: FP - Variables are addressed relative to the ISA frame
     pointer, i.e. EBP on x86. If realignment is required, parameters
     use this. If a dynamic alloca is used, locals will be EBP relative.
  3: Alternative - Variables are stored relative to some alternative
     third callee-saved register. This is required to address highly
     aligned locals when there are dynamic stack adjustments. In this
     case, both the incoming SP saved in the standard FP and the current
     SP are at some dynamic offset from the locals. LLVM uses ESI in
     this case, MSVC uses EBX.

Most of the changes in this patch are to pass around the CPU so that we
can decode these into real, named architectural registers.

Subscribers: hiraditya

Differential Revision: https://reviews.llvm.org/D51894

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341999 91177308-0d34-0410-b5e6-96231b3b80d8

[object] Improve the performance of getSymbols used by ArchiveWriter

In this diff we adjust the code of getSymbols to avoid creating LLVMContext when it's not necessary.
Without this patch when the function getSymbols was called on a MachO object with a __bitcode section
it was parsing the embedded bitcode and then ignoring the result.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D51759

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341998 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add folds for icmp with xor mask constant

These are the folds in Alive;
Name: xor_ult
Pre: isPowerOf2(-C1)
%xor = xor i8 %x, C1
%r = icmp ult i8 %xor, C1
=>
%r = icmp ugt i8 %x, ~C1

Name: xor_ugt
Pre: isPowerOf2(C1+1)
%xor = xor i8 %x, C1
%r = icmp ugt i8 %xor, C1
=>
%r = icmp ugt i8 %x, C1

https://rise4fun.com/Alive/Vty

The ugt case in its simplest form was already handled by DemandedBits,
but that's not ideal as shown in the multi-use test.

I'm not sure if these are all of the symmetrical folds, but I adjusted
the existing code for one of the folds to try to show the similarities.

There's no obvious connection, but this is another preliminary step
for PR14613...
https://bugs.llvm.org/show_bug.cgi?id=14613

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341997 91177308-0d34-0410-b5e6-96231b3b80d8

add IR flags to MI

Summary: Initial support for nsw, nuw and exact flags in MI

Reviewers: spatel, hfinkel, wristow

Reviewed By: spatel

Subscribers: nlopes

Differential Revision: https://reviews.llvm.org/D51738

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341996 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[SanitizerCoverage] Create comdat for global arrays."

This reverts r341987 since it will cause trouble when there's a module
ID collision.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341995 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for icmp with xor; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341993 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Quote arguments containing \n on Windows

Fixes at_file.c test failure caused by r341988. We may want to change
how we treat \n in our tokenizer, but this is probably a good fix
regardless, since we can invoke all kinds of programs with different
interpretations of the command line quoting rules.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341992 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Avoid calling CommandLineToArgvW from shell32.dll

Summary:
Shell32.dll depends on gdi32.dll and user32.dll, which are mostly DLLs
for Windows GUI functionality. LLVM's utilities don't typically need GUI
functionality, and loading these DLLs seems to be slowing down startup.
Also, we already have an implementation of Windows command line
tokenization in cl::TokenizeWindowsCommandLine, so we can just use it.

The goal is to get the original argv in UTF-8, so that it can pass
through most LLVM string APIs. A Windows process starts life with a
UTF-16 string for its command line, and it can be retreived with
GetCommandLineW from kernel32.dll.

Previously, we would:
1. Get the wide command line
2. Call CommandLineToArgvW to handle quoting rules and separate it into
   arguments.
3. For each wide argument, expand wildcards (* and ?) using
   FindFirstFileW.
4. Convert each argument to UTF-8

Now we:
1. Get the wide command line, convert the whole thing to UTF-8
2. Tokenize the UTF-8 command line with cl::TokenizeWindowsCommandLine
3. For each argument, expand wildcards if present
   - This requires converting back to UTF-16 to call FindFirstFileW
   - Results of FindFirstFileW must be converted back to UTF-8

Reviewers: zturner

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51941

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341988 91177308-0d34-0410-b5e6-96231b3b80d8

[SanitizerCoverage] Create comdat for global arrays.

Summary:
Place global arrays in comdat sections with their associated functions.
This makes sure they are stripped along with the functions they
reference, even on the BFD linker.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51902

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341987 91177308-0d34-0410-b5e6-96231b3b80d8

Update MemorySSA in LoopUnswitch.

Summary:
Update MemorySSA in old LoopUnswitch pass.
Actual dependency and update is disabled by default.

Subscribers: sanjoy, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D45301

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341984 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination
into TargetParser.

Also switch away from target features to CPU string when
determining isa version. This fixes an issue when we
output wrong isa version in the object code when features
of a particular CPU are altered (i.e. gfx902 w/o xnack
used to result in gfx900).

Differential Revision: https://reviews.llvm.org/D51890

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341982 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] enhance vector demanded elements to look at a vector select condition operand

I noticed that we were not back-propagating undef lanes to shuffle masks when we have a
shuffle that reduces the vector width. This is part of investigating/solving PR38691:
https://bugs.llvm.org/show_bug.cgi?id=38691

The DAG equivalent was proposed with:
D51696

Differential Revision: https://reviews.llvm.org/D51433

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341981 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Delay calculation of Cycles per Resources, separate the cycles and resource quantities.

Summary:
This patch removes the storing of accumulated floating point data
within the llvm-mca library.

This patch splits-up the two quantities: cycles and number of resource units.
By splitting-up these two quantities, we delay the calculation of "cycles per resource unit"
until that value is read, reducing the chance of accumulating floating point error.

I considered using the APFloat, but after measuring performance, for a large (many iteration)
sample, I decided to go with this faster solution.

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb

Subscribers: llvm-commits, javed.absar, tschuett, gbedwell

Differential Revision: https://reviews.llvm.org/D51903

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341980 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for add-with-overflow compares; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341979 91177308-0d34-0410-b5e6-96231b3b80d8

[gcov] Fix branch counters with switch statements (fix PR38821)

Right now, the counters are added in regards of the number of successors
for a given BasicBlock: it's good when we've only 1 or 2 successors (at
least with BranchInstr). But in the case of a switch statement, the
BasicBlock after switch has several predecessors and we need know from
which BB we're coming from.

So the idea is to revert what we're doing: add a PHINode in each block
which will select the counter according to the incoming BB.  They're
several pros for doing that:

- we fix the "switch" bug
- we remove the function call to "__llvm_gcov_indirect_counter_increment"
  and the lookup table stuff
- we replace by PHINodes, so the optimizer will probably makes a better
  job.

Patch by calixte!

Differential Revision: https://reviews.llvm.org/D51619

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341977 91177308-0d34-0410-b5e6-96231b3b80d8

Add some context to fatal verifier errors

Summary: Add function name when verification fails as an initial breadcrumb for debugging.

Patch by David Callahan.

Reviewers: mehdi_amini, modocache

Reviewed By: modocache

Subscribers: llvm-commits, modocache

Differential Revision: https://reviews.llvm.org/D51386

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341974 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Prefer unpckhpd over movhlps in isel for fake unary cases

In r337348, I changed lowering to prefer X86ISD::UNPCKL/UNPCKH opcodes over MOVLHPS/MOVHLPS for v2f64 {0,0} and {1,1} shuffles when we have SSE2. This enabled the removal of a bunch of weirdly bitcasted isel patterns in r337349. To avoid changing the tests I placed a gross hack in isel to still emit movhlps instructions for fake unary unpckh nodes. A similar hack was not needed for unpckl and movlhps because we do execution domain switching for those. But unpckh and movhlps have swapped operand order.

This patch removes the hack.

This is a code size increase since unpckhpd requires a 0x66 prefix and movhlps does not. But if that's a big concern we should be using movhlps for all unpckhpd opcodes and let commuteInstruction turnit into unpckhpd when its an advantage.

Differential Revision: https://reviews.llvm.org/D49499

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341973 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Teach X86FastISel::X86SelectRet to use EAX for the sret pointer in GNUX32

GNUX32 uses 32-bit pointers despite is64BitMode being true. So we should use EAX to return the value.

Fixes ones of the failures from PR38865.

Differential Revision: https://reviews.llvm.org/D51940

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341972 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Fix incorrect usage of getPrimitiveSizeInBits when we should be using the element size for vectors

For vectors, getPrimitiveSizeInBits returns the full vector width. This code should using the element size for vectors. This could be fixed by calling getScalarSizeInBits, but its even easier to just get it from the APInt we're checking.

Differential Revision: https://reviews.llvm.org/D51938

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341971 91177308-0d34-0410-b5e6-96231b3b80d8

[CallSiteSplitting] Add debug location to created PHI nodes.

There are 2 cases when we create PHI nodes:
* For the result of the call that was duplicated in the split blocks.
   Those PHI nodes should have the debug location of the call.

* For values produced before the call. Those instructions need to be
   duplicated in the split blocks and the PHI nodes should have the
   debug locations of those instructions.

Fixes PR37962.

Reviewers: junbuml, gbedwell, vsk

Reviewed By: junbuml

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D51919

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341970 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Lower dbg.declare into indirect DBG_VALUE

Summary:
D31439 changed the semantics of dbg.declare to take the address of a
variable as the first argument, making it indirect. It specifically
updated FastISel for this change here:

https://reviews.llvm.org/D31439#change-WVArzi177jPl

GlobalISel needs to follow suit, or else it will be missing a level of
indirection in the generated debuginfo. This problem was seen in a Rust
debuginfo test on aarch64, since GlobalISel is used at -O0 for aarch64.

https://github.com/rust-lang/rust/issues/49807
https://bugzilla.redhat.com/show_bug.cgi?id=1611597
https://bugzilla.redhat.com/show_bug.cgi?id=1625768

Reviewers: dblaikie, aprantl, t.p.northover, javed.absar, rnk

Reviewed By: rnk

Subscribers: #debug-info, rovka, kristof.beyls, JDevlieghere, llvm-commits, tstellar

Differential Revision: https://reviews.llvm.org/D51749

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341969 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopInfo][FIX] Remove leftover dump in unit test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341968 91177308-0d34-0410-b5e6-96231b3b80d8