git.osdn.net Git - android-x86/external-llvm.git/log

OSDN Git Service

(root) / android-x86 / external-llvm.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Mehdi Amini [Thu, 31 Mar 2016 21:55:35 +0000 (21:55 +0000)]

Revert "Add disk_space() to llvm::fs"

Breaks windows bot.
This reverts commit r265050.
This reverts commit r265055.

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265062 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Evgeniy Stepanov [Thu, 31 Mar 2016 21:55:11 +0000 (21:55 +0000)]

Preserve blockaddress use edges in the module splitter.

"blockaddress" can not apply to an external function. All
blockaddress constant uses must belong to the same module as the
definition of the target function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265061 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

David Majnemer [Thu, 31 Mar 2016 21:29:57 +0000 (21:29 +0000)]

[NVPTX] Infer __nvvm_reflect as nounwind, readnone

This patch simply mirrors the attributes we give to @llvm.nvvm.reflect
to the __nvvm_reflect libdevice call. This shaves about 30% of the code
in libdevice away because of CSE opportunities. It's also helps us
figure out that libdevice implementations of transcendental functions
don't have side-effects.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265060 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Simon Pilgrim [Thu, 31 Mar 2016 21:13:49 +0000 (21:13 +0000)]

Wdocumentation parameter fix

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265055 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjay Patel [Thu, 31 Mar 2016 21:00:48 +0000 (21:00 +0000)]

fix typo; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265054 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Simon Pilgrim [Thu, 31 Mar 2016 20:57:36 +0000 (20:57 +0000)]

Fixed signed/unsigned warning

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265052 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Jun Bum Lim [Thu, 31 Mar 2016 20:53:47 +0000 (20:53 +0000)]

[AArch64] Allow loads with imp-def to be handled in getMemOpBaseRegImmOfsWidth()

Summary:
This change will allow loads with imp-def to be clustered in machine-scheduler pass.
areMemAccessesTriviallyDisjoint() can also handle loads with imp-def.

Reviewers: mcrosier, jmolloy, t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18665

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265051 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Mehdi Amini [Thu, 31 Mar 2016 20:48:27 +0000 (20:48 +0000)]

Add disk_space() to llvm::fs

Summary: Adapted from Boost::filesystem.

Reviewers: bruno, silvas

Subscribers: tberghammer, danalbert, llvm-commits, srhines

Differential Revision: http://reviews.llvm.org/D18467

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265050 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Hal Finkel [Thu, 31 Mar 2016 20:45:00 +0000 (20:45 +0000)]

[PowerPC] Cleanup test/CodeGen/PowerPC/qpx-load-splat.ll

Removing unnecessary attributes and metadata...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265049 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjay Patel [Thu, 31 Mar 2016 20:40:32 +0000 (20:40 +0000)]

[x86] add memset tests to show another potential improvement

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265048 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Hal Finkel [Thu, 31 Mar 2016 20:39:41 +0000 (20:39 +0000)]

[PowerPC] Add a late MI-level pass for QPX load/splat simplification

Chapter 3 of the QPX manual states that, "Scalar floating-point load
instructions, defined in the Power ISA, cause a replication of the source data
across all elements of the target register." Thus, if we have a load followed
by a QPX splat (from the first lane), the splat is redundant. This adds a late
MI-level pass to remove the redundant splats in some of these cases
(specifically when both occur in the same basic block).

This optimization is scheduled just prior to post-RA scheduling. It can't happen
before anything that might replace the load with some already-computed quantity
(i.e. store-to-load forwarding).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265047 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Hans Wennborg [Thu, 31 Mar 2016 20:27:30 +0000 (20:27 +0000)]

Revert r265039 "[X86] Merge adjacent stack adjustments in eliminateCallFramePseudoInstr (PR27140)"

I think it might have caused these build breakages:
http://lab.llvm.org:8011/builders/clang-x86-win2008-selfhost/builds/7234/steps/build%20stage%202/logs/stdio
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/19566/steps/run%20tests/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265046 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Simon Pilgrim [Thu, 31 Mar 2016 20:26:30 +0000 (20:26 +0000)]

[X86][SSE] Some basic tests for variable shuffles

We don't really support non-constant shuffle masks, but these tests are for cases where BUILD_VECTOR is made up from vector extracts (as well as undef/zero scalars).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265045 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Evgeniy Stepanov [Thu, 31 Mar 2016 20:21:31 +0000 (20:21 +0000)]

Preserve extern_weak linkage in CloneModule.

Only force "extern" linkage if the function used to be a definition
in the source module. Declarations keep their original linkage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265043 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Chris Bieneman [Thu, 31 Mar 2016 20:03:19 +0000 (20:03 +0000)]

[CMake] Provide the ability to skip stripping when generating dSYMs

For debugging it is useful to be able to generate dSYM files but not strip the executables. This change adds the ability to skip stripping by setting LLVM_EXTERNALIZE_DEBUGINFO_SKIP_STRIP=On.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265041 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Benjamin Kramer [Thu, 31 Mar 2016 19:42:04 +0000 (19:42 +0000)]

[ARM] Expand v1i64 and v2i64 ctpop.

The default is legal, which results in 'Cannot select' errors. This is
triggered during selfhost due to a recent cost model change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265040 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Hans Wennborg [Thu, 31 Mar 2016 19:26:24 +0000 (19:26 +0000)]

[X86] Merge adjacent stack adjustments in eliminateCallFramePseudoInstr (PR27140)

For code such as:

  void f(int, int);
  void g() {
      f(1, 2);
  }

compiled for 32-bit X86 Linux, Clang would previously generate:

  subl    $12, %esp
  subl    $8, %esp
  pushl   $2
  pushl   $1
  calll   f
  addl    $16, %esp
  addl    $12, %esp
  retl

This patch fixes that by merging adjacent stack adjustments in
eliminateCallFramePseudoInstr().

Differential Revision: http://reviews.llvm.org/D18627

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265039 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Hans Wennborg [Thu, 31 Mar 2016 18:33:38 +0000 (18:33 +0000)]

Change eliminateCallFramePseudoInstr() to return an iterator

This will become necessary in a subsequent change to make this method
merge adjacent stack adjustments, i.e. it might erase the previous
and/or next instruction.

It also greatly simplifies the calls to this function from Prolog-
EpilogInserter. Previously, that had a bunch of logic to resume iteration
after the call; now it just continues with the returned iterator.

Note that this changes the behaviour of PEI a little. Previously,
it attempted to re-visit the new instruction created by
eliminateCallFramePseudoInstr(). That code was added in r36625,
but I can't see any reason for it: the new instructions will obviously
not be pseudo instructions, they will not have FrameIndex operands,
and we have already accounted for the stack adjustment.

Differential Revision: http://reviews.llvm.org/D18627

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265036 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Daniel Dunbar [Thu, 31 Mar 2016 18:22:55 +0000 (18:22 +0000)]

[lit][googletest] Handle upstream gtest output

Summary:
Upstream googletest prints "Running main() from gtest_main.cc" to stdout prior
to running tests. LLVM removed that print statement in r61540. If a user were
to use lit to run tests that use upstream googletest, however, lit
reports "Running main()" as an invalid test name.

To avoid such a failure, add an extra conditional to `formats/googletest.py`.
Also add tests to demonstrate the modified behavior.

Reviewers: abdulras, ddunbar

Subscribers: ddunbar, llvm-commits, kastiglione

Differential Revision: http://reviews.llvm.org/D18606

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265034 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Jacques Pienaar [Thu, 31 Mar 2016 17:58:55 +0000 (17:58 +0000)]

[lanai] isBrImm should accept any non-constant immediate.

isBrImm should accept any non-constant immediate. Previously it was only accepting LanaiMCExpr ones which was wrong.

Differential Revision: http://reviews.llvm.org/D18571

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265032 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Ehsan Amiri [Thu, 31 Mar 2016 17:47:17 +0000 (17:47 +0000)]

[PPC] basic support for Power 9 direct move instructions

http://reviews.llvm.org/D18097

Initial support does not include any patterns to generate this instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265031 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Rong Xu [Thu, 31 Mar 2016 17:39:33 +0000 (17:39 +0000)]

[PGO] use emplace_back. NFC.

Use emplace_back instead of push_back for simplicity.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265030 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjay Patel [Thu, 31 Mar 2016 17:30:06 +0000 (17:30 +0000)]

[x86] use SSE/AVX ops for non-zero memsets (PR27100)

Move the memset check down to the CPU-with-slow-SSE-unaligned-memops case: this allows fast
targets to take advantage of SSE/AVX instructions and prevents slow targets from stepping
into a codegen sinkhole while trying to splat a byte into an XMM reg.

Follow-on bugs exposed by the current codegen are:
https://llvm.org/bugs/show_bug.cgi?id=27141
https://llvm.org/bugs/show_bug.cgi?id=27143

Differential Revision: http://reviews.llvm.org/D18566

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265029 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Valery Pykhtin [Thu, 31 Mar 2016 17:28:46 +0000 (17:28 +0000)]

[AMDGPU] enable few disassembler tests that were mistakenly marked as FIXME.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265028 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Hans Wennborg [Thu, 31 Mar 2016 16:42:10 +0000 (16:42 +0000)]

More checks in win32-seh-nested-finally.ll after comment on r264966

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265027 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Ulrich Weigand [Thu, 31 Mar 2016 16:38:57 +0000 (16:38 +0000)]

[PowerPC] Attempt to fix fast-isel-i64offset.ll failure

The test case added in r265023 is failing on ninja-x64-msvc-RA-centos6.
Update the test to make less specific assumptions on code generation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265026 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Xinliang David Li [Thu, 31 Mar 2016 16:22:17 +0000 (16:22 +0000)]

Minor code cleanup /NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265025 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Stephan Bergmann [Thu, 31 Mar 2016 15:42:01 +0000 (15:42 +0000)]

Don't use potentially invalidated iterator

If the lhs is evaluated before the rhs, FuncletI's operator-> can trigger the

assert(isHandleInSync() && "invalid iterator access!");

at include/llvm/ADT/DenseMap.h:1061. (Happens e.g. when compiled with GCC 6.)

Differential Revision: http://reviews.llvm.org/D18440

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265024 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Ulrich Weigand [Thu, 31 Mar 2016 15:37:06 +0000 (15:37 +0000)]

[PowerPC] Correctly compute 64-bit offsets in fast isel

PPCSimplifyAddress contains this code:

  IntegerType *OffsetTy = ((VT == MVT::i32) ? Type::getInt32Ty(*Context)
                                            : Type::getInt64Ty(*Context));

to determine the type to be used for an index register, if one needs
to be created.  However, the "VT" here is the type of the data being
loaded or stored, *not* the type of an address.  This means that if
a data element of type i32 is accessed using an index that does not
not fit into 32 bits, a wrong address is computed here.

Note that PPCFastISel is only ever used on 64-bit currently, so the type
of an address is actually *always* MVT::i64.  Other parts of the code,
even in this same PPCSimplifyAddress routine, already rely on that fact.
Thus, this patch changes the code to simply unconditionally use
Type::getInt64Ty(*Context) as OffsetTy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265023 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Nemanja Ivanovic [Thu, 31 Mar 2016 15:26:37 +0000 (15:26 +0000)]

[PowerPC] Basic support for P9 atomic loads and stores

This patch corresponds to review:
http://reviews.llvm.org/D18032

This patch provides asm implementation for the following instructions:
lwat, ldat, stwat, stdat, ldmx, mcrxrx

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265022 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Jun Bum Lim [Thu, 31 Mar 2016 14:47:24 +0000 (14:47 +0000)]

[AArch64] Handle missing store pair opportunity

Summary:
This change will handle missing store pair opportunity where the first store
instruction stores zero followed by the non-zero store. For example, this change
will convert :

  str wzr, [x8]
  str w1, [x8, #4]
into:
  stp wzr, w1, [x8]

Reviewers: jmolloy, t.p.northover, mcrosier

Subscribers: flyingforyou, aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18570

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265021 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Ulrich Weigand [Thu, 31 Mar 2016 14:44:50 +0000 (14:44 +0000)]

[PowerPC] Remove incorrect use of COPY_TO_REGCLASS in fast isel

The fast isel pass currently emits a COPY_TO_REGCLASS node to convert
from a F4RC to a F8RC register class during conversion of a
floating-point number to integer. There is actually no support in the
common code instruction printers to emit COPY_TO_REGCLASS nodes, so the
PowerPC back-end has special code there to simply ignore
COPY_TO_REGCLASS.

This is correct *if and only if* the source and destination registers of
COPY_TO_REGCLASS are the same (except for the different register class).
But nothing guarantees this to be the case, and if the register
allocator does end up allocating source and destination to different
registers after all, the back-end simply generates incorrect code. I've
included a test case that shows such incorrect code generation.

However, it seems that COPY_TO_REGCLASS is actually not intended to be
used at the MI layer at all. It is used during SelectionDAG, but always
lowered to a plain COPY before emitting MI. Other back-end's fast isel
passes never emit COPY_TO_REGCLASS at all. I suspect it is simply wrong
for the PowerPC back-end to emit it here.

This patch changes the PowerPC back-end to directly emit COPY instead of
COPY_TO_REGCLASS and removes the special handling in the instruction
printers.

Differential Revision: http://reviews.llvm.org/D18605

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265020 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Daniel Sanders [Thu, 31 Mar 2016 14:34:00 +0000 (14:34 +0000)]

[mips] Range check simm16

Summary:
There are too many instructions to exhaustively test so addiu and lwc2 are
used as representative examples.

It should be noted that many memory instructions that should have simm16
range checking do not because it is also necessary to support the macro
of the same name which accepts simm32. The range checks for these occur in
the macro expansion.

Reviewers: vkalintiris

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D18437

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265019 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Daniel Sanders [Thu, 31 Mar 2016 14:23:20 +0000 (14:23 +0000)]

[mips] Range check simm11 and mem_simm11.

Summary:
ldc2/sdc2 now emit slightly worse diagnostics for MIPS-I. The problem
is that they don't trigger the custom parser because all the candidates
are disabled by feature bits. On all other subtargets, the diagnostics are
accurate but are subject to the usual issues of needing to report multiple
ways to correct the code (e.g. smaller offset, enable a CPU feature) but
only being able to report one error.

Reviewers: vkalintiris

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D18436

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265018 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Dmitry Polukhin [Thu, 31 Mar 2016 14:16:21 +0000 (14:16 +0000)]

[IFUNC] Introduce GlobalIndirectSymbol as a base class for alias and ifunc

This patch is a part of http://reviews.llvm.org/D15525

GlobalIndirectSymbol class contains common implementation for both
aliases and ifuncs. This patch should be NFC change that just prepare
common code for ifunc support.

Differential Revision: http://reviews.llvm.org/D18433

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265016 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sam Kolton [Thu, 31 Mar 2016 14:15:04 +0000 (14:15 +0000)]

[AMDGPU] Disassembler: support for DPP

Review: http://reviews.llvm.org/D18642

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265015 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Daniel Sanders [Thu, 31 Mar 2016 14:12:01 +0000 (14:12 +0000)]

[mips] Split mem_msa into range checked mem_simm10 and mem_simm10_lsl[123]

Summary:
Also, made test_mi10.s formatting consistent with the majority of the
MC tests.

Reviewers: vkalintiris

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D18435

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265014 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Nirav Dave [Thu, 31 Mar 2016 13:40:55 +0000 (13:40 +0000)]

Prevent X86ISelLowering from merging volatile loads

Change isConsecutiveLoads to check that loads are non-volatile as this
is a requirement for any load merges. Propagate change to two callers.

Reviewers: RKSimon

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18546

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265013 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Daniel Sanders [Thu, 31 Mar 2016 13:15:23 +0000 (13:15 +0000)]

[mips] Range check simm9 and fix a bug this revealed.

Summary:
The bug was that microMIPS's [ls]w[lr]e instructions claimed to support a
12-bit offset when it is only 9-bit.

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D18434

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265010 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Benjamin Kramer [Thu, 31 Mar 2016 10:42:40 +0000 (10:42 +0000)]

[TTI] Let the cost model estimate ctpop costs based on legality

PPC has a vector popcount, this lets the vectorizer use the correct cost
for it. Tweak X86 test to use an intrinsic that's actually scalarized (we
have a somewhat efficient lowering for vector popcount using SSE, the
cost model finds that now).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265005 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Zlatko Buljan [Thu, 31 Mar 2016 08:51:24 +0000 (08:51 +0000)]

[mips][microMIPS] Implement MFC*, MFHC* and DMFC* instructions
Differential Revision: http://reviews.llvm.org/D17334

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265002 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Jeroen Ketema [Thu, 31 Mar 2016 08:39:42 +0000 (08:39 +0000)]

Silence warnings in OCaml bindings

* LLVMDisposeMessage lives in llvm-c/Core.h, include this file where necessary
* LLVMAddTargetData has been removed, follow suit in the bindings

Differential Revision: http://reviews.llvm.org/D18633

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265001 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Jonas Paulsson [Thu, 31 Mar 2016 08:00:14 +0000 (08:00 +0000)]

Indentation fix in SystemZInstrInfo.cpp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265000 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjoy Das [Thu, 31 Mar 2016 05:14:34 +0000 (05:14 +0000)]

[InstCombine] Fix incorrect rule from rL236202

The rule for SMIN introduced in rL236202 doesn't work as advertised: the
check for Pred == ICmpInst::ICMP_SGT was missing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264996 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjoy Das [Thu, 31 Mar 2016 05:14:29 +0000 (05:14 +0000)]

Delete trailing whitespace

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264995 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjoy Das [Thu, 31 Mar 2016 05:14:26 +0000 (05:14 +0000)]

[SCEV] Track NoWrap properties using MatchBinaryOp, NFC

This way once we teach MatchBinaryOp to map more things into arithmetic,
the non-wrapping add recurrence construction would understand it too.
Right now MatchBinaryOp still only understands arithmetic, so this is
solely a code-reorganization change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264994 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjoy Das [Thu, 31 Mar 2016 05:14:22 +0000 (05:14 +0000)]

[SCEV] NFC code motion to simplify later change

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264993 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Craig Topper [Thu, 31 Mar 2016 04:37:41 +0000 (04:37 +0000)]

[X86] Use MVT instead of EVT in code called after legalization.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264992 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Davide Italiano [Thu, 31 Mar 2016 03:40:07 +0000 (03:40 +0000)]

[DebugInfo] Subprograms should belong to a CU.

Start fixing tests accordingly. There are still
about 35 failures before we can enable this check
in the IR verifier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264990 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Hal Finkel [Thu, 31 Mar 2016 02:56:05 +0000 (02:56 +0000)]

[PowerPC] Load two floats directly instead of using one 64-bit integer load

When dealing with complex<float>, and similar structures with two
single-precision floating-point numbers, especially when such things are being
passed around by value, we'll sometimes end up loading both float values by
extracting them from one 64-bit integer load. It looks like this:

  t13: i64,ch = load<LD8[%ref.tmp]> t0, t6, undef:i64
      t16: i64 = srl t13, Constant:i32<32>
    t17: i32 = truncate t16
  t18: f32 = bitcast t17
    t19: i32 = truncate t13
  t20: f32 = bitcast t19

The problem, especially before the P8 where those bitcasts aren't legal (and
get expanded via the stack), is that it would have been better to use two
floating-point loads directly. Here we add a target-specific DAGCombine to do
just that. In short, we turn:

ld 3, 0(5)
stw 3, -8(1)
rldicl 3, 3, 32, 32
stw 3, -4(1)
lfs 3, -4(1)
lfs 0, -8(1)

into:

        lfs 3, 4(5)
        lfs 0, 0(5)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264988 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sean Silva [Thu, 31 Mar 2016 01:47:33 +0000 (01:47 +0000)]

Fix case confusion.

The test case was defining and using a function 'notExported()', but
the FileCheck checks were checking for the name 'not_exported'. This
changes the test to use 'notExported' across the board. Also, the test
defined a function 'not_defined()', but doesn't have any checks related
to it. For consistency, this name is changed to 'notDefined'. A later
commit will add checks for 'notDefined'.

Patch by Warren Ristow!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264984 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjoy Das [Thu, 31 Mar 2016 00:18:46 +0000 (00:18 +0000)]

Introduce a @llvm.experimental.guard intrinsic

Summary:
As discussed on llvm-dev[1].

This change adds the basic boilerplate code around having this intrinsic
in LLVM:

- Changes in Intrinsics.td, and the IR Verifier
- A lowering pass to lower @llvm.experimental.guard to normal
control flow
- Inliner support

[1]: http://lists.llvm.org/pipermail/llvm-dev/2016-February/095523.html

Reviewers: reames, atrick, chandlerc, rnk, JosephTremoulet, echristo

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18527

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264976 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Hans Wennborg [Wed, 30 Mar 2016 23:55:22 +0000 (23:55 +0000)]

Add some more triples after r264966

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264972 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Hans Wennborg [Wed, 30 Mar 2016 23:38:01 +0000 (23:38 +0000)]

[X86] Enable call frame optimization ("mov to push") not only for optsize (PR26325)

The size savings are significant, and from what I can tell, both ICC and GCC do this.

Differential Revision: http://reviews.llvm.org/D18573

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264966 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Matthias Braun [Wed, 30 Mar 2016 22:46:04 +0000 (22:46 +0000)]

CodeGen: Factor out code for tail call result compatibility check; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264959 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Matthias Braun [Wed, 30 Mar 2016 22:45:58 +0000 (22:45 +0000)]

Avoid unnecessary #include; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264958 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Paul Robinson [Wed, 30 Mar 2016 22:41:06 +0000 (22:41 +0000)]

Update copyright year to 2016.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264954 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Matt Arsenault [Wed, 30 Mar 2016 22:28:52 +0000 (22:28 +0000)]

AMDGPU: Add frexp_exp intrinsic

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264944 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Matt Arsenault [Wed, 30 Mar 2016 22:28:26 +0000 (22:28 +0000)]

AMDGPU: Constant folding for frexp_mant

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264943 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Teresa Johnson [Wed, 30 Mar 2016 22:17:28 +0000 (22:17 +0000)]

Use existing PrintEscapedString in AssemblyWriter

r264884 introduced a helper to escape the backslashes in the source file
path, but I since discovered an existing mechanism to escape strings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264936 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Peter Collingbourne [Wed, 30 Mar 2016 22:05:13 +0000 (22:05 +0000)]

Cloning: Reduce complexity of debug info cloning and fix correctness issue.

Commit r260791 contained an error in that it would introduce a cross-module
reference in the old module. It also introduced O(N^2) complexity in the
module cloner by requiring the entire module to be visited for each function.
Fix both of these problems by avoiding use of the CloneDebugInfoMetadata
function (which is only designed to do intra-module cloning) and cloning
function-attached metadata in the same way that we clone all other metadata.

Differential Revision: http://reviews.llvm.org/D18583

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264935 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjay Patel [Wed, 30 Mar 2016 21:38:20 +0000 (21:38 +0000)]

fix typos

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264933 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Aaron Ballman [Wed, 30 Mar 2016 21:30:00 +0000 (21:30 +0000)]

Silencing warnings from MSVC 2015 Update 2. All of these changes silence "C4334 '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)". NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264929 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Matt Arsenault [Wed, 30 Mar 2016 21:15:18 +0000 (21:15 +0000)]

LegalizeDAG: Don't replace vector store with integer if not legal

For the same reason as the corresponding load change.

Note that ExpandStore is completely broken for non-byte sized element
vector stores, but preserve the current broken behavior which has tests
for it. The behavior should be the same, but now introduces a new typed
store that is incorrectly split later rather than doing it directly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264928 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Matt Arsenault [Wed, 30 Mar 2016 21:15:10 +0000 (21:15 +0000)]

LegalizeDAG: Don't replace vector load with integer unless legal

On AMDGPU we want to be able to promote i64/f64 loads to v2i32.
If the access is unaligned, this would conclude that since i64 is legal,
it would convert it back to i64 and there is an endless legalization
loop.

Extract the logic for scalarizing the load into a new TargetLowering
function, where this can also replace the custom function AMDGPU
has for this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264927 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

David Majnemer [Wed, 30 Mar 2016 21:12:06 +0000 (21:12 +0000)]

[IndVarSimplify] Don't insert after a catchswitch

Widening a PHI requires us to insert a trunc.
The logical place for this trunc is in the same BB as the PHI.
This is not possible if the BB is terminated by a catchswitch.

This fixes PR27133.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264926 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Justin Lebar [Wed, 30 Mar 2016 20:52:40 +0000 (20:52 +0000)]

Add #include <functional> to PassManagerBuilder, now that it uses std::function. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264923 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Simon Pilgrim [Wed, 30 Mar 2016 20:52:24 +0000 (20:52 +0000)]

[X86][AVX] Ensure EltsFromConsecutiveLoads tests the entire vector for consecutive loads/zeros

Fix for issue introduced D17297, where we were breaking early from the loop detecting consecutive loads which could leave us thinking a consecutive load with zeros was possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264922 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Justin Lebar [Wed, 30 Mar 2016 20:40:11 +0000 (20:40 +0000)]

[NVPTX] Make NVVMReflect a function pass.

Summary:
Currently it's a module pass. Make it a function pass so that we can
move it to PassManagerBuilder's EP_EarlyAsPossible extension point,
which only accepts function passes.

Reviewers: rnk

Subscribers: tra, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D18615

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264919 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Justin Lebar [Wed, 30 Mar 2016 20:39:29 +0000 (20:39 +0000)]

[PassManager] Make PassManagerBuilder::addExtension take an std::function, rather than a function pointer.

Summary:
This gives callers flexibility to pass lambdas with captures, which lets
callers avoid the C-style void*-ptr closure style. (Currently, callers
in clang store state in the PassManagerBuilderBase arg.)

No functional change, and the new API is backwards-compatible.

Reviewers: chandlerc

Subscribers: joker.eph, cfe-commits

Differential Revision: http://reviews.llvm.org/D18613

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264918 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Justin Bogner [Wed, 30 Mar 2016 20:36:07 +0000 (20:36 +0000)]

test: Remove a test for a transform that hasn't existed in 5 years.

The TailDup transform was removed in r138841 in 2011, along with most
of the tests for it. This test, however, was missed. Probably because
it had already been XFAIL'd for 3 years at that point (since r52243!)
and continued to fail when the opt flag for -tailduplicate stopped
being valid.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264916 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Hal Finkel [Wed, 30 Mar 2016 19:54:56 +0000 (19:54 +0000)]

Add a copy constructor to StringMap

There is code under review that requires StringMap to have a copy constructor,
and this makes StringMap more consistent with our other containers (like
DenseMap) that have copy constructors.

Differential Revision: http://reviews.llvm.org/D18506

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264906 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Hal Finkel [Wed, 30 Mar 2016 19:37:08 +0000 (19:37 +0000)]

[LoopVectorize] Don't vectorize loops when everything will be scalarized

This change prevents the loop vectorizer from vectorizing when all of the vector
types it generates will be scalarized. I've run into this problem on the PPC's QPX
vector ISA, which only holds floating-point vector types. The loop vectorizer
will, however, happily vectorize loops with purely integer computation. Here's
an example:

  LV: The Smallest and Widest types: 32 / 32 bits.
  LV: The Widest register is: 256 bits.
  LV: Found an estimated cost of 0 for VF 1 For instruction:   %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.body ]
  LV: Found an estimated cost of 0 for VF 1 For instruction:   %arrayidx = getelementptr inbounds [1600 x i32], [1600 x i32]* %a, i64 0, i64 %indvars.iv25
  LV: Found an estimated cost of 0 for VF 1 For instruction:   %2 = trunc i64 %indvars.iv25 to i32
  LV: Found an estimated cost of 1 for VF 1 For instruction:   store i32 %2, i32* %arrayidx, align 4
  LV: Found an estimated cost of 1 for VF 1 For instruction:   %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1
  LV: Found an estimated cost of 1 for VF 1 For instruction:   %exitcond27 = icmp eq i64 %indvars.iv.next26, 1600
  LV: Found an estimated cost of 0 for VF 1 For instruction:   br i1 %exitcond27, label %for.cond.cleanup, label %for.body
  LV: Scalar loop costs: 3.
  LV: Found an estimated cost of 0 for VF 2 For instruction:   %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.body ]
  LV: Found an estimated cost of 0 for VF 2 For instruction:   %arrayidx = getelementptr inbounds [1600 x i32], [1600 x i32]* %a, i64 0, i64 %indvars.iv25
  LV: Found an estimated cost of 0 for VF 2 For instruction:   %2 = trunc i64 %indvars.iv25 to i32
  LV: Found an estimated cost of 2 for VF 2 For instruction:   store i32 %2, i32* %arrayidx, align 4
  LV: Found an estimated cost of 1 for VF 2 For instruction:   %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1
  LV: Found an estimated cost of 1 for VF 2 For instruction:   %exitcond27 = icmp eq i64 %indvars.iv.next26, 1600
  LV: Found an estimated cost of 0 for VF 2 For instruction:   br i1 %exitcond27, label %for.cond.cleanup, label %for.body
  LV: Vector loop of width 2 costs: 2.
  LV: Found an estimated cost of 0 for VF 4 For instruction:   %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.body ]
  LV: Found an estimated cost of 0 for VF 4 For instruction:   %arrayidx = getelementptr inbounds [1600 x i32], [1600 x i32]* %a, i64 0, i64 %indvars.iv25
  LV: Found an estimated cost of 0 for VF 4 For instruction:   %2 = trunc i64 %indvars.iv25 to i32
  LV: Found an estimated cost of 4 for VF 4 For instruction:   store i32 %2, i32* %arrayidx, align 4
  LV: Found an estimated cost of 1 for VF 4 For instruction:   %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1
  LV: Found an estimated cost of 1 for VF 4 For instruction:   %exitcond27 = icmp eq i64 %indvars.iv.next26, 1600
  LV: Found an estimated cost of 0 for VF 4 For instruction:   br i1 %exitcond27, label %for.cond.cleanup, label %for.body
  LV: Vector loop of width 4 costs: 1.
  ...
  LV: Selecting VF: 8.
  LV: The target has 32 registers
  LV(REG): Calculating max register usage:
  LV(REG): At #0 Interval # 0
  LV(REG): At #1 Interval # 1
  LV(REG): At #2 Interval # 2
  LV(REG): At #4 Interval # 1
  LV(REG): At #5 Interval # 1
  LV(REG): VF = 8

The problem is that the cost model here is not wrong, exactly. Since all of
these operations are scalarized, their cost (aside from the uniform ones) are
indeed VF*(scalar cost), just as the model suggests. In fact, the larger the VF
picked, the lower the relative overhead from the loop itself (and the
induction-variable update and check), and so in a sense, picking the largest VF
here is the right thing to do.

The problem is that vectorizing like this, where all of the vectors will be
scalarized in the backend, isn't really vectorizing, but rather interleaving.
By itself, this would be okay, but then the vectorizer itself also interleaves,
and that's where the problem manifests itself. There's aren't actually enough
scalar registers to support the normal interleave factor multiplied by a factor
of VF (8 in this example). In other words, the problem with this is that our
register-pressure heuristic does not account for scalarization.

While we might want to improve our register-pressure heuristic, I don't think
this is the right motivating case for that work. Here we have a more-basic
problem: The job of the vectorizer is to vectorize things (interleaving aside),
and if the IR it generates won't generate any actual vector code, then
something is wrong. Thus, if every type looks like it will be scalarized (i.e.
will be split into VF or more parts), then don't consider that VF.

This is not a problem specific to PPC/QPX, however. The problem comes up under
SSE on x86 too, and as such, this change fixes PR26837 too. I've added Sanjay's
reduced test case from PR26837 to this commit.

Differential Revision: http://reviews.llvm.org/D18537

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264904 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Rong Xu [Wed, 30 Mar 2016 18:37:52 +0000 (18:37 +0000)]

[PGO] PGOFuncName in LTO optimizations

PGOFuncNames are used as the key to retrieve the Function definition from the
MD5 stored in the profile. For internal linkage function, we prefix the source
file name to the PGOFuncNames. LTO's internalization privatizes many global linkage
symbols. This happens after value profile annotation, but those internal
linkage functions should not have a source prefix. To differentiate compiler
generated internal symbols from original ones, PGOFuncName meta data are
created and attached to the original internal symbols in the value profile
annotation step. If a symbol does not have the meta data, its original linkage
must be non-internal.

Also add a new map that maps PGOFuncName's MD5 value to the function definition.

Differential Revision: http://reviews.llvm.org/D17895

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264902 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Reid Kleckner [Wed, 30 Mar 2016 18:19:39 +0000 (18:19 +0000)]

[cmake] Instead of testing char16_t for MSVC compat, directly ask cl.exe its version

Credit to Aaron Ballman for thinking of this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264886 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Teresa Johnson [Wed, 30 Mar 2016 18:15:08 +0000 (18:15 +0000)]

Restore "[ThinLTO] Serialize the Module SourceFileName to/from LLVM assembly"

This restores commit 264869, with a fix for windows bots to properly
escape '\' in the path when serializing out. Added test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264884 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Chad Rosier [Wed, 30 Mar 2016 18:08:51 +0000 (18:08 +0000)]

[AArch64] Fix warnings pointed out by Hal.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264882 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Reid Kleckner [Wed, 30 Mar 2016 17:30:26 +0000 (17:30 +0000)]

[cmake] Add -fms-compatibility-version=19 when clang-cl gives errors about char16_t

What we are really trying to do here is to figure out if we are using
the 2015 STL. Unfortunately, so far as I know the MSVC STL does not
define a version macro that we can check directly. Instead I wrote a
check to see if char16_t works.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264881 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Reid Kleckner [Wed, 30 Mar 2016 17:28:21 +0000 (17:28 +0000)]

[cmake] Allow EH usage with clang-cl

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264880 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Rong Xu [Wed, 30 Mar 2016 16:56:31 +0000 (16:56 +0000)]

[PGO] Use ArrayRef in annotateValueSite()

Using ArrayRef in annotateValueSite's parameter instead of using an array
and it's size.

Differential Revision: http://reviews.llvm.org/D18568

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264879 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Tom Stellard [Wed, 30 Mar 2016 16:35:13 +0000 (16:35 +0000)]

AMDGPU/SI: Improve MachineSchedModel definition

This patch contains a few improvements to the model, including:

- Using a single resource with a defined buffers size for each memory unit.
- Setting the IssueWidth correctly.
- Fixing latency values for memory instructions.

shader-db stats:

16429 shaders in 3231 tests
Totals:
SGPRS: 318232 -> 312328 (-1.86 %)
VGPRS: 208996 -> 209346 (0.17 %)
Code Size: 7147044 -> 7166440 (0.27 %) bytes
LDS: 83 -> 83 (0.00 %) blocks
Scratch: 1862656 -> 1459200 (-21.66 %) bytes per wave
Max Waves: 49182 -> 49243 (0.12 %)
Wait states: 0 -> 0 (0.00 %)A

Differential Revision: http://reviews.llvm.org/D18453

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264877 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Tom Stellard [Wed, 30 Mar 2016 16:35:09 +0000 (16:35 +0000)]

AMDGPU/SI: Enable lanemask tracking in misched

Summary:
This results in higher register usage, but should make it easier for
the compiler to hide latency.

This pass is a prerequisite for some more scheduler improvements, and I
think the increase register usage with this patch is acceptable, because
when combined with the scheduler improvements, the total register usage
will decrease.

shader-db stats:

2382 shaders in 478 tests
Totals:
SGPRS: 48672 -> 49088 (0.85 %)
VGPRS: 34148 -> 34847 (2.05 %)
Code Size: 1285816 -> 1289128 (0.26 %) bytes
LDS: 28 -> 28 (0.00 %) blocks
Scratch: 492544 -> 573440 (16.42 %) bytes per wave
Max Waves: 6856 -> 6846 (-0.15 %)
Wait states: 0 -> 0 (0.00 %)

Depends on D18451

Reviewers: nhaehnle, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18452

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264876 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Jonas Paulsson [Wed, 30 Mar 2016 16:11:58 +0000 (16:11 +0000)]

[SystemZ] Add nop and nopr InstAliases.

For compatability with GAS, nop and nopr are recognized as alises for
bc and bcr, respectively. A mask of 0 turns these instructions
effectively into no-operations.

Reviewed by Ulrich Weigand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264875 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Nirav Dave [Wed, 30 Mar 2016 15:41:12 +0000 (15:41 +0000)]

Remove HasFnAttribute guards to getFnAttribute calls

These checks are redundant and can be removed

Reviewers: hans

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D18564

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264872 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Teresa Johnson [Wed, 30 Mar 2016 15:16:04 +0000 (15:16 +0000)]

Revert "[ThinLTO] Serialize the Module SourceFileName to/from LLVM assembly"

This reverts commit r264869. I am seeing Windows bot failures due to the
"\" in the path being mishandled at some point (seems to be interpreted
wrongly at some point and llvm-as | llvm-dis is yielding some junk
characters). Need to investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264871 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Simon Pilgrim [Wed, 30 Mar 2016 14:14:00 +0000 (14:14 +0000)]

[X86][XOP] BITREVERSE lowering using VPPERM

XOP's VPPERM has some great 'permute operations' that it can do as well as part of shuffling the bytes of a 128-bit vector - in this case we use it to perform BITREVERSE in a single instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264870 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Teresa Johnson [Wed, 30 Mar 2016 14:00:02 +0000 (14:00 +0000)]

[ThinLTO] Serialize the Module SourceFileName to/from LLVM assembly

Summary:
This change serializes out and in the SourceFileName to LLVM assembly
so that it is preserved through "llvm-dis | llvm-as". This is
necessary to ensure that the global identifiers created for local values
in the module summary index are the same even if the bitcode is
streamed out and read back from LLVM assembly.

Serializing the summary itself to LLVM assembly is in progress.

Reviewers: joker.eph

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D18588

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264869 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Simon Pilgrim [Wed, 30 Mar 2016 13:55:00 +0000 (13:55 +0000)]

[X86][SSE] Test the legalization of vector comparison results

We are currently doing a REALLY bad job of packing results of vector comparisons into the legalized <X x i1> result equivalents - a mixture of PACKSS/PMOVMSKB would be much better here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264867 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Benjamin Kramer [Wed, 30 Mar 2016 12:31:51 +0000 (12:31 +0000)]

[NVPTX] Avoid temporary std::string and make single-use function local to the cpp file.

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264861 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Marianne Mailhot-Sarrasin [Wed, 30 Mar 2016 12:20:53 +0000 (12:20 +0000)]

gold-plugin: Fixed typo in an error message.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264860 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Simon Pilgrim [Wed, 30 Mar 2016 11:43:26 +0000 (11:43 +0000)]

[X86][SSE] Added tests for clearing upper bits of vector elements

Patterns based on PR6455

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264857 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

James Molloy [Wed, 30 Mar 2016 10:11:43 +0000 (10:11 +0000)]

[VectorUtils] Don't try and truncate PHIs to a smaller bitwidth

We already try not to truncate PHIs in computeMinimalBitwidths. LoopVectorize can't handle it and we really don't need to, because both induction and reduction PHIs are truncated by other means.

However, we weren't bailing out in all the places we should have, and we ended up by returning a PHI to be truncated, which has caused PR27018.

This fixes PR17018.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264852 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Chandler Carruth [Wed, 30 Mar 2016 08:41:59 +0000 (08:41 +0000)]

[x86] Fix a horrible bug in our lowering of x86 floating point atomic
operations.

Specifically, we had code that tried to badly approximate reconstructing
all of the possible variations on addressing modes in two x86
instructions based on those in one pseudo instruction. This is not the
first bug uncovered with doing this, so stop doing it altogether.
Instead generically and pedantically copy every operand from the address
over to both new instructions, and strip kill flags from any register
operands.

This fixes a subtle bug seen in the wild where we would mysteriously
drop parts of the addressing mode, causing for example the index
argument in the added test case to just be completely ignored.

Hypothetically, this was an extremely bad miscompile because it actually
caused a predictable and leveragable write of a 64bit quantity to an
unintended offset (the first element of the array intead of whatever
other element was intended). As a consequence, in theory this could even
have introduced security vulnerabilities.

However, this was only something that could happen with an atomic
floating point add. No other operation could trigger this bug, so it
seems extremely unlikely to have occured widely in the wild.

But it did in fact occur, and frequently in scientific applications
which were using relaxed atomic updates of a floating point value after
adding a delta. Those would end up being quite badly miscompiled by
LLVM, which is how we found this. Of course, this often looks like
a race condition in the code, but it was actually a miscompile.

I suspect that this whole RELEASE_FADD thing was a complete mistake.
There is no such operation, and I worry that anything other than add
will get remarkably worse codegeneration. But that's not for this
change....

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264845 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Craig Topper [Wed, 30 Mar 2016 05:26:43 +0000 (05:26 +0000)]

[CodeGen] Mark EVT:getExtendedSizeInBits() as LLVM_READONLY.

I think I had tried this a long time back and some bots failed. Hoping that was with an older gcc and maybe now it will work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264840 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Jingyue Wu [Wed, 30 Mar 2016 05:05:40 +0000 (05:05 +0000)]

[docs] Add gpucc publication and tutorial.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264839 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Duncan P. N. Exon Smith [Wed, 30 Mar 2016 04:32:29 +0000 (04:32 +0000)]

IR: Constify LLVMContext::discardValueNames, NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264823 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Duncan P. N. Exon Smith [Wed, 30 Mar 2016 04:21:52 +0000 (04:21 +0000)]

BitcodeReader: Fix weird whitespace, NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264822 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

George Burgess IV [Wed, 30 Mar 2016 03:12:08 +0000 (03:12 +0000)]

[MemorySSA] Make the visitor more careful with calls.

Prior to this patch, the MemorySSA caching visitor would cache all
calls that it visited. When paired with phi optimization, this can be
problematic. Consider:

define void @foo() {
  ; 1 = MemoryDef(liveOnEntry)
  call void @clobberFunction()
  br i1 undef, label %if.end, label %if.then

if.then:
  ; MemoryUse(??)
  call void @readOnlyFunction()
  ; 2 = MemoryDef(1)
  call void @clobberFunction()
  br label %if.end

if.end:
  ; 3 = MemoryPhi(...)
  ; MemoryUse(?)
  call void @readOnlyFunction()
  ret void
}

When optimizing MemoryUse(?), we visit defs 1 and 2, so we note to
cache them later. We ultimately end up not being able to optimize
passed the Phi, so we set MemoryUse(?) to point to the Phi. We then
cache the clobbering call for def 1 to be the Phi.

This commit changes this behavior so that we wipe out any calls
added to VisistedCalls while visiting the defs of a phi we couldn't
optimize.

Aside: With this patch, we now can bootstrap clang/LLVM without a
single MemorySSA verifier failure. Woohoo. :)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264820 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Chandler Carruth [Wed, 30 Mar 2016 03:10:24 +0000 (03:10 +0000)]

[x86] Extract a helper function to compute the full addressing mode from
an x86 MachineInstr's operands. This will be super useful to fix some
bad atomics code in my next commit.

No functionality changed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264819 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Xinliang David Li [Wed, 30 Mar 2016 02:16:07 +0000 (02:16 +0000)]

[PGO] Handle invoke inst in IR based icall instrumentation

Differential Revision: http://reviews.llvm.org/D18580

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264818 91177308-0d34-0410-b5e6-96231b3b80d8

external/llvm

About OSDN

Find Software

Develop Software

Help

Copyright ©OSDN Corporation All rights reserved.