OSDN Git Service

android-x86/external-llvm.git
6 years agoFixed a bug in splitting Scatter operation in the Type Legalizer.
Elena Demikhovsky [Mon, 11 Sep 2017 06:18:15 +0000 (06:18 +0000)]
Fixed a bug in splitting Scatter operation in the Type Legalizer.
After the split of the Scatter operation, the order of the new instructions is well defined - Lo goes before Hi. Otherwise the semantic of Scatter (from LSB to MSB) is broken.
I'm chaining 2 nodes to prevent reordering.

Differential Revision https://reviews.llvm.org/D37670

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312894 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[ORC] Kill off a dead typedef.
Lang Hames [Mon, 11 Sep 2017 01:09:46 +0000 (01:09 +0000)]
[ORC] Kill off a dead typedef.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312893 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoUse llvm_unreachable for unknown TargetCostKind.
Simon Pilgrim [Sun, 10 Sep 2017 18:42:23 +0000 (18:42 +0000)]
Use llvm_unreachable for unknown TargetCostKind.

TargetTransformInfo::getInstructionCost's switch covers all TargetCostKind cases so we shouldn't return for a default case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312888 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86][SSE] Tidyup + clang-format combineX86ShuffleChain call. NFCI.
Simon Pilgrim [Sun, 10 Sep 2017 18:18:45 +0000 (18:18 +0000)]
[X86][SSE] Tidyup + clang-format combineX86ShuffleChain call. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312887 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86][SSE] Move combineTo call out of combineX86ShufflesConstants. NFCI.
Simon Pilgrim [Sun, 10 Sep 2017 18:10:49 +0000 (18:10 +0000)]
[X86][SSE] Move combineTo call out of combineX86ShufflesConstants. NFCI.

Move towards making it possible to use the shuffle combines for cases where we don't want to call DCI.CombineTo() with the result.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312886 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[InstSimplify] refactor udiv/urem code and add tests; NFCI
Sanjay Patel [Sun, 10 Sep 2017 17:55:08 +0000 (17:55 +0000)]
[InstSimplify] refactor udiv/urem code and add tests; NFCI

This removes some duplicated code and makes it easier to support signed div/rem
in a similar way if we want to do that. Note that the existing comments were not
accurate - we don't need a constant divisor to simplify; icmp simplification does
more than that. But as the added tests show, it could go even further.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312885 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86][SSE] Move combineTo call out of combineX86ShuffleChain. NFCI.
Simon Pilgrim [Sun, 10 Sep 2017 14:06:41 +0000 (14:06 +0000)]
[X86][SSE] Move combineTo call out of combineX86ShuffleChain. NFCI.

First step towards making it possible to use the shuffle combines for cases where we don't want to call DCI.CombineTo() with the result.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312884 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoAdded a test that demonstrates a ug in Scatter scheduling.
Elena Demikhovsky [Sun, 10 Sep 2017 13:20:42 +0000 (13:20 +0000)]
Added a test that demonstrates a ug in Scatter scheduling.
The bug is going to be fixed in an upcomming patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312883 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86][X86AsmParser] adding const on InlineAsmIdentifierInfo in CreateMemForInlineAsm...
Coby Tayree [Sun, 10 Sep 2017 12:21:24 +0000 (12:21 +0000)]
[X86][X86AsmParser] adding const on InlineAsmIdentifierInfo in CreateMemForInlineAsm. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312881 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoRevert "adding autoUpgrade support to broadcast[f|i]32x2 intrinsics"
Uriel Korach [Sun, 10 Sep 2017 09:07:21 +0000 (09:07 +0000)]
Revert "adding autoUpgrade support to broadcast[f|i]32x2 intrinsics"

This reverts commit r312879 - An accidental partial commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312880 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoadding autoUpgrade support to broadcast[f|i]32x2 intrinsics
Uriel Korach [Sun, 10 Sep 2017 08:40:13 +0000 (08:40 +0000)]
adding autoUpgrade support to broadcast[f|i]32x2 intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312879 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoTest commit
Uriel Korach [Sun, 10 Sep 2017 08:31:22 +0000 (08:31 +0000)]
Test commit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312878 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[SCEV] Re-arrange public and private sections to be contiguous; NFC
Sanjoy Das [Sun, 10 Sep 2017 03:54:22 +0000 (03:54 +0000)]
[SCEV] Re-arrange public and private sections to be contiguous; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312876 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86] Add v2i4 store test case (PR20012)
Simon Pilgrim [Sat, 9 Sep 2017 20:28:50 +0000 (20:28 +0000)]
[X86] Add v2i4 store test case (PR20012)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312874 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86] Add v2i2 test case (PR20011)
Simon Pilgrim [Sat, 9 Sep 2017 20:22:35 +0000 (20:22 +0000)]
[X86] Add v2i2 test case (PR20011)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312873 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86][FMA] Regenerate FMA tests
Simon Pilgrim [Sat, 9 Sep 2017 19:25:59 +0000 (19:25 +0000)]
[X86][FMA] Regenerate FMA tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312871 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoMerge isKnownNonNull into isKnownNonZero
Nuno Lopes [Sat, 9 Sep 2017 18:23:11 +0000 (18:23 +0000)]
Merge isKnownNonNull into isKnownNonZero
It now knows the tricks of both functions.
Also, fix a bug that considered allocas of non-zero address space to be always non null

Differential Revision: https://reviews.llvm.org/D37628

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312869 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86][SSE] i32 vector multiplications test cases from PR6399
Simon Pilgrim [Sat, 9 Sep 2017 18:18:17 +0000 (18:18 +0000)]
[X86][SSE] i32 vector multiplications test cases from PR6399

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312868 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86][MOVBE] Fix typo in MOVBE scheduling test names
Simon Pilgrim [Sat, 9 Sep 2017 17:52:44 +0000 (17:52 +0000)]
[X86][MOVBE] Fix typo in MOVBE scheduling test names

Copy+paste is not your friend

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312867 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86] Don't disable slow INC/DEC if optimizing for size
Craig Topper [Sat, 9 Sep 2017 17:11:59 +0000 (17:11 +0000)]
[X86] Don't disable slow INC/DEC if optimizing for size

Summary:
Just because INC/DEC is a little slow on some processors doesn't mean we shouldn't prefer it when optimizing for size.

This appears to match gcc behavior.

Reviewers: chandlerc, zvi, RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37177

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312866 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[CMake] Update GetSVN.cmake to handle repo
MinSeong Kim [Sat, 9 Sep 2017 14:17:52 +0000 (14:17 +0000)]
[CMake] Update GetSVN.cmake to handle repo

Summary:
When repo is used with git, 'clang --version' option does not display
the correct revision information (i.e. git hash on TOP) as the following:

clang version 6.0.0 --->
clang version 6.0.0 (clang version) (llvm version)

This is because repo also creates .git/svn folder as git-svn does and
this makes repo with git uses "git svn info" command, which is only for
git-svn, to retrieve its revision information, making null for the info.
To correctly distinguish between git-svn and repo with git, the folder
hierarchy to specify for git-svn should be .git/svn/refs as the "git svn
info" command depends on the revision data in .git/svn/refs. This patch
in turn makes repo with git passes through to the third macro,
get_source_info_git, in  get_source_info function, resulting in correctly
retrieving the revision information for repo with git using "git log ..."
command.

This patch is tested with git, svn, git-svn, and repo with git.

Reviewers: llvm-commits, probinson, rnk

Reviewed By: rnk

Subscribers: rnk, mehdi_amini, beanz, mgorny

Differential Revision: https://reviews.llvm.org/D35532

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312864 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[DivRemPairs] split tests per target to account for bots that don't build for all...
Sanjay Patel [Sat, 9 Sep 2017 14:10:59 +0000 (14:10 +0000)]
[DivRemPairs] split tests per target to account for bots that don't build for all targets

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312863 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[DivRempairs] add a pass to optimize div/rem pairs (PR31028)
Sanjay Patel [Sat, 9 Sep 2017 13:38:18 +0000 (13:38 +0000)]
[DivRempairs] add a pass to optimize div/rem pairs (PR31028)

This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented
as an independent pass, so there's no stretching of scope and feature creep for an existing pass.
I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost
this same functionality as an addition to CGP in the motivating example of PR31028:
https://bugs.llvm.org/show_bug.cgi?id=31028

The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow
more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and
undo the hoisting that is done here.

Decomposing remainder may allow removing some code from the backend (PPC and possibly others).

Differential Revision: https://reviews.llvm.org/D37121

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312862 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoCoverageMappingTest.cpp: Suppress warnings. [-Wdocumentation]
NAKAMURA Takumi [Sat, 9 Sep 2017 06:19:53 +0000 (06:19 +0000)]
CoverageMappingTest.cpp: Suppress warnings. [-Wdocumentation]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312861 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86] Call removeDeadNode when we're done doing custom isel for mul, div and test
Craig Topper [Sat, 9 Sep 2017 05:57:20 +0000 (05:57 +0000)]
[X86] Call removeDeadNode when we're done doing custom isel for mul, div and test

Summary:
Once we've done our custom isel for these nodes, I think we should be calling removeDeadNode to prune them out of the DAG. Table driven isel ultimately either calls morphNodeTo which modifies a node and doesn't leave dead nodes. Or it emits new nodes and then calls removeDeadNode as part of Opc_CompleteMatch.

If you run a simple multiply test case like this through llc with -debug you'll see a umul_lohi node get printed as part of the dump for Instruction Selection ends.

```
define i64 @foo(i64 %a, i64 %b) local_unnamed_addr #0 {
entry:
  %conv = zext i64 %a to i128
  %conv1 = zext i64 %b to i128
  %mul = mul nuw nsw i128 %conv1, %conv
  %shr = lshr i128 %mul, 64
  %conv2 = trunc i128 %shr to i64
  ret i64 %conv2
}
```

Reviewers: RKSimon, spatel, zvi, guyblank, niravd

Reviewed By: niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37547

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312857 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86] Use ReplaceNode instead of ReplaceUses when converting X86ISD::SHRUNKBLEND...
Craig Topper [Sat, 9 Sep 2017 05:57:19 +0000 (05:57 +0000)]
[X86] Use ReplaceNode instead of ReplaceUses when converting X86ISD::SHRUNKBLEND to ISD::VSELECT during isel.

This ensures that the SHRUNKBLEND node gets erased immediately.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312856 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[sanitizer-coverage] call appendToUsed once per module, not once per function (which...
Kostya Serebryany [Sat, 9 Sep 2017 05:30:13 +0000 (05:30 +0000)]
[sanitizer-coverage] call appendToUsed once per module, not once per function (which is too slow)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312855 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[SLP] Fix buildbots, NFC.
Alexey Bataev [Sat, 9 Sep 2017 02:08:45 +0000 (02:08 +0000)]
[SLP] Fix buildbots, NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312853 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoRegAllocFast: Fix warning; NFC
Matthias Braun [Sat, 9 Sep 2017 01:16:59 +0000 (01:16 +0000)]
RegAllocFast: Fix warning; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312852 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoRegAllocFast: Cleanup; NFC
Matthias Braun [Sat, 9 Sep 2017 00:52:46 +0000 (00:52 +0000)]
RegAllocFast: Cleanup; NFC

- Use range based for
- Variable names should start with upper case
- Add `const`
- Change class name to match filename
- Fix doxygen comments
- Use MCPhysReg instead of unsigned
- Use references instead of pointers where things cannot be nullptr
- Misc coding style improvements

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312846 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoRegAllocFast: Move vector to class level to avoid reallocation; NFC
Matthias Braun [Sat, 9 Sep 2017 00:52:45 +0000 (00:52 +0000)]
RegAllocFast: Move vector to class level to avoid reallocation; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312845 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoRegAllocFast: Remove write-only set; NFC
Matthias Braun [Sat, 9 Sep 2017 00:52:42 +0000 (00:52 +0000)]
RegAllocFast: Remove write-only set; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312844 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoPPC: Don't select lxv/stxv for insufficiently aligned stack slots.
Kyle Butt [Sat, 9 Sep 2017 00:37:56 +0000 (00:37 +0000)]
PPC: Don't select lxv/stxv for insufficiently aligned stack slots.

The lxv/stxv instructions require an offset that is 0 % 16. Previously we were
selecting lxv/stxv for loads and stores to the stack where the offset from the
slot was a multiple of 16, but the stack slot was not 16 or more byte aligned.
When the frame gets lowered these transform to r(1|31) + slot + offset.
If slot is not aligned, slot + offset may not be 0 % 16.
Now we require 16 byte or more alignment for select lxv/stxv to stack slots.

Includes a testcase that shows both sufficiently and insufficiently aligned
stack slots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312843 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agobpf: fix test failures due to previous bpf change of assembly code syntax
Yonghong Song [Sat, 9 Sep 2017 00:11:13 +0000 (00:11 +0000)]
bpf: fix test failures due to previous bpf change of assembly code syntax

Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312840 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[AMDGPU] Remove unused function. NFCI.
Davide Italiano [Fri, 8 Sep 2017 23:54:11 +0000 (23:54 +0000)]
[AMDGPU] Remove unused function. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312836 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[TargetTransformInfo] Remove the extra "default" in a switch that all enum values...
Guozhi Wei [Fri, 8 Sep 2017 23:34:28 +0000 (23:34 +0000)]
[TargetTransformInfo] Remove the extra "default" in a switch that all enum values has been covered.

In function TargetTransformInfo::getInstructionCost, all enum values in the switch statement has been covered, so the default is unnecessary, and may cause error with option -Werror,-Wcovered-switch-default, so remove it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312834 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agobpf: proper print imm64 expression in inst printer
Yonghong Song [Fri, 8 Sep 2017 23:32:38 +0000 (23:32 +0000)]
bpf: proper print imm64 expression in inst printer

Fixed an issue in printImm64Operand where if the value is
an expression, print out the expression properly. Currently,
it will print
  r1 = <MCOperand Expr:(tx_port)>ll
With the patch, the printout will be
  r1 = tx_port

Suggested-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312833 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[TargetTransformInfo] Add a new public interface getInstructionCost
Guozhi Wei [Fri, 8 Sep 2017 22:29:17 +0000 (22:29 +0000)]
[TargetTransformInfo] Add a new public interface getInstructionCost

Current TargetTransformInfo can support throughput cost model and code size model, but sometimes we also need instruction latency cost model in different optimizations. Hal suggested we need a single public interface to query the different cost of an instruction. So I proposed following interface:

  enum TargetCostKind {
    TCK_RecipThroughput, ///< Reciprocal throughput.
    TCK_Latency,         ///< The latency of instruction.
    TCK_CodeSize         ///< Instruction code size.
  };

  int getInstructionCost(const Instruction *I, enum TargetCostKind kind) const;

All clients should mainly use this function to query the cost of an instruction, parameter <kind> specifies the desired cost model.

This patch also provides a simple default implementation of getInstructionLatency.

The default getInstructionLatency provides latency numbers for only small number of instruction classes, those latency numbers are only reasonable for modern OOO processors. It can be extended in following ways:

   Add more detail into this function.
   Add getXXXLatency function and call it from here.
   Implement target specific getInstructionLatency function.

Differential Revision: https://reviews.llvm.org/D37170

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312832 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[CMake][runtimes] Use the same configuration for non-target and "default" target
Petr Hosek [Fri, 8 Sep 2017 22:26:50 +0000 (22:26 +0000)]
[CMake][runtimes] Use the same configuration for non-target and "default" target

The default host target for builtins and runtimes has special behavior
on some platforms, e.g. on Linux both i386 and x86_64 targets are being
built. Specifying "default" as a target name should lead to the same
behavior, which wasn't the case in the past. This patch unifies the
configuration between the non-target and "default" target to produce the
same behavior by moving the default configuration into a function that
can be used from both paths.

Differential Revision: https://reviews.llvm.org/D37450

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312831 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoMigrate llvm-symbolizer tests to not use %T
David Blaikie [Fri, 8 Sep 2017 21:10:01 +0000 (21:10 +0000)]
Migrate llvm-symbolizer tests to not use %T

(context around the %T removal here: https://reviews.llvm.org/D35396 )

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312828 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[llvm-cov] Use portable output redirection in a test
Vedant Kumar [Fri, 8 Sep 2017 20:24:23 +0000 (20:24 +0000)]
[llvm-cov] Use portable output redirection in a test

A follow-up to a test fix (r312825).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312826 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[llvm-cov] Try to appease a Windows bot
Vedant Kumar [Fri, 8 Sep 2017 20:18:17 +0000 (20:18 +0000)]
[llvm-cov] Try to appease a Windows bot

On a Windows bot, I see a FileCheck error where the source being matched
over no longer exists, i.e it seems like it's FileCheck'ing some stale
output:

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/4747

You can see "// CHECK: [[@LINE]]|{{ +}Marker at 19:3 = 1" in the
FileCheck stderr, but that CHECK line doesn't exist.

Remove the input file to FileCheck before running the test, to try and
appease the bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312825 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoAMDGPU: Start using !con operator
Matt Arsenault [Fri, 8 Sep 2017 19:09:13 +0000 (19:09 +0000)]
AMDGPU: Start using !con operator

We have a lot of operand definition work essentially producing
every valid permutation of operands to workaround builiding
operand lists based on the instruction features. Apparently tablegen
already has a mostly undocumented operator to concat dags which
simplies this.

Convert one simple place to use this. The BUF instruction definitions
have much more complicated logic that can be totally rewritten now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312822 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[llvm-cov] Disable name-compression in a test binary
Vedant Kumar [Fri, 8 Sep 2017 19:08:39 +0000 (19:08 +0000)]
[llvm-cov] Disable name-compression in a test binary

This should fix the lld bot:

The Buildbot has detected a new failure on builder llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast while building cfe.
Full details are available at:
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/16993

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312821 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoAMDGPU: Recompute scc liveness
Matt Arsenault [Fri, 8 Sep 2017 18:51:26 +0000 (18:51 +0000)]
AMDGPU: Recompute scc liveness

The various scalar bit operations set SCC,
so one is erased or moved it needs to be recomputed.
Not sure why the existing tests don't fail on this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312819 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[Coverage] Build sorted and unique segments
Vedant Kumar [Fri, 8 Sep 2017 18:44:50 +0000 (18:44 +0000)]
[Coverage] Build sorted and unique segments

A coverage segment contains a starting line and column, an execution
count, and some other metadata. Clients of the coverage library use
segments to prepare line-oriented reports.

Users of the coverage library depend on segments being unique and sorted
in source order. Currently this is not guaranteed (this is why the clang
change which introduced deferred regions was reverted).

This commit documents the "unique and sorted" condition and asserts that
it holds. It also fixes the SegmentBuilder so that it produces correct
output in some edge cases.

Testing: I've added unit tests for some edge cases. I've also checked
that the new SegmentBuilder implementation is fully covered. Apart from
running check-profile and the llvm-cov tests, I've successfully used a
stage1 llvm-cov to prepare a coverage report for an instrumented clang
binary.

Differential Revision: https://reviews.llvm.org/D36813

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312817 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[llvm-cov] Fix a lifetime issue
Vedant Kumar [Fri, 8 Sep 2017 18:44:49 +0000 (18:44 +0000)]
[llvm-cov] Fix a lifetime issue

This fixes an issue where a std::string was moved to a constructor
which accepted a StringRef.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312816 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[Coverage] Define LineColPair for convenience. NFC.
Vedant Kumar [Fri, 8 Sep 2017 18:44:48 +0000 (18:44 +0000)]
[Coverage] Define LineColPair for convenience. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312815 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[Coverage] Report errors when reading malformed source regions
Vedant Kumar [Fri, 8 Sep 2017 18:44:47 +0000 (18:44 +0000)]
[Coverage] Report errors when reading malformed source regions

Each source region has a start and end location. Report an error when
the end location does not precede the begin location.

The old lineExecutionCounts.covmapping test actually had a buggy source
region in it. This commit introduces a regenerated copy of the coverage
and moves the old copy to malformedRegions.covmapping, for a test.

Differential Revision: https://reviews.llvm.org/D37387

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312814 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[llvm-cov] Unify region marker placement between text/html modes
Vedant Kumar [Fri, 8 Sep 2017 18:44:46 +0000 (18:44 +0000)]
[llvm-cov] Unify region marker placement between text/html modes

Make sure that the text and html emitters always emit the same set of
region markers, and avoid emitting redundant markers for line segments
which don't end on the line they start on.

This is related to D35925, and depends on D36014

Differential Revision: https://reviews.llvm.org/D36020

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312813 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[x86] Fix GCC pedantic warnings about default arguments for lambdas.
Chandler Carruth [Fri, 8 Sep 2017 18:23:42 +0000 (18:23 +0000)]
[x86] Fix GCC pedantic warnings about default arguments for lambdas.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312809 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86] Simplify the slow-incdec test and add test cases with optsize.
Craig Topper [Fri, 8 Sep 2017 17:33:54 +0000 (17:33 +0000)]
[X86] Simplify the slow-incdec test and add test cases with optsize.

I think we want to consider using inc/dec with optsize.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312804 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[SLPVectorizer] Add struct InstructionsState that holds information about analysis...
Dinar Temirbulatov [Fri, 8 Sep 2017 17:08:17 +0000 (17:08 +0000)]
[SLPVectorizer] Add struct InstructionsState that holds information about analysis of vector to be vectorized.

Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D37212

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312802 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoFix a bug for rL312641.
Wei Mi [Fri, 8 Sep 2017 16:44:52 +0000 (16:44 +0000)]
Fix a bug for rL312641.

rL312641 Allowed llvm.memcpy/memset/memmove to be tail calls when parent
function return the intrinsics's first argument. However on arm-none-eabi
platform, llvm.memcpy will be expanded to __aeabi_memcpy which doesn't
have return value. The fix is to check the libcall name after expansion
to match "memcpy/memset/memmove" before allowing those intrinsic to be
tail calls.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312799 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoPreserve existing regs when adding pristines to LivePhysRegs/LiveRegUnits
Krzysztof Parzyszek [Fri, 8 Sep 2017 16:29:50 +0000 (16:29 +0000)]
Preserve existing regs when adding pristines to LivePhysRegs/LiveRegUnits

Differential Revision: https://reviews.llvm.org/D37600

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312797 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[SLP] Fix the warning about paths not returning the value, NFC.
Alexey Bataev [Fri, 8 Sep 2017 14:32:20 +0000 (14:32 +0000)]
[SLP] Fix the warning about paths not returning the value, NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312793 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[SLP] Support for horizontal min/max reduction.
Alexey Bataev [Fri, 8 Sep 2017 13:49:36 +0000 (13:49 +0000)]
[SLP] Support for horizontal min/max reduction.

SLP vectorizer supports horizontal reductions for Add/FAdd binary
operations. Patch adds support for horizontal min/max reductions.
Function getReductionCost() is split to getArithmeticReductionCost() for
binary operation reductions and getMinMaxReductionCost() for min/max
reductions.
Patch fixes PR26956.

Differential revision: https://reviews.llvm.org/D27846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312791 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86] Added PR31045 test case
Simon Pilgrim [Fri, 8 Sep 2017 10:49:11 +0000 (10:49 +0000)]
[X86] Added PR31045 test case

Reduced version of 'addr-calc-crash.ll' that was included in D27044, that had been fixed already by D31286/rL298633

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312786 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoRe-enable "[IRCE] Identify loops with latch comparison against current IV value"
Max Kazantsev [Fri, 8 Sep 2017 10:15:05 +0000 (10:15 +0000)]
Re-enable "[IRCE] Identify loops with latch comparison against current IV value"

Re-applying after the found bug was fixed.

Differential Revision: https://reviews.llvm.org/D36215

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312783 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[dwarfdump] Verify line table prologue
Jonas Devlieghere [Fri, 8 Sep 2017 09:48:51 +0000 (09:48 +0000)]
[dwarfdump] Verify line table prologue

This patch adds prologue verification, which is already present in
Apple's dwarfdump. It checks for invalid directory indices and warns
about duplicate file paths.

Differential revision: https://reviews.llvm.org/D37511

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312782 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86] Adding a test point for PR34149 'Suboptimal codegen for "fast" minnum and maxnum'
Jatin Bhateja [Fri, 8 Sep 2017 09:15:36 +0000 (09:15 +0000)]
[X86] Adding a test point for PR34149 'Suboptimal codegen for "fast" minnum and maxnum'

Differential Revision: https://reviews.llvm.org/D37614

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312778 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[llvm-dlltool] Mention arm64 in the lists of architecture alternatives
Martin Storsjo [Fri, 8 Sep 2017 06:49:46 +0000 (06:49 +0000)]
[llvm-dlltool] Mention arm64 in the lists of architecture alternatives

This was missed in SVN r310223 when arm64 support was added.

Differential Revision: https://reviews.llvm.org/D37588

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312776 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agodiff --git a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp b/lib/Transform...
Max Kazantsev [Fri, 8 Sep 2017 04:26:41 +0000 (04:26 +0000)]
diff --git a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
index f72a808..9fa49fd 100644
--- a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
+++ b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
@@ -450,20 +450,10 @@ struct LoopStructure {
   // equivalent to:
   //
   // intN_ty inc = IndVarIncreasing ? 1 : -1;
-  // pred_ty predicate = IndVarIncreasing
-  //                         ? IsSignedPredicate ? ICMP_SLT : ICMP_ULT
-  //                         : IsSignedPredicate ? ICMP_SGT : ICMP_UGT;
+  // pred_ty predicate = IndVarIncreasing ? ICMP_SLT : ICMP_SGT;
   //
-  //
-  // for (intN_ty iv = IndVarStart; predicate(IndVarBase, LoopExitAt);
-  //      iv = IndVarNext)
+  // for (intN_ty iv = IndVarStart; predicate(iv, LoopExitAt); iv = IndVarBase)
   //   ... body ...
-  //
-  // Here IndVarBase is either current or next value of the induction variable.
-  // in the former case, IsIndVarNext = false and IndVarBase points to the
-  // Phi node of the induction variable. Otherwise, IsIndVarNext = true and
-  // IndVarBase points to IV increment instruction.
-  //

   Value *IndVarBase;
   Value *IndVarStart;
@@ -471,13 +461,12 @@ struct LoopStructure {
   Value *LoopExitAt;
   bool IndVarIncreasing;
   bool IsSignedPredicate;
-  bool IsIndVarNext;

   LoopStructure()
       : Tag(""), Header(nullptr), Latch(nullptr), LatchBr(nullptr),
         LatchExit(nullptr), LatchBrExitIdx(-1), IndVarBase(nullptr),
         IndVarStart(nullptr), IndVarStep(nullptr), LoopExitAt(nullptr),
-        IndVarIncreasing(false), IsSignedPredicate(true), IsIndVarNext(false) {}
+        IndVarIncreasing(false), IsSignedPredicate(true) {}

   template <typename M> LoopStructure map(M Map) const {
     LoopStructure Result;
@@ -493,7 +482,6 @@ struct LoopStructure {
     Result.LoopExitAt = Map(LoopExitAt);
     Result.IndVarIncreasing = IndVarIncreasing;
     Result.IsSignedPredicate = IsSignedPredicate;
-    Result.IsIndVarNext = IsIndVarNext;
     return Result;
   }

@@ -841,42 +829,21 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE,
     return false;
   };

-  // `ICI` can either be a comparison against IV or a comparison of IV.next.
-  // Depending on the interpretation, we calculate the start value differently.
+  // `ICI` is interpreted as taking the backedge if the *next* value of the
+  // induction variable satisfies some constraint.

-  // Pair {IndVarBase; IsIndVarNext} semantically designates whether the latch
-  // comparisons happens against the IV before or after its value is
-  // incremented. Two valid combinations for them are:
-  //
-  // 1) { phi [ iv.start, preheader ], [ iv.next, latch ]; false },
-  // 2) { iv.next; true }.
-  //
-  // The latch comparison happens against IndVarBase which can be either current
-  // or next value of the induction variable.
   const SCEVAddRecExpr *IndVarBase = cast<SCEVAddRecExpr>(LeftSCEV);
   bool IsIncreasing = false;
   bool IsSignedPredicate = true;
-  bool IsIndVarNext = false;
   ConstantInt *StepCI;
   if (!IsInductionVar(IndVarBase, IsIncreasing, StepCI)) {
     FailureReason = "LHS in icmp not induction variable";
     return None;
   }

-  const SCEV *IndVarStart = nullptr;
-  // TODO: Currently we only handle comparison against IV, but we can extend
-  // this analysis to be able to deal with comparison against sext(iv) and such.
-  if (isa<PHINode>(LeftValue) &&
-      cast<PHINode>(LeftValue)->getParent() == Header)
-    // The comparison is made against current IV value.
-    IndVarStart = IndVarBase->getStart();
-  else {
-    // Assume that the comparison is made against next IV value.
-    const SCEV *StartNext = IndVarBase->getStart();
-    const SCEV *Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE));
-    IndVarStart = SE.getAddExpr(StartNext, Addend);
-    IsIndVarNext = true;
-  }
+  const SCEV *StartNext = IndVarBase->getStart();
+  const SCEV *Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE));
+  const SCEV *IndVarStart = SE.getAddExpr(StartNext, Addend);
   const SCEV *Step = SE.getSCEV(StepCI);

   ConstantInt *One = ConstantInt::get(IndVarTy, 1);
@@ -1060,7 +1027,6 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE,
   Result.IndVarIncreasing = IsIncreasing;
   Result.LoopExitAt = RightValue;
   Result.IsSignedPredicate = IsSignedPredicate;
-  Result.IsIndVarNext = IsIndVarNext;

   FailureReason = nullptr;

@@ -1350,9 +1316,8 @@ LoopConstrainer::RewrittenRangeInfo LoopConstrainer::changeIterationSpaceEnd(
                                       BranchToContinuation);

     NewPHI->addIncoming(PN->getIncomingValueForBlock(Preheader), Preheader);
-    auto *FixupValue =
-        LS.IsIndVarNext ? PN->getIncomingValueForBlock(LS.Latch) : PN;
-    NewPHI->addIncoming(FixupValue, RRI.ExitSelector);
+    NewPHI->addIncoming(PN->getIncomingValueForBlock(LS.Latch),
+                        RRI.ExitSelector);
     RRI.PHIValuesAtPseudoExit.push_back(NewPHI);
   }

@@ -1735,10 +1700,7 @@ bool InductiveRangeCheckElimination::runOnLoop(Loop *L, LPPassManager &LPM) {
   }
   LoopStructure LS = MaybeLoopStructure.getValue();
   const SCEVAddRecExpr *IndVar =
-      cast<SCEVAddRecExpr>(SE.getSCEV(LS.IndVarBase));
-  if (LS.IsIndVarNext)
-    IndVar = cast<SCEVAddRecExpr>(SE.getMinusSCEV(IndVar,
-                                                  SE.getSCEV(LS.IndVarStep)));
+      cast<SCEVAddRecExpr>(SE.getMinusSCEV(SE.getSCEV(LS.IndVarBase), SE.getSCEV(LS.IndVarStep)));

   Optional<InductiveRangeCheck::Range> SafeIterRange;
   Instruction *ExprInsertPt = Preheader->getTerminator();
diff --git a/test/Transforms/IRCE/latch-comparison-against-current-value.ll b/test/Transforms/IRCE/latch-comparison-against-current-value.ll
deleted file mode 100644
index afea0e6..0000000
--- a/test/Transforms/IRCE/latch-comparison-against-current-value.ll
+++ /dev/null
@@ -1,182 +0,0 @@
-; RUN: opt -verify-loop-info -irce-print-changed-loops -irce -S < %s 2>&1 | FileCheck %s
-
-; Check that IRCE is able to deal with loops where the latch comparison is
-; done against current value of the IV, not the IV.next.
-
-; CHECK: irce: in function test_01: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-; CHECK: irce: in function test_02: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-; CHECK-NOT: irce: in function test_03: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-; CHECK-NOT: irce: in function test_04: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-
-; SLT condition for increasing loop from 0 to 100.
-define void @test_01(i32* %arr, i32* %a_len_ptr) #0 {
-
-; CHECK:      test_01
-; CHECK:        entry:
-; CHECK-NEXT:     %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0
-; CHECK-NEXT:     [[COND2:%[^ ]+]] = icmp slt i32 0, %exit.mainloop.at
-; CHECK-NEXT:     br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit
-; CHECK:        loop:
-; CHECK-NEXT:     %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ]
-; CHECK-NEXT:     %idx.next = add nuw nsw i32 %idx, 1
-; CHECK-NEXT:     %abc = icmp slt i32 %idx, %exit.mainloop.at
-; CHECK-NEXT:     br i1 true, label %in.bounds, label %out.of.bounds.loopexit1
-; CHECK:        in.bounds:
-; CHECK-NEXT:     %addr = getelementptr i32, i32* %arr, i32 %idx
-; CHECK-NEXT:     store i32 0, i32* %addr
-; CHECK-NEXT:     %next = icmp slt i32 %idx, 100
-; CHECK-NEXT:     [[COND3:%[^ ]+]] = icmp slt i32 %idx, %exit.mainloop.at
-; CHECK-NEXT:     br i1 [[COND3]], label %loop, label %main.exit.selector
-; CHECK:        main.exit.selector:
-; CHECK-NEXT:     %idx.lcssa = phi i32 [ %idx, %in.bounds ]
-; CHECK-NEXT:     [[COND4:%[^ ]+]] = icmp slt i32 %idx.lcssa, 100
-; CHECK-NEXT:     br i1 [[COND4]], label %main.pseudo.exit, label %exit
-; CHECK-NOT: loop.preloop:
-; CHECK:        loop.postloop:
-; CHECK-NEXT:    %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ]
-; CHECK-NEXT:     %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1
-; CHECK-NEXT:     %abc.postloop = icmp slt i32 %idx.postloop, %exit.mainloop.at
-; CHECK-NEXT:     br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit
-
-entry:
-  %len = load i32, i32* %a_len_ptr, !range !0
-  br label %loop
-
-loop:
-  %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
-  %idx.next = add nsw nuw i32 %idx, 1
-  %abc = icmp slt i32 %idx, %len
-  br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
-  %addr = getelementptr i32, i32* %arr, i32 %idx
-  store i32 0, i32* %addr
-  %next = icmp slt i32 %idx, 100
-  br i1 %next, label %loop, label %exit
-
-out.of.bounds:
-  ret void
-
-exit:
-  ret void
-}
-
-; ULT condition for increasing loop from 0 to 100.
-define void @test_02(i32* %arr, i32* %a_len_ptr) #0 {
-
-; CHECK:      test_02
-; CHECK:        entry:
-; CHECK-NEXT:     %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0
-; CHECK-NEXT:     [[COND2:%[^ ]+]] = icmp ult i32 0, %exit.mainloop.at
-; CHECK-NEXT:     br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit
-; CHECK:        loop:
-; CHECK-NEXT:     %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ]
-; CHECK-NEXT:     %idx.next = add nuw nsw i32 %idx, 1
-; CHECK-NEXT:     %abc = icmp ult i32 %idx, %exit.mainloop.at
-; CHECK-NEXT:     br i1 true, label %in.bounds, label %out.of.bounds.loopexit1
-; CHECK:        in.bounds:
-; CHECK-NEXT:     %addr = getelementptr i32, i32* %arr, i32 %idx
-; CHECK-NEXT:     store i32 0, i32* %addr
-; CHECK-NEXT:     %next = icmp ult i32 %idx, 100
-; CHECK-NEXT:     [[COND3:%[^ ]+]] = icmp ult i32 %idx, %exit.mainloop.at
-; CHECK-NEXT:     br i1 [[COND3]], label %loop, label %main.exit.selector
-; CHECK:        main.exit.selector:
-; CHECK-NEXT:     %idx.lcssa = phi i32 [ %idx, %in.bounds ]
-; CHECK-NEXT:     [[COND4:%[^ ]+]] = icmp ult i32 %idx.lcssa, 100
-; CHECK-NEXT:     br i1 [[COND4]], label %main.pseudo.exit, label %exit
-; CHECK-NOT: loop.preloop:
-; CHECK:        loop.postloop:
-; CHECK-NEXT:    %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ]
-; CHECK-NEXT:     %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1
-; CHECK-NEXT:     %abc.postloop = icmp ult i32 %idx.postloop, %exit.mainloop.at
-; CHECK-NEXT:     br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit
-
-entry:
-  %len = load i32, i32* %a_len_ptr, !range !0
-  br label %loop
-
-loop:
-  %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
-  %idx.next = add nsw nuw i32 %idx, 1
-  %abc = icmp ult i32 %idx, %len
-  br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
-  %addr = getelementptr i32, i32* %arr, i32 %idx
-  store i32 0, i32* %addr
-  %next = icmp ult i32 %idx, 100
-  br i1 %next, label %loop, label %exit
-
-out.of.bounds:
-  ret void
-
-exit:
-  ret void
-}
-
-; Same as test_01, but comparison happens against IV extended to a wider type.
-; This test ensures that IRCE rejects it and does not falsely assume that it was
-; a comparison against iv.next.
-; TODO: We can actually extend the recognition to cover this case.
-define void @test_03(i32* %arr, i64* %a_len_ptr) #0 {
-
-; CHECK:      test_03
-
-entry:
-  %len = load i64, i64* %a_len_ptr, !range !1
-  br label %loop
-
-loop:
-  %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
-  %idx.next = add nsw nuw i32 %idx, 1
-  %idx.ext = sext i32 %idx to i64
-  %abc = icmp slt i64 %idx.ext, %len
-  br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
-  %addr = getelementptr i32, i32* %arr, i32 %idx
-  store i32 0, i32* %addr
-  %next = icmp slt i32 %idx, 100
-  br i1 %next, label %loop, label %exit
-
-out.of.bounds:
-  ret void
-
-exit:
-  ret void
-}
-
-; Same as test_02, but comparison happens against IV extended to a wider type.
-; This test ensures that IRCE rejects it and does not falsely assume that it was
-; a comparison against iv.next.
-; TODO: We can actually extend the recognition to cover this case.
-define void @test_04(i32* %arr, i64* %a_len_ptr) #0 {
-
-; CHECK:      test_04
-
-entry:
-  %len = load i64, i64* %a_len_ptr, !range !1
-  br label %loop
-
-loop:
-  %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
-  %idx.next = add nsw nuw i32 %idx, 1
-  %idx.ext = sext i32 %idx to i64
-  %abc = icmp ult i64 %idx.ext, %len
-  br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
-  %addr = getelementptr i32, i32* %arr, i32 %idx
-  store i32 0, i32* %addr
-  %next = icmp ult i32 %idx, 100
-  br i1 %next, label %loop, label %exit
-
-out.of.bounds:
-  ret void
-
-exit:
-  ret void
-}
-
-!0 = !{i32 0, i32 50}
-!1 = !{i64 0, i64 50}

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312775 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoFix a crash when emitting debug info for multi-reg function arguments
Adrian Prantl [Fri, 8 Sep 2017 02:31:37 +0000 (02:31 +0000)]
Fix a crash when emitting debug info for multi-reg function arguments
by reusing more of the existing machinery

This is a follow-up to r312169.
Thanks to Björn Pettersson for the testcase!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312773 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[XRay][CodeGen][PowerPC] Fix tail exit codegen for XRay in PPC
Dean Michael Berris [Fri, 8 Sep 2017 01:47:56 +0000 (01:47 +0000)]
[XRay][CodeGen][PowerPC] Fix tail exit codegen for XRay in PPC

Summary:
This fixes code-gen for XRay in PPC. The regression wasn't caught by
codegen tests  which we add in this change.

What happened was the following:

- For tail exits, we used to unconditionally prepend the returns/exits
  with a pseudo-instruction that gets lowered to the instrumentation
  sled (and leave the actual return/exit instruction as-is).
- Changes to the XRay instrumentation pass caused the tail exits to
  suddenly also emit the tail exit pseudo-instruction, since the check
  for whether a return instruction was also a call instruction meant it
  was a tail exit instruction.
- None of the tests caught the regression either due to non-existent
  tests, or the tests being disabled/removed for continuous breakage.

This change re-introduces some of the basic tests and verifies that
we're back to a state that allows the back-end to generate appropriate
XRay instrumented binaries for PPC in the presence of tail exits.

Reviewers: echristo, timshen

Subscribers: nemanjai, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D37570

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312772 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[x86] Flesh out the custom ISel for RMW aritmetic ops with used flags to
Chandler Carruth [Fri, 8 Sep 2017 00:17:12 +0000 (00:17 +0000)]
[x86] Flesh out the custom ISel for RMW aritmetic ops with used flags to
cover the bitwise operators.

Nothing really exciting here, this just stamps out the rest of the core
operations that can RMW memory and set flags.

Still not implemented here: ADC, SBB. Those will require more
interesting logic to channel the flags *in*, and I'm not currently
planning to try to tackle that. It might be interesting for someone who
wants to improve our code generation for bignum implementations.

Differential Revision: https://reviews.llvm.org/D37141

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312768 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoWholeProgramDevirt: When promoting for single-impl devirt, also rename the comdat.
Peter Collingbourne [Fri, 8 Sep 2017 00:10:53 +0000 (00:10 +0000)]
WholeProgramDevirt: When promoting for single-impl devirt, also rename the comdat.

This is required when targeting COFF, as the comdat name must match
one of the names of the symbols in the comdat.

Differential Revision: https://reviews.llvm.org/D37550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312767 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[x86] Extend the manual ISel of `add` and `sub` with both RMW memory
Chandler Carruth [Thu, 7 Sep 2017 23:54:24 +0000 (23:54 +0000)]
[x86] Extend the manual ISel of `add` and `sub` with both RMW memory
operands and used flags to support matching immediate operands.

This is a bit trickier than register operands, and we still want to fall
back on a register operands even for things that appear to be
"immediates" when they won't actually select into the operation's
immediate operand. This also requires us to handle things like selecting
`sub` vs. `add` to minimize the number of bits needed to represent the
immediate, and picking the shortest immediate encoding. In order to
that, we in turn need to scan to make sure that CF isn't used as it will
get inverted.

The end result seems very nice though, and we're now generating
optimal instruction sequences for these patterns IMO.

A follow-up patch will further expand this to other operations with RMW
memory operands. But handing `add` and `sub` are useful starting points
to flesh out the machinery and make sure interesting and complex cases
can be handled.

Thanks to Craig Topper who provided a few fixes and improvements to this
patch in addition to the review!

Differential Revision: https://reviews.llvm.org/D37139

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312764 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoDon't call exit from cl::PrintHelpMessage.
Rafael Espindola [Thu, 7 Sep 2017 23:30:48 +0000 (23:30 +0000)]
Don't call exit from cl::PrintHelpMessage.

Most callers were not expecting the exit(0) and trying to exit with a
different value.

This also adds back the call to cl::PrintHelpMessage in llvm-ar.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312761 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[Bitcode] Fix some Clang-tidy modernize-use-using and Include What You Use warnings...
Eugene Zelenko [Thu, 7 Sep 2017 23:28:24 +0000 (23:28 +0000)]
[Bitcode] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312760 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoSink some IntrinsicInst.h and Intrinsics.h out of llvm/include
Reid Kleckner [Thu, 7 Sep 2017 23:27:44 +0000 (23:27 +0000)]
Sink some IntrinsicInst.h and Intrinsics.h out of llvm/include

Many of these uses can get by with forward declarations. Hopefully this
speeds up compilation after adding a single intrinsic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312759 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoRevert r312318, r312325, r312424, r312489
Richard Trieu [Thu, 7 Sep 2017 23:20:35 +0000 (23:20 +0000)]
Revert r312318, r312325, r312424, r312489

r312318 - Debug info for variables whose type is shrinked to bool
r312325, r312424, r312489 - Test case for r312318

Revision 312318 introduced a null dereference bug.
Details in https://bugs.llvm.org/show_bug.cgi?id=34490

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312758 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[llvm-objcopy] Add support for special section indexes in symbol table greater than...
Petr Hosek [Thu, 7 Sep 2017 23:02:50 +0000 (23:02 +0000)]
[llvm-objcopy] Add support for special section indexes in symbol table greater than SHN_LORESERVE

As is indexes above SHN_LORESERVE will not be handled correctly because
they'll be treated as indexes of sections rather than special values
that should just be copied. This change adds support to copy them
though.

Patch by Jake Ehrlich

Differential Revision: https://reviews.llvm.org/D37393

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312756 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoMove duplicate helpers from DbgValueInst / DbgDeclareInst to DbgInfoIntrinsic
Reid Kleckner [Thu, 7 Sep 2017 22:46:24 +0000 (22:46 +0000)]
Move duplicate helpers from DbgValueInst / DbgDeclareInst to DbgInfoIntrinsic

NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312754 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agollvm-ar: exit with 1 if there is an error.
Rafael Espindola [Thu, 7 Sep 2017 22:20:38 +0000 (22:20 +0000)]
llvm-ar: exit with 1 if there is an error.

This is pr34396.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312752 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[DWARF] Line 0 should not have a discriminator.
Paul Robinson [Thu, 7 Sep 2017 22:15:44 +0000 (22:15 +0000)]
[DWARF] Line 0 should not have a discriminator.
It's meaningless and takes up extra space in the line table.

Differential Revision: https://reviews.llvm.org/D37364

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312751 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoFix llvm-xray tests to avoid subshells
Reid Kleckner [Thu, 7 Sep 2017 21:28:09 +0000 (21:28 +0000)]
Fix llvm-xray tests to avoid subshells

We already uses pipefail to detect failure of a redirected command, so
the "|| echo failure" construct was unnecessary.

These tests run and pass on Windows now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312747 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[ORC] Add ErrorSuccess and void specializations to AsyncHandlerTraits.
Lang Hames [Thu, 7 Sep 2017 21:04:00 +0000 (21:04 +0000)]
[ORC] Add ErrorSuccess and void specializations to AsyncHandlerTraits.

This will allow async handlers to be added that return void or Error::success().
Such handlers are expected to be common, since one of the primary uses of
addAsyncHandler is to run the body of the handler in a detached thread, in which
case the main handler returns immediately and does not need to provide an Error
value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312746 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[yaml2obj][ELF] Add support for symbol indexes greater than SHN_LORESERVE
Petr Hosek [Thu, 7 Sep 2017 20:44:16 +0000 (20:44 +0000)]
[yaml2obj][ELF] Add support for symbol indexes greater than SHN_LORESERVE

Right now Symbols must be either undefined or defined in a specific
section. Some symbols have section indexes like SHN_ABS however. This
change adds support for outputting symbols that have such section
indexes.

Patch by Jake Ehrlich

Differential Revision: https://reviews.llvm.org/D37391

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312745 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoCOFF: PDB: Allow multiple modules with the same name.
Peter Collingbourne [Thu, 7 Sep 2017 20:39:46 +0000 (20:39 +0000)]
COFF: PDB: Allow multiple modules with the same name.

It is possible for two modules to have the same name if they are
archive members with the same name, or if we are doing LTO (in which
case all modules will have the name "lto.tmp").

Differential Revision: https://reviews.llvm.org/D37589

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312744 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoRemove dead code. NFCI.
Peter Collingbourne [Thu, 7 Sep 2017 19:17:30 +0000 (19:17 +0000)]
Remove dead code. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312740 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[XRay][tools] Disable windows for tests that use an unsupported shell redirect.
Keith Wyss [Thu, 7 Sep 2017 19:10:34 +0000 (19:10 +0000)]
[XRay][tools] Disable windows for tests that use an unsupported shell redirect.

The tests are filechecking against stderr and use some magic to make stdout go
away and pipe stderr to FileCheck. This broke bots on windows.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312739 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[CUDA] Added rudimentary support for CUDA-9 and sm_70.
Artem Belevich [Thu, 7 Sep 2017 18:14:32 +0000 (18:14 +0000)]
[CUDA] Added rudimentary support for CUDA-9 and sm_70.

For now CUDA-9 is not included in the list of CUDA versions clang
searches for, so the path to CUDA-9 must be explicitly passed
via --cuda-path=.

On LLVM side NVPTX added sm_70 GPU type which bumps required
PTX version to 6.0, but otherwise is equivalent to sm_62 at the moment.

Differential Revision: https://reviews.llvm.org/D37576

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312734 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[XRay][tools] Function call stack based analysis tooling for XRay traces
Keith Wyss [Thu, 7 Sep 2017 18:07:48 +0000 (18:07 +0000)]
[XRay][tools] Function call stack based analysis tooling for XRay traces

Second try after fixing a code san problem with iterator reference types.

This change introduces a subcommand to the llvm-xray tool called
"stacks" which allows for analysing XRay traces provided as inputs and
accounting time to stacks instead of just individual functions. This
gives us a more precise view of where in a program the latency is
actually attributed.

The tool uses a trie data structure to keep track of the caller-callee
relationships as we process the XRay traces. In particular, we keep
track of the function call stack as we enter functions. While we're
doing this we're adding nodes in a trie and indicating a "calls"
relatinship between the caller (current top of the stack) and the callee
(the new top of the stack). When we push function ids onto the stack, we
keep track of the timestamp (TSC) for the enter event.

When exiting functions, we are able to account the duration by getting
the difference between the timestamp of the exit event and the
corresponding entry event in the stack. This works even if we somehow
miss the exit events for intermediary functions (i.e. if the exit event
is not cleanly associated with the enter event at the top of the stack).

The output of the tool currently provides just the top N leaf functions
that contribute the most latency, and the top N stacks that have the
most frequency. In the future we can provide more sophisticated query
mechanisms and potentially an export to database feature to make offline
analysis of the stack traces possible with existing tools.

Differential revision: D34863

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312733 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoAMDGPU: Start selecting v_mad_mix_f32
Matt Arsenault [Thu, 7 Sep 2017 18:05:07 +0000 (18:05 +0000)]
AMDGPU: Start selecting v_mad_mix_f32

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312732 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoDAG: Allow creating extract_vector_elt post-legalize
Matt Arsenault [Thu, 7 Sep 2017 17:24:43 +0000 (17:24 +0000)]
DAG: Allow creating extract_vector_elt post-legalize

Fixes some combine issues for AMDGPU where we weren't
getting the many extract_vector_elt combines expected
in a future patch.

This should really be checking isOperationLegalOrCustom on
the extract. That improves a number of x86 lit tests, but
a few get stuck in an infinite loop from one place
where a similar looking extract is created. I have a
different workaround in the backend for that which
keeps many of those improvements, but also adds a few
regressions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312730 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoAMDGPU: Handle non-temporal loads and stores
Konstantin Zhuravlyov [Thu, 7 Sep 2017 17:14:54 +0000 (17:14 +0000)]
AMDGPU: Handle non-temporal loads and stores

Differential Revision: https://reviews.llvm.org/D36862

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312729 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoAMDGPU: Handle more than one memory operand in SIMemoryLegalizer
Konstantin Zhuravlyov [Thu, 7 Sep 2017 16:14:21 +0000 (16:14 +0000)]
AMDGPU: Handle more than one memory operand in SIMemoryLegalizer

Differential Revision: https://reviews.llvm.org/D37397

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312725 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[ARM] Remove redundant vcvt patterns.
Benjamin Kramer [Thu, 7 Sep 2017 14:52:26 +0000 (14:52 +0000)]
[ARM] Remove redundant vcvt patterns.

These don't add any value as they're just compositions of existing
patterns. However, they can confuse the cost logic in ISel, leading to
duplicated vcvt instructions like in PR33199.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312724 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86][LLVM]Expanding Supports lowerInterleavedLoad() in X86InterleavedAccess (VF...
Michael Zuckerman [Thu, 7 Sep 2017 14:02:13 +0000 (14:02 +0000)]
[X86][LLVM]Expanding Supports lowerInterleavedLoad() in X86InterleavedAccess (VF{8|16|32} stride 3).

This patch expands the support of lowerInterleavedload to {8|16|32}x8i stride 3.

LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=3 VF={8|16|32}) and we plan to include the store (deinterleved side).

The patch goal is to optimize the following sequence:
a0 b0 c0 a1 b1 c1 a2 b2
c2 a3 b3 c3 a4 b4 c4 a5
b5 c5 a6 b6 c6 a7 b7 c7

into

a0 a1 a2 a3 a4 a5 a6 a7
b0 b1 b2 b3 b4 b5 b6 b7
c0 c1 c2 c3 c4 c5 c6 c7

Reviewers
1. zvi
2. igor
3. guyblank
4. dorit
5. Ayal

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312722 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[mips] Use RegisterMCAsmBackend to register all MIPS asm backends. NFC
Simon Atanasyan [Thu, 7 Sep 2017 12:54:26 +0000 (12:54 +0000)]
[mips] Use RegisterMCAsmBackend to register all MIPS asm backends. NFC

This change converts the `MipsAsmBackend` constructor to the "standard"
form. It makes possible to use `RegisterMCAsmBackend` for the backends
registrations. Now we pass `Triple` instance to the `MipsAsmBackend`
ctor and deduce all required options like endianness and bitness from
the triple. We still need to implement explicit ABI checking for
providing correct options to backends.

Differential revision: https://reviews.llvm.org/D37519

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312720 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[MachineCombiner] Update instruction depths incrementally for large BBs.
Florian Hahn [Thu, 7 Sep 2017 12:49:39 +0000 (12:49 +0000)]
[MachineCombiner] Update instruction depths incrementally for large BBs.

Summary:
For large basic blocks with lots of combinable instructions, the
MachineTraceMetrics computations in MachineCombiner can dominate the compile
time, as computing the trace information is quadratic in the number of
instructions in a BB and it's relevant successors/predecessors.

In most cases, knowing the instruction depth should be enough to make
combination decisions. As we already iterate over all instructions in a basic
block, the instruction depth can be computed incrementally. This reduces the
cost of machine-combine drastically in cases where lots of instructions
are combined. The major drawback is that AFAIK, computing the critical path
length cannot be done incrementally. Therefore we only compute
instruction depths incrementally, for basic blocks with more
instructions than inc_threshold. The -machine-combiner-inc-threshold
option can be used to set the threshold and allows for easier
experimenting and checking if using incremental updates for all basic
blocks has any impact on the performance.

Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn

Reviewed By: fhahn

Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D36619

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312719 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[MachineTraceMetrics] Add computeDepth function (NFCI).
Florian Hahn [Thu, 7 Sep 2017 11:51:30 +0000 (11:51 +0000)]
[MachineTraceMetrics] Add computeDepth function (NFCI).

Summary:
This function is used in D36619 to update the instruction depths
incrementally.

Reviewers: efriedma, Gerolf, MatzeB, fhahn

Reviewed By: fhahn

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36696

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312714 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[Sparc][NFC] Clean up SelectCC lowering
Alex Bradbury [Thu, 7 Sep 2017 11:30:55 +0000 (11:30 +0000)]
[Sparc][NFC] Clean up SelectCC lowering
The ARM, BPF, MSP430, Sparc and Mips backends all use a similar code sequence
for lowering SelectCC. As pointed out by @reames in D29937, this code isn't
particularly clear and in most of these backends doesn't actually match the
comments. This patch makes the code sequence clearer for the Sparc backend
through better variable naming and more accurate comments (e.g. we are
inserting triangle control flow, _not_ diamond). There is no functional
change.

Differential Revision: https://reviews.llvm.org/D37194

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312713 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoFixing incorrectly capitalised regexps.
Benjamin Kramer [Thu, 7 Sep 2017 09:54:03 +0000 (09:54 +0000)]
Fixing incorrectly capitalised regexps.

Patch by Sam Allen!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312709 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoRevert "[RegAlloc] Make sure live-ranges reflect the state of the IR when removing...
Jonas Paulsson [Thu, 7 Sep 2017 09:13:17 +0000 (09:13 +0000)]
Revert "[RegAlloc] Make sure live-ranges reflect the state of the IR when removing them"

This temporarily reverts commit 463fa38 (r311401).

See https://bugs.llvm.org/show_bug.cgi?id=34502

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312708 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[x86] Update to cmov promotion tests for D36711; NFC
Alexander Ivchenko [Thu, 7 Sep 2017 08:59:05 +0000 (08:59 +0000)]
[x86] Update to cmov promotion tests for D36711; NFC

Adding i8 -> [i16, i32, i64] and i32 -> i64 cases.
This way we can see what the current codegen looks like.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312707 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoX86: Improve AVX512 fptoui lowering
Zvi Rackover [Thu, 7 Sep 2017 07:40:34 +0000 (07:40 +0000)]
X86: Improve AVX512 fptoui lowering

Summary:
Add patterns for
  fptoui <16 x float> to <16 x i8>
  fptoui <16 x float> to <16 x i16>

Reviewers: igorb, delena, craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37505

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312704 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86] Force shuffle lowering to only create X86ISD::VPERM2X128 with 64-bit element...
Craig Topper [Thu, 7 Sep 2017 06:11:10 +0000 (06:11 +0000)]
[X86] Force shuffle lowering to only create X86ISD::VPERM2X128 with 64-bit element types so we can remove some patterns from isel.

Intrinsic handling is still creating these nodes with 32-bit elements as well. But at least this gets rid of 8 and 16.

Ideally, someday we'll convert the intrinsics to generic vector shuffles and remove the intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312702 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoAMDGPU: Don't legalize i16 extloads to i32 with legal i16
Matt Arsenault [Thu, 7 Sep 2017 05:37:34 +0000 (05:37 +0000)]
AMDGPU: Don't legalize i16 extloads to i32 with legal i16

Keeping non-i16 extloads makes it easier to match some new
gfx9 load instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312699 91177308-0d34-0410-b5e6-96231b3b80d8