OSDN Git Service
Saleem Abdulrasool [Mon, 29 Aug 2016 20:42:03 +0000 (20:42 +0000)]
ExecutionEngine: fix a bug in the movt/movw relocator
According to the arm arm specifications, 4 bytes are needed for a shift instead
of 8, this was causing the movt instruction to write to a different register
sometimes.
Patch by Walter Erquinigo!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280005
91177308-0d34-0410-b5e6-
96231b3b80d8
Chris Bieneman [Mon, 29 Aug 2016 20:18:52 +0000 (20:18 +0000)]
[CMake] Builtins build needs LLVM_*_OUTPUT_INTDIR variables
This allows the builtins archives to build into the correct subdirectory under the binary dir. Addresses the issue discussed in D24001.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280002
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthew Simpson [Mon, 29 Aug 2016 20:14:04 +0000 (20:14 +0000)]
[LV] Move insertelement sequence after scalar definitions
After r279649 when getting a vector value from VectorLoopValueMap, we create an
insertelement sequence on-demand if the value has been scalarized instead of
vectorized. We previously inserted this insertelement sequence before the
value's first vector user. However, this insert location is problematic if that
user is the phi node of a first-order recurrence. With this patch, we move the
insertelement sequence after the last scalar instruction we created when
scalarizing the value. Thus, the value's vector definition in the new loop will
immediately follow its scalar definitions. This should fix PR30183.
Reference: https://llvm.org/bugs/show_bug.cgi?id=30183
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280001
91177308-0d34-0410-b5e6-
96231b3b80d8
Krzysztof Parzyszek [Mon, 29 Aug 2016 19:50:15 +0000 (19:50 +0000)]
Propagate TBAA info in SelectionDAG::getIndexedLoad
Patch by Pranav Bhandarkar.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279998
91177308-0d34-0410-b5e6-
96231b3b80d8
Douglas Katzman [Mon, 29 Aug 2016 19:42:57 +0000 (19:42 +0000)]
[Myriad]: add missing 'mcpu' values
Should have been done with r276646.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279996
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Mon, 29 Aug 2016 19:42:52 +0000 (19:42 +0000)]
AMDGPU/SI: Implement a custom MachineSchedStrategy
Summary:
GCNSchedStrategy re-uses most of GenericScheduler, it's just uses
a different method to compute the excess and critical register
pressure limits.
It's not enabled by default, to enable it you need to pass -misched=gcn
to llc.
Shader DB stats:
32464 shaders in 17874 tests
Totals:
SGPRS:
1542846 ->
1643125 (6.50 %)
VGPRS:
1005595 -> 904653 (-10.04 %)
Spilled SGPRs: 29929 -> 27745 (-7.30 %)
Spilled VGPRs: 334 -> 352 (5.39 %)
Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
Code Size:
36688188 ->
37034900 (0.95 %) bytes
LDS: 1913 -> 1913 (0.00 %) blocks
Max Waves: 254101 -> 265125 (4.34 %)
Wait states: 0 -> 0 (0.00 %)
Totals from affected shaders:
SGPRS:
1338220 ->
1438499 (7.49 %)
VGPRS: 886221 -> 785279 (-11.39 %)
Spilled SGPRs: 29869 -> 27685 (-7.31 %)
Spilled VGPRs: 334 -> 352 (5.39 %)
Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
Code Size:
34315716 ->
34662428 (1.01 %) bytes
LDS: 1551 -> 1551 (0.00 %) blocks
Max Waves: 188127 -> 199151 (5.86 %)
Wait states: 0 -> 0 (0.00 %)
Reviewers: arsenm, mareko, nhaehnle, MatzeB, atrick
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: https://reviews.llvm.org/D23688
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279995
91177308-0d34-0410-b5e6-
96231b3b80d8
Vitaly Buka [Mon, 29 Aug 2016 19:28:34 +0000 (19:28 +0000)]
[asan] Enable new stack poisoning with store instruction by default
Reviewers: eugenis
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23968
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279993
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Mon, 29 Aug 2016 19:27:20 +0000 (19:27 +0000)]
GlobalISel: switch to SmallVector for pending legalizations.
std::queue was doing far to many heap allocations to be healthy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279992
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Mon, 29 Aug 2016 19:15:22 +0000 (19:15 +0000)]
AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler
Summary:
The SILoadStoreOptimizer can now look ahead more then one instruction when
looking for instructions to merge, which greatly improves the number of
loads/stores that we are able to merge.
Moving the pass before scheduling avoids increasing register pressure after
the scheduler, so that the scheduler's register pressure estimates will be
more accurate. It also gives more consistent results, since it is no longer
affected by minor scheduling changes.
Reviewers: arsenm
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: https://reviews.llvm.org/D23814
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279991
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Mon, 29 Aug 2016 19:12:20 +0000 (19:12 +0000)]
ASan: remove variable only used in assertions build
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279990
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Mon, 29 Aug 2016 19:07:16 +0000 (19:07 +0000)]
GlobalISel: legalize frem to a libcall on AArch64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279988
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Mon, 29 Aug 2016 19:07:08 +0000 (19:07 +0000)]
GlobalISel: rework CallLowering so that it can be used for libcalls too.
There should be no functional change here, I'm just making the implementation
of "frem" (to libcall) legalization easier for a followup.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279987
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 29 Aug 2016 19:01:48 +0000 (19:01 +0000)]
AMDGPU/R600: Fix fixups used for constant arrays
Fixes bug 29289
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279986
91177308-0d34-0410-b5e6-
96231b3b80d8
Kyle Butt [Mon, 29 Aug 2016 18:27:12 +0000 (18:27 +0000)]
IfConversion: Fix branch predication bug.
This bug shows up with diamonds that share unpredicable, unanalyzable branches.
There's an included test case from Hexagon. What was happening was that we were
attempting to predicate the branch instruction despite the fact that it was
checked to be the same. Now for unanalyzable branches we skip over the branch
instructions when predicating the block.
Differential Revision: https://reviews.llvm.org/D23939
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279985
91177308-0d34-0410-b5e6-
96231b3b80d8
Vitaly Buka [Mon, 29 Aug 2016 18:17:21 +0000 (18:17 +0000)]
Use store operation to poison allocas for lifetime analysis.
Summary:
Calling __asan_poison_stack_memory and __asan_unpoison_stack_memory for small
variables is too expensive.
Code is disabled by default and can be enabled by -asan-experimental-poisoning.
PR27453
Reviewers: eugenis
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23947
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279984
91177308-0d34-0410-b5e6-
96231b3b80d8
Vitaly Buka [Mon, 29 Aug 2016 17:41:29 +0000 (17:41 +0000)]
[asan] Separate calculation of ShadowBytes from calculating ASanStackFrameLayout
Summary: No functional changes, just refactoring to make D23947 simpler.
Reviewers: eugenis
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23954
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279982
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Mon, 29 Aug 2016 17:14:08 +0000 (17:14 +0000)]
[SimplifyCFG] Hoisting invalidates metadata
We forgot to remove optimization metadata when performing hosting during
FoldTwoEntryPHINode.
This fixes PR29163.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279980
91177308-0d34-0410-b5e6-
96231b3b80d8
Reid Kleckner [Mon, 29 Aug 2016 16:35:43 +0000 (16:35 +0000)]
Make vec_fabs.ll pass with MSVC 2013
We should revert this change once we drop support for MSVC 2013.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279979
91177308-0d34-0410-b5e6-
96231b3b80d8
Teresa Johnson [Mon, 29 Aug 2016 16:22:23 +0000 (16:22 +0000)]
[gold] Fix test accidentally regressed for newer gold
With r279911 I accidentally regressed the gold/X86/start-lib-common.ll
test for newer golds (v1.12+) that honor the --start-lib/--end-lib.
Remove the alignment which should not be there to make this work with
both old and new gold linkers.
Additionally, now that we have a subdirectory for v1.12+ gold tests,
copy this test there and check specifically for the v1.12+ behavior.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279977
91177308-0d34-0410-b5e6-
96231b3b80d8
Evandro Menezes [Mon, 29 Aug 2016 16:04:37 +0000 (16:04 +0000)]
[AArch64] Adjust the scheduling model for Exynos M1.
Further refine the model for loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279976
91177308-0d34-0410-b5e6-
96231b3b80d8
Anna Thomas [Mon, 29 Aug 2016 15:41:59 +0000 (15:41 +0000)]
[StatepointsForGC] Rematerialize in the presence of PHIs
Summary:
While walking the use chain for identifying rematerializable values in RS4GC,
add the case where the current value and base value are the same PHI nodes.
This will aid rematerialization of geps and casts instead of relocating.
Reviewers: sanjoy, reames, igor
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23920
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279975
91177308-0d34-0410-b5e6-
96231b3b80d8
Teresa Johnson [Mon, 29 Aug 2016 15:33:01 +0000 (15:33 +0000)]
[LTO] Remove extraneous output
Remove some debugging output to stderr that snuck in with r279576.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279974
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Mon, 29 Aug 2016 15:27:17 +0000 (15:27 +0000)]
[Constant] remove fdiv and frem from canTrap()
Assuming the default FP env, we should not treat fdiv and frem any differently in terms of
trapping behavior than any other FP op. Ie, FP ops do not trap with the default FP env.
This matches how we treat the fdiv/frem in IR with isSafeToSpeculativelyExecute() and in
the backend after:
https://reviews.llvm.org/rL279970
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279973
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Mon, 29 Aug 2016 14:57:53 +0000 (14:57 +0000)]
[SimplifyCFG] rename test file, regenerate checks, and add test
The fdiv test shows a problem similar to:
https://reviews.llvm.org/rL279970
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279972
91177308-0d34-0410-b5e6-
96231b3b80d8
Gor Nishanov [Mon, 29 Aug 2016 14:34:12 +0000 (14:34 +0000)]
[Coroutines] Part 9: Add cleanup subfunction.
Summary:
[Coroutines] Part 9: Add cleanup subfunction.
This patch completes coroutine heap allocation elision. Now, the heap elision example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex3.ll)
Intrinsic Changes:
* coro.free gets a token parameter tying it to coro.id to allow reliably discovering all coro.frees associated with a particular coroutine.
* coro.id gets an extra parameter that points back to a coroutine function. This allows to check whether a coro.id describes the enclosing function or it belongs to a different function that was later inlined.
CoroSplit now creates three subfunctions:
# f$resume - resume logic
# f$destroy - cleanup logic, followed by a deallocation code
# f$cleanup - just the cleanup code
CoroElide pass during devirtualization replaces coro.destroy with either f$destroy or f$cleanup depending whether heap elision is performed or not.
Other fixes, improvements:
* Fixed buglet in Shape::buildFrame that was not creating coro.save properly if coroutine has more than one suspend point.
* Switched to using variable width suspend index field (no longer limited to 32 bit index field can be as little as i1 or as large as i<whatever-size_t-is>)
Reviewers: majnemer
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D23844
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279971
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Mon, 29 Aug 2016 13:32:41 +0000 (13:32 +0000)]
[TargetLowering] remove fdiv and frem from canOpTrap() (PR29114)
Assuming the default FP env, we should not treat fdiv and frem any differently in terms of
trapping behavior than any other FP op. Ie, FP ops do not trap with the default FP env.
This matches how we treat these ops in IR with isSafeToSpeculativelyExecute(). There's a
similar bug in Constant::canTrap().
This bug manifests in PR29114:
https://llvm.org/bugs/show_bug.cgi?id=29114
...as a sequence of scalar divisions instead of a vector division on x86 for a <3 x float>
type.
Differential Revision: https://reviews.llvm.org/D23974
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279970
91177308-0d34-0410-b5e6-
96231b3b80d8
Krzysztof Parzyszek [Mon, 29 Aug 2016 13:15:35 +0000 (13:15 +0000)]
Do not use MRI::getMaxLaneMaskForVReg as a mask covering whole register
MRI::getMaxLaneMaskForVReg does not always cover the whole register.
For example, on X86 the upper 16 bits of EAX cannot be accessed via
any subregister. Consequently, there is no lane mask that only covers
that part of EAX. The getMaxLaneMaskForVReg will return the union of
the lane masks for all subregisters, and in case of EAX, that union
will not cover the upper 16 bits.
This fixes https://llvm.org/bugs/show_bug.cgi?id=29132
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279969
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Mon, 29 Aug 2016 13:06:10 +0000 (13:06 +0000)]
AMDGPU/SI: Improve register allocation hints for sopk instructions
Summary:
For shrinking SOPK instructions, we were creating a hint to tell the
register allocator to use the register allocated for src0 for the dst
operand as well. However, this seems to not work sometimes depending
on the order virtual registers are assigned physical registers.
To fix this, I've added a second allocation hint which does the reverse,
asks that the register allocated for dst is used for src0.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D23862
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279968
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Mon, 29 Aug 2016 12:47:22 +0000 (12:47 +0000)]
Use the correct ctor/dtor section for dynamic-no-pic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279967
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Mon, 29 Aug 2016 12:41:32 +0000 (12:41 +0000)]
Mark test as XFAIL instead of disabling it everywhere.
There is no lit feature 'X86' so this test is just disabled completely.
Make it XFAIL until a solution is found.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279966
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Mon, 29 Aug 2016 12:33:42 +0000 (12:33 +0000)]
Move code only used by codegen out of MC. NFC.
MC itself never needs to know about these sections.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279965
91177308-0d34-0410-b5e6-
96231b3b80d8
Haojian Wu [Mon, 29 Aug 2016 12:26:33 +0000 (12:26 +0000)]
Fix -Wunused-but-set-variable warning.
Summary: A follow-up fix on r279958.
Reviewers: bkramer
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D23989
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279964
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Mon, 29 Aug 2016 12:05:32 +0000 (12:05 +0000)]
AMDGPU/SI: Query AA, if available, in areMemAccessesTriviallyDisjoint()
Summary:
The SILoadStoreOptimizer will need to use AliasAnalysis here in order to
move it before scheduling.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D23813
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279963
91177308-0d34-0410-b5e6-
96231b3b80d8
Igor Breger [Mon, 29 Aug 2016 09:12:31 +0000 (09:12 +0000)]
Fixed a bug in type legalizer for masked gather.
The problem occurs when the Node doesn't updated in place , UpdateNodeOperation() return the node that already exist.
In this case assert fail in PromoteIntegerOperand() , N have 2 results ( val + chain).
Differential Revision: http://reviews.llvm.org/D23756
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279961
91177308-0d34-0410-b5e6-
96231b3b80d8
Igor Breger [Mon, 29 Aug 2016 08:52:52 +0000 (08:52 +0000)]
[AVX512] In some cases KORTEST instruction may be used instead of ZEXT + TEST sequence.
Differential Revision: http://reviews.llvm.org/D23490
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279960
91177308-0d34-0410-b5e6-
96231b3b80d8
Haojian Wu [Mon, 29 Aug 2016 08:48:15 +0000 (08:48 +0000)]
[InstructionSelect] NumBlocks isn't defined in DEBUG build.
Summary: A follow-up fixing on http://llvm.org/viewvc/llvm-project?view=revision&revision=279905.
Reviewers: bkramer
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D23985
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279959
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Mon, 29 Aug 2016 04:49:31 +0000 (04:49 +0000)]
[X86] Don't lower FABS/FNEG masking directly to a ConstantPool load. Just create a ConstantFPSDNode and let that be lowered.
This allows broadcast loads to used when available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279958
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Mon, 29 Aug 2016 04:49:27 +0000 (04:49 +0000)]
[AVX-512] Always use v8i64 when converting 512-bit FAND/FOR/FXOR/FANDN to integer operations when DQI isn't supported. This is consistent with the recent changes to promote logical operations to i64 vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279957
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Mon, 29 Aug 2016 04:49:24 +0000 (04:49 +0000)]
[AVX-512] Add 512-bit fabs tests with and without AVX512DQ.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279956
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Mon, 29 Aug 2016 00:54:29 +0000 (00:54 +0000)]
[Orc] Simplify LogicalDylib and move it back inside CompileOnDemandLayer. Also
switch to using one indirect stub manager per logical dylib rather than one per
input module.
LogicalDylib is a helper class used by the CompileOnDemandLayer to manage
symbol resolution between modules during lazy compilation. In particular, it
ensures that internal symbols resolve correctly even in the case where multiple
input modules contain the same internal symbol name (which must to be promoted
to external hidden linkage so that functions in any given module can be split
out by lazy compilation). LogicalDylib's resolution scheme (before this commit)
required one stub-manager per input module. This made recompilation of functions
(by adding a module containing a new definition) difficult, as the stub manager
for any given symbol was bound to the module that supplied the original
definition. By using one stubs manager for the whole logical dylib symbols can
be more easily replaced, although support for doing this is not included in this
patch (it will be implemented in a follow up).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279952
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sun, 28 Aug 2016 22:20:51 +0000 (22:20 +0000)]
[AVX-512] Add support for selecting 512-bit VPABSB/VPABSW when BWI is available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279951
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sun, 28 Aug 2016 22:20:48 +0000 (22:20 +0000)]
[AVX-512] Add patterns for selecting 128/256-bit EVEX VPABS instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279950
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sun, 28 Aug 2016 22:20:45 +0000 (22:20 +0000)]
[AVX-512] Add testcases showing that we don't emit 512-bit vpabsb/vpabsw. Will be fixed in a future commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279949
91177308-0d34-0410-b5e6-
96231b3b80d8
Sylvestre Ledru [Sun, 28 Aug 2016 20:29:18 +0000 (20:29 +0000)]
Fix some typos in the doc
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279943
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Sun, 28 Aug 2016 18:31:32 +0000 (18:31 +0000)]
[x86] add tests for <3 x N> vector types (PR29114)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279939
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Sun, 28 Aug 2016 18:18:00 +0000 (18:18 +0000)]
[InstCombine] use m_APInt to allow icmp (and X, Y), C folds for splat constant vectors
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279937
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Sun, 28 Aug 2016 17:27:14 +0000 (17:27 +0000)]
[X86][AVX512] Only combine EVEX targets shuffles to shuffles of the same number of vector elements
Over eager combing prevents the correct folding of writemasks.
At the moment this occurs for ALL EVEX shuffles, in the future we need to check that the user of the root shuffle is a VSELECT that can fold to a writemask.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279934
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Sun, 28 Aug 2016 16:17:58 +0000 (16:17 +0000)]
[PowerPC] Implement lowering for atomicrmw min/max/umin/umax
Implement lowering for atomicrmw min/max/umin/umax. Fixes PR28818.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279933
91177308-0d34-0410-b5e6-
96231b3b80d8
Elena Demikhovsky [Sun, 28 Aug 2016 08:53:53 +0000 (08:53 +0000)]
[Loop Vectorizer] Fixed memory confilict checks.
Fixed a bug in run-time checks for possible memory conflicts inside loop.
The bug is in Low <-> High boundaries calculation. The High boundary should be calculated as "last memory access pointer + element size".
Differential revision: https://reviews.llvm.org/D23176
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279930
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sun, 28 Aug 2016 06:06:28 +0000 (06:06 +0000)]
[AVX-512] Promote AND/OR/XOR to v2i64/v4i64/v8i64 even when we have AVX512F/AVX512VL.
Previously we weren't creating masked logical operations if bitcasts appeared between the logic operation and the select. The IR optimizers can move bitcasts across logic operations and create these cases. To minimize the number of cases we need to handle, this change promotes all logic ops to an i64 vector type just like when only SSE or AVX is available.
Unfortunately, this also has the consequence of making it difficult to select unmasked VPANDD/VPORD/VPXORD in all the cases it was previously used. This is the cause of most of the test change. This shouldn't result in any functional change though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279929
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sun, 28 Aug 2016 06:06:24 +0000 (06:06 +0000)]
[AVX-512] Add tests to show that we don't select masked logic ops if there are bitcasts between the logic op and the select.
This is taken from optimized IR of clang test cases for masked logic ops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279928
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sun, 28 Aug 2016 06:06:21 +0000 (06:06 +0000)]
[X86] Rename PABSB/D/W instructions to be consistent with SSE/AVX instructions instead of ending 128/256. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279927
91177308-0d34-0410-b5e6-
96231b3b80d8
Jan Vesely [Sat, 27 Aug 2016 19:09:43 +0000 (19:09 +0000)]
AMDGPU/R600: Enable Load combine
Fix and improve tests
Differential Revision: https://reviews.llvm.org/D23899
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279925
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sat, 27 Aug 2016 17:13:43 +0000 (17:13 +0000)]
[X86] Rename predicate function that detects if requires one of the REX.B, REX.X or REX.R bits. It's old name conflicted with a function in X8II namespace that doesnt' quite do the same thing. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279924
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sat, 27 Aug 2016 17:13:41 +0000 (17:13 +0000)]
[X86] Keep looping over operands looking for byte registers even if we already found a register that requires a REX prefix. Otherwise we don't error if a high byte register is used after SPL/BPL/DIL/SIL.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279923
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sat, 27 Aug 2016 17:13:37 +0000 (17:13 +0000)]
[X86] Include XMM/YMM/ZMM16-23 in X86II::isX86_64ExtendedReg. This feels more consistent with its name and simplifies assembler code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279922
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sat, 27 Aug 2016 17:13:34 +0000 (17:13 +0000)]
[X86] Don't allow DR8-DR15 to be assembled in 32-bit mode. Add missing test for CR8-CR15.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279921
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sat, 27 Aug 2016 05:26:54 +0000 (05:26 +0000)]
[X86] Remove stale comment about FixupBWInsts pass being off by default. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279915
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sat, 27 Aug 2016 05:22:15 +0000 (05:22 +0000)]
[AVX-512] Allow EVEX encoding unordered/ordered/equal/notequal VCMPPS/PD/SS/SD to be commuted just like the SSE and AVX counterparts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279914
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sat, 27 Aug 2016 05:22:12 +0000 (05:22 +0000)]
[X86] Enable FR32/FR64 cmpeq/cmpne/cmpunord/cmpord to be commuted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279913
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sat, 27 Aug 2016 05:22:08 +0000 (05:22 +0000)]
[AVX-512] Add load folding for EVEX vcmpps/pd/ss/sd.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279912
91177308-0d34-0410-b5e6-
96231b3b80d8
Teresa Johnson [Sat, 27 Aug 2016 04:41:22 +0000 (04:41 +0000)]
[LTO] Don't create a new common unless merged has different size
Summary:
This addresses a regression in common handling from the new LTO
API in r278338. Only create a new common if the size is different.
The type comparison against an array type fails when the size is
different but not an array. GlobalMerge does not handle the
array types as well and we lose some global merging opportunities.
Reviewers: mehdi_amini
Subscribers: junbuml, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D23955
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279911
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Sat, 27 Aug 2016 03:39:27 +0000 (03:39 +0000)]
AMDGPU: Mark sched model complete
Fixes bug 26800
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279910
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Sat, 27 Aug 2016 03:00:51 +0000 (03:00 +0000)]
AMDGPU: Remove unneeded implicit exec uses/defs
SI_BREAK, SI_IF_BREAK, and SI_ELSE_BREAK do not def exec.
SI_IF_BREAK and SI_ELSE_BREAK do not read it either.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279909
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Sat, 27 Aug 2016 02:59:24 +0000 (02:59 +0000)]
[Orc] Explicitly specify type for assignment.
This should fix the MSVC errors in
http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/15120
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279908
91177308-0d34-0410-b5e6-
96231b3b80d8
Sebastian Pop [Sat, 27 Aug 2016 02:48:41 +0000 (02:48 +0000)]
GVN-hoist: invalidate MD cache (PR29144)
Without invalidating the entries in the MD cache we would try to access instructions
that were removed in previous iterations of hoisting.
Differential Revision: https://reviews.llvm.org/D23927
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279907
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Sat, 27 Aug 2016 02:38:27 +0000 (02:38 +0000)]
[RegBankSelect] Do not abort when the target wants to fall back.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279906
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Sat, 27 Aug 2016 02:38:24 +0000 (02:38 +0000)]
[InstructionSelect] Do not abort when the target wants to fall back.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279905
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Sat, 27 Aug 2016 02:38:21 +0000 (02:38 +0000)]
[MachineLegalize] Do not abort when the target wants to fall back.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279904
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Sat, 27 Aug 2016 01:32:27 +0000 (01:32 +0000)]
AMDGPU: Select mulhi 24-bit instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279902
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Sat, 27 Aug 2016 01:00:37 +0000 (01:00 +0000)]
AMDGPU: Move cndmask pseudo to be isel pseudo
There's only one use of this for the convenience
of a pattern. I think v_mov_b64_pseudo should also be
moved, but SIFoldOperands does currently make use of it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279901
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Sat, 27 Aug 2016 00:51:02 +0000 (00:51 +0000)]
AMDGPU: Fix sched type for branches
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279900
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Sat, 27 Aug 2016 00:42:21 +0000 (00:42 +0000)]
AMDGPU: Remove register operand from si_mask_branch
It isn't used for anything, and is also misleading since
it could be spilled at the end of the block, so it can't be relied
on. There ends up being a verifier error about using an undefined
register since the spill kills the register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279899
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Sat, 27 Aug 2016 00:21:22 +0000 (00:21 +0000)]
AMDGPU: Improve error reporting for maximum branch distance
Unfortunately this seems to only help the assembler diagnostic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279895
91177308-0d34-0410-b5e6-
96231b3b80d8
Chris Bieneman [Sat, 27 Aug 2016 00:19:51 +0000 (00:19 +0000)]
[CMake] Only generate Components.cmake if components are specified
Generating the Components import file is useless if there are no components coming in from the runtimes configuration, so we should skip generation in that case.
This also should fix the configuration error that Renato reported on llvm-dev.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279893
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Sat, 27 Aug 2016 00:19:05 +0000 (00:19 +0000)]
[ORC] Fix typo in LogicalDylib, add unit test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279892
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Sat, 27 Aug 2016 00:18:31 +0000 (00:18 +0000)]
[GlobalISel] Add a fallback path to SDISel.
When global-isel fails on a MachineFunction MF, MF will be cleaned up
and given to SDISel.
Thanks to this fallback, we can already perform correctness test even if
we support only a small portion of the functions in a test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279891
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Sat, 27 Aug 2016 00:18:28 +0000 (00:18 +0000)]
[AArch64][CallLowering] Do not assert for not implemented part.
When doing the ABI lowering, report a failure to the caller instead of
asserting. This gives a chance for the caller to recover.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279890
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Sat, 27 Aug 2016 00:18:24 +0000 (00:18 +0000)]
[GlobalISel] Teach the core pipeline not to run if ISel failed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279889
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Kuperstein [Sat, 27 Aug 2016 00:10:24 +0000 (00:10 +0000)]
[X86] Add baseline test for "odd" shuffles. NFC.
Adds a baseline test for lowering shuffles where the width of the output
vector is not twice the size of the input vectors. Many of those sequences
are suboptimal, and will hopefully be improved in follow-up patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279888
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Fri, 26 Aug 2016 23:49:05 +0000 (23:49 +0000)]
[IRTranslator] Do not abort when the target wants to fall back.
Every pass in the GlobalISel pipeline will need to do something similar.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279886
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Fri, 26 Aug 2016 23:49:01 +0000 (23:49 +0000)]
[MFProperties] Introduce a FailedISel property.
This is used to communicate that the instruction selection pipeline
failed at some point.
Another way to achieve that would be to have some kind of conditional
scheduling in the PassManager, such that we only schedule a pass based
on the success/failure of another one. The property approach has the
advantage of being lightweight and solve the problem at stake.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279885
91177308-0d34-0410-b5e6-
96231b3b80d8
Teresa Johnson [Fri, 26 Aug 2016 23:29:14 +0000 (23:29 +0000)]
[ThinLTO] Move loading of cache entry to client
Summary:
Have the cache pass back the path to the cache entry when it
is ready to be loaded, instead of a buffer.
For gold-plugin we can simply pass this file back to gold directly,
which avoids expensive writing of a separate tmp file. Ensure
the cache entry is not deleted on cleanup by adjusting the setting
of the IsTemporary flags.
Moved the loading of the buffer into llvm-lto2 to maintain current
behavior.
Reviewers: mehdi_amini
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D23946
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279883
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Kaylor [Fri, 26 Aug 2016 23:11:48 +0000 (23:11 +0000)]
Adding document describing the use of the -opt-bisect-limit option.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279881
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Fri, 26 Aug 2016 22:32:59 +0000 (22:32 +0000)]
[TargetPassConfig] Add a target hook to know what GlobalISel should do on error.
By default, this hook tells GlobalISel to abort (report a fatal error)
when it encounters an error. The alternative will be to fall back on
SDISel.
This fall back will be removed when the bring-up of GlobalISel is over.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279879
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Fri, 26 Aug 2016 22:32:57 +0000 (22:32 +0000)]
[IRTranslator][NFC] Use DEBUG_TYPE instead of repeating the name.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279878
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Fri, 26 Aug 2016 22:32:55 +0000 (22:32 +0000)]
[SelectionDAG] Do not run the ISel process on already selected code.
Right now, this cannot happen, but with the fall back path of GlobalISel
it will show up eventually.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279877
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Fri, 26 Aug 2016 22:32:53 +0000 (22:32 +0000)]
[MachineFunction] Introduce a reset method.
This method allows to reset the state of a MachineFunction as if it was
just created. This will be used during the bring-up of GlobalISel to
provide a way to fallback on SelectionDAG. That way, we can start doing
correctness testing even if we are not able to select all functions via
the global instruction selector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279876
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Fri, 26 Aug 2016 22:29:36 +0000 (22:29 +0000)]
TableGen: Switch from a std::map to a DenseMap in CodeGenSubRegIndex. NFC
This mapping is between pointers, which DenseMap is particularly good
at. Most targets aren't really affected, but if there's a lot of
subregister composition this can shave off a good chunk of time from
generating registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279875
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Fri, 26 Aug 2016 22:09:11 +0000 (22:09 +0000)]
[MFProperties] Introduce a reset method with no argument.
This method allows to reset all the properties in one go.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279874
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Fri, 26 Aug 2016 22:09:08 +0000 (22:09 +0000)]
[MFProperties][NFC] Rename clear into reset to match BitVector naming.
The name clear is used to reset all the bit in bitvectors and using it
to reset just properties was confusing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279873
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Fri, 26 Aug 2016 21:36:47 +0000 (21:36 +0000)]
AMDGPU/SI: Canonicalize offset order for merged DS instructions
Summary:
If the scheduler clusters the loads, then the offsets will be sorted,
but it is possible for the scheduler to scheduler loads together
without out explicitly clustering them, which would give us non-sorted
offsets.
Also, we will want to do this if we move the load/store optimizer before
the scheduler.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D23776
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279870
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Fri, 26 Aug 2016 21:16:40 +0000 (21:16 +0000)]
XXX
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279868
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Fri, 26 Aug 2016 21:16:37 +0000 (21:16 +0000)]
AMDGPU/SI: Use a better method for determining the largest pressure sets
Summary:
There are a few different sgpr pressure sets, but we only care about
the one which covers all of the sgprs. We were using hard-coded
register pressure set names to determine the reg set id for the
biggest sgpr set. However, we were using the wrong name, and this
method is pretty fragile, since the reg pressure set names may
change.
The new method just looks for the pressure set that contains the most
reg units and sets that set as our SGPR pressure set. We've also
adopted the same technique for determining our VGPR pressure set.
Reviewers: arsenm
Subscribers: MatzeB, arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D23687
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279867
91177308-0d34-0410-b5e6-
96231b3b80d8
Chris Bieneman [Fri, 26 Aug 2016 20:34:11 +0000 (20:34 +0000)]
[CMake] Expose runtime component check targets
This will expose the check targets for runtime project components into the top-level build. It will enable exposing targets like check-asan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279861
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Fri, 26 Aug 2016 20:21:05 +0000 (20:21 +0000)]
[Inliner] Report when inlining fails because callee's def is unavailable
Summary:
This is obviously an interesting case because it may motivate code
restructuring or LTO.
Reporting this requires instantiation of ORE in the loop where the call
sites are first gathered. I've checked compile-time
overhead *with* -Rpass-with-hotness and the worst slow-down was 6% in
mcf and quickly tailing off. As before without -Rpass-with-hotness
there is no overhead.
Because this could be a pretty noisy diagnostics, it is currently
qualified as 'verbose'. As of this patch, 'verbose' diagnostics are
only emitted with -Rpass-with-hotness, i.e. when the output is expected
to be filtered.
Reviewers: eraman, chandlerc, davidxl, hfinkel
Subscribers: tejohnson, Prazek, davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D23415
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279860
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Fri, 26 Aug 2016 20:19:35 +0000 (20:19 +0000)]
Make writeToResolutionFile a static helper.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279859
91177308-0d34-0410-b5e6-
96231b3b80d8
Kyle Butt [Fri, 26 Aug 2016 20:12:40 +0000 (20:12 +0000)]
TailDuplication: Record blocks that received the duplicated block. NFC.
This will allow tail duplication during layout to handle the cfg changes more
cleanly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279858
91177308-0d34-0410-b5e6-
96231b3b80d8
Chris Bieneman [Fri, 26 Aug 2016 20:08:57 +0000 (20:08 +0000)]
[CMake] Fixing LLVM_INCLUDE_TESTS for runtimes directory
We need to explicitly pass LLVM_INCLUDE_TESTS through from the top-level to the runtimes configuration because it isn't in LLVMConfig.cmake
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279857
91177308-0d34-0410-b5e6-
96231b3b80d8
Teresa Johnson [Fri, 26 Aug 2016 20:07:15 +0000 (20:07 +0000)]
Streamline LTO getComdat invocation (NFC)
We already have obtained a pointer to the underlying GlobalObject,
use it directly to find the comdat, rather than using the
GlobalValue::getComdat which will do the same thing again.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279856
91177308-0d34-0410-b5e6-
96231b3b80d8