OSDN Git Service
Chandler Carruth [Fri, 17 Feb 2017 00:29:59 +0000 (00:29 +0000)]
FileCheck-ize some tests in test/CodeGen/X86/
Patch by Jorge Gorbe!
Differential Revision: https://reviews.llvm.org/D29807
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295386
91177308-0d34-0410-b5e6-
96231b3b80d8
Teresa Johnson [Fri, 17 Feb 2017 00:21:19 +0000 (00:21 +0000)]
Handle link of NoDebug CU with a CU that has debug emission enabled
Summary:
This is an issue both with regular and Thin LTO. When we link together
a DICompileUnit that is marked NoDebug (e.g when compiling with -g0
but applying an AutoFDO profile, which requires location tracking
in the compiler) and a DICompileUnit with debug emission enabled,
we can have failures during dwarf debug generation. Specifically,
when we have inlined from the NoDebug compile unit into the debug
compile unit, we can fail during construction of the abstract and
inlined scope DIEs. This is because the SPMap does not include NoDebug
CUs (they are skipped in the debug_compile_units_iterator).
This patch fixes the failures by skipping locations from NoDebug CUs
when extracting lexical scopes.
Reviewers: dblaikie, aprantl
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D29765
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295384
91177308-0d34-0410-b5e6-
96231b3b80d8
Eugene Zelenko [Fri, 17 Feb 2017 00:00:09 +0000 (00:00 +0000)]
[IR] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295383
91177308-0d34-0410-b5e6-
96231b3b80d8
Zachary Turner [Thu, 16 Feb 2017 23:35:45 +0000 (23:35 +0000)]
[pdb] Add the ability to resolve TypeServer PDBs.
Some PDBs or object files can contain references to other PDBs
where the real type information lives. When this happens,
all type indices in the original PDB are meaningless because
their records are not there.
With this patch we add the ability to pull type info from those
secondary PDBs.
Differential Revision: https://reviews.llvm.org/D29973
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295382
91177308-0d34-0410-b5e6-
96231b3b80d8
Wei Mi [Thu, 16 Feb 2017 21:27:31 +0000 (21:27 +0000)]
[LSR] Prevent formula with SCEVAddRecExpr type of Reg from Sibling loops
In rL294814, we allow formula with SCEVAddRecExpr type of Reg from loops
other than current loop. This is good for the case when induction variable
of outerloop being used in expr in innerloop. But it is very bad to allow
such Reg from sibling loop because we may need to add lsr.iv in other sibling
loops when scev expanding those SCEVAddRecExpr type exprs. For the testcase
below, one loop can be inserted with a bunch of lsr.iv because of LSR for
other loops.
// The induction variable j from a loop in the middle will have initial
// value generated from previous sibling loop and exit value used by its
// next sibling loop.
void goo(long i, long j);
long cond;
void foo(long N) {
long i = 0;
long j = 0;
i = 0; do { goo(i, j); i++; j++; } while (cond);
i = 0; do { goo(i, j); i++; j++; } while (cond);
i = 0; do { goo(i, j); i++; j++; } while (cond);
i = 0; do { goo(i, j); i++; j++; } while (cond);
i = 0; do { goo(i, j); i++; j++; } while (cond);
i = 0; do { goo(i, j); i++; j++; } while (cond);
}
The fix is to only allow formula with SCEVAddRecExpr type of Reg from current
loop or its parents.
Differential Revision: https://reviews.llvm.org/D30021
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295378
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Thu, 16 Feb 2017 20:55:48 +0000 (20:55 +0000)]
Fix -Wunused-lambda-capture by removing some unused lambda captures
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295373
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Thu, 16 Feb 2017 20:26:51 +0000 (20:26 +0000)]
[MachinePipeliner] Remove redundant destructor. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295372
91177308-0d34-0410-b5e6-
96231b3b80d8
Krzysztof Parzyszek [Thu, 16 Feb 2017 20:25:23 +0000 (20:25 +0000)]
[Hexagon] Start using regmasks on calls
All the cool targets are doing it...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295371
91177308-0d34-0410-b5e6-
96231b3b80d8
Erich Keane [Thu, 16 Feb 2017 20:19:49 +0000 (20:19 +0000)]
Change default TimerGroup singleton to use magic statics
TimerGroup was showing up on a leak in valigrind, and
used some pretty complex code to implement a singleton.
This patch replaces the implementation with a vastly simpler
one.
Differential Revision: https://reviews.llvm.org/D28367
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295370
91177308-0d34-0410-b5e6-
96231b3b80d8
Krzysztof Parzyszek [Thu, 16 Feb 2017 19:28:06 +0000 (19:28 +0000)]
[RDF] Aggregate shadow phi uses into one cluster when propagating live info
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295366
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Thu, 16 Feb 2017 19:17:36 +0000 (19:17 +0000)]
[X86][SSE] Add PR31309 test case (load-extend i32 to i128).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295363
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 16 Feb 2017 19:09:04 +0000 (19:09 +0000)]
AMDGPU: Remove llvm.AMDGPU.cube intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295359
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 16 Feb 2017 19:08:58 +0000 (19:08 +0000)]
AMDGPU: Remove llvm.AMDGPU.rsq intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295358
91177308-0d34-0410-b5e6-
96231b3b80d8
Hans Wennborg [Thu, 16 Feb 2017 19:04:42 +0000 (19:04 +0000)]
Re-apply r282920 "X86: Allow conditional tail calls in Win64 "leaf" functions (PR26302)"
The original commit was reverted in r283329 due to a miscompile in
Chromium. That turned out to be the same issue as PR31257, which was
fixed in r295262.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295357
91177308-0d34-0410-b5e6-
96231b3b80d8
Krzysztof Parzyszek [Thu, 16 Feb 2017 18:53:04 +0000 (18:53 +0000)]
[RDF] Differentiate between defining and clobbering nodes
Defining nodes should not alias with one another, while clobbering
nodes can. When pushing defs on stacks, push clobbers first, link
non-clobbering defs, then push the defs.
The data flow in a statement is now: uses -> clobbers -> defs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295356
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Thu, 16 Feb 2017 18:48:33 +0000 (18:48 +0000)]
Refactor DebugHandlerBase a bit to common non-debug-having-function filtering
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295354
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 16 Feb 2017 18:46:24 +0000 (18:46 +0000)]
InstCombine: Canonicalize fast fmuladd to fmul + fadd
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295353
91177308-0d34-0410-b5e6-
96231b3b80d8
Krzysztof Parzyszek [Thu, 16 Feb 2017 18:45:23 +0000 (18:45 +0000)]
[RDF] Move normalize(RegisterRef) to PhysicalRegisterInfo
Remove the duplicate from DFG and make some members of PRI private.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295351
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrea Di Biagio [Thu, 16 Feb 2017 18:25:37 +0000 (18:25 +0000)]
x86 interrupt calling convention: only save xmm registers if the target supports SSE
The existing code always saves the xmm registers for 64-bit targets even if the
target doesn't support SSE (which is common for kernels). Thus, the compiler
inserts movaps instructions which lead to CPU exceptions when an interrupt
handler is invoked.
This commit fixes this bug by returning a register set without xmm registers
from getCalleeSavedRegs and getCallPreservedMask for such targets.
Patch by Philipp Oppermann.
Differential Revision: https://reviews.llvm.org/D29959
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295347
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 16 Feb 2017 18:15:16 +0000 (18:15 +0000)]
[x86] add more tests of select of constants; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295346
91177308-0d34-0410-b5e6-
96231b3b80d8
Artur Pilipenko [Thu, 16 Feb 2017 17:07:27 +0000 (17:07 +0000)]
[DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combine
Resubmit -r295314 with PowerPC and AMDGPU tests updated.
Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patters.
Reviewed By: filcab
Differential Revision: https://reviews.llvm.org/D29591
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295336
91177308-0d34-0410-b5e6-
96231b3b80d8
Sjoerd Meijer [Thu, 16 Feb 2017 15:52:22 +0000 (15:52 +0000)]
[AArch64] AArch64AsmParser clean up of isImmediate functions. NFC
Regression test neon-diagnostics.s needed changing because it now
produces a more specific diagnostic about the immediate ranges. One
change in the expected error message is not obvious, but there multiple
candidate and it happens to pick the immediate diagnostic.
Differential Revision: https://reviews.llvm.org/D29939
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295331
91177308-0d34-0410-b5e6-
96231b3b80d8
Dan Gohman [Thu, 16 Feb 2017 15:21:37 +0000 (15:21 +0000)]
[WebAssembly] Add a cast to void to fix an unused private member warning, for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295327
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Thu, 16 Feb 2017 15:11:49 +0000 (15:11 +0000)]
[X86] Remove local areOnlyUsersOf helper and use SDNode::areOnlyUsersOf instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295326
91177308-0d34-0410-b5e6-
96231b3b80d8
Marshall Clow [Thu, 16 Feb 2017 14:37:03 +0000 (14:37 +0000)]
Remove uses of deprecated std::random_shuffle in the LLVM code base. Reviewed as https://reviews.llvm.org/D29780.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295325
91177308-0d34-0410-b5e6-
96231b3b80d8
Diana Picus [Thu, 16 Feb 2017 14:10:50 +0000 (14:10 +0000)]
[ARM] GlobalISel: Select floating point loads
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295321
91177308-0d34-0410-b5e6-
96231b3b80d8
Artur Pilipenko [Thu, 16 Feb 2017 13:04:46 +0000 (13:04 +0000)]
Rever -r295314 "[DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combine"
This change causes some of AMDGPU and PowerPC tests to fail.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295316
91177308-0d34-0410-b5e6-
96231b3b80d8
Artur Pilipenko [Thu, 16 Feb 2017 12:53:26 +0000 (12:53 +0000)]
[DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combine
Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patters.
Reviewed By: filcab
Differential Revision: https://reviews.llvm.org/D29591
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295314
91177308-0d34-0410-b5e6-
96231b3b80d8
Diana Picus [Thu, 16 Feb 2017 12:19:57 +0000 (12:19 +0000)]
[ARM] GlobalISel: Select G_SEQUENCE and G_EXTRACT
Since they're only used for passing around double precision floating point
values into the general purpose registers, we'll lower them to VMOVDRR and
VMOVRRD.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295310
91177308-0d34-0410-b5e6-
96231b3b80d8
Diana Picus [Thu, 16 Feb 2017 12:19:52 +0000 (12:19 +0000)]
[ARM] GlobalISel: Select double G_FADD and copies
Just use VADDD if available, bail out if not.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295309
91177308-0d34-0410-b5e6-
96231b3b80d8
Diana Picus [Thu, 16 Feb 2017 11:25:09 +0000 (11:25 +0000)]
[ARM] GlobalISel: Assert that we don't use the FPR bank if we don't have VFP
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295308
91177308-0d34-0410-b5e6-
96231b3b80d8
Diana Picus [Thu, 16 Feb 2017 11:00:31 +0000 (11:00 +0000)]
[ARM] GlobalISel: Add reg bank mappings for G_SEQUENCE and G_EXTRACT
Support G_SEQUENCE and G_EXTRACT as needed for passing double precision floating
point values in the soft-fp float mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295306
91177308-0d34-0410-b5e6-
96231b3b80d8
Diana Picus [Thu, 16 Feb 2017 10:12:49 +0000 (10:12 +0000)]
[ARM] GlobalISel: Make the FPR bank 64-bit wide
Also add mappings for single and double precision FP, and use them for G_FADD
and G_LOAD.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295302
91177308-0d34-0410-b5e6-
96231b3b80d8
Diana Picus [Thu, 16 Feb 2017 09:09:49 +0000 (09:09 +0000)]
[ARM] GlobalISel: Legalize 64-bit G_FADD and G_LOAD
For now we just mark them as legal all the time and let the other passes bail
out if they can't handle it. In the future, we'll want to move more of the
brains into the legalizer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295300
91177308-0d34-0410-b5e6-
96231b3b80d8
NAKAMURA Takumi [Thu, 16 Feb 2017 08:22:08 +0000 (08:22 +0000)]
RWMutex.h: Use llvm-config.h instead of config.h in installed headers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295297
91177308-0d34-0410-b5e6-
96231b3b80d8
Diana Picus [Thu, 16 Feb 2017 07:53:07 +0000 (07:53 +0000)]
[ARM] GlobalISel: Lower double precision FP args
For the hard float calling convention, we just use the D registers.
For the soft-fp calling convention, we use the R registers and move values
to/from the D registers by means of G_SEQUENCE/G_EXTRACT. While doing so, we
make sure to honor the endianness of the target, since the CCAssignFn doesn't do
that for us.
For pure soft float targets, we still bail out because we don't support the
libcalls yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295295
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 16 Feb 2017 07:35:23 +0000 (07:35 +0000)]
[AVX-512][InstCombine] Teach InstCombine to optimize 512-bit packss/packus intrinsics like it does 128/256-bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295294
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 16 Feb 2017 06:31:54 +0000 (06:31 +0000)]
[AVX-512] Remove masked packss/packus intrinsics and autoupgrade to unmasked intrinsics with select instructions. For 512-bit add new unmasked intrinsics.
The new 512-bit unmasked intrinsics will make it easy to handle these with the SSE/AVX intrinsics in InstCombine where we currently have a TODO.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295290
91177308-0d34-0410-b5e6-
96231b3b80d8
Rui Ueyama [Thu, 16 Feb 2017 02:56:06 +0000 (02:56 +0000)]
Split WinCOFFObjectWriter::writeSection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295276
91177308-0d34-0410-b5e6-
96231b3b80d8
Rui Ueyama [Thu, 16 Feb 2017 02:35:48 +0000 (02:35 +0000)]
Split WinCOFFObjectWriter::writeObject function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295273
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 16 Feb 2017 02:01:17 +0000 (02:01 +0000)]
AMDGPU: Remove llvm.SI.sendmsg
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295270
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 16 Feb 2017 02:01:13 +0000 (02:01 +0000)]
AMDGPU: Remove SI_fs_constant and SI_fs_interp intrinsics
Update test uses with expansion in terms of new intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295269
91177308-0d34-0410-b5e6-
96231b3b80d8
Rui Ueyama [Thu, 16 Feb 2017 01:41:04 +0000 (01:41 +0000)]
Remove useless local variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295268
91177308-0d34-0410-b5e6-
96231b3b80d8
Rui Ueyama [Thu, 16 Feb 2017 01:06:45 +0000 (01:06 +0000)]
Rename variables to match the LLVM style.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295265
91177308-0d34-0410-b5e6-
96231b3b80d8
Hans Wennborg [Thu, 16 Feb 2017 00:04:05 +0000 (00:04 +0000)]
[X86] Re-enable conditional tail calls and fix PR31257.
This reverts r294348, which removed support for conditional tail calls
due to the PR above. It fixes the PR by marking live registers as
implicitly used and defined by the now predicated tailcall. This is
similar to how IfConversion predicates instructions.
Differential Revision: https://reviews.llvm.org/D29856
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295262
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Wed, 15 Feb 2017 23:48:38 +0000 (23:48 +0000)]
PMB: Add an importing WPD pass to the start of the ThinLTO backend pipeline.
Differential Revision: https://reviews.llvm.org/D30008
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295260
91177308-0d34-0410-b5e6-
96231b3b80d8
Teresa Johnson [Wed, 15 Feb 2017 23:45:21 +0000 (23:45 +0000)]
Collapse my two entries in CODE_OWNERS.txt
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295259
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Wed, 15 Feb 2017 23:22:50 +0000 (23:22 +0000)]
GlobalISel: legalize va_arg on AArch64.
Uses a Custom implementation because the slot sizes being a multiple of the
pointer size isn't really universal, even for the architectures that do have a
simple "void *" va_list.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295255
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Wed, 15 Feb 2017 23:22:33 +0000 (23:22 +0000)]
GlobalISel: support translating va_arg
Since (say) i128 and [16 x i8] map to the same type in generic MIR, we also
need to attach the required alignment info.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295254
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Berlin [Wed, 15 Feb 2017 23:16:20 +0000 (23:16 +0000)]
Implement intrinsic mangling for literal struct types.
Fixes PR 31921
Summary:
Predicateinfo requires an ugly workaround to try to avoid literal
struct types due to the intrinsic mangling not being implemented.
This workaround actually does not work in all cases (you can hit the
assert by bootstrapping with -print-predicateinfo), and can't be made
to work without DFS'ing the type (IE copying getMangledStr and using a
version that detects if it would crash).
Rather than do that, i just implemented the mangling. It seems
simple, since they are unified structurally.
Looking at the overloaded-mangling testcase we have, it actually turns
out the gc intrinsics will *also* crash if you try to use a literal
struct. Thus, the testcase added fails before this patch, and works
after, without needing to resort to predicateinfo.
Reviewers: chandlerc, davide
Subscribers: llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D29925
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295253
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Wed, 15 Feb 2017 22:23:04 +0000 (22:23 +0000)]
AMDGPU: Remove dead node definitions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295247
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Wed, 15 Feb 2017 22:19:06 +0000 (22:19 +0000)]
Fix typos
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295246
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Wed, 15 Feb 2017 22:17:09 +0000 (22:17 +0000)]
AMDGPU: Consolidate sendmsg/sendmsghalt handling and tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295244
91177308-0d34-0410-b5e6-
96231b3b80d8
Eugene Zelenko [Wed, 15 Feb 2017 22:17:02 +0000 (22:17 +0000)]
[Support] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295243
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Wed, 15 Feb 2017 22:02:42 +0000 (22:02 +0000)]
DAG: Do not scalarize fsub if fneg is legal
Tests will be included with future commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295242
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Wed, 15 Feb 2017 21:56:51 +0000 (21:56 +0000)]
Re-apply r295110 and r295144 with a fix for the ASan issue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295241
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Wed, 15 Feb 2017 21:50:34 +0000 (21:50 +0000)]
AMDGPU: Replace assert with report_fatal_error
Also use a more refined condition.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295239
91177308-0d34-0410-b5e6-
96231b3b80d8
Keno Fischer [Wed, 15 Feb 2017 21:42:42 +0000 (21:42 +0000)]
[GlobalObject] Fix setSection("")
Summary:
In rL291613, the section name was interned in LLVMContext. However,
this broke the ability to remove the section from a GlobalObject,
because it tried to intern empty strings, which is not allowed.
Fix that and add an appropriate regression test.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D29795
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295238
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 15 Feb 2017 21:31:34 +0000 (21:31 +0000)]
[InstCombine] improve formatting; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295237
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Wed, 15 Feb 2017 21:10:09 +0000 (21:10 +0000)]
AssumptionCache: Disable the verifier by default, move it behind a hidden cl::opt and verify from releaseMemory().
This is a short term solution to the problem that many passes currently fail
to update the assumption cache. In the long term the verifier should not
be controllable with a flag. We should either fix all passes to correctly
update the assumption cache and enable the verifier unconditionally or
somehow arrange for the assumption list to be updated automatically by passes.
Differential Revision: https://reviews.llvm.org/D30003
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295236
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Wed, 15 Feb 2017 21:09:00 +0000 (21:09 +0000)]
[X86][SSE] Don't call EltsFromConsecutiveLoads if any element is missing.
Minor performance speedup - if any call to getShuffleScalarElt fails to get a result, don't both calling for the remaining elements as EltsFromConsecutiveLoads will fail anyhow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295235
91177308-0d34-0410-b5e6-
96231b3b80d8
Arnold Schwaighofer [Wed, 15 Feb 2017 20:43:43 +0000 (20:43 +0000)]
AddressSanitizer: don't track swifterror memory addresses
They are register promoted by ISel and so it makes no sense to treat them as
memory.
Inserting calls to the thread sanitizer would also generate invalid IR.
You would hit:
"swifterror value can only be loaded and stored from, or as a swifterror
argument!"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295230
91177308-0d34-0410-b5e6-
96231b3b80d8
Ahmed Bougacha [Wed, 15 Feb 2017 20:38:31 +0000 (20:38 +0000)]
[AArch64] Make am_ldrlit an iPTR - not OtherVT - operand. NFC-ish.
am_ldrlit diverged from am_brcond in r207105, but kept the OtherVT
operand type. It made sense for branch targets, as those are
represented as MVT::Other in SDAG. But loads operate on pointers.
This shouldn't have an observable effect on any in-tree code, but helps
make the patterns consistent for external users.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295229
91177308-0d34-0410-b5e6-
96231b3b80d8
Ahmed Bougacha [Wed, 15 Feb 2017 20:38:28 +0000 (20:38 +0000)]
[OptDiag] Pass const Values/Types to Argument. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295228
91177308-0d34-0410-b5e6-
96231b3b80d8
Ahmed Bougacha [Wed, 15 Feb 2017 20:38:22 +0000 (20:38 +0000)]
[IR] Accept 'const Type &' in the Type operator<<. NFC.
Type::print is const; there's no reason for the operator not to be.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295227
91177308-0d34-0410-b5e6-
96231b3b80d8
Tobias Edler von Koch [Wed, 15 Feb 2017 20:36:36 +0000 (20:36 +0000)]
[LTO] Add ability to emit assembly to new LTO API
Summary:
Add a field to LTO::Config, CGFileType, to select the file type to emit (object
or assembly). This is useful for testing and to implement -save-temps.
Reviewers: tejohnson, mehdi_amini, pcc
Reviewed By: mehdi_amini
Subscribers: davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D29475
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295226
91177308-0d34-0410-b5e6-
96231b3b80d8
Kyle Butt [Wed, 15 Feb 2017 19:49:14 +0000 (19:49 +0000)]
Codegen: Make chains from trellis-shaped CFGs
Lay out trellis-shaped CFGs optimally.
A trellis of the shape below:
A B
|\ /|
| \ / |
| X |
| / \ |
|/ \|
C D
would be laid out A; B->C ; D by the current layout algorithm. Now we identify
trellises and lay them out either A->C; B->D or A->D; B->C. This scales with an
increasing number of predecessors. A trellis is a a group of 2 or more
predecessor blocks that all have the same successors.
because of this we can tail duplicate to extend existing trellises.
As an example consider the following CFG:
B D F H
/ \ / \ / \ / \
A---C---E---G---Ret
Where A,C,E,G are all small (Currently 2 instructions).
The CFG preserving layout is then A,B,C,D,E,F,G,H,Ret.
The current code will copy C into B, E into D and G into F and yield the layout
A,C,B(C),E,D(E),F(G),G,H,ret
define void @straight_test(i32 %tag) {
entry:
br label %test1
test1: ; A
%tagbit1 = and i32 %tag, 1
%tagbit1eq0 = icmp eq i32 %tagbit1, 0
br i1 %tagbit1eq0, label %test2, label %optional1
optional1: ; B
call void @a()
br label %test2
test2: ; C
%tagbit2 = and i32 %tag, 2
%tagbit2eq0 = icmp eq i32 %tagbit2, 0
br i1 %tagbit2eq0, label %test3, label %optional2
optional2: ; D
call void @b()
br label %test3
test3: ; E
%tagbit3 = and i32 %tag, 4
%tagbit3eq0 = icmp eq i32 %tagbit3, 0
br i1 %tagbit3eq0, label %test4, label %optional3
optional3: ; F
call void @c()
br label %test4
test4: ; G
%tagbit4 = and i32 %tag, 8
%tagbit4eq0 = icmp eq i32 %tagbit4, 0
br i1 %tagbit4eq0, label %exit, label %optional4
optional4: ; H
call void @d()
br label %exit
exit:
ret void
}
here is the layout after D27742:
straight_test: # @straight_test
; ... Prologue elided
; BB#0: # %entry ; A (merged with test1)
; ... More prologue elided
mr 30, 3
andi. 3, 30, 1
bc 12, 1, .LBB0_2
; BB#1: # %test2 ; C
rlwinm. 3, 30, 0, 30, 30
beq 0, .LBB0_3
b .LBB0_4
.LBB0_2: # %optional1 ; B (copy of C)
bl a
nop
rlwinm. 3, 30, 0, 30, 30
bne 0, .LBB0_4
.LBB0_3: # %test3 ; E
rlwinm. 3, 30, 0, 29, 29
beq 0, .LBB0_5
b .LBB0_6
.LBB0_4: # %optional2 ; D (copy of E)
bl b
nop
rlwinm. 3, 30, 0, 29, 29
bne 0, .LBB0_6
.LBB0_5: # %test4 ; G
rlwinm. 3, 30, 0, 28, 28
beq 0, .LBB0_8
b .LBB0_7
.LBB0_6: # %optional3 ; F (copy of G)
bl c
nop
rlwinm. 3, 30, 0, 28, 28
beq 0, .LBB0_8
.LBB0_7: # %optional4 ; H
bl d
nop
.LBB0_8: # %exit ; Ret
ld 30, 96(1) # 8-byte Folded Reload
addi 1, 1, 112
ld 0, 16(1)
mtlr 0
blr
The tail-duplication has produced some benefit, but it has also produced a
trellis which is not laid out optimally. With this patch, we improve the layouts
of such trellises, and decrease the cost calculation for tail-duplication
accordingly.
This patch produces the layout A,C,E,G,B,D,F,H,Ret. This layout does have
back edges, which is a negative, but it has a bigger compensating
positive, which is that it handles the case where there are long strings
of skipped blocks much better than the original layout. Both layouts
handle runs of executed blocks equally well. Branch prediction also
improves if there is any correlation between subsequent optional blocks.
Here is the resulting concrete layout:
straight_test: # @straight_test
; BB#0: # %entry ; A (merged with test1)
mr 30, 3
andi. 3, 30, 1
bc 12, 1, .LBB0_4
; BB#1: # %test2 ; C
rlwinm. 3, 30, 0, 30, 30
bne 0, .LBB0_5
.LBB0_2: # %test3 ; E
rlwinm. 3, 30, 0, 29, 29
bne 0, .LBB0_6
.LBB0_3: # %test4 ; G
rlwinm. 3, 30, 0, 28, 28
bne 0, .LBB0_7
b .LBB0_8
.LBB0_4: # %optional1 ; B (Copy of C)
bl a
nop
rlwinm. 3, 30, 0, 30, 30
beq 0, .LBB0_2
.LBB0_5: # %optional2 ; D (Copy of E)
bl b
nop
rlwinm. 3, 30, 0, 29, 29
beq 0, .LBB0_3
.LBB0_6: # %optional3 ; F (Copy of G)
bl c
nop
rlwinm. 3, 30, 0, 28, 28
beq 0, .LBB0_8
.LBB0_7: # %optional4 ; H
bl d
nop
.LBB0_8: # %exit
Differential Revision: https://reviews.llvm.org/D28522
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295223
91177308-0d34-0410-b5e6-
96231b3b80d8
Xinliang David Li [Wed, 15 Feb 2017 19:21:04 +0000 (19:21 +0000)]
include function name in dot filename
Differential Revision: http://reviews.llvm.org/D29975
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295220
91177308-0d34-0410-b5e6-
96231b3b80d8
Arnold Schwaighofer [Wed, 15 Feb 2017 18:57:06 +0000 (18:57 +0000)]
ThreadSanitizer: don't track swifterror memory addresses
They are register promoted by ISel and so it makes no sense to treat them as
memory.
Inserting calls to the thread sanitizer would also generate invalid IR.
You would hit:
"swifterror value can only be loaded and stored from, or as a swifterror
argument!"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295215
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Kuperstein [Wed, 15 Feb 2017 18:37:26 +0000 (18:37 +0000)]
[DAG] Don't try to create an INSERT_SUBVECTOR with an illegal source
We currently can't legalize those, but we should really not be creating
them in the first place, since legalization would probably look similar to the
way we legalize CONCAT_VECTORS - basically replace the INSERT with a BUILD.
This fixes PR311956.
Differential Revision: https://reviews.llvm.org/D29961
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295213
91177308-0d34-0410-b5e6-
96231b3b80d8
Dehao Chen [Wed, 15 Feb 2017 17:54:39 +0000 (17:54 +0000)]
Expose getBaseDiscriminatorFromDiscriminator, getDuplicationFactorFromDiscriminator and getCopyIdentifierFromDiscriminator API so that downstream tools can use them to get the correct encoding.
Summary: Discriminators are now encoded with rich information. This patch exposes the encoding API to downstream tools.
Reviewers: davidxl, hfinkel
Reviewed By: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29852
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295210
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 15 Feb 2017 17:42:58 +0000 (17:42 +0000)]
[Inline] add tests to show attribute information loss; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295209
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Wed, 15 Feb 2017 17:41:33 +0000 (17:41 +0000)]
[X86][SSE] Propagate undef upper elements from scalar_to_vector during shuffle combining
Only do this for integer types currently - floats types (in particular insertps) load folding often fails with this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295208
91177308-0d34-0410-b5e6-
96231b3b80d8
Stanislav Mekhanoshin [Wed, 15 Feb 2017 17:19:50 +0000 (17:19 +0000)]
[AMDGPU] Revert failed scheduling
This patch reverts region's scheduling to the original untouched state
in case if we have have decreased occupancy.
In addition it switches to use TargetRegisterInfo occupancy callback
for pressure limits instead of gradually increasing limits which were
just passed by. We are going to stay with the best schedule so we do
not need to tolerate worsened scheduling anymore.
Differential Revision: https://reviews.llvm.org/D29971
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295206
91177308-0d34-0410-b5e6-
96231b3b80d8
Anna Thomas [Wed, 15 Feb 2017 17:08:29 +0000 (17:08 +0000)]
Revert "[JumpThreading] Thread through guards"
This reverts commit r294617.
We fail on an assert while trying to get a condition from an
unconditional branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295200
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Wed, 15 Feb 2017 16:48:45 +0000 (16:48 +0000)]
[X86] Regenerate scalar stack reload test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295195
91177308-0d34-0410-b5e6-
96231b3b80d8
David Bozier [Wed, 15 Feb 2017 16:03:22 +0000 (16:03 +0000)]
Fix unittest for buildbot with mips host (32bit big endian) from r295174
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295188
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 15 Feb 2017 15:22:18 +0000 (15:22 +0000)]
[InlineFunction] use getFunction(); NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295185
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Wed, 15 Feb 2017 15:11:36 +0000 (15:11 +0000)]
Fix spelling mistake - paramater -> parameter. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295182
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 15 Feb 2017 15:08:38 +0000 (15:08 +0000)]
[InlineFunction] use getCaller(); NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295181
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 15 Feb 2017 14:56:11 +0000 (14:56 +0000)]
[InlineFunction] use range-for loop; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295179
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Wed, 15 Feb 2017 14:06:17 +0000 (14:06 +0000)]
[X86] Regenerate i64 ext-load on 32-bit target tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295177
91177308-0d34-0410-b5e6-
96231b3b80d8
David Bozier [Wed, 15 Feb 2017 13:40:05 +0000 (13:40 +0000)]
Attempt to fix buildbots after commit of r295173.
Unit tests needed to check on the endianness of the host platform. (Test was failing for big endian hosts).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295174
91177308-0d34-0410-b5e6-
96231b3b80d8
David Bozier [Wed, 15 Feb 2017 12:58:41 +0000 (12:58 +0000)]
Fix incorrect formatting of DataRefImpl members in operator<< function
Changed format specifiers to use format macro constant for pointer type.
Moved width part of format specifier in the correct place for formatting members a and b.
Added a unit test to confirm the output.
Differential Revision: https://reviews.llvm.org/D28957
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295173
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Wed, 15 Feb 2017 11:46:15 +0000 (11:46 +0000)]
[X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise ZERO inputs
Add support for specifying an UNPCK input as ZERO, particularly improves ZEXT cases with non-zero offsets
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295169
91177308-0d34-0410-b5e6-
96231b3b80d8
Sagar Thakur [Wed, 15 Feb 2017 10:48:11 +0000 (10:48 +0000)]
[LLVM][XRAY][MIPS] Support xray on mips/mipsel/mips64/mips64el
Summary: Adds support for xray instrumentation on mips for both 32-bit and 64-bit.
Reviewed by sdardis, dberris
Differential: D27697
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295164
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Jasper [Wed, 15 Feb 2017 09:56:08 +0000 (09:56 +0000)]
Revert r295110 and r295144.
This fails under ASAN:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/798/steps/check-llvm%20asan/logs/stdio
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295162
91177308-0d34-0410-b5e6-
96231b3b80d8
Ayman Musa [Wed, 15 Feb 2017 08:12:16 +0000 (08:12 +0000)]
[X86][AVX] Remove REX_W from AVX instructions.
There is no meaning for REX_W in VEX encoded AVX instruction.
Differential Revision: https://reviews.llvm.org/D29894
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295157
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Wed, 15 Feb 2017 06:58:47 +0000 (06:58 +0000)]
[X86] Don't create VBROADCAST nodes with 256-bit or 512-bit input types
Summary:
We don't seem to have great rules on what a valid VBROADCAST node looks like. And as a consequence we end up with a lot of patterns to try to catch everything. We have patterns with scalar inputs, 128-bit vector inputs, 256-bit vector inputs, and 512-bit vector inputs.
As you can see from the things improved here we are currently missing patterns for 128-bit loads being extended to 256-bit before the vbroadcast.
I'd like to propose that VBROADCAST should always take a 128-bit vector type as input. As a first step towards that this patch adds an EXTRACT_SUBVECTOR in front of VBROADCAST when the input is 256 or 512-bits. In the future I would like to add scalar_to_vector around all the scalar operations. And maybe we should consider adding a VBROADCAST+load node to avoid separating loads from the broadcasting operation when the load itself isn't foldable.
This requires an additional change in target shuffle combining to look for the extract subvector and look through it to find the original operand. I'm sure this change isn't perfect but was enough to fix a few test failures that were being caused.
Another interesting thing I noticed is that the changes in masked_gather_scatter.ll show cases were we don't remove a useless insert into element 1 before broadcasting element 0.
Reviewers: delena, RKSimon, zvi
Reviewed By: zvi
Subscribers: igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D28747
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295155
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Wed, 15 Feb 2017 06:51:39 +0000 (06:51 +0000)]
[AVX-512] Add PACKSS/PACKUS instructions to load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295154
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Wed, 15 Feb 2017 05:57:16 +0000 (05:57 +0000)]
[SelectionDAGBuilder] Simplify creation of shufflevector DAG nodes where inputs are larger than the mask
Summary:
The current code loops over all elements to calculate a used range. Then a second short loop looks at the ranges and determines if they can be used in a extract and creates a properly aligned start index for the extract.
This range finding is unnecessary, we can just calculate a properly aligned start index for an extract for each input during the first loop. If we don't find the same start index for each indice we can't use an extract.
Reviewers: zvi, RKSimon
Reviewed By: zvi
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29926
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295152
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Wed, 15 Feb 2017 05:39:35 +0000 (05:39 +0000)]
[Orc][RPC] Add a AsyncHandlerTraits specialization for non-value-type response
handler args.
The specialization just inherits from the std::decay'd response handler type.
This allows member functions (via MemberFunctionWrapper) to be used as async
handlers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295151
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Wed, 15 Feb 2017 03:50:01 +0000 (03:50 +0000)]
AssumptionCache: Update documentation comment.
The comment was somewhat misleading in that it implied that passes were not
responsible for adding new assumptions to the assumption cache. This new
wording now explicitly mentions that they are required to do so.
Differential Revision: https://reviews.llvm.org/D29977
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295148
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Wed, 15 Feb 2017 03:01:11 +0000 (03:01 +0000)]
SimplifyCFG: Register cloned assume intrinsics with assumption cache when creating critical edge.
Differential Revision: https://reviews.llvm.org/D29976
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295145
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Wed, 15 Feb 2017 02:13:08 +0000 (02:13 +0000)]
WholeProgramDevirt: Separate the code that applies optzns from the code that decides whether to apply them. NFCI.
The idea is that the apply* functions will also be called when importing
devirt optimizations.
Differential Revision: https://reviews.llvm.org/D29745
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295144
91177308-0d34-0410-b5e6-
96231b3b80d8
Rui Ueyama [Wed, 15 Feb 2017 01:48:33 +0000 (01:48 +0000)]
Revert r295138: Instead of a series of string operations, use snprintf().
This broke buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295142
91177308-0d34-0410-b5e6-
96231b3b80d8
Rui Ueyama [Wed, 15 Feb 2017 01:09:40 +0000 (01:09 +0000)]
Instead of a series of string operations, use snprintf().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295138
91177308-0d34-0410-b5e6-
96231b3b80d8
Rui Ueyama [Wed, 15 Feb 2017 01:09:20 +0000 (01:09 +0000)]
Return early. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295137
91177308-0d34-0410-b5e6-
96231b3b80d8
Rui Ueyama [Wed, 15 Feb 2017 01:09:01 +0000 (01:09 +0000)]
Use LLVM-style naming scheme.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295136
91177308-0d34-0410-b5e6-
96231b3b80d8
Stanislav Mekhanoshin [Wed, 15 Feb 2017 01:03:59 +0000 (01:03 +0000)]
[AMDGPU] Fix MaxWorkGroupsPerCU for large workgroups
This patch corrects the maximum workgroups per CU if we have big
workgroups (more than 128). This calculation contributes to the
occupancy calculation in respect to LDS size.
Differential Revision: https://reviews.llvm.org/D29974
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295134
91177308-0d34-0410-b5e6-
96231b3b80d8