OSDN Git Service
Jason Ekstrand [Tue, 7 Jun 2016 23:53:19 +0000 (16:53 -0700)]
isl/state: Don't use designated initializers for the surface state
While designated initializers are nice, they also force us to put some
things in the initializer and some things later. Surface state setup is
complicated enough that this really hurts readability in the long run.
Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
caf2af4181c66df8af31662de22120dcf1d16c7c)
Jason Ekstrand [Fri, 3 Jun 2016 01:43:59 +0000 (18:43 -0700)]
genxml/gen8,9: Prefix the multisample format enum with MSFMT
This is what gen7 does and it's nice to have a prefix
Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
de1d194856ddcfc946df2df0f076cb42ff1c165d)
Jason Ekstrand [Sat, 11 Jun 2016 04:11:02 +0000 (21:11 -0700)]
i965/gen4: Subtract 1 from buffer sizes
The PRM states that the values put in Width, Height, and Depth should be
various bits from the value size - 1. We seem to have done this wrong
more-or-less from the start.
Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
2a1cc94d27c80929d91e38b4843333a5408d563e)
Jason Ekstrand [Tue, 7 Jun 2016 02:15:39 +0000 (19:15 -0700)]
i965/fs: Use a default Y coordinate of 0 for TXF on gen9+
Previously, we were incrementing length but not actually putting anything
in the Y coordinate. This meant that 1-D TXF operations had a garbage
array index. If the surface is emitted as 1-D non-array, the coordinate
gets discarded and it works fine. If it happens to be bound as an array
surface, it may count as an out-of-bounds array access and you get zero.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
0195299c868ec99bc6c595c641da81bb2632252e)
Jason Ekstrand [Sat, 4 Jun 2016 21:32:37 +0000 (14:32 -0700)]
i965/gen8: Use the qpitch from the aux_mt for AUX_QPITCH
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
1436238b75e0352439306f120ac1ca03c9fc7df3)
Jason Ekstrand [Sat, 4 Jun 2016 06:25:19 +0000 (23:25 -0700)]
i965/blorp/gen8: Use the correct max level and layer in emit_surface_states
We were adding in the base which is wrong because the values given in the
miptree are relative to zero and not the base layer/level.
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
620f81d2edb20ffd9803ee318f60845441459fac)
Jason Ekstrand [Thu, 9 Jun 2016 21:57:33 +0000 (14:57 -0700)]
i965: Drop the maximum 3D texture size to 512 on Sandy Bridge
The RenderTargetViewExtent field of RENDER_SURFACE_STATE is supposed to be
set to the depth of a 3-D texture when rendering. Unfortunatley, that
field is only 9 bits on Sandy Bridge and prior so we can't actually bind
a 3-D texturing for rendering if it has depth > 512. On Ivy Bridge, this
field was bumpped to 11 bits so we can go all the way up to 2048. On Iron
Lake and prior, we don't support layered rendering and we use OffsetX/Y
hacks to render to particular layers so 2048 is ok there too.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
6ba88bce64b343761aabe3a6c7ee285c6020a959)
Jason Ekstrand [Wed, 22 Jun 2016 18:11:29 +0000 (11:11 -0700)]
i965/gen4-6: Handle gl_texture_object::BaseLevel and MinLayer correctly
This is basically a direct translation of what we do for gen7.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
0f9cd74aab021da81a7e5a2f0fbf66213471628f)
Jason Ekstrand [Wed, 22 Jun 2016 04:58:23 +0000 (21:58 -0700)]
i965/gen4: Pull texture formats from the texture object not the miptree
This makes texture views sort-of work. It doesn't add full texture view
support for gen4-5 but it is enough to fix the GL_ARB_copy_image formats
piglit test on Iron Lake.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
ee39d3ba918de9d52d79bdee6db2c120bcf0f28e)
Ilia Mirkin [Tue, 21 Jun 2016 20:16:17 +0000 (16:16 -0400)]
glsl: only match gl_FragData and not gl_SecondaryFragDataEXT
There's special logic around finding gl_FragData. It latches onto any
array with FRAG_RESULT_DATA0. However gl_SecondaryFragDataEXT[], added
by GL_EXT_blend_func_extended, fits those parameters as well. The real
frag data array should have index 0 though, so we can use that to
distinguish them.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96617
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit
36ed1b695e5a0ae5714b79cae3a089b5e7e8bd29)
Ilia Mirkin [Sun, 19 Jun 2016 01:54:37 +0000 (21:54 -0400)]
nv50,nvc0: fix start_instance in manual push path
The start instance is applied as an offset into the buffer directly,
ignoring the divisor, not as an instance id offset that respects the
divisor.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit
1f4bca798dda155ad0615ba81d8373c771d1ec94)
Ilia Mirkin [Sun, 19 Jun 2016 04:43:06 +0000 (00:43 -0400)]
translate: fix start_instance parameter in sse version
The generic version gets this right already, but this was using an
incorrect formula in SSE.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit
5b0d64886dfe9d42d02666ee1b07f2aa375197a5)
Jason Ekstrand [Tue, 21 Jun 2016 22:32:09 +0000 (15:32 -0700)]
anv/cmd: Dirty descriptor sets when a new pipeline is bound
Ever since
c2581a9375ea, the binding table layout has depended on the
pipeline. This means that whenever we change pipelines we also need to
re-emit binding tables for the new layout.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
35b53c8d47d3a0b53ee2549d73296d5db8e3cca0)
Jason Ekstrand [Tue, 21 Jun 2016 22:31:14 +0000 (15:31 -0700)]
anv/cmd: Move emit_descriptor_pointers to genX_cmd_buffer.c
It's tiny and fully generic so there's really no reason for it to be in a
gen7-specific file.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
2bfe0c33748b9fd96d48cb93656b6dc643bf024e)
Jason Ekstrand [Tue, 21 Jun 2016 22:28:15 +0000 (15:28 -0700)]
anv/cmd: Move flush_descriptor_sets to anv_cmd_buffer.c
There's no good reason for recompiling it
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
9df4d6bb36268d5dd248b872611e3787de9608be)
Jason Ekstrand [Tue, 21 Jun 2016 06:41:11 +0000 (23:41 -0700)]
spirv: Use the system value version of gl_FrontFace
SPIR-V treats it as an input but NIR wants the system value. This
shouldn't have been too much of a surprise given that we have to do the
same conversion in the GLSL IR to NIR pass.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
295e03c980a7ff6dde77abcb6bbfa2f8d015323b)
Kenneth Graunke [Tue, 14 Jun 2016 06:09:31 +0000 (23:09 -0700)]
i965: Reorganize prog_data->total_scratch code a bit.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit
40013c50333caf7a4a66204ac29695aad0d9b06d)
Emil Velikov [Tue, 21 Jun 2016 12:32:04 +0000 (13:32 +0100)]
Update version to 12.0.0-rc4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Nicolai Hähnle [Fri, 10 Jun 2016 13:59:58 +0000 (15:59 +0200)]
st/mesa: flush bitmap cache before CopyImageSubData
Found by inspection.
Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit
f9ddd52317caf14a21ec7c040fd4bb944f9842e4)
Nicolai Hähnle [Thu, 9 Jun 2016 10:22:31 +0000 (12:22 +0200)]
st/mesa: flush bitmap cache before texture functions
As far as I can tell, a sequence of glBitmap followed by texture functions
that refer to a texture bound as the framebuffer is well within what should
be allowed.
Found by inspection.
Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit
e7fff3cfe156e13198107e5e76a77fb79ed02173)
Nicolai Hähnle [Thu, 9 Jun 2016 10:12:34 +0000 (12:12 +0200)]
st/mesa: flush bitmap cache before compute dispatch
In the unlikely case that a program uses glBitmap to render to a framebuffer
whose texture is bound in a compute shader.
Found by inspection.
Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit
c542b7e43d3a504456518c9f407e21c4e7e5fa88)
Kenneth Graunke [Wed, 8 Jun 2016 23:09:02 +0000 (16:09 -0700)]
i965: Fix multiplication of immediates on Cherryview/Broxton.
Cherryview and Broxton don't support DW x DW multiplication. We have
piles of code to handle this, but apparently weren't retyping in the
immediate case.
For example,
tests/spec/arb_tessellation_shader/execution/dvec3-vs-tcs-tes
makes the simulator angry about instructions such as:
mul(8) r18<1>:D r10.0<8;8,1>:D 0x00000003:D
Just retype to W or UW. It should be safe on all platforms.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit
cd89c834a8b3b4e5f5874c8e1f90c9b01d541181)
Jason Ekstrand [Tue, 14 Jun 2016 15:40:49 +0000 (08:40 -0700)]
anv: Add proper support for depth clamping
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
eb6764c4a73006eee32e19e3afc6eab100a2ce16)
Jason Ekstrand [Tue, 14 Jun 2016 15:15:34 +0000 (08:15 -0700)]
anv/cmd_buffer: Split emit_viewport in two
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
8a46b505cb2c7255ad430b56c1ce0dfa9c13c559)
Jason Ekstrand [Tue, 14 Jun 2016 00:09:37 +0000 (17:09 -0700)]
anv/cmd_buffer: Set depth/stencil extent based on the image
It used to be based on the framebuffer which isn't quite right.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
20e95a746df34923eb4aac5e7f1ab6d722432d89)
Jason Ekstrand [Wed, 15 Jun 2016 21:30:33 +0000 (14:30 -0700)]
anv/cmd_buffer: Don't crash if push constants are provided for missing stages
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
b65f2e4163c9180e6a022c0afec018b08e4c5aa5)
Jason Ekstrand [Thu, 16 Jun 2016 17:57:39 +0000 (10:57 -0700)]
anv/pipeline: Do invariance propagation on SPIR-V shaders
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
e6c2fe451962e364f30f689dc48c34e2b6161b25)
Jason Ekstrand [Mon, 13 Jun 2016 21:41:05 +0000 (14:41 -0700)]
nir/alu_to_scalar: Respect the exact ALU operation qualifier
Just setting builder->exact isn't sufficient because that only applies to
instructions that are built with the builder but instructions created
manually and only inserted using the builder are left alone.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
bec07b729242f6a2dcf5a12ce75bf8b07ea658e0)
Jason Ekstrand [Mon, 13 Jun 2016 19:47:19 +0000 (12:47 -0700)]
nir: Add a pass for propagating invariant decorations
This pass is similar to propagate_invariance in the GLSL compiler. The
real "output" of this pass is that any algebraic operations which are
eventually consumed by an invariant variable get marked as "exact".
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
202751fbb7e3d35c1aa84f325f862245dab67f6c)
Jason Ekstrand [Sat, 18 Jun 2016 19:30:36 +0000 (12:30 -0700)]
nir/algebraic: Remove imprecise flog2 optimizations
While mathematically correct, these two optimizations result in an
expression with substantially lower precision than the original. For any
positive finite floating-point value, log2(x) is well-defined and finite.
More precisely, it is in the range [-150, 150] so any sum of logarithms
log2(a) + log2(b) is also well-defined and finite as long as a and b are
both positive and finite. However, if a and b are either very small or
very large, their product may get flushed to infinity or zero causing
log2(a * b) to be nowhere close to log2(a) + log2(b).
This imprecision was causing incorrect rendering in Talos Principal because
part of its HDR rendering process involves doing 8 texture operations,
clamping the result to [0, 65000], taking a dot-product with a constant,
and then taking the log2. This is done 6 or 8 times and summed to produce
the final result which is written to a red texture. In cases where you
have a region of the screen that is very dark, it can end up getting a
result value of -inf which is not what is intended.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96425
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
68e308d85355079ad93bd4e16cba164784740fdf)
Nicolai Hähnle [Fri, 17 Jun 2016 08:48:53 +0000 (10:48 +0200)]
radeonsi: fix calculation of valid RB mask per SE
The old calculation treated too many RBs as disabled.
Cc: 11.0 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit
c95175581e983642dc4b23d059e6eaff5b79d2db)
Nicolai Hähnle [Fri, 17 Jun 2016 08:30:44 +0000 (10:30 +0200)]
radeonsi: raise SI_PM4_MAX_DW
The old limit, introduced in commit
afa752d3f03ac6697581ff5d324e8ac0512ef513,
was exceeded by 4 SE configurations which hit si_write_harvested_raster_configs.
Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit
6c2e63698290d3ea868eefcc3e4dd51dc1e16c64)
Roland Scheidegger [Sun, 19 Jun 2016 01:56:11 +0000 (03:56 +0200)]
gallivm: don't use integer min/max sse intrinsics with llvm >= 3.9
Apparently, these are deprecated. There's some AutoUpgrade feature which
is supposed to promote these to cmp/select, which apparently doesn't work
with jit code. It is possible it's not actually even meant to work (see
the bug filed against llvm which couldn't provide an answer neither)
but in any case this is meant to be only temporary unless the intrinsics
are really illegal. So, just use the fallback code (which should be cmp/select,
we're actually doing cmp/sext/trunc/select, but in any case llvm 3.9 manages
to optimize this back to pmin/pmax in the end).
This addresses https://llvm.org/bugs/show_bug.cgi?id=28176
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Aaron Watry <awatry@gmail.com>
(cherry picked from commit
b0cf99165af445adc4c5c1f66a3a3e0d882211cd)
Ilia Mirkin [Sun, 19 Jun 2016 04:28:36 +0000 (00:28 -0400)]
nvc0: don't make use of push hint if there are no non-const user vbos
This makes the check match up what we do on nv50 as well - there's no
point in switching over the push path if everything's in managed
buffers. This can happen when a shader uses a vertex without an enabled
array - we end up passing it a constant attribute.
This also has the effect of "fixing" some flickering in Talos. I have no
idea why. I've stared at the push logic forwards, backwards, and
sideways. By always forcing the push path (which is slow), the
flickering also goes away, but other rendering is still wrong
(specifically draw 383068 as identified in the bug). However by not
switching over to the push path, draw 383068 is correct.
Note that other flickering remains in Talos, like the red/green
walls/floors. This takes care of the shadow flickering though.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90513
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit
154c0a42a23187c61ea0a1307198fae667398eba)
Ilia Mirkin [Sat, 18 Jun 2016 19:22:09 +0000 (15:22 -0400)]
gk104/ir: fix tex use generation to be more careful about eliding uses
If we have a loop, instructions before the tex might be added as tex
uses, and those may in fact dominate all other uses of the tex results.
This however doesn't mean that we don't need a texbar after the tex.
Only check if uses dominate each other they are dominated by the tex.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96565
Fixes:
7752bbc44 (gk104/ir: simplify and fool-proof texbar algorithm)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit
1804aa0b80cf5b1ee5d97bc33a12808c78673a12)
Samuel Iglesias Gonsálvez [Mon, 13 Jun 2016 06:29:53 +0000 (08:29 +0200)]
i965/fs: indirect addressing with doubles is not supported in CHV/BSW/BXT
From the Cherryview's PRM, Volume 7, 3D Media GPGPU Engine, Register Region
Restrictions, page 844:
"When source or destination datatype is 64b or operation is integer DWord
multiply, indirect addressing must not be used."
v2:
- Fix it for Broxton too.
v3:
- Simplify code by using subscript() and not creating a new num_components
variable (Kenneth).
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit
bdab572a86f27b92ba10124f85d278e9c8861fff)
Iago Toral Quiroga [Mon, 13 Jun 2016 07:13:23 +0000 (03:13 -0400)]
i965/fs: Fix single-precision to double-precision conversions for CHV/BSW/BXT
From the Cherryview PRM, Volume 7, 3D Media GPGPU Engine,
Register Region Restrictions:
"When source or destination is 64b (...), regioning in Align1
must follow these rules:
1. Source and destination horizontal stride must be aligned to
the same qword.
(...)"
v2:
- Fix it for Broxton too.
v3:
- Remove inst->regs_written change as it is not necessary (Ken)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit
0177dbb6c2fe876a9761a4a97eec44accfa4c007)
Ian Romanick [Mon, 13 Jun 2016 16:59:10 +0000 (09:59 -0700)]
mesa: If validation fails in a debug context just emit a debug message
There are quite a few pipelines that desktop applications (including a
bunch of piglit test) can expect to have run but don't meet the GLES
requirements. Instead of failing validation, just emit a debug message.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit
6bec55a780b0e95445c6d77c6e35cc0c74290ac0)
Ian Romanick [Mon, 13 Jun 2016 22:22:34 +0000 (15:22 -0700)]
glsl: Always strip arrayness in precision_qualifier_allowed
Previously some callers of precision_qualifier_allowed would strip the
arrayness from the type and some would not. As a result, some places
would not notice that float[6], for example, needed a precision
qualifier.
Fixes the new piglit test no-default-float-array-precision.frag.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Gregory Hainaut <gregory.hainaut@gmail.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit
9c872820413f6183db0eb47828a7afcf703f9930)
Kenneth Graunke [Wed, 1 Jun 2016 07:08:55 +0000 (00:08 -0700)]
i965: Use a uniform for gl_PatchVerticesIn in the TCS on Gen8+.
We still need to recompile the passthrough shader when this value
changes, as it also affects the output vertex count. But otherwise,
we can eliminate recompiles on Gen8+.
We probably want to do this for Gen7 as well, but that requires
rewriting the input release code to use a loop, which is a trade-off
I'd need to consider in more detail.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit
c319512e16f19bf1f558670981bbb4af510ba9f4)
Kenneth Graunke [Fri, 27 May 2016 03:21:58 +0000 (20:21 -0700)]
glsl: Optionally lower TCS gl_PatchVerticesIn to a uniform.
i965 has no special hardware for this, so the best way to implement
this is to pass it in via a uniform.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit
2b867264d2cce59bd65bd3599ff0e3c5439bc9d4)
Kenneth Graunke [Wed, 1 Jun 2016 07:08:55 +0000 (00:08 -0700)]
i965: Use a uniform for gl_PatchVerticesIn in the TES.
Fixes three GL44-CTS.tessellation_shader subtests:
- max_patch_vertices
- single.max_patch_vertices
- tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn
These use gl_PatchVerticesIn in the TES, but don't link against a
TCS (which would allow the linker to lower it to a constant). We had
no handling for the system value in the backend, so it would just
assert fail.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit
1bc194cd64085d07f1aae319cb6fb3c99d69aaeb)
Kenneth Graunke [Fri, 27 May 2016 03:21:58 +0000 (20:21 -0700)]
glsl: Optionally lower TES gl_PatchVerticesIn to a uniform.
i965 has no special hardware for this, so we need to pass this value in
as a uniform (unless the TES is linked against a TCS, in which case the
linker can just replace this with a constant).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit
0be210513797d3a0245588df915b9510c201bea4)
Nicolai Hähnle [Fri, 13 May 2016 06:48:04 +0000 (01:48 -0500)]
mesa/main: fix integer overflows in _mesa_image_offset
Found using -fsanitize=undefined.
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit
6510e0734563ff8d30e45b8781153367db15cc5b)
Kenneth Graunke [Fri, 27 May 2016 02:56:48 +0000 (19:56 -0700)]
mesa: Pass gl_constant_value union into _mesa_fetch_state().
We've had some trouble in the past with copying integers around via
float pointers, as the C compiler sometimes uses x87 floating point
registers to load values on 32-bit systems. Passing the
gl_constant_value union should be safer.
To avoid churn, this patch creates a "GLfloat *value" variable so
existing uses can stay the same.
Not observed to fix anything, but I was in the area adding more integer
state vars, and thought it'd be wise.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit
8b408972ff5476f1e23ad24a329f89442e6df054)
Emil Velikov [Wed, 15 Jun 2016 08:21:11 +0000 (09:21 +0100)]
Update version to 12.0.0-rc3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Nicolai Hähnle [Tue, 14 Jun 2016 16:00:13 +0000 (18:00 +0200)]
radeonsi: mark buffer texture range valid for shader images
When a shader image view into a buffer texture can be written to, the buffer's
valid range must be updated, or subsequent transfers may incorrectly skip
synchronization.
This fixes a bug that was exposed in Xephyr by PBO acceleration for glReadPixels,
reported by Michel Dänzer.
Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit
a64c7cd2bac33a3a2bf908b5ef538dff03b93b73)
Back-ported from commit
a64c7cd2bac33a3a2bf908b5ef538dff03b93b73:
- include util/u_format.h
- code was extracted to si_set_shader_image in master, move it back
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
--
src/gallium/drivers/radeonsi/si_descriptors.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
Ilia Mirkin [Sat, 28 May 2016 18:23:35 +0000 (14:23 -0400)]
nv50/ir: record number of threads in a compute shader
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit
27a51ff9b420909334898785cf194b5998776e88)
Ilia Mirkin [Sat, 28 May 2016 18:28:07 +0000 (14:28 -0400)]
nvc0/ir: limit max number of regs based on availability in SM
This effectively limits registers to 32 and 64 for fermi and kepler when
1024 threads are used, but allows the full amount to be used with
smaller thread sizes.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit
1f895caba0accc0af3e637d6193ac0b673ce98bc)
Tomasz Figa [Mon, 13 Jun 2016 10:53:21 +0000 (19:53 +0900)]
i965: Check return value of screen->image.loader->getBuffers (v2)
The images struct is an uninitialized local variable on the stack. If the
callback returns 0, the struct might not have been updated and so should
be considered uninitialized. Currently the code ignores the return value,
which (depending on stack contents) might end up in reading a non-zero
value from images.image_mask and dereferencing further fields.
Another solution would be to initialize image_mask with 0, but checking
the return value seems more sensible and it is what Gallium is doing.
v2: fix typos in commit message,
fix indentation,
remove unnecessary parentheses and pointer dereference to keep line
length reasonable.
Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
e7ab358e8186dd8651cf920d4db1500c60ccd2fc)
Dylan Baker [Mon, 13 Jun 2016 18:19:18 +0000 (11:19 -0700)]
isl: Replace bash generator with python generator
This replaces the current bash generator with a python based generator
using mako. It's quite fast and works with both python 2.7 and python
3.5, and should work with 3.3+ and maybe even 3.2.
It produces an almost identical file except for a minor layout changes,
and the addition of a "generated file, do not edit" warning.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
5a87bc718197deab7577a028c64a7f591bbfaec4)
Bas Nieuwenhuizen [Mon, 6 Jun 2016 20:49:57 +0000 (22:49 +0200)]
radeonsi: Reinitialize all descriptors in CE preamble.
This fixes a problem with the CE preamble and restoring only stuff in the
preamble when needed.
To illustrate suppose we have two graphics IB's 1 and 2, which are submitted in
that order. Furthermore suppose IB 1 does not use CE ram, but IB 2 does, and we
have a context switch at the start of IB 1, but not between IB 1 and IB 2.
The old code put the CE RAM loads in the preamble of IB 2. As the preamble of
IB 1 does not have the loads and the preamble of IB 2 does not get executed, the
old values are not load into CE RAM.
Fix this by always restoring the entire CE RAM.
v2: - Just load all descriptor set buffers instead of load and store the entire
CE RAM.
- Leave the ce_ram_dirty tracking in place for the non-preamble case.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Note: This commit differs from the one in master -
54f755fa0fd
("radeonsi: Reinitialize all descriptors in CE preamble.")
Emil Velikov [Tue, 14 Jun 2016 14:52:41 +0000 (15:52 +0100)]
cherry-ignore: drop the "i965 bring back INTEL_PRECISE_TRIG"
The commit that removes it isn't in branch, thus there's nothing to do
here.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Samuel Iglesias Gonsálvez [Thu, 9 Jun 2016 11:03:59 +0000 (13:03 +0200)]
i965: Defeat the register stride checker in pull uniform messages.
Pulling DF uniforms from pull constant buffer generates messages like:
send(4) g12<1>DF g12<0,1,0>F
sampler ld SIMD4x2 Surface = 1 Sampler = 0 mlen 1 rlen 1
which produces GPU hangs in Cherryview/Braswell:
"For 64-bit Align1 operation or multiplication of dwords in CHV,
source horizontal stride must be aligned to qword."
This seems to be documented in the Cherryview PRM, Volume 7, Page 843:
"When source or destination datatype is 64b or operation is integer
DWord multiply, regioning in Align1 must follow these rules:
1. Source and Destination horizontal stride must be aligned to the
same qword."
We should set the destination type to UD, D, or F so that
the register stride checker doesn't notice. The destination type of
send messages is basically irrelevant anyway.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit
a0ed8503b753574b14df3dc280fd917ae7c207f8)
Kenneth Graunke [Wed, 8 Jun 2016 23:24:50 +0000 (16:24 -0700)]
i965: Defeat the register stride checker in URB reads.
Pulling DF inputs from the URB generates messages like:
send(8) g23<1>DF g1<8,8,1>UD
urb 3 SIMD8 read mlen 1 rlen 2 { align1 1Q };
which makes the simulator angry:
"For 64-bit Align1 operation or multiplication of dwords in CHV,
source horizontal stride must be aligned to qword."
This seems to be documented in the Cherryview PRM, Volume 7, Page 823:
"When source or destination datatype is 64b or operation is integer
DWord multiply, regioning in Align1 must follow these rules:
1. Source and Destination horizontal stride must be aligned to the
same qword."
Setting the source horizontal stride to QWord is insane, as it's the
message header containing 8 URB handles in a single 32-bit DWord.
Instead, we should whack the destination type to UD, D, or F so that
the register stride checker doesn't notice. The destination type of
send messages is basically irrelevant anyway.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit
ed3ba651f6faa4ea94dde16fa880781090785477)
Kenneth Graunke [Wed, 8 Jun 2016 22:55:18 +0000 (15:55 -0700)]
i965: Fix issues with number of VS URB entries on Cherryview/Broxton.
Cherryview/Broxton annoyingly have a minimum number of VS URB entries
of 34, which is not a multiple of 8. When the VS size is less than 9,
the number of VS entries has to be a multiple of 8.
Notably, BLORP programmed the minimum number of VS URB entries (34), with
a size of 1 (less than 9), which is invalid.
It seemed like this could be a problem in the regular URB code as well,
so I went ahead and updated that to be safe.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit
9f37df06dafbf54cec6749543cac1baa77d0b5e2)
Timothy Arceri [Tue, 14 Jun 2016 00:13:41 +0000 (10:13 +1000)]
glsl: make sure UBO arrays are sized in ES
This check was removed in
5b2675093e86 add it back in.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
https://bugs.freedesktop.org/show_bug.cgi?id=96349
(cherry picked from commit
b010fa85675b98962426fe8961466fbae2d25499)
Vedran Miletić [Mon, 6 Jun 2016 10:43:33 +0000 (12:43 +0200)]
clover: Update OpenCL version string to match OpenGL
Change MESA into Mesa in CL_PLATFORM_VERSION and CL_DEVICE_VERSION. For
both, always append git version suffix from git_sha1.h.
v5: move semicolon to same line as MESA_GIT_SHA1.
v4: drop #ifdef guards.
v3: add missing include.
v2: change CL_DEVICE_VERSION as well.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit
4825264f75c83576f251290547f121f066b46a70)
Squashed with commit
clover: Include generated sources in AM_CPPFLAGS
git_sha1.c is generated in $(top_builddir)/src.
Fixes out-of-tree builds since
4825264f75c83576.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96516
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit
fafe026dbe0680c971bf3ba2452954eea84287f2)
Francisco Jerez [Sat, 11 Jun 2016 00:55:39 +0000 (17:55 -0700)]
i965/fs: Fix regs_written for SIMD-lowered instructions some more.
ISTR having suggested this during review of the recent FP64 changes to
the SIMD lowering pass, but it doesn't look like it was taken into
account in the end. Using the fs_reg::component_size helper instead
of this open-coded variant makes sure that the stride is taken into
account correctly. Fixes at least the following piglit tests with
spilling forced on (since otherwise regs_written would be calculated
incorrectly and the spilling code would be rather confused about how
much data needs to be spilled):
spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-shader
spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-mixed-shader
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit
bd9f9726519fad94e88b9266b0c255aa00251f4d)
Francisco Jerez [Fri, 10 Jun 2016 23:41:59 +0000 (16:41 -0700)]
i965: Fix cross-primitive scratch corruption when changing the per-thread allocation.
I haven't found any mention of this in the hardware docs, but
experimentally what seems to be going on is that when the per-thread
scratch slot size is changed between two pipelined draw calls, shader
invocations using the old and new scratch size setting may end up
being executed in parallel, causing their scratch offset calculations
to be based in a different partitioning of the scratch space, which
can cause their thread-local scratch space to overlap leading to
cross-thread scratch corruption.
I've been experimenting with alternative workarounds, like emitting a
PIPE_CONTROL with DC flush and CS stall between draw (or dispatch
compute) calls using different per-thread scratch allocation settings,
or avoiding reuse of the scratch BO if the per-thread scratch
allocation doesn't exactly match the original. Both seem to be as
effective as this workaround, but they have potential performance
implications, while this should be basically for free.
Fixes over 40 failures in our CI system with spilling forced on
(including CTS, dEQP and Piglit failures) on a number of different
platforms from Gen4 to Gen9. The 'glsl-max-varyings' piglit test
seems to be able to reproduce this bug consistently in the vertex
shader on at least Gen4, Gen8 and Gen9 with spilling forced on.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit
a84b5d43e2e54dbebe3600111f4f35c29411f831)
Francisco Jerez [Mon, 13 Jun 2016 21:56:22 +0000 (14:56 -0700)]
i965: Keep track of the per-thread scratch allocation in brw_stage_state.
This will be used to find out what per-thread slot size a previously
allocated scratch BO was used with in order to fix a hardware race
condition without introducing additional stalls or memory allocations.
Instead of calling brw_get_scratch_bo() manually from the various
codegen functions, call a new helper function that keeps track of the
per-thread scratch size and conditionally allocates a larger scratch
BO.
v2: Handle BO allocation manually instead of relying on
brw_get_scratch_bo (Ken).
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit
d960284e447df9b1563deef0ce950617decfba63)
Francisco Jerez [Thu, 9 Jun 2016 00:53:24 +0000 (17:53 -0700)]
i965: Fix scratch overallocation if the original slot size was already a power of two.
The bitwise arithmetic trick used in brw_get_scratch_size() to clamp
the scratch allocation to 1KB has the unintended side effect that it
will cause us to allocate 2x the required amount of scratch space if
the original per-thread scratch size happened to be already a power of
two. Instead use the obvious MAX2 idiom to clamp the scratch
allocation to the expected range.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit
013ae4a70aeb40dc74e53943824bff33dda109e1)
Kenneth Graunke [Mon, 13 Jun 2016 19:18:23 +0000 (12:18 -0700)]
i965: Fix encode_slm_size() to take a generation, not a device info.
In the Vulkan driver, we have the generation number (a compile time
constant) but not necessarily the brw_device_info struct. I meant
to rework the function to take a generation number instead of a
brw_device_info pointer to accomodate this. But I forgot, and left
it taking a brw_device_info pointer, while making Vulkan pass the
generation number (8, 9, ...) directly. This led to crashes.
Brown paper bag fix for commit
87d062a94080373995170f51063a9649.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96504
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit
5a0d294d38505ae61293ae1a9184e1b3228ef2af)
Kenneth Graunke [Sun, 12 Jun 2016 22:44:55 +0000 (15:44 -0700)]
i965: Don't leak scratch BOs for TCS/TES.
These need to be freed too.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit
667e5cec760d1908af73a40de28c53848b5b70a0)
Nanley Chery [Thu, 9 Jun 2016 21:48:00 +0000 (14:48 -0700)]
anv/pipeline: Don't dereference NULL dynamic state pointers
Add guards to prevent dereferencing NULL dynamic pipeline state. Asserts
of pCreateInfo members are moved to the earliest points at which they
should not be NULL.
This fixes a segfault seen in the McNopper demo, VKTS_Example09.
v3 (Jason Ekstrand):
- Fix disabled rasterization check
- Revert opaque detection of color attachment usage
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
a4a59172482d50318a5ae7f99021bcf0125e0f53)
Nanley Chery [Thu, 9 Jun 2016 19:12:29 +0000 (12:12 -0700)]
anv: Document and rename anv_pipeline_init_dynamic_state()
To reduce confusion, clarify that the state being copied is not dynamic.
This agrees with the Vulkan spec's usage of the term. Various sections
specify that the various pipeline state which have VkDynamicState enums
(e.g. viewport, scissor, etc.) may or may not be dynamic.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
a0d84a9ef9df69606a928cf7dca8f2b80dea1c36)
Samuel Pitoiset [Mon, 13 Jun 2016 15:13:28 +0000 (17:13 +0200)]
nvc0/ir: clamp the UBO index for compute on Kepler
We already check that the address is not "too far", but we should also
clamp the UBO index in order to avoid looking at the wrong place in the
driver cb. This is a pretty rare situation though.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
7f257abc1bdd153b3981efffc3f201e1ea5fe843)
Jimmy Berry [Thu, 21 Apr 2016 13:05:41 +0000 (15:05 +0200)]
st/va: hardlink driver instances to gallium_drv_video.so
Removes the need to set LIBVA_DRIVER_NAME=gallium for supported targets and is
consistent with vdpau and general gallium drivers.
Note: some versions of libva can detect the gallium name and use the
backend. Although that behaviour seems inconsistent since it only works
for some platforms/backends.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
0c0f841e5de27d01312f8857641668ca439b1ab1)
Emil Velikov [Fri, 10 Jun 2016 19:45:01 +0000 (20:45 +0100)]
swr: automake: add missing -I flag
When building from a release tarball (where the generated/built files
are in srcdir) in an OOT fashion we need to have both builddir and
srcdir in the includes list.
Otherwise we'll error out, as the file (header gen_knobs.h in this case)
won't be in the location where we are looking.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
fcb5a75a666fa8eb8efe8e7d45316549b4c53ef3)
Emil Velikov [Fri, 10 Jun 2016 17:47:32 +0000 (18:47 +0100)]
automake: add SWR to `make distcheck' gallium drivers
Will allows us to catch missing files and build issues before getting
the tarball out for general consumption.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
f4d26856df2628b6e0fdeee1e9be36427d8f7b74)
Emil Velikov [Fri, 10 Jun 2016 16:46:24 +0000 (17:46 +0100)]
configure.ac: strip out the llvm-config -march/mtune flags
Otherwise drivers such as SWR that depend on providing their own values
will fail to build.
v2: Add -mcpu for good measure (Chuck)
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chuck Atkins <chuck.atkins@kitware.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
(cherry picked from commit
bab5ab69402594637359289c1b5ec6491e91d252)
Chuck Atkins [Fri, 10 Jun 2016 14:44:28 +0000 (10:44 -0400)]
swr: Add missing headers for package inclusion
CC: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
c86fcaca72ed500eac63d5633e0d7ffb77de9acf)
Emil Velikov [Wed, 8 Jun 2016 14:36:18 +0000 (15:36 +0100)]
automake: get in-tree `make distclean' working again.
With earlier commit we've handled the `make distclean' out of tree
build, yet we failed to attribute that for in-tree builds the test
condition will return 1. Thus effectively the target will be considered
as "failed".
Fixes:
b7f7ec78435 ("mesa: automake: distclean git_sha1.h when building
OOT")
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Andy Furniss <adf.lists@gmail.com>
Reported-by: Andy Furniss <adf.lists@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit
8229fe68b5d19c4aaf674474300319b5f69260b7)
Kenneth Graunke [Tue, 7 Jun 2016 04:37:34 +0000 (21:37 -0700)]
i965: Use the correct number of threads for compute shaders.
We were programming the number of threads per subslice, when we should
have been programming the total number of threads on the GPU as a whole.
Thanks to Curro and Jordan for helping track this down!
On Skylake GT3e:
- Improves performance in Unreal's Elemental Demo by roughly 1.5-1.7x.
- Improves performance in Synmark's Gl43CSDof by roughly 3.7x.
- Improves performance in Synmark's Gl43GSCloth by roughly 1.18x.
On Broadwell GT2:
- Improves performance in Unreal's Elemental Demo by roughly 1.2-1.5x.
- Improves performance in Synmark's Gl43CSDof by roughly 2.0x.
- Improves performance in Synmark's Gl43GSCloth by 1.47035% +/-
0.255654% (n=25).
On Haswell GT3e:
- Improves performance in Unreal's Elemental Demo (in GL 4.3 mode)
by roughly 1.10x.
- Improves performance in Synmark's Gl43CSDof by roughly 1.18x.
- Decreases performance in Synmark's Gl43CSCloth by -1.99484% +/-
0.432771% (n=64).
On Ivybridge GT2:
- Improves performance in Unreal's Elemental Demo (in GL 4.2 mode)
by roughly 1.03x.
- Improves performance in Synmark's G/43CSDof by roughly 1.25x.
- No change in Synmark's Gl43CSCloth (n=28).
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit
0fb85ac08d61d365e67c8f79d6955e9f89543560)
Kenneth Graunke [Fri, 10 Jun 2016 01:13:26 +0000 (18:13 -0700)]
i965: Assert that the scratch spaces are in range.
I don't know that anything actually guarantees this, but if we exceed
the limits, we may end up overflowing and trashing random buffers that
happen to be nearby in the VMA space, leading to rendering corruption,
hangs, or worse.
We should really fix this properly. However, the pitfall has existed
for ages, so for now we should at least detect it.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit
1db37ebecf5af55215ace3801f8dbb8b10c5305e)
Kenneth Graunke [Fri, 10 Jun 2016 00:30:40 +0000 (17:30 -0700)]
i965: Fix CS scratch size calculations on Ivybridge and Baytrail.
These are linear, not powers of two, and much more limited.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit
a42a93dc123163f84058f3886e5ce1b02b9856f5)
Kenneth Graunke [Thu, 9 Jun 2016 23:56:31 +0000 (16:56 -0700)]
i965: Fix Haswell CS per-thread scratch space encoding.
Most scratch stages use power of two sizes, in kilobytes, where
0 means 1kB. But compute shaders on Haswell have a minimum of 2kB,
and use a representation where 0 = 2kB.
This meant that we were effectively telling the hardware to allocate
each thread twice as much space as we meant to, while simultaneously
not allocating that much space in the buffer, leading to overflows.
Note that the existing code is completely wrong for Ivybridge,
but that will take additional work to sort out, so I've left it
as is for now. A subsequent commit will take care of that.
Together with the previous patches, this fixes rendering corruption
on Synmark's Gl43CSDof on Haswell.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit
147a90d82a5de637f968e0d5f383cabcb792f1ce)
Kenneth Graunke [Thu, 9 Jun 2016 23:11:46 +0000 (16:11 -0700)]
i965: Account for poor address calculations in Haswell CS scratch size.
Curro figured this out by investigating the simulator. Apparently
there's also a workaround in the Windows driver. I'm not sure it's
actually documented anywhere.
We were underallocating the scratch buffer by a factor of 128/70.
v2: Rename threads_per_subslice to scratch_ids_per_subslice
(suggested by Jordan Justen).
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit
a7d029d3dfac1da2701be75ff4d1589ac562e916)
Kenneth Graunke [Tue, 7 Jun 2016 04:37:34 +0000 (21:37 -0700)]
i965: Allocate scratch space for the maximum number of compute threads.
We were allocating enough space for the number of threads per subslice,
when we should have been allocating space for the number of threads in
the entire GPU.
Even though we currently run with a reduced thread count (due to a bug),
we might still overflow the scratch buffer because the address
calculation is based on the FFTID, which can depend on exactly which
threads, EUs, and threads are executing. We need to allocate enough
for every possible thread that could run.
Fixes rendering corruption in Synmark's Gl43CSDof on Gen8+.
Earlier platforms need additional bug fixes.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit
2213ffdb4bb79856f0556bdf2bfd4bdf57720232)
Kenneth Graunke [Thu, 9 Jun 2016 06:36:16 +0000 (23:36 -0700)]
i965: Set subslice_total on Gen7/7.5 platforms.
We'll use this for compute shader thread counts and scratch space
calculations shortly.
Note that subslices are referred to as "half slices" on Ivybridge.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit
9cd8f95809c21330e4ccbfbe80ee2eea0f7906ae)
Kenneth Graunke [Thu, 9 Jun 2016 05:21:22 +0000 (22:21 -0700)]
i965: Fix shared local memory size for Gen9+.
Skylake changes the representation of shared local memory size:
Size | 0 kB | 1 kB | 2 kB | 4 kB | 8 kB | 16 kB | 32 kB | 64 kB |
-------------------------------------------------------------------
Gen7-8 | 0 | none | none | 1 | 2 | 4 | 8 | 16 |
-------------------------------------------------------------------
Gen9+ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
The old formula would substantially underallocate the amount of space.
This fixes GPU hangs on Skylake when running with full thread counts.
v2: Fix the Vulkan driver too, use a helper function, and fix the table
in the comments and commit message.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit
87d062a94080373995170f51063a9649c96c6dea)
Ilia Mirkin [Fri, 10 Jun 2016 03:45:22 +0000 (23:45 -0400)]
mesa: add drawbuffer argument to ClearNamedFramebufferfi
This was fixed in revision 47 of the ARB_dsa spec in Oct 22, 2015. Since
it's horrible to have differing APIs across library versions, we should
attempt to minimize the impact by backporting it as far as possible and
hope no one notices.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
7d7e015381b25ec639633b63d01d851bc32edf23)
Ilia Mirkin [Fri, 10 Jun 2016 04:43:13 +0000 (00:43 -0400)]
GL: update glcorearb.h to svn 32433
This brings in the fixed glClearNamedFramebufferfi definition, as well
as a lot of GLsizei -> GLsizeiptr changes.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
92351a71a81edb53164f1d62b854036e031bb4a1)
Ilia Mirkin [Fri, 10 Jun 2016 02:55:18 +0000 (22:55 -0400)]
GL: update glext to svn 32957
This brings in defines from GL_EXT_window_rectangles and fixes the
glClearNamedFramebufferfi definition.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
f81374fd3e24650adc90d589a307e7cc2f2fa714)
Anuj Phogat [Fri, 11 Dec 2015 22:41:31 +0000 (14:41 -0800)]
gallium: Fix region overlap conditions for rectangles with a shared edge
>From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels":
"The pixels corresponding to these buffers are copied from the source
rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1)
to the destination rectangle bounded by the locations (dstX0, dstY 0)
and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive,
while the upper bounds are exclusive."
So, the rectangles sharing just an edge shouldn't overlap.
-----------
| |
------- ---
| | |
| | |
------- ---
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit
466b3201633a61bc9adfb38397a6fe776cb1cfe3)
Anuj Phogat [Fri, 11 Dec 2015 22:41:30 +0000 (14:41 -0800)]
mesa: Fix region overlap conditions for rectangles with a shared edge
>From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels":
"The pixels corresponding to these buffers are copied from the source
rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1)
to the destination rectangle bounded by the locations (dstX0, dstY 0)
and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive,
while the upper bounds are exclusive."
So, the rectangles sharing just an edge shouldn't overlap.
-----------
| |
------- ---
| | |
| | |
------- ---
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit
f8679badd423b61b3a49e1138445f9f3d740fdde)
Jason Ekstrand [Fri, 10 Jun 2016 19:42:52 +0000 (12:42 -0700)]
anv/entrypoints: Rework #if guards
This reworks the #if guards a bit. When Emil originally wrote them, he
just guarded everything. However, part of what anv_entrypoints_gen.py
generates is a hash table for looking up entrypoints based on their name.
This table *cannot* get out of sync between C and python regardless of
preprocessor flags. In order to prevent this, this commit makes us use
void pointers in the dispatch table for those entrypoints which aren't
available. This means that the dispatch table size and entry order is
constant and it should never get out-of-sync with the python.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
8d37556ec9d7fbbffc5497388a52998ae4fe75de)
Jason Ekstrand [Fri, 10 Jun 2016 19:30:05 +0000 (12:30 -0700)]
anv/entrypoints: Use the function pointer types provided by vulkan.h
This is a bit cleaner than generating the types ourselves when making the
table.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
9ed0d9dd06414b30d28d7d1301a980784e22d8d6)
Jason Ekstrand [Mon, 6 Jun 2016 21:29:18 +0000 (14:29 -0700)]
anv/entrypoints: Emit #if guards for all platforms
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
d1a53f91ee950720b54c35b7d61f0213659533de)
Nicolai Hähnle [Mon, 6 Jun 2016 21:15:10 +0000 (23:15 +0200)]
st/mesa: use base level size as "guess" when available
When an applications specifies mip levels _before_ setting a mipmap texture
filter, we will initially guess a single texture level. When the second level
image is created, we try to allocate the full texture -- however, we get the
base level size guess wrong if that size is odd. This leads to yet another
re-allocation of the texture later during st_finalize_texture.
Even worse, this re-allocation breaks a (reasonable) assumption made by
st_generate_mipmaps, because the re-allocation in the finalization call will
again allocate a single-level pipe texture (based on the non-mipmap texture
filter!). As a result, mipmap generation fails in interesting ways.
All of this can be avoided by just using the fact that we already know the
size of the base level.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95529
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit
42624ea837e8f422f1cd04403af915bd7f218b8d)
Jason Ekstrand [Fri, 10 Jun 2016 16:43:45 +0000 (09:43 -0700)]
anv: Remove the PhysicalDeviceLimits FINISHME
At this point, the limits are probably more-or-less correct. If there is
an invalid limit, that's a bug not a FINSHME.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
a1e69930e43f2876f662042bb94b76124dbe7dfc)
Jason Ekstrand [Mon, 6 Jun 2016 18:20:44 +0000 (11:20 -0700)]
anv/pipeline_cache: Allow for an zero-sized cache
This gets ANV_ENABLE_PIPELINE_CACHE=false working again.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
4f5bbf804b5590631abb7ff36b74871a0725f8fa)
Jason Ekstrand [Mon, 6 Jun 2016 18:12:27 +0000 (11:12 -0700)]
anv/pipeline: Store the (set, binding, index) tripple in the bind map
This way the the bind map (which we're caching) is mostly independent of
the pipeline layout. The only coupling remaining is that we pull the array
size of a binding out of the layout. However, that size is also specified
in the shader and should always match so it's not really coupled. This
rendering issues in Dota 2.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
a1a25db69926604d579139d1d497f1566ec16ac7)
Jason Ekstrand [Mon, 6 Jun 2016 16:15:03 +0000 (09:15 -0700)]
anv/descriptor_set: Ensure that bindings are always in increasing order
Since applications are allowed to specify some set of bindings which need
not be dense they also need not be in order. For most things, this doesn't
matter, but it could result getting the wrong dynamic offsets. This adds a
quick-and-dirty sort to ensure that everything is always in increasing
order of binding index.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
c13c5ac561f6475d08c35d2a88a829e6ce36e98c)
Jason Ekstrand [Mon, 6 Jun 2016 16:12:50 +0000 (09:12 -0700)]
anv/descriptor_set: Add a type field in debug builds
This allows for some extra validation and makes it easier to see what's
going on when poking around in gdb.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
e2265926f25235f9be833984a5e365889a70ea74)
Jason Ekstrand [Mon, 6 Jun 2016 16:12:20 +0000 (09:12 -0700)]
anv/descriptor_set: Set array_size to zero for non-existant descriptors
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
cd21015abd134aa815398e90714631d3f6601294)
Leo Liu [Thu, 9 Jun 2016 16:53:54 +0000 (12:53 -0400)]
vl/dri3: support receiving new pixmap for front buffer
With glx of gstreamer-vaapi, the temporary pixmap for front buffer gets
renewed in each frame, so when we receive a new pixmap, should get a new
front buffer for it.
This also fixes Totem player playback corruption.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
2ad443e4cc0a72b7d0b28195b5810cbf197961cb)
Leo Liu [Thu, 9 Jun 2016 17:11:52 +0000 (13:11 -0400)]
vl/dri3: get Makefile properly
From original commit, the macro "if HAVE_DRI3" was in Makefile.sources,
this file is shared with SCons, SCons is not able to parse this marco,
the SCons build failed. Jose quickly gave two approaches and quick fix
with his second approach, thanks Jose for the solutions and fixes.
This patch is Jose's first approach, and it's more proper, because the
dri3 c file should not be included to build when DRI3 is not enabled.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
0ef8500aabb5430eae76919438fcf61020b7eb8e)
Daniel Czarnowski [Wed, 10 Feb 2016 17:36:05 +0000 (09:36 -0800)]
glx: fix crash with bad fbconfig
GLX documentation states:
glXCreateNewContext can generate the following errors: (...)
GLXBadFBConfig if config is not a valid GLXFBConfig
Function checks if the given config is a valid config and sets proper
error code.
Fixes currently crashing glx-fbconfig-bad Piglit test.
v2: coding style cleanups (Emil, Topi)
use DefaultScreen macro (Emil)
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
cf804b4455fac9e585b3600a8318caaced9c23de)
Jason Ekstrand [Thu, 9 Jun 2016 02:56:46 +0000 (19:56 -0700)]
i965: Emit surface states for extra planes prior to gen8
When Kristian implemented GL_TEXTURE_EXTERNAL_OES, he hooked it up for gen8
but not for gen7 or earlier. It all works, we just need to emit the states
for the extra planes.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
037ce5d7343829a69ec9c7361a0964bc1366b019)