OSDN Git Service

android-x86/external-mesa.git
8 years agoi965: Defeat the register stride checker in pull uniform messages.
Samuel Iglesias Gonsálvez [Thu, 9 Jun 2016 11:03:59 +0000 (13:03 +0200)]
i965: Defeat the register stride checker in pull uniform messages.

Pulling DF uniforms from pull constant buffer generates messages like:
    send(4)         g12<1>DF        g12<0,1,0>F
         sampler ld SIMD4x2 Surface = 1 Sampler = 0 mlen 1 rlen 1

which produces GPU hangs in Cherryview/Braswell:

    "For 64-bit Align1 operation or multiplication of dwords in CHV,
     source horizontal stride must be aligned to qword."

This seems to be documented in the Cherryview PRM, Volume 7, Page 843:

    "When source or destination datatype is 64b or operation is integer
     DWord multiply, regioning in Align1 must follow these rules:

     1. Source and Destination horizontal stride must be aligned to the
        same qword."

We should set the destination type to UD, D, or F so that
the register stride checker doesn't notice.  The destination type of
send messages is basically irrelevant anyway.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit a0ed8503b753574b14df3dc280fd917ae7c207f8)

8 years agoi965: Defeat the register stride checker in URB reads.
Kenneth Graunke [Wed, 8 Jun 2016 23:24:50 +0000 (16:24 -0700)]
i965: Defeat the register stride checker in URB reads.

Pulling DF inputs from the URB generates messages like:

   send(8)         g23<1>DF        g1<8,8,1>UD
                   urb 3 SIMD8 read mlen 1 rlen 2      { align1 1Q };

which makes the simulator angry:

   "For 64-bit Align1 operation or multiplication of dwords in CHV,
    source horizontal stride must be aligned to qword."

This seems to be documented in the Cherryview PRM, Volume 7, Page 823:

   "When source or destination datatype is 64b or operation is integer
    DWord multiply, regioning in Align1 must follow these rules:

    1. Source and Destination horizontal stride must be aligned to the
       same qword."

Setting the source horizontal stride to QWord is insane, as it's the
message header containing 8 URB handles in a single 32-bit DWord.
Instead, we should whack the destination type to UD, D, or F so that
the register stride checker doesn't notice.  The destination type of
send messages is basically irrelevant anyway.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit ed3ba651f6faa4ea94dde16fa880781090785477)

8 years agoi965: Fix issues with number of VS URB entries on Cherryview/Broxton.
Kenneth Graunke [Wed, 8 Jun 2016 22:55:18 +0000 (15:55 -0700)]
i965: Fix issues with number of VS URB entries on Cherryview/Broxton.

Cherryview/Broxton annoyingly have a minimum number of VS URB entries
of 34, which is not a multiple of 8.  When the VS size is less than 9,
the number of VS entries has to be a multiple of 8.

Notably, BLORP programmed the minimum number of VS URB entries (34), with
a size of 1 (less than 9), which is invalid.

It seemed like this could be a problem in the regular URB code as well,
so I went ahead and updated that to be safe.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 9f37df06dafbf54cec6749543cac1baa77d0b5e2)

8 years agoglsl: make sure UBO arrays are sized in ES
Timothy Arceri [Tue, 14 Jun 2016 00:13:41 +0000 (10:13 +1000)]
glsl: make sure UBO arrays are sized in ES

This check was removed in 5b2675093e86 add it back in.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
https://bugs.freedesktop.org/show_bug.cgi?id=96349
(cherry picked from commit b010fa85675b98962426fe8961466fbae2d25499)

8 years agoclover: Update OpenCL version string to match OpenGL
Vedran Miletić [Mon, 6 Jun 2016 10:43:33 +0000 (12:43 +0200)]
clover: Update OpenCL version string to match OpenGL

Change MESA into Mesa in CL_PLATFORM_VERSION and CL_DEVICE_VERSION. For
both, always append git version suffix from git_sha1.h.

v5: move semicolon to same line as MESA_GIT_SHA1.
v4: drop #ifdef guards.
v3: add missing include.
v2: change CL_DEVICE_VERSION as well.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 4825264f75c83576f251290547f121f066b46a70)

Squashed with commit

clover: Include generated sources in AM_CPPFLAGS

git_sha1.c is generated in $(top_builddir)/src.

Fixes out-of-tree builds since 4825264f75c83576.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96516
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit fafe026dbe0680c971bf3ba2452954eea84287f2)

8 years agoi965/fs: Fix regs_written for SIMD-lowered instructions some more.
Francisco Jerez [Sat, 11 Jun 2016 00:55:39 +0000 (17:55 -0700)]
i965/fs: Fix regs_written for SIMD-lowered instructions some more.

ISTR having suggested this during review of the recent FP64 changes to
the SIMD lowering pass, but it doesn't look like it was taken into
account in the end.  Using the fs_reg::component_size helper instead
of this open-coded variant makes sure that the stride is taken into
account correctly.  Fixes at least the following piglit tests with
spilling forced on (since otherwise regs_written would be calculated
incorrectly and the spilling code would be rather confused about how
much data needs to be spilled):

 spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-shader
 spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-mixed-shader

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit bd9f9726519fad94e88b9266b0c255aa00251f4d)

8 years agoi965: Fix cross-primitive scratch corruption when changing the per-thread allocation.
Francisco Jerez [Fri, 10 Jun 2016 23:41:59 +0000 (16:41 -0700)]
i965: Fix cross-primitive scratch corruption when changing the per-thread allocation.

I haven't found any mention of this in the hardware docs, but
experimentally what seems to be going on is that when the per-thread
scratch slot size is changed between two pipelined draw calls, shader
invocations using the old and new scratch size setting may end up
being executed in parallel, causing their scratch offset calculations
to be based in a different partitioning of the scratch space, which
can cause their thread-local scratch space to overlap leading to
cross-thread scratch corruption.

I've been experimenting with alternative workarounds, like emitting a
PIPE_CONTROL with DC flush and CS stall between draw (or dispatch
compute) calls using different per-thread scratch allocation settings,
or avoiding reuse of the scratch BO if the per-thread scratch
allocation doesn't exactly match the original.  Both seem to be as
effective as this workaround, but they have potential performance
implications, while this should be basically for free.

Fixes over 40 failures in our CI system with spilling forced on
(including CTS, dEQP and Piglit failures) on a number of different
platforms from Gen4 to Gen9.  The 'glsl-max-varyings' piglit test
seems to be able to reproduce this bug consistently in the vertex
shader on at least Gen4, Gen8 and Gen9 with spilling forced on.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a84b5d43e2e54dbebe3600111f4f35c29411f831)

8 years agoi965: Keep track of the per-thread scratch allocation in brw_stage_state.
Francisco Jerez [Mon, 13 Jun 2016 21:56:22 +0000 (14:56 -0700)]
i965: Keep track of the per-thread scratch allocation in brw_stage_state.

This will be used to find out what per-thread slot size a previously
allocated scratch BO was used with in order to fix a hardware race
condition without introducing additional stalls or memory allocations.
Instead of calling brw_get_scratch_bo() manually from the various
codegen functions, call a new helper function that keeps track of the
per-thread scratch size and conditionally allocates a larger scratch
BO.

v2: Handle BO allocation manually instead of relying on
    brw_get_scratch_bo (Ken).

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d960284e447df9b1563deef0ce950617decfba63)

8 years agoi965: Fix scratch overallocation if the original slot size was already a power of...
Francisco Jerez [Thu, 9 Jun 2016 00:53:24 +0000 (17:53 -0700)]
i965: Fix scratch overallocation if the original slot size was already a power of two.

The bitwise arithmetic trick used in brw_get_scratch_size() to clamp
the scratch allocation to 1KB has the unintended side effect that it
will cause us to allocate 2x the required amount of scratch space if
the original per-thread scratch size happened to be already a power of
two.  Instead use the obvious MAX2 idiom to clamp the scratch
allocation to the expected range.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 013ae4a70aeb40dc74e53943824bff33dda109e1)

8 years agoi965: Fix encode_slm_size() to take a generation, not a device info.
Kenneth Graunke [Mon, 13 Jun 2016 19:18:23 +0000 (12:18 -0700)]
i965: Fix encode_slm_size() to take a generation, not a device info.

In the Vulkan driver, we have the generation number (a compile time
constant) but not necessarily the brw_device_info struct.  I meant
to rework the function to take a generation number instead of a
brw_device_info pointer to accomodate this.  But I forgot, and left
it taking a brw_device_info pointer, while making Vulkan pass the
generation number (8, 9, ...) directly.  This led to crashes.

Brown paper bag fix for commit 87d062a94080373995170f51063a9649.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96504
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5a0d294d38505ae61293ae1a9184e1b3228ef2af)

8 years agoi965: Don't leak scratch BOs for TCS/TES.
Kenneth Graunke [Sun, 12 Jun 2016 22:44:55 +0000 (15:44 -0700)]
i965: Don't leak scratch BOs for TCS/TES.

These need to be freed too.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 667e5cec760d1908af73a40de28c53848b5b70a0)

8 years agoanv/pipeline: Don't dereference NULL dynamic state pointers
Nanley Chery [Thu, 9 Jun 2016 21:48:00 +0000 (14:48 -0700)]
anv/pipeline: Don't dereference NULL dynamic state pointers

Add guards to prevent dereferencing NULL dynamic pipeline state. Asserts
of pCreateInfo members are moved to the earliest points at which they
should not be NULL.

This fixes a segfault seen in the McNopper demo, VKTS_Example09.

v3 (Jason Ekstrand):
   - Fix disabled rasterization check
   - Revert opaque detection of color attachment usage

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a4a59172482d50318a5ae7f99021bcf0125e0f53)

8 years agoanv: Document and rename anv_pipeline_init_dynamic_state()
Nanley Chery [Thu, 9 Jun 2016 19:12:29 +0000 (12:12 -0700)]
anv: Document and rename anv_pipeline_init_dynamic_state()

To reduce confusion, clarify that the state being copied is not dynamic.

This agrees with the Vulkan spec's usage of the term. Various sections
specify that the various pipeline state which have VkDynamicState enums
(e.g. viewport, scissor, etc.) may or may not be dynamic.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a0d84a9ef9df69606a928cf7dca8f2b80dea1c36)

8 years agonvc0/ir: clamp the UBO index for compute on Kepler
Samuel Pitoiset [Mon, 13 Jun 2016 15:13:28 +0000 (17:13 +0200)]
nvc0/ir: clamp the UBO index for compute on Kepler

We already check that the address is not "too far", but we should also
clamp the UBO index in order to avoid looking at the wrong place in the
driver cb. This is a pretty rare situation though.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7f257abc1bdd153b3981efffc3f201e1ea5fe843)

8 years agost/va: hardlink driver instances to gallium_drv_video.so
Jimmy Berry [Thu, 21 Apr 2016 13:05:41 +0000 (15:05 +0200)]
st/va: hardlink driver instances to gallium_drv_video.so

Removes the need to set LIBVA_DRIVER_NAME=gallium for supported targets and is
consistent with vdpau and general gallium drivers.

Note: some versions of libva can detect the gallium name and use the
backend. Although that behaviour seems inconsistent since it only works
for some platforms/backends.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 0c0f841e5de27d01312f8857641668ca439b1ab1)

8 years agoswr: automake: add missing -I flag
Emil Velikov [Fri, 10 Jun 2016 19:45:01 +0000 (20:45 +0100)]
swr: automake: add missing -I flag

When building from a release tarball (where the generated/built files
are in srcdir) in an OOT fashion we need to have both builddir and
srcdir in the includes list.

Otherwise we'll error out, as the file (header gen_knobs.h in this case)
won't be in the location where we are looking.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit fcb5a75a666fa8eb8efe8e7d45316549b4c53ef3)

8 years agoautomake: add SWR to `make distcheck' gallium drivers
Emil Velikov [Fri, 10 Jun 2016 17:47:32 +0000 (18:47 +0100)]
automake: add SWR to `make distcheck' gallium drivers

Will allows us to catch missing files and build issues before getting
the tarball out for general consumption.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f4d26856df2628b6e0fdeee1e9be36427d8f7b74)

8 years agoconfigure.ac: strip out the llvm-config -march/mtune flags
Emil Velikov [Fri, 10 Jun 2016 16:46:24 +0000 (17:46 +0100)]
configure.ac: strip out the llvm-config -march/mtune flags

Otherwise drivers such as SWR that depend on providing their own values
will fail to build.

v2: Add -mcpu for good measure (Chuck)

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chuck Atkins <chuck.atkins@kitware.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
(cherry picked from commit bab5ab69402594637359289c1b5ec6491e91d252)

8 years agoswr: Add missing headers for package inclusion
Chuck Atkins [Fri, 10 Jun 2016 14:44:28 +0000 (10:44 -0400)]
swr: Add missing headers for package inclusion

CC: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c86fcaca72ed500eac63d5633e0d7ffb77de9acf)

8 years agoautomake: get in-tree `make distclean' working again.
Emil Velikov [Wed, 8 Jun 2016 14:36:18 +0000 (15:36 +0100)]
automake: get in-tree `make distclean' working again.

With earlier commit we've handled the `make distclean' out of tree
build, yet we failed to attribute that for in-tree builds the test
condition will return 1. Thus effectively the target will be considered
as "failed".

Fixes: b7f7ec78435 ("mesa: automake: distclean git_sha1.h when building
OOT")
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Andy Furniss <adf.lists@gmail.com>
Reported-by: Andy Furniss <adf.lists@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 8229fe68b5d19c4aaf674474300319b5f69260b7)

8 years agoi965: Use the correct number of threads for compute shaders.
Kenneth Graunke [Tue, 7 Jun 2016 04:37:34 +0000 (21:37 -0700)]
i965: Use the correct number of threads for compute shaders.

We were programming the number of threads per subslice, when we should
have been programming the total number of threads on the GPU as a whole.

Thanks to Curro and Jordan for helping track this down!

On Skylake GT3e:
- Improves performance in Unreal's Elemental Demo by roughly 1.5-1.7x.
- Improves performance in Synmark's Gl43CSDof by roughly 3.7x.
- Improves performance in Synmark's Gl43GSCloth by roughly 1.18x.

On Broadwell GT2:
- Improves performance in Unreal's Elemental Demo by roughly 1.2-1.5x.
- Improves performance in Synmark's Gl43CSDof by roughly 2.0x.
- Improves performance in Synmark's Gl43GSCloth by 1.47035% +/-
  0.255654% (n=25).

On Haswell GT3e:
- Improves performance in Unreal's Elemental Demo (in GL 4.3 mode)
  by roughly 1.10x.
- Improves performance in Synmark's Gl43CSDof by roughly 1.18x.
- Decreases performance in Synmark's Gl43CSCloth by -1.99484% +/-
  0.432771% (n=64).

On Ivybridge GT2:
- Improves performance in Unreal's Elemental Demo (in GL 4.2 mode)
  by roughly 1.03x.
- Improves performance in Synmark's G/43CSDof by roughly 1.25x.
- No change in Synmark's Gl43CSCloth (n=28).

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 0fb85ac08d61d365e67c8f79d6955e9f89543560)

8 years agoi965: Assert that the scratch spaces are in range.
Kenneth Graunke [Fri, 10 Jun 2016 01:13:26 +0000 (18:13 -0700)]
i965: Assert that the scratch spaces are in range.

I don't know that anything actually guarantees this, but if we exceed
the limits, we may end up overflowing and trashing random buffers that
happen to be nearby in the VMA space, leading to rendering corruption,
hangs, or worse.

We should really fix this properly.  However, the pitfall has existed
for ages, so for now we should at least detect it.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 1db37ebecf5af55215ace3801f8dbb8b10c5305e)

8 years agoi965: Fix CS scratch size calculations on Ivybridge and Baytrail.
Kenneth Graunke [Fri, 10 Jun 2016 00:30:40 +0000 (17:30 -0700)]
i965: Fix CS scratch size calculations on Ivybridge and Baytrail.

These are linear, not powers of two, and much more limited.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit a42a93dc123163f84058f3886e5ce1b02b9856f5)

8 years agoi965: Fix Haswell CS per-thread scratch space encoding.
Kenneth Graunke [Thu, 9 Jun 2016 23:56:31 +0000 (16:56 -0700)]
i965: Fix Haswell CS per-thread scratch space encoding.

Most scratch stages use power of two sizes, in kilobytes, where
0 means 1kB.  But compute shaders on Haswell have a minimum of 2kB,
and use a representation where 0 = 2kB.

This meant that we were effectively telling the hardware to allocate
each thread twice as much space as we meant to, while simultaneously
not allocating that much space in the buffer, leading to overflows.

Note that the existing code is completely wrong for Ivybridge,
but that will take additional work to sort out, so I've left it
as is for now.  A subsequent commit will take care of that.

Together with the previous patches, this fixes rendering corruption
on Synmark's Gl43CSDof on Haswell.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 147a90d82a5de637f968e0d5f383cabcb792f1ce)

8 years agoi965: Account for poor address calculations in Haswell CS scratch size.
Kenneth Graunke [Thu, 9 Jun 2016 23:11:46 +0000 (16:11 -0700)]
i965: Account for poor address calculations in Haswell CS scratch size.

Curro figured this out by investigating the simulator.  Apparently
there's also a workaround in the Windows driver.  I'm not sure it's
actually documented anywhere.

We were underallocating the scratch buffer by a factor of 128/70.

v2: Rename threads_per_subslice to scratch_ids_per_subslice
    (suggested by Jordan Justen).

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit a7d029d3dfac1da2701be75ff4d1589ac562e916)

8 years agoi965: Allocate scratch space for the maximum number of compute threads.
Kenneth Graunke [Tue, 7 Jun 2016 04:37:34 +0000 (21:37 -0700)]
i965: Allocate scratch space for the maximum number of compute threads.

We were allocating enough space for the number of threads per subslice,
when we should have been allocating space for the number of threads in
the entire GPU.

Even though we currently run with a reduced thread count (due to a bug),
we might still overflow the scratch buffer because the address
calculation is based on the FFTID, which can depend on exactly which
threads, EUs, and threads are executing.  We need to allocate enough
for every possible thread that could run.

Fixes rendering corruption in Synmark's Gl43CSDof on Gen8+.
Earlier platforms need additional bug fixes.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 2213ffdb4bb79856f0556bdf2bfd4bdf57720232)

8 years agoi965: Set subslice_total on Gen7/7.5 platforms.
Kenneth Graunke [Thu, 9 Jun 2016 06:36:16 +0000 (23:36 -0700)]
i965: Set subslice_total on Gen7/7.5 platforms.

We'll use this for compute shader thread counts and scratch space
calculations shortly.

Note that subslices are referred to as "half slices" on Ivybridge.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 9cd8f95809c21330e4ccbfbe80ee2eea0f7906ae)

8 years agoi965: Fix shared local memory size for Gen9+.
Kenneth Graunke [Thu, 9 Jun 2016 05:21:22 +0000 (22:21 -0700)]
i965: Fix shared local memory size for Gen9+.

Skylake changes the representation of shared local memory size:

 Size   | 0 kB | 1 kB | 2 kB | 4 kB | 8 kB | 16 kB | 32 kB | 64 kB |
 -------------------------------------------------------------------
 Gen7-8 |    0 | none | none |    1 |    2 |     4 |     8 |    16 |
 -------------------------------------------------------------------
 Gen9+  |    0 |    1 |    2 |    3 |    4 |     5 |     6 |     7 |

The old formula would substantially underallocate the amount of space.
This fixes GPU hangs on Skylake when running with full thread counts.

v2: Fix the Vulkan driver too, use a helper function, and fix the table
    in the comments and commit message.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 87d062a94080373995170f51063a9649c96c6dea)

8 years agomesa: add drawbuffer argument to ClearNamedFramebufferfi
Ilia Mirkin [Fri, 10 Jun 2016 03:45:22 +0000 (23:45 -0400)]
mesa: add drawbuffer argument to ClearNamedFramebufferfi

This was fixed in revision 47 of the ARB_dsa spec in Oct 22, 2015. Since
it's horrible to have differing APIs across library versions, we should
attempt to minimize the impact by backporting it as far as possible and
hope no one notices.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7d7e015381b25ec639633b63d01d851bc32edf23)

8 years agoGL: update glcorearb.h to svn 32433
Ilia Mirkin [Fri, 10 Jun 2016 04:43:13 +0000 (00:43 -0400)]
GL: update glcorearb.h to svn 32433

This brings in the fixed glClearNamedFramebufferfi definition, as well
as a lot of GLsizei -> GLsizeiptr changes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 92351a71a81edb53164f1d62b854036e031bb4a1)

8 years agoGL: update glext to svn 32957
Ilia Mirkin [Fri, 10 Jun 2016 02:55:18 +0000 (22:55 -0400)]
GL: update glext to svn 32957

This brings in defines from GL_EXT_window_rectangles and fixes the
glClearNamedFramebufferfi definition.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f81374fd3e24650adc90d589a307e7cc2f2fa714)

8 years agogallium: Fix region overlap conditions for rectangles with a shared edge
Anuj Phogat [Fri, 11 Dec 2015 22:41:31 +0000 (14:41 -0800)]
gallium: Fix region overlap conditions for rectangles with a shared edge

>From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels":
"The pixels corresponding to these buffers are copied from the source
rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1)
to the destination rectangle bounded by the locations (dstX0, dstY 0)
and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive,
while the upper bounds are exclusive."

So, the rectangles sharing just an edge shouldn't overlap.
 -----------
|           |
 ------- ---
|       |   |
|       |   |
 ------- ---

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 466b3201633a61bc9adfb38397a6fe776cb1cfe3)

8 years agomesa: Fix region overlap conditions for rectangles with a shared edge
Anuj Phogat [Fri, 11 Dec 2015 22:41:30 +0000 (14:41 -0800)]
mesa: Fix region overlap conditions for rectangles with a shared edge

>From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels":
"The pixels corresponding to these buffers are copied from the source
 rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1)
 to the destination rectangle bounded by the locations (dstX0, dstY 0)
 and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive,
 while the upper bounds are exclusive."

So, the rectangles sharing just an edge shouldn't overlap.
     -----------
    |           |
     ------- ---
    |       |   |
    |       |   |
     ------- ---

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit f8679badd423b61b3a49e1138445f9f3d740fdde)

8 years agoanv/entrypoints: Rework #if guards
Jason Ekstrand [Fri, 10 Jun 2016 19:42:52 +0000 (12:42 -0700)]
anv/entrypoints: Rework #if guards

This reworks the #if guards a bit.  When Emil originally wrote them, he
just guarded everything.  However, part of what anv_entrypoints_gen.py
generates is a hash table for looking up entrypoints based on their name.
This table *cannot* get out of sync between C and python regardless of
preprocessor flags.  In order to prevent this, this commit makes us use
void pointers in the dispatch table for those entrypoints which aren't
available.  This means that the dispatch table size and entry order is
constant and it should never get out-of-sync with the python.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8d37556ec9d7fbbffc5497388a52998ae4fe75de)

8 years agoanv/entrypoints: Use the function pointer types provided by vulkan.h
Jason Ekstrand [Fri, 10 Jun 2016 19:30:05 +0000 (12:30 -0700)]
anv/entrypoints: Use the function pointer types provided by vulkan.h

This is a bit cleaner than generating the types ourselves when making the
table.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9ed0d9dd06414b30d28d7d1301a980784e22d8d6)

8 years agoanv/entrypoints: Emit #if guards for all platforms
Jason Ekstrand [Mon, 6 Jun 2016 21:29:18 +0000 (14:29 -0700)]
anv/entrypoints: Emit #if guards for all platforms

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d1a53f91ee950720b54c35b7d61f0213659533de)

8 years agost/mesa: use base level size as "guess" when available
Nicolai Hähnle [Mon, 6 Jun 2016 21:15:10 +0000 (23:15 +0200)]
st/mesa: use base level size as "guess" when available

When an applications specifies mip levels _before_ setting a mipmap texture
filter, we will initially guess a single texture level. When the second level
image is created, we try to allocate the full texture -- however, we get the
base level size guess wrong if that size is odd. This leads to yet another
re-allocation of the texture later during st_finalize_texture.

Even worse, this re-allocation breaks a (reasonable) assumption made by
st_generate_mipmaps, because the re-allocation in the finalization call will
again allocate a single-level pipe texture (based on the non-mipmap texture
filter!). As a result, mipmap generation fails in interesting ways.

All of this can be avoided by just using the fact that we already know the
size of the base level.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95529
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 42624ea837e8f422f1cd04403af915bd7f218b8d)

8 years agoanv: Remove the PhysicalDeviceLimits FINISHME
Jason Ekstrand [Fri, 10 Jun 2016 16:43:45 +0000 (09:43 -0700)]
anv: Remove the PhysicalDeviceLimits FINISHME

At this point, the limits are probably more-or-less correct.  If there is
an invalid limit, that's a bug not a FINSHME.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a1e69930e43f2876f662042bb94b76124dbe7dfc)

8 years agoanv/pipeline_cache: Allow for an zero-sized cache
Jason Ekstrand [Mon, 6 Jun 2016 18:20:44 +0000 (11:20 -0700)]
anv/pipeline_cache: Allow for an zero-sized cache

This gets ANV_ENABLE_PIPELINE_CACHE=false working again.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4f5bbf804b5590631abb7ff36b74871a0725f8fa)

8 years agoanv/pipeline: Store the (set, binding, index) tripple in the bind map
Jason Ekstrand [Mon, 6 Jun 2016 18:12:27 +0000 (11:12 -0700)]
anv/pipeline: Store the (set, binding, index) tripple in the bind map

This way the the bind map (which we're caching) is mostly independent of
the pipeline layout.  The only coupling remaining is that we pull the array
size of a binding out of the layout.  However, that size is also specified
in the shader and should always match so it's not really coupled.  This
rendering issues in Dota 2.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a1a25db69926604d579139d1d497f1566ec16ac7)

8 years agoanv/descriptor_set: Ensure that bindings are always in increasing order
Jason Ekstrand [Mon, 6 Jun 2016 16:15:03 +0000 (09:15 -0700)]
anv/descriptor_set: Ensure that bindings are always in increasing order

Since applications are allowed to specify some set of bindings which need
not be dense they also need not be in order.  For most things, this doesn't
matter, but it could result getting the wrong dynamic offsets. This adds a
quick-and-dirty sort to ensure that everything is always in increasing
order of binding index.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c13c5ac561f6475d08c35d2a88a829e6ce36e98c)

8 years agoanv/descriptor_set: Add a type field in debug builds
Jason Ekstrand [Mon, 6 Jun 2016 16:12:50 +0000 (09:12 -0700)]
anv/descriptor_set: Add a type field in debug builds

This allows for some extra validation and makes it easier to see what's
going on when poking around in gdb.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e2265926f25235f9be833984a5e365889a70ea74)

8 years agoanv/descriptor_set: Set array_size to zero for non-existant descriptors
Jason Ekstrand [Mon, 6 Jun 2016 16:12:20 +0000 (09:12 -0700)]
anv/descriptor_set: Set array_size to zero for non-existant descriptors

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cd21015abd134aa815398e90714631d3f6601294)

8 years agovl/dri3: support receiving new pixmap for front buffer
Leo Liu [Thu, 9 Jun 2016 16:53:54 +0000 (12:53 -0400)]
vl/dri3: support receiving new pixmap for front buffer

With glx of gstreamer-vaapi, the temporary pixmap for front buffer gets
renewed in each frame, so when we receive a new pixmap, should get a new
front buffer for it.

This also fixes Totem player playback corruption.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2ad443e4cc0a72b7d0b28195b5810cbf197961cb)

8 years agovl/dri3: get Makefile properly
Leo Liu [Thu, 9 Jun 2016 17:11:52 +0000 (13:11 -0400)]
vl/dri3: get Makefile properly

From original commit, the macro "if HAVE_DRI3" was in Makefile.sources,
this file is shared with SCons, SCons is not able to parse this marco,
the SCons build failed. Jose quickly gave two approaches and quick fix
with his second approach, thanks Jose for the solutions and fixes.

This patch is Jose's first approach, and it's more proper, because the
dri3 c file should not be included to build when DRI3 is not enabled.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0ef8500aabb5430eae76919438fcf61020b7eb8e)

8 years agoglx: fix crash with bad fbconfig
Daniel Czarnowski [Wed, 10 Feb 2016 17:36:05 +0000 (09:36 -0800)]
glx: fix crash with bad fbconfig

GLX documentation states:
glXCreateNewContext can generate the following errors: (...)
GLXBadFBConfig if config is not a valid GLXFBConfig

Function checks if the given config is a valid config and sets proper
error code.

Fixes currently crashing glx-fbconfig-bad Piglit test.

v2: coding style cleanups (Emil, Topi)
    use DefaultScreen macro (Emil)

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cf804b4455fac9e585b3600a8318caaced9c23de)

8 years agoi965: Emit surface states for extra planes prior to gen8
Jason Ekstrand [Thu, 9 Jun 2016 02:56:46 +0000 (19:56 -0700)]
i965: Emit surface states for extra planes prior to gen8

When Kristian implemented GL_TEXTURE_EXTERNAL_OES, he hooked it up for gen8
but not for gen7 or earlier.  It all works, we just need to emit the states
for the extra planes.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 037ce5d7343829a69ec9c7361a0964bc1366b019)

8 years agovirgl: fix checking fences
Marc-André Lureau [Tue, 7 Jun 2016 12:54:34 +0000 (14:54 +0200)]
virgl: fix checking fences

When calling virgl_fence_wait() with timeout=0,
virgl_{drm,vtest}_resource_is_busy() is called. However, it returns TRUE
for a busy resource, whereace virgl_fence_wait() should return TRUE for
a completed (non-busy) resource.

This fixes running supertuxkart in a VM (I could not reproduce locally
with vtest though there is a similar fix)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit dc81b3ad43dde0815baf957e7cf4c633d6f350f8)

8 years agost/mesa: directly compute level=0 texture size in st_finalize_texture
Nicolai Hähnle [Tue, 7 Jun 2016 20:40:49 +0000 (22:40 +0200)]
st/mesa: directly compute level=0 texture size in st_finalize_texture

The width0/height0/depth0 on stObj may not have been set at this point.
Observed in a trace that set up levels 2..9 of a 2d texture, and set the base
level to 2, with height 1. This made the guess logic always bail.

Originally investigated by Ilia Mirkin, this patch gets rid of the somewhat
redundant storage of width0/height0/depth0 and makes sure we always compute
pipe texture sizes that are compatible with the base level image of the
GL texture.

Fixes the gl-1.2-texture-base-level piglit test provided by Brian Paul.

v2:
- try to re-use an existing pipe texture when possible
- handle a corner case where the base level is not level 0 and it is of
  size 1x1x1

v3:
- ptHeight = ptWidth in cube map 1x1 case (suggested by Brian)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit bd5c41fe5fdfbef80959b5738b0372b81bef1f2f)

8 years agost/mesa: use buffer usage history to set dirty flags for revalidation
Ilia Mirkin [Sat, 4 Jun 2016 17:26:46 +0000 (13:26 -0400)]
st/mesa: use buffer usage history to set dirty flags for revalidation

We were previously unconditionally doing this for arrays and ubo's, and
ignoring texture/storage/atomic buffers. Instead use the usage history
to determine which atoms need to be revalidated.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6e6fd911da8a1d9cd62fe0a8a4cc0fb7bdccfe02)

8 years agogallium/radeon: don't allocate DCC for non-renderable texture formats
Marek Olšák [Sun, 5 Jun 2016 23:29:14 +0000 (01:29 +0200)]
gallium/radeon: don't allocate DCC for non-renderable texture formats

R9G9B9E5 is the only uncompressed one hopefully.

This fixes incorrect rendering not discovered (due to a lack of tests)
until DCC mipmapping was enabled.

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit d4d733e39de2fc75aaa17d95998abdf19219cb38)

8 years agotgsi/scan: add uses_derivatives (v2)
Nicolai Hähnle [Wed, 1 Jun 2016 11:17:29 +0000 (13:17 +0200)]
tgsi/scan: add uses_derivatives (v2)

v2:
- TG4 does not calculate derivatives (Ilia)
- also handle SAMPLE* instructions (Roland)

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit d3a584defec988faa09960adea90545399440827)

8 years agost/mesa: revalidate image atoms when a texture is updated
Ilia Mirkin [Sat, 4 Jun 2016 17:25:35 +0000 (13:25 -0400)]
st/mesa: revalidate image atoms when a texture is updated

A texture may be redefined with _NEW_TEXTURE, which might have been
bound to a shader image slot. We have to revalidate the image atoms to
pick up on the new resource.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c81b090c920f90bf86a34c978e10ff336d1edbc0)

8 years agogk104/ir: fix conditions for adding a texbar
Ilia Mirkin [Tue, 7 Jun 2016 01:25:05 +0000 (21:25 -0400)]
gk104/ir: fix conditions for adding a texbar

Sometimes a register source can actually be double- or even quad-wide.
We must make sure that the inserted texbars take that width into
account.

Based on an earlier patch by Samuel Pitoiset.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 71ad8a173f5c64d6384c13f04361455571c42ffe)

8 years agoi965/gen8: fix cull distance emission for tessellation shaders.
Dave Airlie [Tue, 7 Jun 2016 00:27:44 +0000 (10:27 +1000)]
i965/gen8: fix cull distance emission for tessellation shaders.

This fixes some cases of:
GL45-CTS.cull_distance.functional
on Skylake.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c295923d139b2c2daf169c94d9edcca23527289b)

8 years agonv50/ir: use round toward 0 when converting doubles to integers
Samuel Pitoiset [Mon, 6 Jun 2016 19:12:15 +0000 (21:12 +0200)]
nv50/ir: use round toward 0 when converting doubles to integers

Like floats, we should use the round toward 0 mode instead of the
nearest one (which is the default) for doubles to integers.

This fixes all arb_gpu_shader_fp64 piglits which convert doubles to
integers (16 tests).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 08ddfe7b2fa9f577ba00c8c05c5604460942f5a8)

8 years agomesa/program_resource: return -1 for index if no location.
Dave Airlie [Mon, 23 May 2016 20:41:21 +0000 (06:41 +1000)]
mesa/program_resource: return -1 for index if no location.

The GL4.5 spec quote seems clear on this:
"The value -1 will be returned by either command if an error occurs,
if name does not identify an active variable on programInterface,
or if name identifies an active variable that does not have a valid
location assigned, as described above."

This fixes:
GL45-CTS.program_interface_query.output-built-in

[airlied: use _mesa_program_resource_location_index as
suggested by Eduardo]
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 07403014c3a29bfdecc89def187389ac9f208529)

8 years agoradeonsi: set descriptor dirty mask on shader buffer unbind
Nicolai Hähnle [Fri, 3 Jun 2016 13:17:25 +0000 (15:17 +0200)]
radeonsi: set descriptor dirty mask on shader buffer unbind

Found randomly while skimming the code. This might have caused VM faults in
robustness tests.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit ec2b52e2d9d7114df6e6023eec5949c4c6a76d52)

8 years agoi965/gs/scalar: Fix load input for doubles
Samuel Iglesias Gonsálvez [Fri, 27 May 2016 09:59:48 +0000 (11:59 +0200)]
i965/gs/scalar: Fix load input for doubles

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2b648ec17c2934802dd56452d11d78ec2d525a06)

8 years agoi965/fs: fix offset when loading double vector input varyings
Samuel Iglesias Gonsálvez [Thu, 26 May 2016 05:56:37 +0000 (07:56 +0200)]
i965/fs: fix offset when loading double vector input varyings

When we are not packing a double input varying, we might need to
read its data in a non-aligned to 64-bit offset, so we read
the wrong data. This is happening when using explicit locations
in varyings because Mesa disables packing varying for that case.

const_index is in 32-bit size units but offset() is multiplying
it by destination type size units. When operating with double
input varyings, const_index value could be not aligned to 64 bits.
To fix it, we load the double vector as if it was a float based vector
with twice the number of components.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2d6f82a294ad1ab1eab0020cf65df5ecc9591272)

8 years agoi965/fs: fix FS_OPCODE_CINTERP for unpacked double input varyings
Samuel Iglesias Gonsálvez [Thu, 26 May 2016 05:56:38 +0000 (07:56 +0200)]
i965/fs: fix FS_OPCODE_CINTERP for unpacked double input varyings

Data starts at suboffet 3 in 32-bit units (12 bytes), so it is not
64-bit aligned and the current implementation fails to read the data
properly. Instead, when there is is a double input varying, read it as
vector of floats with twice the number of components.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cb30727648fea301cfff1647d947bfab540c3bf6)

8 years agoglsl: geom shader max_vertices layout must match.
Dave Airlie [Fri, 3 Jun 2016 00:45:07 +0000 (10:45 +1000)]
glsl: geom shader max_vertices layout must match.

From GLSL 4.5 spec, "4.4.2.3 Geometry Outputs".
"all geometry shader output vertex count declarations in a
program must declare the same count."

Fixes:
GL45-CTS.geometry_shader.output.conflicted_output_vertices_max

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4c863993780a11cea6f88fa0682796bee5794042)

8 years agoi965: don't use NumLayers for 3D textures.
Dave Airlie [Fri, 3 Jun 2016 01:36:38 +0000 (11:36 +1000)]
i965: don't use NumLayers for 3D textures.

For 3D textures we shouldn't be using NumLayers, we need
to get it from the depth.

This fixes:
GL45-CTS.geometry_shader.layered_framebuffer.clear_call_support

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ff2e569153d48bda347be729fc441852ab293138)

8 years agoglsl: for anonymous struct matching use without_array() (v3)
Dave Airlie [Mon, 6 Jun 2016 00:33:51 +0000 (10:33 +1000)]
glsl: for anonymous struct matching use without_array() (v3)

With tessellation shaders we can have cases where we have
arrays of anon structs, so make sure we match using without_array().

Fixes:
GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_in

v2:
test lengths match as well (Ilia)
v3:
descend array lengths to check for matches as well (Ilia)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1f66a4b68955b9d6cae629cec90e5e0f301c6a7a)

8 years agoglsl/ast: don't crash when func_name is NULL
Dave Airlie [Tue, 3 May 2016 04:39:06 +0000 (14:39 +1000)]
glsl/ast: don't crash when func_name is NULL

This fixes a crash in
GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types

If we can't find the func_name in one of these paths,
we have emitted an earlier error so just return here.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 6702c1581092bce230742d02ebf9325f68bd247a)

8 years agoglsl: handle ast_aggregate in has_sequence_subexpression. (v2)
Dave Airlie [Tue, 3 May 2016 07:16:27 +0000 (17:16 +1000)]
glsl: handle ast_aggregate in has_sequence_subexpression. (v2)

GL43-CTS.compute_shader.work-group-size does
uniform uint g_uniform[gl_WorkGroupSize.z + 20] = { 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24 };

The initializer triggers the GLSL 4.30/GLES3 tests
for constant sequence subexpressions, so it doesn't
happen unless you are using those, so just return
false as this path is now reachable.

v2: update commit msg with diagnosis
Acked-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4336196b7fc61166a36babdabdf5dd004ec9ee55)

8 years agonv50,nvc0: fix BGR10_A2UI vertex format
Ilia Mirkin [Sun, 5 Jun 2016 19:00:36 +0000 (15:00 -0400)]
nv50,nvc0: fix BGR10_A2UI vertex format

This is mostly academic as this is not reachable from GL, which only has
the packed RGB10_A2UI vertex format.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 092ec3920f7f9a57fcc1c859a477e891752f5c1f)

8 years agonvc0: do not clear surfaces bins in the validate function
Samuel Pitoiset [Sun, 5 Jun 2016 16:53:26 +0000 (18:53 +0200)]
nvc0: do not clear surfaces bins in the validate function

We should not call nouveau_bufctx_reset() inside a validate function.
This only affects Fermi where images are aliased between 3D and CP.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit be365f34f04112572550337f387b61ed1ba69acd)

8 years agonvc0: re-validate images after launching a grid on Fermi
Samuel Pitoiset [Sun, 5 Jun 2016 16:01:19 +0000 (18:01 +0200)]
nvc0: re-validate images after launching a grid on Fermi

Images invalidation is a bit weird on Fermi and there is already a hack
which forces invalidating all images when launching a computer shader
to help in fixing 3D<->CP interaction.

However, we need to re-validate images for compute because
nvc0_compute_invalidate_surfaces() will destroy the previous binding.
This is not really good for performance purposes but this might be
improved later.

This fixes the following piglits:
- spec/arb_compute_shader/execution/basic-uniform-access
- spec/arb_compute_shader/execution/mutiple-texture-reading
- spec/arb_compute_shader/execution/multiple-workgroups
- spec/glsl-4.30/execution/built-in-functions/cs-* (207 tests)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43d3ecfb338d79d9f0ccae22b316653af461032a)

8 years agonvc0: reduce overhead from always marking images dirty
Ilia Mirkin [Sat, 4 Jun 2016 18:13:38 +0000 (14:13 -0400)]
nvc0: reduce overhead from always marking images dirty

We would revalidate images when anything was touched at all. Which is
unfortunate, since the state tracker does not use CSO's to reduce the
workload. So instead implement a protocol to ensure that something has
changed before revalidating all the images.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fd6bbc2ee205ed02f66a8d8ef5b2adf4005d588c)

8 years agonvc0: reduce overhead from always marking buffers dirty
Ilia Mirkin [Sat, 4 Jun 2016 17:50:21 +0000 (13:50 -0400)]
nvc0: reduce overhead from always marking buffers dirty

We would revalidate buffers when anything was touched at all. Which is
unfortunate, since the state tracker does not use CSO's to reduce the
workload. So instead implement a protocol to ensure that something has
changed before revalidating all the SSBOs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0f673db6f08995312a2f5ed04aba191d59910099)

8 years agonvc0: fix memory barrier flag handling
Ilia Mirkin [Fri, 3 Jun 2016 01:36:04 +0000 (21:36 -0400)]
nvc0: fix memory barrier flag handling

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e8ee161b160f6ff18134ff279b046f2e056bccd0)

8 years agonvc0: mark bound buffer range valid
Ilia Mirkin [Fri, 3 Jun 2016 01:42:14 +0000 (21:42 -0400)]
nvc0: mark bound buffer range valid

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 29abbeecd80bb2cbf48c2c4c13155463d60c70f5)

8 years agor600g: write WAIT_UNTIL in the correct place
Marek Olšák [Tue, 31 May 2016 21:07:15 +0000 (23:07 +0200)]
r600g: write WAIT_UNTIL in the correct place

This has been wrong all along. Fixing this will allow removing useless
cache flushes.

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit 7746903d3a68c97e86b88d5aa16995015b4db4ba)

8 years agoanv/blit: Use CLAMP_TO_EDGE for scaled blits
Jason Ekstrand [Thu, 2 Jun 2016 23:34:11 +0000 (16:34 -0700)]
anv/blit: Use CLAMP_TO_EDGE for scaled blits

When upscaling you can end up interpolating between the edge pixel and one
past the edge.  Using CLAMP_TO_EDGE seems like the most reasonable thing to
do in this case.  This fixes two of the new Vulkan CTS tests in
dEQP-VK.api.copy_and_blit.blit_image.*

Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 441194edd91c2ca9d2b219f7e16ba2c7783396df)

8 years agoanv/copy: Account for the anv_surface.offset when creating a blit2d_surf
Jason Ekstrand [Thu, 2 Jun 2016 23:25:44 +0000 (16:25 -0700)]
anv/copy: Account for the anv_surface.offset when creating a blit2d_surf

This was causing problems if the user tried to copy to/from the stencil
portion of a combined depth/stencil image.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9313a568169bddabaac675361497efff4a0709d6)

8 years agonir/spirv: Make a decoration switch complete
Jason Ekstrand [Thu, 2 Jun 2016 21:36:58 +0000 (14:36 -0700)]
nir/spirv: Make a decoration switch complete

Getting rid of the default case makes the compiler warn if we are missing
cases.  While we're here, we also add the one missing case.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 526a8de22d85bcdc3a46b2e16e0073527616d95d)

8 years agonir/spirv: Make unhandled decorations and capabilities non-fatal
Jason Ekstrand [Thu, 2 Jun 2016 21:34:15 +0000 (14:34 -0700)]
nir/spirv: Make unhandled decorations and capabilities non-fatal

glslang frequently throw bogus decorations into shaders.  While we are free
to assert-fail, it's a bit nicer to the application to just warn.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 62c6e94bd69a1287ebb4908f4ee67b8f730466cc)

8 years agonir/spirv: Add a way to print non-fatal warnings
Jason Ekstrand [Thu, 2 Jun 2016 21:32:56 +0000 (14:32 -0700)]
nir/spirv: Add a way to print non-fatal warnings

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ed14d21d04bf5d58cfadac525b8fd17552378507)

8 years agonir/spirv: Add string lookup tables for a couple of SPIR-V enums
Jason Ekstrand [Thu, 2 Jun 2016 21:06:30 +0000 (14:06 -0700)]
nir/spirv: Add string lookup tables for a couple of SPIR-V enums

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2e46a5d1551854b98b0ca3c773a17f3ea5d2f7c6)

8 years agonir/spirv: Complete the list of capabilities
Jason Ekstrand [Thu, 2 Jun 2016 20:43:19 +0000 (13:43 -0700)]
nir/spirv: Complete the list of capabilities

Previously we supported a subset of capabilities and just left a default
case for the others.  It's time to stop being lazy and actually audit the
capabilities.  This should bring them up-to-date with reality.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5a1e56f344c0007f877ada78721c33d4fe61419a)

8 years agoanv/pipeline: Add support for early depth stencil
Jason Ekstrand [Wed, 1 Jun 2016 03:16:01 +0000 (20:16 -0700)]
anv/pipeline: Add support for early depth stencil

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9fa958e95b9ffd61283d20d5d14a0ac007c1def7)

8 years agoi965/fs Add a wm_prog_data bit for has_side_effects
Jason Ekstrand [Thu, 2 Jun 2016 01:46:30 +0000 (18:46 -0700)]
i965/fs Add a wm_prog_data bit for has_side_effects

This is more accurate than calling
_mesa_active_fragment_shader_has_side_effects because it looks at whether
or not the SSBOs, images, or atomic buffers are actually written rather
than just existing in the program.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3fb289f957a8a27349a6f7df03983f92d9b6cf64)

8 years agoanv/pipeline: Silently pass tests if depth or stencil is missing
Jason Ekstrand [Wed, 1 Jun 2016 05:23:18 +0000 (22:23 -0700)]
anv/pipeline: Silently pass tests if depth or stencil is missing

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 56a178922fa200c98cbbb177e5fab106ad01072b)

8 years agoanv/pipeline: Unify gen7/8 emit_ds_state
Jason Ekstrand [Wed, 1 Jun 2016 05:19:53 +0000 (22:19 -0700)]
anv/pipeline: Unify gen7/8 emit_ds_state

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bc7f7e195389e334ae8a5ee7d6f85b9e27983ca4)

8 years agogenxml/gen6,7,75: s/BackFace/Backface
Jason Ekstrand [Wed, 1 Jun 2016 05:15:38 +0000 (22:15 -0700)]
genxml/gen6,7,75: s/BackFace/Backface

This is more consistent with gen8+

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fdc3c5dd05dc072fe9b4975091308d02e6df6037)

8 years agonir/spirv: Handle the WorkgroupSize builtin decoration
Jason Ekstrand [Wed, 1 Jun 2016 18:20:22 +0000 (11:20 -0700)]
nir/spirv: Handle the WorkgroupSize builtin decoration

This fixes the 7 dEQP-VK.pipeline.spec_constant.compute.local_size.* tests
in the latest dev version of the Vulkan CTS.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1f7b54ed299bea95f774e7d8baa181c11118b3fe)

8 years agonir/spirv: Use breaks instead of returns in constant handling
Jason Ekstrand [Wed, 1 Jun 2016 17:34:04 +0000 (10:34 -0700)]
nir/spirv: Use breaks instead of returns in constant handling

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b26cdd65e8e588ed7bdd7cca0b205579132e1261)

8 years agoanv/pipeline: Refactor specialization constant handling a bit
Jason Ekstrand [Tue, 31 May 2016 23:27:19 +0000 (16:27 -0700)]
anv/pipeline: Refactor specialization constant handling a bit

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a19ae36ce52e72e2b93251ba0a43d683bd5e58cc)

8 years agonir/lower_indirect_derefs: Use the direct array deref for recursion
Jason Ekstrand [Tue, 31 May 2016 22:02:10 +0000 (15:02 -0700)]
nir/lower_indirect_derefs: Use the direct array deref for recursion

This fixes about 100 of the new Vulkan CTS tests.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 45542f554ca01b00b3d4674cf90575dff7904736)

8 years agoanv/clear: Handle ClearImage on 3-D images
Jason Ekstrand [Tue, 31 May 2016 18:26:06 +0000 (11:26 -0700)]
anv/clear: Handle ClearImage on 3-D images

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 59f06ac3892fefd55b8f1371b48f9e0d99cc7c23)

8 years agoRevert "i965/fs: Allow scalar source regions on SNB math instructions."
Francisco Jerez [Fri, 3 Jun 2016 19:32:15 +0000 (12:32 -0700)]
Revert "i965/fs: Allow scalar source regions on SNB math instructions."

This reverts commit c1107cec44ab030c7fcc97c67baa12df1cc9d7b5.
Apparently the hardware spec text I quoted in the commit message was
outright lying about scalar source math being supported on SNB, the
hardware seems to load 32 contiguous bits of data for each channel
regardless of the regioning mode.  Fixes regressions in the following
CTS tests (which we didn't catch early due to CTS being temporarily
disabled in our CI system):

   es2-cts.gtf.gl.atan.atan_vec3_frag_xvary
   es2-cts.gtf.gl.cos.cos_vec2_frag_xvary
   es2-cts.gtf.gl.atan.atan_vec2_frag_xvary
   es2-cts.gtf.gl.pow.pow_vec2_frag_xvary_yconsthalf
   es2-cts.gtf.gl.cos.cos_float_frag_xvary
   es2-cts.gtf.gl.pow.pow_float_frag_xvary_yconsthalf
   es2-cts.gtf.gl.atan.atan_vec3_frag_xvaryyvary
   es2-cts.gtf.gl.pow.pow_vec3_frag_xvary_yconsthalf
   es2-cts.gtf.gl.cos.cos_vec3_frag_xvary
   es2-cts.gtf.gl.atan.atan_vec2_frag_xvaryyvary

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96346
Reported-by: Mark Janes <mark.a.janes@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 7244dc1e0651958b62222cafb15e34487851a6cd)

8 years agoi965/vec4: Fix cmod propagation not to propagate non-identity cmod into CMP(N).
Francisco Jerez [Wed, 1 Jun 2016 23:27:52 +0000 (16:27 -0700)]
i965/vec4: Fix cmod propagation not to propagate non-identity cmod into CMP(N).

The conditional mod of these instructions determines the semantics of
the comparison itself (rather than being evaluated based on the result
of the instruction as is usually the case for most other instructions
that allow conditional mods), so it's in general not legal to
propagate a conditional mod into a CMP instruction.  This prevents
cmod propagation from (mis)optimizing:

 cmp.z.f0 tmp, ...
 mov.z.f0 null, tmp

into:

 cmp.z.f0 tmp, ...

which gives the negation of the flag result of the original sequence.
I originally noticed this while working on SIMD32 in the scalar
back-end, but the same scenario is likely to be possible in vec4
programs so this commit ports the bugfix with the same name from the
scalar back-end to the vec4 cmod propagation pass.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit a2135c6fd95d3e48f222a702c2b814e3cf37eb7d)

8 years agoanv: add the X related and Wayland CFLAGS to VULKAN_ENTRYPOINT_CPPFLAGS
Emil Velikov [Fri, 3 Jun 2016 23:20:53 +0000 (00:20 +0100)]
anv: add the X related and Wayland CFLAGS to VULKAN_ENTRYPOINT_CPPFLAGS

Otherwise we will fail to find the headers in some scenarios.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
(cherry picked from commit 7a3a0d921235712fa8d22f85552cc382a793ce95)

8 years agomesa/get: return correct value for layer provoking vertex.
Dave Airlie [Fri, 3 Jun 2016 02:26:05 +0000 (12:26 +1000)]
mesa/get: return correct value for layer provoking vertex.

This fixes:
GL45-CTS.geometry_shader.layered_rendering.layered_rendering

on Skylake.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d10ae20b9678f1a5b8a81716c68e612662665277)

8 years agonvc0: mark buffer texture range valid for shader images
Samuel Pitoiset [Thu, 2 Jun 2016 22:00:27 +0000 (00:00 +0200)]
nvc0: mark buffer texture range valid for shader images

Loosely based on radeonsi (Thanks to Nicolai).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 28590eb9492f2e06c95483df7bd7f67b0fee7b8e)

8 years agoi965/fs: Reindent emit_zip().
Francisco Jerez [Fri, 27 May 2016 08:02:19 +0000 (01:02 -0700)]
i965/fs: Reindent emit_zip().

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 060c8d245deb83aeb412de98810cad6052aafb78)

8 years agoi965/fs: Skip SIMD lowering destination zipping if possible.
Francisco Jerez [Fri, 27 May 2016 07:45:04 +0000 (00:45 -0700)]
i965/fs: Skip SIMD lowering destination zipping if possible.

Skipping the temporary allocation and copy instructions is easy (just
return dst), but the conditions used to find out whether the copy can
be optimized out safely without breaking the program are rather
complex: The destination must be exactly one component of at most the
execution width of the lowered instruction, and all source regions of
the instruction must be either fully disjoint from the destination or
be aligned with it group by group.

v2: Don't handle partial source-destination overlap for simplicity
    (Jason).  No instruction count regressions with respect to v1 in
    either shader-db or the few FP64 shader_runner test-cases with
    partial overlap I've checked manually.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 7aa76d66a1f5edad9e8c1d54aafdce99ffa6c345)

8 years agoblorp: Fix 16x multisample scaled blits
Anuj Phogat [Thu, 2 Jun 2016 18:05:44 +0000 (11:05 -0700)]
blorp: Fix 16x multisample scaled blits

Piglit test ext_framebuffer_multisample_blit_scaled-blit-scaled
(with added 16x sample support) now passes with this patch.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 75da9c9933a97e6f2baf0884b98350df800ee785)

8 years agomesa/copyimage: report INVALID_VALUE for missing cube face
Dave Airlie [Thu, 2 Jun 2016 04:13:18 +0000 (14:13 +1000)]
mesa/copyimage: report INVALID_VALUE for missing cube face

The specs says INVALID_VALUE for exceeding dimensions,
which is really what is happening here.

This fixes:
GL45-CTS.copy_image.non_existent_mipmap

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit af7bf610cf74c6805f42babbcf85bc88b2b9453d)