OSDN Git Service

android-x86/external-mesa.git
7 years agoi965: Always scissor on Gen6-7.5 instead of disabling guardband.
Kenneth Graunke [Thu, 12 Jan 2017 05:38:52 +0000 (21:38 -0800)]
i965: Always scissor on Gen6-7.5 instead of disabling guardband.

Previously we disabled the guardband when the viewport was smaller than
the framebuffer on Gen6-7.5, to prevent portions of primitives from
being draw outside of the viewport.  On Gen8+, we relied on the viewport
extents test to effectively scissor this away for us.

We can simply always enable scissoring instead.  We already include the
viewport in the scissor rectangle, so this will effectively do the
viewport extents test for us.  (The only difference is that the scissor
rectangle doesn't support sub-pixel values.  I think that's okay.)

Given that the viewport extents test is essentially a second scissor,
and is enabled for basically all 3D drawing on Gen8+, it stands to
reason that scissoring is cheap.  Enabling the guardband reduces the
cost of clipping, which is expensive.

The Windows driver appears to never disable guardband clipping, and
appears to use scissoring in this case.  I don't know if they leave
it on universally though.

This fixes misrendering in Blender, where the "floor plane" grid lines
started rendering at wrong angles after I disabled XY clipping of line
primitives.  Enabling the guardband seems to solve the issue.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99339
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit ece0e535a44c228dd994861592deb155c14740d8)

7 years agoi965: Use a better guardband calculation.
Jason Ekstrand [Sat, 21 Jan 2017 11:50:42 +0000 (03:50 -0800)]
i965: Use a better guardband calculation.

(Patch co-authored by Jason and Ken.)

We scaled the guardband based on the viewport size, but failed to
take into account the translation portion of the viewport transform.

This meant the guardband was always centered around the origin.
We want it to be centered around the screen-space drawing area,
which is the intersection of the viewport and the render target.

At best, getting this wrong would reduce the guardband's effectiveness
in some cases.  At worst, it might break things - objects outside of the
guardband are trivially rejected, so getting the guardband in the wrong
place and leaving guardband clipping enabled could cause problems.

v2: drop clamping of positive maximums.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f3c068c5c89c8c3dce257ecc2b640f375d3f4836)

7 years agoi965: Combine the Gen6 SF and Clip viewport atoms.
Kenneth Graunke [Sat, 21 Jan 2017 22:10:15 +0000 (14:10 -0800)]
i965: Combine the Gen6 SF and Clip viewport atoms.

The next patch will make the guardband calculation dependent on the
transformation matrix.  Instead of computing it in both atoms, just
combine them into a single atom.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 89ad7f1be6a607b33ffb388516b5d0547b491c33)

7 years agoradv: pass FMASK alignment to application
Dave Airlie [Tue, 7 Feb 2017 00:31:11 +0000 (10:31 +1000)]
radv: pass FMASK alignment to application

As was done for dcc and cmask.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 90ac2285f0900972d2a3e7a034b51ee4de374ffb)

7 years agoradv: Pass DCC alignment to application.
Bas Nieuwenhuizen [Mon, 6 Feb 2017 23:45:11 +0000 (00:45 +0100)]
radv: Pass DCC alignment to application.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
(cherry picked from commit 47ca0f537dfbc03f0eb0cb12fdee06dbe664fbc7)

7 years agoradv: Pass CMASK alignment to application.
Bas Nieuwenhuizen [Mon, 6 Feb 2017 23:24:16 +0000 (00:24 +0100)]
radv: Pass CMASK alignment to application.

CMASK alignment can be greater than image data alignment, so pass
it to the app so that it knows what alignment to backing memory
should have.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit eb01b20cc41e9501062eb25034069e484f8b1899)

7 years agoradv/ac: avoid the fmask path when doing txs.
Dave Airlie [Mon, 6 Feb 2017 02:40:45 +0000 (02:40 +0000)]
radv/ac: avoid the fmask path when doing txs.

This fixes the vulkan samples deferredmultisampling test.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a864ef7f4824a8319b74d4cf1c82e2dd25672ff1)

7 years agoswr: [rasterizer core] Remove dead code Clipper::ClipScalar()
Bruce Cherniak [Thu, 2 Feb 2017 20:15:08 +0000 (14:15 -0600)]
swr: [rasterizer core] Remove dead code Clipper::ClipScalar()

Clipper::ClipScalar() is dead code and should be removed.  It is causing
an error with gcc-7 because it references a now defunct member.

v2: includes bugzilla reference, same code change

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99633
CC: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
(cherry picked from commit bf29495dcdb290c8b15cacd2001603b8ae5d36c8)

7 years agodri/common: clear the loaderPrivate pointer in driDestroyDrawable
Nicolai Hähnle [Fri, 27 Jan 2017 10:55:14 +0000 (11:55 +0100)]
dri/common: clear the loaderPrivate pointer in driDestroyDrawable

The GLX specification says about glXDestroyPixmap:

    "The storage for the GLX pixmap will be freed when it is not current
     to any client."

We're not really following this language to the letter: some of the storage
is freed immediately (in particular, the dri3_drawable, which contains both
GLXDRIdrawable and loader_dri3_drawable). So we NULL out the pointers to
that freed storage; the previous patches added the corresponding NULL-pointer
checks.

This fixes memory corruption in piglit
./bin/glx-visuals-depth/stencil -pixmap -auto

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 7be0e602ed82d25b9f0db77748618c663d9cbfe7)

7 years agoglx: guard swap-interval functions against destroyed drawables
Nicolai Hähnle [Thu, 2 Feb 2017 17:06:27 +0000 (18:06 +0100)]
glx: guard swap-interval functions against destroyed drawables

The GLX specification says about glXDestroyPixmap:

    "The storage for the GLX pixmap will be freed when it is not current
     to any client."

So arguably, functions like glXSwapIntervalMESA can be called after
glXDestroyPixmap has been called for the currently bound GLXPixmap.
In that case, the GLXDRIDrawable no longer exists, and so we just skip
those calls.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f446f3fb33528eebe9b120340fca3ac5c5ba518d)

7 years agoglx/dri3: guard in_current_context against a disappeared drawable
Nicolai Hähnle [Thu, 2 Feb 2017 17:01:06 +0000 (18:01 +0100)]
glx/dri3: guard in_current_context against a disappeared drawable

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 21ec35566be2c1aca07083a67f462618ae15fa86)

7 years agoglx/dri3: handle NULL pointers in loader-to-DRI3 drawable conversion
Nicolai Hähnle [Fri, 27 Jan 2017 10:58:41 +0000 (11:58 +0100)]
glx/dri3: handle NULL pointers in loader-to-DRI3 drawable conversion

With a subsequent patch, we might see NULL loaderPrivates, e.g. when
a DRIdrawable is flushed whose corresponding GLXDRIdrawable was destroyed.
This resulted in a crash, since the loader vs. DRI3 drawable structures
have a non-zero offset.

Fixes glx-visuals-{depth,stencil} -pixmap

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 40c304fc065259c07c0f4f7a35efedd205e4f250)

7 years agoradv: fix shared memory load/stores.
Dave Airlie [Fri, 3 Feb 2017 03:26:13 +0000 (03:26 +0000)]
radv: fix shared memory load/stores.

If we have an indirect index here we need to scale it by attribute slots
e.g. is this is vec2[256] then we get an indir_index in the 0.255 range
but the vec2 are aligned inside vec4 slots. So scale the indir index,
then extract the channels.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 106a51440d018031b94c91758eecc7424a3bb5ee)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/amd/common/ac_nir_to_llvm.c

7 years agoradv/ac: correctly size shared memory usage.
Dave Airlie [Fri, 3 Feb 2017 01:46:24 +0000 (01:46 +0000)]
radv/ac: correctly size shared memory usage.

We count the number of slots used, but slots are vec4 sized,
so we have to scale by 16 not 4.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a1a8aef4c9dbdf254036adada95f0d6e394c5d6a)

7 years agowinsys/amdgpu: avoid potential segfault in amdgpu_bo_map()
Samuel Pitoiset [Thu, 2 Feb 2017 17:40:18 +0000 (18:40 +0100)]
winsys/amdgpu: avoid potential segfault in amdgpu_bo_map()

cs can be NULL when it comes from r600_buffer_map_sync_with_rings()
to avoid doing the same checks. It was checked for write mappings
but not for read mappings.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit af303abcdbeac3b90fb760de19bed56cc40cfff4)

7 years agost/mesa: MAX_VARYING is the max supported number of patch varyings, not min
Ilia Mirkin [Thu, 26 Jan 2017 03:31:58 +0000 (22:31 -0500)]
st/mesa: MAX_VARYING is the max supported number of patch varyings, not min

This fixes
GL45-CTS.tessellation_shader.tessellation_shader_tessellation.max_in_out_attributes
on nouveau. We only support 30 patch varyings (as 2 vec4 slots end up
being used for tess level settings), but were getting 32 exposed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7d3f9ed71c71637a91ecf15f50dbe7578a65d57e)

7 years agovbo: process buffer binding state changes on draw when recording
Ilia Mirkin [Wed, 1 Feb 2017 21:11:41 +0000 (16:11 -0500)]
vbo: process buffer binding state changes on draw when recording

The VBO module keeps track of any vbo buffers. It updates this list when
receiving an InvalidateState call, however this never happens when
recording draws right now. Make sure that we do all the usual state
updates when recording draws so that the VBO list may be kept up to
date.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99631
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e73f87fcbdcb12b0b8d28c4ca3444bfb7669bca5)

7 years agoRevert "radeonsi: decrease the number of texture slots to 24"
Marek Olšák [Thu, 2 Feb 2017 18:42:22 +0000 (19:42 +0100)]
Revert "radeonsi: decrease the number of texture slots to 24"

This reverts commit bdd860e3076655519d45bd66936ef7be9b7dda63.

Requested by a game developer.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit dfe111368d11aaffae7f8738c858c335cdec1e9d)

7 years agoanv/pass: Store the depth-stencil attachment's last subpass index
Nanley Chery [Wed, 1 Feb 2017 03:01:18 +0000 (19:01 -0800)]
anv/pass: Store the depth-stencil attachment's last subpass index

Commit 968ffd6c868af7226e8f889573eef709888151cb stored the last subpass
index of all the attachments but that of the depth-stencil attachment.
This could cause depth buffers used in multiple subpasses not to be in
the requested final layout. Fix this error.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 043d92fef9315dcc303f36d472eb38b5511bb2cd)

7 years agovulkan: Don't install vk_platform.h or vulkan.h.
Matt Turner [Tue, 24 Jan 2017 00:48:01 +0000 (16:48 -0800)]
vulkan: Don't install vk_platform.h or vulkan.h.

These files belong to the vulkan loader.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 045f38a50759bb225cb179703bc7050f6de752b1)

7 years agoandroid: correct typo in build
Tapani Pälli [Thu, 19 Jan 2017 07:10:34 +0000 (09:10 +0200)]
android: correct typo in build

Fixes: 63c58dfc653c499aab5b8d0ea07f1dc1af88c856
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 4148881513b1cba6d4737803cc903546e59d5b91)

7 years agoUpdate version to 17.0.0-rc3
Emil Velikov [Mon, 6 Feb 2017 13:18:13 +0000 (13:18 +0000)]
Update version to 17.0.0-rc3

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoetnaviv: force vertex buffers through the MMU
Lucas Stach [Mon, 21 Nov 2016 10:54:25 +0000 (11:54 +0100)]
etnaviv: force vertex buffers through the MMU

This fixes a vertex data corruption issue if some of the vertex streams
go through the MMU and some don't.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit e158b7497103f145a9236a70183e07c37a9e13f7)
Nominated-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agost/va: make sure that we call begin_frame() only once v2
Christian König [Thu, 19 Jan 2017 12:44:34 +0000 (13:44 +0100)]
st/va: make sure that we call begin_frame() only once v2

This fixes "st/va: delay calling begin_frame until we have all parameters".

v2: call begin frame after decoder (re)creation as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Tested-by: Andy Furniss <adf.lists@gmail.com>
(cherry picked from commit 1338d912f52b69f76ef75d1ad313893db77d4da8)

7 years agost/vdpau: only send buffers with B8G8R8A8 format to X
Nayan Deshmukh [Thu, 19 Jan 2017 09:29:28 +0000 (14:59 +0530)]
st/vdpau: only send buffers with B8G8R8A8 format to X

PresentPixmap only works if the pixmap depth matches with the
window depth, otherwise it returns a BadMatch protocol error.
Even if the depths match, the result won't look correctly
if the VDPAU RGB component order doesn't match the X11 one so
we only allow the X11 format.
For other buffers we copy them to a buffer which is send to X.

v2: only send buffers with format VDP_RGBA_FORMAT_B8G8R8A8
v3: reword commit message
v4: add comment explaining the code

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 31908d6a4a3309f4cd4b953d6eecdf41595b1299)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99637
Nominated-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Nominated-by: Michel Dänzer <michel.daenzer@amd.com> (IRC)
7 years agoandroid: fix llvm, elf dependencies for M, N releases
Mauro Rossi [Mon, 30 Jan 2017 19:57:30 +0000 (20:57 +0100)]
android: fix llvm, elf dependencies for M, N releases

These changes set the correct llvm version and elf include path
which differ for Marshmallow and Nougat

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 9c45bb731c97d1f02f83b872c67b2c1b04ec3a41)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
Android.common.mk

7 years agoanv: Improve flushing around STATE_BASE_ADDRESS
Jason Ekstrand [Tue, 31 Jan 2017 03:53:17 +0000 (19:53 -0800)]
anv: Improve flushing around STATE_BASE_ADDRESS

It is not clear from the docs exactly how pipelined STATE_BASE_ADDRESS
actually is.  We know from experimentation that we need to flush the
render cache prior to emitting STATE_BASE_ADDRESS and invalidate the
texture cache afterwards.  The only thing the PRM says is that, on gen8+
we're supposed to invalidate the state cache after STATE_BASE_ADDRESS
but experimentation has indicated that doing so does nothing whatsoever.

Since we don't really know, let's do just a bit more flushing in the
hopes that this won't be a problem again.  In particular:

 1) Do a CS stall before we emit STATE_BASE_ADDRESS since we don't
    really know whether or not it's pipelined.

 2) Do a data cache flush in case what runs before STATE_BASE_ADDRESS
    is a compute shader.

 3) Invalidate the state and constant caches after STATE_BASE_ADDRESS
    because the state may be getting cached there (we don't really know).

Reported-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 92128590bc78bcbbfb19144c7004b31d6405bbcb)

7 years agoanv: Flush render cache before STATE_BASE_ADDRESS on gen7
Jason Ekstrand [Tue, 31 Jan 2017 23:06:56 +0000 (15:06 -0800)]
anv: Flush render cache before STATE_BASE_ADDRESS on gen7

We had no good reason for *not* doing this on gen7 before but we didn't
know it was needed.  Recently, when trying update to Vulkan CTS version
1.0.2 in our CI system, Mark discovered GPU hangs on Haswell that appear
to be STATE_BASE_ADDRESS related.  This commit fixes them.

Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f1f9794118008bcdc13d93ee709022d21cc4156d)

7 years agoisl/formats: Only advertise sampling for A4B4G4R4 on Broadwell
Jason Ekstrand [Fri, 27 Jan 2017 20:31:40 +0000 (12:31 -0800)]
isl/formats: Only advertise sampling for A4B4G4R4 on Broadwell

This causes hangs on Broadwell if you try to render to it.  I have no
idea how we managed to not hit this earlier.

Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4871930451215fd8673f7e213a88aa48e5ecaad3)

7 years agointel/blorp: Handle clearing of A4B4G4R4 on all platforms
Jason Ekstrand [Fri, 27 Jan 2017 20:32:05 +0000 (12:32 -0800)]
intel/blorp: Handle clearing of A4B4G4R4 on all platforms

Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a0348b5a0b679a78b3f49d41f980dec6066cc541)

7 years agoetnaviv: Set SE.CLIP registers, add margins for scissor/clip registers
Wladimir J. van der Laan [Fri, 25 Nov 2016 06:42:43 +0000 (06:42 +0000)]
etnaviv: Set SE.CLIP registers, add margins for scissor/clip registers

This fixes rendering of full-screen quads (and other screen-filling
geometry, e.g. ioquake3 walls up-close) on gc3000. It should be a no-op
on other hardware.

- It looks like SE_CLIP registers were not set at all.
  I'm amazed that rendering worked without them. Emit them to
  avoid issues on gc3000.

- Define constants
  ETNA_SE_SCISSOR_MARGIN_RIGHT (0x1119)
  ETNA_SE_SCISSOR_MARGIN_BOTTOM (0x1111)
  ETNA_SE_CLIP_MARGIN_RIGHT (0xffff)
  ETNA_SE_CLIP_MARGIN_BOTTOM (0xffff)

  These demarcate the margin (fixp16) between the computed sizes and the
  value sent to the chip. I have set these to the numbers used by the
  Vivante driver for gc2000. I am not sure whether any old hardware was
  relying on the old numbers, or whether those were just a guess. But if
  so, these need to be moved to the _specs structure.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit 56314f5bafdfeb514adf8401c52f216bd430bbb2)

7 years agoetnaviv: Generate new sin/cos instructions on GC3000
Wladimir J. van der Laan [Tue, 31 Jan 2017 08:23:51 +0000 (09:23 +0100)]
etnaviv: Generate new sin/cos instructions on GC3000

Shaders using sin/cos instructions were not working on GC3000.

The reason for this turns out to be that these chips implement sin/cos
in a different way (but using the same opcodes):

- Need their input scaled by 1/pi instead of 2/pi.

- Output an x and y component, which need to be multiplied to
  get the result.

- tex_amode needs to be set to 1.

Add a new bit to the compiler specs and generate these instructions
as necessary.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit fe3bb8cdb519a01e6315ce6f142827aece3d4a41)

7 years agoanv/cmd_buffer: Use the proper depth input attachment surface state
Nanley Chery [Mon, 30 Jan 2017 20:27:15 +0000 (12:27 -0800)]
anv/cmd_buffer: Use the proper depth input attachment surface state

Commit 2852efcda40274acf3272611c6a3b7731523a72d moved the location of
the depth input attachment surface state from the render pass to the
image view, but failed to update the surface state location used when
emitting the binding table. Fix this by loading the surface state from
the correct location.

Fixes:
dEQP-VK.renderpass.formats.d16_unorm.input.*
dEQP-VK.renderpass.formats.d24_unorm_s8_uint.input.*
dEQP-VK.renderpass.formats.d32_sfloat.input.*
dEQP-VK.renderpass.formats.x8_d24_unorm_pack32.input.*
dEQP-VK.renderpass.attachment_allocation.input_output.93
dEQP-VK.renderpass.attachment_allocation.input_output.92
dEQP-VK.renderpass.attachment_allocation.input_output.82
dEQP-VK.renderpass.attachment_allocation.input_output.46

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 33e0c5d003658320f5005e26caf55bbcfbe1fbb2)

7 years agoglsl: fix heap-buffer-overflow
Bartosz Tomczyk [Tue, 31 Jan 2017 11:02:20 +0000 (12:02 +0100)]
glsl: fix heap-buffer-overflow

The `end+1` skips the ']', whereas the `strlen+1` includes the final
'\0' in the move to terminate the string.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit fc27181f9e51441a26b7eb4f62794b5e9a994644)

7 years agoetnaviv: Cannot render to rb-swapped formats
Wladimir J. van der Laan [Wed, 7 Dec 2016 12:59:54 +0000 (12:59 +0000)]
etnaviv: Cannot render to rb-swapped formats

Exposing rb swapped (or other swizzled) formats for rendering would
involve swizzing in the pixel shader. This is not the case at the
moment, so reject requests for creating such surfaces.

(GPUs that need an extra resolve step anyway due to multiple pixel
pipes, such as gc2000, might also do this swap in the resolve operation.
But this would be tricky to keep track of)

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit 658568941d5e232d690e1ffbcddbd6ea9685693a)

7 years agoetnaviv: Avoid infinite loop in find_frame()
Christian Gmeiner [Tue, 31 Jan 2017 08:10:27 +0000 (09:10 +0100)]
etnaviv: Avoid infinite loop in find_frame()

Use of unsigned loop control variable with '>= 0' would lead
to infinite loop.

Reported by clang:

etnaviv_compiler.c:1024:39: warning: comparison of unsigned expression
>= 0 is always true [-Wtautological-compare]
   for (unsigned sp = c->frame_sp; sp >= 0; sp--)
                                   ~~ ^  ~

v2: Simply use the same datatype as c->frame_sp is using.

CC: <mesa-stable@lists.freedesktop.org>
Reported-by: Rhys Kidd <rhyskidd@gmail.com>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
(cherry picked from commit 82fe240a9912d78bc2eec513c1139c918c5f189f)

7 years agoradv/ac: apply slice rounding to 1d arrays as well.
Dave Airlie [Tue, 31 Jan 2017 00:09:11 +0000 (10:09 +1000)]
radv/ac: apply slice rounding to 1d arrays as well.

Fixes:
dEQP-VK.glsl.texture_functions.texture.*1darray*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8477aa71d902d6a6fd89741151f8d119a72a7dc0)

7 years agoradv/ac: implement txs for buffer textures.
Dave Airlie [Mon, 30 Jan 2017 19:19:56 +0000 (05:19 +1000)]
radv/ac: implement txs for buffer textures.

This fixes a bunch of buffer related:
dEQP-VK.memory.pipeline_barrier.*
tests, that were crashing in LLVM due to this being missing.

Reviewed-by: Andres Rodriguez<andresx7@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0ecd426490b043aac6a5db0a6e0feaa39f6d9c54)

7 years agoradv/ac: handle nir irem opcode.
Dave Airlie [Mon, 30 Jan 2017 18:50:30 +0000 (04:50 +1000)]
radv/ac: handle nir irem opcode.

This fixes:
dEQP-VK.spirv_assembly.instruction.compute.opsrem.*

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org"
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ecc3fa3ba3967624f67abe8d8188102a08c20d7c)

7 years agoradv/ac: fix multisample subpass image.
Dave Airlie [Mon, 30 Jan 2017 06:13:30 +0000 (16:13 +1000)]
radv/ac: fix multisample subpass image.

We weren't adding the fragment position properly.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 059dd171759bb89d915c049de1ca1c93865c21d3)

7 years agoradv: handle transfer_write as a dst flag.
Dave Airlie [Mon, 30 Jan 2017 03:17:05 +0000 (13:17 +1000)]
radv: handle transfer_write as a dst flag.

It appears we can get image barriers like:
    srcStageMask:                   VkPipelineStageFlags = 4096 (VK_PIPELINE_STAGE_TRANSFER_BIT)
    dstStageMask:                   VkPipelineStageFlags = 4096 (VK_PIPELINE_STAGE_TRANSFER_BIT)
    dependencyFlags:                VkDependencyFlags = 0
    memoryBarrierCount:             uint32_t = 0
    pMemoryBarriers:                const VkMemoryBarrier* = NULL
    bufferMemoryBarrierCount:       uint32_t = 0
    pBufferMemoryBarriers:          const VkBufferMemoryBarrier* = NULL
    imageMemoryBarrierCount:        uint32_t = 1
    pImageMemoryBarriers:           const VkImageMemoryBarrier* = 0x7ffc882367b0
        pImageMemoryBarriers[0]:        const VkImageMemoryBarrier = 0x7ffc882367b0:
            sType:                          VkStructureType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER (45)
            pNext:                          const void* = NULL
            srcAccessMask:                  VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT)
            dstAccessMask:                  VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT)
            oldLayout:                      VkImageLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL (7)
            newLayout:                      VkImageLayout = VK_IMAGE_LAYOUT_GENERAL (1)
            srcQueueFamilyIndex:            uint32_t = 4294967295
            dstQueueFamilyIndex:            uint32_t = 4294967295
            image:                          VkImage = 0x2df55e0
            subresourceRange:               VkImageSubresourceRange = 0x7ffc882367e0:
                aspectMask:                     VkImageAspectFlags = 1 (VK_IMAGE_ASPECT_COLOR_BIT)
                baseMipLevel:                   uint32_t = 0
                levelCount:                     uint32_t = 1
                baseArrayLayer:                 uint32_t = 0
                layerCount:                     uint32_t = 1

This fixes all the CTS dEQP-VK.memory.pipeline_barrier.transfer_dst tests here,
not sure if this is a too large hammer.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a1c1ba7d5649cca450ca81bf87be36c035a01db0)

7 years agoradeonsi: don't invoke DCC decompression in update_all_texture_descriptors
Marek Olšák [Sun, 29 Jan 2017 22:59:59 +0000 (23:59 +0100)]
radeonsi: don't invoke DCC decompression in update_all_texture_descriptors

This fixes a bug uncovered by the 17-part patch series, specifically:
  "gallium/radeon: merge dirty_fb_counter and dirty_tex_descriptor_counter"

If dirty_tex_counter has been updated and set_shader_image invokes DCC
decompression, the DCC decompression itself checks the counter and updates
descriptors, which in turn invokes the same DCC decompression. The blitter
can't handle the recursion and the driver eventually crashes.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a0740d59aa97a08d89998cb57138e8217a331af6)

7 years agor600: Fix stack overflow
Bartosz Tomczyk [Mon, 30 Jan 2017 13:07:45 +0000 (14:07 +0100)]
r600: Fix stack overflow

Commit 7b5878ee0491e7a93914389a8369cd6752b9757d increased number of
outputs to 64, but left output array intact. This caused stack overflow
when number of outputs is bigger then 32. Found by ASAN.

Cc: "12.0 13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a41f2527ae8ae5432b99c88863fbdf2f0b5f04ad)

7 years agoi965: Support the force_glsl_version driconf option.
Kenneth Graunke [Sat, 21 Jan 2017 04:33:57 +0000 (20:33 -0800)]
i965: Support the force_glsl_version driconf option.

Gallium drivers have had this for a while.  It makes sense to support
it consistently across drivers, so expose it in i965 as well.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 2f7a7ae13196c58eddea75fc51637f8c2a8579b0)

7 years agoi965: Fix check for negative pitch in can_do_fast_copy_blit().
Kenneth Graunke [Thu, 26 Jan 2017 09:27:42 +0000 (01:27 -0800)]
i965: Fix check for negative pitch in can_do_fast_copy_blit().

At this point, the pitch is in bytes.  We haven't yet divided the pitch
by 4 for tiled surfaces, so abs(pitch) may be larger than 32K.  This
means the bit 15 trick won't work.

The caller now has signed integers anyway, so just pass those through
and do the obvious check.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 02216a1ddf2bcafb86fda352e514f27ab6f7a4fa)

7 years agoi965: Unbind deleted shaders from brw_context, fixing malloc heisenbug.
Kenneth Graunke [Wed, 25 Jan 2017 08:59:42 +0000 (00:59 -0800)]
i965: Unbind deleted shaders from brw_context, fixing malloc heisenbug.

Applications may delete a shader program, create a new one, and bind it
before the next draw.  With terrible luck, malloc may randomly return a
chunk of memory for the new gl_program that happened to be the exact
same pointer as our previously bound gl_program.  In this case, our
logic to detect new programs in brw_upload_pipeline_state() would break:

      if (brw->vertex_program != ctx->VertexProgram._Current) {
         brw->vertex_program = ctx->VertexProgram._Current;
         brw->ctx.NewDriverState |= BRW_NEW_VERTEX_PROGRAM;
      }

Because the pointer is the same, we'd think it was the same program.
But it could be wildly different - a different stage altogether,
different sets of resources, and so on.  This causes utter chaos.

As unlikely as this seems, I believe I hit this when running a subset
of the CTS in a loop, in a group of tests that churns through simple
programs, deleting and rebuilding them.  Presumably malloc uses a
bucketing cache of sorts, and so freeing up a gl_program and allocating
a new one fairly quickly causes it to reuse that memory.

The result was that brw->vertex_program->info.num_ssbos claimed the
program had SSBOs, while brw->vs.base.prog_data.binding_table claimed
that there were none.  This was crazy, because the binding table is
calculated from info.num_ssbos - the shader info appeared to change
between shader compile time and draw time.  Careful use of watchpoints
revealed that it was being clobbered by rzalloc's memset when building
an entirely different program...

Fortunately, our 0xd0d0d0d0 canary for unused binding table entries
caused us to crash out of bounds when trying to upload SSBOs, or we
may have never discovered this heisenbug.

Fixes crashes in GL45-CTS.compute_shader.sso-case2 when using a hacked
cts-runner that only runs GL45-CTS.compute_shader.s* in EGL config ID 5
at 64x64 in a loop with 100 iterations.

Cc: "17.0 13.0 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 7c5629a26912af9164bda17b7af5370f6b4302e4)

7 years agoradv/ac: Use base in push constant loads.
Bas Nieuwenhuizen [Sat, 28 Jan 2017 00:32:20 +0000 (01:32 +0100)]
radv/ac: Use base in push constant loads.

Apparently the source is not an address but an offset, so we actually
need to use the base.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 96c60b7f07e626d9ca0fc5789117f0c725ba1da2)

7 years agoconfigure.ac: list radeon in --with-vulkan-drivers help string
Emil Velikov [Fri, 27 Jan 2017 18:29:38 +0000 (18:29 +0000)]
configure.ac: list radeon in --with-vulkan-drivers help string

Analogous to what we do for the dri and gallium drivers.

Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@colllabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit cb6be5c8c05fa1af20ebd4f014d686244826f987)

7 years agoradv: automake: Don't install vk_platform.h or vulkan.h.
Emil Velikov [Fri, 27 Jan 2017 18:05:13 +0000 (18:05 +0000)]
radv: automake: Don't install vk_platform.h or vulkan.h.

These files belong to the vulkan loader.

Identical to
045f38a5075 vulkan: Don't install vk_platform.h or vulkan.h.

Cc: Dave Airlie <airlied@redhat.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 6f2dec0a235694779310979fd1cbf48a8d7ba27b)

7 years agomesa/tests: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:49 +0000 (15:45 +0000)]
mesa/tests: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 091f2b8c98937138c17a1ddf4b16d17f31a20020)

7 years agodri/osmesa: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:48 +0000 (15:45 +0000)]
dri/osmesa: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 6ba96bdcabc4c9c89827603907aba1b7dd5e9972)

7 years agodri/swrast: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:47 +0000 (15:45 +0000)]
dri/swrast: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ede4ff9adc5653db56e7edac53c012fc431647dc)

7 years agoradeon, r200: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:46 +0000 (15:45 +0000)]
radeon, r200: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 5a0ba1e5de088bf71bc407db9837235b18dca936)

7 years agomapi: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:45 +0000 (15:45 +0000)]
mapi: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ee5de93269cee8abec2dd4d938c5e745983bf98e)

7 years agoloader: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:44 +0000 (15:45 +0000)]
loader: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit af860850a0c4096f9bb27faa4d9c0061fb437a9b)

7 years agoglx/windows: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:43 +0000 (15:45 +0000)]
glx/windows: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 912b4f5472007dbd06c922cb1c315cb4814c4010)

7 years agoglx/apple: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:42 +0000 (15:45 +0000)]
glx/apple: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jeremy Sequoia <jeremyhu@apple.com>
(cherry picked from commit 5b874cee099122bc9c0625254895e63843ae2260)

7 years agoglx: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:41 +0000 (15:45 +0000)]
glx: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d66f9e6d930a9fcfa23129dce18d5359cc6e00f4)

7 years agod3dadapter9: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:40 +0000 (15:45 +0000)]
d3dadapter9: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d221bf9b91900c62069cc447dc214c04a9e5261c)

7 years agost/dri: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:39 +0000 (15:45 +0000)]
st/dri: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 517f34b4be0ac4a5a508ccb6dcaeca3c975585b0)

7 years agoclover: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:37 +0000 (15:45 +0000)]
clover: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Aaron Watry <awatry@gmail.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 65d5a60caca632a7c03cd1dc554645f27f408f37)

7 years agoegl: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:36 +0000 (15:45 +0000)]
egl: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c5921ae0d2fc37699c7ebbd693a2e850a5371204)

7 years agoi915: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:35 +0000 (15:45 +0000)]
i915: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 90ac5c339e2c360712d663e3eab76c4a4abf2487)

7 years agoi965: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:34 +0000 (15:45 +0000)]
i965: automake: include builddir prior to srcdir

The latter can contain stale generated file, which, as-is, we'll end up
using.

Fixes: bfd17c76c12 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 4622c75dfbf5c89630d1860037dcb8b7910c0820)

7 years agofreedreno: automake: correctly set MKDIR_GEN
Emil Velikov [Mon, 16 Jan 2017 15:45:33 +0000 (15:45 +0000)]
freedreno: automake: correctly set MKDIR_GEN

Analogous to previous commit.

Fixes: 4610e5ef28e "freedreno/ir3: fix sin/cos"
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Rob Clark <robclark@freedesktop.org>
Cc: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
(cherry picked from commit a922c821255bfac22cf705244e5bd303a626bb55)

7 years agoi965: automake: correctly set MKDIR_GEN
Emil Velikov [Mon, 16 Jan 2017 15:45:32 +0000 (15:45 +0000)]
i965: automake: correctly set MKDIR_GEN

Otherwise we might end up w/o the respective folder (depending on
autotools version) and fail at build time.

Fixes: bfd17c76c12 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5eed48d23780e812ca5c53fecc59a419962c7dd6)

7 years agovulkan/wsi: Lower the maximum image sizes
Jason Ekstrand [Wed, 25 Jan 2017 01:10:45 +0000 (17:10 -0800)]
vulkan/wsi: Lower the maximum image sizes

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit d6397dd62542215e655c0cab557729474c2ae973)

7 years agovulkan/wsi/wayland: Handle VK_INCOMPLETE for GetPresentModes
Jason Ekstrand [Wed, 25 Jan 2017 00:43:15 +0000 (16:43 -0800)]
vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetPresentModes

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 659edd9f5ce995aa47e2ab02425508cc29140cce)

7 years agovulkan/wsi/wayland: Handle VK_INCOMPLETE for GetFormats
Jason Ekstrand [Wed, 25 Jan 2017 00:43:01 +0000 (16:43 -0800)]
vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetFormats

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit dc578ef060f6b92e6fd2f77bb6454a5fb22a471c)

7 years agomesa: move variable declaration to where its used
Emil Velikov [Thu, 26 Jan 2017 13:18:41 +0000 (13:18 +0000)]
mesa: move variable declaration to where its used

The variable replacement was unused when building w/o
ENABLE_SHADER_CACHE. Since we can mix variable declarations and code,
move it to where its used.

Fixes: 9f8dc3bf03e "utils: build sha1/disk cache only with
Android/Autoconf"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 6a5850b04a86c2287112047c6cad500136d18df5)

7 years agoconfigure.ac: Require LLVM for r300 only on x86 and x86_64
Andreas Boll [Tue, 24 Jan 2017 15:44:12 +0000 (16:44 +0100)]
configure.ac: Require LLVM for r300 only on x86 and x86_64

b3119a3 introduced a strict LLVM requirement for r300 on all
architectures and thus configure fails on architectures where LLVM is
not available or buggy.

r300 doesn't strictly require LLVM, but for performance reasons we
highly recommend LLVM usage. So require it at least on x86 and x86_64
architectures as we have done before b3119a3.

Fixes: b3119a3 ("configure.ac: Check gallium LLVM version in gallium_require_llvm")
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 1f2a890ace07f2709ca8f7ef5fe12051222aafed)

7 years agospirv: handle undefined components for OpVectorShuffle
Lionel Landwerlin [Thu, 26 Jan 2017 16:57:40 +0000 (16:57 +0000)]
spirv: handle undefined components for OpVectorShuffle

Fixes:
   dEQP-VK.spirv_assembly.instruction.compute.opspecconstantop.vector_related
   dEQP-VK.spirv_assembly.instruction.graphics.opspecconstantop.vector_related*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bbe8705c579c3e464615a0ca9b2eb4bd3c16aad3)

7 years agospirv: handle OpUndef as part of the variable parsing pass
Lionel Landwerlin [Thu, 26 Jan 2017 16:57:25 +0000 (16:57 +0000)]
spirv: handle OpUndef as part of the variable parsing pass

Looking at the following bit of SPIRV shader :

...
%zero        = OpConstant %i32 0
%ivec3_0     = OpConstantComposite %ivec3 %zero %zero %zero
%vec3_undef  = OpUndef %ivec3
%sc_0        = OpSpecConstant %i32 0
%sc_1        = OpSpecConstant %i32 0
%sc_2        = OpSpecConstant %i32 0
...

Our compiler currently stops parsing variables & types on the OpUndef
and switches to instructions, leaving the following sc_[0-2] variables
untreated.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit df7063cba35ea273827ba60f643596cd80539458)

7 years agoanv: fix descriptor pool internal size allocation
Lionel Landwerlin [Thu, 26 Jan 2017 11:06:53 +0000 (11:06 +0000)]
anv: fix descriptor pool internal size allocation

The size of the pool is slightly smaller than the size of the
structure containing the whole pool. We need to take that into account
on when setting up the internals.

Fixes a crash due to out of bound memory access in:
   dEQP-VK.api.descriptor_pool.out_of_pool_memory

v2: Drop debug traces (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c3421106ec332bf3a943ccf9447edf00dc7f3618)

7 years agoi965: Make intelEmitCopyBlit not truncate large strides.
Kenneth Graunke [Sun, 22 Jan 2017 09:44:08 +0000 (01:44 -0800)]
i965: Make intelEmitCopyBlit not truncate large strides.

When trying to blit larger tiled surfaces, the pitch can be larger than
32768 bytes, which means it won't fit in a GLshort.  Passing it in will
truncate the stride to 0, which has...surprising results.

The pitch can be up to 32,768 DWords, or 128kB.  We measure it in bytes,
but divide by 4 when programming it.  So we need to handle values up to
131,072.  Switch from GLshort to int32_t to avoid the truncation.

Fixes GL45-CTS.gtf30.GL3Tests.depth_texture.depth_texture_copyteximage
at widths greater than 8192.

v2: Use int32_t as negative values can be used (Jason).

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit f8f7ea508be7fe7222cd19e0d59574cfea2decf0)

7 years agoi965: Use a UW source type for CS_OPCODE_CS_TERMINATE.
Kenneth Graunke [Tue, 24 Jan 2017 08:45:53 +0000 (00:45 -0800)]
i965: Use a UW source type for CS_OPCODE_CS_TERMINATE.

SIMD16 compute shaders use a send(16) with mlen 1 for the EOT message,
using a source of g127 for the single register.  With a UD type, this
supposedly could read g128, which doesn't exist, causing the simulator
to get cranky.  Use a UW type to avoid this.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit fcf723b647f36fa174d29b1fe6a9732637a1f8d1)

7 years agoanv/lower_input_attachments: honor sample index parameter to subpassLoad()
Iago Toral Quiroga [Wed, 25 Jan 2017 14:04:35 +0000 (15:04 +0100)]
anv/lower_input_attachments: honor sample index parameter to subpassLoad()

According to GL_KHR_vulkan_glsl, the signature of subpassLoad() is:

gvec4 subpassLoad(gsubpassInput   subpass);
gvec4 subpassLoad(gsubpassInputMS subpass, int sample);

So the multisampled case always receives an explicit sample index that we
should use. The current implementation was ignoring this parameter
and using gl_SampleID value instead.

Fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_id.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9b25769da63999fa65a70a14194a452c49d18f3e)

7 years agoi965: Fix fast depth clears for surfaces with a dimension of 16384.
Kenneth Graunke [Mon, 23 Jan 2017 19:57:21 +0000 (11:57 -0800)]
i965: Fix fast depth clears for surfaces with a dimension of 16384.

I hadn't bothered to set this bit because I figured it would just
paper over us getting the rectangle wrong.  But it turns out that
there is a legitimate reason to use it, so let's do so.

The alternative would be to chop up 16k clears to multiple 8k clears,
which is pointlessly painful.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 5106df85da20d57007e89262472bb1624afbdaba)

7 years agoanv: set command buffer to NULL when allocations fail
Lionel Landwerlin [Wed, 25 Jan 2017 16:22:40 +0000 (16:22 +0000)]
anv: set command buffer to NULL when allocations fail

The spec section 5.2 says:

   "vkAllocateCommandBuffers can be used to create multiple command
   buffers. If the creation of any of those command buffers fails, the
   implementation must destroy all successfully created command buffer
   objects from this command, set all entries of the pCommandBuffers
   array to VK_NULL_HANDLE and return the error."

Fixes:
   dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary
   dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 25e21cb8d065799888e0c5db80b0e616ffddc560)

7 years agoradv: program a default point size.
Dave Airlie [Wed, 18 Jan 2017 03:46:43 +0000 (13:46 +1000)]
radv: program a default point size.

Along the lines of what
3b804819 anv: Default PointSize to 1.0 if not written by the shader
does for anv, program a default point size in the hw of 1.0.

This preempt fixes a bunch of geom shader tests.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 2ab2be092d15ddb449b4a215609705bae68089a0)

7 years agoradeonsi: handle first_non_void correctly in si_create_vertex_elements
Marek Olšák [Fri, 20 Jan 2017 15:02:04 +0000 (16:02 +0100)]
radeonsi: handle first_non_void correctly in si_create_vertex_elements

This fixes R11G11B10_FLOAT, because it's in the category of "OTHER",
meaning that it doesn't have any channel description.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit eac7df43ca05abd9992b305e078e88fe7b7f8c91)

7 years agost/mesa: destroy pipe_context before destroying st_context (v2)
Marek Olšák [Fri, 20 Jan 2017 01:26:42 +0000 (02:26 +0100)]
st/mesa: destroy pipe_context before destroying st_context (v2)

If radeonsi starts compiling an optimized shader variant asynchronously
with a GL debug callback set and the application destroys the GL context,
radeonsi crashes when trying to write shader stats into the debug output
of a non-existent context after compilation, because st/mesa was destroyed
before pipe_context.

Firefox with WebGL2 enabled hits this bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99456

v2: protect against a double destroy in st_create_context_priv and callers.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit d9ef54923804d5fe44a1d3ad5c29e9b8e8382359)

7 years agomesa: Don't advertise GL_OES_read_format in core profile
Ian Romanick [Mon, 23 Jan 2017 17:57:15 +0000 (09:57 -0800)]
mesa: Don't advertise GL_OES_read_format in core profile

OpenGL ES implementations are not allowed to ship ARB extensions, and
OpenGL implementations are not allowed to ship OES extensions.

The functionality is also included in GL_ARB_ES2_compatibility.  Ever
OpenGL core-profile driver currently exposes both extensions.  I don't
know of any applications that explicitly check for GL_OES_read_format,
so removing it seems very unlikely to cause problems.  No functionality
is removed.

I have left this extension in place for compatibility profile.  There
are still OpenGL 1.x drivers in Mesa, and adding code to check for
compatibility profile and not GL_ARB_ES2_compatibility for
GL_IMPLEMENTATION_COLOR_READ_TYPE and GL_IMPLEMENTATION_COLOR_READ_FORMAT
just feels dumb.

Three other other alternatives considered:

 - Remove the string from compatibility profile drivers but leave the
   functionality in place.

 - Add a flag to expose the extension string, and set it in every OpenGL
   driver that does not expose GL_ARB_ES2_compatibility (and those
   drivers only).  I tried this.  You can't have two instances of an
   extension in the extension table (one dummy_true for ES1 and one with
   a flag for compatibility profile), so the implementation requires a
   bit of effort.

 - Only expose the extension in compatibility if the version is less
   than 2.0.  I didn't see an easy way to do this.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c4a0c1efff77ed90dfebf969348cd440195ae38a)

7 years agogallivm: (trivial) fix ddiv cpu implementation
Roland Scheidegger [Mon, 23 Jan 2017 17:04:12 +0000 (18:04 +0100)]
gallivm: (trivial) fix ddiv cpu implementation

we can't use the cpu implementation of fdiv, as this one uses different
lp_build_context, which causes assertion failure.
Just use default fdiv action (there is no fast rcp for doubles which we
could potentially use anyway).

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 25208949d7293aa060a3416f8cf3cdb3ca1fbfdd)

7 years agotgsi: implement ddiv opcode
Roland Scheidegger [Mon, 23 Jan 2017 17:10:44 +0000 (18:10 +0100)]
tgsi: implement ddiv opcode

softpipe (along with llvmpipe) claims to support arb_gpu_shader_fp64,
so we really need to support that opcode.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 3b575a955c1d84744a65160e2c45f0ce407effd8)

7 years agoi965/blorp: Use the correct ISL format for combined depth/stencil
Jason Ekstrand [Mon, 23 Jan 2017 18:53:13 +0000 (10:53 -0800)]
i965/blorp: Use the correct ISL format for combined depth/stencil

In brw_blorp_copyteximage, we use the format from the render buffer.
This could be a combined depth/stencil format.  In this case, we handle
stencil properly but we give blorp the wrong ISL format.  Specifically,
we would give blorp ISL_FORMAT_R32G32B32A32_FLOAT which is the wrong
size was causing GPU hangs.

Fixes: GL45-CTS.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_copyteximage

Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4c180f9633421a526f3ea6746cf38b809e7c1abb)

7 years agoi965/blorp: Add also depth and stencil buffers to render cache
Topi Pohjolainen [Thu, 19 Jan 2017 08:11:42 +0000 (10:11 +0200)]
i965/blorp: Add also depth and stencil buffers to render cache

v2 (Jason, Curro): Add stencil also even though it is not
                   enabled yet.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit ba6399df9462d78eda2a5de7c2940d8cf9d27f95)

7 years agoconfigure.ac: move require_dri_shared_libs_and_glapi() before its users
Emil Velikov [Thu, 19 Jan 2017 15:19:56 +0000 (15:19 +0000)]
configure.ac: move require_dri_shared_libs_and_glapi() before its users

Otherwise we'll get a lovely message as below:
"require_dri_shared_libs_and_glapi: command not found"

Cc: Steven Newbury <steve@snewbury.org.uk>
Reported-by: Steven Newbury <steve@snewbury.org.uk>
Fixes: da410e6afad "configure: explicitly require shared glapi for
enable-dri"
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Steven Newbury <steve@snewbury.org.uk>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 5872850b8868f00c031d21387b0516d844d070be)

7 years agoUpdate version to 17.0.0-rc2
Emil Velikov [Wed, 25 Jan 2017 13:24:27 +0000 (13:24 +0000)]
Update version to 17.0.0-rc2

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoi965/blorp: Make post draw flush more explicit
Topi Pohjolainen [Tue, 17 Jan 2017 10:00:37 +0000 (12:00 +0200)]
i965/blorp: Make post draw flush more explicit

Blits do not need any special treatment as the target buffer
object is added to render cache just as one does for normal draw.
Color clears and resolves in turn require explicit "end of pipe
synchronization". It is not clear what this means exactly but the
assumption is that render cache flush with command stream stall
should be sufficient.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 180653c357d19ca88f7895f59874a58fac99cc53)

7 years agoi965/gen6: Issue direct depth stall and flush after depth clear
Topi Pohjolainen [Tue, 17 Jan 2017 09:48:49 +0000 (11:48 +0200)]
i965/gen6: Issue direct depth stall and flush after depth clear

instead of calling unconditionally brw_emit_mi_flush() which
does:

   brw_emit_pipe_control_flush(brw,
                                PIPE_CONTROL_DEPTH_CACHE_FLUSH |
                                PIPE_CONTROL_RENDER_TARGET_FLUSH |
                                PIPE_CONTROL_CS_STALL);

   brw_emit_pipe_control_flush(brw,
                                PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
                                PIPE_CONTROL_CONST_CACHE_INVALIDATE);

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 46b346899d98e29943f8cd74c25bcb8d2f868a49)

7 years agoi965: Make depth clear flushing more explicit
Topi Pohjolainen [Tue, 17 Jan 2017 09:44:52 +0000 (11:44 +0200)]
i965: Make depth clear flushing more explicit

Current blorp logic issues unconditional "flush everything"
(see brw_emit_mi_flush()) after each render. For example, all
blits issue this unconditionally which shouldn't be needed if
they set render cache properly so that subsequent renders do
necessary flushing before drawing.

In case of piglit:

ext_framebuffer_multisample-accuracy all_samples depth_draw small

intel_hiz_exec() is always preceded by blorb blit and the
unconditional flush looks to hide the lack of stall and flushes
in depth clears. By removing the brw_emit_mi_flush() I get gpu
hangs.

This patch adds the stalls and flushes mandated by the spec
and gets rid of those hangs.

v2 (Jason, Ken): Document the rational for separating
                 depth cache flush and stall on Gen7.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit e6da6943fed1228c551af1f0e1a405b6d67b41ae)

7 years agoi965/blorp: Use the render cache mechanism instead of explicit flushing
Topi Pohjolainen [Tue, 17 Jan 2017 09:04:22 +0000 (11:04 +0200)]
i965/blorp: Use the render cache mechanism instead of explicit flushing

by replacing brw_emit_mi_flush() with brw_render_cache_set_check_flush().
The latter splits the flush in two:

   brw_emit_pipe_control_flush(brw,
                               PIPE_CONTROL_DEPTH_CACHE_FLUSH |
                               PIPE_CONTROL_RENDER_TARGET_FLUSH |
                               PIPE_CONTROL_CS_STALL);

   brw_emit_pipe_control_flush(brw,
                               PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
                               PIPE_CONTROL_CONST_CACHE_INVALIDATE);

instead of

   int flags = PIPE_CONTROL_NO_WRITE | PIPE_CONTROL_RENDER_TARGET_FLUSH;
   if (brw->gen >= 6) {
      flags |= PIPE_CONTROL_INSTRUCTION_INVALIDATE |
               PIPE_CONTROL_CONST_CACHE_INVALIDATE |
               PIPE_CONTROL_DEPTH_CACHE_FLUSH |
               PIPE_CONTROL_VF_CACHE_INVALIDATE |
               PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
               PIPE_CONTROL_CS_STALL;
   }
   brw_emit_pipe_control_flush(brw, flags);

v2 (Jason): Check that destination exists before trying to add to
            render cache. Depth clears and resolves don't have it.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 4840a53e902b0f2b9841d9dbb90e479a3688153d)

7 years agoradeonsi: always set the TCL1_ACTION_ENA when invalidating L2
Marek Olšák [Fri, 20 Jan 2017 00:13:39 +0000 (01:13 +0100)]
radeonsi: always set the TCL1_ACTION_ENA when invalidating L2

Some CIK-VI docs say this is the default behavior on SI. That doesn't
answer whether it's also the default behavior on CIK-VI.

Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 573bf0940a08e18a511e338de478f30fd95a1590)

7 years agoradv: don't resubmit the same cs over and over while tracing
Grazvydas Ignotas [Mon, 23 Jan 2017 21:16:42 +0000 (23:16 +0200)]
radv: don't resubmit the same cs over and over while tracing

Fixes: 97dfff54 ("radv: Dump command buffer on hang.")
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f65b3641c3233f1697b96ea8126b578dae6de4f1)

7 years agoswr: Align query results allocation
George Kyriazis [Wed, 18 Jan 2017 23:09:08 +0000 (17:09 -0600)]
swr: Align query results allocation

Some query results struct contents are declared as cache line aligned.
Use aligned malloc, and align the whole struct, to be safe.

Fixes crash when compiling with clang.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit 00847e4f14dd237dfcdb2c3d15be1325a08ccf5a)

7 years agoswr: Prune empty nodes in CalculateProcessorTopology.
Bruce Cherniak [Thu, 19 Jan 2017 21:44:52 +0000 (15:44 -0600)]
swr: Prune empty nodes in CalculateProcessorTopology.

CalculateProcessorTopology tries to figure out system topology by
parsing /proc/cpuinfo to determine the number of threads, cores, and
NUMA nodes.  There are some architectures where the "physical id" begins
with 1 rather than 0, which was creating and empty "0" node and causing a
crash in CreateThreadPool.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97102
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b829206b0739925501bcc68233437d6d03b79795)

7 years agost/glsl_to_tgsi: use DDIV instead of DRCP + DMUL
Nicolai Hähnle [Mon, 16 Jan 2017 15:43:54 +0000 (16:43 +0100)]
st/glsl_to_tgsi: use DDIV instead of DRCP + DMUL

Fixes GL45-CTS.gpu_shader_fp64.built_in_functions.

v2: use DDIV unconditionally (Roland)

Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cfabbbcfd778cc404813c9f05a9ef79efe531980)

7 years agoglsl: split DIV_TO_MUL_RCP into single- and double-precision flags
Nicolai Hähnle [Mon, 16 Jan 2017 15:39:06 +0000 (16:39 +0100)]
glsl: split DIV_TO_MUL_RCP into single- and double-precision flags

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b71c415c3d288da4b5f533ece42f50f4f20a8c33)

7 years agor600: implement DDIV
Nicolai Hähnle [Thu, 19 Jan 2017 13:44:57 +0000 (14:44 +0100)]
r600: implement DDIV

Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e4f8f9a638c1ffb9b76840b088290f11f0f91813)