OSDN Git Service

android-x86/external-mesa.git
7 years agogallium/targets: fix bool setting on BE architectures
Ilia Mirkin [Tue, 18 Apr 2017 04:00:40 +0000 (00:00 -0400)]
gallium/targets: fix bool setting on BE architectures

val_bool and val_int are in a union. val_bool gets the first byte, which
happens to work on LE when setting via the int, but breaks on BE. By
setting the value properly, we are able to use DRI3 on BE architectures.
Tested by running glxgears with a NV34 in a G5 PPC.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
[Emil Velikov: squash the vmwgfx hunk]
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 6af14778a3f68030c4ad6426c75fe25d726235d5)

7 years agost/mesa: don't cast the incomplete framebufer to st_framebuffer
Nicolai Hähnle [Fri, 21 Apr 2017 13:06:47 +0000 (15:06 +0200)]
st/mesa: don't cast the incomplete framebufer to st_framebuffer

The incomplete framebuffer is set for a surfaceless context. This leads to
the following error in piglit spec@egl_khr_surfaceless_context@viewport:

==26703==ERROR: AddressSanitizer: global-buffer-overflow on address 0x7f6886e43240 at pc 0x7f68854db0fd bp 0x7ffca404b3b0 sp 0x7ffca404b3a0
READ of size 8 at 0x7f6886e43240 thread T0
    #0 0x7f68854db0fc in st_viewport ../../../mesa-src/src/mesa/state_tracker/st_cb_viewport.c:57
    #1 0x556840176cdb in main tests/egl/spec/egl_khr_surfaceless_context/viewport.c:101
    #2 0x7f688edcf3f0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x203f0)
    #3 0x556840176e19 in _start (/home/nha/amd/piglit/bin/egl-surfaceless-context-viewport+0xe19)

0x7f6886e43240 is located 32 bytes to the left of global variable 'DummyRenderbuffer' defined in '../../../mesa-src/src/mesa/main/fbobject.c:69:31' (0x7f6886e43260) of size 112
0x7f6886e43240 is located 8 bytes to the right of global variable 'IncompleteFramebuffer' defined in '../../../mesa-src/src/mesa/main/fbobject.c:73:30' (0x7f6886e42de0) of size 1112
SUMMARY: AddressSanitizer: global-buffer-overflow ../../../mesa-src/src/mesa/state_tracker/st_cb_viewport.c:57 in st_viewport

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek@olsak@amd.com>
(cherry picked from commit 19b61799e3d06795d783b34fdbbf8474ef1e9a7c)

7 years agoutil/disk_cache: remove percentage based max cache limit
Timothy Arceri [Fri, 28 Apr 2017 03:05:52 +0000 (13:05 +1000)]
util/disk_cache: remove percentage based max cache limit

The more I think about it the more this seems like a bad idea.
When we were deleting old cache dirs this wasn't so bad as it
was unlikely we would ever hit the actual limit before things
were cleaned up. Now that we only start cleaning up old cache
items once the limit is reached the a percentage based max
cache limit is more risky.

For the inital release of shader cache I think its better to
stick to a more conservative cache limit, at least until we
have some way of cleaning up the cache more aggressively.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 22fa3d90a92c1628215d0f5fccbe1116d4f5147f)

7 years agodisk_cache: use block size rather than file size
Timothy Arceri [Thu, 27 Apr 2017 01:15:30 +0000 (11:15 +1000)]
disk_cache: use block size rather than file size

The majority of cache files are less than 1kb this resulted in us
greatly miscalculating the amount of disk space used by the cache.

Using the number of blocks allocated to the file is more
conservative and less likely to cause issues.

This change will result in cache sizes being miscalculated further
until old items added with the previous calculation have all been
removed. However I don't see anyway around that, the previous
patch should help limit that problem.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 4e1f3afea9bdeddb0d21f00d25319bce580d80c3)

7 years agodisk_cache: reduce default cache size to 5% of filesystem
Timothy Arceri [Thu, 27 Apr 2017 01:15:29 +0000 (11:15 +1000)]
disk_cache: reduce default cache size to 5% of filesystem

Modern disks are extremely large and are only going to get bigger.
Usage has shown frequent Mesa upgrades can result in the cache
growing very fast i.e. wasting a lot of disk space unnecessarily.

5% seems like a more reasonable default.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit ce412371513c90bf9156f22c3567ee57750ef264)

7 years agoanv: Don't place scratch buffers above the 32-bit boundary
Jason Ekstrand [Sat, 22 Apr 2017 22:51:01 +0000 (15:51 -0700)]
anv: Don't place scratch buffers above the 32-bit boundary

This fixes rendering corruptions in DOOM.  Hopefully, it will also make
Jenkins a bit more stable as we've been seeing some random failures and
GPU hangs ever since turning on 48bit.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100620
Fixes: 651ec926fc1 "anv: Add support for 48-bit addresses"
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c43b4bc85eddba8bc31665cfee5928bed8343516)

7 years agoradeonsi: adjust ESGS ring buffer size computation on VI
Marek Olšák [Sat, 22 Apr 2017 12:47:03 +0000 (14:47 +0200)]
radeonsi: adjust ESGS ring buffer size computation on VI

Cc: 17.0 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 3f2a0649abc982fe5de647a96fbe354aa9e41a59)

7 years agoradeonsi/gfx9: don't set deprecated field PARTIAL_ES_WAVE_ON
Marek Olšák [Sun, 23 Apr 2017 18:29:04 +0000 (20:29 +0200)]
radeonsi/gfx9: don't set deprecated field PARTIAL_ES_WAVE_ON

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 80814819c28353a38c03d4cdba39983b8cf260ac)

7 years agoradeonsi/gfx9: set MAX_PRIMGRP_IN_WAVE in the correct register
Marek Olšák [Sun, 23 Apr 2017 18:14:42 +0000 (20:14 +0200)]
radeonsi/gfx9: set MAX_PRIMGRP_IN_WAVE in the correct register

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 60a20e6879e4ce0911b12848ffd9e372f096590e)

7 years agoradeonsi/gfx9: add a workaround for viewing a slice of 3D as a 2D image
Marek Olšák [Sun, 23 Apr 2017 22:01:06 +0000 (00:01 +0200)]
radeonsi/gfx9: add a workaround for viewing a slice of 3D as a 2D image

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 8e8570a9e8bae7f4d3ad623475dfadc715a828d7)

7 years agoradeonsi/gfx9: fix 1D array shader images
Marek Olšák [Sun, 23 Apr 2017 21:29:20 +0000 (23:29 +0200)]
radeonsi/gfx9: fix 1D array shader images

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 482e6b07cc6ce4b2ceac8188be19dbf252eaecde)

7 years agoradeonsi/gfx9: fix most things wrong with shader images
Marek Olšák [Sun, 23 Apr 2017 21:06:38 +0000 (23:06 +0200)]
radeonsi/gfx9: fix most things wrong with shader images

There are 2 major hw changes:
- The address must always point to the address of level 0. GFX9 tiling
  modes don't allow binding to a non-0 level.
- 3D must always be bound as 3D, because 2D and 3D use entirely different
  tiling modes, and the texture target determines which set of modes is
  used.

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 5c94779585e24e8bd1bd41707521584af4251de3)

7 years agoradeonsi/gfx9: fix texture buffer objects and image buffers with IDXEN==0
Marek Olšák [Sun, 23 Apr 2017 19:32:22 +0000 (21:32 +0200)]
radeonsi/gfx9: fix texture buffer objects and image buffers with IDXEN==0

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 65e0c3fba74ee98cacadbba4bd005b930609b65e)

7 years agointel/fs: Take into account amount of data read in spilling cost heuristic.
Francisco Jerez [Thu, 20 Apr 2017 18:42:27 +0000 (11:42 -0700)]
intel/fs: Take into account amount of data read in spilling cost heuristic.

Until now the spilling cost calculation was neglecting the amount of
data read from the register during the spilling cost calculation.
This caused it to make suboptimal decisions in some cases leading to
higher memory bandwidth usage than necessary.

Improves Unigine Heaven performance by ~4% on BDW, reversing an
unintended FPS regression from my previous commit
147e71242ce539ff28e282f009c332818c35f5ac with n=12 and statistical
significance 5%.  In addition SynMark2 OglCSDof performance is
improved by an additional ~5% on SKL, and a Kerbal Space Program
apitrace around the Moho planet I can provide on request improves by
~20%.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 58324389be7bc7c5e10093b9cc0a8efa9b4c93a9)

7 years agointel/fs: Use regs_written() in spilling cost heuristic for improved accuracy.
Francisco Jerez [Thu, 20 Apr 2017 18:44:01 +0000 (11:44 -0700)]
intel/fs: Use regs_written() in spilling cost heuristic for improved accuracy.

This is what we use later on to compute the number of registers that
will actually get spilled to memory, so it's more likely to match
reality than the current open-coded approximation.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit ecc19e12dca95d2571d3761dea6dec24b061013c)

7 years agoi965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().
Kenneth Graunke [Fri, 21 Apr 2017 08:28:13 +0000 (01:28 -0700)]
i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().

opt_register_coalesce() was optimizing sequences such as:

   mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
   mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D
   mov(8) m4.zw:F, vgrf5.xxxy:F

into:

   mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
   mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D

This doesn't work - if we're going to reswizzle MACH, we'd need to
reswizzle the MUL as well.  Here, the MUL fills the accumulator's .zw
components with attr18.yy * attr19.yy.  But the MACH instruction expects
.z to contain attr18.x * attr19.x.  Bogus results ensue.

No change in shader-db on Haswell.  Prevents regressions in Timothy's
patches to use enhanced layouts for varying packing (which rearrange
code just enough to trigger this pre-existing bug, but were fine
themselves).

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 2faf227ec2e22c7a37e0a54783a3f0a0062ac852)

Squashed with commit:

i965/vec4: Use reads_accumulator_implicitly(), not MACH checks.

Curro pointed out that I should not just check for MACH, but use
the reads_accumulator_implicitly() helper, which would also prevent
the same bug with MAC and SADA2 (if we ever decide to use them).

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 6b10c37b9c3a73add73f444fe1aee73c9ec82c94)

7 years agoUpdate version to 17.1.0-rc2
Emil Velikov [Mon, 24 Apr 2017 14:23:07 +0000 (15:23 +0100)]
Update version to 17.1.0-rc2

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agost/clover: add space between < and ::
Emil Velikov [Wed, 19 Apr 2017 10:35:10 +0000 (11:35 +0100)]
st/clover: add space between < and ::

As pointed out by compiler

./llvm/codegen.hpp:52:22: error: ‘<::’ cannot begin a template-argument list [-fpermissive]
./llvm/codegen.hpp:52:22: note: ‘<:’ is an alternate spelling for ‘[’. Insert whitespace between ‘<’ and ‘::’

Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Vedran Miletić <vedran@miletic.net>
(cherry picked from commit dd6ec78b4fc1208c9ec330642ad42361fea91678)

7 years agoconfigure.ac: Fix typos.
Vinson Lee [Thu, 20 Apr 2017 21:48:50 +0000 (14:48 -0700)]
configure.ac: Fix typos.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b81d85f1754928139f9f01474495e024946aa1b4)

7 years agomesa: validate sampler type across the whole program
Timothy Arceri [Fri, 21 Apr 2017 07:04:10 +0000 (17:04 +1000)]
mesa: validate sampler type across the whole program

Currently we were only making sure types were the same within a
single stage. This looks to have regressed with 953a0af8e3f73.

Fixes: 953a0af8e3f73 ("mesa: validate sampler uniforms during gluniform calls")

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
https://bugs.freedesktop.org/show_bug.cgi?id=97524
(cherry picked from commit d682f8aa8e0edd166166f87fcd774dd2d57b4180)

7 years agomesa/glthread: correctly compare thread handles
Emil Velikov [Thu, 20 Apr 2017 15:24:08 +0000 (16:24 +0100)]
mesa/glthread: correctly compare thread handles

As mentioned in the manual - comparing pthread_t handles via the C
comparison operator is incorrect and pthread_equal() should be used
instead.

Cc: Timothy Arceri <tarceri@itsqueeze.com>
Fixes: d8d81fbc316 ("mesa: Add infrastructure for a worker thread to process GL commands.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 52df318d61f4892dbbaa8f0da4787f25caf1b0d1)

7 years agost/mesa: automake: honour the vdpau header install location
Emil Velikov [Sun, 16 Apr 2017 14:46:28 +0000 (15:46 +0100)]
st/mesa: automake: honour the vdpau header install location

If VDPAU is installed in the non-default location, we'll fail to find
the headers and error at build time.

../../src/gallium/include/state_tracker/vdpau_dmabuf.h:37:25: fatal error: vdpau/vdpau.h: No such file or directory
 #include <vdpau/vdpau.h>
                         ^

Fixes: faba96bc60b ("st/vdpau: add new interop interface")
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 51c0c213b7fa53b249e9fcb9004a3ba1076fe773)

7 years agoconfigure.ac: check require_basic_egl only if egl enabled
Emil Velikov [Sun, 16 Apr 2017 00:46:59 +0000 (01:46 +0100)]
configure.ac: check require_basic_egl only if egl enabled

Fixes: 1ac40173c2a ("configure.ac: simplify EGL requirements for drivers dependent on EGL")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 4516bfbd309a6996c18f577de47b13e33dce0828)

7 years agoconfigure.ac: manually expand PKG_CHECK_VAR
Emil Velikov [Tue, 18 Apr 2017 10:33:42 +0000 (11:33 +0100)]
configure.ac: manually expand PKG_CHECK_VAR

The macro is introduced with pkgconfig v0.28 which isn't universally
available. Thus it will error at configure stage.

Reported-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
Fixes: ce562f9e3fa ("EGL: Implement the libglvnd interface for EGL (v3)")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 179e21a72045ceb09cebf4c426bae16580fdd438)

7 years agoutil/queue: don't hang at exit
Rob Clark [Tue, 18 Apr 2017 01:20:37 +0000 (21:20 -0400)]
util/queue: don't hang at exit

So atexit() is horrible and 4aea8fe7 is probably not a good idea.  But
add an extra layer of duct-tape to the problem.  Otherwise we hit a
situation where app using an atexit() handler that runs later than ours
doesn't hang when trying to tear down a context.

 (gdb) bt
 #0  util_queue_killall_and_wait (queue=queue@entry=0x52bc80) at ../../../src/util/u_queue.c:264
 #1  0x0000007fb6c380c0 in atexit_handler () at ../../../src/util/u_queue.c:51
 #2  0x0000007fb7730e2c in __run_exit_handlers () from /lib64/libc.so.6
 #3  0x0000007fb7730e5c in exit () from /lib64/libc.so.6
 #4  0x0000007fb7ce17dc in piglit_report_result (result=PIGLIT_PASS) at /home/robclark/src/piglit/tests/util/piglit-util.c:267
 #5  0x0000007fb7ef99f8 in process_next_event (x11_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:139
 #6  0x0000007fb7ef9a90 in enter_event_loop (winsys_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:153
 #7  0x0000007fb7ef8e50 in run_test (gl_fw=0x432c20, argc=1, argv=0x7ffffff588) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_winsys_framework.c:88
 #8  0x0000007fb7edb890 in piglit_gl_test_run (argc=1, argv=0x7ffffff588, config=0x7ffffff400) at /home/robclark/src/piglit/tests/util/piglit-framework-gl.c:203
 #9  0x0000000000401224 in main (argc=1, argv=0x7ffffff588) at /home/robclark/src/piglit/tests/bugs/drawbuffer-modes.c:46
 (gdb) c
 Continuing.
 [Thread 0x7fb67580c0 (LWP 3471) exited]
 ^C
 Thread 1 "drawbuffer-mode" received signal SIGINT, Interrupt.
 0x0000007fb72dda34 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0
 (gdb) bt
 #0  0x0000007fb72dda34 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0
 #1  0x0000007fb6c38304 in cnd_wait (mtx=0x5bdc90, cond=0x5bdcc0) at ../../../include/c11/threads_posix.h:159
 #2  util_queue_fence_wait (fence=0x5bdc90) at ../../../src/util/u_queue.c:106
 #3  0x0000007fb6daac70 in fd_batch_sync (batch=0x5bdc70) at ../../../../../src/gallium/drivers/freedreno/freedreno_batch.c:233
 #4  batch_reset (batch=batch@entry=0x5bdc70) at ../../../../../src/gallium/drivers/freedreno/freedreno_batch.c:183
 #5  0x0000007fb6daa5e0 in batch_flush (batch=0x5bdc70) at ../../../../../src/gallium/drivers/freedreno/freedreno_batch.c:290
 #6  fd_batch_flush (batch=0x5bdc70, sync=<optimized out>) at ../../../../../src/gallium/drivers/freedreno/freedreno_batch.c:308
 #7  0x0000007fb6daba2c in fd_bc_flush (cache=0x461220, ctx=0x52b920) at ../../../../../src/gallium/drivers/freedreno/freedreno_batch_cache.c:141
 #8  0x0000007fb6dac954 in fd_context_flush (pctx=0x52b920, fence=0x0, flags=<optimized out>) at ../../../../../src/gallium/drivers/freedreno/freedreno_context.c:54
 #9  0x0000007fb6b43294 in st_glFlush (ctx=<optimized out>) at ../../../src/mesa/state_tracker/st_cb_flush.c:121
 #10 0x0000007fb69a84e8 in _mesa_make_current (newCtx=newCtx@entry=0x0, drawBuffer=drawBuffer@entry=0x0, readBuffer=readBuffer@entry=0x0) at ../../../src/mesa/main/context.c:1654
 #11 0x0000007fb6b7ca58 in st_api_make_current (stapi=<optimized out>, stctxi=0x0, stdrawi=0x0, streadi=0x0) at ../../../src/mesa/state_tracker/st_manager.c:827
 #12 0x0000007fb6cc87e8 in dri_unbind_context (cPriv=<optimized out>) at ../../../../../src/gallium/state_trackers/dri/dri_context.c:217
 #13 0x0000007fb6cc80b0 in driUnbindContext (pcp=0x5271e0) at ../../../../../../src/mesa/drivers/dri/common/dri_util.c:591
 #14 0x0000007fb7d1da08 in MakeContextCurrent (dpy=0x433380, draw=0, read=0, gc_user=0x0) at ../../../src/glx/glxcurrent.c:214
 #15 0x0000007fb7a8d5e0 in glx_platform_make_current () from /lib64/libwaffle-1.so.0
 #16 0x0000007fb7a894e4 in waffle_make_current () from /lib64/libwaffle-1.so.0
 #17 0x0000007fb7ef8c60 in piglit_wfl_framework_teardown (wfl_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_wfl_framework.c:628
 #18 0x0000007fb7ef939c in piglit_winsys_framework_teardown (winsys_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_winsys_framework.c:238
 #19 0x0000007fb7ef9c30 in destroy (gl_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:212
 #20 0x0000007fb7edb7c4 in destroy () at /home/robclark/src/piglit/tests/util/piglit-framework-gl.c:184
 #21 0x0000007fb7730e2c in __run_exit_handlers () from /lib64/libc.so.6
 #22 0x0000007fb7730e5c in exit () from /lib64/libc.so.6
 #23 0x0000007fb7ce17dc in piglit_report_result (result=PIGLIT_PASS) at /home/robclark/src/piglit/tests/util/piglit-util.c:267
 #24 0x0000007fb7ef99f8 in process_next_event (x11_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:139
 #25 0x0000007fb7ef9a90 in enter_event_loop (winsys_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:153
 #26 0x0000007fb7ef8e50 in run_test (gl_fw=0x432c20, argc=1, argv=0x7ffffff588) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_winsys_framework.c:88
 #27 0x0000007fb7edb890 in piglit_gl_test_run (argc=1, argv=0x7ffffff588, config=0x7ffffff400) at /home/robclark/src/piglit/tests/util/piglit-framework-gl.c:203
 #28 0x0000000000401224 in main (argc=1, argv=0x7ffffff588) at /home/robclark/src/piglit/tests/bugs/drawbuffer-modes.c:46
 (gdb) r

Fixes: 4aea8fe7 ("gallium/u_queue: fix random crashes when the app calls exit()")
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 6fb7935dedc87ffd767a2999f402ce1a46d18cce)

7 years agoconfigure.ac: print deprecation warning as needed
Emil Velikov [Mon, 17 Apr 2017 14:07:44 +0000 (15:07 +0100)]
configure.ac: print deprecation warning as needed

The warning should be printed only when one explicitly uses the
deprecated configure toggle.

Fixes: 7748c3f5eb1 ("configure.ac: deprecate --with-egl-platforms over
--with-platforms")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 9915753e631a203d7f22bfa308fbf688644a8a37)

7 years agost/mesa: invalidate the readpix cache in st_indirect_draw_vbo
Marek Olšák [Sun, 9 Apr 2017 14:03:59 +0000 (16:03 +0200)]
st/mesa: invalidate the readpix cache in st_indirect_draw_vbo

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 7cd6e2df65de9e2f0d77022a64c4e48ca2ebcb33)

7 years agowinsys/sw/dri: don't use GNU void pointer arithmetic
Emil Velikov [Sun, 16 Apr 2017 13:39:03 +0000 (14:39 +0100)]
winsys/sw/dri: don't use GNU void pointer arithmetic

Resolves build issues like the following:

src/gallium/winsys/sw/dri/dri_sw_winsys.c:203:31: error: pointer of type ‘void *’ used in arithmetic [-Werror=pointer-arith]
        data = dri_sw_dt->data + (dri_sw_dt->stride * box->y) + box->x * blsize;
                               ^
src/gallium/winsys/sw/dri/dri_sw_winsys.c:203:62: error: pointer of type ‘void *’ used in arithmetic [-Werror=pointer-arith]
        data = dri_sw_dt->data + (dri_sw_dt->stride * box->y) + box->x * blsize;
                                                              ^

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 309f4067a795219027f523bf0733692e48f2fd58)

7 years agovbo: fix gl_DrawID handling in glMultiDrawArrays
Nicolai Hähnle [Fri, 7 Apr 2017 16:24:44 +0000 (18:24 +0200)]
vbo: fix gl_DrawID handling in glMultiDrawArrays

Fixes a bug in
KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 51deba0eb35d0d27560bb7dad24b8d39abb58be6)

7 years agomesa: move glMultiDrawArrays to vbo and fix error handling
Nicolai Hähnle [Fri, 7 Apr 2017 16:20:34 +0000 (18:20 +0200)]
mesa: move glMultiDrawArrays to vbo and fix error handling

When any count[i] is negative, we must skip all draws.

Moving to vbo makes the subsequent change easier.

v2:
- provide the function in all contexts, including GLES
- adjust validation accordingly to include the xfb check
v3:
- fix mix-up of pre- and post-xfb prim count (Nils Wallménius)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 42d5465b9ba85b4918b9e6fb57994720e3c8a80b)

7 years agomesa: extract need_xfb_remaining_prims_check
Nicolai Hähnle [Thu, 13 Apr 2017 19:06:11 +0000 (21:06 +0200)]
mesa: extract need_xfb_remaining_prims_check

The same logic needs to be applied to glMultiDrawArrays.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 756e9ebbdd84018382908d3556973a62dbda09ca)

7 years agomesa: fix remaining xfb prims check for GLES with multiple instances
Nicolai Hähnle [Thu, 13 Apr 2017 18:53:17 +0000 (20:53 +0200)]
mesa: fix remaining xfb prims check for GLES with multiple instances

Found by inspection.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit ea9a8940cadb30ac8d72a26b82bdb54872c0e199)

7 years agoanv/cmd_buffer: Disable CCS on BDW input attachments
Nanley Chery [Thu, 13 Apr 2017 16:52:31 +0000 (09:52 -0700)]
anv/cmd_buffer: Disable CCS on BDW input attachments

The description under RENDER_SURFACE_STATE::RedClearColor says,

   For Sampling Engine Multisampled Surfaces and Render Targets:
    Specifies the clear value for the red channel.
   For Other Surfaces:
    This field is ignored.

This means that the sampler on BDW doesn't support CCS.

Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit d9d793696bf54e970491302605a1efd0aa182d1b)

7 years agoanv: blorp: flush memory after copy
Lionel Landwerlin [Mon, 17 Apr 2017 21:45:08 +0000 (14:45 -0700)]
anv: blorp: flush memory after copy

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d71efbe5f2a0ff934b8e9eeb96cd680a83bc0259)

7 years agofreedreno: fix crash if ctx torn down with no rendering
Rob Clark [Sat, 15 Apr 2017 16:32:17 +0000 (12:32 -0400)]
freedreno: fix crash if ctx torn down with no rendering

In this case, ctx->flush_queue would not have been initialized.

Fixes: 0b613c20 ("freedreno: enable draw/batch reordering by default")
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit d4601b0efc7f5e24e3f39fefa8e29e79560245ce)

7 years agoUpdate version to 17.1.0-rc1
Emil Velikov [Mon, 17 Apr 2017 13:51:04 +0000 (14:51 +0100)]
Update version to 17.1.0-rc1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoRevert "docs: add 17.2.0-devel release notes template, bump version"
Emil Velikov [Mon, 17 Apr 2017 13:30:44 +0000 (14:30 +0100)]
Revert "docs: add 17.2.0-devel release notes template, bump version"

This reverts commit 47dd2544e194029bdafb58b93d08c510571655d5.

Should have landed in the master branch

7 years agodocs: add 17.2.0-devel release notes template, bump version
Emil Velikov [Mon, 17 Apr 2017 13:27:10 +0000 (14:27 +0100)]
docs: add 17.2.0-devel release notes template, bump version

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoconfigure.ac: deprecate --with-egl-platforms over --with-platforms
Emil Velikov [Mon, 17 Apr 2017 12:29:41 +0000 (13:29 +0100)]
configure.ac: deprecate --with-egl-platforms over --with-platforms

Currently the former controls more than just EGL. With follow-up commits
we'll unwind and fix things so that one can build the different drivers
with said platform support.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoconfigure: remove egl platforms check
Emil Velikov [Thu, 8 Dec 2016 19:21:44 +0000 (19:21 +0000)]
configure: remove egl platforms check

The configure option is used by more than just EGL and with next commit
we'll rename it accordingly. Thus having the check will (and is atm)
incorrect.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agotravis: remove unneeded dri3/present proto requirement
Emil Velikov [Thu, 8 Dec 2016 19:21:42 +0000 (19:21 +0000)]
travis: remove unneeded dri3/present proto requirement

Signed-off-by: Emil Velikov <emil.lvelikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
7 years agoconfigure: remove unneeded dri3/present proto requirements
Emil Velikov [Thu, 8 Dec 2016 19:21:41 +0000 (19:21 +0000)]
configure: remove unneeded dri3/present proto requirements

We are not using either of these. The respecive xcb packages are used
instead.

v2: Rebase, reword commit message.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
7 years agoEGL: Implement the libglvnd interface for EGL (v3)
Kyle Brenneman [Wed, 4 Jan 2017 18:31:58 +0000 (11:31 -0700)]
EGL: Implement the libglvnd interface for EGL (v3)

The new interface mostly just sits on top of the existing library.

The only change to the existing EGL code is to split the client
extension string into platform extensions and everything else. On
non-glvnd builds, eglQueryString will just concatenate the two strings.

The EGL dispatch stubs are all generated. The script is based on the one
used to generate entrypoints in libglvnd itself.

v2: [Kyle]
 - Rebased against master.
 - Reworked the EGL makefile to use separate libraries
 - Made the EGL code generation scripts work with Python 2 and 3.
 - Change gen_egl_dispatch.py to use argparse for the command line arguments.
 - Assorted formatting and style cleanup in the Python scripts.

v3: [Emil Velikov]
 - Rebase
 - Remove separate glvnd glx/egl configure toggles

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoandroid: add marshal_generated c and h files to generated sources
Tapani Pälli [Thu, 16 Mar 2017 09:24:38 +0000 (11:24 +0200)]
android: add marshal_generated c and h files to generated sources

Fixes: efd63e2 ("mesa: Connect the generated GL command marshalling code to the build.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoconfigure.ac: honour --disable-libunwind if the .pc file is present
Emil Velikov [Thu, 13 Apr 2017 18:47:17 +0000 (19:47 +0100)]
configure.ac: honour --disable-libunwind if the .pc file is present

We should check the presence in order to determine if we should
[implicitly] set the CFLAGS/LIBS

v2: Drop spurious OMX hunk (Eric)

Cc: Eric Anholt <eric@anholt.net>
Reported-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agodocs: document the C++14 SWR requirement
Emil Velikov [Wed, 12 Apr 2017 14:11:14 +0000 (15:11 +0100)]
docs: document the C++14 SWR requirement

Earlier commit bumped the requirement for the SWR driver.

v2: Fold the note with the LLVM 3.9 one (Tim)

Fixes: 3c52a7316a1 ("swr: [configure.ac/scons] require c++14")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
7 years agowinsys/amdgpu: init buffer_indices_hashlist with memset()
Samuel Pitoiset [Fri, 14 Apr 2017 16:32:25 +0000 (18:32 +0200)]
winsys/amdgpu: init buffer_indices_hashlist with memset()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: simplify amdgpu_cs_add_buffer() a bit
Samuel Pitoiset [Fri, 14 Apr 2017 16:32:24 +0000 (18:32 +0200)]
winsys/amdgpu: simplify amdgpu_cs_add_buffer() a bit

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoi965/drm: Delete NULL check in brw_bo_unmap().
Kenneth Graunke [Wed, 12 Apr 2017 16:30:48 +0000 (09:30 -0700)]
i965/drm: Delete NULL check in brw_bo_unmap().

I accidentally moved the bo->bufmgr dereference above the NULL check
when cleaning up this code.

While passing NULL to free() is a common pattern...passing NULL to
unmap seems pretty bad.  You really ought to know whether you have
a buffer or not.  We don't want to paper over bugs like that.  So,
just drop the NULL check altogether.

CID: 1405006

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agointel/decoder: Fix is_header_field starting condition.
Kenneth Graunke [Wed, 12 Apr 2017 16:53:44 +0000 (09:53 -0700)]
intel/decoder: Fix is_header_field starting condition.

Starting positions >= 32 are not part of the header, rather than >.

Caught by Coverity, which found that "bits <<= field->start" may shift
by 32, which has undefined behavior.

CID: 1404968

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoi965/drm: Remove dead return in brw_bo_busy()
Kenneth Graunke [Wed, 12 Apr 2017 16:33:39 +0000 (09:33 -0700)]
i965/drm: Remove dead return in brw_bo_busy()

If ret is 0, we return.  If ret is not 0, we return.  This is dead.

CID: 1405013 (Structurally dead code (UNREACHABLE))

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoandroid: amd/addrlib: trivial fix for gfx9 support
Mauro Rossi [Sat, 1 Apr 2017 10:48:44 +0000 (12:48 +0200)]
android: amd/addrlib: trivial fix for gfx9 support

Fixes the following build error:

external/mesa/src/amd/addrlib/gfx9/gfx9addrlib.cpp:36:10: fatal error: 'gfx9_gb_reg.h' file not found
         ^
1 error generated.

Fixes: 7f160ef "amd/addrlib: import gfx9 support"
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
7 years agonir: Add GLSL_TYPE_[U]INT64 to some switch statements
Jason Ekstrand [Fri, 14 Apr 2017 21:41:43 +0000 (14:41 -0700)]
nir: Add GLSL_TYPE_[U]INT64 to some switch statements

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agogallium/radeon: always flush asynchronously and wait after begin_new_cs
Marek Olšák [Thu, 13 Apr 2017 21:46:59 +0000 (23:46 +0200)]
gallium/radeon: always flush asynchronously and wait after begin_new_cs

This hides the overhead of everything in the driver after the CS flush and
before returning from pipe_context::flush.
Only microbenchmarks will benefit.

+2% FPS for glxgears.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: remove local variable 'mod' from si_compile_tgsi_shader
Marek Olšák [Wed, 15 Mar 2017 19:50:35 +0000 (20:50 +0100)]
radeonsi: remove local variable 'mod' from si_compile_tgsi_shader

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: add si_shader_selector::vs_needs_prolog
Marek Olšák [Tue, 21 Feb 2017 19:32:51 +0000 (20:32 +0100)]
radeonsi: add si_shader_selector::vs_needs_prolog

cleanup

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: don't set VGT_GS_MODE as part of the GS state
Marek Olšák [Wed, 12 Apr 2017 15:48:16 +0000 (17:48 +0200)]
radeonsi: don't set VGT_GS_MODE as part of the GS state

The VS state sets it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: don't allow user indices with indirect draws
Marek Olšák [Sun, 2 Apr 2017 14:22:54 +0000 (16:22 +0200)]
radeonsi: don't allow user indices with indirect draws

Not possible with GL and it will make future gallium rework easier.
(also it's something I wouldn't like to support)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: merge two if (indirect) statements
Marek Olšák [Sun, 2 Apr 2017 13:27:02 +0000 (15:27 +0200)]
radeonsi: merge two if (indirect) statements

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: don't mark non-dirty textures with CMASK as compressed
Marek Olšák [Fri, 7 Apr 2017 10:36:59 +0000 (12:36 +0200)]
radeonsi: don't mark non-dirty textures with CMASK as compressed

because the compression is skipped with non-dirty textures.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agodocs: Document interaction Fixes tag and stable branches.
Bas Nieuwenhuizen [Fri, 14 Apr 2017 21:39:15 +0000 (23:39 +0200)]
docs: Document interaction Fixes tag and stable branches.

For the next time I forget.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoglsl: don't run the GLSL pre-processor when we are skipping compilation
Timothy Arceri [Mon, 10 Apr 2017 01:48:49 +0000 (11:48 +1000)]
glsl: don't run the GLSL pre-processor when we are skipping compilation

This moves the hashing of shader source for the cache lookup to before
the preprocessor.  In our experience, shaders are unlikely to hash the
same after preprocessing if they didn't hash the same before, so we can
skip preprocessing for cache hits.

Improves Deus Ex start-up times with a warm cache from ~30 seconds to
~22 seconds.

Also fixes the leaking of state.

V2: fix indentation

v3: add the value of MESA_EXTENSION_OVERRIDE to the hash of the shader.

Tested-by (v2): Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
7 years agoglsl: delay optimisations on individual shaders when cache is available
Timothy Arceri [Mon, 10 Apr 2017 01:48:48 +0000 (11:48 +1000)]
glsl: delay optimisations on individual shaders when cache is available

Due to a max limit of 65,536 entries on the index table that we use to
decide if we can skip compiling individual shaders, it is very likely
we will have collisions.

To avoid doing too much work when the linked program may be in the
cache this patch delays calling the optimisations until link time.

Improves cold cache start-up times on Deus Ex by ~20 seconds.

When deleting the cache index to simulate a worst case scenario
of collisions in the index, warm cache start-up time improves by
~45 seconds.

V2: fix indentation, make sure to call optimisations on cache
fallback, make sure optimisations get called for XFB.

Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoanv: Add the pci_id into the shader cache UUID
Jason Ekstrand [Sat, 25 Feb 2017 00:36:00 +0000 (16:36 -0800)]
anv: Add the pci_id into the shader cache UUID

This prevents a user from using a cache created on one hardware
generation on a different one.  Of course, with Intel hardware, this
requires moving their drive from one machine to another but it's still
possible and we should prevent it.

Reviewed-by: Chad Versace <chadversary@chromium.org>
Cc: mesa-stable@lists.freedesktop.org
7 years agoetnaviv: native fence fd support
Philipp Zabel [Wed, 12 Apr 2017 10:31:01 +0000 (12:31 +0200)]
etnaviv: native fence fd support

This adds native fence fd support to etnaviv, similarly to commit
0b98e84e9ba0 ("freedreno: native fence fd"), enabled for kernel
driver version 1.1 or later.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agodocs: mark GL_ARB_vertex_attrib_64bit and OpenGL 4.2 as supported by i965/gen7+
Francisco Jerez [Fri, 14 Apr 2017 22:59:52 +0000 (15:59 -0700)]
docs: mark GL_ARB_vertex_attrib_64bit and OpenGL 4.2 as supported by i965/gen7+

v2 (Andreas Boll):
- Mark GL 4.1 as supported by i965/gen7+
- Mark GL_ARB_shader_precision as supported by i965/gen7+
- Update release notes

Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965: enable OpenGL 4.2 in Ivybridge
Juan A. Suarez Romero [Wed, 29 Mar 2017 09:41:35 +0000 (11:41 +0200)]
i965: enable OpenGL 4.2 in Ivybridge

Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965: enable ARB_shader_precision in gen7+
Samuel Iglesias Gonsálvez [Mon, 17 Oct 2016 14:40:06 +0000 (14:40 +0000)]
i965: enable ARB_shader_precision in gen7+

Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965: enable ARB_vertex_attrib_64bit for gen7+
Juan A. Suarez Romero [Fri, 21 Oct 2016 14:57:25 +0000 (16:57 +0200)]
i965: enable ARB_vertex_attrib_64bit for gen7+

Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoswr: Fix swr osmesa build
George Kyriazis [Fri, 14 Apr 2017 18:56:09 +0000 (13:56 -0500)]
swr: Fix swr osmesa build

Use GALLIUM_SWR to standardize

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoetnaviv: SINGLE_BUFFER support on GC3000
Wladimir J. van der Laan [Fri, 14 Apr 2017 07:44:27 +0000 (09:44 +0200)]
etnaviv: SINGLE_BUFFER support on GC3000

This patch adds support for the SINGLE_BUFFER feature on GC3000
GPUs, which allows rendering to a single buffer using multiple pixel
pipes.

This feature is always used when it is available, which means that
multi-tiled formats are no longer being used in that case, and all
buffers will be normal (super)tiled. This mimics the behavior of the
blob on GC3000.

- Because the same format can be used to render to and texture from,
  this avoids an extra resolve pass when rendering to texture.

- i.MX6qp includes a PRE which can scan-out directly from tiled formats,
  avoiding untiling overhead.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agoetnaviv: Update includes from rnndb
Wladimir J. van der Laan [Fri, 14 Apr 2017 07:41:03 +0000 (09:41 +0200)]
etnaviv: Update includes from rnndb

Update to etna_viv commit 8486a97.

austriancoder: changed patch to include isa redefinition fix.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agoetnaviv: Add chipMinorFeatures4 and 5
Wladimir J. van der Laan [Fri, 14 Apr 2017 07:39:52 +0000 (09:39 +0200)]
etnaviv: Add chipMinorFeatures4 and 5

Request chipMinorFeatures bitfields 4 and 5 from the
drm driver.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agoetnaviv: resolve tile status when flushing resource
Philipp Zabel [Wed, 12 Apr 2017 14:13:37 +0000 (16:13 +0200)]
etnaviv: resolve tile status when flushing resource

When passing render buffers from EGL clients to a wayland compositor,
the resource tile status must be resolved because otherwise the tile
status is lost in the transfer and cleared parts of the buffer will
contain old contents.

The same applies when sampling directly from a renderable resource.

lst: Add seqno tracking, to skip flush when not needed.

Fixes: aadcb5e94b35 ("etnaviv: enable TS, but disable autodisable")
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agoetnaviv: stop repeatedly resolving an unchanged resource into its scanout prime buffer
Philipp Zabel [Wed, 12 Apr 2017 14:13:36 +0000 (16:13 +0200)]
etnaviv: stop repeatedly resolving an unchanged resource into its scanout prime buffer

Before resolving a resource into its scanout prime buffer, check that
the prime resource is actually older. If it is not, the resolve is an
expensive no-op, and we better skip it.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agoswr: Add polygon stipple support
George Kyriazis [Sat, 1 Apr 2017 01:09:57 +0000 (20:09 -0500)]
swr: Add polygon stipple support

Add polygon stipple functionality to the fragment shader.

Explicitly turn off polygon stipple for lines and points, since we
do them using tris.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agodocs/relnotes: add GL_ARB_gpu_shader_fp64 support on i965/ivybridge
Samuel Iglesias Gonsálvez [Wed, 5 Apr 2017 04:23:43 +0000 (06:23 +0200)]
docs/relnotes: add GL_ARB_gpu_shader_fp64 support on i965/ivybridge

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
7 years agodocs: mark GL_ARB_gpu_shader_fp64 and OpenGL 4.0 as supported by i965/gen7+
Samuel Iglesias Gonsálvez [Tue, 11 Oct 2016 08:59:52 +0000 (10:59 +0200)]
docs: mark GL_ARB_gpu_shader_fp64 and OpenGL 4.0 as supported by i965/gen7+

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965: enable OpenGL 4.0 to Ivybridge/Baytrail
Samuel Iglesias Gonsálvez [Fri, 26 Aug 2016 05:39:04 +0000 (07:39 +0200)]
i965: enable OpenGL 4.0 to Ivybridge/Baytrail

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965: enable ARB_gpu_shader_fp64 for Ivybridge/Baytrail
Samuel Iglesias Gonsálvez [Fri, 26 Aug 2016 05:37:42 +0000 (07:37 +0200)]
i965: enable ARB_gpu_shader_fp64 for Ivybridge/Baytrail

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965: Use correct VertStride on align16 instructions.
Matt Turner [Fri, 20 Jan 2017 21:35:33 +0000 (13:35 -0800)]
i965: Use correct VertStride on align16 instructions.

In commit c35fa7a, we changed the "width" of DF source registers to 2,
which is conceptually fine. Unfortunately a VertStride of 2 is not
allowed by align16 instructions on IVB/BYT, and the regular VertStride
of 4 works fine in any case.

See generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/vs-round-double.shader_test
for example:

cmp.ge.f0(8)    g18<1>DF        g1<0>.xyxyDF    -g8<2>DF        { align16 1Q };
        ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed
cmp.ge.f0(8)    g19<1>DF        g1<0>.xyxyDF    -g9<2>DF        { align16 2N };
        ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed

v2:
- Add spec quote (Curro).
- Change the condition to only BRW_VERTICAL_STRIDE_2 (Curro)

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965/vec4/dce: improve track of partial flag register writes
Samuel Iglesias Gonsálvez [Fri, 17 Mar 2017 10:57:25 +0000 (11:57 +0100)]
i965/vec4/dce: improve track of partial flag register writes

This is required for correctness in presence of multiple 4-wide flag
writes (e.g. 4-wide instructions with a conditional mod set) which
update a different portion of the same 8-bit flag subregister.

Right now we keep track of flag dataflow with 8-bit granularity and
consider flag writes to have killed any previous definition of the
same subregister even if the write was less than 8 channels wide,
which can cause live flag register updates to be dead
code-eliminated incorrectly.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965/vec4: don't do horizontal stride on some register file types
Samuel Iglesias Gonsálvez [Fri, 17 Mar 2017 10:55:49 +0000 (11:55 +0100)]
i965/vec4: don't do horizontal stride on some register file types

horiz_offset() shouldn't be doing anything for scalar registers,
because all channels of any SIMD instructions will end up reading or
writing the same component of the register, so shifting the register
offset would be wrong.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
[ Francisco Jerez: Re-implement in terms of is_uniform() for
  simplicity.  Pass argument by const reference.  Clarify commit
  message. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965/vec4: Fix exec size for MOVs {SET,PICK}_{HIGH,LOW}_32BIT.
Matt Turner [Fri, 20 Jan 2017 21:35:32 +0000 (13:35 -0800)]
i965/vec4: Fix exec size for MOVs {SET,PICK}_{HIGH,LOW}_32BIT.

Otherwise for a pack_double_2x32_split opcode, we emit:

   vec1 64 ssa_135 = pack_double_2x32_split ssa_133, ssa_134
mov(8)          g5<1>UD         g5<4>.xUD                       { align16 1Q compacted };
mov(8)          g7<2>UD         g5<4,4,1>UD                     { align1 1Q };
        ERROR: When the destination spans two registers, the source must span two registers
               (exceptions for scalar source and packed-word to packed-dword expansion)
mov(8)          g8<2>UD         g5.4<4,4,1>UD                   { align1 2N };
        ERROR: The offset from the two source registers must be the same
mov(8)          g5<1>UD         g6<4>.xUD                       { align16 1Q compacted };
mov(8)          g7.1<2>UD       g5<4,4,1>UD                     { align1 1Q };
        ERROR: When the destination spans two registers, the source must span two registers
               (exceptions for scalar source and packed-word to packed-dword expansion)
mov(8)          g8.1<2>UD       g5.4<4,4,1>UD                   { align1 2N };
        ERROR: The offset from the two source registers must be the same

The intention was to emit mov(4)s for the instructions that have ERROR
annotations.

See tests/spec/arb_gpu_shader_fp64/execution/vs-isinf-dvec.shader_test
for example.

v2 (Samuel):
- Instead of setting the exec size to a fixed value, don't double it
(Curro).
- Add PICK_{HIGH,LOW}_32BIT to the condition.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
[ Francisco Jerez: Trivial rebase changes. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965/vec4: use vec4_builder to emit instructions in setup_imm_df()
Samuel Iglesias Gonsálvez [Tue, 7 Mar 2017 09:29:53 +0000 (10:29 +0100)]
i965/vec4: use vec4_builder to emit instructions in setup_imm_df()

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
[ Francisco Jerez: Drop useless vec4_visitor dependencies.  Demote to
  static stand-alone function.  Don't write unused components in the
  result.  Use vec4_builder interface for register allocation. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965/vec4: consider subregister offset in live variables
Juan A. Suarez Romero [Fri, 23 Sep 2016 15:57:39 +0000 (15:57 +0000)]
i965/vec4: consider subregister offset in live variables

Take into account offset values less than a full register (32 bytes)
when getting the var from register.

This is required when dealing with an operation that writes half of the
register (like one d2x in IVB/BYT, which uses exec_size == 4).

v2:
- Take in account this offset < 32 in liveness analysis too (Curro)

v3:
- Change formula in var_from_reg() (Curro)
- Remove useless changes (Curro)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965/vec4: fix assert to detect SIMD lowered DF instructions in IVB
Francisco Jerez [Wed, 12 Apr 2017 23:54:49 +0000 (16:54 -0700)]
i965/vec4: fix assert to detect SIMD lowered DF instructions in IVB

On IVB, DF instructions have lowered the SIMD width to 4 but the
exec_size will be later doubled. Fix the assert to avoid crashing in
this case.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
[ Francisco Jerez: Simplify assert.  Except for the 'inst->group % 4
  == 0' part the assertion was redundant with the previous assertion. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965/vec4: split VEC4_OPCODE_FROM_DOUBLE into one opcode per destination's type
Samuel Iglesias Gonsálvez [Fri, 24 Mar 2017 07:46:13 +0000 (08:46 +0100)]
i965/vec4: split VEC4_OPCODE_FROM_DOUBLE into one opcode per destination's type

This way we can set the destination type as double to all these new opcodes,
avoiding any optimizer's confusion that was happening before.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
[ Francisco Jerez: Drop no_spill workaround originally needed due to
  the bogus destination type of VEC4_OPCODE_FROM_DOUBLE. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965/vec4: split d2x conversion and data gathering from one opcode to two explicit...
Samuel Iglesias Gonsálvez [Wed, 8 Mar 2017 08:27:49 +0000 (09:27 +0100)]
i965/vec4: split d2x conversion and data gathering from one opcode to two explicit ones

When doing a 64-bit to a smaller data type size conversion, the destination should
be aligned to 64-bits. Because of that, we need to gather the data after the
actual conversion.

Until now, these two operations were done by VEC4_OPCODE_FROM_DOUBLE but
now we split them explicitely in two different instructions:
VEC4_OPCODE_FROM_DOUBLE just do the conversion and
VEC4_OPCODE_PICK_LOW_32BIT will gather the data.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965/vec4: fix VEC4_OPCODE_FROM_DOUBLE for IVB/BYT
Juan A. Suarez Romero [Fri, 23 Sep 2016 09:57:43 +0000 (09:57 +0000)]
i965/vec4: fix VEC4_OPCODE_FROM_DOUBLE for IVB/BYT

In the generator we must generate slightly different code for
Ivybridge/Baytrail, because of the way the stride works in
this hardware.

v2:
- Use stride and don't need to fix dst (Curro)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965/vec4: keep original type when dealing with null registers
Juan A. Suarez Romero [Mon, 12 Sep 2016 16:06:22 +0000 (16:06 +0000)]
i965/vec4: keep original type when dealing with null registers

Keep the original type when dealing with null registers. Especially
because we do no want to introduce an implicit conversion between
types that could affect the conditional flags.

This affects especially when the original type is DF, and we are working
on Ivybridge/Baytrail.

v2 (Curro)
- Fix typo.
- Use retype() instead of applying the type directly.
- Remove unneeded retype.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965/vec4: split DF instructions and later double its execsize in IVB/BYT
Samuel Iglesias Gonsálvez [Mon, 29 Aug 2016 08:10:30 +0000 (10:10 +0200)]
i965/vec4: split DF instructions and later double its execsize in IVB/BYT

We need to split DF instructions in two on IVB/BYT as it needs an
execsize 8 to process 4 DF values (one GRF in total).

v2:
- Rename helper and make it static inline function (Matt).
- Fix indention and add braces (Matt).

v3:
- Don't edit IR instruction when doubling exec_size (Curro)
- Add comment into the code (Curro).
- Manage ARF registers like the others (Curro)

v4:
- Add get_exec_type() function and use it to calculate the execution
  size.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
[ Francisco Jerez: Fix bogus 'type != BAD_FILE' check.  Take
  destination type as execution type where there is no valid source.
  Assert-fail if the deduced execution type is byte.  Clarify comment
  in get_lowered_simd_width().  Move SIMD width workaround outside of
  'if (...inst->size_written > REG_SIZE)' conditional block, since the
  problem should be independent of whether the amount of data written
  by the instruction is greater or lower than a GRF.  Drop redundant
  is_ivb_df definition.  Drop bogus inst->exec_size < 8 check.
  Simplify channel group assertion. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965/fs: lower all non-force_writemask_all DF instructions to SIMD4 on IVB/BYT
Samuel Iglesias Gonsálvez [Thu, 25 Aug 2016 14:05:24 +0000 (16:05 +0200)]
i965/fs: lower all non-force_writemask_all DF instructions to SIMD4 on IVB/BYT

The hardware applies the same channel enable signals to both halves of
the compressed instruction which will be just wrong under non-uniform
control flow. Fix this by splitting those instructions to SIMD4.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965/fs: Get 64-bit indirect moves working on IVB.
Francisco Jerez [Thu, 9 Feb 2017 18:16:58 +0000 (10:16 -0800)]
i965/fs: Get 64-bit indirect moves working on IVB.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
7 years agoi965: Use source region <1,2,0> when converting to DF.
Matt Turner [Fri, 13 Jan 2017 02:05:58 +0000 (18:05 -0800)]
i965: Use source region <1,2,0> when converting to DF.

Doing so allows us to use a single MOV in VEC4_OPCODE_TO_DOUBLE instead
of two.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
7 years agoi965/fs: fix lower SIMD width for IVB/BYT's MOV_INDIRECT
Juan A. Suarez Romero [Wed, 3 Aug 2016 11:51:44 +0000 (11:51 +0000)]
i965/fs: fix lower SIMD width for IVB/BYT's MOV_INDIRECT

According to the IVB and HSW PRMs:

"2.When the destination requires two registers and the sources are
 indirect, the sources must use 1x1 regioning mode."

So for DF instructions the execution size is not limited by the number
of address registers that are available, but by the EU decompression
logic not handling VxH indirect addressing correctly.

This patch limits the SIMD width to 4 in this case.

v2:
- Fix typo (Matt).
- Fix condition (Curro)

v3:
- Add spec quote (Curro)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965/fs: fix dst stride in IVB/BYT type conversions
Juan A. Suarez Romero [Fri, 20 Jan 2017 07:50:50 +0000 (08:50 +0100)]
i965/fs: fix dst stride in IVB/BYT type conversions

When converting a DF to 32-bit conversions, we set dst stride to 2,
to fulfill alignment restrictions because the upper Dword of every
Qword will be written with undefined value.

But in IVB/BYT, this is not necessary, as each DF conversion already
writes 2, the first one the real value, and the second one a 0.
That is, IVB/BYT already set stride = 2 implicitly, so we must set it to
1 explicitly to avoid ending up with stride = 4.

v2:
- Fix typo (Matt)

v3:
- Fix stride in the destination's brw_reg, don't modity IR (Curro)

v4:
- Remove 'is_dst' argument of brw_reg_from_fs_reg() (Curro)
- Fix comment (Curro).
- Relax hstride assert (Curro)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
[ Francisco Jerez: Minor spelling fixes. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965/fs: rename lower_d2x to lower_conversions
Samuel Iglesias Gonsálvez [Tue, 14 Mar 2017 07:17:36 +0000 (08:17 +0100)]
i965/fs: rename lower_d2x to lower_conversions

v2:
- Change the name to lower_conversions.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoRevert "i965/fs: Don't emit SEL instructions for type-converting MOVs."
Samuel Iglesias Gonsálvez [Tue, 28 Mar 2017 04:25:13 +0000 (06:25 +0200)]
Revert "i965/fs: Don't emit SEL instructions for type-converting MOVs."

This reverts commit 7dccd38b400d3a65da20ddefe282a7bb0b7ccb58.

d2x pass fixes SEL instructions when there is a type conversion
by doing a SEL without type conversion and then convert the result.
This pass also takes into account the non-uniform control flow.

Then, 7dccd38b400d3a65da20ddefe282a7bb0b7ccb58 is not needed anymore.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agoi965/fs: generalize the legalization d2x pass
Samuel Iglesias Gonsálvez [Fri, 20 Jan 2017 07:47:05 +0000 (08:47 +0100)]
i965/fs: generalize the legalization d2x pass

Generalize it to lower any unsupported narrower conversion.

v2 (Curro):
- Add supports_type_conversion()
- Reuse existing intruction instead of cloning it.
- Generalize d2x to narrower and equal size conversions.

v3 (Curro):
- Make supports_type_conversion() const and improve it.
- Use foreach_block_and_inst to process added instructions.
- Simplify code.
- Add assert and improve comments.
- Remove redundant mov.
- Remove useless comment.
- Remove saturate == false assert and add support for saturation
  when fixing the conversion.
- Add get_exec_type() function.

v4 (Curro):
- Use get_exec_type() function to get sources' type.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>