git.osdn.net Git - android-x86/external-mesa.git/log

radv: avoid GPU hangs if someone does a resolve with non-multisample src (v2)

This is a bug in the app, but I'd rather avoid hanging the GPU,
esp if someone is running in validation and it takes out their
development environment.

v2: get it right, reverse the polarity.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 36a1b61321561634c6b243cf876c347fef73dfa4)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>
Conflicts:
src/amd/vulkan/radv_meta_resolve.c

egl/x11: don't leak xfixes_query in the error path

If we get a xfixes v1.x we'll error out, without freeing the
xfixes_query reply.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit c961b679fe16fc98c3d04d611abc287f1bcc07b5)

intel/vec4/gs: reset nr_pull_param if DUAL_INSTANCED compile failed.

If dual object compile fails (as seems to happen with virgl a
fair bit, and does piglit even have any tests for it?), we end up
not restarting the pull params, so we call
vec4_visitor::move_uniform_array_access_to_pull_constant
a second time and it runs over the ends of the alloc.

Fixes: tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test
running inside virgl on ivybridge.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 271fa3a684ef0eefe99087c13d1abb099784163f)

i965/blit: Remember to include miptree buffer offset in relocs

Remember to add the offset to the start of the buffer in the relocation
or else we write 0xff into random bytes elsewhere.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit fb63c43fd1b7adb5cb4f34e7616e7d564ca178e5)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>
Conflicts:
src/mesa/drivers/dri/i965/intel_pixel_bitmap.c

i965: Delete pitch alignment assertion in get_blit_intratile_offset_el.

The cacheline alignment restriction is on the base address; the pitch
can be anything.

Fixes assertion failures when using primus (say, on glxgears, which
creates a 300x300 linear BGRX surface with a pitch of 1200):

intel_blit.c:190: get_blit_intratile_offset_el: Assertion `mt->surf.row_pitch % 64 == 0' failed.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 595a47b8293b1d97a3ae7dbfa8db703bfb4e7aae)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>
Conflicts:
src/mesa/drivers/dri/i965/intel_blit.c

ac/nir: fix lsb emission

This makes it match radeonsi. The LLVM backend itself will emit the
correct instruction, but LLVM might do incorrect optimizations since it
thinks the output is undefined when the input is 0, even though it's not
supposed to be. We really need a new intrinsic, or for the backend to
become smarter and recognize this pattern.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <basni@google.com>
(cherry picked from commit 6d731c5651ea98551e0bf0c1a8896d5ea63558d5)
[Andres Gomez: nir_to_llvm_context not yet converted into ac_llvm_context]
Signed-off-by: Andres Gomez <agomez@igalia.com>
Conflicts:
src/amd/common/ac_nir_to_llvm.c

docs: add sha256 checksums for 17.1.6

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

docs: add release notes for 17.1.6

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Update version to 17.1.6

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

swr/rast: fix scons gen_knobs.h dependency

Copy/paste error was duplicating a gen_knobs.cpp rule.

Fixes: 5079c277b57 ("swr: [scons] Fix windows build")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit e4a6ae06cf01a21d7fe32e3ff2fc441102d68f82)

radv: Don't underflow non-visible VRAM size.

In some APU situations the reported visible size can be larger than
VRAM size. This properly clamps the value.

Surprisingly both CTS and spec seem to allow a heap type with size 0,
so this seemed like the easiest option to me.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: 4ae84efbc5c "radv: Use enum for memory heaps."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 8229706ad86b27ed571f17872006a488fcd35378)
[Emil Velikov: branch uses radeon_info::visible_vram_size]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/amd/vulkan/radv_device.c

spirv: Fix SpvImageFormatR16ui

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.1 17.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 95c6a97464e7baaca6e09f829da0be5ac8c50297)

dri3: Wait for all pending swapbuffers to be scheduled before touching the front

This implements a wait for glXWaitGL, glXCopySubBuffer, dri flush_front and
creation of fake front until all pending SwapBuffers have been committed to
hardware. Among other things this fixes piglit glx-copy-sub-buffers on dri3.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 185ef06fd2db782d9d3d6046580f7cece02c4797)

gallium/radeon: fix ARB_query_buffer_object conversion to boolean

The issue here is that the immediate is treated as a 64-bit value,
and fetching it does not work reliably with swizzles that are different
from xy and zw.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit da83687c4ba7e9022f6f14176393a9e3c6391ed5)

fixup! cherry-ignore: add a bunch more commits to the list

nir: fix algebraic optimizations

The optimizations are only valid for 32-bit integers. They were
mistakenly firing for 64-bit integers as well.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit de914615753678c5514733a37ac7d0360a43e525)

cherry-ignore: add a bunch more commits to the list

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

etnaviv: fix memory leak when BO allocation fails

The resource struct is already allocated at this point and should be
freed properly.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
(cherry picked from commit 4fb9f97047eb1e43c47cb7cacba27ccd20383eff)

st/dri: Check get-handle return value in queryImage

In the DRIImage queryImage hook, check if resource_get_handle() failed
and return FALSE if so.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit b4a18f13ce7f0e7d0307fb3388819345616752ce)
[Emil Velikov: drop offset and modifier hunks - not in branch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/state_trackers/dri/dri2.c

radv: for stencil only set Z tile mode index to same value

On SI this was causing a hang in
dEQP-VK.pipeline.render_to_image.core.2d_array.mipmap.r16g16_sint_s8_uint

This was due to not handling the tile mode index for depth like
I fixed previously for new GPUs.

Fixes: 01d0c5a9 (radv: fix stencil regression since new addrlib import)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 800d1622096ca52b955bdfc20eb770b80ef15221)
[Emil Velikov: XXX]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/amd/vulkan/radv_device.c

radv/ac: port SI TC L1 write corruption fix.

This ports 72e46c988 to radv.
radeonsi: apply a TC L1 write corruption workaround for SI

Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e77ff11ffe1a52b8e17a847f263746c849db3f11)

radv/ac: realign SI workaround with radeonsi.

This ports: da7453666ae
radeonsi: don't apply the Z export bug workaround to Hainan
to radv.

Just noticed in passing.

Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a81e99f50a718790de379087c9f5a636e32b2a28)

radv: fix buffer views on SI/CIK.

Fixes CTS dEQP-VK.memory.pipeline_barrier.host_write_uniform_texel_buffer.1024
on SI/CIK with radv.

Fixes: f4e499ec (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ca82ef5ac75e50abb109986b55002cca24f7c0fb)

radv: fix non-0 based layer clears.

If the layer base was > 0, it wasn't getting passed as the start
instance or getting added in the shaders.

Fixes CTS dEQP-VK.api.image_clearing.core.clear_color_attachment.2d_r8_uint_multiple_layers

Fixes: 7e0382fb (radv: add support for layered clears (v2))
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 75392e76adf143070a5f208febd8da31b39b7676)

swr: remove unneeded fallback strcasecmp define

The last user of the function was removed with earlier commit.

Fixes: 50842e8a931 ("swr: replace gallium->swr format enum conversion")
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
(cherry picked from commit a0755f2e6a1b41b2c5e295fa5ff8eb8dfbf5eb41)

i965: use strtol to convert the integer deviceID override

One can override the deviceID, by setting the INTEL_DEVID_OVERRIDE
variable. A few symbolic names or a numerical value for the actual
device ID is accepted.

At the same time we're using strtod (string to double) to convert the
string to a decimal numeral. A seeming thinko, made by the original
commit that introduces the code in libdrm_intel and got here with the
import.

Fixes: 514db96c117a ("i965: Import libdrm_intel.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 647b5a18df6e423e1a15d92bc767ba0cf04493a3)

anv/pipeline: do not use BITFIELD64_BIT()

In the previous commit, forgot to apply v2 suggestions.

Fixes: 28d0c38 (anv/pipeline: use unsigned long long constant to check
enable vertex inputs)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 5cd4ece34ebdc1383f1e2376c88097d06544e2f6)

travis: lower SWR requirement to GCC 4.8, aka std=c++11

With ealier commit we relaxed the requirement from C++14 to C++11.
Update the build script so that it

Cc: Tim Rowley <timothy.o.rowley@intel.com
Fixes: 0b80b025021 ("swr: relax c++ requirement from c++14 to c++11")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 459274144dbd9227a57858316b996cede9094bca)

i965: Resolve framebuffers before signaling the fence

From KHR_fence_sync:

  When the condition of the sync object is satisfied by the fence
  command, the sync is signaled by the associated client API context,
  causing any eglClientWaitSyncKHR commands (see below) blocking on
  <sync> to unblock. The only condition currently supported is
  EGL_SYNC_PRIOR_COMMANDS_COMPLETE_KHR, which is satisfied by
  completion of the fence command corresponding to the sync object,
  and all preceding commands in the associated client API context's
  command stream. The sync object will not be signaled until all
  effects from these commands on the client API's internal and
  framebuffer state are fully realized. No other state is affected by
  execution of the fence command.

If clients are passing the fence fd (from EGL_ANDROID_native_fence_sync)
to a compositor, that fence must only be signaled once the framebuffer
is resolved and not before as is currently the case.

v2: fixup assert to use GL_SYNC_GPU_COMMANDS_COMPLETE (Chad)

Reported-by: Sergi Granell <xerpi.g.12@gmail.com>
Fixes: c636284ee8ee ("i965/sync: Implement DRI2_Fence extension")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Sergi Granell <xerpi.g.12@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Chad Versace <chadversary@chromium.org>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit 618be8cc1ad1760103930b69ffbf528d7b861ab3)

bin/cherry-ignore: add radeonsi "fix of a fix"

The commit addresses an earlier fix, which did not land in branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: add yet another bindless textures fix

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: add "st/glsl_to_tgsi: fix getting the image type for array of structs"

Addresses commit which did not land in branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: add bindless textures fix

The bindless work did not land in branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: ignore reverted st/mesa commit

Applied to master and reverted shortly afterwords.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: add a couple of radeonsi/gfx9 commits

They depend on the merged shaders (re)work which landed past the 17.1
branchpoint.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: add "swr: fix transform feedback logic"

Explicit 17.2 nomination, since it depends on refactoring past the 17.1
branchpoint.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: add "swr/rast: non-regex knob fallback code for gcc < 4.9"

Addresses commit merged past the 17.1 brancpoint.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: add a couple of radeon commits

Both are explicit 17.2 nominations, since they depend on work which
landed past the 17.1 branchpoint.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

gallium/radeon: make S_FIXED function signed and move it to shared code

This fixes a bug uncovered by:
2412c4c81ea0488df865817a0de91ec46e359b72
util: Make CLAMP turn NaN into MIN.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 433f6f7ac9ed6624fec02cc055c3bfa247dba185)

radeonsi/gfx9: reduce max threads per block to 1024 on gfx9+

The number of supported waves per thread group has been reduced to 16
with gfx9. Trying to use 32 waves causes hangs, and barriers might
not work correctly with > 16 waves.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a0e6b9a2db5aa5f06a4f60d270aca8344e7d8b3f)
[Emil Velikov: add a HAVE_LLVM check, as applicable in branch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/drivers/radeon/r600_pipe_common.c

radeonsi: fix detection of DRAW_INDIRECT_MULTI on SI

The firmware version numbers for SI were wrong. The new numbers are probably
too conservative (we don't have a definitive answer by the firmware team),
but DRAW_INDIRECT_MULTI has been confirmed to work with these versions on
Tahiti (by Gustaw) and on Verde (by myself).

While this is technically adding a feature, it's a feature we thought we had
for a long time. The change is small enough and we're early enough in the 17.2
release cycle that it should still go in.

Reported-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 65fbaab0b74b6b5a2ac483d48beeefa0a29ff15e)

anv: only expose up to 28 vertex attributes

The EU limit of 128 GRFs should allow 32 vertex elements of 4 GRFs.
However, the maximum allowed value of "Vertex URB Entry Read Length"
in SIMD8 is 15. And 15 * 8 = 120 gives us a limit of 30 vertex elements.
Because we also need to reserve a vertex buffer to upload
VertexIndex/InstanceIndex and another to upload DrawID when needed,
we can only expose 28.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 31f1863ace73d31a579e5c36252a957818ad09cf)

anv/cmd_buffer: fix off by one error in assertion

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit a848e693efc8e2a1d355dc1076409968b374153f)

cherry-ignore: add "i965: Fix = vs == in MCS aux usage assert."

Addesses 0f9b609cf4f, which landed shortly before the 17.2 branchpoint.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: add "i965: Fix offset addition in get_isl_surf"

Addesses 63a43f41619, which landed shortly before the 17.2 branchpoint.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

i965: perf: flush batchbuffers at the beginning of queries

As Chris commented, it makes more sense to have batch buffer flushes
before the query. Usually applications like frame_retrace do a series
of queries and in that case, with flushes at the end of the queries,
we might still have the first query contained in 2 different batchs.
More generally it would be quite usual to have the query contained in
2 batch buffers because we never now what's the fill rate of the
current batch buffer.

If we move the flushing at the beginning of the queries, it's pretty
much guaranteed that queries will be contained in a single batch
buffer (unless the amount of commands is huge, but then it's only fair
to include reloading request times in the measurements).

Fixes: adafe4b733c02 ("i965: perf: minimize the chances to spread queries across batchbuffers")
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9f439ae1201cb049ffedb9b0e2d4f393fb0a761e)

broadcom/vc4: Prefer blit via rendering to the software fallback.

I don't know how I managed to leave this here for so long. Found when
working on a 1:1 overlapping blit extension for X11.

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 93fec49a75ce799bb6fe167f9409fd553a5781c6)

etnaviv: Clear lbl_usage array correctly

Fill the entire array instead of just a quarter. This avoids
crashes with large shaders.
(currently this never causes a problem because shaders larger than 2048/4
instructions are not supported by this driver on any hardware, but it will
cause problems in the future)

Fixes: ec436051899 ("etnaviv: fix shader miscompilation with more than 16 labels")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit 15a1ceb127b70ac98b03cae051927f75fb7ee204)

swr: don't forget to link AVX/AVX2 against pthreads

Seems like the backends have been using pthreads since day one, yet
we've been missing the link.

With later commit we'll fix a typo, hence the libraries will be build
with -Wl,no-undefined, aka failing the build on unresolved symbols.

v2: Split from a larger patch.

Cc: mesa-stable@lists.freedesktop.org
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Laurent Carlier <lordheavym@gmail.com>
Fixes: c6e67f5a9373e916a8d2 "gallium/swr: add OpenSWR rasterizer"
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 33d397ada50a1d1f485205e847003dc48146ec19)
[Emil Velikov: add PTHREAD_LIBS to COMMON_LIBADD]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/drivers/swr/Makefile.am

cherry-ignore: add "anv: Transition MCS buffers from the undefined layout"

Depends on earlier refactoring commit 6235f08ff8870636d89d2181e0a9dfc3ebec7b45

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

swr/rast: quit using linux-specific gettid()

Linux-specific gettid() syscall shouldn't be used in portable code.
Fix does assume a 1:1 thread:LWP architecture, but works for our
current target platforms and can be revisited later if needed.

Fixes unresolved symbol in linux scons builds.

v2: add comment in code about the 1:1 assumption.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit d1e7153228304eb1be85580cbfdea1a57c5f203b)

gallium/util: fix nondeterministic avx512 detection

cpuid.7 requires cx=0 to select the extended feature leaf.

avx512 detection was using the non-indexed cpuid resulting
in random non-detection of avx512.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 131b9f644cbe70728ba02878483e22459400bcb4)

anv/image: Fix VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT

We incorrectly detected VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT. We looked
for the bit in VkImageCreateInfo::usage, but it's actually in
VkImageCreateInfo::flags.

Found by assertion failures while enabling VK_ANDROID_native_buffer.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 5d6905211355464de4885492511e5f9d936cc058)

swrast: add dri2ConfigQueryExtension to the correct extension list

The extension should be in the list as returned by getExtensions().
Seems to have gone unnoticed since close to nobody wants to change the
vblank mode for the software driver.

v2: Rebase

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
(cherry picked from commit 7791949dadd5af707055d0076874177e5e8e2133)
[Emil Velikov: drop st/dri hunk, squash correct swrast piece]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

nir: Use nir_src_copy instead of direct assignments.

If the source is an indirect register, there is ralloc'd data. Copying
with a direct assignment will copy the pointer, but the data will still
belong to the old instruction's memory context. Since we're lowering
and throwing away instructions, that could free the data by mistake.

Instead, use nir_src_copy, which properly handles this.

This is admittedly not a common case, so I think the bug is real,
but unlikely to be hit.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 0320bb2c6cb27370e2389b392b63f8d05c7cb4c7)
[Emil Velikov: drop nir_lower_atomics_to_ssbo.c - not in branch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/compiler/nir/nir_lower_atomics_to_ssbo.c

nir: fix nir_opt_copy_prop_vars() for arrays of arrays

Previously we only incremented the guide for a single
dimension/wildcard.

V2: rework logic to avoid code duplication

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3f0fb23b039443d581d221b1fe9158f9cc81ccd6)

nir/vars_to_ssa: Handle missing struct members in foreach_deref_node

This can happen if, for instance, you have an array of structs and there
are both direct and wildcard references to the same struct and some
members only have direct or only have indirect.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ecf91898e0a8e144adb82d72aecf1224e77ee31b)

anv/image: Add INPUT_ATTACHMENT to the list of required usages

From the Vulkan 1.0.53 spec VU for vkCreateImageView:

    "image must have been created with a usage value containing at least
    one of VK_IMAGE_USAGE_SAMPLED_BIT, VK_IMAGE_USAGE_STORAGE_BIT,
    VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT,
    VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT, or
    VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT"

We were missing VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT from out list.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c5700ed72e765043bb1c8523a05ade235496e053)

anv: Stop leaking the no_aux sampler surface state

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit cbdfd1daa24ee9a7a612f7b0e9aa4610af05e211)

anv/cmd_buffer: Properly handle render passes with 0 attachments

We were early returning and never created the NULL surface state.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: James Legg <jlegg@feralinteractive.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit bd41564746ca4f4bd46185b99754eaa012c359e5)

st/va: Fix scaling list ordering for H.265

Mesa here requires the scaling lists in diagonal scan order, but
VAAPI passes them in raster scan order. Therefore, rearrange the
elements when copying.

v2: Move scan tables to vl_zscan.c.
Fix type in size assertion.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 63dcfed81f011dae5ca68af3369433be28135415)

radv: advertise v6 of the wayland surface extension

Jason updated the Khronos spec to explicitly state that Wayland surfaces
must support VK_PRESENT_MODE_MAILBOX_KHR.

ANV did so since day one (back in 2015)

Cc: mesa-stable@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 4168c162c5bcbbfc6c712466b9c3d7d0dbac06e5)

anv: advertise v6 of the wayland surface extension

Jason updated the Khronos spec to explicitly state that Wayland surfaces
must support VK_PRESENT_MODE_MAILBOX_KHR.

ANV did so since day one (back in 2015)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 43c188f9708b3e80b9f1c9c4c6bb16ac94b5ce5e)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/intel/vulkan/anv_device.c

configure: only install khrplatform.h if needed

khrplatform.h is only used by EGL and GLES; let's only install it when
one of those is enabled.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jussi Kukkonen <jussi.kukkonen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 8821ef4be1009328fc0bbf651feda6377efcd6b6)

anv/pipeline: use unsigned long long constant to check enable vertex inputs

When initializing the ANV pipeline, one of the tasks is checking which
vertex inputs are enabled. This is done by checking if the enabled bits
in inputs_read.

But the mask to use is computed doing `(1 << (VERT_ATTRIB_GENERIC0 +
desc->location))`. The problem here is that if location is 15 or
greater, the sum is 32 or greater. But C is handling 1 as a 32-bit
integer, which means the displaced bit is out of range and thus the full
value is 0.

Thus, use 1ull, which is an unsigned long long value.

This fixes:
dEQP-VK.pipeline.vertex_input.max_attributes.16_attributes.binding_one_to_one.interleaved

v2: use 1ull instead of BITFIELD64_BIT() (Matt Turner)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 28d0c38d85d94cab23667049f03ea072b8e7907c)

radeonsi/gfx9: fix crash building monolithic merged ES-GS shader

Forwarding from the ES prolog to the ES just barely exceeds the current
maximum array size when 16 vertex attributes are used. Give it a decent
bump to account for merged shaders having up to 32 user SGPRs.

Fixes a crash in GL45-CTS.multi_bind.draw_bind_vertex_buffers.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit c22e3c5373ad0df160f13fe8271c32e8d7e61b43)
[Emil Velikov: resolve trivial conflicts - drop initial[] hunk]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/drivers/radeonsi/si_shader.c

loader/dri3: Use dri3_find_back in loader_dri3_swap_buffers_msc

If the application hasn't done any drawing since the last call, we
would reuse the same back buffer which was used for the previous swap,
which may not have completed yet. This could result in various issues
such as tearing or application hangs.

In the normal case, the behaviour is unchanged.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97957
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101683
Cc: mesa-stable@lists.freedesktop.org
[Michel Dänzer: Make Thomas' fix from bugzilla actually work as
intended, write commit log]

(cherry picked from commit 81fb1547772d42c527318837d4207ecdb6899e5d)

st/mesa: always unconditionally revalidate main framebuffer after SwapBuffers

This fixes the black Feral launcher window.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101867

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
(cherry picked from commit 7257c171e9eadc05903140cffa26a253f0d0178a)

nv50/ir: fix threads calculation for non-compute shaders

We were using the "cp" union fields, which are only valid for compute
shaders. The threads calculation affects the available GPRs, so just
pick a small number for other shader types to avoid limiting available
registers.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3645268748c44825ce8d37bf03f684731eb2652a)

cherry-ignore: add "anv: Round u_vector element sizes to a power of two"

The commit addresses issue brought up with 08413a81b93dc537fb0c3.
With the latter missing in branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

svga: fix texture swizzle writemasking

Commit bfe1e7737a76e3b046 changed how texture swizzles are set up.
This exposed a latent bug in the VMware driver: we were ignoring
the texture instruction's writemask when applying the 0 and 1
swizzle terms.

This wasn't caught by the Piglit texture swizzle test because it
only exercises fixed function (no write masking).

Fixes issues seen with ETQW apitrace.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit f7e78abdf45b26f3991dc336120162ae01b208f1)

docs: add sha256 checksums for 17.1.5

Signed-off-by: Andres Gomez <agomez@igalia.com>

docs: add release notes for 17.1.5

Signed-off-by: Andres Gomez <agomez@igalia.com>

Update version to 17.1.5

Signed-off-by: Andres Gomez <agomez@igalia.com>

spirv: Fix reaching unreachable for compare exchange on images

We were hitting the
unreachable("Invalid image opcode")
near the end of vtn_handle_image when parsing the
SpvOpAtomicCompareExchange opcode.

v2: Add stable CC.
v3: Ignore SpvOpAtomicCompareExchangeWeak. It requires the Kernel
capability which is not exposed in Vulkan, and spirv_to_nir is not used
for OpenCL which does support it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b117f59710e62f4afa5781c554f8113e2b0df9cc)

svga: fix PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE value

This query is supposed to return the max texture buffer size/width in
texels, not size in bytes. Divide by 16 (the largest format size) to
return texels.

Fixes Piglit arb_texture_buffer_object-max-size test.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by :Charmaine Lee <charmainel@vmware.com>

(cherry picked from commit 3b28eaabf603657c388caa72bc92b1b660d00b2a)

st/wgl: improve selection of pixel format

Current selection of pixel format does not enforce the request of
stencil or depth buffer if the color depth is not the same as
requested.

For instance, GLUT requests a 32-bit color buffer with an 8-bit
stencil buffer, but because color buffers are only 24-bit, no
priority is given to creating a stencil buffer.

This patch gives more priority to the creation of requested buffers
and less priority to the difference in bit depth.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101703
Signed-off-by: Olivier Lauffenburger <o.lauffenburger@topsolid.com>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 80c6598cdba36edb43d618f97175103e560d61a1)

svga: fixed surface size to include array size

This patch fixes the total surface size in surface cache
to include array size as well.

Tested with MTT glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit adead35320c0afe95f3f170a6047905179f8c6c3)

svga: loop over box.depth for ReadBack_image on each slice

piglit test ext_texture_array-gen-mipmap is fixed with this patch.

Tested with mtt piglit, glretrace, viewperf and conform. No regression.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 31fe1d10b291bcd1b9ee376d53db05028719831d)

glsl: do not call link_xfb_stride_layout_qualifiers() for fragment shaders

xfb only applies to the latest stage before the fragment shader, so
there is no need to invoke it in the fragment shader.

Fixes:
KHR-GL45.enhanced_layouts.xfb_stride_of_empty_list
KHR-GL45.enhanced_layouts.xfb_stride_of_empty_list_and_api

v2: do reset only if shaders provide an explicit stride

v3: do not call link_xfb_stride_layout_qualifiers() for fragment shaders
(Timothy)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 860919a3b237386cba5b2951ae520bf6734fd17e)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

intel/isl: Add the maximum surface size limit

V2: Use 2^31 bytes (2GB) surface size limit on pre-gen9 and
2^38 bytes for gen9+.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit c07271fef095164c8bcfb54fdc95567c3774a866)

intel/isl: Use uint64_t to store total surface size

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 70229782370c7ed9a63e05689f4d8bfc80128dd9)

gallium/radeon: fix a possible crash for buffer exports

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit e6dbe975eff8e23992c9d9a72ce302896b5fecfc)

etnaviv: don't dereference etna_resource pointer if allocation fails

The check for the pointer being non-NULL was being done too late.

Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit a6893a50c8ae5b68e4175366dac718ee9f6fa9d1)

svga: clamp device line width to at least 1 to fix HWv8 line stippling

The line stipple fallback code for virtual HW version 8 didn't work.

With HW version 8, we were getting zero when querying the max line
widths (AA and non-AA).  This means we were setting the draw module's
wide line threshold to zero.  This caused the wide line stage to always
get enabled.  That caused the line stipple module to fall because the
wide line stage was clobbering the rasterization state with a state
object setting the line stipple pattern to 0xffff.

Now the wide_lines variable in draw's validate_pipeline() will not
be incorrectly set.

Also improve debug output.

BTW, also this fixes several other piglit tests: polygon-mode,
primitive- restart-draw-mode, and line-flat-clip-color since they
all use the draw module fallback.

See VMware bug 1895811.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit c2b92dada076afc303e31e3d029256d234254c27)

Squashed with:

svga: adjust line subpixel position for HWv8

This fixes two regressions on HWv8:
  Piglit gl-1.0-ortho-pos
  Piglit/glean fbo
This was caused by commit c2b92dada076a "svga: clamp device line width
to at least 1 to fix HWv8 line stippling"

This also fixes two conform tests: Vertex Order and Polygon Face

No Piglit/conform changes with HWv9 or later.

VMware bug 1905053

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit 5b8d33acefa9adbf1f0c9ff10f1933a0b3a5c66b)

ac/nir: Fix ordering of parameters for image atomic cmpswap intrinsics

The NIR parameters are ordered "compare, data", matching GLSL, but both
the image and buffer LLVM intrinsics take them the other way around.
This is already handled correctly for SSBO atomics.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
(cherry picked from commit c2a5cb64272da3cd8d97b0a58da6c6992b0417d3)

etnaviv: fix refcnt initialization in etna_screen

Despite being a member of the etna_screen struct, 'refcnt' is used by
the winsys-specific logic to track the reference count of the object
managed in a hash table. When the count reaches zero, the pipe screen
is removed from the table and destroyed.

Fix the logic by initializing the refcnt to 1 when screen created.
This initialization is done in etna_screen_create(), to follow the
same logic as in freedreno and virgl.

Fixes: c9e8b49b885 ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
(cherry picked from commit 5d8514de14bd27170293bb373e06f5ff43c708ad)

swr/rast: Correctly allocate SWR_STATS memory as cacheline aligned

Cacheline alignment of SWR_STATS to prevent sharing of cachelines
between threads (performance).

Gets rid of gcc-7.1 warning about using c++17's over-aligned new
feature.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit bab03c06fc79ec5624982777684d0c5f123c127c)

swr/rast: _mm*_undefined_* implementations for gcc<4.9

Define these in terms of setzero for ancient gcc versions which don't
have the undefined intrinsics.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit f0a22956be4802e01f2b4f3244f011212626f12d)

Squashed with:

swr: modifications to allow gcc-4.8 compilation

Code unconditionally used avx2 intrinsics in the avx compilation.

The simd intrinsics library we used has diverged significantly between
branch and master; the non-“undefined intrinsics” portion is specific
to the branch.

This complements:
f0a22956be “swr/rast: _mm*_undefined_* implementations for gcc<4.9”

And makes this branch equivalent with the additional master patch:
d50ef7332c “swr/rast: don't use _mm256_fmsub_ps in AVX code”

scons: Check for xlocale.h before defining HAVE_XLOCALE_H.

Don't assume the header is present on some platforms - use the more
robust CheckHeader() instead.

glibc 2.26 removed xlocale.h.
https://sourceware.org/glibc/wiki/Release/2.26#Removal_of_.27xlocale.h.27

Fix this build error with glibc 2.26.

Compiling src/util/strtod.c ...
src/util/strtod.c:32:10: fatal error: xlocale.h: No such file or directory
#include <xlocale.h>
^~~~~~~~~~~

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101657
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit c5d0dc7fa5566941a49ede8c83a0cfe0a33a3d7f)

mesa/main: Move NULL pointer check.

In blit_framebuffer we're already doing a NULL
pointer check for readFb and drawFb so it makes
sense to do it before we actually use the pointers.

CID: 1412569
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit b3b61211157ab934f1898d3519e7288c1fd89d80)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

st/va: Fix leak in VAAPI subpictures

sampler view allocated in vaAssociateSubpicture is not cleared
in vaiDeassociateSubpicture.

Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit b1a359b7d8a0559412d253101e930a6a45d9af7a)

glsl: gl_Max{Vertex,Fragment}UniformComponents exist in all desktop GL versions

The current implementation assumed that these were replaced in GLSL >= 4.10
by gl_Max{Vertex,Fragment}UniformVectors, however this is not true: both
built-ins should be produced from GLSL 4.10 onwards.

This was raised by new CTS tests that are in development.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit b70d6a2de1c90409c7a2e0d6484f350558f5c2ac)

intel: common: Fix link failure with standalone Android build

Some reshuffle in the Makefiles under src/intel resulted in Android
libraries being no longer linked with code using
src/intel/common/gen_debug.h that contains references to functions
exported by those libraries (namely ALOGW macro, which is currently
resolved into a call to __android_log_print() from cutils).

Fix the build by taking into account ANDROID_CFLAGS and ANDROID_LIBS for
affected module on Android NDK builds.

Fixes: d5b355ce5fd ("i965: Move intel_debug.h to intel/common/gen_debug.h")
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 50a8a7377ae071d5b4b927e9055a7ec8391acc59)

glsl: check if any of the named builtins are available first

_mesa_glsl_has_builtin_function is used to determine whether any variant
of a builtin are available, for the purpose of enforcing the GLSL ES
3.00+ rule that overloads or overrides of builtins are disallowed.

However the builtin_builder contains information on all builtins,
irrespective of parse state, or versions, or extension enablement. As a
result we would say that a builtin existed even if it was not actually
available.

To resolve this, first check if at least one signature is available for
a builtin before returning true.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101666
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 880f21f55d579fe2183255d031c23343da30f69e)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

nir/spirv: Use the type from the deref for atomics

Previously, we were using the type of the variable which is incorrect.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit a10d887ad1eba7baca74b8fbd9c9ec171dbcf3a7)

ac/nir: implement 64-bit packing and unpacking

We implement the split opcodes, and tell NIR to lower the original ones.
The lowering to LLVM is a little more complicated, but NIR can optimize
the split ones a little better, and some NIR lowering passes that we
might want to use (particularly for doubles) emit the split ones.

This should fix pack/unpackDouble2x32, which seems like a bug since when
we enabled the Float64 capability. It will also fix pack/unpackInt2x32
when we enable the Int64 capability.

Fixes: 798ae37c ("radv: Enable Float64 support.")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 7168425dd77f37fa048de5a4639619763556c331)

spirv: fix OpBitcast when the src and dst bitsize are different (v3)

Before, we were just implementing it with a move, which is incorrect
when the source and destination have different bitsizes. To implement
it properly, we need to use the 64-bit pack/unpack opcodes. Since
glslang uses OpBitcast to implement packInt2x32 and unpackInt2x32, this
should fix them on anv (and radv once we enable the int64 capability).

v2: make supporting non-32/64 bit easier (Jason)
v3: add another assert (Jason)

Fixes: b3135c3c ("anv: Advertise shaderInt64 on Broadwell and above")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 196e6b60b1e392c5e55c07a9f9b4e85dad52fb66)

st/mesa: release EGLImage on EGLImageTarget* error

The smapi->get_egl_image() call in st_egl_image_get_surface() stores a
reference to the EGLImage's texture in stimg.texture. That reference is
released via pipe_resource_reference(&stimg.texture, NULL) before stimg
goes out of scope at the end of the function, but not in the error path
if !is_format_supported().

Fixes: 83e9de25f325 ("st/mesa: EGLImageTarget* error handling")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 7d7bcd65d6019dfb63f31138a426fe2a043016db)

winsys/radeon: only call pb_slabs_reclaim when slabs are actually used

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100242
Fixes: fb827c055cb1 ("winsys/radeon: enable buffer allocation from slabs")
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit b0b4b5e8f7a25dd11c1662d339d68c9733e9b2dc)