git.osdn.net Git - android-x86/external-mesa.git/log

cherry-ignore: add yet another bindless textures fix

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: add "st/glsl_to_tgsi: fix getting the image type for array of structs"

Addresses commit which did not land in branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: add bindless textures fix

The bindless work did not land in branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: ignore reverted st/mesa commit

Applied to master and reverted shortly afterwords.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: add a couple of radeonsi/gfx9 commits

They depend on the merged shaders (re)work which landed past the 17.1
branchpoint.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: add "swr: fix transform feedback logic"

Explicit 17.2 nomination, since it depends on refactoring past the 17.1
branchpoint.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: add "swr/rast: non-regex knob fallback code for gcc < 4.9"

Addresses commit merged past the 17.1 brancpoint.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: add a couple of radeon commits

Both are explicit 17.2 nominations, since they depend on work which
landed past the 17.1 branchpoint.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

gallium/radeon: make S_FIXED function signed and move it to shared code

This fixes a bug uncovered by:
2412c4c81ea0488df865817a0de91ec46e359b72
util: Make CLAMP turn NaN into MIN.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 433f6f7ac9ed6624fec02cc055c3bfa247dba185)

radeonsi/gfx9: reduce max threads per block to 1024 on gfx9+

The number of supported waves per thread group has been reduced to 16
with gfx9. Trying to use 32 waves causes hangs, and barriers might
not work correctly with > 16 waves.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a0e6b9a2db5aa5f06a4f60d270aca8344e7d8b3f)
[Emil Velikov: add a HAVE_LLVM check, as applicable in branch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/drivers/radeon/r600_pipe_common.c

radeonsi: fix detection of DRAW_INDIRECT_MULTI on SI

The firmware version numbers for SI were wrong. The new numbers are probably
too conservative (we don't have a definitive answer by the firmware team),
but DRAW_INDIRECT_MULTI has been confirmed to work with these versions on
Tahiti (by Gustaw) and on Verde (by myself).

While this is technically adding a feature, it's a feature we thought we had
for a long time. The change is small enough and we're early enough in the 17.2
release cycle that it should still go in.

Reported-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 65fbaab0b74b6b5a2ac483d48beeefa0a29ff15e)

anv: only expose up to 28 vertex attributes

The EU limit of 128 GRFs should allow 32 vertex elements of 4 GRFs.
However, the maximum allowed value of "Vertex URB Entry Read Length"
in SIMD8 is 15. And 15 * 8 = 120 gives us a limit of 30 vertex elements.
Because we also need to reserve a vertex buffer to upload
VertexIndex/InstanceIndex and another to upload DrawID when needed,
we can only expose 28.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 31f1863ace73d31a579e5c36252a957818ad09cf)

anv/cmd_buffer: fix off by one error in assertion

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit a848e693efc8e2a1d355dc1076409968b374153f)

cherry-ignore: add "i965: Fix = vs == in MCS aux usage assert."

Addesses 0f9b609cf4f, which landed shortly before the 17.2 branchpoint.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

cherry-ignore: add "i965: Fix offset addition in get_isl_surf"

Addesses 63a43f41619, which landed shortly before the 17.2 branchpoint.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

i965: perf: flush batchbuffers at the beginning of queries

As Chris commented, it makes more sense to have batch buffer flushes
before the query. Usually applications like frame_retrace do a series
of queries and in that case, with flushes at the end of the queries,
we might still have the first query contained in 2 different batchs.
More generally it would be quite usual to have the query contained in
2 batch buffers because we never now what's the fill rate of the
current batch buffer.

If we move the flushing at the beginning of the queries, it's pretty
much guaranteed that queries will be contained in a single batch
buffer (unless the amount of commands is huge, but then it's only fair
to include reloading request times in the measurements).

Fixes: adafe4b733c02 ("i965: perf: minimize the chances to spread queries across batchbuffers")
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9f439ae1201cb049ffedb9b0e2d4f393fb0a761e)

broadcom/vc4: Prefer blit via rendering to the software fallback.

I don't know how I managed to leave this here for so long. Found when
working on a 1:1 overlapping blit extension for X11.

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 93fec49a75ce799bb6fe167f9409fd553a5781c6)

etnaviv: Clear lbl_usage array correctly

Fill the entire array instead of just a quarter. This avoids
crashes with large shaders.
(currently this never causes a problem because shaders larger than 2048/4
instructions are not supported by this driver on any hardware, but it will
cause problems in the future)

Fixes: ec436051899 ("etnaviv: fix shader miscompilation with more than 16 labels")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit 15a1ceb127b70ac98b03cae051927f75fb7ee204)

swr: don't forget to link AVX/AVX2 against pthreads

Seems like the backends have been using pthreads since day one, yet
we've been missing the link.

With later commit we'll fix a typo, hence the libraries will be build
with -Wl,no-undefined, aka failing the build on unresolved symbols.

v2: Split from a larger patch.

Cc: mesa-stable@lists.freedesktop.org
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Laurent Carlier <lordheavym@gmail.com>
Fixes: c6e67f5a9373e916a8d2 "gallium/swr: add OpenSWR rasterizer"
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 33d397ada50a1d1f485205e847003dc48146ec19)
[Emil Velikov: add PTHREAD_LIBS to COMMON_LIBADD]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/drivers/swr/Makefile.am

cherry-ignore: add "anv: Transition MCS buffers from the undefined layout"

Depends on earlier refactoring commit 6235f08ff8870636d89d2181e0a9dfc3ebec7b45

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

swr/rast: quit using linux-specific gettid()

Linux-specific gettid() syscall shouldn't be used in portable code.
Fix does assume a 1:1 thread:LWP architecture, but works for our
current target platforms and can be revisited later if needed.

Fixes unresolved symbol in linux scons builds.

v2: add comment in code about the 1:1 assumption.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit d1e7153228304eb1be85580cbfdea1a57c5f203b)

gallium/util: fix nondeterministic avx512 detection

cpuid.7 requires cx=0 to select the extended feature leaf.

avx512 detection was using the non-indexed cpuid resulting
in random non-detection of avx512.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 131b9f644cbe70728ba02878483e22459400bcb4)

anv/image: Fix VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT

We incorrectly detected VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT. We looked
for the bit in VkImageCreateInfo::usage, but it's actually in
VkImageCreateInfo::flags.

Found by assertion failures while enabling VK_ANDROID_native_buffer.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 5d6905211355464de4885492511e5f9d936cc058)

swrast: add dri2ConfigQueryExtension to the correct extension list

The extension should be in the list as returned by getExtensions().
Seems to have gone unnoticed since close to nobody wants to change the
vblank mode for the software driver.

v2: Rebase

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
(cherry picked from commit 7791949dadd5af707055d0076874177e5e8e2133)
[Emil Velikov: drop st/dri hunk, squash correct swrast piece]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

nir: Use nir_src_copy instead of direct assignments.

If the source is an indirect register, there is ralloc'd data. Copying
with a direct assignment will copy the pointer, but the data will still
belong to the old instruction's memory context. Since we're lowering
and throwing away instructions, that could free the data by mistake.

Instead, use nir_src_copy, which properly handles this.

This is admittedly not a common case, so I think the bug is real,
but unlikely to be hit.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 0320bb2c6cb27370e2389b392b63f8d05c7cb4c7)
[Emil Velikov: drop nir_lower_atomics_to_ssbo.c - not in branch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/compiler/nir/nir_lower_atomics_to_ssbo.c

nir: fix nir_opt_copy_prop_vars() for arrays of arrays

Previously we only incremented the guide for a single
dimension/wildcard.

V2: rework logic to avoid code duplication

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3f0fb23b039443d581d221b1fe9158f9cc81ccd6)

nir/vars_to_ssa: Handle missing struct members in foreach_deref_node

This can happen if, for instance, you have an array of structs and there
are both direct and wildcard references to the same struct and some
members only have direct or only have indirect.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ecf91898e0a8e144adb82d72aecf1224e77ee31b)

anv/image: Add INPUT_ATTACHMENT to the list of required usages

From the Vulkan 1.0.53 spec VU for vkCreateImageView:

    "image must have been created with a usage value containing at least
    one of VK_IMAGE_USAGE_SAMPLED_BIT, VK_IMAGE_USAGE_STORAGE_BIT,
    VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT,
    VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT, or
    VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT"

We were missing VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT from out list.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c5700ed72e765043bb1c8523a05ade235496e053)

anv: Stop leaking the no_aux sampler surface state

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit cbdfd1daa24ee9a7a612f7b0e9aa4610af05e211)

anv/cmd_buffer: Properly handle render passes with 0 attachments

We were early returning and never created the NULL surface state.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: James Legg <jlegg@feralinteractive.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit bd41564746ca4f4bd46185b99754eaa012c359e5)

st/va: Fix scaling list ordering for H.265

Mesa here requires the scaling lists in diagonal scan order, but
VAAPI passes them in raster scan order. Therefore, rearrange the
elements when copying.

v2: Move scan tables to vl_zscan.c.
Fix type in size assertion.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 63dcfed81f011dae5ca68af3369433be28135415)

radv: advertise v6 of the wayland surface extension

Jason updated the Khronos spec to explicitly state that Wayland surfaces
must support VK_PRESENT_MODE_MAILBOX_KHR.

ANV did so since day one (back in 2015)

Cc: mesa-stable@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 4168c162c5bcbbfc6c712466b9c3d7d0dbac06e5)

anv: advertise v6 of the wayland surface extension

Jason updated the Khronos spec to explicitly state that Wayland surfaces
must support VK_PRESENT_MODE_MAILBOX_KHR.

ANV did so since day one (back in 2015)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 43c188f9708b3e80b9f1c9c4c6bb16ac94b5ce5e)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/intel/vulkan/anv_device.c

configure: only install khrplatform.h if needed

khrplatform.h is only used by EGL and GLES; let's only install it when
one of those is enabled.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jussi Kukkonen <jussi.kukkonen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 8821ef4be1009328fc0bbf651feda6377efcd6b6)

anv/pipeline: use unsigned long long constant to check enable vertex inputs

When initializing the ANV pipeline, one of the tasks is checking which
vertex inputs are enabled. This is done by checking if the enabled bits
in inputs_read.

But the mask to use is computed doing `(1 << (VERT_ATTRIB_GENERIC0 +
desc->location))`. The problem here is that if location is 15 or
greater, the sum is 32 or greater. But C is handling 1 as a 32-bit
integer, which means the displaced bit is out of range and thus the full
value is 0.

Thus, use 1ull, which is an unsigned long long value.

This fixes:
dEQP-VK.pipeline.vertex_input.max_attributes.16_attributes.binding_one_to_one.interleaved

v2: use 1ull instead of BITFIELD64_BIT() (Matt Turner)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 28d0c38d85d94cab23667049f03ea072b8e7907c)

radeonsi/gfx9: fix crash building monolithic merged ES-GS shader

Forwarding from the ES prolog to the ES just barely exceeds the current
maximum array size when 16 vertex attributes are used. Give it a decent
bump to account for merged shaders having up to 32 user SGPRs.

Fixes a crash in GL45-CTS.multi_bind.draw_bind_vertex_buffers.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit c22e3c5373ad0df160f13fe8271c32e8d7e61b43)
[Emil Velikov: resolve trivial conflicts - drop initial[] hunk]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/drivers/radeonsi/si_shader.c

loader/dri3: Use dri3_find_back in loader_dri3_swap_buffers_msc

If the application hasn't done any drawing since the last call, we
would reuse the same back buffer which was used for the previous swap,
which may not have completed yet. This could result in various issues
such as tearing or application hangs.

In the normal case, the behaviour is unchanged.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97957
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101683
Cc: mesa-stable@lists.freedesktop.org
[Michel Dänzer: Make Thomas' fix from bugzilla actually work as
intended, write commit log]

(cherry picked from commit 81fb1547772d42c527318837d4207ecdb6899e5d)

st/mesa: always unconditionally revalidate main framebuffer after SwapBuffers

This fixes the black Feral launcher window.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101867

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
(cherry picked from commit 7257c171e9eadc05903140cffa26a253f0d0178a)

nv50/ir: fix threads calculation for non-compute shaders

We were using the "cp" union fields, which are only valid for compute
shaders. The threads calculation affects the available GPRs, so just
pick a small number for other shader types to avoid limiting available
registers.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3645268748c44825ce8d37bf03f684731eb2652a)

cherry-ignore: add "anv: Round u_vector element sizes to a power of two"

The commit addresses issue brought up with 08413a81b93dc537fb0c3.
With the latter missing in branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

svga: fix texture swizzle writemasking

Commit bfe1e7737a76e3b046 changed how texture swizzles are set up.
This exposed a latent bug in the VMware driver: we were ignoring
the texture instruction's writemask when applying the 0 and 1
swizzle terms.

This wasn't caught by the Piglit texture swizzle test because it
only exercises fixed function (no write masking).

Fixes issues seen with ETQW apitrace.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit f7e78abdf45b26f3991dc336120162ae01b208f1)

docs: add sha256 checksums for 17.1.5

Signed-off-by: Andres Gomez <agomez@igalia.com>

docs: add release notes for 17.1.5

Signed-off-by: Andres Gomez <agomez@igalia.com>

Update version to 17.1.5

Signed-off-by: Andres Gomez <agomez@igalia.com>

spirv: Fix reaching unreachable for compare exchange on images

We were hitting the
unreachable("Invalid image opcode")
near the end of vtn_handle_image when parsing the
SpvOpAtomicCompareExchange opcode.

v2: Add stable CC.
v3: Ignore SpvOpAtomicCompareExchangeWeak. It requires the Kernel
capability which is not exposed in Vulkan, and spirv_to_nir is not used
for OpenCL which does support it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b117f59710e62f4afa5781c554f8113e2b0df9cc)

svga: fix PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE value

This query is supposed to return the max texture buffer size/width in
texels, not size in bytes. Divide by 16 (the largest format size) to
return texels.

Fixes Piglit arb_texture_buffer_object-max-size test.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by :Charmaine Lee <charmainel@vmware.com>

(cherry picked from commit 3b28eaabf603657c388caa72bc92b1b660d00b2a)

st/wgl: improve selection of pixel format

Current selection of pixel format does not enforce the request of
stencil or depth buffer if the color depth is not the same as
requested.

For instance, GLUT requests a 32-bit color buffer with an 8-bit
stencil buffer, but because color buffers are only 24-bit, no
priority is given to creating a stencil buffer.

This patch gives more priority to the creation of requested buffers
and less priority to the difference in bit depth.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101703
Signed-off-by: Olivier Lauffenburger <o.lauffenburger@topsolid.com>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 80c6598cdba36edb43d618f97175103e560d61a1)

svga: fixed surface size to include array size

This patch fixes the total surface size in surface cache
to include array size as well.

Tested with MTT glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit adead35320c0afe95f3f170a6047905179f8c6c3)

svga: loop over box.depth for ReadBack_image on each slice

piglit test ext_texture_array-gen-mipmap is fixed with this patch.

Tested with mtt piglit, glretrace, viewperf and conform. No regression.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 31fe1d10b291bcd1b9ee376d53db05028719831d)

glsl: do not call link_xfb_stride_layout_qualifiers() for fragment shaders

xfb only applies to the latest stage before the fragment shader, so
there is no need to invoke it in the fragment shader.

Fixes:
KHR-GL45.enhanced_layouts.xfb_stride_of_empty_list
KHR-GL45.enhanced_layouts.xfb_stride_of_empty_list_and_api

v2: do reset only if shaders provide an explicit stride

v3: do not call link_xfb_stride_layout_qualifiers() for fragment shaders
(Timothy)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 860919a3b237386cba5b2951ae520bf6734fd17e)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

intel/isl: Add the maximum surface size limit

V2: Use 2^31 bytes (2GB) surface size limit on pre-gen9 and
2^38 bytes for gen9+.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit c07271fef095164c8bcfb54fdc95567c3774a866)

intel/isl: Use uint64_t to store total surface size

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 70229782370c7ed9a63e05689f4d8bfc80128dd9)

gallium/radeon: fix a possible crash for buffer exports

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit e6dbe975eff8e23992c9d9a72ce302896b5fecfc)

etnaviv: don't dereference etna_resource pointer if allocation fails

The check for the pointer being non-NULL was being done too late.

Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit a6893a50c8ae5b68e4175366dac718ee9f6fa9d1)

svga: clamp device line width to at least 1 to fix HWv8 line stippling

The line stipple fallback code for virtual HW version 8 didn't work.

With HW version 8, we were getting zero when querying the max line
widths (AA and non-AA).  This means we were setting the draw module's
wide line threshold to zero.  This caused the wide line stage to always
get enabled.  That caused the line stipple module to fall because the
wide line stage was clobbering the rasterization state with a state
object setting the line stipple pattern to 0xffff.

Now the wide_lines variable in draw's validate_pipeline() will not
be incorrectly set.

Also improve debug output.

BTW, also this fixes several other piglit tests: polygon-mode,
primitive- restart-draw-mode, and line-flat-clip-color since they
all use the draw module fallback.

See VMware bug 1895811.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit c2b92dada076afc303e31e3d029256d234254c27)

Squashed with:

svga: adjust line subpixel position for HWv8

This fixes two regressions on HWv8:
  Piglit gl-1.0-ortho-pos
  Piglit/glean fbo
This was caused by commit c2b92dada076a "svga: clamp device line width
to at least 1 to fix HWv8 line stippling"

This also fixes two conform tests: Vertex Order and Polygon Face

No Piglit/conform changes with HWv9 or later.

VMware bug 1905053

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit 5b8d33acefa9adbf1f0c9ff10f1933a0b3a5c66b)

ac/nir: Fix ordering of parameters for image atomic cmpswap intrinsics

The NIR parameters are ordered "compare, data", matching GLSL, but both
the image and buffer LLVM intrinsics take them the other way around.
This is already handled correctly for SSBO atomics.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
(cherry picked from commit c2a5cb64272da3cd8d97b0a58da6c6992b0417d3)

etnaviv: fix refcnt initialization in etna_screen

Despite being a member of the etna_screen struct, 'refcnt' is used by
the winsys-specific logic to track the reference count of the object
managed in a hash table. When the count reaches zero, the pipe screen
is removed from the table and destroyed.

Fix the logic by initializing the refcnt to 1 when screen created.
This initialization is done in etna_screen_create(), to follow the
same logic as in freedreno and virgl.

Fixes: c9e8b49b885 ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
(cherry picked from commit 5d8514de14bd27170293bb373e06f5ff43c708ad)

swr/rast: Correctly allocate SWR_STATS memory as cacheline aligned

Cacheline alignment of SWR_STATS to prevent sharing of cachelines
between threads (performance).

Gets rid of gcc-7.1 warning about using c++17's over-aligned new
feature.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit bab03c06fc79ec5624982777684d0c5f123c127c)

swr/rast: _mm*_undefined_* implementations for gcc<4.9

Define these in terms of setzero for ancient gcc versions which don't
have the undefined intrinsics.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit f0a22956be4802e01f2b4f3244f011212626f12d)

Squashed with:

swr: modifications to allow gcc-4.8 compilation

Code unconditionally used avx2 intrinsics in the avx compilation.

The simd intrinsics library we used has diverged significantly between
branch and master; the non-“undefined intrinsics” portion is specific
to the branch.

This complements:
f0a22956be “swr/rast: _mm*_undefined_* implementations for gcc<4.9”

And makes this branch equivalent with the additional master patch:
d50ef7332c “swr/rast: don't use _mm256_fmsub_ps in AVX code”

scons: Check for xlocale.h before defining HAVE_XLOCALE_H.

Don't assume the header is present on some platforms - use the more
robust CheckHeader() instead.

glibc 2.26 removed xlocale.h.
https://sourceware.org/glibc/wiki/Release/2.26#Removal_of_.27xlocale.h.27

Fix this build error with glibc 2.26.

Compiling src/util/strtod.c ...
src/util/strtod.c:32:10: fatal error: xlocale.h: No such file or directory
#include <xlocale.h>
^~~~~~~~~~~

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101657
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit c5d0dc7fa5566941a49ede8c83a0cfe0a33a3d7f)

mesa/main: Move NULL pointer check.

In blit_framebuffer we're already doing a NULL
pointer check for readFb and drawFb so it makes
sense to do it before we actually use the pointers.

CID: 1412569
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit b3b61211157ab934f1898d3519e7288c1fd89d80)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

st/va: Fix leak in VAAPI subpictures

sampler view allocated in vaAssociateSubpicture is not cleared
in vaiDeassociateSubpicture.

Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit b1a359b7d8a0559412d253101e930a6a45d9af7a)

glsl: gl_Max{Vertex,Fragment}UniformComponents exist in all desktop GL versions

The current implementation assumed that these were replaced in GLSL >= 4.10
by gl_Max{Vertex,Fragment}UniformVectors, however this is not true: both
built-ins should be produced from GLSL 4.10 onwards.

This was raised by new CTS tests that are in development.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit b70d6a2de1c90409c7a2e0d6484f350558f5c2ac)

intel: common: Fix link failure with standalone Android build

Some reshuffle in the Makefiles under src/intel resulted in Android
libraries being no longer linked with code using
src/intel/common/gen_debug.h that contains references to functions
exported by those libraries (namely ALOGW macro, which is currently
resolved into a call to __android_log_print() from cutils).

Fix the build by taking into account ANDROID_CFLAGS and ANDROID_LIBS for
affected module on Android NDK builds.

Fixes: d5b355ce5fd ("i965: Move intel_debug.h to intel/common/gen_debug.h")
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 50a8a7377ae071d5b4b927e9055a7ec8391acc59)

glsl: check if any of the named builtins are available first

_mesa_glsl_has_builtin_function is used to determine whether any variant
of a builtin are available, for the purpose of enforcing the GLSL ES
3.00+ rule that overloads or overrides of builtins are disallowed.

However the builtin_builder contains information on all builtins,
irrespective of parse state, or versions, or extension enablement. As a
result we would say that a builtin existed even if it was not actually
available.

To resolve this, first check if at least one signature is available for
a builtin before returning true.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101666
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 880f21f55d579fe2183255d031c23343da30f69e)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

nir/spirv: Use the type from the deref for atomics

Previously, we were using the type of the variable which is incorrect.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit a10d887ad1eba7baca74b8fbd9c9ec171dbcf3a7)

ac/nir: implement 64-bit packing and unpacking

We implement the split opcodes, and tell NIR to lower the original ones.
The lowering to LLVM is a little more complicated, but NIR can optimize
the split ones a little better, and some NIR lowering passes that we
might want to use (particularly for doubles) emit the split ones.

This should fix pack/unpackDouble2x32, which seems like a bug since when
we enabled the Float64 capability. It will also fix pack/unpackInt2x32
when we enable the Int64 capability.

Fixes: 798ae37c ("radv: Enable Float64 support.")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 7168425dd77f37fa048de5a4639619763556c331)

spirv: fix OpBitcast when the src and dst bitsize are different (v3)

Before, we were just implementing it with a move, which is incorrect
when the source and destination have different bitsizes. To implement
it properly, we need to use the 64-bit pack/unpack opcodes. Since
glslang uses OpBitcast to implement packInt2x32 and unpackInt2x32, this
should fix them on anv (and radv once we enable the int64 capability).

v2: make supporting non-32/64 bit easier (Jason)
v3: add another assert (Jason)

Fixes: b3135c3c ("anv: Advertise shaderInt64 on Broadwell and above")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 196e6b60b1e392c5e55c07a9f9b4e85dad52fb66)

st/mesa: release EGLImage on EGLImageTarget* error

The smapi->get_egl_image() call in st_egl_image_get_surface() stores a
reference to the EGLImage's texture in stimg.texture. That reference is
released via pipe_resource_reference(&stimg.texture, NULL) before stimg
goes out of scope at the end of the function, but not in the error path
if !is_format_supported().

Fixes: 83e9de25f325 ("st/mesa: EGLImageTarget* error handling")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 7d7bcd65d6019dfb63f31138a426fe2a043016db)

winsys/radeon: only call pb_slabs_reclaim when slabs are actually used

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100242
Fixes: fb827c055cb1 ("winsys/radeon: enable buffer allocation from slabs")
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit b0b4b5e8f7a25dd11c1662d339d68c9733e9b2dc)

cherry-ignore: i965: Fix anisotropic filtering for mag filter

extra: References 6a7c5257cac but because later f8d69beed49 introduced
a regression and the latter didn't land.

Signed-off-by: Andres Gomez <agomez@igalia.com>

swr: Limit memory held by defer deleted resources.

This patch limits the number of items on the fence work queue (the
deferred deletion list) by submitting a sync fence when the queue size
exceeds a threshold.  This initiates deferred deletion of all resources
on the list and decreases the total amount of memory held waiting for
"deferred deletion".

This resolves  bug 101467 filed against swr for the piglit
streaming-texture-leak test.  For those running on smaller memory
(16GB?) systems, this will prevent oom-killer.

Thus far, we have not seen any real world applications that exhibit
behavior like the streaming-texture-leak test; as any form of pipeline
flush will trigger the defer queue and properly free any retained
allocations.  But, this addresses those as well.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 32c1a54bd01465e77a8e26b9cc8d2487b31509c5)

etnaviv: fix shader miscompilation with more than 16 labels

The labels array may change its virtual address on a reallocation, so
it is invalid to cache pointers into the array. Rather than using the
pointer directly, remember the array index.

Fixes miscompilation of shaders in glmark2 ideas, leading to GPU hangs.

Fixes: c9e8b49b (etnaviv: gallium driver for Vivante GPUs)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit ec43605189907fa327a4a7f457aa3c822cfdea5d)

ac/nir: Use correct LLVM intrinsics for atomic ops on imageBuffers

The buffer intrinsics should be used instead of the image ones.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 909184ac9cf59f23803915773f5659f05c161394)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

ac/nir: Make intrinsic_name buffer long enough

When using cmpswap on an image, it was being trunctated to
lvm.amdgcn.image.atomic.cmpswa, with the coords type missing entirely.

v2: Add stable CC

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 6fc41bb4d59313985f67b1276d1fd1225be09426)

i965: Always set AALINEDISTANCE_TRUE on Sandybridge.

We set this unconditionally on every other platform. Zero (Manhattan)
isn't even listed as an option in the Sandybridge docs - only "true".

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 4878ab9bd4281c4554254bbb0c62faae453bb863)

i965: Use true AA line distance on G45/Ironlake.

The original Broadwater and Crestline platforms computed antialiased
line distances using "manhattan" distance, aka a + b = c.  Eaglelake
and Cantiga added "true" distance, which apparently does something
like max(a, b) + min(a, b) / 4.  Not exactly "true", but at least
more accurate.

The G45 documentation indicates that the old manhattan distance setting
is "only for debug purposes" and should never be used.  The Ironlake
documentation no longer mentions AALINEDISTANCE_MANHATTAN, though it
does still contain the narrative about the feature.

At any rate, we should use the more accurate mode.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit b625bcc601a16fab1962f9ed569700d3d08738b9)

radeon/winsys: Limit max allocation size to 70% of VRAM

The CL CTS queries the max allocation size, and then attempts to
allocate buffers of that size. If not enough contiguous RAM/VRAM is
available, this causes errors in the radeon kernel module due to
inability to allocate the required memory.

It's a bit of a hack, but experimentally on my system, I can use ~3/4
of the card's VRAM for a single global/constant buffer allocation given
current GUI/compositor use.

For a 1GB Pitcairn (HD7850) this gets me from the reported clinfo values of:
Global memory size                              2143076352 (1.996GiB)
Max memory allocation                           1500153446 (1.397GiB)
Max constant buffer size                        1500153446 (1.397GiB)

To:
Global memory size                              2143076352 (1.996GiB)
Max memory allocation                           751619276 (716MiB)
Max constant buffer size                        751619276 (716MiB)

Fixes: OpenCL CTS test/conformance/api/min_max_mem_alloc_size,
       OpenCL CTS test/conformance/api/min_max_constant_buffer_size

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e4d06e4c531157f1f3e4683487ee9c81fa0cff9b)

draw: check for line_width != 1.0f in validate_pipeline()

We shouldn't use the wide line stage if the line width is 1.
This check isn't strictly needed because all drivers are (now)
specifying a line wide threshold of at least 1.0 pixels, but
let's play it safe.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit c8f344ed2d471f0e012205aecfae4aa765d9fffb)

docs: add sha256 checksums for 17.1.4

Signed-off-by: Andres Gomez <agomez@igalia.com>

docs: add release notes for 17.1.4

Signed-off-by: Andres Gomez <agomez@igalia.com>

Update version to 17.1.4

Signed-off-by: Andres Gomez <agomez@igalia.com>

cherry-ignore: bin/get-fixes-pick-list.sh: better identify multiple "fixes:" tags

fixes: Genuine false positive.

Signed-off-by: Andres Gomez <agomez@igalia.com>

cherry-ignore: 17.1.4 rejected commits

stable: rejected commits.

Signed-off-by: Andres Gomez <agomez@igalia.com>

Fix khrplatform.h not installed if EGL is disabled.

KHR/khrplatform.h is required by the EGL, GLES and VG headers, but is
only installed if Mesa3d is compiled with EGL support.

This patch installs this header file unconditionally.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77240
Signed-off-by: Eric Le Bihan <eric.le.bihan.dev@free.fr>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 2154defcd698c7f9862bd235925cac75c0d5a520)

Android: major/minor/makedev live in <sys/sysmacros.h>

sysmacros.h was getting implicitly included in types.h until recently in
AOSP master. Define MAJOR_IN_SYSMACROS to explicitly include sysmacros.h.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
(cherry picked from commit e8f82bfd520e7f7185d178cdbecbb9b1bf2b2c1c)

radeonsi: include ac_binary.h for struct ac_shader_binary

The header embeds the struct so it needs the header inclusion instead of
the dummy forward declaration.

Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Tom Stellard <tstellar@redhat.com>
Fixes: 32206c5e560 ("radeonsi: Add radeon_shader_binary member to struct
si_shader")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 1f958c1337290b4062a77f79fc101bb9f4bdf515)

nv50/ir: fix combineLd/St to update existing records as necessary

Previously the logic would decide that the record is kept, which
translates into keep = false in the caller, which meant that these
passes did not run.

While it's right that keep = false which means that a new record does
not need to be added, we do still have to perform the usual list
maintenance. It's easiest to do this pre-merge rather than post.

The lowering that clip/cull distance passes produce triggers this bug in
TCS (since reading outputs is done differently in other stages), but it
should be possible to achieve it with the right sequence of regular
reads/writes.

Fixes: KHR-GL45.cull_distance.functional
Fixes: generated_tests/spec/arb_tessellation_shader/execution/tes-input/tes-input-gl_ClipDistance.shader_test
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 4a79f2be337cef920fc8ea5048fabc106bac492e)

nv50/ir: fetch indirect sources BEFORE the op that uses them

All the BuildUtil helpers just insert the operation into the current BB.
So we have to take care that any fetchSrc() operations happen before the
operation whose setIndirect() it goes into.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8c02ee4a8b0bea5dda3ced341dce81f340457c95)

i965: update MaxTextureRectSize to match PRMs and comply with OpenGL 4.1+

We were exposing 4096, but we can do up to 8192 in Gen4-6 and up to
16384 in gen7+. OpenGL 4.1+ requires at least 16384.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b72b7c541dd81890e04652373f24840f580123ed)

amd/common: fix off-by-one in sid_tables.py

The very last entry in the sid_strings_offsets table ended up missing,
leading to out-of-bounds reads and potential crashes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 67e49a7f6570b8691d9405cb65f263b87817fe71)

egl/display: make platform detection thread-safe

Imagine there are 2 threads that both call _eglGetNativePlatform()
simultaneously:
- thread 1 completes the first "if (native_platform ==
  _EGL_INVALID_PLATFORM)" check and is preempted to do something else
- thread 2 executes the whole function, does "native_platform =
  _EGL_NATIVE_PLATFORM" and just before returning it's preempted
- thread 1 wakes up and calls _eglGetNativePlatformFromEnv() which
  returns _EGL_INVALID_PLATFORM because no env vars are set, updates
  native_platform and then gets preempted again
- thread 2 wakes up and returns wrong _EGL_INVALID_PLATFORM

Solve this by doing the detection in a local var and only overwriting
the global one at the end, if no other thread has updated it since.

This means the platform detected in the thread might not be the platform
returned by the function, but this is a different issue that will need
to be discussed when this becomes possible.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101252
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 311c09165881111c4a596ca7e7b4bce89b059e0f)

egl/display: only detect the platform once

My refactor missed the fact that `native_platform` is static.
Add the proper guard around the detection code, as it might not be
necessary, and only print the debug message when a detection was
actually performed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101252
Fixes: 7adb9b094894a512c019 ("egl/display: remove unnecessary code and
make it easier to read")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 4ca9ae587c083b6f03feb65b4ce84929109d5d59)

gallium/util: Break recursion in pipe_resource_reference

It calling itself recursively prevented it from being inlined, resulting
in a copy being generated in every compilation unit referencing it. This
bloated the text segment of the Gallium mega-driver *_dri.so by ~4%,
and might also have impacted performance.

Fixes: ecd6fce2611e ("mesa/st: support lowering multi-planar YUV")
v2:
* Add comment above pipe_resource_next_reference [Samuel Pitoiset]
v3:
* Use loop to unreference the full chain of resources referenced via
the next members [Timothy Arceri]
v4:
* Stop chasing ->next chain at the first sub-resource which isn't
destroyed [Nicolai Hähnle]

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 176e761513f9f9502248c0c8dad133d2d9f28d2d)

i915: Fix wpos_tex vs. -1 comparison

wpos_tex used to be a GLuint so assigning -1 to it and
later comparing with -1 worked correctly, but commit
c349031c27b7 ("i915: Fix texcoord vs. varying collision in
fragment programs") changed wpos_tex to uint8_t and hence
broke the comparison. To fix this define a more explicit
invalid value for wpos_tex.

gcc warns us:
i915_fragprog.c:1255:57: warning: comparison is always true due to limited range of data type [-Wtype-limits]
    if (inputsRead & VARYING_BITS_TEX_ANY || p->wpos_tex != -1) {
                                                         ^

And clang says:
i915_fragprog.c:1255:57: warning: comparison of constant -1 with expression of type 'uint8_t' (aka 'unsigned char') is always true [-Wtautological-constant-out-of-range-compare]
   if (inputsRead & VARYING_BITS_TEX_ANY || p->wpos_tex != -1) {
                                            ~~~~~~~~~~~ ^  ~~

Cc: Chih-Wei Huang <cwhuang@android-x86.org>
Cc: Eric Anholt <eric@anholt.net>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Fixes: c349031c27b7 ("i915: Fix texcoord vs. varying collision in fragment programs")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c1eedb43f32f6a3733f26e7918eb028f68bd60a4)

Squashed with commit:

i915: Always emit W on gen3

Unlike the older gen2 hardware, gen3 performs perspective
correct interpolation even for the primary/secondary colors.
To do that it naturally needs us to emit W for the vertices.

Currently we emit W only when at least one texture coordinate
set gets emitted. This means the interpolation of color will
change depending on whether texcoords/varyings are used or not.
That's probably not what anyone would expect, so let's just
always emit W to get consistent behaviour. Trying to avoid
emitting W seems like more hassle than it's worth, especially
as bspec seems to suggest that the hardware will perform the
perspective division anyway.

This used to be broken until it was accidentally fixed it in
commit c349031c27b7 ("i915: Fix texcoord vs. varying collision
in fragment programs") by introducing a bug that made the driver
always emit W. After fixing that bug in commit c1eedb43f32f
("i915: Fix wpos_tex vs. -1 comparison") we went back to the
old behaviour and caused an apparent regression.

Fixes: c1eedb43f32f ("i915: Fix wpos_tex vs. -1 comparison")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101451
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0eef03a6f2f7fa7968accaa2ab2c3d7431e984b8)

etnaviv: only flush resource to self if no scanout buffer exists

Currently a resource flush may trigger a self resolve, even if a scanout buffer
exists, but is up to date. If a scanout buffer exists we only ever want to
flush the resource to the scanout buffer. This fixes a performance regression.

Fixes: dda956340ce9 (etnaviv: resolve tile status when flushing resource)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit 28550c787595f04453d2a39f46f570a891368fcf)

egl_dri2: swrastGetDrawableInfo: set *x, *y [v2]

In swrastGetDrawableInfo, set *x and *y, not just *w and *h;
this fixes a crash later in drisw_update_tex_buffer when the
(formerly) uninitialized x and y values are used to construct
an address in a call to llvmpipe_transfer_map.

Fixes crash in Piglit test
"spec@egl 1.4@eglcreatepbuffersurface and then glclear"
(<piglit dir>/bin/egl-create-pbuffer-surface -auto)
that occurred intermittently, e.g. when the uninitialized x and y in
drisw_update_tex_buffer just happened to contain absurd non-zero values.

v2: Initialize in case if function succeeds or fails, just like *w/*h.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 162c42f8edde4a2c13b1eb5c0f9f0828441ed4c8)

nv50/ir: Properly fold constants in SPLIT operation

Fixes: b7d9677d ("nv50/ir: constant fold OP_SPLIT")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit afb8f2d4a346041adf54d45729963a55a625ac1f)

i965: Clamp clear colors to the representable range

Starting with Sky Lake, we can clear to arbitrary floats or integers.
Unfortunately, the hardware isn't particularly smart when it comes
sampling from that clear color. If the clear color is out of range for
the surface format, it will happily return whatever we put in the
surface state packet unmodified. In order to avoid returning bogus
values for surfaces with a limited range, we need to do some clamping.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit f1fa4be871e13c68b50685aaf64dc095b49ed0b5)
[Andres Gomez: override_color still a gl_color_union]
Signed-off-by: Andres Gomez <agomez@igalia.com>
Conflicts:
src/mesa/drivers/dri/i965/brw_meta_util.c

gallium/vbuf: avoid segfault when we get invalid glDrawRangeElements()

A common user error is to call glDrawRangeElements() with the 'end'
argument being one too large.  If we use the vbuf module to translate
some vertex attributes this error can cause us to read past the end of
the mapped hardware buffer, resulting in a crash.

This patch adjusts the vertex count to avoid that issue.  Typically,
the vertex_count gets decremented by one.

This fixes crashes with the Unigine Tropics and Sanctuary demos with older
VMware hardware versions.  The issue isn't hit with VGPU10 because we
don't hit this fallback.

No piglit changes.

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit d8148ed10ae5faea6f88f2f964797f4b0590c083)
[Andres Gomez: pipe_vertex_buffer hadn't shrunk yet]
Signed-off-by: Andres Gomez <agomez@igalia.com>
Conflicts:
src/gallium/auxiliary/util/u_vbuf.c