OSDN Git Service
Dave Airlie [Mon, 30 Jan 2017 18:50:30 +0000 (04:50 +1000)]
radv/ac: handle nir irem opcode.
This fixes:
dEQP-VK.spirv_assembly.instruction.compute.opsrem.*
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org"
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit
ecc3fa3ba3967624f67abe8d8188102a08c20d7c)
Dave Airlie [Mon, 30 Jan 2017 06:13:30 +0000 (16:13 +1000)]
radv/ac: fix multisample subpass image.
We weren't adding the fragment position properly.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit
059dd171759bb89d915c049de1ca1c93865c21d3)
Dave Airlie [Mon, 30 Jan 2017 03:17:05 +0000 (13:17 +1000)]
radv: handle transfer_write as a dst flag.
It appears we can get image barriers like:
srcStageMask: VkPipelineStageFlags = 4096 (VK_PIPELINE_STAGE_TRANSFER_BIT)
dstStageMask: VkPipelineStageFlags = 4096 (VK_PIPELINE_STAGE_TRANSFER_BIT)
dependencyFlags: VkDependencyFlags = 0
memoryBarrierCount: uint32_t = 0
pMemoryBarriers: const VkMemoryBarrier* = NULL
bufferMemoryBarrierCount: uint32_t = 0
pBufferMemoryBarriers: const VkBufferMemoryBarrier* = NULL
imageMemoryBarrierCount: uint32_t = 1
pImageMemoryBarriers: const VkImageMemoryBarrier* = 0x7ffc882367b0
pImageMemoryBarriers[0]: const VkImageMemoryBarrier = 0x7ffc882367b0:
sType: VkStructureType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER (45)
pNext: const void* = NULL
srcAccessMask: VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT)
dstAccessMask: VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT)
oldLayout: VkImageLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL (7)
newLayout: VkImageLayout = VK_IMAGE_LAYOUT_GENERAL (1)
srcQueueFamilyIndex: uint32_t =
4294967295
dstQueueFamilyIndex: uint32_t =
4294967295
image: VkImage = 0x2df55e0
subresourceRange: VkImageSubresourceRange = 0x7ffc882367e0:
aspectMask: VkImageAspectFlags = 1 (VK_IMAGE_ASPECT_COLOR_BIT)
baseMipLevel: uint32_t = 0
levelCount: uint32_t = 1
baseArrayLayer: uint32_t = 0
layerCount: uint32_t = 1
This fixes all the CTS dEQP-VK.memory.pipeline_barrier.transfer_dst tests here,
not sure if this is a too large hammer.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit
a1c1ba7d5649cca450ca81bf87be36c035a01db0)
Marek Olšák [Sun, 29 Jan 2017 22:59:59 +0000 (23:59 +0100)]
radeonsi: don't invoke DCC decompression in update_all_texture_descriptors
This fixes a bug uncovered by the 17-part patch series, specifically:
"gallium/radeon: merge dirty_fb_counter and dirty_tex_descriptor_counter"
If dirty_tex_counter has been updated and set_shader_image invokes DCC
decompression, the DCC decompression itself checks the counter and updates
descriptors, which in turn invokes the same DCC decompression. The blitter
can't handle the recursion and the driver eventually crashes.
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit
a0740d59aa97a08d89998cb57138e8217a331af6)
Bartosz Tomczyk [Mon, 30 Jan 2017 13:07:45 +0000 (14:07 +0100)]
r600: Fix stack overflow
Commit
7b5878ee0491e7a93914389a8369cd6752b9757d increased number of
outputs to 64, but left output array intact. This caused stack overflow
when number of outputs is bigger then 32. Found by ASAN.
Cc: "12.0 13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit
a41f2527ae8ae5432b99c88863fbdf2f0b5f04ad)
Kenneth Graunke [Sat, 21 Jan 2017 04:33:57 +0000 (20:33 -0800)]
i965: Support the force_glsl_version driconf option.
Gallium drivers have had this for a while. It makes sense to support
it consistently across drivers, so expose it in i965 as well.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit
2f7a7ae13196c58eddea75fc51637f8c2a8579b0)
Kenneth Graunke [Thu, 26 Jan 2017 09:27:42 +0000 (01:27 -0800)]
i965: Fix check for negative pitch in can_do_fast_copy_blit().
At this point, the pitch is in bytes. We haven't yet divided the pitch
by 4 for tiled surfaces, so abs(pitch) may be larger than 32K. This
means the bit 15 trick won't work.
The caller now has signed integers anyway, so just pass those through
and do the obvious check.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit
02216a1ddf2bcafb86fda352e514f27ab6f7a4fa)
Kenneth Graunke [Wed, 25 Jan 2017 08:59:42 +0000 (00:59 -0800)]
i965: Unbind deleted shaders from brw_context, fixing malloc heisenbug.
Applications may delete a shader program, create a new one, and bind it
before the next draw. With terrible luck, malloc may randomly return a
chunk of memory for the new gl_program that happened to be the exact
same pointer as our previously bound gl_program. In this case, our
logic to detect new programs in brw_upload_pipeline_state() would break:
if (brw->vertex_program != ctx->VertexProgram._Current) {
brw->vertex_program = ctx->VertexProgram._Current;
brw->ctx.NewDriverState |= BRW_NEW_VERTEX_PROGRAM;
}
Because the pointer is the same, we'd think it was the same program.
But it could be wildly different - a different stage altogether,
different sets of resources, and so on. This causes utter chaos.
As unlikely as this seems, I believe I hit this when running a subset
of the CTS in a loop, in a group of tests that churns through simple
programs, deleting and rebuilding them. Presumably malloc uses a
bucketing cache of sorts, and so freeing up a gl_program and allocating
a new one fairly quickly causes it to reuse that memory.
The result was that brw->vertex_program->info.num_ssbos claimed the
program had SSBOs, while brw->vs.base.prog_data.binding_table claimed
that there were none. This was crazy, because the binding table is
calculated from info.num_ssbos - the shader info appeared to change
between shader compile time and draw time. Careful use of watchpoints
revealed that it was being clobbered by rzalloc's memset when building
an entirely different program...
Fortunately, our 0xd0d0d0d0 canary for unused binding table entries
caused us to crash out of bounds when trying to upload SSBOs, or we
may have never discovered this heisenbug.
Fixes crashes in GL45-CTS.compute_shader.sso-case2 when using a hacked
cts-runner that only runs GL45-CTS.compute_shader.s* in EGL config ID 5
at 64x64 in a loop with 100 iterations.
Cc: "17.0 13.0 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit
7c5629a26912af9164bda17b7af5370f6b4302e4)
Bas Nieuwenhuizen [Sat, 28 Jan 2017 00:32:20 +0000 (01:32 +0100)]
radv/ac: Use base in push constant loads.
Apparently the source is not an address but an offset, so we actually
need to use the base.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
96c60b7f07e626d9ca0fc5789117f0c725ba1da2)
Emil Velikov [Fri, 27 Jan 2017 18:29:38 +0000 (18:29 +0000)]
configure.ac: list radeon in --with-vulkan-drivers help string
Analogous to what we do for the dri and gallium drivers.
Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@colllabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit
cb6be5c8c05fa1af20ebd4f014d686244826f987)
Emil Velikov [Fri, 27 Jan 2017 18:05:13 +0000 (18:05 +0000)]
radv: automake: Don't install vk_platform.h or vulkan.h.
These files belong to the vulkan loader.
Identical to
045f38a5075 vulkan: Don't install vk_platform.h or vulkan.h.
Cc: Dave Airlie <airlied@redhat.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit
6f2dec0a235694779310979fd1cbf48a8d7ba27b)
Emil Velikov [Mon, 16 Jan 2017 15:45:49 +0000 (15:45 +0000)]
mesa/tests: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
091f2b8c98937138c17a1ddf4b16d17f31a20020)
Emil Velikov [Mon, 16 Jan 2017 15:45:48 +0000 (15:45 +0000)]
dri/osmesa: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
6ba96bdcabc4c9c89827603907aba1b7dd5e9972)
Emil Velikov [Mon, 16 Jan 2017 15:45:47 +0000 (15:45 +0000)]
dri/swrast: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
ede4ff9adc5653db56e7edac53c012fc431647dc)
Emil Velikov [Mon, 16 Jan 2017 15:45:46 +0000 (15:45 +0000)]
radeon, r200: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
5a0ba1e5de088bf71bc407db9837235b18dca936)
Emil Velikov [Mon, 16 Jan 2017 15:45:45 +0000 (15:45 +0000)]
mapi: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
ee5de93269cee8abec2dd4d938c5e745983bf98e)
Emil Velikov [Mon, 16 Jan 2017 15:45:44 +0000 (15:45 +0000)]
loader: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
af860850a0c4096f9bb27faa4d9c0061fb437a9b)
Emil Velikov [Mon, 16 Jan 2017 15:45:43 +0000 (15:45 +0000)]
glx/windows: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
912b4f5472007dbd06c922cb1c315cb4814c4010)
Emil Velikov [Mon, 16 Jan 2017 15:45:42 +0000 (15:45 +0000)]
glx/apple: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jeremy Sequoia <jeremyhu@apple.com>
(cherry picked from commit
5b874cee099122bc9c0625254895e63843ae2260)
Emil Velikov [Mon, 16 Jan 2017 15:45:41 +0000 (15:45 +0000)]
glx: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
d66f9e6d930a9fcfa23129dce18d5359cc6e00f4)
Emil Velikov [Mon, 16 Jan 2017 15:45:40 +0000 (15:45 +0000)]
d3dadapter9: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
d221bf9b91900c62069cc447dc214c04a9e5261c)
Emil Velikov [Mon, 16 Jan 2017 15:45:39 +0000 (15:45 +0000)]
st/dri: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
517f34b4be0ac4a5a508ccb6dcaeca3c975585b0)
Emil Velikov [Mon, 16 Jan 2017 15:45:37 +0000 (15:45 +0000)]
clover: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Aaron Watry <awatry@gmail.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
65d5a60caca632a7c03cd1dc554645f27f408f37)
Emil Velikov [Mon, 16 Jan 2017 15:45:36 +0000 (15:45 +0000)]
egl: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
c5921ae0d2fc37699c7ebbd693a2e850a5371204)
Emil Velikov [Mon, 16 Jan 2017 15:45:35 +0000 (15:45 +0000)]
i915: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
90ac5c339e2c360712d663e3eab76c4a4abf2487)
Emil Velikov [Mon, 16 Jan 2017 15:45:34 +0000 (15:45 +0000)]
i965: automake: include builddir prior to srcdir
The latter can contain stale generated file, which, as-is, we'll end up
using.
Fixes:
bfd17c76c12 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit
4622c75dfbf5c89630d1860037dcb8b7910c0820)
Emil Velikov [Mon, 16 Jan 2017 15:45:33 +0000 (15:45 +0000)]
freedreno: automake: correctly set MKDIR_GEN
Analogous to previous commit.
Fixes:
4610e5ef28e "freedreno/ir3: fix sin/cos"
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Rob Clark <robclark@freedesktop.org>
Cc: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
(cherry picked from commit
a922c821255bfac22cf705244e5bd303a626bb55)
Emil Velikov [Mon, 16 Jan 2017 15:45:32 +0000 (15:45 +0000)]
i965: automake: correctly set MKDIR_GEN
Otherwise we might end up w/o the respective folder (depending on
autotools version) and fail at build time.
Fixes:
bfd17c76c12 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit
5eed48d23780e812ca5c53fecc59a419962c7dd6)
Jason Ekstrand [Wed, 25 Jan 2017 01:10:45 +0000 (17:10 -0800)]
vulkan/wsi: Lower the maximum image sizes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit
d6397dd62542215e655c0cab557729474c2ae973)
Jason Ekstrand [Wed, 25 Jan 2017 00:43:15 +0000 (16:43 -0800)]
vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetPresentModes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit
659edd9f5ce995aa47e2ab02425508cc29140cce)
Jason Ekstrand [Wed, 25 Jan 2017 00:43:01 +0000 (16:43 -0800)]
vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetFormats
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit
dc578ef060f6b92e6fd2f77bb6454a5fb22a471c)
Emil Velikov [Thu, 26 Jan 2017 13:18:41 +0000 (13:18 +0000)]
mesa: move variable declaration to where its used
The variable replacement was unused when building w/o
ENABLE_SHADER_CACHE. Since we can mix variable declarations and code,
move it to where its used.
Fixes:
9f8dc3bf03e "utils: build sha1/disk cache only with
Android/Autoconf"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit
6a5850b04a86c2287112047c6cad500136d18df5)
Andreas Boll [Tue, 24 Jan 2017 15:44:12 +0000 (16:44 +0100)]
configure.ac: Require LLVM for r300 only on x86 and x86_64
b3119a3 introduced a strict LLVM requirement for r300 on all
architectures and thus configure fails on architectures where LLVM is
not available or buggy.
r300 doesn't strictly require LLVM, but for performance reasons we
highly recommend LLVM usage. So require it at least on x86 and x86_64
architectures as we have done before
b3119a3.
Fixes:
b3119a3 ("configure.ac: Check gallium LLVM version in gallium_require_llvm")
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
1f2a890ace07f2709ca8f7ef5fe12051222aafed)
Lionel Landwerlin [Thu, 26 Jan 2017 16:57:40 +0000 (16:57 +0000)]
spirv: handle undefined components for OpVectorShuffle
Fixes:
dEQP-VK.spirv_assembly.instruction.compute.opspecconstantop.vector_related
dEQP-VK.spirv_assembly.instruction.graphics.opspecconstantop.vector_related*
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
bbe8705c579c3e464615a0ca9b2eb4bd3c16aad3)
Lionel Landwerlin [Thu, 26 Jan 2017 16:57:25 +0000 (16:57 +0000)]
spirv: handle OpUndef as part of the variable parsing pass
Looking at the following bit of SPIRV shader :
...
%zero = OpConstant %i32 0
%ivec3_0 = OpConstantComposite %ivec3 %zero %zero %zero
%vec3_undef = OpUndef %ivec3
%sc_0 = OpSpecConstant %i32 0
%sc_1 = OpSpecConstant %i32 0
%sc_2 = OpSpecConstant %i32 0
...
Our compiler currently stops parsing variables & types on the OpUndef
and switches to instructions, leaving the following sc_[0-2] variables
untreated.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
df7063cba35ea273827ba60f643596cd80539458)
Lionel Landwerlin [Thu, 26 Jan 2017 11:06:53 +0000 (11:06 +0000)]
anv: fix descriptor pool internal size allocation
The size of the pool is slightly smaller than the size of the
structure containing the whole pool. We need to take that into account
on when setting up the internals.
Fixes a crash due to out of bound memory access in:
dEQP-VK.api.descriptor_pool.out_of_pool_memory
v2: Drop debug traces (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
c3421106ec332bf3a943ccf9447edf00dc7f3618)
Kenneth Graunke [Sun, 22 Jan 2017 09:44:08 +0000 (01:44 -0800)]
i965: Make intelEmitCopyBlit not truncate large strides.
When trying to blit larger tiled surfaces, the pitch can be larger than
32768 bytes, which means it won't fit in a GLshort. Passing it in will
truncate the stride to 0, which has...surprising results.
The pitch can be up to 32,768 DWords, or 128kB. We measure it in bytes,
but divide by 4 when programming it. So we need to handle values up to
131,072. Switch from GLshort to int32_t to avoid the truncation.
Fixes GL45-CTS.gtf30.GL3Tests.depth_texture.depth_texture_copyteximage
at widths greater than 8192.
v2: Use int32_t as negative values can be used (Jason).
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit
f8f7ea508be7fe7222cd19e0d59574cfea2decf0)
Kenneth Graunke [Tue, 24 Jan 2017 08:45:53 +0000 (00:45 -0800)]
i965: Use a UW source type for CS_OPCODE_CS_TERMINATE.
SIMD16 compute shaders use a send(16) with mlen 1 for the EOT message,
using a source of g127 for the single register. With a UD type, this
supposedly could read g128, which doesn't exist, causing the simulator
to get cranky. Use a UW type to avoid this.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit
fcf723b647f36fa174d29b1fe6a9732637a1f8d1)
Iago Toral Quiroga [Wed, 25 Jan 2017 14:04:35 +0000 (15:04 +0100)]
anv/lower_input_attachments: honor sample index parameter to subpassLoad()
According to GL_KHR_vulkan_glsl, the signature of subpassLoad() is:
gvec4 subpassLoad(gsubpassInput subpass);
gvec4 subpassLoad(gsubpassInputMS subpass, int sample);
So the multisampled case always receives an explicit sample index that we
should use. The current implementation was ignoring this parameter
and using gl_SampleID value instead.
Fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_id.*
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
9b25769da63999fa65a70a14194a452c49d18f3e)
Kenneth Graunke [Mon, 23 Jan 2017 19:57:21 +0000 (11:57 -0800)]
i965: Fix fast depth clears for surfaces with a dimension of 16384.
I hadn't bothered to set this bit because I figured it would just
paper over us getting the rectangle wrong. But it turns out that
there is a legitimate reason to use it, so let's do so.
The alternative would be to chop up 16k clears to multiple 8k clears,
which is pointlessly painful.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit
5106df85da20d57007e89262472bb1624afbdaba)
Lionel Landwerlin [Wed, 25 Jan 2017 16:22:40 +0000 (16:22 +0000)]
anv: set command buffer to NULL when allocations fail
The spec section 5.2 says:
"vkAllocateCommandBuffers can be used to create multiple command
buffers. If the creation of any of those command buffers fails, the
implementation must destroy all successfully created command buffer
objects from this command, set all entries of the pCommandBuffers
array to VK_NULL_HANDLE and return the error."
Fixes:
dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary
dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
25e21cb8d065799888e0c5db80b0e616ffddc560)
Dave Airlie [Wed, 18 Jan 2017 03:46:43 +0000 (13:46 +1000)]
radv: program a default point size.
Along the lines of what
3b804819 anv: Default PointSize to 1.0 if not written by the shader
does for anv, program a default point size in the hw of 1.0.
This preempt fixes a bunch of geom shader tests.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit
2ab2be092d15ddb449b4a215609705bae68089a0)
Marek Olšák [Fri, 20 Jan 2017 15:02:04 +0000 (16:02 +0100)]
radeonsi: handle first_non_void correctly in si_create_vertex_elements
This fixes R11G11B10_FLOAT, because it's in the category of "OTHER",
meaning that it doesn't have any channel description.
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit
eac7df43ca05abd9992b305e078e88fe7b7f8c91)
Marek Olšák [Fri, 20 Jan 2017 01:26:42 +0000 (02:26 +0100)]
st/mesa: destroy pipe_context before destroying st_context (v2)
If radeonsi starts compiling an optimized shader variant asynchronously
with a GL debug callback set and the application destroys the GL context,
radeonsi crashes when trying to write shader stats into the debug output
of a non-existent context after compilation, because st/mesa was destroyed
before pipe_context.
Firefox with WebGL2 enabled hits this bug.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99456
v2: protect against a double destroy in st_create_context_priv and callers.
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit
d9ef54923804d5fe44a1d3ad5c29e9b8e8382359)
Ian Romanick [Mon, 23 Jan 2017 17:57:15 +0000 (09:57 -0800)]
mesa: Don't advertise GL_OES_read_format in core profile
OpenGL ES implementations are not allowed to ship ARB extensions, and
OpenGL implementations are not allowed to ship OES extensions.
The functionality is also included in GL_ARB_ES2_compatibility. Ever
OpenGL core-profile driver currently exposes both extensions. I don't
know of any applications that explicitly check for GL_OES_read_format,
so removing it seems very unlikely to cause problems. No functionality
is removed.
I have left this extension in place for compatibility profile. There
are still OpenGL 1.x drivers in Mesa, and adding code to check for
compatibility profile and not GL_ARB_ES2_compatibility for
GL_IMPLEMENTATION_COLOR_READ_TYPE and GL_IMPLEMENTATION_COLOR_READ_FORMAT
just feels dumb.
Three other other alternatives considered:
- Remove the string from compatibility profile drivers but leave the
functionality in place.
- Add a flag to expose the extension string, and set it in every OpenGL
driver that does not expose GL_ARB_ES2_compatibility (and those
drivers only). I tried this. You can't have two instances of an
extension in the extension table (one dummy_true for ES1 and one with
a flag for compatibility profile), so the implementation requires a
bit of effort.
- Only expose the extension in compatibility if the version is less
than 2.0. I didn't see an easy way to do this.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit
c4a0c1efff77ed90dfebf969348cd440195ae38a)
Roland Scheidegger [Mon, 23 Jan 2017 17:04:12 +0000 (18:04 +0100)]
gallivm: (trivial) fix ddiv cpu implementation
we can't use the cpu implementation of fdiv, as this one uses different
lp_build_context, which causes assertion failure.
Just use default fdiv action (there is no fast rcp for doubles which we
could potentially use anyway).
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit
25208949d7293aa060a3416f8cf3cdb3ca1fbfdd)
Roland Scheidegger [Mon, 23 Jan 2017 17:10:44 +0000 (18:10 +0100)]
tgsi: implement ddiv opcode
softpipe (along with llvmpipe) claims to support arb_gpu_shader_fp64,
so we really need to support that opcode.
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit
3b575a955c1d84744a65160e2c45f0ce407effd8)
Jason Ekstrand [Mon, 23 Jan 2017 18:53:13 +0000 (10:53 -0800)]
i965/blorp: Use the correct ISL format for combined depth/stencil
In brw_blorp_copyteximage, we use the format from the render buffer.
This could be a combined depth/stencil format. In this case, we handle
stencil properly but we give blorp the wrong ISL format. Specifically,
we would give blorp ISL_FORMAT_R32G32B32A32_FLOAT which is the wrong
size was causing GPU hangs.
Fixes: GL45-CTS.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_copyteximage
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
4c180f9633421a526f3ea6746cf38b809e7c1abb)
Topi Pohjolainen [Thu, 19 Jan 2017 08:11:42 +0000 (10:11 +0200)]
i965/blorp: Add also depth and stencil buffers to render cache
v2 (Jason, Curro): Add stencil also even though it is not
enabled yet.
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit
ba6399df9462d78eda2a5de7c2940d8cf9d27f95)
Emil Velikov [Thu, 19 Jan 2017 15:19:56 +0000 (15:19 +0000)]
configure.ac: move require_dri_shared_libs_and_glapi() before its users
Otherwise we'll get a lovely message as below:
"require_dri_shared_libs_and_glapi: command not found"
Cc: Steven Newbury <steve@snewbury.org.uk>
Reported-by: Steven Newbury <steve@snewbury.org.uk>
Fixes:
da410e6afad "configure: explicitly require shared glapi for
enable-dri"
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Steven Newbury <steve@snewbury.org.uk>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit
5872850b8868f00c031d21387b0516d844d070be)
Emil Velikov [Wed, 25 Jan 2017 13:24:27 +0000 (13:24 +0000)]
Update version to 17.0.0-rc2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Topi Pohjolainen [Tue, 17 Jan 2017 10:00:37 +0000 (12:00 +0200)]
i965/blorp: Make post draw flush more explicit
Blits do not need any special treatment as the target buffer
object is added to render cache just as one does for normal draw.
Color clears and resolves in turn require explicit "end of pipe
synchronization". It is not clear what this means exactly but the
assumption is that render cache flush with command stream stall
should be sufficient.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit
180653c357d19ca88f7895f59874a58fac99cc53)
Topi Pohjolainen [Tue, 17 Jan 2017 09:48:49 +0000 (11:48 +0200)]
i965/gen6: Issue direct depth stall and flush after depth clear
instead of calling unconditionally brw_emit_mi_flush() which
does:
brw_emit_pipe_control_flush(brw,
PIPE_CONTROL_DEPTH_CACHE_FLUSH |
PIPE_CONTROL_RENDER_TARGET_FLUSH |
PIPE_CONTROL_CS_STALL);
brw_emit_pipe_control_flush(brw,
PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
PIPE_CONTROL_CONST_CACHE_INVALIDATE);
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit
46b346899d98e29943f8cd74c25bcb8d2f868a49)
Topi Pohjolainen [Tue, 17 Jan 2017 09:44:52 +0000 (11:44 +0200)]
i965: Make depth clear flushing more explicit
Current blorp logic issues unconditional "flush everything"
(see brw_emit_mi_flush()) after each render. For example, all
blits issue this unconditionally which shouldn't be needed if
they set render cache properly so that subsequent renders do
necessary flushing before drawing.
In case of piglit:
ext_framebuffer_multisample-accuracy all_samples depth_draw small
intel_hiz_exec() is always preceded by blorb blit and the
unconditional flush looks to hide the lack of stall and flushes
in depth clears. By removing the brw_emit_mi_flush() I get gpu
hangs.
This patch adds the stalls and flushes mandated by the spec
and gets rid of those hangs.
v2 (Jason, Ken): Document the rational for separating
depth cache flush and stall on Gen7.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit
e6da6943fed1228c551af1f0e1a405b6d67b41ae)
Topi Pohjolainen [Tue, 17 Jan 2017 09:04:22 +0000 (11:04 +0200)]
i965/blorp: Use the render cache mechanism instead of explicit flushing
by replacing brw_emit_mi_flush() with brw_render_cache_set_check_flush().
The latter splits the flush in two:
brw_emit_pipe_control_flush(brw,
PIPE_CONTROL_DEPTH_CACHE_FLUSH |
PIPE_CONTROL_RENDER_TARGET_FLUSH |
PIPE_CONTROL_CS_STALL);
brw_emit_pipe_control_flush(brw,
PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
PIPE_CONTROL_CONST_CACHE_INVALIDATE);
instead of
int flags = PIPE_CONTROL_NO_WRITE | PIPE_CONTROL_RENDER_TARGET_FLUSH;
if (brw->gen >= 6) {
flags |= PIPE_CONTROL_INSTRUCTION_INVALIDATE |
PIPE_CONTROL_CONST_CACHE_INVALIDATE |
PIPE_CONTROL_DEPTH_CACHE_FLUSH |
PIPE_CONTROL_VF_CACHE_INVALIDATE |
PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
PIPE_CONTROL_CS_STALL;
}
brw_emit_pipe_control_flush(brw, flags);
v2 (Jason): Check that destination exists before trying to add to
render cache. Depth clears and resolves don't have it.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit
4840a53e902b0f2b9841d9dbb90e479a3688153d)
Marek Olšák [Fri, 20 Jan 2017 00:13:39 +0000 (01:13 +0100)]
radeonsi: always set the TCL1_ACTION_ENA when invalidating L2
Some CIK-VI docs say this is the default behavior on SI. That doesn't
answer whether it's also the default behavior on CIK-VI.
Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit
573bf0940a08e18a511e338de478f30fd95a1590)
Grazvydas Ignotas [Mon, 23 Jan 2017 21:16:42 +0000 (23:16 +0200)]
radv: don't resubmit the same cs over and over while tracing
Fixes:
97dfff54 ("radv: Dump command buffer on hang.")
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
f65b3641c3233f1697b96ea8126b578dae6de4f1)
George Kyriazis [Wed, 18 Jan 2017 23:09:08 +0000 (17:09 -0600)]
swr: Align query results allocation
Some query results struct contents are declared as cache line aligned.
Use aligned malloc, and align the whole struct, to be safe.
Fixes crash when compiling with clang.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit
00847e4f14dd237dfcdb2c3d15be1325a08ccf5a)
Bruce Cherniak [Thu, 19 Jan 2017 21:44:52 +0000 (15:44 -0600)]
swr: Prune empty nodes in CalculateProcessorTopology.
CalculateProcessorTopology tries to figure out system topology by
parsing /proc/cpuinfo to determine the number of threads, cores, and
NUMA nodes. There are some architectures where the "physical id" begins
with 1 rather than 0, which was creating and empty "0" node and causing a
crash in CreateThreadPool.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97102
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
b829206b0739925501bcc68233437d6d03b79795)
Nicolai Hähnle [Mon, 16 Jan 2017 15:43:54 +0000 (16:43 +0100)]
st/glsl_to_tgsi: use DDIV instead of DRCP + DMUL
Fixes GL45-CTS.gpu_shader_fp64.built_in_functions.
v2: use DDIV unconditionally (Roland)
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
cfabbbcfd778cc404813c9f05a9ef79efe531980)
Nicolai Hähnle [Mon, 16 Jan 2017 15:39:06 +0000 (16:39 +0100)]
glsl: split DIV_TO_MUL_RCP into single- and double-precision flags
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
b71c415c3d288da4b5f533ece42f50f4f20a8c33)
Nicolai Hähnle [Thu, 19 Jan 2017 13:44:57 +0000 (14:44 +0100)]
r600: implement DDIV
Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
e4f8f9a638c1ffb9b76840b088290f11f0f91813)
Nicolai Hähnle [Thu, 19 Jan 2017 13:44:24 +0000 (14:44 +0100)]
r600: factor out cayman_emit_unary_double_raw
We will use it for DDIV.
Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
488560cfe6ee2206f7a7f894694ebc43b419be61)
Nicolai Hähnle [Thu, 19 Jan 2017 13:38:54 +0000 (14:38 +0100)]
r600: double multiply can handle only one multiply at a time
It seems clear that trying to multiply two pairs of doubles would result
in the temporary register getting overwritten by the second pair. So
make the code more explicit.
Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
76b02d2fe1df5351f67f53d07b37952043f0a84c)
Rob Clark [Sun, 22 Jan 2017 18:38:43 +0000 (13:38 -0500)]
freedreno/a5xx: set frag shader threadsize
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
31daeb5bf14334bc0d39f28c9102cd15d834abfc)
Rob Clark [Sun, 22 Jan 2017 17:23:27 +0000 (12:23 -0500)]
freedreno/a5xx: set fragcoordxy properly
What a3xx docs call IJPERSPCENTERREGID.. the xy coord passed into
bary.f. We were incorrectly setting both this and gl_FragCoord.xy to
the same register resulting in all sorts of hilarity.
Fixes stk, vdrift, 0ad, probably a bunch others.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
8d6af93e76bb9e592293b632b22b2b756cc0cae8)
Rob Clark [Mon, 16 Jan 2017 19:02:54 +0000 (14:02 -0500)]
freedreno/a5xx: fix psize
Note spritelist (POINTLIST_PSIZE) seems not to be a thing anymore on
a5xx.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
6cc93bedc15d09395ab6a92a0a129d06a8cd8ae8)
Rob Clark [Sun, 15 Jan 2017 18:19:47 +0000 (13:19 -0500)]
freedreno/a5xx: srgb fix
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
141a4f86d6b9c0c4dbde511b741576a103f8f7ff)
Rob Clark [Sun, 15 Jan 2017 13:43:44 +0000 (08:43 -0500)]
freedreno/a5xx: fix int vbos
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
69fbb458cf59fbab5f6675ad256a266b04d54700)
Rob Clark [Sat, 14 Jan 2017 12:59:42 +0000 (07:59 -0500)]
freedreno/a5xx: fix clear for uint/sint formats
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
16671e970444f154ffa60d2aaadee4d065eb6103)
Rob Clark [Wed, 11 Jan 2017 16:31:40 +0000 (11:31 -0500)]
freedreno/a5xx: fix cull state
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
4d9aa4f67d6316feea93901bf29b76a68c4333cd)
Rob Clark [Wed, 11 Jan 2017 16:30:21 +0000 (11:30 -0500)]
freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
4c39458460075f6c1ea9e4607769513b96c6dd82)
Jason Ekstrand [Thu, 19 Jan 2017 19:28:31 +0000 (11:28 -0800)]
nir/search: Use the correct bit size for integer comparisons
The previous code always compared integers as 64-bit. Due to variations
in sign-extension in the code generated by nir_opt_algebraic.py, this
meant that nir_search doesn't always do what you want. Instead, 32-bit
values should be matched as 32-bit and 64-bit values should be matched
as 64-bit. While we're here we unify the unsigned and signed paths.
Now that we're using the right bit size, they should be the same since
the only difference we had before was sign extension.
This gets the UE4 bitfield_extract optimization working again. It had
stopped working due to the constant 0xff00ff00 getting sign-extended
when it shouldn't have.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
bb96b034616d9d099752efb005b5c05e8644059c)
Jason Ekstrand [Fri, 20 Jan 2017 20:27:34 +0000 (12:27 -0800)]
intel/blorp/copy: Properly handle clear colors for CCS_E images
In order to handle CCS_E, we stomp the image format to a UINT format and
then do some bitcasting logic in the shader. This works fine since SKL
render compression only considers the channel layout of the format and
not the format itself. In order for this to work on images that have
been fast-cleared, we need to also convert the clear color so that, when
interpreted as UINT, it provides the same bit value as it would have in
the original format. This fixes a bunch of OpenGL ES CTS tests for
copy_image when we start using CCS more aggressively.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
817f9e3b17c784cbe40639a4b370edc762bd2513)
Andres Rodriguez [Wed, 18 Jan 2017 22:48:36 +0000 (17:48 -0500)]
radv: fix include order for installed headers v2
In situations where libdrm_amdgpu and mesa are installed to the same
location, the mesa installed headers will take precedence over the git
source headers.
This is due to the AMDGPU_CFLAGS containing the install directory.
This situation can cause build errors if the git version of a header is
newer than the currently installed version of a header (e.g. git pull
updates vulkan.h)
Note: using the same install prefix for mesa and libdrm is probably a
common occurrence since it is described in the radeonBuildHowTo wiki:
https://www.x.org/wiki/radeonBuildHowTo/
v2: added sign-off
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
a3ad6a34c6ba222ec93a2cfd0cac205c62574eb7)
Andres Rodriguez [Wed, 18 Jan 2017 23:07:56 +0000 (18:07 -0500)]
vulkan/wsi: clarify the severity of lack of DRI3 v2
The current message sounds like a small warning, clarify that it can
result in lack of presentation support and application crashes.
v2: add "if they do" (Bas)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98263
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Acked-by: Jason ekstrand <jason@jlekstrand.net>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
e0674e740bf84085dec898ffd87bdeb2027e620f)
Lionel Landwerlin [Thu, 19 Jan 2017 16:20:00 +0000 (16:20 +0000)]
anv: don't require render target isl bit for depth/stencil surfaces
Blorp can deal with depth/stencil surfaces blits/copies without the
render target requirement. Also having both render target and
depth/stencil requirement is incompatible from isl's point of view.
This fixes an image creation issue in the high level quality settings
of the Unity3D player, which requires a depth texture with src/dst
transfer & 4x multisampling.
v2: Simply aspect checking condition (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
74c23bde5b8155a84233265c56bedac8f38de14e)
Lionel Landwerlin [Fri, 13 Jan 2017 16:08:28 +0000 (16:08 +0000)]
spirv: don't assert with location decorations on non i/o variables
Some applications might add location decoration to samplers. Rather
than raising an error it seems it would make more sense to just
discard these decorations.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
8a28e764d0e28d0d4dfa3b81b89fa3baf30e94f2)
Samuel Pitoiset [Fri, 20 Jan 2017 00:19:49 +0000 (01:19 +0100)]
gallium/hud: add missing break in hud_cpufreq_graph_install()
Fixes:
e99b9395bef "gallium/hud: Add support for CPU frequency monitoring"
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit
383fc8e9f340e80695aca2cd585957af0e081eb9)
Marek Olšák [Wed, 18 Jan 2017 21:15:35 +0000 (22:15 +0100)]
radeonsi: don't forget to add HTILE to the buffer list for texturing
This fixes VM faults. Discovered by Samuel Pitoiset.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98975
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99450
Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit
e490b7812cae778c61004971d86dc8299b6cd240)
Nicolai Hähnle [Wed, 18 Jan 2017 08:28:47 +0000 (09:28 +0100)]
radeonsi: fix texture gather on stencil textures
At least on VI, texture gather doesn't work with a 24_8 data format, so
use 8_8_8_8 and a modified swizzle instead.
A bit of background: When creating a GL_STENCIL_INDEX8 texture, we select
the X24S8 pipe format because we don't support stencil-only render targets
properly. With mip-mapping this can lead to a setup where the tiling is
incompatible with stencil texturing, and a flushed stencil texture is
used. For the flushed stencil, a literal X24S8 is used because there were
issues with an 8bpp DB->CB copy.
Longer term, it would be good if we could get away from these workarounds,
i.e. properly support an S8 format for stencil-only rendering and flushed
stencil. Since stencil texturing is somewhat rare, it's not a high
priority.
Fixes GL45-CTS.texture_cube_map_array.sampling.
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit
3cd092c41508dde2e6259f09df1736911a828548)
Zachary Michaels [Thu, 19 Jan 2017 09:50:16 +0000 (10:50 +0100)]
radeonsi: Always leave poly_offset in a valid state
This commit makes si_update_poly_offset set poly_offset to NULL if
uses_poly_offset is false. This way poly_offset either points into the
currently queued rasterizer, or it is NULL.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99451
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit
d7d32b3bfe86bd89d94d59393907bce1cb9dab7c)
Nicolai Hähnle [Mon, 16 Jan 2017 11:13:50 +0000 (12:13 +0100)]
mesa/main: fix meta caller of _mesa_ClampColor
Since _mesa_ClampColor properly checks for support of the API function
now, it's meta callers need to check support as well.
Fixes:
963311b71f ("mesa/main: fix version/extension checks in _mesa_ClampColor")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99401
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit
a7c635ec6589f600f0d52d0097774ea0b938de9f)
Dave Airlie [Thu, 19 Jan 2017 04:39:10 +0000 (14:39 +1000)]
gallivm: use #ifdef not #if for PIPE_ARCH_BIG_ENDIAN
This fixes the build on ppc/s390.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit
ef71b867ee152d8161a8c7320e89843801236249)
Emil Velikov [Wed, 18 Jan 2017 20:12:04 +0000 (20:12 +0000)]
Update version to 17.0.0-rc1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Wed, 18 Jan 2017 19:48:37 +0000 (19:48 +0000)]
utils: really remove the __END_DECLS macro
Fixes:
d1efa09d342 "util: import sha1 implementation from OpenBSD"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
ea8b2624c8da1061e93124a760cae2ffb5f027ad)
Emil Velikov [Wed, 18 Jan 2017 19:40:31 +0000 (19:40 +0000)]
utils: build sha1/disk cache only with Android/Autoconf
Earlier commit imported a SHA1 implementation and relaxed the SHA1 and
disk cache handling, broking the Windows builds.
Restrict things for now until we get to a proper fix.
Fixes:
d1efa09d342 "util: import sha1 implementation from OpenBSD"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
9f8dc3bf03ec825bae7041858dda6ca2e9a34363)
Emil Velikov [Fri, 13 Jan 2017 16:51:31 +0000 (16:51 +0000)]
util: import sha1 implementation from OpenBSD
At the moment we support 5+ different implementations each with varying
amount of bugs - from thread safely problems [1], to outright broken
implementation(s) [2]
In order to accommodate these we have 150+ lines of configure script and
extra two configure toggles. Whist an actual implementation being
~200loc and our current compat wrapping ~250.
Let's not forget that different people use different code paths, thus
effectively makes it harder to test and debug since the default
implementation is automatically detected.
To minimise all these lovely experiences, import the "100% Public
Domain" OpenBSD sha1 implementation. Clearly document any changes needed
to get building correctly, since many/most of those can be upstreamed
making future syncs easier.
As an added bonus this will avoid all the 'fun' experiences trying to
integrate it with the Android and SCons builds.
v2: Manually expand __BEGIN_DECLS/__END_DECLS and document (Tapani).
Furthermore it seems that some games (or surrounding runtime) static
link against OpenSSL resulting in conflicts. For more information see
the discussion thread [3]
Bugzilla [1]: https://bugs.freedesktop.org/show_bug.cgi?id=94904
Bugzilla [2]: https://bugs.freedesktop.org/show_bug.cgi?id=97967
[3] https://lists.freedesktop.org/archives/mesa-dev/2017-January/140748.html
Cc: Mark Janes <mark.a.janes@intel.com>
Cc: Vinson Lee <vlee@freedesktop.org>
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Jonathan Gray <jsg@jsg.id.au>
Tested-by: Jonathan Gray <jsg@jsg.id.au>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Kenneth Graunke [Fri, 11 Nov 2016 22:52:36 +0000 (14:52 -0800)]
i965: Make brw_cache_item structure private to brw_program_cache.c.
struct brw_cache_item is an implementation detail of the program cache.
We don't need to make those internals available to the entire driver.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Marek Olšák [Tue, 17 Jan 2017 21:03:23 +0000 (22:03 +0100)]
radeonsi: determine in advance which VBOs should be added to the buffer list
v2: now it should be correct
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 17 Jan 2017 20:55:59 +0000 (21:55 +0100)]
radeonsi: use fewer pointer dereferences in upload_vertex_buffer_descriptors
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 17 Jan 2017 20:49:50 +0000 (21:49 +0100)]
radeonsi: reject invalid vertex buffer indices at state creation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 17 Jan 2017 20:30:23 +0000 (21:30 +0100)]
radeonsi: use a global dirty mask for shader pointers
Only vertex buffers use a separate bool flag.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 17 Jan 2017 19:46:39 +0000 (20:46 +0100)]
radeonsi: use a bitmask-based loop in si_decompress_textures
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 17 Jan 2017 19:10:15 +0000 (20:10 +0100)]
radeonsi: skip an unnecessary mutex lock for L2 prefetches
the mutex lock is inside util_range_add.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 17 Jan 2017 17:42:23 +0000 (18:42 +0100)]
radeonsi: si_cp_dma_prepare is a no-op for L2 prefetches
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 17 Jan 2017 17:40:26 +0000 (18:40 +0100)]
radeonsi: add SI_CPDMA_SKIP_BO_LIST_UPDATE
the next commit will use it in a clever way, because the CP DMA prefetch
doesn't need this.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 17 Jan 2017 17:04:13 +0000 (18:04 +0100)]
radeonsi: use the correct target machine when building shader variants
If the shader selector is created with a different context than
the shader variant, we should use the calling context's target machine
for the shader variant.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99419
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 17 Jan 2017 16:39:16 +0000 (17:39 +0100)]
radeonsi: move shader pipe context state into a separate structure
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Ben Widawsky [Fri, 2 Dec 2016 19:10:47 +0000 (11:10 -0800)]
i965: Fix SURFACE_STATE to handle non-zero aux offsets
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Daniel Stone <daniels@collabora.com>