git.osdn.net Git - uclinux-h8/linux.git/log

drm/i915: Implement dynamic GuC WOPCM offset and size calculation

Hardware may have specific restrictions on GuC WOPCM offset and size. On
Gen9, the value of the GuC WOPCM size register needs to be larger than the
value of GuC WOPCM offset register + a Gen9 specific offset (144KB) for
reserved GuC WOPCM. Fail to enforce such a restriction on GuC WOPCM size
will lead to GuC firmware execution failures. On the other hand, with
current static GuC WOPCM offset and size values (512KB for both offset and
size), the GuC WOPCM size verification will fail on Gen9 even if it can be
fixed by lowering the GuC WOPCM offset by calculating its value based on
HuC firmware size (which is likely less than 200KB on Gen9), so that we can
have a GuC WOPCM size value which is large enough to pass the GuC WOPCM
size check.

This patch updates the reserved GuC WOPCM size for RC6 context on Gen9 to
24KB to strictly align with the Gen9 GuC WOPCM layout. It also adds support
to verify the GuC WOPCM size aganist the Gen9 hardware restrictions. To
meet all above requirements, let's provide dynamic partitioning of the
WOPCM that will be based on platform specific HuC/GuC firmware sizes.

v2:
- Removed intel_wopcm_init (Ville/Sagar/Joonas)
- Renamed and Moved the intel_wopcm_partition into intel_guc (Sagar)
- Removed unnecessary function calls (Joonas)
- Init GuC WOPCM partition as soon as firmware fetching is completed

v3:
- Fixed indentation issues (Chris)
- Removed layering violation code (Chris/Michal)
- Created separat files for GuC wopcm code  (Michal)
- Used inline function to avoid code duplication (Michal)

v4:
- Preset the GuC WOPCM top during early GuC init (Chris)
- Fail intel_uc_init_hw() as soon as GuC WOPCM partitioning failed

v5:
- Moved GuC DMA WOPCM register updating code into intel_wopcm.c
- Took care of the locking status before writing to GuC DMA
   Write-Once registers. (Joonas)

v6:
- Made sure the GuC WOPCM size to be multiple of 4K (4K aligned)

v8:
- Updated comments and fixed naming issues (Sagar/Joonas)
- Updated commit message to include more description about the hardware
   restriction on GuC WOPCM size (Sagar)

v9:
- Minor changes variable names and code comments (Sagar)
- Added detailed GuC WOPCM layout drawing (Sagar/Michal)
- Refined macro definitions to be reader friendly (Michal)
- Removed redundent check to valid flag (Michal)
- Unified first parameter for exported GuC WOPCM functions (Michal)
- Refined the name and parameter list of hardware restriction checking
   functions (Michal)

v10:
- Used shorter function name for internal functions (Joonas)
- Moved init-ealry function into c file (Joonas)
- Consolidated and removed redundant size checks (Joonas/Michal)
- Removed unnecessary unlikely() from code which is only called once
   during boot (Joonas)
- More fixes to kernel-doc format and content (Michal)
- Avoided the use of PAGE_MASK for 4K pages (Michal)
- Added error log messages to error paths (Michal)

v11:
- Replaced intel_guc_wopcm with more generic intel_wopcm and attached
   intel_wopcm to drm_i915_private instead intel_guc (Michal)
- dynamic calculation of GuC non-wopcm memory start (a.k.a WOPCM Top
   offset from GuC WOPCM base) (Michal)
- Moved WOPCM marco definitions into .c source file (Michal)
- Exported WOPCM layout diagram as kernel-doc (Michal)

v12:
- Updated naming, function kernel-doc to align with new changes (Michal)

v13:
- Updated the ordering of s-o-b/cc/r-b tags (Sagar)
- Corrected one tense error in comment (Sagar)
- Corrected typos and removed spurious comments (Joonas)

Bspec: 12690

Signed-off-by: Jackie Li <yaodong.li@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Spotswood <john.a.spotswood@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> (v8)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v9)
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> (v11)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v12)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1520987574-19351-2-git-send-email-yaodong.li@intel.com

drm/i915/guc: Rename guc_ggtt_offset to intel_guc_ggtt_offset

GuC related exported functions should start with "intel_guc_" prefix and
pass intel_guc as the first parameter since its GuC related. Current
guc_ggtt_offset() failed to follow this code convention and this is a
problem for future patches that needs to access intel_guc data to verify
the GGTT offset against the GuC WOPCM top.

This patch renames the guc_ggtt_offset to intel_guc_ggtt_offset and updates
the related code to pass intel_guc pointer to this function call, so that
we can have a unified coding style for GuC code and also enable the future
patches to get GuC related data from intel_guc to do the offset
verification. Meanwhile, this patch also moves the GUC_GGTT_TOP from
intel_guc_regs.h to intel_guc.h since it is not GuC register related
definition.

v8:
- Fixed coding style issues and moved GUC_GGTT_TOP to intel_guc.h (Sagar)
- Updated commit message to explain to reason and motivation to add
intel_guc as the first parameter of intel_guc_ggtt_offset (Chris)

v9:
- Fixed code alignment issue due to line break (Chris)

v10:
- Removed unnecessary comments, redundant code and avoided reuse variable
to avoid potential issues (Joonas)

v13:
- Updated the ordering of s-o-b/cc/r-b tags (Sagar)

Signed-off-by: Jackie Li <yaodong.li@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> (v8)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v9)
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> (v11)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v12)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1520987574-19351-1-git-send-email-yaodong.li@intel.com

drm/i915/psr: Comment to clarify SRD_DEBUG is called PSR_MASK SKL+

What was called SRD_DEBUG(0x6F860) on HSW and BDW was renamed to PSR_MASK
SKL onwards, add a note next to the macro definition.
There is also a different PSR_DEBUG on SKL+ to add to the confusion.

Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180313040954.6289-1-dhinakaran.pandiyan@intel.com

drm/i915: Show GEM_TRACE when detecting a failed GPU idle

If we timeout waiting for the GPU to idle, something went seriously
wrong. We currently dump the engine state, but we can also dump the
ftrace buffer showing our last operations (when available).

In passing, note that since commit 559e040f1f08 ("drm/i915: Show the GPU
state when declaring wedged", we now show the engine state twice, once
in detecting the failed idle and then again on declaring wedged.

v2: ftrace_dump() takes a parameter specifying whether to dump all cpu
buffers or the local cpu's.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309101114.1138-1-chris@chris-wilson.co.uk

drm/i915: Move CUR SURFLIVE definition to a better place.

No functional change. But let's keep definitions clean
and cursor related register definitions together.

v2: Fix caps x no caps on same reg. Change name to match
    original reg name. (by Ville).
    Also fix name on code s/surlive/surflive and on subject
    s/cur_surlife/cur surflive/.

Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180312210528.7905-1-rodrigo.vivi@intel.com

drm/i915/psr: Remove PSR active flag from debugfs

The flag becomes misleading with flips and cursor moves not modifying it's
state as HW takes care of exiting PSR (when HW tracking is enabled)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180313044211.27105-1-dhinakaran.pandiyan@intel.com

drm/i915/psr: Use more PSR HW tracking.

So far we are using frontbuffer tracking for everything
and ignoring that PSR has a HW capable HW tracking for many
modern usages of GPU on Core platforms and newer Atom ones.

One reason for that is that we were trying to keep same
infrastructure in place for VLV/CHV than the rest of platforms.
But also because when this infrastructure was created
the front-buffer-tracking origin wasn't that good and stable
how it is today after Paulo reworked it to attend FBC cases.

However this PSR implementation without HW tracking died
on gen8LP. And newer platforms are starting to demand more HW
tracking specially with PSR2 cases in mind.

By disabling and re-enabling PSR totally every time we believe
someone is going to change the front buffer content we don't
allow PSR HW tracking to do this job and specially compromising
the whole idea of PSR2 case where the HW tracking detect only
the damaged area and do a partial screen update.

So, from now on, on the platforms that has hw_tracking let's
rely more on HW tracking.

This also is the case in used by other drivers and more validated
by SV teams. So I hope that this will lead us to less misterious
bugs.

v2: Only do this for platform that actually has hw tracking.

v3 from DK
Do this only for flips, small gradual changes are better.

Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Jim Bride <jim.bride@linux.intel.com>
Cc: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Jose Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307033420.3086-3-dhinakaran.pandiyan@intel.com

drm/i915/frontbuffer: HW tracking for cursor moves to fix PSR lags.

DRM_IOCTL_MODE_CURSOR results in frontbuffer flush before the cursor
plane MMIOs are written to. But this flush should not be necessary for
PSR as hardware tracking triggers PSR exit when MMIOs are written. As
for FBC, the spec says "Flips or changes to plane size and panning" cause
FBC to be nuked. Use origin == ORIGIN_FLIP so that features can ignore
cursor updates in their frontbuffer_flush implementations.

/sys/kernel/debug/dri/0/i915_fbc_status shows
"Compressing: yes" when I move the cursor around.

v3: Use ORIGIN_FLIP now that pin_to_display does not flush frontbuffer.
v2: Update comment in i915_gem_object_pin_to_display_plane. (Chris)

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307033420.3086-2-dhinakaran.pandiyan@intel.com

drm/i915/frontbuffer: Pull frontbuffer_flush out of gem_obj_pin_to_display

i915_gem_obj_pin_to_display() calls frontbuffer_flush with origin set to
DIRTYFB. The callers however are at a vantage point to decide if hardware
frontbuffer tracking can do the flush for us. For example, legacy cursor
updates, like flips, write to MMIO registers, which then triggers PSR flush
by the hardware. Moving frontbuffer_flush out will enable us to skip a
software initiated flush by setting origin to FLIP. Thanks to Chris for the
idea.

v2:
Rebased due to Ville adding intel_plane_pin_fb().
Minor code reordering as fb_obj_flush doesn't need struct_mutex (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307033420.3086-1-dhinakaran.pandiyan@intel.com

drm/i915: Use sseu size for determining eu_regs[]

eu_regs[] is written 2*max_slices times (like s_reg[]) but oddly read
2*max_slices + max_subslices/2 times. Allocate the array large enough
for the writes to avoid overwriting our stack and worry about the logic
later.

Fixes: 7aa0b14ede64 ("drm/i915: Remove variable length arrays from sseu debugfs printers")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105479
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180313113149.1094-1-chris@chris-wilson.co.uk

drm/i915: Warn against variable length arrays

VLA are strongly discouraged in the kernel due to ambiguity they impose
on the limited stack space and security concerns over manipulating the
stack frame. Add -Wvla to our compiler flags so that CI rejects them.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180313004055.25411-2-chris@chris-wilson.co.uk

drm/i915: Remove variable length arrays from sseu debugfs printers

In order to enable -Wvla to prevent new variable length arrays being
used in i915.ko, we first must remove the existing VLA. Inside
i915_print_sseu_info(), VLA are used as the actual size of the sseu
depends on platform. Replace the VLA with the maximum required.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180313004055.25411-1-chris@chris-wilson.co.uk

drm/i915/uc: Sanitize uC together with GEM

Instead of dancing around uC on reset/suspend/resume scenarios,
explicitly sanitize uC when we sanitize GEM to force uC reload
and start from known beginning.

v2: don't forget about reset path (Daniele)
sanitize uc before gem initiated full reset (Daniele)
v3: drop redundant disable_communication in init_hw (Daniele)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180312130308.22952-3-michal.wajdeczko@intel.com

drm/i915/uc: Sanitize uC options early

We are sanitizing uC related modparams together with other driver
modparams in intel_sanitize_options called from i915_driver_init_hw,
but this is too late for us as we will want to use USES_GUC/USES_HUC
macros at earlier stage. Since our sanitizing does not require any
MMIO access, we can do it in intel_uc_init_early right after we resolve
firmware names.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180312130308.22952-2-michal.wajdeczko@intel.com

drm/i915: Remove the impedance mismatch around intel_engine_enable_signaling

There is some redundancy between dma_fence->ops->enable_signaling (via
i915_fence_enable_signaling) and our backend,
intel_engine_enable_signaling() in that both levels recheck the fence
status multiple times. If we convert intel_engine_enable_signaling() to
return the information desired by dma_fence->ops->enable_signaling, we
can reduce i915_fence_enable_signaling to a simple stub and avoid
trying to reinterpret the same information.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308140732.25090-1-chris@chris-wilson.co.uk

drm/i915/psr: Display WA 0884 applied broadly for more HW tracking.

WA 0884:bxt:all,cnl:*:A - "When FBC is enabled with eDP PSR,
the CPU host modify writes may not get updated on the Display
as expected.
WA: Write 0x00000000 to CUR_SURFLIVE_A with every CPU
host modify write to trigger PSR exit."

We can also find on spec other cases where they describe
bogus writes to cursor registers to force PSR exit with
HW tracking. And it was confirmed by HW engineers that
this Wa can be safely applied for any frontbuffer activity.

So let's use this more and more here instead of forcibly
disable and re-enable PSR everytime that we have a simple
reliable flush case.

Other commits improve the fbcon/fbdev use a lot, but this
approach is the only when where we can get a fully reliable
console with no slowness or missed frames and PSR still
enabled and active.

v2: - Rebase on drm-tip
    - (DK) Add a comment to explain that WA
    tells about writing 0 to CUR_SURFLIVE_A but we write to
    CUR_SURFLIVE(pipe).
v3: Wa doesn't work on PSR2.

Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309005218.26772-1-rodrigo.vivi@intel.com

drm/i915: Move i915_gpu_error into its own header

Error state management code was moved into separate .c unit
but we didn't move related definitions into own header.

v2: move also intel_display_error_state forward decl
fix ("Prefer 'unsigned int' to bare use of 'unsigned'")
warnings detected by checkpatch in moved code (Michal)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308095037.18264-5-michal.wajdeczko@intel.com

drm/i915: Make header i915_pmu.h more robust

Definitions in i915_pmu.h header depend on other types and
declarations that were not explicitly included. Fix that by
adding related headers and forward declarations.
While here, change license text to SPDX format.

v2: don't drop "intel_ringbuffer.h" (Tvrtko)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308095037.18264-4-michal.wajdeczko@intel.com

drm/i915: Change parameters order in i915_gem_batch_pool_init

Function i915_gem_batch_pool_init() failed to follow obj-verb
naming schema. Fix that by swapping function parameters.
While here, change license text to SPDX format.

v2: use intel_engine_init_batch_pool (Chris) as proxy (Michal)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308095037.18264-3-michal.wajdeczko@intel.com

drm/i915: Include i915_reg.h in intel_ringbuffer.h

Header intel_ringbuffer.h is using definitions from i915_reg.h
but forget to include it. Remove this hidden dependency by
explicitly include missing header.

v2: add reminder (Chris)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308095037.18264-2-michal.wajdeczko@intel.com

drm/i915/guc: Move GuC notification handling to separate function

To allow future code reuse. While here, fix comment style.

v2: Notifications are a separate thing - rename the handler (Sagar)

Suggested-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308154707.21716-3-michal.winiarski@intel.com

drm/i915/guc: Create common entry points for log register/unregister

We have many functions responsible for allocating different parts of
GuC log runtime called from multiple places. Let's stick with keeping
everything in guc_log_register instead.

v2: Use more generic intel_uc_register name, keep using "misc" suffix (Michał)
s/dev_priv/i915 (Sagar)
Make guc_log_relay_* static (sparse)

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308154707.21716-2-michal.winiarski@intel.com

drm/i915/guc: Tidy guc_log_control

We plan to decouple log runtime (mapping + relay) from verbosity control.
Let's tidy the code now to reduce the churn in the following patches.

v2: Tidy macros, keep debug messages, use helper var for enable,
correct typo (Michał)
Fix incorrect input validaction (Sagar)

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308154707.21716-1-michal.winiarski@intel.com

drm/i915: Remove unused DP_LINK_CHECK_TIMEOUT

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308232421.14049-2-lyude@redhat.com

drm/i915: Only call tasklet_kill() on the first prepare_reset

tasklet_kill() will spin waiting for the current tasklet to be executed.
However, if tasklet_disable() has been called, then the tasklet is never
executed but permanently put back onto the runlist until
tasklet_enable() is called. Ergo, we cannot use tasklet_kill() inside a
disable/enable pair. This is the case when we call set-wedge from inside
i915_reset(), and another request was submitted to us concurrent to the
reset.

Fixes: 963ddd63c314 ("drm/i915: Suspend submission tasklets around wedging")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-6-chris@chris-wilson.co.uk

drm/i915: Wrap engine->schedule in RCU locks for set-wedge protection

Similar to the staging around handling of engine->submit_request, we
need to stop adding to the execlists->queue prior to calling
engine->cancel_requests. cancel_requests will move requests from the
queue onto the timeline, so if we add a request onto the queue after that
point, it will be lost.

Fixes: af7a8ffad9c5 ("drm/i915: Use rcu instead of stop_machine in set_wedged")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-5-chris@chris-wilson.co.uk

drm/i915: Include ring->emit in debugging

Include ring->emit and ring->space alongside ring->(head,tail) when
printing debug information.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-4-chris@chris-wilson.co.uk

drm/i915: Update ring position from request on retiring

When wedged, we do not update the ring->tail as we submit the requests
causing us to leak the ring->space upon cleaning up the wedged driver.
We can just use the value stored in rq->tail, and keep the submission
backend details away from set-wedge.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-3-chris@chris-wilson.co.uk

drm/i915: Finish the wait-for-wedge by retiring all the inflight requests

Before we reset the GPU after marking the device as wedged, we wait for
all the remaining requests to be completed (and marked as EIO).
Afterwards, we should flush the request lists so the next batch start
with the driver in an idle state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-1-chris@chris-wilson.co.uk

drm/i915/icl: do not save DDI A/E sharing bit for ICL

We don't want to preserve the DDI A 4 lane bit on ICL.

Fixes: 3d2011cfa41f ("drm/i915/icl: remove port A/E lane sharing limitation.")
Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306104155.3526-1-jani.nikula@intel.com

drm/i915: Push irq_shift from gen8_cs_irq_handler() to caller

Originally we were inlining gen8_cs_irq_handler() and so expected the
compiler to constant-fold away the irq_shift (so we had hardcoded it as
opposed to use engine->irq_shift). However, we dropped the inline given
the proliferation of gen8_cs_irq_handler()s. If we pull the shifting
of the iir into the caller, we can shrink the code still further:

add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-34 (-34)
Function                                     old     new   delta
gen8_cs_irq_handler                          123     118      -5
gen8_gt_irq_handler                          261     248     -13
gen11_irq_handler                            722     706     -16

v2: Drop gen11_cs_irq_handler now that it is a simple
stub around gen8_cs_irq_handler (Daniele)

References: 5d3d69d5c119 ("drm/i915: Stop inlining the execlists IRQ handler")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309010808.11921-1-chris@chris-wilson.co.uk

drm/i915: Index the ring frequency table by HW frequency range

When reporting the frequency table stored in the punit, report the full
range and not just the user restricted frequency range. In the process
keep the code to set the frequency table and read it the same.

v3: As we haven't separated the sb_lock from the pcu_lock yet, there's a
cycle between the pcu_lock and intel_runtime_pm_get.

References: f936ec34dea8 ("drm/i915/skl: Updated the i915_ring_freq_table debugfs function")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20180308142648.4016-2-chris@chris-wilson.co.uk

drm/i915: Kick the rps worker when changing the boost frequency

The boost frequency is only applied from the RPS worker while someone is
waiting on a request and requested a boost. As such, when the user
wishes to change the frequency, we have to kick the worker in order to
re-evaluate whether to apply the boost frequency.

v2: Check num_waiters to decide if we should kick the worker to handle
boosting.

Fixes: 29ecd78d3b79 ("drm/i915: Define a separate variable and control for RPS waitboost frequency")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308142648.4016-1-chris@chris-wilson.co.uk

drm/i915: Handle pipe CRC around enabling/disabling pipe.

This will get rid of the following error:
[   74.730271] WARNING: CPU: 4 PID: 0 at drivers/gpu/drm/drm_vblank.c:614 drm_calc_vbltimestamp_from_scanoutpos+0x13e/0x2f0
[   74.730311] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_intel crct10dif_pclmul snd_hda_codec crc32_pclmul snd_hwdep broadcom ghash_clmulni_intel snd_hda_core bcm_phy_lib snd_pcm tg3 lpc_ich mei_me mei prime_numbers
[   74.730353] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G     U           4.16.0-rc2-CI-CI_DRM_3822+ #1
[   74.730355] Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
[   74.730359] RIP: 0010:drm_calc_vbltimestamp_from_scanoutpos+0x13e/0x2f0
[   74.730361] RSP: 0018:ffff88022fb03d10 EFLAGS: 00010086
[   74.730365] RAX: ffffffffa0291d20 RBX: ffff88021a180000 RCX: 0000000000000001
[   74.730367] RDX: ffffffff820e7db8 RSI: 0000000000000001 RDI: ffffffff82068cea
[   74.730369] RBP: ffff88022fb03d70 R08: 0000000000000000 R09: ffffffff815d26d0
[   74.730371] R10: 0000000000000000 R11: ffffffffa0161ca0 R12: 0000000000000001
[   74.730373] R13: ffff880212448008 R14: ffff880212448330 R15: 0000000000000000
[   74.730376] FS:  0000000000000000(0000) GS:ffff88022fb00000(0000) knlGS:0000000000000000
[   74.730378] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   74.730380] CR2: 000055edcbec9000 CR3: 0000000002210001 CR4: 00000000000606e0
[   74.730382] Call Trace:
[   74.730385]  <IRQ>
[   74.730397]  drm_get_last_vbltimestamp+0x36/0x50
[   74.730401]  drm_update_vblank_count+0x64/0x240
[   74.730409]  drm_crtc_accurate_vblank_count+0x41/0x90
[   74.730453]  display_pipe_crc_irq_handler+0x176/0x220 [i915]
[   74.730497]  i9xx_pipe_crc_irq_handler+0xfe/0x150 [i915]
[   74.730537]  ironlake_irq_handler+0x618/0xa30 [i915]
[   74.730548]  __handle_irq_event_percpu+0x3c/0x340
[   74.730556]  handle_irq_event_percpu+0x1b/0x50
[   74.730561]  handle_irq_event+0x2f/0x50
[   74.730566]  handle_edge_irq+0xe4/0x1b0
[   74.730572]  handle_irq+0x11/0x20
[   74.730576]  do_IRQ+0x5e/0x120
[   74.730584]  common_interrupt+0x84/0x84
[   74.730586]  </IRQ>
[   74.730591] RIP: 0010:cpuidle_enter_state+0xaa/0x350
[   74.730593] RSP: 0018:ffffc9000008beb8 EFLAGS: 00000212 ORIG_RAX: ffffffffffffffde
[   74.730597] RAX: ffff880226b80040 RBX: 000000000031fc3e RCX: 0000000000000001
[   74.730599] RDX: 0000000000000000 RSI: ffffffff8210fb59 RDI: ffffffff820c02e7
[   74.730601] RBP: 0000000000000004 R08: 00000000000040af R09: 0000000000000018
[   74.730603] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000004
[   74.730606] R13: ffffe8ffffd00430 R14: 0000001166120bf4 R15: ffffffff82294460
[   74.730621]  ? cpuidle_enter_state+0xa6/0x350
[   74.730629]  do_idle+0x188/0x1d0
[   74.730636]  cpu_startup_entry+0x14/0x20
[   74.730641]  start_secondary+0x129/0x160
[   74.730646]  secondary_startup_64+0xa5/0xb0
[   74.730660] Code: e1 48 c7 c2 b8 7d 0e 82 be 01 00 00 00 48 c7 c7 ea 8c 06 82 e8 64 ec ff ff 48 8b 83 c8 07 00 00 48 83 78 28 00 0f 84 e2 fe ff ff <0f> 0b 45 31 ed e9 db fe ff ff 41 b8 d3 4d 62 10 89 c8 6a 03 41
[   74.730754] ---[ end trace 14b1345705b68565 ]---

Changes since v1:
- Don't try to apply CRC workaround when enabling pipe, it should already be enabled.
Changes since v2:
- Make crc functions for !DEBUGFS case inline.
- Pass intel_crtc to crc functions.
- Add comments to callsites.
Changes since v3:
- Cache selected source to pipe_crc->source.
- Set pipe_crc->skipped to MIN_INT during disable to close a race condition.
Changes since v4:
- Handle fallout from setting pipe_crc->source in irq handler.

Cc: Marta Löfstedt <marta.lofstedt@intel.com>
Reported-by: Marta Löfstedt <marta.lofstedt@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105185
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308120202.52446-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Only prune fences after wait-for-all

Currently, we only allow ourselves to prune the fences so long as
all the waits completed (i.e. all the fences we checked were signaled),
and that the reservation snapshot did not change across the wait.
However, if we only waited for a subset of the reservation object, i.e.
just waiting for the last writer to complete as opposed to all readers
as well, then we would erroneously conclude we could prune the fences as
indeed although all of our waits were successful, they did not represent
the totality of the reservation object.

v2: We only need to check the shared fences due to construction (i.e.
all of the shared fences will be later than the exclusive fence, if
any).

Fixes: e54ca9774777 ("drm/i915: Remove completed fences after a wait")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307171303.29466-1-chris@chris-wilson.co.uk

drm/i915: Update DRIVER_DATE to 20180308

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Merge tag 'gvt-next-2018-03-08' of https://github.com/intel/gvt-linux into drm-intel-next-queued

gvt-next-2018-03-08

- big refactor for shadow ppgtt (Changbin)
- KBL context save/restore via LRI cmd (Weinan)
- misc smatch fixes (Zhenyu)
- Properly unmap dma for guest page (Changbin)
- other misc fixes (Xiong, etc.)

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308023152.oi4ialn5uxetbruf@zhen-hp.sh.intel.com

drm/i915: add schedule out notification of preempted but completed request

There is one corner case missing schedule out notification of the preempted
request. The preempted request is just completed when preemption happen,
then it will be canceled and won't be resubmitted later, GVT-g will lost
the schedule out notification.

Here add schedule out notification if found the preempted request has been
completed.

v2:
- refine description, add completed check and notification in
execlists_cancel_port_requests. (Chris)

v3:
- use ternary confitional, remove local variable. (Tvrtko)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1520302557-25079-1-git-send-email-weinan.z.li@intel.com

drm/i915: expose rcs topology through query uAPI

With the introduction of asymmetric slices in CNL, we cannot rely on
the previous SUBSLICE_MASK getparam to tell userspace what subslices
are available. Here we introduce a more detailed way of querying the
Gen's GPU topology that doesn't aggregate numbers.

This is essential for monitoring parts of the GPU with the OA unit,
because counters need to be normalized to the number of
EUs/subslices/slices. The current aggregated numbers like EU_TOTAL do
not gives us sufficient information.

The Mesa series making use of this API is :

    https://patchwork.freedesktop.org/series/38795/

As a bonus we can draw representations of the GPU :

    https://imgur.com/a/vuqpa

v2: Rename uapi struct s/_mask/_info/ (Tvrtko)
    Report max_slice/subslice/eus_per_subslice rather than strides (Tvrtko)
    Add uapi macros to read data from *_info structs (Tvrtko)

v3: Use !!(v & DRM_I915_BIT()) for uapi macros instead of custom shifts (Tvrtko)

v4: factorize query item writting (Tvrtko)
    tweak uapi struct/define names (Tvrtko)

v5: Replace ALIGN() macro (Chris)

v6: Updated uapi comments (Tvrtko)
    Moved flags != 0 checks into vfuncs (Tvrtko)

v7: Use access_ok() before copying anything, to avoid overflows (Chris)
    Switch BUG_ON() to GEM_WARN_ON() (Tvrtko)

v8: Tweak uapi comments style to match the coding style (Lionel)

v9: Fix error in comment about computation of enabled subslice (Tvrtko)

v10: Fix/update comments in uAPI (Sagar)

v11: Drop drm_i915_query_(slice|subslice|eu)_info in favor of a single
     drm_i915_query_topology_info (Joonas)

v12: Add subslice_stride/eu_stride in drm_i915_query_topology_info (Joonas)

v13: Fix comment in uAPI (Joonas)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-7-lionel.g.landwerlin@intel.com

drm/i915: add query uAPI

There are a number of information that are readable from hardware
registers and that we would like to make accessible to userspace. One
particular example is the topology of the execution units (how are
execution units grouped in subslices and slices and also which ones
have been fused off for die recovery).

At the moment the GET_PARAM ioctl covers some basic needs, but
generally is only able to return a single value for each defined
parameter. This is a bit problematic with topology descriptions which
are array/maps of available units.

This change introduces a new ioctl that can deal with requests to fill
structures of potentially variable lengths. The user is expected fill
a query with length fields set at 0 on the first call, the kernel then
sets the length fields to the their expected values. A second call to
the kernel with length fields at their expected values will trigger a
copy of the data to the pointed memory locations.

The scope of this uAPI is only to provide information to userspace,
not to allow configuration of the device.

v2: Simplify dispatcher code iteration (Tvrtko)
    Tweak uapi drm_i915_query_item structure (Tvrtko)

v3: Rename pad fields into flags (Chris)
    Return error on flags field != 0 (Chris)
    Only copy length back to userspace in drm_i915_query_item (Chris)

v4: Use array of functions instead of switch (Chris)

v5: More comments in uapi (Tvrtko)
    Return query item errors in length field (All)

v6: Tweak uapi comments style to match the coding style (Lionel)

v7: Add i915_query.h (Joonas)

v8: (Lionel) Change the behavior of the item iterator to report
    invalid queries into the query item rather than stopping the
    iteration. This enables userspace applications to query newer
    items on older kernels and only have failure on the items that are
    not supported.

v9: Edit copyright headers (Joonas)

v10: Typos & comments in uapi (Joonas)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-6-lionel.g.landwerlin@intel.com

drm/i915: add rcs topology to error state

This might be useful information for developers looking at an error
state.

v2: Place topology towards the end of the error state (Chris)

v3: Reuse common printing code (Michal)

v4: Make this a one-liner (Chris)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-5-lionel.g.landwerlin@intel.com

drm/i915/debugfs: add rcs topology entry

While the end goal is to make this information available to userspace
through a new ioctl, there is no reason we can't display it in a human
readable fashion through debugfs.

slice0: 3 subslice(s) (0x7):
subslice0: 8 EUs (0xff)
subslice1: 8 EUs (0xff)
subslice2: 8 EUs (0xff)
subslice3: 0 EUs (0x0)
slice1: 3 subslice(s) (0x7):
subslice0: 8 EUs (0xff)
subslice1: 8 EUs (0xff)
subslice2: 8 EUs (0xff)
subslice3: 0 EUs (0x0)
slice2: 3 subslice(s) (0x7):
subslice0: 8 EUs (0xff)
subslice1: 8 EUs (0xff)
subslice2: 8 EUs (0xff)
subslice3: 0 EUs (0x0)

v2: Reformat debugfs printing (Tvrtko)
Use the new EU mask helper (Tvrtko)

v3: Move printing code to intel_device_info.c to be shared with error
state (Michal)

v4: Bump u8 to u16 when using sseu_get_eus() (Lionel)

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-4-lionel.g.landwerlin@intel.com

drm/i915/debugfs: reuse max slice/subslices already stored in sseu

Now that we have that information in topology fields, let's just reuse it.

v2: Style tweaks (Tvrtko)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-3-lionel.g.landwerlin@intel.com

drm/i915: store all subslice masks

Up to now, subslice mask was assumed to be uniform across slices. But
starting with Cannonlake, slices can be asymmetric (for example slice0
has different number of subslices as slice1+). This change stores all
subslices masks for all slices rather than having a single mask that
applies to all slices.

v2: Rework how we store total numbers in sseu_dev_info (Tvrtko)
    Fix CHV eu masks, was reading disabled as enabled (Tvrtko)
    Readability changes (Tvrtko)
    Add EU index helper (Tvrtko)

v3: Turn ALIGN(v, 8) / 8 into DIV_ROUND_UP(v, BITS_PER_BYTE) (Tvrtko)
    Reuse sseu_eu_idx() for setting eu_mask on CHV (Tvrtko)
    Reformat debug prints for subslices (Tvrtko)

v4: Change eu_mask helper into sseu_set_eus() (Tvrtko)

v5: With Haswell reporting masks & counts, bump sseu_*_eus() functions
    to use u16 (Lionel)

v6: Fix sseu_get_eus() for > 8 EUs per subslice (Lionel)

v7: Change debugfs enabels for number of subslices per slice, will
    need a small igt/pm_sseu change (Lionel)
    Drop subslice_total field from sseu_dev_info, rely on
    sseu_subslice_total() to recompute the value instead (Lionel)

v8: Remove unused function compute_subslice_total() (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-2-lionel.g.landwerlin@intel.com

drm/i915/guc: work around gcc-4.4.4 union initializer issue

gcc-4.4.4 has problems with initalizers of anon unions.

drivers/gpu/drm/i915/intel_guc_log.c: In function 'guc_log_control':
drivers/gpu/drm/i915/intel_guc_log.c:64: error: unknown field 'logging_enabled' specified in initializer

Work around this.

Fixes: 35fe703c3161 ("drm/i915/guc: Change values for i915_guc_log_control")
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308001333.rI2vrNRTY%akpm@linux-foundation.org

drm/i915/cnl: Add Wa_2201832410

"Clock gating bug in GWL may not clear barrier state when an EOT
is received, causing a hang the next time that barrier is used."

HSDES: 2201832410

Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307220912.3681-1-rodrigo.vivi@intel.com

drm/i915/icl: Gen11 forcewake support

The main difference with previous GENs is that starting from Gen11
each VCS and VECS engine has its own power well, which only exist
if the related engine exists in the HW.
The fallback forcewake request workaround is only needed on gen9
according to the HSDES WA entry (1604254524), so we can go back to using
the simpler fw_domains_get/put functions.

BSpec: 18331

v2: fix fwtable, use array to test shadow tables, create new
    accessors to avoid check on every access (Tvrtko)
v3 (from Paulo): Rebase.
v4:
  - Range 09400-097FF should be FORCEWAKE_ALL (Daniele)
  - Use the BIT macro for forcewake domains (Daniele)
  - Add a comment about the range ordering (Oscar)
  - Updated commit message (Oscar)
v5: Rebased
v6: Use I915_MAX_VCS/VECS (Michal)
v7: translate FORCEWAKE_ALL to available domains
v8: rebase, add clarification on fallback ack in commit message.
v9: fix rebase issue, change check in fw_domains_init from IS_GEN11
    to GEN >= 11
v10: Generate is_genX_shadowed with a macro (Daniele)
     Include gen11_fw_ranges in the selftest (Michel)
v11: Simplify FORCEWAKE_ALL, new line between NEEDS_FORCEWAKEs (Tvrtko)

Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Acked-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-6-mika.kuoppala@linux.intel.com
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/icl: Add Indirect Context Offset for Gen11

v2: rebased to intel_lr_indirect_ctx_offset
v3: rebase, move define to intel_lrc_reg.h

BSpec: 11740
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-5-mika.kuoppala@linux.intel.com
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/icl: Enhanced execution list support

Enhanced Execlists is an upgraded version of execlists which supports
up to 8 ports. The lrcs to be submitted are written to a submit queue
(the ExecLists Submission Queue - ELSQ), which is then loaded on the
HW. When writing to the ELSP register, the lrcs are written cyclically
in the queue from position 0 to position 7. Alternatively, it is
possible to write directly in the individual positions of the queue
using the ELSQC registers. To be able to re-use all the existing code
we're using the latter method and we're currently limiting ourself to
only using 2 elements.

v2: Rebase.
v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
v5: Reword commit, rename regs to be closer to specs, turn off
preemption (Daniele), reuse engine->execlists.elsp (Chris)
v6: use has_logical_ring_elsq to differentiate the new paths
v7: add preemption support, rename els to submit_reg (Chris)
v8: save the ctrl register inside the execlists struct, drop CSB
handling updates (superseded by preempt_complete_status) (Chris)
v9: s/drm_i915_gem_request/i915_request (Mika)
v10: resolved conflict in inject_preempt_context (Mika)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-4-mika.kuoppala@linux.intel.com

drm/i915/icl: new context descriptor support

Starting from Gen11 the context descriptor format has been updated in
the HW. The hw_id field has been considerably reduced in size and engine
class and instance fields have been added.

There is a slight name clashing issue because the field that we call
hw_id is actually called SW Context ID in the specs for Gen11+.

With the current size of the hw_id field we can have a maximum of 2k
contexts at any time, but we could use the sw_counter field (which is sw
defined) to increase that because the HW requirement is that
engine_id + sw id + sw_counter is a unique number.
GuC uses a similar method to support more contexts but does its tracking
at lrc level. To avoid doing an implementation that will need to be
reworked once GuC support lands, defer it for now and mark it as TODO.

v2: rebased, add documentation, fix GEN11_ENGINE_INSTANCE_SHIFT
v3: rebased, bring back lost code from i915_gem_context.c
v4: make TODO comment more generic
v5: be consistent with bit ordering, add extra checks (Chris)

Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-3-mika.kuoppala@linux.intel.com
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/icl: Correctly initialize the Gen11 engines

Gen11 has up to 4 VCS and up to 2 VECS engines, this patch adds mmio
base definitions for all of them.

Bspec: 20944
Bspec: 7021

v2: Set the correct mmio_base in intel_engines_init_mmio; updating the
base mmio values any later would cause incorrect reads in
i915_gem_sanitize (Michel).

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ceraolo Spurio, Daniele <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-2-mika.kuoppala@linux.intel.com
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915: Assert that the request is indeed complete when signaled from irq

After we call dma_fence_signal(), confirm that the request was indeed
complete.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305104105.8296-1-chris@chris-wilson.co.uk

drm/i915: Handle changing enable_fbc parameter at runtime better.

If i915.enable_fbc is cleared at runtime, but FBC was previously enabled
then we don't disable FBC until the next time the crtc is disabled.

Make sure that if the module param is changed, we disable FBC in
intel_fbc_post_update so we never have to worry about disabling.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305123608.20665-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Track whether the DP link is trained or not

LSPCON likes to throw short HPDs during the enable seqeunce prior to the
link being trained. These obviously result in the channel CR/EQ check
failing and thus we schedule a pointless hotplug work to retrain the
link. Avoid that by ignoring the bad CR/EQ status until we've actually
initially trained the link.

I've not actually investigated to see what LSPCON is trying to signal
with the short pulse. But as long as it signals anything I think we're
supposed to check the link status anyway, so I don't really see other
good ways to solve this. I've not seen these short pulses being
generated by normal DP sinks.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-5-ville.syrjala@linux.intel.com

drm/i915: Nuke intel_dp->channel_eq_status

intel_dp->channel_eq_status is used in exactly one function, and we
don't need it to persist between calls. So just go back to using a
local variable instead.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-4-ville.syrjala@linux.intel.com

drm/i915: Move SST DP link retraining into the ->post_hotplug() hook

Doing link retraining from the short pulse handler is problematic since
that might introduce deadlocks with MST sideband processing. Currently
we don't retrain MST links from this code, but we want to change that.
So better to move the entire thing to the hotplug work. We can utilize
the new encoder->hotplug() hook for this.

The only thing we leave in the short pulse handler is the link status
check. That one still depends on the link parameters stored under
intel_dp, so no locking around that but races should be mostly harmless
as the actual retraining code will recheck the link state if we
end up there by mistake.

v2: Rebase due to ->post_hotplug() now being just ->hotplug()
Check the connector type to figure out if we should do
the HDMI thing or the DP think for DDI

[pushed with whitespace changes for sparse]
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-3-ville.syrjala@linux.intel.com

drm/i915: Reinitialize sink scrambling/TMDS clock ratio on HPD

The LG 4k TV I have doesn't deassert HPD when I turn the TV off, but
when I turn it back on it will pulse the HPD line. By that time it has
forgotten everything we told it about scrambling and the clock ratio.
Hence if we want to get a picture out if it again we have to tell it
whether we're currently sending scrambled data or not. Implement
that via the encoder->hotplug() hook.

v2: Force a full modeset to not follow the HDMI 2.0 spec more
closely (Shashank)

[pushed with whitespace fixes to make sparse happy]
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-1-ville.syrjala@linux.intel.com

drm/i915: Convert intel_hpd_irq_event() into an encoder hotplug hook

Allow encoders to customize their hotplug processing by moving the
intel_hpd_irq_event() code into an encoder hotplug vfunc. Currently
only SDVO needs this to re-enable hotplug signalling in the SDVO
chip. We'll use this same hook for DP/HDMI link management later.

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-1-ville.syrjala@linux.intel.com

drm/i915/cnp: Document WaSouthDisplayDisablePWMCGEGating

No functional change since WA is already applied.
But since it has different names on different databases,
let's document it here to avoid future confusion.

Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306012812.19779-1-rodrigo.vivi@intel.com

drm/i915/cnl: document WaVFUnitClockGatingDisable

No functional change. WA is already properly applied.
but in different databases it has different names.
Let's document all of them to avoid future confusion.

Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306012000.18928-1-rodrigo.vivi@intel.com

drm/i915/psr: Update PSR2 resolution check for Cannonlake

In fact, apply the Cannonlake resolution check for all >= Gen-10 platforms
to be safe.

v3: Update GLK too. (Ville)
Longer variable names.
if-else in place of ternary operator.
v2: Use local variables for resolution limits and print them (Ville)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Elio Martinez Monroy <elio.martinez.monroy@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306203355.29292-1-dhinakaran.pandiyan@intel.com

drm/i915: Flush waiters on seqno wraparound

Previously, we would spin waiting for all waiters to wake up and notice
their request had completed before we would reset the seqno upon
wraparound. However, we can mark their waits as complete and wake them
up directly using the existing machinery for handling the flushing of
missed wakeups when idling.

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306130143.13312-2-chris@chris-wilson.co.uk

drm/i915: Stop kicking the signaling thread on seqno wraparound

Since commit fd10e2ce9905 ("drm/i915/breadcrumbs: Ignore unsubmitted
signalers"), we cancel the signaler when retiring the request and so
upon wraparound, where we wait for all requests to be retired, we no
longer need to spin waiting for the signaling thread to release its
references to the in-flight requests, and so we can assert that the
signaler is idle.

References: fd10e2ce9905 ("drm/i915/breadcrumbs: Ignore unsubmitted signalers")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306130143.13312-1-chris@chris-wilson.co.uk

drm/i915/breadcrumbs: Assert all missed breadcrumbs were signaled

When parking the engines and their breadcrumbs, if we have waiters left
then they missed their wakeup. Verify that each waiter's seqno did
complete.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222092545.17216-2-chris@chris-wilson.co.uk

drm/i915/breadcrumbs: Reduce signaler rbtree to a sorted list

The goal here is to try and reduce the latency of signaling additional
requests following the wakeup from interrupt by reducing the list of
to-be-signaled requests from an rbtree to a sorted linked list. The
original choice of using an rbtree was to facilitate random insertions
of request into the signaler while maintaining a sorted list. However,
if we assume that most new requests are added when they are submitted,
we see those new requests in execution order making a insertion sort
fast, and the reduction in overhead of each signaler iteration
significant.

Since commit 56299fb7d904 ("drm/i915: Signal first fence from irq handler
if complete"), we signal most fences directly from notify_ring() in the
interrupt handler greatly reducing the amount of work that actually
needs to be done by the signaler kthread. All the thread is then
required to do is operate as the bottom-half, cleaning up after the
interrupt handler and preparing the next waiter. This includes signaling
all later completed fences in a saturated system, but on a mostly idle
system we only have to rebuild the wait rbtree in time for the next
interrupt. With this de-emphasis of the signaler's role, we want to
rejig it's datastructures to reduce the amount of work we require to
both setup the signal tree and maintain it on every interrupt.

References: 56299fb7d904 ("drm/i915: Signal first fence from irq handler if complete")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222092545.17216-1-chris@chris-wilson.co.uk

drm/i915/error: capture uc_state after gen_state

error->device_info.has_guc, which we check in capture_uc_state, is set
in capture_gen_state, so the latter needs to be performed first.

v2: rebased

Reported-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Fixes: 7d41ef3479a6 (drm/i915: Add Guc/HuC firmware details to error state)
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305222122.3547-3-daniele.ceraolospurio@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/error: standardize function style in error capture

some of the static functions used from capture() have the "i915_"
prefix while other don't; most of them take i915 as a parameter, but one
of them derives it internally from error->i915. Let's be consistent by
avoiding prefix for static functions and by getting i915 from
error->i915. While at it, s/dev_priv/i915 in functions that don't
perform register reads.

v2: take i915 from error->i915 (Michal), s/dev_priv/i915,
update commit message

Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305222122.3547-2-daniele.ceraolospurio@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/error: remove unused gen8_engine_sync_index

Leftover from Gen8 ringbuffer support removal

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305222122.3547-1-daniele.ceraolospurio@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gvt: Return error at the failure of finding page_track

In XenGT, ioreq copy is used to trap mmio write and ppgtt write. Both
of them are memory write, ioreq handler couldn't distinguish them. So
ioreq handler probe the ppgtt write handler, if it is succuess, this
ioreq is ppgtt write, otherwise it is mmio write.

So ppgtt write handler should return an error at the failure of finding
page track, it is fatal to implement ioreq handler in XenGT.

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Release gvt->lock at the failure of finding page track

page_track_handler take lock at the beginning, the lock should be released
at the failure of finding page track. Otherwise deadlock will happen.

Fixes: e502a2af4c35 ("drm/i915/gvt: Provide generic page_track infrastructure for write-protected page")
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/kvmgt: Add kvmgt debugfs entry nr_cache_entries under vgpu

Add a new debugfs entry kvmgt_nr_cache_entries under vgpu which shows
the number of entry in dma cache.

$ cat /sys/kernel/debug/gvt/vgpu1/kvmgt_nr_cache_entries
10101

v3: fix compiling error for some configuration. (Xiong Zhang <xiong.y.zhang@intel.com>)
v2: keep debugfs layout flat.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Fix guest vGPU hang caused by very high dma setup overhead

The implementation of current kvmgt implicitly setup dma mapping at MPT
API gfn_to_mfn. First this design against the API's original purpose.
Second, there is no unmap hit in this design. The result is that the
dma mapping keep growing larger and larger. For mutl-vm case, they will
consume IOMMU IOVA low 4GB address space quickly and so tons of rbtree
entries crated in the IOMMU IOVA allocator. Finally, single IOVA
allocation can take as long as ~70ms. Such latency is intolerable.

To address both above issues, this patch introduced two new MPT API:
o dma_map_guest_page - setup dma map for guest page
o dma_unmap_guest_page - cancel dma map for guest page

The kvmgt implements these 2 API. And to reduce dma setup overhead for
duplicated pages (eg. scratch pages), two caches are used: one is for
mapping gfn to struct gvt_dma, another is for mapping dma addr to
struct gvt_dma.

With these 2 new API, the gtt now is able to cancel dma mapping when page
table is invalidated. The dma mapping is not in a gradual increase now.

v2: follow the old logic for VFIO_IOMMU_NOTIFY_DMA_UNMAP at this point.

Cc: Hang Yuan <hang.yuan@intel.com>
Cc: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Fix check error on hws_pga_write() fail message

Fix below check error by using proper failure message output.

drivers/gpu/drm/i915//gvt/handlers.c:1392 hws_pga_write() error: 'vgpu' dereferencing possible ERR_PTR()
drivers/gpu/drm/i915//gvt/handlers.c:1402 hws_pga_write() error: 'vgpu' dereferencing possible ERR_PTR()

Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Fix one indent error

Fix below warning:

drivers/gpu/drm/i915//gvt/handlers.c:323 gdrst_mmio_write() warn: inconsistent indenting

Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Fix check error on fence mmio handler

Fix below error with minor code refactor.

CHECK drivers/gpu/drm/i915//gvt/handlers.c
drivers/gpu/drm/i915//gvt/handlers.c:203 sanitize_fence_mmio_access() error: 'vgpu' dereferencing possible ERR_PTR()

Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Fix check error of vgpu create failure message

Fix check error at

CHECK drivers/gpu/drm/i915//gvt/kvmgt.c
drivers/gpu/drm/i915//gvt/kvmgt.c:455 intel_vgpu_create() error: we previously assumed 'vgpu' could be null (see line 454)

For failed vgpu create, just show error return in failure message.

Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Fix vGPU sched timeslice calculation warning

Fix below warning by using proper ktime helper to calculate timeslice.

CHECK drivers/gpu/drm/i915//gvt/sched_policy.c
drivers/gpu/drm/i915//gvt/sched_policy.c:108 gvt_balance_timeslice() debug: sval_binop_signed: invalid divide LLONG_MIN/-1
drivers/gpu/drm/i915//gvt/sched_policy.c:108 gvt_balance_timeslice() debug: sval_binop_signed: invalid divide LLONG_MIN/-1

Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: remove gvt max port definition

Remove GVT-g private max port definition but use i915 one.

Fix error caused by:
drivers/gpu/drm/i915//gvt/handlers.c:871 dp_aux_ch_ctl_mmio_write() error: buffer overflow 'display->ports' 5 <= 5

Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Fix one gvt_vgpu_error() use in dmabuf.c

Fix below warning with proper usage.

CHECK drivers/gpu/drm/i915//gvt/dmabuf.c
drivers/gpu/drm/i915//gvt/dmabuf.c:462 intel_vgpu_get_dmabuf() error: 'vgpu' dereferencing possible ERR_PTR()

Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: init mmio by lri command in vgpu inhibit context

There is one issue relates to Coarse Power Gating(CPG) on KBL NUC in GVT-g,
vgpu can't get the correct default context by updating the registers before
inhibit context submission. It always get back the hardware default value
unless the inhibit context submission happened before the 1st time
forcewake put. With this wrong default context, vgpu will run with
incorrect state and meet unknown issues.

The solution is initialize these mmios by adding lri command in ring buffer
of the inhibit context, then gpu hardware has no chance to go down RC6 when
lri commands are right being executed, and then vgpu can get correct
default context for further use.

v3:
- fix code fault, use 'for' to loop through mmio render list(Zhenyu)

v4:
- save the count of engine mmio need to be restored for inhibit context and
refine some comments. (Kevin)

v5:
- code rebase

Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: add interface to check if context is inhibit

No functional change, just for easy to use.

v4:
- refine comment (Kevin)

Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: add define GEN9_MOCS_SIZE

No functional change. This defination will also be used in future patchesi.

v4:
- refine patch description (Kevin)

Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Define PTE addr mask with GENMASK_ULL

Define the masks better.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Manage shadow pages with radix tree

We don't know how many page tables will be shadowed. It varies
considerably corresponding to guest load. Radix tree is a better
choice for us. Since Page Frame Number is used as key so most of
the bits are common.

Here is some performance data (duration in us) of looking up a
element:
Before: (aka. ppgtt_find_shadow_page)
0.308 0.292 0.246 0.432 0.143 ... 0.311 0.225 0.382 0.199 0.325
After: (aka. intel_vgpu_find_spt_by_mfn)
0.106 0.106 0.107 0.106 0.105 0.107 ... 0.107 0.109 0.105 0.108

This time I didn't get the early data of hash table. The data is
measured when desktop is shown.

As last change, the overall benchmark almost is not changed, but
we get better scalability.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Provide generic page_track infrastructure for write-protected page

This patch provide generic page_track infrastructure for write-protected
guest page. The old page_track logic gets rewrote and now stays in a new
standalone page_track.c. This page track infrastructure can be both used
by vGUC and GTT shadowing.

The important change is that it uses radix tree instead of hash table.
We don't have a predictable number of pages that will be tracked.

Here is some performance data (duration in us) of looking up a element:
Before: (aka. intel_vgpu_find_tracked_page)
0.091 0.089 0.090 ... 0.093 0.091 0.087 ... 0.292 0.285 0.292 0.291
After: (aka. intel_vgpu_find_page_track)
0.104 0.105 0.100 0.102 0.102 0.100 ... 0.101 0.101 0.105 0.105

The hash table has good performance at beginning, but turns bad with
more pages being tracked even no 3D applications are running. As
expected, radix tree has stable duration and very quick.

The overall benchmark (tested with Heaven Benchmark) marginally improved
since this is not the bottleneck. What we benefit more from this change
is scalability.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Don't extend page_track to mpt layer

Don't extend page_track to mpt layer. Keep MPT simple and clean.
Meanwhile remove gtt.n_tracked_guest_page which doesn't make much
sense.

v2: clean up gtt.n_tracked_guest_page.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Rename mpt api {set, unset}_wp_page to {enable, disable}_page_track

The kvmgt's implementation of mpt api {set,unset}_wp_page is not real
write-protection - the data get written before invoke this two api.
As discussed, change the mpt api to match the real behavior.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Rename shadow_page to short name spt

The target structure of some functions is struct intel_vgpu_ppgtt_spt and
their names are xxx_shadow_page. It should be xxx_shadow_page_table. Let's
use short name 'spt' instead to reduce the length. As well as the hash
table name.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Rework shadow page management code

This is a another big one and the GVT shadow page management code is
heavily refined.

The new code only use struct intel_vgpu_ppgtt_spt to represent a vgpu
shadow page table - w/ or wo/ a guest page associated with. A pure shadow
page (no guest page associated) will be used to shadow splited 2M huge
gtt. In this case, the spt.guest_page.gfn should be a zero.

To search a existed shadow page table, we have two new interfaces:
- intel_vgpu_find_spt_by_gfn(), find a spt by guest gfn. It must not
be a pure spt.
- intel_vgpu_find_spt_by_mfn, Find the spt using shadow page mfn in
shadowed PTE.

The oos_page management is remained as what is was.

v2: Split some changes into small standalone patches.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Refine pte shadowing process

Make the shadow PTE population code clear. Later we will add huge gtt
support based on this.

v2:
- rebase to latest code.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Use standard pte bit definition

GTT entry has similar format with the CPU PTE. We'd prefer named macro
instead of hardcode.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Factor out intel_vgpu_{get, put}_ppgtt_mm interface

Factor out these two interfaces so we can kill some duplicated code in
scheduler.c.

v2:
- rename to intel_vgpu_{get,put}_ppgtt_mm
- refine handle_g2v_notification

Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Rename ggtt related functions to be more specific

Accurate names help to avoid confusing so improve readability.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Add verbose gtt shadow logs

This add a new macro gvt_vdbg_mm() to print more verbose logs for
gtt shadowing. The added verbose logs are very useful for debugging.
gvt_vdbg_mm() only comes into effect if VERBOSE_DEBUG is defined by
the developer.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Refine ggtt_set_shadow_entry

Less code and use existed helper ggtt_set_host_entry.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Refine ggtt and ppgtt root entry ops

Separate ggtt and ppgtt since they are different. A little more code but
straightforward.

And move these helpers to gtt.c since that is the only client.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Refine the intel_vgpu_mm reference management

If we manage an object with a reference count, then its life cycle
must flow the reference count operations. Meanwhile, change the
operation functions to generic name *put* and *get*.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Rework shadow graphic memory management code

This is a big one and the GVT shadow graphic memory management code is
heavily refined. The new code is more straightforward with less code.

The struct intel_vgpu_mm is restructured to be clearly defined, use
accurate names and some of the original fields are removed which are
really redundant.

Now we only manage ppgtt mm object with mm->ppgtt_mm.lru_list. No need
to mix ppgtt and ggtt together, since one vGPU only has one ggtt object.

v4: Don't invoke ppgtt_free_all_shadow_page before intel_vgpu_destroy_all_ppgtt_mm.
v3: Add GVT_RING_CTX_NR_PDPS to avoid confusing about the PDPs.
v2: Split some changes into small standalone patches.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/icl: Ringbuffer interrupt handling

On Gen11 interrupt masks need to be clear to allow C6 entry.
We keep them all enabled knowing that we generate extra
interrupts.

v2: Rebase.
v3: Remove gen 11 extra check in logical_render_ring_init.
v4: Rebase fixes.
v5: Rebase/refactor.
v6: Rebase.
v7: Rebase.
v8: Update comment and commit message (Daniele)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-1-mika.kuoppala@linux.intel.com

drm/i915: Unwind vma pinning for intel_pin_and_fence_fb_obj error path

If we fail to acquire a fence when we must, we must unwind before
reporting the error. Otherwise, we lose tracking of the vma pinning and
eventually hit a bug like

<3>[   46.163202] i915_vma_unpin:333 GEM_BUG_ON(!i915_vma_is_pinned(vma))
<4>[   46.163424] ------------[ cut here ]------------
<2>[   46.163429] kernel BUG at drivers/gpu/drm/i915/i915_vma.h:333!
<4>[   46.163444] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
<0>[   46.163451] Dumping ftrace buffer:
<0>[   46.163457] ---------------------------------
<0>[   46.163630]    <...>-84      1.... 46260767us : i915_gem_object_unpin_from_display_plane: i915_vma_unpin:333 GEM_BUG_ON(!i915_vma_is_pinned(vma))
<0>[   46.163635] ---------------------------------
<4>[   46.163638] Modules linked in: vgem i915 snd_hda_codec_analog snd_hda_codec_generic coretemp snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm lpc_ich mei_me e1000e mei prime_numbers
<4>[   46.163667] CPU: 1 PID: 84 Comm: kworker/u16:1 Tainted: G     U           4.16.0-rc3-gc07ef2c77d14-kasan_18+ #1
<4>[   46.163671] Hardware name: Dell Inc. OptiPlex 755                 /0PU052, BIOS A08 02/19/2008
<4>[   46.163743] Workqueue: events_unbound intel_atomic_commit_work [i915]
<4>[   46.163809] RIP: 0010:i915_gem_object_unpin_from_display_plane+0x253/0x2f0 [i915]
<4>[   46.163813] RSP: 0018:ffff8800624cfb48 EFLAGS: 00010286
<4>[   46.163818] RAX: 000000000000000c RBX: ffff880064446c40 RCX: ffff8800653135b8
<4>[   46.163822] RDX: dffffc0000000000 RSI: 0000000000000054 RDI: ffff8800651e30d0
<4>[   46.163825] RBP: 00000000000003d0 R08: 0000000000000001 R09: ffff8800651e3158
<4>[   46.163829] R10: 0000000000000000 R11: ffff8800651e30f0 R12: 0000000000000001
<4>[   46.163832] R13: ffff880054c58620 R14: 0000000000000000 R15: dffffc0000000000
<4>[   46.163836] FS:  0000000000000000(0000) GS:ffff880066040000(0000) knlGS:0000000000000000
<4>[   46.163840] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   46.163843] CR2: 00007f1fc6fb0000 CR3: 00000000526fe000 CR4: 00000000000006e0
<4>[   46.163846] Call Trace:
<4>[   46.163918]  intel_unpin_fb_vma+0xbd/0x300 [i915]
<4>[   46.163990]  intel_cleanup_plane_fb+0x99/0xc0 [i915]
<4>[   46.163998]  drm_atomic_helper_cleanup_planes+0x166/0x280
<4>[   46.164071]  intel_atomic_commit_tail+0x1594/0x33a0 [i915]
<4>[   46.164081]  ? process_one_work+0x66e/0x1460
<4>[   46.164151]  ? skl_update_crtcs+0x9c0/0x9c0 [i915]
<4>[   46.164157]  ? lock_acquire+0x13d/0x390
<4>[   46.164161]  ? lock_acquire+0x13d/0x390
<4>[   46.164169]  process_one_work+0x71a/0x1460
<4>[   46.164175]  ? __schedule+0x838/0x1e50
<4>[   46.164182]  ? pwq_dec_nr_in_flight+0x2b0/0x2b0
<4>[   46.164188]  ? _raw_spin_lock_irq+0xa/0x40
<4>[   46.164194]  worker_thread+0xdf/0xf60
<4>[   46.164204]  ? process_one_work+0x1460/0x1460
<4>[   46.164209]  kthread+0x2cf/0x3c0
<4>[   46.164213]  ? _kthread_create_on_node+0xa0/0xa0
<4>[   46.164218]  ret_from_fork+0x3a/0x50
<4>[   46.164227] Code: e8 78 d9 cd e8 48 8b 35 cc 9e 47 00 49 c7 c0 c0 31 84 c0 b9 4d 01 00 00 48 c7 c2 e0 80 84 c0 48 c7 c7 0e bb 57 c0 e8 5d 4b df e8 <0f> 0b 48 c7 c1 c0 30 84 c0 ba 4e 01 00 00 48 c7 c6 e0 80 84 c0
<1>[   46.164368] RIP: i915_gem_object_unpin_from_display_plane+0x253/0x2f0 [i915] RSP: ffff8800624cfb48

Fixes: 85798ac9b35f ("drm/i915: Fail if we can't get a fence for gen2/3 tiled scanout")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305103312.29492-1-chris@chris-wilson.co.uk