git.osdn.net Git - qmiga/qemu.git/log

OSDN Git Service

(root) / qmiga / qemu.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Andrey Smirnov [Fri, 9 Feb 2018 10:40:29 +0000 (10:40 +0000)]

i.MX: Add code to emulate i.MX2 watchdog IP block

Add enough code to emulate i.MX2 watchdog IP block so it would be
possible to reboot the machine running Linux Guest.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Marcel Apfelbaum <marcel.apfelbaum@zoho.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: yurovsky@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Andrey Smirnov [Fri, 9 Feb 2018 10:40:29 +0000 (10:40 +0000)]

i.MX: Add code to emulate i.MX7 CCM, PMU and ANALOG IP blocks

Add minimal code needed to allow upstream Linux guest to boot.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Marcel Apfelbaum <marcel.apfelbaum@zoho.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: yurovsky@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Andrey Smirnov [Fri, 9 Feb 2018 10:40:29 +0000 (10:40 +0000)]

hw: i.MX: Convert i.MX6 to use TYPE_IMX_USDHC

Convert i.MX6 to use TYPE_IMX_USDHC since that's what real HW comes
with.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Marcel Apfelbaum <marcel.apfelbaum@zoho.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: yurovsky@gmail.com
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Andrey Smirnov [Fri, 9 Feb 2018 10:40:29 +0000 (10:40 +0000)]

sdhci: Add i.MX specific subtype of SDHCI

IP block found on several generations of i.MX family does not use
vanilla SDHCI implementation and it comes with a number of quirks.

Introduce i.MX SDHCI subtype of SDHCI block to add code necessary to
support unmodified Linux guest driver.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Marcel Apfelbaum <marcel.apfelbaum@zoho.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: yurovsky@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[PMM: define and use ESDHC_UNDOCUMENTED_REG27]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Ard Biesheuvel [Fri, 9 Feb 2018 10:40:29 +0000 (10:40 +0000)]

target/arm: enable user-mode SHA-3, SM3, SM4 and SHA-512 instruction support

Add support for the new ARMv8.2 SHA-3, SM3, SM4 and SHA-512 instructions to
AArch64 user mode emulation.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 20180207111729.15737-6-ard.biesheuvel@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Ard Biesheuvel [Fri, 9 Feb 2018 10:40:28 +0000 (10:40 +0000)]

target/arm: implement SM4 instructions

This implements emulation of the new SM4 instructions that have
been added as an optional extension to the ARMv8 Crypto Extensions
in ARM v8.2.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 20180207111729.15737-5-ard.biesheuvel@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Ard Biesheuvel [Fri, 9 Feb 2018 10:40:28 +0000 (10:40 +0000)]

target/arm: implement SM3 instructions

This implements emulation of the new SM3 instructions that have
been added as an optional extension to the ARMv8 Crypto Extensions
in ARM v8.2.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 20180207111729.15737-4-ard.biesheuvel@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Ard Biesheuvel [Fri, 9 Feb 2018 10:40:28 +0000 (10:40 +0000)]

target/arm: implement SHA-3 instructions

This implements emulation of the new SHA-3 instructions that have
been added as an optional extensions to the ARMv8 Crypto Extensions
in ARM v8.2.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 20180207111729.15737-3-ard.biesheuvel@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Ard Biesheuvel [Fri, 9 Feb 2018 10:40:28 +0000 (10:40 +0000)]

target/arm: implement SHA-512 instructions

This implements emulation of the new SHA-512 instructions that have
been added as an optional extensions to the ARMv8 Crypto Extensions
in ARM v8.2.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 20180207111729.15737-2-ard.biesheuvel@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Peter Maydell [Fri, 9 Feb 2018 10:40:28 +0000 (10:40 +0000)]

target/arm: Handle exceptions during exception stack pop

Handle possible MPU faults, SAU faults or bus errors when
popping register state off the stack during exception return.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1517324542-6607-8-git-send-email-peter.maydell@linaro.org

commit | commitdiff | tree

Peter Maydell [Fri, 9 Feb 2018 10:40:28 +0000 (10:40 +0000)]

target/arm: Make exception vector loads honour the SAU

Make the load of the exception vector from the vector table honour
the SAU and any bus error on the load (possibly provoking a derived
exception), rather than simply aborting if the load fails.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1517324542-6607-7-git-send-email-peter.maydell@linaro.org

commit | commitdiff | tree

Peter Maydell [Fri, 9 Feb 2018 10:40:27 +0000 (10:40 +0000)]

target/arm: Make v7m_push_callee_stack() honour MPU

Make v7m_push_callee_stack() honour the MPU by using the
new v7m_stack_write() function. We return a flag to indicate
whether the pushes failed, which we can then use in
v7m_exception_taken() to cause us to handle the derived
exception correctly.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1517324542-6607-6-git-send-email-peter.maydell@linaro.org

commit | commitdiff | tree

Peter Maydell [Fri, 9 Feb 2018 10:40:27 +0000 (10:40 +0000)]

target/arm: Make v7M exception entry stack push check MPU

The memory writes done to push registers on the stack
on exception entry in M profile CPUs are supposed to
go via MPU permissions checks, which may cause us to
take a derived exception instead of the original one of
the MPU lookup fails. We were implementing these as
always-succeeds direct writes to physical memory.
Rewrite v7m_push_stack() to do the necessary checks.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1517324542-6607-5-git-send-email-peter.maydell@linaro.org

commit | commitdiff | tree

Peter Maydell [Fri, 9 Feb 2018 10:40:27 +0000 (10:40 +0000)]

target/arm: Add ignore_stackfaults argument to v7m_exception_taken()

In the v8M architecture, if the process of taking an exception
results in a further exception this is called a derived exception
(for example, an MPU exception when writing the exception frame to
memory). If the derived exception happens while pushing the initial
stack frame, we must ignore any subsequent possible exception
pushing the callee-saves registers.

In preparation for making the stack writes check for exceptions,
add a return value from v7m_push_stack() and a new parameter to
v7m_exception_taken(), so that the former can tell the latter that
it needs to ignore failures to write to the stack. We also plumb
the argument through to v7m_push_callee_stack(), which is where
the code to ignore the failures will be.

(Note that the v8M ARM pseudocode structures this slightly differently:
derived exceptions cause the attempt to process the original
exception to be abandoned; then at the top level it calls
DerivedLateArrival to prioritize the derived exception and call
TakeException from there. We choose to let the NVIC do the prioritization
and continue forward with a call to TakeException which will then
take either the original or the derived exception. The effect is
the same, but this structure works better for QEMU because we don't
have a convenient top level place to do the abandon-and-retry logic.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1517324542-6607-4-git-send-email-peter.maydell@linaro.org

commit | commitdiff | tree

Peter Maydell [Fri, 9 Feb 2018 10:40:27 +0000 (10:40 +0000)]

target/arm: Split "get pending exception info" from "acknowledge it"

Currently armv7m_nvic_acknowledge_irq() does three things:
* make the current highest priority pending interrupt active
* return a bool indicating whether that interrupt is targeting
Secure or NonSecure state
* implicitly tell the caller which is the highest priority
pending interrupt by setting env->v7m.exception

We need to split these jobs, because v7m_exception_taken()
needs to know whether the pending interrupt targets Secure so
it can choose to stack callee-saves registers or not, but it
must not make the interrupt active until after it has done
that stacking, in case the stacking causes a derived exception.
Similarly, it needs to know the number of the pending interrupt
so it can read the correct vector table entry before the
interrupt is made active, because vector table reads might
also cause a derived exception.

Create a new armv7m_nvic_get_pending_irq_info() function which simply
returns information about the highest priority pending interrupt, and
use it to rearrange the v7m_exception_taken() code so we don't
acknowledge the exception until we've done all the things which could
possibly cause a derived exception.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1517324542-6607-3-git-send-email-peter.maydell@linaro.org

commit | commitdiff | tree

Peter Maydell [Fri, 9 Feb 2018 10:40:27 +0000 (10:40 +0000)]

target/arm: Add armv7m_nvic_set_pending_derived()

In order to support derived exceptions (exceptions generated in
the course of trying to take an exception), we need to be able
to handle prioritizing whether to take the original exception
or the derived exception.

We do this by introducing a new function
armv7m_nvic_set_pending_derived() which the exception-taking code in
helper.c will call when a derived exception occurs. Derived
exceptions are dealt with mostly like normal pending exceptions, so
we share the implementation with the armv7m_nvic_set_pending()
function.

Note that the way we structure this is significantly different
from the v8M Arm ARM pseudocode: that does all the prioritization
logic in the DerivedLateArrival() function, whereas we choose to
let the existing "identify highest priority exception" logic
do the prioritization for us. The effect is the same, though.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1517324542-6607-2-git-send-email-peter.maydell@linaro.org

commit | commitdiff | tree

Peter Maydell [Thu, 8 Feb 2018 17:41:15 +0000 (17:41 +0000)]

Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20180208' into staging

tcg generic vectors

# gpg: Signature made Thu 08 Feb 2018 16:47:16 GMT
# gpg:                using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-tcg-20180208:
  tcg/aarch64: Add vector operations
  tcg/i386: Add vector operations
  target/arm: Use vector infrastructure for aa64 orr/bic immediate
  target/arm: Use vector infrastructure for aa64 multiplies
  target/arm: Use vector infrastructure for aa64 compares
  target/arm: Use vector infrastructure for aa64 constant shifts
  target/arm: Use vector infrastructure for aa64 dup/movi
  target/arm: Use vector infrastructure for aa64 mov/not/neg
  target/arm: Use vector infrastructure for aa64 add/sub/logic
  target/arm: Align vector registers
  tcg/optimize: Handle vector opcodes during optimize
  tcg: Add generic vector helpers with a scalar operand
  tcg: Add generic helpers for saturating arithmetic
  tcg: Add generic vector ops for multiplication
  tcg: Add generic vector ops for comparisons
  tcg: Add generic vector ops for constant shifts
  tcg: Add generic vector expanders
  tcg: Standardize integral arguments to expanders
  tcg: Add types and basic operations for host vectors
  tcg: Allow multiple word entries into the constant pool

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Richard Henderson [Tue, 12 Sep 2017 05:09:28 +0000 (22:09 -0700)]

tcg/aarch64: Add vector operations

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Thu, 17 Aug 2017 21:47:43 +0000 (14:47 -0700)]

tcg/i386: Add vector operations

The x86 vector instruction set is extremely irregular. With newer
editions, Intel has filled in some of the blanks. However, we don't
get many 64-bit operations until SSE4.2, introduced in 2009.

The subsequent edition was for AVX1, introduced in 2011, which added
three-operand addressing, and adjusts how all instructions should be
encoded.

Given the relatively narrow 2 year window between possible to support
and desirable to support, and to vastly simplify code maintainence,
I am only planning to support AVX1 and later cpus.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Fri, 12 Jan 2018 22:09:25 +0000 (14:09 -0800)]

target/arm: Use vector infrastructure for aa64 orr/bic immediate

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Tue, 21 Nov 2017 10:21:28 +0000 (11:21 +0100)]

target/arm: Use vector infrastructure for aa64 multiplies

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Fri, 17 Nov 2017 20:05:16 +0000 (21:05 +0100)]

target/arm: Use vector infrastructure for aa64 compares

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Fri, 17 Nov 2017 17:27:45 +0000 (18:27 +0100)]

target/arm: Use vector infrastructure for aa64 constant shifts

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Mon, 13 Nov 2017 18:31:31 +0000 (19:31 +0100)]

target/arm: Use vector infrastructure for aa64 dup/movi

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Tue, 17 Oct 2017 19:35:21 +0000 (12:35 -0700)]

target/arm: Use vector infrastructure for aa64 mov/not/neg

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Mon, 14 Aug 2017 21:46:55 +0000 (14:46 -0700)]

target/arm: Use vector infrastructure for aa64 add/sub/logic

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Tue, 12 Sep 2017 13:50:01 +0000 (06:50 -0700)]

target/arm: Align vector registers

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Wed, 22 Nov 2017 08:07:11 +0000 (09:07 +0100)]

tcg/optimize: Handle vector opcodes during optimize

Trivial move and constant propagation. Some identity and constant
function folding, but nothing that requires knowledge of the size
of the vector element.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Thu, 21 Dec 2017 18:58:36 +0000 (10:58 -0800)]

tcg: Add generic vector helpers with a scalar operand

Use dup to convert a non-constant scalar to a third vector.

Add addition, multiplication, and logical operations with an immediate.
Add addition, subtraction, multiplication, and logical operations with
a non-constant scalar. Allow for the front-end to build operations in
which the scalar operand comes first.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Thu, 14 Dec 2017 16:45:20 +0000 (10:45 -0600)]

tcg: Add generic helpers for saturating arithmetic

No vector ops as yet. SSE only has direct support for 8- and 16-bit
saturation; handling 32- and 64-bit saturation is much more expensive.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Tue, 21 Nov 2017 09:11:14 +0000 (10:11 +0100)]

tcg: Add generic vector ops for multiplication

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Fri, 17 Nov 2017 19:47:42 +0000 (20:47 +0100)]

tcg: Add generic vector ops for comparisons

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Fri, 17 Nov 2017 13:35:11 +0000 (14:35 +0100)]

tcg: Add generic vector ops for constant shifts

Opcodes are added for scalar and vector shifts, but considering the
varied semantics of these do not expose them to the front ends. Do
go ahead and provide them in case they are needed for backend expansion.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Fri, 15 Sep 2017 21:11:45 +0000 (14:11 -0700)]

tcg: Add generic vector expanders

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Thu, 4 Jan 2018 15:44:17 +0000 (07:44 -0800)]

tcg: Standardize integral arguments to expanders

Some functions use intN_t arguments, some use uintN_t, some just
used "unsigned". To aid putting function pointers in tables, we
need consistency.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Thu, 14 Sep 2017 20:53:46 +0000 (13:53 -0700)]

tcg: Add types and basic operations for host vectors

Nothing uses or enables them yet.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Richard Henderson [Thu, 9 Nov 2017 19:24:08 +0000 (20:24 +0100)]

tcg: Allow multiple word entries into the constant pool

This will be required for storing vector constants.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

commit | commitdiff | tree

Peter Maydell [Thu, 8 Feb 2018 14:31:51 +0000 (14:31 +0000)]

Merge remote-tracking branch 'remotes/famz/tags/staging-pull-request' into staging

# gpg: Signature made Thu 08 Feb 2018 01:29:22 GMT
# gpg:                using RSA key CA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/staging-pull-request:
  docs: Add docs/devel/testing.rst
  qapi: Add NVMe driver options to the schema
  docs: Add section for NVMe VFIO driver
  block: Move NVMe constants to a separate header
  qemu-img: Map bench buffer
  block/nvme: Implement .bdrv_(un)register_buf
  block: Introduce buf register API
  block: Add VFIO based NVMe driver
  util: Introduce vfio helpers
  stubs: Add stubs for ram block API
  curl: convert to CoQueue
  coroutine-lock: make qemu_co_enter_next thread-safe
  coroutine-lock: convert CoQueue to use QemuLockable
  lockable: add QemuLockable
  test-coroutine: add simple CoMutex test
  docker: change Fedora base image to fedora:27

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Peter Maydell [Thu, 8 Feb 2018 10:16:59 +0000 (10:16 +0000)]

Merge remote-tracking branch 'remotes/jnsnow/tags/bitmaps-pull-request' into staging

# gpg: Signature made Wed 07 Feb 2018 17:00:12 GMT
# gpg:                using RSA key 7DEF8106AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"
# Primary key fingerprint: FAEB 9711 A12C F475 812F  18F2 88A9 064D 1835 61EB
#      Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76  CBD0 7DEF 8106 AAFC 390E

* remotes/jnsnow/tags/bitmaps-pull-request:
  hbitmap: fix missing restore count when finish deserialization

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Fam Zheng [Thu, 1 Feb 2018 02:20:46 +0000 (10:20 +0800)]

docs: Add docs/devel/testing.rst

To make our efforts on QEMU testing easier to consume by contributors,
let's add a document. For example, Patchew reports build errors on
patches that should be relatively easy to reproduce with a few steps, and
it is much nicer if there is such a documentation that it can refer to.

This focuses on how to run existing tests and how to write new test
cases, without going into the frameworks themselves.

The VM based testing section is moved from tests/vm/README which now
is a single line pointing to the new doc.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180201022046.9425-1-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

commit | commitdiff | tree

Fam Zheng [Tue, 16 Jan 2018 06:09:01 +0000 (14:09 +0800)]

qapi: Add NVMe driver options to the schema

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180116060901.17413-10-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

commit | commitdiff | tree

Fam Zheng [Tue, 16 Jan 2018 06:09:00 +0000 (14:09 +0800)]

docs: Add section for NVMe VFIO driver

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180116060901.17413-9-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

commit | commitdiff | tree

Fam Zheng [Tue, 16 Jan 2018 06:08:59 +0000 (14:08 +0800)]

block: Move NVMe constants to a separate header

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180116060901.17413-8-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

commit | commitdiff | tree

Fam Zheng [Tue, 16 Jan 2018 06:08:58 +0000 (14:08 +0800)]

qemu-img: Map bench buffer

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180116060901.17413-7-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

commit | commitdiff | tree

Fam Zheng [Tue, 16 Jan 2018 06:08:57 +0000 (14:08 +0800)]

block/nvme: Implement .bdrv_(un)register_buf

Forward these two calls to the IOVA manager.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180116060901.17413-6-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

commit | commitdiff | tree

Fam Zheng [Tue, 16 Jan 2018 06:08:56 +0000 (14:08 +0800)]

block: Introduce buf register API

Allow block driver to map and unmap a buffer for later I/O, as a performance
hint.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180116060901.17413-5-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

commit | commitdiff | tree

Fam Zheng [Tue, 16 Jan 2018 06:08:55 +0000 (14:08 +0800)]

block: Add VFIO based NVMe driver

This is a new protocol driver that exclusively opens a host NVMe
controller through VFIO. It achieves better latency than linux-aio by
completely bypassing host kernel vfs/block layer.

    $rw-$bs-$iodepth  linux-aio     nvme://
    ----------------------------------------
    randread-4k-1     10.5k         21.6k
    randread-512k-1   745           1591
    randwrite-4k-1    30.7k         37.0k
    randwrite-512k-1  1945          1980

    (unit: IOPS)

The driver also integrates with the polling mechanism of iothread.

This patch is co-authored by Paolo and me.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180116060901.17413-4-famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

commit | commitdiff | tree

Fam Zheng [Tue, 16 Jan 2018 06:08:54 +0000 (14:08 +0800)]

util: Introduce vfio helpers

This is a library to manage the host vfio interface, which could be used
to implement userspace device driver code in QEMU such as NVMe or net
controllers.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180116060901.17413-3-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

commit | commitdiff | tree

Fam Zheng [Wed, 10 Jan 2018 09:18:38 +0000 (17:18 +0800)]

stubs: Add stubs for ram block API

These functions will be wanted by block-obj-y but the actual definition
is in obj-y, so stub them to keep the linker happy.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180110091846.10699-2-famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

commit | commitdiff | tree

Paolo Bonzini [Sat, 3 Feb 2018 15:39:35 +0000 (10:39 -0500)]

curl: convert to CoQueue

Now that CoQueues can use a QemuMutex for thread-safety, there is no
need for curl to roll its own coroutine queue. Coroutines can be
placed directly on the queue instead of using a list of CURLAIOCBs.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180203153935.8056-6-pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

commit | commitdiff | tree

Paolo Bonzini [Sat, 3 Feb 2018 15:39:34 +0000 (10:39 -0500)]

coroutine-lock: make qemu_co_enter_next thread-safe

qemu_co_queue_next does not need to release and re-acquire the mutex,
because the queued coroutine does not run immediately.  However, this
does not hold for qemu_co_enter_next.  Now that qemu_co_queue_wait
can synchronize (via QemuLockable) with code that is not running in
coroutine context, it's important that code using qemu_co_enter_next
can easily use a standardized locking idiom.

First of all, qemu_co_enter_next must use aio_co_wake to restart the
coroutine.  Second, the function gains a second argument, a QemuLockable*,
and the comments of qemu_co_queue_next and qemu_co_queue_restart_all
are adjusted to clarify the difference.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180203153935.8056-5-pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

commit | commitdiff | tree

Paolo Bonzini [Sat, 3 Feb 2018 15:39:33 +0000 (10:39 -0500)]

coroutine-lock: convert CoQueue to use QemuLockable

There are cases in which a queued coroutine must be restarted from
non-coroutine context (with qemu_co_enter_next). In this cases,
qemu_co_enter_next also needs to be thread-safe, but it cannot use
a CoMutex and so cannot qemu_co_queue_wait. Use QemuLockable so
that the CoQueue can interchangeably use CoMutex or QemuMutex.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180203153935.8056-4-pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

commit | commitdiff | tree

Paolo Bonzini [Sat, 3 Feb 2018 15:39:32 +0000 (10:39 -0500)]

lockable: add QemuLockable

QemuLockable is a polymorphic lock type that takes an object and
knows which function to use for locking and unlocking. The
implementation could use C11 _Generic, but since the support is
not very widespread I am instead using __builtin_choose_expr and
__builtin_types_compatible_p, which are already used by
include/qemu/atomic.h.

QemuLockable can be used to implement lock guards, or to pass around
a lock in such a way that a function can release it and re-acquire it.
The next patch will do this for CoQueue.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180203153935.8056-3-pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

commit | commitdiff | tree

Paolo Bonzini [Sat, 3 Feb 2018 15:39:31 +0000 (10:39 -0500)]

test-coroutine: add simple CoMutex test

In preparation for adding a similar test using QemuLockable, add a very
simple testcase that has two interleaved calls to lock and unlock.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180203153935.8056-2-pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

commit | commitdiff | tree

Paolo Bonzini [Fri, 12 Jan 2018 11:11:43 +0000 (12:11 +0100)]

docker: change Fedora base image to fedora:27

Using "fedora:latest" makes behavior different depending on when you
actually pulled the image from the docker repository. In my case,
the supposedly "latest" image was a Fedora 25 download from 8 months
ago, and the new "test-debug" test was failing.

Use "27" to improve reproducibility and make it clear when the image
is obsolete.

Cc: Fam Zheng <famz@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1515755504-21341-1-git-send-email-pbonzini@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

commit | commitdiff | tree

Peter Maydell [Wed, 7 Feb 2018 23:02:18 +0000 (23:02 +0000)]

Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging

# gpg: Signature made Wed 07 Feb 2018 16:32:36 GMT
# gpg:                using RSA key 7DEF8106AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"
# Primary key fingerprint: FAEB 9711 A12C F475 812F  18F2 88A9 064D 1835 61EB
#      Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76  CBD0 7DEF 8106 AAFC 390E

* remotes/jnsnow/tags/ide-pull-request:
  ide-test: test trim requests

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Peter Maydell [Wed, 7 Feb 2018 20:40:36 +0000 (20:40 +0000)]

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* socket option parsing fix (Daniel)
* SCSI fixes (Fam)
* Readline double-free fix (Greg)
* More HVF attribution fixes (Izik)
* WHPX (Windows Hypervisor Platform Extensions) support (Justin)
* POLLHUP handler (Klim)
* ivshmem fixes (Ladi)
* memfd memory backend (Marc-André)
* improved error message (Marcelo)
* Memory fixes (Peter Xu, Zhecheng)
* Remove obsolete code and comments (Peter M.)
* qdev API improvements (Philippe)
* Add CONFIG_I2C switch (Thomas)

# gpg: Signature made Wed 07 Feb 2018 15:24:08 GMT
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (47 commits)
  Add the WHPX acceleration enlightenments
  Introduce the WHPX impl
  Add the WHPX vcpu API
  Add the Windows Hypervisor Platform accelerator.
  tests/test-filter-redirector: move close()
  tests: use memfd in vhost-user-test
  vhost-user-test: make read-guest-mem setup its own qemu
  tests: keep compiling failing vhost-user tests
  Add memfd based hostmem
  memfd: add hugetlbsize argument
  memfd: add hugetlb support
  memfd: add error argument, instead of perror()
  cpus: join thread when removing a vCPU
  cpus: hvf: unregister thread with RCU
  cpus: tcg: unregister thread with RCU, fix exiting of loop on unplug
  cpus: dummy: unregister thread with RCU, exit loop on unplug
  cpus: kvm: unregister thread with RCU
  cpus: hax: register/unregister thread with RCU, exit loop on unplug
  ivshmem: Disable irqfd on device reset
  ivshmem: Improve MSI irqfd error handling
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# cpus.c

commit | commitdiff | tree

Liang Li [Wed, 7 Feb 2018 16:35:49 +0000 (11:35 -0500)]

hbitmap: fix missing restore count when finish deserialization

The .count of HBitmap is forgot to set in function
hbitmap_deserialize_finish, let's set it to the right value.

Cc: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Cc: Fam Zheng <famz@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Weiping Zhang <zhangweiping@didichuxing.com>
Signed-off-by: Liang Li <liliangleo@didichuxing.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20180118131308.GA2181@liangdeMacBook-Pro.local
Signed-off-by: John Snow <jsnow@redhat.com>

commit | commitdiff | tree

Peter Maydell [Wed, 7 Feb 2018 16:26:01 +0000 (16:26 +0000)]

Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2018-02-06' into staging

Error reporting patches for 2018-02-06

# gpg: Signature made Tue 06 Feb 2018 19:48:30 GMT
# gpg:                using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-error-2018-02-06:
  tcg: Replace fprintf(stderr, "*\n" with error_report()
  hw/xen*: Replace fprintf(stderr, "*\n" with error_report()
  hw/sparc*: Replace fprintf(stderr, "*\n" with error_report()
  hw/sd: Replace fprintf(stderr, "*\n" with DPRINTF()
  hw/ppc: Replace fprintf(stderr, "*\n" with error_report()
  hw/pci*: Replace fprintf(stderr, "*\n" with error_report()
  hw/openrisc: Replace fprintf(stderr, "*\n" with error_report()
  hw/moxie: Replace fprintf(stderr, "*\n" with error_report()
  hw/mips: Replace fprintf(stderr, "*\n" with error_report()
  hw/lm32: Replace fprintf(stderr, "*\n" with error_report()
  hw/dma: Replace fprintf(stderr, "*\n" with error_report()
  hw/arm: Replace fprintf(stderr, "*\n" with error_report()
  audio: Replace AUDIO_FUNC with __func__
  error: Improve documentation of error_append_hint()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Anton Nefedov [Wed, 7 Feb 2018 16:25:22 +0000 (11:25 -0500)]

ide-test: test trim requests

Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1516611841-5526-1-git-send-email-anton.nefedov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>

commit | commitdiff | tree

Peter Maydell [Wed, 7 Feb 2018 14:38:53 +0000 (14:38 +0000)]

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20180206.0' into staging

VFIO updates 2018-02-06

- SPAPR in-kernel TCE accleration (Alexey Kardashevskiy)

- MSI-X relocation (Alex Williamson)

- Add missing platform mutex init (Eric Auger)

- Redundant variable cleanup (Alexey Kardashevskiy)

- Option to disable GeForce quirks (Alex Williamson)

# gpg: Signature made Tue 06 Feb 2018 18:21:22 GMT
# gpg:                using RSA key 239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-update-20180206.0:
  vfio/pci: Add option to disable GeForce quirks
  vfio/common: Remove redundant copy of local variable
  hw/vfio/platform: Init the interrupt mutex
  vfio/pci: Allow relocating MSI-X MMIO
  qapi: Create DEFINE_PROP_OFF_AUTO_PCIBAR
  vfio/pci: Emulate BARs
  vfio/pci: Add base BAR MemoryRegion
  vfio/pci: Fixup VFIOMSIXInfo comment
  spapr/iommu: Enable in-kernel TCE acceleration via VFIO KVM device
  vfio/spapr: Use iommu memory region's get_attr()
  memory/iommu: Add get_attr()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Justin Terry (VM) [Mon, 22 Jan 2018 21:07:49 +0000 (13:07 -0800)]

Add the WHPX acceleration enlightenments

Implements the WHPX accelerator cpu enlightenments to actually use the whpx-all
accelerator on Windows platforms.

Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <1516655269-1785-5-git-send-email-juterry@microsoft.com>
[Register/unregister VCPU thread with RCU. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Justin Terry (VM) [Mon, 22 Jan 2018 21:07:48 +0000 (13:07 -0800)]

Introduce the WHPX impl

Implements the Windows Hypervisor Platform accelerator (WHPX) target. Which
acts as a hypervisor accelerator for QEMU on the Windows platform. This enables
QEMU much greater speed over the emulated x86_64 path's that are taken on
Windows today.

1. Adds support for vPartition management.
2. Adds support for vCPU management.
3. Adds support for MMIO/PortIO.
4. Registers the WHPX ACCEL_CLASS.

Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <1516655269-1785-4-git-send-email-juterry@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Justin Terry (VM) [Mon, 22 Jan 2018 21:07:47 +0000 (13:07 -0800)]

Add the WHPX vcpu API

Adds support for the Windows Hypervisor Platform accelerator (WHPX) stubs and
introduces the whpx.h sysemu API for managing the vcpu scheduling and
management.

Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <1516655269-1785-3-git-send-email-juterry@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Justin Terry (VM) [Mon, 22 Jan 2018 21:07:46 +0000 (13:07 -0800)]

Add the Windows Hypervisor Platform accelerator.

Introduces the configure support for the new Windows Hypervisor Platform that
allows for hypervisor acceleration from usermode components on the Windows
platform.

Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <1516655269-1785-2-git-send-email-juterry@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Klim Kireev [Thu, 1 Feb 2018 13:48:31 +0000 (16:48 +0300)]

tests/test-filter-redirector: move close()

Since we have separate handler on POLLHUP, which drops data
after closing the connection we need to fix this test, because
it sends data and instantly close the socket creating race condition.
In some cases on other end of socket client closes it faster than
reads data. To prevent it I suggest to close socket after recieving.

Signed-off-by: Klim Kireev <klim.kireev@virtuozzo.com>
Message-Id: <20180201134831.17709-1-klim.kireev@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Marc-André Lureau [Thu, 1 Feb 2018 13:27:57 +0000 (14:27 +0100)]

tests: use memfd in vhost-user-test

This will exercise the memfd memory backend and should generally be
better for testing than memory-backend-file (thanks to anonymous files
and sealing).

If memfd is available, it is preferred.

However, in order to check that file & memfd backends both work
correctly, the read-guest-mem test is checked explicitly for each.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-8-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Marc-André Lureau [Thu, 1 Feb 2018 13:27:56 +0000 (14:27 +0100)]

vhost-user-test: make read-guest-mem setup its own qemu

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-7-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Marc-André Lureau [Thu, 1 Feb 2018 13:27:55 +0000 (14:27 +0100)]

tests: keep compiling failing vhost-user tests

Let's protect the failing tests under a QTEST_VHOST_USER_FIXME
environment variable, so we keep compiling the tests and we can easily
run them.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-6-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Marc-André Lureau [Thu, 1 Feb 2018 13:27:54 +0000 (14:27 +0100)]

Add memfd based hostmem

Add a new memory backend, similar to hostmem-file, except that it
doesn't need to create files. It also enforces memory sealing.

This backend is mainly useful for sharing the memory with other
processes.

Note that Linux supports transparent huge-pages of shmem/memfd memory
since 4.8. It is relatively easier to set up THP than a dedicate
hugepage mount point by using "madvise" in
/sys/kernel/mm/transparent_hugepage/shmem_enabled.

Since 4.14, memfd allows to set hugetlb requirement explicitly.

Pending for merge in 4.16 is memfd sealing support for hugetlb backed
memory.

Usage:
-object memory-backend-memfd,id=mem1,size=1G

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-5-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Marc-André Lureau [Thu, 1 Feb 2018 13:27:53 +0000 (14:27 +0100)]

memfd: add hugetlbsize argument

Learn to specificy hugetlb size as qemu_memfd_create() argument.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-4-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Marc-André Lureau [Thu, 1 Feb 2018 13:27:52 +0000 (14:27 +0100)]

memfd: add hugetlb support

Linux commit 749df87bd7bee5a79cef073f5d032ddb2b211de8 (v4.14-rc1)
added a new flag MFD_HUGETLB to memfd_create() that specify the file
to be created resides in the hugetlbfs filesystem. This is the
generic hugetlbfs filesystem not associated with any specific mount
point.

hugetlbfs does not support sealing operations in v4.14, therefore
specifying MFD_ALLOW_SEALING with MFD_HUGETLB will result in EINVAL.

However, I added sealing support in "[PATCH v3 0/9] memfd: add sealing
to hugetlb-backed memory" series, queued in -mm tree for v4.16.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-3-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Marc-André Lureau [Thu, 1 Feb 2018 13:27:51 +0000 (14:27 +0100)]

memfd: add error argument, instead of perror()

This will allow callers to silence error report when the call is
allowed to failed.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-2-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Paolo Bonzini [Tue, 30 Jan 2018 15:40:12 +0000 (16:40 +0100)]

cpus: join thread when removing a vCPU

If no one joins the thread, its associated memory is leaked.

Reported-by: CheneyLin <linzc@zju.edu.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Paolo Bonzini [Tue, 30 Jan 2018 16:05:21 +0000 (11:05 -0500)]

cpus: hvf: unregister thread with RCU

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Paolo Bonzini [Tue, 30 Jan 2018 16:05:06 +0000 (11:05 -0500)]

cpus: tcg: unregister thread with RCU, fix exiting of loop on unplug

Keep running until cpu_can_run(cpu) becomes false, for consistency
with other acceslerators.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Paolo Bonzini [Tue, 30 Jan 2018 16:04:53 +0000 (11:04 -0500)]

cpus: dummy: unregister thread with RCU, exit loop on unplug

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Paolo Bonzini [Tue, 30 Jan 2018 16:04:36 +0000 (11:04 -0500)]

cpus: kvm: unregister thread with RCU

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Paolo Bonzini [Tue, 30 Jan 2018 15:28:49 +0000 (16:28 +0100)]

cpus: hax: register/unregister thread with RCU, exit loop on unplug

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Ladi Prosek [Mon, 11 Dec 2017 07:21:10 +0000 (08:21 +0100)]

ivshmem: Disable irqfd on device reset

The effects of ivshmem_enable_irqfd() was not undone on device reset.

This manifested as:
ivshmem_add_kvm_msi_virq: Assertion `!s->msi_vectors[vector].pdev' failed.

when irqfd was enabled before reset and then enabled again after reset, making
ivshmem_enable_irqfd() run for the second time.

To reproduce, run:

  ivshmem-server

and QEMU with:

  -device ivshmem-doorbell,chardev=iv
  -chardev socket,path=/tmp/ivshmem_socket,id=iv

then install the Windows driver, at the time of writing available at:

https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem

and crash-reboot the guest by inducing a BSOD.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-Id: <20171211072110.9058-5-lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Ladi Prosek [Mon, 11 Dec 2017 07:21:09 +0000 (08:21 +0100)]

ivshmem: Improve MSI irqfd error handling

Adds a rollback path to ivshmem_enable_irqfd() and fixes
ivshmem_disable_irqfd() to bail if irqfd has not been enabled.

To reproduce, run:

  ivshmem-server -n 0

and QEMU with:

  -device ivshmem-doorbell,chardev=iv
  -chardev socket,path=/tmp/ivshmem_socket,id=iv

then load, unload, and load again the Windows driver, at the time of writing
available at:

https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem

The issue is believed to have been masked by other guest drivers, notably
Linux ones, not enabling MSI-X on the device.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171211072110.9058-4-lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Ladi Prosek [Mon, 11 Dec 2017 07:21:08 +0000 (08:21 +0100)]

ivshmem: Always remove irqfd notifiers

As of commit 660c97eef6f8 ("ivshmem: use kvm irqfd for msi notifications"),
QEMU crashes with:

ivshmem: msix_set_vector_notifiers failed
msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier' failed.

if MSI-X is repeatedly enabled and disabled on the ivshmem device, for example
by loading and unloading the Windows ivshmem driver. This is because
msix_unset_vector_notifiers() doesn't call any of the release notifier callbacks
since MSI-X is already disabled at that point (msix_enabled() returning false
is how this transition is detected in the first place). Thus ivshmem_vector_mask()
doesn't run and when MSI-X is subsequently enabled again ivshmem_vector_unmask()
fails.

This is fixed by keeping track of unmasked vectors and making sure that
ivshmem_vector_mask() always runs on MSI-X disable.

Fixes: 660c97eef6f8 ("ivshmem: use kvm irqfd for msi notifications")
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171211072110.9058-3-lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Ladi Prosek [Mon, 11 Dec 2017 07:21:07 +0000 (08:21 +0100)]

ivshmem: Don't update non-existent MSI routes

As of commit 660c97eef6f8 ("ivshmem: use kvm irqfd for msi notifications"),
QEMU crashes with:

  kvm_irqchip_commit_routes: Assertion `ret == 0' failed.

if the ivshmem device is configured with more vectors than what the server
supports. This is caused by the ivshmem_vector_unmask() being called on
vectors that have not been initialized by ivshmem_add_kvm_msi_virq().

This commit fixes it by adding a simple check to the mask and unmask
callbacks.

Note that the opposite mismatch, if the server supplies more vectors than
what the device is configured for, is already handled and leads to output
like:

  Too many eventfd received, device has 1 vectors

To reproduce the assert, run:

  ivshmem-server -n 0

and QEMU with:

  -device ivshmem-doorbell,chardev=iv
  -chardev socket,path=/tmp/ivshmem_socket,id=iv

then load the Windows driver, at the time of writing available at:

https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem

The issue is believed to have been masked by other guest drivers, notably
Linux ones, not enabling MSI-X on the device.

Fixes: 660c97eef6f8 ("ivshmem: use kvm irqfd for msi notifications")
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171211072110.9058-2-lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Klim Kireev [Thu, 25 Jan 2018 13:51:29 +0000 (16:51 +0300)]

chardev/char-socket: add POLLHUP handler

The following behavior was observed for QEMU configured by libvirt
to use guest agent as usual for the guests without virtio-serial
driver (Windows or the guest remaining in BIOS stage).

In QEMU on first connect to listen character device socket
the listen socket is removed from poll just after the accept().
virtio_serial_guest_ready() returns 0 and the descriptor
of the connected Unix socket is removed from poll and it will
not be present in poll() until the guest will initialize the driver
and change the state of the serial to "guest connected".

In libvirt connect() to guest agent is performed on restart and
is run under VM state lock. Connect() is blocking and can
wait forever.
In this case libvirt can not perform ANY operation on that VM.

The bug can be easily reproduced this way:

Terminal 1:
qemu-system-x86_64 -m 512 -device pci-serial,chardev=serial1 -chardev socket,id=serial1,path=/tmp/console.sock,server,nowait
(virtio-serial and isa-serial also fit)

Terminal 2:
minicom -D unix\#/tmp/console.sock
(type something and press enter)
C-a x (to exit)

Do 3 times:
minicom -D unix\#/tmp/console.sock
C-a x

It needs 4 connections, because the first one is accepted by QEMU, then two are queued by
the kernel, and the 4th blocks.

The problem is that QEMU doesn't add a read watcher after succesful read
until the guest device wants to acquire recieved data, so
I propose to install a separate pullhup watcher regardless of
whether the device waits for data or not.

Signed-off-by: Klim Kireev <klim.kireev@virtuozzo.com>
Message-Id: <20180125135129.9305-1-klim.kireev@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Peter Xu [Mon, 22 Jan 2018 06:02:44 +0000 (14:02 +0800)]

memory: do explicit cleanup when remove listeners

When unregister memory listeners, we should call, e.g.,
region_del() (and possibly other undo operations) on every existing
memory region sections there, otherwise we may leak resources that are
held during the region_add(). This patch undo the stuff for the
listeners, which emulates the case when the address space is set from
current to an empty state.

I found this problem when debugging a refcount leak issue that leads to
a device unplug event lost (please see the "Bug:" line below). In that
case, the leakage of resource is the PCI BAR memory region refcount.
And since memory regions are not keeping their own refcount but onto
their owners, so the vfio-pci device's (who is the owner of the PCI BAR
memory regions) refcount is leaked, and event missing.

We had encountered similar issues before and fixed in other
way (ee4c112846, "vhost: Release memory references on cleanup"). This
patch can be seen as a more high-level fix of similar problems that are
caused by the resource leaks from memory listeners. So now we can remove
the explicit unref of memory regions since that'll be done altogether
during unregistering of listeners now.

Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1531393
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180122060244.29368-5-peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Peter Xu [Mon, 22 Jan 2018 06:02:43 +0000 (14:02 +0800)]

vfio: listener unregister before unset container

After next patch, listener unregister will need the container to be
alive. Let's move this unregister phase to be before unset container,
since that operation will free the backend container in kernel,
otherwise we'll get these after next patch:

qemu-system-x86_64: VFIO_UNMAP_DMA: -22
qemu-system-x86_64: vfio_dma_unmap(0x559bf53a4590, 0x0, 0xa0000) = -22 (Invalid argument)

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180122060244.29368-4-peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Peter Xu [Mon, 22 Jan 2018 06:02:42 +0000 (14:02 +0800)]

arm: postpone device listener unregister

It's a preparation for follow-up patch to call region_del() in
memory_listener_unregister(), otherwise all device addr attached with
kvm_devices_head will be reset before calling kvm_arm_set_device_addr.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180122060244.29368-3-peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Peter Xu [Mon, 22 Jan 2018 06:02:41 +0000 (14:02 +0800)]

vhost: add traces for memory listeners

Trace these operations on two memory listeners. It helps to verify the
new memory listener fix, and good to keep them there.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180122060244.29368-2-peterx@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Marc-André Lureau [Tue, 16 Jan 2018 15:11:52 +0000 (16:11 +0100)]

ucontext: annotate coroutine stack for ASAN

It helps ASAN to detect more leaks on coroutine stacks, and to get rid
of some extra warnings.

Before:

tests/test-coroutine -p
/basic/lifecycle
/basic/lifecycle: ==20781==WARNING: ASan doesn't fully support
makecontext/swapcontext functions and may produce false positives in
some cases!
==20781==WARNING: ASan is ignoring requested __asan_handle_no_return:
stack top: 0x7ffcb184d000; bottom 0x7ff6c4cfd000; size: 0x0005ecb50000
(25446121472)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189
OK

After:

tests/test-coroutine -p /basic/lifecycle
/basic/lifecycle: ==21110==WARNING: ASan doesn't fully support
makecontext/swapcontext functions and may produce false positives in
some cases!
OK

A similar work would need to be done for sigaltstack & windows fibers
to have similar coverage. Since ucontext is preferred, I didn't bother
checking the other coroutine implementations for now.

Update travis to fix the build with ASAN annotations.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180116151152.4040-4-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Marc-André Lureau [Tue, 16 Jan 2018 15:11:51 +0000 (16:11 +0100)]

build-sys: add --enable-sanitizers

Typical slowdown introduced by AddressSanitizer is 2x.
UBSan shouldn't have much impact on runtime cost.

Enable it by default when --enable-debug, unless --disable-sanitizers.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180116151152.4040-3-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit | commitdiff | tree

Peter Maydell [Wed, 7 Feb 2018 12:07:23 +0000 (12:07 +0000)]

Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20180206a' into staging

Migration pull 2018-02-06

This is based off Juan's last pull with a few extras, but
also removing:
   Add migration xbzrle test
   Add migration precopy test

As well as my normal test boxes, I also gave it a test
on a 32 bit ARM box and it seems happy (a Calxeda highbank)
and a big-endian power box.

Dave

# gpg: Signature made Tue 06 Feb 2018 15:33:31 GMT
# gpg:                using RSA key 0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20180206a:
  migration: incoming postcopy advise sanity checks
  migration: Don't leak IO channels
  migration: Recover block devices if failure in device state
  tests: Adjust sleeps for migration test
  tests: Create migrate-start-postcopy command
  tests: Add deprecated commands migration test
  tests: Use consistent names for migration
  tests: Consolidate accelerators declaration
  tests: Remove deprecated migration tests commands
  migration: Drop current address parameter from save_zero_page()
  migration: use s->threshold_size inside migration_update_counters
  migration/savevm.c: set MAX_VM_CMD_PACKAGED_SIZE to 1ul << 32
  migration: Route errors down through migration_channel_connect
  migration: Allow migrate_fd_connect to take an Error *

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Peter Maydell [Tue, 6 Feb 2018 19:28:08 +0000 (19:28 +0000)]

Merge remote-tracking branch 'remotes/ehabkost/tags/python-next-pull-request' into staging

Python queue, 2018-02-05

# gpg: Signature made Mon 05 Feb 2018 23:07:57 GMT
# gpg:                using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/python-next-pull-request: (21 commits)
  docker: change Fedora images to run with python3
  travis: improve python version test coverage
  ui: update keycodemapdb to get py3 fixes
  input: add missing JIS keys to virtio input
  qemu.py: don't launch again before shutdown()
  qemu.py: cleanup redundant calls in launch()
  qemu.py: use poll() instead of 'returncode'
  qemu.py: always cleanup on shutdown()
  qemu.py: refactor launch()
  qemu.py: better control of created files
  qemu.py: remove unused import
  configure: allow use of python 3
  scripts: ensure signrom treats data as bytes
  qapi: force a UTF-8 locale for running Python
  qapi: ensure stable sort ordering when checking QAPI entities
  qapi: remove '-q' arg to diff when comparing QAPI output
  qapi: Adapt to moved location of 'maketrans' function in py3
  qapi: adapt to moved location of StringIO module in py3
  qapi: Use OrderedDict from standard library if available
  qapi: use items()/values() intead of iteritems()/itervalues()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

commit | commitdiff | tree

Alex Williamson [Tue, 6 Feb 2018 18:08:27 +0000 (11:08 -0700)]

vfio/pci: Add option to disable GeForce quirks

These quirks are necessary for GeForce, but not for Quadro/GRID/Tesla
assignment.  Leaving them enabled is fully functional and provides the
most compatibility, but due to the unique NVIDIA MSI ACK behavior[1],
it also introduces latency in re-triggering the MSI interrupt.  This
overhead is typically negligible, but has been shown to adversely
affect some (very) high interrupt rate applications.  This adds the
vfio-pci device option "x-no-geforce-quirks=" which can be set to
"on" to disable this additional overhead.

A follow-on optimization for GeForce might be to make use of an
ioeventfd to allow KVM to trigger an irqfd in the kernel vfio-pci
driver, avoiding the bounce through userspace to handle this device
write.

[1] Background: the NVIDIA driver has been observed to issue a write
to the MMIO mirror of PCI config space in BAR0 in order to allow the
MSI interrupt for the device to retrigger.  Older reports indicated a
write of 0xff to the (read-only) MSI capability ID register, while
more recently a write of 0x0 is observed at config space offset 0x704,
non-architected, extended config space of the device (BAR0 offset
0x88704).  Virtualization of this range is only required for GeForce.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

commit | commitdiff | tree

Alexey Kardashevskiy [Tue, 6 Feb 2018 18:08:27 +0000 (11:08 -0700)]

vfio/common: Remove redundant copy of local variable

There is already @hostwin in vfio_listener_region_add() so there is no
point in having the other one.

Fixes: 2e4109de8e58 ("vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2)")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

commit | commitdiff | tree

Eric Auger [Tue, 6 Feb 2018 18:08:26 +0000 (11:08 -0700)]

hw/vfio/platform: Init the interrupt mutex

Add the initialization of the mutex protecting the interrupt list.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

commit | commitdiff | tree

Alex Williamson [Tue, 6 Feb 2018 18:08:26 +0000 (11:08 -0700)]

vfio/pci: Allow relocating MSI-X MMIO

Recently proposed vfio-pci kernel changes (v4.16) remove the
restriction preventing userspace from mmap'ing PCI BARs in areas
overlapping the MSI-X vector table.  This change is primarily intended
to benefit host platforms which make use of system page sizes larger
than the PCI spec recommendation for alignment of MSI-X data
structures (ie. not x86_64).  In the case of POWER systems, the SPAPR
spec requires the VM to program MSI-X using hypercalls, rendering the
MSI-X vector table unused in the VM view of the device.  However,
ARM64 platforms also support 64KB pages and rely on QEMU emulation of
MSI-X.  Regardless of the kernel driver allowing mmaps overlapping
the MSI-X vector table, emulation of the MSI-X vector table also
prevents direct mapping of device MMIO spaces overlapping this page.
Thanks to the fact that PCI devices have a standard self discovery
mechanism, we can try to resolve this by relocating the MSI-X data
structures, either by creating a new PCI BAR or extending an existing
BAR and updating the MSI-X capability for the new location.  There's
even a very slim chance that this could benefit devices which do not
adhere to the PCI spec alignment guidelines on x86_64 systems.

This new x-msix-relocation option accepts the following choices:

  off: Disable MSI-X relocation, use native device config (default)
  auto: Use a known good combination for the platform/device (none yet)
  bar0..bar5: Specify the target BAR for MSI-X data structures

If compatible, the target BAR will either be created or extended and
the new portion will be used for MSI-X emulation.

The first obvious user question with this option is how to determine
whether a given platform and device might benefit from this option.
In most cases, the answer is that it won't, especially on x86_64.
Devices often dedicate an entire BAR to MSI-X and therefore no
performance sensitive registers overlap the MSI-X area.  Take for
example:

# lspci -vvvs 0a:00.0
0a:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection
...
Region 0: Memory at db680000 (32-bit, non-prefetchable) [size=512K]
Region 3: Memory at db7f8000 (32-bit, non-prefetchable) [size=16K]
...
Capabilities: [70] MSI-X: Enable+ Count=10 Masked-
Vector table: BAR=3 offset=00000000
PBA: BAR=3 offset=00002000

This device uses the 16K bar3 for MSI-X with the vector table at
offset zero and the pending bits arrary at offset 8K, fully honoring
the PCI spec alignment guidance.  The data sheet specifically refers
to this as an MSI-X BAR.  This device would not see a benefit from
MSI-X relocation regardless of the platform, regardless of the page
size.

However, here's another example:

# lspci -vvvs 02:00.0
02:00.0 Serial Attached SCSI controller: xxxxxxxx
...
Region 0: I/O ports at c000 [size=256]
Region 1: Memory at ef640000 (64-bit, non-prefetchable) [size=64K]
Region 3: Memory at ef600000 (64-bit, non-prefetchable) [size=256K]
...
Capabilities: [c0] MSI-X: Enable+ Count=16 Masked-
Vector table: BAR=1 offset=0000e000
PBA: BAR=1 offset=0000f000

Here the MSI-X data structures are placed on separate 4K pages at the
end of a 64KB BAR.  If our host page size is 4K, we're likely fine,
but at 64KB page size, MSI-X emulation at that location prevents the
entire BAR from being directly mapped into the VM address space.
Overlapping performance sensitive registers then starts to be a very
likely scenario on such a platform.  At this point, the user could
enable tracing on vfio_region_read and vfio_region_write to determine
more conclusively if device accesses are being trapped through QEMU.

Upon finding a device and platform in need of MSI-X relocation, the
next problem is how to choose target PCI BAR to host the MSI-X data
structures.  A few key rules to keep in mind for this selection
include:

* There are only 6 BAR slots, bar0..bar5
* 64-bit BARs occupy two BAR slots, 'lspci -vvv' lists the first slot
* PCI BARs are always a power of 2 in size, extending == doubling
* The maximum size of a 32-bit BAR is 2GB
* MSI-X data structures must reside in an MMIO BAR

Using these rules, we can evaluate each BAR of the second example
device above as follows:

bar0: I/O port BAR, incompatible with MSI-X tables
bar1: BAR could be extended, incurring another 64KB of MMIO
bar2: Unavailable, bar1 is 64-bit, this register is used by bar1
bar3: BAR could be extended, incurring another 256KB of MMIO
bar4: Unavailable, bar3 is 64bit, this register is used by bar3
bar5: Available, empty BAR, minimum additional MMIO

A secondary optimization we might wish to make in relocating MSI-X
is to minimize the additional MMIO required for the device, therefore
we might test the available choices in order of preference as bar5,
bar1, and finally bar3.  The original proposal for this feature
included an 'auto' option which would choose bar5 in this case, but
various drivers have been found that make assumptions about the
properties of the "first" BAR or the size of BARs such that there
appears to be no foolproof automatic selection available, requiring
known good combinations to be sourced from users.  This patch is
pre-enabled for an 'auto' selection making use of a validated lookup
table, but no entries are yet identified.

Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

commit | commitdiff | tree

Alex Williamson [Tue, 6 Feb 2018 18:08:26 +0000 (11:08 -0700)]

qapi: Create DEFINE_PROP_OFF_AUTO_PCIBAR

Add an option which allows the user to specify a PCI BAR number,
including an 'off' and 'auto' selection.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

commit | commitdiff | tree

Alex Williamson [Tue, 6 Feb 2018 18:08:25 +0000 (11:08 -0700)]

vfio/pci: Emulate BARs

The kernel provides similar emulation of PCI BAR register access to
QEMU, so up until now we've used that for things like BAR sizing and
storing the BAR address. However, if we intend to resize BARs or add
BARs that don't exist on the physical device, we need to switch to the
pure QEMU emulation of the BAR.

Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

commit | commitdiff | tree

Alex Williamson [Tue, 6 Feb 2018 18:08:25 +0000 (11:08 -0700)]

vfio/pci: Add base BAR MemoryRegion

Add one more layer to our stack of MemoryRegions, this base region
allows us to register BARs independently of the vfio region or to
extend the size of BARs which do map to a region. This will be
useful when we want hypervisor defined BARs or sections of BARs,
for purposes such as relocating MSI-X emulation. We therefore call
msix_init() based on this new base MemoryRegion, while the quirks,
which only modify regions still operate on those sub-MemoryRegions.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

commit | commitdiff | tree

Alex Williamson [Tue, 6 Feb 2018 18:08:25 +0000 (11:08 -0700)]

vfio/pci: Fixup VFIOMSIXInfo comment

The fields were removed in the referenced commit, but the comment
still mentions them.

Fixes: 2fb9636ebf24 ("vfio-pci: Remove unused fields from VFIOMSIXInfo")
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

About OSDN

Find Software

Develop Software

Help

Copyright ©OSDN Corporation All rights reserved.