OSDN Git Service

qmiga/qemu.git
3 years agotarget/arm: Correctly initialize MDCR_EL2.HPMN
Daniel Müller [Wed, 10 Feb 2021 17:41:22 +0000 (09:41 -0800)]
target/arm: Correctly initialize MDCR_EL2.HPMN

When working with performance monitoring counters, we look at
MDCR_EL2.HPMN as part of the check whether a counter is enabled. This
check fails, because MDCR_EL2.HPMN is reset to 0, meaning that no
counters are "enabled" for < EL2.
That's in violation of the Arm specification, which states that

> On a Warm reset, this field [MDCR_EL2.HPMN] resets to the value in
> PMCR_EL0.N

That's also what a comment in the code acknowledges, but the necessary
adjustment seems to have been forgotten when support for more counters
was added.
This change fixes the issue by setting the reset value to PMCR.N, which
is four.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agohw/arm: versal: Use nr_apu_cpus in favor of hard coding 2
Edgar E. Iglesias [Wed, 10 Feb 2021 14:20:48 +0000 (15:20 +0100)]
hw/arm: versal: Use nr_apu_cpus in favor of hard coding 2

Use nr_apu_cpus in favor of hard coding 2.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Message-id: 20210210142048.3125878-2-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agoaccel/tcg: Add URL of clang bug to comment about our workaround
Peter Maydell [Fri, 29 Jan 2021 13:03:30 +0000 (13:03 +0000)]
accel/tcg: Add URL of clang bug to comment about our workaround

In cpu_exec() we have a longstanding workaround for compilers which
do not correctly implement the part of the sigsetjmp()/siglongjmp()
spec which requires that local variables which are not changed
between the setjmp and the longjmp retain their value.

I recently ran across the upstream clang bug report for this; add a
link to it to the comment describing the workaround, and generally
expand the comment, so that we have a reasonable chance in future of
understanding why it's there and determining when we can remove it,
assuming clang eventually fixes the bug.

Remove the /* buggy compiler */ comments on the #else and #endif:
they don't add anything to understanding and are somewhat misleading
since they're sandwiching the code path for *non*-buggy compilers.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20210129130330.30820-1-peter.maydell@linaro.org

3 years agoarm: Update infocenter.arm.com URLs
Peter Maydell [Fri, 5 Feb 2021 17:14:56 +0000 (17:14 +0000)]
arm: Update infocenter.arm.com URLs

Update infocenter.arm.com URLs for various pieces of Arm
documentation to the new developer.arm.com equivalents.  (There is a
redirection in place from the old URLs, but we might as well update
our comments in case the redirect ever disappears in future.)

This patch covers all the URLs which are not MPS2/SSE-200/IoTKit
related (those are dealt with in a different patch).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210205171456.19939-1-peter.maydell@linaro.org

3 years agotarget/arm: Set ID_PFR0.DIT to 1 for "max" 32-bit CPU
Rebecca Cran [Mon, 8 Feb 2021 06:57:00 +0000 (23:57 -0700)]
target/arm: Set ID_PFR0.DIT to 1 for "max" 32-bit CPU

Enable FEAT_DIT for the "max" 32-bit CPU.

Signed-off-by: Rebecca Cran <rebecca@nuviainc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210208065700.19454-5-rebecca@nuviainc.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agotarget/arm: Set ID_AA64PFR0.DIT and ID_PFR0.DIT to 1 for "max" AA64 CPU
Rebecca Cran [Mon, 8 Feb 2021 06:56:59 +0000 (23:56 -0700)]
target/arm: Set ID_AA64PFR0.DIT and ID_PFR0.DIT to 1 for "max" AA64 CPU

Enable FEAT_DIT for the "max" AARCH64 CPU.

Signed-off-by: Rebecca Cran <rebecca@nuviainc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210208065700.19454-4-rebecca@nuviainc.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agotarget/arm: Support AA32 DIT by moving PSTATE_SS from cpsr into env->pstate
Rebecca Cran [Mon, 8 Feb 2021 06:56:58 +0000 (23:56 -0700)]
target/arm: Support AA32 DIT by moving PSTATE_SS from cpsr into env->pstate

cpsr has been treated as being the same as spsr, but it isn't.
Since PSTATE_SS isn't in cpsr, remove it and move it into env->pstate.

This allows us to add support for CPSR_DIT, adding helper functions
to merge SPSR_ELx to and from CPSR.

Signed-off-by: Rebecca Cran <rebecca@nuviainc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210208065700.19454-3-rebecca@nuviainc.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agotarget/arm: Add support for FEAT_DIT, Data Independent Timing
Rebecca Cran [Mon, 8 Feb 2021 06:56:57 +0000 (23:56 -0700)]
target/arm: Add support for FEAT_DIT, Data Independent Timing

Add support for FEAT_DIT. DIT (Data Independent Timing) is a required
feature for ARMv8.4. Since virtual machine execution is largely
nondeterministic and TCG is outside of the security domain, it's
implemented as a NOP.

Signed-off-by: Rebecca Cran <rebecca@nuviainc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210208065700.19454-2-rebecca@nuviainc.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agohw/arm: Remove GPIO from unimplemented NPCM7XX
Hao Wu [Fri, 29 Jan 2021 00:58:40 +0000 (16:58 -0800)]
hw/arm: Remove GPIO from unimplemented NPCM7XX

NPCM7XX GPIO devices have been implemented in hw/gpio/npcm7xx-gpio.c. So
we removed them from the unimplemented devices list.

Reviewed-by: Doug Evans<dje@google.com>
Reviewed-by: Tyrong Ting<kfting@nuvoton.com>
Signed-off-by: Hao Wu<wuhaotsh@google.com>
Message-id: 20210129005845.416272-2-wuhaotsh@google.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agotarget/arm: Fix SCR RES1 handling
Mike Nawrocki [Wed, 3 Feb 2021 16:55:52 +0000 (11:55 -0500)]
target/arm: Fix SCR RES1 handling

The FW and AW bits of SCR_EL3 are RES1 only in some contexts. Force them
to 1 only when there is no support for AArch32 at EL1 or above.

The reset value will be 0x30 only if the CPU is AArch64-only; if there
is support for AArch32 at EL1 or above, it will be reset to 0.

Also adds helper function isar_feature_aa64_aa32_el1 to check if AArch32
is supported at EL1 or above.

Signed-off-by: Mike Nawrocki <michael.nawrocki@gtri.gatech.edu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210203165552.16306-2-michael.nawrocki@gtri.gatech.edu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agotarget/arm: Don't migrate CPUARMState.features
Aaron Lindsay [Wed, 3 Feb 2021 16:13:40 +0000 (11:13 -0500)]
target/arm: Don't migrate CPUARMState.features

As feature flags are added or removed, the meanings of bits in the
`features` field can change between QEMU versions, causing migration
failures. Additionally, migrating the field is not useful because it is
a constant function of the CPU being used.

Fixes: LP:1914696
Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Tested-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agoMerge remote-tracking branch 'remotes/stefanha-gitlab/tags/block-pull-request' into...
Peter Maydell [Wed, 10 Feb 2021 15:42:19 +0000 (15:42 +0000)]
Merge remote-tracking branch 'remotes/stefanha-gitlab/tags/block-pull-request' into staging

Pull request

v4:
 * Add PCI_EXPRESS Kconfig dependency to fix s390x in "multi-process: setup PCI
   host bridge for remote device" [Philippe and Thomas]

# gpg: Signature made Wed 10 Feb 2021 09:26:14 GMT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha-gitlab/tags/block-pull-request: (27 commits)
  docs: fix Parallels Image "dirty bitmap" section
  multi-process: perform device reset in the remote process
  multi-process: Retrieve PCI info from remote process
  multi-process: create IOHUB object to handle irq
  multi-process: Synchronize remote memory
  multi-process: PCI BAR read/write handling for proxy & remote endpoints
  multi-process: Forward PCI config space acceses to the remote process
  multi-process: add proxy communication functions
  multi-process: introduce proxy object
  multi-process: setup memory manager for remote device
  multi-process: Associate fd of a PCIDevice with its object
  multi-process: Initialize message handler in remote device
  multi-process: define MPQemuMsg format and transmission functions
  io: add qio_channel_readv_full_all_eof & qio_channel_readv_full_all helpers
  io: add qio_channel_writev_full_all helper
  multi-process: setup a machine object for remote device process
  multi-process: setup PCI host bridge for remote device
  multi-process: Add config option for multi-process QEMU
  memory: alloc RAM from file at offset
  multi-process: add configure and usage information
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agoMerge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.0-20210210' into staging
Peter Maydell [Wed, 10 Feb 2021 13:38:27 +0000 (13:38 +0000)]
Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.0-20210210' into staging

ppc patch queue for 20201-02-10

Here's the latest batch of patches for the ppc target and machine
types.  Highlights are:
 * Several fixes for E500 from Bin Meng
 * Fixes and cleanups for PowerNV from Cédric Le Goater
 * Assorted other fixes and cleanups

# gpg: Signature made Wed 10 Feb 2021 06:16:53 GMT
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dg-gitlab/tags/ppc-for-6.0-20210210:
  target/ppc: Add E500 L2CSR0 write helper
  hw/net: fsl_etsec: Reverse the RCTRL.RSF logic
  hw/ppc: e500: Fill in correct <clock-frequency> for the serial nodes
  hw/ppc: e500: Use a macro for the platform clock frequency
  ppc/pnv: Set default RAM size to 1 GB
  spapr_numa.c: fix ibm,max-associativity-domains calculation
  spapr_numa.c: create spapr_numa_initial_nvgpu_numa_id() helper
  spapr: move spapr_machine_using_legacy_numa() to spapr_numa.c
  ppc/pnv: Introduce a LPC FW memory region attribute to map the PNOR
  ppc/pnv: Remove default disablement of the PNOR contents
  ppc/pnv: Discard internal BMC initialization when BMC is external
  ppc/pnv: Simplify pnv_bmc_create()
  ppc/pnv: Use skiboot addresses to load kernel and ramfs
  ppc/xive: Add firmware bit when dumping the ENDs
  ppc/pnv: Add trace events for PCI event notification
  target/ppc: Remove unused MMU definitions
  spapr: Adjust firmware path of PCI devices
  spapr.c: add 'name' property for hotplugged CPUs nodes
  spapr.c: use g_auto* with 'nodename' in CPU DT functions

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agodocs: fix Parallels Image "dirty bitmap" section
Denis V. Lunev [Thu, 28 Jan 2021 17:13:13 +0000 (20:13 +0300)]
docs: fix Parallels Image "dirty bitmap" section

Original specification says that l1 table size if 64 * l1_size, which
is obviously wrong. The size of the l1 entry is 64 _bits_, not bytes.
Thus 64 is to be replaces with 8 as specification says about bytes.

There is also minor tweak, field name is renamed from l1 to l1_table,
which matches with the later text.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20210128171313.2210947-1-den@openvz.org
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[Replace the original commit message "docs: fix mistake in dirty bitmap
feature description" as suggested by Eric Blake.
--Stefan]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: perform device reset in the remote process
Elena Ufimtseva [Fri, 29 Jan 2021 16:46:21 +0000 (11:46 -0500)]
multi-process: perform device reset in the remote process

Perform device reset in the remote process when QEMU performs
device reset. This is required to reset the internal state
(like registers, etc...) of emulated devices

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 7cb220a51f565dc0817bd76e2f540e89c2d2b850.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: Retrieve PCI info from remote process
Jagannathan Raman [Fri, 29 Jan 2021 16:46:20 +0000 (11:46 -0500)]
multi-process: Retrieve PCI info from remote process

Retrieve PCI configuration info about the remote device and
configure the Proxy PCI object based on the returned information

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 85ee367bbb993aa23699b44cfedd83b4ea6d5221.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: create IOHUB object to handle irq
Jagannathan Raman [Fri, 29 Jan 2021 16:46:19 +0000 (11:46 -0500)]
multi-process: create IOHUB object to handle irq

IOHUB object is added to manage PCI IRQs. It uses KVM_IRQFD
ioctl to create irqfd to injecting PCI interrupts to the guest.
IOHUB object forwards the irqfd to the remote process. Remote process
uses this fd to directly send interrupts to the guest, bypassing QEMU.

Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 51d5c3d54e28a68b002e3875c59599c9f5a424a1.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: Synchronize remote memory
Jagannathan Raman [Fri, 29 Jan 2021 16:46:18 +0000 (11:46 -0500)]
multi-process: Synchronize remote memory

Add ProxyMemoryListener object which is used to keep the view of the RAM
in sync between QEMU and remote process.
A MemoryListener is registered for system-memory AddressSpace. The
listener sends SYNC_SYSMEM message to the remote process when memory
listener commits the changes to memory, the remote process receives
the message and processes it in the handler for SYNC_SYSMEM message.

Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 04fe4e6a9ca90d4f11ab6f59be7652f5b086a071.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: PCI BAR read/write handling for proxy & remote endpoints
Jagannathan Raman [Fri, 29 Jan 2021 16:46:17 +0000 (11:46 -0500)]
multi-process: PCI BAR read/write handling for proxy & remote endpoints

Proxy device object implements handler for PCI BAR writes and reads.
The handler uses BAR_WRITE/BAR_READ message to communicate to the
remote process with the BAR address and value to be written/read.
The remote process implements handler for BAR_WRITE/BAR_READ
message.

Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: a8b76714a9688be5552c4c92d089bc9e8a4707ff.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: Forward PCI config space acceses to the remote process
Elena Ufimtseva [Fri, 29 Jan 2021 16:46:16 +0000 (11:46 -0500)]
multi-process: Forward PCI config space acceses to the remote process

The Proxy Object sends the PCI config space accesses as messages
to the remote process over the communication channel

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: d3c94f4618813234655356c60e6f0d0362ff42d6.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: add proxy communication functions
Elena Ufimtseva [Fri, 29 Jan 2021 16:46:15 +0000 (11:46 -0500)]
multi-process: add proxy communication functions

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: d54edb4176361eed86b903e8f27058363b6c83b3.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: introduce proxy object
Elena Ufimtseva [Fri, 29 Jan 2021 16:46:14 +0000 (11:46 -0500)]
multi-process: introduce proxy object

Defines a PCI Device proxy object as a child of TYPE_PCI_DEVICE.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: b5186ebfedf8e557044d09a768846c59230ad3a7.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: setup memory manager for remote device
Jagannathan Raman [Fri, 29 Jan 2021 16:46:13 +0000 (11:46 -0500)]
multi-process: setup memory manager for remote device

SyncSysMemMsg message format is defined. It is used to send
file descriptors of the RAM regions to remote device.
RAM on the remote device is configured with a set of file descriptors.
Old RAM regions are deleted and new regions, each with an fd, is
added to the RAM.

Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 7d2d1831d812e85f681e7a8ab99e032cf4704689.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: Associate fd of a PCIDevice with its object
Jagannathan Raman [Fri, 29 Jan 2021 16:46:12 +0000 (11:46 -0500)]
multi-process: Associate fd of a PCIDevice with its object

Associate the file descriptor for a PCIDevice in remote process with
DeviceState object.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: f405a2ed5d7518b87bea7c59cfdf334d67e5ee51.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: Initialize message handler in remote device
Jagannathan Raman [Fri, 29 Jan 2021 16:46:11 +0000 (11:46 -0500)]
multi-process: Initialize message handler in remote device

Initializes the message handler function in the remote process. It is
called whenever there's an event pending on QIOChannel that registers
this function.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 99d38d8b93753a6409ac2340e858858cda59ab1b.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: define MPQemuMsg format and transmission functions
Elena Ufimtseva [Fri, 29 Jan 2021 16:46:10 +0000 (11:46 -0500)]
multi-process: define MPQemuMsg format and transmission functions

Defines MPQemuMsg, which is the message that is sent to the remote
process. This message is sent over QIOChannel and is used to
command the remote process to perform various tasks.
Define transmission functions used by proxy and by remote.

Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 56ca8bcf95195b2b195b08f6b9565b6d7410bce5.1611938319.git.jag.raman@oracle.com

[Replace struct iovec send[2] = {0} with {} to make clang happy as
suggested by Peter Maydell <peter.maydell@linaro.org>.
--Stefan]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agoio: add qio_channel_readv_full_all_eof & qio_channel_readv_full_all helpers
Elena Ufimtseva [Fri, 29 Jan 2021 16:46:09 +0000 (11:46 -0500)]
io: add qio_channel_readv_full_all_eof & qio_channel_readv_full_all helpers

Adds qio_channel_readv_full_all_eof() and qio_channel_readv_full_all()
to read both data and FDs. Refactors existing code to use these helpers.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: b059c4cc0fb741e794d644c144cc21372cad877d.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agoio: add qio_channel_writev_full_all helper
Elena Ufimtseva [Fri, 29 Jan 2021 16:46:08 +0000 (11:46 -0500)]
io: add qio_channel_writev_full_all helper

Adds qio_channel_writev_full_all() to transmit both data and FDs.
Refactors existing code to use this helper.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 480fbf1fe4152495d60596c9b665124549b426a5.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: setup a machine object for remote device process
Jagannathan Raman [Fri, 29 Jan 2021 16:46:07 +0000 (11:46 -0500)]
multi-process: setup a machine object for remote device process

x-remote-machine object sets up various subsystems of the remote
device process. Instantiate PCI host bridge object and initialize RAM, IO &
PCI memory regions.

Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: c537f38d17f90453ca610c6b70cf3480274e0ba1.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: setup PCI host bridge for remote device
Jagannathan Raman [Fri, 29 Jan 2021 16:46:06 +0000 (11:46 -0500)]
multi-process: setup PCI host bridge for remote device

PCI host bridge is setup for the remote device process. It is
implemented using remote-pcihost object. It is an extension of the PCI
host bridge setup by QEMU.
Remote-pcihost configures a PCI bus which could be used by the remote
PCI device to latch on to.

Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 0871ba857abb2eafacde07e7fe66a3f12415bfb2.1611938319.git.jag.raman@oracle.com

[Added PCI_EXPRESS condition in hw/remote/Kconfig since remote-pcihost
needs PCIe. This solves "make check" failure on s390x. Fix suggested by
Philippe Mathieu-Daudé <philmd@redhat.com> and Thomas Huth
<thuth@redhat.com>.
--Stefan]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agotarget/ppc: Add E500 L2CSR0 write helper
Bin Meng [Wed, 10 Feb 2021 02:45:52 +0000 (10:45 +0800)]
target/ppc: Add E500 L2CSR0 write helper

Per EREF 2.0 [1] chapter 3.11.2:

The following bits in L2CSR0 (exists in the e500mc/e5500/e6500 core):

- L2FI  (L2 cache flash invalidate)
- L2FL  (L2 cache flush)
- L2LFC (L2 cache lock flash clear)

when set, a cache operation is initiated by hardware, and these bits
will be cleared when the operation is complete.

Since we don't model cache in QEMU, let's add a write helper to emulate
the cache operations completing instantly.

[1] https://www.nxp.com/files-static/32bit/doc/ref_manual/EREFRM.pdf

Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <1612925152-20913-1-git-send-email-bmeng.cn@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agohw/net: fsl_etsec: Reverse the RCTRL.RSF logic
Bin Meng [Wed, 10 Feb 2021 02:10:21 +0000 (10:10 +0800)]
hw/net: fsl_etsec: Reverse the RCTRL.RSF logic

Per MPC8548ERM [1] chapter 14.5.3.4.1:

When RCTRL.RSF is 1, frames less than 64 bytes are accepted upon
a DA match. But currently QEMU does the opposite. This commit
reverses the RCTRL.RSF testing logic to match the manual.

Due to the reverse of the logic, certain guests may potentially
break if they don't program eTSEC to have RCTRL.RSF bit set.
When RCTRL.RSF is 0, short frames are silently dropped, however
as of today both slirp and tap networking do not pad short frames
(e.g.: an ARP packet) to the minimum frame size of 60 bytes. So
ARP requests will be dropped, preventing the guest from becoming
visible on the network.

The same issue was reported on e1000 and vmxenet3 before, see:

commit 78aeb23eded2 ("e1000: Pad short frames to minimum size (60 bytes)")
commit 40a87c6c9b11 ("vmxnet3: Pad short frames to minimum size (60 bytes)")

[1] https://www.nxp.com/docs/en/reference-manual/MPC8548ERM.pdf

Fixes: eb1e7c3e5146 ("Add Enhanced Three-Speed Ethernet Controller (eTSEC)")
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <1612923021-19746-1-git-send-email-bmeng.cn@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agohw/ppc: e500: Fill in correct <clock-frequency> for the serial nodes
Bin Meng [Wed, 3 Feb 2021 14:24:48 +0000 (22:24 +0800)]
hw/ppc: e500: Fill in correct <clock-frequency> for the serial nodes

At present the <clock-frequency> property of the serial node is
populated with value zero. U-Boot's ns16550 driver is not happy
about this, so let's fill in a meaningful value.

Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1612362288-22216-2-git-send-email-bmeng.cn@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agohw/ppc: e500: Use a macro for the platform clock frequency
Bin Meng [Wed, 3 Feb 2021 14:24:47 +0000 (22:24 +0800)]
hw/ppc: e500: Use a macro for the platform clock frequency

At present the platform clock frequency is using a magic number.
Convert it to a macro and use it everywhere.

Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1612362288-22216-1-git-send-email-bmeng.cn@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pnv: Set default RAM size to 1 GB
Cédric Le Goater [Fri, 29 Jan 2021 11:17:19 +0000 (12:17 +0100)]
ppc/pnv: Set default RAM size to 1 GB

The memory layout of the PowerNV machine is defined as :

  #define KERNEL_LOAD_BASE ((void *)0x20000000)
  #define KERNEL_LOAD_SIZE 0x08000000

  #define INITRAMFS_LOAD_BASE KERNEL_LOAD_BASE + KERNEL_LOAD_SIZE
  #define INITRAMFS_LOAD_SIZE 0x08000000

  #define SKIBOOT_BASE 0x30000000
  #define SKIBOOT_SIZE 0x01c10000

  #define CPU_STACKS_BASE (SKIBOOT_BASE + SKIBOOT_SIZE)
  #define STACK_SHIFT 15
  #define STACK_SIZE (1 << STACK_SHIFT)

The overall size of the CPU stacks is (max PIR + 1) * 32K and the
machine easily reaches 800MB of minimum required RAM.

Any value below will result in a skiboot crash :

    [    0.034949905,3] MEM: Partial overlap detected between regions:
    [    0.034959039,3] MEM: ibm,firmware-stacks [0x31c10000-0x3a450000] (new)
    [    0.034968576,3] MEM: ibm,firmware-allocs-memory@0 [0x31c10000-0x38400000]
    [    0.034980367,3] Out of memory adding skiboot reserved areas
    [    0.035074945,3] ***********************************************
    [    0.035093627,3] < assert failed at core/mem_region.c:1129 >
    [    0.035104247,3]     .
    [    0.035108025,3]      .
    [    0.035111651,3]       .
    [    0.035115231,3]         OO__)
    [    0.035119198,3]        <"__/
    [    0.035122980,3]         ^ ^

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210129111719.790692-1-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agospapr_numa.c: fix ibm,max-associativity-domains calculation
Daniel Henrique Barboza [Thu, 28 Jan 2021 17:42:13 +0000 (14:42 -0300)]
spapr_numa.c: fix ibm,max-associativity-domains calculation

The current logic for calculating 'maxdomain' making it a sum of
numa_state->num_nodes with spapr->gpu_numa_id. spapr->gpu_numa_id is
used as a index to determine the next available NUMA id that a
given NVGPU can use.

The problem is that the initial value of gpu_numa_id, for any topology
that has more than one NUMA node, is equal to numa_state->num_nodes.
This means that our maxdomain will always be, at least, twice the
amount of existing NUMA nodes. This means that a guest with 4 NUMA
nodes will end up with the following max-associativity-domains:

rtas/ibm,max-associativity-domains
                 00000004 00000008 00000008 00000008 00000008

This overtuning of maxdomains doesn't go unnoticed in the guest, being
detected in SLUB during boot:

 dmesg | grep SLUB
[    0.000000] SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=4, Nodes=8

SLUB is detecting 8 total nodes, with 4 nodes being online.

This patch fixes ibm,max-associativity-domains by considering the amount
of NVGPUs NUMA nodes presented in the guest, instead of just
spapr->gpu_numa_id.

Reported-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210128174213.1349181-4-danielhb413@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agospapr_numa.c: create spapr_numa_initial_nvgpu_numa_id() helper
Daniel Henrique Barboza [Thu, 28 Jan 2021 17:42:12 +0000 (14:42 -0300)]
spapr_numa.c: create spapr_numa_initial_nvgpu_numa_id() helper

We'll need to check the initial value given to spapr->gpu_numa_id when
building the rtas DT, so put it in a helper for easier access and to
avoid repetition.

Tested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210128174213.1349181-3-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agospapr: move spapr_machine_using_legacy_numa() to spapr_numa.c
Daniel Henrique Barboza [Thu, 28 Jan 2021 17:42:11 +0000 (14:42 -0300)]
spapr: move spapr_machine_using_legacy_numa() to spapr_numa.c

This function is used only in spapr_numa.c.

Tested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210128174213.1349181-2-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pnv: Introduce a LPC FW memory region attribute to map the PNOR
Cédric Le Goater [Tue, 26 Jan 2021 17:10:59 +0000 (18:10 +0100)]
ppc/pnv: Introduce a LPC FW memory region attribute to map the PNOR

This to map the PNOR from the machine init handler directly and finish
the cleanup of the LPC model.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210126171059.307867-8-clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pnv: Remove default disablement of the PNOR contents
Cédric Le Goater [Tue, 26 Jan 2021 17:10:58 +0000 (18:10 +0100)]
ppc/pnv: Remove default disablement of the PNOR contents

On PowerNV systems, the BMC is in charge of mapping the PNOR contents
on the LPC FW address space using the HIOMAP protocol. Under QEMU, we
emulate this behavior and we also add an extra control on the flash
accesses by letting the HIOMAP command handler decide whether the
memory region is accessible or not depending on the firmware requests.

However, this behavior is not compatible with hostboot like firmwares
which need this mapping to be always available. For this reason, the
PNOR memory region is initially disabled for skiboot mode only.

This is badly placed under the LPC model and requires the use of the
machine. Since it doesn't add much, simply remove the initial setting.
The extra control in the HIOMAP command handler will still be performed.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210126171059.307867-7-clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pnv: Discard internal BMC initialization when BMC is external
Cédric Le Goater [Tue, 26 Jan 2021 17:10:57 +0000 (18:10 +0100)]
ppc/pnv: Discard internal BMC initialization when BMC is external

The PowerNV machine can be run with an external IPMI BMC device
connected to a remote QEMU machine acting as BMC, using these options :

  -chardev socket,id=ipmi0,host=localhost,port=9002,reconnect=10 \
  -device ipmi-bmc-extern,id=bmc0,chardev=ipmi0 \
  -device isa-ipmi-bt,bmc=bmc0,irq=10 \
  -nodefaults

In that case, some aspects of the BMC initialization should be
skipped, since they rely on the simulator interface.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210126171059.307867-6-clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pnv: Simplify pnv_bmc_create()
Cédric Le Goater [Tue, 26 Jan 2021 17:10:56 +0000 (18:10 +0100)]
ppc/pnv: Simplify pnv_bmc_create()

and reuse pnv_bmc_set_pnor() to share the setting of the PNOR.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210126171059.307867-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pnv: Use skiboot addresses to load kernel and ramfs
Cédric Le Goater [Tue, 26 Jan 2021 17:10:55 +0000 (18:10 +0100)]
ppc/pnv: Use skiboot addresses to load kernel and ramfs

The current settings are useful to load large kernels (with debug) but
it moves the initrd image in a memory region not protected by
skiboot. If skiboot is compiled with DEBUG=1, memory poisoning will
corrupt the initrd.

Cc: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210126171059.307867-4-clg@kaod.org>
Reviewed-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/xive: Add firmware bit when dumping the ENDs
Cédric Le Goater [Tue, 26 Jan 2021 17:10:54 +0000 (18:10 +0100)]
ppc/xive: Add firmware bit when dumping the ENDs

ENDs allocated by OPAL for the HW thread VPs are tagged as owned by FW.
Dump the state in 'info pic'.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210126171059.307867-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pnv: Add trace events for PCI event notification
Cédric Le Goater [Tue, 26 Jan 2021 17:10:53 +0000 (18:10 +0100)]
ppc/pnv: Add trace events for PCI event notification

On POWER9 systems, PHB controllers signal the XIVE interrupt controller
of a source interrupt notification using a store on a MMIO region. Add
traces for such events.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210126171059.307867-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Remove unused MMU definitions
Philippe Mathieu-Daudé [Wed, 27 Jan 2021 23:24:01 +0000 (00:24 +0100)]
target/ppc: Remove unused MMU definitions

Remove these confusing and unused definitions.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210127232401.3525126-1-f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agospapr: Adjust firmware path of PCI devices
Greg Kurz [Fri, 22 Jan 2021 17:01:57 +0000 (18:01 +0100)]
spapr: Adjust firmware path of PCI devices

It is currently not possible to perform a strict boot from USB storage:

$ qemu-system-ppc64 -accel kvm -nodefaults -nographic -serial stdio \
-boot strict=on \
-device qemu-xhci \
-device usb-storage,drive=disk,bootindex=0 \
-blockdev driver=file,node-name=disk,filename=fedora-ppc64le.qcow2

SLOF **********************************************************************
QEMU Starting
 Build Date = Jul 17 2020 11:15:24
 FW Version = git-e18ddad8516ff2cf
 Press "s" to enter Open Firmware.

Populating /vdevice methods
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /pci@800000020000000
                     00 0000 (D) : 1b36 000d    serial bus [ usb-xhci ]
No NVRAM common partition, re-initializing...
Scanning USB
  XHCI: Initializing
    USB Storage
       SCSI: Looking for devices
          101000000000000 DISK     : "QEMU     QEMU HARDDISK    2.5+"
Using default console: /vdevice/vty@71000000

  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php

Trying to load:  from: /pci@800000020000000/usb@0/storage@1/disk@101000000000000 ...
E3405: No such device

E3407: Load failed

  Type 'boot' and press return to continue booting the system.
  Type 'reset-all' and press return to reboot the system.

Ready!
0 >

The device tree handed over by QEMU to SLOF indeed contains:

qemu,boot-list =
"/pci@800000020000000/usb@0/storage@1/disk@101000000000000 HALT";

but the device node is named usb-xhci@0, not usb@0.

This happens because the firmware names of PCI devices returned
by get_boot_devices_list() come from pcibus_get_fw_dev_path(),
while the sPAPR PHB code uses a different naming scheme for
device nodes. This inconsistency has always been there but it was
hidden for a long time because SLOF used to rename USB device
nodes, until this commit, merged in QEMU 4.2.0 :

commit 85164ad4ed9960cac842fa4cc067c6b6699b0994
Author: Alexey Kardashevskiy <aik@ozlabs.ru>
Date:   Wed Sep 11 16:24:32 2019 +1000

    pseries: Update SLOF firmware image

    This fixes USB host bus adapter name in the device tree to match QEMU's
    one.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Fortunately, sPAPR implements the firmware path provider interface.
This provides a way to override the default firmware paths.

Just factor out the sPAPR PHB naming logic from spapr_dt_pci_device()
to a helper, and use it in the sPAPR firmware path provider hook.

Fixes: 85164ad4ed99 ("pseries: Update SLOF firmware image")
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210122170157.246374-1-groug@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agospapr.c: add 'name' property for hotplugged CPUs nodes
Daniel Henrique Barboza [Wed, 20 Jan 2021 23:23:05 +0000 (20:23 -0300)]
spapr.c: add 'name' property for hotplugged CPUs nodes

In the CPU hotunplug bug [1] the guest kernel throws a scary
message in dmesg:

pseries-hotplug-cpu: Failed to offline CPU <NULL>, rc: -16

The reason isn't related to the bug though. This happens because the
kernel file arch/powerpc/platform/pseries/hotplug-cpu.c, function
dlpar_cpu_remove(), is not finding the device_node.name of the offending
CPU.

We're not populating the 'name' property for hotplugged CPUs. Since the
kernel relies on device_node.name for identifying CPU nodes, and the
CPUs that are coldplugged has the 'name' property filled by SLOF, this
is creating an unneeded inconsistency between hotplug and coldplug CPUs
in the kernel.

Let's fill the 'name' property for hotplugged CPUs as well. This will
make the guest dmesg throws a less intimidating message when we try to
unplug the last online CPU:

pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9@1, rc: -16

[1] https://bugzilla.redhat.com/1911414

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210120232305.241521-3-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agospapr.c: use g_auto* with 'nodename' in CPU DT functions
Daniel Henrique Barboza [Wed, 20 Jan 2021 23:23:04 +0000 (20:23 -0300)]
spapr.c: use g_auto* with 'nodename' in CPU DT functions

Next patch will use the 'nodename' string in spapr_core_dt_populate()
after the point it's being freed today.

Instead of moving 'g_free(nodename)' around, let's do a QoL change in
both CPU DT functions where 'nodename' is being freed, and use
g_autofree to avoid the 'g_free()' call altogether.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210120232305.241521-2-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agomulti-process: Add config option for multi-process QEMU
Jagannathan Raman [Fri, 29 Jan 2021 16:46:05 +0000 (11:46 -0500)]
multi-process: Add config option for multi-process QEMU

Add configuration options to enable or disable multiprocess QEMU code

Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 6cc37253e35418ebd7b675a31a3df6e3c7a12dc1.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomemory: alloc RAM from file at offset
Jagannathan Raman [Fri, 29 Jan 2021 16:46:04 +0000 (11:46 -0500)]
memory: alloc RAM from file at offset

Allow RAM MemoryRegion to be created from an offset in a file, instead
of allocating at offset of 0 by default. This is needed to synchronize
RAM between QEMU & remote process.

Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 609996697ad8617e3b01df38accc5c208c24d74e.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: add configure and usage information
Elena Ufimtseva [Fri, 29 Jan 2021 16:46:03 +0000 (11:46 -0500)]
multi-process: add configure and usage information

Adds documentation explaining the command-line arguments needed
to use multi-process.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 49f757a84e5dd6fae14b22544897d1124c5fdbad.1611938319.git.jag.raman@oracle.com

[Move orphan docs/multi-process.rst document into docs/system/ and add
it to index.rst to prevent Sphinx "document isn't included in any
toctree" error.
--Stefan]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agomulti-process: add the concept description to docs/devel/qemu-multiprocess
John G Johnson [Fri, 29 Jan 2021 16:46:02 +0000 (11:46 -0500)]
multi-process: add the concept description to docs/devel/qemu-multiprocess

Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 02a68adef99f5df6a380bf8fd7b90948777e411c.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agoget_maintainer: update repo URL to GitLab
Stefan Hajnoczi [Mon, 11 Jan 2021 11:50:17 +0000 (11:50 +0000)]
get_maintainer: update repo URL to GitLab

qemu.org is running out of bandwidth and the QEMU project is moving
towards a gating CI on GitLab. Use the GitLab repos instead of qemu.org
(they will become mirrors).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210111115017.156802-7-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agopc-bios: update mirror URLs to GitLab
Stefan Hajnoczi [Mon, 11 Jan 2021 11:50:16 +0000 (11:50 +0000)]
pc-bios: update mirror URLs to GitLab

qemu.org is running out of bandwidth and the QEMU project is moving
towards a gating CI on GitLab. Use the GitLab repos instead of qemu.org
(they will become mirrors).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210111115017.156802-6-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agodocs: update README to use GitLab repo URLs
Stefan Hajnoczi [Mon, 11 Jan 2021 11:50:15 +0000 (11:50 +0000)]
docs: update README to use GitLab repo URLs

qemu.org is running out of bandwidth and the QEMU project is moving
towards a gating CI on GitLab. Use the GitLab repos instead of qemu.org
(they will become mirrors).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210111115017.156802-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agogitlab-ci: remove redundant GitLab repo URL command
Stefan Hajnoczi [Mon, 11 Jan 2021 11:50:14 +0000 (11:50 +0000)]
gitlab-ci: remove redundant GitLab repo URL command

It is no longer necessary to point .gitmodules at GitLab repos when
running in GitLab CI since they are now used all the time.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210111115017.156802-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agogitmodules: use GitLab repos instead of qemu.org
Stefan Hajnoczi [Mon, 11 Jan 2021 11:50:13 +0000 (11:50 +0000)]
gitmodules: use GitLab repos instead of qemu.org

qemu.org is running out of bandwidth and the QEMU project is moving
towards a gating CI on GitLab. Use the GitLab repos instead of qemu.org
(they will become mirrors).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210111115017.156802-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years ago.github: point Repo Lockdown bot to GitLab repo
Stefan Hajnoczi [Mon, 11 Jan 2021 11:50:12 +0000 (11:50 +0000)]
.github: point Repo Lockdown bot to GitLab repo

Use the GitLab repo URL as the main repo location in order to reduce
load on qemu.org.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20210111115017.156802-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 years agoMerge remote-tracking branch 'remotes/nvme/tags/nvme-next-pull-request' into staging
Peter Maydell [Tue, 9 Feb 2021 13:24:37 +0000 (13:24 +0000)]
Merge remote-tracking branch 'remotes/nvme/tags/nvme-next-pull-request' into staging

Emulated NVMe device updates

  * deallocate or unwritten logical block error feature (me)
  * dataset management command (me)
  * compare command (Gollu Appalanaidu)
  * namespace types (Niklas Cassel)
  * zoned namespaces (Dmitry Fomichev)
  * smart critical warning toggle (Zhenwei Pi)
  * allow cmb and pmr to coexist (me)
  * pmr rds/wds support (Naveen Nagar)
  * cmb v1.4 logic (Padmakar Kalghatgi)

And a lot of smaller fixes from Gollu Appalanaidu and Minwoo Im.

# gpg: Signature made Tue 09 Feb 2021 07:25:18 GMT
# gpg:                using RSA key 522833AA75E2DCE6A24766C04DE1AF316D4F0DE9
# gpg: Good signature from "Klaus Jensen <its@irrelevant.dk>" [unknown]
# gpg:                 aka "Klaus Jensen <k.jensen@samsung.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: DDCA 4D9C 9EF9 31CC 3468  4272 63D5 6FC5 E55D A838
#      Subkey fingerprint: 5228 33AA 75E2 DCE6 A247  66C0 4DE1 AF31 6D4F 0DE9

* remotes/nvme/tags/nvme-next-pull-request: (56 commits)
  hw/block/nvme: refactor the logic for zone write checks
  hw/block/nvme: fix zone boundary check for append
  hw/block/nvme: fix wrong parameter name 'cross_read'
  hw/block/nvme: align with existing style
  hw/block/nvme: fix set feature save field check
  hw/block/nvme: fix set feature for error recovery
  hw/block/nvme: error if drive less than a zone size
  hw/block/nvme: lift cmb restrictions
  hw/block/nvme: bump to v1.4
  hw/block/nvme: move cmb logic to v1.4
  hw/block/nvme: add PMR RDS/WDS support
  hw/block/nvme: disable PMR at boot up
  hw/block/nvme: remove redundant zeroing of PMR registers
  hw/block/nvme: rename PMR/CMB shift/mask fields
  hw/block/nvme: allow cmb and pmr to coexist
  hw/block/nvme: move msix table and pba to BAR 0
  hw/block/nvme: indicate CMB support through controller capabilities register
  hw/block/nvme: fix 64 bit register hi/lo split writes
  hw/block/nvme: add size to mmio read/write trace events
  hw/block/nvme: trigger async event during injecting smart warning
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agoMerge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging
Peter Maydell [Tue, 9 Feb 2021 10:04:51 +0000 (10:04 +0000)]
Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging

* Fuzzing improvements (Qiuhao, Alexander)
* i386: Fix BMI decoding for instructions with the 0x66 prefix (David)
* initial attempt at fixing event_notifier emulation (Maxim)
* i386: PKS emulation, fix for "qemu-system-i386 -cpu host" (myself)
* meson: RBD test fixes (myself)
* meson: TCI warnings (Philippe)
* Leaner build for --disable-guest-agent, --disable-system and
  --disable-tools (Philippe, Stefan)
* --enable-tcg-interpreter fix (Richard)
* i386: SVM feature bits (Wei)
* KVM bugfix (Thomas H.)
* Add missing MemoryRegionOps callbacks (PJP)

# gpg: Signature made Mon 08 Feb 2021 14:15:35 GMT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini-gitlab/tags/for-upstream: (46 commits)
  target/i386: Expose VMX entry/exit load pkrs control bits
  target/i386: Add support for save/load IA32_PKRS MSR
  imx7-ccm: add digprog mmio write method
  tz-ppc: add dummy read/write methods
  spapr_pci: add spapr msi read method
  nvram: add nrf51_soc flash read method
  prep: add ppc-parity write method
  vfio: add quirk device write method
  pci-host: designware: add pcie-msi read method
  hw/pci-host: add pci-intack write method
  cpu-throttle: Remove timer_mod() from cpu_throttle_set()
  replay: rng-builtin support
  pc-bios/descriptors: fix paths in json files
  replay: fix replay of the interrupts
  accel/kvm/kvm-all: Fix wrong return code handling in dirty log code
  qapi/meson: Restrict UI module to system emulation and tools
  qapi/meson: Restrict system-mode specific modules
  qapi/meson: Remove QMP from user-mode emulation
  qapi/meson: Restrict qdev code to system-mode emulation
  meson: Restrict emulation code
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agoMerge remote-tracking branch 'remotes/philmd-gitlab/tags/integration-testing-20210208...
Peter Maydell [Mon, 8 Feb 2021 20:22:54 +0000 (20:22 +0000)]
Merge remote-tracking branch 'remotes/philmd-gitlab/tags/integration-testing-20210208' into staging

Integration testing patches

Tests added:
- Armbian 20.08 on Orange Pi PC (Philippe)
- MPC8544ds machine (Thomas)
- Virtex-ml507 ppc machine (Thomas)
- Re-enable the microblaze test (Thomas)

Various fixes and documentation improvements from Cleber.

# gpg: Signature made Mon 08 Feb 2021 20:19:12 GMT
# gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* remotes/philmd-gitlab/tags/integration-testing-20210208:
  Acceptance Tests: remove unnecessary tag from documentation example
  Acceptance tests: clarify ssh connection failure reason
  tests/acceptance/virtiofs_submounts: required space between IP and port
  tests/acceptance/virtiofs_submounts: standardize port as integer
  tests/acceptance/virtiofs_submounts: use a virtio-net device instead
  tests/acceptance/virtiofs_submounts: do not ask for ssh key password
  tests/acceptance/virtiofs_submounts: use workdir property
  tests/acceptance/boot_linux: rename misleading cloudinit method
  tests/acceptance/boot_linux: fix typo on cloudinit error message
  tests/acceptance: Re-enable the microblaze test
  tests/acceptance: Add a test for the virtex-ml507 ppc machine
  tests/acceptance: Test the mpc8544ds machine
  tests/acceptance: Move the pseries test to a separate file
  tests/acceptance: Test U-Boot/Linux from Armbian 20.08 on Orange Pi PC
  tests/acceptance: Extract do_test_arm_orangepi_armbian_uboot() method
  tests/acceptance: Introduce tesseract_ocr() helper
  tests/acceptance: Extract tesseract_available() helper in new namespace

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agohw/block/nvme: refactor the logic for zone write checks
Klaus Jensen [Tue, 19 Jan 2021 13:21:50 +0000 (14:21 +0100)]
hw/block/nvme: refactor the logic for zone write checks

Refactor the zone write check logic such that the most "meaningful"
error is returned first. That is, first, if the zone is not writable,
return an appropriate status code for that. Then, make sure we are
actually writing at the write pointer and finally check that we do not
cross the zone write boundary. This aligns with the "priority" of status
codes for zone read checks.

Also add a couple of additional descriptive trace events and remove an
always true assert.

Cc: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Tested-by: Niklas Cassel <niklas.cassel@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: fix zone boundary check for append
Klaus Jensen [Tue, 19 Jan 2021 11:42:58 +0000 (12:42 +0100)]
hw/block/nvme: fix zone boundary check for append

When a zone append is processed the controller checks that validity of
the write before assigning the LBA to the append command. This causes
the boundary check to be wrong.

Fix this by checking the write *after* assigning the LBA. Remove the
append special case from the nvme_check_zone_write and open code it in
nvme_do_write, assigning the slba when basic sanity checks have been
performed. Then check the validity of the resulting write like any other
write command.

In the process, also fix a missing endianness conversion for the zone
append ALBA.

Reported-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Cc: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Tested-by: Niklas Cassel <niklas.cassel@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: fix wrong parameter name 'cross_read'
Minwoo Im [Tue, 26 Jan 2021 00:19:24 +0000 (09:19 +0900)]
hw/block/nvme: fix wrong parameter name 'cross_read'

The actual parameter name is 'cross_read' rather than 'cross_zone_read'.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: align with existing style
Gollu Appalanaidu [Sun, 24 Jan 2021 16:30:24 +0000 (22:00 +0530)]
hw/block/nvme: align with existing style

Change status checks to align with the existing style and remove the
explicit check against NVME_SUCCESS.

Cc: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: fix set feature save field check
Gollu Appalanaidu [Sun, 24 Jan 2021 15:35:32 +0000 (21:05 +0530)]
hw/block/nvme: fix set feature save field check

Currently, no features are saveable, so the current check is not wrong,
but add a check against the feature capabilities to make sure this will
not regress if saveable features are added later.

Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: fix set feature for error recovery
Gollu Appalanaidu [Sun, 24 Jan 2021 15:54:40 +0000 (21:24 +0530)]
hw/block/nvme: fix set feature for error recovery

Only enable DULBE if the namespace supports it.

Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: error if drive less than a zone size
Minwoo Im [Fri, 15 Jan 2021 12:19:20 +0000 (21:19 +0900)]
hw/block/nvme: error if drive less than a zone size

If a user assigns a backing device with less capacity than the size of a
single zone, the namespace capacity will be reported as zero and the
kernel will silently fail to allocate the namespace.

This patch errors out in case that the backing device cannot accomodate
at least a single zone.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
[k.jensen: small fixup in the error and commit message]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: lift cmb restrictions
Klaus Jensen [Thu, 17 Dec 2020 23:32:57 +0000 (00:32 +0100)]
hw/block/nvme: lift cmb restrictions

The controller now implements v1.4 and we can lift the restrictions on
CMB Data Pointer and Command Independent Locations Support (CDPCILS) and
CMB Data Pointer Mixed Locations Support (CDPMLS) since the device
really does not care about mixed host/cmb pointers in those cases.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: bump to v1.4
Klaus Jensen [Wed, 13 Jan 2021 09:19:44 +0000 (10:19 +0100)]
hw/block/nvme: bump to v1.4

With the new CMB logic in place, bump the implemented specification
version to v1.4 by default.

This requires adding the setting the CNTRLTYPE field and modifying the
VWC field since 0x00 is no longer a valid value for bits 2:1.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: move cmb logic to v1.4
Padmakar Kalghatgi [Thu, 17 Dec 2020 23:32:16 +0000 (00:32 +0100)]
hw/block/nvme: move cmb logic to v1.4

Implement v1.4 logic for configuring the Controller Memory Buffer. By
default, the v1.4 scheme will be used (CMB must be explicitly enabled by
the host), so drivers that only support v1.3 will not be able to use the
CMB anymore.

To retain the v1.3 behavior, set the boolean 'legacy-cmb' nvme device
parameter.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Padmakar Kalghatgi <p.kalghatgi@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: add PMR RDS/WDS support
Naveen Nagar [Fri, 13 Nov 2020 05:30:05 +0000 (11:00 +0530)]
hw/block/nvme: add PMR RDS/WDS support

Add support for the PMRMSCL and PMRMSCU MMIO registers. This allows
adding RDS/WDS support for PMR as well.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Naveen Nagar <naveen.n1@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: disable PMR at boot up
Klaus Jensen [Fri, 18 Dec 2020 12:04:19 +0000 (13:04 +0100)]
hw/block/nvme: disable PMR at boot up

The PMR should not be enabled at boot up. Disable the PMR MemoryRegion
initially and implement MMIO for PMRCTL, allowing the host to enable the
PMR explicitly.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: remove redundant zeroing of PMR registers
Klaus Jensen [Fri, 18 Dec 2020 12:03:09 +0000 (13:03 +0100)]
hw/block/nvme: remove redundant zeroing of PMR registers

The controller registers are initially zero. Remove the redundant
zeroing.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: rename PMR/CMB shift/mask fields
Klaus Jensen [Fri, 18 Dec 2020 11:54:45 +0000 (12:54 +0100)]
hw/block/nvme: rename PMR/CMB shift/mask fields

Use the correct field names.

Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: allow cmb and pmr to coexist
Klaus Jensen [Fri, 13 Nov 2020 08:57:13 +0000 (09:57 +0100)]
hw/block/nvme: allow cmb and pmr to coexist

With BAR 4 now free to use, allow PMR and CMB to be enabled
simultaneously.

Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: move msix table and pba to BAR 0
Klaus Jensen [Fri, 13 Nov 2020 08:50:33 +0000 (09:50 +0100)]
hw/block/nvme: move msix table and pba to BAR 0

In the interest of supporting both CMB and PMR to be enabled on the same
device, move the MSI-X table and pending bit array out of BAR 4 and into
BAR 0.

This is a simplified version of the patch contributed by Andrzej
Jakowski (see [1]). Leaving the CMB at offset 0 removes the need for
changes to CMB address mapping code.

  [1]: https://lore.kernel.org/qemu-devel/20200729220107.37758-3-andrzej.jakowski@linux.intel.com/

Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Tested-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: indicate CMB support through controller capabilities register
Andrzej Jakowski [Fri, 13 Nov 2020 07:00:47 +0000 (08:00 +0100)]
hw/block/nvme: indicate CMB support through controller capabilities register

This patch sets CMBS bit in controller capabilities register when user
configures NVMe driver with CMB support, so capabilites are correctly
reported to guest OS.

Signed-off-by: Andrzej Jakowski <andrzej.jakowski@linux.intel.com>
Reviewed-by: Maxim Levitsky <mlevitsky@gmail.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: fix 64 bit register hi/lo split writes
Klaus Jensen [Mon, 18 Jan 2021 06:31:45 +0000 (07:31 +0100)]
hw/block/nvme: fix 64 bit register hi/lo split writes

64 bit registers like ASQ and ACQ should be writable by both a hi/lo 32
bit write combination as well as a plain 64 bit write. The spec does not
define ordering on the hi/lo split, but the code currently assumes that
the low order bits are written first. Additionally, the code does not
consider that another address might already have been written into the
register, causing the OR'ing to result in a bad address.

Fix this by explicitly overwriting only the low or high order bits for
32 bit writes.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
3 years agohw/block/nvme: add size to mmio read/write trace events
Klaus Jensen [Mon, 18 Jan 2021 06:30:50 +0000 (07:30 +0100)]
hw/block/nvme: add size to mmio read/write trace events

Add the size of the mmio read/write to the trace event.

Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: trigger async event during injecting smart warning
zhenwei pi [Fri, 15 Jan 2021 03:27:02 +0000 (11:27 +0800)]
hw/block/nvme: trigger async event during injecting smart warning

During smart critical warning injection by setting property from QMP
command, also try to trigger asynchronous event.

Suggested by Keith, if a event has already been raised, there is no
need to enqueue the duplicate event any more.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
[k.jensen: fix typo in commit message]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: add smart_critical_warning property
zhenwei pi [Fri, 15 Jan 2021 03:27:01 +0000 (11:27 +0800)]
hw/block/nvme: add smart_critical_warning property

There is a very low probability that hitting physical NVMe disk
hardware critical warning case, it's hard to write & test a monitor
agent service.

For debugging purposes, add a new 'smart_critical_warning' property
to emulate this situation.

The orignal version of this change is implemented by adding a fixed
property which could be initialized by QEMU command line. Suggested
by Philippe & Klaus, rework like current version.

Test with this patch:
1, change smart_critical_warning property for a running VM:
 #virsh qemu-monitor-command nvme-upstream '{ "execute": "qom-set",
  "arguments": { "path": "/machine/peripheral-anon/device[0]",
  "property": "smart_critical_warning", "value":16 } }'
2, run smartctl in guest
 #smartctl -H -l error /dev/nvme0n1

  === START OF SMART DATA SECTION ===
  SMART overall-health self-assessment test result: FAILED!
  - volatile memory backup device has failed

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agonvme: introduce bit 5 for critical warning
zhenwei pi [Fri, 15 Jan 2021 03:27:00 +0000 (11:27 +0800)]
nvme: introduce bit 5 for critical warning

According to NVM Express v1.4, Section 5.14.1.2 ("SMART / Health
Information"), introduce bit 5 for "Persistent Memory Region has become
read-only or unreliable".

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
[k.jensen: minor brush ups in commit message]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: fix zone write finalize
Klaus Jensen [Tue, 12 Jan 2021 09:32:37 +0000 (10:32 +0100)]
hw/block/nvme: fix zone write finalize

The zone write pointer is unconditionally advanced, even for write
faults. Make sure that the zone is always transitioned to Full if the
write pointer reaches zone capacity.

Cc: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: remove unused argument in nvme_ns_setup
Minwoo Im [Sun, 17 Jan 2021 14:53:35 +0000 (23:53 +0900)]
hw/block/nvme: remove unused argument in nvme_ns_setup

nvme_ns_setup() finally does not have nothing to do with NvmeCtrl
instance.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: split setup and register for namespace
Minwoo Im [Sun, 17 Jan 2021 14:53:34 +0000 (23:53 +0900)]
hw/block/nvme: split setup and register for namespace

In NVMe, namespace is being attached to process I/O.  We register NVMe
namespace to a controller via nvme_register_namespace() during
nvme_ns_setup().  This is main reason of receiving NvmeCtrl object
instance to this function to map the namespace to a controller.

To make namespace instance more independent, it should be split into two
parts: setup and register.  This patch split them into two differnt
parts, and finally nvme_ns_setup() does not have nothing to do with
NvmeCtrl instance at all.

This patch is a former patch to introduce NVMe subsystem scheme to the
existing design especially for multi-path.  In that case, it should be
split into two to make namespace independent from a controller.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: remove unused argument in nvme_ns_init_blk
Minwoo Im [Sun, 17 Jan 2021 14:53:33 +0000 (23:53 +0900)]
hw/block/nvme: remove unused argument in nvme_ns_init_blk

Removed no longer used aregument NvmeCtrl object in nvme_ns_init_blk().

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: open code for volatile write cache
Minwoo Im [Sun, 17 Jan 2021 14:53:32 +0000 (23:53 +0900)]
hw/block/nvme: open code for volatile write cache

Volatile Write Cache(VWC) feature is set in nvme_ns_setup() in the
initial time.  This feature is related to block device backed,  but this
feature is controlled in controller level via Set/Get Features command.

This patch removed dependency between nvme and nvme-ns to manage the VWC
flag value.  Also, it open coded the Get Features for VWC to check all
namespaces attached to the controller, and if false detected, return
directly false.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
[k.jensen: report write cache preset if present on ANY namespace]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: remove unused argument in nvme_ns_init_zoned
Minwoo Im [Sun, 17 Jan 2021 14:53:31 +0000 (23:53 +0900)]
hw/block/nvme: remove unused argument in nvme_ns_init_zoned

nvme_ns_init_zoned() has no use for given NvmeCtrl object.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: Correct error status for unaligned ZA
Dmitry Fomichev [Mon, 18 Jan 2021 03:39:17 +0000 (12:39 +0900)]
hw/block/nvme: Correct error status for unaligned ZA

TP 4053 says (in section 2.3.1.1) -
... if a Zone Append command specifies a ZSLBA that is not the lowest
logical block address in that zone, then the controller shall abort
that command with a status code of Invalid Field In Command.

In the code, Zone Invalid Write is returned instead, fix this.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: remove unnecessary check for append
Klaus Jensen [Thu, 10 Dec 2020 07:11:32 +0000 (08:11 +0100)]
hw/block/nvme: remove unnecessary check for append

nvme_io_cmd already checks if the namespace supports the Zone Append
command, so the removed check is dead code.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
3 years agohw/block/nvme: add missing string representations for commands
Klaus Jensen [Wed, 9 Dec 2020 22:55:06 +0000 (23:55 +0100)]
hw/block/nvme: add missing string representations for commands

Add missing string representations for a couple of new commands.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
3 years agohw/block/nvme: zero out zones on reset
Klaus Jensen [Wed, 9 Dec 2020 22:43:15 +0000 (23:43 +0100)]
hw/block/nvme: zero out zones on reset

The zoned command set specification states that "All logical blocks in a
zone *shall* be marked as deallocated when [the zone is reset]". Since
the device guarantees 0x00 to be read from deallocated blocks we have to
issue a pwrite_zeroes since we cannot be sure that a discard will do
anything. But typically, this will be achieved with an efficient
unmap/discard operation.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
3 years agohw/block/nvme: enum style fix
Klaus Jensen [Wed, 9 Dec 2020 22:12:49 +0000 (23:12 +0100)]
hw/block/nvme: enum style fix

Align with existing style and use a typedef for header-file enums.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
3 years agohw/block/nvme: merge implicitly/explicitly opened processing masks
Klaus Jensen [Wed, 9 Dec 2020 22:00:20 +0000 (23:00 +0100)]
hw/block/nvme: merge implicitly/explicitly opened processing masks

Implicitly and explicitly opended zones are always bulk processed
together, so merge the two processing masks.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
3 years agohw/block/nvme: fix shutdown/reset logic
Klaus Jensen [Wed, 9 Dec 2020 12:10:45 +0000 (13:10 +0100)]
hw/block/nvme: fix shutdown/reset logic

A shutdown is only about flushing stuff. It is the host that should
delete any queues, so do not perform a reset here.

Also, on shutdown, make sure that the PMR is flushed if in use.

Fixes: 368f4e752cf9 ("hw/block/nvme: Process controller reset and shutdown differently")
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
3 years agohw/block/nvme: conditionally enable DULBE for zoned namespaces
Klaus Jensen [Mon, 11 Jan 2021 12:52:40 +0000 (13:52 +0100)]
hw/block/nvme: conditionally enable DULBE for zoned namespaces

The device uses the BDRV_BLOCK_ZERO flag to determine the "deallocated"
status of logical blocks. Since the zoned namespaces command set
specification defines that logical blocks SHALL be marked as deallocated
when the zone is in the Empty or Offline states, DULBE can only be
supported if the zone size is a multiple of the calculated deallocation
granularity (reported in NPDG) which depends on the underlying block
device cluster size (if applicable) or the configured
discard_granularity.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: fix for non-msix machines
Klaus Jensen [Tue, 12 Jan 2021 12:30:26 +0000 (13:30 +0100)]
hw/block/nvme: fix for non-msix machines

Commit 1c0c2163aa08 ("hw/block/nvme: verify msix_init_exclusive_bar()
return value") had the unintended effect of breaking support on
several platforms not supporting MSI-X.

Still check for errors, but only report that MSI-X is unsupported
instead of bailing out.

Fixes: 1c0c2163aa08 ("hw/block/nvme: verify msix_init_exclusive_bar() return value")
Fixes: fbf2e5375e33 ("hw/block/nvme: Verify msix_vector_use() returned value")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: Document zoned parameters in usage text
Dmitry Fomichev [Tue, 8 Dec 2020 20:04:10 +0000 (05:04 +0900)]
hw/block/nvme: Document zoned parameters in usage text

Added brief descriptions of the new device properties that are
now available to users to configure features of Zoned Namespace
Command Set in the emulator.

This patch is for documentation only, no functionality change.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>