OSDN Git Service

qmiga/qemu.git
3 years agoppc/pnv: Introduce a LPC FW memory region attribute to map the PNOR
Cédric Le Goater [Tue, 26 Jan 2021 17:10:59 +0000 (18:10 +0100)]
ppc/pnv: Introduce a LPC FW memory region attribute to map the PNOR

This to map the PNOR from the machine init handler directly and finish
the cleanup of the LPC model.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210126171059.307867-8-clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pnv: Remove default disablement of the PNOR contents
Cédric Le Goater [Tue, 26 Jan 2021 17:10:58 +0000 (18:10 +0100)]
ppc/pnv: Remove default disablement of the PNOR contents

On PowerNV systems, the BMC is in charge of mapping the PNOR contents
on the LPC FW address space using the HIOMAP protocol. Under QEMU, we
emulate this behavior and we also add an extra control on the flash
accesses by letting the HIOMAP command handler decide whether the
memory region is accessible or not depending on the firmware requests.

However, this behavior is not compatible with hostboot like firmwares
which need this mapping to be always available. For this reason, the
PNOR memory region is initially disabled for skiboot mode only.

This is badly placed under the LPC model and requires the use of the
machine. Since it doesn't add much, simply remove the initial setting.
The extra control in the HIOMAP command handler will still be performed.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210126171059.307867-7-clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pnv: Discard internal BMC initialization when BMC is external
Cédric Le Goater [Tue, 26 Jan 2021 17:10:57 +0000 (18:10 +0100)]
ppc/pnv: Discard internal BMC initialization when BMC is external

The PowerNV machine can be run with an external IPMI BMC device
connected to a remote QEMU machine acting as BMC, using these options :

  -chardev socket,id=ipmi0,host=localhost,port=9002,reconnect=10 \
  -device ipmi-bmc-extern,id=bmc0,chardev=ipmi0 \
  -device isa-ipmi-bt,bmc=bmc0,irq=10 \
  -nodefaults

In that case, some aspects of the BMC initialization should be
skipped, since they rely on the simulator interface.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210126171059.307867-6-clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pnv: Simplify pnv_bmc_create()
Cédric Le Goater [Tue, 26 Jan 2021 17:10:56 +0000 (18:10 +0100)]
ppc/pnv: Simplify pnv_bmc_create()

and reuse pnv_bmc_set_pnor() to share the setting of the PNOR.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210126171059.307867-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pnv: Use skiboot addresses to load kernel and ramfs
Cédric Le Goater [Tue, 26 Jan 2021 17:10:55 +0000 (18:10 +0100)]
ppc/pnv: Use skiboot addresses to load kernel and ramfs

The current settings are useful to load large kernels (with debug) but
it moves the initrd image in a memory region not protected by
skiboot. If skiboot is compiled with DEBUG=1, memory poisoning will
corrupt the initrd.

Cc: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210126171059.307867-4-clg@kaod.org>
Reviewed-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/xive: Add firmware bit when dumping the ENDs
Cédric Le Goater [Tue, 26 Jan 2021 17:10:54 +0000 (18:10 +0100)]
ppc/xive: Add firmware bit when dumping the ENDs

ENDs allocated by OPAL for the HW thread VPs are tagged as owned by FW.
Dump the state in 'info pic'.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210126171059.307867-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pnv: Add trace events for PCI event notification
Cédric Le Goater [Tue, 26 Jan 2021 17:10:53 +0000 (18:10 +0100)]
ppc/pnv: Add trace events for PCI event notification

On POWER9 systems, PHB controllers signal the XIVE interrupt controller
of a source interrupt notification using a store on a MMIO region. Add
traces for such events.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210126171059.307867-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Remove unused MMU definitions
Philippe Mathieu-Daudé [Wed, 27 Jan 2021 23:24:01 +0000 (00:24 +0100)]
target/ppc: Remove unused MMU definitions

Remove these confusing and unused definitions.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210127232401.3525126-1-f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agospapr: Adjust firmware path of PCI devices
Greg Kurz [Fri, 22 Jan 2021 17:01:57 +0000 (18:01 +0100)]
spapr: Adjust firmware path of PCI devices

It is currently not possible to perform a strict boot from USB storage:

$ qemu-system-ppc64 -accel kvm -nodefaults -nographic -serial stdio \
-boot strict=on \
-device qemu-xhci \
-device usb-storage,drive=disk,bootindex=0 \
-blockdev driver=file,node-name=disk,filename=fedora-ppc64le.qcow2

SLOF **********************************************************************
QEMU Starting
 Build Date = Jul 17 2020 11:15:24
 FW Version = git-e18ddad8516ff2cf
 Press "s" to enter Open Firmware.

Populating /vdevice methods
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /pci@800000020000000
                     00 0000 (D) : 1b36 000d    serial bus [ usb-xhci ]
No NVRAM common partition, re-initializing...
Scanning USB
  XHCI: Initializing
    USB Storage
       SCSI: Looking for devices
          101000000000000 DISK     : "QEMU     QEMU HARDDISK    2.5+"
Using default console: /vdevice/vty@71000000

  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php

Trying to load:  from: /pci@800000020000000/usb@0/storage@1/disk@101000000000000 ...
E3405: No such device

E3407: Load failed

  Type 'boot' and press return to continue booting the system.
  Type 'reset-all' and press return to reboot the system.

Ready!
0 >

The device tree handed over by QEMU to SLOF indeed contains:

qemu,boot-list =
"/pci@800000020000000/usb@0/storage@1/disk@101000000000000 HALT";

but the device node is named usb-xhci@0, not usb@0.

This happens because the firmware names of PCI devices returned
by get_boot_devices_list() come from pcibus_get_fw_dev_path(),
while the sPAPR PHB code uses a different naming scheme for
device nodes. This inconsistency has always been there but it was
hidden for a long time because SLOF used to rename USB device
nodes, until this commit, merged in QEMU 4.2.0 :

commit 85164ad4ed9960cac842fa4cc067c6b6699b0994
Author: Alexey Kardashevskiy <aik@ozlabs.ru>
Date:   Wed Sep 11 16:24:32 2019 +1000

    pseries: Update SLOF firmware image

    This fixes USB host bus adapter name in the device tree to match QEMU's
    one.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Fortunately, sPAPR implements the firmware path provider interface.
This provides a way to override the default firmware paths.

Just factor out the sPAPR PHB naming logic from spapr_dt_pci_device()
to a helper, and use it in the sPAPR firmware path provider hook.

Fixes: 85164ad4ed99 ("pseries: Update SLOF firmware image")
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210122170157.246374-1-groug@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agospapr.c: add 'name' property for hotplugged CPUs nodes
Daniel Henrique Barboza [Wed, 20 Jan 2021 23:23:05 +0000 (20:23 -0300)]
spapr.c: add 'name' property for hotplugged CPUs nodes

In the CPU hotunplug bug [1] the guest kernel throws a scary
message in dmesg:

pseries-hotplug-cpu: Failed to offline CPU <NULL>, rc: -16

The reason isn't related to the bug though. This happens because the
kernel file arch/powerpc/platform/pseries/hotplug-cpu.c, function
dlpar_cpu_remove(), is not finding the device_node.name of the offending
CPU.

We're not populating the 'name' property for hotplugged CPUs. Since the
kernel relies on device_node.name for identifying CPU nodes, and the
CPUs that are coldplugged has the 'name' property filled by SLOF, this
is creating an unneeded inconsistency between hotplug and coldplug CPUs
in the kernel.

Let's fill the 'name' property for hotplugged CPUs as well. This will
make the guest dmesg throws a less intimidating message when we try to
unplug the last online CPU:

pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9@1, rc: -16

[1] https://bugzilla.redhat.com/1911414

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210120232305.241521-3-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agospapr.c: use g_auto* with 'nodename' in CPU DT functions
Daniel Henrique Barboza [Wed, 20 Jan 2021 23:23:04 +0000 (20:23 -0300)]
spapr.c: use g_auto* with 'nodename' in CPU DT functions

Next patch will use the 'nodename' string in spapr_core_dt_populate()
after the point it's being freed today.

Instead of moving 'g_free(nodename)' around, let's do a QoL change in
both CPU DT functions where 'nodename' is being freed, and use
g_autofree to avoid the 'g_free()' call altogether.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210120232305.241521-2-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoMerge remote-tracking branch 'remotes/nvme/tags/nvme-next-pull-request' into staging
Peter Maydell [Tue, 9 Feb 2021 13:24:37 +0000 (13:24 +0000)]
Merge remote-tracking branch 'remotes/nvme/tags/nvme-next-pull-request' into staging

Emulated NVMe device updates

  * deallocate or unwritten logical block error feature (me)
  * dataset management command (me)
  * compare command (Gollu Appalanaidu)
  * namespace types (Niklas Cassel)
  * zoned namespaces (Dmitry Fomichev)
  * smart critical warning toggle (Zhenwei Pi)
  * allow cmb and pmr to coexist (me)
  * pmr rds/wds support (Naveen Nagar)
  * cmb v1.4 logic (Padmakar Kalghatgi)

And a lot of smaller fixes from Gollu Appalanaidu and Minwoo Im.

# gpg: Signature made Tue 09 Feb 2021 07:25:18 GMT
# gpg:                using RSA key 522833AA75E2DCE6A24766C04DE1AF316D4F0DE9
# gpg: Good signature from "Klaus Jensen <its@irrelevant.dk>" [unknown]
# gpg:                 aka "Klaus Jensen <k.jensen@samsung.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: DDCA 4D9C 9EF9 31CC 3468  4272 63D5 6FC5 E55D A838
#      Subkey fingerprint: 5228 33AA 75E2 DCE6 A247  66C0 4DE1 AF31 6D4F 0DE9

* remotes/nvme/tags/nvme-next-pull-request: (56 commits)
  hw/block/nvme: refactor the logic for zone write checks
  hw/block/nvme: fix zone boundary check for append
  hw/block/nvme: fix wrong parameter name 'cross_read'
  hw/block/nvme: align with existing style
  hw/block/nvme: fix set feature save field check
  hw/block/nvme: fix set feature for error recovery
  hw/block/nvme: error if drive less than a zone size
  hw/block/nvme: lift cmb restrictions
  hw/block/nvme: bump to v1.4
  hw/block/nvme: move cmb logic to v1.4
  hw/block/nvme: add PMR RDS/WDS support
  hw/block/nvme: disable PMR at boot up
  hw/block/nvme: remove redundant zeroing of PMR registers
  hw/block/nvme: rename PMR/CMB shift/mask fields
  hw/block/nvme: allow cmb and pmr to coexist
  hw/block/nvme: move msix table and pba to BAR 0
  hw/block/nvme: indicate CMB support through controller capabilities register
  hw/block/nvme: fix 64 bit register hi/lo split writes
  hw/block/nvme: add size to mmio read/write trace events
  hw/block/nvme: trigger async event during injecting smart warning
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agoMerge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging
Peter Maydell [Tue, 9 Feb 2021 10:04:51 +0000 (10:04 +0000)]
Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging

* Fuzzing improvements (Qiuhao, Alexander)
* i386: Fix BMI decoding for instructions with the 0x66 prefix (David)
* initial attempt at fixing event_notifier emulation (Maxim)
* i386: PKS emulation, fix for "qemu-system-i386 -cpu host" (myself)
* meson: RBD test fixes (myself)
* meson: TCI warnings (Philippe)
* Leaner build for --disable-guest-agent, --disable-system and
  --disable-tools (Philippe, Stefan)
* --enable-tcg-interpreter fix (Richard)
* i386: SVM feature bits (Wei)
* KVM bugfix (Thomas H.)
* Add missing MemoryRegionOps callbacks (PJP)

# gpg: Signature made Mon 08 Feb 2021 14:15:35 GMT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini-gitlab/tags/for-upstream: (46 commits)
  target/i386: Expose VMX entry/exit load pkrs control bits
  target/i386: Add support for save/load IA32_PKRS MSR
  imx7-ccm: add digprog mmio write method
  tz-ppc: add dummy read/write methods
  spapr_pci: add spapr msi read method
  nvram: add nrf51_soc flash read method
  prep: add ppc-parity write method
  vfio: add quirk device write method
  pci-host: designware: add pcie-msi read method
  hw/pci-host: add pci-intack write method
  cpu-throttle: Remove timer_mod() from cpu_throttle_set()
  replay: rng-builtin support
  pc-bios/descriptors: fix paths in json files
  replay: fix replay of the interrupts
  accel/kvm/kvm-all: Fix wrong return code handling in dirty log code
  qapi/meson: Restrict UI module to system emulation and tools
  qapi/meson: Restrict system-mode specific modules
  qapi/meson: Remove QMP from user-mode emulation
  qapi/meson: Restrict qdev code to system-mode emulation
  meson: Restrict emulation code
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agoMerge remote-tracking branch 'remotes/philmd-gitlab/tags/integration-testing-20210208...
Peter Maydell [Mon, 8 Feb 2021 20:22:54 +0000 (20:22 +0000)]
Merge remote-tracking branch 'remotes/philmd-gitlab/tags/integration-testing-20210208' into staging

Integration testing patches

Tests added:
- Armbian 20.08 on Orange Pi PC (Philippe)
- MPC8544ds machine (Thomas)
- Virtex-ml507 ppc machine (Thomas)
- Re-enable the microblaze test (Thomas)

Various fixes and documentation improvements from Cleber.

# gpg: Signature made Mon 08 Feb 2021 20:19:12 GMT
# gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* remotes/philmd-gitlab/tags/integration-testing-20210208:
  Acceptance Tests: remove unnecessary tag from documentation example
  Acceptance tests: clarify ssh connection failure reason
  tests/acceptance/virtiofs_submounts: required space between IP and port
  tests/acceptance/virtiofs_submounts: standardize port as integer
  tests/acceptance/virtiofs_submounts: use a virtio-net device instead
  tests/acceptance/virtiofs_submounts: do not ask for ssh key password
  tests/acceptance/virtiofs_submounts: use workdir property
  tests/acceptance/boot_linux: rename misleading cloudinit method
  tests/acceptance/boot_linux: fix typo on cloudinit error message
  tests/acceptance: Re-enable the microblaze test
  tests/acceptance: Add a test for the virtex-ml507 ppc machine
  tests/acceptance: Test the mpc8544ds machine
  tests/acceptance: Move the pseries test to a separate file
  tests/acceptance: Test U-Boot/Linux from Armbian 20.08 on Orange Pi PC
  tests/acceptance: Extract do_test_arm_orangepi_armbian_uboot() method
  tests/acceptance: Introduce tesseract_ocr() helper
  tests/acceptance: Extract tesseract_available() helper in new namespace

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agohw/block/nvme: refactor the logic for zone write checks
Klaus Jensen [Tue, 19 Jan 2021 13:21:50 +0000 (14:21 +0100)]
hw/block/nvme: refactor the logic for zone write checks

Refactor the zone write check logic such that the most "meaningful"
error is returned first. That is, first, if the zone is not writable,
return an appropriate status code for that. Then, make sure we are
actually writing at the write pointer and finally check that we do not
cross the zone write boundary. This aligns with the "priority" of status
codes for zone read checks.

Also add a couple of additional descriptive trace events and remove an
always true assert.

Cc: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Tested-by: Niklas Cassel <niklas.cassel@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: fix zone boundary check for append
Klaus Jensen [Tue, 19 Jan 2021 11:42:58 +0000 (12:42 +0100)]
hw/block/nvme: fix zone boundary check for append

When a zone append is processed the controller checks that validity of
the write before assigning the LBA to the append command. This causes
the boundary check to be wrong.

Fix this by checking the write *after* assigning the LBA. Remove the
append special case from the nvme_check_zone_write and open code it in
nvme_do_write, assigning the slba when basic sanity checks have been
performed. Then check the validity of the resulting write like any other
write command.

In the process, also fix a missing endianness conversion for the zone
append ALBA.

Reported-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Cc: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Tested-by: Niklas Cassel <niklas.cassel@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: fix wrong parameter name 'cross_read'
Minwoo Im [Tue, 26 Jan 2021 00:19:24 +0000 (09:19 +0900)]
hw/block/nvme: fix wrong parameter name 'cross_read'

The actual parameter name is 'cross_read' rather than 'cross_zone_read'.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: align with existing style
Gollu Appalanaidu [Sun, 24 Jan 2021 16:30:24 +0000 (22:00 +0530)]
hw/block/nvme: align with existing style

Change status checks to align with the existing style and remove the
explicit check against NVME_SUCCESS.

Cc: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: fix set feature save field check
Gollu Appalanaidu [Sun, 24 Jan 2021 15:35:32 +0000 (21:05 +0530)]
hw/block/nvme: fix set feature save field check

Currently, no features are saveable, so the current check is not wrong,
but add a check against the feature capabilities to make sure this will
not regress if saveable features are added later.

Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: fix set feature for error recovery
Gollu Appalanaidu [Sun, 24 Jan 2021 15:54:40 +0000 (21:24 +0530)]
hw/block/nvme: fix set feature for error recovery

Only enable DULBE if the namespace supports it.

Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: error if drive less than a zone size
Minwoo Im [Fri, 15 Jan 2021 12:19:20 +0000 (21:19 +0900)]
hw/block/nvme: error if drive less than a zone size

If a user assigns a backing device with less capacity than the size of a
single zone, the namespace capacity will be reported as zero and the
kernel will silently fail to allocate the namespace.

This patch errors out in case that the backing device cannot accomodate
at least a single zone.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
[k.jensen: small fixup in the error and commit message]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: lift cmb restrictions
Klaus Jensen [Thu, 17 Dec 2020 23:32:57 +0000 (00:32 +0100)]
hw/block/nvme: lift cmb restrictions

The controller now implements v1.4 and we can lift the restrictions on
CMB Data Pointer and Command Independent Locations Support (CDPCILS) and
CMB Data Pointer Mixed Locations Support (CDPMLS) since the device
really does not care about mixed host/cmb pointers in those cases.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: bump to v1.4
Klaus Jensen [Wed, 13 Jan 2021 09:19:44 +0000 (10:19 +0100)]
hw/block/nvme: bump to v1.4

With the new CMB logic in place, bump the implemented specification
version to v1.4 by default.

This requires adding the setting the CNTRLTYPE field and modifying the
VWC field since 0x00 is no longer a valid value for bits 2:1.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: move cmb logic to v1.4
Padmakar Kalghatgi [Thu, 17 Dec 2020 23:32:16 +0000 (00:32 +0100)]
hw/block/nvme: move cmb logic to v1.4

Implement v1.4 logic for configuring the Controller Memory Buffer. By
default, the v1.4 scheme will be used (CMB must be explicitly enabled by
the host), so drivers that only support v1.3 will not be able to use the
CMB anymore.

To retain the v1.3 behavior, set the boolean 'legacy-cmb' nvme device
parameter.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Padmakar Kalghatgi <p.kalghatgi@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: add PMR RDS/WDS support
Naveen Nagar [Fri, 13 Nov 2020 05:30:05 +0000 (11:00 +0530)]
hw/block/nvme: add PMR RDS/WDS support

Add support for the PMRMSCL and PMRMSCU MMIO registers. This allows
adding RDS/WDS support for PMR as well.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Naveen Nagar <naveen.n1@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: disable PMR at boot up
Klaus Jensen [Fri, 18 Dec 2020 12:04:19 +0000 (13:04 +0100)]
hw/block/nvme: disable PMR at boot up

The PMR should not be enabled at boot up. Disable the PMR MemoryRegion
initially and implement MMIO for PMRCTL, allowing the host to enable the
PMR explicitly.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: remove redundant zeroing of PMR registers
Klaus Jensen [Fri, 18 Dec 2020 12:03:09 +0000 (13:03 +0100)]
hw/block/nvme: remove redundant zeroing of PMR registers

The controller registers are initially zero. Remove the redundant
zeroing.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: rename PMR/CMB shift/mask fields
Klaus Jensen [Fri, 18 Dec 2020 11:54:45 +0000 (12:54 +0100)]
hw/block/nvme: rename PMR/CMB shift/mask fields

Use the correct field names.

Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: allow cmb and pmr to coexist
Klaus Jensen [Fri, 13 Nov 2020 08:57:13 +0000 (09:57 +0100)]
hw/block/nvme: allow cmb and pmr to coexist

With BAR 4 now free to use, allow PMR and CMB to be enabled
simultaneously.

Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: move msix table and pba to BAR 0
Klaus Jensen [Fri, 13 Nov 2020 08:50:33 +0000 (09:50 +0100)]
hw/block/nvme: move msix table and pba to BAR 0

In the interest of supporting both CMB and PMR to be enabled on the same
device, move the MSI-X table and pending bit array out of BAR 4 and into
BAR 0.

This is a simplified version of the patch contributed by Andrzej
Jakowski (see [1]). Leaving the CMB at offset 0 removes the need for
changes to CMB address mapping code.

  [1]: https://lore.kernel.org/qemu-devel/20200729220107.37758-3-andrzej.jakowski@linux.intel.com/

Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Tested-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: indicate CMB support through controller capabilities register
Andrzej Jakowski [Fri, 13 Nov 2020 07:00:47 +0000 (08:00 +0100)]
hw/block/nvme: indicate CMB support through controller capabilities register

This patch sets CMBS bit in controller capabilities register when user
configures NVMe driver with CMB support, so capabilites are correctly
reported to guest OS.

Signed-off-by: Andrzej Jakowski <andrzej.jakowski@linux.intel.com>
Reviewed-by: Maxim Levitsky <mlevitsky@gmail.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: fix 64 bit register hi/lo split writes
Klaus Jensen [Mon, 18 Jan 2021 06:31:45 +0000 (07:31 +0100)]
hw/block/nvme: fix 64 bit register hi/lo split writes

64 bit registers like ASQ and ACQ should be writable by both a hi/lo 32
bit write combination as well as a plain 64 bit write. The spec does not
define ordering on the hi/lo split, but the code currently assumes that
the low order bits are written first. Additionally, the code does not
consider that another address might already have been written into the
register, causing the OR'ing to result in a bad address.

Fix this by explicitly overwriting only the low or high order bits for
32 bit writes.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
3 years agohw/block/nvme: add size to mmio read/write trace events
Klaus Jensen [Mon, 18 Jan 2021 06:30:50 +0000 (07:30 +0100)]
hw/block/nvme: add size to mmio read/write trace events

Add the size of the mmio read/write to the trace event.

Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: trigger async event during injecting smart warning
zhenwei pi [Fri, 15 Jan 2021 03:27:02 +0000 (11:27 +0800)]
hw/block/nvme: trigger async event during injecting smart warning

During smart critical warning injection by setting property from QMP
command, also try to trigger asynchronous event.

Suggested by Keith, if a event has already been raised, there is no
need to enqueue the duplicate event any more.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
[k.jensen: fix typo in commit message]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: add smart_critical_warning property
zhenwei pi [Fri, 15 Jan 2021 03:27:01 +0000 (11:27 +0800)]
hw/block/nvme: add smart_critical_warning property

There is a very low probability that hitting physical NVMe disk
hardware critical warning case, it's hard to write & test a monitor
agent service.

For debugging purposes, add a new 'smart_critical_warning' property
to emulate this situation.

The orignal version of this change is implemented by adding a fixed
property which could be initialized by QEMU command line. Suggested
by Philippe & Klaus, rework like current version.

Test with this patch:
1, change smart_critical_warning property for a running VM:
 #virsh qemu-monitor-command nvme-upstream '{ "execute": "qom-set",
  "arguments": { "path": "/machine/peripheral-anon/device[0]",
  "property": "smart_critical_warning", "value":16 } }'
2, run smartctl in guest
 #smartctl -H -l error /dev/nvme0n1

  === START OF SMART DATA SECTION ===
  SMART overall-health self-assessment test result: FAILED!
  - volatile memory backup device has failed

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agonvme: introduce bit 5 for critical warning
zhenwei pi [Fri, 15 Jan 2021 03:27:00 +0000 (11:27 +0800)]
nvme: introduce bit 5 for critical warning

According to NVM Express v1.4, Section 5.14.1.2 ("SMART / Health
Information"), introduce bit 5 for "Persistent Memory Region has become
read-only or unreliable".

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
[k.jensen: minor brush ups in commit message]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: fix zone write finalize
Klaus Jensen [Tue, 12 Jan 2021 09:32:37 +0000 (10:32 +0100)]
hw/block/nvme: fix zone write finalize

The zone write pointer is unconditionally advanced, even for write
faults. Make sure that the zone is always transitioned to Full if the
write pointer reaches zone capacity.

Cc: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: remove unused argument in nvme_ns_setup
Minwoo Im [Sun, 17 Jan 2021 14:53:35 +0000 (23:53 +0900)]
hw/block/nvme: remove unused argument in nvme_ns_setup

nvme_ns_setup() finally does not have nothing to do with NvmeCtrl
instance.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: split setup and register for namespace
Minwoo Im [Sun, 17 Jan 2021 14:53:34 +0000 (23:53 +0900)]
hw/block/nvme: split setup and register for namespace

In NVMe, namespace is being attached to process I/O.  We register NVMe
namespace to a controller via nvme_register_namespace() during
nvme_ns_setup().  This is main reason of receiving NvmeCtrl object
instance to this function to map the namespace to a controller.

To make namespace instance more independent, it should be split into two
parts: setup and register.  This patch split them into two differnt
parts, and finally nvme_ns_setup() does not have nothing to do with
NvmeCtrl instance at all.

This patch is a former patch to introduce NVMe subsystem scheme to the
existing design especially for multi-path.  In that case, it should be
split into two to make namespace independent from a controller.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: remove unused argument in nvme_ns_init_blk
Minwoo Im [Sun, 17 Jan 2021 14:53:33 +0000 (23:53 +0900)]
hw/block/nvme: remove unused argument in nvme_ns_init_blk

Removed no longer used aregument NvmeCtrl object in nvme_ns_init_blk().

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: open code for volatile write cache
Minwoo Im [Sun, 17 Jan 2021 14:53:32 +0000 (23:53 +0900)]
hw/block/nvme: open code for volatile write cache

Volatile Write Cache(VWC) feature is set in nvme_ns_setup() in the
initial time.  This feature is related to block device backed,  but this
feature is controlled in controller level via Set/Get Features command.

This patch removed dependency between nvme and nvme-ns to manage the VWC
flag value.  Also, it open coded the Get Features for VWC to check all
namespaces attached to the controller, and if false detected, return
directly false.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
[k.jensen: report write cache preset if present on ANY namespace]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: remove unused argument in nvme_ns_init_zoned
Minwoo Im [Sun, 17 Jan 2021 14:53:31 +0000 (23:53 +0900)]
hw/block/nvme: remove unused argument in nvme_ns_init_zoned

nvme_ns_init_zoned() has no use for given NvmeCtrl object.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: Correct error status for unaligned ZA
Dmitry Fomichev [Mon, 18 Jan 2021 03:39:17 +0000 (12:39 +0900)]
hw/block/nvme: Correct error status for unaligned ZA

TP 4053 says (in section 2.3.1.1) -
... if a Zone Append command specifies a ZSLBA that is not the lowest
logical block address in that zone, then the controller shall abort
that command with a status code of Invalid Field In Command.

In the code, Zone Invalid Write is returned instead, fix this.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: remove unnecessary check for append
Klaus Jensen [Thu, 10 Dec 2020 07:11:32 +0000 (08:11 +0100)]
hw/block/nvme: remove unnecessary check for append

nvme_io_cmd already checks if the namespace supports the Zone Append
command, so the removed check is dead code.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
3 years agohw/block/nvme: add missing string representations for commands
Klaus Jensen [Wed, 9 Dec 2020 22:55:06 +0000 (23:55 +0100)]
hw/block/nvme: add missing string representations for commands

Add missing string representations for a couple of new commands.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
3 years agohw/block/nvme: zero out zones on reset
Klaus Jensen [Wed, 9 Dec 2020 22:43:15 +0000 (23:43 +0100)]
hw/block/nvme: zero out zones on reset

The zoned command set specification states that "All logical blocks in a
zone *shall* be marked as deallocated when [the zone is reset]". Since
the device guarantees 0x00 to be read from deallocated blocks we have to
issue a pwrite_zeroes since we cannot be sure that a discard will do
anything. But typically, this will be achieved with an efficient
unmap/discard operation.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
3 years agohw/block/nvme: enum style fix
Klaus Jensen [Wed, 9 Dec 2020 22:12:49 +0000 (23:12 +0100)]
hw/block/nvme: enum style fix

Align with existing style and use a typedef for header-file enums.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
3 years agohw/block/nvme: merge implicitly/explicitly opened processing masks
Klaus Jensen [Wed, 9 Dec 2020 22:00:20 +0000 (23:00 +0100)]
hw/block/nvme: merge implicitly/explicitly opened processing masks

Implicitly and explicitly opended zones are always bulk processed
together, so merge the two processing masks.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
3 years agohw/block/nvme: fix shutdown/reset logic
Klaus Jensen [Wed, 9 Dec 2020 12:10:45 +0000 (13:10 +0100)]
hw/block/nvme: fix shutdown/reset logic

A shutdown is only about flushing stuff. It is the host that should
delete any queues, so do not perform a reset here.

Also, on shutdown, make sure that the PMR is flushed if in use.

Fixes: 368f4e752cf9 ("hw/block/nvme: Process controller reset and shutdown differently")
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
3 years agohw/block/nvme: conditionally enable DULBE for zoned namespaces
Klaus Jensen [Mon, 11 Jan 2021 12:52:40 +0000 (13:52 +0100)]
hw/block/nvme: conditionally enable DULBE for zoned namespaces

The device uses the BDRV_BLOCK_ZERO flag to determine the "deallocated"
status of logical blocks. Since the zoned namespaces command set
specification defines that logical blocks SHALL be marked as deallocated
when the zone is in the Empty or Offline states, DULBE can only be
supported if the zone size is a multiple of the calculated deallocation
granularity (reported in NPDG) which depends on the underlying block
device cluster size (if applicable) or the configured
discard_granularity.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: fix for non-msix machines
Klaus Jensen [Tue, 12 Jan 2021 12:30:26 +0000 (13:30 +0100)]
hw/block/nvme: fix for non-msix machines

Commit 1c0c2163aa08 ("hw/block/nvme: verify msix_init_exclusive_bar()
return value") had the unintended effect of breaking support on
several platforms not supporting MSI-X.

Still check for errors, but only report that MSI-X is unsupported
instead of bailing out.

Fixes: 1c0c2163aa08 ("hw/block/nvme: verify msix_init_exclusive_bar() return value")
Fixes: fbf2e5375e33 ("hw/block/nvme: Verify msix_vector_use() returned value")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: Document zoned parameters in usage text
Dmitry Fomichev [Tue, 8 Dec 2020 20:04:10 +0000 (05:04 +0900)]
hw/block/nvme: Document zoned parameters in usage text

Added brief descriptions of the new device properties that are
now available to users to configure features of Zoned Namespace
Command Set in the emulator.

This patch is for documentation only, no functionality change.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: Support Zone Descriptor Extensions
Dmitry Fomichev [Tue, 8 Dec 2020 20:04:08 +0000 (05:04 +0900)]
hw/block/nvme: Support Zone Descriptor Extensions

Zone Descriptor Extension is a label that can be assigned to a zone.
It can be set to an Empty zone and it stays assigned until the zone
is reset.

This commit adds a new optional module property,
"zoned.descr_ext_size". Its value must be a multiple of 64 bytes.
If this value is non-zero, it becomes possible to assign extensions
of that size to any Empty zones. The default value for this property
is 0, therefore setting extensions is disabled by default.

Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: Introduce max active and open zone limits
Dmitry Fomichev [Tue, 8 Dec 2020 20:04:07 +0000 (05:04 +0900)]
hw/block/nvme: Introduce max active and open zone limits

Add two module properties, "zoned.max_active" and "zoned.max_open"
to control the maximum number of zones that can be active or open.
Once these variables are set to non-default values, these limits are
checked during I/O and Too Many Active or Too Many Open command status
is returned if they are exceeded.

Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: Support Zoned Namespace Command Set
Dmitry Fomichev [Tue, 8 Dec 2020 20:04:06 +0000 (05:04 +0900)]
hw/block/nvme: Support Zoned Namespace Command Set

The emulation code has been changed to advertise NVM Command Set when
"zoned" device property is not set (default) and Zoned Namespace
Command Set otherwise.

Define values and structures that are needed to support Zoned
Namespace Command Set (NVMe TP 4053) in PCI NVMe controller emulator.
Define trace events where needed in newly introduced code.

In order to improve scalability, all open, closed and full zones
are organized in separate linked lists. Consequently, almost all
zone operations don't require scanning of the entire zone array
(which potentially can be quite large) - it is only necessary to
enumerate one or more zone lists.

Handlers for three new NVMe commands introduced in Zoned Namespace
Command Set specification are added, namely for Zone Management
Receive, Zone Management Send and Zone Append.

Device initialization code has been extended to create a proper
configuration for zoned operation using device properties.

Read/Write command handler is modified to only allow writes at the
write pointer if the namespace is zoned. For Zone Append command,
writes implicitly happen at the write pointer and the starting write
pointer value is returned as the result of the command. Write Zeroes
handler is modified to add zoned checks that are identical to those
done as a part of Write flow.

Subsequent commits in this series add ZDE support and checks for
active and open zone limits.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com>
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Matias Bjorling <matias.bjorling@wdc.com>
Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Adam Manzanares <adam.manzanares@wdc.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agonvme: Make ZNS-related definitions
Dmitry Fomichev [Tue, 8 Dec 2020 20:04:05 +0000 (05:04 +0900)]
nvme: Make ZNS-related definitions

Define values and structures that are needed to support Zoned
Namespace Command Set (NVMe TP 4053).

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: Support allocated CNS command variants
Niklas Cassel [Tue, 8 Dec 2020 20:04:04 +0000 (05:04 +0900)]
hw/block/nvme: Support allocated CNS command variants

Many CNS commands have "allocated" command variants. These include
a namespace as long as it is allocated, that is a namespace is
included regardless if it is active (attached) or not.

While these commands are optional (they are mandatory for controllers
supporting the namespace attachment command), our QEMU implementation
is more complete by actually providing support for these CNS values.

However, since our QEMU model currently does not support the namespace
attachment command, these new allocated CNS commands will return the
same result as the active CNS command variants.

The reason for not hooking up this command completely is because the
NVMe specification requires the namespace management command to be
supported if the namespace attachment command is supported.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: Add support for Namespace Types
Niklas Cassel [Tue, 8 Dec 2020 20:04:03 +0000 (05:04 +0900)]
hw/block/nvme: Add support for Namespace Types

Define the structures and constants required to implement
Namespace Types support.

Namespace Types introduce a new command set, "I/O Command Sets",
that allows the host to retrieve the command sets associated with
a namespace. Introduce support for the command set and enable
detection for the NVM Command Set.

The new workflows for identify commands rely heavily on zero-filled
identify structs. E.g., certain CNS commands are defined to return
a zero-filled identify struct when an inactive namespace NSID
is supplied.

Add a helper function in order to avoid code duplication when
reporting zero-filled identify structures.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: Add Commands Supported and Effects log
Dmitry Fomichev [Tue, 8 Dec 2020 20:04:02 +0000 (05:04 +0900)]
hw/block/nvme: Add Commands Supported and Effects log

This log page becomes necessary to implement to allow checking for
Zone Append command support in Zoned Namespace Command Set.

This commit adds the code to report this log page for NVM Command
Set only. The parts that are specific to zoned operation will be
added later in the series.

All incoming admin and i/o commands are now only processed if their
corresponding support bits are set in this log. This provides an
easy way to control what commands to support and what not to
depending on set CC.CSS.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agoMerge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20210208a' into...
Peter Maydell [Mon, 8 Feb 2021 18:23:47 +0000 (18:23 +0000)]
Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20210208a' into staging

Migration pull 2021-02-08

v2
  Dropped vmstate: Fix memory leak in vmstate_handle_alloc
    Broke on Power
  Added migration: only check page size match if RAM postcopy is enabled

# gpg: Signature made Mon 08 Feb 2021 11:28:14 GMT
# gpg:                using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20210208a: (27 commits)
  migration: only check page size match if RAM postcopy is enabled
  migration: introduce snapshot-{save, load, delete} QMP commands
  iotests: fix loading of common.config from tests/ subdir
  iotests: add support for capturing and matching QMP events
  migration: introduce a delete_snapshot wrapper
  migration: wire up support for snapshot device selection
  migration: control whether snapshots are ovewritten
  block: rename and alter bdrv_all_find_snapshot semantics
  block: allow specifying name of block device for vmstate storage
  block: add ability to specify list of blockdevs during snapshot
  migration: stop returning errno from load_snapshot()
  migration: Make save_snapshot() return bool, not 0/-1
  block: push error reporting into bdrv_all_*_snapshot functions
  migration: Display the migration blockers
  migration: Add blocker information
  migration: Fix a few absurdly defective error messages
  migration: Fix cache_init()'s "Failed to allocate" error messages
  migration: Clean up signed vs. unsigned XBZRLE cache-size
  migration: Fix migrate-set-parameters argument validation
  migration: introduce 'userfaultfd-wrlat.py' script
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agohw/block/nvme: Combine nvme_write_zeroes() and nvme_write()
Dmitry Fomichev [Tue, 8 Dec 2020 20:04:01 +0000 (05:04 +0900)]
hw/block/nvme: Combine nvme_write_zeroes() and nvme_write()

Move write processing to nvme_do_write() that now handles both WRITE
and WRITE ZEROES. Both nvme_write() and nvme_write_zeroes() become
inline helper functions.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: Separate read and write handlers
Dmitry Fomichev [Tue, 8 Dec 2020 20:04:00 +0000 (05:04 +0900)]
hw/block/nvme: Separate read and write handlers

The majority of code in nvme_rw() is becoming read- or write-specific.
Move these parts to two separate handlers, nvme_read() and nvme_write()
to make the code more readable and to remove multiple is_write checks
that has been present in the i/o path.

This is a refactoring patch, no change in functionality.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: Generate namespace UUIDs
Dmitry Fomichev [Tue, 8 Dec 2020 20:03:59 +0000 (05:03 +0900)]
hw/block/nvme: Generate namespace UUIDs

In NVMe 1.4, a namespace must report an ID descriptor of UUID type
if it doesn't support EUI64 or NGUID. Add a new namespace property,
"uuid", that provides the user the option to either specify the UUID
explicitly or have a UUID generated automatically every time a
namespace is initialized.

Suggested-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: Process controller reset and shutdown differently
Dmitry Fomichev [Tue, 8 Dec 2020 20:03:58 +0000 (05:03 +0900)]
hw/block/nvme: Process controller reset and shutdown differently

Controller reset ans subsystem shutdown are handled very much the same
in the current code, but some of the steps should be different in these
two cases.

Introduce two new functions, nvme_reset_ctrl() and nvme_shutdown_ctrl(),
to separate some portions of the code from nvme_clear_ctrl(). The steps
that are made different between reset and shutdown are that BAR.CC is not
reset to zero upon the shutdown and namespace data is flushed to
backing storage as a part of shutdown handling, but not upon reset.

Suggested-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: fix bad clearing of CAP
Klaus Jensen [Tue, 8 Dec 2020 07:43:04 +0000 (08:43 +0100)]
hw/block/nvme: fix bad clearing of CAP

Commit 37712e00b1f0 ("hw/block/nvme: factor out pmr setup") changed the
control flow such that the CAP register is erronously cleared after
nvme_init_pmr() has configured it. Since the entire NvmeCtrl structure
is zero-filled initially, there is no need for the explicit clearing, so
just remove it.

Fixes: 37712e00b1f0 ("hw/block/nvme: factor out pmr setup")
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
3 years agohw/block/nvme: add compare command
Gollu Appalanaidu [Mon, 16 Nov 2020 10:14:02 +0000 (11:14 +0100)]
hw/block/nvme: add compare command

Add the Compare command.

This implementation uses a bounce buffer to read in the data from
storage and then compare with the host supplied buffer.

Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
[k.jensen: rebased]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
3 years agohw/block/nvme: add the dataset management command
Klaus Jensen [Wed, 21 Oct 2020 12:03:19 +0000 (14:03 +0200)]
hw/block/nvme: add the dataset management command

Add support for the Dataset Management command and the Deallocate
attribute. Deallocation results in discards being sent to the underlying
block device. Whether of not the blocks are actually deallocated is
affected by the same factors as Write Zeroes (see previous commit).

     format | discard | dsm (512B)  dsm (4KiB)  dsm (64KiB)
    --------------------------------------------------------
      qcow2    ignore   n           n           n
      qcow2    unmap    n           n           y
      raw      ignore   n           n           n
      raw      unmap    n           y           y

Again, a raw format and 4KiB LBAs are preferable.

In order to set the Namespace Preferred Deallocate Granularity and
Alignment fields (NPDG and NPDA), choose a sane minimum discard
granularity of 4KiB. If we are using a passthru device supporting
discard at a 512B granularity, user should set the discard_granularity
property explicitly. NPDG and NPDA will also account for the
cluster_size of the block driver if required (i.e. for QCOW2).

See NVM Express 1.3d, Section 6.7 ("Dataset Management command").

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
3 years agonvme: add namespace I/O optimization fields to shared header
Klaus Jensen [Fri, 23 Oct 2020 06:07:50 +0000 (08:07 +0200)]
nvme: add namespace I/O optimization fields to shared header

This adds the NPWG, NPWA, NPDG, NPDA and NOWS family of fields to the
shared nvme.h header for use by later patches.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Fam Zheng <fam@euphon.net>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
3 years agohw/block/nvme: add dulbe support
Klaus Jensen [Wed, 14 Oct 2020 07:55:08 +0000 (09:55 +0200)]
hw/block/nvme: add dulbe support

Add support for reporting the Deallocated or Unwritten Logical Block
Error (DULBE).

Rely on the block status flags reported by the block layer and consider
any block with the BDRV_BLOCK_ZERO flag to be deallocated.

Multiple factors affect when a Write Zeroes command result in
deallocation of blocks.

  * the underlying file system block size
  * the blockdev format
  * the 'discard' and 'logical_block_size' parameters

     format | discard | wz (512B)  wz (4KiB)  wz (64KiB)
    -----------------------------------------------------
      qcow2    ignore   n          n          y
      qcow2    unmap    n          n          y
      raw      ignore   n          y          y
      raw      unmap    n          y          y

So, this works best with an image in raw format and 4KiB LBAs, since
holes can then be punched on a per-block basis (this assumes a file
system with a 4kb block size, YMMV). A qcow2 image, uses a cluster size
of 64KiB by default and blocks will only be marked deallocated if a full
cluster is zeroed or discarded. However, this *is* consistent with the
spec since Write Zeroes "should" deallocate the block if the Deallocate
attribute is set and "may" deallocate if the Deallocate attribute is not
set. Thus, we always try to deallocate (the BDRV_REQ_MAY_UNMAP flag is
always set).

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
3 years agohw/block/nvme: pull aio error handling
Klaus Jensen [Tue, 10 Nov 2020 07:53:20 +0000 (08:53 +0100)]
hw/block/nvme: pull aio error handling

Add a new function, nvme_aio_err, to handle errors resulting from AIOs
and use this from the callbacks.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
3 years agohw/block/nvme: remove superfluous NvmeCtrl parameter
Klaus Jensen [Mon, 9 Nov 2020 11:23:18 +0000 (12:23 +0100)]
hw/block/nvme: remove superfluous NvmeCtrl parameter

nvme_check_bounds has no use of the NvmeCtrl parameter; remove it.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
3 years agoAcceptance Tests: remove unnecessary tag from documentation example
Cleber Rosa [Wed, 3 Feb 2021 17:23:38 +0000 (12:23 -0500)]
Acceptance Tests: remove unnecessary tag from documentation example

The ":avocado: enable" is not necessary and was removed in 9531d26c,
so let's remove from the docs.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210203172357.1422425-4-crosa@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
3 years agoAcceptance tests: clarify ssh connection failure reason
Cleber Rosa [Wed, 3 Feb 2021 17:23:47 +0000 (12:23 -0500)]
Acceptance tests: clarify ssh connection failure reason

If the connection to the ssh server fails, it may indeed be a "sshd"
issue, but it may also not be that.  Let's state what we know: the
establishment of the connection from the client side was not possible.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210203172357.1422425-13-crosa@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
3 years agotests/acceptance/virtiofs_submounts: required space between IP and port
Cleber Rosa [Wed, 3 Feb 2021 17:23:44 +0000 (12:23 -0500)]
tests/acceptance/virtiofs_submounts: required space between IP and port

AFAICT, there should not be a situation where IP and port do not have
at least one whitespace character separating them.

This may be true for other '\s*' patterns in the same regex too.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210203172357.1422425-10-crosa@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
3 years agotests/acceptance/virtiofs_submounts: standardize port as integer
Cleber Rosa [Wed, 3 Feb 2021 17:23:43 +0000 (12:23 -0500)]
tests/acceptance/virtiofs_submounts: standardize port as integer

Instead of having to cast it whenever it's going to be used, let's
standardize it as an integer, which is the data type that will be
used most often.

Given that the regex will only match digits, it's safe that we'll
end up getting a integer, but, it could as well be a zero.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Beraldo Leal <bleal@redhat.com>
Message-Id: <20210203172357.1422425-9-crosa@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
3 years agotests/acceptance/virtiofs_submounts: use a virtio-net device instead
Cleber Rosa [Wed, 3 Feb 2021 17:23:41 +0000 (12:23 -0500)]
tests/acceptance/virtiofs_submounts: use a virtio-net device instead

In a virtiofs based tests, it seems safe to assume that the guest will
be capable of a virtio-net device.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210203172357.1422425-7-crosa@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
3 years agotests/acceptance/virtiofs_submounts: do not ask for ssh key password
Cleber Rosa [Wed, 3 Feb 2021 17:23:40 +0000 (12:23 -0500)]
tests/acceptance/virtiofs_submounts: do not ask for ssh key password

Tests are supposed to be non-interactive, and ssh-keygen is asking for
a passphrase when creating a key.  Let's set an empty passphrase to
avoid the prompt.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Beraldo Leal <bleal@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210203172357.1422425-6-crosa@redhat.com>
[PMD: Reword description per Alex Bennée comment]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
3 years agotests/acceptance/virtiofs_submounts: use workdir property
Cleber Rosa [Wed, 3 Feb 2021 17:23:39 +0000 (12:23 -0500)]
tests/acceptance/virtiofs_submounts: use workdir property

For Avocado Instrumented based tests, it's a better idea to just use
the property.  The environment variable is a fall back for tests not
written using that Python API.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Beraldo Leal <bleal@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reference: https://avocado-framework.readthedocs.io/en/84.0/api/test/avocado.html#avocado.Test.workdir
Message-Id: <20210203172357.1422425-5-crosa@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
3 years agotests/acceptance/boot_linux: rename misleading cloudinit method
Cleber Rosa [Wed, 3 Feb 2021 17:23:37 +0000 (12:23 -0500)]
tests/acceptance/boot_linux: rename misleading cloudinit method

There's no downloading happening on that method, so let's call it
"prepare" instead.  While at it, and because of it, the current
"prepare_boot" and "prepare_cloudinit" are also renamed.

The reasoning here is that "prepare_" methods will just work on the
images, while "set_up_" will make them effective to the VM that will
be launched.  Inspiration comes from the "virtiofs_submounts.py"
tests, which this expects to converge more into.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Beraldo Leal <bleal@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210203172357.1422425-3-crosa@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
3 years agotests/acceptance/boot_linux: fix typo on cloudinit error message
Cleber Rosa [Wed, 3 Feb 2021 17:23:36 +0000 (12:23 -0500)]
tests/acceptance/boot_linux: fix typo on cloudinit error message

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Beraldo Leal <bleal@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210203172357.1422425-2-crosa@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
3 years agotests/acceptance: Re-enable the microblaze test
Thomas Huth [Thu, 28 Jan 2021 15:28:15 +0000 (16:28 +0100)]
tests/acceptance: Re-enable the microblaze test

The microblaze kernel sometimes gets stuck during boot (ca. 1 out of 200
times), so we disabled the corresponding acceptance tests some months
ago. However, it's likely better to check that the kernel is still
starting than to not testing it at all anymore. Move the test to
a separate file, enable it again and check for an earlier console
message that should always appear.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Message-Id: <20210128152815.585478-1-thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
3 years agoMerge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2021-02-08' into staging
Peter Maydell [Mon, 8 Feb 2021 16:12:20 +0000 (16:12 +0000)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2021-02-08' into staging

QAPI patches patches for 2021-02-08

# gpg: Signature made Mon 08 Feb 2021 13:54:26 GMT
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qapi-2021-02-08:
  qapi: enable strict-optional checks
  qapi: type 'info' as Optional[QAPISourceInfo]
  qapi/gen: Drop support for QAPIGen without a file name
  qapi/commands: Simplify command registry generation
  qapi/gen: Support switching to another module temporarily
  qapi/gen: write _genc/_genh access shims
  qapi: centralize the built-in module name definition
  qapi/gen: Combine ._add_[user|system]_module
  qapi: use './builtin' as the built-in module name
  qapi: use explicitly internal module names
  qapi/gen: Replace ._begin_system_module()
  qapi: centralize is_[user|system|builtin]_module methods
  qapi/gen: inline _wrap_ifcond into end_if()
  qapi/main: handle theoretical None-return from re.match()
  qapi/events: fix visit_event typing
  qapi/commands: assert arg_type is not None

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agotarget/i386: Expose VMX entry/exit load pkrs control bits
Chenyi Qiang [Fri, 5 Feb 2021 08:33:25 +0000 (16:33 +0800)]
target/i386: Expose VMX entry/exit load pkrs control bits

Expose the VMX exit/entry load pkrs control bits in
VMX_TRUE_EXIT_CTLS/VMX_TRUE_ENTRY_CTLS MSRs to guest, which supports the
PKS in nested VM.

Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20210205083325.13880-3-chenyi.qiang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agotarget/i386: Add support for save/load IA32_PKRS MSR
Chenyi Qiang [Fri, 5 Feb 2021 08:33:24 +0000 (16:33 +0800)]
target/i386: Add support for save/load IA32_PKRS MSR

PKS introduces MSR IA32_PKRS(0x6e1) to manage the supervisor protection
key rights. Page access and writes can be managed via the MSR update
without TLB flushes when permissions change.

Add the support to save/load IA32_PKRS MSR in guest.

Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20210205083325.13880-2-chenyi.qiang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoimx7-ccm: add digprog mmio write method
Prasad J Pandit [Tue, 11 Aug 2020 11:41:32 +0000 (17:11 +0530)]
imx7-ccm: add digprog mmio write method

Add digprog mmio write method to avoid assert failure during
initialisation.

Reviewed-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20200811114133.672647-9-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agotz-ppc: add dummy read/write methods
Prasad J Pandit [Tue, 11 Aug 2020 11:41:31 +0000 (17:11 +0530)]
tz-ppc: add dummy read/write methods

Add tz-ppc-dummy mmio read/write methods to avoid assert failure
during initialisation.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-Id: <20200811114133.672647-8-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agospapr_pci: add spapr msi read method
Prasad J Pandit [Tue, 11 Aug 2020 11:41:30 +0000 (17:11 +0530)]
spapr_pci: add spapr msi read method

Add spapr msi mmio read method to avoid NULL pointer dereference
issue.

Reported-by: Lei Sun <slei.casper@gmail.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20200811114133.672647-7-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agonvram: add nrf51_soc flash read method
Prasad J Pandit [Tue, 11 Aug 2020 11:41:29 +0000 (17:11 +0530)]
nvram: add nrf51_soc flash read method

Add nrf51_soc mmio read method to avoid NULL pointer dereference
issue.

Reported-by: Lei Sun <slei.casper@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-Id: <20200811114133.672647-6-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoprep: add ppc-parity write method
Prasad J Pandit [Tue, 11 Aug 2020 11:41:28 +0000 (17:11 +0530)]
prep: add ppc-parity write method

Add ppc-parity mmio write method to avoid NULL pointer dereference
issue.

Reported-by: Lei Sun <slei.casper@gmail.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-Id: <20200811114133.672647-5-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agovfio: add quirk device write method
Prasad J Pandit [Tue, 11 Aug 2020 11:41:27 +0000 (17:11 +0530)]
vfio: add quirk device write method

Add vfio quirk device mmio write method to avoid NULL pointer
dereference issue.

Reported-by: Lei Sun <slei.casper@gmail.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20200811114133.672647-4-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agopci-host: designware: add pcie-msi read method
Prasad J Pandit [Tue, 11 Aug 2020 11:41:26 +0000 (17:11 +0530)]
pci-host: designware: add pcie-msi read method

Add pcie-msi mmio read method to avoid NULL pointer dereference
issue.

Reported-by: Lei Sun <slei.casper@gmail.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20200811114133.672647-3-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agohw/pci-host: add pci-intack write method
Prasad J Pandit [Tue, 11 Aug 2020 11:41:25 +0000 (17:11 +0530)]
hw/pci-host: add pci-intack write method

Add pci-intack mmio write method to avoid NULL pointer dereference
issue.

Reported-by: Lei Sun <slei.casper@gmail.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20200811114133.672647-2-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agocpu-throttle: Remove timer_mod() from cpu_throttle_set()
Utkarsh Tripathi [Thu, 31 Dec 2020 13:13:04 +0000 (13:13 +0000)]
cpu-throttle: Remove timer_mod() from cpu_throttle_set()

During migrations, after each iteration, cpu_throttle_set() is called,
which irrespective of input, re-arms the timer according to value of
new_throttle_pct. This causes cpu_throttle_thread() to be delayed in
getting scheduled and consqeuntly lets guest run for more time than what
the throttle value should allow. This leads to spikes in guest throughput
at high cpu-throttle percentage whenever cpu_throttle_set() is called.

A solution would be not to modify the timer immediately in
cpu_throttle_set(), instead, only modify throttle_percentage so that the
throttle would automatically adjust to the required percentage when
cpu_throttle_timer_tick() is invoked.

Manually tested the patch using following configuration:

Guest:
Centos7 (3.10.0-123.el7.x86_64)
Total Memory - 64GB , CPUs - 16
Tool used - stress (1.0.4)
Workload - stress --vm 32 --vm-bytes 1G --vm-keep

Migration Parameters:
Network Bandwidth - 500MBPS
cpu-throttle-initial - 99

Results:
With timer_mod(): fails to converge, continues indefinitely
Without timer_mod(): converges in 249 sec

Signed-off-by: Utkarsh Tripathi <utkarsh.tripathi@nutanix.com>
Message-Id: <1609420384-119407-1-git-send-email-utkarsh.tripathi@nutanix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoreplay: rng-builtin support
Pavel Dovgalyuk [Wed, 3 Feb 2021 06:00:12 +0000 (09:00 +0300)]
replay: rng-builtin support

This patch enables using rng-builtin with record/replay
by making the callbacks deterministic.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Message-Id: <161233201286.170686.7858208964037376305.stgit@pasha-ThinkPad-X280>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agopc-bios/descriptors: fix paths in json files
Sergei Trofimovich [Sun, 31 Jan 2021 14:34:34 +0000 (14:34 +0000)]
pc-bios/descriptors: fix paths in json files

Before the change /usr/share/qemu/firmware/50-edk2-x86_64-secure.json
contained the relative path:
            "filename": "share/qemu/edk2-x86_64-secure-code.fd",
            "filename": "share/qemu/edk2-i386-vars.fd",

After then change the paths are absolute:
            "filename": "/usr/share/qemu/edk2-x86_64-secure-code.fd",
            "filename": "/usr/share/qemu/edk2-i386-vars.fd",

The regression appeared in qemu-5.2.0 (seems to be related
to meson port).

CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "Marc-André Lureau" <marcandre.lureau@redhat.com>
CC: "Philippe Mathieu-Daudé" <philmd@redhat.com>
Bug: https://bugs.gentoo.org/766743
Bug: https://bugs.launchpad.net/qemu/+bug/1913012
Signed-off-by: Jannik Glückert <jannik.glueckert@gmail.com>
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Message-Id: <20210131143434.2513363-1-slyfox@gentoo.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoreplay: fix replay of the interrupts
Pavel Dovgalyuk [Mon, 1 Feb 2021 07:05:27 +0000 (10:05 +0300)]
replay: fix replay of the interrupts

Sometimes interrupt event comes at the same time with
the virtual timers. In this case replay tries to proceed
the timers, because deadline for them is zero.
This patch allows processing interrupts and exceptions
by entering the vCPU execution loop, when deadline is zero,
but checkpoint associated with virtual timers is not ready
to be replayed.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Message-Id: <161216312794.2030770.1709657858900983160.stgit@pasha-ThinkPad-X280>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoaccel/kvm/kvm-all: Fix wrong return code handling in dirty log code
Thomas Huth [Fri, 29 Jan 2021 08:43:54 +0000 (09:43 +0100)]
accel/kvm/kvm-all: Fix wrong return code handling in dirty log code

The kvm_vm_ioctl() wrapper already returns -errno if the ioctl itself
returned -1, so the callers of kvm_vm_ioctl() should not check for -1
but for a value < 0 instead.

This problem has been fixed once already in commit b533f658a98325d0e4
but that commit missed that the ENOENT error code is not fatal for
this ioctl, so the commit has been reverted in commit 50212d6346f33d6e
since the problem occurred close to a pending release at that point
in time. The plan was to fix it properly after the release, but it
seems like this has been forgotten. So let's do it now finally instead.

Resolves: https://bugs.launchpad.net/qemu/+bug/1294227
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210129084354.42928-1-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoqapi/meson: Restrict UI module to system emulation and tools
Philippe Mathieu-Daudé [Fri, 22 Jan 2021 20:44:41 +0000 (21:44 +0100)]
qapi/meson: Restrict UI module to system emulation and tools

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210122204441.2145197-13-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoqapi/meson: Restrict system-mode specific modules
Philippe Mathieu-Daudé [Fri, 22 Jan 2021 20:44:40 +0000 (21:44 +0100)]
qapi/meson: Restrict system-mode specific modules

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210122204441.2145197-12-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoqapi/meson: Remove QMP from user-mode emulation
Philippe Mathieu-Daudé [Fri, 22 Jan 2021 20:44:39 +0000 (21:44 +0100)]
qapi/meson: Remove QMP from user-mode emulation

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210122204441.2145197-11-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>