git.osdn.net Git - qmiga/qemu.git/log

tcg: Split helper-proto.h

Create helper-proto-common.h without the target specific portion.
Use that in tcg-op-common.h. Include helper-proto.h in target/arm
and target/hexagon before helper-info.c.inc; all other targets are
already correct in this regard.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg: Split helper-gen.h

Create helper-gen-common.h without the target specific portion.
Use that in tcg-op-common.h. Reorg headers in target/arm to
ensure that helper-gen.h is included before helper-info.c.inc.
All other targets are already correct in this regard.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg: Split tcg_gen_callN

Make tcg_gen_callN a static function. Create tcg_gen_call[0-7]
functions for use by helper-gen.h.inc.

Removes a multiplicty of calls to __stack_chk_fail, saving up
to 143kiB of .text space as measured on an x86_64 host.

Old New Less %Change
8888680 8741816 146864 1.65% qemu-system-aarch64
5911832 5856152 55680 0.94% qemu-system-riscv64
5816728 5767512 49216 0.85% qemu-system-mips64
6707832 6659144 48688 0.73% qemu-system-ppc64

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg: Move temp_idx and tcgv_i32_temp debug out of line

Removes a multiplicity of calls to __assert_fail, saving up
to 360kiB of .text space as measured on an x86_64 host.

Old New Less %Change
9257272 8888680 368592 3.98% qemu-system-aarch64
6100968 5911832 189136 3.10% qemu-system-riscv64
5839112 5707032 132080 2.26% qemu-system-mips
4447608 4341752 105856 2.38% qemu-system-s390x

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg: Pass TCGHelperInfo to tcg_gen_callN

In preparation for compiling tcg/ only once, eliminate
the all_helpers array. Instantiate the info structs for
the generic helpers in accel/tcg/, and the structs for
the target-specific helpers in each translate.c.

Since we don't see all of the info structs at startup,
initialize at first use, using g_once_init_* to make
sure we don't race while doing so.

Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg: Move TCGHelperInfo and dependencies to tcg/helper-info.h

This will be required outside of tcg-internal.h soon.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg: Remove outdated comments in helper-head.h

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/hexagon: Include helper-gen.h where needed

This had been included via tcg-op-common.h via tcg-op.h,
but that is going away. In idef-parser.y, shuffle some
tcg related includes into a more logical order.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/arm: Include helper-gen.h in translator.h

This had been included via tcg-op-common.h via tcg-op.h,
but that is going away.

It is needed for inlines within translator.h, so we might as well
do it there and not individually in each translator c file.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg: Split tcg/tcg-op-common.h from tcg/tcg-op.h

Create tcg/tcg-op-common.h, moving everything that does not concern
TARGET_LONG_BITS or TCGv. Adjust tcg/*.c to use the new header
instead of tcg-op.h, in preparation for compiling tcg/ only once.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg: Move TCGv, dup_const_tl definitions to tcg-op.h

These two items are the last uses of TARGET_LONG_BITS within tcg.h,
and are more in common with the other "_tl" definitions within that file.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg: Split out tcg/oversized-guest.h

Move a use of TARGET_LONG_BITS out of tcg/tcg.h.
Include the new file only where required.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/arm: Fix test of TCG_OVERSIZED_GUEST

The symbol is always defined, even if to 0.  We wanted to test for
TCG_OVERSIZED_GUEST == 0.

This fixed, the #error is reached while building arm-softmmu, because
TCG_OVERSIZED_GUEST is not true (nor supposed to be true) for arm32
guest on a 32-bit host.  But that's ok, because this feature doesn't
apply to arm32.  Add an #ifdef for TARGET_AARCH64.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg: Split out tcg-target-reg-bits.h

Often, the only thing we need to know about the TCG host
is the register size.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

*: Add missing includes of tcg/tcg.h

This had been pulled in from exec/cpu_ldst.h, via exec/exec-all.h,
but the include of tcg.h will be removed.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/*: Add missing includes of tcg/debug-assert.h

This had been pulled in from tcg/tcg.h, via exec/cpu_ldst.h,
via exec/exec-all.h, but the include of tcg.h will be removed.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/avr: Add missing includes of qemu/error-report.h

This had been pulled in from tcg/tcg.h, via exec/cpu_ldst.h,
via exec/exec-all.h, but the include of tcg.h will be removed.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg: Add tlb_fast_offset to TCGContext

Disconnect the layout of ArchCPU from TCG compilation.
Pass the relative offset of 'env' and 'neg.tlb.f' as a parameter.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg: Widen CPUTLBEntry comparators to 64-bits

This makes CPUTLBEntry agnostic to the address size of the guest.
When 32-bit addresses are in effect, we can simply read the low
32 bits of the 64-bit field. Similarly when we need to update
the field for setting TLB_NOTDIRTY.

For TCG backends that could in theory be big-endian, but in
practice are not (arm, loongarch, riscv), use QEMU_BUILD_BUG_ON
to document and ensure this is not accidentally missed.

For s390x, which is always big-endian, use HOST_BIG_ENDIAN anyway,
to document the reason for the adjustment.

For sparc64 and ppc64, always perform a 64-bit load, and rely on
the following 32-bit comparison to ignore the high bits.

Rearrange mips and ppc if ladders for clarity.

Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg: Move TCG_TYPE_TL from tcg.h to tcg-op.h

Removes the only use of TARGET_LONG_BITS from tcg.h, which is to be
target independent. Move the symbol to a define in tcg-op.h, which
will continue to be target dependent. Rather than complicate matters
for the use in tb_gen_code(), expand the definition there.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/sparc64: Remove TARGET_LONG_BITS, TCG_TYPE_TL

All uses replaced with TCGContext.addr_type.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/s390x: Remove TARGET_LONG_BITS, TCG_TYPE_TL

All uses replaced with TCGContext.addr_type.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/riscv: Remove TARGET_LONG_BITS, TCG_TYPE_TL

All uses replaced with TCGContext.addr_type.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/ppc: Remove TARGET_LONG_BITS, TCG_TYPE_TL

All uses replaced with TCGContext.addr_type.

Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Merge tag 'pull-block-2023-06-05' of https://gitlab.com/hreitz/qemu into staging

Block patches

- Fix padding of unaligned vectored requests to match the host alignment
  for vectors with 1023 or 1024 buffers
- Refactor and fix bugs in parallels's image check functionality
- Add an option to the qcow2 driver to retain (qcow2-level) allocations
  on discard requests from the guest (while still forwarding the discard
  to the lower level and marking the range as zero)

# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEEy2LXoO44KeRfAE00ofpA0JgBnN8FAmR+AT4SHGhyZWl0ekBy
# ZWRoYXQuY29tAAoJEKH6QNCYAZzfnboQAKD6YrreZLoseomRfqOAoApSf6yOdcHk
# 6kfsvzwzjosomsF1Pkzm4851vX5PyDqTdeu0iViM+pxanVO1b494q1P4VcAERqMB
# iZVs68R6M0l6HV9btWFGm+ibHJf4FapdntkIdwog1ka5TIhw5oDWCVNLigjhIoRv
# sM37Bgf14kC3sFTR++0HESsyU1eUP5gJjwJbPZ2IgJBmzYay0is1z5nHA/3VUswu
# 8dKnGQDsv62EtlK7PK8cU2BhLOeNi6Wr3bAb6Wf2QLB5e0qRb7oAkqNx5/UcTznk
# a3XMC1aiWhYvM/+DaYIpQUcIPgA8xQ1KHKeD6WjbGfLgZBqseX0aGWMByUsiY8Bo
# +BPIBnUDrbiPnAKB/XLQfnzlE+s7121/JpEbB7AkZqVFRGuw8Wur4tbc2fzvy8Pw
# x/uQfv3ZPi/2Lf6u7hv/TVHubXi8jucVgx3Ubu5Jeo3901S4/KOQBQ4BQ/GYIGQX
# 38ijSROcEd0eQJ1mTKPEctouxjSZCghNSbrn9DfsL1V3VWqWNKKGCU3hM+RQ1SJT
# 688qvnyYt8QZfTsiDSHR/GfKsufG0DkoqE7c9IhSEPohecAH8Rrc3HcLut7fuwD2
# gCFQhm68CPwwRmBjPCY6Zi1RDzeOyFBSWN31T6t0yTb4OHJ/3/cSZVBJtwwkOVbx
# zwabHDNdY5Kw
# =GuoL
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 05 Jun 2023 08:37:34 AM PDT
# gpg:                using RSA key CB62D7A0EE3829E45F004D34A1FA40D098019CDF
# gpg:                issuer "hreitz@redhat.com"
# gpg: Good signature from "Hanna Reitz <hreitz@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: CB62 D7A0 EE38 29E4 5F00  4D34 A1FA 40D0 9801 9CDF

* tag 'pull-block-2023-06-05' of https://gitlab.com/hreitz/qemu:
  qcow2: add discard-no-unref option
  parallels: Incorrect condition in out-of-image check
  parallels: Replace qemu_co_mutex_lock by WITH_QEMU_LOCK_GUARD
  parallels: Move statistic collection to a separate function
  parallels: Move check of leaks to a separate function
  parallels: Fix statistics calculation
  parallels: Move check of cluster outside image to a separate function
  parallels: Move check of unclean image to a separate function
  parallels: Use generic infrastructure for BAT writing in parallels_co_check()
  parallels: create parallels_set_bat_entry_helper() to assign BAT value
  parallels: Fix image_end_offset and data_end after out-of-image check
  parallels: Fix high_off calculation in parallels_co_check()
  parallels: Out of image offset in BAT leads to image inflation
  iotests/iov-padding: New test
  util/iov: Remove qemu_iovec_init_extended()
  block: Collapse padded I/O vecs exceeding IOV_MAX
  util/iov: Make qiov_slice() public

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Merge tag 'qemu-sparc-20230605' of https://github.com/mcayland/qemu into staging

qemu-sparc queue

# -----BEGIN PGP SIGNATURE-----
#
# iQFSBAABCgA8FiEEzGIauY6CIA2RXMnEW8LFb64PMh8FAmR9xHseHG1hcmsuY2F2
# ZS1heWxhbmRAaWxhbmRlLmNvLnVrAAoJEFvCxW+uDzIfFsgH/0lV2y9CphGdWwtV
# S/00CRhp5n0kv3P2m4yDvXBhBgMa8zS6m/ZHhcfuU5xOxkJAnPudvXiZucHP4TSy
# 6KFBgVi2+xBW1YJf2UEHlBG2cDnVDHtYQze6Ak3S+W5sVeZZU8n1k8k8kRwpuNDS
# io23oKXzzKMYErlwy12bEHEF7vpfcG9HktjpJxUtppkNDuPTysyKSA6BavEm3mJC
# MlIypbxqxgBvu2SJETeP/m1yLQOaswViCNkqNqUH+MgEc1ubtSCXEuXTfSHIsxLR
# 224kRPFciY1sm3NvG9yz75UYSq1JUvUC5mNIIqwjb4MZJ+YHqFb3S/6ilCYChQ8V
# IBPyvH0=
# =pxbI
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 05 Jun 2023 04:18:19 AM PDT
# gpg:                using RSA key CC621AB98E82200D915CC9C45BC2C56FAE0F321F
# gpg:                issuer "mark.cave-ayland@ilande.co.uk"
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* tag 'qemu-sparc-20230605' of https://github.com/mcayland/qemu:
  hw/isa/i82378: Remove unused "io" attribute
  hw/arm/omap: Remove unused omap_uart_attach()
  hw/timer/i8254_common: Share "iobase" property via base class

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

qcow2: add discard-no-unref option

When we for example have a sparse qcow2 image and discard: unmap is enabled,
there can be a lot of fragmentation in the image after some time. Especially on VM's
that do a lot of writes/deletes.
This causes the qcow2 image to grow even over 110% of its virtual size,
because the free gaps in the image get too small to allocate new
continuous clusters. So it allocates new space at the end of the image.

Disabling discard is not an option, as discard is needed to keep the
incremental backup size as low as possible. Without discard, the
incremental backups would become large, as qemu thinks it's just dirty
blocks but it doesn't know the blocks are unneeded.
So we need to avoid fragmentation but also 'empty' the unneeded blocks in
the image to have a small incremental backup.

In addition, we also want to send the discards further down the stack, so
the underlying blocks are still discarded.

Therefor we introduce a new qcow2 option "discard-no-unref".
When setting this option to true, discards will no longer have the qcow2
driver relinquish cluster allocations. Other than that, the request is
handled as normal: All clusters in range are marked as zero, and, if
pass-discard-request is true, it is passed further down the stack.
The only difference is that the now-zero clusters are preallocated
instead of being unallocated.
This will avoid fragmentation on the qcow2 image.

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621
Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
Message-Id: <20230605084523.34134-2-jean-louis@dupond.be>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>

parallels: Incorrect condition in out-of-image check

All the offsets in the BAT must be lower than the file size.
Fix the check condition for correct check.

Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Message-Id: <20230424093147.197643-13-alexander.ivanov@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>

parallels: Replace qemu_co_mutex_lock by WITH_QEMU_LOCK_GUARD

Replace the way we use mutex in parallels_co_check() for simplier
and less error prone code.

Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Message-Id: <20230424093147.197643-12-alexander.ivanov@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>

parallels: Move statistic collection to a separate function

We will add more and more checks so we need a better code structure
in parallels_co_check. Let each check performs in a separate loop
in a separate helper.

Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230424093147.197643-11-alexander.ivanov@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>

parallels: Move check of leaks to a separate function

We will add more and more checks so we need a better code structure
in parallels_co_check. Let each check performs in a separate loop
in a separate helper.

Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Message-Id: <20230424093147.197643-10-alexander.ivanov@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>

parallels: Fix statistics calculation

Exclude out-of-image clusters from allocated and fragmented clusters
calculation.

Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Message-Id: <20230424093147.197643-9-alexander.ivanov@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>

parallels: Move check of cluster outside image to a separate function

We will add more and more checks so we need a better code structure in
parallels_co_check. Let each check performs in a separate loop in a
separate helper.

Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Message-Id: <20230424093147.197643-8-alexander.ivanov@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>

parallels: Move check of unclean image to a separate function

We will add more and more checks so we need a better code structure
in parallels_co_check. Let each check performs in a separate loop
in a separate helper.

Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230424093147.197643-7-alexander.ivanov@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>

parallels: Use generic infrastructure for BAT writing in parallels_co_check()

BAT is written in the context of conventional operations over the image
inside bdrv_co_flush() when it calls parallels_co_flush_to_os() callback.
Thus we should not modify BAT array directly, but call
parallels_set_bat_entry() helper and bdrv_co_flush() further on. After
that there is no need to manually write BAT and track its modification.

This makes code more generic and allows to split parallels_set_bat_entry()
for independent pieces.

Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Message-Id: <20230424093147.197643-6-alexander.ivanov@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>

parallels: create parallels_set_bat_entry_helper() to assign BAT value

This helper will be reused in next patches during parallels_co_check
rework to simplify its code.

Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230424093147.197643-5-alexander.ivanov@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>

parallels: Fix image_end_offset and data_end after out-of-image check

Set data_end to the end of the last cluster inside the image. In such a
way we can be sure that corrupted offsets in the BAT can't affect on the
image size. If there are no allocated clusters set image_end_offset by
data_end.

Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Message-Id: <20230424093147.197643-4-alexander.ivanov@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>

parallels: Fix high_off calculation in parallels_co_check()

Don't let high_off be more than the file size even if we don't fix the
image.

Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230424093147.197643-3-alexander.ivanov@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>

parallels: Out of image offset in BAT leads to image inflation

data_end field in BDRVParallelsState is set to the biggest offset present
in BAT. If this offset is outside of the image, any further write will
create the cluster at this offset and/or the image will be truncated to
this offset on close. This is definitely not correct.

Raise an error in parallels_open() if data_end points outside the image
and it is not a check (let the check to repaire the image). Set data_end
to the end of the cluster with the last correct offset.

Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Message-Id: <20230424093147.197643-2-alexander.ivanov@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>

iotests/iov-padding: New test

Test that even vectored IO requests with 1024 vector elements that are
not aligned to the device's request alignment will succeed.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230411173418.19549-5-hreitz@redhat.com>

util/iov: Remove qemu_iovec_init_extended()

bdrv_pad_request() was the main user of qemu_iovec_init_extended().
HEAD^ has removed that use, so we can remove qemu_iovec_init_extended()
now.

The only remaining user is qemu_iovec_init_slice(), which can easily
inline the small part it really needs.

Note that qemu_iovec_init_extended() offered a memcpy() optimization to
initialize the new I/O vector. qemu_iovec_concat_iov(), which is used
to replace its functionality, does not, but calls qemu_iovec_add() for
every single element. If we decide this optimization was important, we
will need to re-implement it in qemu_iovec_concat_iov(), which might
also benefit its pre-existing users.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230411173418.19549-4-hreitz@redhat.com>

block: Collapse padded I/O vecs exceeding IOV_MAX

When processing vectored guest requests that are not aligned to the
storage request alignment, we pad them by adding head and/or tail
buffers for a read-modify-write cycle.

The guest can submit I/O vectors up to IOV_MAX (1024) in length, but
with this padding, the vector can exceed that limit.  As of
4c002cef0e9abe7135d7916c51abce47f7fc1ee2 ("util/iov: make
qemu_iovec_init_extended() honest"), we refuse to pad vectors beyond the
limit, instead returning an error to the guest.

To the guest, this appears as a random I/O error.  We should not return
an I/O error to the guest when it issued a perfectly valid request.

Before 4c002cef0e9abe7135d7916c51abce47f7fc1ee2, we just made the vector
longer than IOV_MAX, which generally seems to work (because the guest
assumes a smaller alignment than we really have, file-posix's
raw_co_prw() will generally see bdrv_qiov_is_aligned() return false, and
so emulate the request, so that the IOV_MAX does not matter).  However,
that does not seem exactly great.

I see two ways to fix this problem:
1. We split such long requests into two requests.
2. We join some elements of the vector into new buffers to make it
   shorter.

I am wary of (1), because it seems like it may have unintended side
effects.

(2) on the other hand seems relatively simple to implement, with
hopefully few side effects, so this patch does that.

To do this, the use of qemu_iovec_init_extended() in bdrv_pad_request()
is effectively replaced by the new function bdrv_create_padded_qiov(),
which not only wraps the request IOV with padding head/tail, but also
ensures that the resulting vector will not have more than IOV_MAX
elements.  Putting that functionality into qemu_iovec_init_extended() is
infeasible because it requires allocating a bounce buffer; doing so
would require many more parameters (buffer alignment, how to initialize
the buffer, and out parameters like the buffer, its length, and the
original elements), which is not reasonable.

Conversely, it is not difficult to move qemu_iovec_init_extended()'s
functionality into bdrv_create_padded_qiov() by using public
qemu_iovec_* functions, so that is what this patch does.

Because bdrv_pad_request() was the only "serious" user of
qemu_iovec_init_extended(), the next patch will remove the latter
function, so the functionality is not implemented twice.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2141964
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230411173418.19549-3-hreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

util/iov: Make qiov_slice() public

We want to inline qemu_iovec_init_extended() in block/io.c for padding
requests, and having access to qiov_slice() is useful for this.  As a
public function, it is renamed to qemu_iovec_slice().

(We will need to count the number of I/O vector elements of a slice
there, and then later process this slice.  Without qiov_slice(), we
would need to call qemu_iovec_subvec_niov(), and all further
IOV-processing functions may need to skip prefixing elements to
accomodate for a qiov_offset.  Because qemu_iovec_subvec_niov()
internally calls qiov_slice(), we can just have the block/io.c code call
qiov_slice() itself, thus get the number of elements, and also create an
iovec array with the superfluous prefixing elements stripped, so the
following processing functions no longer need to skip them.)

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230411173418.19549-2-hreitz@redhat.com>

hw/isa/i82378: Remove unused "io" attribute

The attribute isn't used since commit 5c9736789b79ea49cd236ac326f0a414f63b1015
"i82378: Cleanup implementation".

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20230523195608.125820-4-shentey@gmail.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

hw/arm/omap: Remove unused omap_uart_attach()

The function is unused since commit
bdad3654d3c55f478e538037d9eccd204e5fc8ee ('hw/arm/nseries: Remove
invalid/unnecessary n8x0_uart_setup()').

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230523195608.125820-3-shentey@gmail.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

hw/timer/i8254_common: Share "iobase" property via base class

Both TYPE_KVM_I8254 and TYPE_I8254 have their own but same implementation of
the "iobase" property. The storage for the property already resides in
PITCommonState, so also move the property definition there.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230523195608.125820-2-shentey@gmail.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

Merge tag 'pull-loongarch-20230605' of https://gitlab.com/gaosong/qemu into staging

Fixes Coverity CID: 1512452, 1512453
Fixes: 78464f023b54 ("hw/loongarch/virt: Modify ipi as percpu device")

# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEIAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZH1THwAKCRBAov/yOSY+
# 37u0BADoodYhyV+jhfBCwKvZbE7yI3vvqcC1hdw8/jclLcIfYVsgUJ+qETmk9s/X
# h/WGevhwADb+fBcBIYEBWeFFfFdBlEDxua/ew0eFlRcB3HoIHRnJJT9L+rZ4ifNW
# b8HDhv/LjDj9dpQ449wpGlSOiSST3+AiCpKkH36KUOEfLP029Q==
# =3GIb
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 04 Jun 2023 08:14:39 PM PDT
# gpg:                using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C  6C2C 40A2 FFF2 3926 3EDF

* tag 'pull-loongarch-20230605' of https://gitlab.com/gaosong/qemu:
  hw/intc/loongarch_ipi: Bring back all 4 IPI mailboxes

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

hw/intc/loongarch_ipi: Bring back all 4 IPI mailboxes

As per "Loongson 3A5000/3B5000 Processor Reference Manual",
Loongson 3A5000's IPI implementation have 4 mailboxes per
core.

However, in 78464f023b54 ("hw/loongarch/virt: Modify ipi as
percpu device"), the number of IPI mailboxes was reduced to
one, which mismatches actual hardware.

It won't affect LoongArch based system as LoongArch boot code
only uses the first mailbox, however MIPS based Loongson boot
code uses all 4 mailboxes.

Fixes Coverity CID: 1512452, 1512453
Fixes: 78464f023b54 ("hw/loongarch/virt: Modify ipi as percpu device")
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20230521102307.87081-2-jiaxun.yang@flygoat.com>
Signed-off-by: Song Gao <gaosong@loongson.cn>

Merge tag 'migration-20230602-pull-request' of https://gitlab.com/juan.quintela/qemu into staging

Migration Pull request (20230602 vintage)

This PULL request get:
- All migration-test patches except last one (daniel)
- Documentation about live test cases (peter)

Please apply.

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmR5yRwACgkQ9IfvGFhy
# 1yOLQQ/+NsrXEj7Bwp2PdGo+wBRkq4Gah/mc9F00pqtJc2CGNWgfgDohhZjBrhRv
# cTABsfEcIKgCYqGYwVCklJGlUMzxlJPPcMfvou5SWN59E4FBFSg4DWaBfDPCS8LW
# yjnz0JcpxJ+Ge0eqP6xpTPKQ0YGisdav/PjF8GZewBCjyrhZop062a92B2t59D8Y
# shJYKaZZU/5/4zx6KqOm9OClD/yJ+w5q6cGn89/rFE0RMSVywZ3Y1O8/LwIAEP6U
# oj88rczh3geGlsmtPIeyhA3BdnYuPonmyLz8CINFH9+y2tR9l1dN59q1uwEOIvff
# BhJvxTNmkTvsi5zeAmbp2CYmRTwhBmlanh8v2OLNj8zlt0cHYNpiYUZO9qxCHIyT
# LnNTTYhrpqAqINdm+Z8c3ymDKkTz0KECBa45hdFtNB4ZOXPDQTHVqkQRfe3CxDKz
# f/WM4TxHEzVMw/Ow1K9Kbk7/AEwIV6Ol2BSf9D+ZcU4ydmu6ENhV9G4cQ9Orlv8I
# opychxf+O/b6yhVFq7J1ufDhfn3aWQmUQC06npEgfrIV/fLrXhYfs2CXkNZs78v6
# MTMNPNBN/UasM8hx+ldsjZEHf625lO3eNWoNY1Xxog5YICnNLA+tG6n69uybew2+
# UOVyoHwX7iqaToK6bQNCS4H/PjCp3v7fzw1Nsz48Pfaklpivz/k=
# =4exy
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 02 Jun 2023 03:49:00 AM PDT
# gpg:                using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [unknown]
# gpg:                 aka "Juan Quintela <quintela@trasno.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* tag 'migration-20230602-pull-request' of https://gitlab.com/juan.quintela/qemu:
  qtest/migration: Document live=true cases
  tests/qtest: make more migration pre-copy scenarios run non-live
  tests/qtest: distinguish src/dst migration VM stop/resume events
  tests/qtest: capture RESUME events during migration
  tests/qtest: replace wait_command() with qtest_qmp_assert_success
  tests/qtest: switch to using event callbacks for STOP event
  tests/qtest: get rid of some 'qtest_qmp' usage in migration test
  tests/qtest: get rid of 'qmp_command' helper in migration test
  tests/qtest: add support for callback to receive QMP events
  tests/qtest: add various qtest_qmp_assert_success() variants

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Merge tag 'pull-nbd-2023-06-01-v2' of https://repo.or.cz/qemu/ericb into staging

nbd and misc patches for 2023-06-01

- Eric Blake: Fix iotest 104 for NBD
- Eric Blake: Improve qcow2 spec on padding bytes
- Eric Blake: Fix read-beyond-bounds bug in qemu_strtosz

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAmR6JzEACgkQp6FrSiUn
# Q2oGwgf+PIaN8iedQo5KR08OEf9YxJXab7nL5Oh12+ZvrPOt8XoJcd585KblQ1YI
# 3bGC4CO1l4QO3xmKltVHi7hnlX+3/8WMEvh0jBQBG1AjjPCi5Y1A/gGTEJFX60Ux
# /ffEpo8+1vaHQ8srkxBMWIvpF/dYRaMXSm/CP5SNqTllTalTR46YHKL9odXTzIeN
# 0Zu9UQw/Jwp5A9/8KB+0M9SYXA6zOEmEqEyOwVESEAU2Lm7titwqdBny6GZc6DH6
# Sa2lKO0qQA/e9ya6jHm2c9ycoNCtQ/2VR8QuCd6WCf9DX8q/9RhdiJir+EK5gqp6
# 3JRUFtx783d0BwPnUDUqPawi4txFtw==
# =Cg4e
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 02 Jun 2023 10:30:25 AM PDT
# gpg:                using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full]
# gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full]
# gpg:                 aka "[jpeg image of size 6874]" [full]

* tag 'pull-nbd-2023-06-01-v2' of https://repo.or.cz/qemu/ericb: (21 commits)
  cutils: Improve qemu_strtosz handling of fractions
  cutils: Improve qemu_strtod* error paths
  cutils: Use parse_uint in qemu_strtosz for negative rejection
  cutils: Set value in all integral qemu_strto* error paths
  cutils: Set value in all qemu_strtosz* error paths
  test-cutils: Add more coverage to qemu_strtosz
  numa: Check for qemu_strtosz_MiB error
  cutils: Allow NULL str in qemu_strtosz
  test-cutils: Refactor qemu_strtosz tests for less boilerplate
  test-cutils: Prepare for upcoming semantic change in qemu_strtosz
  test-cutils: Add coverage of qemu_strtod
  cutils: Allow NULL endptr in parse_uint()
  cutils: Adjust signature of parse_uint[_full]
  cutils: Document differences between parse_uint and qemu_strtou64
  cutils: Fix wraparound parsing in qemu_strtoui
  test-cutils: Test more integer corner cases
  test-cutils: Test integral qemu_strto* value on failures
  test-cutils: Use g_assert_cmpuint where appropriate
  test-cutils: Avoid g_assert in unit tests
  qcow2: Explicit mention of padding bytes
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

cutils: Improve qemu_strtosz handling of fractions

We have several limitations and bugs worth fixing; they are
inter-related enough that it is not worth splitting this patch into
smaller pieces:

* ".5k" should work to specify 512, just as "0.5k" does
* "1.9999k" and "1." + "9"*50 + "k" should both produce the same
  result of 2048 after rounding
* "1." + "0"*350 + "1B" should not be treated the same as "1.0B";
  underflow in the fraction should not be lost
* "7.99e99" and "7.99e999" look similar, but our code was doing a
  read-out-of-bounds on the latter because it was not expecting ERANGE
  due to overflow. While we document that scientific notation is not
  supported, and the previous patch actually fixed
  qemu_strtod_finite() to no longer return ERANGE overflows, it is
  easier to pre-filter than to try and determine after the fact if
  strtod() consumed more than we wanted.  Note that this is a
  low-level semantic change (when endptr is not NULL, we can now
  successfully parse with a scale of 'E' and then report trailing
  junk, instead of failing outright with EINVAL); but an earlier
  commit already argued that this is not a high-level semantic change
  since the only caller passing in a non-NULL endptr also checks that
  the tail is whitespace-only.

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1629
Fixes: cf923b78 ("utils: Improve qemu_strtosz() to have 64 bits of precision", 6.0.0)
Fixes: 7625a1ed ("utils: Use fixed-point arithmetic in qemu_strtosz", 6.0.0)
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230522190441.64278-20-eblake@redhat.com>
[eblake: tweak function comment for accuracy]

cutils: Improve qemu_strtod* error paths

Previous patches changed all integral qemu_strto*() error paths to
guarantee that *value is never left uninitialized.  Do likewise for
qemu_strtod.  Also, tighten qemu_strtod_finite() to never return a
non-finite value (prior to this patch, we were rejecting "inf" with
-EINVAL and unspecified result 0.0, but failing "9e999" with -ERANGE
and HUGE_VAL - which is infinite on IEEE machines - despite our
function claiming to recognize only finite values).

Auditing callers, we have no external callers of qemu_strtod, and
among the callers of qemu_strtod_finite:

- qapi/qobject-input-visitor.c:qobject_input_type_number_keyval() and
  qapi/string-input-visitor.c:parse_type_number() which reject all
  errors (does not matter what we store)

- utils/cutils.c:do_strtosz() incorrectly assumes that *endptr points
  to '.' on all failures (that is, it is not distinguishing between
  EINVAL and ERANGE; and therefore still does the WRONG THING for
  "9.9e999".  The change here does not entirely fix that (a later
  patch will tackle this more systematically), but at least it fixes
  the read-out-of-bounds first diagnosed in
  https://gitlab.com/qemu-project/qemu/-/issues/1629

- our testsuite, which we can update to match what we document

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
CC: qemu-stable@nongnu.org
Message-Id: <20230522190441.64278-19-eblake@redhat.com>

cutils: Use parse_uint in qemu_strtosz for negative rejection

Rather than open-coding two different ways to check for an unwanted
negative sign, reuse the same code in both functions.  That way, if we
decide down the road to accept "-0" instead of rejecting it, we have
fewer places to change.  Also, it means we now get ERANGE instead of
EINVAL for negative values in qemu_strtosz, which is reasonable for
what it represents.  This in turn changes the expected output of a
couple of iotests.

The change is not quite complete: negative fractional scaled values
can trip us up.  This will be fixed in a later patch addressing other
issues with fractional scaled values.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230522190441.64278-18-eblake@redhat.com>

cutils: Set value in all integral qemu_strto* error paths

Our goal in writing qemu_strtoi() and friends is to have an interface
harder to abuse than libc's strtol(). Leaving the return value
uninitialized on some but not all error paths does not lend itself
well to this goal; and our documentation wasn't helpful on what to
expect.

Note that the previous patch changed all qemu_strtosz() EINVAL error
paths to slam value to 0 rather than stay uninitialized, even when the
EINVAL eror occurs because of trailing junk. But for the remaining
integral qemu_strto*, it's easier to return the parsed value than to
force things back to zero, in part because of how check_strtox_error
works; in part because people expect that from libc strto* (while
there is no libc strtosz to compare to), and in part because doing so
creates less churn in the testsuite.

Here, the list of affected callers is much longer ('git grep
"qemu_strto[ui]" "*.c" "**/*.c" | grep -v tests/ |wc -l' outputs 107,
although a few of those are the implementation in in cutils.c), so
touching as little as possible is the wisest course of action.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230522190441.64278-17-eblake@redhat.com>

cutils: Set value in all qemu_strtosz* error paths

Making callers determine whether or not *value was populated on error
is not nice for usability.  Pre-patch, we have unit tests that check
that *result is left unchanged on most EINVAL errors and set to 0 on
many ERANGE errors.  This is subtly different from libc strtoumax()
behavior which returns UINT64_MAX on ERANGE errors, as well as
different from our parse_uint() which slams to 0 on EINVAL on the
grounds that we want our functions to be harder to mis-use than
strtoumax().

Let's audit callers:

- hw/core/numa.c:parse_numa() fixed in the previous patch to check for
  errors

- migration/migration-hmp-cmds.c:hmp_migrate_set_parameter(),
  monitor/hmp.c:monitor_parse_arguments(),
  qapi/opts-visitor.c:opts_type_size(),
  qapi/qobject-input-visitor.c:qobject_input_type_size_keyval(),
  qemu-img.c:cvtnum_full(), qemu-io-cmds.c:cvtnum(),
  target/i386/cpu.c:x86_cpu_parse_featurestr(), and
  util/qemu-option.c:parse_option_size() appear to reject all failures
  (although some with distinct messages for ERANGE as opposed to
  EINVAL), so it doesn't matter what is in the value parameter on
  error.

- All remaining callers are in the testsuite, where we can tweak our
  expectations to match our new desired behavior.

Advancing to the end of the string parsed on overflow (ERANGE), while
still returning 0, makes sense (UINT64_MAX as a size is unlikely to be
useful); likewise, our size parsing code is complex enough that it's
easier to always return 0 when endptr is NULL but trailing garbage was
found, rather than trying to return the value of the prefix actually
parsed (no current caller cared about the value of the prefix).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230522190441.64278-16-eblake@redhat.com>

test-cutils: Add more coverage to qemu_strtosz

Add some more strings that the user might send our way. In
particular, some of these additions include FIXME comments showing
where our parser doesn't quite behave the way we want.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230522190441.64278-15-eblake@redhat.com>

numa: Check for qemu_strtosz_MiB error

As shown in the previous commit, qemu_strtosz_MiB sometimes leaves the
result value untouched (we have to audit further to learn that in that
case, the QAPI generator says that visit_type_NumaOptions() will have
zero-initialized it), and sometimes leaves it with the value of a
partial parse before -EINVAL occurs because of trailing garbage.
Rather than blindly treating any string the user may throw at us as
valid, we should check for parse failures.

Fixes: cc001888 ("numa: fixup parsed NumaNodeOptions earlier", v2.11.0)
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230522190441.64278-14-eblake@redhat.com>

cutils: Allow NULL str in qemu_strtosz

All the other qemu_strto* and parse_uint allow a NULL str. Having
qemu_strtosz not crash on qemu_strtosz(NULL, NULL, &value) is an easy
fix that adds some consistency between our string parsers.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230522190441.64278-13-eblake@redhat.com>

test-cutils: Refactor qemu_strtosz tests for less boilerplate

No need to copy-and-paste lots of boilerplate per string tested, when
we can consolidate that behind helper functions.  Plus, this adds a
bit more coverage (we now test all strings both with and without
endptr, whereas before some tests skipped the NULL endptr case), which
exposed a SEGFAULT on qemu_strtosz(NULL, NULL, &val) that will be
fixed in an upcoming patch.

Note that duplicating boilerplate has one advantage lost here - a
failed test tells you which line number failed; but a helper function
does not show the call stack that reached the failure.  Since we call
the helper more than once within many of the "unit tests", even the
unit test name doesn't point out which call is failing.  But that only
matters when tests fail (they normally pass); at which point I'm
debugging the failures under gdb anyways, so I'm not too worried about
it.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230522190441.64278-12-eblake@redhat.com>

test-cutils: Prepare for upcoming semantic change in qemu_strtosz

A quick search for 'qemu_strtosz' in the code base shows that outside
of the testsuite, the ONLY place that passes a non-NULL pointer to
@endptr of any variant of a size parser is in hmp.c (the 'o' parser of
monitor_parse_arguments), and that particular caller warns of
"extraneous characters at the end of line" unless the trailing bytes
are purely whitespace. Thus, it makes no semantic difference at the
high level whether we parse "1.5e1k" as "1" + ".5e1" + "k" (an attempt
to use scientific notation in strtod with a scaling suffix of 'k' with
no trailing junk, but which qemu_strtosz says should fail with
EINVAL), or as "1.5e" + "1k" (a valid size with scaling suffix of 'e'
for exabytes, followed by two junk bytes) - either way, any user
passing such a string will get an error message about a parse failure.

However, an upcoming patch to qemu_strtosz will fix other corner case
bugs in handling the fractional portion of a size, and in doing so, it
is easier to declare that qemu_strtosz() itself stops parsing at the
first 'e' rather than blindly consuming whatever strtod() will
recognize. Once that is fixed, the difference will be visible at the
low level (getting a valid parse with trailing garbage when @endptr is
non-NULL, while continuing to get -EINVAL when @endptr is NULL); this
is easier to demonstrate by moving the affected strings from
test_qemu_strtosz_invalid() (which declares them as always -EINVAL) to
test_qemu_strtosz_trailing() (where @endptr affects behavior, for now
with FIXME comments).

Note that a similar argument could be made for having "0x1.5" or
"0x1M" parse as 0x1 with ".5" or "M" as trailing junk, instead of
blindly treating it as -EINVAL; however, as these cases do not suffer
from the same problems as floating point, they are not worth changing
at this time.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230522190441.64278-11-eblake@redhat.com>

test-cutils: Add coverage of qemu_strtod

It's hard to tweak code for consistency if I can't prove what will or
won't break from those tweaks.  Time to add unit tests for
qemu_strtod() and qemu_strtod_finite().

Among other things, I wrote a check whether we have C99 semantics for
strtod("0x1") (which MUST parse hex numbers) rather than C89 (which
must stop parsing at 'x').  These days, I suspect that is okay; but if
it fails CI checks, knowing the difference will help us decide what we
want to do about it.  Note that C2x, while not final at the time of
this patch, has been considering whether to make strtol("0b1") parse
as 1 with no slop instead of the C17 parse of 0 with slop "b1"; that
decision may also bleed over to strtod().  But for now, I didn't think
it worth adding unit tests on that front (to strtol or strtod) as
things may still change.

Likewise, there are plenty more corner cases of strtod proper that I
don't explicitly test here, but there are enough unit tests added here
that it covers all the branches reached in our wrappers.  In
particular, it demonstrates the difference on when *value is left
uninitialized, which an upcoming patch will normalize.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230522190441.64278-10-eblake@redhat.com>

cutils: Allow NULL endptr in parse_uint()

All the qemu_strto*() functions permit a NULL endptr, just like their
libc counterparts, leaving parse_uint() as the oddball that caused
SEGFAULT on NULL and required the user to call parse_uint_full()
instead.  Relax things for consistency, even though the testsuite is
the only impacted caller.  Add one more unit test to ensure even
parse_uint_full(NULL, 0, &value) works.  This also fixes our code to
uniformly favor EINVAL over ERANGE when both apply.

Also fixes a doc mismatch @v vs. a parameter named value.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230522190441.64278-9-eblake@redhat.com>

cutils: Adjust signature of parse_uint[_full]

It's already confusing that we have two very similar functions for
wrapping the parse of a 64-bit unsigned value, differing mainly on
whether they permit leading '-'.  Adjust the signature of parse_uint()
and parse_uint_full() to be like all of qemu_strto*(): put the result
parameter last, use the same types (uint64_t and unsigned long long
have the same width, but are not always the same type), and mark
endptr const (this latter change only affects the rare caller of
parse_uint).  Adjust all callers in the tree.

While at it, note that since cutils.c already includes:

    QEMU_BUILD_BUG_ON(sizeof(int64_t) != sizeof(long long));

we are guaranteed that the result of parse_uint* cannot exceed
UINT64_MAX (or the build would have failed), so we can drop
pre-existing dead comparisons in opts-visitor.c that were never false.

Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230522190441.64278-8-eblake@redhat.com>
[eblake: Drop dead code spotted by Markus]
Signed-off-by: Eric Blake <eblake@redhat.com>

cutils: Document differences between parse_uint and qemu_strtou64

These two functions are subtly different, and not just because of
swapped parameter order. It took me adding better unit tests to
figure out why. Document the differences to make it more obvious to
developers trying to pick which one to use, as well as to aid in
upcoming semantic changes.

While touching the documentation, adjust a mis-statement: parse_uint
does not return -EINVAL on invalid base, but assert()s, like all the
other qemu_strto* functions that take a base argument.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230522190441.64278-7-eblake@redhat.com>

cutils: Fix wraparound parsing in qemu_strtoui

While we were matching 32-bit strtol in qemu_strtoi, our use of a
64-bit parse was leaking through for some inaccurate answers in
qemu_strtoui in comparison to a 32-bit strtoul (see the unit test for
examples). The comment for that function even described what we have
to do for a correct parse, but didn't implement it correctly: since
strtoull checks for overflow against the wrong values and then
negates, we have to temporarily undo negation before checking for
overflow against our desired value.

Our int wrappers would be a lot easier to write if libc had a
guaranteed 32-bit parser even on platforms with 64-bit long.

Whether we parse C2x binary strings like "0b1000" is currently up to
what libc does; our unit tests intentionally don't cover that at the
moment, though.

Fixes: 473a2a331e ("cutils: add qemu_strtoi & qemu_strtoui parsers for int/unsigned int types", v2.12.0)
Signed-off-by: Eric Blake <eblake@redhat.com>
CC: qemu-stable@nongnu.org
Message-Id: <20230522190441.64278-6-eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>

test-cutils: Test more integer corner cases

We have quite a few undertested and underdocumented integer parsing
corner cases.  To ensure that any changes we make in the code are
intentional rather than accidental semantic changes, it is time to add
more unit tests of existing behavior.

In particular, this demonstrates that parse_uint() and qemu_strtou64()
behave differently.  For "-0", it's hard to argue why parse_uint needs
to reject it (it's not a negative integer), but the documentation sort
of mentions it; but it is intentional that all other negative values
are treated as ERANGE with value 0 (compared to qemu_strtou64()
treating "-2" as success and UINT64_MAX-1, for example).

Also, when mixing overflow/underflow with a check for no trailing
junk, parse_uint_full favors ERANGE over EINVAL, while qemu_strto[iu]*
favor EINVAL.  This behavior is outside the C standard, so we can pick
whatever we want, but it would be nice to be consistent.

Note that C requires that "9223372036854775808" fail strtoll() with
ERANGE/INT64_MAX, but "-9223372036854775808" pass with INT64_MIN; we
weren't testing this.  For strtol(), the behavior depends on whether
long is 32- or 64-bits (the cutoff point either being the same as
strtoll() or at "-2147483648").  Meanwhile, C is clear that
"-18446744073709551615" pass stroull() (but not strtoll) with value 1,
even though we want it to fail parse_uint().  And although
qemu_strtoui() has no C counterpart, it makes more sense if we design
it like 32-bit strtoul() (that is, where "-4294967296" be an alternate
acceptable spelling for "1", but "-0xffffffff00000001" should be
treated as overflow and return 0xffffffff rather than 1).  We aren't
there yet, so some of the tests added in this patch have FIXME
comments.

However, note that C2x will (likely) be adding a SILENT semantic
change, where C17 strtol("0b1", &ep, 2) returns 0 with ep="b1", but
C2x will have it return 1 with ep="".  I did not feel like adding
testing for those corner cases, in part because the next version of C
is not standard and libc support for binary parsing is not yet
wide-spread (as of this patch, glibc.git still misparses bare "0b":
https://sourceware.org/bugzilla/show_bug.cgi?id=30371).

Message-Id: <20230522190441.64278-5-eblake@redhat.com>
[eblake: fix a few typos spotted by Hanna]
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
[eblake: fix typo on platforms with 32-bit long]
Signed-off-by: Eric Blake <eblake@redhat.com>

test-cutils: Test integral qemu_strto* value on failures

We are inconsistent on the contents of *value after a strto* parse
failure.  I found the following behaviors:

- parse_uint() and parse_uint_full(), which document that *value is
  slammed to 0 on all EINVAL failures and 0 or UINT_MAX on ERANGE
  failures, and has unit tests for that (note that parse_uint requires
  non-NULL endptr, and does not fail with EINVAL for trailing junk)

- qemu_strtosz(), which leaves *value untouched on all failures (both
  EINVAL and ERANGE), and has unit tests but not documentation for
  that

- qemu_strtoi() and other integral friends, which document *value on
  ERANGE failures but is unspecified on EINVAL (other than implicitly
  by comparison to libc strto*); there, *value is untouched for NULL
  string, slammed to 0 on no conversion, and left at the prefix value
  on NULL endptr; unit tests do not consistently check the value

- qemu_strtod(), which documents *value on ERANGE failures but is
  unspecified on EINVAL; there, *value is untouched for NULL string,
  slammed to 0.0 for no conversion, and left at the prefix value on
  NULL endptr; there are no unit tests (other than indirectly through
  qemu_strtosz)

- qemu_strtod_finite(), which documents *value on ERANGE failures but
  is unspecified on EINVAL; there, *value is left at the prefix for
  'inf' or 'nan' and untouched in all other cases; there are no unit
  tests (other than indirectly through qemu_strtosz)

Upcoming patches will change behaviors for consistency, but it's best
to first have more unit test coverage to see the impact of those
changes.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230522190441.64278-4-eblake@redhat.com>

test-cutils: Use g_assert_cmpuint where appropriate

When debugging test failures, seeing unsigned values as large positive
values rather than negative values matters (assuming glib 2.78+; given
that I just fixed a bug in glib 2.76 [1] where g_assert_cmpuint
displays signed instead of unsigned values). No impact when the test
is passing, but using a consistent style will matter more in upcoming
test additions. Also, some tests are better with cmphex.

While at it, fix some spacing and minor typing issues spotted nearby.

[1] https://gitlab.gnome.org/GNOME/glib/-/issues/2997

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230522190441.64278-3-eblake@redhat.com>

test-cutils: Avoid g_assert in unit tests

glib documentation[1] is clear: g_assert() should be avoided in unit
tests because it is ineffective if G_DISABLE_ASSERT is defined; unit
tests should stick to constructs based on g_assert_true() instead.
Note that since commit 262a69f428, we intentionally state that you
cannot define G_DISABLE_ASSERT while building qemu; but our code can
be copied to other projects without that restriction, so we should be
consistent.

For most of the replacements in this patch, using g_assert_cmpstr()
would be a regression in quality - although it would helpfully display
the string contents of both pointers on test failure, here, we really
do care about pointer equality, not just string content equality. But
when a NULL pointer is expected, g_assert_null works fine.

[1] https://libsoup.org/glib/glib-Testing.html#g-assert

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230522190441.64278-2-eblake@redhat.com>

qcow2: Explicit mention of padding bytes

Although we already covered the need for padding bytes with our
changes in commit 3ae3fcfa, commit 66fcbca5 (both v5.0.0) added one
byte and relied on the rest of the text for implicitly covering 7
padding bytes. For consistency with other parts of the header (such
as the header extension format listing padding from n - m, or the
snapshot table entry listing variable padding), we might as well call
out the remaining 7 bytes as padding until such time (as any) as they
gain another meaning.

Signed-off-by: Eric Blake <eblake@redhat.com>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230522184631.47211-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

iotests: Fix test 104 under NBD

In the past, commit a231cb27 ("iotests: Fix 104 for NBD", v2.3.0)
added an additional filter to _filter_img_info to rewrite NBD URIs
into the expected output form. This recently broke when we tweaked
tests to run in a per-format directory, which did not match the regex,
because _img_info itself is now already changing
SOCK_DIR=/tmp/tmpphjfbphd/raw-nbd-104 into
/tmp/tmpphjfbphd/IMGFMT-nbd-104 prior to _img_info_filter getting a
chance to further filter things.

While diagnosing the problem, I also noticed some filter lines
rendered completely useless by a typo when we switched from TCP to
Unix sockets for NBD (in shell, '\\+' is different from "\\+" (one
gives two backslash to the regex, matching the literal 2-byte sequence
<\+> after a single digit; the other gives one backslash to the regex,
as the metacharacter \+ to match one or more of <[0-9]>); since the
literal string <nbd://127.0.0.1:0\+> is not a valid URI, that regex
hasn't been matching anything for years so it is fine to just drop it
rather than fix the typo.

Fixes: f3923a72 ("iotests: Switch nbd tests to use Unix rather than TCP", v4.2.0)
Fixes: 5ba7db09 ("iotests: always use a unique sub-directory per test", v8.0.0)
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20230519150216.2599189-1-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>

qtest/migration: Document live=true cases

Document every single live=true use cases on why it should be done in the
live manner. Also document on the parameter so new precopy cases should
always use live=off unless with explicit reasonings.

Cc: Thomas Huth <thuth@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230601172935.175726-1-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

tests/qtest: make more migration pre-copy scenarios run non-live

There are 27 pre-copy live migration scenarios being tested. In all of
these we force non-convergence and run for one iteration, then let it
converge and wait for completion during the second (or following)
iterations. At 3 mbps bandwidth limit the first iteration takes a very
long time (~30 seconds).

While it is important to test the migration passes and convergence
logic, it is overkill to do this for all 27 pre-copy scenarios. The
TLS migration scenarios in particular are merely exercising different
code paths during connection establishment.

To optimize time taken, switch most of the test scenarios to run
non-live (ie guest CPUs paused) with no bandwidth limits. This gives
a massive speed up for most of the test scenarios.

For test coverage the following scenarios are unchanged

* Precopy with UNIX sockets
* Precopy with UNIX sockets and dirty ring tracking
* Precopy with XBZRLE
* Precopy with UNIX compress
* Precopy with UNIX compress (nowait)
* Precopy with multifd

On a test machine this reduces execution time from 13 minutes to
8 minutes.

Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230601161347.1803440-10-berrange@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

tests/qtest: distinguish src/dst migration VM stop/resume events

The 'got_stop' and 'got_resume' global variables apply to the src and
dst migration VM respectively. Change their names to make this explicit
to developers.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230601161347.1803440-9-berrange@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

tests/qtest: capture RESUME events during migration

When running migration tests we monitor for a STOP event so we can skip
redundant waits. This will be needed for the RESUME event too shortly.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230601161347.1803440-8-berrange@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

tests/qtest: replace wait_command() with qtest_qmp_assert_success

Most usage of wait_command() is followed by qobject_unref(), which
is just a verbose re-implementation of qtest_qmp_assert_success().

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230601161347.1803440-7-berrange@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

tests/qtest: switch to using event callbacks for STOP event

Change the migration test to use the new qtest event callback to watch
for the stop event. This ensures that we only watch for the STOP event
on the source QEMU. The previous code would set the single 'got_stop'
flag when either source or dest QEMU got the STOP event.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230601161347.1803440-6-berrange@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

tests/qtest: get rid of some 'qtest_qmp' usage in migration test

Some of the usage is just a verbose way of re-inventing the
qtest_qmp_assert_success(_ref) methods.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230601161347.1803440-5-berrange@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

tests/qtest: get rid of 'qmp_command' helper in migration test

This function duplicates logic of qtest_qmp_assert_success_ref.
The qtest_qmp_assert_success_ref method has better diagnostics
on failure because it prints the entire QMP response, instead
of just asserting on existance of the 'error' key.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230601161347.1803440-4-berrange@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

tests/qtest: add support for callback to receive QMP events

Currently code must call one of the qtest_qmp_event* functions to
fetch events. These are only usable if the immediate caller knows
the particular event they want to capture, and are only interested
in one specific event type. Adding ability to register an event
callback lets the caller capture a range of events over any period
of time.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230601161347.1803440-3-berrange@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

tests/qtest: add various qtest_qmp_assert_success() variants

Add several counterparts of qtest_qmp_assert_success() that can

* Use va_list instead of ...
* Accept a list of FDs to send
* Return the response data

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230601161347.1803440-2-berrange@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

Merge tag 'migration-20230601-pull-request' of https://gitlab.com/juan.quintela/qemu into staging

Migration Pull request (20230601)

Hi

In this series:
- improve background migration (fiona)
- improve vmstate failure states (vladimir)
- dropped all the RDMA cleanups

Please, apply.

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmR5LAoACgkQ9IfvGFhy
# 1yPOuxAA5QzUfswCIWehrkY8FApDjsecPDq5R6hS1p5pDZkHTF5y6j49J93I65o2
# E4qr4l0+DJvBvnxTW29JQ1i0RPqJHnJBFC9Ib4o0NaA/7iRP1sTwYxIN4wWZz6H/
# pqG3oQC0WPPqgj9tHSgDW4TNkFjETYaezTN8nddNmyiaO9UxNuR5ZKbeYMroVlfp
# KbnAYfXV6CyXKUZFT32BYcajYBDZAqBCO6y3gEn77KPlT1/TqnucoYEVuNudq5SE
# jeCamTzoAQ6SIRFM/eY+aASxdsSryqDS/WLqBFsleXs1kkJ6mkDnNels4HqS+xs9
# p2Vhv/59ktoC57XsRgTgzEklAaSHunZivcQkc5szyGVE5TZyKtWg8WhA6rvlqbjK
# lb3kKpvtVi73+pAWU0hhKFdnCrB6ieCHI70CJ5mpiIu3MzLUyrNJOK4FoKNoJDOD
# Dp45DK+W/EMg51pXyHJZZqHM1p0GGj0fmhv5T05nJ590fIWV4iqDdHFxsMZ9vEPN
# iEvAB7/pXz+yECznDFrp2e047rshOGaKKNSW3zl3/7D32Ds2FKur76dL4BoymztW
# HHLxmRWn8HmHMoKYLoawWVmCBsDqy8BFct+rHbA6h/0nSCPYIUCmMrSYVqajUbXD
# Ulkh4KNQGoBCzp5Toa0dYEXVc891wVOw4k8PTARwf2OYskSkT88=
# =Km5X
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 01 Jun 2023 04:38:50 PM PDT
# gpg:                using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [unknown]
# gpg:                 aka "Juan Quintela <quintela@trasno.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* tag 'migration-20230601-pull-request' of https://gitlab.com/juan.quintela/qemu:
  migration: stop tracking ram writes when cancelling background migration
  migration: restore vmstate on migration failure
  migration: switch from .vm_was_running to .vm_old_state
  runstate: drop unused runstate_store()
  migration: never fail in global_state_store()
  runstate: add runstate_get()

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

migration: stop tracking ram writes when cancelling background migration

Currently, it is only done when the iteration finishes successfully.
Not cleaning up the userfaultfd write protection can lead to
symptoms/issues such as the process hanging in memmove or GDB not
being able to attach.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20230526115908.196171-1-f.ebner@proxmox.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

migration: restore vmstate on migration failure

1. Otherwise failed migration just drops guest-panicked state, which is
   not good for management software.

2. We do keep different paused states like guest-panicked during
   migration with help of global_state state.

3. We do restore running state on source when migration is cancelled or
   failed.

4. "postmigrate" state is documented as "guest is paused following a
   successful 'migrate'", so originally it's only for successful path
   and we never documented current behavior.

Let's restore paused states like guest-panicked in case of cancel or
fail too. Allow same transitions like for inmigrate state.

This commit changes the behavior that was introduced by commit
42da5550d6 "migration: set state to post-migrate on failure" and
provides a bit different fix on related
  https://bugzilla.redhat.com/show_bug.cgi?id=1355683

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230517123752.21615-6-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>

migration: switch from .vm_was_running to .vm_old_state

No logic change here, only refactoring. That's a preparation for next
commit where we finally restore the stopped vm state on migration
failure or cancellation.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230517123752.21615-5-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>

runstate: drop unused runstate_store()

The function is unused since previous commit. Drop it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230517123752.21615-4-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>

migration: never fail in global_state_store()

Actually global_state_store() can never fail. Let's get rid of extra
error paths.

To make things clear, use new runstate_get() and use same approach for
global_state_store() and global_state_store_running().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230517123752.21615-3-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>

runstate: add runstate_get()

It's necessary to restore the state after failed/cancelled migration in
further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230517123752.21615-2-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>

Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging

Pull request

- Stefano Garzarella's blkio block driver 'fd' parameter
- My thread-local blk_io_plug() series

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmR4uHoACgkQnKSrs4Gr
# c8hFBAgAo+SFrOteYgdELM9s0EWb0AU39MTOyNXW7i5mPZNXrn5J7pfRD/5wvI6l
# wl5GNMQ+M5HVYO7CumKWr4M1IpKV5Jin6FN/2h15fWkeg17lBOmNHUF+LctLYQbq
# HwtNA4hdw1+SEv8kQLBgiqSJMqWcn80X09emgPMCIwET9zxokRYwVjQJx2alM5bd
# SqgitDp5qlHyj5HQPX2orT9KrXYWQdGr8i50bn0S67r1wdqTRMu93wrWdEUUncId
# 7otlUaq8cARbRMJzIwDmy/cF24Ynr0wCJb4aHW+trRtf+PNgx1Ki+YOiz+LFyjq7
# t6KOMeignzhz9Uzq8EVG4XW8SHpGkw==
# =Ms48
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 01 Jun 2023 08:25:46 AM PDT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]

* tag 'block-pull-request' of https://gitlab.com/stefanha/qemu:
  qapi: add '@fdset' feature for BlockdevOptionsVirtioBlkVhostVdpa
  block/blkio: use qemu_open() to support fd passing for virtio-blk
  block: remove bdrv_co_io_plug() API
  block/linux-aio: convert to blk_io_plug_call() API
  block/io_uring: convert to blk_io_plug_call() API
  block/blkio: convert to blk_io_plug_call() API
  block/nvme: convert to blk_io_plug_call() API
  block: add blk_io_plug_call() API

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Merge tag 'tracing-pull-request' of https://gitlab.com/stefanha/qemu into staging

Pull request

This pull request contains Alex Bennée's vcpu trace events removal patches.

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmR4tAMACgkQnKSrs4Gr
# c8ht/AgAiVslnH4vmD5IZloBHVRNEZKifODZbHW75yDgIirj/MhqlXPZ7bWoGwTN
# MLsTVuihhYnJBQKknN7lKyhkoQjgiJSkYhQbXSlcN7T3UE0+iG47FSudYTLDZSov
# M5wu1Edzi4q1uWr7ZIn/NS39iHVvQ7fdDMosHQmI0HKl25yx5936c0T2A4yyj96e
# LEtg4wLKo1uRgEMvCWrpiDz8ohNVwexAxCggwHE17tCebBmik+2cBEWAS+fcTbSr
# Nx3yWRat5VbqHOe3ghudLMNXHySQjNYrexULOVzyUUoaqUDt2eWCr9A4312BflEl
# 8U9FFl99BZX5rWkyUzsHxEmPlRsazQ==
# =oMRe
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 01 Jun 2023 08:06:43 AM PDT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]

* tag 'tracing-pull-request' of https://gitlab.com/stefanha/qemu:
  accel/tcg: include cs_base in our hash calculations
  hw/9pfs: use qemu_xxhash4
  tcg: remove the final vestiges of dstate
  trace: remove control-vcpu.h
  trace: remove code that depends on setting vcpu
  qapi: make the vcpu parameters deprecated for 8.1
  docs/deprecated: move QMP events bellow QMP command section
  scripts/qapi: document the tool that generated the file
  trace: remove vcpu_id from the TraceEvent structure
  trace-events: remove the remaining vcpu trace events
  *-user: remove the guest_user_syscall tracepoints

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

qapi: add '@fdset' feature for BlockdevOptionsVirtioBlkVhostVdpa

The virtio-blk-vhost-vdpa driver in libblkio 1.3.0 supports the fd
passing through the new 'fd' property.

Since now we are using qemu_open() on '@path' if the virtio-blk driver
supports the fd passing, let's announce it.
In this way, the management layer can pass the file descriptor of an
already opened vhost-vdpa character device. This is useful especially
when the device can only be accessed with certain privileges.

Add the '@fdset' feature only when the virtio-blk-vhost-vdpa driver
in libblkio supports it.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230530071941.8954-3-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

block/blkio: use qemu_open() to support fd passing for virtio-blk

Some virtio-blk drivers (e.g. virtio-blk-vhost-vdpa) supports the fd
passing. Let's expose this to the user, so the management layer
can pass the file descriptor of an already opened path.

If the libblkio virtio-blk driver supports fd passing, let's always
use qemu_open() to open the `path`, so we can handle fd passing
from the management layer through the "/dev/fdset/N" special path.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230530071941.8954-2-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

accel/tcg: include cs_base in our hash calculations

We weren't using cs_base in the hash calculations before. Since the
arm front end moved a chunk of flags in a378206a20 (target/arm: Move
mode specific TB flags to tb->cs_base) they comprise of an important
part of the execution state.

Widen the tb_hash_func to include cs_base and expand to qemu_xxhash8()
to accommodate it.

My initial benchmark shows very little difference in the
runtime.

Before:

armhf

➜  hyperfine -w 2 -m 20 "./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot"
Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot
  Time (mean ± σ):     24.627 s ±  2.708 s    [User: 34.309 s, System: 1.797 s]
  Range (min … max):   22.345 s … 29.864 s    20 runs

arm64

➜  hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot"
Benchmark 1: 20
  Time (mean ± σ):     62.559 s ±  2.917 s    [User: 189.115 s, System: 4.089 s]
  Range (min … max):   59.997 s … 70.153 s    10 runs

After:

armhf

Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot
  Time (mean ± σ):     24.223 s ±  2.151 s    [User: 34.284 s, System: 1.906 s]
  Range (min … max):   22.000 s … 28.476 s    20 runs

arm64

hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot"
Benchmark 1: 20
  Time (mean ± σ):     62.769 s ±  1.978 s    [User: 188.431 s, System: 5.269 s]
  Range (min … max):   60.285 s … 66.868 s    10 runs

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230526165401.574474-12-alex.bennee@linaro.org
Message-Id: <20230524133952.3971948-11-alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

hw/9pfs: use qemu_xxhash4

No need to pass zeros as we have helpers that do that for us.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230526165401.574474-11-alex.bennee@linaro.org
Message-Id: <20230524133952.3971948-10-alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

tcg: remove the final vestiges of dstate

Now we no longer have dynamic state affecting things we can remove the
additional fields in cpu.h and simplify the TB hash calculation.

For the benchmark:

    hyperfine -w 2 -m 20 \
      "./arm-softmmu/qemu-system-arm -cpu cortex-a15 \
        -machine type=virt,highmem=off \
        -display none -m 2048 \
        -serial mon:stdio \
        -netdev user,id=unet,hostfwd=tcp::2222-:22 \
        -device virtio-net-pci,netdev=unet \
        -device virtio-scsi-pci \
        -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf \
        -device scsi-hd,drive=hd -smp 4 \
        -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage \
        -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' \
        -snapshot"

It has a marginal effect on runtime, before:

  Time (mean ± σ):     26.279 s ±  2.438 s    [User: 41.113 s, System: 1.843 s]
  Range (min … max):   24.420 s … 32.565 s    20 runs

after:

  Time (mean ± σ):     24.440 s ±  2.885 s    [User: 34.474 s, System: 2.028 s]
  Range (min … max):   21.663 s … 29.937 s    20 runs

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1358
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230526165401.574474-10-alex.bennee@linaro.org
Message-Id: <20230524133952.3971948-9-alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

trace: remove control-vcpu.h

Now we no longer have vcpu controlled trace events we can excise the
code that allows us to query its status.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230526165401.574474-9-alex.bennee@linaro.org
Message-Id: <20230524133952.3971948-8-alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

trace: remove code that depends on setting vcpu

Now we no longer have any events that are for vcpus we can start
excising the code from the trace control. As the vcpu parameter is
encoded as part of QMP we just stub out the has_vcpu/vcpu parameters
rather than alter the API.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230526165401.574474-8-alex.bennee@linaro.org
Message-Id: <20230524133952.3971948-7-alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

qapi: make the vcpu parameters deprecated for 8.1

I don't think I can remove the parameters directly but certainly mark
them as deprecated.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230526165401.574474-7-alex.bennee@linaro.org
Message-Id: <20230524133952.3971948-6-alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

docs/deprecated: move QMP events bellow QMP command section

Also rename the section to make the fact this is part of the
management protocol even clearer.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230526165401.574474-6-alex.bennee@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

scripts/qapi: document the tool that generated the file

This makes it a little easier for developers to find where things
where being generated.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230526165401.574474-5-alex.bennee@linaro.org
Message-Id: <20230524133952.3971948-5-alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>