OSDN Git Service

qmiga/qemu.git
10 months agotarget/riscv/cpu.c: consider user option with RVG
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:23 +0000 (10:24 -0300)]
target/riscv/cpu.c: consider user option with RVG

Enabling RVG will enable a set of extensions that we're not checking if
the user was okay enabling or not. And in this case we want to error
out, instead of ignoring, otherwise we will be inconsistent enabling RVG
without all its extensions.

After this patch, disabling ifencei or icsr while enabling RVG will
result in error:

$ ./build/qemu-system-riscv64 -M virt -cpu rv64,g=true,Zifencei=false --nographic
qemu-system-riscv64: RVG requires Zifencei but user set Zifencei to false

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-21-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv/cpu.c: honor user choice in cpu_cfg_ext_auto_update()
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:22 +0000 (10:24 -0300)]
target/riscv/cpu.c: honor user choice in cpu_cfg_ext_auto_update()

Add a new cpu_cfg_ext_is_user_set() helper to check if an extension was
set by the user in the command line. Use it inside
cpu_cfg_ext_auto_update() to verify if the user set a certain extension
and, if that's the case, do not change its value.

This will make us honor user choice instead of overwriting the values.
Users will then be informed whether they're using an incompatible set of
extensions instead of QEMU setting a magic value that works.

The reason why we're not implementing user choice for MISA extensions
right now is because, today, we do not silently change any MISA bit
during realize() time (we do warn when enabling bits if RVG is enabled).
We do that - a lot - with multi-letter extensions though, so we're
handling the most immediate concern first.

After this patch, we'll now error out if the user explicitly set 'zce' to true
and 'zca' to false:

$ ./build/qemu-system-riscv64 -M virt -cpu rv64,zce=true,zca=false -nographic
qemu-system-riscv64: Zcf/Zcd/Zcb/Zcmp/Zcmt extensions require Zca extension

This didn't happen before because we were enabling 'zca' if 'zce' was enabled
regardless if the user set 'zca' to false.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-20-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv: use isa_ext_update_enabled() in init_max_cpu_extensions()
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:21 +0000 (10:24 -0300)]
target/riscv: use isa_ext_update_enabled() in init_max_cpu_extensions()

Before adding support to detect if an extension was user set we need to
handle how we're enabling extensions in riscv_init_max_cpu_extensions().
object_property_set_bool() calls the set() callback for the property,
and we're going to use this callback to set the 'multi_ext_user_opts'
hash.

This means that, as is today, all extensions we're setting for the 'max'
CPU will be seen as user set in the future. Let's change set_bool() to
isa_ext_update_enabled() that will just enable/disable the flag on a
certain offset.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-19-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv/cpu.c: introduce RISCVCPUMultiExtConfig
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:20 +0000 (10:24 -0300)]
target/riscv/cpu.c: introduce RISCVCPUMultiExtConfig

If we want to make better decisions when auto-enabling extensions during
realize() we need a way to tell if an user set an extension manually.
The RISC-V KVM driver has its own solution via a KVMCPUConfig struct
that has an 'user_set' flag that is set during the Property set()
callback. The set() callback also does init() time validations based on
the current KVM driver capabilities.

For TCG we would want a 'user_set' mechanic too, but we would look
ad-hoc via cpu_cfg_ext_auto_update() if a certain extension was user set
or not. If we copy what was made in the KVM side we would look for
'user_set' for one into 60+ extension structs spreaded in 3 arrays
(riscv_cpu_extensions, riscv_cpu_experimental_exts,
riscv_cpu_vendor_exts).

We'll still need an extension struct but we won't be using the
'user_set' flag:

- 'RISCVCPUMultiExtConfig' will be our specialized structure, similar to what
we're already doing with the MISA extensions in 'RISCVCPUMisaExtConfig'.
DEFINE_PROP_BOOL() for all 3 extensions arrays were replaced by
MULTI_EXT_CFG_BOOL(), a macro that will init our specialized struct;

- the 'multi_ext_user_opts' hash will be used to store the offset of each
extension that the user set via the set() callback, cpu_set_multi_ext_cfg().
For now we're just initializing and populating it - next patch will use
it to determine if a certain extension was user set;

- cpu_add_multi_ext_prop() is a new helper that will replace the
qdev_property_add_static() calls that our macros are doing to populate
user properties. The macro was renamed to ADD_CPU_MULTIEXT_PROPS_ARRAY()
for clarity. Note that the non-extension properties in
riscv_cpu_options[] still need to be declared via qdev().

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-18-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv/cpu.c: use cpu_cfg_ext_auto_update() during realize()
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:19 +0000 (10:24 -0300)]
target/riscv/cpu.c: use cpu_cfg_ext_auto_update() during realize()

Let's change the other instances in realize() where we're enabling an
extension based on a certain criteria (e.g. it's a dependency of another
extension).

We're leaving icsr and ifencei being enabled during RVG for later -
we'll want to error out in that case. Every other extension enablement
during realize is now done via cpu_cfg_ext_auto_update().

The end goal is that only cpu init() functions will handle extension
flags directly via "cpu->cfg.ext_N = true|false".

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-17-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv/cpu.c: introduce cpu_cfg_ext_auto_update()
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:18 +0000 (10:24 -0300)]
target/riscv/cpu.c: introduce cpu_cfg_ext_auto_update()

During realize() time we're activating a lot of extensions based on some
criteria, e.g.:

    if (cpu->cfg.ext_zk) {
        cpu->cfg.ext_zkn = true;
        cpu->cfg.ext_zkr = true;
        cpu->cfg.ext_zkt = true;
    }

This practice resulted in at least one case where we ended up enabling
something we shouldn't: RVC enabling zca/zcd/zcf when using a CPU that
has priv_spec older than 1.12.0.

We're also not considering user choice. There's no way of doing it now
but this is about to change in the next few patches.

cpu_cfg_ext_auto_update() will check for priv version mismatches before
enabling extensions. If we have a mismatch between the current priv
version and the extension we want to enable, do not enable it. In the
near future, this same function will also consider user choice when
deciding if we're going to enable/disable an extension or not.

For now let's use it to handle zca/zcd/zcf enablement if RVC is enabled.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-16-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv: make CPUCFG() macro public
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:17 +0000 (10:24 -0300)]
target/riscv: make CPUCFG() macro public

The RISC-V KVM driver uses a CPUCFG() macro that calculates the offset
of a certain field in the struct RISCVCPUConfig. We're going to use this
macro in target/riscv/cpu.c as well in the next patches. Make it public.

Rename it to CPU_CFG_OFFSET() for more clarity while we're at it.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230912132423.268494-15-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv/cpu.c: use offset in isa_ext_is_enabled/update_enabled
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:16 +0000 (10:24 -0300)]
target/riscv/cpu.c: use offset in isa_ext_is_enabled/update_enabled

We'll have future usage for a function where, given an offset of the
struct RISCVCPUConfig, the flag is updated to a certain val.

Change all existing callers to use edata->ext_enable_offset instead of
'edata'.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-14-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv: deprecate the 'any' CPU type
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:15 +0000 (10:24 -0300)]
target/riscv: deprecate the 'any' CPU type

The 'any' CPU type was introduced in commit dc5bd18fa5725 ("RISC-V CPU
Core Definition"), being around since the beginning. It's not an easy
CPU to use: it's undocumented and its name doesn't tell users much about
what the CPU is supposed to bring. 'git log' doesn't help us either in
knowing what was the original design of this CPU type.

The closest we have is a comment from Alistair [1] where he recalls from
memory that the 'any' CPU is supposed to behave like the newly added
'max' CPU. He also suggested that the 'any' CPU should be removed.

The default CPUs are rv32 and rv64, so removing the 'any' CPU will have
impact only on users that might have a script that uses '-cpu any'.
And those users are better off using the default CPUs or the new 'max'
CPU.

We would love to just remove the code and be done with it, but one does
not simply remove a feature in QEMU. We'll put the CPU in quarantine
first, letting users know that we have the intent of removing it in the
future.

[1] https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg02891.html

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230912132423.268494-13-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agoavocado, risc-v: add tuxboot tests for 'max' CPU
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:14 +0000 (10:24 -0300)]
avocado, risc-v: add tuxboot tests for 'max' CPU

Add smoke tests to ensure that we'll not break the 'max' CPU type when
adding new frozen/ratified RISC-V extensions.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230912132423.268494-12-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv: add 'max' CPU type
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:13 +0000 (10:24 -0300)]
target/riscv: add 'max' CPU type

The 'max' CPU type is used by tooling to determine what's the most
capable CPU a current QEMU version implements. Other archs such as ARM
implements this type. Let's add it to RISC-V.

What we consider "most capable CPU" in this context are related to
ratified, non-vendor extensions. This means that we want the 'max' CPU
to enable all (possible) ratified extensions by default. The reasoning
behind this design is (1) vendor extensions can conflict with each other
and we won't play favorities deciding which one is default or not and
(2) non-ratified extensions are always prone to changes, not being
stable enough to be enabled by default.

All this said, we're still not able to enable all ratified extensions
due to conflicts between them. Zfinx and all its dependencies aren't
enabled because of a conflict with RVF. zce, zcmp and zcmt are also
disabled due to RVD conflicts. When running with 64 bits we're also
disabling zcf.

MISA bits RVG, RVJ and RVV are also being set manually since they're
default disabled.

This is the resulting 'riscv,isa' DT for this new CPU:

rv64imafdcvh_zicbom_zicboz_zicsr_zifencei_zihintpause_zawrs_zfa_
zfh_zfhmin_zca_zcb_zcd_zba_zbb_zbc_zbkb_zbkc_zbkx_zbs_zk_zkn_zknd_
zkne_zknh_zkr_zks_zksed_zksh_zkt_zve32f_zve64f_zve64d_
smstateen_sscofpmf_sstc_svadu_svinval_svnapot_svpbmt

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-11-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv/cpu.c: limit cfg->vext_spec log message
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:12 +0000 (10:24 -0300)]
target/riscv/cpu.c: limit cfg->vext_spec log message

Inside riscv_cpu_validate_v() we're always throwing a log message if the
user didn't set a vector version via 'vext_spec'.

We're going to include one case with the 'max' CPU where env->vext_ver
will be set in the cpu_init(). But that alone will not stop the "vector
version is not specified" message from appearing. The usefulness of this
log message is debatable for the generic CPUs, but for a 'max' CPU type,
where we are supposed to deliver a CPU model with all features possible,
it's strange to force users to set 'vext_spec' to get rid of this
message.

Change riscv_cpu_validate_v() to not throw this log message if
env->vext_ver is already set.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Message-ID: <20230912132423.268494-10-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv/cpu.c: add riscv_cpu_add_kvm_unavail_prop_array()
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:11 +0000 (10:24 -0300)]
target/riscv/cpu.c: add riscv_cpu_add_kvm_unavail_prop_array()

Use a helper in riscv_cpu_add_kvm_properties() to eliminate some of its
code repetition.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230912132423.268494-9-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv/cpu.c: add riscv_cpu_add_qdev_prop_array()
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:10 +0000 (10:24 -0300)]
target/riscv/cpu.c: add riscv_cpu_add_qdev_prop_array()

The code inside riscv_cpu_add_user_properties() became quite repetitive
after recent changes. Add a helper to hide the repetition away.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230912132423.268494-8-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv/cpu.c: split vendor exts from riscv_cpu_extensions[]
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:09 +0000 (10:24 -0300)]
target/riscv/cpu.c: split vendor exts from riscv_cpu_extensions[]

Our goal is to make riscv_cpu_extensions[] hold only ratified,
non-vendor extensions.

Create a new riscv_cpu_vendor_exts[] array for them, changing
riscv_cpu_add_user_properties() and riscv_cpu_add_kvm_properties()
accordingly.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv/cpu.c: split non-ratified exts from riscv_cpu_extensions[]
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:08 +0000 (10:24 -0300)]
target/riscv/cpu.c: split non-ratified exts from riscv_cpu_extensions[]

Create a new riscv_cpu_experimental_exts[] to store the non-ratified
extensions properties. Once they are ratified we'll move them back to
riscv_cpu_extensions[].

riscv_cpu_add_user_properties() and riscv_cpu_add_kvm_properties() are
changed to keep adding non-ratified properties to users.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-6-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv: add DEFINE_PROP_END_OF_LIST() to riscv_cpu_options[]
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:07 +0000 (10:24 -0300)]
target/riscv: add DEFINE_PROP_END_OF_LIST() to riscv_cpu_options[]

Add DEFINE_PROP_END_OF_LIST() and eliminate the ARRAY_SIZE() usage when
iterating in the riscv_cpu_options[] array, making it similar to what
we already do when working with riscv_cpu_extensions[].

We also have a more sophisticated motivation behind this change. In the
future we might need to export riscv_cpu_options[] to other files, and
ARRAY_LIST() doesn't work properly in that case because the array size
isn't exposed to the header file. Here's a future sight of what we would
deal with:

./target/riscv/kvm.c:1057:5: error: nested extern declaration of 'riscv_cpu_add_misa_properties' [-Werror=nested-externs]
n file included from ../target/riscv/kvm.c:19:
home/danielhb/work/qemu/include/qemu/osdep.h:473:31: error: invalid application of 'sizeof' to incomplete type 'const RISCVCPUMultiExtConfig[]'
 473 | #define ARRAY_SIZE(x) ((sizeof(x) / sizeof((x)[0])) + \
     |                               ^
./target/riscv/kvm.c:1047:29: note: in expansion of macro 'ARRAY_SIZE'
1047 |         for (int i = 0; i < ARRAY_SIZE(_array); i++) { \
     |                             ^~~~~~~~~~
./target/riscv/kvm.c:1059:5: note: in expansion of macro 'ADD_UNAVAIL_KVM_PROP_ARRAY'
1059 |     ADD_UNAVAIL_KVM_PROP_ARRAY(obj, riscv_cpu_extensions);
     |     ^~~~~~~~~~~~~~~~~~~~~~~~~~
home/danielhb/work/qemu/include/qemu/osdep.h:473:31: error: invalid application of 'sizeof' to incomplete type 'const RISCVCPUMultiExtConfig[]'
 473 | #define ARRAY_SIZE(x) ((sizeof(x) / sizeof((x)[0])) + \
     |                               ^
./target/riscv/kvm.c:1047:29: note: in expansion of macro 'ARRAY_SIZE'
1047 |         for (int i = 0; i < ARRAY_SIZE(_array); i++) { \

Homogenize the present and change the future by using
DEFINE_PROP_END_OF_LIST() in riscv_cpu_options[].

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230912132423.268494-5-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv/cpu.c: split kvm prop handling to its own helper
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:06 +0000 (10:24 -0300)]
target/riscv/cpu.c: split kvm prop handling to its own helper

Future patches will split the existing Property arrays even further, and
the existing code in riscv_cpu_add_user_properties() will start to scale
bad with it because it's dealing with KVM constraints mixed in with TCG
constraints. We're going to pay a high price to share a couple of common
lines of code between the two.

Create a new kvm_riscv_cpu_add_kvm_properties() helper that will be
forked from riscv_cpu_add_user_properties() if we're running KVM. The
helper includes all properties that a KVM CPU will add. The rest of
riscv_cpu_add_user_properties() body will then be relieved from having
to deal with KVM constraints.

The helper was declared in kvm_stubs.h, while being implemented in
cpu.c, to allow '--enable-debug' builds to work. The compiler won't
remove the kvm_riscv_cpu_add_kvm_properties() reference when
'kvm_enabled()' is false if we end up with an unused function. Even
though being a KVM only helper we can't implement it in kvm.c due to its
many dependencies inside cpu.c, so make it public in kvm_riscv.h and
keep its implementation in cpu.c for now. We'll move it to kvm.c in the
near future.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-4-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv/cpu.c: skip 'bool' check when filtering KVM props
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:05 +0000 (10:24 -0300)]
target/riscv/cpu.c: skip 'bool' check when filtering KVM props

After the introduction of riscv_cpu_options[] all properties in
riscv_cpu_extensions[] are booleans. This check is now obsolete.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agotarget/riscv/cpu.c: split CPU options from riscv_cpu_extensions[]
Daniel Henrique Barboza [Tue, 12 Sep 2023 13:24:04 +0000 (10:24 -0300)]
target/riscv/cpu.c: split CPU options from riscv_cpu_extensions[]

We'll add a new CPU type that will enable a considerable amount of
extensions. To make it easier for us we'll do a few cleanups in our
existing riscv_cpu_extensions[] array.

Start by splitting all CPU non-boolean options from it. Create a new
riscv_cpu_options[] array for them. Add all these properties in
riscv_cpu_add_user_properties() as it is already being done today.

'mmu' and 'pmp' aren't really extensions in the usual way we think about
RISC-V extensions. These are closer to CPU features/options, so move
both to riscv_cpu_options[] too. In the near future we'll need to match
all extensions with all entries in isa_edata_arr[], and so it happens
that both 'mmu' and 'pmp' do not have a riscv,isa string (thus, no priv
spec version restriction). This further emphasizes the point that these
are more a CPU option than an extension.

No functional changes made.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
10 months agoMerge tag 'pull-omnibus-111023-1' of https://gitlab.com/stsquad/qemu into staging
Stefan Hajnoczi [Wed, 11 Oct 2023 13:43:10 +0000 (09:43 -0400)]
Merge tag 'pull-omnibus-111023-1' of https://gitlab.com/stsquad/qemu into staging

testing, gdbstub and plugin updates

  - enable more sbsa-ref tests in avocado
  - add swtpm to the package lists
  - reduce avocado noise in gitlab by limiting tests
  - make docker engine choice driven by configure and enable override
  - remove unneeded gcc suffix on some cross compilers
  - fix some NULL returns in gdbstub
  - improve locking in execlog plugin
  - introduce the GDBFeature structure
  - consistently set gdb_core_xml_file
  - use cleaner escaping for gdb xml
  - drop ancient gdb_has_xml() test
  - disable multi-instruction GUSA emulation when plugins enabled
  - fix some coverity issues in plugins

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmUmVJkACgkQ+9DbCVqe
# KkTfNggAiS02FcL3faGjAN9+60xhvEQ3DJjI473hjvFWu0bSkQTjObcQqGc+V7Cw
# 9yNtnxOOWB6KdAU8At7HlVqiUXeyTCJB7Att5/UgNUZj63j+cs7PXb4p7cVCcJOc
# 17zni22tnmCBcC8wZaz0yj68jaftL3hz1QNUZOmv6CBt42q0+/4g1WKfaJ+w+SbK
# T7cJEiMDObm8qeNAAXpDLB+9v3bRDxMZ8hFJ3p3CatQC8jbDrkuH7RrVPHDWiWQx
# w0uXpUHlZEOVX23v6+iIoeb8YQW2bZI9UsfeyIHJlENaVgyL200LHgLvvAE4Qd63
# dCtfQUZzj4t9sfoL4XgxaB7G4qtXTg==
# =7PLI
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 11 Oct 2023 03:54:01 EDT
# gpg:                using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* tag 'pull-omnibus-111023-1' of https://gitlab.com/stsquad/qemu: (25 commits)
  contrib/plugins: fix coverity warning in hotblocks
  contrib/plugins: fix coverity warning in lockstep
  contrib/plugins: fix coverity warning in cache
  plugins: Set final instruction count in plugin_gen_tb_end
  target/sh4: Disable decode_gusa when plugins enabled
  accel/tcg: Add plugin_enabled to DisasContextBase
  gdbstub: Replace gdb_regs with an array
  gdbstub: Remove gdb_has_xml variable
  target/ppc: Remove references to gdb_has_xml
  target/arm: Remove references to gdb_has_xml
  gdbstub: Use g_markup_printf_escaped()
  hw/core/cpu: Return static value with gdb_arch_name()
  target/arm: Move the reference to arm-core.xml
  gdbstub: Introduce GDBFeature structure
  contrib/plugins: Use GRWLock in execlog
  plugins: Check if vCPU is realized
  gdbstub: Fix target.xml response
  gdbstub: Fix target_xml initialization
  configure: remove gcc version suffixes
  configure: allow user to override docker engine
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 months agoMerge tag 'migration-20231011-pull-request' of https://gitlab.com/juan.quintela/qemu...
Stefan Hajnoczi [Wed, 11 Oct 2023 13:42:39 +0000 (09:42 -0400)]
Merge tag 'migration-20231011-pull-request' of https://gitlab.com/juan.quintela/qemu into staging

Migration Pull request (20231011 edition)

Hi

In this pull request:

- Markus RDMA cleanup series
- recover fixes from peter
- migration capability from fabiano
- negative migration test from fabiano.

Please, pull.

Thanks, Juan.

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmUmaQ8ACgkQ9IfvGFhy
# 1yN9fA//SBnea3Wl2158J673l5aaI8Vp/1PjfzvNdcr/6EQbXZBgug+haQ3n5Hhf
# USNRhemrCkpZAGCUf07g9pfF4R/Jsq1OkOrWF4e6gAaZPNU4V5F7VKBk8pmFMLtr
# Kk2XgnH2ZPaFEvts0qBrOfvDHH8gOzzjpF2HGrioM8Zr3p1JHz9OqJoSyawLF0U7
# YFTq2jJSgaOQ6ax1+L8hLLuXlmNccBaTWT8Cv0rbPEgcwrJOM/wMfmd6O39ps929
# yS5NnxqqkrprTDjmeGOgOQd0Cy/flinnzmu+BVMO6/ns9Hu6q1TGG6D+DOBdgmHH
# jq7Ej5VILtXWOoZtXLHqA1Xt73ciVlmditVupoC+5vtIJou2JseClutOp98qxxzV
# llMF7ldHbRTWnu7qIrwv2OINarowR0pIZfkJqBc6dNHHScwMCnX5L9YAvNePEo2V
# 1oJpbqW7mmgwdlFAiKFD+AE6qUWxcnzOvPf+fzWrJMi507Kv5nmxQWTHw9dsFs7k
# neWnK21t0s2t77+vVBtLlr06JESG+WndzvQsXKZu8Pd0+ASnzpX8pRVzxEPk5EiH
# fT9bhXOCvxTTHulznjkOApODE5NF+KlHAFXU87cSIkdi/6JfzcvTe6KeeIPC248Q
# jk3nVlhds1xajTcPAK7HF5Ta6R8rNdTZ6q/kFNhLaTGqv9agxDU=
# =hekO
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 11 Oct 2023 05:21:19 EDT
# gpg:                using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full]
# gpg:                 aka "Juan Quintela <quintela@trasno.org>" [full]
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* tag 'migration-20231011-pull-request' of https://gitlab.com/juan.quintela/qemu: (65 commits)
  migration: Add migration_rp_wait|kick()
  migration: Remember num of ramblocks to sync during recovery
  qemufile: Always return a verbose error
  migration: Introduce migrate_has_error()
  migration: Display error in query-migrate irrelevant of status
  migration/rdma: Replace flawed device detail dump by tracing
  migration/rdma: Use error_report() & friends instead of stderr
  migration/rdma: Downgrade qemu_rdma_cleanup() errors to warnings
  migration/rdma: Silence qemu_rdma_register_and_get_keys()
  migration/rdma: Silence qemu_rdma_block_for_wrid()
  migration/rdma: Don't report received completion events as error
  migration/rdma: Silence qemu_rdma_reg_control()
  migration/rdma: Silence qemu_rdma_connect()
  migration/rdma: Silence qemu_rdma_resolve_host()
  migration/rdma: Convert qemu_rdma_alloc_pd_cq() to Error
  migration/rdma: Convert qemu_rdma_post_recv_control() to Error
  migration/rdma: Convert qemu_rdma_post_send_control() to Error
  migration/rdma: Convert qemu_rdma_write() to Error
  migration/rdma: Convert qemu_rdma_write_one() to Error
  migration/rdma: Convert qemu_rdma_write_flush() to Error
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 months agoMerge tag 'audio-pull-request' of https://gitlab.com/marcandre.lureau/qemu into staging
Stefan Hajnoczi [Wed, 11 Oct 2023 13:42:14 +0000 (09:42 -0400)]
Merge tag 'audio-pull-request' of https://gitlab.com/marcandre.lureau/qemu into staging

Audio es1370 fix & cleanups

# -----BEGIN PGP SIGNATURE-----
#
# iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmUmQIQcHG1hcmNhbmRy
# ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5UiED/0VjK4J6I+MYljyJePj
# ki62XC/dtojuxkE+nwsTrU54u86776TLmthRYyHOffSs8mDpk0ynY6sWD99T4r1b
# 9Szmm9FJM658RSpN9i128JAiHKD9Vth7UUpekrLIBgRsW5lGuQFapHouTB1wa3hm
# 8ZszUNAGEpAgGM2My9t+pvGy8lrOZan2ZAlQzKfMc+msYTnivK6huzjVw/xUc0kD
# vzfpVMgKntvl391IBf9hFBBrJCWRtDrydtPl1/PxpVj/96FGIxlx+o30X8z9hSzA
# yg96XfwmS/mgqFsw8QqtFyrkSE3iD2yzcrepjtK+erwcv6M0BVG38wn89F571Lcl
# Ga8PcqBd7pP0nMXrgjbX8qB6t58TPipEbFMeS2iabO4pwz0jy9aCVQJ2GZqHNqaF
# VoMN5hEDBhanxFcNEcjSWZEfJOvK7uVLAZ2Xdqcrsm50ESTu55H/BIQWFipTGlqu
# +BlvfpwVYXGziuDsj8h4cUGvyuMY02XWsBU9qvl5aCiyWoyED0ZlR2UJ0jVjn68K
# RUbTrxSMF9hLgI1j8FYkNgXMBuR4+Sopc7vp//AXpp92/yxTNCbe44Y06C9iaib1
# 0zrYhd4rfTn0MX+YM9IUKzvoNSVY6VxHoTucyTM9HcT7XtCnmwnMzjY4yGxQ6k3V
# X0WBPpH4mCyXla7TJ0zwbnvH0A==
# =B/KL
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 11 Oct 2023 02:28:20 EDT
# gpg:                using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg:                issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* tag 'audio-pull-request' of https://gitlab.com/marcandre.lureau/qemu:
  hw/audio/es1370: trace lost interrupts
  hw/audio/es1370: change variable type and name
  hw/audio/es1370: block structure coding style fixes
  hw/audio/es1370: remove #ifdef ES1370_VERBOSE to avoid bit rot
  hw/audio/es1370: remove #ifdef ES1370_DEBUG to avoid bit rot
  hw/audio/es1370: remove unused dolog macro
  hw/audio/es1370: replace bit-rotted code with tracepoints
  hw/audio/es1370: reset current sample counter

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 months agomigration: Add migration_rp_wait|kick()
Peter Xu [Wed, 4 Oct 2023 22:02:37 +0000 (18:02 -0400)]
migration: Add migration_rp_wait|kick()

It's just a simple wrapper for rp_sem on either wait() or kick(), make it
even clearer on how it is used.  Prepared to be used even for other things.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-ID: <20231004220240.167175-8-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
10 months agomigration: Remember num of ramblocks to sync during recovery
Peter Xu [Wed, 4 Oct 2023 22:02:36 +0000 (18:02 -0400)]
migration: Remember num of ramblocks to sync during recovery

Instead of only relying on the count of rp_sem, make the counter be part of
RAMState so it can be used in both threads to synchronize on the process.

rp_sem will be further reused in follow up patches, as a way to kick the
main thread, e.g., on recovery failures.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-7-peterx@redhat.com>

10 months agoqemufile: Always return a verbose error
Peter Xu [Wed, 4 Oct 2023 22:02:35 +0000 (18:02 -0400)]
qemufile: Always return a verbose error

There're a lot of cases where we only have an errno set in last_error but
without a detailed error description.  When this happens, try to generate
an error contains the errno as a descriptive error.

This will be helpful in cases where one relies on the Error*.  E.g.,
migration state only caches Error* in MigrationState.error.  With this,
we'll display correct error messages in e.g. query-migrate when the error
was only set by qemu_file_set_error().

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-6-peterx@redhat.com>

10 months agomigration: Introduce migrate_has_error()
Peter Xu [Wed, 4 Oct 2023 22:02:32 +0000 (18:02 -0400)]
migration: Introduce migrate_has_error()

Introduce a helper to detect whether MigrationState.error is set for
whatever reason.

This is preparation work for any thread (e.g. source return path thread) to
setup errors in an unified way to MigrationState, rather than relying on
its own way to set errors (mark_source_rp_bad()).

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-3-peterx@redhat.com>

10 months agomigration: Display error in query-migrate irrelevant of status
Peter Xu [Wed, 4 Oct 2023 22:02:31 +0000 (18:02 -0400)]
migration: Display error in query-migrate irrelevant of status

Display it as long as being set, irrelevant of FAILED status.  E.g., it may
also be applicable to PAUSED stage of postcopy, to provide hint on what has
gone wrong.

The error_mutex seems to be overlooked when referencing the error, add it
to be very safe.

This will change QAPI behavior by showing up error message outside !FAILED
status, but it's intended and doesn't expect to break anyone.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-2-peterx@redhat.com>

10 months agomigration/rdma: Replace flawed device detail dump by tracing
Markus Armbruster [Thu, 28 Sep 2023 13:20:19 +0000 (15:20 +0200)]
migration/rdma: Replace flawed device detail dump by tracing

qemu_rdma_dump_id() dumps RDMA device details to stdout.

rdma_start_outgoing_migration() calls it via qemu_rdma_source_init()
and qemu_rdma_resolve_host() to show source device details.
rdma_start_incoming_migration() arranges its call via
rdma_accept_incoming_migration() and qemu_rdma_accept() to show
destination device details.

Two issues:

1. rdma_start_outgoing_migration() can run in HMP context.  The
   information should arguably go the monitor, not stdout.

2. ibv_query_port() failure is reported as error.  Its callers remain
   unaware of this failure (qemu_rdma_dump_id() can't fail), so
   reporting this to the user as an error is problematic.

Fixable, but the device detail dump is noise, except when
troubleshooting.  Tracing is a better fit.  Similar function
qemu_rdma_dump_id() was converted to tracing in commit
733252deb8b (Tracify migration/rdma.c).

Convert qemu_rdma_dump_id(), too.

While there, touch up qemu_rdma_dump_gid()'s outdated comment.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-54-armbru@redhat.com>

10 months agomigration/rdma: Use error_report() & friends instead of stderr
Markus Armbruster [Thu, 28 Sep 2023 13:20:18 +0000 (15:20 +0200)]
migration/rdma: Use error_report() & friends instead of stderr

error_report() obeys -msg, reports the current error location if any,
and reports to the current monitor if any.  Reporting to stderr
directly with fprintf() or perror() is wrong, because it loses all
this.

Fix the offenders.  Bonus: resolves a FIXME about problematic use of
errno.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-53-armbru@redhat.com>

10 months agomigration/rdma: Downgrade qemu_rdma_cleanup() errors to warnings
Markus Armbruster [Thu, 28 Sep 2023 13:20:17 +0000 (15:20 +0200)]
migration/rdma: Downgrade qemu_rdma_cleanup() errors to warnings

Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_source_init(), qemu_rdma_connect(),
rdma_start_incoming_migration(), and rdma_start_outgoing_migration()
violate this principle: they call error_report() via
qemu_rdma_cleanup().

Moreover, qemu_rdma_cleanup() can't fail.  It is called on error
paths, and QIOChannel close and finalization.  Are the conditions it
reports really errors?  I doubt it.

Downgrade qemu_rdma_cleanup()'s errors to warnings.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-52-armbru@redhat.com>

10 months agomigration/rdma: Silence qemu_rdma_register_and_get_keys()
Markus Armbruster [Thu, 28 Sep 2023 13:20:16 +0000 (15:20 +0200)]
migration/rdma: Silence qemu_rdma_register_and_get_keys()

Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_write_one() violates this principle: it reports errors to
stderr via qemu_rdma_register_and_get_keys().  I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.

Clean this up: silence qemu_rdma_register_and_get_keys().  I believe
the caller's error reports suffice.  If they don't, we need to convert
to Error instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-51-armbru@redhat.com>

10 months agomigration/rdma: Silence qemu_rdma_block_for_wrid()
Markus Armbruster [Thu, 28 Sep 2023 13:20:15 +0000 (15:20 +0200)]
migration/rdma: Silence qemu_rdma_block_for_wrid()

Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_post_send_control(), qemu_rdma_exchange_get_response(), and
qemu_rdma_write_one() violate this principle: they call
error_report(), fprintf(stderr, ...), and perror() via
qemu_rdma_block_for_wrid(), qemu_rdma_poll(), and
qemu_rdma_wait_comp_channel().  I elected not to investigate how
callers handle the error, i.e. precise impact is not known.

Clean this up by dropping the error reporting from qemu_rdma_poll(),
qemu_rdma_wait_comp_channel(), and qemu_rdma_block_for_wrid().  I
believe the callers' error reports suffice.  If they don't, we need to
convert to Error instead.

Bonus: resolves a FIXME about problematic use of errno.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-50-armbru@redhat.com>

10 months agomigration/rdma: Don't report received completion events as error
Markus Armbruster [Thu, 28 Sep 2023 13:20:14 +0000 (15:20 +0200)]
migration/rdma: Don't report received completion events as error

When qemu_rdma_wait_comp_channel() receives an event from the
completion channel, it reports an error "receive cm event while wait
comp channel,cm event is T", where T is the numeric event type.
However, the function fails only when T is a disconnect or device
removal.  Events other than these two are not actually an error, and
reporting them as an error is wrong.  If we need to report them to the
user, we should use something else, and what to use depends on why we
need to report them to the user.

For now, report this error only when the function actually fails.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-49-armbru@redhat.com>

10 months agomigration/rdma: Silence qemu_rdma_reg_control()
Markus Armbruster [Thu, 28 Sep 2023 13:20:13 +0000 (15:20 +0200)]
migration/rdma: Silence qemu_rdma_reg_control()

Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_source_init() and qemu_rdma_accept() violate this principle:
they call error_report() via qemu_rdma_reg_control().  I elected not
to investigate how callers handle the error, i.e. precise impact is
not known.

Clean this up by dropping the error reporting from
qemu_rdma_reg_control().  I believe the callers' error reports
suffice.  If they don't, we need to convert to Error instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-48-armbru@redhat.com>

10 months agomigration/rdma: Silence qemu_rdma_connect()
Markus Armbruster [Thu, 28 Sep 2023 13:20:12 +0000 (15:20 +0200)]
migration/rdma: Silence qemu_rdma_connect()

Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_connect() violates this principle: it calls error_report()
and perror().  I elected not to investigate how callers handle the
error, i.e. precise impact is not known.

Clean this up: replace perror() by changing error_setg() to
error_setg_errno(), and drop error_report().  I believe the callers'
error reports suffice then.  If they don't, we need to convert to
Error instead.

Bonus: resolves a FIXME about problematic use of errno.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-47-armbru@redhat.com>

10 months agomigration/rdma: Silence qemu_rdma_resolve_host()
Markus Armbruster [Thu, 28 Sep 2023 13:20:11 +0000 (15:20 +0200)]
migration/rdma: Silence qemu_rdma_resolve_host()

Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_resolve_host() violates this principle: it calls
error_report().

Clean this up: drop error_report().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-46-armbru@redhat.com>

10 months agomigration/rdma: Convert qemu_rdma_alloc_pd_cq() to Error
Markus Armbruster [Thu, 28 Sep 2023 13:20:10 +0000 (15:20 +0200)]
migration/rdma: Convert qemu_rdma_alloc_pd_cq() to Error

Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_source_init() violates this principle: it calls
error_report() via qemu_rdma_alloc_pd_cq().  I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.

Clean this up by converting qemu_rdma_alloc_pd_cq() to Error.

The conversion loses a piece of advice on one of two failure paths:

    Your mlock() limits may be too low. Please check $ ulimit -a # and search for 'ulimit -l' in the output

Not worth retaining.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-45-armbru@redhat.com>

10 months agomigration/rdma: Convert qemu_rdma_post_recv_control() to Error
Markus Armbruster [Thu, 28 Sep 2023 13:20:09 +0000 (15:20 +0200)]
migration/rdma: Convert qemu_rdma_post_recv_control() to Error

Just for symmetry with qemu_rdma_post_send_control().  Error messages
lose detail I consider of no use to users.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-44-armbru@redhat.com>

10 months agomigration/rdma: Convert qemu_rdma_post_send_control() to Error
Markus Armbruster [Thu, 28 Sep 2023 13:20:08 +0000 (15:20 +0200)]
migration/rdma: Convert qemu_rdma_post_send_control() to Error

Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_exchange_send() violates this principle: it calls
error_report() via qemu_rdma_post_send_control().  I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.

Clean this up by converting qemu_rdma_post_send_control() to Error.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-43-armbru@redhat.com>

10 months agomigration/rdma: Convert qemu_rdma_write() to Error
Markus Armbruster [Thu, 28 Sep 2023 13:20:07 +0000 (15:20 +0200)]
migration/rdma: Convert qemu_rdma_write() to Error

Just for consistency with qemu_rdma_write_one() and
qemu_rdma_write_flush(), and for slightly simpler code.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-42-armbru@redhat.com>

10 months agomigration/rdma: Convert qemu_rdma_write_one() to Error
Markus Armbruster [Thu, 28 Sep 2023 13:20:06 +0000 (15:20 +0200)]
migration/rdma: Convert qemu_rdma_write_one() to Error

Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_write_flush() violates this principle: it calls
error_report() via qemu_rdma_write_one().  I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.

Clean this up by converting qemu_rdma_write_one() to Error.  Bonus:
resolves a FIXME about problematic use of errno.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-41-armbru@redhat.com>

10 months agomigration/rdma: Convert qemu_rdma_write_flush() to Error
Markus Armbruster [Thu, 28 Sep 2023 13:20:05 +0000 (15:20 +0200)]
migration/rdma: Convert qemu_rdma_write_flush() to Error

Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qio_channel_rdma_writev() violates this principle: it calls
error_report() via qemu_rdma_write_flush().  I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.

Clean this up by converting qemu_rdma_write_flush() to Error.

Necessitates setting an error when qemu_rdma_write_one() failed.
Since this error will go away later in this series, simply use "FIXME
temporary error message" there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-40-armbru@redhat.com>

10 months agomigration/rdma: Convert qemu_rdma_reg_whole_ram_blocks() to Error
Markus Armbruster [Thu, 28 Sep 2023 13:20:04 +0000 (15:20 +0200)]
migration/rdma: Convert qemu_rdma_reg_whole_ram_blocks() to Error

Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_exchange_send() violates this principle: it calls
error_report() via callback qemu_rdma_reg_whole_ram_blocks().  I
elected not to investigate how callers handle the error, i.e. precise
impact is not known.

Clean this up by converting the callback to Error.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-39-armbru@redhat.com>

10 months agomigration/rdma: Convert qemu_rdma_exchange_get_response() to Error
Markus Armbruster [Thu, 28 Sep 2023 13:20:03 +0000 (15:20 +0200)]
migration/rdma: Convert qemu_rdma_exchange_get_response() to Error

Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_exchange_send() and qemu_rdma_exchange_recv() violate this
principle: they call error_report() via
qemu_rdma_exchange_get_response().  I elected not to investigate how
callers handle the error, i.e. precise impact is not known.

Clean this up by converting qemu_rdma_exchange_get_response() to
Error.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-38-armbru@redhat.com>

10 months agomigration/rdma: Convert qemu_rdma_exchange_send() to Error
Markus Armbruster [Thu, 28 Sep 2023 13:20:02 +0000 (15:20 +0200)]
migration/rdma: Convert qemu_rdma_exchange_send() to Error

Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qio_channel_rdma_writev() violates this principle: it calls
error_report() via qemu_rdma_exchange_send().  I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.

Clean this up by converting qemu_rdma_exchange_send() to Error.

Necessitates setting an error when qemu_rdma_post_recv_control(),
callback(), or qemu_rdma_exchange_get_response() failed.  Since these
errors will go away later in this series, simply use "FIXME temporary
error message" there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-37-armbru@redhat.com>

10 months agomigration/rdma: Convert qemu_rdma_exchange_recv() to Error
Markus Armbruster [Thu, 28 Sep 2023 13:20:01 +0000 (15:20 +0200)]
migration/rdma: Convert qemu_rdma_exchange_recv() to Error

Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qio_channel_rdma_readv() violates this principle: it calls
error_report() via qemu_rdma_exchange_recv().  I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.

Clean this up by converting qemu_rdma_exchange_recv() to Error.

Necessitates setting an error when qemu_rdma_exchange_get_response()
failed.  Since this error will go away later in this series, simply
use "FIXME temporary error message" there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-36-armbru@redhat.com>

10 months agomigration/rdma: Drop "@errp is clear" guards around error_setg()
Markus Armbruster [Thu, 28 Sep 2023 13:20:00 +0000 (15:20 +0200)]
migration/rdma: Drop "@errp is clear" guards around error_setg()

These guards are all redundant now.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-35-armbru@redhat.com>

10 months agomigration/rdma: Fix error handling around rdma_getaddrinfo()
Markus Armbruster [Thu, 28 Sep 2023 13:19:59 +0000 (15:19 +0200)]
migration/rdma: Fix error handling around rdma_getaddrinfo()

qemu_rdma_resolve_host() and qemu_rdma_dest_init() iterate over
addresses to find one that works, holding onto the first Error from
qemu_rdma_broken_ipv6_kernel() for use when no address works.  Issues:

1. If @errp was &error_abort or &error_fatal, we'd terminate instead
   of trying the next address.  Can't actually happen, since no caller
   passes these arguments.

2. When @errp is a pointer to a variable containing NULL, and
   qemu_rdma_broken_ipv6_kernel() fails, the variable no longer
   contains NULL.  Subsequent iterations pass it again, violating
   Error usage rules.  Dangerous, as setting an error would then trip
   error_setv()'s assertion.  Works only because
   qemu_rdma_broken_ipv6_kernel() and the code following the loops
   carefully avoids setting a second error.

3. If qemu_rdma_broken_ipv6_kernel() fails, and then a later iteration
   finds a working address, @errp still holds the first error from
   qemu_rdma_broken_ipv6_kernel().  If we then run into another error,
   we report the qemu_rdma_broken_ipv6_kernel() failure instead.

4. If we don't run into another error, we leak the Error object.

Use a local error variable, and propagate to @errp.  This fixes 3. and
also cleans up 1 and partly 2.

Free this error when we have a working address.  This fixes 4.

Pass the local error variable to qemu_rdma_broken_ipv6_kernel() only
until it fails.  Pass null on any later iterations.  This cleans up
the remainder of 2.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-34-armbru@redhat.com>

10 months agomigration/rdma: Retire macro ERROR()
Markus Armbruster [Thu, 28 Sep 2023 13:19:58 +0000 (15:19 +0200)]
migration/rdma: Retire macro ERROR()

ERROR() has become "error_setg() unless an error has been set
already".  Hiding the conditional in the macro is in the way of
further work.  Replace the macro uses by their expansion, and delete
the macro.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-33-armbru@redhat.com>

10 months agomigration/rdma: Delete inappropriate error_report() in macro ERROR()
Markus Armbruster [Thu, 28 Sep 2023 13:19:57 +0000 (15:19 +0200)]
migration/rdma: Delete inappropriate error_report() in macro ERROR()

Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

Macro ERROR() violates this principle.  Delete the error_report()
there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Tested-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-32-armbru@redhat.com>

10 months agomigration/rdma: Plug a memory leak and improve a message
Markus Armbruster [Thu, 28 Sep 2023 13:19:56 +0000 (15:19 +0200)]
migration/rdma: Plug a memory leak and improve a message

When migration capability @rdma-pin-all is true, but the server cannot
honor it, qemu_rdma_connect() calls macro ERROR(), then returns
success.

ERROR() sets an error.  Since qemu_rdma_connect() returns success, its
caller rdma_start_outgoing_migration() duly assumes @errp is still
clear.  The Error object leaks.

ERROR() additionally reports the situation to the user as an error:

    RDMA ERROR: Server cannot support pinning all memory. Will register memory dynamically.

Is this an error or not?  It actually isn't; we disable @rdma-pin-all
and carry on.  "Correcting" the user's configuration decisions that
way feels problematic, but that's a topic for another day.

Replace ERROR() by warn_report().  This plugs the memory leak, and
emits a clearer message to the user.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-31-armbru@redhat.com>

10 months agomigration/rdma: Check negative error values the same way everywhere
Markus Armbruster [Thu, 28 Sep 2023 13:19:55 +0000 (15:19 +0200)]
migration/rdma: Check negative error values the same way everywhere

When a function returns 0 on success, negative value on error,
checking for non-zero suffices, but checking for negative is clearer.
So do that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-30-armbru@redhat.com>

10 months agomigration/rdma: Drop superfluous assignments to @ret
Markus Armbruster [Thu, 28 Sep 2023 13:19:54 +0000 (15:19 +0200)]
migration/rdma: Drop superfluous assignments to @ret

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-29-armbru@redhat.com>

10 months agomigration/rdma: Replace int error_state by bool errored
Markus Armbruster [Thu, 28 Sep 2023 13:19:53 +0000 (15:19 +0200)]
migration/rdma: Replace int error_state by bool errored

All we do with the value of RDMAContext member @error_state is test
whether it's zero.  Change to bool and rename to @errored.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-28-armbru@redhat.com>

10 months agomigration/rdma: Dumb down remaining int error values to -1
Markus Armbruster [Thu, 28 Sep 2023 13:19:52 +0000 (15:19 +0200)]
migration/rdma: Dumb down remaining int error values to -1

This is just to make the error value more obvious.  Callers don't
mind.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-27-armbru@redhat.com>

10 months agomigration/rdma: Return -1 instead of negative errno code
Markus Armbruster [Thu, 28 Sep 2023 13:19:51 +0000 (15:19 +0200)]
migration/rdma: Return -1 instead of negative errno code

Several functions return negative errno codes on failure.  Callers
check for specific codes exactly never.  For some of the functions,
callers couldn't check even if they wanted to, because the functions
also return negative values that aren't errno codes, leaving readers
confused on what the function actually returns.

Clean up and simplify: return -1 instead of negative errno code.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-26-armbru@redhat.com>

10 months agomigration/rdma: Fix rdma_getaddrinfo() error checking
Markus Armbruster [Thu, 28 Sep 2023 13:19:50 +0000 (15:19 +0200)]
migration/rdma: Fix rdma_getaddrinfo() error checking

rdma_getaddrinfo() returns 0 on success.  On error, it returns one of
the EAI_ error codes like getaddrinfo() does, or -1 with errno set.
This is broken by design: POSIX implicitly specifies the EAI_ error
codes to be non-zero, no more.  They could clash with -1.  Nothing we
can do about this design flaw.

Both callers of rdma_getaddrinfo() only recognize negative values as
error.  Works only because systems elect to make the EAI_ error codes
negative.

Best not to rely on that: change the callers to treat any non-zero
value as failure.  Also change them to return -1 instead of the value
received from getaddrinfo() on failure, to avoid positive error
values.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-25-armbru@redhat.com>

10 months agomigration/rdma: Fix QEMUFileHooks method return values
Markus Armbruster [Thu, 28 Sep 2023 13:19:49 +0000 (15:19 +0200)]
migration/rdma: Fix QEMUFileHooks method return values

The QEMUFileHooks methods don't come with a written contract.  Digging
through the code calling them, we find:

* save_page():

  Negative values RAM_SAVE_CONTROL_DELAYED and
  RAM_SAVE_CONTROL_NOT_SUPP are special.  Any other negative value is
  an unspecified error.

  qemu_rdma_save_page() returns -EIO or rdma->error_state on error.  I
  believe the latter is always negative.  Nothing stops either of them
  to clash with the special values, though.  Feels unlikely, but fix
  it anyway to return only the special values and -1.

* before_ram_iterate(), after_ram_iterate():

  Negative value means error.  qemu_rdma_registration_start() and
  qemu_rdma_registration_stop() comply as far as I can tell.  Make
  them comply *obviously*, by returning -1 on error.

* hook_ram_load:

  Negative value means error.  rdma_load_hook() already returns -1 on
  error.  Leave it alone.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-24-armbru@redhat.com>

10 months agomigration/rdma: Drop dead qemu_rdma_data_init() code for !@host_port
Markus Armbruster [Thu, 28 Sep 2023 13:19:48 +0000 (15:19 +0200)]
migration/rdma: Drop dead qemu_rdma_data_init() code for !@host_port

qemu_rdma_data_init() neglects to set an Error when it fails because
@host_port is null.  Fortunately, no caller passes null, so this is
merely a latent bug.  Drop the flawed code handling null argument.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-23-armbru@redhat.com>

10 months agomigration/rdma: Fix qemu_get_cm_event_timeout() to always set error
Markus Armbruster [Thu, 28 Sep 2023 13:19:47 +0000 (15:19 +0200)]
migration/rdma: Fix qemu_get_cm_event_timeout() to always set error

qemu_get_cm_event_timeout() neglects to set an error when it fails
because rdma_get_cm_event() fails.  Harmless, as its caller
qemu_rdma_connect() substitutes a generic error then.  Fix it anyway.

qemu_rdma_connect() also sets the generic error when its own call of
rdma_get_cm_event() fails.  Make the error handling more obvious: set
a specific error right after rdma_get_cm_event() fails.  Delete the
generic error.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-22-armbru@redhat.com>

10 months agomigration/rdma: Fix qemu_rdma_broken_ipv6_kernel() to set error
Markus Armbruster [Thu, 28 Sep 2023 13:19:46 +0000 (15:19 +0200)]
migration/rdma: Fix qemu_rdma_broken_ipv6_kernel() to set error

qemu_rdma_resolve_host() and qemu_rdma_dest_init() try addresses until
they find on that works.  If none works, they return the first Error
set by qemu_rdma_broken_ipv6_kernel(), or else return a generic one.

qemu_rdma_broken_ipv6_kernel() neglects to set an Error when
ibv_open_device() fails.  If a later address fails differently, we use
that Error instead, or else the generic one.  Harmless enough, but
needs fixing all the same.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-21-armbru@redhat.com>

10 months agomigration/rdma: Replace dangerous macro CHECK_ERROR_STATE()
Markus Armbruster [Thu, 28 Sep 2023 13:19:45 +0000 (15:19 +0200)]
migration/rdma: Replace dangerous macro CHECK_ERROR_STATE()

Hiding return statements in macros is a bad idea.  Use a function
instead, and open code the return part.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-20-armbru@redhat.com>

10 months agomigration/rdma: Fix io_writev(), io_readv() methods to obey contract
Markus Armbruster [Thu, 28 Sep 2023 13:19:44 +0000 (15:19 +0200)]
migration/rdma: Fix io_writev(), io_readv() methods to obey contract

QIOChannelClass methods qio_channel_rdma_readv() and
qio_channel_rdma_writev() violate their method contract when
rdma->error_state is non-zero:

1. They return whatever is in rdma->error_state then.  Only -1 will be
   fine.  -2 will be misinterpreted as "would block".  Anything less
   than -2 isn't defined in the contract.  A positive value would be
   misinterpreted as success, but I believe that's not actually
   possible.

2. They neglect to set an error then.  If something up the call stack
   dereferences the error when failure is returned, it will crash.  If
   it ignores the return value and checks the error instead, it will
   miss the error.

Crap like this happens when return statements hide in macros,
especially when their uses are far away from the definition.

I elected not to investigate how callers are impacted.

Expand the two bad macro uses, so we can set an error and return -1.
The next commit will then get rid of the macro altogether.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-19-armbru@redhat.com>

10 months agomigration/rdma: Ditch useless numeric error codes in error messages
Markus Armbruster [Thu, 28 Sep 2023 13:19:43 +0000 (15:19 +0200)]
migration/rdma: Ditch useless numeric error codes in error messages

Several error messages include numeric error codes returned by failed
functions:

* ibv_poll_cq() returns an unspecified negative value.  Useless.

* rdma_accept and rdma_get_cm_event() return -1.  Useless.

* qemu_rdma_poll() returns either -1 or an unspecified negative
  value.  Useless.

* qemu_rdma_block_for_wrid(), qemu_rdma_write_flush(),
  qemu_rdma_exchange_send(), qemu_rdma_exchange_recv(),
  qemu_rdma_write() return a negative value that may or may not be an
  errno value.  While reporting human-readable errno
  information (which a number is not) can be useful, reporting an
  error code that may or may not be an errno value is useless.

Drop these error codes from the error messages.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-18-armbru@redhat.com>

10 months agomigration/rdma: Fix or document problematic uses of errno
Markus Armbruster [Thu, 28 Sep 2023 13:19:42 +0000 (15:19 +0200)]
migration/rdma: Fix or document problematic uses of errno

We use errno after calling Libibverbs functions that are not
documented to set errno (manual page does not mention errno), or where
the documentation is unclear ("returns [...] the value of errno on
failure").  While this could be read as "sets errno and returns it",
a glance at the source code[*] kills that hope:

    static inline int ibv_post_send(struct ibv_qp *qp, struct ibv_send_wr *wr,
                                    struct ibv_send_wr **bad_wr)
    {
            return qp->context->ops.post_send(qp, wr, bad_wr);
    }

The callback can be

    static int mana_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
                              struct ibv_send_wr **bad)
    {
            /* This version of driver supports RAW QP only.
             * Posting WR is done directly in the application.
             */
            return EOPNOTSUPP;
    }

Neither of them touches errno.

One of these errno uses is easy to fix, so do that now.  Several more
will go away later in the series; add temporary FIXME commments.
Three will remain; add TODO comments.  TODO, not FIXME, because the
bug might be in Libibverbs documentation.

[*] https://github.com/linux-rdma/rdma-core.git
    commit 55fa316b4b18f258d8ac1ceb4aa5a7a35b094dcf

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-17-armbru@redhat.com>

10 months agomigration/rdma: Use bool for two RDMAContext flags
Markus Armbruster [Thu, 28 Sep 2023 13:19:41 +0000 (15:19 +0200)]
migration/rdma: Use bool for two RDMAContext flags

@error_reported and @received_error are flags.  The latter is even
assigned bool true.  Change them from int to bool.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-16-armbru@redhat.com>

10 months agomigration/rdma: Make qemu_rdma_buffer_mergeable() return bool
Markus Armbruster [Thu, 28 Sep 2023 13:19:40 +0000 (15:19 +0200)]
migration/rdma: Make qemu_rdma_buffer_mergeable() return bool

qemu_rdma_buffer_mergeable() is semantically a predicate.  It returns
int 0 or 1.  Return bool instead, and fix the function name's
spelling.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-15-armbru@redhat.com>

10 months agomigration/rdma: Drop qemu_rdma_search_ram_block() error handling
Markus Armbruster [Thu, 28 Sep 2023 13:19:39 +0000 (15:19 +0200)]
migration/rdma: Drop qemu_rdma_search_ram_block() error handling

qemu_rdma_search_ram_block() can't fail.  Return void, and drop the
unreachable error handling.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-14-armbru@redhat.com>

10 months agomigration/rdma: Drop rdma_add_block() error handling
Markus Armbruster [Thu, 28 Sep 2023 13:19:38 +0000 (15:19 +0200)]
migration/rdma: Drop rdma_add_block() error handling

rdma_add_block() can't fail.  Return void, and drop the unreachable
error handling.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-13-armbru@redhat.com>

10 months agomigration/rdma: Eliminate error_propagate()
Markus Armbruster [Thu, 28 Sep 2023 13:19:37 +0000 (15:19 +0200)]
migration/rdma: Eliminate error_propagate()

When all we do with an Error we receive into a local variable is
propagating to somewhere else, we can just as well receive it there
right away.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-12-armbru@redhat.com>

10 months agomigration/rdma: Put @errp parameter last
Markus Armbruster [Thu, 28 Sep 2023 13:19:36 +0000 (15:19 +0200)]
migration/rdma: Put @errp parameter last

include/qapi/error.h demands:

 * - Functions that use Error to report errors have an Error **errp
 *   parameter.  It should be the last parameter, except for functions
 *   taking variable arguments.

qemu_rdma_connect() does not conform.  Clean it up.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-11-armbru@redhat.com>

10 months agomigration/rdma: Fix qemu_rdma_accept() to return failure on errors
Markus Armbruster [Thu, 28 Sep 2023 13:19:35 +0000 (15:19 +0200)]
migration/rdma: Fix qemu_rdma_accept() to return failure on errors

qemu_rdma_accept() returns 0 in some cases even when it didn't
complete its job due to errors.  Impact is not obvious.  I figure the
caller will soon fail again with a misleading error message.

Fix it to return -1 on any failure.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-10-armbru@redhat.com>

10 months agomigration/rdma: Give qio_channel_rdma_source_funcs internal linkage
Markus Armbruster [Thu, 28 Sep 2023 13:19:34 +0000 (15:19 +0200)]
migration/rdma: Give qio_channel_rdma_source_funcs internal linkage

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-9-armbru@redhat.com>

10 months agomigration/rdma: Clean up two more harmless signed vs. unsigned issues
Markus Armbruster [Thu, 28 Sep 2023 13:19:33 +0000 (15:19 +0200)]
migration/rdma: Clean up two more harmless signed vs. unsigned issues

qemu_rdma_exchange_get_response() compares int parameter @expecting
with uint32_t head->type.  Actual arguments are non-negative
enumeration constants, RDMAControlHeader uint32_t member type, or
qemu_rdma_exchange_recv() int parameter expecting.  Actual arguments
for the latter are non-negative enumeration constants.  Change both
parameters to uint32_t.

In qio_channel_rdma_readv(), loop control variable @i is ssize_t, and
counts from 0 up to @niov, which is size_t.  Change @i to size_t.

While there, make qio_channel_rdma_readv() and
qio_channel_rdma_writev() more consistent: change the former's @done
to ssize_t, and delete the latter's useless initialization of @len.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-8-armbru@redhat.com>

10 months agomigration/rdma: Fix unwanted integer truncation
Markus Armbruster [Thu, 28 Sep 2023 13:19:32 +0000 (15:19 +0200)]
migration/rdma: Fix unwanted integer truncation

qio_channel_rdma_readv() assigns the size_t value of qemu_rdma_fill()
to an int variable before it adds it to @done / subtracts it from
@want, both size_t.  Truncation when qemu_rdma_fill() copies more than
INT_MAX bytes.  Seems vanishingly unlikely, but needs fixing all the
same.

Fixes: 6ddd2d76ca6f (migration: convert RDMA to use QIOChannel interface)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-7-armbru@redhat.com>

10 months agomigration/rdma: Consistently use uint64_t for work request IDs
Markus Armbruster [Thu, 28 Sep 2023 13:19:31 +0000 (15:19 +0200)]
migration/rdma: Consistently use uint64_t for work request IDs

We use int instead of uint64_t in a few places.  Change them to
uint64_t.

This cleans up a comparison of signed qemu_rdma_block_for_wrid()
parameter @wrid_requested with unsigned @wr_id.  Harmless, because the
actual arguments are non-negative enumeration constants.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-6-armbru@redhat.com>

10 months agomigration/rdma: Drop fragile wr_id formatting
Markus Armbruster [Thu, 28 Sep 2023 13:19:30 +0000 (15:19 +0200)]
migration/rdma: Drop fragile wr_id formatting

wrid_desc[] uses 4001 pointers to map four integer values to strings.

print_wrid() accesses wrid_desc[] out of bounds when passed a negative
argument.  It returns null for values 2..1999 and 2001..3999.

qemu_rdma_poll() and qemu_rdma_block_for_wrid() print wrid_desc[wr_id]
and passes print_wrid(wr_id) to tracepoints.  Could conceivably crash
trying to format a null string.  I believe access out of bounds is not
possible.

Not worth cleaning up.  Dumb down to show just numeric wr_id.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-5-armbru@redhat.com>

10 months agomigration/rdma: Clean up rdma_delete_block()'s return type
Markus Armbruster [Thu, 28 Sep 2023 13:19:29 +0000 (15:19 +0200)]
migration/rdma: Clean up rdma_delete_block()'s return type

rdma_delete_block() always returns 0, which its only caller ignores.
Return void instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-4-armbru@redhat.com>

10 months agomigration/rdma: Clean up qemu_rdma_data_init()'s return type
Markus Armbruster [Thu, 28 Sep 2023 13:19:28 +0000 (15:19 +0200)]
migration/rdma: Clean up qemu_rdma_data_init()'s return type

qemu_rdma_data_init() return type is void *.  It actually returns
RDMAContext *, and all its callers assign the value to an
RDMAContext *.  Unclean.

Return RDMAContext * instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-3-armbru@redhat.com>

10 months agomigration/rdma: Clean up qemu_rdma_poll()'s return type
Markus Armbruster [Thu, 28 Sep 2023 13:19:27 +0000 (15:19 +0200)]
migration/rdma: Clean up qemu_rdma_poll()'s return type

qemu_rdma_poll()'s return type is uint64_t, even though it returns 0,
-1, or @ret, which is int.  Its callers assign the return value to int
variables, then check whether it's negative.  Unclean.

Return int instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-2-armbru@redhat.com>

10 months agomigration: Allow RECOVER->PAUSED convertion for dest qemu
Peter Xu [Wed, 4 Oct 2023 22:02:39 +0000 (18:02 -0400)]
migration: Allow RECOVER->PAUSED convertion for dest qemu

There's a bug on dest that if a double fault triggered on dest qemu (a
network issue during postcopy-recover), we won't set PAUSED correctly
because we assumed we always came from ACTIVE.

Fix that by always overwriting the state to PAUSE.

We could also check for these two states, but maybe it's an overkill.  We
did the same on the src QEMU to unconditionally switch to PAUSE anyway.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-10-peterx@redhat.com>

10 months agotests/qtest: migration: Add support for negative testing of qmp_migrate
Fabiano Rosas [Wed, 12 Jul 2023 19:07:41 +0000 (16:07 -0300)]
tests/qtest: migration: Add support for negative testing of qmp_migrate

There is currently no way to write a test for errors that happened in
qmp_migrate before the migration has started.

Add a version of qmp_migrate that ensures an error happens. To make
use of it a test needs to set MigrateCommon.result as
MIG_TEST_QMP_ERROR.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230712190742.22294-6-farosas@suse.de>

10 months agomigration: Set migration status early in incoming side
Fabiano Rosas [Wed, 12 Jul 2023 19:07:40 +0000 (16:07 -0300)]
migration: Set migration status early in incoming side

We are sending a migration event of MIGRATION_STATUS_SETUP at
qemu_start_incoming_migration but never actually setting the state.

This creates a window between qmp_migrate_incoming and
process_incoming_migration_co where the migration status is still
MIGRATION_STATUS_NONE. Calling query-migrate during this time will
return an empty response even though the incoming migration command
has already been issued.

Commit 7cf1fe6d68 ("migration: Add migration events on target side")
has added support to the 'events' capability to the incoming part of
migration, but chose to send the SETUP event without setting the
state. I'm assuming this was a mistake.

This introduces a change in behavior, any QMP client waiting for the
SETUP event will hang, unless it has previously enabled the 'events'
capability. Having the capability enabled is sufficient to continue to
receive the event.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230712190742.22294-5-farosas@suse.de>

10 months agotests/qtest: migration: Use migrate_incoming_qmp where appropriate
Fabiano Rosas [Wed, 12 Jul 2023 19:07:39 +0000 (16:07 -0300)]
tests/qtest: migration: Use migrate_incoming_qmp where appropriate

Use the new migrate_incoming_qmp helper in the places that currently
open-code calling migrate-incoming.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230712190742.22294-4-farosas@suse.de>

10 months agotests/qtest: migration: Add migrate_incoming_qmp helper
Fabiano Rosas [Wed, 12 Jul 2023 19:07:38 +0000 (16:07 -0300)]
tests/qtest: migration: Add migrate_incoming_qmp helper

file-based migration requires the target to initiate its migration after
the source has finished writing out the data in the file. Currently
there's no easy way to initiate 'migrate-incoming', allow this by
introducing migrate_incoming_qmp helper, similarly to migrate_qmp.

Also make sure migration events are enabled and wait for the incoming
migration to start before returning. This avoid a race when querying
the migration status too soon after issuing the command.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230712190742.22294-3-farosas@suse.de>

10 months agotests/qtest: migration: Expose migrate_set_capability
Fabiano Rosas [Wed, 12 Jul 2023 19:07:37 +0000 (16:07 -0300)]
tests/qtest: migration: Expose migrate_set_capability

The following patch will make use of this function from within
migrate-helpers.c, so move it there.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230712190742.22294-2-farosas@suse.de>

10 months agomigration/qmp: Fix crash on setting tls-authz with null
Peter Xu [Tue, 5 Sep 2023 16:23:32 +0000 (12:23 -0400)]
migration/qmp: Fix crash on setting tls-authz with null

QEMU will crash if anyone tries to set tls-authz (which is a type
StrOrNull) with 'null' value.  Fix it in the easy way by converting it to
qstring just like the other two tls parameters.

Cc: qemu-stable@nongnu.org # v4.0+
Fixes: d2f1d29b95 ("migration: add support for a "tls-authz" migration parameter")
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230905162335.235619-2-peterx@redhat.com>

10 months agocontrib/plugins: fix coverity warning in hotblocks
Alex Bennée [Mon, 9 Oct 2023 16:41:04 +0000 (17:41 +0100)]
contrib/plugins: fix coverity warning in hotblocks

Coverity complains that we have an unbalance use of mutex leading to
potential deadlocks.

Fixes: CID 1519048
Fixes: a208ba09bd ("tests/plugin: add a hotblocks plugin")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231009164104.369749-26-alex.bennee@linaro.org>

10 months agocontrib/plugins: fix coverity warning in lockstep
Alex Bennée [Mon, 9 Oct 2023 16:41:03 +0000 (17:41 +0100)]
contrib/plugins: fix coverity warning in lockstep

Coverity complains that e don't check for a truncation when copying in
the path. Bail if we can't copy the whole path into sockaddr.

Fixes: CID 1519045
Fixes: CID 1519046
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231009164104.369749-25-alex.bennee@linaro.org>

10 months agocontrib/plugins: fix coverity warning in cache
Alex Bennée [Mon, 9 Oct 2023 16:41:02 +0000 (17:41 +0100)]
contrib/plugins: fix coverity warning in cache

Coverity complains that appends_stats_line can be fed a 0 leading
to the undefined behaviour of a divide by 0.

Fixes: CID 1519044
Fixes: CID 1519047
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231009164104.369749-24-alex.bennee@linaro.org>

10 months agoplugins: Set final instruction count in plugin_gen_tb_end
Matt Borgerson [Mon, 9 Oct 2023 16:41:01 +0000 (17:41 +0100)]
plugins: Set final instruction count in plugin_gen_tb_end

Translation logic may partially decode an instruction, then abort and
remove the instruction from the TB. This can happen for example when an
instruction spans two pages. In this case, plugins may get an incorrect
result when calling qemu_plugin_tb_n_insns to query for the number of
instructions in the TB. This patch updates plugin_gen_tb_end to set the
final instruction count.

Signed-off-by: Matt Borgerson <contact@mborgerson.com>
[AJB: added g_assert to defed API]
Message-Id: <CADc=-s5RwGViNTR-h5cq3np673W3RRFfhr4vCGJp0EoDUxvhog@mail.gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231009164104.369749-23-alex.bennee@linaro.org>

10 months agotarget/sh4: Disable decode_gusa when plugins enabled
Richard Henderson [Mon, 9 Oct 2023 16:41:00 +0000 (17:41 +0100)]
target/sh4: Disable decode_gusa when plugins enabled

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230824181233.1568795-3-richard.henderson@linaro.org>
[AJB: fixed s/cpu_env/tcg_env/ during re-base]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231009164104.369749-22-alex.bennee@linaro.org>

10 months agoaccel/tcg: Add plugin_enabled to DisasContextBase
Richard Henderson [Mon, 9 Oct 2023 16:40:59 +0000 (17:40 +0100)]
accel/tcg: Add plugin_enabled to DisasContextBase

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230824181233.1568795-2-richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231009164104.369749-21-alex.bennee@linaro.org>

10 months agogdbstub: Replace gdb_regs with an array
Akihiko Odaki [Mon, 9 Oct 2023 16:40:58 +0000 (17:40 +0100)]
gdbstub: Replace gdb_regs with an array

An array is a more appropriate data structure than a list for gdb_regs
since it is initialized only with append operation and read-only after
initialization.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230912224107.29669-13-akihiko.odaki@daynix.com>
[AJB: fixed a checkpatch violation]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231009164104.369749-20-alex.bennee@linaro.org>

10 months agogdbstub: Remove gdb_has_xml variable
Akihiko Odaki [Mon, 9 Oct 2023 16:40:57 +0000 (17:40 +0100)]
gdbstub: Remove gdb_has_xml variable

GDB has XML support since 6.7 which was released in 2007.
It's time to remove support for old GDB versions without XML support.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230912224107.29669-12-akihiko.odaki@daynix.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231009164104.369749-19-alex.bennee@linaro.org>

10 months agotarget/ppc: Remove references to gdb_has_xml
Akihiko Odaki [Mon, 9 Oct 2023 16:40:56 +0000 (17:40 +0100)]
target/ppc: Remove references to gdb_has_xml

GDB has XML support since 6.7 which was released in 2007.
It's time to remove support for old GDB versions without XML support.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230912224107.29669-11-akihiko.odaki@daynix.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231009164104.369749-18-alex.bennee@linaro.org>

10 months agotarget/arm: Remove references to gdb_has_xml
Akihiko Odaki [Mon, 9 Oct 2023 16:40:55 +0000 (17:40 +0100)]
target/arm: Remove references to gdb_has_xml

GDB has XML support since 6.7 which was released in 2007.
It's time to remove support for old GDB versions without XML support.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230912224107.29669-10-akihiko.odaki@daynix.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231009164104.369749-17-alex.bennee@linaro.org>

10 months agogdbstub: Use g_markup_printf_escaped()
Akihiko Odaki [Mon, 9 Oct 2023 16:40:54 +0000 (17:40 +0100)]
gdbstub: Use g_markup_printf_escaped()

g_markup_printf_escaped() is a safer alternative to simple printf() as
it automatically escapes values.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230912224107.29669-9-akihiko.odaki@daynix.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231009164104.369749-16-alex.bennee@linaro.org>

10 months agohw/core/cpu: Return static value with gdb_arch_name()
Akihiko Odaki [Mon, 9 Oct 2023 16:40:53 +0000 (17:40 +0100)]
hw/core/cpu: Return static value with gdb_arch_name()

All implementations of gdb_arch_name() returns dynamic duplicates of
static strings. It's also unlikely that there will be an implementation
of gdb_arch_name() that returns a truly dynamic value due to the nature
of the function returning a well-known identifiers. Qualify the value
gdb_arch_name() with const and make all of its implementations return
static strings.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230912224107.29669-8-akihiko.odaki@daynix.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231009164104.369749-15-alex.bennee@linaro.org>