OSDN Git Service

KVM: PPC: Book3S HV: Fix decrementer migration
authorFabiano Rosas <farosas@linux.ibm.com>
Tue, 16 Aug 2022 22:25:17 +0000 (19:25 -0300)
committerMichael Ellerman <mpe@ellerman.id.au>
Tue, 27 Sep 2022 15:07:19 +0000 (01:07 +1000)
We used to have a workaround[1] for a hang during migration that was
made ineffective when we converted the decrementer expiry to be
relative to guest timebase.

The point of the workaround was that in the absence of an explicit
decrementer expiry value provided by userspace during migration, KVM
needs to initialize dec_expires to a value that will result in an
expired decrementer after subtracting the current guest timebase. That
stops the vcpu from hanging after migration due to a decrementer
that's too large.

If the dec_expires is now relative to guest timebase, its
initialization needs to be guest timebase-relative as well, otherwise
we end up with a decrementer expiry that is still larger than the
guest timebase.

1- https://git.kernel.org/torvalds/c/5855564c8ab2

Fixes: 3c1a4322bba7 ("KVM: PPC: Book3S HV: Change dec_expires to be relative to guest timebase")
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220816222517.1916391-1-farosas@linux.ibm.com
arch/powerpc/kvm/book3s_hv.c
arch/powerpc/kvm/powerpc.c

index 57d0835..917abda 100644 (file)
@@ -2517,10 +2517,24 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
                r = set_vpa(vcpu, &vcpu->arch.dtl, addr, len);
                break;
        case KVM_REG_PPC_TB_OFFSET:
+       {
                /* round up to multiple of 2^24 */
-               vcpu->arch.vcore->tb_offset =
-                       ALIGN(set_reg_val(id, *val), 1UL << 24);
+               u64 tb_offset = ALIGN(set_reg_val(id, *val), 1UL << 24);
+
+               /*
+                * Now that we know the timebase offset, update the
+                * decrementer expiry with a guest timebase value. If
+                * the userspace does not set DEC_EXPIRY, this ensures
+                * a migrated vcpu at least starts with an expired
+                * decrementer, which is better than a large one that
+                * causes a hang.
+                */
+               if (!vcpu->arch.dec_expires && tb_offset)
+                       vcpu->arch.dec_expires = get_tb() + tb_offset;
+
+               vcpu->arch.vcore->tb_offset = tb_offset;
                break;
+       }
        case KVM_REG_PPC_LPCR:
                kvmppc_set_lpcr(vcpu, set_reg_val(id, *val), true);
                break;
index fb14907..757491d 100644 (file)
@@ -786,7 +786,6 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 
        hrtimer_init(&vcpu->arch.dec_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
        vcpu->arch.dec_timer.function = kvmppc_decrementer_wakeup;
-       vcpu->arch.dec_expires = get_tb();
 
 #ifdef CONFIG_KVM_EXIT_TIMING
        mutex_init(&vcpu->arch.exit_timing_lock);