From fece27760ff57a0ca16c37e144f6a58a2f1851e9 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 2 May 2018 20:04:12 -0700 Subject: [PATCH] rcu: Suppress false-positive preempted-task splats Consider the following sequence of events in a PREEMPT=y kernel: 1. All CPUs corresponding to a given rcu_node structure go offline. A new grace period starts just after the CPU-hotplug code path does its synchronize_rcu() for the last CPU, so at least this CPU is present in that structure's ->qsmask. 2. Before the grace period ends, a CPU comes back online, and not just any CPU, but the one corresponding to a non-zero bit in the leaf rcu_node structure's ->qsmask. 3. A task running on the newly onlined CPU is preempted while in an RCU read-side critical section. Because this CPU's ->qsmask bit is net, not only does this task queue itself on the leaf rcu_node structure's ->blkd_tasks list, it also sets that structure's ->gp_tasks pointer to reference it. 4. The grace period started in #1 above comes to an end. This results in rcu_gp_cleanup() being invoked, which, among other things, checks to make sure that there are no tasks blocking the just-ended grace period, that is, that all ->gp_tasks pointers are NULL. The ->gp_tasks pointer corresponding to the task preempted in #3 above is non-NULL, which results in a splat. This splat is a false positive. The task's RCU read-side critical section cannot have begun before the just-ended grace period because this would mean either: (1) The CPU came online before the grace period started, which cannot have happened because the grace period started before that CPU was all the way offline, or (2) The task started its RCU read-side critical section on some other CPU, but then it would have had to have been preempted before migrating to this CPU, which would mean that it would have instead queued itself on that other CPU's rcu_node structure. This commit eliminates this false positive by adding code to the end of rcu_cleanup_dying_idle_cpu() that reports a quiescent state to RCU, which has the side-effect of clearing that CPU's ->qsmask bit, preventing the above scenario. This approach has the added benefit of more promptly reporting quiescent states corresponding to offline CPUs. Note well that the call to rcu_report_qs_rnp() reporting the quiescent state must come -before- the clearing of this CPU's bit in the leaf rcu_node structure's ->qsmaskinitnext field. Otherwise, lockdep-RCU will complain bitterly about quiescent states coming from an offline CPU. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index a7773fc72b0c..72adf97458e3 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3731,6 +3731,11 @@ static void rcu_cleanup_dying_idle_cpu(int cpu, struct rcu_state *rsp) /* Remove outgoing CPU from mask in the leaf rcu_node structure. */ mask = rdp->grpmask; raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order guarantee. */ + if (rnp->qsmask & mask) { /* RCU waiting on outgoing CPU? */ + /* Report quiescent state -before- changing ->qsmaskinitnext! */ + rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags); + raw_spin_lock_irqsave_rcu_node(rnp, flags); + } rnp->qsmaskinitnext &= ~mask; raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } -- 2.11.0