Observed issue
==============
The urcu wait_gp() implements a futex wait/wakeup scheme identical to
the workqueue code, which has an issue with spurious wakeups.
A spurious wakeup on wait_gp can cause wait_gp to return with a
rcu_gp.futex state of -1, which is unexpected. It would cause the
following loops in wait_for_readers() to decrement the
rcu_gp.futex to values below -1, thus actively using CPU as values
will be decremented to very low negative values until it reaches 0
through underflow, or until the input_readers list is found to be empty.
The state is restored to 0 when the input_readers list is found to be
empty, which restores the futex state to a correct state for the
following calls to wait_for_readers().
This issue will cause spurious unexpected high CPU use, but will not
lead to data corruption.
Cause
=====
From futex(5):
FUTEX_WAIT
Returns 0 if the caller was woken up. Note that a wake-up can
also be caused by common futex usage patterns in unrelated code
that happened to have previously used the futex word's memory
location (e.g., typical futex-based implementations of Pthreads
mutexes can cause this under some conditions). Therefore, call‐
ers should always conservatively assume that a return value of 0
can mean a spurious wake-up, and use the futex word's value
(i.e., the user-space synchronization scheme) to decide whether
to continue to block or not.
Solution
========
We therefore need to validate whether the value differs from -1 in
user-space after the call to FUTEX_WAIT returns 0.
Known drawbacks
===============
None.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I83942e24c32e77395ff25b466f1b1640422b9eb5
smp_mb_master();
/* Temporarily unlock the registry lock. */
mutex_unlock(&rcu_registry_lock);
smp_mb_master();
/* Temporarily unlock the registry lock. */
mutex_unlock(&rcu_registry_lock);
- if (uatomic_read(&rcu_gp.futex) != -1)
- goto end;
- while (futex_async(&rcu_gp.futex, FUTEX_WAIT, -1,
- NULL, NULL, 0)) {
+ while (uatomic_read(&rcu_gp.futex) == -1) {
+ if (!futex_async(&rcu_gp.futex, FUTEX_WAIT, -1, NULL, NULL, 0)) {
+ /*
+ * Prior queued wakeups queued by unrelated code
+ * using the same address can cause futex wait to
+ * return 0 even through the futex value is still
+ * -1 (spurious wakeups). Check the value again
+ * in user-space to validate whether it really
+ * differs from -1.
+ */
+ continue;
+ }
/* Value already changed. */
goto end;
case EINTR:
/* Retry if interrupted by signal. */
/* Value already changed. */
goto end;
case EINTR:
/* Retry if interrupted by signal. */
- break; /* Get out of switch. */
+ break; /* Get out of switch. Check again. */
default:
/* Unexpected error. */
urcu_die(errno);
default:
/* Unexpected error. */
urcu_die(errno);