Fix: waiter: futex wait: handle spurious futex wakeups
Observed issue
==============
The waiter lttng_waiter_wait() implements a futex wait/wakeup
scheme similar to the liburcu workqueue code, which has an issue with
spurious wakeups.
A spurious wakeup on lttng_waiter_wait can cause
lttng_waiter_wait to reach label skip_futex_wait with a
waiter->state state of WAITER_WAITING, which is unexpected. It would
cause busy-waiting on WAITER_TEARDOWN state to start early. The
wait-teardown stage is done with WAIT_ATTEMPTS active attempts,
following by attempts spaced by 10ms sleeps. I do not expect that these
spurious wakeups will cause user-observable effects other than being
slightly less efficient that it should be.
This issue will cause spurious unexpected high CPU use, but will not
lead to data corruption.
Cause
=====
From futex(5):
FUTEX_WAIT
Returns 0 if the caller was woken up. Note that a wake-up can
also be caused by common futex usage patterns in unrelated code
that happened to have previously used the futex word's memory
location (e.g., typical futex-based implementations of Pthreads
mutexes can cause this under some conditions). Therefore, call‐
ers should always conservatively assume that a return value of 0
can mean a spurious wake-up, and use the futex word's value
(i.e., the user-space synchronization scheme) to decide whether
to continue to block or not.
Solution
========
We therefore need to validate whether the value differs from
WAITER_WAITING in user-space after the call to FUTEX_WAIT returns 0.
Known drawbacks
===============
None.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ida9905d1f0b5d9543c8b85ecbd7d748a6f7c1c97
This page took 0.027639 seconds and 4 git commands to generate.