From: Mathieu Desnoyers Date: Wed, 22 Jun 2022 20:28:53 +0000 (-0400) Subject: Fix: workqueue: futex wait: handle spurious futex wakeups X-Git-Tag: v0.12.4~16 X-Git-Url: https://git.lttng.org./?a=commitdiff_plain;h=998a67fe12ae9c054f7e6ef129e51292ad426438;p=userspace-rcu.git Fix: workqueue: futex wait: handle spurious futex wakeups Observed issue ============== The workqueue thread futex_wait() returns with a workqueue->futex state of -1, which is unexpected. In this situation, the workqueue thread is observed to use 99% of CPU as workqueue->futex values are decremented to very low negative values while the workqueue is empty. This issue will cause spurious unexpected high CPU use, but will not lead to data corruption. Cause ===== From futex(5): FUTEX_WAIT Returns 0 if the caller was woken up. Note that a wake-up can also be caused by common futex usage patterns in unrelated code that happened to have previously used the futex word's memory location (e.g., typical futex-based implementations of Pthreads mutexes can cause this under some conditions). Therefore, call‐ ers should always conservatively assume that a return value of 0 can mean a spurious wake-up, and use the futex word's value (i.e., the user-space synchronization scheme) to decide whether to continue to block or not. Solution ======== We therefore need to validate whether the value differs from -1 in user-space after the call to FUTEX_WAIT returns 0. Known drawbacks =============== None. Signed-off-by: Mathieu Desnoyers Change-Id: Id024e7d3b2dab75d30fc01280fd27e5f2d8af0d1 --- diff --git a/src/workqueue.c b/src/workqueue.c index 94bd514..1e0cbba 100644 --- a/src/workqueue.c +++ b/src/workqueue.c @@ -132,18 +132,29 @@ static int set_thread_cpu_affinity(struct urcu_workqueue *workqueue __attribute_ static void futex_wait(int32_t *futex) { + int ret; + /* Read condition before read futex */ cmm_smp_mb(); - if (uatomic_read(futex) != -1) - return; - while (futex_async(futex, FUTEX_WAIT, -1, NULL, NULL, 0)) { + while (uatomic_read(futex) == -1) { + if (!futex_async(futex, FUTEX_WAIT, -1, NULL, NULL, 0)) { + /* + * Prior queued wakeups queued by unrelated code + * using the same address can cause futex wait to + * return 0 even through the futex value is still + * -1 (spurious wakeups). Check the value again + * in user-space to validate whether it really + * differs from -1. + */ + continue; + } switch (errno) { - case EWOULDBLOCK: + case EAGAIN: /* Value already changed. */ return; case EINTR: /* Retry if interrupted by signal. */ - break; /* Get out of switch. */ + break; /* Get out of switch. Check again. */ default: /* Unexpected error. */ urcu_die(errno);