From: Mathieu Desnoyers Date: Thu, 23 Jun 2022 19:58:04 +0000 (-0400) Subject: Fix: sessiond wait futex: handle spurious futex wakeups X-Git-Tag: v2.13.4~14 X-Git-Url: https://git.lttng.org./?a=commitdiff_plain;h=362ea43ac96394dc68dc22677cd9b84f5fa29b76;p=lttng-ust.git Fix: sessiond wait futex: handle spurious futex wakeups Observed issue ============== LTTng-UST scheme for letting listener threads wait on session daemon to wake up a futex is similar to the liburcu workqueue code, which has an issue with spurious wakeups. This wait/wakeup scheme is only used after the LTTng-UST listener thread has been unable to connect to the session daemon. A spurious wakeup on wait_for_sessiond can cause wait_for_sessiond to return with a sock_info->wait_shm_mmap state of 0, which is unexpected. However, this should not cause any user-observable issues other than using slightly more CPU time than strictly needed, because this spurious wakeup will only cause an additional connection attempt to the session daemon to fail. Cause ===== From futex(5): FUTEX_WAIT Returns 0 if the caller was woken up. Note that a wake-up can also be caused by common futex usage patterns in unrelated code that happened to have previously used the futex word's memory location (e.g., typical futex-based implementations of Pthreads mutexes can cause this under some conditions). Therefore, call‐ ers should always conservatively assume that a return value of 0 can mean a spurious wake-up, and use the futex word's value (i.e., the user-space synchronization scheme) to decide whether to continue to block or not. Solution ======== We therefore need to validate whether the value differs from 0 in user-space after the call to FUTEX_WAIT returns 0. Known drawbacks =============== None. Signed-off-by: Mathieu Desnoyers Change-Id: I468d8ff302f467ee9924e6edb04476fcb031b4b9 --- diff --git a/src/lib/lttng-ust/lttng-ust-comm.c b/src/lib/lttng-ust/lttng-ust-comm.c index ed5bb8c8..a46ca4e4 100644 --- a/src/lib/lttng-ust/lttng-ust-comm.c +++ b/src/lib/lttng-ust/lttng-ust-comm.c @@ -1757,18 +1757,25 @@ void wait_for_sessiond(struct sock_info *sock_info) DBG("Waiting for %s apps sessiond", sock_info->name); /* Wait for futex wakeup */ - if (uatomic_read((int32_t *) sock_info->wait_shm_mmap)) - goto end_wait; - - while (lttng_ust_futex_async((int32_t *) sock_info->wait_shm_mmap, - FUTEX_WAIT, 0, NULL, NULL, 0)) { + while (!uatomic_read((int32_t *) sock_info->wait_shm_mmap)) { + if (!lttng_ust_futex_async((int32_t *) sock_info->wait_shm_mmap, FUTEX_WAIT, 0, NULL, NULL, 0)) { + /* + * Prior queued wakeups queued by unrelated code + * using the same address can cause futex wait to + * return 0 even through the futex value is still + * 0 (spurious wakeups). Check the value again + * in user-space to validate whether it really + * differs from 0. + */ + continue; + } switch (errno) { - case EWOULDBLOCK: + case EAGAIN: /* Value already changed. */ goto end_wait; case EINTR: /* Retry if interrupted by signal. */ - break; /* Get out of switch. */ + break; /* Get out of switch. Check again. */ case EFAULT: wait_poll_fallback = 1; DBG(