call_rcu: use futex for wakeup scheme
If I remove the URCU_CALL_RCU_RT flag from the rbtree single writer
test, thus using the pthread_cond_signal mechanism, there is a huge
slowdown: without cpu affinity for the worker threads, it crawls to 129
updates/s (looks like mutex contention between the thread calling
call_rcu and the call_rcu thread). Adding CPU affinity to the per-cpu
call_rcu threads, I get 546 updates/s, which is slightly better (better
cache locality, and maybe the mutex contention is not as bad thanks to
the two threads sharing the same CPU).
So I decided to try replacing pthread_cond_wait/signal with my
futex-based implementation I use for the rest of the urcu lib: it has
the advantage of removing the mutex from the call_rcu() execution
entirely, sampling the "futex" variable without any mutex whatsoever for
the case where no wakeup is needed.
Disabling URCU_CALL_RCU_RT flag, with per-cpu affined call_rcu threads,
with my futex-based wakeup implementation, I get 55754 updates/s (even
better than with URCU_CALL_RCU_RT flag!).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
This page took 0.025328 seconds and 4 git commands to generate.