Fix: urcu-bp: re-initialize list head on library exit
In case an application would try to create threads after the urcu-bp
library destructor has run, make sure the arena chunk list is
re-initialized after the memory mappings are unmapped.
"make distcheck" marks each source file on the srcdir in the extracted
dist tarball read-only. The examples copy from the srcdir into the
builddir before running the "make" examples, but this keeps the
read-only flag on the builddir directories, which fails the build
because the resulting objects cannot be created.
Fix this by ensuring the copied target directory for each example is
user-writeable.
Introduce __cds_wfcq_head_cast and cds_wfcq_head_cast for compability
of wfcqueue with c++. Those are effect-less in C, where transparent
unions are supported. However, in C++, those transform struct
cds_wfcq_head and struct __cds_wfcq_head pointers to
cds_wfcq_head_ptr_t.
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Marek Vasut [Tue, 9 Feb 2016 18:30:22 +0000 (19:30 +0100)]
Support for NIOS2 architecture
Add support for the Altera NIOS2 CPU archirecture. The atomic operations
are handled by the GCC. The memory barriers on this systems are entirely
trivial too, since the CPU does not support SMP at all.
Signed-off-by: Marek Vasut <marex@denx.de> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Add a urcu_ref_get_safe API, which returns a boolean. It takes the value
"false" if a LONG_MAX overflow would occur (get is not performed in this
case), or true otherwise.
It warns the user (at compile-time) if the return value is unchecked.
The urcu refcounting API features a look and feel similar to the Linux
kernel reference counting API, which has been the subject of
CVE-2016-0728 (use-after-free). Therefore, improve the urcu refcounting
API by dealing with reference counting overflow.
For urcu_ref_get(), handle this by comparing the prior value with
LONG_MAX before updating it with a cmpxchg. When an overflow would
occur, trigger a abort() rather than allowing the overflow (which is a
use-after-free security concern).
For urcu_ref_get_unless_zero(), in addition to compare the prior value
to 0, also compare it to LONG_MAX, and return failure (false) in both
cases.
Fix: compat_futex should work-around futex signal-restart kernel bug
When testing liburcu on a 3.18 Linux kernel, 2-core MIPS (cpu model :
Ingenic JZRISC V4.15 FPU V0.0), we notice that a blocked sys_futex
FUTEX_WAIT returns -1, errno=ENOSYS when interrupted by a SA_RESTART
signal handler. This spurious ENOSYS behavior causes hangs in liburcu
0.9.x. Running a MIPS 3.18 kernel under a QEMU emulator exhibits the
same behavior. This might affect earlier kernels.
This issue appears to be fixed in 3.19 since commit e967ef022 "MIPS: Fix
restart of indirect syscalls", but nevertheless, we should try to handle
this kernel bug more gracefully than a user-space hang due to unexpected
spurious ENOSYS return value.
Therefore, fallback on the "async-safe" version of compat_futex in those
situations where FUTEX_WAIT returns ENOSYS. This async-safe fallback has
the nice property of being OK to use concurrently with other FUTEX_WAKE
and FUTEX_WAIT futex() calls, because it's simply a busy-wait scheme.
The 4.2 kernel on parisc, and likely newer kernels too, are also
affected by a similar issue.
Didier Nadeau [Wed, 16 Dec 2015 20:02:47 +0000 (15:02 -0500)]
Support for Xeon-Phi with newer MPSS
The Xeon-Phi is now considered as a new architecture instead of a
vendor in MPSS version 3.4.4. This change is backward compatible with
previous MPSS versions.
Now that the membarrier system call is allocated on sparc, allocate
its number in our architecture header if the system headers don't
allocate it. This allows using the membarrier system call as soon as
implemented in the kernel, even if the distribution has old kernel
headers.
Now that the membarrier system call is allocated on hppa (parisc),
allocate its number in our architecture header if the system headers
don't allocate it. This allows using the membarrier system call as soon
as implemented in the kernel, even if the distribution has old kernel
headers.
The signal-based urcu flavor calls smp_mb_master() within the wait_gp()
function. Since commit "Fix: deadlock when thread join is issued in
read-side C.S.", wait_gp() is called without the registry lock held.
Ensure that the registry lock is only released around the wait per se,
not around the call to smp_mb_master(), otherwise we end up iterating on
a non-consistent thread registry in smp_mb_master().
- Migrate benchmarks and regression tests to tap,
- Replace the "bench" make target by "short_bench" and "long_bench".
The short benchmark is 3 seconds per test, and the long one is 30
seconds per test,
- make regtest now invokes the benchmarks with only 1 second per
benchmark.
- Now use "nproc" command to detect the number of available CPUs rather
than hardcoding a value.
- rcutorture in "stress" mode is now executed.
Now that the membarrier system call is allocated on tile, allocate
its number in our architecture header if the system headers don't
allocate it. This allows using the membarrier system call as soon as
implemented in the kernel, even if the distribution has old kernel
headers.
Do so by creating headers specifically for tile, which rely on the
gcc atomic and memory barrier builtins.
Now that the membarrier system call is allocated on ia64, allocate
its number in our architecture header if the system headers don't
allocate it. This allows using the membarrier system call as soon as
implemented in the kernel, even if the distribution has old kernel
headers.
Do so by creating headers specifically for ia64, which rely on the
gcc atomic and memory barrier builtins.
Now that the membarrier system call is allocated on aarch64, allocate
its number in our architecture header if the system headers don't
allocate it. This allows using the membarrier system call as soon as
implemented in the kernel, even if the distribution has old kernel
headers.
Do so by creating headers specifically for aarch64, which rely on the
gcc atomic and memory barrier builtins.
powerpc64le has been originally added to urcu with the "gcc" generic
architecture support. After testing, it appears that the "ppc"
architecture works as well.
Move to the "ppc" architecture so it becomes the same as other powerpc
32/64 (big endian) architectures.
Doing so wires up the membarrier system call on powerpc64le.
Now that the membarrier system call is allocated on ARM, allocate its
number in our architecture header if the system headers don't allocate
it. This allows using the membarrier system call as soon as implemented
in the kernel, even if the distribution has old kernel headers.
Now that the membarrier system call is allocated on s390/s390x, allocate
its number in our architecture header if the system headers don't
allocate it. This allows using the membarrier system call as soon as
implemented in the kernel, even if the distribution has old kernel
headers.
Now that the membarrier system call is allocated on powerpc, allocate
its number in our architecture header if the system headers don't
allocate it. This allows using the membarrier system call as soon as
implemented in the kernel, even if the distribution has old kernel
headers.
The documentation of the RCU-based synchronization technique in lfstack
is too strict. It currently states that the cds_lfs_node structure
cannot be overwritten before a grace period has passed. However, lfstack
pop only use the next pointer as the replacement value when doing the
cmpxchg on the head. After the node has been pop'd from the stack,
concurrent cmpxchg trying to pop that same node will necessarily fail as
long as there is a grace period before pop/pop_all and re-adding the
node into the stack.
It is therefore sufficient to wait for a grace period between:
1) pop/pop_all and
2) freeing the node (to ensure existence for concurrent pop trying to
read node->next) or re-adding the node into the stack.
This node re-use constraint relaxation is only possible because we don't
care about node->next content read by concurrent pop: it will be simply
discarded by the cmpxchg on head. Be careful not to apply this relaxed
constraint to other data structures which care about the content of the
node's next pointer (e.g. wfstack).
This relaxed constraint allows implementing efficient free-lists (memory
allocation) with a lock-free allocation/free based on lfstack: it allows
re-using the memory backing the free-list node immediately after
allocation. The only requirement with respect to this use-case is to
wait for a grace period before putting the node back into the free-list.
Also update the test_urcu_lfs to poison the next pointer immediately
after pop/pop_all to make sure we test this relaxed constraint.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> CC: Lai Jiangshan <jiangshanlai@gmail.com> CC: lttng-dev@lists.lttng.org CC: rp@svcs.cs.pdx.edu
Cleanup: tests: Branch condition evaluates to a garbage value
scan-build reported this:
Logic error Branch condition evaluates to a garbage value tests
/benchmark /test_urcu_hash_rw.c 170
Logic error Branch condition evaluates to a garbage value tests
/benchmark /test_urcu_hash_rw.c 274
It should never happen based on code review, but silence this warning by
initializing to NULL.
CID 1021635 (#1 of 2): Unchecked return value (CHECKED_RETURN)7.
check_return: Calling pthread_mutex_unlock without checking return value
(as is done elsewhere 29 out of 33 times).
CID 1021634 (#2 of 2): Unchecked return value (CHECKED_RETURN)12.
check_return: Calling pthread_mutex_unlock without checking return value
(as is done elsewhere 29 out of 33 times).
CID 1021642 (#1 of 2): Side effect in assertion
(ASSERT_SIDE_EFFECT)assert_side_effect: Argument test_array of assert()
has a side effect because the variable is volatile. The containing
function might work differently in a non-debug build.
Now that the membarrier system call is allocated on x86 32/64, allocate
its number in our architecture header if the system headers don't
allocate it. This allows using the membarrier system call as soon as
implemented in the kernel, even if the distribution has old kernel
headers.
Allows getting a reference atomically if the reference count is not
zero. Returns true if the reference is taken, false otherwise. This
needs to be used in conjunction with another synchronization technique
(e.g. RCU or mutex) to ensure existence of the reference count.
Fix: dynamic fallback to compat futex on sys_futex ENOSYS
Some MIPS processors (e.g. Cavium Octeon II) dynamically check if the
CPU supports ll/sc within sys_futex, and return a ENOSYS errno if they
don't, even though the architecture implements sys_futex.
Handle this situation by always building the sys_futex compatibility
layer, and fall-back on it if sys_futex return a ENOSYS errno. This is
a tiny compat layer which adds very little space overhead.
This adds an unlikely branch on return from sys_futex, which should
not be an issue performance-wise (we've already taken a system call).
Since this is a fall-back mode, don't try to be clever, and don't cache
the result, so that the common cases (architectures with a properly
working sys_futex) don't get two conditional branches, just one.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> CC: Michael Jeanson <mjeanson@efficios.com> CC: Jon Bernard <jbernard@debian.org>
Use the urcu_assert() macro (enabled on DEBUG_RCU) to check for
unmatched rcu_read_lock() that eventually leads to nesting counter
overflow in urcu.h and urcu-bp.h. This won't necessarily point the the
exact rcu_read_lock() that is unmatched, but will at least detect the
overflow condition.
Use the urcu_assert() macro (enabled on DEBUG_RCU) to check for
unmatched rcu_read_unlock() that leads to nesting counter underflow in
urcu.h and urcu-bp.h.
Add a "registered" flag to urcu.c and urcu-qsbr.c, set/cleared when a
thread is registered and unregistered. Add corresponding asserts in
those functions checking if a thread is registered or unregistered more
than once (which would be a bug in the way the application uses urcu).
Move the checks enabled on RCU_DEBUG to a single header: urcu/debug.h.
Add checks for the registered flag in RCU read-side lock functions (new
urcu_assert() checks, which are only built-in if RCU_DEBUG is defined at
compile-time).
From Coverity:
CID 1021642 (#1 of 3): Side effect in assertion
(ASSERT_SIDE_EFFECT)assert_side_effect: Argument test_array of assert()
has a side effect because the variable is volatile. The containing
function might work differently in a non-debug build.
sys_membarrier underwent changes between its original implementation and
its upcoming inclusion into the Linux kernel. Update its use to follow
those changes.
Should the prior user-space code be built against a kernel header that
defines SYS_membarrier, and executed against that kernel, the following
scenarios may happen:
- -1 will be returned with EINVAL errno if the 2nd argument (flags) is
non-zero (the previous ABI expected a single argument),
- (MEMBARRIER_EXPEDITED | MEMBARRIER_QUERY) defined as
(1 << 0) | (1 << 16) will return -1 with EINVAL errno, because valid
commands are now one-hot.
Therefore, should an incompatible user-space code try to use
sys_membarrier, it will simply think that the system does not have
membarrier support due to the negative return value upon query.
Khem Raj [Sun, 23 Aug 2015 04:38:30 +0000 (21:38 -0700)]
uatomic: Specify complete types for atomic function calls
This was unearthed by clang compiler where it complained about parameter
mismatch, gcc doesnt notice this
urcu/uatomic/generic.h:190:10: error: address argument to atomic builtin
must be a pointer to integer or pointer ('void *' invalid)
return __sync_add_and_fetch_4(addr, val);
Fix: handle sys_futex() FUTEX_WAIT interrupted by signal
We need to handle EINTR returned by sys_futex() FUTEX_WAIT, otherwise a
signal interrupting this system call could make sys_futex return too
early, and therefore cause a synchronization issue.
Ensure that the futex compatibility layer returns meaningful errors and
errno when using poll() or pthread cond variables.
Reported-by: Gerd Gerats <geg@ngncc.de> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> CC: Lai Jiangshan <laijs@cn.fujitsu.com> CC: Stephen Hemminger <shemminger@vyatta.com> CC: Alan Stern <stern@rowland.harvard.edu> CC: lttng-dev@lists.lttng.org CC: rp@svcs.cs.pdx.edu Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Make call_rcu_thread() affine itself more persistently
Currently, URCU simply fails if a call_rcu_thread() fails to affine
itself. This is problematic when execution is constrained by cgroup
and hotunplugged CPUs. This commit therefore makes call_rcu_thread()
retry setting its affinity every 256 grace periods, but only if it
detects that it migrated to a different CPU. Since sched_getcpu() is
cheap on many architectures, this check is less costly than going
through a system call.
Reported-by: Michael Jeanson <mjeanson@efficios.com> Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
/usr/include/features.h:148:3: warning: #warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" [-Wcpp]
# warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE"
From http://man7.org/linux/man-pages/man7/feature_test_macros.7.html:
_BSD_SOURCE (deprecated since glibc 2.20)
[...]
Since glibc 2.20, this macro is deprecated. It now has the same effect
as defining _DEFAULT_SOURCE, but generates a compile-time warning
(unless _DEFAULT_SOURCE is also defined). Use _DEFAULT_SOURCE instead.
To allow code that requires _BSD_SOURCE in glibc 2.19 and earlier and
_DEFAULT_SOURCE in glibc 2.20 and later to compile without warnings,
define both _BSD_SOURCE and _DEFAULT_SOURCE.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Can block a thread on join, and thus have the side-effect of deadlocking
a thread doing a pthread_join while within a RCU read-side critical
section. This join would be awaiting for completion of register_thread or
rcu_unregister_thread, which may never complete because the rcu_gp_lock
is held by synchronize_rcu executed from another thread.
One solution to fix this is to add a new lock, rcu_registry_lock. This
lock now protects the thread registry. It is released between iterations
on the registry by synchronize_rcu, thus allowing thread
registration/unregistration to complete even though synchronize_rcu is
awaiting for RCU read-side critical sections to complete.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> CC: Eugene Ivanov <Eugene.Ivanov@orc-group.com> CC: Lai Jiangshan <laijs@cn.fujitsu.com> CC: Stephen Hemminger <stephen@networkplumber.org>
Luca Boccassi [Wed, 25 Mar 2015 19:39:00 +0000 (19:39 +0000)]
Mark braced-groups within expressions with __extension__
Braced-groups within expressions are not valid ISO C, so
if a macro uses them and it's included in a project built
with -pedantic, the build will fail. GCC and CLANG do
support them as extension, so marking them as such allows
the build to complete even with -pedantic.
The Userspace RCU compatibility layer around sys_futex has a race
condition which makes pretty much all "benchmark" tests hang pretty
quickly on non-Linux systems (tested on Mac OS X).
I narrowed it down to a bug in compat_futex_noasync: this compat layer
uses a single pthread mutex and condition variable for all callers,
independently of their uaddr. The FUTEX_WAKE performs a pthread cond
broadcast to all waiters. FUTEX_WAIT must then compare *uaddr with val
to see which thread has been awakened.
Unfortunately, the check was not done again after each return from
pthread_cond_wait(), thus causing the race.
This race affects threads using the futex_noasync() compatibility layer
concurrently, thus it affects only on non-Linux systems.