- cds_hlist_for_each()
- cds_hlist_for_each_safe()
- CDS_HLIST_HEAD()
- CDS_HLIST_HEAD_INIT()
- cds_hlist_for_each_entry_2() (takes 3 argument, like the Linux kernel
API),
- cds_hlist_for_each_entry_safe_2() (takes 4 arguments, like the Linux
kernel API),
- cds_hlist_for_each_rcu()
- cds_hlist_for_each_entry_rcu_2() (takes 3 arguments, like the Linux
kernel API).
Left cds_hlist_for_each_entry(), cds_hlist_for_each_entry_safe() and
cds_hlist_for_each_entry_rcu() as-is (different from the ones found in
the Linux kernel) because those APIs were already exposed by Userspace
RCU.
in cds_list_add_rcu, use rcu_assign_pointer to update head->next
atomically and provide the memory barrier before publishing head->next.
Notice that we don't need the wmb() prior to store to prev, because RCU
traversals only go forward, and thus only use "next".
in cds_list_del_rcu, use CMM_STORE_SHARED() to store to elem->prev->next
atomically.
* Lai Jiangshan (laijs@cn.fujitsu.com) wrote:
> Hi, Mathieu,
>
> There is a big compatible problem in URCU which should be fix in next round.
>
> LB: liburcu built on the system which has sys_membarrier().
> LU: liburcu built on the system which does NOT have sys_membarrier().
>
> LBM: liburcu-mb ....
> LUM: liburcu-mb ...
>
> AB: application(-lliburcu) built on the system which has sys_membarrier().
> AU: application(-lliburcu) built on the system which does NOT have
> sys_membarrier().
>
> ABM application(-lliburcu-mb) ...
> AUM application(-lliburcu-mb) ...
>
> AB/AU + LB/LU: 4 combinations
> ABM/AUM + LBM/LUM: 4 combinations
>
> I remember some of the 8 combinations can't works due to symbols are
> miss match. only LU+AB and LB+AU ?
>
> could you check it?
>
> How to fix it: In LU and AU, keep all the symbol name/ABI as LA and
> AB, but only the behaviors falls back to URCU_MB.
Define membarrier() as -ENOSYS when SYS_membarrier is not found in the
system headers. Check dynamically for membarrier availability to ensure
ABI compatibility between applications and librairies.
Reported-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fix: Use a filled signal mask to disable all signals
Changelog from David Pelton's original patch:
While using lttng-ust with an application that was calling fork()
with pending signals, I found that all signals were getting unmasked
shortly before the underlying call to fork(). After some
investigation, I found that the rcu_bp_before_fork() function was
unmasking all signals. Based on the comments for this function, it
should be masking all signals. Inspection of the rest of the code
in urcu-bp.c revealed the same pattern in two other functions.
This patch changes the code to use a filled signal mask to disable
all signals. The change to rcu_bp_before_fork() addressed the
problem I was seeing while using lttng-ust. The changes to the
other two functions appear to fix other instances of the same
problem.
Updates by Mathieu Desnoyers:
- Use SIG_BLOCK instead of SIG_SETMASK when setting a filled mask. This
has the same behavior in this case (since we're blocking all signals),
but is semantically neater: if we ever some signals from that mask,
we'd like to to a union with the signal mask already blocked by the
application.
- Also fix incorrect signal masking in compat_arch_x86.c.
Reported-by: David Pelton <dpelton@ciena.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Lai Jiangshan [Mon, 6 May 2013 12:42:27 +0000 (08:42 -0400)]
urcu: avoid false sharing for rcu_gp_ctr
@rcu_gp_ctr and @registry share the same cache line, it causes
false sharing and slowdown both of the read site and update site.
Fix: Use different cache line for them.
Although rcu_gp_futex is updated less than rcu_gp_ctr, but
they always be accessed at almost the same time, so we also move rcu_gp_futex
to the cacheline of rcu_gp_ctr to reduce the cacheline-usage or cache-missing
of read site.
Lai Jiangshan [Mon, 6 May 2013 12:32:02 +0000 (08:32 -0400)]
urcu: make the code of urcu-qsbr as normal urcu
urcu-qsbr's read site's quiescence is much longer than normal urcu ==>
synchronize_rcu() is much slower ==>
rcu_gp_ctr is updated much less ==>
the whole urcu-qsbr will not be slowed down by false sharing of rcu_gp_ctr.
But this patch makes sense to keep the code of urcu-qsbr like normal urcu,
better readability and maintenance.
- thread is online (QSBR),
- thread is nested within read-side critical section (other flavors),
This is useful for libraries that need to know if QSBR is online in
order to save the original state temporarily so it can be restored
before returning to the caller.
Eventually, this API can be called by a "debugging" implementation of
rcu_dereference() and other urcu-pointer.h API members to check that no
RCU pointer is read outside of RCU read-side critical sections.
Some sparc Debian setups advertise a "sparc" host cpu (rather than
sparc64).
In all cases, I think it should be safe to add a "sparc" entry to
userspace RCU configure.ac upstream, e.g.
[sparc], [ARCHTYPE="sparc64"],
in the event someone would launch the build on an environment not
supporting sparc v9, the build would fail because the 32-bit compiler
would not be able to generate sparc v9 instructions (unless
explicitely instructed to do so by the -m32 -Wa,-Av9a flags).
Fix hurd-i386: move cpuset tests outside of sched_setaffinity conditional
Comment about introduction of cpuset.h within urcu tests:
> Unfortunately it doesn't work, because sched_setaffinity is for now
> just a fail-stub on hurd-i386, and thus configure considers it as
> missing, and thus the CPU_SET test is disabled completely.
>
> I however guess you could just disable defining your own cpu_set_t
> when !HAVE_SCHED_SETAFFINITY, since it is probably used only for using
> sched_setaffinity.
Fix by moving cpu_set_t, CPU_SET and CPU_ZERO tests outside of the
sched_setaffinity conditional.
Reported-by: Samuel Thibault <sthibault@debian.org> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fix tests: finer-grained use of CPU_SET, CPU_ZERO and cpu_set_t
Noticed build failure at
https://buildd.debian.org/status/package.php?p=liburcu :
Tail of log for liburcu on hurd-i386:
test_urcu.c:110:0: warning: "CPU_SET" redefined [enabled by default]
In file included from /usr/include/pthread/pthread.h:50:0,
from /usr/include/pthread.h:2,
from test_urcu.c:26:
/usr/include/sched.h:80:0: note: this is the location of the previous definition
make[3]: *** [test_urcu.o] Error 1
make[2]: *** [all-recursive] Error 1
make[1]: *** [all] Error 2
dh_auto_build: make -j1 returned exit code 2
make: *** [build-arch] Error 2
dpkg-buildpackage: error: debian/rules build-arch gave error exit status 2
make[3]: Entering directory `/build/buildd-liburcu_0.7.6-1-hurd-i386-wGBAtt/liburcu-0.7.6/tests'
CC test_urcu.o
make[3]: Leaving directory `/build/buildd-liburcu_0.7.6-1-hurd-i386-wGBAtt/liburcu-0.7.6/tests'
make[2]: Leaving directory `/build/buildd-liburcu_0.7.6-1-hurd-i386-wGBAtt/liburcu-0.7.6'
Fix build on architectures with HAVE_SCHED_GETCPU but without HAVE_SYSCONF
Noticed on: https://buildd.debian.org/status/package.php?p=liburcu
Tail of log for liburcu on kfreebsd-amd64:
CC urcu.lo
In file included from urcu.c:450:0:
urcu-call-rcu-impl.h:145:12: error: static declaration of 'sched_getcpu' follows non-static declaration
In file included from /usr/include/sched.h:43:0,
from /usr/include/pthread.h:20,
from urcu.c:30:
/usr/include/x86_64-kfreebsd-gnu/bits/sched.h:65:12: note: previous declaration of 'sched_getcpu' was here
make[3]: *** [urcu.lo] Error 1
make[3]: Leaving directory `/build/buildd-liburcu_0.7.6-1-kfreebsd-amd64-nnkICd/liburcu-0.7.6'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/build/buildd-liburcu_0.7.6-1-kfreebsd-amd64-nnkICd/liburcu-0.7.6'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/build/buildd-liburcu_0.7.6-1-kfreebsd-amd64-nnkICd/liburcu-0.7.6'
dh_auto_build: make -j1 returned exit code 2
make: *** [build-arch] Error 2
Tail of log for liburcu on kfreebsd-i386:
CC urcu.lo
In file included from urcu.c:450:0:
urcu-call-rcu-impl.h:145:12: error: static declaration of 'sched_getcpu' follows non-static declaration
In file included from /usr/include/sched.h:43:0,
from /usr/include/pthread.h:20,
from urcu.c:30:
/usr/include/i386-kfreebsd-gnu/bits/sched.h:65:12: note: previous declaration of 'sched_getcpu' was here
make[3]: *** [urcu.lo] Error 1
make[3]: Leaving directory `/build/buildd-liburcu_0.7.6-1-kfreebsd-i386-sWzNKU/liburcu-0.7.6'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/build/buildd-liburcu_0.7.6-1-kfreebsd-i386-sWzNKU/liburcu-0.7.6'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/build/buildd-liburcu_0.7.6-1-kfreebsd-i386-sWzNKU/liburcu-0.7.6'
dh_auto_build: make -j1 returned exit code 2
make: *** [build-arch] Error 2
CMM_STORE_SHARED(x, v) is a macro that really acts like an assignment
expression, e.g.:
x = v;
but internally also has "mc" barriers (useful for cache-incoherent
architectures).
The issue here is that (x = v) can evaluate to "v", but very often we're
not interested to use the assignment expression result. When we have an
explicit assignment, the compiler won't complain that the result of this
expression is unused, but given that the added barrier requires that we
make this macro evaluate explicitly to a value, clang complains.
Fix this by adding "_v = _v" at the last line of the macro, thus
performing what would appear like an effect-less assignment, but
actually tricks clang into thinking we are evaluating to an assignment
expression, thus suppressing the warning.
I've had a report of someone running into issues with the RCU lock-free
hash table by embedding the struct cds_lfht_node into a packed structure
by mistake, thus not respecting alignment requirements stated in
urcu/rculfhash.h. Assertions on "replace" and "add" operations should
catch this, but I notice that we should add assertions on the
REMOVAL_OWNER_FLAG to cover all possible misalignments.
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>