Library major version number (soname) increment to 3
Due to the urcu/tls-compat.h TLS fallback symbol clash fix, we need to
bump the Userspace RCU soname major version number so we don't have to
keep erroneous usage detection code around.
Vladimir Nikulichev noticed crashes when using this setup. The problem
can be pinpointed to a missing macro expansion in urcu/tls-compat.h:
looking at the output of
nm tests/unit/.libs/test_urcu_multiflavor :
U __tls_access_rcu_reader
this seems to be the issue. We're missing macro expansion in
tls-compat.h. With this commit, it becomes:
U __tls_access_rcu_reader_bp
U __tls_access_rcu_reader_mb
U __tls_access_rcu_reader_memb
U __tls_access_rcu_reader_sig
Please note that this affects an unusual configuration of userspace RCU
(with TLS pthread key fallback), needed for some BSD that don't support
compiler TLS. Strictly speaking, this requires bumping the URCU library
soname version major number, because it breaks the ABI presented to
applications on those unusual configurations.
A following commit will handle the ABI migration: for stable releases
(stable-0.7 and stable-0.8 branches), the ABI is kept compatible, and
bogus usage are detected. For the upcoming stable-0.9, the soname will
simply be bumped.
Reported-by: Vladimir Nikulichev <nvs@tbricks.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
When compiling code using the rcu_xchg_pointer() family of functions,
with the following define:
#define URCU_INLINE_SMALL_FUNCTIONS
prior to including urcu headers, when compiling with gcc with
-Wsign-compare and -Wextra, gcc warns about:
urcu-xchg.c: In function ‘reload’:
urcu-xchg.c:19:1: warning: ordered comparison of pointer with integer zero [-Wextra]
urcu-xchg.c:19:1: warning: signed and unsigned type in conditional expression [-Wsign-compare]
For the "ordered comparison of pointer with integer zero" warning, fix
this by comparing (type) -1 against (type) 0 instead of just 0, so if
"type" is a pointer type, this pointer type will be applied to the right
operand too, thus fixing the warning.
For the "signed and unsigned type in conditional expression" warning, we
need caa_cast_long_keep_sign() to always evaluate to the same type
signedness. In order to do so, when we need to sign-extend the value,
cast it to unsigned long after first casting it to long.
Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
* Dmitri Shubin <sbn@tbricks.com> wrote:
> Shouldn't the condition in line 94 actually be
>
> 94 #if (!defined(BUILD_QSBR_LIB) && !defined(RCU_DEBUG))
>
> So when RCU_DEBUG is _not_ defined we get static inlines for
> rcu_read_{,un}lock() ?
Indeed!
Reported-by: Dmitri Shubin <sbn@tbricks.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
This fixes an issue that appears after this recent urcu-bp fix is
applied:
Fix: urcu-bp: Bulletproof RCU arena resize bug
Prior to this fix, on Linux at least, the behavior was to allocate
(and leak) one memory map region per reader thread. It worked, except
for the unfortunate leak. The fact that it worked, even though not the
way we had intended it to, is is why testing did not raise any red flag.
That state of affairs has prevailed for a long time, but it was
side-tracking some issues. After fixing the underlying bug that was
causing the memory map leak, another issue appears.
The garbage collection scheme reclaiming the thread tracking structures
in urcu-bp fails in stress tests to due a bug in glibc (tested against
glibc 2.13 and 2.17). Under this workload, on a 2-core/hyperthreaded i7:
./test_urcu_bp 40 4 10
we can easily trigger a segmentation fault in the pthread_kill() code.
Program terminated with signal 11, Segmentation fault.
Backtrace:
#0 __pthread_kill (threadid=140723681437440, signo=0) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:42
42 ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c: No such file or directory.
(gdb) bt full
#0 __pthread_kill (threadid=140723681437440, signo=0) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:42
__x = <optimized out>
pd = 0x7ffcc90b2700
tid = <optimized out>
val = <optimized out>
#1 0x0000000000403009 in rcu_gc_registry () at ../../urcu-bp.c:437
tid = 140723681437440
ret = 0
chunk = 0x7ffcca0b8000
rcu_reader_reg = 0x7ffcca0b8120
__PRETTY_FUNCTION__ = "rcu_gc_registry"
#2 0x0000000000402b9c in synchronize_rcu_bp () at ../../urcu-bp.c:230
cur_snap_readers = {next = 0x7ffcb4888cc0, prev = 0x7ffcb4888cc0}
qsreaders = {next = 0x7ffcb4888cd0, prev = 0x7ffcb4888cd0}
newmask = {__val = {18446744067267100671, 18446744073709551615 <repeats 15 times>}}
oldmask = {__val = {0, 140723337334144, 0, 0, 0, 140723690351643, 0, 140723127058464, 4, 0, 140723698253920, 140723693868864, 4096, 140723690370432, 140723698253920, 140723059951840}}
ret = 0
__PRETTY_FUNCTION__ = "synchronize_rcu_bp"
#3 0x0000000000401803 in thr_writer (_count=0x76b2f0) at test_urcu_bp.c:223
count = 0x76b2f0
new = 0x7ffca80008c0
old = 0x7ffca40008c0
#4 0x00007ffcc9c83f8e in start_thread (arg=0x7ffcb4889700) at pthread_create.c:311
__res = <optimized out>
pd = 0x7ffcb4889700
now = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140723337336576, 6546223316613858487, 0, 140723698253920, 140723693868864, 4096, -6547756131873848137,
-6547872135220034377}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = 0
pagesize_m1 = <optimized out>
sp = <optimized out>
freesize = <optimized out>
__PRETTY_FUNCTION__ = "start_thread"
#5 0x00007ffcc99ade1d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
It appears that the memory backing the thread information can be
relinquished by NPTL concurrently with execution of pthread_kill()
targeting an already joined thread and cause this segfault. We were
using pthread_kill(tid, 0) to discover if the target thread was alive or
not, as documented in pthread_kill(3):
If sig is 0, then no signal is sent, but error checking is still per‐
formed; this can be used to check for the existence of a thread ID.
but it appears that the glibc implementation is racy.
Instead of using the racy pthread_kill implementation, implement cleanup
using a pthread_key destroy notifier for a dummy key. This notifier is
called for each thread exit and destroy.
It is not correct to move the registry address range, since there are
external references from reader threads. This will trigger on workloads
with many threads.
Typically, on Linux, mremap can expand the existing range, which is OK.
However, if there is not enough space around the existing range, it may
try to map it at a different address, which is incorrect.
It is more likely that this bug will be observed on operating systems
where urcu uses the mmap/munmap fallback instead of mremap.
Moreover, prior to commit:
"Fix: urcu-bp: Bulletproof RCU arena resize bug"
this issue was hidden by the fact that each thread ended up with their
own memory mapping (leaked), on Linux at least.
compat_futex.c has one instance included in each urcu shared object, as
well as within some of the test applications. However, it is expected
that an entire program interact with the same lock and completion
variables. Therefore, define them as globally visible, but weak, so the
entire program agree on which object should be used.
Reported-by: Vladimir Nikulichev <nvs@tbricks.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
compat_arch_x86.c is linked into many .so and even into test programs.
The basic problem with this is that it contains a statically defined
mutex, which will fail to protect concurrent use of this compat code by
different shared objects.
Fix this by defining both the mutex (now called __urcu_x86_compat_mutex)
and __rcu_cas_avail as weak symbols. Therefore, the first symbol that
gets loaded in a program will by used by everyone.
Reported-by: Vladimir Nikulichev <nvs@tbricks.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> While trying to use the BP flavor of RCU I ran into random crashes. I
> tracked it down to issues with resizing of the BP RCU memory pool.
>
> The problem is in the urcu-bp.c file in the resize_arena() function.
> On successful allocation / remapping the len member of the
> registry_arena struct is never set anywhere function. On the second
> resize of the arena the code in resize_arena() still thinks the
> previous size is equal to the original mapping size. I've fixed this
> issue locally by just adding the following code at the bottom of
> resize_arena().
Good catch !!
However, I think your fix misses one case: if we happen to re-use the
same region, we want to update the length too.
Fix: hash table growth (for small tables) should be limited
Buckets with many entries encountered in a hash table could cause it to
grow to a large size, beyond the scope for which this mechanism is
expected to play a role when node accounting is available. Indeed, when
the hash table grows to larger size, split-counter node accounting is
expected to deal with resize/shrink rather than relying on an heuristic
based on the largest bucket size.
This is fixing an issue where we see hash tables sometimes reaching 65k
entries index (65536*8 = 524288 bytes) for a workload limited to adding
1000 entries and then removing all of them, done in a loop (random
keys).
- cds_hlist_for_each()
- cds_hlist_for_each_safe()
- CDS_HLIST_HEAD()
- CDS_HLIST_HEAD_INIT()
- cds_hlist_for_each_entry_2() (takes 3 argument, like the Linux kernel
API),
- cds_hlist_for_each_entry_safe_2() (takes 4 arguments, like the Linux
kernel API),
- cds_hlist_for_each_rcu()
- cds_hlist_for_each_entry_rcu_2() (takes 3 arguments, like the Linux
kernel API).
Left cds_hlist_for_each_entry(), cds_hlist_for_each_entry_safe() and
cds_hlist_for_each_entry_rcu() as-is (different from the ones found in
the Linux kernel) because those APIs were already exposed by Userspace
RCU.
in cds_list_add_rcu, use rcu_assign_pointer to update head->next
atomically and provide the memory barrier before publishing head->next.
Notice that we don't need the wmb() prior to store to prev, because RCU
traversals only go forward, and thus only use "next".
in cds_list_del_rcu, use CMM_STORE_SHARED() to store to elem->prev->next
atomically.
* Lai Jiangshan (laijs@cn.fujitsu.com) wrote:
> Hi, Mathieu,
>
> There is a big compatible problem in URCU which should be fix in next round.
>
> LB: liburcu built on the system which has sys_membarrier().
> LU: liburcu built on the system which does NOT have sys_membarrier().
>
> LBM: liburcu-mb ....
> LUM: liburcu-mb ...
>
> AB: application(-lliburcu) built on the system which has sys_membarrier().
> AU: application(-lliburcu) built on the system which does NOT have
> sys_membarrier().
>
> ABM application(-lliburcu-mb) ...
> AUM application(-lliburcu-mb) ...
>
> AB/AU + LB/LU: 4 combinations
> ABM/AUM + LBM/LUM: 4 combinations
>
> I remember some of the 8 combinations can't works due to symbols are
> miss match. only LU+AB and LB+AU ?
>
> could you check it?
>
> How to fix it: In LU and AU, keep all the symbol name/ABI as LA and
> AB, but only the behaviors falls back to URCU_MB.
Define membarrier() as -ENOSYS when SYS_membarrier is not found in the
system headers. Check dynamically for membarrier availability to ensure
ABI compatibility between applications and librairies.
Reported-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>