Fix: work-around glibc __nptl_setxid vs clone hang
hash table resize threads exit end up setting a "locked" state within
libc pthread, which deadlocks with seteuid/setegid called from the
cloned process in runas.c when runas() is called exactly when a resize
thread exits.
Temporarily fix this issue by adding a mutex cross this resize
operation, which holds mutual exclusion with runas() usage.
We should investigate whether we want to properly call exec() from the
runas.c clone child before touching any non-async-signal-safe libc call.
However, given that this change is more intrusive, let's first use this
mutex-based work-around.
Before this fix, running 1000 instances of "demo-trace 300" with
sessiond running as root, and:
lttng create
lttng enable-event -u -a
lttng start
would sometimes lead to consumerd hang with the following clone child
backtrace:
setxid_mark_thread (cmdp=<optimized out>, t=0x7f52dd47c700)
at allocatestack.c:995
995 allocatestack.c: No such file or directory.
(gdb) bt full
at allocatestack.c:995
ch = <optimized out>
at allocatestack.c:1088
t = 0x80
signalled = <optimized out>
result = <optimized out>
runp = 0x7f52dd47c9c0
at ../sysdeps/unix/sysv/linux/setegid.c:44
__p = 0xfffffffffffffe00
__cmd = {syscall_no = 119, id = {-1, 1000, -1}, cntr = 0}
result = <optimized out>
data = 0x7f52e66e1930
writelen = <optimized out>
writeleft = <optimized out>
index = <optimized out>
sendret = {i = 0, c = "\000\000\000"}
ret = <optimized out>
__func__ = "child_run_as"
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
No locals.
No symbol table info available.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
This page took 0.026048 seconds and 4 git commands to generate.