Michael Jeanson [Fri, 4 Jun 2021 19:15:38 +0000 (15:15 -0400)]
fix: allow building with userspace-rcu 0.13
The liburcu detection code was checking for the 'synchronize_rcu_bp'
symbol in liburcu-bp which was dropped from the ABI in 0.13, add a
fallback check for 'urcu_bp_synchronize_rcu' which replaced it.
Also drop the check for 'call_rcu_bp' which we don't use and was just a
leftover of version check for 0.6 which has since been superseded.
Both those changes allow to build against liburcu >= 0.13.
Change-Id: I85c613fa0a41a4f78466501280768c4a4d4de12e Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Testing with a fixed number of loops per-thread only works if the
workload is distributed perfectly across CPUs. For instance, if a lock
is held in the workload (e.g. internally by open() and close()), those
may cause starvation of some threads, and therefore cause the benchmark
to be wrong because it will wait for the slowest thread to complete its
loops.
It is also not good for testing overcommit of threads vs cpus.
Change the test to report the number of loops performed in a given wall
time, and use this to report the average and std.dev. of tracing
overhead per event on each active CPU.
Change the benchmark workload to be only CPU-bound and not generate
system calls to minimize the inherent non-scalability of the workload
(e.g. locks held within the kernel).
Kill session daemon after script, conditionally drop caches if root
(else it is forbidden to do so), handle cleanup on trap, use snapshot
(flight recorder) tracing to benchmark ring buffer, output the result in
nanoseconds, removing trailing numbers after dot.
Before this, the test was pretty much always resulting in an output of
"0s" extra overhead due to truncation of significant numbers in the
output.
Fix: bytecode linker: validate event and field array/sequence encoding
The bytecode linker should only allow linking filter expressions loading
fields which are string-encoded arrays and sequence for comparison
against a string, and reject arrays and sequences without encoding, so
the filter interpreter does not attempt to load non-NULL terminated
arrays/sequences as if they were strings.
Fix: uninitialized variable in lib_ring_buffer_channel_switch_timer_start
Found by Coverity:
** CID 1447027: Uninitialized variables (UNINIT)
/libringbuffer/ring_buffer_frontend.c: 810 in lib_ring_buffer_channel_switch_timer_start()
>>> CID 1447027: Uninitialized variables (UNINIT)
>>> Using uninitialized value "sev". Field "sev._sigev_un" is uninitialized when calling "timer_create".
Fix: Use unix socket peercred for pid, uid, gid credentials
Currently, the session daemon trust the pid, ppid, uid, and gid values
passed by the application, but should really validate the uid using unix
socket peercred. This fix uses the peercred values rather than the
values provided by the application on registration for:
- pid, uid and gid on Linux,
- uid and gid on FreeBSD.
This should improve how the session daemon deals with containerized
applications on Linux as well. Applications are required to be either in
the same pid namespace, or in a pid namespace nested within the pid
namespace of the lttng-sessiond, so the session daemon can map the
application pid to something meaningful within its own pid namespace.
Applications in a unrelated (disjoint) pid namespace will be refused by
the session daemon.
About the uid and gid with user namespaces on Linux, those will provide
meaningful IDs if the application user namespace is either the same as
the user namespace of the session daemon, or a nested user namespace.
Otherwise, the IDs will be that of /proc/sys/kernel/overflowuid and
/proc/sys/kernel/overflowgid, which typically maps to nobody.nogroup on
current distributions.
Given that fetching the parent pid (ppid) of the application would
require to use /proc/<pid>/status (which is racy wrt pid reuse), expose
the ppid provided by the application on registration instead, but only
in situations where the application sits in the same pid namespace as
the session daemon (on Linux), which is detected by checking if the pid
provided by the application matches the pid obtained using unix socket
credentials. The ppid is only used for logging/debugging purposes in the
session daemon anyway, so it is OK to use the value provided by the
application for that purpose.
If the command callback takes ownership of a pointer or file descriptor,
it sets them to NULL or -1. Therefore, the caller can always try to free
the pointer, or close it if it is greater or equal to 0.
If the command callback takes ownership of a pointer or file descriptor,
it sets them to NULL or -1. Therefore, the caller can always try to free
the pointer, or close it if it is greater or equal to 0.
Fix: Use default visibility for tracepoint provider symbol
When building a probe provider `someprobe` with -fvisibility=hidden into
a shared library, the `__tracepoint_provider_someprobe` symbol is hidden,
which does not allow tracepoint instrumentation to link to it.
Fix this by using the "default" visibility attribute.
For a shared library built with -fvisibility=hidden, this changes the
output of nm for that symbol from:
The newly-released autoconf 2.70 introduces a number of breaking
changes [1] and is being rolled-out by some distros.
Amongst those changes, the AC_PROG_CC_STDC macro is marked as obsolete
and was merged into AC_PROG_CC, which we already use. On 2.70, this
results in a warning which we handle as an error.
A version check is added to invoke the AC_PROG_CC_STDC macro only when
running a pre-2.70 version of autoconf, fixing the issue.
[1] https://lwn.net/Articles/839395/
[ Backport by Mathieu Desnoyers: removed ax_pthread.m4 bits which are
not present in stable-2.12 and prior. ]
Fix: ustctl_release_object: eliminate double-close/free on error
When ustctl_release_object returns an error, it is unclear to the caller
what close/free side effects were effectively performed and which were
not. So the only courses of action are to either leak file descriptors
or memory, or call ustctl_release_object again which can trigger double
close or double free.
Fix this by setting the file descriptors to -1 after successful close,
and pointers to NULL after successful free.
Michael Jeanson [Mon, 13 Jul 2020 19:22:35 +0000 (15:22 -0400)]
tests: return the proper TAP exit code
The C TAP library provides the 'exit_status()' function that will return
the proper exit code according to the number of tests that succeeded or
failed.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I786527dfa9cfe2d1a7c8bc80086d54186f60b4d9
Fix: libc-wrapper: undef temporary token rather than value
The lttng-ust malloc wrapper defines pthread_mutex_lock/unlock
preprocessor tokens to ust_malloc_spin_lock/unlock around the definition
of a TLS variable, which uses pthread mutexes when relying on the
pthread key fallback. Undefining those tokens should be done on the
preprocessor token, rather than its value.
Michael Jeanson [Fri, 26 Jun 2020 19:44:20 +0000 (15:44 -0400)]
Fix: support compile units including 'sys/sdt.h' without defining SDT_USE_VARIADIC
Instead of using SDT_USE_VARIADIC from 'sys/sdt.h', use our own namespaced
macros since the instrumented application might already have included
'sys/sdt.h' without variadic support.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Idfece1a65a5296a90d33370bc8d73ea554c14b0f
Background
==========
When userspace events with exclusions are enabled by the CLI user, the
session daemon enables the events in a 3-steps process using the
following ustctl commands:
1. `LTTNG_UST_EVENT` to create an event and get an handle on it,
2. `LTTNG_UST_EXCLUSION` to attach exclusions, and
3. `LTTNG_UST_ENABLE` to activate the tracing of this event.
Also, the session daemon uses the `LTTNG_UST_SESSION_START` to start the
tracing of events. In various use cases, this can happen before OR after
the 3-steps process above.
Scenario
========
If the`LTTNG_UST_SESSION_START` is done before the 3-steps process the
tracer will end up not considering the exclusions and trace all events
matching the event name glob pattern provided at step #1.
This is due to the fact that when we do step #1, we end up calling
`lttng_session_lazy_sync_enablers()`. This function will sync the event
enablers if the session is active (which it is in our scenario because
of the _SESSION_START).
Syncing the enablers will then match event name glob patterns provided
by the user to the event descriptors of the tracepoints and attach
probes in their corresponding callsite on matches.
All of this is done before we even received the exclusions in step #2.
Problem
=======
The problem is that we attach probes to tracepoints before being sure we
received the entire description of the events.
Step #2 may reduced the set of tracepoints to trace so we must wait
until exclusions are all received to attached probes to tracepoints.
Solution
========
Only match event names and exclusions (and ultimately, register probes)
on enabled enablers to ensure that no other modifications will be
applied to the event.
Event enablers are enabled when at step #3.
Note
====
Filters are not causing problems here because adding a filter won't
change what tracepoints are to be traced.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id984f266d976f346b001db81cd8a2b74965b5ef2
Michael Jeanson [Wed, 22 Apr 2020 20:32:51 +0000 (16:32 -0400)]
fix: Java examples CLASSPATH override
Variables provided to make as arguments, called 'command variables' can't
be reassigned a new value in the Makefile. Since the CLASSPATH is now
passed this way in the glue between automake and the example makefiles,
use different names for the internal variables.
Change-Id: Id6289273f211f544a66d933a96f06df75243415f Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
When reading from a TCP socket, there is no guarantee that the
read will return all the requested data. We need to loop and continue
reading until we gather all the expected data.
The stream shm FDs are allocated by the consumer process, and then
passed to the applications over unix sockets. When opening those
file descriptors on reception, the FD_CLOEXEC flag is not set.
In a fork + exec scenario, parent process streams shm FDs and channel
wake FDs are present in the resulting child process.
Set FD_CLOEXEC on reception (ustcomm_recv_fds_unix_sock) to
prevent such scenario.
Change-Id: Id58077b272be9c1ab239846639ffd8103b3d50f1 Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fix: tracepoint.h: Disable address sanitizer on pointer array section variables
The tracepoint header declares pointer global variables meant to be
placed contiguously within the __tracepoints_ptrs section, and then used
as an array of pointers when loading an executable or shared object.
Clang Address Sanitizer adds redzones around each variable, thus leading to
detection of a global buffer overflow.
Those redzones should not be placed within this section, because it
defeats its purpose. Therefore, teach asan not to add redzones
around those variables with an attribute.
Note that there does not appear to be any issue with gcc (tested with
gcc-8 with address sanitization enabled), and gcc ignores the
no_sanitize_address attribute when applied to a global variable.
jhash.h implements "special" code for valgrind because it reads memory
out-of-bound (and then applies a mask) when reading strings.
Considering that lttng-ust does not use jhash.h in a fast-path, remove
this "optimization" and use the verifiable VALGRIND code instead. This
fixes an ASan splat.
Fix: lttng-ust-comm.c: return number of fd rather size of array
There are two conflicting comments for this function. One says it
returns the size of the received data and the other says it returns the
number of fd received.
It's more useful to receive the number of fd.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I74084b461d396c3e623fa55100e6dd7e59dbea83
Michael Jeanson [Thu, 16 Jan 2020 15:59:14 +0000 (10:59 -0500)]
Fix: build with -fno-common
GCC 10 will default to building with -fno-common, this inhibits the
linker from merging multiple tentative definitions of a symbol in an
archive. Keep only the declaration in the libustsnprintf.la convenience
library.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I8fb7c72811ce7e62f10342f55fcabeeabfdd4c67
Fix: uninitialized variable in lib_ring_buffer_reserve_committed
This internal function implemented in libringbuffer is not used within
lttng-ust actually, but uses an uninitialized variable:
As reported by clang:
./frontend_internal.h:263:75: warning: variable 'idx' is uninitialized when used here [-Wuninitialized]
struct commit_counters_hot *cc_hot = shmp_index(handle, buf->commit_hot, idx);
^~~
./shm.h:74:86: note: expanded from macro 'shmp_index'
____ptr_ret = (__typeof__(____ptr_ret)) _shmp_offset((handle)->table, &(ref)._ref, index, sizeof(*____ptr_ret)); \
^~~~~
./frontend_internal.h:262:27: note: initialize the variable 'idx' to silence this warning
unsigned long offset, idx, commit_count;
^
= 0
In file included from ring_buffer_backend.c:29:
In file included from ./backend.h:33:
./frontend_internal.h:263:75: warning: variable 'idx' is uninitialized when used here [-Wuninitialized]
struct commit_counters_hot *cc_hot = shmp_index(handle, buf->commit_hot, idx);
^~~
./shm.h:74:86: note: expanded from macro 'shmp_index'
____ptr_ret = (__typeof__(____ptr_ret)) _shmp_offset((handle)->table, &(ref)._ref, index, sizeof(*____ptr_ret)); \
^~~~~
./frontend_internal.h:262:27: note: initialize the variable 'idx' to silence this warning
unsigned long offset, idx, commit_count;
^
= 0
Introduce a new lttng_perf_lock to protect the lttng perf context
data structures from concurrent modifications and from fork. This
lock can be nested within the ust_lock, but never the opposite.
This removes the circular locking dependency involving urcu bp.
Fix: fd tracker: do not allow signal handlers to close lttng-ust FDs
Split the thread_fd_tracking state from the ust_fd_mutex_nest used to
track whether a signal handler is nested over a fd tracker lock.
lttng-ust listener threads need to invoke
lttng_ust_fd_tracker_register_thread() so the fd tracker can
distinguish them from application threads.
Otherwise, using ust_fd_mutex_nest to try to distinguish between
ust and application threads makes it possible for signal handlers
to appear as if they are ust listener threads, and thus attempt to
close UST file descriptors.
Fix: fd tracker: provide async-signal-safety for close wrapper
close(3) is part of the async-signal-safe functions. Therefore, it is
expected that the close wrapper provided by liblttng-ust-fd-tracker
behaves in a async-signal-safe way.
Use a similar strategy as ust_lock() does: disable signals when taking
and releasing the lock, and keep track of nesting with a TLS variable.
This ensures signals are restored to their original state when close(3)
ends up being invoked.
If fork() is performed while other threads are holding the fd tracker
lock, it will stay in locked state in the child process and eventually
cause a deadlock.
One way to solve this is to hold the fd tracker lock across fork(), in
the same way we do for the ust_lock. This ensures no other threads are
holding that lock in the parent, and therefore provides a consistent
lock state in the child.
Fix: wait for initial statedump before proceeding to the main program
In the case of short lived applications, the application may exit before
the initial statedump has completed.
Higher-level trace analysis features such as translating addresses to
symbols rely on statedump. That information is required for those
analyses to work on such short-lived applications.
Force the statedump to occur before handing the control to the
application.
Jonathan Rajotte [Mon, 29 Jul 2019 18:49:59 +0000 (14:49 -0400)]
Use MAP_POPULATE to reduce pagefault when available
Any ring buffer configuration bigger than PAGE_SIZE would result
in an increased latency for the first tracepoint hit (1200ns) landing on a
new PAGE_SIZE sized chunk of the mapped memory. This happens at least
for the first ring buffer traversal.
To alleviate this we can use MAP_POPULATE that will "prefault" the page
tables.
A similar flag seems to exist on freebsd (MAP_PREFAULT_READ) but I do
not have access to a system to test it and ensure it does indeed results
in the same effect. It mostly indicates that it prefaults for the
read case so I doubt it is the case.
Default to using MAP_POPULATE on Linux only for now. Support of
prefaulting on other platforms will be added as needed.
Commit 973eac638e4fd introduces an uninitialised value that may prevent
shared memory from being allocated. The compiler didn't give any warning
because the pointer to the value is sent to a function that don't do anything
with it. We simply pass NULL to that function.
-Waddress-of-packed-member, enabled by default, warns about an
unaligned pointer value from the address of a packed member of a
struct or union.
The warning is triggered in some place in LTTng-UST in cases where we
pass a pointer to get a result. Rather than passing the pointer directly
from the struct member, we get the result into a local storage, then
write into in the struct.
Michael Jeanson [Mon, 3 Jun 2019 19:25:32 +0000 (15:25 -0400)]
Fix: namespace our gettid wrapper
Since glibc 2.30, a gettid wrapper was added that conflicts with our
static declaration. Namespace our wrapper so there is no conflict,
we'll add support for the glibc provided wrapper in a further commit.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fix: bitfield: shift undefined/implementation defined behaviors
bitfield.h uses the left shift operator with a left operand which
may be negative. The C99 standard states that shifting a negative
value is undefined.
When building with -Wshift-negative-value, we get this gcc warning:
In file included from /home/smarchi/src/babeltrace/include/babeltrace/ctfser-internal.h:44:0,
from /home/smarchi/src/babeltrace/ctfser/ctfser.c:42:
/home/smarchi/src/babeltrace/include/babeltrace/ctfser-internal.h: In function ‘bt_ctfser_write_unsigned_int’:
/home/smarchi/src/babeltrace/include/babeltrace/bitfield-internal.h:116:24: error: left shift of negative value [-Werror=shift-negative-value]
mask = ~((~(type) 0) << (__start % ts)); \
^
/home/smarchi/src/babeltrace/include/babeltrace/bitfield-internal.h:222:2: note: in expansion of macro ‘_bt_bitfield_write_le’
_bt_bitfield_write_le(ptr, type, _start, _length, _v)
^~~~~~~~~~~~~~~~~~~~~
/home/smarchi/src/babeltrace/include/babeltrace/ctfser-internal.h:418:3: note: in expansion of macro ‘bt_bitfield_write_le’
bt_bitfield_write_le(mmap_align_addr(ctfser->base_mma) +
^~~~~~~~~~~~~~~~~~~~
This boils down to the fact that the expression ~((uint8_t)0) has type
"signed int", which is used as an operand of the left shift. This is due
to the integer promotion rules of C99 (6.3.3.1):
If an int can represent all values of the original type, the value is
converted to an int; otherwise, it is converted to an unsigned int.
These are called the integer promotions. All other types are unchanged
by the integer promotions.
We also need to cast the result explicitly into the left hand
side type to deal with:
warning: large integer implicitly truncated to unsigned type [-Woverflow]
The C99 standard states that a right shift has implementation-defined
behavior when shifting a signed negative value. Add a preprocessor check
that the compiler provides the expected behavior, else provide an
alternative implementation which guarantees the intended behavior.
A preprocessor check is also added to ensure that the compiler
representation for signed values is two's complement, which is expected
by this header.
Document that this header strictly respects the C99 standard, with
the exception of its use of __typeof__.
Fix: alignment of ring buffer shm space reservation
commit a9ff648cc "Implement file-backed ring buffer" changes the order
of backend fields with respect to the frontend per-subbuffer
commit_counters_hot and commit_counters_cold arrays, but does not change
that order when calculating the space needed in the initial pass.
This discrepancy can be an issue for field alignment calculation.
Let's analyse the situation. If the incorrect position of alignment
calculation leads to a larger space reserved than the actual
allocations, no ill effect will be perceived by the user. However,
if space calculation is less than the allocations, it will cause the
ring buffer (and thus channel) creation to fail.
The fields that are incorrectly misplaced in size calculation (in
officially released versions) are:
* struct commit_counters_hot is aligned on CAA_CACHE_LINE_SIZE,
* struct commit_counters_cold is aligned on CAA_CACHE_LINE_SIZE,
Those are placed after (should be before) the backend fields:
* struct lttng_ust_lib_ring_buffer_backend_pages_shmp aligned on the
natural alignment of ssize_t,
* alignment on page size,
* struct lttng_ust_lib_ring_buffer_backend_pages, aligned on the natural
alignment of ssize_t,
* struct lttng_ust_lib_ring_buffer_backend_subbuffer, aligned on natural
alignment of unsigned long,
* struct lttng_ust_lib_ring_buffer_backend_counts, aligned on natural
alignment of uint64_t.
The largest alignment is the alignment on page size in the backend
fields. If we have a channel configured within specific ranges of
sub-buffer count, we should reach commit counters array dimensions
which cause the page size alignment to be lower than it should be in
the space calculation, and therefore leads to a problematic scenario
where space allocation will fail, thus leading to channel creation
failures.
Allocate the memory used by the ts_end field added by commit 6c737d05.
When allocating lots of subbuffer for a channel (512 or more),
zalloc_shm() will fail to allocate all the objects because the allocated
memory map didn't take account the newly added field.
Fix: timestamp_end field should include all events within sub-buffer
Fix for timestamp_end not including all events within sub-buffer. This
happens if a thread is preempted/interrupted for a long time between
reserve and commit (e.g. in the middle of a packet), which causes the
timestamp used for timestamp_end field of the packet header to be lower
than the timestamp of the last events in the buffer (those following the
event that was preempted/interrupted between reserve and commit).
The fix involves sampling the timestamp when doing the last space
reservation in a sub-buffer (which necessarily happens before doing the
delivery after its last commit). Save this timestamp temporarily in a
per-sub-buffer control area (we have exclusive access to that area until
we increment the commit counter).
Then, that timestamp value will be read when delivering the sub-buffer,
whichever event or switch happens to be the last to increment the commit
counter to perform delivery. The timestamp value can be read without
worrying about concurrent access, because at that point sub-buffer
delivery has exclusive access to the sub-buffer.
This ensures the timestamp_end value is always larger or equal to the
timestamp of the last event, always below or equal the timestamp_begin
of the following packet, and always below or equal the timestamp of the
first event in the following packet.
This changes the layout of the ring buffer shared memory area, so we
need to bump the LTTNG_UST_ABI version from 7.2 to 8.0, thus requiring
locked-step upgrade between liblttng-ust in applications, session
daemon, and consumer daemon. This fix therefore cannot be backported
to existing stable releases.
Fix: don't access packet header for stream_id and stream_instance_id getters
The stream ID and stream instance ID are invariant for a stream, so
there is no point reading them from the packet header currently owned by
the consumer (between get/put subbuf).
Actually, the consumer try to access the stream_id from the live timer
when sending a live beacon without getting the reader subbuffer first.
Doing so is racy against producers. In typical live scenarios
(non-overwrite channels), the producers will always write the same
stream id and stream instance id values at the same header offsets,
which will "work", except for the initial state of an empty buffer:
the value "0" will be returned (erroneously).
For the less frequently used scenario of a live session with "overwrite"
channels, this is handled by issuing a CHAN_WARN_ON, which disables
tracing for the channel, and prints warning to the consumerd console
when running consumerd with LTTNG_UST_DEBUG=1.
In the case where a ring buffer does not have any data ready, it makes
no sense to try to get a subbuffer for reading anyway, so the approach
was broken.
So return the stream id and stream instance id from the internal
data structures rather than reading it from the ring buffer.
Michael Jeanson [Wed, 20 Mar 2019 15:07:35 +0000 (11:07 -0400)]
compat: work around broken _SC_NPROCESSORS_CONF on MUSL libc
On MUSL libc the _SC_NPROCESSORS_CONF sysconf will report the number of
CPUs allocated to the task based on the affinity mask instead of the
total number of CPUs configured on the system.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Move wait_shm_mmap initialization to library constructor
Prevent us from deadlocking ourself if some glibc implementation
decide to hold the dl_load_* locks on fork operation.
This happens on Yocto Rocko and up when performing python tracing (import
lttngust). Why Yocto decided to patch glibc this way is a mystery
(ongoing effort) [1][2][3].
Anyhow, we can prevent this by moving the initialization of the
wait_shm_mmap to the library constructor since the dl_load_* locks are
nestable mutex.
Nothing in the git log for the wait_shm_mmap indicate a specific reason
to why it was done inside the listener thread. Doing it inside
wait_for_sessiond can help in some corner cases were /dev/shm
(or the shm path) files are unlinked. This is not much of an advantage.
Omair Majid [Thu, 11 Oct 2018 18:28:49 +0000 (14:28 -0400)]
Fix: address shellcheck warnings/errors in example scripts
ShellCheck points out a number of warnings in the example scripts. In
particular, a number of normal and special shell variables are not
quoted correctly.
Fix: check for event class/instance prototype mismatch
The TP_ARGS() for an event instance belonging to an event class
must have compatible types with the event class TP_ARGS().
Failure to follow this rule leads to a prototype mismatch between the
tracepoint call site and the probe function. A common effect perceived
is that events with prototype mismatch between call site and probe
function are never traced.
Fix this by enforcing a compile-time check of the event instance and
class prototypes, similarly to what is done in LTTng modules.
Fix: race between statedump and library destructor
The locking scheme for ust_lock() returns a teardown state (variable
lttng_ust_comm_should_quit) which is set by library destructor with lock
held.
It requires that when ust listener threads use this lock to protect
against concurrent accesses to a data structure, in addition to take
the lock, they need to check the return value of ust_lock() and
skip their critical section entirely if the return value indicates
that teardown is ongoing.
Iteration over all loaded libraries by lttng_ust_dl_update() starts by
iter_begin which grabs the lock, and sets data->cancel state
appropriately if teardown is ongoing. Then extract_bin_info_events()
uses the data->cancel state to skip over use of the protected structures
as needed, but iter_end() fails to take this data->cancel state into
account. Therefore, it can access data structures concurrently while
their teardown is ongoing which leads to crashes.
procname
Thread name, as set by exec(3) or prctl(2). It is recommended
that programs set their thread name with prctl(2) before
hitting the first tracepoint for that thread.
We can rightfully expect that this applies to the first thread created
within a child process upon fork. Reset the procname cache in the child
on fork.
Move symbol preventing unloading of probe providers
Issue
=====
Calling dlclose on the probe provider library that first loaded
__tracepoints__disable_destructors in the symbol table does not
unregister the probes from the callsites as the destructors are not
executed.
The __tracepoints__disable_destructors weak symbol is exposed by probe
providers, liblttng-ust.so and liblttng-ust-tracepoint.so libraries. If
a probe provider is loaded first into the address space, its definition
is bound to the symbol. All the subsequent loaded libraries using the
symbol will use the existing definition of the symbol, thus creating a
situation where liblttng-ust.so or liblttng-ust-tracepoint.so depend on
the probe provider library.
This prevents the dynamic loader from unloading the library as it is
still in use by other libraries. Because of this, the execution of its
destructors and the unregistration of the probes is postponed. Since the
unregistration of the probes is postponed, event will be generated if
the callsite is executed even though the probes should not be loaded.
Solution
========
To overcome this issue, we no longer expose this symbol in the
tracepoint.h file to remove the explicit dependency of the probe
provider on the symbol. We instead use the existing dlopen handle on
liblttng-ust-tracepoint.so and use dlsym to get handles on functions
that disable and get the state of the destructors.
Version compatibility
=====================
- This change is backward compatible with UST applications and libraries
built on lttng-ust version before 2.11. Those applications will use
the __tracepoints__disable_destructors symbol that is now exposed
as a weak symbol in the liblttng-ust-tracepoint.so library. This
symbol is alway checked in 2.11 in case an old app is running.
- Applications built with this change will also work in older versions
of lttng-ust as there is a check to see if the new destructor state
checking method should be used, if it is not we fallback to a
compatibility method. To ensure compatibility in this case, we also
look up and keep up-to-date the __tracepoints__disable_destructors
value using the dlopen-dlsym combo.
- A mix of applications/probes builds in part against 2.10 and 2.11
also work. When setting the destructor state from a binary built
against 2.11 headers, both old/new states are set, so a binary built
against 2.10 will correctly see the old state. When querying the state
from a binary built against 2.11 headers, both old and new states are
queried, so if the state has been set from a binary built against
2.10 headers, the old state will be set.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>