Kienan Stewart [Thu, 2 May 2024 20:51:45 +0000 (16:51 -0400)]
docs: Correct GitHub URLs in lttng-ust.3
The branches follow the format `stable-X.YZ` rather than `vX.YZ`.
Furthermore, when rendering the man pages from source, the URLs were
omitted completely as the subsitution `{lttng_version}` was not
defined. This hasn't been an issue for the published HTML versions as
those are produced via a different script in the `lttng-www` project
which presumably sets the substitution properly.
Change-Id: Ib96c99df13ddf724e128f95e7ce7c74b2c10c766
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Thu, 2 May 2024 14:41:49 +0000 (10:41 -0400)]
fix: handle EINTR correctly in get_cpu_mask_from_sysfs
If the read() in get_cpu_mask_from_sysfs() fails with EINTR, the code is
supposed to retry, but the while loop condition has (bytes_read > 0),
which is false when read() fails with EINTR. The result is that the code
exits the loop, having only read part of the string.
Use (bytes_read != 0) in the while loop condition instead, since the
(bytes_read < 0) case is already handled in the loop.
Original fix in liburcu from Benjamin Marzinski <bmarzins@redhat.com>:
commit
9922f33e2986 ("fix: handle EINTR correctly in get_cpu_mask_from_sysfs")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I885a0fb98e5a7cfb9a8bd180c8e64b20926ff58c
Mathieu Desnoyers [Fri, 19 Apr 2024 18:09:36 +0000 (14:09 -0400)]
Version 2.12.10
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia7eb272d7a0bb0a3d56fcfb713566314de588628
Mathieu Desnoyers [Thu, 18 Apr 2024 15:25:55 +0000 (11:25 -0400)]
Add close_range wrapper to liblttng-ust-fd.so
glibc 2.34 implements close_range(2), which is used by the ssh client
(amongst others). This needs to be overridden to make sure ssh does not
close lttng-ust file descriptors.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic4e0046499e1f010395aec71a48316b9d1e9bf3f
Olivier Dion [Thu, 21 Mar 2024 18:42:13 +0000 (14:42 -0400)]
lttng-ust(3): Fix wrong len_type for sequence
`len_type' of a sequence field must be of type unsigned integer. Some
provided examples in the man page were incorrectly using a type signed
integer, resulting in correct compilation, but error while decoding.
Change-Id: Icc685b330d0704660b36f703075f453d71c5e4cb
Signed-off-by: Olivier Dion <odion@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Mon, 15 Jan 2024 18:36:29 +0000 (13:36 -0500)]
Fix: libc wrapper: use initial-exec for malloc_nesting TLS
Use the initial-exec TLS model for the malloc_nesting nesting guard
variable to ensure that the glibc implementation of the TLS access don't
trigger infinite recursion by calling the memory allocator wrapper
functions, which can happen with global-dynamic.
Considering that the libc wrapper is meant to be loaded with LD_PRELOAD
anyway (never with dlopen(3)), we always expect the libc to have enough
space to hold the malloc_nesting variable.
In addition to change the malloc_nesting from global-dynamic to
initial-exec, this removes the URCU TLS compatibility layer from the
libc wrapper, which is a good thing: this compatibility layer relies
on pthread key and calloc internally, which makes it a bad fit for TLS
accesses guarding access to malloc wrappers, due to possible infinite
recursion.
Link: https://lists.lttng.org/pipermail/lttng-dev/2024-January/030697.html
Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I72c42bc09c1a06e2922b184b85abeb9c94200ee2
Mathieu Desnoyers [Wed, 10 Jan 2024 19:54:07 +0000 (14:54 -0500)]
Version 2.12.9
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I47df837a4eeb9d48bcb7cd4c9f9b70e64182647c
Kienan Stewart [Wed, 6 Dec 2023 18:37:47 +0000 (13:37 -0500)]
fix: invoke MKDIR_P before changing directories
In autoconf < 2.72d `AC_PROG_MKDIR_P` may fall back to using
`install-sh` and that may be referenced as a relative path.
To avoid issues with relative paths causing the command to not be
found, the build directories are created before changing the working
directory.
One way to to test the behaviour prior to this commit is to configure
the build similar to the following:
./configure MKDIR_P="$(realpath --relative-to="$(pwd)" \
$(command -v mkdir))" BUILD_EXAMPLES_FROM_TREE=1
Fixes https://bugs.lttng.org/issues/1404
Change-Id: I2d66254cd8c208f9236d55c6ef1b83c580560c7c
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 17 Oct 2023 19:02:44 +0000 (15:02 -0400)]
fix: clean java inner class files in examples
Java classes that contain inner classes will result in additional class
files being created when compiled in the form of
'Class$InnerClass.class'. Expand the clean target to delete those
additional files.
Change-Id: I0ed7939dcaefa5ca26db9438f7a9b34e57d78f21
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Olivier Dion [Tue, 15 Aug 2023 14:47:06 +0000 (10:47 -0400)]
ustfork: Fix warning about volatile qualifier
Clang is strict about the volatile qualifier on function pointers. It
also wants pointers to be passed to atomic builtins, even for
functions. Therefore, use the addresses of function pointers even if
unnecessary according to C standard.
Change-Id: I5d553a46671cc4bfbe8de5cec2425201459f60d2
Signed-off-by: Olivier Dion <odion@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Olivier Dion [Wed, 9 Aug 2023 21:35:40 +0000 (17:35 -0400)]
ustfork: Fix possible race conditions
Assuming that `dlsym(RTLD_NEXT, "symbol")' is invariant for "symbol",
then we could think that memory operations on the `plibc_func' pointers can
be safely done without atomics.
However, consider what would happen if a load to a`plibc_func' pointer
is torn apart by the compiler. Then a thread could see:
1) NULL
2) The stored value as returned by a dlsym() call
3) A mix of 1) and 2)
The same goes for other optimizations that a compiler is authorized to
do (e.g. store tearing, load fusing).
One could question whether such race condition is even possible for the
clone(2) wrapper. Indeed, a thread must be cloned to get into
existence. Therefore, the main thread would always store the value of
`plibc_func' at least once before creating the first sibling thread,
preventing any possible race condition for this wrapper. However, this
assume that the main thread will not call the clone system call directly
before calling the libc wrapper! Thus, to be on the safe side, we do the
same for the clone wrapper.
Fix the race conditions by using the uatomic_read/uatomic_set functions,
on access to `plibc_func' pointers.
Change-Id: Ic4be25983b8836d2b333f367af9c18d2f6b75879
Signed-off-by: Olivier Dion <odion@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Sat, 12 Aug 2023 17:19:52 +0000 (13:19 -0400)]
Fix: FreeBSD: Pass flags arguments to rfork wrapper
Backported from:
commit
e2a195a6849 ("Fix warnings on FreeBSD")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I12aa474cd057645337f03de22d7c6010266398f3
Michael Jeanson [Wed, 14 Jun 2023 20:55:28 +0000 (16:55 -0400)]
fix: python agent: use stdlib distutils when setuptools is installed
When the setuptools package is installed, it monkey patches the standard
library distutils even if the user code doesn't import setuptools.
This results in a failure to install the python agent in a directory
which ins't in the current PYTHONPATH. To allow this setuptools requires
the '--single-version-externally-managed' options which is not
implemented in distutils.
To resolve this, force the use of distutils for python < 3.12 even when
setuptools is installed with the 'SETUPTOOLS_USE_DISTUTILS' environment
variable and use the proper setuptools option with python >= 3.12 which
doesn't include distutils anymore.
Change-Id: Idf477ca61bed460c9f6be7f481fe3b84624f328c
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 14 Jun 2023 19:58:32 +0000 (15:58 -0400)]
fix: python agent: install on Debian python >= 3.10
Starting with Debian's Python 3.10, the default install scheme is
'posix_local' which is a Debian specific scheme based on 'posix_prefix'
but with an added 'local' prefix. This is the default so users doing
system wide manual installations of python modules end up in
'/usr/local'. This interferes with our autotools based install which
already defaults to '/usr/local' and expect a provided prefix to be used
verbatim.
Monkeypatch sysconfig to override this scheme and use 'posix_prefix' instead.
Change-Id: I08fe77b6c8807515765e3ad0344aa6849e573b90
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 14 Jun 2023 20:21:32 +0000 (16:21 -0400)]
fix: python agent: Add a dependency on generated files
This allows files to be regenerated at build time if the template was
modified since the last build.
Change-Id: I2f98d6b726552efd91719ada9637d2fc2909fbb3
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 13 Jun 2023 15:40:20 +0000 (11:40 -0400)]
python: use setuptools with python >= 3.12
Since 'distutils' will be removed in Python 3.12, use setuptools instead
to build the python agent.
See https://peps.python.org/pep-0632/
Change-Id: I101f0ce0ecd9bd8c198eb2a8d4dd535a46c7a0a0
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 7 Jun 2023 14:32:14 +0000 (10:32 -0400)]
Version 2.12.8
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I3d375552200c91887f1f89cf36097d587cb56eb9
Jérémie Galarneau [Thu, 30 Mar 2023 18:56:15 +0000 (14:56 -0400)]
Fix: segmentation fault on filter interpretation in "switch" mode
When building the interpreter with `INTERPRETER_USE_SWITCH`, I get the
following crash when interpreting a bytecode:
Program terminated with signal SIGSEGV, Segmentation fault.
(gdb) bt
#0 0x00007f5789aee443 in lttng_bytecode_interpret (ust_bytecode=0x555dfe90a650, interpreter_stack_data=0x7ffd12615500 "", probe_ctx=0x7ffd12615620,
caller_ctx=0x7ffd126154bc) at lttng-bytecode-interpreter.c:885
#1 0x00007f5789af4da2 in lttng_ust_interpret_event_filter (event=0x555dfe90a580, interpreter_stack_data=0x7ffd12615500 "", probe_ctx=0x7ffd12615620,
event_filter_ctx=0x0) at lttng-bytecode-interpreter.c:2548
#2 0x0000555dfe02d2d4 in lttng_ust__event_probe__tp___the_string (__tp_data=0x555dfe90a580, i=0, arg_i=2, str=0x7ffd12617cfa "hypothec") at ././tp.h:16
#3 0x0000555dfe02cac0 in lttng_ust_tracepoint_cb_tp___the_string (str=0x7ffd12617cfa "hypothec", arg_i=2, i=0)
at /tmp/lttng-master/src/lttng-tools/tests/utils/testapp/gen-ust-nevents-str/tp.h:16
#4 main (argc=39, argv=0x7ffd12615818) at gen-ust-nevents-str.cpp:38
This appears to be caused by `bytecode->data` being used to determine
the `start_pc` address. In my case, `data` is NULL. A quick look around
the code seems to show that this member is not used except during the
transmission of the bytecode.
I am basing the fix on the implementation of START_OP in the default
case which uses `code` in lieu of `data` and can confirm that it fixes
the crash on my end.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0773df385b8e90728b60503016dec4b46d902234
Mathieu Desnoyers [Tue, 21 Feb 2023 19:43:29 +0000 (14:43 -0500)]
Fix: Reevaluate LTTNG_UST_TRACEPOINT_DEFINE each time tracepoint.h is included
Fix issues with missing symbols in use-cases where tracef.h is included
before defining LTTNG_UST_TRACEPOINT_DEFINE, e.g.:
#include <lttng/tracef.h>
#define LTTNG_UST_TRACEPOINT_DEFINE
#include <provider.h>
It is caused by the fact that tracef.h includes tracepoint.h in a
context which has LTTNG_UST_TRACEPOINT_DEFINE undefined, and this is not
re-evaluated for the following includes.
Fix this by lifting the definition code in tracepoint.h outside of the
header include guards, and #undef the old LTTNG_UST__DEFINE_TRACEPOINT
before re-defining it to its new semantic. Use a new
_LTTNG_UST_TRACEPOINT_DEFINE_ONCE include guard within the
LTTNG_UST_TRACEPOINT_DEFINE defined case to ensure symbols are not
duplicated.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id48a5f73527d8de5fc1b3605c1d858d848b78633
Mathieu Desnoyers [Fri, 17 Feb 2023 21:08:46 +0000 (16:08 -0500)]
Fix: trace events in C constructors/destructors
Adding a priority (150) to the tracepoint and tracepoint provider
constructors/destructors ensures that we trace tracepoints located
within C constructors/destructors with a higher priority value,
including the default init priority of 65535, when the tracepoint vs
tracepoint definition vs tracepoint probe provider are in different
compile units (and in various link order one compared to another).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia8e36317ae058402cdb81cb921da69cfa97a2f82
Mathieu Desnoyers [Thu, 2 Feb 2023 15:25:57 +0000 (10:25 -0500)]
Fix: use unaligned pointer accesses for lttng_inline_memcpy
lttng_inline_memcpy receives pointers which can be unaligned. This
causes issues (traps) specifically on arm 32-bit with 8-byte strings
(including \0).
Use unaligned pointer accesses for loads/stores within
lttng_inline_memcpy instead.
There is an impact on code generation on some architectures. Using the
following test code on godbolt.org:
void copy16_aligned(void *dest, void *src) {
*(uint16_t *)dest = *(uint16_t *) src;
}
void copy16_unaligned(void *dest, void *src) {
STORE_UNALIGNED_INT(uint16_t, dest, LOAD_UNALIGNED_INT(uint16_t, src));
}
void copy32_aligned(void *dest, void *src) {
*(uint32_t *)dest = *(uint32_t *) src;
}
void copy32_unaligned(void *dest, void *src) {
STORE_UNALIGNED_INT(uint32_t, dest, LOAD_UNALIGNED_INT(uint32_t, src));
}
void copy64_aligned(void *dest, void *src) {
*(uint64_t *)dest = *(uint64_t *) src;
}
void copy64_unaligned(void *dest, void *src) {
STORE_UNALIGNED_INT(uint64_t, dest, LOAD_UNALIGNED_INT(uint64_t, src));
}
The resulting assembler (gcc 12.2.0 in -O2) between aligned and
unaligned:
- x86-32: unchanged.
- x86-64: unchanged.
- powerpc32: unchanged.
- powerpc64: unchanged.
- arm32: 16 and 32-bit copy: unchanged. Added code for 64-bit unaligned copy.
- aarch64: unchanged.
- mips32: added code for unaligned.
- mips64: added code for unaligned.
- riscv: added code for unaligned.
If we want to improve the situation on mips and riscv, this would
require introducing a new "lttng_inline_integer_copy" and expose
additional ring buffer client APIs in addition to event_write() which
take integers as inputs. Let's not introduce that complexity yet until
it is justified.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1e6471d4607ac6aff89f16ef24d5370e804b7612
Mathieu Desnoyers [Fri, 30 Sep 2022 18:36:27 +0000 (14:36 -0400)]
Version 2.12.7
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie7ee76315f3966f3c0f9bad0d374287442f7f8a2
Mathieu Desnoyers [Fri, 30 Sep 2022 15:38:57 +0000 (11:38 -0400)]
Fix: bytecode validator: reject specialized load field/context ref instructions
Reject specialized load ref and get context ref instructions so a
bytecode crafted with nefarious intent cannot read a memory area larger
than the memory targeted by the instrumentation.
This prevents bytecode received from the session daemon from performing
out of bound memory accesses and from disclosing the content of
application memory beyond what has been targeted by the instrumentation.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I9bb027e58312c125aa4a9cba5d8f4b5ceb31f4f6
Mathieu Desnoyers [Thu, 29 Sep 2022 20:50:09 +0000 (16:50 -0400)]
Fix: bytecode validator: reject specialized load instructions
Reject specialized load instructions so a bytecode crafted with
nefarious intent cannot read a memory area larger than the memory
targeted by the instrumentation.
This prevents bytecode received from the session daemon from performing
out of bound memory accesses and from disclosing the content of
application memory beyond what has been targeted by the instrumentation.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id81d73e890b29fe2ac6073681ef1faffc52ebfa7
Jérémie Galarneau [Wed, 14 Sep 2022 12:37:35 +0000 (13:37 +0100)]
Fix: lttng-ust-comm: wait on wrong child process
The code currently assumes that the forked process is the only child
process at that point in time. However, there can be unreaped child
processes as reported in the original bug.
From wait(3), as currently used, "status is requested for any child
process."
Using the pid explicitly ensures a wait on the expected child process.
More context is available at:
https://bugs.lttng.org/issues/1359
Fixes #1359
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I8a4621d79c61f7dfefde5c2b94bdee9752e1973d
Michael Jeanson [Thu, 7 Jul 2022 21:01:54 +0000 (17:01 -0400)]
fix: 'make dist' without javah
Don't use 'BUILT_SOURCES' for the header file generated by javah /
javac, files added to this target will be generated on 'make dist'
regardless of the configuration or presence of the required tools.
Add proper make dependencies between the different targets instead of
using 'all-local'.
Set JAVAROOT to a temporary directory to properly clean class files and
avoid confusing javah when it's used to generate the JNI header.
Change-Id: I8544d0418039ba667d062cb01c924368ab702ab7
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 19 Aug 2022 20:20:55 +0000 (16:20 -0400)]
Version 2.12.6
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id8598030a26685551a0f168497e8f806fc7461d5
Michael Jeanson [Mon, 1 Aug 2022 17:44:08 +0000 (13:44 -0400)]
cleanup: remove stale comment
Change-Id: I339fe13ff2d124fbf0a91223c090921902cb965d
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Fri, 29 Jul 2022 14:43:15 +0000 (10:43 -0400)]
fix: Unify possible CPU number fallback
The MUSL specific fallback to get the number of possible CPUs in the
system has the same issue with hot-unplugged CPUs as the Glibc
implementation we worked around by using the possible CPU mask from
sysfs.
To address this, unify our fallback code across all C libraries to get
the maximum CPU id from the directories in "/sys/devices/system/cpu".
Change-Id: I5541742dc1de8e011a942880825fa88c656f0905
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 27 Jul 2022 15:45:26 +0000 (11:45 -0400)]
Fix: remove unused align.h header
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I3a3ea23e36de36ac59b0d350b8ac2bdcfca4a139
Michael Jeanson [Wed, 27 Jul 2022 14:54:53 +0000 (10:54 -0400)]
fix: removed accidental VLA in _get_num_possible_cpus()
The LTTNG_UST_PAGE_SIZE define can either point to a literal value or
the sysconf() function making buf[] a VLA. Replace this by a
cpumask specifc define that will always be a literal value.
Change-Id: I8d329f314878e8018939f979861918969e3ec8ac
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 25 Jul 2022 18:32:36 +0000 (14:32 -0400)]
Add unit tests for num possible cpus
Change-Id: I90eff0090b28cef64a8f4a5bd9745971ed89c711
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 25 Jul 2022 18:21:12 +0000 (14:21 -0400)]
fix: num_possible_cpus() with hot-unplugged CPUs
We rely on sysconf(_SC_NPROCESSORS_CONF) to get the maximum possible
number of CPUs that can be attached to the system for the lifetime of an
application. We use this value to allocate an array of per-CPU buffers
that is indexed by the numerical id of the CPUs.
As such we expect that the highest possible CPU id would be one less
than the number returned by sysconf(_SC_NPROCESSORS_CONF) which is
unfortunatly not always the case and can vary across libc
implementations and versions.
Glibc up to 2.35 will count the number of "cpuX" directories in
"/sys/devices/system/cpu" which doesn't include CPUS that were
hot-unplugged.
This information is however provided by the kernel in
"/sys/devices/system/cpu/possible" in the form of a mask listing all the
CPUs that could possibly be hot-plugged in the system.
This patch changes the implementation of num_possible_cpus() to first
try parsing the possible CPU mask to extract the highest possible value
and if this fails fallback to the previous behavior.
Change-Id: I1a3cb1a446154ec443a391d6689cb7d4165726fd
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
takeshi.iwanari [Fri, 24 Jun 2022 13:17:39 +0000 (22:17 +0900)]
Fix: Use negative value for error code of lttng_ust_ctl_duplicate_ust_object_data
[As is]
- `lttng_ust_ctl_duplicate_ust_object_data` function is called by the following functions:
- `event_notifier_error_accounting_register_app` (lttng-tools)
- `duplicate_stream_object` (lttng-tools)
- `duplicate_channel_object` (lttng-tools)
- `lttng_ust_ctl_duplicate_ust_object_data` function returns positive value (= errno = 24 = EMFILE) when system call `dup` returns error
- However, `duplicate_stream_object` and `duplicate_channel_object` functions expect negative value as error code
- As a result, these functions cannot handle error and segmentation fault occurs when using `stream->handle`
[Proposal]
- Currently, `lttng_ust_ctl_duplicate_ust_object_data` function returns either positive or negative value when error happens
- It looks convention is using negative value for error code (e.g. `-ENOMEM` )
- So, I propose to change `errno` to `-errno`
Signed-off-by: takeshi.iwanari <takeshi.iwanari@tier4.jp>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Iccb01930413ecd5a8c58ad267e9c4eca53694dc7
Mathieu Desnoyers [Thu, 23 Jun 2022 19:58:04 +0000 (15:58 -0400)]
Fix: sessiond wait futex: handle spurious futex wakeups
Observed issue
==============
LTTng-UST scheme for letting listener threads wait on session daemon
to wake up a futex is similar to the liburcu workqueue code, which has
an issue with spurious wakeups.
This wait/wakeup scheme is only used after the LTTng-UST listener thread
has been unable to connect to the session daemon.
A spurious wakeup on wait_for_sessiond can cause wait_for_sessiond to
return with a sock_info->wait_shm_mmap state of 0, which is unexpected.
However, this should not cause any user-observable issues other than
using slightly more CPU time than strictly needed, because this spurious
wakeup will only cause an additional connection attempt to the session
daemon to fail.
Cause
=====
From futex(5):
FUTEX_WAIT
Returns 0 if the caller was woken up. Note that a wake-up can
also be caused by common futex usage patterns in unrelated code
that happened to have previously used the futex word's memory
location (e.g., typical futex-based implementations of Pthreads
mutexes can cause this under some conditions). Therefore, call‐
ers should always conservatively assume that a return value of 0
can mean a spurious wake-up, and use the futex word's value
(i.e., the user-space synchronization scheme) to decide whether
to continue to block or not.
Solution
========
We therefore need to validate whether the value differs from 0 in
user-space after the call to FUTEX_WAIT returns 0.
Known drawbacks
===============
None.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I468d8ff302f467ee9924e6edb04476fcb031b4b9
Mathieu Desnoyers [Fri, 3 Jun 2022 19:53:42 +0000 (15:53 -0400)]
Version 2.12.5
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I72006c47fe7476369d27e8a4e36b68bd70ee1d83
Mathieu Desnoyers [Wed, 6 Apr 2022 15:16:06 +0000 (11:16 -0400)]
Document ust lock async-signal-safety
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie150d5757cc050b0262dcea20f20c1da4963a27e
Mathieu Desnoyers [Wed, 6 Apr 2022 14:55:11 +0000 (10:55 -0400)]
Fix: don't use strerror() from ust lock nocheck
ust_lock_nocheck is meant to be async-signal-safe for use from the
fork() override helper (and fork(2) is async-signal-safe).
Remove calls to strerror() from ust lock functions and from the
cancelstate helper because strerror is not async-signal-safe and indeed
allocates memory.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I461f3631a24e71232d987b0a984b4942903bf9ac
Mathieu Desnoyers [Fri, 3 Jun 2022 19:45:31 +0000 (15:45 -0400)]
Fix: remove non-async-signal-safe fflush from ERR()
Commit
ff1fedb9f2e8 ("usterr: make error reporting functions signal safe")
changed the logging printout mechanism to use patient_write() to a file
descriptor to ensure signal-safety of the ERR() logging mechanism.
However, the fflush(stderr) was left in place, although it was useless.
Unfortunately, fflush() is not async-signal-safe.
Fix this by removing this fflush() call.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I13754acd914c4a9f71014a1e332c3fb25197a669
Jonathan Rajotte [Tue, 17 May 2022 18:09:42 +0000 (14:09 -0400)]
Fix: statedump: invalid read during iter_end
Scenario
=========
Using a modified doc/examples/easy-ust/sample.c and dummy shared objects:
int main(int argc, char **argv)
{
int i = 0;
void *handle_cat;
void *handle_dog;
void (*func_print_name_cat)(const char*);
void (*func_print_name_dog)(const char*);
handle_dog = dlopen("./libdog.so", RTLD_NOW);
handle_cat = dlopen("./libcat.so", RTLD_NOW);
*(void**)(&func_print_name_dog) = dlsym(handle_dog, "print_name");
*(void**)(&func_print_name_cat) = dlsym(handle_cat, "print_name");
for (i = 0; i < 5; i++) {
tracepoint(sample_component, message, "Hello World");
usleep(1);
}
printf("Run `lttng regenerate statedump. Press enter \n");
getchar();
dlclose(handle_dog);
printf("Run `lttng regenerate statedump. Press enter \n");
getchar();
dlclose(handle_cat);
return 0;
}
On lttng side:
lttng create
lttng enable-event -u -a
lttng start
valgrind sample
Issue `lttng regenerate statedump` as the app suggest.
The second `lttng regenerate statedump` results in:
==934747== Invalid read of size 8
==934747== at 0x48BA90F: iter_end (lttng-ust-statedump.c:439)
==934747== by 0x48BAD73: lttng_ust_dl_update (lttng-ust-statedump.c:586)
==934747== by 0x48BADC0: do_baddr_statedump (lttng-ust-statedump.c:599)
==934747== by 0x48BAE62: do_lttng_ust_statedump (lttng-ust-statedump.c:633)
==934747== by 0x489F820: lttng_handle_pending_statedump (lttng-events.c:969)
==934747== by 0x488C000: handle_pending_statedump (lttng-ust-comm.c:717)
==934747== by 0x488DCF7: handle_message (lttng-ust-comm.c:1110)
==934747== by 0x48905EA: ust_listener_thread (lttng-ust-comm.c:1756)
==934747== by 0x4B62608: start_thread (pthread_create.c:477)
==934747== by 0x4A4D162: clone (clone.S:95)
==934747== Address 0x4c4ea88 is 4,152 bytes inside a block of size 4,176 free'd
==934747== at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==934747== by 0x48B9588: free_dl_node (lttng-ust-statedump.c:123)
==934747== by 0x48BA90A: iter_end (lttng-ust-statedump.c:450)
==934747== by 0x48BAD73: lttng_ust_dl_update (lttng-ust-statedump.c:586)
==934747== by 0x48BADC0: do_baddr_statedump (lttng-ust-statedump.c:599)
==934747== by 0x48BAE62: do_lttng_ust_statedump (lttng-ust-statedump.c:633)
==934747== by 0x489F820: lttng_handle_pending_statedump (lttng-events.c:969)
==934747== by 0x488C000: handle_pending_statedump (lttng-ust-comm.c:717)
==934747== by 0x488DCF7: handle_message (lttng-ust-comm.c:1110)
==934747== by 0x48905EA: ust_listener_thread (lttng-ust-comm.c:1756)
==934747== by 0x4B62608: start_thread (pthread_create.c:477)
==934747== by 0x4A4D162: clone (clone.S:95)
==934747== Block was alloc'd at
==934747== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==934747== by 0x48B936A: zmalloc (helper.h:27)
==934747== by 0x48B936A: alloc_dl_node (lttng-ust-statedump.c:85)
==934747== by 0x48B98F7: find_or_create_dl_node (lttng-ust-statedump.c:184)
==934747== by 0x48BA205: extract_baddr (lttng-ust-statedump.c:339)
==934747== by 0x48BABC6: extract_bin_info_events (lttng-ust-statedump.c:528)
==934747== by 0x4A8D2F4: dl_iterate_phdr (dl-iteratephdr.c:75)
==934747== by 0x48BAD4C: lttng_ust_dl_update (lttng-ust-statedump.c:583)
==934747== by 0x48BADC0: do_baddr_statedump (lttng-ust-statedump.c:599)
==934747== by 0x48BAE62: do_lttng_ust_statedump (lttng-ust-statedump.c:633)
==934747== by 0x489F820: lttng_handle_pending_statedump (lttng-events.c:969)
==934747== by 0x488C000: handle_pending_statedump (lttng-ust-comm.c:717)
==934747== by 0x488DCF7: handle_message (lttng-ust-comm.c:1110)
==934747==
Cause
=========
Nodes can be removed during the `cds_hlist_for_each_entry_2` iteration which
is not meant to be used when items are removed within the traversal.
Solution
=========
Use `cds_hlist_for_each_entry_safe_2`.
Change-Id: Ibf3d94a4d6f7abac19ed9740eeacfbcb1bdf1f4f
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 30 Mar 2022 16:10:53 +0000 (12:10 -0400)]
Fix: bytecode interpreter context_get_index() leaves byte order uninitialized
Observed Issue
==============
With lttng-ust 2.13, when using the event notification capture feature
to capture a context field, e.g. '$ctx.cpu_id', the captured value is
often observed in reverse byte order.
This issue is not visible in lttng-ust 2.12 because it does not
implement the event notification capture feature. However, it would
become observable if a lttng-tools emits a filter bytecode
BYTECODE_OP_GET_SYMBOL instruction to load the context value. For
compatibility purposes, lttng-tools only uses
BYTECODE_OP_GET_CONTEXT_REF to load the filter context fields,
but nothing prevents a future lttng-tools version from using
BYTECODE_OP_GET_SYMBOL instead.
Cause
=====
Within the bytecode interpreter, context_get_index() leaves the "rev_bo"
field uninitialized in the top of stack.
Solution
========
Initialize the rev_bo field based on the context field type
reserve_byte_order field.
Known drawbacks
===============
None.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I74996d501cee3c269658d98dfc0d0050b74c5ddb
Mathieu Desnoyers [Fri, 25 Mar 2022 18:00:18 +0000 (14:00 -0400)]
Version 2.12.4
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ifdb806cb6fb3c282131e5417019d0fdef96faeda
Michael Jeanson [Thu, 17 Mar 2022 17:45:51 +0000 (13:45 -0400)]
fix: __STDC_VERSION__ can be undefined in C++
Caught on SLES12 with g++ 4.8 when enabling '-Wundef'.
Change-Id: Ib027f224a4f0ef021beb1709d8a626db62fe6d9c
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 9 Mar 2022 16:54:33 +0000 (11:54 -0500)]
Fix: concurrent exec(2) file descriptor leak
If exec(2) is executed by the application concurrently with LTTng-UST
listener threads between the creation of a file descriptor with
socket(2), recvmsg(2), or pipe(2) and call to fcntl(3) FD_CLOEXEC, those
file descriptors will stay open after the exec, which is not intended.
As a consequence, shared memory files for ring buffers can stay present
on the file system for long-running traced processes.
Use:
- pipe2(2) O_CLOEXEC (supported since Linux 2.6.27, and by FreeBSD),
- socket(2) SOCK_CLOEXEC (supported since Linux 2.6.27, and by FreeBSD),
- recvmsg(2) MSG_CMSG_CLOEXEC (supported since Linux 2.6.23 and by FreeBSD),
rather than fcntl(2) FD_CLOEXEC to make sure the file descriptors are
closed on exec immediately upon their creation.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id2167cf99d7cb8a8425fc0dc13745f023a504562
Michael Jeanson [Thu, 10 Feb 2022 15:25:02 +0000 (15:25 +0000)]
Add 'domain' parameter to the Log4j 2.x agent
The initial Log4j 2.x agent commit only implemented a compatibility mode
to be used with the existing LOG4J domain in lttng-tools.
In this mode the agent converts the new Log4j 2.x loglevel values to
their corresponding Log4j 1.x values in the same way the upstream
compatibility bridge does.
This is great when doing in-place migration using the upstream
compatibility bridge but doesn't cover the usecase of an application
that natively uses Log4j 2.x.
This commit adds a new mandatory 'domain' parameter to the Log4j2 agent
which currently only implements the 'LOG4J' compatibility domain in
preparation to adding a 'LOG4J2' domain.
The configuration for a single appender in Log4j 1.x compat mode will
now look like this:
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Appenders>
<Lttng name="LTTNG" domain="LOG4J"/>
</Appenders>
<Loggers>
<Root level="all">
<AppenderRef ref="LTTNG"/>
</Root>
</Loggers>
</Configuration>
Change-Id: I7fd5f79ad58c77175714bd4198d8ff5db2e6b846
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 2 Feb 2022 19:04:50 +0000 (19:04 +0000)]
fix: Convert custom loglevels in Log4j 2.x agent
The loglevel integer representation has changed between log4j 1.x and
2.x, we currently convert the standard loglevels but passthrough the
custom ones.
This can be problematic when using severity ranges as custom loglevels
won't be properly filtered.
Use the same strategy as the upstream Log4j 2.x compatibility layer by
converting the custom loglevels to their equivalent standard loglevel
value.
Change-Id: I8cbd4706cb774e334380050cf0b407e19d7bc7c4
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Fri, 28 Jan 2022 18:58:12 +0000 (18:58 +0000)]
fix: coverity reported null returns in Log4j2 agent
According to the log4j javadoc, these methods should not return null but
since it's reported by Coverity, add the null checks.
*** CID
1469124: Null pointer dereferences (NULL_RETURNS)
src/lib/lttng-ust-java-agent/java/lttng-ust-agent-log4j2/org/lttng/ust/agent/log4j2/LttngLogAppender.java:
194 in org.lttng.ust.agent.log4j2.LttngLogAppender.append(org.apache.logging.log4j.core.LogEvent)()
*** CID
1469123: Null pointer dereferences (NULL_RETURNS)
src/lib/lttng-ust-java-agent/java/lttng-ust-agent-log4j2/org/lttng/ust/agent/log4j2/LttngLogAppender.java:
167 in org.lttng.ust.agent.log4j2.LttngLogAppender.append(org.apache.logging.log4j.core.LogEvent)()
Change-Id: Ib992b3cc6848492cfb6e7d8fec6ce3898d962db4
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 6 Jan 2022 19:36:46 +0000 (14:36 -0500)]
Add a Log4j 2.x Java agent
This adds a new agent to the LTTng-UST Java agents suite supporting the
Log4j 2.x logging backend.
This new agent can be enabled with the following configure options :
$ export CLASSPATH=/path/to/log4j-core.jar:/path/to/log4j-api.jar
$ ./configure --enable-java-agent-log4j2
This backport differs from the master branch for the
'--enable-java-agent-all' option won't select this new agent since we
wanted to avoid introducing a new dependency in existing configurations.
The name of the new agent jar file is "lttng-ust-agent-log4j2.jar".
It will be installed in the arch-agnostic "$prefix/share/java" path
e.g: "/usr/share/java".
It uses the same jni library "liblttng-ust-log4j-jni.so" as the Log4j 1.x agent.
The agent was designed as a mostly drop-in replacement for applications
upgrading from Log4j 1.x to 2.x. It requires no modification to the
tracing configuration as it uses the same domain "-l / LOG4J" and the
loglevels integer representations are converted to the Log4j 1.x values
(excluding custom loglevels).
The recommended way to use this agent with Log4j 2.x is to add an
"Lttng" Appender with an arbiraty name and associate it with one or more
Logger using an AppenderRef.
For example, here is a basic log4j2 xml configuration that would send
all logging statements exlusively to an lttng appender:
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Appenders>
<Lttng name="LTTNG"/>
</Appenders>
<Loggers>
<Root level="all">
<AppenderRef ref="LTTNG"/>
</Root>
</Loggers>
</Configuration>
More examples can be found in the 'doc/examples' directory.
The implementation of the appender is based on this[1] great guide by
Keith D. Gregory which is so much more detailed than the official
documentation, my thanks to him.
[1] https://www.kdgregory.com/index.php?page=logging.log4j2Plugins
Change-Id: I34593c9a4c3140c8839cef8b58cc85745fe9f47f
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 5 Jan 2022 21:11:27 +0000 (16:11 -0500)]
Version 2.12.3
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ifbf2f5d985efc5358255e2dd9f068b4a94485638
Mathieu Desnoyers [Thu, 9 Dec 2021 19:43:06 +0000 (14:43 -0500)]
Fix: ust-cancelstate: include string.h for strerror
strerror() is provided by string.h, not error.h. Also error.h is not
present on FreeBSD, which causes the build to fail.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id5e29df184c1f24659e15bc16c73da01fc819905
Mathieu Desnoyers [Thu, 9 Dec 2021 17:46:08 +0000 (12:46 -0500)]
Fix: lttng-ust-fd: remove lttng_ust_common_ctor call
Introduced by bogus backport of ("fix: liblttng-ust-fd async-signal-safe
close()"). This constructor does not exist in stable-2.12.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Iab19505d7269a044b94572975eeba87bb44e9dfa
Mathieu Desnoyers [Thu, 9 Sep 2021 16:49:26 +0000 (12:49 -0400)]
Fix: nestable pthread cancelstate
The pthread cancelstate disable performed to ensure threads are not
cancelled while holding mutexes which are used in library destructors
does not currently support that those mutexes may be nested. It
generates error messages when using the fork and fd helpers when running
with LTTNG_UST_DEBUG=1. The effect of this is that the pthread
cancelstate can be re-enabled too soon when the first unlock is
performed (in a nested lock scenario), thus allowing the thread to be
cancelled while still holding a lock, and causing a deadlock on
application exit.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ife8b1fee04c7d7c480e59bdfc158abdee771994c
Jonathan Rajotte [Mon, 6 Dec 2021 14:20:58 +0000 (09:20 -0500)]
Fix: abort on decrement_sem_count during concurrent tracing start and teardown
Observed issue
==============
The following backtrace has been reported:
#0 __GI_raise (sig=sig@entry=6)
at /usr/src/debug/glibc/2.31/git/sysdeps/unix/sysv/linux/raise.c:50
#1 0x0000007f90b3fdd4 in __GI_abort () at /usr/src/debug/glibc/2.31/git/stdlib/abort.c:79
#2 0x0000007f90b4bf50 in __assert_fail_base (fmt=0x7f90c3da98 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
assertion=assertion@entry=0x7f9112cb90 "uatomic_read(&sem_count) >= count",
file=file@entry=0x7f9112cb30 "/usr/src/debug/lttng-ust/2_2.13.0-r0/lttng-ust-2.13.0/src/lib/lttng-ust/lttng-ust-comm.c",
line=line@entry=664, function=function@entry=0x7f911317e8 <__PRETTY_FUNCTION__.10404> "decrement_sem_count")
at /usr/src/debug/glibc/2.31/git/assert/assert.c:92
#3 0x0000007f90b4bfb4 in __GI___assert_fail (assertion=assertion@entry=0x7f9112cb90 "uatomic_read(&sem_count) >= count",
file=file@entry=0x7f9112cb30 "/usr/src/debug/lttng-ust/2_2.13.0-r0/lttng-ust-2.13.0/src/lib/lttng-ust/lttng-ust-comm.c",
line=line@entry=664, function=function@entry=0x7f911317e8 <__PRETTY_FUNCTION__.10404> "decrement_sem_count")
at /usr/src/debug/glibc/2.31/git/assert/assert.c:101
#4 0x0000007f910e3830 in decrement_sem_count (count=<optimized out>)
at /usr/src/debug/lttng-ust/2_2.13.0-r0/lttng-ust-2.13.0/src/lib/lttng-ust/lttng-ust-comm.c:664
#5 0x0000007f910e5d28 in handle_pending_statedump (sock_info=0x7f9115c608 <global_apps>)
at /usr/src/debug/lttng-ust/2_2.13.0-r0/lttng-ust-2.13.0/src/lib/lttng-ust/lttng-ust-comm.c:737
#6 handle_message (lum=0x7f8dde46d8, sock=3, sock_info=0x7f9115c608 <global_apps>)
at /usr/src/debug/lttng-ust/2_2.13.0-r0/lttng-ust-2.13.0/src/lib/lttng-ust/lttng-ust-comm.c:1410
#7 ust_listener_thread (arg=0x7f9115c608 <global_apps>)
at /usr/src/debug/lttng-ust/2_2.13.0-r0/lttng-ust-2.13.0/src/lib/lttng-ust/lttng-ust-comm.c:2055
#8 0x0000007f90af73e0 in start_thread (arg=0x7fc27a82f6)
at /usr/src/debug/glibc/2.31/git/nptl/pthread_create.c:477
#9 0x0000007f90bead5c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
It turns out that the main thread is at that point iterating over the
libraries destructors:
Thread 3 (LWP 1983):
#0 0x0000007f92a68a0c in _dl_fixup (l=0x7f9054e510, reloc_arg=432)
at /usr/src/debug/glibc/2.31/git/elf/dl-runtime.c:69
#1 0x0000007f92a6ea3c in _dl_runtime_resolve () at ../sysdeps/aarch64/dl-trampoline.S:100
#2 0x0000007f905170f8 in __do_global_dtors_aux () from <....>/crash/work/rootfs/usr/lib/libbsd.so.0
#3 0x0000007f92a697f8 in _dl_fini () at /usr/src/debug/glibc/2.31/git/elf/dl-fini.c:138
#4 0x0000007f90b54864 in __run_exit_handlers (status=0, listp=0x7f90c65648 <__exit_funcs>,
run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true)
at /usr/src/debug/glibc/2.31/git/stdlib/exit.c:108
#5 0x0000007f90b549f4 in __GI_exit (status=<optimized out>)
at /usr/src/debug/glibc/2.31/git/stdlib/exit.c:139
#6 0x0000000000404c98 in a_function_name (....) at main.c:152
#7 0x0000000000404a98 in main (argc=3, argv=0x7fc27a8858, env=0x7fc27a8878) at main.c:97
Cause
=====
An enable command is processed at the same time that the lttng-ust
destructor is run. At the end of the command handling,
`handle_pending_statedump` is called. Multiple variables from the
`sock_info` struct are checked outside the UST lock at that point.
lttng-ust-comm.c +1406:
/*
* Performed delayed statedump operations outside of the UST
* lock. We need to take the dynamic loader lock before we take
* the UST lock internally within handle_pending_statedump().
*/
handle_pending_statedump(sock_info);
Namely:
registration_done
statedump_pending
initial_statedump_done
`statedump_pending` is set during the enable command
(`lttng_session_statedump`, lttng-events.c +631) in the same thread.
As for `registration_done` and `initial_statedump_done` they are invariant
from the registration of the app until `lttng_ust_cleanup` is called.
`cleanup_sock_info` called by `lttng_ust_cleanup`, itself called by
`lttng_ust_exit` resets the `registration_done` and
`initial_statedump_done` fields. Note that these operations are done
outside of the listener thread.
Note that by that point `lttng_ust_exit` expects all "getters" on
`sock_info` to fail while trying to acquire the UST lock due to
`lttng_ust_comm_should_quit` set to 1. Note that the listener threads
can still exist because we do not join them, we only execute
pthread_cancel which is async.
Clearly we are missing mutual exclusion provided by locking
when accessing `registration_done` and `initial_statedump_done`.
Solution
========
Here we can do better and simply not require any mutual exclusion based on locking.
`registration_done` and `initial_statedump_done` only need to be reset
to zero when we are not actually exiting (`lttng_ust_after_fork_child`).
In this case, no concurrent listener thread exists at that point
that could access those fields during the reset. Hence we can move the
reset to only the non-exiting code path and alleviate the current
situation.
Known drawbacks
===============
None.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I45ba3eaee20c49a3988837a87fa680ce0a6ed953
Michael Jeanson [Tue, 9 Mar 2021 17:38:06 +0000 (12:38 -0500)]
fix: liblttng-ust-fd async-signal-safe close()
"close(2)" is documented as async-signal-safe (see signal-safety(7)) and
as such our override function should also be. This means we shouldn't do
lazy initialization of the function pointer to the original libc close
symbol in the override function.
If we move the initialization of the function pointer in the library
constructor we risk breaking other constructors that may run before ours
and call close().
A compromise is to explicitly do the initialization in the constructor
but keep a lazy init scheme if close() is called before it is executed.
The dlsym call is now done only once, if it fails the wrappers will
return -1 and set errno to ENOSYS.
Change-Id: I05c66d2f5d51b2022c6803ca215340fb9c00f5a8
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 10 Nov 2021 20:10:18 +0000 (15:10 -0500)]
Set git-review branch to stable-2.12
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I7ffd689a7702db6b3b4aecf01deaa22401f3fb4d
Michael Jeanson [Wed, 10 Nov 2021 20:00:34 +0000 (15:00 -0500)]
fix: link benchmark with pthread
In commit
aefa2e417b0d0a3c43cb4d078fa4a2d4cfbf429c we removed '-lpthread'
from the global link flags. Some toolchains don't require to explicitly
link with pthread but some do, add it to the benchmark specific link
flags.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1a52d1f5363a86e17f82bcf56bd2273179c4328e
Jonathan Rajotte [Mon, 8 Nov 2021 20:00:24 +0000 (15:00 -0500)]
Fix: liblttng-ust-ctl have dependencies on liburcu
Observed issue
==============
On a yocto build the liblttng-ust-ctl shared object have dependencies on
liburcu.
(The 'not found's are irrelevant here)
./liblttng-ust-ctl.so: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by ./liblttng-ust-ctl.so)
linux-vdso.so.1 (0x00007ffc07910000)
libpseudo.so => /root/oe-core/build/tmp-glibc/sysroots-components/x86_64/pseudo-native/usr/lib/pseudo/lib64/libpseudo.so (0x00007f8fa3e0d000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f8fa3c05000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f8fa3a01000)
liburcu-bp.so.6 => not found
liburcu-cds.so.6 => not found
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f8fa37e2000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8fa33f1000)
/lib64/ld-linux-x86-64.so.2 (0x00007f8fa4069000)
On my ubuntu system:
linux-vdso.so.1 (0x00007ffcd20aa000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007faaf5282000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007faaf527c000)
libnuma.so.1 => /lib/x86_64-linux-gnu/libnuma.so.1 (0x00007faaf526f000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007faaf524c000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faaf505a000)
/lib64/ld-linux-x86-64.so.2 (0x00007faaf52ed000)
Cause
=====
The default behavior for AC_CHECK_LIB, when the `action-if-found` is NOT
defined, is to prepend the library to LIBS. [1]
"
If action-if-found is not specified, the default action prepends
-llibrary to LIBS and defines ‘HAVE_LIBlibrary’ (in all capitals).
"
It is important to note that the LIBS variable is used for ALL linking.
This is normally not a problem for most distributions since they force
the use of `--as-needed` at the toolchain level (gcc specs) (for example
debian [2]). One could also pass the `--as-needed` flag manually but
libtool reorganize flags in the case of shared object creation [3].
In our case, we normally explicitly state the dependencies via the *_LIBADD
automake clause. We do not rely on the LIBS variable.
For the current tree, when the configure script is run, the LIBS
variable has the following value:
LIBS='-lnuma -lurcu-bp -lurcu-bp -lurcu-cds -lpthread '
The current configure.ac does define what seems to be `empty but
defined` clause for the `action-if-found` as "[]". This is not a valid
"empty but defined" `action-if-find` clause and end up generating the
default behavior as defined in [1]. "[ ]" would be but is supposedly not
the case for all autoconf version.
This leads to unnecessary dependencies for most of the shared objects, at
link time, generated when using a distro that do not enforce the
`--as-needed` flag on linking.
Solution
========
Define an actual no-op shell operation `:` for the `action-if-found`
parameter.
For libnuma define HAVE_LIBNUMA manually. This prevent the prepending to
LIBS while providing the HAVE_LIB* default behavior.
Known drawbacks
=========
None.
References
==========
[1] https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Libraries.html
[2] https://salsa.debian.org/toolchain-team/gcc/-/blob/master/debian/patches/gcc-as-needed.diff
[3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=347650
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I8fbb41eb85a98800902fbd23bfbea3ed41faaf26
Simon Marchi [Thu, 2 Sep 2021 02:30:02 +0000 (22:30 -0400)]
Fix: add extern "C" to two header files
These are needed to build some lttng-tools binary as C++ programs.
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Iba97d9cc52f86fd01cc24111c53a85340595e4c4
Mathieu Desnoyers [Fri, 14 May 2021 19:02:08 +0000 (15:02 -0400)]
Version 2.12.2
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I787648f1b103a8d18150e64a09a57104db293004
Mathieu Desnoyers [Wed, 31 Mar 2021 14:27:32 +0000 (10:27 -0400)]
tests: benchmark: improve benchmark scalability accuracy
Testing with a fixed number of loops per-thread only works if the
workload is distributed perfectly across CPUs. For instance, if a lock
is held in the workload (e.g. internally by open() and close()), those
may cause starvation of some threads, and therefore cause the benchmark
to be wrong because it will wait for the slowest thread to complete its
loops.
It is also not good for testing overcommit of threads vs cpus.
Change the test to report the number of loops performed in a given wall
time, and use this to report the average and std.dev. of tracing
overhead per event on each active CPU.
Change the benchmark workload to be only CPU-bound and not generate
system calls to minimize the inherent non-scalability of the workload
(e.g. locks held within the kernel).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I5245f36831875bd9f87854618a4ed0cb31e56a4d
Mathieu Desnoyers [Wed, 31 Mar 2021 00:05:44 +0000 (20:05 -0400)]
tests: benchmark: use cpu-bound workload, calculate average and std.dev.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: If5cd9d920dd0cfe0e235b5386359080bc99efbb2
Mathieu Desnoyers [Tue, 30 Mar 2021 20:33:26 +0000 (16:33 -0400)]
tests: improve benchmark script
Kill session daemon after script, conditionally drop caches if root
(else it is forbidden to do so), handle cleanup on trap, use snapshot
(flight recorder) tracing to benchmark ring buffer, output the result in
nanoseconds, removing trailing numbers after dot.
Before this, the test was pretty much always resulting in an output of
"0s" extra overhead due to truncation of significant numbers in the
output.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia7844f3a52821da972b24bf9f39158c24329f63a
Mathieu Desnoyers [Thu, 25 Mar 2021 18:28:20 +0000 (14:28 -0400)]
compiler warning cleanup: is_signed_type: compare -1 to 1
Comparing -1 to 0 triggers compiler warnings (gcc -Wtype-limits and
-Wbool-compare) and Coverity warning "Macro compares unsigned to 0".
Comparing -1 to 1 instead takes care of silencing those warnings while
keeping the same behavior.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I5003ce1f80d34ca6713bab43e8114a23f2d8c1d4
Mathieu Desnoyers [Mon, 22 Mar 2021 21:32:59 +0000 (17:32 -0400)]
Fix: properly compare type enumeration
Fixes: b82404da07 ("Fix: bytecode linker: validate event and field array/sequence encoding")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I4a430d36ee08dc65242a5e6ca6bd0a6a689df873
Michael Jeanson [Thu, 18 Mar 2021 20:25:34 +0000 (16:25 -0400)]
fix: use the configured cmake binary
Use the configure cmake binary in the examples Makefile instead of a
direct call.
Change-Id: Idb66642345e551b427e359b9cd58f49aad2a8502
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Mon, 22 Mar 2021 18:23:44 +0000 (14:23 -0400)]
Fix: bytecode linker: validate event and field array/sequence encoding
The bytecode linker should only allow linking filter expressions loading
fields which are string-encoded arrays and sequence for comparison
against a string, and reject arrays and sequences without encoding, so
the filter interpreter does not attempt to load non-NULL terminated
arrays/sequences as if they were strings.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic13fbbb0d601eddbb7d98f4a5e13fe3f45612fd8
Michael Jeanson [Wed, 24 Feb 2021 20:12:12 +0000 (15:12 -0500)]
fix: unix socket peercred on FreeBSD
Include 'sys/un.h' for LOCAL_PEERCRED.
The uid member of 'struct xucred' is 'cr_uid'.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I6d94e5ad53292401afebeae0cbbba82e7568a847
Mathieu Desnoyers [Thu, 25 Feb 2021 17:40:48 +0000 (12:40 -0500)]
Fix: uninitialized variable in lib_ring_buffer_channel_switch_timer_start
Found by Coverity:
** CID
1447027: Uninitialized variables (UNINIT)
/libringbuffer/ring_buffer_frontend.c: 810 in lib_ring_buffer_channel_switch_timer_start()
>>> CID
1447027: Uninitialized variables (UNINIT)
>>> Using uninitialized value "sev". Field "sev._sigev_un" is uninitialized when calling "timer_create".
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Iaa66e5612ff1c51b50c4c0b8f30f3695e1b03153
Mathieu Desnoyers [Mon, 12 Oct 2020 20:52:03 +0000 (16:52 -0400)]
Fix: Use unix socket peercred for pid, uid, gid credentials
Currently, the session daemon trust the pid, ppid, uid, and gid values
passed by the application, but should really validate the uid using unix
socket peercred. This fix uses the peercred values rather than the
values provided by the application on registration for:
- pid, uid and gid on Linux,
- uid and gid on FreeBSD.
This should improve how the session daemon deals with containerized
applications on Linux as well. Applications are required to be either in
the same pid namespace, or in a pid namespace nested within the pid
namespace of the lttng-sessiond, so the session daemon can map the
application pid to something meaningful within its own pid namespace.
Applications in a unrelated (disjoint) pid namespace will be refused by
the session daemon.
About the uid and gid with user namespaces on Linux, those will provide
meaningful IDs if the application user namespace is either the same as
the user namespace of the session daemon, or a nested user namespace.
Otherwise, the IDs will be that of /proc/sys/kernel/overflowuid and
/proc/sys/kernel/overflowgid, which typically maps to nobody.nogroup on
current distributions.
Given that fetching the parent pid (ppid) of the application would
require to use /proc/<pid>/status (which is racy wrt pid reuse), expose
the ppid provided by the application on registration instead, but only
in situations where the application sits in the same pid namespace as
the session daemon (on Linux), which is detected by checking if the pid
provided by the application matches the pid obtained using unix socket
credentials. The ppid is only used for logging/debugging purposes in the
session daemon anyway, so it is OK to use the value provided by the
application for that purpose.
Fixes: #1286
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I94742e57dad642106908d09e2c7e395993c2c48f
Mathieu Desnoyers [Tue, 15 Dec 2020 14:01:38 +0000 (09:01 -0500)]
Fix: stream fd leaks on error
Use a regular pattern for all commands:
If the command callback takes ownership of a pointer or file descriptor,
it sets them to NULL or -1. Therefore, the caller can always try to free
the pointer, or close it if it is greater or equal to 0.
This eliminates memory and fd leaks on error.
Change-Id: I447129ab1672ce4fc6cf3c0baf18dbf27bfcfaf8
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Tue, 15 Dec 2020 13:48:59 +0000 (08:48 -0500)]
Fix: channel leak on error
Use a regular pattern for all commands:
If the command callback takes ownership of a pointer or file descriptor,
it sets them to NULL or -1. Therefore, the caller can always try to free
the pointer, or close it if it is greater or equal to 0.
This eliminates memory and fd leaks on error.
Change-Id: I0f9cba5fc0f4c095c8ec8f3e8970de8a10386876
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Tue, 15 Dec 2020 14:04:16 +0000 (09:04 -0500)]
Fix: lttng_abi_map_channel should be static
Change-Id: If0434525f083dc20b50a7628926dadb483d30c88
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 17 Feb 2021 20:14:22 +0000 (15:14 -0500)]
Version 2.12.1
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id0c0c121d8694ab7aaa6aae473559766e03baa29
Mathieu Desnoyers [Sun, 27 Dec 2020 19:48:46 +0000 (14:48 -0500)]
Fix: Use default visibility for tracepoint provider symbol
When building a probe provider `someprobe` with -fvisibility=hidden into
a shared library, the `__tracepoint_provider_someprobe` symbol is hidden,
which does not allow tracepoint instrumentation to link to it.
Fix this by using the "default" visibility attribute.
For a shared library built with -fvisibility=hidden, this changes the
output of nm for that symbol from:
000000000000417c b __tracepoint_provider_someprobe
(local BSS symbol)
to
000000000000417c B __tracepoint_provider_someprobe
(global (external) BSS symbol)
Fixes: #1296
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I308374f6cac58cca86e8eb19c872286a3da21a75
Jérémie Galarneau [Mon, 11 Jan 2021 17:09:05 +0000 (12:09 -0500)]
Fix: configure: support Autoconf 2.70
The newly-released autoconf 2.70 introduces a number of breaking
changes [1] and is being rolled-out by some distros.
Amongst those changes, the AC_PROG_CC_STDC macro is marked as obsolete
and was merged into AC_PROG_CC, which we already use. On 2.70, this
results in a warning which we handle as an error.
A version check is added to invoke the AC_PROG_CC_STDC macro only when
running a pre-2.70 version of autoconf, fixing the issue.
[1] https://lwn.net/Articles/839395/
[ Backport by Mathieu Desnoyers: removed ax_pthread.m4 bits which are
not present in stable-2.12 and prior. ]
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I2f4eac788638924aa4960b54374d09de73957925
Christophe Bedard [Sun, 21 Jun 2020 17:54:30 +0000 (13:54 -0400)]
Fix: ust-tracepoint: make sure to expand tracepoint provider token
Using a #defined TRACEPOINT_PROVIDER with the
TRACEPOINT_EVENT_{CLASS,INSTANCE}() macros led to compilation errors,
e.g.:
gcc -I/home/chris/lttng-mwe/app/../lib -c myapp.c
gcc -o myapp myapp.o -L/home/chris/lttng-mwe/app/../lib -Wl,-rpath=/home/chris/lttng-mwe/app/../lib -lmytps -ldl
/usr/bin/ld: /home/chris/lttng-mwe/app/../lib/libmytps.so: undefined reference to `__tracepoint_provider_mismatch_TRACEPOINT_PROVIDER'
collect2: error: ld returned 1 exit status
make: *** [Makefile:9: myapp] Error 1
Use intermediate TRACEPOINT_EVENT_{CLASS,INSTANCE}() macros to make sure
that the tracepoint provider name token gets expanded.
Fixes: #1273
Signed-off-by: Christophe Bedard <bedard.christophe@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I68550a510c809f1d5f86a7e1d966dcecc3780dda
Mathieu Desnoyers [Tue, 22 Sep 2020 20:38:51 +0000 (16:38 -0400)]
Fix: ustctl_release_object: eliminate double-close/free on error
When ustctl_release_object returns an error, it is unclear to the caller
what close/free side effects were effectively performed and which were
not. So the only courses of action are to either leak file descriptors
or memory, or call ustctl_release_object again which can trigger double
close or double free.
Fix this by setting the file descriptors to -1 after successful close,
and pointers to NULL after successful free.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 13 Jul 2020 19:22:35 +0000 (15:22 -0400)]
tests: return the proper TAP exit code
The C TAP library provides the 'exit_status()' function that will return
the proper exit code according to the number of tests that succeeded or
failed.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I786527dfa9cfe2d1a7c8bc80086d54186f60b4d9
Jonathan Rajotte [Wed, 17 Jun 2020 22:40:22 +0000 (18:40 -0400)]
Fix: python agent: 'time' has no attribute 'clock'
The time.clock() function was removed in python 3.8 and is marked as
deprecated since python 3.3. See PEP 418 for more details [1].
Solution
=====
When the python version is greater than 3.2 use the
`time.perf_counter()` function [2]. Otherwise, fall back to
`time.clock()`.
Introduce a compat module to the lttngust agent package providing the
`_clock` function.
[1] https://www.python.org/dev/peps/pep-0418/
[2] https://docs.python.org/3/library/time.html#time.perf_counter
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I3d6d8b24309d45d43b634dc2a6b4d5dbc12da3aa
Christophe Bedard [Mon, 1 Jun 2020 20:11:08 +0000 (16:11 -0400)]
Fix: libc-wrapper: undef temporary token rather than value
The lttng-ust malloc wrapper defines pthread_mutex_lock/unlock
preprocessor tokens to ust_malloc_spin_lock/unlock around the definition
of a TLS variable, which uses pthread mutexes when relying on the
pthread key fallback. Undefining those tokens should be done on the
preprocessor token, rather than its value.
Signed-off-by: Christophe Bedard <bedard.christophe@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I2053b79c88000e272c29b25ca105b1352ecfabd7
Michael Jeanson [Fri, 26 Jun 2020 19:44:20 +0000 (15:44 -0400)]
Fix: support compile units including 'sys/sdt.h' without defining SDT_USE_VARIADIC
Instead of using SDT_USE_VARIADIC from 'sys/sdt.h', use our own namespaced
macros since the instrumented application might already have included
'sys/sdt.h' without variadic support.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Idfece1a65a5296a90d33370bc8d73ea554c14b0f
Francis Deslauriers [Tue, 5 May 2020 15:51:58 +0000 (11:51 -0400)]
Fix: event probes attached before event enabled
Background
==========
When userspace events with exclusions are enabled by the CLI user, the
session daemon enables the events in a 3-steps process using the
following ustctl commands:
1. `LTTNG_UST_EVENT` to create an event and get an handle on it,
2. `LTTNG_UST_EXCLUSION` to attach exclusions, and
3. `LTTNG_UST_ENABLE` to activate the tracing of this event.
Also, the session daemon uses the `LTTNG_UST_SESSION_START` to start the
tracing of events. In various use cases, this can happen before OR after
the 3-steps process above.
Scenario
========
If the`LTTNG_UST_SESSION_START` is done before the 3-steps process the
tracer will end up not considering the exclusions and trace all events
matching the event name glob pattern provided at step #1.
This is due to the fact that when we do step #1, we end up calling
`lttng_session_lazy_sync_enablers()`. This function will sync the event
enablers if the session is active (which it is in our scenario because
of the _SESSION_START).
Syncing the enablers will then match event name glob patterns provided
by the user to the event descriptors of the tracepoints and attach
probes in their corresponding callsite on matches.
All of this is done before we even received the exclusions in step #2.
Problem
=======
The problem is that we attach probes to tracepoints before being sure we
received the entire description of the events.
Step #2 may reduced the set of tracepoints to trace so we must wait
until exclusions are all received to attached probes to tracepoints.
Solution
========
Only match event names and exclusions (and ultimately, register probes)
on enabled enablers to ensure that no other modifications will be
applied to the event.
Event enablers are enabled when at step #3.
Note
====
Filters are not causing problems here because adding a filter won't
change what tracepoints are to be traced.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id984f266d976f346b001db81cd8a2b74965b5ef2
Mathieu Desnoyers [Thu, 23 Apr 2020 16:51:40 +0000 (12:51 -0400)]
Fix: agent: read: end of loop condition should exclude 0
Causes the agent to hang endlessly waiting for completion of the
loop.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 22 Apr 2020 20:32:51 +0000 (16:32 -0400)]
fix: Java examples CLASSPATH override
Variables provided to make as arguments, called 'command variables' can't
be reassigned a new value in the Makefile. Since the CLASSPATH is now
passed this way in the glue between automake and the example makefiles,
use different names for the internal variables.
Change-Id: Id6289273f211f544a66d933a96f06df75243415f
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 22 Apr 2020 18:07:46 +0000 (14:07 -0400)]
Add missing files to .gitignore
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I5050772c58342d3eb956e0cbd56983fbb1ed3b70
Michael Jeanson [Wed, 22 Apr 2020 18:37:55 +0000 (14:37 -0400)]
fix: Add CLASSPATH to autoconf precious variables
This will record the value of CLASSPATH in the autoconf cache and ensure
it's used when calling make even if it's not exported to the
environment.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia06070dd352f9d3b6956a1ec0f4eb1d024f73179
Mathieu Desnoyers [Wed, 22 Apr 2020 18:05:41 +0000 (14:05 -0400)]
Fix: Java agent: close session daemon socket on error
When catching an error, close the socket used to communicate with the
session daemon rather than leaking it, before retrying to connect.
Also, when sleep is interrupted, there is no point in printing
out the stack trace. Just retry connection immediately.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 22 Apr 2020 17:30:40 +0000 (13:30 -0400)]
Fix: Java agent: handle partial payload read
When reading from a TCP socket, there is no guarantee that the
read will return all the requested data. We need to loop and continue
reading until we gather all the expected data.
This fixes flakiness of the lttng-ust-java-tests.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Thu, 9 Apr 2020 15:29:18 +0000 (11:29 -0400)]
Fix: namespace contexts !CONFIG_RCU_TLS variable initialization
The namespace contexts introduced in lttng-ust 2.12 require to
initialize TLS variable to nonzero values. However, in !CONFIG_RCU_TLS
(compatibility mode using pthread setspecific), this initialization
does not build.
Use the new DEFINE_URCU_TLS_INIT from liburcu when building
!CONFIG_RCU_TLS to fix this issue. Since this requires a dependency on
a new liburcu version, only !CONFIG_RCU_TLS adds this dependency in the
fix. A followup cleanup patch will use DEFINE_URCU_TLS_INIT as we add
a strict version dependency.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 8 Apr 2020 13:30:27 +0000 (09:30 -0400)]
Version 2.12.0
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 27 Mar 2020 14:29:52 +0000 (10:29 -0400)]
Version 2.12.0-rc3
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Jonathan Rajotte [Mon, 9 Mar 2020 23:24:07 +0000 (19:24 -0400)]
Fix: v.u.d might be uninitialized
lttng-filter-interpreter.c:301:17: warning: ‘v.u.d’ may be used uninitialized in this function [-Wmaybe-uninitialized]
ptr->u.d = v.u.d;
Indeed it seems that the value is never fetched if we compare to other
call sites and the dynamic call site further below.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I542d6db7514a2c39afe3613228577f34f194d672
Mathieu Desnoyers [Wed, 4 Mar 2020 21:11:39 +0000 (16:11 -0500)]
Version 2.12.0-rc2
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Jonathan Rajotte [Mon, 2 Mar 2020 19:21:33 +0000 (14:21 -0500)]
Fix: set FD_CLOEXEC on incoming FDs.
The stream shm FDs are allocated by the consumer process, and then
passed to the applications over unix sockets. When opening those
file descriptors on reception, the FD_CLOEXEC flag is not set.
In a fork + exec scenario, parent process streams shm FDs and channel
wake FDs are present in the resulting child process.
Set FD_CLOEXEC on reception (ustcomm_recv_fds_unix_sock) to
prevent such scenario.
Change-Id: Id58077b272be9c1ab239846639ffd8103b3d50f1
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Tue, 18 Feb 2020 00:25:01 +0000 (19:25 -0500)]
Fix: tracepoint.h: Disable address sanitizer on pointer array section variables
The tracepoint header declares pointer global variables meant to be
placed contiguously within the __tracepoints_ptrs section, and then used
as an array of pointers when loading an executable or shared object.
Clang Address Sanitizer adds redzones around each variable, thus leading to
detection of a global buffer overflow.
Those redzones should not be placed within this section, because it
defeats its purpose. Therefore, teach asan not to add redzones
around those variables with an attribute.
Note that there does not appear to be any issue with gcc (tested with
gcc-8 with address sanitization enabled), and gcc ignores the
no_sanitize_address attribute when applied to a global variable.
Fixes: #1238
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Tue, 18 Feb 2020 00:31:41 +0000 (19:31 -0500)]
Fix: jhash.h: remove out-of-bound reads
jhash.h implements "special" code for valgrind because it reads memory
out-of-bound (and then applies a mask) when reading strings.
Considering that lttng-ust does not use jhash.h in a fast-path, remove
this "optimization" and use the verifiable VALGRIND code instead. This
fixes an ASan splat.
Fixes: #1238
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Francis Deslauriers [Mon, 17 Feb 2020 15:46:45 +0000 (10:46 -0500)]
Cleanup: remove trailing white spaces across project
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I15202338465ee56d33316cbc632d9e3bf44ee31e
Francis Deslauriers [Tue, 26 Nov 2019 16:16:30 +0000 (11:16 -0500)]
Fix: lttng-ust-comm.c: return number of fd rather size of array
There are two conflicting comments for this function. One says it
returns the size of the received data and the other says it returns the
number of fd received.
It's more useful to receive the number of fd.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I74084b461d396c3e623fa55100e6dd7e59dbea83
Francis Deslauriers [Fri, 6 Dec 2019 22:49:35 +0000 (17:49 -0500)]
liblttng-ust: exit loop early on event enabler match
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0fa3215f7cd6a2d32ac00d66cf5fc184abd14612
Francis Deslauriers [Wed, 27 Nov 2019 20:16:06 +0000 (15:16 -0500)]
Add git-review config
Add .gitreview for contributors wishing to use gerrit for patch
reviews.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I3ba853d0d68a660ac5d2d24c757e0f28b634977c
This page took 0.054 seconds and 4 git commands to generate.