Mathieu Desnoyers [Fri, 12 Jul 2024 15:35:41 +0000 (11:35 -0400)]
Temporarily Revert "Introduce sync vs unsync enablers"
This reverts commit
8a5c7efa50f9dce0360611b16323462c77f07321 in
preparation for merging the Trace Hit Counters feature.
This commit will be reintroduced later on top.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: If335556a1a6b285edcf0b3ff8a61ddb2831f018c
Mathieu Desnoyers [Tue, 11 Jun 2024 20:56:07 +0000 (16:56 -0400)]
Fix: test_benchmark: do not match "CPU(s) scaling MHz:"
Do not match "CPU(s) scaling MHz:" line, it breaks the script.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ifa392d018590e098dae75acef2b8265c8714c4cb
Mathieu Desnoyers [Thu, 9 May 2024 19:09:17 +0000 (15:09 -0400)]
ust-fd: Add close_range declaration
Old libc headers do not contain a declaration of close_range(). Emit our
own declaration to prevent compiler warnings.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: If6ca8193895efbb6ce1ba46e092939b8099bcff6
Mathieu Desnoyers [Thu, 2 May 2024 21:22:14 +0000 (17:22 -0400)]
Rename "tsc" to "timestamp"
Naming timestamps "TSC" or "tsc" is an historical artefact dating from
the implementation of libringbuffer, where the initial intent was to use
the x86 "rdtsc" instruction directly, which ended up not being what was
done in reality.
Rename uses of "TSC" and "tsc" to "timestamp" to clarify things and
don't require reviewers to be fluent in x86 instruction set.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I8e7e2ad9cd2d2427485fc6adbc340fccde14ca2f
Kienan Stewart [Thu, 2 May 2024 20:51:45 +0000 (16:51 -0400)]
docs: Correct GitHub URLs in lttng-ust.3
The branches follow the format `stable-X.YZ` rather than `vX.YZ`.
Furthermore, when rendering the man pages from source, the URLs were
omitted completely as the subsitution `{lttng_version}` was not
defined. This hasn't been an issue for the published HTML versions as
those are produced via a different script in the `lttng-www` project
which presumably sets the substitution properly.
Change-Id: Ib96c99df13ddf724e128f95e7ce7c74b2c10c766
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Thu, 2 May 2024 14:41:49 +0000 (10:41 -0400)]
fix: handle EINTR correctly in get_cpu_mask_from_sysfs
If the read() in get_cpu_mask_from_sysfs() fails with EINTR, the code is
supposed to retry, but the while loop condition has (bytes_read > 0),
which is false when read() fails with EINTR. The result is that the code
exits the loop, having only read part of the string.
Use (bytes_read != 0) in the while loop condition instead, since the
(bytes_read < 0) case is already handled in the loop.
Original fix in liburcu from Benjamin Marzinski <bmarzins@redhat.com>:
commit
9922f33e2986 ("fix: handle EINTR correctly in get_cpu_mask_from_sysfs")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I885a0fb98e5a7cfb9a8bd180c8e64b20926ff58c
Mathieu Desnoyers [Wed, 20 Mar 2024 20:47:39 +0000 (16:47 -0400)]
Introduce LTTNG_UST_MAP_POPULATE_POLICY environment variable
Problem Statement
-----------------
commit
4d4838bad480 ("Use MAP_POPULATE to reduce pagefault when available")
was first introduced in tag v2.11.0 and never backported to stable
branches. Its purpose was to reduce the tracer fast-path latency caused
by handling minor page faults the first time a given application writes
to each page of the ring buffer after mapping them. The discussion
thread leading to this commit can be found here [1]. When using
LTTng-UST for diagnosing real-time applications with very strict
constraints, this added latency is unwanted.
That commit introduced the MAP_POPULATE flag when mapping the ring
buffer pages, which causes the kernel to pre-populate the page table
entries (PTE).
This has, however, unintended consequences for the following scenarios:
* Short-lived applications which write very little to the ring buffer end
up taking more time to start, because of the time it takes to
pre-populate all the ring buffer pages, even though they typically won't
be used by the application.
* Containerized workloads using cpusets will also end up having longer
application startup time than strictly required, and will populate
PTE for ring buffers of CPUs which are not present in the cpuset.
There are, therefore, two sets of irreconcilable requirements:
short-lived and containerized workloads benefit from lazily populating
the PTE, whereas real-time workloads benefit from pre-populating them.
This will therefore require a tunable environment variable that will let
the end-user choose the behavior for each application.
Solution
--------
Allow users to specify whether they want to pre-populate
shared memory pages within the application with an environment
variable.
LTTNG_UST_MAP_POPULATE_POLICY
If set, override the policy used to populate shared memory pages within the
application. The expected values are:
none
Do not pre-populate any pages, take minor faults on first access while
tracing.
cpu_possible
Pre-populate pages for all possible CPUs in the system, as listed by
/sys/devices/system/cpu/possible.
Default: none. If the policy is unknown, use the default.
Choice of the default
---------------------
Given that users with strict real-time constraints already have to setup
their tracing with specific options (see the "--read-timer"
lttng-enable-channel(3) option [2]), it makes sense that the default
is to lazily populate the ring buffer PTE, and require users with
real-time constraints to explicitly enable the pre-populate through an
environment variable.
Effect on default behavior
--------------------------
The default behavior for ring buffer PTE mapping will be changing across
LTTng-UST versions in the following way:
- 2.10 and earlier: lazily populate PTE,
- 2.11-2.13: pre-populate PTE,
- 2.14: lazily populate PTE.
LTTng-UST 2.14 will revert back to the 2.10 lazy populate scheme by
default.
[1] https://lists.lttng.org/pipermail/lttng-dev/2019-July/thread.html#29094
[2] https://lttng.org/docs/v2.13/#doc-channel-timers
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I6743b08cd1fe0d956caaf6aad63005555bb9640e
Mathieu Desnoyers [Thu, 18 Apr 2024 15:25:55 +0000 (11:25 -0400)]
Add close_range wrapper to liblttng-ust-fd.so
glibc 2.34 implements close_range(2), which is used by the ssh client
(amongst others). This needs to be overridden to make sure ssh does not
close lttng-ust file descriptors.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic4e0046499e1f010395aec71a48316b9d1e9bf3f
Kienan Stewart [Thu, 14 Mar 2024 15:39:12 +0000 (11:39 -0400)]
docs: Add supported versions and fix-backport policy
Change-Id: I9ec43912652fc713484959e9315765f7e9d29a3e
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Fri, 9 Feb 2024 19:48:29 +0000 (14:48 -0500)]
docs: Add cases in which tracepoints in ctors/dtors may not work
Change-Id: I52666810322e26b3841ea1bca6f588b6c3e6f3f8
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Olivier Dion [Thu, 21 Mar 2024 18:51:07 +0000 (14:51 -0400)]
ust-tracepoint-event: Add static check of sequences length type
Enforce required unsigned type for length of sequence at compile time.
Change-Id: Ia8668a80eb0c0b81e8c03b208d7581e34af313fd
Signed-off-by: Olivier Dion <odion@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Olivier Dion [Thu, 21 Mar 2024 18:42:13 +0000 (14:42 -0400)]
lttng-ust(3): Fix wrong len_type for sequence
`len_type' of a sequence field must be of type unsigned integer. Some
provided examples in the man page were incorrectly using a type signed
integer, resulting in correct compilation, but error while decoding.
Change-Id: Icc685b330d0704660b36f703075f453d71c5e4cb
Signed-off-by: Olivier Dion <odion@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Fri, 9 Feb 2024 19:30:48 +0000 (14:30 -0500)]
python: log exception details when agent thread cannot start
Change-Id: If9d58f066d513f63428bbc07190a956571532655
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Fri, 9 Feb 2024 19:18:59 +0000 (14:18 -0500)]
Fix: python lttngust agent fails when LTTNG_UST_APP_PATH is not set
Observed issue
==============
lttng-tools `tests/regression/ust/python-logging/test_python_logging`
had the following failures:
```
not ok 14 - Found 0 / 5 events matching 'python-ev-test1' amongst 0 events
not ok 27 - Found 0 / 5 events matching 'python-ev-test1' amongst 0 events
not ok 40 - Found 0 / 5 events matching 'python-ev-test1' amongst 0 events
not ok 53 - Found 0 / 5 events matching 'python-ev-test1' out of 0 events
not ok 66 - Found 0 / 1 events matching 'python-ev-test2' amongst 0 events
not ok 74 - Found 0 / 1 events matching 'python-ev-test2' amongst 0 events
not ok 82 - Found 0 / 5 events matching 'python-ev-test1' amongst 0 events
not ok 98 - Found 0 / 1 events matching 'python-ev-test2' amongst 0 events
not ok 109 - Found 0 / 5 events matching 'python-ev-test1' out of 0 events
not ok 115 - Found 0 events matching 'python-ev-test1'
not ok 121 - Found 0 / 1 events matching 'python-ev-test2' amongst 0 events
not ok 127 - Found 0 / 5 events matching 'python-ev-test1' amongst 0 events
not ok 134 - Found 0 / 10 events matching 'python-ev-test1' amongst 0 events
not ok 140 - Found 0 / 5 events matching 'python-ev-test1' amongst 0 events
not ok 146 - Found 0 / 5 events matching 'python-ev-test1' amongst 0 events
not ok 157 - Found 0 / 5 events matching 'python-ev-test1' amongst 0 events
```
Cause
=====
When the use of `LTTNG_UST_APP_PATH` was introduced[1], no default
value for `ust_app_port` was set. In the case where
`LTTNG_UST_APP_PATH` is not set in the environment the condition for
starting with the `ust_app_port` is still checked, causing the
following exception:
```
[
2559145.907503] LTTng-UST warning: _init_threads(): cannot create client threads: cannot access local variable 'ust_app_port' where it is not associated with a value
```
Solution
========
Provide a known default value for `ust_app_port`.
Known drawbacks
===============
None.
References
==========
[1]: https://github.com/lttng/lttng-ust/commit/
c0f6fb054d2f16518d047a6adf7e8aa81eff5403
Change-Id: I92242ccd056dd91505156e4e8df812639eaef570
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 27 Nov 2023 17:02:40 +0000 (12:02 -0500)]
Add initial support for the multiple LTTNG_UST_APP_PATHs
The `$LTTNG_UST_APP_PATH` is split using ':' as a separator. There is
no provision for escaping the ':' separator.
Paths after the first path will be ignored for the moment and a
warning emitted.
Change-Id: I619a3578e00fd3c758d616b99b443fc15a1477df
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Tue, 28 Nov 2023 19:39:23 +0000 (14:39 -0500)]
Fix java client connection path when LTTNG_UST_APP_PATH is set
When LTTNG_UST_CTL_PATH is set for `lttng-sessiond`, the agent port is
at `$LTTNG_UST_CTL_PATH/agent.port`, not
`$LTTNG_UST_CTL_PATH/.lttng/agent.port`.
Change-Id: I79419f36cbd802da06acd68f58e437b0d4eb3856
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 20 Oct 2023 19:20:45 +0000 (15:20 -0400)]
Introduce LTTNG_UST_APP_PATH environment variable
Introduce an environment to specify a path under which unix sockets
used for the communication between the application (tracee) instrumented
with `liblttng-ust` and the LTTng session and consumer daemons (part of
the LTTng-tools project) are located. When `$LTTNG_UST_APP_PATH` is
specified, only this path is considered for connecting to a session
daemon. Setting this environment variable disables connection to root
and per-user session daemons.
The `$LTTNG_UST_APP_PATH` target directory must exist and be accessible
by the user before the application is executed for tracing to work.
This environment variable affects the Java and Python agents in the same
way.
This environment variable on the LTTng-UST application side is meant to
be used with a new LTTNG_UST_CTL_PATH on the lttng sessiond side.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I4784f4565514a9771827603bd0bebabbeb37a7ad
Mathieu Desnoyers [Fri, 20 Oct 2023 18:52:05 +0000 (14:52 -0400)]
Rename "global" sock_info field to "multi_user"
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ife23f632db641e476e2e059031a2f956af4da72d
Mathieu Desnoyers [Mon, 15 Jan 2024 18:36:29 +0000 (13:36 -0500)]
Fix: libc wrapper: use initial-exec for malloc_nesting TLS
Use the initial-exec TLS model for the malloc_nesting nesting guard
variable to ensure that the glibc implementation of the TLS access don't
trigger infinite recursion by calling the memory allocator wrapper
functions, which can happen with global-dynamic.
Considering that the libc wrapper is meant to be loaded with LD_PRELOAD
anyway (never with dlopen(3)), we always expect the libc to have enough
space to hold the malloc_nesting variable.
In addition to change the malloc_nesting from global-dynamic to
initial-exec, this removes the URCU TLS compatibility layer from the
libc wrapper, which is a good thing: this compatibility layer relies
on pthread key and calloc internally, which makes it a bad fit for TLS
accesses guarding access to malloc wrappers, due to possible infinite
recursion.
Link: https://lists.lttng.org/pipermail/lttng-dev/2024-January/030697.html
Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I72c42bc09c1a06e2922b184b85abeb9c94200ee2
Michael Jeanson [Thu, 6 Jul 2023 18:40:01 +0000 (14:40 -0400)]
Tests: implement REUSE with SPDX identifiers
The SPDX identifiers [1] are a legally binding shorthand, which can be
used instead of the full boiler plate text. This is the first step
towards implementing the full REUSE spec [2] to help with copyright and
licensing audits and compliance.
This will reduce a lot a manual work required for the licensing audit
required in Debian on each update.
For files that lacked copyright and licensing information, I used the
following guidelines. If a clear author could be determined from the git
history use it, otherwise use 'EfficiOS Inc.'. For build system files,
use 'MIT', for documentation 'CC-BY-4.0' and for data files 'CC0-1.0'.
Freeform text files were converted to Markdown to allow licensing
comments.
[1] https://spdx.org/ids-how
[2] https://reuse.software/tutorial/
Change-Id: I3c391f15d97b5958bdfacc17eb4ab2abafd9d99d
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 6 Jul 2023 18:40:15 +0000 (14:40 -0400)]
doc: implement REUSE with SPDX identifiers
The SPDX identifiers [1] are a legally binding shorthand, which can be
used instead of the full boiler plate text. This is the first step
towards implementing the full REUSE spec [2] to help with copyright and
licensing audits and compliance.
This will reduce a lot a manual work required for the licensing audit
required in Debian on each update.
For files that lacked copyright and licensing information, I used the
following guidelines. If a clear author could be determined from the git
history use it, otherwise use 'EfficiOS Inc.'. For build system files,
use 'MIT', for documentation 'CC-BY-4.0' and for data files 'CC0-1.0'.
Freeform text files were converted to Markdown to allow licensing
comments.
[1] https://spdx.org/ids-how
[2] https://reuse.software/tutorial/
Change-Id: Idc57357d401c4efac4d0c641108607236fa8ecd4
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 6 Jul 2023 18:40:39 +0000 (14:40 -0400)]
include: implement REUSE with SPDX identifiers
The SPDX identifiers [1] are a legally binding shorthand, which can be
used instead of the full boiler plate text. This is the first step
towards implementing the full REUSE spec [2] to help with copyright and
licensing audits and compliance.
This will reduce a lot a manual work required for the licensing audit
required in Debian on each update.
For files that lacked copyright and licensing information, I used the
following guidelines. If a clear author could be determined from the git
history use it, otherwise use 'EfficiOS Inc.'. For build system files,
use 'MIT', for documentation 'CC-BY-4.0' and for data files 'CC0-1.0'.
Freeform text files were converted to Markdown to allow licensing
comments.
[1] https://spdx.org/ids-how
[2] https://reuse.software/tutorial/
Change-Id: Ie6e7d801e879f78ee34361c2327338ac8b60c92b
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 6 Jul 2023 18:40:50 +0000 (14:40 -0400)]
src: implement REUSE with SPDX identifiers
The SPDX identifiers [1] are a legally binding shorthand, which can be
used instead of the full boiler plate text. This is the first step
towards implementing the full REUSE spec [2] to help with copyright and
licensing audits and compliance.
This will reduce a lot a manual work required for the licensing audit
required in Debian on each update.
For files that lacked copyright and licensing information, I used the
following guidelines. If a clear author could be determined from the git
history use it, otherwise use 'EfficiOS Inc.'. For build system files,
use 'MIT', for documentation 'CC-BY-4.0' and for data files 'CC0-1.0'.
Freeform text files were converted to Markdown to allow licensing
comments.
[1] https://spdx.org/ids-how
[2] https://reuse.software/tutorial/
Change-Id: I5bebf12931a64f29fa84ee3947b165d0624db13a
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Wed, 6 Dec 2023 18:37:47 +0000 (13:37 -0500)]
fix: invoke MKDIR_P before changing directories
In autoconf < 2.72d `AC_PROG_MKDIR_P` may fall back to using
`install-sh` and that may be referenced as a relative path.
To avoid issues with relative paths causing the command to not be
found, the build directories are created before changing the working
directory.
One way to to test the behaviour prior to this commit is to configure
the build similar to the following:
./configure MKDIR_P="$(realpath --relative-to="$(pwd)" \
$(command -v mkdir))" BUILD_EXAMPLES_FROM_TREE=1
Fixes https://bugs.lttng.org/issues/1404
Change-Id: I2d66254cd8c208f9236d55c6ef1b83c580560c7c
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Fri, 24 Nov 2023 14:12:37 +0000 (09:12 -0500)]
docs: Update contributing guide
Indicate that Gerrit (https://review.lttng.org) is the principal place
where patches are submitted and reviewed, rather than the mailing list.
Based on feedback received on the mailing list:
https://lists.lttng.org/pipermail/lttng-dev/2023-November/030670.html
Change-Id: I3543659b0d02ecd672f2c8a45d23975c271628f9
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 6 Jul 2023 16:03:04 +0000 (12:03 -0400)]
Build system: implement REUSE with SPDX identifiers
The SPDX identifiers [1] are a legally binding shorthand, which can be
used instead of the full boiler plate text. This is the first step
towards implementing the full REUSE spec [2] to help with copyright and
licensing audits and compliance.
This will reduce a lot a manual work required for the licensing audit
required in Debian on each update.
For files that lacked copyright and licensing information, I used the
following guidelines. If a clear author could be determined from the git
history use it, otherwise use 'EfficiOS Inc.'. For build system files,
use 'MIT', for documentation 'CC-BY-4.0' and for data files 'CC0-1.0'.
Freeform text files were converted to Markdown to allow licensing
comments.
[1] https://spdx.org/ids-how
[2] https://reuse.software/tutorial/
Change-Id: I75f6120ff1241e1f6ea5aac44cd87c89d7fd21e3
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 25 Apr 2023 13:22:14 +0000 (09:22 -0400)]
Disallow building static librairies
We don't officialy support static linking the LTTng-UST tracer, however
the autotools build system still allows building static libraries.
Make it impossible to build static libraries without modifying the
configure.ac script to make it explicit.
Adding support for static linking would require a lot of additional
testing an also for applications to explicitly call the library
constructor.
Change-Id: I55d7d9db9fa9c8623305901d085aef1a33286f28
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 14 Dec 2023 15:46:56 +0000 (10:46 -0500)]
fix: -Wsingle-bit-bitfield-constant-conversion with clang16
We get the following warning with Clang 16:
lttng-ust-abi.c:558:38: warning: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Wsingle-bit-bitfield-constant-conversion]
lttng_chan_buf->priv->parent.tstate = 1;
My understanding is that there is no bug because we only check if the
values are zero or not, so we can silence the warning by making the
variables unsigned.
Change-Id: Ic4e02164d5adf4271fa24e5b13e5d320ae19de2e
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 20 Oct 2023 15:57:20 +0000 (11:57 -0400)]
Revert "Add support for LTTNG_UST_HOME"
This reverts commit
90d125c709f566f3663bf84677f100134cc618e0.
After discussion with Jeremie, we want to introduce two (not one)
environment variables:
- LTTNG_UST_APP_PATH,
- LTTNG_UST_CTL_PATH.
to accomodate use-cases where a sessiond within a container is traced by
a sessiond in the parent container. In that situation, we want the
sessiond in the parent container to access the tracee through the
LTTNG_UST_CTL_PATH, without making the unix sockets for tracing control
visible to the child container.
Therefore, remove the LTTNG_UST_HOME environment variable before it is
added into an official release.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 20 Oct 2023 15:56:58 +0000 (11:56 -0400)]
Revert "Cleanup: remove leftover comment"
This reverts commit
5c0cb615bd9744f061ea318f829e0aa147b05958.
Will also revert "Add support for LTTNG_UST_HOME" in a follow up revert.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Thu, 19 Oct 2023 17:56:44 +0000 (13:56 -0400)]
Cleanup: remove leftover comment
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ifd2ead39c7d7ebe9d50ec20b834c81cd4caa5e1d
Michael Jeanson [Tue, 17 Oct 2023 19:02:44 +0000 (15:02 -0400)]
fix: clean java inner class files in examples
Java classes that contain inner classes will result in additional class
files being created when compiled in the form of
'Class$InnerClass.class'. Expand the clean target to delete those
additional files.
Change-Id: I0ed7939dcaefa5ca26db9438f7a9b34e57d78f21
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Thu, 19 Oct 2023 17:34:06 +0000 (13:34 -0400)]
Cleanup: remove whitespaces at EOL in lttng-ust.pc.in
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1da0f8291929c6638bea44748e6e223ba32b36dc
Jonathan Rajotte [Tue, 28 Sep 2021 21:47:31 +0000 (17:47 -0400)]
Add support for LTTNG_UST_HOME
Namespacing the LTTNG_HOME env variable facilitates the work carried to
have a way to trace the tracer (lttng-sessiond). This also fits with
the work done lately to namespace lttng-ust.
The LTTNG_HOME environment variable is used by lttng-sessiond to setup
the whole tracing environment for the application to be traced. When
lttng-ust is loaded by the lttng-sessiond to be traced, the fact that it
reuse the `LTTNG_HOME` set for the lttng-sessiond prevent us from
specifying an external lttng-sessiond home.
Albeit it could be possible for the lttng-sessiond to "trace" itself
(self tracing), it make more sense, in our testing environment, to have
a supplementary lttng-sessiond handling the tracing of the
lttng-sessiond under testing.
Note that some work will be carried to limit the use of LTTNG_HOME to
setup the tracing environment by lttng-sessiond and liblttng-ctl APIs
but it will be a long effort. Providing `LTTNG_UST_HOME` allows us to
start dogfooding today.
`LTTNG_HOME` is still used as a fallback to `LTTNG_UST_HOME` to preserve
backward compatibility.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I6aed21fd70d1b79b6768d237f59cc80612938d65
Kienan Stewart [Wed, 11 Oct 2023 14:28:40 +0000 (10:28 -0400)]
Log path used in connection attempts
Motivated by feedback on the lttng-dev mailing list that a user couldn't
find the socket path used when debugging connection issues of their
UST application.
Refs https://bugs.lttng.org/issues/1393
Change-Id: I42c8bb9ae372683a16f176caf87ac394f816955e
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 6 Oct 2023 15:48:01 +0000 (11:48 -0400)]
Introduce sync vs unsync enablers
Eliminate iteration over unmodified enablers when synchronizing the
enablers vs event state.
The intent is to turn a O(m*n) algorithm (m = number of enablers, n =
number of event probes) into a O(n) when enabling many additional events
when tracing is active.
This change is done both for event enablers and for event notifier
enablers.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ifdadbafbf899ce5f3c5f8eb22409ac0c4af3139c
Jérémie Galarneau [Wed, 27 Sep 2023 13:41:04 +0000 (09:41 -0400)]
Fix: misaligned urcu reader accesses
Running the LTTng-tools tests (test_valid_filter, for example) under
address sanitizer results in the following warning:
/usr/include/lttng/urcu/static/urcu-ust.h:155:6: runtime error: member access within misaligned address 0x7fc45db3a020 for type 'struct lttng_ust_urcu_reader', which requires 128 byte alignment
0x7fc45db3a020: note: pointer points here
c4 7f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
While the node member of lttng_ust_urcu_reader has an "aligned"
attribute of CAA_CACHE_LINE_SIZE, the compiler can't ensure the
alignment of members for dynamically allocated instances.
The `data` pointer is changed from char* to struct
lttng_ust_urcu_reader*, allowing the compiler to enforce the expected
alignment constraints.
Since `data` was addressed in bytes, the code using this field is
adapted to use element counts. As the chunks are only used to allocate
reader instances (and not other types), it makes the code a bit easier
to read.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic826cf94444681bea3a192d3a9f4262a0961e948
Olivier Dion [Tue, 22 Aug 2023 15:28:36 +0000 (11:28 -0400)]
ustfork: Initialize libc pointers in constructor
Instead of resolving individual libc functions lazily at their call
site, resolve every libc functions in a global constructor. This improve
error reporting for the user, by only emiting a single warning for each
failed symbol lookup.
Change-Id: I47504846e44a68366870b983ff556158e634cf83
Signed-off-by: Olivier Dion <odion@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Olivier Dion [Tue, 15 Aug 2023 14:47:06 +0000 (10:47 -0400)]
ustfork: Fix warning about volatile qualifier
Clang is strict about the volatile qualifier on function pointers. It
also wants pointers to be passed to atomic builtins, even for
functions. Therefore, use the addresses of function pointers even if
unnecessary according to C standard.
Change-Id: I5d553a46671cc4bfbe8de5cec2425201459f60d2
Signed-off-by: Olivier Dion <odion@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Olivier Dion [Wed, 9 Aug 2023 21:35:40 +0000 (17:35 -0400)]
ustfork: Fix possible race conditions
Assuming that `dlsym(RTLD_NEXT, "symbol")' is invariant for "symbol",
then we could think that memory operations on the `plibc_func' pointers can
be safely done without atomics.
However, consider what would happen if a load to a`plibc_func' pointer
is torn apart by the compiler. Then a thread could see:
1) NULL
2) The stored value as returned by a dlsym() call
3) A mix of 1) and 2)
The same goes for other optimizations that a compiler is authorized to
do (e.g. store tearing, load fusing).
One could question whether such race condition is even possible for the
clone(2) wrapper. Indeed, a thread must be cloned to get into
existence. Therefore, the main thread would always store the value of
`plibc_func' at least once before creating the first sibling thread,
preventing any possible race condition for this wrapper. However, this
assume that the main thread will not call the clone system call directly
before calling the libc wrapper! Thus, to be on the safe side, we do the
same for the clone wrapper.
Fix the race conditions by using the uatomic_read/uatomic_set functions,
on access to `plibc_func' pointers.
Change-Id: Ic4be25983b8836d2b333f367af9c18d2f6b75879
Signed-off-by: Olivier Dion <odion@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 21 Jun 2023 19:36:49 +0000 (15:36 -0400)]
Fix: tracepoint: Remove trailing \ at the end of macro
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia6dba969704d64d0e31f7d6b3667996101c50f70
Michael Jeanson [Wed, 14 Jun 2023 20:27:39 +0000 (16:27 -0400)]
Show python agent install output in verbose builds
When running 'make V=1' print the output of the python agent install
command to help with debugging.
Change-Id: I1c34f1c4302b914fa4c75fbdfbe8527886652565
Signed-off-by: Michael Jeanson <mjeanson@debian.org>
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 14 Jun 2023 20:55:28 +0000 (16:55 -0400)]
fix: python agent: use stdlib distutils when setuptools is installed
When the setuptools package is installed, it monkey patches the standard
library distutils even if the user code doesn't import setuptools.
This results in a failure to install the python agent in a directory
which ins't in the current PYTHONPATH. To allow this setuptools requires
the '--single-version-externally-managed' options which is not
implemented in distutils.
To resolve this, force the use of distutils for python < 3.12 even when
setuptools is installed with the 'SETUPTOOLS_USE_DISTUTILS' environment
variable and use the proper setuptools option with python >= 3.12 which
doesn't include distutils anymore.
Change-Id: Idf477ca61bed460c9f6be7f481fe3b84624f328c
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 14 Jun 2023 19:58:32 +0000 (15:58 -0400)]
fix: python agent: install on Debian python >= 3.10
Starting with Debian's Python 3.10, the default install scheme is
'posix_local' which is a Debian specific scheme based on 'posix_prefix'
but with an added 'local' prefix. This is the default so users doing
system wide manual installations of python modules end up in
'/usr/local'. This interferes with our autotools based install which
already defaults to '/usr/local' and expect a provided prefix to be used
verbatim.
Monkeypatch sysconfig to override this scheme and use 'posix_prefix' instead.
Change-Id: I08fe77b6c8807515765e3ad0344aa6849e573b90
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 14 Jun 2023 20:21:32 +0000 (16:21 -0400)]
fix: python agent: Add a dependency on generated files
This allows files to be regenerated at build time if the template was
modified since the last build.
Change-Id: I2f98d6b726552efd91719ada9637d2fc2909fbb3
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 13 Jun 2023 15:40:20 +0000 (11:40 -0400)]
python: use setuptools with python >= 3.12
Since 'distutils' will be removed in Python 3.12, use setuptools instead
to build the python agent.
See https://peps.python.org/pep-0632/
Change-Id: I101f0ce0ecd9bd8c198eb2a8d4dd535a46c7a0a0
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Jérémie Galarneau [Thu, 30 Mar 2023 18:56:15 +0000 (14:56 -0400)]
Fix: segmentation fault on filter interpretation in "switch" mode
When building the interpreter with `INTERPRETER_USE_SWITCH`, I get the
following crash when interpreting a bytecode:
Program terminated with signal SIGSEGV, Segmentation fault.
(gdb) bt
#0 0x00007f5789aee443 in lttng_bytecode_interpret (ust_bytecode=0x555dfe90a650, interpreter_stack_data=0x7ffd12615500 "", probe_ctx=0x7ffd12615620,
caller_ctx=0x7ffd126154bc) at lttng-bytecode-interpreter.c:885
#1 0x00007f5789af4da2 in lttng_ust_interpret_event_filter (event=0x555dfe90a580, interpreter_stack_data=0x7ffd12615500 "", probe_ctx=0x7ffd12615620,
event_filter_ctx=0x0) at lttng-bytecode-interpreter.c:2548
#2 0x0000555dfe02d2d4 in lttng_ust__event_probe__tp___the_string (__tp_data=0x555dfe90a580, i=0, arg_i=2, str=0x7ffd12617cfa "hypothec") at ././tp.h:16
#3 0x0000555dfe02cac0 in lttng_ust_tracepoint_cb_tp___the_string (str=0x7ffd12617cfa "hypothec", arg_i=2, i=0)
at /tmp/lttng-master/src/lttng-tools/tests/utils/testapp/gen-ust-nevents-str/tp.h:16
#4 main (argc=39, argv=0x7ffd12615818) at gen-ust-nevents-str.cpp:38
This appears to be caused by `bytecode->data` being used to determine
the `start_pc` address. In my case, `data` is NULL. A quick look around
the code seems to show that this member is not used except during the
transmission of the bytecode.
I am basing the fix on the implementation of START_OP in the default
case which uses `code` in lieu of `data` and can confirm that it fixes
the crash on my end.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0773df385b8e90728b60503016dec4b46d902234
Jérémie Galarneau [Thu, 16 Mar 2023 16:17:23 +0000 (12:17 -0400)]
Fix: `ip` context is expressed as a base-10 field
The base for UST context field `ip` was changed from 16 (hexadecimal) to
10 (decimal), most likely an unintentional copy&paste error in
4e48b5d.
Base 16 is more common for addresses, hence this change should probably
be reverted.
Reported-by: Thomas Gatterweh <thomas.gatterweh@siemens.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ibb28a4768e99e1089577babf2fd74476ae367a89
Mathieu Desnoyers [Wed, 8 Mar 2023 21:18:21 +0000 (16:18 -0500)]
Fix: c99: use __asm__ __volatile__
Allow building with -std=c99 by using __asm__ __volatile__ rather than
asm volatile.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I7d81d34e62c37f417b7d6deeec432e8794abf345
Mathieu Desnoyers [Wed, 8 Mar 2023 20:58:33 +0000 (15:58 -0500)]
Fix: c99: static assert: clang build fails due to multiple typedef
Unlike c11, c99 does not allow redefinition of the same typedef, and
clang is strict about it. Building code with tracepoints with -std=c99
with clang fails with:
warning: redefinition of typedef 'lttng_ust_static_assert_Tracepoint_name_length_is_too_long' is a C11 feature [-Wtypedef-redefinition]
Fix this by placing the (potentially negative size) array as argument to
a function prototype instead.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I06b6edbcd93f43f349451c23b0520df59f4fb346
Mathieu Desnoyers [Tue, 21 Feb 2023 19:29:49 +0000 (14:29 -0500)]
Fix: Reevaluate LTTNG_UST_TRACEPOINT_DEFINE each time tracepoint.h is included
Fix issues with missing symbols in use-cases where tracef.h is included
before defining LTTNG_UST_TRACEPOINT_DEFINE, e.g.:
#include <lttng/tracef.h>
#define LTTNG_UST_TRACEPOINT_DEFINE
#include <provider.h>
It is caused by the fact that tracef.h includes tracepoint.h in a
context which has LTTNG_UST_TRACEPOINT_DEFINE undefined, and this is not
re-evaluated for the following includes.
Fix this by lifting the definition code in tracepoint.h outside of the
header include guards, and #undef the old LTTNG_UST__DEFINE_TRACEPOINT
before re-defining it to its new semantic. Use a new
_LTTNG_UST_TRACEPOINT_DEFINE_ONCE include guard within the
LTTNG_UST_TRACEPOINT_DEFINE defined case to ensure symbols are not
duplicated.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0ef720435003a7ca0bfcf29d7bf27866c5ff8678
Mathieu Desnoyers [Thu, 16 Feb 2023 19:25:16 +0000 (14:25 -0500)]
Fix: trace events in C++ constructors/destructors
Wrap constructor and destructor functions to invoke them as functions with
the constructor/destructor GNU C attributes, which ensures that those
constructors/destructors are ordered before/after C++
constructors/destructors.
Wrap constructor and destructor functions as the constructor/destructor of a
variable defined within an anonymous namespace when building as C++ with
LTTNG_UST_ALLOCATE_COMPOUND_LITERAL_ON_HEAP defined. With this option,
there are no guarantees that the events in C++ constructors/destructors will
be traced.
Fixes: 05bfa3dc3a6e ("Fix: generate probe registration constructor as a C++ constuctor")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: If058b15af6b4d8852fa29d0a21b8233bcb4b43a2
Mathieu Desnoyers [Thu, 16 Feb 2023 21:41:19 +0000 (16:41 -0500)]
Fix: trace events in C constructors/destructors
Adding a priority (150) to the tracepoint and tracepoint provider
constructors/destructors ensures that we trace tracepoints located
within C constructors/destructors with a higher priority value,
including the default init priority of 65535, when the tracepoint vs
tracepoint definition vs tracepoint probe provider are in different
compile units (and in various link order one compared to another).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia8e36317ae058402cdb81cb921da69cfa97a2f82
Mathieu Desnoyers [Thu, 2 Feb 2023 15:25:57 +0000 (10:25 -0500)]
Fix: use unaligned pointer accesses for lttng_inline_memcpy
lttng_inline_memcpy receives pointers which can be unaligned. This
causes issues (traps) specifically on arm 32-bit with 8-byte strings
(including \0).
Use unaligned pointer accesses for loads/stores within
lttng_inline_memcpy instead.
There is an impact on code generation on some architectures. Using the
following test code on godbolt.org:
void copy16_aligned(void *dest, void *src) {
*(uint16_t *)dest = *(uint16_t *) src;
}
void copy16_unaligned(void *dest, void *src) {
STORE_UNALIGNED_INT(uint16_t, dest, LOAD_UNALIGNED_INT(uint16_t, src));
}
void copy32_aligned(void *dest, void *src) {
*(uint32_t *)dest = *(uint32_t *) src;
}
void copy32_unaligned(void *dest, void *src) {
STORE_UNALIGNED_INT(uint32_t, dest, LOAD_UNALIGNED_INT(uint32_t, src));
}
void copy64_aligned(void *dest, void *src) {
*(uint64_t *)dest = *(uint64_t *) src;
}
void copy64_unaligned(void *dest, void *src) {
STORE_UNALIGNED_INT(uint64_t, dest, LOAD_UNALIGNED_INT(uint64_t, src));
}
The resulting assembler (gcc 12.2.0 in -O2) between aligned and
unaligned:
- x86-32: unchanged.
- x86-64: unchanged.
- powerpc32: unchanged.
- powerpc64: unchanged.
- arm32: 16 and 32-bit copy: unchanged. Added code for 64-bit unaligned copy.
- aarch64: unchanged.
- mips32: added code for unaligned.
- mips64: added code for unaligned.
- riscv: added code for unaligned.
If we want to improve the situation on mips and riscv, this would
require introducing a new "lttng_inline_integer_copy" and expose
additional ring buffer client APIs in addition to event_write() which
take integers as inputs. Let's not introduce that complexity yet until
it is justified.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1e6471d4607ac6aff89f16ef24d5370e804b7612
Jérémie Galarneau [Fri, 29 Jul 2022 21:40:45 +0000 (17:40 -0400)]
ust-ctl: allow runtime version checks
Officially, building (and dynamically linking) mismatching LTTng-UST
and LTTng-tools versions is unsupported.
At build time, LTTng-tools ensures that both versions match. However, it
remains possible for a user to mistakenly deploy LTTng-tools built
againt liblttng-ust-ctl v2.x and liblttng-ust-ctl.so from a different
LTTng-UST release. Since the soname major version of liblttng-ust-ctl is
not bumped at every release, this would allow LTTng-tools binary to
load.
In practice, this is unlikely to work since new symbols are introduced
at almost every release cycle. However, it isn't guaranteed.
In the case of a recent change -- removing the underscore prefix of
enumeration mappings used by a variant -- we don't change the ABI, but
we rely on the LTTNG_UST_ABI_MAJOR_VERSION to indicate whether or not
the fix is present to change the interpretation of existing fields.
Adding lttng_ust_ctl_get_version() provides an additional safety net
to check, at runtime, that the version of liblttng-ust-ctl.so that
is loaded matches that of LTTng-tools.
Technically, only major and minor versions are necessary. I propose
including the patchlevel version for future-proofing should we want to
work around known bugs in the future.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I14bee30bb5d2109be0c8c015ed845d70df16630b
Jérémie Galarneau [Fri, 29 Jul 2022 19:32:57 +0000 (15:32 -0400)]
dynamic-type: remove underscore prefix from mapping names
Dynamic types are expressed by LTTng-UST as a variant + its tag.
Currenly (v2.13 and older), the names of the variant's choices and the
enumeration's entries do not match: enumeration entry names are prefixed
with an underscore.
This worked with older versions of LTTng-tools since all fields in the
TSDL metadata were "escaped" by prepending an underscore, causing the
variant choices to match the enumeration mapping names.
Following a rewrite of the TSDL producer, that no longer systematically
prepends an underscore, I noticed this discrepancy.
Newer LTTng-tools versions using the v10 protocol will assume that a
variant field's choices matches its tag's mapping names exactly, while a
work-around for older supported versions (8 and 9) will strip the
leading underscores from the enumeration mapping names when this
specified combination is found.
Note that the "oldest compatible" version remains unchanged as this
change is not ABI breaking.
There is one scenario where this fix can cause problem: running an older
LTTng-tools (e.g. 2.13) linked against a recent LTTng-UST (2.14+). This
configuration is _not_ supported and versions are properly checked at
build time by LTTng-tools.
In that problematic case, the older LTTng-tools would expect the
enumeration mappings to be prefixed with an underscore and produce an
invalid CTF trace.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia7e78096a9c31cd4c0574d599c961067d8f03791
Mathieu Desnoyers [Tue, 25 Oct 2022 16:32:12 +0000 (12:32 -0400)]
Relicense common/smp.c common/smp.h to MIT
EfficiOS owns all copyright of these files. Relicense them to MIT.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Icdeabc471fe9d0f4a756b047d10841391bfaf060
Mathieu Desnoyers [Fri, 30 Sep 2022 14:20:29 +0000 (10:20 -0400)]
Fix: bytecode validator: reject specialized load field/context ref instructions
Reject specialized load ref and get context ref instructions so a
bytecode crafted with nefarious intent cannot read a memory area larger
than the memory targeted by the instrumentation.
This prevents bytecode received from the session daemon from performing
out of bound memory accesses and from disclosing the content of
application memory beyond what has been targeted by the instrumentation.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ica16b97167d391d86e73b05fbf0210ff52b9c9f1
Mathieu Desnoyers [Thu, 29 Sep 2022 19:37:47 +0000 (15:37 -0400)]
Fix: bytecode validator: reject specialized load instructions
Reject specialized load instructions so a bytecode crafted with
nefarious intent cannot read a memory area larger than the memory
targeted by the instrumentation.
This prevents bytecode received from the session daemon from performing
out of bound memory accesses and from disclosing the content of
application memory beyond what has been targeted by the instrumentation.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1f90379455699cf0ad09159c11a12dcd53070f6a
Mathieu Desnoyers [Wed, 28 Sep 2022 14:41:08 +0000 (10:41 -0400)]
Fix: event notification capture: validate buffer length
Validate that the buffer length is large enough to hold empty capture
fields.
If the buffer is initially not large enough to hold empty capture fields
for each field to capture, discard the notification.
If after capturing a field there is not enough room anymore in the
buffer to write empty capture fields, skip the offending large field by
writing an empty capture field in its place.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I819f2b3cdd7f23cd97e35ec5e5f615ef7d740dc5
Mathieu Desnoyers [Tue, 27 Sep 2022 20:14:26 +0000 (16:14 -0400)]
Fix: event notification capture error handling
When the captured fields end up taking more than
PIPE_BUF - sizeof(struct lttng_ust_abi_event_notifier_notification) - 1
bytes of space for the msgpack message, the notification append capture
fails.
Currently, the result is that the msgpack buffer will contain a (likely
corrupted) truncated msgpack data.
Handle those overflow errors, and when they are encountered, reset the
msgpack writer position to skip the problematic captured field entirely.
Change-Id: I7ba1bf06aa72512fc73211a1d8ae6823d0e8d7ff
Jérémie Galarneau [Wed, 14 Sep 2022 12:37:35 +0000 (13:37 +0100)]
Fix: lttng-ust-comm: wait on wrong child process
The code currently assumes that the forked process is the only child
process at that point in time. However, there can be unreaped child
processes as reported in the original bug.
From wait(3), as currently used, "status is requested for any child
process."
Using the pid explicitly ensures a wait on the expected child process.
More context is available at:
https://bugs.lttng.org/issues/1359
Fixes #1359
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I8a4621d79c61f7dfefde5c2b94bdee9752e1973d
Michael Jeanson [Thu, 7 Jul 2022 21:01:54 +0000 (17:01 -0400)]
fix: 'make dist' without javah
Don't use 'BUILT_SOURCES' for the header file generated by javah /
javac, files added to this target will be generated on 'make dist'
regardless of the configuration or presence of the required tools.
Add proper make dependencies between the different targets instead of
using 'all-local'.
Set JAVAROOT to a temporary directory to properly clean class files and
avoid confusing javah when it's used to generate the JNI header.
Change-Id: I8544d0418039ba667d062cb01c924368ab702ab7
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 1 Aug 2022 17:44:08 +0000 (13:44 -0400)]
cleanup: remove stale comment
Change-Id: I339fe13ff2d124fbf0a91223c090921902cb965d
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 17 Aug 2022 19:10:58 +0000 (15:10 -0400)]
Fix: disable array/sequence compile-time type check in C
Disable this compile-time check in C. Indeed, the C implementation of
lttng_ust_is_pointer_type does not support opaque pointer types, because
it relies on pointer arithmetic.
Therefore, remove this check to keep supporting opaque pointers as
array/sequence elements in probe providers.
The worse that could happen is that users providing an unsupported
type as array/sequence element will end up with a meaningless integer
field.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0fa170f7af7fc016027685e48076ebaf0366cc5b
Mathieu Desnoyers [Mon, 1 Aug 2022 16:22:59 +0000 (12:22 -0400)]
Fix: add missing tracelog-internal.h to makefile
Missing from make dist.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I66ce16b69f003893bac95a54b0a4bb09370ab3bb
Norbert Lange [Mon, 1 Aug 2022 14:37:00 +0000 (16:37 +0200)]
lttng_ust_init_thread: call urcu_register_thread
Eagerly register the thread, and avoid taking mutex during the
first tracepoint.
Signed-off-by: Norbert Lange <nolange79@gmail.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Norbert Lange [Mon, 1 Aug 2022 14:36:59 +0000 (16:36 +0200)]
lttng_ust_init_thread: initialise cached context values
Modify all relevant *_alloc_tls functions so that they take
flags for 'init'. Rename them to init_thread for consistency.
So far define one flag LTTNG_UST_INIT_THREAD_CONTEXT_CACHE,
this will warm up cached values so less is done during
the first tracepoint.
The function 'lttng_ust_init_thread' will use all available
flags, software can opt-in to do work early instead
of lazily during tracepoints.
Signed-off-by: Norbert Lange <nolange79@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Norbert Lange [Mon, 1 Aug 2022 14:35:24 +0000 (16:35 +0200)]
Improve tracef/tracelog to use the stack for small strings
Support two common cases, one being that the resulting message is
small enough to fit into a on-stack buffer.
The seconds being the common 'printf("%s", "Message")' scheme.
Unfortunately, iterating a va_list is destructive,
so it has to be copied before calling vprintf.
The implementation was moved to a separate file,
used by both tracef.c and tracelog.c.
Signed-off-by: Norbert Lange <nolange79@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Fri, 29 Jul 2022 15:12:57 +0000 (11:12 -0400)]
fix: add missing closedir in _get_max_cpuid_from_sysfs()
As reported by Coverity:
*** CID
1490849: (RESOURCE_LEAK)
/src/common/smp.c: 84 in _get_max_cpuid_from_sysfs()
78 * CPU num of 0.
79 */
80 if (max_cpuid < 0 || max_cpuid > INT_MAX)
81 max_cpuid = -1;
82
83 end:
>>> CID
1490849: (RESOURCE_LEAK)
>>> Variable "cpudir" going out of scope leaks the storage it points to.
84 return max_cpuid;
85 }
86
87 /*
88 * As a fallback to parsing the CPU mask in "/sys/devices/system/cpu/possible",
89 * iterate on all the folders in "/sys/devices/system/cpu" that start with
Change-Id: I2048e2473d66aaa2a275fe2923da84a7e105f235
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 28 Jul 2022 14:17:48 +0000 (10:17 -0400)]
Add more unit tests for possible_cpus_array_len
Change-Id: If0b7fb9183936f00ac90349fb32f1db57f124602
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 27 Jul 2022 18:23:41 +0000 (14:23 -0400)]
Clarify terminolgy around cpu ids and array length
Rename 'num_possible_cpus' to 'possible_cpus_array_len' to make it
clearer that we use this value to create arrays of per-CPU elements.
Change-Id: Ie5dc9293a95bf321f8add7e9c44ac677bc1fe539
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 27 Jul 2022 20:19:35 +0000 (16:19 -0400)]
fix: Unify possible CPU number fallback
The MUSL specific fallback to get the number of possible CPUs in the
system has the same issue with hot-unplugged CPUs as the Glibc
implementation we worked around by using the possible CPU mask from
sysfs.
To address this, unify our fallback code across all C libraries to get
the maximum CPU id from the directories in "/sys/devices/system/cpu".
Change-Id: I5541742dc1de8e011a942880825fa88c656f0905
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 27 Jul 2022 14:54:53 +0000 (10:54 -0400)]
fix: removed accidental VLA in _get_num_possible_cpus()
The LTTNG_UST_PAGE_SIZE define can either point to a literal value or
the sysconf() function making buf[] a VLA. Replace this by a
cpumask specifc define that will always be a literal value.
Change-Id: I8d329f314878e8018939f979861918969e3ec8ac
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Mon, 25 Jul 2022 19:53:17 +0000 (15:53 -0400)]
Fix: file descriptor leak in get_possible_cpu_mask_from_sysfs
Found by Coverity:
*** CID
1490808: Resource leaks (RESOURCE_LEAK)
/src/common/smp.c: 125 in get_possible_cpu_mask_from_sysfs()
119 max_bytes - total_bytes_read);
120
121 if (bytes_read < 0) {
122 if (errno == EINTR) {
123 continue; /* retry operation */
124 } else {
>>> CID
1490808: Resource leaks (RESOURCE_LEAK)
>>> Handle variable "fd" going out of scope leaks the handle.
125 return -1;
126 }
127 }
128
129 total_bytes_read += bytes_read;
130 assert(total_bytes_read <= max_bytes);
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I372b1fa2d454eeaa6462fe9c13692983369bea6b
Michael Jeanson [Wed, 20 Jul 2022 19:02:41 +0000 (15:02 -0400)]
Add unit tests for num possible cpus
Change-Id: I90eff0090b28cef64a8f4a5bd9745971ed89c711
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 20 Jul 2022 18:49:56 +0000 (14:49 -0400)]
fix: num_possible_cpus() with hot-unplugged CPUs
We rely on sysconf(_SC_NPROCESSORS_CONF) to get the maximum possible
number of CPUs that can be attached to the system for the lifetime of an
application. We use this value to allocate an array of per-CPU buffers
that is indexed by the numerical id of the CPUs.
As such we expect that the highest possible CPU id would be one less
than the number returned by sysconf(_SC_NPROCESSORS_CONF) which is
unfortunatly not always the case and can vary across libc
implementations and versions.
Glibc up to 2.35 will count the number of "cpuX" directories in
"/sys/devices/system/cpu" which doesn't include CPUS that were
hot-unplugged.
This information is however provided by the kernel in
"/sys/devices/system/cpu/possible" in the form of a mask listing all the
CPUs that could possibly be hot-plugged in the system.
This patch changes the implementation of num_possible_cpus() to first
try parsing the possible CPU mask to extract the highest possible value
and if this fails fallback to the previous behavior.
Change-Id: I1a3cb1a446154ec443a391d6689cb7d4165726fd
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 21 Jul 2022 13:10:30 +0000 (09:10 -0400)]
fix: Disable warnings for GNU extensions on Clang
Some versions of Clang enabled '-Wgnu' in '-Wall', since we rely on
GNUisms in the code this results in numerous errors. Check if the
compiler accepts '-Wno-gnu' to disable those warnings.
Change-Id: I9d1126744e427a6cf7c18e219cae5431227a43c0
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 21 Jul 2022 15:51:11 +0000 (11:51 -0400)]
fix: clang warning '-Wnull-pointer-subtraction' in lttng_ust_is_pointer_type
Some versions of Clang enable '-Wnull-pointer-subtraction' in '-Wall'
which results in the following message:
././ust-utils-common.h:166:2: warning: performing pointer subtraction with a null pointer has undefined behavior [-Wnull-pointer-subtraction]
ok_is_pointer_type(void *);
^~~~~~~~~~~~~~~~~~~~~~~~~~
././ust-utils-common.h:120:5: note: expanded from macro 'ok_is_pointer_type'
ok(lttng_ust_is_pointer_type(_type) == true, "lttng_ust_is_pointer_type - '" lttng_ust_stringify(_type) "' is a pointer")
~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../../include/lttng/ust-utils.h:71:45: note: expanded from macro 'lttng_ust_is_pointer_type'
(lttng_ust_is_integer_type(typeof(((type)0 - (type)0))) && !lttng_ust_is_integer_type(type))
^
Since this macro is used only the determine if the type is a pointer we
can use any value other than NULL and thus not depend on undefined
behavior.
Change-Id: Iab7a182f580ce7431a817ab006ecdf3f1da09ae0
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
takeshi.iwanari [Fri, 24 Jun 2022 13:17:39 +0000 (22:17 +0900)]
Fix: Use negative value for error code of lttng_ust_ctl_duplicate_ust_object_data
[As is]
- `lttng_ust_ctl_duplicate_ust_object_data` function is called by the following functions:
- `event_notifier_error_accounting_register_app` (lttng-tools)
- `duplicate_stream_object` (lttng-tools)
- `duplicate_channel_object` (lttng-tools)
- `lttng_ust_ctl_duplicate_ust_object_data` function returns positive value (= errno = 24 = EMFILE) when system call `dup` returns error
- However, `duplicate_stream_object` and `duplicate_channel_object` functions expect negative value as error code
- As a result, these functions cannot handle error and segmentation fault occurs when using `stream->handle`
[Proposal]
- Currently, `lttng_ust_ctl_duplicate_ust_object_data` function returns either positive or negative value when error happens
- It looks convention is using negative value for error code (e.g. `-ENOMEM` )
- So, I propose to change `errno` to `-errno`
Signed-off-by: takeshi.iwanari <takeshi.iwanari@tier4.jp>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Iccb01930413ecd5a8c58ad267e9c4eca53694dc7
Mathieu Desnoyers [Thu, 23 Jun 2022 19:58:04 +0000 (15:58 -0400)]
Fix: sessiond wait futex: handle spurious futex wakeups
Observed issue
==============
LTTng-UST scheme for letting listener threads wait on session daemon
to wake up a futex is similar to the liburcu workqueue code, which has
an issue with spurious wakeups.
This wait/wakeup scheme is only used after the LTTng-UST listener thread
has been unable to connect to the session daemon.
A spurious wakeup on wait_for_sessiond can cause wait_for_sessiond to
return with a sock_info->wait_shm_mmap state of 0, which is unexpected.
However, this should not cause any user-observable issues other than
using slightly more CPU time than strictly needed, because this spurious
wakeup will only cause an additional connection attempt to the session
daemon to fail.
Cause
=====
From futex(5):
FUTEX_WAIT
Returns 0 if the caller was woken up. Note that a wake-up can
also be caused by common futex usage patterns in unrelated code
that happened to have previously used the futex word's memory
location (e.g., typical futex-based implementations of Pthreads
mutexes can cause this under some conditions). Therefore, call‐
ers should always conservatively assume that a return value of 0
can mean a spurious wake-up, and use the futex word's value
(i.e., the user-space synchronization scheme) to decide whether
to continue to block or not.
Solution
========
We therefore need to validate whether the value differs from 0 in
user-space after the call to FUTEX_WAIT returns 0.
Known drawbacks
===============
None.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I468d8ff302f467ee9924e6edb04476fcb031b4b9
Mathieu Desnoyers [Tue, 5 Oct 2021 17:26:46 +0000 (13:26 -0400)]
tracepoints: increase dlopen failure message level from debug to critical
Print the failure message associated with failing to find
lttng-ust-tracepoint.so as a "Critical: " message, because when this
situation occurs, it indeeds makes part of that application's
instrumentation invisible to the tracer.
Similarly to debug message, this critical message is only shown if
LTTNG_UST_DEBUG is defined for the compile unit or if the
LTTNG_UST_DEBUG environment variable is set.
In addition however, if LTTNG_UST_ABORT_ON_CRITICAL is defined at
compile-time, or if the application is run with the
LTTNG_UST_ABORT_ON_CRITICAL environment variable set, the construction
will call abort() on failure to find lttng-ust-tracepoint.so.
This should make it easier for end-users to identify deployment issues
which prevent the lttng-ust tracer from being aware of application
tracepoints.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I2ddcfd593eae699a2c18ef85049ac2239dd41411
Mathieu Desnoyers [Wed, 6 Apr 2022 15:16:06 +0000 (11:16 -0400)]
Document ust lock async-signal-safety
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie150d5757cc050b0262dcea20f20c1da4963a27e
Mathieu Desnoyers [Wed, 6 Apr 2022 14:55:11 +0000 (10:55 -0400)]
Fix: don't use strerror() from ust lock nocheck
ust_lock_nocheck is meant to be async-signal-safe for use from the
fork() override helper (and fork(2) is async-signal-safe).
Remove calls to strerror() from ust lock functions and from the
cancelstate helper because strerror is not async-signal-safe and indeed
allocates memory.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I461f3631a24e71232d987b0a984b4942903bf9ac
Mathieu Desnoyers [Wed, 6 Apr 2022 14:17:15 +0000 (10:17 -0400)]
Fix: remove non-async-signal-safe fflush from ERR()
Commit
ff1fedb9f2e8 ("usterr: make error reporting functions signal safe")
changed the logging printout mechanism to use patient_write() to a file
descriptor to ensure signal-safety of the ERR() logging mechanism.
However, the fflush(stderr) was left in place, although it was useless.
Unfortunately, fflush() is not async-signal-safe.
Fix this by removing this fflush() call.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I13754acd914c4a9f71014a1e332c3fb25197a669
Mathieu Desnoyers [Fri, 20 May 2022 16:00:08 +0000 (12:00 -0400)]
Fix: Pointers are rejected by integer element compile time assertion for array and sequence
commit
2df82195d140b ("Add compile time assertion that array and
sequence have integer elements") introduced a check to validate that
sequences and arrays only contain integers. This was meant to refuse
arrays of double/float which are not supported.
However, as a side-effect, this also refuses arrays and sequences of
pointers, which were accepted prior to lttng-ust 2.13.
Introduce a lttng_ust_is_pointer_type() and use it in the array/sequence
type validation. The trick here is to use the fact that a difference
between two pointers in C is an integer. Therefore, we can validate that
an argument type is a pointer similarly to C++ is_pointer.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: #1355
Change-Id: I7c96d24ab68fb711f85eccdb781a3c513b45c5dc
Jonathan Rajotte [Tue, 17 May 2022 18:09:42 +0000 (14:09 -0400)]
Fix: statedump: invalid read during iter_end
Scenario
=========
Using a modified doc/examples/easy-ust/sample.c and dummy shared objects:
int main(int argc, char **argv)
{
int i = 0;
void *handle_cat;
void *handle_dog;
void (*func_print_name_cat)(const char*);
void (*func_print_name_dog)(const char*);
handle_dog = dlopen("./libdog.so", RTLD_NOW);
handle_cat = dlopen("./libcat.so", RTLD_NOW);
*(void**)(&func_print_name_dog) = dlsym(handle_dog, "print_name");
*(void**)(&func_print_name_cat) = dlsym(handle_cat, "print_name");
for (i = 0; i < 5; i++) {
tracepoint(sample_component, message, "Hello World");
usleep(1);
}
printf("Run `lttng regenerate statedump. Press enter \n");
getchar();
dlclose(handle_dog);
printf("Run `lttng regenerate statedump. Press enter \n");
getchar();
dlclose(handle_cat);
return 0;
}
On lttng side:
lttng create
lttng enable-event -u -a
lttng start
valgrind sample
Issue `lttng regenerate statedump` as the app suggest.
The second `lttng regenerate statedump` results in:
==934747== Invalid read of size 8
==934747== at 0x48BA90F: iter_end (lttng-ust-statedump.c:439)
==934747== by 0x48BAD73: lttng_ust_dl_update (lttng-ust-statedump.c:586)
==934747== by 0x48BADC0: do_baddr_statedump (lttng-ust-statedump.c:599)
==934747== by 0x48BAE62: do_lttng_ust_statedump (lttng-ust-statedump.c:633)
==934747== by 0x489F820: lttng_handle_pending_statedump (lttng-events.c:969)
==934747== by 0x488C000: handle_pending_statedump (lttng-ust-comm.c:717)
==934747== by 0x488DCF7: handle_message (lttng-ust-comm.c:1110)
==934747== by 0x48905EA: ust_listener_thread (lttng-ust-comm.c:1756)
==934747== by 0x4B62608: start_thread (pthread_create.c:477)
==934747== by 0x4A4D162: clone (clone.S:95)
==934747== Address 0x4c4ea88 is 4,152 bytes inside a block of size 4,176 free'd
==934747== at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==934747== by 0x48B9588: free_dl_node (lttng-ust-statedump.c:123)
==934747== by 0x48BA90A: iter_end (lttng-ust-statedump.c:450)
==934747== by 0x48BAD73: lttng_ust_dl_update (lttng-ust-statedump.c:586)
==934747== by 0x48BADC0: do_baddr_statedump (lttng-ust-statedump.c:599)
==934747== by 0x48BAE62: do_lttng_ust_statedump (lttng-ust-statedump.c:633)
==934747== by 0x489F820: lttng_handle_pending_statedump (lttng-events.c:969)
==934747== by 0x488C000: handle_pending_statedump (lttng-ust-comm.c:717)
==934747== by 0x488DCF7: handle_message (lttng-ust-comm.c:1110)
==934747== by 0x48905EA: ust_listener_thread (lttng-ust-comm.c:1756)
==934747== by 0x4B62608: start_thread (pthread_create.c:477)
==934747== by 0x4A4D162: clone (clone.S:95)
==934747== Block was alloc'd at
==934747== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==934747== by 0x48B936A: zmalloc (helper.h:27)
==934747== by 0x48B936A: alloc_dl_node (lttng-ust-statedump.c:85)
==934747== by 0x48B98F7: find_or_create_dl_node (lttng-ust-statedump.c:184)
==934747== by 0x48BA205: extract_baddr (lttng-ust-statedump.c:339)
==934747== by 0x48BABC6: extract_bin_info_events (lttng-ust-statedump.c:528)
==934747== by 0x4A8D2F4: dl_iterate_phdr (dl-iteratephdr.c:75)
==934747== by 0x48BAD4C: lttng_ust_dl_update (lttng-ust-statedump.c:583)
==934747== by 0x48BADC0: do_baddr_statedump (lttng-ust-statedump.c:599)
==934747== by 0x48BAE62: do_lttng_ust_statedump (lttng-ust-statedump.c:633)
==934747== by 0x489F820: lttng_handle_pending_statedump (lttng-events.c:969)
==934747== by 0x488C000: handle_pending_statedump (lttng-ust-comm.c:717)
==934747== by 0x488DCF7: handle_message (lttng-ust-comm.c:1110)
==934747==
Cause
=========
Nodes can be removed during the `cds_hlist_for_each_entry_2` iteration which
is not meant to be used when items are removed within the traversal.
Solution
=========
Use `cds_hlist_for_each_entry_safe_2`.
Change-Id: Ibf3d94a4d6f7abac19ed9740eeacfbcb1bdf1f4f
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Mon, 4 Apr 2022 19:55:07 +0000 (15:55 -0400)]
Cleanup: tracepoint event: use different prefixes for provider and event descriptors
Prefix the provider array of descriptor identifier with
"lttng_ust__provider_event_desc___" to distinguish it from the event
descriptor identifier.
This does not cause any conflict even when the event has the same name
as the provider because the provider is included in the event name. But
providing different prefixes is cleaner.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: If8b38d2765de61fd59558fe7c6223d1fdbb3349b
Mathieu Desnoyers [Wed, 30 Mar 2022 16:10:53 +0000 (12:10 -0400)]
Fix: bytecode interpreter context_get_index() leaves byte order uninitialized
Observed Issue
==============
When using the event notification capture feature to capture a context
field, e.g. '$ctx.cpu_id', the captured value is often observed in
reverse byte order.
Cause
=====
Within the bytecode interpreter, context_get_index() leaves the "rev_bo"
field uninitialized in the top of stack.
This only affects the event notification capture bytecode because the
BYTECODE_OP_GET_SYMBOL bytecode instruction (as of lttng-tools 2.13)
is only generated for capture bytecode in lttng-tools. Therefore, only
capture bytecode targeting contexts are affected by this issue. The
reason why lttng-tools uses the "legacy" bytecode instruction to get
context (BYTECODE_OP_GET_CONTEXT_REF) for the filter bytecode is to
preserve backward compatibility of filtering when interacting with
applications linked against LTTng-UST 2.12.
Solution
========
Initialize the rev_bo field based on the context field type
reserve_byte_order field.
Known drawbacks
===============
None.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I74996d501cee3c269658d98dfc0d0050b74c5ddb
Michael Jeanson [Thu, 17 Mar 2022 17:45:51 +0000 (13:45 -0400)]
fix: __STDC_VERSION__ can be undefined in C++
Caught on SLES12 with g++ 4.8 when enabling '-Wundef'.
Change-Id: Ib027f224a4f0ef021beb1709d8a626db62fe6d9c
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 9 Mar 2022 21:42:15 +0000 (16:42 -0500)]
Fix: sample discarded events count before reserve
Sampling the discarded events count in the buffer_end callback is done
out of order, and may therefore include increments performed by following
events (in following packets) if the thread doing the end-of-packet
event write is preempted for a long time.
Sampling the event discarded counts before reserving space for the last
event in a packet, and keeping this as part of the private ring buffer
context, should fix this race.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib59b634bbaefd2444751547d20a891c9dd93cd73
Mathieu Desnoyers [Thu, 10 Mar 2022 14:58:37 +0000 (09:58 -0500)]
Fix: ring buffer event counter
When compiling with -DLTTNG_RING_BUFFER_COUNT_EVENTS, the lttng-ust
libringbuffer can count events (with additional overhead). This is never
used or enabled by default. Fix this code so it compiles again when the
define is enabled.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I3aeeff7995d66a07316cc5c535b5271536a89636
Mathieu Desnoyers [Wed, 9 Mar 2022 16:54:33 +0000 (11:54 -0500)]
Fix: concurrent exec(2) file descriptor leak
If exec(2) is executed by the application concurrently with LTTng-UST
listener threads between the creation of a file descriptor with
socket(2), recvmsg(2), or pipe(2) and call to fcntl(3) FD_CLOEXEC, those
file descriptors will stay open after the exec, which is not intended.
As a consequence, shared memory files for ring buffers can stay present
on the file system for long-running traced processes.
Use:
- pipe2(2) O_CLOEXEC (supported since Linux 2.6.27, and by FreeBSD),
- socket(2) SOCK_CLOEXEC (supported since Linux 2.6.27, and by FreeBSD),
- recvmsg(2) MSG_CMSG_CLOEXEC (supported since Linux 2.6.23 and by FreeBSD),
rather than fcntl(2) FD_CLOEXEC to make sure the file descriptors are
closed on exec immediately upon their creation.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id2167cf99d7cb8a8425fc0dc13745f023a504562
Michael Jeanson [Thu, 10 Feb 2022 15:25:54 +0000 (15:25 +0000)]
Add LOG4J2 domain to the Log4j 2.x agent
This commit adds a new LOG4J2 domain with native Log4j 2.x loglevels in
addition to the existing LOG4J domain. Both domains can be used
individually or at the same time by instantiating one or more appenders.
For example, a single appender in Log4j 2.x native mode:
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Appenders>
<Lttng name="LTTNG" domain="LOG4J2"/>
</Appenders>
<Loggers>
<Root level="all">
<AppenderRef ref="LTTNG"/>
</Root>
</Loggers>
</Configuration>
Two appenders to enable both domains:
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Appenders>
<Lttng name="LTTNG1" domain="LOG4J"/>
<Lttng name="LTTNG2" domain="LOG4J2"/>
</Appenders>
<Loggers>
<Root level="all">
<AppenderRef ref="LTTNG1"/>
<AppenderRef ref="LTTNG2"/>
</Root>
</Loggers>
</Configuration>
Change-Id: I9a41e107d19d5a0efffe055edf0ca5211a7f6d6b
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 10 Feb 2022 15:25:02 +0000 (15:25 +0000)]
Add 'domain' parameter to the Log4j 2.x agent
The initial Log4j 2.x agent commit only implemented a compatibility mode
to be used with the existing LOG4J domain in lttng-tools.
In this mode the agent converts the new Log4j 2.x loglevel values to
their corresponding Log4j 1.x values in the same way the upstream
compatibility bridge does.
This is great when doing in-place migration using the upstream
compatibility bridge but doesn't cover the usecase of an application
that natively uses Log4j 2.x.
This commit adds a new mandatory 'domain' parameter to the Log4j2 agent
which currently only implements the 'LOG4J' compatibility domain in
preparation to adding a 'LOG4J2' domain.
The configuration for a single appender in Log4j 1.x compat mode will
now look like this:
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Appenders>
<Lttng name="LTTNG" domain="LOG4J"/>
</Appenders>
<Loggers>
<Root level="all">
<AppenderRef ref="LTTNG"/>
</Root>
</Loggers>
</Configuration>
Change-Id: I7fd5f79ad58c77175714bd4198d8ff5db2e6b846
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 2 Feb 2022 19:04:50 +0000 (19:04 +0000)]
fix: Convert custom loglevels in Log4j 2.x agent
The loglevel integer representation has changed between log4j 1.x and
2.x, we currently convert the standard loglevels but passthrough the
custom ones.
This can be problematic when using severity ranges as custom loglevels
won't be properly filtered.
Use the same strategy as the upstream Log4j 2.x compatibility layer by
converting the custom loglevels to their equivalent standard loglevel
value.
Change-Id: I8cbd4706cb774e334380050cf0b407e19d7bc7c4
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Fri, 28 Jan 2022 18:58:12 +0000 (18:58 +0000)]
fix: coverity reported null returns in Log4j2 agent
According to the log4j javadoc, these methods should not return null but
since it's reported by Coverity, add the null checks.
*** CID
1469124: Null pointer dereferences (NULL_RETURNS)
src/lib/lttng-ust-java-agent/java/lttng-ust-agent-log4j2/org/lttng/ust/agent/log4j2/LttngLogAppender.java:
194 in org.lttng.ust.agent.log4j2.LttngLogAppender.append(org.apache.logging.log4j.core.LogEvent)()
*** CID
1469123: Null pointer dereferences (NULL_RETURNS)
src/lib/lttng-ust-java-agent/java/lttng-ust-agent-log4j2/org/lttng/ust/agent/log4j2/LttngLogAppender.java:
167 in org.lttng.ust.agent.log4j2.LttngLogAppender.append(org.apache.logging.log4j.core.LogEvent)()
Change-Id: Ib992b3cc6848492cfb6e7d8fec6ce3898d962db4
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Tue, 25 Jan 2022 02:08:35 +0000 (21:08 -0500)]
Fix: ustcomm: serialize variant_nestable type
LTTng-UST 2.13 serializes the contents of the variant_nestable union
field, but keeps the "atype" as lttng_ust_ctl_atype_variant.
It happens to work by pure chance because the binary layout of the
variant_nestable and legacy.variant union fields are the same, except
for the alignment field of variant_nestable which is zeroed padding in
the legacy.variant. Therefore, as long as the variant_nestable has a
padding of 0, everything works out fine (which is currently the case).
But it's better to fix this discrepancy in case we ever plan to use a
nonzero variant alignment.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I96a3e1f6bfbe410ed61ea59313eb49b6c4f4b40d
Michael Jeanson [Thu, 6 Jan 2022 19:36:46 +0000 (14:36 -0500)]
Add a Log4j 2.x Java agent
This adds a new agent to the LTTng-UST Java agents suite supporting the
Log4j 2.x logging backend.
This new agent can be enabled with 2 different configure options :
1) Java agent with Log4j 2.x support:
$ export CLASSPATH=/path/to/log4j-core.jar:/path/to/log4j-api.jar
$ ./configure --enable-java-agent-log4j2
2) Java agent with JUL + Log4j + Log4j2 support
$ export CLASSPATH=/path/to/log4j-core.jar:/path/to/log4j-api.jar:/path/to/log4j-1.2.jar
$ ./configure --enable-java-agent-all
The name of the new agent jar file is "lttng-ust-agent-log4j2.jar".
It will be installed in the arch-agnostic "$prefix/share/java" path
e.g: "/usr/share/java".
It uses the same jni library "liblttng-ust-log4j-jni.so" as the Log4j 1.x agent.
The agent was designed as a mostly drop-in replacement for applications
upgrading from Log4j 1.x to 2.x. It requires no modification to the
tracing configuration as it uses the same domain "-l / LOG4J" and the
loglevels integer representations are converted to the Log4j 1.x values
(excluding custom loglevels).
The recommended way to use this agent with Log4j 2.x is to add an
"Lttng" Appender with an arbiraty name and associate it with one or more
Logger using an AppenderRef.
For example, here is a basic log4j2 xml configuration that would send
all logging statements exlusively to an lttng appender:
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Appenders>
<Lttng name="LTTNG"/>
</Appenders>
<Loggers>
<Root level="all">
<AppenderRef ref="LTTNG"/>
</Root>
</Loggers>
</Configuration>
More examples can be found in the 'doc/examples' directory.
The implementation of the appender is based on this[1] great guide by
Keith D. Gregory which is so much more detailed than the official
documentation, my thanks to him.
[1] https://www.kdgregory.com/index.php?page=logging.log4j2Plugins
Change-Id: I34593c9a4c3140c8839cef8b58cc85745fe9f47f
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 18 Jan 2022 19:14:33 +0000 (19:14 +0000)]
Fix: may be used uninitialized on powerpc
Fix the following warning on powerpc :
In file included from ../../src/common/counter/counter-internal.h:16,
from ../../src/common/counter/counter-api.h:16,
from counter-clients/percpu-64-modular.c:12:
In function ‘__lttng_counter_add_percpu’,
inlined from ‘lttng_counter_add’ at ../../src/common/counter/counter-api.h:265:10,
inlined from ‘counter_add’ at counter-clients/percpu-64-modular.c:53:9:
include/urcu/compiler.h:25:42: warning: ‘move_sum’ may be used uninitialized [-Wmaybe-uninitialized]
25 | #define caa_unlikely(x) __builtin_expect(!!(x), 0)
| ^~~~~
../../src/common/counter/counter-api.h:244:13: note: in expansion of macro ‘caa_unlikely’
244 | if (caa_unlikely(move_sum))
| ^~~~~~~~~~~~
In file included from counter-clients/percpu-64-modular.c:12:
counter-clients/percpu-64-modular.c: In function ‘counter_add’:
../../src/common/counter/counter-api.h:237:17: note: ‘move_sum’ declared here
237 | int64_t move_sum;
| ^~~~~~~~
Change-Id: I65dc61a567c0337735124a35f1af96697d416054
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
This page took 0.053236 seconds and 4 git commands to generate.