Simon Marchi [Tue, 20 Jun 2023 20:32:51 +0000 (16:32 -0400)]
format-cpp: run clang-format in parallel
Use the -P option of GNU xargs to run multiple instances of clang-format
in parallel, which speeds up the execution quite a bit (depending on the
number of cores, of course).
Inspired by this babeltrace commit:
http://git.efficios.com/?p=babeltrace.git;a=commit;h=
66c3bce11973e6e96a3791c378a9e5f98ddaa280
Change-Id: I201535244ef4c3614dfd742ae6f1c427994e6147
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 27 Jul 2023 18:09:00 +0000 (14:09 -0400)]
Fix: format-cpp: don't pass -i twice to clang-format
From Simon Marchi's original commit message:
I'm trying to run format-cpp, with clang-format-14 in my PATH, and I
get a ton of these messages:
clang-format-14: for the -i option: may only occur zero or one times!
This is because -i is present in the FORMATTER variables, as well as in
the command line where the formatter is invoked.
Remove the one in the command-line.
Instead of assuming the FORMATTER variable contains '-i', assume it
doesn't since the options are not semantically part of the formatter's
name. The '-i' option is passed to the formatter invocation directly
since it is always needed.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I0c26ff0d8d4d99b3f161ca9e5aa94ff867e3f916
Michael Jeanson [Tue, 27 Jun 2023 17:33:18 +0000 (17:33 +0000)]
Tests: python: path-like object introduced in python 3.6
Prior to python 3.6 the os.path() function expected a string or bytes
object for the pathname. Use a compat method to convert the path-like
object to a string on interpreters that lack PEP-519 [1] support.
Traceback (most recent call last):
File "tests/regression/tools/context/test_ust.py", line 156, in <module>
tap, test_env, lttngtest.VpidContextType(), lambda test_app: test_app.vpid
File "tests/regression/tools/context/test_ust.py", line 114, in test_static_context
test_app = test_env.launch_wait_trace_test_application(50)
File "tests/utils/lttngtest/environment.py", line 541, in launch_wait_trace_test_application
wait_before_exit_file_path,
File "tests/utils/lttngtest/environment.py", line 163, in __init__
self._wait_for_file_to_be_created(pathlib.Path(app_ready_file_path))
File "tests/utils/lttngtest/environment.py", line 168, in _wait_for_file_to_be_created
if os.path.exists(sync_file_path):
File "/usr/lib/python3.5/genericpath.py", line 19, in exists
os.stat(path)
TypeError: argument should be string, bytes or integer, not PosixPath
[1] https://peps.python.org/pep-0519/
Change-Id: I783e36f61223d44667294ccbf4b3aec5bff68701
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 27 Jul 2023 16:26:42 +0000 (12:26 -0400)]
Fix: lttng-add-context: context type options possible null dereference
Coverity reports that:
** CID
1518091: Null pointer dereferences (FORWARD_NULL)
/src/bin/lttng/commands/add_context.cpp: 820 in destroy_ctx_type(<unnamed>::ctx_type *)(
Free application context options only if type->opt isn't null.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Icc27d04480c4821ed33127f5baf293510cdb314e
Jérémie Galarneau [Thu, 29 Jun 2023 18:04:37 +0000 (14:04 -0400)]
Fix: consumerd: slow metadata push slows down application registration
Issue observed
--------------
When rotating the channels of a session configured with a "per-pid"
buffer sharing policy, applications with a long registration
timeout (e.g. LTTNG_UST_REGISTER_TIMEOUT=-1, see LTTNG-UST(3)) sometimes
experience long start-up times.
Cause
-----
The session list lock is held during the registration of an application
and during the setup of a rotation.
While setting up a rotation in the userspace domain, the session daemon
flushes its metadata cache to the userspace consumer daemon and waits
for a confirmation that all metadata emitted before that point in time
has been serialized (whether on disk or sent through a network output).
As the consumer daemon waits for the metadata to be consumed, it
periodically checks the metadata stream's output position with a 200ms
delay (see DEFAULT_METADATA_AVAILABILITY_WAIT_TIME).
In practice, in per-uid mode, this delay is seldomly encountered since
the metadata has already been pushed by the consumption thread.
Moreover, if it was not, a single polling iteration will typically
suffice.
However, in per-pid buffering mode and with a sustained "heavy" data
production rate, this delay becomes problematic since:
- metadata is pushed for every application,
- the delay is hit almost systematically as the consumption thread
is busy and has to catch up to consume the most recent metadata.
Hence, some rotation setups can easily take multiple seconds (at least
200ms per application). This makes the locking scheme employed on that
path unsuitable as it blocks some operations (like application
registrations) for an extended period of time.
Solution
--------
The polling "back-off" delay is eliminated by using a waiter that allows
the consumer daemon thread that runs the metadata push command to
wake-up whenever the criteria used to evaluate the "pushed" metadata
position are changed.
Those criteria are:
- the metadata stream's pushed position
- the lifetime of the metadata channel's stream
- the status of the session's endpoint
Whenever those states are affected, the waiters are woken-up to force a
re-evaluation of the metadata cache flush position and, eventually,
cause the metadata push command to complete.
Note
----
The waiter queue is adapted from urcu-wait.h of liburcu (also LGPL
licensed).
Change-Id: Ib86c2e878abe205c73f930e6de958c0b10486a37
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Kienan Stewart [Thu, 29 Jun 2023 14:32:59 +0000 (10:32 -0400)]
Docs: Fix broken reference in lttng-add-trigger
Change-Id: I4068570d188fbf75e402898234944b6e21cfa2a1
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Kienan Stewart [Thu, 6 Jul 2023 20:20:24 +0000 (16:20 -0400)]
Docs: Fix broken reference to lttng-concepts(7) man page
Change-Id: Iaa700e06ec98a3a451f10b4e287c7b28e6ff4524
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Kienan Stewart [Wed, 21 Jun 2023 13:39:06 +0000 (09:39 -0400)]
Tests: Preemptively fail infinite blocking tests when low on disk space
In the system tests run by LAVA, the infinite blocking tests were
hanging when the system under test ran out of disk space. This is the
expected behaviour of the failing test, but the condition can be
detected and the tests preemptively failed with a clear error of what
needs to be addressed in the system being tested.
Change-Id: I9e6126408b57c2cd5aa64c2e360e0672f9eb2314
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 17 May 2023 17:41:03 +0000 (13:41 -0400)]
Fix: sessiond: bad fd used while rotating exiting app's buffers
Issue observed
--------------
From bug #1372:
We are observing seemingly random crashes in the LTTng consumer daemon
when tracing a C++ application with LTTng-UST. Our workload has a single
printf-like tracepoint, where each string is in the order of 1kb and the
total output is around 30MB/s.
LTTng is set up with a single session and channel enabling this
tracepoint, and we enabled rotation with a maximum size of 100MB or
every 30 seconds. We are periodically starting new traced processes and
the system runs close to 100% CPU load. This ran on an AWS
Graviton2 (ARM) instance with CentOS 7 and a 5.4 kernel, using LTTng-UST
2.13.5 and LTTng-tools 2.13.8.
The first reported error is a write to a bad file descriptor (-1),
apparently when waking up the metadata poll thread during a rotation.
Cause
-----
Inspecting the logs, we see that the metadata channel with key 574 has a
negative poll fd write end which causes the write in
consumer_metadata_wakeup_pipe to fail because of an invalid file
descriptor:
DBG1 - 15:12:13.
271001175 [6593/6605]: Waking up metadata poll thread (writing to pipe): channel name = 'metadata', channel key = 574 (in consumer_metadata_wakeup_pipe() at consumer.c:888)
DBG3 - 15:12:13.
271010093 [6593/6605]: write() fd = -1 (in consumer_metadata_wakeup_pipe() at consumer.c:892)
PERROR - 15:12:13.
271014655 [6593/6605]: Failed to write to UST metadata pipe while attempting to wake-up the metadata poll thread: Bad file descriptor (in consumer_metadata_wakeup_pipe() at consumer.c:907)
Error: Failed to dump the metadata cache
Error: Rotate channel failed
Meanwhile, a lot of applications seem to be unregistering. Notably, the
application associated with that metadata channel is being torn down.
Leading up to the use of a bad file descriptor, the chain of events is:
1) The "rotation" thread starts to issue "Consumer rotate channel" on
key 574 (@ `15:12:12.
865621802`), but blocks on the consumer socket
lock. We can deduce this from the fact that thread "6605" in the
consumer wakes up to process an unrelated command originating from the
same socket.
We don't see that command being issued by the session daemon, most
likely because it occurs just before the captured logs start. All
call sites that use this socket take the socket lock, issue their
command, wait for a reply, and release the socket lock.
2) The application unregisters (@ `15:12:13.
269722736`). The
`registry_session`, which owns the metadata contents, is destroyed
during `delete_ust_app_session` which is done directly as a consequence
of the app unregistration (through a deferred RCU call), see
`ust_app_unregister`.
This is problematic since the consumer will request the metadata during
the rotation of the metadata channel. In the logs, we can see that
the "close_metadata" command blocks on the consumer socket lock.
However, the problem occurs when the `manage-apps` acquires the lock
before the "rotation" thread. In this instance, the "close-metadata"
command is performed by the consumer daemon, closing the metadata
poll file descriptor.
3) As the "close_metadata" command completes, the rotation thread
successfully acquires the socket lock. It is not aware of the
unregistration of the application and of the subsequent tear-down of the
application, registry, and channels since it was already iterating on
the application's channels.
The consumer starts to process the channel rotation command (@
`15:12:13.
270633213`) which fails on the metadata poll fd.
Essentially, we must ensure that the lifetime of metadata
channel/streams exceeds any ongoing rotation, and prevent a rotation
from being launched when an application is being torn-down in per-PID
buffering mode.
The problem is fairly hard to reproduce as it requires threads to
wake-up in the problematic order described above. I don't have a
straight-forward reproducer for the moment.
Solution
--------
During the execution of a rotation on a per-pid session, the session
daemon iterates on all applications to rotate their data and metadata
channels.
The `ust_app` itself is correctly protected: it is owned by an RCU HT
(`ust_app_ht`) and the RCU read lock is acquired as required to protect
the lifetime of the storage of `ust_app`. However, there is no way to
lock an `ust_app` instance itself.
The rotation command assumes that if it finds the `ust_app`, it will be
able to rotate all of its channels. This isn't true: the `ust_app` can
be unregistered by the `manage-applications` thread which monitors the
application sockets for their deaths in order to teardown the
applications.
The `ust_app` doesn't directly own its channels; they are owned by an
`ust_app_session` which, itself, has a `lock` mutex. Also, the metadata
of the application is owned by the "session registry", which itself can
also be locked.
At a high-level, we want to ensure that the metadata isn't closed while
a rotation is being setup. The registry lock could provide this
guarantee. However, it currently needs to remain unlocked during the
setup of the rotation as it is used when providing the metadata to the
consumer daemon.
Taking the registry lock over the duration of the setup would result in
a deadlock like so:
- the consumer buffer consumption thread consumed a data buffer and attempts
a metadata sync,
- the command handling thread of the consumer daemon attempts to rotate
any stream that is already at its rotation position and locks on the
channel lock held by the consumption thread,
- the metadata sync launches a metadata request against the session
daemon which attempts to refresh the metadata contents through the
command socket,
- the command handling thread never services the metadata "refresh" sent
by the session daemon since it is locked against the same channel as
the buffer consumption thread, resulting in a deadlock.
Instead, a different approach is required: extending the lifetime of the
application's channels over the duration of the setup of a rotation.
To do so, the `ust_app` structure (which represents a registered
application) is now reference-counted. A reference is acquired over the
duration of the rotation's setup phase. This reference transitively
holds a reference the application's tracing buffers.
Note that taking a reference doesn't prevent applications from
unregistering; it simply defers the reclamation of their buffers to the
end of the rotation setup.
As the rotation completes its setup phase, the references to the
application (and thus, its tracing buffers) are released, allowing the
reclamation of all buffering ressources.
Note that the setup phase of the rotation doesn't last long so it
shouldn't significantly change the observable behaviour in terms of
memory usage. The setup phase mostly consists in sampling the
consumption/production positions of all buffers in order to establish a
switch-over point between the old and new files.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8dc1ee45dd00c85556dd70d34a3af4f3a4d4e7cb
Jérémie Galarneau [Mon, 24 Jul 2023 20:45:07 +0000 (16:45 -0400)]
Fix: sessiond: leak of application context in channel
Issue observed
--------------
ASAN generates the following report when the session daemon exists after
running the tests/regression/tools/context/test_ust.py test suite.
lttng-sessiond: ==930543==ERROR: LeakSanitizer: detected memory leaks
lttng-sessiond: Direct leak of 8 byte(s) in 1 object(s) allocated from:
lttng-sessiond: 0 0x7f8d1706c33a in __interceptor_strdup /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_interceptors.cpp:454
lttng-sessiond: 1 0x55e36fa6d107 in alloc_ust_app_ctx /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:1368
lttng-sessiond: 2 0x55e36fa82f73 in create_ust_app_channel_context /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:2912
lttng-sessiond: 3 0x55e36fa9eeac in ust_app_channel_create /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:5062
lttng-sessiond: 4 0x55e36faa9fef in find_or_create_ust_app_channel /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:5936
lttng-sessiond: 5 0x55e36faab610 in ust_app_synchronize_all_channels /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:6147
lttng-sessiond: 6 0x55e36faac12e in ust_app_synchronize /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:6208
lttng-sessiond: 7 0x55e36faacc29 in ust_app_global_update(ltt_ust_session*, ust_app*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:6268
lttng-sessiond: 8 0x55e36faa910e in ust_app_start_trace_all(ltt_ust_session*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:5850
lttng-sessiond: 9 0x55e36f920343 in cmd_start_trace(ltt_session*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:2826
lttng-sessiond: 10 0x55e36f9ffac5 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:1779
lttng-sessiond: 11 0x55e36fa077c0 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2588
lttng-sessiond: 12 0x55e36f9e4d85 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:67
lttng-sessiond: 13 0x7f8d15c9d44a (/usr/lib/libc.so.6+0x8744a) (BuildId:
2f005a79cd1a8e385972f5a102f16adba414d75e)
lttng-sessiond: Direct leak of 5 byte(s) in 1 object(s) allocated from:
lttng-sessiond: 0 0x7f8d1706c33a in __interceptor_strdup /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_interceptors.cpp:454
lttng-sessiond: 1 0x55e36fa6d059 in alloc_ust_app_ctx /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:1367
lttng-sessiond: 2 0x55e36fa82f73 in create_ust_app_channel_context /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:2912
lttng-sessiond: 3 0x55e36fa9eeac in ust_app_channel_create /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:5062
lttng-sessiond: 4 0x55e36faa9fef in find_or_create_ust_app_channel /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:5936
lttng-sessiond: 5 0x55e36faab610 in ust_app_synchronize_all_channels /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:6147
lttng-sessiond: 6 0x55e36faac12e in ust_app_synchronize /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:6208
lttng-sessiond: 7 0x55e36faacc29 in ust_app_global_update(ltt_ust_session*, ust_app*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:6268
lttng-sessiond: 8 0x55e36faa910e in ust_app_start_trace_all(ltt_ust_session*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/ust-app.cpp:5850
lttng-sessiond: 9 0x55e36f920343 in cmd_start_trace(ltt_session*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:2826
lttng-sessiond: 10 0x55e36f9ffac5 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:1779
lttng-sessiond: 11 0x55e36fa077c0 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2588
lttng-sessiond: 12 0x55e36f9e4d85 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:67
lttng-sessiond: 13 0x7f8d15c9d44a (/usr/lib/libc.so.6+0x8744a) (BuildId:
2f005a79cd1a8e385972f5a102f16adba414d75e)
lttng-sessiond: SUMMARY: AddressSanitizer: 13 byte(s) leaked in 2 allocation(s).
Cause
-----
In the case of application contexts, alloc_ust_app_ctx() copies the
provider and application context names. However, these fields are not
free'd by delete_ust_app_ctx().
Solution
--------
The application context and provider names are free'd during
delete_ust_app_ctx() when the context type is LTTNG_UST_ABI_CONTEXT_APP.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I0759018ec1811cf6246b5a80d4f5a7545c63910a
Jérémie Galarneau [Mon, 24 Jul 2023 20:10:50 +0000 (16:10 -0400)]
Fix: lttng-add-context: leak of application context parameters
Issue observed
--------------
ASAN reports the following leak when running the
tests/regression/tools/context/test_ust.py test suite:
Direct leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x7f32e5ae0cd1 in __interceptor_calloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77
#1 0x5653e1092088 in zmalloc_internal ../../../src/common/macros.hpp:60
#2 0x5653e10922b3 in char* calloc<char>(unsigned long) string-utils/../macros.hpp:113
#3 0x5653e119d68f in get_context_type commands/add_context.cpp:1012
#4 0x5653e119ddf5 in cmd_add_context(int, char const**) commands/add_context.cpp:1059
#5 0x5653e11e12e7 in handle_command /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:237
#6 0x5653e11e2027 in parse_args /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:427
#7 0x5653e11e24e1 in _main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:474
#8 0x5653e11e25bd in main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:485
#9 0x7f32e3e3984f (/usr/lib/libc.so.6+0x2384f) (BuildId:
2f005a79cd1a8e385972f5a102f16adba414d75e)
Direct leak of 5 byte(s) in 1 object(s) allocated from:
#0 0x7f32e5ae0cd1 in __interceptor_calloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77
#1 0x5653e1092088 in zmalloc_internal ../../../src/common/macros.hpp:60
#2 0x5653e10922b3 in char* calloc<char>(unsigned long) string-utils/../macros.hpp:113
#3 0x5653e119d2ae in get_context_type commands/add_context.cpp:1003
#4 0x5653e119ddf5 in cmd_add_context(int, char const**) commands/add_context.cpp:1059
#5 0x5653e11e12e7 in handle_command /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:237
#6 0x5653e11e2027 in parse_args /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:427
#7 0x5653e11e24e1 in _main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:474
#8 0x5653e11e25bd in main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:485
#9 0x7f32e3e3984f (/usr/lib/libc.so.6+0x2384f) (BuildId:
2f005a79cd1a8e385972f5a102f16adba414d75e)
Cause
-----
The context and provider names are dynamically allocated by
get_context_type() and stored in ctx_type. However, destroy_ctx_type()
never frees those members when the structure is of type
CONTEXT_APP_CONTEXT.
Solution
--------
Free both names when an application context type is destroyed.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I86dde1eed9f0cc63499c936cf373b094168035e2
Jérémie Galarneau [Mon, 24 Jul 2023 19:25:39 +0000 (15:25 -0400)]
Fix: sessiond: memory leak of lttng_pipe structure
Issue observed
--------------
When running the session daemon under ASAN, the following report is
produced:
Direct leak of 104 byte(s) in 1 object(s) allocated from:
#0 0x7f93866e0cd1 in __interceptor_calloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77
#1 0x55c55a7c4963 in zmalloc_internal /home/simark/src/lttng-tools/src/common/macros.hpp:60
#2 0x55c55a7c4973 in lttng_pipe* zmalloc<lttng_pipe>() /home/simark/src/lttng-tools/src/common/macros.hpp:88
#3 0x55c55a7c26eb in _pipe_create /home/simark/src/lttng-tools/src/common/pipe.cpp:111
#4 0x55c55a7c351d in lttng_pipe_open(int) /home/simark/src/lttng-tools/src/common/pipe.cpp:185
#5 0x55c55a586dd6 in operator() /home/simark/src/lttng-tools/src/bin/lttng-sessiond/rotation-thread.cpp:403
#6 0x55c55a58744a in lttng::sessiond::rotation_thread::rotation_thread(lttng::sessiond::rotation_thread_timer_queue&, notification_thread_handle&) /home/simark/src/lttng-tools/src/bin/lttng-sessiond/rotation-thread.cpp:402
#7 0x55c55a46377f in std::unique_ptr<lttng::sessiond::rotation_thread, std::default_delete<lttng::sessiond::rotation_thread> > lttng::make_unique<lttng::sessiond::rotation_thread, lttng::sessiond::rotation_thread_timer_queue&, notification_thread_handle&>(lttng::sessiond::rotation_thread_timer_queue&, notification_thread_handle&) /home/simark/src/lttng-tools/src/common/make-unique.hpp:18
#8 0x55c55a455024 in _main /home/simark/src/lttng-tools/src/bin/lttng-sessiond/main.cpp:1773
#9 0x55c55a455c2e in main /home/simark/src/lttng-tools/src/bin/lttng-sessiond/main.cpp:1982
#10 0x7f9385c1484f (/usr/lib/libc.so.6+0x2384f) (BuildId:
2f005a79cd1a8e385972f5a102f16adba414d75e)
Cause
-----
On destruction, the std::unique_ptr wrapper of
lttng_pipe (lttng_pipe::uptr) invokes `lttng_pipe_close` (which only
closes the file descriptors of the underlying pipe) rather than
`lttng_pipe_destroy` which closes the file descriptors _and_ frees the
memory allocated by lttng_open.
Currently, the rotation thread is the only user of this wrapper (through
its quit_pipe).
Solution
--------
The deleter of lttng_pipe::uptr is replaced to invoke lttng_pipe_destroy.
Fixes #1380
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5715ac6131c5aa134cfd18d8b677f31aabed36f0
Jérémie Galarneau [Fri, 14 Jul 2023 21:26:26 +0000 (17:26 -0400)]
event-rule: set event rule loglevel to domain specific value when unset
The various domains that support log levels define different values for
ALL/UNSET that are not -1. Using these domain values makes reasoning
about the code simpler as -1 does not necessarily have a defined meaning
in all domains.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2e818d42f72d3b12e44e375ce30367bf1f6d5463
Kienan Stewart [Wed, 28 Jun 2023 15:28:55 +0000 (11:28 -0400)]
Fix: sessiond: preserve jul/log4j domain loglevels
Issue observed
==============
Following
dcd24bbf7dbc74e3584d1d0d52715e749023c452, the
lttng-ust-java-tests started failing with a number of errors such as
the following [1]:
```
org.opentest4j.AssertionFailedError: expected: java.util.Collections$SingletonList@
3270d194<[Event name = eventA, Log level selector = (LTTNG_EVENT_LOGLEVEL_ALL), Filter string = logger_name == "eventA"]> but was: java.util.ArrayList@
235834f2<[Event name = eventA, Log level selector = (LTTNG_EVENT_LOGLEVEL_ALL), Filter string = logger_name == "eventA"]>
at org.lttng.ust.agent.integration.client.TcpClientIT.testEnableEvent(TcpClientIT.java:187
```
While the assertion failure print out looks like the events are the
same, there is a difference in between the objects which is not
printed: the loglevel integer value. For example:
```
eventA [level -
2147483648, type 0]: logger_name == "eventA"
eventB [level -
2147483648, type 0]: logger_name == "eventB"
eventA [level -1, type 0]: logger_name == "eventA"
eventB [level -1, type 0]: logger_name == "eventB"
```
Cause
=====
When events are created from payloads in
`src/common/event.cpp:lttng_event_create_from_payload`, the loglevel
value is coerced to `-1` when the loglevel_type is
LTTNG_EVENT_LOGLEVEL_ALL.
Consider the event created in `lttng enable-event --jul
eventName`. The loglevel_type and loglevel will be set as follows:
* loglevel_type: LTTNG_EVENT_LOGLEVEL_ALL (0)
* loglevel: LTTNG_EVENT_LOGLEVEL_JUL_ALL (-
2147483648)
The event created is then serialized and sent to the sessiond which
recreates it from the payload removing the value set initially.
The normalization performed in `src/bin/lttng-sessiond/cmd.cpp` has
the same effect.
Solution
========
Remove the normalization of the the loglevel to -1 when events with
loglevel_type LTTNG_EVENT_LOGLEVEL_ALL are created from payloads or
processed via `_cmd_enable_event`.
A test has been added to confirm that the modification doesn't regress
on the issue flagged in https://bugs.lttng.org/issues/1373 which lead
to the normalization changes being made.
This change isn't an exhaustive audit of the packet outputs which may
or may not leak the '-1' "unset" value. One potential change to the
normalization may be to have the normalization be domain-aware and
always default to the domain's "ALL" value. Note that not all domains
have the concept of an "unset" level.
References
==========
[1] https://ci.lttng.org/job/lttng-ust-java-tests_master_build/java_version=java-11-openjdk,platform=bionic-amd64/3302/consoleFull
Refs: #1373
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iac653157190b61b44d5ff18ac968fef58330a106
Michael Jeanson [Thu, 27 Apr 2023 18:47:53 +0000 (14:47 -0400)]
Fix: lttng-elf: add missing include for uint64_t
Change-Id: Ic10f9c51197d1c6942ce76a7ee4be4a75f51db47
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 15 Jun 2023 20:38:08 +0000 (16:38 -0400)]
Tests: add a recording rule listing test
Add a test that validates that recording rules can be added to a channel
and listed back. Most of the changes are to the lttngtest package in
order to provide the requisite capabilities to the test itself.
This test is also intended as a reproducer for #1373.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I1c6b5e072934ba9760cb1f838395e18a4c6bb211
Jérémie Galarneau [Wed, 14 Jun 2023 23:03:12 +0000 (19:03 -0400)]
Fix: sessiond: crash enabling event rules that differ only by loglevel type
Issue observed
--------------
Summarizing bug #1373, an assertion fails when enabling two event-rules
that only differ by their log level selection type (all, range, or
single).
This issue can be reproduced by launching an instrumented
application (which remains active over the duration of this test) and
running:
$ lttng create test_session
$ lttng enable-channel --userspace test_channel
$ lttng start test_session
$ lttng enable-event --userspace --session test_session --channel test_channel 'l*' --loglevel-only=TRACE_DEBUG_LINE
$ lttng enable-event --userspace --session test_session --channel test_channel 'l*' --loglevel=TRACE_DEBUG_LINE
Cause
-----
A number of sites conflate log level values and type. A clean-up had
been performed previously as of
2106efa08 and through follow-up commits.
However, some instances were apparently missed at the time.
As such, add_unique_ust_app_event mixed loglevel values and types when
initializing the key used for the unique insertion. The log level type,
for its part, is completely ignored.
Going back to the reproducer above, the first insertion succeeds as
expected. The second insertion fails since there is already an app event
rule with the `TRACE_DEBUG_LINE` log level type.
Moreover, the matching function only matches on the log level
type (which is really the value), which is also a bug.
The "matching" function is invoked on the key of the second event rule
and the first event rule since the hashing is only performed on the
event-rule's name pattern, resulting in a collision.
Solution
--------
Both the log level value and log level types are used to perform the
matching within the ust-app module. This implies extending the
ust_app_ht_key to store the log level value.
To simplify the matching logic (which attempted to handle 0 and -1
having the same meaning when the "ALL" log level type was used), the log
level value is normalized to '-1' throughout.
Fixes #1373
Reported-by: Bogdan Codres <bogdan.codres@windriver.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I869a0fb7a6554da7d84bc71df6ee91a7e46937cc
Jérémie Galarneau [Fri, 16 Jun 2023 21:01:13 +0000 (17:01 -0400)]
Tests fix: lttngtest: quote session output path
The output path used with the `--output` option must be quoted since it
can contain spaces.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7bf557f64d3fb72a10230ad9da9d59872672e11c
Jérémie Galarneau [Thu, 15 Jun 2023 20:37:33 +0000 (16:37 -0400)]
Tests: remove accidental import of cgi.test
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I17c1776e8981e212636b65244a58389f194b5d4c
Jérémie Galarneau [Thu, 15 Jun 2023 16:16:59 +0000 (12:16 -0400)]
Tests: lttngtest: raise NotImplementedError in abstract class methods
It is preferable to error-out when a Controller doesn't support a method
rather than silently failing.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I3751bf4cff97f400ae53e07efb2740e8426992e9
Jérémie Galarneau [Thu, 15 Jun 2023 16:14:01 +0000 (12:14 -0400)]
Tests: lttngtest: add methods to control session rotations
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8b8513fa37e6a51e034e5832499085c501e34ac4
Jérémie Galarneau [Thu, 15 Jun 2023 16:09:05 +0000 (12:09 -0400)]
Tests: lttngctl.py: allow the creation of per-pid buffers
Add BufferSharingPolicy which can be used with the add_channel method of
a Session to create per-pid channels.
This is done to allow tests for the per-pid buffer sharing mode to be
written using the python utils.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ice3b5d3a8db69c95aaa632c45f42069ab2be590c
Jérémie Galarneau [Thu, 15 Jun 2023 16:00:40 +0000 (12:00 -0400)]
Tests: introduce WaitTraceTestApplicationGroup
A new test helper, WaitTraceTestApplicationGroup, is added to make it
easier to write test that require a group of applications.
A WaitTraceTestApplicationGroup can be used to:
- launch a given number of applications,
- make them trace (almost) all at once,
- wait for the group of applications to complete
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I18023f1c507993a3e5647f87cd74ead3be392163
Jérémie Galarneau [Mon, 15 May 2023 19:49:04 +0000 (15:49 -0400)]
Clean-up: sessiond: ust-app: ua_sess is never populated on failure
When find_or_create_ust_app_session() fails, it doesn't populate its
return parameter. Therefore, it is unnecessary to destroy the app
session when it returns < 0.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8f5cc75f718d96d32fb67fc67135034eb95365d7
Jérémie Galarneau [Wed, 3 May 2023 19:24:51 +0000 (15:24 -0400)]
Tests: user space context: remove unneeded test import
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2c109562022fdb4934fdd950ce106cfd66509cd3
Jérémie Galarneau [Wed, 3 May 2023 18:55:09 +0000 (14:55 -0400)]
Tests: environment: base WaitTraceTestApplication on gen-ust-events
WaitTraceTestApplication wraps gen-ust-nevents, but that test app offers
few synchronization points compared to gen-ust-events.
The synchronization points to determine that the application has reached
its `main()` and just before emiting its first event are added to
gen-ust-events. WaitTraceTestApplication is adapted to use those new
options.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Idf717fbaa9108d48a3f7d2b26946a4e5c5dfffd5
Jérémie Galarneau [Wed, 3 May 2023 18:14:51 +0000 (14:14 -0400)]
Tests: environment: use a context manager to restore original signal handler
The original signal handler for SIGUSR1 was not restored when launching
the session daemon. This didn't cause any issue, but is unexpected and
would have been a real head scratcher if we wanted to use it later.
To prevent us from forgetting to restore signals when using a
_SignalWaitQueue, a context manager is provided to handle that
automatically.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I86599846ad2000af5d1e3c5cb2a0a73fb1c91455
Jérémie Galarneau [Sat, 4 Feb 2023 00:24:54 +0000 (19:24 -0500)]
common: Add a default nullptr argument to make_unique_wrapper
When wrapping C libraries that return unmanaged pointers,
lttng::make_unique_wrapper makes it easier to locally "wrap" returned
pointers.
auto val = lttng::make_unique_ptr<struct foo *,
lttng::free>(some_func());
However, in its current form, a nullptr must be passed to define an
alias:
using my_type_uptr =
decltype(lttng::make_unique_wrapper<lttng_session,
lttng::free>(nullptr));
Adding a default nullptr argument cuts down a bit of boiler plate.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9d05d8162d28cc156b1e9ec6fe623f1cc02e9c8e
Kienan Stewart [Fri, 9 Jun 2023 20:30:10 +0000 (16:30 -0400)]
Tests: Reduce runtime for high_throughput_limits test
The test currently takes about 2 hours to run. The iteration count and
range of bandwidth limits have been adjusted to reduce the runtime,
while hopefully covering the same goal: attempting to catch rare race
conditions in low-bandwidth scenarios.
The iteration count has been set so there should be some dropped
events, and a diagnostic message emitted if ever there are no dropped
events.
Change-Id: I670414b60c6881c3a6f7aafd2c1b0c4540e6707f
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Tue, 23 May 2023 19:22:04 +0000 (15:22 -0400)]
fix: tests: grep for '$key =' in metadata
Always grep for '$key =' to avoid a collision with a value, for example
if you are looking for the 'domain' key and your hostname contains
'domain'.
While we are here, add a bunch of logging to facilitate debugging in the
future.
Change-Id: I09197169ab7f842921c9139fdeb36007d7b20cfb
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 9 Jun 2023 20:28:09 +0000 (16:28 -0400)]
Build fix: workaround g++ 4.8 decltype handling bug
g++ 4.8.5 fails to build with the following error:
g++ -std=gnu++11 -DHAVE_CONFIG_H -I../../../include -I../../../include -I../../../src -I../../../src -include config.h -I/home/mjeanson/opt/include -I/home/mjeanson/opt/include -DINSTALL_BIN_PATH=\""/home/mjeanson/opt/bin"\" -fvisibility=hidden -fvisibility-inlines-hidden -fno-strict-aliasing -Wall -Wextra -Wmissing-declarations -Wundef -Wredundant-decls -Wshadow -Wsuggest-attribute=format -Wwrite-strings -Wformat=2 -Wstrict-aliasing -Wmissing-noreturn -Wlogical-op -Winit-self -Wno-incomplete-setjmp-declaration -Wno-gnu-folding-constant -Wno-sign-compare -pthread -Wno-shadow -Wno-missing-field-initializers -MT commands/lttng-list_triggers.o -MD -MP -MF commands/.deps/lttng-list_triggers.Tpo -c -o commands/lttng-list_triggers.o `test -f 'commands/list_triggers.cpp' || echo './'`commands/list_triggers.cpp
In file included from commands/../utils.hpp:12:0,
from commands/../command.hpp:12,
from commands/list_triggers.cpp:8:
../../../src/common/container-wrapper.hpp: In instantiation of ‘typename std::conditional<std::is_pointer<_Dp>::value, ElementType, ElementType&>::type lttng::utils::random_access_container_wrapper<ContainerType, ElementType, ContainerOperations>::operator[](std::size_t) [with ContainerType = const lttng_action*; ElementType = const lttng_action*; ContainerOperations = lttng::ctl::details::const_action_list_operations; typename std::conditional<std::is_pointer<_Dp>::value, ElementType, ElementType&>::type = const lttng_action*; std::size_t = long unsigned int]’:
../../../src/common/container-wrapper.hpp:78:21: required from ‘typename std::conditional<std::is_pointer<U>::value, IteratorElementType, IteratorElementType&>::type lttng::utils::random_access_container_wrapper<ContainerType, ElementType, ContainerOperations>::_iterator<IteratorContainerType, IteratorElementType>::operator*() const [with IteratorContainerType = lttng::utils::random_access_container_wrapper<const lttng_action*, const lttng_action*, lttng::ctl::details::const_action_list_operations>; IteratorElementType = const lttng_action*; ContainerType = const lttng_action*; ElementType = const lttng_action*; ContainerOperations = lttng::ctl::details::const_action_list_operations; typename std::conditional<std::is_pointer<U>::value, IteratorElementType, IteratorElementType&>::type = const lttng_action*]’
commands/list_triggers.cpp:1030:66: required from here
../../../src/common/container-wrapper.hpp:133:69: error: ‘const’ qualifiers cannot be applied to ‘lttng::utils::random_access_container_wrapper<const lttng_action*, const lttng_action*, lttng::ctl::details::const_action_list_operations>&’
const auto& const_this = static_cast<const decltype(*this)&>(*this);
^
This bug was fixed in g++ 5.0, see:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60420
In this case, we can simply restate the class' type to work around the
issue since the problem is confined to the handling of decltype
declaration specifiers.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I3ba6b012af0f43f7cd06d780e6800c42e16cc66c
Jérémie Galarneau [Thu, 8 Jun 2023 18:53:15 +0000 (14:53 -0400)]
Wrap calls to fmt::format to catch formatting exceptions
fmt::format throws when a formatting error is encountered.
Unfortunately, we can't ensure complete coverage of all logging call
sites (e.g. error paths) and it is not desirable for such an exception
to be thrown in those cases.
The formatting error is returned as the formatted string so that it ends
up in the logs or exception messages more or less transparently.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I1cb33a5fe87221139eaf9de918b47e0397daa89c
Jérémie Galarneau [Thu, 8 Jun 2023 18:20:09 +0000 (14:20 -0400)]
Wrap main functions to handle uncaught exceptions
Coverity reports multiple instances where formatting facilities can
throw (e.g. invalid format string).
A wrapper to handle formatting exceptions is added in a follow-up
change, but it is a good practice to catch exceptions at the top level
nonetheless.
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie55af338b9ef1f7d6e8825055cfc2e7037cdd80e
Jérémie Galarneau [Thu, 8 Jun 2023 17:42:50 +0000 (13:42 -0400)]
Fix: container-wrapper: size container operation can throw
1512923 Uncaught exception
If the exception is ever thrown, the program will crash.
In lttng::utils::random_access_container_wrapper<lttng_action const *, lttng_action const *, lttng::ctl::details::const_action_list_operations>::size(): A C++ exception is thrown but never caught (CWE-248)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5f8bffc64fb239e59b272985f6b3c959d238da0a
Jérémie Galarneau [Wed, 7 Jun 2023 20:57:37 +0000 (16:57 -0400)]
common: silence bogus coverity warning
Coverity reports:
1512891 Uninitialized scalar variable
The variable will contain an arbitrary value left from earlier computations.
In lttng_event_serialize(lttng_event const *, unsigned int, char
const * const *, char const *, unsigned long, lttng_bytecode *, lttng_payload *): Use of an uninitialized variable (CWE-457)
This warning is bogus since lttng_event_exclusion_comm contains a single
field which is already initialized and is packed (no padding possible).
Initialize the header explicitly to silence the warning.
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia1eeee779168b3ac0eb9d0796d503b2d9ab225f2
Jérémie Galarneau [Wed, 7 Jun 2023 20:28:27 +0000 (16:28 -0400)]
Fix: coverity warns of uncaught exception
Coverity warns that some container operations used by
random_access_container_wrapper can throw even though methods are marked
as noexcept.
CID
1512893 (#1-2 of 2): Uncaught exception (UNCAUGHT_EXCEPT)
exn_spec_violation: An exception of type lttng::invalid_argument_error is thrown but the exception specification noexcept doesn't allow it to be thrown. This will result in a call to terminate().
The noexcept specifier is remvoved from operator* and end() of
random_access_container_wrapper's iterator implementation.
To make this a bit clearer, a bounds check is performed in operator[]
directly which will make errors easier to catch.
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I31d51e8709d33b3c80d64c8c05a23e519e3a93e7
Michael Jeanson [Wed, 7 Jun 2023 14:42:57 +0000 (10:42 -0400)]
Build fix: brace-enclosed initlializer lists error with g++ 4.8
Looks like g++ 4.8 is confused by a single argument brace enclosed
initializer list:
utils.hpp: In constructor 'lttng::cli::session_list::session_list(lttng::cli::session_list&&)':
utils.hpp:112:38: error: call of overloaded 'random_access_container_wrapper(<brace-enclosed initializer list>)' is ambiguous
{ std::move(original._container) })
^
Change-Id: I19da292ed9a49bada7dbda5753bf1bd1442e612f
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 6 Jun 2023 18:09:44 +0000 (14:09 -0400)]
clang-tidy: ignore generated filter-parser.hpp
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I92adc9c9fca588aad2dece474e89e2daa88f36d0
Jérémie Galarneau [Tue, 6 Jun 2023 17:43:28 +0000 (13:43 -0400)]
Fix clang-tidy cppcoreguidelines-pro-type-const-cast warning
clang-tidy reports:
cppcoreguidelines-pro-type-const-cast; do not use const_cast
The const_cast adds a const qualifier so this warning seems a bit
strict. Regardless, we can dodge the whole question by passing the
exclusion_list as `const char * const *`, which is closer to the
original intention of the API anyhow.
For more information on the safety of these types of casts, see:
https://isocpp.org/wiki/faq/const-correctness#constptrptr-conversion
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia3129b7d1ed4e450f3f2d63920d2fd67c66a6d73
Jérémie Galarneau [Tue, 6 Jun 2023 15:20:06 +0000 (11:20 -0400)]
fmtlib: backport upstream fixes to suppress bogus gcc 13.1 warnings
gcc 13.1 erroneously warns of dangling references when using our custom
formatters. This was reported to both fmtlib and gcc and fixes have been
provided, but are not released yet.
This change backports two fixes from the master branch to our vendored
version:
https://github.com/fmtlib/fmt/commit/
f61f15cc5b11582d50d02ba0514c5344f7b2600e
https://github.com/fmtlib/fmt/commit/
ef55d4f52ec527668a8e910a56ea79d9b939dbc2
For more information on the issue, see:
https://github.com/fmtlib/fmt/issues/3415
https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=
6b927b1297e66e26e62e722bf15c921dcbbd25b9
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I30bbbbe5e0aa2729e50228acdb528ee060d9df23
Jérémie Galarneau [Tue, 23 May 2023 23:35:10 +0000 (19:35 -0400)]
Fix: sessiond: incorrect use of exclusions array leads to crash
Issue observed
--------------
When using the CLI to list the configuration of a session that has an
event rule which makes use of multiple exclusions, the session daemon
crashes with the following stack trace:
(gdb) bt
#0 0x00007fa9ed401445 in ?? () from /usr/lib/libc.so.6
#1 0x0000560cd5fc5199 in lttng_strnlen (str=0x615f6f6c6c6568 <error: Cannot access memory at address 0x615f6f6c6c6568>, max=256) at ../../src/common/compat/string.h:19
#2 0x0000560cd5fc6b39 in lttng_event_serialize (event=0x7fa9cc01d8b0, exclusion_count=2, exclusion_list=0x7fa9cc011794, filter_expression=0x0, bytecode_len=0, bytecode=0x0, payload=0x7fa9d3ffda88) at event.c:767
#3 0x0000560cd5f380b5 in list_lttng_ust_global_events (nb_events=<synthetic pointer>, reply_payload=0x7fa9d3ffda88, ust_global=<optimized out>, channel_name=<optimized out>) at cmd.c:472
#4 cmd_list_events (domain=<optimized out>, session=<optimized out>, channel_name=<optimized out>, reply_payload=0x7fa9d3ffda88) at cmd.c:3860
#5 0x0000560cd5f6d76a in process_client_msg (cmd_ctx=0x7fa9d3ffa710, sock=0x7fa9d3ffa5b0, sock_error=0x7fa9d3ffa5b4) at client.c:1890
#6 0x0000560cd5f6f876 in thread_manage_clients (data=0x560cd7879490) at client.c:2629
#7 0x0000560cd5f65a54 in launch_thread (data=0x560cd7879500) at thread.c:66
#8 0x00007fa9ed32d44b in ?? () from /usr/lib/libc.so.6
#9 0x00007fa9ed3b0e40 in ?? () from /usr/lib/libc.so.6
Cause
-----
lttng_event_serialize expects a `char **` list of exclusion names, as
provided by the other callsite in liblttng-ctl. However, the callsite in
list_lttng_ust_global_events passes pointer to the exclusions as stored
in lttng_event_exclusion.
lttng_event_exclusion contains an array of fixed-length strings (with a
stride of 256 bytes) which isn't an expected layout for
lttng_event_serialize.
Solution
--------
A temporary array of pointers is constructed before invoking
lttng_event_serialize to construct a list of exclusions with the layout
that lttng_event_serialize expects.
The array itself is reused for all events, limiting the number of
allocations.
Note
----
None.
Change-Id: I266a1cc9e9f18e0476177a0047b1d8f468110575
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 6 Jun 2023 03:39:17 +0000 (23:39 -0400)]
lttng: reuse random_access_container_wrapper for session_list
Reimplement lttng::cli::session_list in terms of the
random_access_container_wrapper utility since the code is essentially
duplicated.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I99160a54d5d2dfa7cf65a6287b271e9c2238c510
Jérémie Galarneau [Mon, 5 Jun 2023 21:39:39 +0000 (17:39 -0400)]
lttng: move session_list to the lttng::cli namespace
This is a preliminary step to re-use the random_access_container util to
replace the implementation of session_list. No behaviour change is
intended by this change.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I32d0354a6985b8c6fb58813cf880ef04a10678dd
Jérémie Galarneau [Thu, 1 Jun 2023 20:19:31 +0000 (16:19 -0400)]
Provide an idiomatic c++ interface for action lists
Replace for_each macros with the use of an iterator. It is done by using
a random_access_container_wrapper util which is intended to wrap
random_access containers implemented in C.
Change-Id: I1b22725b7335f267c9b2d02fc65f9375baf37426
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Wed, 23 Jun 2021 16:33:18 +0000 (12:33 -0400)]
actions: list: Add `for_each_action_{const, mutable}()` macros
Accessing all the inner actions of a action list in a loop is a common
access pattern. This commit adds 2 `for_each` macros to iterate over all
elements either using a const or a mutable pointer.
Add a few unit tests for the list action to test these macros.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9aff0b81e1f782b5d20c3fcb82ee7028da8dd810
Francis Deslauriers [Fri, 18 Jun 2021 16:23:35 +0000 (12:23 -0400)]
trace-ust: Rename `{next, used}_channel_id` to `{next, used}_event_container_id`
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Idb07462c15cbebaf37a244ad76a32e893ecbab0c
Jérémie Galarneau [Thu, 25 May 2023 23:15:20 +0000 (19:15 -0400)]
Tests fix: test_callstack: output of addr2line incorrectly parsed
Issue observed
--------------
The test_callstack test fails with GCC 13.1 with the following output:
Traceback (most recent call last):
File "/usr/lib/lttng-tools/ptest/tests/regression/././kernel//../../utils/parse-callstack.py", line 160, in <module>
main()
File "/usr/lib/lttng-tools/ptest/tests/regression/././kernel//../../utils/parse-callstack.py", line 155, in main
raise Exception('Expected function name not found in recorded callstack')
Exception: Expected function name not found in recorded callstack
ok 10 - Destroy session callstack
PASS: kernel/test_callstack 10 - Destroy session callstack
not ok 11 - Validate userspace callstack
FAIL: kernel/test_callstack 11 - Validate userspace callstack
Cause
-----
parse-callstack.py uses 'split()' to split the lines of addr2line's
output. By default, 'split()' splits a string on any whitespace.
Typically this was fine as addr2line's output doesn't contain spaces and
the function then splits on new lines.
Typical output of addr2line:
$ addr2line -e ./tests/regression/kernel//../../utils/testapp/gen-syscall-events-callstack/gen-syscall-events-callstack --functions --addresses 0x40124B
0x000000000040124b
my_gettid
/tmp/test-callstack-master/src/lttng-tools/tests/utils/testapp/gen-syscall-events-callstack/gen-syscall-events-callstack.c:40
However, with the test app compiled using gcc 13.1, a "discriminator"
annotation is present:
0x0000000000401279
fct_b
/tmp/test-callstack-master/src/lttng-tools/tests/utils/testapp/gen-syscall-events-callstack/gen-syscall-events-callstack.c:58 (discriminator 1)
Hence, by selecting the second to last element (-2, with negative
indexing), the addr2line function returns '(discriminator' as the
function name.
Solution
--------
The parsing code is changed to simply iterate on groups of 3 lines,
following addr2line's output format.
Fixes #1377
Change-Id: I8c1eab97e84ca7cad171904bed6660540061cf08
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Tue, 2 May 2023 21:06:29 +0000 (17:06 -0400)]
Tests: add more output to test_ust_constructor
To make debugging test failures easier, add a tap result for each lttng
command and for each expected event.
Change-Id: I54872494c967f1e13bf6d5b05dc5c202abf5865e
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Tue, 2 May 2023 14:28:54 +0000 (10:28 -0400)]
Tests: convert even more left-over type hint to type comment
Use type comments to support older python3 interpreters that can't
handle type hints (such as 3.4).
Change-Id: I4f97bc3f1e18b8701cb04175d97deb295178883b
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 1 May 2023 21:34:36 +0000 (17:34 -0400)]
Docs: test environment: clarify the behavior of a TraceTestApplication
A TraceTestApplication, as opposed to a WaitTraceTestApplication, traces
immediately when it is launched. This is useful to test tracing from a
constructor, but should most likely not be used for other purposes.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7c25a8a60008d584b9376fce74fb925fc02b846a
Michael Jeanson [Mon, 1 May 2023 20:50:57 +0000 (16:50 -0400)]
Tests: convert more left-over type hint to type comment
Use type comments to support older python3 interpreters that can't
handle type hints (such as 3.4).
Change-Id: I3599311dfcc172344c759f8c0e22af640759f1f5
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Tue, 18 Apr 2023 19:02:37 +0000 (15:02 -0400)]
Port: fix -Wdeprecated-declarations warning about sprintf on macOS clang 14
Remove uses of sprintf to fix this warning:
warning: 'sprintf' is deprecated: This function is provided for
compatibility reasons only. Due to security concerns inherent in the
design of sprintf(3), it is highly recommended that you use snprintf(3)
instead. [-Wdeprecated-declarations]
Change-Id: Idf3109f2eacafe0a7d18f4c132613f2f85afa09b
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 1 May 2023 17:45:27 +0000 (13:45 -0400)]
Tests: ust_constructor: convert left-over type hint to type comment
Use type comments to support older python3 interpreters that can't
handle type hints (such as 3.4).
Python 3.4 reports the following error:
File "./ust/ust-constructor/test_ust_constructor.py", line 176
client: lttngtest.Controller = lttngtest.LTTngClient(test_env, log=tap.diagnostic)
^
SyntaxError: invalid syntax
ERROR: ust/ust-constructor/test_ust_constructor.py - missing test plan
ERROR: ust/ust-constructor/test_ust_constructor.py - exited with status 1
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic0ead4b9bedfca56e8999bf4012e103801794655
Jérémie Galarneau [Mon, 1 May 2023 15:57:48 +0000 (11:57 -0400)]
Tests: ust_constructor: convert type hints to type comments
Use type comments to support older python3 interpreters that can't
handle type hints (such as 3.4).
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id8b040fb36a4f243a3af0d731843028ad2f4388d
Jérémie Galarneau [Fri, 28 Apr 2023 14:46:08 +0000 (10:46 -0400)]
Clean-up: lttng: utils: coding style fix
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia23ad0168729819bcb0735ff8e55ce02fc9bcd20
Mathieu Desnoyers [Fri, 17 Feb 2023 20:32:25 +0000 (15:32 -0500)]
Tests: Introduce test_ust_constructor
Test instrumentation coverage of C/C++ constructors and destructors by
LTTng-UST tracepoints.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia9e5a5a57bfa7fd4316f8a914ef97effd020262e
Mathieu Desnoyers [Fri, 17 Feb 2023 14:57:03 +0000 (09:57 -0500)]
Tests: Introduce gen-ust-events-constructor test application
This test application tests C/C++ constructor/destructor instrumentation coverage.
* How to use:
lttng create
lttng enable-event -u 'tp*'
lttng start
./gen-ust-events-constructor
lttng stop
lttng view
* Before UST fixes:
[11:57:09.
949917277] (+?.?????????) compudjdev tp_so:constructor_c_provider_shared_library: { cpu_id = 6 }, { }
[11:57:09.
949962573] (+0.
000045296) compudjdev tp_so:constructor_cplusplus_provider_shared_library: { cpu_id = 6 }, { msg = "global - shared library define and provider" }
[11:57:09.
952145202] (+0.
002182629) compudjdev tp:constructor_cplusplus: { cpu_id = 6 }, { msg = "global - same unit after provider" }
[11:57:09.
952146517] (+0.
000001315) compudjdev tp:constructor_c_across_units_after_provider: { cpu_id = 6 }, { }
[11:57:09.
952146887] (+0.
000000370) compudjdev tp:constructor_cplusplus: { cpu_id = 6 }, { msg = "global - across units after provider" }
[11:57:09.
952634622] (+0.
000487735) compudjdev tp:constructor_cplusplus: { cpu_id = 6 }, { msg = "main() local" }
[11:57:09.
952635522] (+0.
000000900) compudjdev tp_so:constructor_cplusplus_provider_shared_library: { cpu_id = 6 }, { msg = "main() local - shared library define and provider" }
[11:57:09.
952636176] (+0.
000000654) compudjdev tp_a:constructor_cplusplus_provider_static_archive: { cpu_id = 6 }, { msg = "main() local - static archive define and provider" }
[11:57:09.
952636906] (+0.
000000730) compudjdev tp:main: { cpu_id = 6 }, { }
[11:57:09.
952637469] (+0.
000000563) compudjdev tp_a:destructor_cplusplus_provider_static_archive: { cpu_id = 6 }, { msg = "main() local - static archive define and provider" }
[11:57:09.
952638106] (+0.
000000637) compudjdev tp_so:destructor_cplusplus_provider_shared_library: { cpu_id = 6 }, { msg = "main() local - shared library define and provider" }
[11:57:09.
952638516] (+0.
000000410) compudjdev tp:destructor_cplusplus: { cpu_id = 6 }, { msg = "main() local" }
[11:57:09.
952681576] (+0.
000043060) compudjdev tp:destructor_cplusplus: { cpu_id = 6 }, { msg = "global - across units after provider" }
[11:57:09.
952682066] (+0.
000000490) compudjdev tp:destructor_cplusplus: { cpu_id = 6 }, { msg = "global - same unit after provider" }
[11:57:09.
952729603] (+0.
000047537) compudjdev tp_so:destructor_cplusplus_provider_shared_library: { cpu_id = 6 }, { msg = "global - shared library define and provider" }
* After UST fixes:
[11:49:37.
921028048] (+?.?????????) compudjdev tp_so:constructor_c_provider_shared_library: { cpu_id = 22 }, { }
[11:49:37.
921033701] (+0.
000005653) compudjdev tp_a:constructor_c_provider_static_archive: { cpu_id = 22 }, { }
[11:49:37.
921036278] (+0.
000002577) compudjdev tp_so:constructor_cplusplus_provider_shared_library: { cpu_id = 22 }, { msg = "global - shared library define and provider" }
[11:49:37.
921037961] (+0.
000001683) compudjdev tp_a:constructor_cplusplus_provider_static_archive: { cpu_id = 22 }, { msg = "global - static archive define and provider" }
[11:49:37.
921039431] (+0.
000001470) compudjdev tp:constructor_c_across_units_before_define: { cpu_id = 22 }, { }
[11:49:37.
921040288] (+0.
000000857) compudjdev tp:constructor_cplusplus: { cpu_id = 22 }, { msg = "global - across units before define" }
[11:49:37.
921041208] (+0.
000000920) compudjdev tp:constructor_c_same_unit_before_define: { cpu_id = 22 }, { }
[11:49:37.
921042021] (+0.
000000813) compudjdev tp:constructor_c_same_unit_after_define: { cpu_id = 22 }, { }
[11:49:37.
921042568] (+0.
000000547) compudjdev tp:constructor_cplusplus: { cpu_id = 22 }, { msg = "global - same unit before define" }
[11:49:37.
921043161] (+0.
000000593) compudjdev tp:constructor_cplusplus: { cpu_id = 22 }, { msg = "global - same unit after define" }
[11:49:37.
921044058] (+0.
000000897) compudjdev tp:constructor_c_across_units_after_define: { cpu_id = 22 }, { }
[11:49:37.
921044585] (+0.
000000527) compudjdev tp:constructor_cplusplus: { cpu_id = 22 }, { msg = "global - across units after define" }
[11:49:37.
921045585] (+0.
000001000) compudjdev tp:constructor_c_same_unit_before_provider: { cpu_id = 22 }, { }
[11:49:37.
921046385] (+0.
000000800) compudjdev tp:constructor_c_same_unit_after_provider: { cpu_id = 22 }, { }
[11:49:37.
921046938] (+0.
000000553) compudjdev tp:constructor_cplusplus: { cpu_id = 22 }, { msg = "global - same unit before provider" }
[11:49:37.
921047548] (+0.
000000610) compudjdev tp:constructor_cplusplus: { cpu_id = 22 }, { msg = "global - same unit after provider" }
[11:49:37.
921048428] (+0.
000000880) compudjdev tp:constructor_c_across_units_after_provider: { cpu_id = 22 }, { }
[11:49:37.
921048918] (+0.
000000490) compudjdev tp:constructor_cplusplus: { cpu_id = 22 }, { msg = "global - across units after provider" }
[11:49:37.
921050001] (+0.
000001083) compudjdev tp:constructor_cplusplus: { cpu_id = 22 }, { msg = "main() local" }
[11:49:37.
921050628] (+0.
000000627) compudjdev tp_so:constructor_cplusplus_provider_shared_library: { cpu_id = 22 }, { msg = "main() local - shared library define and provider" }
[11:49:37.
921051368] (+0.
000000740) compudjdev tp_a:constructor_cplusplus_provider_static_archive: { cpu_id = 22 }, { msg = "main() local - static archive define and provider" }
[11:49:37.
921052098] (+0.
000000730) compudjdev tp:main: { cpu_id = 22 }, { }
[11:49:37.
921052758] (+0.
000000660) compudjdev tp_a:destructor_cplusplus_provider_static_archive: { cpu_id = 22 }, { msg = "main() local - static archive define and provider" }
[11:49:37.
921053758] (+0.
000001000) compudjdev tp_so:destructor_cplusplus_provider_shared_library: { cpu_id = 22 }, { msg = "main() local - shared library define and provider" }
[11:49:37.
921054595] (+0.
000000837) compudjdev tp:destructor_cplusplus: { cpu_id = 22 }, { msg = "main() local" }
[11:49:37.
921055698] (+0.
000001103) compudjdev tp:destructor_cplusplus: { cpu_id = 22 }, { msg = "global - across units after provider" }
[11:49:37.
921056455] (+0.
000000757) compudjdev tp:destructor_cplusplus: { cpu_id = 22 }, { msg = "global - same unit after provider" }
[11:49:37.
921057011] (+0.
000000556) compudjdev tp:destructor_cplusplus: { cpu_id = 22 }, { msg = "global - same unit before provider" }
[11:49:37.
921057558] (+0.
000000547) compudjdev tp:destructor_cplusplus: { cpu_id = 22 }, { msg = "global - across units after define" }
[11:49:37.
921058188] (+0.
000000630) compudjdev tp:destructor_cplusplus: { cpu_id = 22 }, { msg = "global - same unit after define" }
[11:49:37.
921058658] (+0.
000000470) compudjdev tp:destructor_cplusplus: { cpu_id = 22 }, { msg = "global - same unit before define" }
[11:49:37.
921059168] (+0.
000000510) compudjdev tp:destructor_cplusplus: { cpu_id = 22 }, { msg = "global - across units before define" }
[11:49:37.
921059768] (+0.
000000600) compudjdev tp_a:destructor_cplusplus_provider_static_archive: { cpu_id = 22 }, { msg = "global - static archive define and provider" }
[11:49:37.
921060445] (+0.
000000677) compudjdev tp_so:destructor_cplusplus_provider_shared_library: { cpu_id = 22 }, { msg = "global - shared library define and provider" }
[11:49:37.
921067265] (+0.
000006820) compudjdev tp:destructor_c_across_units_after_provider: { cpu_id = 22 }, { }
[11:49:37.
921067901] (+0.
000000636) compudjdev tp:destructor_c_same_unit_after_provider: { cpu_id = 22 }, { }
[11:49:37.
921068515] (+0.
000000614) compudjdev tp:destructor_c_same_unit_before_provider: { cpu_id = 22 }, { }
[11:49:37.
921069128] (+0.
000000613) compudjdev tp:destructor_c_across_units_after_define: { cpu_id = 22 }, { }
[11:49:37.
921069831] (+0.
000000703) compudjdev tp:destructor_c_same_unit_after_define: { cpu_id = 22 }, { }
[11:49:37.
921070445] (+0.
000000614) compudjdev tp:destructor_c_same_unit_before_define: { cpu_id = 22 }, { }
[11:49:37.
921071075] (+0.
000000630) compudjdev tp:destructor_c_across_units_before_define: { cpu_id = 22 }, { }
[11:49:37.
921071721] (+0.
000000646) compudjdev tp_a:destructor_c_provider_static_archive: { cpu_id = 22 }, { }
[11:49:37.
921072605] (+0.
000000884) compudjdev tp_so:destructor_c_provider_shared_library: { cpu_id = 22 }, { }
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4572c2548acf5e295f70e88137ab12b3b86d17c9
Michael Jeanson [Thu, 27 Apr 2023 17:08:16 +0000 (13:08 -0400)]
Fix: io-hint: add missing include for off_t
Change-Id: I73ca36d43c7a9b9b5f67e77e835426964b315f65
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 26 Apr 2023 22:13:02 +0000 (18:13 -0400)]
Build fix: g++ 4.8 incorrectly disambiguates enum and member
g++ 4.8 fails to build with the following error:
commands/start.cpp: In function ‘cmd_error_code {anonymous}::start_tracing(const session_spec&)’:
commands/start.cpp:123:76: error: ‘session_spec::type’ is not a class, namespace, or enumeration
if (!listing_failed && sessions.size() == 0 && spec.type == session_spec::type::NAME) {
^
commands/start.cpp:144:36: error: ‘session_spec::type’ is not a class, namespace, or enumeration
if (spec.type != session_spec::type::NAME) {
^
The `type` member is renamed to type_ to workaround this compiler bug.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id46ca4219d4d96db71a9d4523c3571303b2e97a7
Jérémie Galarneau [Fri, 21 Apr 2023 19:48:56 +0000 (15:48 -0400)]
Test: client: start, stop, destroy: add tests for --glob/--all
Test the CLI client's start, stop, and destroy commands along with their
--all and --glob options.
The tests validate that only the targeted sessions are affected by the
various commands and that the commands don't error-out when multiple
sessions are targetted.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie39e999608a063cc4573d790120fbe0896917d6f
Jérémie Galarneau [Fri, 21 Apr 2023 19:11:44 +0000 (15:11 -0400)]
lttng: rename iterator_template to _iterator
iterator_template is not meant to be used directly and is private: it
should be prefixed with an underscore.
Moreover, "template" in the name doesn't really add anything; remove it.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id891afeb596788d8d40cdf45e3f50dd8bf0427cc
Jérémie Galarneau [Fri, 21 Apr 2023 18:48:53 +0000 (14:48 -0400)]
Move create_unique_class util to the memory namespace
create_unique_class is helpful to write unique_ptr wrappers and is now
accessed in numerous places even though it lives in the `details`
namespace.
Moving it to the `memory` namespace to live with other memory management
facilities.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id4bb5100c1eb3a7e1e2d65f5b2d40ff8f97c786e
Jérémie Galarneau [Thu, 20 Apr 2023 21:01:14 +0000 (17:01 -0400)]
lttng: destroy: ensure a cmd_error_code is returned by the command
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7adef3df9b0a76de893341131cd5d0da2c9ab5df
Jérémie Galarneau [Thu, 20 Apr 2023 18:40:32 +0000 (14:40 -0400)]
Clean-up: lttng-destroy: move static symbols to anonymous namespace
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iebea98532e1c6d0c7309c0ea40e6585a25b8b3df
Jérémie Galarneau [Wed, 26 Apr 2023 02:24:17 +0000 (22:24 -0400)]
Test: mi: inverted logic in stop validation test
Stopping a session twice should succeed (no-op). Moreover, the test
mentions it tests a 'start' while it tests a 'stop' operation.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4c8f19f0ecc181921fb5e3dd4f882e3c2285adcd
Jérémie Galarneau [Thu, 20 Apr 2023 16:34:35 +0000 (12:34 -0400)]
lttng: stop: ensure a cmd_error_code is returned by the command
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iee537cbda297cee72a1d5ad49738ac347f42d3fd
Jérémie Galarneau [Thu, 20 Apr 2023 15:54:57 +0000 (11:54 -0400)]
Clean-up: lttng-stop: move static symbols to anonymous namespace
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4cb91315993ffa4f0f53698f7912540fe2aa4075
Jérémie Galarneau [Thu, 20 Apr 2023 15:47:22 +0000 (11:47 -0400)]
Clean-up: rotation-thread: disable move and copy
Disable unused move and copy constructors and assignment operators, as
reported by clang-tidy.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I610e60dd082bc7c552a6f304ea44e9970487e558
Jérémie Galarneau [Wed, 19 Apr 2023 19:15:50 +0000 (15:15 -0400)]
lttng: start: ensure a cmd_error_code is returned by the command
The start_tracing functions mix cmd_error_code and lttng_error_code
values in "raw" integers which is unexpected by the top-level client.
For instance, the client returns '80' when attempting to start a session
that is already active.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I12c15e8aa3eb960e1e47d5166307995af8c46989
Jérémie Galarneau [Wed, 19 Apr 2023 19:15:26 +0000 (15:15 -0400)]
lttng: start: move static symbols to anonymous namespace
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I295b39aa2f286bc494a0d2eb0b5c8528e0b6afc3
Olivier Dion [Mon, 6 Feb 2023 22:19:36 +0000 (17:19 -0500)]
lttng: Add --glob option to lttng-destroy
Change-Id: I930adc74e1ab2d285f99e0aef01ba1108df089a4
Signed-off-by: Olivier Dion <odion@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Olivier Dion [Mon, 6 Feb 2023 21:11:56 +0000 (16:11 -0500)]
lttng: Add --all, --glob options to lttng-stop
Change-Id: Ida7c8f68f9b52f7ab4414bf060383864bd2e3046
Signed-off-by: Olivier Dion <odion@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Olivier Dion [Mon, 6 Feb 2023 21:06:10 +0000 (16:06 -0500)]
lttng: Add --all, --glob options to lttng-start
Change-Id: I8ff806c7ea7a51b8fa69206bd427b5e106706073
Signed-off-by: Olivier Dion <odion@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Olivier Dion [Mon, 6 Feb 2023 20:55:39 +0000 (15:55 -0500)]
lttng: Add list_sessions utility
Get a list of sessions given a session specification.
The `NAME' specification returns the session that has the given
name. The `GLOB_PATTERN' specification returns all sessions that match a
pattern. The `ALL' specification returns all sessions.
Change-Id: If69698d6dd59d54ba16669571ef49ed5bb9b8997
Signed-off-by: Olivier Dion <odion@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Olivier Dion [Tue, 28 Feb 2023 22:02:38 +0000 (17:02 -0500)]
Fix: mi: Pass const session to mi_lttng_session
There's no reason for session to be not constant.
Change-Id: I55d63e142853f78ba643600b5c1869866ee1bb5e
Signed-off-by: Olivier Dion <odion@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 18 Apr 2023 21:57:46 +0000 (17:57 -0400)]
Clean-up: clang-tidy autofixes to eventfd and file-descriptor
eventfd must be marked explicit to prevent erroneous implicit
conversions.
A destructor is provided since other special member functions are
defined (see cppcoreguidelines-special-member-functions).
file_descriptor's constructor is replaced by a trivial constructor.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I1623a2f8db0d613201a534293f5e1476b779283c
Jérémie Galarneau [Thu, 20 Apr 2023 15:07:44 +0000 (11:07 -0400)]
Tests: utils: lttng_pgrep spams output when racing with a process
lttng_pgrep is often used to check if a process is alive. As such, it is
often used on PIDs which are tearing down.
The file redirection used in `tr '\0' '\n' < /proc/"$pid"/cmdline` often
fails (which is correct) because the /proc/$pid folder no longer exists.
When this occurs, the test output is cluttered with annoying errors like:
./tests/regression/kernel//../../utils/utils.sh: line 151: /proc/845/cmdline: No such file or directory
This part of the command now runs under a subshell to hide the error
when it occurs.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6a26cb63cd56c46557a73e2e475b0cac729cc67f
Jérémie Galarneau [Wed, 19 Apr 2023 21:48:25 +0000 (17:48 -0400)]
Tests: enable test_callstack on 32-bit x86
The test application used by test_callstack defines syscall
stubs (instead of using the libc) for both 32 and 64 bit variants of
x86. This limits the architectures that can run this test.
However, the `uname` check is erroneous as it assumes that `uname -m`
will return x86 for both architectures. It doesn't; it returns i386 or
i686 (and I presume others) for 32-bit kernels.
As such, the check is modified to check for '86' which, as far as I
know, doesn't clash with other architectures.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I267a81fadb2f1db6b135c6886ae5700184f2ae5b
Michael Jeanson [Wed, 19 Apr 2023 18:28:43 +0000 (14:28 -0400)]
Tests: fix: kernel/test_callstack: number of tests on i386
The number of planned tests is incorrect when not running the userspace
callstack tests which is limited to x86_64.
Fixes the following error:
ERROR: kernel/test_callstack 12 - Wait after kill session daemon # UNPLANNED
# Looks like you planned 11 tests but ran 1 extra.
ERROR: kernel/test_callstack - too many tests run (expected 11, got 12)
ERROR: kernel/test_callstack - exited with status 1
Change-Id: Icf662968ac977b08594eca99e18476e43cc4ea79
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Tue, 18 Apr 2023 21:58:01 +0000 (17:58 -0400)]
Add getrandom compat for MacOS, FreeBSD and Cygwin
Use the BSD arc4random_buf() function which should be non-blocking on
all of these platforms.
Change-Id: Ib43373cad82373dc83995fdb3d01c2a2d43ab683
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 18 Apr 2023 18:58:04 +0000 (14:58 -0400)]
Tests: metadata-regeneration: restore date at the end of the test
The metadata-regeneration test sets the date is the past to validate
that the clock offset is re-computed when regenerating a trace's
metadata.
In doing so, it leaves the time in the past which causes 'make' to panic
when it checks for changes while continuing to evaluate the 'check'
target.
make[4]: Entering directory '/root/workspace/lttng-tools_master_rootbuild_i386/babeltrace_version/stable-2.0/build/std/conf/agents/liburcu_version/master/node/i386-rootnode/platform/deb11-i386/test_type/base/src/lttng-tools/tests/perf'
make[4]: Warning: File '.deps/find_event.Po' has modification time
1363500490 s in the future
============================================================================
Testsuite summary for lttng-tools 2.14.0-pre
============================================================================
# TOTAL: 0
# PASS: 0
# SKIP: 0
# XFAIL: 0
# FAIL: 0
# XPASS: 0
# ERROR: 0
============================================================================
make[4]: warning: Clock skew detected. Your build may be incomplete.
make[4]: Leaving directory '/root/workspace/lttng-tools_master_rootbuild_i386/babeltrace_version/stable-2.0/build/std/conf/agents/liburcu_version/master/node/i386-rootnode/platform/deb11-i386/test_type/base/src/lttng-tools/tests/perf'
make[3]: warning: Clock skew detected. Your build may be incomplete.
make[3]: Leaving directory '/root/workspace/lttng-tools_master_rootbuild_i386/babeltrace_version/stable-2.0/build/std/conf/agents/liburcu_version/master/node/i386-rootnode/platform/deb11-i386/test_type/base/src/lttng-tools/tests/perf'
make[2]: warning: Clock skew detected. Your build may be incomplete.
make[2]: Leaving directory '/root/workspace/lttng-tools_master_rootbuild_i386/babeltrace_version/stable-2.0/build/std/conf/agents/liburcu_version/master/node/i386-rootnode/platform/deb11-i386/test_type/base/src/lttng-tools/tests/perf'
make[1]: *** [Makefile:557: check-recursive] Error 1
The date is sampled at the beginning of the test and restored when it
ends. This leaves the system with a wrong time (offset by the duration
of the test itself). Still, its better than being 40 years in the past.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4a128b57d4d9fd61eec097948b688060815f4a4c
Michael Jeanson [Tue, 18 Apr 2023 16:11:33 +0000 (12:11 -0400)]
Build fix: add a noop compat wrapper for posix_fadvise
This allows building on platforms without posix_fadvise.
Change-Id: I13fd6404cd02eb8038a97693db27d9619246401d
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 18 Apr 2023 15:44:42 +0000 (11:44 -0400)]
Clean-up: error.hpp/error.cpp coding style fix
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I570dabdc2ea83bf9aded48e64fc9e1f83a061044
Michael Jeanson [Tue, 18 Apr 2023 15:29:21 +0000 (11:29 -0400)]
Build fix: missing error_get_str on non-glibc builds
The function declaration was erroneously moved inside this ifdef. This
breaks non glibc builds.
Introduced in :
commit
003f455dab0204dd3f066ecdbea0470035f8181f
Author: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Date: Thu Apr 13 14:31:33 2023 -0400
Fix: logging: unhandled error in *_FMT macros
Coverity reports:
1508779 Uncaught exception
If the exception is ever thrown, the program will crash.
In lttng::sessiond::rotation_thread::_thread_function(): A C++ exception
is thrown but never caught (CWE-248)
The *_FMT macros, which use fmtlib, don't handle the case where
fmt::format throws. This can happen, in particular, when an invalid
format string is used.
The macros are modified to log the exception and abort.
Change-Id: I5eb3b2a673e224f3c99cae7faece31175084db9d
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 18 Apr 2023 15:42:04 +0000 (11:42 -0400)]
Clean-up: remove unnecessary inclusion
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7f8b875fe2cdcf7909a953bb6061229c4a5bca71
Jérémie Galarneau [Tue, 18 Apr 2023 15:28:22 +0000 (11:28 -0400)]
Build fix: missing header on macOS
The build fails on macOS (on both ARM64 and AMD64 platforms):
exception.cpp:32:41: error: use of undeclared identifier 'error_get_str'
runtime_error(msg + ": " + std::string(error_get_str(error_code)),
^
error_get_str is defined in lttng/lttng-error.h
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I69fa0c0bc442d0cb28779f6b4c55d0b6a7196791
Michael Jeanson [Tue, 18 Apr 2023 15:06:56 +0000 (11:06 -0400)]
Build fix: failure on macOS caused by missing space in Makefile
Change-Id: Ia30100d293ad52e422998512d44173af16858257
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Jérémie Galarneau [Wed, 12 Apr 2023 18:53:37 +0000 (14:53 -0400)]
Build fix: common: eventfd only exists on Linux
The project fails to build on non-Linux platforms since eventfd is a
Linux-exclusive facility.
From the project's point of view, the eventfd util belongs in libcommon.
However, we will need to write a wrapper if it ends up being used on
non-Linux platform.
For the moment, exclude it from the internal libcommon sources when
building on non-Linux platforms as it is only used on the session
daemon.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: If528ac522d72a95e87a4f496531ea679a81030a2
Jérémie Galarneau [Mon, 17 Apr 2023 18:46:56 +0000 (14:46 -0400)]
Replace uses of the URCU tls helpers by the standard thread_local
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6751ad8c3a121638b07d333ac6f28547b25f54f2
Jérémie Galarneau [Thu, 13 Apr 2023 18:41:43 +0000 (14:41 -0400)]
Clean-up: use const reference where possible
These exceptions can all be caught by const reference.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I11a4f5018c838d77212340efa6f94d5d1c005ed6
Jérémie Galarneau [Thu, 13 Apr 2023 18:34:31 +0000 (14:34 -0400)]
Coverity warning: sessiond: uncaught exception in main
Coverity reports:
1508778 Uncaught exception
If the exception is ever thrown, the program will crash.
In main: A C++ exception is thrown but never caught (CWE-248)
In particular, Coverity reports that pthread_mutex_lock can error-out,
which would cause an lttng::posix_error exception to be thrown.
This isn't particularly likely to happen, but it is nonetheless
preferable to catch exceptions at the top-level and log them before
exiting.
Change-Id: I4f7a52db3c9cd19764e7d12fd62afa60af99dabd
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 13 Apr 2023 18:31:33 +0000 (14:31 -0400)]
Fix: logging: unhandled error in *_FMT macros
Coverity reports:
1508779 Uncaught exception
If the exception is ever thrown, the program will crash.
In lttng::sessiond::rotation_thread::_thread_function(): A C++ exception
is thrown but never caught (CWE-248)
The *_FMT macros, which use fmtlib, don't handle the case where
fmt::format throws. This can happen, in particular, when an invalid
format string is used.
The macros are modified to log the exception and abort.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7f4a066b418a9d544a679f773df7e94640755f47
Jérémie Galarneau [Mon, 17 Apr 2023 18:25:47 +0000 (14:25 -0400)]
Clean-up: common: error_log_time doesn't need to be global
error_log_time is only used in error.cpp. Reduce its visibility to the
file.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ifbd821af12d7378b203cd7a2c1d48d28ed00cb40
Jérémie Galarneau [Fri, 11 Nov 2022 22:35:12 +0000 (17:35 -0500)]
Tests: select_poll_epoll: Add support for _time64
Add support for the 64-bit time_t syscalls SYS_ppoll_time64
and SYS_pselect6_time64.
These syscalls exist on 32-bit platforms since the 5.1 kernel. 32-bit
platforms with a 64-bit time_t only have these and don't have the
original syscalls (such as 32-bit RISC-V).
In doing so, the original syscalls were renamed to add the `_time32`
suffix which causes the validation steps to fail.
This patch ensures that the 64-bit version of the pselect and ppoll
syscalls are called whenever they are available, allowing the tests to
succeed.
It also ensures that we don't attempt to use the 32-bit versions that
don't exist on newer 32-bit platforms like RISCV-32.
The test is also cleaned-up:
- The *invalid_pointer, *invalid_fd, *ulong_max, and *buffer_overflow
tests are identical except for the syscall(...) invocation. They are
combined to share as much code as possible regardless of which
syscalls the platform's ABI supports.
- Harmonized test names between the test script and test application
- Test names are printed when using the list option and used to launch
a test (rather than a numeric id)
- Allow the test application to print the list of supported tested
syscalls
Fixes: https://github.com/lttng/lttng-tools/pull/162
Change-Id: I974f780022441fedfa45414d672092606e657cf6
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Tue, 17 Jan 2023 22:49:42 +0000 (17:49 -0500)]
Remove fcntl wrapper
Replace the questionnable sync_file_range wrapper by more descriptive
util functions. Remove the unimplemented splice wrapper, I'd rather have
a build failure / adjust the build system on some platforms than have
hard to diagnose runtime issues.
Change-Id: I4114d0d9765ae3d95a1488c945e5d66a20c2029d
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Wed, 12 Apr 2023 18:14:21 +0000 (14:14 -0400)]
Build fix: brace-enclosed initlializer lists error with g++ 4.8
A build error occurs when building using g++ 4.8 :
rotation-thread.cpp: In constructor 'lttng::sessiond::rotation_thread::rotation_thread(lttng::sessiond::rotation_thread_timer_queue&, notification_thread_handle&)':
rotation-thread.cpp:400:58: error: invalid initialization of non-const reference of type 'lttng::sessiond::rotation_thread_timer_queue&' from an rvalue of type '<brace-enclosed initializer list>'
_notification_thread_handle{ notification_thread_handle }
Use old-style initialization of references instead.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia3392a88b8a2d8dd8c60330c16229f507338e7cd
Michael Jeanson [Tue, 28 Mar 2023 16:06:00 +0000 (12:06 -0400)]
Fix: warning: "HAVE_GETIPNODEBYNAME" is not defined
Change-Id: I3214a63daea3c2d44f8712127cd0d776d429f130
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Christophe Bedard [Mon, 20 Feb 2023 21:20:51 +0000 (13:20 -0800)]
Extras: python bindings: update context types
This adds context types from LTTNG_EVENT_CONTEXT_CALLSTACK_KERNEL (20)
to LTTNG_EVENT_CONTEXT_TIME_NS (41) so that the list matches
lttng_event_context_type in include/lttng/event.h.
Change-Id: Ied908aa51cf75e931794acef61271468efeff6a7
Signed-off-by: Christophe Bedard <christophe.bedard@apex.ai>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Christophe Bedard [Tue, 21 Feb 2023 20:31:11 +0000 (12:31 -0800)]
Docs: lttng-add-context(1): fix typo
Change-Id: Ie6d182e00d2aae07f8dd405e3b4b9ec8205ad1da
Signed-off-by: Christophe Bedard <christophe.bedard@apex.ai>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 14 Apr 2023 18:57:06 +0000 (14:57 -0400)]
Tests: fix: test_list_triggers_cli fails to list userspace-probe-sdt trigger
The listing of triggers which use an event rule match condition
consisting in a user space probe set on an SDT probe fails since
28f23191d.
The coding style imposes an order of includes. However, the order in
which the probe declarations generated by systemtap vs sdt.h matters.
From SYSTEMTAP(2):
Sometimes, semaphore variables are not necessary nor helpful. Skipping
them can simplify the build process, by omitting the extra "test.o"
file. To skip dependence upon semaphore variables, include "<sys/sdt.h>"
within the application before "test.h":
[...]
In this mode, the ENABLED() test is fixed at 1.
The reformatted version of userspace-probe-sdt-binary.c includes sdt.h
after the probe causing the probes to use a guarding semaphore.
Unfortunately, we can't instrument such probes and the registration of
the trigger silently fails. The silent failure is addressed by a
follow-up commit.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5146f55ed5d9d109f1f7bc32e0f7c2c8bf839f8e
This page took 0.055126 seconds and 4 git commands to generate.