Mathieu Desnoyers [Thu, 2 Dec 2021 22:33:55 +0000 (17:33 -0500)]
Fix: relayd: `!vsession->current_trace_chunk` assertion failed
Observed issue
==============
When performing:
#!/bin/bash
lttng create py_syscalls --live
lttng enable-event -u -a
lttng enable-event -k -a
lttng start
babeltrace2 -i lttng-live net://localhost/host/raton/py_syscalls
The relay daemon hits this assertion:
Thread 8 (Thread 0x7fffeeffd700 (LWP 167040) "lttng-relayd"):
#0 0x00007ffff7b1618b in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff7af5859 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007ffff7af5729 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007ffff7b06f36 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x00005555555889bb in viewer_session_attach (vsession=0x7fffdc001400, session=session@entry=0x7fffe8001180) at viewer-session.c:80
#5 0x000055555557bcff in viewer_attach_session (conn=0x7fffd0001140) at live.c:1275
#6 process_control (conn=0x7fffd0001140, recv_hdr=0x7fffeeffcaf0) at live.c:2341
#7 thread_worker (data=<optimized out>) at live.c:2515
#8 0x00007ffff7ccd609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#9 0x00007ffff7bf2293 in clone () from /lib/x86_64-linux-gnu/libc.so.6
Cause
=====
This assert appears to be entirely wrong.
It checks that the "viewer session" has a NULL current trace chunk when
attaching a session to a viewer session, but in the case where a viewer
session has multiple sessions (e.g. with kernel and ust tracing
combined), we are attaching each session individually to the viewer
session, and we set the current trace chunk of the viewer session when
we attach the first session to it.
So it is expected to be non-NULL when attaching the second session.
Solution
========
Remove the assertion.
Known limitations
=================
None.
Fixes: #1335
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4d8f5d5347b4588144ddf449976cae5a94b81b3a
Simon Marchi [Wed, 10 Nov 2021 13:42:25 +0000 (08:42 -0500)]
Fix: tests: fix unused-but-set warning in test_fd_tracker.c
When building with clang-14 on Ubuntu 20.04, I get:
CC test_fd_tracker.o
/home/smarchi/src/lttng-tools/tests/unit/test_fd_tracker.c:169:15: error: variable 'fds_set_to_minus_1' set but not used [-Werror,-Wunused-but-set-variable]
unsigned int fds_set_to_minus_1 = 0;
^
The compiler seems right, so remove fds_set_to_minus_1. It might be
that the intention was to assert something using this variable, but I
couldn't figure it out.
Change-Id: I12bfd07bca7829de8d5b85d375d9b52bd84d677a
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Wed, 10 Nov 2021 13:39:22 +0000 (08:39 -0500)]
Fix: sessiond: fix possible buffer overflow warning
When compiling with clang-14 on Ubuntu 20.04, I get:
CC lttng-syscall.lo
/home/smarchi/src/lttng-tools/src/bin/lttng-sessiond/lttng-syscall.c:70:13: error: 'fscanf' may overflow; destination buffer in argument 4 has size 255, but the corresponding specifier may require size 256 [-Werror,-Wfortify-source]
&index, name, &bitness) == 3) {
^
I think the compiler is right, we read a string when length up to 255 in
a buffer of size 255. We need one more byte for the NULL terminator,
fix that.
Change-Id: I6b2eec401af3ef6230dd4b6c8559032de9b54584
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 23 Aug 2021 21:12:28 +0000 (17:12 -0400)]
Fix: tests: app unregistering is not guaranteed by app lifetime
Observed issue
==============
The per-pid timer based rotation tests fail on a minimal ptest
yocto image.
The test suite report that the second archive is not empty as it
expects.
Note that the yocto/OE image is running under QEMU without
KVM.
Cause
=====
Since the image is running under QEMU without KVM, the overall
processing capability of the VM is quite limited.
The test seems to assume that between the first and the second rotation
the app will be unregistered by the time the second rotation is issued.
Note that the observable lifetime of an app is not equal to the
lttng-sessiond/consumerd app visibility since we deal with app
unregistration via a polling mechanism.
Note, that as far as I understand, this is a testing issue only.
It is still relevant in the context of rotation to validate that the second
rotation archive does NOT contain info for a "dead" app under per-pid
configuration.
Solution
========
Move the rotation timer operation after the app is registered and
considered unregistered from the point of view of
lttng-sessiond/lttng-consumerd. This should give us a more robust
approach.
Known drawbacks
=========
None.
References
==========
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie8c542d29ef8bdb325efc05de14e80b179c68754
Jonathan Rajotte [Tue, 19 Oct 2021 19:22:39 +0000 (15:22 -0400)]
Fix: lttng-ctl: tracing_group memory leaks
Observed issue
==============
liblttng-ctl leaks memory if `lttng_set_tracing_group` is called at least
1 time by an API client.
joraj@~/lttng/master/lttng-tools-dev [master][]$ valgrind --leak-check=full lttng --group=joraj list
==24823== Memcheck, a memory error detector
==24823== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==24823== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==24823== Command: lttng --group=joraj list
==24823==
Error: No session daemon is available
==24823==
==24823== HEAP SUMMARY:
==24823== in use at exit: 8 bytes in 1 blocks
==24823== total heap usage: 55 allocs, 54 frees, 87,023 bytes allocated
==24823==
==24823== 8 bytes in 1 blocks are definitely lost in loss record 1 of 1
==24823== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==24823== by 0x4BA7DC7: __vasprintf_internal (vasprintf.c:71)
==24823== by 0x4C4B742: __asprintf_chk (asprintf_chk.c:34)
==24823== by 0x48687D9: asprintf (stdio2.h:181)
==24823== by 0x48687D9: lttng_set_tracing_group (lttng-ctl.c:2620)
==24823== by 0x4011B89: call_init.part.0 (dl-init.c:72)
==24823== by 0x4011C90: call_init (dl-init.c:30)
==24823== by 0x4011C90: _dl_init (dl-init.c:119)
==24823== by 0x4001139: ??? (in /usr/lib/x86_64-linux-gnu/ld-2.31.so)
==24823== by 0x2: ???
==24823== by 0x1FFEFFFCFE: ???
==24823== by 0x1FFEFFFD04: ???
==24823== by 0x1FFEFFFD12: ???
==24823==
==24823== LEAK SUMMARY:
==24823== definitely lost: 8 bytes in 1 blocks
==24823== indirectly lost: 0 bytes in 0 blocks
==24823== possibly lost: 0 bytes in 0 blocks
==24823== still reachable: 0 bytes in 0 blocks
==24823== suppressed: 0 bytes in 0 blocks
==24823==
==24823== For lists of detected and suppressed errors, rerun with: -s
==24823== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Cause
=====
The allocated pointer in the library constructor is not freed on
subsequent assignation.
Solution
========
Free the pointer.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie1d4c45df2764a88c74d56de691783df9215633c
Francis Deslauriers [Tue, 2 Nov 2021 13:33:02 +0000 (09:33 -0400)]
Fix: use <unistd.h> instead of <sys/unistd.h>
Fixes: #1330
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I07cabde5a0295de06f7c6f42dd12de803b57c907
Francis Deslauriers [Mon, 1 Nov 2021 19:31:26 +0000 (15:31 -0400)]
Fix: Tests: unchecked `close()` return value
CID
1465101 (#1 of 1): Unchecked return value (CHECKED_RETURN)
9. check_return: Calling close without checking return value (as is done
elsewhere 177 out of 185 times).
CID
1465100 (#1 of 1): Unchecked return value (CHECKED_RETURN)
4. check_return: Calling close without checking return value (as is done
elsewhere 177 out of 185 times)
CID
1465099 (#1 of 1): Unchecked return value (CHECKED_RETURN) 4.
check_return: Calling close without checking return value (as is done
elsewhere 177 out of 185 times).
CID
1465098 (#1 of 1): Unchecked return value (CHECKED_RETURN) 4.
check_return: Calling close without checking return value (as is done
elsewhere 177 out of 185 times).
CID
1465097 (#1 of 1): Unchecked return value (CHECKED_RETURN) 4.
check_return: Calling close without checking return value (as is done
elsewhere 177 out of 185 times).
Reported-by: Coverity Scan
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8e2552c75ab7cec5aa3707e2c1c4d9f2484b501a
Jérémie Galarneau [Mon, 1 Nov 2021 19:43:55 +0000 (15:43 -0400)]
Fix: relayd: live: mishandled initial null trace chunk
Observed issue
==============
As reported in #1323 (https://bugs.lttng.org/issues/1323), crashes of
the relay daemon are observed when running the user space clear tests.
The crash occurs with the following stack trace:
#0 0x000055fbb861d6ae in urcu_ref_get_unless_zero (ref=0x28) at /usr/local/include/urcu/ref.h:85
#1 lttng_trace_chunk_get (chunk=0x0) at trace-chunk.c:1836
#2 0x000055fbb86051e2 in make_viewer_streams (relay_session=relay_session@entry=0x7f6ea002d540, viewer_session=<optimized out>, seek_t=seek_t@entry=LTTNG_VIEWER_SEEK_BEGINNING, nb_total=nb_total@entry=0x7f6ea9607b00, nb_unsent=nb_unsent@entry=0x7f6ea9607aec, nb_created=nb_created@entry=0x7f6ea9607ae8, closed=<optimized out>) at live.c:405
#3 0x000055fbb86061d9 in viewer_get_new_streams (conn=0x7f6e94000fc0) at live.c:1155
#4 process_control (conn=0x7f6e94000fc0, recv_hdr=0x7f6ea9607af0) at live.c:2353
#5 thread_worker (data=<optimized out>) at live.c:2515
#6 0x00007f6eae86a609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#7 0x00007f6eae78f293 in clone () from /lib/x86_64-linux-gnu/libc.so.6
The race window during which this occurs seems very small as it can take
hours to reproduce this crash. However, a minimal reproducer could be
identified, as stated in the bug report.
Essentially, the same crash can be reproduced by attaching a live viewer
to a session that has seen events being produced, been stopped and been
cleared.
Cause
=====
The crash occurs as an attempt is made to take a reference to a viewer
session’s trace chunk as viewer streams are created. The crux of the
problem is that the code doesn’t expect a viewer session’s trace chunk
to be NULL.
The viewer session’s current trace chunk is initially set, when a viewer
attaches to the viewer session, to a copy the corresponding
relay_session’s current trace chunk.
A live session always attempts to "catch-up" to the newest available
trace chunk. This means that when a viewer reaches the end of a trace
chunk, the viewer session may not transition to the "next" one: it jumps
to the most recent trace chunk available (the one being produced by the
relay_session). Hence, if the producer performs multiple rotations
before a viewer completes the consumption of a trace chunk, it will skip
over those "intermediary" trace chunks.
A viewer session updates its current trace chunk when:
1) new viewer streams are created,
2) a new index is requested,
3) metadata is requested.
Hence, as a general principle, the viewer session will reference the
most recent trace chunk available _even if its streams do not point to
it_. It indicates which trace chunk viewer streams should transition to
when the end of their current trace chunk is reached.
The live code properly handles transitions to a null chunk. This can be
verified by attaching a viewer to a live session, stopping the session,
clearing it (thus entering a null trace chunk), and resuming tracing.
The only issue is that the case where the first trace chunk of a viewer
session is "null" (no active trace chunk) is mishandled in two places:
1) in make_viewer_streams(), where the crash is observed,
2) in viewer_get_metadata().
Solution
========
In make_viewer_streams(), it is assumed that a viewer session will have
a non-null trace chunk whenever a rotation is not ongoing. This is
reflected by the fact that a reference is always acquired on the viewer
session’s trace chunk.
That code is one of the three places that can cause a viewer session’s
trace chunk to be updated. We still want to update the viewer session to
the most recently seen trace chunk (null, in this case). However, there
is no reference to acquire and the trace chunk to use for the creation
of the viewer stream is NULL. This is properly handled by
viewer_stream_create().
The second site to change is viewer_get_metadata() which doesn’t handle
a viewer metadata stream not having an active trace chunk at all.
Thankfully, the protocol allows us to express this condition by
returning the LTTNG_VIEWER_NO_NEW_METADATA status code when a viewer
metadata stream doesn’t have an open file and doesn’t have a current
trace chunk.
Surprisingly, this bug didn’t trigger in the case where a transition to
a null chunk occurred _after_ attaching to a viewer session.
This is because viewers will typically ask for metadata as a result of an
LTTNG_VIEWER_FLAG_NEW_METADATA reply to the GET_NEXT_INDEX command. When
a session is stopped and all data was consumed, this command returns
that no new data is available, causing the viewers to wait and ask again
later.
However, when attaching, babeltrace2 (at least, and probably babeltrace 1.x)
always asks for an initial segment of metadata before asking for an
index.
Known drawbacks
===============
None.
Fixes: #1323
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I516fca60755e6897f6b7170c12d706ef57ad61a5
Francis Deslauriers [Thu, 30 Sep 2021 18:43:11 +0000 (14:43 -0400)]
Fix: configure.ac: reporting SDT uprobe as a UST feature
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I86638d6a148b04e7131e4af7ec830c5e56817fdc
Francis Deslauriers [Thu, 7 Oct 2021 18:52:27 +0000 (14:52 -0400)]
Fix: Tests: leaking epoll fd
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5ec4fcdb87159f35932c20e7314cda764d14967c
Francis Deslauriers [Mon, 25 Oct 2021 15:32:24 +0000 (11:32 -0400)]
Typo: occurences -> occurrences
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I719e26febd639f3b047b6aa6361fc6734088e871
Jérémie Galarneau [Mon, 18 Oct 2021 21:09:27 +0000 (17:09 -0400)]
Update version to v2.13.1
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Thu, 8 Jul 2021 18:17:51 +0000 (14:17 -0400)]
Fix: ust: app stuck on recv message during UST comm timeout scenario
Observed issue
==============
The following scenario lead to the UST thread to be "stuck" on recvmsg
on the notify socket.
The problem manifest itself when an application is unresponsive during
the ustctl_start_session call. Note that the default timeout for ust
communication is 5 seconds.
# Start an instrumented app
./app
gdb lttng-sessiond
# put a breakpoint on ustctl_start_session
lttng create my_session
lttng enable-event -u -a
lttng start
# The tracepoint should hit. Do not continue.
kill -s SIGSTOP $(pgrep app)
# Continue lttng-sessiond.
sleep 5 # This make sure lttng-sessiond unregister the app from its point of view
kill -s SIGCONT $(pgrep app)
gdb -p $(pgrep app)
thread apply all bt
App stack trace:
Thread 3 (Thread 0x7fe2c6f58700 (LWP 48172)):
#0 __libc_recvmsg (flags=0, msg=0x7fe2c6f56ac0, fd=4) at ../sysdeps/unix/sysv/linux/recvmsg.c:28
#1 __libc_recvmsg (fd=fd@entry=4, msg=msg@entry=0x7fe2c6f56ac0, flags=flags@entry=0) at ../sysdeps/unix/sysv/linux/recvmsg.c:25
#2 0x00007fe2c7a010ba in ustcomm_recv_unix_sock (sock=sock@entry=4, buf=buf@entry=0x7fe2c6f56ea0, len=len@entry=48) at lttng-ust-comm.c:308
#3 0x00007fe2c7a037c3 in ustcomm_register_channel (sock=4, session=session@entry=0x7fe2c0000ba0, session_objd=<optimized out>, channel_objd=<optimized out>, nr_ctx_fields=nr_ctx_fields@entry=0, ctx_fields=<optimized out>, chan_id=0x7fe2
c6f5716c, header_type=0x7fe2c0012b18) at lttng-ust-comm.c:1544
#4 0x00007fe2c7a10787 in lttng_session_enable (session=0x7fe2c0000ba0) at lttng-events.c:444
#5 0x00007fe2c7a0b785 in lttng_session_cmd (objd=1, cmd=128, arg=
140611977311672, uargs=0x7fe2c6f57800, owner=0x7fe2c7a5da00 <local_apps>) at lttng-ust-abi.c:576
#6 0x00007fe2c7a07d6d in handle_message (lum=0x7fe2c6f57590, sock=3, sock_info=0x7fe2c7a5da00 <local_apps>) at lttng-ust-comm.c:1003
#7 ust_listener_thread (arg=0x7fe2c7a5da00 <local_apps>) at lttng-ust-comm.c:1712
#8 0x00007fe2c7993609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#9 0x00007fe2c78ba293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
...
Cause
=====
When the app continues after the timeout from lttng-sessiond side, the
actual start_session message is received on the application side then
UST, app side, send commands on the notify socket. On lttng-sessiond
side, the command is received but no reply is sent.
This is due to the fact that the lookup against the
ust_app_ht_by_notify_sock hash table (find_app_by_notify_sock)
return nothing since the app is unregistered at this point and the hash
table node was removed on unregistration.
Solution
========
When the app lookup fails, return an error that will trigger the cleanup
of the notify socket.
Known drawbacks
=========
None
Note
=========
Subsequent error path in reply_ust_register_channel,
add_event_ust_registry, and add_enum_ust_registry might lead to the same
type of problem since no reply is sent to the app. Still, for those
cases the complete application/notify socket should not be destroyed
since the error path relate to either a session or a sub object of a
session.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: Iea0dc027ca1ee772e84c7e545114f1be69fd1f63
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 23 Jun 2021 02:17:03 +0000 (22:17 -0400)]
Fix: ust: UST communication can return -EAGAIN
Observed issue
==============
The following scenario lead to an abort on event creation. The
problem manifest itself when an application is unresponsive. Note that
the default timeout for ust communication is 5 seconds.
# Start an instrumented app
./app
gdb lttng-sessiond
# put a breakpoint on ustctl_create_event.
lttng create my_session
lttng enable-event -u -a
lttng start
# The tracepoint should hit. Do not continue.
kill -s SIGSTOP $(pgrep app)
# Continue lttng-sessiond.
# lttng-sessiond will abort.
Note that for UST this is not an expected behaviour. Expected
communication failure with a single app should not invalidate the
complete channel, compromise its setup or result in an abort.
Note that a similar scenario for the following ustctl call sites also
lead to scenario where failure of a single app lead to error reporting
and/or error propagation to upper level object.
Problematic callsites:
ustctl_set_exclusion
ustctl_set_filter
ustctl_disable_channel
These callsites are also fixed by this patch.
Cause
=====
For an unresponsive application, EAGAIN is returned and is treated as an
"unknown" hard error.
In this particular case the abort() call was introduced by commit:
88e3c2f5610b9ac89b0923d448fee34140fc46fb [1]. It is not clear if this is
a leftover from debugging session since this is the only callsite where
an abort is issued on communication failure via ustctl.
Solution
========
Handle EAGAIN coming from ustctl_* and treat it the same way a
dying application is handled. The only minor difference is that we WARN
on communication time out. Albeit not the most useful thing for a CLI
client, it could help overall user of lttng-sessiond in time out
situation.
Most call site already handled "unknown" error correctly. For those call
site we simply end up bringing more info in regards to the timeout
issue instead of mentioning that "-11" was returned.
Note, the reclamation of "app" is handled by the poll loop and
ust_app_unregister since the socket is shutdown by lttng-ust internally
on error, including EAGAIN.
Note that the application will try to register itself back to the
lttng-sessiond based on its configuration.
Known drawbacks
=========
None
Note
==========
Some logging call sites used the ppid of the app instead of the pid.
Those have been changed to pid.
References
==========
[1] https://github.com/lttng/lttng-tools/commit/
88e3c2f5610b9ac89b0923d448fee34140fc46fb
Fixes: #1384
Change-Id: If364b5d48e7fd2b664276a0fb1b7eec2c45ed683
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 12 Jul 2021 20:44:38 +0000 (16:44 -0400)]
Fix: ust: segfault on lttng start on filter bytecode copy
Observed issue
==============
A segmentation fault is observed for multiple UST timeout scenarios.
Backtrace:
#0 __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:384
#1 0x0000557fe0395df9 in copy_filter_bytecode (orig_f=0x7f9c5802b790) at ust-app.c:1196
#2 0x0000557fe0397702 in shadow_copy_event (ua_event=0x7f9c58025ff0, uevent=0x7f9c58033560) at ust-app.c:1824
#3 0x0000557fe039ac46 in create_ust_app_event (ua_sess=0x7f9c5802ec20, ua_chan=0x7f9c58025cc0, uevent=0x7f9c58033560, app=0x7f9c5c001da0) at ust-app.c:3192
#4 0x0000557fe03a054d in ust_app_channel_synchronize_event (ua_chan=0x7f9c58025cc0, uevent=0x7f9c58033560, ua_sess=0x7f9c5802ec20, app=0x7f9c5c001da0) at ust-app.c:5096
#5 0x0000557fe03a0772 in ust_app_synchronize (usess=0x7f9c580074a0, app=0x7f9c5c001da0) at ust-app.c:5173
#6 0x0000557fe03a0a70 in ust_app_global_update (usess=0x7f9c580074a0, app=0x7f9c5c001da0) at ust-app.c:5255
#7 0x0000557fe03a00e0 in ust_app_start_trace_all (usess=0x7f9c580074a0) at ust-app.c:4987
#8 0x0000557fe0355c6a in cmd_start_trace (session=0x7f9c5800a190) at cmd.c:2668
#9 0x0000557fe0382e70 in process_client_msg (cmd_ctx=0x7f9c58003d70, sock=0x7f9c74bf44e0, sock_error=0x7f9c74bf44e4) at client.c:1527
#10 0x0000557fe03848a2 in thread_manage_clients (data=0x557fe06d9440) at client.c:2200
#11 0x0000557fe037d1cb in launch_thread (data=0x557fe06d94b0) at thread.c:75
#12 0x00007f9c796af609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#13 0x00007f9c795b6293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
The scenario:
# Start an instrumented app
./app
gdb lttng-sessiond
# put a breakpoint on ustctl_set_filter
lttng create my_session
lttng enable-event -u tp:tp_test
lttng start
lttng enable-event -u __dummy --filter 'my_field == "user34"'
# The tracepoint should hit. Do not continue.
kill -s SIGSTOP $(pgrep app)
# Continue lttng-sessiond.
# enable-event will return an error. This a bug in itself, still let's
# continue with the current bug.
lttng stop
# Start a new app that will register.
./app &
sleep 1
lttng start
# lttng-sessiond should segfault.
Cause
=====
During the "lttng enable-event" command, the timeout error bubbles up
all the way to event_ust_enable_tracepoint and is different from
LTTNG_UST_ERR_EXIST. `trace_ust_destroy_event` is called and frees the
`uevent` object. Note that contrary to the comment `uevent` is added to
the channel event hash table at this point.
On the next `lttng start` command, the event node is still present in
the hash table and is iterated on. lttng-sessiond segfault on the first
data access of the previously freed memory.
The problem was introduced by commit
88e3c2f5610b9ac89b0923d448fee34140fc46fb [1]. Which essentially move the
callsite of `add_unique_ust_event` before `ust_app_*_event_glb` calls.
Solution
========
Go to `end` label to prevent freeing of the uevent object.
Note that app synchronization should not force an error at the channel
level, since a single app can fail but the whole channel should not.
The `error` label is now obsolete.
Known drawbacks
=========
None.
References
==========
[1] https://github.com/lttng/lttng-tools/commit/
88e3c2f5610b9ac89b0923d448fee34140fc46fb
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: Ifaf3f4c71bb2da869c7b441aaa4b367f8f7cbdd6
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Thu, 7 Oct 2021 20:19:41 +0000 (16:19 -0400)]
Fix: sessiond: previously created channel cannot be enabled
Observed issue
==============
A previously created channel cannot be enabled back once a session is
started.
Cause
=====
The check validating that the session was started is to early in the
`cmd_enable_channel` function.
Solution
========
Move the check at the creation code path when the channel is not found.
Known drawbacks
=========
None.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: I8e7d62b7e97246e65f1cf9022270293a6dd34cc9
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 15 Oct 2021 21:03:38 +0000 (17:03 -0400)]
Build fix: Missing message in LTTNG_DEPRECATED invocation
Coverity scan build jobs fail since LTTNG_DEPRECATED expects a string
and none is provided at the lttng_metadata_regenerate use site.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7e6701abd24c679f578b0adead771ac93b6566cd
Francis Deslauriers [Mon, 27 Sep 2021 13:42:54 +0000 (09:42 -0400)]
Fix: notification-thread: handling event from a removed tracer event src
Issue
=====
The issue is caused by a race condition where the `lttng_poll_wait()`
returns a _REMOVE_TRACER_EVENT_SOURCE event followed by an actual
notification event on the removed event source fd.
This causes the notification thread to remove the fd from the potential
notification sources list and later fail to find that same fd in the
next iteration.
This race condition can lead to the notification thread to hang
indefinitely or to failed assertions within the `fini_thread_state()`
function.
Fix
===
When removing an tracer event source, force the notification thread
`lttng_poll_wait()` loop to restart to ignore events from the removed
fd.
Use the `restart_poll` for that purpose (see note below).
Reproducer
==========
It's easy to reproduce this issue by adding a `usleep(5000)` just before
the `lttng_poll_wait()` call in the notification thread.
Note
====
It's the second time that I fix this issue.
It was first fixed by this commit by adding the `restart_poll` flag:
commit
8b5240601e4ddf6127e4291b7194dd5179cb35b5
Author: Francis Deslauriers <francis.deslauriers@efficios.com>
Date: Thu Dec 10 15:41:29 2020 -0500
notification-thread: drain all tracer notification on removal
and later, that other commit refactored that code but accidently removed
the use of the `restart_poll`:
commit
34bf4f69e49d8a69331a6aa6826ef1f155e20ede
Author: Francis Deslauriers <francis.deslauriers@efficios.com>
Date: Wed May 26 16:05:16 2021 -0400
notification-thread: remove fd from pollset on LPOLLHUP and friends
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6da0ed4374b612934adc72fb88d5c142505c5d53
Simon Marchi [Wed, 6 Oct 2021 15:41:19 +0000 (11:41 -0400)]
include: add missing "extern"
Change-Id: I37574b25adede7c639a04c508f6e4be8256339d9
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Wed, 6 Oct 2021 14:57:24 +0000 (10:57 -0400)]
include: remove spurious spaces in condition/session-rotation.h
Change-Id: Ia525d24c3b4098dff5c50fb2c5d93c16f6e08f5c
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Tue, 5 Oct 2021 20:10:18 +0000 (16:10 -0400)]
tests: fix header of regression/ust/getcpu-override/run-getcpu-override
The "SPDX-License-Identifier:" header is not in a comment, so is
interpreted as a bash command. This is harmless, but it appears in the
test output:
ok 13 - Start tracing for session sequence-cpu
# Launching app with getcpu-plugin wrapper
./tests/regression/ust/getcpu-override//run-getcpu-override: 2: SPDX-License-Identifier:: not found
ok 14 - Application with wrapper done
Fix that, and add a proper copyright notice, based on the other files
that were added at the same time as this one.
Change-Id: Icdf5e2fd5aec4080b2e5cad10cca4813bad26394
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Thu, 5 Aug 2021 20:48:51 +0000 (16:48 -0400)]
fix: wrong define used for GCC version check
As far as I can tell, the __GNUC_MAJOR__ define has never existed, the
proper define for the major version is __GNUC__. See
https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html for
more details.
Change-Id: I0d47d524e7efd204fd2f8976311c62e872eb6170
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 4 Oct 2021 16:41:51 +0000 (12:41 -0400)]
Fix: userspace-probe: unreported error on string copy error
Issue
=====
String copy errors, either due to the length or an allocation failure,
are not reported by
lttng_userspace_probe_location_tracepoint_create_from_payload
and don't log a clear error message.
This allowed truncation bugs like the one fixed in
b45a296 to go
unnoticed.
Fix
===
Return an "invalid" status code and log a more descriptive error
message.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia07cac7cba315ea79337262e9082dd06eb60950f
Francis Deslauriers [Fri, 1 Oct 2021 20:10:24 +0000 (16:10 -0400)]
Fix: userspace-probe: truncating binary path for SDT
Issue
=====
This issue was uncovered when we enabled the testing of the SDT
userspace probe instrumentation on the CI, where the paths to file are
specially long.
The reported error is:
- rule: ma-probe-sdt (type: kernel:uprobe, location type: SDT, location: /root/workspace/dev_gerrit_lttng-tools_rootbuild/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/agents/liburcu_version/master/node/amd64-rootnode/test_type/base/src/lttng-tools/tests/utils/testapp/userspace-probe-sdt-binary/.libs/userspace-probe-sdt-binary:foobar:tp1)
+ rule: ma-probe-sdt (type: kernel:uprobe, location type: SDT, location: /root/workspace/dev_gerrit_lttng-tools_rootbuild/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/agents/liburcu_version/master/node/amd64-rootnode/test_type/base/src/lttng-tools/tests/utils/testapp/userspace-probe-sdt-binary/.libs/userspace-probe-s:foobar:tp1)
The important part to notice is that the path to the binary is truncated
compared to was is expected by the test case.
The problem is caused by the
`lttng_userspace_probe_location_tracepoint_create_from_payload()`
function that strdup() the path string using the wrong defined value.
Fix
===
Use LTTNG_PATH_MAX rather then LTTNG_SYMBOL_NAME_LEN to copy the binary
path.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I24cbf413baba405bf4c4b534ccbc2b18f8d5d43f
Jérémie Galarneau [Thu, 1 Jul 2021 19:50:47 +0000 (15:50 -0400)]
Fix: lttng: add-trigger: don't provide a default event rule type
There is no reason for an event rule to have a default type. The
--type parameter is required.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic7f03453fac410c96ca6bb3b3ca0bdfb297a10d1
Francis Deslauriers [Thu, 19 Aug 2021 21:14:46 +0000 (17:14 -0400)]
Fix: statements with side-effects in assert statements
Background
==========
When building with the NDEBUG definition the `assert()` statements are
removed.
Issue
=====
Currently, a few `assert()` statements in the code base contain
statements that have side effects and removing them changes the
behavior for the program.
Fix
===
Extract the statements with side effects out of the `assert()`
statements.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I0b11c8e25c3380563332b4c0fad15f70b09a7335
Jonathan Rajotte [Thu, 16 Sep 2021 15:20:07 +0000 (11:20 -0400)]
Fix: lttng_trace_archive_location_serialize is called on freed memory
Observed issue
==============
The following backtrace have been reported [1].
#0 __GI_raise (sig=sig@entry=6) at /usr/src/debug/glibc/2.31+gitAUTOINC+
f84949f1c4-r0/git/sysdeps/unix/sysv/linux/raise.c:50
#1 0x0000003123025528 in __GI_abort () at /usr/src/debug/glibc/2.31+gitAUTOINC+
f84949f1c4-r0/git/stdlib/abort.c:79
#2 0x0000000000419884 in lttng_trace_archive_location_serialize (location=0x7f1c9c001160, buffer=0x7f1cb961c320) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/common/location.c:230
#3 0x00000000004c8f06 in lttng_evaluation_session_rotation_serialize (evaluation=0x7f1cb000a7f0, payload=0x7f1cb961c320) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/common/conditions/session-rotation.c:539
#4 0x00000000004a80fa in lttng_evaluation_serialize (evaluation=0x7f1cb000a7f0, payload=0x7f1cb961c320) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/common/evaluation.c:42
#5 0x00000000004bc24f in lttng_notification_serialize (notification=0x7f1cb961c310, payload=0x7f1cb961c320) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/common/notification.c:63
#6 0x0000000000458b7d in notification_client_list_send_evaluation (client_list=0x7f1cb0008f90, trigger=0x7f1ca40113d0, evaluation=<optimized out>, source_object_creds=0x7f1cb000a874, client_report=0x475840 <client_handle_transmission_status>, user_data=0x7f1cb0006010) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/bin/lttng-sessiond/notification-thread-events.c:4379
#7 0x0000000000476586 in action_executor_generic_handler (item=0x7f1cb0009600, work_item=0x7f1cb000a820, executor=0x7f1cb0006010) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/bin/lttng-sessiond/action-executor.c:696
#8 action_work_item_execute (work_item=0x7f1cb000a820, executor=0x7f1cb0006010) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/bin/lttng-sessiond/action-executor.c:715
#9 action_executor_thread (_data=0x7f1cb0006010) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/bin/lttng-sessiond/action-executor.c:797
#10 0x0000000000462327 in launch_thread (data=0x7f1cb00060b0) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/bin/lttng-sessiond/thread.c:66
#11 0x0000003123408ea4 in start_thread (arg=<optimized out>) at /usr/src/debug/glibc/2.31+gitAUTOINC+
f84949f1c4-r0/git/nptl/pthread_create.c:477
#12 0x00000031230f8dcf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
This can be easily reproduced with the following session and trigger
configuration:
lttng create test
lttng enable-event -u -a
lttng start
# Register two similar triggers via a dummy C program since rotation
# completed condition is not exposed on the CLI for now. Yielding the
# following triggers:
lttng list-triggers
- name: trigger0
owner uid: 1000
condition: session rotation completed
session name: test
errors: none
action:notify
errors: none
- name: trigger1
owner uid: 1000
condition: session rotation completed
session name: test
errors: none
action:notify
errors: none
lttng rotate <- abort happens here.
Cause
=====
The problem lies in how the location (`lttng_trace_archive_location`)
object is assigned to the `lttng_evaluation` objects. A single location
object can end up being shared between multiple `lttng_evaluation` objects
since we iterate over all triggers and create an `lttng_evaluation` object
with the location each time as needed.
See `src/bin/lttng-sessiond/notification-thread-events.c:1956`.
The location object is then freed when the first notification is
completely serialized. The second serialization end up having a
reference to a freed `lttng_trace_archive_location` object.
Solution
========
Implement ref counting for the lttng_trace_archive_location object.
Note
=======
This also fixes a leak that was present in `cmd_destroy_session_reply`.
The location is created by `session_get_trace_archive_location` and is
never `destroyed`/`put`.
Known drawbacks
=========
None.
References
==========
[1] https://bugs.lttng.org/issues/1325
Fixes: #1325
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: I99dc595ee5b0288c727b193ed061f5273752bd24
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 13 Sep 2021 20:49:48 +0000 (16:49 -0400)]
Fix: sessiond: ust session is inactive during ust_app_global_update
Observed issue
==============
The following scenario leads to an abort of lttng-sessiond.
lttng-sessiond (with kernel tracing available)
lttng create system-trace --snapshot -U /tmp/snapshot
lttng enable-channel -k system-trace --subbuf-size=4k --num-subbuf=256
lttng enable-event -c system-trace -k 'sched_wak*' -s system-trace
lttng start system-trace
lttng enable-event -u -a
Fails as expected with:
Error: Events: The command tried to enable an event in a new domain for
a session that has already been started once. (channel channel0,
session system-trace)
Launch any ust app such as easy_ust from the lttng-ust repository.
The following backtrace is generated:
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ffff7af0859 in __GI_abort () at abort.c:79
#2 0x00007ffff7af0729 in __assert_fail_base (fmt=0x7ffff7c86588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55555564b765 "usess->active", file=0x555555649a60 "ust-app.c", line
#3 0x00007ffff7b01f36 in __GI___assert_fail (assertion=0x55555564b765 "usess->active", file=0x555555649a60 "ust-app.c", line=5123, function=0x55555564ecf0 <__PRETTY_FUNCTION__.14199> "ust_
#4 0x00005555555d1f5e in ust_app_global_update (usess=0x7fffe001fb90, app=0x7fffac000b80) at ust-app.c:5123
#5 0x00005555555b60d4 in update_ust_app (app_sock=82) at dispatch.c:71
#6 0x00005555555b7025 in thread_dispatch_ust_registration (data=0x5555556a07f0) at dispatch.c:409
#7 0x00005555555ad5ab in launch_thread (data=0x5555556a0810) at thread.c:65
#8 0x00007ffff7ce6609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#9 0x00007ffff7bed293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
This also happens for the track command. You can replace the `lttng
enable-event -u -a` with `lttng track --userspace --vuid=0` then launch
an app and the same backtrace gets generated.
Cause
=====
During `process_client_msg` the `create_ust_session` function is called
and a ust session is assigned to the "system_trace" session with a
state of `active` set to 0 (false). This is not a problem.
The problem seems to lie with a single call site for
`ust_app_global_update` in `update_ust_app`. The status of the ust
session is not checked before calling the `ust_app_global_update`. It is
important to note that all `ust_app_global_update_all` callsites guard
the call with a check against the status of the session.
Solution
========
Guard the call to `ust_app_global_update` with a check of the ust
session active state.
Known drawbacks
=========
None.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I14d25d99d0609689247cdfa86130bd0219613581
Jonathan Rajotte [Tue, 14 Sep 2021 20:10:36 +0000 (16:10 -0400)]
Fix: common: error query for trigger action protocol error
Observed issue
==============
When listing a trigger with a single non-list action the CLI reports an
error in the protocol resulting in an output with no error accounting
for the action.
$ lttng list-triggers
- name: trigger0
owner uid: 1000
condition: session rotation ongoing
session name: test
errors: none
action:notify
Error: Failed to query errors of trigger 'trigger0' (owner uid: 1000): Protocol error occurred
Cause
=====
The `action_path` associated with the query has an index count of 0 as
it should considering that the single root element action element is not
a `list` object.
Inside `lttng_action_path_create_from_payload` a payload view is
initialized with a `len` of 0 since `header->index_count` is 0 as it
should.
The payload view is then validated and is considered invalid since the
validation check for `len` > 0. The error then bubbles up.
Solution
========
Since that the payload view is considered invalid when it is equal to
zero simply handle this special case and call directly
`lttng_action_path_create` with the appropriate parameter.
Known drawbacks
=========
None.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8f302c3aa78835342c665793908dc02f0a9dece4
Simon Marchi [Tue, 21 Sep 2021 14:31:55 +0000 (10:31 -0400)]
Fix: common: un-hide two rate policy functions
These functions are part of the liblttng-ctl API/ABI, they should not be
hidden.
Change-Id: Ic04bb4e7a0bfd0c7d661228b7ccf5d17dccfd9ba
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Tue, 21 Sep 2021 13:30:09 +0000 (09:30 -0400)]
Fix: include: remove unneeded declaration of lttng_session_descriptor_get_session_name
There is a declaration of lttng_session_descriptor_get_session_name in
both session-descriptor.h and session-descriptor-internal.h. Since this
is a function exposed by the API, the one in -internal.h is not needed,
remove it.
Since the removed declaration had LTTNG_HIDDEN, this has the effect of
making the lttng_session_descriptor_get_session_name symbol of
liblttng-ctl exported / part of the ABI. I think it was a mistake that
it wasn't previously exported.
Change-Id: I79d383f012d161a6df42240c6849b1b3af109def
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Wed, 8 Sep 2021 14:16:23 +0000 (10:16 -0400)]
Fix: Tests: race condition in test_ns_contexts_change
Issue
=====
The test script doesn't wait for the test application to complete before
stopping the tracing session. The race is that depending on the
scheduling the application is not always done generating events when the
session is stopped.
Fix
===
Make the test script wait for the termination of the test app before
stopping the session.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I29d9b41d2a2ed60a6c42020509c2067442ae332c
Francis Deslauriers [Tue, 7 Sep 2021 21:10:31 +0000 (17:10 -0400)]
Fix: Tests: race condition in test_event_tracker
Background
==========
The `test_event_tracker` file contains test cases when the event
generating app in executed in two distinct steps. Those two steps are
preparation and execution.
1. the preparation is the launching the app in the background, and
2. the execution is actually generating the event that should or
should not be traced depending on the test case.
This is useful to test the tracker feature since we want to ensure that
already running apps are notified properly when changing their tracking
status.
Issue
=====
The `test_event_vpid_track_untrack` test case suffers from a race
condition that is easy to reproduce on Yocto.
The issue is that sometimes events are end up the trace when none is
expected.
This is due to the absence of synchronization point at the launch of the
app which leads to the app being scheduled in-between the track-untrack
calls leading to events being recorded to the trace.
It's easy to reproduce this issue on my machine by adding a `sleep 5`
between the track and untrack calls and setting the `NR_USEC_WAIT`
variable to 1.
Fix
===
Using the testapp `--sync-before-last-event-touch` flag to make the app
create a file when all but the last event are executed. We then have the
app wait until we create a file (`--sync-before-last-event`) to generate
that last event. This way, we are sure no event will be generated when
running the track and untrack commands.
Notes
=====
- This issue affects other test cases in this file.
- This commit fixes a typo in the test header.
- This commit adds `diag` calls to help tracking to what test the output
relates to when reading the log.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia2b68128dc9a805526f9748f31ec2c2d95566f31
Jérémie Galarneau [Fri, 13 Aug 2021 15:15:15 +0000 (11:15 -0400)]
Fix: man: lttng-rotate: trace file count/size limitation does not apply
Reported-by: Zach Kramer <Zach.Kramer@cognex.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I337fd06a12d145bdd97c14b4b1894e3676945f63
Francis Deslauriers [Fri, 6 Aug 2021 13:40:20 +0000 (09:40 -0400)]
Fix: runas: less-than-zero comparison of an unsigned value
Fixes two defects found by Coverity related to unsigned integers being
treated as signed.
Reported by Coverity:
CID
1461333: Control flow issues (NO_EFFECT)
This less-than-zero comparison of an unsigned value is never true. "buf_size < 0UL".
CID
1461332: Integer handling issues (NEGATIVE_RETURNS)
"buf_size" is passed to a parameter that cannot be negative.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id6d4a71960f2ef34f14c05e66ef5d934b7a3e524
Francis Deslauriers [Fri, 23 Jul 2021 20:27:00 +0000 (16:27 -0400)]
Fix: runas: supplementary groups are ignored on lttng save
Observed issue
==============
On `lttng save` the following is reported to the user:
$ sudo -u my_user lttng save -o /tmp/my_dir my_session_name
Error: Permission denied
Note that:
* the running lttng-sessiond is root,
* "my_user" is part of the tracing group,
* "my_user" primary group is "my_user" and is part of group "my_dummy_group"
* The "/tmp/my_dir" has the following permissions:
drwxrwx--- 2 root my_dummy_group 4096 Jul 26 16:39 /tmp/my_dir/
Cause
=====
The supplementary groups are not initialized when the run-as process
demote itself to the user "my_user" to perform the recursive mkdir
required by the `lttng save` command.
From the point of the view the kernel, at the moment of performing the
mkdir call the permissions looks like this:
euid: uid of "my_user"
egid: primary gid of "my_user"
supplementary group list: "root"
Note that the kernel does not treat the presence of the root group in
the supplementary group list in any special way. Since "root gid" !=
"my_dummy_group gid" the directory creation is refused.
Solution
========
Use initgroups(3) to initialize the supplementary group list.
Known drawbacks
=========
None.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I58656a3107e4f7b59a2391a4759988401cad7a2b
Jérémie Galarneau [Tue, 3 Aug 2021 18:27:03 +0000 (14:27 -0400)]
Docs: lttng-event-rule(7): --exclude does not exist, use --exclude-name
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I92bb8e1b362d121172368897e6a9d4f538d4c68d
Francis Deslauriers [Thu, 8 Jul 2021 19:46:02 +0000 (15:46 -0400)]
sessiond: logging typo: {triger, triggger} -> trigger
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ida8faafc4c12f9817d3ee097bb648c10bd5ff854
Simon Marchi [Mon, 2 Aug 2021 01:02:39 +0000 (21:02 -0400)]
Fix: lttng: free sessions in cmd_destroy
When doing `lttng destroy`, I get:
Direct leak of 4385 byte(s) in 1 object(s) allocated from:
#0 0x7f74ae025459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x7f74add4129a in zmalloc /home/simark/src/lttng-tools/src/common/macros.h:45
#2 0x7f74add42b9d in recv_sessiond_optional_data /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:494
#3 0x7f74add42f9a in lttng_ctl_ask_sessiond_fds_varlen /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:596
#4 0x7f74add41714 in lttng_ctl_ask_sessiond_varlen_no_cmd_header /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:58
#5 0x7f74add41747 in lttng_ctl_ask_sessiond /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:78
#6 0x7f74add4a922 in lttng_list_sessions /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:2105
#7 0x56472bcbdf80 in cmd_destroy /home/simark/src/lttng-tools/src/bin/lttng/commands/destroy.c:330
#8 0x56472bd00764 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
#9 0x56472bd01218 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:426
#10 0x56472bd0151a in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:475
#11 0x7f74ad963b24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
This is due to cmd_destroy not free'ing the result of
lttng_list_sessions. Fix that.
Change-Id: Iff2e75e6ec1cdcd0bdfdbbc3d5099422e592905b
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Mon, 2 Aug 2021 00:33:23 +0000 (20:33 -0400)]
Fix: lttng: free domains and channels in get_session_stats_str
When doing `lttng stop`, I get:
Direct leak of 656 byte(s) in 1 object(s) allocated from:
#0 0x7f970719e459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x7f9706eba29a in zmalloc /home/simark/src/lttng-tools/src/common/macros.h:45
#2 0x7f9706ebbb9d in recv_sessiond_optional_data /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:494
#3 0x7f9706ebbf9a in lttng_ctl_ask_sessiond_fds_varlen /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:596
#4 0x7f9706eba714 in lttng_ctl_ask_sessiond_varlen_no_cmd_header /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:58
#5 0x7f9706eba747 in lttng_ctl_ask_sessiond /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:78
#6 0x7f9706ec4604 in lttng_list_channels /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:2262
#7 0x55837235c4e7 in get_session_stats_str /home/simark/src/lttng-tools/src/bin/lttng/utils.c:499
#8 0x55837235bf73 in print_session_stats /home/simark/src/lttng-tools/src/bin/lttng/utils.c:445
#9 0x55837231cc12 in stop_tracing /home/simark/src/lttng-tools/src/bin/lttng/commands/stop.c:138
#10 0x55837231d062 in cmd_stop /home/simark/src/lttng-tools/src/bin/lttng/commands/stop.c:229
#11 0x55837235e63e in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
#12 0x55837235f0f2 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:426
#13 0x55837235f3f4 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:475
#14 0x7f9706adcb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
Direct leak of 308 byte(s) in 1 object(s) allocated from:
#0 0x7f970719e459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x7f9706eba29a in zmalloc /home/simark/src/lttng-tools/src/common/macros.h:45
#2 0x7f9706ebbb9d in recv_sessiond_optional_data /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:494
#3 0x7f9706ebbf9a in lttng_ctl_ask_sessiond_fds_varlen /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:596
#4 0x7f9706eba714 in lttng_ctl_ask_sessiond_varlen_no_cmd_header /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:58
#5 0x7f9706eba747 in lttng_ctl_ask_sessiond /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:78
#6 0x7f9706ec421c in lttng_list_domains /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:2220
#7 0x55837235c3d3 in get_session_stats_str /home/simark/src/lttng-tools/src/bin/lttng/utils.c:484
#8 0x55837235bf73 in print_session_stats /home/simark/src/lttng-tools/src/bin/lttng/utils.c:445
#9 0x55837231cc12 in stop_tracing /home/simark/src/lttng-tools/src/bin/lttng/commands/stop.c:138
#10 0x55837231d062 in cmd_stop /home/simark/src/lttng-tools/src/bin/lttng/commands/stop.c:229
#11 0x55837235e63e in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
#12 0x55837235f0f2 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:426
#13 0x55837235f3f4 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:475
#14 0x7f9706adcb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
This is due to the get_session_stats_str function not free'ing the
results of lttng_list_channels and lttng_list_domains. Fix that.
Change-Id: I4c200d3df41bf09bdce8eadb000abbff7fe5a751
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 2 Aug 2021 20:49:46 +0000 (16:49 -0400)]
Update version to v2.13.0
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 19 Jul 2021 21:21:17 +0000 (17:21 -0400)]
Tests fix: unix socket: leaked socket of connection to child
The child_connection socket is only used by the parent in the
credentials passing test. The teardown assumes the reverse which causes
the socket to be leaked.
1458471 Resource leak
The system resource will not be reclaimed and reused, reducing the
future availability of the resource.
In test_creds_passing: Leak of memory or pointers to system
resources (CWE-404)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2ead9abbfc189ffbdd71a27f6376d0b001cdc2a3
Jérémie Galarneau [Mon, 19 Jul 2021 21:17:39 +0000 (17:17 -0400)]
Fix: sessiond: notification: missing unlock on client skip
Skipping a client must be performed by using the dedicated "skip_client"
label which will unlock the client's lock before continuing the loop
rather than using 'continue' directly.
Currently, a client will remain locked when an hidden trigger emits
a notification to which it is subscribed.
1458230 Missing unlock
May result in deadlock if there is another attempt to acquire the lock.
In notification_client_list_send_evaluation: Missing a release of a lock
on a path (CWE-667)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8b69395b91b0ea59ae5e0beadebd9099db623121
Jérémie Galarneau [Fri, 16 Jul 2021 18:57:49 +0000 (14:57 -0400)]
Update version to v2.13.0-rc3
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 16 Jul 2021 18:47:53 +0000 (14:47 -0400)]
liblttng-ctl: hide logger_thread_name
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4eb5a86029c6220ad4f48d382ec26126fd82e443
Jérémie Galarneau [Fri, 16 Jul 2021 18:42:24 +0000 (14:42 -0400)]
liblttng-ctl: hide MI trigger command variables
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I45eec5bb0fd3353c8f1257b3c94ef08440114b21
Francis Deslauriers [Tue, 25 May 2021 19:57:59 +0000 (15:57 -0400)]
Cleanup: rename `get_domain_str()` -> `lttng_domain_type_str()`
Both functions currently exist in the code base and accomplish the same
goal. Let's keep only one of them.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2254b846f0b5bdc883c86d970fde7daffa9e6155
Jérémie Galarneau [Fri, 16 Jul 2021 17:29:07 +0000 (13:29 -0400)]
.gitignore: Add hidden trigger test
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iab0fe77c0d4607d5469a7aa57d6bd784d47d8609
Jérémie Galarneau [Thu, 15 Jul 2021 00:44:44 +0000 (20:44 -0400)]
Test: unix socket: test credential passing
Since the credential passing over UNIX sockets now makes use of the pid,
the compatiblity wrappers have become more complex as each platform
appears to define its own way of accessing this information.
This new test:
- creates a named unix socket,
- forks,
- gets the parents and child to connect,
- sends the child's credentials as a data payload and as credentials
verified by the kernel
- the parent checks that the two sets of credentials are equal.
This is more of a sanity check for the compatibility wrappers used on
non-Linux platforms.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic0a6213afca7cc95a00617b052e7a145fc88625c
Jérémie Galarneau [Wed, 14 Jul 2021 19:19:15 +0000 (15:19 -0400)]
Build fix: retrieve unix socket peer PID on non-unix platforms
The previous attempt at extending the credential retrieval wrapper was
broken and didn't build on FreeBSD, macOS, and cygwin.
A platform-specific way of retrieving the PID of a unix peer is
implemented for FreeBSD (getsockopt using LOCAL_PEERCRED, note that the
cr_pid field is only available from FreeBSD 13 and up),
macOS (getsockopt using LOCAL_PEERPID, macOS 10.8+), and
Solaris (getpeerucreds).
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ifcf522c70ee4c2e0799293ae0961f41aebff5056
Jérémie Galarneau [Mon, 12 Jul 2021 22:42:57 +0000 (18:42 -0400)]
Fix: sessiond: notification: find_tracer_event_source returns NULL
Due to a bad edit of the original patch (my bad!)
find_tracer_event_source_element always returns NULL.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7febee1d803034a06d5063a2cc9179c4edef4809
Francis Deslauriers [Thu, 8 Jul 2021 16:35:58 +0000 (12:35 -0400)]
Tests: MI: add `diag` statements to test functions
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie56e23a3d0796d1edb07e2fd7cdc259816ac0133
Francis Deslauriers [Thu, 21 Jan 2021 17:00:03 +0000 (12:00 -0500)]
Cleanup: fix comments in `duplicate_{stream,channel}_object()`
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5089d09880d21842bf264f6c30ec7fd5e72b93df
Jérémie Galarneau [Fri, 9 Jul 2021 17:00:48 +0000 (13:00 -0400)]
Tests: add hidden trigger visibility test
Add a regression test for the previous commit that verifies that
internal triggers used by the session daemon to implement various
features (automatic session rotations based on their consumed size, in
this instance) are not visible to users of liblttng-ctl.
The test is written in C to use the library directly. This is needed
since the `lttng` client filters-out anonymous triggers and thus, would
not allow us to see those triggers since they are anonymous by default.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I1b8fca648953b8cba49a9888593b3486457d01b2
Jérémie Galarneau [Fri, 9 Jul 2021 17:00:56 +0000 (13:00 -0400)]
Fix: sessiond: list-triggers: don't return internal triggers
The session daemon uses triggers internally. For instance, the trigger
and notification subsystem is used to implement the automatic rotation
of sessions based on a size threshold.
Currently, a user of the C API will see those internal triggers if it is
running as the same user as the session daemon. This can be unexpected
by user code that assumes it will be alone in creating triggers.
Moreover, it is possible for external users to unregister those triggers
which would cause bugs.
As the triggers gain more capabilities, it is likely that the session
daemon will keep using them to implement features internally. Thus,
an internal "is_hidden" property is introduced in lttng_trigger.
A "hidden" trigger is a trigger that is not returned by the listings.
It is used to hide triggers that are used internally by the session
daemon so that they can't be listed nor unregistered by external
clients.
This is a property that can only be set internally by the session
daemon. As such, it is not serialized nor set by a
"create_from_buffer" constructor.
The hidden property is preserved by copies.
Note that notifications originating from an "hidden" trigger will not
be sent to clients that are not within the session daemon's process.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I61b7949075172fcd428289e2eb670d03c19bdf71
Jérémie Galarneau [Thu, 8 Jul 2021 21:57:45 +0000 (17:57 -0400)]
unix: receive pid on non-linux platforms
Add a `pid` to the lttng_sock_cred structure definition used on
non-Linux platforms and receive the peer's PID when receiving
credentials.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9c92f6dda6441deca58f9cc85f846f5031cceb6e
Jérémie Galarneau [Thu, 8 Jul 2021 18:39:59 +0000 (14:39 -0400)]
Clean-up: sessiond: return an lttng_error_code from list_triggers
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5d44b508a2a5211894c0cc7b6d51a9a03dc8b3f2
Francis Deslauriers [Wed, 26 May 2021 20:05:16 +0000 (16:05 -0400)]
notification-thread: remove fd from pollset on LPOLLHUP and friends
When an app dies, it's possible that the notification thread gets an
epoll event (`LPOLLHUP`) that the socket was closed before it gets the
_REMOVE_TRACER_SOURCE command for that source.
In such cases, the notification thread should simply remove the file
descriptor from the pollset and drain the notification on that file
descriptor. It should _not_ remove the _source_element object from the
list.
The removal from the list should only be done when it receives the
_REMOVE_TRACER_SOURCE command.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9525315f9e92d0f6ae5e84e26b83a6b7207dce54
Jérémie Galarneau [Wed, 7 Jul 2021 18:59:02 +0000 (14:59 -0400)]
Tests: fix: list triggers: bc missing on system
`bc` is not part of the test suite's dependancies and can be replaced,
in this instance, by a use of `printf`.
This use of `bc` caused a number of failures on the CI's Lava workers.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I1a1b24a23325754c26ebedfdb6b7728378381d97
Jérémie Galarneau [Mon, 5 Jul 2021 18:25:40 +0000 (14:25 -0400)]
Clean-up: event-expr: remove unreachable code
1452699 Logically dead code
The indicated dead code may have performed some action; that action will
never occur.
In lttng_event_expr_array_field_element_create: Code can never be
reached because of a logical contradiction (CWE-561)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I301e73c8e0cc7b9c4fb889e5bf7ef30d6ecf7d9f
Jérémie Galarneau [Mon, 5 Jul 2021 18:21:19 +0000 (14:21 -0400)]
Fix: lttng: remove-trigger: null dereference on MI initialization error
Failures to create an MI writer instance will result in a dereference of
the MI writer when attempting to close the command's output element.
1457842 Dereference after null check
Either the check against null is unnecessary, or there may be a null
pointer dereference.
In cmd_add_trigger: Pointer is checked against null but then
dereferenced anyway (CWE-476)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I0bc71bf6c83df7d9d938cf93a12d5f6cf6d7ae36
Jérémie Galarneau [Mon, 5 Jul 2021 18:18:27 +0000 (14:18 -0400)]
Fix: lttng: list-trigger: leak of error query in query callbacks
1457841 Resource leak
The system resource will not be reclaimed and reused, reducing the
future availability of the resource.
In mi_error_query_trigger_callback: Leak of memory or pointers to system
resources (CWE-404)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4e2cde41d77e5299d1758e8c9387b0a1c63efd17
Jérémie Galarneau [Mon, 5 Jul 2021 18:16:00 +0000 (14:16 -0400)]
Fix: lttng: add-trigger: null dereference on MI initialization error
Failures to create an MI writer instance will result in a dereference of
the MI writer when attempting to close the command's output element.
1457842 Dereference after null check
Either the check against null is unnecessary, or there may be a null
pointer dereference.
In cmd_add_trigger: Pointer is checked against null but then
dereferenced anyway (CWE-476)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I98b844d2f1c7abd43bd42ee472759de57b34484e
Jérémie Galarneau [Wed, 30 Jun 2021 22:41:24 +0000 (18:41 -0400)]
lttng: add-trigger: print generated trigger name
Print the generated trigger name when `add-trigger` succeeds.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id858880260513b9a10c4ce5022a95c476e3e32aa
Jérémie Galarneau [Wed, 30 Jun 2021 22:47:52 +0000 (18:47 -0400)]
sessiond: generate trigger name: name triggers with the 'trigger' prefix
Generated trigger names currently have the form TN, where N is the
number of generated trigger names over the lifetime of the session
daemon.
The form 'triggerN' seems more in line with autogenerated names such
as channel names (e.g. 'channel0').
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id61cd4716bb4c080d9242853c366e14542f60f7c
Jérémie Galarneau [Thu, 1 Jul 2021 13:12:21 +0000 (09:12 -0400)]
Revert "lttng: add-trigger: print generated trigger name"
This reverts commit
8310270a50784aced2af5b21ab23bc7bd9dee47f.
This change is still under review.
Change-Id: If75aa02e2e5daa0bfbcf30bea0a2b54c4aca1fd4
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 30 Jun 2021 22:41:24 +0000 (18:41 -0400)]
lttng: add-trigger: print generated trigger name
Print the generated trigger name when `add-trigger` succeeds. Also,
no message is emited when a trigger is successfully registered as
the command will print an error message if any error occurs.
There is also no need to parrot the trigger's name if it was specified
by the user.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9607fbd358298b036bd533834143eb5e9d185cd0
Jonathan Rajotte [Mon, 7 Jun 2021 22:11:01 +0000 (18:11 -0400)]
MI: xsd: bump to 4.1
No breaking change were done to the xsd. Only objects related to
triggers, event-rules, actions, condition, and error-query were added.
They do not interfere with the current MI.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia057c0fbea34f8e5c48cb8d8d307f004acc95a00
Jonathan Rajotte [Mon, 7 Jun 2021 22:03:17 +0000 (18:03 -0400)]
Tests: trigger: mi: use utils.sh xsd versions for xml diff
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic1536218b468d300ceb3d16ca160b8a8b891edfc
Jonathan Rajotte [Mon, 7 Jun 2021 21:56:37 +0000 (17:56 -0400)]
Tests: utils: regroup xml utils to utils.sh
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Idfa0f05d1bde75f4b02c903699281a86494b435f
Jonathan Rajotte [Wed, 26 May 2021 22:08:14 +0000 (18:08 -0400)]
Tests: MI: {add, list, remove}-trigger
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ica66a759d961cc122c1a1b81ce69fa54b0e78c78
Jonathan Rajotte [Thu, 27 May 2021 01:53:19 +0000 (21:53 -0400)]
MI: xsd: add objects type definition related to trigger
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: If28306f8aaf24890a6d834e9ff69bd00de3da295
Jonathan Rajotte [Thu, 27 May 2021 01:51:26 +0000 (21:51 -0400)]
MI: xsd: sort output_type
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: If2206e4a1c7a54d6d6bc4887c1925b12f035232b
Jonathan Rajotte [Thu, 27 May 2021 01:48:20 +0000 (21:48 -0400)]
MI: xsd: sort command_string_type
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8df9d69aeaf93050c405ff876ad697efac7c4021
Jonathan Rajotte [Wed, 26 May 2021 21:17:02 +0000 (17:17 -0400)]
Add pretty_xml utils
This util reads on stdin and outputs an indented/formatted xml.
It is equivalent to "xmllint --format -".
It will be used for MI trigger testing. For testing we will essentially
diff the output of the command against the expected output. While a
nicely formatted multi-line output is not necessary for a machine to
do the diff, the human that will have to debug it will surely appreciate
it.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie1597644941c55ce3e59f7ff16f196ac36325179
Jonathan Rajotte [Wed, 26 May 2021 20:39:12 +0000 (16:39 -0400)]
Move xml utils from mi subfolder to xml-utils folder
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I268dc544bf4f72f61a701ac3efd0b12488cc2f64
Jonathan Rajotte [Fri, 28 May 2021 18:32:37 +0000 (14:32 -0400)]
Fix: lttng_triggers count is not equal to the size of the sorted trigger array
Since anonymous triggers can be present in the original lttng_triggers
and that we do not add them to the sorting list, the count to be used
while iterating on the sorted list must be the size of the list itself
and not that of lttng_triggers.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ifb1802345199cb20fbb6d401f316be918b8a6443
Jonathan Rajotte [Wed, 26 May 2021 17:04:41 +0000 (13:04 -0400)]
MI: {add, list, remove} trigger
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie16c5c3a894b921e032a99ed3deda4ed5da17e78
Jonathan Rajotte [Fri, 7 May 2021 01:26:17 +0000 (21:26 -0400)]
MI: implement all objects related to trigger machine interface
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Idb2045135b1ba87853d6214b149afbe27bb7a1ca
Jonathan Rajotte [Thu, 11 Feb 2021 15:40:39 +0000 (10:40 -0500)]
Move event-expr-to-bytecode to event-expr
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I74a4b823ae7bbcbb062dbb9a2a0f84785bca287a
Jonathan Rajotte [Thu, 11 Feb 2021 15:18:38 +0000 (10:18 -0500)]
Move event-expr from liblttng-ctl to libcommon
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I31c65cd7f63fa4e1c918285b02ab2ab2e82549f6
Jonathan Rajotte [Thu, 4 Feb 2021 20:57:56 +0000 (15:57 -0500)]
MI: support double element
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I97411fea238d8b1275028d3d04a6f4f376624001
Jérémie Galarneau [Wed, 16 Jun 2021 19:08:21 +0000 (15:08 -0400)]
Fix: rotation client example: leak of handle on error
1452927 Resource leak
The system resource will not be reclaimed and reused, reducing the
future availability of the resource.
In setup_session: Leak of memory or pointers to system
resources (CWE-404)
CID
1452927 (#1 of 1): Resource leak (RESOURCE_LEAK)8. leaked_storage:
Variable chan_handle going out of scope leaks the storage it points to
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4c215ac4a86f9f70fd5c9d3aa13f944d3d7a2cc7
Michael Jeanson [Mon, 14 Jun 2021 15:18:19 +0000 (11:18 -0400)]
Silence warnings on GCC 4.8 with -Wmaybe-uninitialized
We still build on SLES12 with GCC 4.8 in which '-Wmaybe-uninitialized'
doesn't seem to be the sharpest tool in the shed. Add explicit
initialization of 'ret' to silence the warnings.
Change-Id: I1f9de535b6be48357735af106ff555ab9eceb730
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Tue, 15 Jun 2021 03:07:32 +0000 (23:07 -0400)]
doc/man/common-footer.txt: add missing non-breaking space
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ibefd4e7448920f0f346697eea5e1b5d250a93d1f
Philippe Proulx [Tue, 15 Jun 2021 02:52:02 +0000 (22:52 -0400)]
Rename "tracing session" -> "recording session"
Starting from LTTng 2.13, _tracing_ is defined as attempting to execute
one or more actions when emitting an event, which is very close to the
trigger definition.
To highlight that a tracing session is only about event recording,
rename this concept to _recording session_.
This patch mostly changes the manual pages, although I also updated some
C source and other files which contain user-facing text to use the new
term.
I didn't update logging messages because debugging scripts could still
refer to "tracing sessions".
The lttng-concepts(7) manual page mentions that the "recording session"
term was "tracing session" before LTTng 2.13.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I620d6b6be9e0f1dac14c0bc5e26094c3b3711c75
Philippe Proulx [Mon, 14 Jun 2021 17:05:37 +0000 (13:05 -0400)]
doc/man: use double quotes when referring to internal section
This patch adds double quotes to all the manual page internal section
references using their full name. Those references often have the
following AsciiDoc form:
See the <<id,Full section name>> section below.
With this patch, this would be converted to:
See the ``<<id,Full section name>>'' section below.
In the rendered manual page, before this patch:
See the Full section name section below.
¯¯¯¯ ¯¯¯¯¯¯¯ ¯¯¯¯
With this patch:
See the “Full section name” section below.
The purpose of this patch is, thanks to the change in
`doc/man/manpage.xsl`, to remove the italic style for the text of
internal links. Because there's no way to create dynamic internal links
in a manual page, this style causes internal links to look weird when
they're not a full section name, for example:
Note that the trigger doesn't need to [...]
¯¯¯¯¯¯¯
The HTML rendering of LTTng-tools manual pages can still benefit from
internal links. This patch makes it possible to add more internal links
without degrading the visual style of manual pages when rendered in a
terminal.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I1a5ef7eab7ff1e66c137e16b51a9c9074e43f583
Philippe Proulx [Tue, 18 May 2021 14:14:47 +0000 (10:14 -0400)]
doc/man: update type/domain options for common event rule spec.
This patch updates manual pages to follow the recent `--type` and
`--domain` option changes of the lttng-add-trigger(1) command, which
accepts the common event rule specification options of
lttng-event-rule(7).
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6064734534e773bf4f03b5f1e849b57134583039
Michael Jeanson [Tue, 15 Jun 2021 19:05:16 +0000 (15:05 -0400)]
.gitreview: Set default branch to 'stable-2.13'
Change-Id: Ia321edf68795a2a560b38947d5d888536cae08fa
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Wed, 16 Jun 2021 16:10:42 +0000 (12:10 -0400)]
Fix: use of uninitialised bytes valgrind warning
Issue
=====
Valgrind reports usage of uninitialised stack allocated memory:
==
2961363== Thread 9 Client manageme:
==
2961363== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
==
2961363== at 0x521418D: __libc_sendmsg (sendmsg.c:28)
==
2961363== by 0x521418D: sendmsg (sendmsg.c:25)
==
2961363== by 0x53411B: lttcomm_send_unix_sock (unix.c:294)
==
2961363== by 0x48AA8C: send_unix_sock (client.c:896)
==
2961363== by 0x484F45: thread_manage_clients (client.c:2865)
==
2961363== by 0x480FB4: launch_thread (thread.c:66)
==
2961363== by 0x5208608: start_thread (pthread_create.c:477)
==
2961363== by 0x5346292: clone (clone.S:95)
==
2961363== Address 0x7575389 is 25 bytes inside a block of size 16,384 alloc'd
==
2961363== at 0x483DFAF: realloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==
2961363== by 0x4EB618: lttng_dynamic_buffer_set_capacity (dynamic-buffer.c:166)
==
2961363== by 0x4EB52C: lttng_dynamic_buffer_append (dynamic-buffer.c:55)
==
2961363== by 0x48CBA1: setup_lttng_msg (client.c:125)
==
2961363== by 0x48AD70: setup_lttng_msg_no_cmd_header (client.c:860)
==
2961363== by 0x489825: process_client_msg (client.c:2253)
==
2961363== by 0x484A97: thread_manage_clients (client.c:2807)
==
2961363== by 0x480FB4: launch_thread (thread.c:66)
==
2961363== by 0x5208608: start_thread (pthread_create.c:477)
==
2961363== by 0x5346292: clone (clone.S:95)
==
2961363== Uninitialised value was created by a stack allocation
==
2961363== at 0x485FE4: process_client_msg (client.c:928)
After some digging, I found that this warning was caused by the padding
of the `struct lttng_session_list_schedules_return` during the
`LTTNG_SESSION_LIST_ROTATION_SCHEDULES` command.
All the fields are of the stack allocated struct are initialised by the
designated initializer but the padding is not.
These padding bytes are reported by Valgrind as being used
uninitialised.
Fix
===
Remove the padding by adding the LTTNG_PACKED attribute to the nested
structs in `struct lttng_session_list_schedules_return`.
Notes
=====
In light of the actual root cause, this is stacktrace is not really
useful.
The realloc call to grow the buffer makes it hard to find what is the
actual uninitialised stack allocation because Valgrind reports the
realloc call as the problematic site.
I was able to track this issue by adding a "consuming" step in the
`lttng_dynamic_buffer_append()` function. This consuming step would sum
all the bytes of the `buf` parameter so as to force Valgrind to check
each byte and not wait until the `sendmsg()` call. This way, I was able
to get a more precise location of the root cause of the issue.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ib4a729575e9117cf95716ad25e1417c833f4232b
Jonathan Rajotte [Mon, 7 Jun 2021 18:21:06 +0000 (14:21 -0400)]
Fix: build: libcommon fd-tracker dependency is not available
Observed issue
==============
A build configured with:
./configure -disable-bin-lttng --disable-bin-lttng-crash --disable-bin-lttng-sessiond --disable-bin-lttng-relayd
Fails at build time with:
make[3]: *** No rule to make target '../../src/common/fd-tracker/libfd-tracker.la', needed by 'libcommon.la'. Stop.
make[3]: *** Waiting for unfinished jobs....
CC lttng-elf.lo
Cause
=====
fd-tracker is required by libcommon. This is introduced by commit
8bb66c3cd60938352927ee865759433387324250 [1]
Build of libfd-tracker is disabled at the configure level by
build_lib_fd_tracker which in turn is enabled/disabled by the
--enable/disable-bin-* options.
For the observed issue, the --enable-bin-lttng-consumerd alone does not
enable the build of libfd-tracker.
Solution
========
All dependencies for libcommon are now always built. All bins require
libcommon to be present anyway.
This patch also fix a problem where the examples under the doc are build
even if liblttng-ctl is not built.
Known drawbacks
=========
None.
References
==========
[1]
http://git.lttng.org/?p=lttng-tools.git;a=commit;h=
8bb66c3cd60938352927ee865759433387324250
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I94f5d7cdadcb4f8ff9c2617a675659c1f9eb4709
Jérémie Galarneau [Wed, 9 Jun 2021 22:04:28 +0000 (18:04 -0400)]
Clean-up: mark lttng_error_query communication header as const
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I166ef90aee0d4d7da9ce1002cbbe2a35eea88757
Jérémie Galarneau [Wed, 9 Jun 2021 21:58:45 +0000 (17:58 -0400)]
Add condition-targeting error query
Notifications discarded by the tracers are reported at the level of a
trigger. As those errors are specific to triggers with an "event-rule
matches" condition, they should be reported through a condition-specific
error query.
Note that a condition error query is created from a trigger: there is no
ambiguity since, unlike actions, conditions cannot be nested.
Given the proximity of the final 2.13 release, the code which populated
trigger error query results is simply used to populate the condition
error query results when the condition is of type "event-rule matches".
No trigger-scope errors can be reported for the moment. However, such
error reports will be added in the future.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie1ac3668142041beb6fd61574ccef506707c55b2
Francis Deslauriers [Tue, 8 Jun 2021 21:21:30 +0000 (17:21 -0400)]
action list: missing renames from previous name "group"
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4373984c2bea96dc67880b1bbb361fb8fbc014ca
Francis Deslauriers [Tue, 8 Jun 2021 20:14:49 +0000 (16:14 -0400)]
Cleanup: ust-app: simplify ust_app_synchronize() error paths
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7fd3636dfb1370ebe224aa2e200189b2fe99002a
Francis Deslauriers [Tue, 8 Jun 2021 19:15:00 +0000 (15:15 -0400)]
Fix: double mutex_unlock() if session is deleted
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9e640396c12496b6d6c191e838dab138679d5f5e
Francis Deslauriers [Fri, 28 May 2021 16:40:07 +0000 (12:40 -0400)]
Fix: out of sync lttng_ust_ctl_sigbug_handle() prototype
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I3aa3fc711750be320433a2f4d1e7d49d47d71c44
Francis Deslauriers [Fri, 28 May 2021 20:06:09 +0000 (16:06 -0400)]
Fix: appending unallocated data from beyond exclusion entries
Issue
=====
If an exclusion string is smaller than the `LTTNG_SYMBOL_NAME_LEN`
integer, the `lttng_dynamic_buffer_append()` call will append
unallocated data to the buffer.
Fix
===
Use the `exclusion_len` value to copy the actual exclusion and pad the
remaining bytes with zeros.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I04c6681c28e82de29791541eb490158db9e503d0
Francis Deslauriers [Wed, 2 Jun 2021 19:28:24 +0000 (15:28 -0400)]
Tests: remove leftover temporary files
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie9249820643158c1572c9ce45507d1cba36bf6f7
Philippe Proulx [Thu, 10 Jun 2021 21:45:29 +0000 (17:45 -0400)]
lttng-disable-channel(1): fix typo
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I37f733bd48d17f2e6de75038918cabd95375e4e2
This page took 0.058701 seconds and 4 git commands to generate.