I get the following warning when compiling with `-Wall -Werror` with gcc
9.3.0:
In file included from ../../src/common/macros.h:15,
from ../../src/common/lttng-kernel.h:14,
from ../../src/bin/lttng-sessiond/trace-kernel.h:14,
from test_kernel_data.c:16:
In function ‘lttng_strnlen’,
inlined from ‘lttng_strncpy’ at ../../src/common/macros.h:120:6,
inlined from ‘test_create_kernel_event’ at test_kernel_data.c:136:2:
../../src/common/compat/string.h:28:8: error: ‘memchr’ reading 256 bytes from a region of size 11 [-Werror=stringop-overflow=]
28 | end = memchr(str, 0, max);
| ^~~~~~~~~~~~~~~~~~~
Fix this warning by using the RANDOM_STRING_LEN value as the max number
of bytes to copy.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Change-Id: I61752ee17163c4d642aad21b296c0fc4fad5b7a6
(cherry picked from commit 1dd622b10db0821d77490c937caee80c65332f14) Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Fix: load: incomplete error handling for load_session_from_file
This fix is adapted from a fix against the stable-2.11 branch. The
commit message of the stable-2.11 branch follows.
An equivalent fix was already in place in `load_session_from_path()`,
but the same problem as the stable-2.11 branch is present in
`load_session_from_file()`.
Here we load a xml created by lttng-tools-2.12 and try to load it using
lttng-tools 2.11. We expect this to fail on the load.
The command report an error on the stderr but the command return code
value is zero.
From lttng-ivc test runtime.log:
Command #0
Return value: 0
Command: lttng load --input-path=/home/joraj/lttng/lttng-ivc/.tox/py3/tmp/test_save_load_blocking_timeou0/save_load saved_trace
STDOUT:
Session saved_trace has been loaded successfully
STDERR:
XML Error: Element 'process_attr_trackers': This element is not expected.
Error: Session configuration file validation failed
Cause
=====
The error coming from load_session_from_file is not handled correctly.
Solution
========
Rework error handling in load_session_from_path and
load_session_from_file.
LTTNG_ERR_LOAD_SESSION_NOENT is NOT an error when session_name is
specified in load_session_from_path. In this scenario, we are actively
looking for the configuration of the session.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: Ic68c253aa194bf8ab72c3c271f10d443118bdeee
Fix: lttng-load: support legacy PID tracker specification
The 2.12 release changes the way tracked process attributes are
expressed in the MI and save/restore formats. While the MI schema was
bumped to 4.0, the save/restore schema only undergoes a minor bump to
accomodate existing users.
The original commit introducing this change justified the breaking
change as saved PIDs being fairly unlikely. However, even the
'INCLUDE_ALL' policy will specify a 'trackers' node, which no longer
existed and made all existing configurations incompatible.
A legacy load path is introduced to support the former PID tracker
serialization format and preserve the compatibility with existing
configurations.
Configurations generated by the 2.11 releases are included to test
this new legacy path.
Since the function is not static and has no previous declaration, the
diagnostic is emitted. We can see that they have fixed it in SWIG
3.0.12, where the same function is defined as:
Those comments are not relevant anymore and were missed by the following
clean-up commit:
commit 6dd26587e926671cdec2545b7d3db74cbd6a7cd8
Author: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Date: Thu Jan 30 12:01:57 2020 -0500
lttng-view: clean-up: remove commented and unused references to lttv
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Change-Id: Ie68614c8160f62a2c045153b377f036a4399ad87 Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Fix: relayd: harmonize path format in backward-compat mode
Observed issue
==============
Currently, the relay daemon produces the following path formats
depending on the whether a tracepath is provided, the version of the
session daemon peer, and the grouping option specified on launch of
the relay daemon.
Hostname grouping, no custom trace path
pre 2.11: $BASE/$SHOSTNAME/$SESSION-$DATETIME
2.12: $BASE/$SHOSTNAME/$SESSION-$DATETIME
Hostname grouping, custom trace path
pre 2.11: $BASE/$HOSTNAME/$TRACEPATH
2.12: $BASE/$HOSTNAME/$TRACEPATH
Tracing session grouping, no custom trace path
pre 2.11: $BASE/$SESSION/$HOSTNAME-$DATETIME
2.12: $BASE/$SESSION/$HOSTNAME-$DATETIME
As you can see, there is a single case where the format
diverges based on the version of the session daemon.
Cause
=====
Pre-2.11 session daemons do not transmit a session creation time when
a TRACEPATH is specified as part of the streaming url (e.g.
`lttng create my_session --set-url net://localhost/a_path`)
Hence, the backward compatibility path formatting code does not
insert a "DATETIME" string in the resulting path.
Solution
========
The relay daemon samples the time when it creates its session and that
time is formatted into the DATETIME representation if no DATETIME is
present in the path provided by pre-2.11 peers.
Drawbacks
=========
Sampling the relay session creation time will not yield the exact same
behaviour as what a 2.11+ peer would produce, but it is a reasonable
approximation for most use-cases.
Users depending on this time being the exact same as that sampled by
the session daemon will need to adapt tools anyhow if they use the new
--group-output-by-session option, so the change doesn't introduce more
problems.
This behaviour can be surprising when snapshots are streamed by
pre-2.11 peers as the session creation DATETIME will be different for
all snapshots. This is not ideal, but still less jarring than getting
a completely different path format depending on a peer's version.
The session schema has changed to include new tracked process
attributes. This change is technically non-backward compatible if
tracked PIDs were saved.
As it is extremely unlikely that anyone saves PIDs to a session
configuration to load it in another lttng-tools version, the schema
does not allow the pre-2.12 PID tracking specification to be
expressed. In my opinion, this is unlikely enough not to warrant a
major version bump.
This can be changed if someone really encounters this in the wild or
has a legitimate use case for going through this trouble.
Simon Marchi [Tue, 31 Mar 2020 03:24:24 +0000 (23:24 -0400)]
Fix: filter-grammar-test: add dependencies between steps
If the user specifies only the -B switch of the filter-grammar-test
program, the program will try to print the bytecode without having
generated it first, leaving to a segfault. Similarly, if the user
specifies only the -b switch, the program will try to generate the
bytecode from the IR, without having generated the IR first, also
leading to a segfault.
This patch adds some kind of dependency between the steps, such that if
the user is interested in a particular step (let's say, print the
bytecode), all the required steps will also be done.
Change-Id: Idc365a12b992a950566f227759fd3223cf5c5fd8 Signed-off-by: Simon Marchi <simon.marchi@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Fix: relayd: assertion fails on creation of session by peer < 2.11
Observed issue
==============
An assertion that a chunk has no active directory handle fails when
creating an anonymous chunk. More specifically, this occurs when
associating an fd tracker to the newly created anonymous trace chunk.
This occurs when a session is created by a peer that is older than
2.11.
Cause
=====
Trace chunks that should monitor their file descriptors with a file
descriptor tracker must be associated with the tracker before any
other operation occurs on the chunk. This is to ensure that "raw" file
descriptors are not created when they were meant to be tracked.
Here, the credentials and session output directory are set before the
file descriptor tracker was provided to the anonymous chunk which is a
breach of the API contract (enforced by the assert()).
Solution
========
Associate the fd tracker immediately to the anonymous chunk before
providing it with a reference to the file descriptor
tracker. Moreover, a leak of the output_directory is prevented by not
setting it to NULL. The trace chunk will acquire a reference to the
trace chunk; it is not transferred to the trace chunk.
Note
====
The problem was introduced during the 2.12 release cycle (clear
feature); this doesn't need to be backported.
Fix: relayd: crash on creation of session by peer < 2.11
Observed issue
==============
A NULL pointer dereference occurs during the creation of
a session that is associated with a peer older than 2.11.
The resulting backtrace follows:
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000564af45b755b in lttng_trace_chunk_set_as_owner (chunk=0x7f8ca8004730, session_output_directory=0x7f8ca8004680) at trace-chunk.c:1033
1033 if (chunk->path[0] != '\0') {
[Current thread is 1 (Thread 0x7f8cb808d700 (LWP 7300))]
#0 0x0000564af45b755b in lttng_trace_chunk_set_as_owner (chunk=0x7f8ca8004730, session_output_directory=0x7f8ca8004680) at trace-chunk.c:1033
#1 0x0000564af45a6a78 in session_set_anonymous_chunk (session=0x7f8ca8001380) at session.c:229
#2 session_create (session_name=<optimized out>, hostname=<optimized out>, base_path=<optimized out>, live_timer=<optimized out>, snapshot=<optimized out>,
sessiond_uuid=<optimized out>, id_sessiond=<optimized out>, current_chunk_id=<optimized out>, creation_time=<optimized out>, major=<optimized out>,
minor=<optimized out>, session_name_contains_creation_time=<optimized out>) at session.c:416
#3 0x0000564af459207e in relay_create_session (conn=0x7f8ca0000f60, payload=<optimized out>, recv_hdr=<optimized out>) at main.c:1428
#4 0x0000564af4594f12 in relay_process_control_command (payload=0x7f8cb808c940, header=0x7f8ca0001000, conn=0x7f8ca0000f60) at main.c:3218
#5 relay_process_control_receive_payload (conn=0x7f8ca0000f60) at main.c:3361
#6 0x0000564af45980b0 in relay_process_control (conn=0x7f8ca0000f60) at main.c:3478
#7 relay_thread_worker (data=<optimized out>) at main.c:3927
#8 0x00007f8cbba9a46f in start_thread () from /usr/lib/libpthread.so.0
#9 0x00007f8cbb9ca3d3 in clone () from /usr/lib/libc.so.6
Cause
=====
lttng_trace_chunk_set_as_owner() correctly handles the case
where a trace chunk has no output path, but expects the path
to be an empty string rather than being NULL.
This is not correct as an anonymous chunk, created in backward
compatibility mode when interacting with older peers, has no
path; the path is transmitted as part of the streams' attributes
upon their creation.
Solution
========
Simply check for a NULL pointer in the same place where the empty
chunk path string is created. The rest of the code in trace-chunk.c
doesn't assume that the chunk's path is non-NULL.
Note
====
The problem was introduced during the 2.12 release cycle (clear
feature); this doesn't need to be backported.
Fix: consumer: fallback to flush when flush empty is unsupported
Session destruction fails on older (<= 2.8) lttng-modules as the
flush_empty fails on the kernel streams during the quiet rotation.
Fallback to the regular flush as the semantics of regular rotations
are not expected here; we merely want to flush any pending data and
destroy the session.
A logging statement was apparently copy-pasted from the rotation code
and not adapted for the consumer_clear_buffer command and would
indicate that a flush operation failed when a clear operation was
performed.
Fix: sessiond: error reported on session destruction for old modules
The session destruction command will return
-LTTNG_ERR_ROTATION_NOT_AVAILABLE_KERNEL when the kernel tracer
version does not support packet sequence numbers which prevents
rotations from being performed.
It is okay to not perform an implicit rotation in this case since we
know that no rotations have occurred during the session's lifetime (as
it is not supported). Thus, the client/library only needs to stop the
session, wait for pending data, and destroy the session.
Fix: sessiond: erroneous error code returned on rotation failure
`LTTNG_ERR_KERN_CONSUMER_FAIL` is returned by the kernel domain
rotation handling code. This code is associated with a failure to
launch the kernel consumer daemon which is not the case here.
The `LTTNG_ERR_ROTATION_FAIL_CONSUMER` is used instead and returned to
the client.
Fix: lttng-destroy: missing newline on session destruction message
The lost packets/discarded events statistics are printed on the same
line as the session destruction progress message when the session is
stopped as part of the `destroy` command.
This is a consequence of printing the statistics as they are
retrieved; the statistics must be fetched before the destruction,
but the progress indicator is still being printed.
The statistics output is now formatted to a buffer and printed
after the session's destruction has completed.
Fix: tracker: NULL pointer dereference after NULL check
value_view can be NULL and must thus be checked before use.
Moreover, the fix introduced in 1ad5cb59 is erreneous: the
function must validate that either:
- value is a 'name' type, value_view is not null, and not len == 0,
- value is an integer and value_view does not contain more data.
In process_attr_value_from_comm: Pointer is checked against null but
then dereferenced anyway (CWE-476)
Fix: sessiond: NULL pointer dereference after NULL check
The process attribute value deserialization allows the buffer view to
be NULL when the value's type is not USER_NAME nor GROUP_NAME. This is
not checked when ensuring that no string is passed (len == 0) in the
case of integral values.
The trace_ust inclusion set add/remove methods do not jump to the
end label after checking the `tracker` variable. This can result
in a NULL pointer dereference when an invalid process attribute
is specified.
The same problem appears in save_process_attr_trackers() and
process_attr_value_from_comm().
Fix: sessiond: user/group name can be leaked on malformed command
process_attr_value_from_comm() can leak a copy of the user/group
name when the value type is erroneous. This is not reachable in
"normal" execution, but could be triggered by invalid "crafter"
lttng-ctl commands.
In process_attr_value_from_comm: Leak of memory or pointers to
system resources (CWE-404).
Simon Marchi [Wed, 25 Mar 2020 22:40:02 +0000 (18:40 -0400)]
configure: add -Wmissing-declarations, -Wmissing-prototypes, and more
Here's the rationale for each:
- -Wmissing-declarations: Make sure the definition of a function can
"see" a corresponding (usually in a header file), if it isn't static.
This makes sure that the declaration and definition don't go out of
sync, which can lead to hard to debug problems (because it still
builds, but the function doesn't receives what it thinks it receives).
On top of pointing out out-of-sync declarations, it can help point out
that a foo.c file misses including its header foo.c, or that a
function should actually be made static.
- -Wmissing-prototypes: makes sure that functions without parameters are
declared as `foo(void)` instead of `foo()`. In C, the former declares
a function that takes no parameters, whereas the latter declares a
function without specifying its parameters. The latter could be
called with any number of parameters, which is a recipe for confusion.
- -Wmissing-parameter-type, -Wold-style-definition,
-Wold-style-declarations, -Wstrict-prototypes: makes sure there's no
function declared with parameters without types specified, or using
the old style:
int foo(bar)
int bar;
{
...
}
Unlikely, but there's no harm in enabling them.
Change-Id: I7ddf5ff61b4466c0bd7b03485ef29156c399e2a8 Signed-off-by: Simon Marchi <simon.marchi@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Wed, 25 Mar 2020 22:39:56 +0000 (18:39 -0400)]
Fix: sessiond: make the --without-lttng-ust version of launch_application_notification_thread static
When building with --without-lttng-ust, a simple version of
launch_application_notification_thread, implemented in the header file,
is used. We get this warning:
CC main.o
In file included from /home/simark/src/lttng-tools/src/bin/lttng-sessiond/main.c:61:
/home/simark/src/lttng-tools/src/bin/lttng-sessiond/notify-apps.h:17:6: error: no previous prototype for ‘launch_application_notification_thread’ [-Werror=missing-prototypes]
17 | bool launch_application_notification_thread(int apps_cmd_notify_pipe_read_fd)
| ^~~~~~~
Make the function `static inline` to avoid that. The `inline` is not
strictly required here, but if that header ended up included by some
other source file that didn't use
launch_application_notification_thread, we would get a -Wunused-function
warning. The `inline` avoids that.
Change-Id: I19605e0594af0d7997951def2da3a6313bf65e11 Signed-off-by: Simon Marchi <simon.marchi@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
I've fixed the -Wmissing-declarations warning in callsites.c by adding a local
declaration. That was wrong, since there is actually a callsites.h header file
that needs to be included, which contains the declaration. This is nicely
pointed out when building with clang and -Wstrict-prototypes:
CC exec_with_callsites-multi-lib-test.o
In file included from /home/simark/src/lttng-tools/tests/regression/ust/multi-lib/multi-lib-test.c:15:
/home/simark/src/lttng-tools/tests/regression/ust/multi-lib/callsites.h:10:21: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes]
void call_tracepoint();
^
void
Remove the local declaration and include callsites.h in callsites.c.
Change-Id: Ib656d96c2ed3b389697a2022e343e98ac0b66447 Signed-off-by: Simon Marchi <simon.marchi@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Wed, 25 Mar 2020 22:39:48 +0000 (18:39 -0400)]
Fix: relayd: cast idigit argument to unsigned char
This diagnostic is emitted when building on cygwin:
main.c:233:34: warning: array subscript has type ‘char’ [-Wchar-subscripts]
233 | if (errno != 0 || !isdigit(arg[0])) {
| ~~~^~~
This is due to passing a `char` argument to isdigit. According to the
man page of isdigit, the arguments of type `char` must be cast to
`unsigned char`, so they are able to represent the EOF value. This
patch does that, and should get rid of the warning.
Change-Id: Iaed4c0b494a79b917761e65f18388f43478b97e1 Signed-off-by: Simon Marchi <simon.marchi@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Wed, 25 Mar 2020 22:39:39 +0000 (18:39 -0400)]
Fix: tests: add `void` parameter to functions that take no parameters
In C, empty parenthesis declare a function without a prorotype (without
specifying its parameters). This is not the same as having a `void`
parameter, which declares a function which has no parameters.
It's safer to use the later, otherwise it makes it possible to
erroneously call the function with some arguments.
Change this `test_function` to add `void`. It fixes diagnostics like:
CC userspace-probe-elf-binary.o
/home/simark/src/lttng-tools/tests/utils/testapp/userspace-probe-elf-binary/userspace-probe-elf-binary.c:12:33: error: no previous prototype for ‘test_function’ [-Werror=missing-prototypes]
12 | void __attribute__ ((noinline)) test_function()
|
Change-Id: Iceb7636e44d45f51889667ec76f2c04c032b5df8 Signed-off-by: Simon Marchi <simon.marchi@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Wed, 25 Mar 2020 22:39:35 +0000 (18:39 -0400)]
Fix: common: make lttng_trace_chunk_remove_subdirectory_recursive static
This function is only used in this file, make it static. Fixes this
diagnostic:
CC trace-chunk.lo
/home/simark/src/lttng-tools/src/common/trace-chunk.c:1498:5: error: no previous prototype for ‘lttng_trace_chunk_remove_subdirectory_recursive’ [-Werror=missing-prototypes]
1498 | int lttng_trace_chunk_remove_subdirectory_recursive(struct lttng_trace_chunk *chunk,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Change-Id: I19e03b64557ce845b94f02fcf0d6fbafba654d95 Signed-off-by: Simon Marchi <simon.marchi@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Wed, 25 Mar 2020 22:39:30 +0000 (18:39 -0400)]
Fix: common: add `void` parameter to log_add_time declaration
It makes it match the definition and fixes this warning:
CC error.lo
/home/smarchi/src/lttng-tools/src/common/error.c:33:13: error: no previous prototype for ‘log_add_time’ [-Werror=missing-prototypes]
const char *log_add_time(void)
^~~~~~~~~~~~
Change-Id: Ibbf5eebd8171504bc7050c754e2a5d113b6b5387 Signed-off-by: Simon Marchi <simon.marchi@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
CONTRIBUTING.md: clarify the guidelines for commit messages
In order to streamline the code review process, I am adding a more
detailed explanation of the desired commit message format.
Of note, the commit title format `Fix: sub-system`, used informally
for a couple of months, is adopted for bug fixes. A template of the
sections expected in the commit message body of those patches is also
included.
More general guidelines are also added for feature contributions.
Fix: lttng-list: don't warn when the kernel domain has no channels
Some commands beside lttng-enable-channel have the side-effect of
creating a domain. For instance, the lttng-track and lttng-untrack
commands will implicitly create their target domains if they don't
exist. Thus, it is not unexpected for a domain to exist without
channels.
Currently, tracking process attributes in the user space and kernel
domains will result in a warning being printed when lttng-status
(or lttng-list `the_session`) is invoked.
Example output:
Tracing session arielle_bolduc: [inactive]
Trace output: /home/jgalar/lttng-traces/arielle_bolduc-20200323-191128
=== Domain: Linux kernel ===
Tracked process attributes
Process IDs: all
Virtual Process IDs: 12365, 526, 41
User IDs: all
Virtual User IDs: all
Group IDs: all
Virtual Group IDs: all
Warning: No kernel channel
=== Domain: User space ===
Buffering scheme: per-user
Tracked process attributes
Virtual Process IDs: 12365, 526, 41
Virtual User IDs: all
Virtual Group IDs: all
The warning is removed since it can only confuse users.
Refactor: lttng-ctl: follow terminology of the tracker documentation
This commit harmonizes the process attribute tracker API and
serialization formats (save/restore and MI) with the documentation
with regards to the terminology used.
The message of the parent commit adjusting the manual pages of the
lttng-track and lttng-untrack commands details those terminology
changes and their rationale.
Some problems with the API introduced during the 2.12 development
cycle are also adressed:
Type safety:
- The process attribute tracker is made type safe with regards
to the platform's native types to express process
attributes. Where the original API casted all integral values to
integers, this change introduces accessors for all process
attribute types (pid_t, uid_t, gid_t). This makes it easier to
use the API safely and without producing warnings in user's
code.
Another benefit of adopting this explicit type-safe approach is
that is will make it easier to add new attributes which are not
expressible (or non-ambiguously expressible) using `int` and
`string` types (e.g. tracking a virtual PID within a given
namespace).
Ambiguity of INCLUDE_ALL, EXCLUDE_ALL, and INCLUDE_SET states:
- The original tracker API has a notion of 'enabled' pid_tracker
which is confusing to users:
- enable = 0: everything is tracked,
- enable = 1: a list of tracked pids is provided, which may be
empty.
- pid '-1' is *special* and tracks or untracks everything.
This was replaced with a 'special' opaque value meaning 'ALL'
which, while being clearer, was still confusing and hard to
document.
The revised API explicitly expresses the notion of a tracking
policy (`enum lttng_tracking_policy`). When that policy is set
to `LTTNG_TRACKING_POLICY_INCLUDE_SET`, the inclusion set can
be queried and/or mutated.
On top of being clearer, this aligns more closely with the
internal lttng-sessiond daemon API which gets rid of a lot
of code to handle those special cases. The resulting code is
more verbose, but a lot easier to understand.
Moreover, the types introduced (e.g. lttng_process_attr_values)
are meant to be re-used if a new
`LTTNG_TRACKING_POLICY_EXCLUDE_SET` tracking policy is added in
the future.
Documentation:
- The revised API includes a complete documentation. It documents
the API usage, but also adds implementation notes such explicitly
mentionning when/where user names and group names are resolved.
Client:
- While making the changes to use this new API, some error messages
are clarified (or added). The resulting output when listing the
trackers was also changed to be more compact.
The CLI output now also makes use of the terminology used in
the documentation for all commands interacting with process
attribute trackers.
It is now also possible to specify multiple process attribute
trackers along with the --all option for the lttng-track and
lttng-untrack command. For instance: `lttng tracker --userspace
--vpid --vuid --all` is now allowed.
The same process attribute tracker can also be specified more than
once in a command, as follows:
`lttng track --userspace --vpid 43,11 --vpid 55,77`
Since the serialization had been changed during the 2.12 cycle, I
changed them further to use the API's terminology in the element
names.
Docs: overhaul of lttng-track(1) and lttng-untrack(1)
- Replace the term whitelist by "inclusion set"
The use of whitelist (and its counterpart, blacklist) is somewhat
controversial and doesn't align with the terms used in the tracker
API.
The "inclusion set" nomenclature is inspired by that used by ACL,
namely "include list" and "exclude list". Set is preferred to list
as there is no implied notion of order.
- Adapt process attribute descriptions
The process attribute descriptions were apparently copy-pasted from
the original PID description (all are referencing "process IDs").
A suitable description is added for each process attribute.
- Add new process attributes to lttng-untrack(1)
The lttng-untrack man page was not updated when the resource types
introduced in LTTng 2.12 were added to lttng-track(1).
- Group all process attributes with their 'virtual' counter-parts
The order in which process attributes were described was: pid, uid,
gid, vpid, vuid, vgid.
Grouping process attributes with their virtual counterpart (e.g. pid
followed by vpid) makes it easier to understand the difference at a
glance.
The order becomes: pid, vpid, uid, vuid, gid, vgid
- Remove all PID-specific wordings in lttng-untrack(1)
Fix: MI: bump MI schema version to 4.0 in mi-lttng.c
The MI schema has been bumped to 4.0, but the internal constant
in mi-lttng.c still indicates a 3.0 schema. Synchronize both
versions to 4.0 for the 2.12 release.
Fix: sessiond: occasional badfd error on repeated SIGTERM
The session daemon occasionally prints the following messages
when it received multiple SIGTERM signals:
PERROR - 16:50:18.505585257 [49845/49845]: write poll pipe: Bad file
descriptor (in notify_thread_pipe() at utils.c:35)
This is caused by a (somewhat inevitable) race between the teardown of
the daemon and the closing of its quit pipe. This happens more often
when kernel modules take a long time to be unloaded and the user
spams ctrl+c in the hope of convincing the daemon process to close
faster since modules are unloaded after closing the quit pipe.
Setting closed pipe fds to '-1' is safe anyway and is already
handled by the notify_thread_pipe() util.
Fix: lttng: incorrect domain list printed when no domain is provided
The following commands make use of a common utility function to
validate the count of domains specified and print an error when it
is invalid:
- lttng-enable-event,
- lttng-disable-event,
- lttng-track,
- lttng-untrack,
- lttng-add-context,
- lttng-enable-channel,
- lttng-disable-channel.
Those commands do not allow the same domains to be used. In fact, they
all expect --kernel or --userspace only, except for the
lttng-enable-event, lttng-disable-event, and lttng-add-context
commands which allow the --log4j, --jul, and --python domain options
to be used.
Currently, the error message when no domain is specified is incorrect
for all of those commands. The error reads as follows:
`Error: Please specify a domain (-k/-u/-j).`
For most commands, the -j option cannot be used. For those that allow
agent domains, the message is missing the -l and -p domains.
This ensures that the expected domains are printed for each of those
commands.
Moreover, the message is clarified by using the long form option
names.
Issue
=====
Some testcases related to `--live` sessions in
`tools/clear/{test_ust,test_kernel}` fail with the following error:
ls: cannot access '/tmp/tmp.gi1eVM3fZ9/tmp.dNL0WHRTQ3/ci-node-bionic-amd64-02-04/LUP49x7iV7xexehS*': No such file or directory
not ok 61 - Failed to list content of directory "/tmp/tmp.gi1eVM3fZ9/tmp.dNL0WHRTQ3/ci-node-bionic-amd64-02-04/LUP49x7iV7xexehS*"
FAIL: tools/clear/test_ust 61 - Failed to list content of directory "/tmp/tmp.gi1eVM3fZ9/tmp.dNL0WHRTQ3/ci-node-bionic-amd64-02-04/LUP49x7iV7xexehS*"
# Failed test 'Failed to list content of directory "/tmp/tmp.gi1eVM3fZ9/tmp.dNL0WHRTQ3/ci-node-bionic-amd64-02-04/LUP49x7iV7xexehS*"'
# in ./tools/clear//../../../utils/tap/tap.sh:fail() at line 159.
They fail because the wildcard character `*` is used in a path and is
not expanded when comes the time to list the content of the expanded
path.
In `--live` related tests, we use a wildcard in those paths because we
can't know the full name of the output directory as it's named by the
sessiond.
This bug was introduced (or rather not fixed) by the following commit:
commit 94360c17201a28466af49058735166c73f9ae130
Author: Francis Deslauriers <francis.deslauriers@efficios.com>
Date: Fri Mar 6 18:18:14 2020 -0500
Issues
======
1. Both functions aim to do the same thing.
2. The current `validate_folder_is_empty` function uses an erroneous
way of checking if a folder is empty. `ls -A` returns zero on
success; it doesn't return the number of files and sub-folders. A
correct way to get the number of elements in a directory is to pipe
the output of `ls -A` to `wc -l`.
3. The `local_path` variable is used undefined in the current
`validate_directory_empty` function.
It just happens that the `local_path` variable is defined in all
but one of the callsites of this function so those calls work as
expected.
The other call to this function was
bogus (tests/regression/tools/clear/test_ust:323) because it uses
the `$TRACE_PATH` variable name.
Fixes
=====
- Merge both functions into one, keeping the name
`validate_directory_empty`,
- Fix the emptiness check,
- Fix usages of `validate_directory_empty` in tests.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Change-Id: Id75baf16f42866e3b978389f9aada4a2f6b6f2ae Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Thu, 13 Feb 2020 15:35:34 +0000 (10:35 -0500)]
Ignore -Wincomplete-setjmp-declaration warnings
We currently get this when building a test that requires UST:
make[4]: Entering directory '/home/smarchi/build/lttng-tools-clang/tests/regression/ust/linking'
CC tp.lo
In file included from /home/smarchi/src/lttng-tools/tests/regression/ust/linking/tp.c:11:
In file included from /home/smarchi/src/lttng-tools/tests/regression/ust/linking/./ust_tests_demo.h:38:
In file included from /home/smarchi/install/include/lttng/tracepoint-event.h:58:
In file included from /home/smarchi/install/include/lttng/ust-tracepoint-event.h:28:
In file included from /home/smarchi/install/include/lttng/ust-events.h:41:
/usr/include/pthread.h:744:12: error: declaration of built-in
function '__sigsetjmp' requires the declaration of the 'jmp_buf'
type, commonly provided in the header
<setjmp.h>. [-Werror,-Wincomplete-setjmp-declaration]
extern int __sigsetjmp (struct __jmp_buf_tag *__env, int __savemask) __THROWNL;
^
I'm not sure what we can do about it, and I believe the warning is
bogus. I do have a definition for the "jmp_buf" type in
/usr/include/setjmp.h:
typedef struct __jmp_buf_tag jmp_buf[1];
And even with I include <setjmp.h>, the warning does not go away. I'm
not sure if that's right, but this patch disables the warning.
Change-Id: Ibe7451dc0afc9aaca59431296d42011d9e4306f9 Signed-off-by: Simon Marchi <simon.marchi@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Thu, 13 Feb 2020 15:36:04 +0000 (10:36 -0500)]
tests: put -no-pie in LDFLAGS instead of CFLAGS
When building with clang, we get:
make[4]: Entering directory
'/home/smarchi/build/lttng-tools-clang/tests/utils/testapp/gen-syscall-events-callstack'
CC gen-syscall-events-callstack.o
clang: error: argument unused during compilation: '-no-pie'
[-Werror,-Wunused-command-line-argument]
Indeed, -no-pie should be in LDFLAGS, not CFLAGS.
To make sure that Makefiles can use `AM_LDFLAGS += ...`, I have added
an AC_SUBST for AM_LDFLAGS in configure.ac. This makes it so that if
we ever set AM_LDFLAGS for real in configure.ac, the Makefiles won't
inadvertently overwrite that value.
Change-Id: I7bad985bb135e50750917db6a928f2705a85b445 Signed-off-by: Simon Marchi <simon.marchi@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Mon, 25 Nov 2019 20:32:19 +0000 (15:32 -0500)]
Fix -Wmissing-declarations warnings in filter-parser.y
Make gc_string_append and yyerror static, as they are only used in
this file.
Remove yywrap, apparently, it's not used. This probably has to do
with the fact that our lexer uses the noyywrap option.
setstring is used by filter-lexer.c (generated from filter-lexer.l),
which has its own local setstring declaration. This is not very good,
because the prototype of setstring could change in the implementation,
and the users of the function would still compile, but then it would
probably crash inexplicably at runtime.
Instead, put the declaration in filter-parser.h, using a "%code
provides" directive. That seems to be the intent of "%code provides"
[1]:
Purpose: This is the best place to write additional definitions and
declarations that should be provided to other modules.
Simon Marchi [Mon, 25 Nov 2019 20:39:22 +0000 (15:39 -0500)]
Include cmd-2-2.h in cmd-2-1.h
Fixes:
CC cmd-2-2.o
/home/smarchi/src/lttng-tools/src/bin/lttng-relayd/cmd-2-2.c:36:5:
error: no previous declaration for ‘cmd_recv_stream_2_2’
[-Werror=missing-declarations] int cmd_recv_stream_2_2(const
struct lttng_buffer_view *payload, ^~~~~~~~~~~~~~~~~~~
... and helps ensure that the declarations are in sync with the
definitions.
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Change-Id: I889d7306c157b0d68d24bf610fce18c8949422d8 Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Mon, 25 Nov 2019 21:00:56 +0000 (16:00 -0500)]
Make create_file function static in gen-ust-tracef.c
Fixes:
CC gen-ust-tracef.o
/home/smarchi/src/lttng-tools/tests/utils/testapp/gen-ust-tracef/gen-ust-tracef.c:36:6:
error: no previous declaration for ‘create_file’
[-Werror=missing-declarations] void create_file(const char *path)
^~~~~~~~~~~
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Change-Id: I12359b30054da609be2aca998de960719120083e Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Mon, 25 Nov 2019 21:16:32 +0000 (16:16 -0500)]
Make remove_file_from_hierarchy function static in test_directory_handle.c
Fixes:
CC test_directory_handle.o
/home/smarchi/src/lttng-tools/tests/unit/test_directory_handle.c:139:5:
error: no previous declaration for ‘remove_file_from_hierarchy’
[-Werror=missing-declarations] int
remove_file_from_hierarchy(struct
lttng_directory_handle *test_dir_handle,
^~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Change-Id: I060a89d88f2de8e12368496968708d273cb318f2 Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Mon, 25 Nov 2019 21:17:37 +0000 (16:17 -0500)]
Make fd_count function static in test_fd_tracker.c
Fixes:
CC test_fd_tracker.o
/home/smarchi/src/lttng-tools/tests/unit/test_fd_tracker.c:65:5:
error: no previous declaration for ‘fd_count’
[-Werror=missing-declarations] int fd_count(void) ^~~~~~~~
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Change-Id: I02775d5d601211052b4cad323bc33623beabc8f9 Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Mon, 25 Nov 2019 21:27:13 +0000 (16:27 -0500)]
Make functions in live_test.c static
Fixes:
CC live_test.o
/home/smarchi/src/lttng-tools/tests/regression/tools/live/live_test.c:154:5:
error: no previous declaration for ‘establish_connection’
[-Werror=missing-declarations] int establish_connection(void)
^~~~~~~~~~~~~~~~~~~~
/home/smarchi/src/lttng-tools/tests/regression/tools/live/live_test.c:198:5:
error: no previous declaration for ‘list_sessions’
[-Werror=missing-declarations] int
list_sessions(uint64_t *session_id) ^~~~~~~~~~~~~
/home/smarchi/src/lttng-tools/tests/regression/tools/live/live_test.c:245:5:
error: no previous declaration for ‘create_viewer_session’
[-Werror=missing-declarations] int create_viewer_session(void)
^~~~~~~~~~~~~~~~~~~~~
/home/smarchi/src/lttng-tools/tests/regression/tools/live/live_test.c:283:5:
error: no previous declaration for ‘attach_session’
[-Werror=missing-declarations] int attach_session(uint64_t id)
^~~~~~~~~~~~~~
/home/smarchi/src/lttng-tools/tests/regression/tools/live/live_test.c:372:5:
error: no previous declaration for ‘get_metadata’
[-Werror=missing-declarations] int get_metadata(void) ^~~~~~~~~~~~
/home/smarchi/src/lttng-tools/tests/regression/tools/live/live_test.c:466:5:
error: no previous declaration for ‘get_next_index’
[-Werror=missing-declarations] int get_next_index(void)
^~~~~~~~~~~~~~
/home/smarchi/src/lttng-tools/tests/regression/tools/live/live_test.c:631:5:
error: no previous declaration for ‘detach_viewer_session’
[-Werror=missing-declarations] int detach_viewer_session(uint64_t
id) ^~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Change-Id: I7dc82076439e1bccd7bc253d084291f6281f6887 Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Mon, 25 Nov 2019 21:29:04 +0000 (16:29 -0500)]
Make parse_arguments static in base_client.c
Fixes:
CC base_client.o
/home/smarchi/src/lttng-tools/tests/regression/tools/notification/base_client.c:60:5:
error: no previous declaration for ‘parse_arguments’
[-Werror=missing-declarations] int parse_arguments(char **argv) {
^~~~~~~~~~~~~~~
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Change-Id: I4e431009d6e174eaadd82d206f9bedde9efdd30b Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Mon, 25 Nov 2019 21:41:29 +0000 (16:41 -0500)]
Fix all -Wmissing-declarations warning instances
This fixes all the remaining -Wmissing-declarations warning occurences.
The changes either:
- Make functions static
- Remove unused functions
- Add declarations for functions that are meant to be exported in a
shared object to override some other function (e.g.
lttng_ust_clock_plugin_init). There isn't really a better place for
these declarations.
Change-Id: If08855b75bf44dfdcfbdd654c272474ad8ebef39 Signed-off-by: Simon Marchi <simon.marchi@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Fix: liblttng-ctl: hide new tracker config symbols
Session configuration constants are supposed to remain internal.
Some symbols were errneously exported in previous releases, but
we mark new ones as "hidden".
Simon Marchi [Tue, 11 Feb 2020 22:29:09 +0000 (17:29 -0500)]
configure: add --enable-Werror
If one wants to build with -Werror, it's not possible to simply
configure with -Werror in the CFLAGS. This makes a bunch of configure
checks fail, which would have otherwise passed.
This patch adds an --enable-Werror option to configure, which has the
effect of adding -Werror to AM_CFLAGS. It therefore ends up in the
CFLAGS used to build the project, but it doesn't interfere with the
configure checks.
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Change-Id: I18c33125c717305aac8f1d8a19fee7e065d70c31 Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Fri, 31 Jan 2020 18:05:20 +0000 (13:05 -0500)]
configure: use AX_APPEND_COMPILE_FLAGS to detect supported warning flags
I would eventually like to enable some additional warnings by default
when building lttng-tools. However, some warnings are
compiler-specific or are not present in older versions of some compilers
we need to support. We can therefore not add them unconditionally to
CFLAGS.
This patch uses the AX_APPEND_COMPILE_FLAGS macro to address that. This
macro tests each individual flag we pass it with the current compiler.
If it finds that the flag is supported (the compiler exits with status 0
when compiling a file with that flag), it appends it to the given
variable, WARN_CFLAGS in our case). WARN_CFLAGS is then added to
AM_CFLAGS.
With time, we'll be able to throw any warning flag in there that we
think is useful, even if just available in a recent version of one
compiler.
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Change-Id: Id2ae4b4e8882af788c835ce89a979544531370e9 Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Fix: remove broken health monitoring test `test_thread_exit`
The `test_thread_exit` test uses the testpoint() infrastructure to
simulate daemon threads dying at "random" times by calling
pthread_exit() and checks that dead threads are properly reported by
the health check API.
The health check system is implemented as a list of structures that
live in the TLS of the various monitored threads. When the health
thread receives a health-check request, it iterates on this list and
reports the current health status of the threads to the client.
When a thread is stalled, this works fine; the health TLS has not been
updated in some time and that can be observed by the client. However,
for a 'spurious' thread exit, this is a bit more fragile.
Essentially, the test assumes that the thread will not have
unregistered its health TLS, but that it will remain valid until the
thread is joined.
Unfortunately, this behaviour relies on the fact that threads were not
joined until late in the tear down of the session daemon in the
past. This is no longer the case for all threads.
To provide an example of a test sequence that results in a
crash:
- the test kills the client thread,
- the session daemon received SIGTERM,
- the client thread is joined immediately,
- the next thread to shutdown (gracefully) unregisters itself from
the health monitoring sybsystem and, in doing so, accesses an
invalid element while removing itself from the health_app list,
causing a crash.
As a side note, I could not find a definitive answer on the lifetime
of TLS variables. Are they guaranteed to be accessible until a
pthread_join() or is it undefined to access them after a thread has
returned? I'm guessing this is a very internal implementation detail
of the pthread implementation being used and that we should not rely
on that behaviour.
There are multiple ways we could fix this problem, such as using
heap-allocated structures and ref-counting them to share ownership
with the health thread, using pthread_atexit() to clean-up, etc.
However, the LTTng daemons never pthread_exit() their threads to
handle errors (or even call it directly); handling this behaviour is
of dubious interest at the moment.
Fix: relayd: use of relay_session ref count before initialization
The relay_session's reference count is used before it is initialized
on multiple code paths of session_create(). The initialization of the
reference count, mutexes, and intrusive data structure nodes are
initialized earlier to make their use safe in the event of an error.
Fix: relayd: unchecked return value when opening relay socket
fd_tracker_open_unsuspendable_fd may fail because the underlying
socket() call fails or there may be too many open file descriptors at
the time of the call. In both cases, these errors must be logged and
handled.
Simon Marchi [Mon, 3 Feb 2020 22:38:23 +0000 (17:38 -0500)]
tests: append to AM_CFLAGS instead of overriding it
The Makefiles modified by this patch currently override the AM_CFLAGS
value, which means that anything put in AM_CFLAGS by configure (for
example, the warning flags) is lost.
I believe the intention is to add some flags to CFLAGS, so modify them
such that they append instead of override.
notification/Makefile.am overrides AM_LDFLAGS with nothing. It feels
like it was not the intention to actually clear that variable, but
rather that it is just the by-product of a copy paste. If it was really
the intention to clear the value of AM_LDFLAGS, there would have been a
comment to explain it, right?
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Change-Id: I4ef926d9135b16200e5f17d09461506a5e955068 Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>