Jonathan Rajotte [Wed, 17 Jun 2020 19:55:36 +0000 (15:55 -0400)]
Fix: invalid discarded events on start/stop without event production
Observed issue
==============
On consecutive start/stop command sequence the reported discarded event
count is N * CPU, where N is the number of start/stop pair executed.
Note that no event generation occurred between each start/stop pair.
lttng start
lttng stop
Tracing stopped for session auto-
20200616-094338
lttng start
lttng stop
Waiting for data availability
Warning: 4 events were discarded, please refer to the documentation on channel configuration.
Tracing stopped for session auto-
20200616-094338
lttng start
lttng stop
Waiting for data availability
Warning: 8 events were discarded, please refer to the documentation on channel configuration.
Tracing stopped for session auto-
20200616-094338
The issue was bisected down to:
commit
6f9449c22eef59294cf1e1dc3610a5cbf14baec0 (HEAD)
Author: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Date: Sun May 10 18:00:26 2020 -0400
consumerd: refactor: split read_subbuf into sub-operations
[...]
Cause
=====
The discarded event local variable, in `consumer_stream_update_stats()`
is initialized with the subbuffer sequence count instead of the
subbuffer discarded event count.
Solution
========
Use the subbuffer discarded event count to initialized the variable.
Known drawbacks
=========
None
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5ff213d0464cdb591b550f6e610bf15085b18888
Jonathan Rajotte [Wed, 17 Jun 2020 19:05:48 +0000 (15:05 -0400)]
tests: truncate metadata file for regenerate metadata test
Truncating the metadata file ensure that we test the effect of the
regenerate metadata command. Otherwise we simply test the command
return.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I762dc849f69d2cf3fe8bf73c5a77d5c2a4aa4ae5
Jérémie Galarneau [Wed, 17 Jun 2020 16:59:24 +0000 (12:59 -0400)]
Fix: consumerd: user space metadata not regenerated
Observed Issue
==============
The LTTng-IVC tests fail on the `regenerate metadata` tests which
essentially:
- Setups a user space session
- Enables events
- Traces an application
- Stops tracing
- Validates the trace
- Truncates the metadata file (empties it)
- Starts tracing
- Regenerates the metadata
- Stops the session
- Validates the trace
The last trace validation step fails on an empty file (locally) or
a garbled file (remote).
The in-tree tests did no catch any of this since they essentially don't
test much. They verify that the command works (returns 0) but do not
validate any of its effects.
The issue was bisected down to:
commit
6f9449c22eef59294cf1e1dc3610a5cbf14baec0 (HEAD)
Author: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Date: Sun May 10 18:00:26 2020 -0400
consumerd: refactor: split read_subbuf into sub-operations
[...]
Cause
=====
The commit that introduced the issue refactored the sub-buffer
consumption loop to eliminate code duplications between the user space
and kernel consumer daemons.
In doing so, it eleminated a metadata version check from the consumption
path.
The consumption of a metadata sub-buffer follows those relevant
high-level steps:
- `get` the sub-buffer
- /!\ user space specific /!\
- if the `get` fails, attempt to flush the metadata cache's
contents to the ring-buffer
- populate `stream_subbuffer` properties (size, version, etc.)
- check the sub-buffer's version against the last known metadata
version (pre-consume step)
- if they don't match, a metadata regeneration occurred: reset the
metadata consumed position
- consume (actual write/send)
- `put` sub-buffer
[...]
As shown above, the user space consumer must manage the flushing of the
metadata cache explicitly as opposed to the kernel domain for which the
tracer performs the flushing implicitly through the `get` operation.
When the user space consumer encounters a `get` failure, it checks
if all the metadata cache was flushed (consumed position != cache size),
and flushes any remaining contents.
However, the metadata version could have changed and yielded an
identical cache size: a regeneration without any new metadata will
yield the same cache size.
Since
6f9449c22, the metadata version check is only performed after
a successful `get`. This means that after a regeneration, `get`
never succeeds (there is seemingly nothing to consume), and the
metadata version check is never performed.
Therefore, the metadata stream is never put in the `reset` mode,
effectively not regenerating the data.
Note that producing new metadata (e.g. a newly registering app
announcing new events) would work around the problem here.
Solution
========
Add a metadata version check when failing to `get` a metadata
sub-buffer. This is done in `commit_one_metadata_packet()` when the
cache size is seen to be equal to the consumed position.
When this occurs, `consumer_stream_metadata_set_version()`, a new
consumer stream method, is invoked which sets the new metadata version,
sets the `reset` flag, and discards any previously bucketized metadata.
The metadata cache's consumed position is also reset, allowing the
cache flush to take place.
`metadata_stream_reset_cache()` is renamed to
`metadata_stream_reset_cache_consumed_position()` since its name is
misleading and since it is used as part of the fix.
Know drawbacks
==============
None.
Change-Id: I3b933c8293f409f860074bd49bebd8d1248b6341
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Reported-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Ovidiu Panait [Mon, 18 May 2020 13:39:26 +0000 (16:39 +0300)]
tests: gen-ust-events-ns/tp.h: Fix build with musl libc
Fix the following build error with musl libc:
In file included from ../../../../../lttng-tools-2.12.0/tests/utils/testapp/gen-ust-events-ns/tp.h:14,
from ../../../../../lttng-tools-2.12.0/tests/utils/testapp/gen-ust-events-ns/tp.c:10:
../../../../../lttng-tools-2.12.0/tests/utils/testapp/gen-ust-events-ns/tp.h:17:10: error: unknown type name 'ino_t'; did you mean 'int8_t'?
17 | TP_ARGS(ino_t, ns_ino),
| ^~~~~
../../../../../lttng-tools-2.12.0/tests/utils/testapp/gen-ust-events-ns/tp.h:17:10: error: unknown type name 'ino_t'; did you mean 'int8_t'?
17 | TP_ARGS(ino_t, ns_ino),
| ^~~~~
../../../../../lttng-tools-2.12.0/tests/utils/testapp/gen-ust-events-ns/./tp.h:17:2: error: unknown type name 'ino_t'; did you mean 'int8_t'?
17 | TP_ARGS(ino_t, ns_ino),
| ^~~~~~~
../../../../../lttng-tools-2.12.0/tests/utils/testapp/gen-ust-events-ns/./tp.h:17:2: error: unknown type name 'ino_t'; did you mean 'int8_t'?
17 | TP_ARGS(ino_t, ns_ino),
| ^~~~~~~
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic7a73c6754fc30a62bdf6519062c07be65a2eaba
Jonathan Rajotte [Wed, 15 Jan 2020 17:00:25 +0000 (12:00 -0500)]
actions: Expose lttng_action_type_string internally
This will ease debugging output on the action handling code.
Change-Id: I81b6faf5bb8b5082edcf3895ea88c8690572475a
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Mon, 2 Dec 2019 21:41:52 +0000 (16:41 -0500)]
actions: introduce action group
This patch introduces action groups as a new kind of action.
When creating a trigger, it is only possible to attach a single action.
Action groups allow users to attach more than one action.
A group is created using lttng_action_group_create. Actions are added
to it using lttng_action_group_add_action. The group can then be used
in a trigger, like any other action.
The operations required to be implemented by actions (serialize,
create_from_buffer, validate) are implemented by executing the operation
on all elements.
Current limitations are:
- To avoid any cycle, it is not possible to add a group inside a group.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Change-Id: I2ae6aed21d9a6b45510d37a435773b1bd7d76528
Jérémie Galarneau [Wed, 5 Feb 2020 23:13:07 +0000 (18:13 -0500)]
actions: Make lttng_action reference countable
lttng_action objects will be shared with the action executor subsystem
which executes them asynchronously. This introduces an ambiguous
ownership since triggers could be unregistered while an action is
executing (or pending execution).
Also ease the object ownership management of the group action sub
actions. This allow clients to add multiple time the same action to an
action group without any problem.
The currently user-visible 'destroy' method simply calls the 'put'
method which preserves the expected behaviour for current users (both
internal and public) of the API.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2d016a9b80e418d40c72db8e155c44e96852b33f
Simon Marchi [Mon, 2 Dec 2019 20:20:38 +0000 (15:20 -0500)]
actions: introduce snapshot session action
This patch introduces the API for the "snapshot session" action.
A snapshot session action is created using the
lttng_action_snapshot_session_create function. It is mandatory to set a
session name using lttng_action_snapshot_session_set_session_name before
using the action in a trigger.
It is possible, but optional, to provide a snapshot name with
lttng_action_snapshot_session_set_snapshot_name.
The patch adds the code for serializing the action and deserializing it
on the sessiond side, but not the code for executing it.
Change-Id: I2b76680d44bf69eb705f2a238fffef2519b82534
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Fri, 5 Jun 2020 15:30:49 +0000 (11:30 -0400)]
Clean-up: replace space by tabs
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I86adff6db8bc4bb681ffe06530fd98ffbc03d3c5
Jonathan Rajotte [Wed, 27 May 2020 22:49:25 +0000 (18:49 -0400)]
Fix: tests: output_dir contains the consumerd pipe
Prevents failure on teardown. Otherwise, testpoint fails when removing
the consumerd pipe.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9cbfa211e2545350c28e3b10a34fb00aac0493cb
Jérémie Galarneau [Wed, 10 Jun 2020 19:39:47 +0000 (15:39 -0400)]
liblttng-ctl: use lttng_payload for serialize/create_from_buffer
Some objects used in the sessiond <-> liblttng-ctl communication (e.g.
such as userspace probe event rule) contain file descriptors that
must be carried accross process boundaries (fd passing).
Since those objects are often nested within a higher-level object
hierarchy, it makes sense to adapt the existing
serialize/create_from_buffer interface to use an lttng_payload.
An lttng_payload contains a dynamic buffer and an array of file
descriptors. Objects are expected to push their file descriptors in the
payload in the same way they currently push the bytes of their binary
representation in a dynamic buffer.
Conversely, an lttng_payload_view interface is added and contains a
buffer_view and an iterator which allows objects to ̀ pop` a file
descriptors when appropriate.
Tests are added to validate the FD consumption behaviour depending
on the origin of payload views (root view or descendant view).
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id378d8b5a3376a074ab138a60733377e39a24133
Jérémie Galarneau [Wed, 10 Jun 2020 17:02:29 +0000 (13:02 -0400)]
common: set dynamic-buffer's data to NULL on reset()
Set 'data' to NULL after the reset of a dynamic_buffer since
re-using it (e.g. appending) will cause realloc to be called
with an invalid pointer.
Not marked as a fix as no code currently re-uses a buffer after
a reset().
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I58fd5bbcfcda9d952748bea17430e2f41b076f3c
Jérémie Galarneau [Wed, 10 Jun 2020 17:00:16 +0000 (13:00 -0400)]
Clean-up: coding style fixes in dynamic-buffer.c
1) An empty line is expected before any statement after a scope is
closed
2) Variables should be marked const where possible.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I803e62fd759348416faae0bc108cacc726ce64da
Jonathan Rajotte [Tue, 9 Jun 2020 00:29:58 +0000 (20:29 -0400)]
liblttng-ctl: add facilities for lttng_snapshot_output object
Internal:
is_equal, serialize, validate, from_buffer.
Public:
set_local_path, set_network_url set_network_urls
These APIs are used by the upcoming "snapshot session" action
used to trigger a snapshot on a given condition.
The internal API is added to transmit a snapshot output as part of an
action while the public API is added to clean-up the current
snapshot_output API which will be used by the client to create the
snapshot_session actions.
For instance, with set_local_path, it is no longer necessary to create a
local snapshot output by calling lttng_snapshot_output_set_ctrl_url
with a "file://" protocol.
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I00f9521faf9f66890ad6ea9a05ad7f6468f805f8
Jérémie Galarneau [Thu, 13 Feb 2020 23:21:08 +0000 (18:21 -0500)]
Fix: unix: don't PERROR on EAGAIN for non-blocking sockets
EAGAIN is expected on non-blocking UNIX socket operations. This
results in a spammy sessiond log.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I58ba711dad193b8d6849501f3e090797813e18ac
Depends-on: lttng-ust: I5a800fc92e588c2a6a0e26282b0ad5f31c044479
Simon Marchi [Mon, 2 Dec 2019 20:01:50 +0000 (15:01 -0500)]
actions: introduce rotate session action
This patch introduces the API for the "rotate session" action.
A rotate session action is created using the
lttng_action_rotate_session_create function. It is mandatory to set a
session name using lttng_action_rotate_session_set_session_name before
using the action in a trigger.
The patch adds the code for serializing the action and deserializing it
on the sessiond side, but not the code for executing it.
Change-Id: I8ed1a71a00deaa6abafaff703a8980c2c7598bda
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Mon, 2 Dec 2019 19:53:11 +0000 (14:53 -0500)]
actions: introduce stop session action
This patch introduces the API for the "stop session" action.
A stop session action is created using the
lttng_action_stop_session_create function. It is mandatory to set a
session name using lttng_action_stop_session_set_session_name before
using the action in a trigger.
The patch adds the code for serializing the action and deserializing it
on the sessiond side, but not the code for executing it.
Change-Id: Ie00d744340f85f15a333680cf86d3882bd612d1a
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Fri, 29 Nov 2019 21:48:45 +0000 (16:48 -0500)]
actions: introduce start session action
This patch introduces the API for the "start session" action.
A start session action is created using the
lttng_action_start_session_create function. It is mandatory to set a
session name using lttng_action_start_session_set_session_name before
using the action in a trigger.
The patch adds the code for serializing the action and deserializing it
on the sessiond side, but not the code for executing it.
Change-Id: I90598e25a461ccabcf7dc327aaa73b3d35e203af
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Sat, 14 Dec 2019 00:06:41 +0000 (19:06 -0500)]
actions: implement is_equal
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: I9272253202cfd0d6b6fb165293a534a824fbe854
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Sat, 18 Apr 2020 04:10:41 +0000 (00:10 -0400)]
Clean-up: sort includes per clang format in action.c
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: I1dd53d6c7a6e561d4537ba483f38e1ac2ace391e
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 20 Apr 2020 18:27:45 +0000 (14:27 -0400)]
format: AlignOperand introduces spaces
Observed issue
==============
t = tabs
s = space
tabs = 8 blanks
clang-format on this code will align with the operand using space.
consumed_len = sizeof(struct lttng_action_start_session_comm) +
t ssssssscomm->session_name_len;
We want:
consumed_len = sizeof(struct lttng_action_start_session_comm) +
t t comm->session_name_len;
Solution
========
Explicitly set it to false.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: I39bee6d82b20f4b6f9587a2911abb183de767d25
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 27 May 2020 15:27:26 +0000 (11:27 -0400)]
Fix: incorrect specifier %lu used with size_t argument
Fixes the following warning on 32-bit targets:
libtool: compile: gcc -DHAVE_CONFIG_H -I../../../include -I../../../include -I../../../src -include config.h -I/build/include -I/home/jenkins/workspace/lttng-tools_master_portbuild/arch/armhf/babeltrace_version/stable-1.5/build/std/conf/std/liburcu_version/master/test_type/base/deps/build/include -Wall -Wno-incomplete-setjmp-declaration -Wdiscarded-qualifiers -Wmissing-declarations -Wmissing-prototypes -Wmissing-parameter-type -fno-strict-aliasing -pthread -g -O2 -MT consumer-stream.lo -MD -MP -MF .deps/consumer-stream.Tpo -c consumer-stream.c -fPIC -DPIC -o .libs/consumer-stream.o
In file included from ../../../src/common/common.h:12:0,
from consumer.c:25:
consumer.c: In function ‘lttng_consumer_on_read_subbuffer_mmap’:
../../../src/common/error.h:161:35: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 7 has type ‘size_t {aka unsigned int}’ [-Wformat=]
#define DBG(fmt, args...) _ERRMSG("DEBUG1", PRINT_DBG, fmt, ## args)
^
../../../src/common/error.h:136:51: note: in definition of macro ‘__lttng_print’
fprintf((type) == PRINT_MSG ? stdout : stderr, fmt, ## args); \
^~~
../../../src/common/error.h:161:27: note: in expansion of macro ‘_ERRMSG’
#define DBG(fmt, args...) _ERRMSG("DEBUG1", PRINT_DBG, fmt, ## args)
^~~~~~~
consumer.c:1688:2: note: in expansion of macro ‘DBG’
DBG("Consumer mmap write() ret %zd (len %lu)", ret, write_len);
^~~
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id9a571d8e94105428833baa053c6463b91484a03
Jérémie Galarneau [Thu, 14 May 2020 18:24:17 +0000 (14:24 -0400)]
Fix: consumerd: live client receives incomplete metadata
Observed issue
==============
Babeltrace 1.5.x and Babeltrace 2.x can both report errors (albeit
differently) when using the "lttng-live" protocol that imply that the
metadata they received is incomplete.
For instance, babeltrace 1.5.3 reports the following error:
```
[error] Error creating AST
[error] [Context] Cannot open_mmap_trace of format ctf.
[error] Error adding trace
[warning] [Context] Cannot open_trace of format lttng-live at path net://localhost:xxxx/host/session/live_session.
[warning] [Context] cannot open trace "net://localhost:xxxx/host/session/live_session" for reading.
[error] opening trace "net://localhost:xxxx/host/session/live_session" for reading.
[error] none of the specified trace paths could be opened.
```
While debugging both viewers, I noticed that both were attempting to
receive the available metadata before consuming the "data" streams'
content.
Typically, the following exchange between the relay daemon and the
lttng-live client occurs when the problem is observed:
bt lttng-live:
emits LTTNG_VIEWER_GET_METADATA command
relayd:
returns LTTNG_VIEWER_METADATA_OK, len = 4096 (default packet size)
bt lttng-live:
consume 4096 bytes of metadata
emits LTTNG_VIEWER_GET_METADATA command
relayd:
returns LTTNG_VIEWER_NO_NEW_METADATA
When the lttng-live client receives the LTTNG_VIEWER_NO_NEW_METADATA
status code, it attempts to parse all the metadata it has received
since the last LTTNG_VIEWER_NO_NEW_METADATA reply. In effect, it is
expected that this forms a logical unit of metadata that is parseable
on its own.
If this is the first time metadata is received for that trace, the
metadata is expected to contain a trace declaration, packet header
declaration, etc.
If metadata was already received, it is expected that the newly parsed
declarations can be "appended" to the existing trace schema.
It appears that the relay daemon sends the
LTTNG_VIEWER_NO_NEW_METADATA while the metadata it has sent up to that
point is not parseable on its own.
The live protocol description does not require or imply that a viewer
should attempt to parse metadata packets until it hopefully succeeds
at some point. Anyhow:
1) This would make it impossible for a live viewer to correctly
handle a corrupted metadata stream beyond retrying forever,
2) This behaviour is not implemented by the two reference
implementations of the protocol.
Cause
=====
The relay daemon provides a guarantee that it will send any available
metadata before allowing a data stream packet to be served to the
client.
In other words, a client requesting a data packet will receive the
LTTNG_VIEWER_FLAG_NEW_METADATA status code (and no data) if it
attempts to get a data stream packet while the relay daemon has
metadata already available.
This guarantee is properly enforced as far as I can tell. However,
looking at the consumer daemon implementation, it appears that
metadata packets are sent as soon as they are available.
A metadata packet is not guaranteed to be parseable on its own. For
instance, it can end in the middle the an event declaration.
Hence, this hints at a race involving the tracer, the consumer daemon,
the relay daemon, and the lttng-live client.
Consider the following scenario:
- Metadata packets (sub-buffers) are configured to be 4kB in size,
- a large number of kernel events are enabled (e.g. --kernel --all),
- the network connection between the consumer and relay daemons is
slow
1) The kernel tracer will produce enough TSDL metadata to fill the
first sub-buffer of the "metadata" ring-buffer and signal the
consumer daemon that a buffer is ready. The tracer then starts
writing the remaining data in the following available sub-buffers.
2) The consumer daemon metadata thread is woken up and consumes the
first metadata sub-buffer and sends it to the relay daemon.
3) A live client establishes an lttng-live connection to the relay
daemon and attempts to consume the available metadata. It receives
the first packet and, since the relay daemon doesn't know about any
follow-up metadata, receives LTTNG_VIEWER_NO_NEW_METADATA on the
next attempt.
4) Having received LTTNG_VIEWER_NO_NEW_METADATA, the lttng-live client
attempts to parse the metadata it has received and fails.
This scenario is easy to reproduce by inserting a "sleep(1)" at
src/bin/lttng-relayd/main.c:1978 (as of this revision). This simulates
a relay daemon that would be slow to receive/process metadata packets
from the consumer daemon.
This problem similarly applies to the user space tracer.
Solution
========
Having no knowledge of TSDL, the relay daemon can't "bundle" packets
of metadata until they form a "parseable unit" to send to the consumer
daemon.
To provide the parseability guarantee expected by the viewers, and by
the relay daemon implicitly, we need to ensure that the consumer
daemons only send "parseable units" of metadata to the relay daemon.
Unfortunately, the consumer daemons do not know how to parse TSDL
either. In fact, only the metadata producers are able to provide the
boundaries of the "parseable units" of metadata.
The general idea of the fix is to accumulate metadata up to a point
where a "parseable unit" boundary has been identified and send that
content in one request to the relay daemon. Note that the solution
described here only concerns the live mode. In other cases, the
mechanisms described are simply bypassed.
A "metadata bucket" is added to lttng_consumer_stream when it is
created from a live channel. This bucket is filled until the
consumption position reaches the "parseable unit" end position.
A refresher about the handling of metadata in live mode
-------------------------------------------------------
Three "events" are of interest here and can cause metadata to be
consumed more or less indirectly:
1) A metadata packet is closed, causing the metadata thread to wake
up
2) The live timer expires
3) A data sub-buffer is closed, causing the data thread to wake-up
1) The first case is simple and happens regardless of whether or not
the tracing session is in live mode or not. Metadata is always
consumed by the metadata thread in the same way. However, this
scenario can be "caused" by (2) and (3). See [1]. A sub-buffer is
"acquired" from the metadata ring-buffer and sent to the relayd
daemon as the payload of a "RELAYD_SEND_METADATA" command.
2) When the live timer expires [2], the 'check_stream' function is
called on all data streams of the session. As its name clearly
implies, this function is responsible for flushing all streams or
sending a "live beacon" (called an "empty index" in the code) if
there is no data to flush. Any flushed data will result in (3).
3) When a data sub-buffer is ready to be consumed, [1] is invoked
by the data thread. This function acquires a sub-buffer and sends
it to the relay daemon through the data connection.
Then, an important synchronization step takes place. The index of
the newly-sent packet will be sent through the control
connection. The relay daemon waits for both the data packet and its
matching index before making the new packet visible to live
viewers.
Since a data packet could contain data that requires "newer"
metadata to be decoded, the data thread flushes the metadata stream
and enters a "waiting" phase to pause until all metadata present in
the metadata ring buffer has been consumed [3].
At the end of this waiting phase, the data thread sends the data
packet's index to the relay daemon, allowing the relayd to make it
visible to its live clients.
How to identify a "parseable unit" boundary?
--------------------------------------------
In the case of the kernel domain, the kernel tracer produces the
actual TSDL descriptions directly. The TSDL metadata is serialized to
a metadata cache and is flushed "just in time" to the metadata
ring-buffer when a "get next" operation is performed.
There is no way, from user space, to query whether or not the metadata
cache of the kernel tracer is empty. Hence, a new
RING_RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK command was added to
query whether or not the kernel tracer's metadata cache is empty when
acquiring a sub-buffer.
This allows the consumer daemon to identify a "coherent" position in
the metadata stream that is safe to use as a "parseable unit"
boundary.
As for the user space domain, since the session daemon is responsible
for generating the TSDL representation of the metadata, there is no
need to change LTTng-ust APIs.
The session daemon generates coherent units of metadata and adds them
to its "registry" at once (protected by the registry's lock). It then
flushes the contents to the consumer daemon and waits for that data to
be consumed before proceeding further.
On the consumer daemon side, the metadata cache is filled with the
newly-produced contents. This is done atomically with respect to
accesses to the metadata cache as all accesses happen through a
dedicated metadata cache lock.
When the consumer's metadata polling thread is woken-up, it will
attempt to acquire (`get_next`) a sub-buffer from the metadata stream
ring-buffer. If it fails, it will flush a sub-buffer's worth of
metadata to the ring-buffer and attempt to acquire a sub-buffer again.
At this point, it is possible to determine if that sub-buffer is the
last one of a parseable metadata unit: the cache must be empty and the
ring-buffer must be empty following the consumption of this
sub-buffer. When those conditions are met, the resulting metadata
`stream_subbuffer` is tagged as being `coherent`.
Metadata bucket
---------------
A helper interface, metadata_bucket, is introduced as part of this
fix. A metadata_bucket is `fill`ed with `stream_subbuffer`s, and is
eventually `flushed` when it is filled by a `coherent` sub-buffer.
As older versions of LTTng-modules must remain supported, this new
helper is not used when the
RING_RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK operation is not
available. When the operation is available, the metadata stream's
bucketization is enabled, causing a bucket to be created and the
`consume` callback to be swapped.
The `consume` callback of the metadata streams is replaced by a new
implementation when the metadata bucketization is activated on the
stream. This implementation returns the padded size of the consumed
sub-buffer when they could be added to the bucket. When the bucket is
flushed, the regular `mmap`-based consumption function is called with
the bucket's contents.
Known drawbacks
===============
This implementation causes the consumer daemon to buffer the whole
initial unit of metadata before sending it. In practice, this is not
expected to be a problem since the largest metadata files we have seen
in real use are a couple of megabytes wide.
Beyond the (temporary) memory use, this causes the metadata thread to
block while this potentially large chunk of metadata is sent (rather
than blocking while sending 4kb at a time).
The second point is just a consequence of existing shortcomings of the
consumerd; slow IO should not affect other unrelated streams. The
fundamental problem is that blocking IO is used and we should switch
to non-blocking communication if this is a problem (as is done in the
relay daemon).
The first point is more problematic given the existing tracer APIs.
If the tracer could provide the boundary of a "parseable unit" of
metadata, we could send the header of the RELAYD_SEND_METADATA command
with that size and send the various metadata packets as they are made
available. This would make no difference to the relay daemon as it is
not blocking on that socket and will not make the metadata size change
visible to the "live server" until it has all been received.
This size can't be determined right now since it could exceed the
total size of the "metadata" ring buffer. In other words, we can't wait
for the production of metadata to complete before starting to consume.
Finally, while implementing this fix, I also realized that the
computation of the rotation position of the metadata streams is
erroneous. The rotation code makes use of the ring-buffer's positions
to determine the rotation position. However, since both user space and
kernel domains make use of a "cache" behind the ring-buffer, that
cached content must be taken into account when computing the metadata
stream's rotation position.
References
==========
[1] https://github.com/lttng/lttng-tools/blob/
d5ccf8fe0/src/common/consumer/consumer.c#L3433
[2] https://github.com/lttng/lttng-tools/blob/
d5ccf8fe0/src/common/consumer/consumer-timer.c#L312
[3] https://github.com/lttng/lttng-tools/blob/
d5ccf8fe0/src/common/consumer/consumer-stream.c#L492
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I40ee07e5c344c72d9aae2b9b15dc36c00b21e5fa
Jérémie Galarneau [Sun, 10 May 2020 22:00:26 +0000 (18:00 -0400)]
consumerd: refactor: split read_subbuf into sub-operations
The read_subbuf code paths intertwine domain-specific operations and
metadata/data-specific logic which makes modifications error prone and
introduces a fair amount of code duplication.
lttng_consumer_read_subbuffer is effectively turned into a template
method invoking overridable callbacks making most of the consumption
logic domain and data/metadata agnostic.
The goal is not to extensively clean-up that code path. A follow-up
fix introduces metadata buffering logic which would not reasonably fit
in the current scheme. This clean-up makes it easier to safely
introduce those changes.
No changes in behaviour are intended by this change.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9366f2e2a38018ca9b617b93ad9259340180c55d
Jérémie Galarneau [Fri, 8 May 2020 20:00:11 +0000 (16:00 -0400)]
consumerd: move rotation logic to domain-agnostic read path
The "rotation ready" logic is duplicated in both user space and kernel
specializations of the read subbuffer functions.
It is moved to the domain-agnostic caller where it is needed only
once. This makes it easier to implement a follow-up fix and reduces
code duplication.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iae952a2cd52fa458cec956ae219492557e4adf79
Jérémie Galarneau [Tue, 5 May 2020 22:54:32 +0000 (18:54 -0400)]
sessiond: enforce mmap output type for kernel metadata channel
A follow-up fix causes the consumer daemon to accumulate metadata
packets into a complete "unit" that can be parsed before sending it to
the relay daemon.
The consumer daemon will also need to extract the contents of the
metadata cache when computing a rotation position (follow-up fix too).
Hence, it is not possible to rely on the splice back-end as the
consumer daemon may need to accumulate more content than can be backed
by the ring buffer's underlying pages.
In both cases, the splice output mode could still be used when
combined with a memfd, but I see no tangible benefit. Moreover, it
would require a 3.17 kernel.
Curiously the kernel metadata channel configuration appears to be
hard-coded twice; once in the ltt_kernel_session's
ltt_kernel_metadata, and once again in
kernel_consumer_add_metadata(). kernel_consumer_add_metadata is
modified to use the kernel session's metadata configuration.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia4cad82f595d3eee50d081851c234d4c2ef7ee5f
Jérémie Galarneau [Tue, 5 May 2020 19:48:05 +0000 (15:48 -0400)]
consumerd: tag metadata channel as being part of a live session
metadata channels that are part of a live session must be handled
differently than when they are part of non-live sessions since
complete "metadata units" must be accumulated before they are
forwarded to a relay daemon.
This allows a follow-up fix to use this information since the
live_timer_interval of a metadata channel is always 0.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I53db4bc717b149ed20e0309531db6f0241e873e1
Jérémie Galarneau [Tue, 5 May 2020 17:13:03 +0000 (13:13 -0400)]
consumerd: pass channel instance to stream creation function
Both callsites of consumer_allocate_stream() set the stream's "chan"
pointer after the creation. Pass the channel directly to the stream
creation function so it can initialize the stream according to the
channel's settings.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Icea7088e7695e310585bf398e14e6443d67a30bb
Jérémie Galarneau [Mon, 4 May 2020 23:04:02 +0000 (19:04 -0400)]
consumerd: cleanup: use buffer view interface for mmap read subbuf
Replace explicit pointer + size parameters by an lttng_buffer_view
in lttng_consumer_on_read_subbuffer_mmap().
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I76f35b3e295c596cdf4bbb8a6d01168a850a721a
Jérémie Galarneau [Mon, 4 May 2020 22:21:48 +0000 (18:21 -0400)]
consumerd: move address computation from on_read_subbuffer_mmap
The computation of the subbuffer's address is moved outside of
lttng_consumer_on_read_subbuffer_mmap to make it usable with a regular
buffer. This facilitates an upcoming change.
Moreover this has the benefit of isolating domain-specific logic from
this function which is supposed to be domain-agnostic.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I16f8ccaa73804f98fa03e69136548e6d6b7782e5
Jérémie Galarneau [Wed, 29 Apr 2020 04:03:43 +0000 (00:03 -0400)]
consumerd: refactor: combine duplicated check_*_functions
The check_ust_stream and check_kernel_stream functions are identical
except for the call to the domain-specific call to
consumer_flush_*_index.
A "flush_index" callback is passed to check_stream in order to share
the rest of that code.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iafdb64192322c0106a555b67f54290dadc4f0579
Jérémie Galarneau [Wed, 29 Apr 2020 01:40:12 +0000 (21:40 -0400)]
kerner-ctl: add RING_RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK
Add a wrapper for RING_RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK
which gets the next metadata subbuffer and returns a boolean flag
indicating whether the metadata is guaranteed to be in a consistent
state at the end of this sub-buffer (can be parsed).
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I13fbdfe51c3c4ef04581409e0fbc9837ed6d555d
Jonathan Rajotte [Thu, 21 May 2020 00:53:45 +0000 (20:53 -0400)]
Fix: common: fs_handle_seek returns negative value on success
Observed issue
==============
Babeltrace 1/2 fails to fetch data from a live session.
Error:
PLUGIN/SRC.CTF.LTTNG-LIVE/VIEWER lttng_live_get_stream_bytes@viewer-connection.c:1593 [lttng-live] Received get_data_packet response: error
PLUGIN/CTF/MSG-ITER request_medium_bytes@msg-iter.c:546 [lttng-live] User function failed: status=ERROR
PLUGIN/CTF/MSG-ITER ctf_msg_iter_get_next_message@msg-iter.c:2881 [lttng-live] Cannot handle state: msg-it-addr=0x562d87521a40, state=DSCOPE_TRACE_PACKET_HEADER_BEGIN
PLUGIN/SRC.CTF.LTTNG-LIVE lttng_live_iterator_next_handle_one_active_data_stream@lttng-live.c:821 [lttng-live] CTF message iterator failed to get next message: msg-iter=0x562d87521a40, msg-iter-status=ERROR
PLUGIN/SRC.CTF.LTTNG-LIVE lttng_live_msg_iter_next@lttng-live.c:1499 [lttng-live] Error preparing the next batch of messages: live-iter-status=LTTNG_LIVE_ITERATOR_STATUS_ERROR
LIB/MSG-ITER bt_message_iterator_next@iterator.c:865 Component input port message iterator's "next" method failed: iter-addr=0x562d8751ab70, iter-upstream-comp-name="lttng-live", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=SOURCE, iter-upstream-comp-class-name="lttng-live", iter-upstream-comp-class-partial-descr="Connect to an LTTng relay daemon", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
PLUGIN/FLT.UTILS.MUXER muxer_upstream_msg_iter_next@muxer.c:446 [muxer] Upstream iterator's next method returned an error: status=ERROR
PLUGIN/FLT.UTILS.MUXER validate_muxer_upstream_msg_iters@muxer.c:989 [muxer] Cannot validate muxer's upstream message iterator wrapper: muxer-msg-iter-addr=0x562d87515280, muxer-upstream-msg-iter-wrap-addr=0x562d8751ca90
PLUGIN/FLT.UTILS.MUXER muxer_msg_iter_next@muxer.c:1417 [muxer] Cannot get next message: comp-addr=0x562d8751a260, muxer-comp-addr=0x562d8751a2e0, muxer-msg-iter-addr=0x562d87515280, msg-iter-addr=0x562d8751aa90, status=ERROR
LIB/MSG-ITER bt_message_iterator_next@iterator.c:865 Component input port message iterator's "next" method failed: iter-addr=0x562d8751aa90, iter-upstream-comp-name="muxer", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=FILTER, iter-upstream-comp-class-name="muxer", iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
LIB/GRAPH consume_graph_sink@graph.c:462 Component's "consume" method failed: status=ERROR, comp-addr=0x562d8751a3d0, comp-name="pretty", comp-log-level=WARNING, comp-class-type=SINK, comp-class-name="pretty", comp-class-partial-descr="Pretty-print messages (`text` fo", comp-class-is-frozen=0, comp-class-so-handle-addr=0x562d8751a110, comp-class-so-handle-path="/usr/local/lib/babeltrace2/plugins/babeltrace-plugin-text.so", comp-input-port-count=1, comp-output-port-count=0
CLI cmd_run@babeltrace2.c:2529 Graph failed to complete successfully
PLUGIN/SRC.CTF.LTTNG-LIVE/VIEWER lttng_live_session_detach@viewer-connection.c:1211 [lttng-live] Unknown detach return code 0
The relevant relayd log:
DEBUG2 - Relay get data packet (in viewer_get_packet() at live.c:1770)
PERROR - Failed to seek file system handle of viewer stream 4 to offset
2244861952: Success (in viewer_get_packet() at live.c:1810)
DEBUG1 - Sent 262156 bytes for stream 4 (in viewer_get_packet() at live.c:1852)
Cause
=====
The fs_handle_seek function calls lseek on a stream file of ~2.5GB. The
return value of the lseek call is downcasted from off_t (signed 64 bit
on my system) to int. The resulting value is negative and force an error
at the call sites.
Solution
========
Use off_t as the return type.
Note that current call sites of fs_handle_seek already expect an off_t
return value.
ag fs_handle_seek:
src/bin/lttng-relayd/stream.c
249: lseek_ret = fs_handle_seek(previous_stream_file, previous_stream_copy_origin, SEEK_SET);
src/bin/lttng-relayd/live.c
1804: lseek_ret = fs_handle_seek(vstream->stream_file.handle,
src/bin/lttng-relayd/viewer-stream.c
176: lseek_ret = fs_handle_seek(
ag lseek_ret:
src/bin/lttng-relayd/stream.c
193: off_t lseek_ret, previous_stream_copy_origin;
src/bin/lttng-relayd/live.c
1760: off_t lseek_ret;
src/bin/lttng-relayd/viewer-stream.c
174: off_t lseek_ret;
Known drawbacks
=========
This limitation existed before this patch.
On 32bit system without _FILE_OFFSET_BITS=64 defined at compile time
(-D_FILE_OFFSET_BITS=64) the lseek operation could return EOVERFLOW for
stream file bigger then 2,147,483,647 bytes. Anybody working on a 32bit
system should be aware of the limitation of working in 32bit.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ib6c7bb3c9c402bdfbe9b447b1f8f298de6058caa
Jérémie Galarneau [Thu, 14 May 2020 20:08:56 +0000 (16:08 -0400)]
Fix: lttng: Destroying session message repeated during destruction
Observed Issue
==============
The `Destroying session X` is repeated indifinitely whenever
the data pending phase lasts more than one iteration.
```
$ lttng destroy
Destroying session eloi_turcotte.Destroying session
eloi_turcotte.Destroying session eloi_turcotte.Destroying session
eloi_turcotte.D
```
Cause
=====
Missing check that the message has been printed.
Solution
========
Use the same check as is done later for
lttng_destruction_handle_wait_for_completion().
Known drawbacks
===============
None.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6cd29d917925644a4994c515b4177bbd05ffa98e
Jérémie Galarneau [Fri, 15 May 2020 20:04:11 +0000 (16:04 -0400)]
Add lttng_dynamic_buffer_append_view util
Add lttng_dynamic_buffer_append_view() which appends the contents
of a buffer view to a dynamic buffer.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4082ba2c848b79aa2116847987067453638de441
Jérémie Galarneau [Fri, 15 May 2020 19:55:27 +0000 (15:55 -0400)]
Make lttng_dynamic_buffer_append_buffer const-correct
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6d42d2d9f8beca15b026fc43ee57270173480c2d
Jérémie Galarneau [Tue, 12 May 2020 18:37:37 +0000 (14:37 -0400)]
.gitignore: add test_buffer_view
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I620042eb36b6041887fe0d4dc7353c311d5e867a
Jérémie Galarneau [Wed, 20 May 2020 19:56:03 +0000 (15:56 -0400)]
Fix: liblttng-ctl: leak of tracker handle in lttng_[un]track_pid
The lttng_track_pid and lttng_untrack_pid functions were reimplemented
on top of the new lttng_process_attr_tracker_handle API (new in 2.12).
Both functions do not destroy the tracker handle on their return
path.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ieeeace2011ba6ee1e5024306ea735f0389c3980d
Jonathan Rajotte [Tue, 19 May 2020 16:23:18 +0000 (12:23 -0400)]
Fix: common: abort on rotation after time manipulation
Observed issue
==============
Core dump:
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x0000003eb4025548 in __GI_abort () at abort.c:79
#2 0x0000003eb402542f in __assert_fail_base (fmt=0x3eb4184ae0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x4cdee0 "(trace_chunk->timestamp_close).is_set",
file=0x4cde78 "../../../lttng-tools-2.11.3/src/common/trace-chunk.c", line=903, function=0x4cf4a0 <__PRETTY_FUNCTION__.6756> "lttng_trace_chunk_move_to_completed")
at assert.c:92
#3 0x0000003eb4033af2 in __GI___assert_fail (assertion=assertion@entry=0x4cdee0 "(trace_chunk->timestamp_close).is_set",
file=file@entry=0x4cde78 "../../../lttng-tools-2.11.3/src/common/trace-chunk.c", line=line@entry=903,
function=function@entry=0x4cf4a0 <__PRETTY_FUNCTION__.6756> "lttng_trace_chunk_move_to_completed") at assert.c:101
#4 0x000000000047f37e in lttng_trace_chunk_move_to_completed (trace_chunk=0x7fcb5c00e570) at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:903
#5 0x0000000000480755 in lttng_trace_chunk_release (ref=0x7fcb5c00e598) at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1117
#6 urcu_ref_put (release=<optimized out>, ref=0x7fcb5c00e598) at /usr/include/urcu/ref.h:68
#7 lttng_trace_chunk_put (chunk=0x7fcb5c00e570) at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1150
#8 0x0000000000429c22 in cmd_rotate_session (session=0x7fcb5c003ff0, rotate_return=rotate_return@entry=0x7fcb6b7ed470, quiet_rotation=quiet_rotation@entry=false)
at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/cmd.c:5037
#9 0x00000000004451d7 in process_client_msg (cmd_ctx=0x7fcb5c00e760, sock=sock@entry=0x7fcb6b7fd4c0, sock_error=sock_error@entry=0x7fcb6b7fd4c4)
at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/client.c:1852
#10 0x00000000004474c6 in thread_manage_clients (data=<optimized out>) at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/client.c:2199
#11 0x00000000004422f2 in launch_thread (data=0x4f97a0) at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/thread.c:75
#12 0x0000003eb4408ed4 in start_thread (arg=<optimized out>) at pthread_create.c:479
#13 0x0000003eb40f8e6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Reproduction:
Disable ntp/any time management mechanism.
lttng create
lttng enable-event -u 'lttng_ust_tracef:*'
lttng start
lttng rotate
date --set="$(date --date='-1 hour')"
lttng rotate auto-
20200515-142503
Waiting for rotation to complete
Error: Failed to query the state of the rotation.
Logs:
DEBUG1 - 12:25:28.
570037987 [2660/2717]: Setting trace chunk close command to "move to completed chunk folder" (in lttng_trace_chunk_set_close_command() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1073)
Error: Failed to set trace chunk close timestamp: close timestamp is before creation timestamp
Error: Failed to set the close timestamp of the current trace chunk of session "auto-
20200515-142503"
lttng-sessiond: ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:903: lttng_trace_chunk_move_to_completed: Assertion `(trace_chunk->timestamp_close).is_set' failed.
...
Aborted (core dumped)
root@X10SDV-8C-TLN4F:~# DEBUG1 - 12:25:29.
534263017 [2739/2739]: Releasing trace chunk registry to all trace chunks (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1414)
DEBUG1 - 12:25:29.
534317468 [2739/2739]: Releasing reference to trace chunk: session_id = 0chunk_id = 2, name = "20200515T122528+0000-2", status = closed (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1435)
DEBUG1 - 12:25:29.
534365653 [2739/2739]: Releasing reference to trace chunk: session_id = 0chunk_id = 1, name = "20200515T142520+0000-1", status = closed (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1435)
DEBUG1 - 12:25:29.
534400638 [2739/2739]: Released reference to 2 trace chunks in lttng_trace_chunk_registry_put_each_chunk() (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1447)
Error: 2 trace chunks are leaked by lttng-consumerd. This can be caused by an internal error of the session daemon.
Cause
=====
The trace_chunk->timestamp_close is not set since the result from time()
is smaller than the creation timestamp.
The close timestamp is smaller because the calendar system time is
modified by an administrator.
time() offers no monotonicity guarantee and hence is exposed to time
modification of the system.
The begin and close timestamps are strictly used in the name generation
of the chunk/archives. Given the current usage of these timestamps
validating monotonicity should not be a fatal error. Name uniqueness is
provided by the chunk name suffix (auto increment).
Solution
========
Do not enforce monotonicity for the begin and close timestamps but warn
on unexpected return (begin > close).
Known drawbacks
=========
None.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic4b17285d150358d1569d6821c451c243e64e9a1
Francis Deslauriers [Tue, 5 May 2020 16:27:52 +0000 (12:27 -0400)]
Tests: test_exclusion: exclusion after tracing active
This testcase tests for a UST bug where exclusions made when the tracing
is active are not excluding the undesired events.
This UST bug was fixed by the following commit:
commit
de713d8a77cbd77e60f58537b0fc222f98fde395
Author: Francis Deslauriers <francis.deslauriers@efficios.com>
Date: Tue May 5 11:51:58 2020 -0400
Fix: event probes attached before event enabled
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6d896682d5f5e16582ab195c6f4d6946de394843
Depends-on: lttng-ust: Id984f266d976f346b001db81cd8a2b74965b5ef2
Francis Deslauriers [Tue, 5 May 2020 16:19:13 +0000 (12:19 -0400)]
Tests: `gen-ust-nevents`: add syncpoints
Adds two sync points:
`--sync-in-main`: create a file when `gen-ust-nevents` when app is in
main,
`--sync-before-first-event`: wait on a file before starting to
generate any events.
Those two sync points allow for tests to do work when the testapp has
reached main BUT before any events are generated.
This is useful to perform actions on a UST channel once tracing has
started and is active on a particular app.
For example, we want to test a scenario where events are enabled once an
app is already generating other events.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id501c8b373e1a9b43aad6caef11672fe6c30a55a
Francis Deslauriers [Thu, 14 May 2020 18:56:05 +0000 (14:56 -0400)]
Tests: accept built-in kernel modules
When validating that the kernel tracer is available on the system also
consider that it might be built directly in the kernel.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2b0e1767b8a6d561ad94ba38546cb183d9e98a95
Jonathan Rajotte [Mon, 11 May 2020 18:21:51 +0000 (14:21 -0400)]
API: missing includes in lttng.h
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8ffc15f1ea855cdc6e08da6387936406ce9f7d75
Jonathan Rajotte [Mon, 11 May 2020 18:03:34 +0000 (14:03 -0400)]
API: missing clear and clear-handle includes in lttng.h
Refs: #1266
Reported-by: Shuo Yang <shuoyang@didiglobal.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia933831e806e51a071f05b2c0425799f2f6410cf
Jonathan Rajotte [Mon, 11 May 2020 18:02:00 +0000 (14:02 -0400)]
API: sort lttng.h includes
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I59699d6252c8b840cc3dc80316e0462c9420b465
Jonathan Rajotte [Mon, 11 May 2020 13:57:53 +0000 (09:57 -0400)]
Fix: API: missing end brace for C++ linkage specification.
Fixes: #1266
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia8f2b4b55acc5770297d20df1b95575cf7fa480a
Jérémie Galarneau [Tue, 5 May 2020 17:49:37 +0000 (13:49 -0400)]
README.md: fix typos in component descriptions
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I569455d5011e46cda7046d0aa0d6c507f461677d
Francis Deslauriers [Mon, 4 May 2020 19:59:22 +0000 (15:59 -0400)]
Fix: tests: `-Wstringop-overflow` warning
I get the following warning when compiling with `-Wall -Werror` with gcc
9.3.0:
In file included from ../../src/common/macros.h:15,
from ../../src/common/lttng-kernel.h:14,
from ../../src/bin/lttng-sessiond/trace-kernel.h:14,
from test_kernel_data.c:16:
In function ‘lttng_strnlen’,
inlined from ‘lttng_strncpy’ at ../../src/common/macros.h:120:6,
inlined from ‘test_create_kernel_event’ at test_kernel_data.c:136:2:
../../src/common/compat/string.h:28:8: error: ‘memchr’ reading 256 bytes from a region of size 11 [-Werror=stringop-overflow=]
28 | end = memchr(str, 0, max);
| ^~~~~~~~~~~~~~~~~~~
Fix this warning by using the RANDOM_STRING_LEN value as the max number
of bytes to copy.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Change-Id: I61752ee17163c4d642aad21b296c0fc4fad5b7a6
(cherry picked from commit
1dd622b10db0821d77490c937caee80c65332f14)
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Thu, 30 Apr 2020 16:33:51 +0000 (12:33 -0400)]
Improve README.md
Changes:
* Harmonize voice and style.
* Add non-breaking spaces and hyphens where missing.
* Add many internal and external links.
* Add `lttng-crash` and the Python bindings to the list of components.
* Use `≥` instead of `>=` (welcome to 2020).
* Do not specify why you need to configure with `--disable-epoll` to use
a Linux kernel < 2.6.27: I don't think the end user needs to know
about poll() vs. epoll().
* Do not mention Mathieu Desnoyers and Paul E. McKenney (this is a
README, not the `THANKS` file).
* Mention Babeltrace 2 instead of Babeltrace.
* Use `https://babeltrace.org/` instead of
`https://lttng.org/babeltrace`.
* Add kmod to the list of optional dependencies.
* Specify that you can use `--enable-embedded-help` at build
configuration time if you don't plan to have a manual pager installed
on your system.
* Don't mention the known GNU gold bug
(`http://sourceware.org/bugzilla/show_bug.cgi?id=11317`).
Again, the end user only cares about which minimal version of the tool
to use, not why.
* Use clear, ordered build steps.
For the configuration step, specify which options are available and
what they change.
* Remove the link to `doc/quickstart.txt`.
Link to the LTTng Documentation and online manual pages instead, which
are more complete.
* Remove the link to `doc/streaming-howto.txt`: the LTTng Documentation
and online manual pages cover this topic.
* Change the "Contact" section for the "Community" section and put all
the community/communication links in there.
* Remove the "Package contents" section: end users do not care about
this and we're not keeping it updated as we amend the project's source
anyway.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Change-Id: I4e4c0b183d9a18243db23cde3edb608e32c5e30e
(cherry picked from commit
f8bd3a12204c10743dfbc456d363e30894eddfc5)
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Christophe Bedard [Wed, 8 Apr 2020 21:04:48 +0000 (17:04 -0400)]
Docs: fix comment typo in lttng-error.h
Signed-off-by: Christophe Bedard <bedard.christophe@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I735b60dd072d717e5f8ad4dd4420e37187e05de4
Jérémie Galarneau [Tue, 21 Apr 2020 21:08:10 +0000 (17:08 -0400)]
Fix: sessiond: sessiond and agent deadlock on destroy
Observed issue
--------------
While running the out-of-tree java agent tests [1], the session daemon
and agent often end up in a deadlock.
Attaching gdb to the session daemon, we can see that two threads are
blocked in an intriguing state.
Thread 13 (Thread 0x7f89027fc700 (LWP 9636)):
#0 0x00007f891e81a4cf in __lll_lock_wait () from /usr/lib/libpthread.so.0
#1 0x00007f891e812e03 in pthread_mutex_lock () from /usr/lib/libpthread.so.0
#2 0x000055637f1fbd92 in session_lock_list () at session.c:156
#3 0x000055637f25dc47 in update_agent_app (app=0x7f88ec003480) at agent-thread.c:56
#4 0x000055637f25ec0a in thread_agent_management (data=0x556380cd2400) at agent-thread.c:426
#5 0x000055637f22fb3a in launch_thread (data=0x556380cd24a0) at thread.c:65
#6 0x00007f891e81046f in start_thread () from /usr/lib/libpthread.so.0
#7 0x00007f891e7203d3 in clone () from /usr/lib/libc.so.6
Thread 8 (Thread 0x7f8919309700 (LWP 9631)):
#0 0x00007f891e81b44d in recvmsg () from /usr/lib/libpthread.so.0
#1 0x000055637f267847 in lttcomm_recvmsg_inet_sock (sock=0x7f88ec0033c0, buf=0x7f89192f5d5c, len=4, flags=0) at inet.c:367
#2 0x000055637f2146c6 in recv_reply (sock=0x7f88ec0033c0, buf=0x7f89192f5d5c, size=4) at agent.c:275
#3 0x000055637f215202 in app_context_op (app=0x7f88ec003400, ctx=0x7f8908020900, cmd=AGENT_CMD_APP_CTX_DISABLE) at agent.c:552
#4 0x000055637f215c2d in disable_context (ctx=0x7f8908020900, domain=LTTNG_DOMAIN_JUL) at agent.c:841
#5 0x000055637f217480 in agent_destroy (agt=0x7f890801dc20) at agent.c:1326
#6 0x000055637f243448 in trace_ust_destroy_session (session=0x7f8908004010) at trace-ust.c:1408
#7 0x000055637f1fd775 in session_release (ref=0x7f8908001e70) at session.c:873
#8 0x000055637f1fb9ac in urcu_ref_put (ref=0x7f8908001e70, release=0x55637f1fd62a <session_release>) at /usr/include/urcu/ref.h:68
#9 0x000055637f1fdad2 in session_put (session=0x7f8908000d10) at session.c:942
#10 0x000055637f2369e6 in process_client_msg (cmd_ctx=0x7f890800e6e0, sock=0x7f8919308560, sock_error=0x7f8919308564) at client.c:2102
#11 0x000055637f2375ab in thread_manage_clients (data=0x556380cd1840) at client.c:2347
#12 0x000055637f22fb3a in launch_thread (data=0x556380cd18b0) at thread.c:65
#13 0x00007f891e81046f in start_thread () from /usr/lib/libpthread.so.0
#14 0x00007f891e7203d3 in clone () from /usr/lib/libc.so.6
T8 is holding session list lock while the cmd_destroy_session
command is being processed. More specifically, it is attempting
to destroy an "agent_context" by communicating with an "agent"
application.
Meanwhile, T13 is still registering that same "agent" application.
Cause
-----
The deadlock itself is pretty simple to understand.
The "agent thread" (T13) has the responsability of accepting new agent
application connections. When such a connection occurs, the thread
creates a new `agent_app` instance and sends the current sessions'
configuration (i.e. their event rules and contexts) to the agent
application. When that "update" is complete, a "registration done"
message is sent to the new agent application.
From the stacktrace above, we can see that T13 is attempting to update
the agent application with its initial configuration, but it is
blocked on the acquisition of the session list lock. The application's
agent is also blocked since it is waiting for the "registration done"
message before allowing tracing to proceed (not shown here, but seen
in the test logs).
Meanwhile, T8 is holding the session list lock while destroying a
session. This is expected as all client commands are executed with
this lock held. It is, amongst other reasons, used to serialize
changes to the sessions' configuration and configuration updates sent
to the tracers (i.e. because new apps appear or to keep existing
tracers in sync with the users' session configuration).
The question becomes: why is T8 tearing down an application that is
not yet registered?
First, inspecting `agent_app` immediately shows that this structure
has no built-in synchronization mechanism. Therefore, the fact that
two threads are accessing it at the same time raises a big red flag.
Speculating on the intentions of the original design, my intuition is
that the "agent_management" thread's role is limited to instantiating
an `agent_app` and synchronizing it with the various sessions'
configuration. Once that synchronization is performed, the agent
application should be published and never accessed again by the "agent
thread".
Configuration updates (i.e. new event rules, contexts) are then sent
synchronously as they are requested by a client in the context of the
client thread. Those updates are performed while holding the session
list lock.
Hence, there is only one thread that should manipulate the agent
application at any given time making an explicit `agent_app` lock
unnecessary.
Overall, this would echo what is done when a 'user space tracer'
application registers to the session daemon (see dispatch.c:368).
Evidently this isn't what is happening here.
The agent thread creates the `agent_app`, publishes it, and then
performs an "agent app update" (sending the configuration) while
holding the session list lock. This means that there is a window where
an agent application is visible to the other threads, yet has not been
properly registered.
Solution
--------
The acquisition of the session list lock is moved outside of
update_agent_app() to allow the "agent thread" to hold the session
list lock during the "configuration update" phase of the agent
application registration.
Essentially, the sequence of operation changes from:
- Agent tcp connection established
- call handle_registration()
- agent version check
- allocation of agent_app instance
- new agent_add is published through the global agent_apps_ht_by_sock
hashtable
***
it is now reachable by all other threads without any form of
exclusivity synchronization.
***
- update_agent_app
- acquire session list lock
- iterate over sessions
- send configuration
- release session list lock
- send registration done
to:
- Agent tcp connection established
- call accept_agent_registration()
- agent version check
- allocation of agent_app instance
- acquire session list lock
- update_agent_app
- iterate over sessions
- send configuration
- send registration done
- new agent_add is published through the global agent_apps_ht_by_sock
hashtable
- release session list lock
Links
-----
[1] https://github.com/lttng/lttng-ust-java-tests
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia34c5ad81ed3936acbca756b425423e0cb8dbddf
Jérémie Galarneau [Wed, 22 Apr 2020 16:25:20 +0000 (12:25 -0400)]
relayd: clean-up: remove unused DATETIME_STRING_SIZE definition
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I79d6fd63537792b3031075bb102daddce5db1bde
Jérémie Galarneau [Fri, 17 Apr 2020 19:49:52 +0000 (15:49 -0400)]
Fix: load: incomplete error handling for load_session_from_file
This fix is adapted from a fix against the stable-2.11 branch. The
commit message of the stable-2.11 branch follows.
An equivalent fix was already in place in `load_session_from_path()`,
but the same problem as the stable-2.11 branch is present in
`load_session_from_file()`.
Original message:
Observed issue
==============
lttng-ivc test fails to fail.
test_save_load_blocking_timeout[lttng-tools-2.12-lttng-tools-2.11-False]
Here we load a xml created by lttng-tools-2.12 and try to load it using
lttng-tools 2.11. We expect this to fail on the load.
The command report an error on the stderr but the command return code
value is zero.
From lttng-ivc test runtime.log:
Command #0
Return value: 0
Command: lttng load --input-path=/home/joraj/lttng/lttng-ivc/.tox/py3/tmp/test_save_load_blocking_timeou0/save_load saved_trace
STDOUT:
Session saved_trace has been loaded successfully
STDERR:
XML Error: Element 'process_attr_trackers': This element is not expected.
Error: Session configuration file validation failed
Cause
=====
The error coming from load_session_from_file is not handled correctly.
Solution
========
Rework error handling in load_session_from_path and
load_session_from_file.
LTTNG_ERR_LOAD_SESSION_NOENT is NOT an error when session_name is
specified in load_session_from_path. In this scenario, we are actively
looking for the configuration of the session.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: Ic68c253aa194bf8ab72c3c271f10d443118bdeee
Simon Marchi [Fri, 29 Nov 2019 21:48:16 +0000 (16:48 -0500)]
actions: improve logging in lttng_action_create_from_buffer
Small improvements in the logging messages. Use "Create from buffer"
instead of "Deserializing", since that's the term used in the function
names.
Change-Id: I7e688df766602cfb73bc40d87d224591c0f29534
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Fri, 29 Nov 2019 21:46:05 +0000 (16:46 -0500)]
actions: introduce lttng_action_init
This function is to be used to initialize the common portion of all
action structures. I have updated the sole currently existing action
(notify), but the function will also be used in subsequent patches.
Change-Id: I4e42554c42a4e6a5ef2da292a6dfeb708d72a602
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Thu, 28 Nov 2019 22:51:01 +0000 (17:51 -0500)]
actions: introduce function typedef for creating actions from buffer
The only existing action type, LTTNG_ACTION_TYPE_NOTIFY, does not
require deserializing additional data (on top of the action type field),
so lttng_action_create_from_buffer handles it in a trivial way.
Upcoming patches will introduce new action types which will need to
deserialize some additional data. This patch prepares
lttng_action_create_from_buffer for that by making it call an
action-specific function for deserializing this additional data. The
changes are inspired by what lttng_condition_create_from_buffer does.
No functional changes intended.
Change-Id: I469a67b744aa2cf7a45d7d970f1bbee6a994a2a9
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Fri, 29 Nov 2019 21:42:24 +0000 (16:42 -0500)]
buffer-view: introduce lttng_buffer_view_contains_string
This function can be used to safely validate that a string in a buffer
view has the length we expect. The goal is to avoid doing any reads
outside the buffer view, whatever the input is, considering that the
buffer and advertised length of the string are untrusted data.
It will be used by subsequent patches to deserialize strings from
received buffers.
Change-Id: I8afc4b6f9334e035e0ef02cb96b157df59d8107d
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Mon, 25 Nov 2019 19:20:31 +0000 (14:20 -0500)]
Move actions source files to src/common/actions directory
Since more actions will be added, group them under an "actions"
directory. The files to be added are expected to have some pretty
generic/overloaded names, such as start-session.c and snapshot.c, so
having them under the actions directory make it clear that they
implement the action described by the name.
Change-Id: Ia47160dd75531eb9bcf13f875f4bc6caa2391d7b
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Wed, 1 Apr 2020 21:48:09 +0000 (17:48 -0400)]
ust registry: Refactor representation of nested types
Those allow LTTng-UST to internally represent nested types properly,
and to serialize them when sending them over to the session
daemon. However, for now, the session daemon only accepts arrays and
sequences which contain an integer type, which is the only use-case
emitted by lttng-ust. Wait until we have the nested types wired up
within lttng-ust before accepting and supporting nested types so it
can be tested.
Given the size of this change to ust-metadata.c, use this opportunity
to make sure that every
if (ret)
...
in this file is changed to conform to lttng-tools coding style:
if (ret) {
...
}
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I3ac44fc3a993f73d1fb08056331ad6fed7059981
Depends-on: lttng-ust: I45bb0886c5ac970c3ca75dbefcb94adb50801294
Simon Marchi [Fri, 3 Apr 2020 17:52:41 +0000 (13:52 -0400)]
common: keep libcommon_la_SOURCES list sorted
Change-Id: I8d263f4d876047074cab3169f929fd255536d702
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Fri, 3 Apr 2020 20:14:05 +0000 (16:14 -0400)]
lttng-crash(1): document the command's positional argument
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Change-Id: I919329f2e072b14ad94181e98e49cb8840991225
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Fri, 3 Apr 2020 02:48:57 +0000 (22:48 -0400)]
lttng-sessiond(8): append missing argument to short options
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Change-Id: Ifab424acfa6dedbe43167e69e2c2fb70f3ecf9a7
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Fri, 3 Apr 2020 02:46:40 +0000 (22:46 -0400)]
lttng-sessiond(8): sort the option list by long option name
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Change-Id: Icaeabbb95fa650c059f208b1808f0b2cf3202117
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Fri, 3 Apr 2020 02:45:57 +0000 (22:45 -0400)]
lttng-relayd(8): mention the `--config` option
This patch does not document the configuration file's format. This work
is reserved for a future patch.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Change-Id: If36dcaf244e194b58c1e6cacbe0e771176ac2045
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 7 Apr 2020 04:46:39 +0000 (00:46 -0400)]
Fix: lttng-load: support legacy PID tracker specification
The 2.12 release changes the way tracked process attributes are
expressed in the MI and save/restore formats. While the MI schema was
bumped to 4.0, the save/restore schema only undergoes a minor bump to
accomodate existing users.
The original commit introducing this change justified the breaking
change as saved PIDs being fairly unlikely. However, even the
'INCLUDE_ALL' policy will specify a 'trackers' node, which no longer
existed and made all existing configurations incompatible.
A legacy load path is introduced to support the former PID tracker
serialization format and preserve the compatibility with existing
configurations.
Configurations generated by the 2.11 releases are included to test
this new legacy path.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ied6532c42cb2d1a5c9e39785cc4e47aaf89b8288
Jérémie Galarneau [Tue, 7 Apr 2020 04:14:00 +0000 (00:14 -0400)]
Fix: sessiond: invalid session configuration on EXCLUDE_ALL policy
Saving a session with a process attribute tracker that uses the
`EXCLUDE_ALL` policy results in an invalid session configuration.
Currently, a tracker of the following form is produced:
<process_attr_values>
<vpid/>
</process_attr_values>
This is invalid as per the XSD as 'vpid' is not a list; it is an
individual tracked attribute.
The appropriate '<process_attr_values/>' empty node is now produced.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia31e276fe7246a89e72d6808c9ed960fb04f1b3a
Jérémie Galarneau [Mon, 6 Apr 2020 16:39:17 +0000 (12:39 -0400)]
Fix: relayd: unchecked allocation result of unlinked file pool
`pool` is not checked for NULL after its allocation. Error out
if the allocation fails.
In lttng_unlinked_file_pool_create: Return value of function which
returns null is dereferenced without checking (CWE-476)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2a7717701cf3d11de557b9ecdc6609c1f6a1fd6f
Francis Deslauriers [Fri, 3 Apr 2020 18:57:55 +0000 (14:57 -0400)]
lttng-crash: use `spawn_viewer()` to launch trace viewer
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Change-Id: I59f8d6c1189b3b3b4cfd0704ff2c8eb22e6df44f
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Fri, 3 Apr 2020 18:21:25 +0000 (14:21 -0400)]
lttng-view: clean-up: move `--viewer` code to specific file
This code will be reuse by the lttng-crash utility.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Change-Id: Ide72ad08577d55bbf2f7833d46e734b8a680c9d2
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Thu, 2 Apr 2020 21:27:03 +0000 (17:27 -0400)]
lttng-crash: clean-up: fix alignment of format string
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Change-Id: I893a8ab7cbc0ad35a4ebabcf5017dbc83be1efc3
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Thu, 2 Apr 2020 20:16:59 +0000 (16:16 -0400)]
lttng-view: clean-up: rename `parse_options()` -> `parse_viewer_option()`
This seems more representative of what this function does.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Change-Id: I78e4282a13048f4a35df4ddeae91b89969b73941
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Fri, 27 Mar 2020 01:11:35 +0000 (21:11 -0400)]
Fix: python: suppress -Wmissing-prototypes warning with SWIG 3.0.10
SWIG 3.0.10 is used on SLES 12. It produces this warning:
CC lttng_wrap.lo
lttng_wrap.c:3411:1: error: no previous prototype for ‘SWIG_strnlen’ [-Werror=missing-prototypes]
3411 | SWIG_strnlen(const char* s, size_t maxlen)
| ^~~~~~
SWIG_strnlen is defined like this:
size_t
SWIG_strnlen(const char* s, size_t maxlen)
{
...
}
Since the function is not static and has no previous declaration, the
diagnostic is emitted. We can see that they have fixed it in SWIG
3.0.12, where the same function is defined as:
SWIGINTERN size_t
SWIG_strnlen(const char* s, size_t maxlen)
{
...
}
SWIGINTERN is defined as static.
We can workaround the warning by adding our own declaration for
SWIG_strnlen in lttng.i(.in).
Tested by build with SWIG 3.0.10, 3.0.11 and 3.0.12.
Change-Id: I7d9d93cf5fc04f2044a47ad384e16d726e27f72f
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Thu, 2 Apr 2020 20:04:49 +0000 (16:04 -0400)]
lttng-view: clean-up: use singular form for type name
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Change-Id: Ibeb6363def3579761e9a5a4cb76f7c6ca22d6146
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Thu, 2 Apr 2020 20:00:14 +0000 (16:00 -0400)]
lttng-view: clean-up: remove references to LTTv
Those comments are not relevant anymore and were missed by the following
clean-up commit:
commit
6dd26587e926671cdec2545b7d3db74cbd6a7cd8
Author: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Date: Thu Jan 30 12:01:57 2020 -0500
lttng-view: clean-up: remove commented and unused references to lttv
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Change-Id: Ie68614c8160f62a2c045153b377f036a4399ad87
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 3 Apr 2020 19:05:37 +0000 (15:05 -0400)]
Fix: relayd: harmonize path format in backward-compat mode
Observed issue
==============
Currently, the relay daemon produces the following path formats
depending on the whether a tracepath is provided, the version of the
session daemon peer, and the grouping option specified on launch of
the relay daemon.
Hostname grouping, no custom trace path
pre 2.11: $BASE/$SHOSTNAME/$SESSION-$DATETIME
2.12: $BASE/$SHOSTNAME/$SESSION-$DATETIME
Hostname grouping, custom trace path
pre 2.11: $BASE/$HOSTNAME/$TRACEPATH
2.12: $BASE/$HOSTNAME/$TRACEPATH
Tracing session grouping, no custom trace path
pre 2.11: $BASE/$SESSION/$HOSTNAME-$DATETIME
2.12: $BASE/$SESSION/$HOSTNAME-$DATETIME
Tracing session grouping, custom trace path
pre 2.11: $BASE/$SESSION/$HOSTNAME/$TRACEPATH
2.12: $BASE/$SESSION/$HOSTNAME-$DATETIME/$TRACEPATH
As you can see, there is a single case where the format
diverges based on the version of the session daemon.
Cause
=====
Pre-2.11 session daemons do not transmit a session creation time when
a TRACEPATH is specified as part of the streaming url (e.g.
`lttng create my_session --set-url net://localhost/a_path`)
Hence, the backward compatibility path formatting code does not
insert a "DATETIME" string in the resulting path.
Solution
========
The relay daemon samples the time when it creates its session and that
time is formatted into the DATETIME representation if no DATETIME is
present in the path provided by pre-2.11 peers.
Drawbacks
=========
Sampling the relay session creation time will not yield the exact same
behaviour as what a 2.11+ peer would produce, but it is a reasonable
approximation for most use-cases.
Users depending on this time being the exact same as that sampled by
the session daemon will need to adapt tools anyhow if they use the new
--group-output-by-session option, so the change doesn't introduce more
problems.
This behaviour can be surprising when snapshots are streamed by
pre-2.11 peers as the session creation DATETIME will be different for
all snapshots. This is not ideal, but still less jarring than getting
a completely different path format depending on a peer's version.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I3316aa35a34985e83ae759851af3a899b0011789
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 3 Apr 2020 14:43:18 +0000 (10:43 -0400)]
Bump session.xsd version to 2.12
The session schema has changed to include new tracked process
attributes. This change is technically non-backward compatible if
tracked PIDs were saved.
As it is extremely unlikely that anyone saves PIDs to a session
configuration to load it in another lttng-tools version, the schema
does not allow the pre-2.12 PID tracking specification to be
expressed. In my opinion, this is unlikely enough not to warrant a
major version bump.
This can be changed if someone really encounters this in the wild or
has a legitimate use case for going through this trouble.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8fd8dac02065b09c3e3c75b38699caea3223ce50
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Thu, 2 Apr 2020 19:32:59 +0000 (15:32 -0400)]
lttng-relayd(8): normalize style and add details
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Change-Id: Ifd2e90686bf9955f0c68fe158c60344d346814d0
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Thu, 2 Apr 2020 19:03:15 +0000 (15:03 -0400)]
doc/man: refer to Babeltrace 2 instead of Babeltrace 1
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Change-Id: Icafd69f28304b14e49e8cf8e21d2c6ebe3e319b6
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Thu, 2 Apr 2020 18:44:36 +0000 (14:44 -0400)]
lttng-clear(1): normalize style and add details
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Change-Id: I50709df27a04bbb5f300c734b9a85ae3a7c311c2
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Tue, 31 Mar 2020 03:24:24 +0000 (23:24 -0400)]
Fix: filter-grammar-test: add dependencies between steps
If the user specifies only the -B switch of the filter-grammar-test
program, the program will try to print the bytecode without having
generated it first, leaving to a segfault. Similarly, if the user
specifies only the -b switch, the program will try to generate the
bytecode from the IR, without having generated the IR first, also
leading to a segfault.
This patch adds some kind of dependency between the steps, such that if
the user is interested in a particular step (let's say, print the
bytecode), all the required steps will also be done.
Change-Id: Idc365a12b992a950566f227759fd3223cf5c5fd8
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 2 Apr 2020 19:47:30 +0000 (15:47 -0400)]
Fix: relayd: assertion fails on creation of session by peer < 2.11
Observed issue
==============
An assertion that a chunk has no active directory handle fails when
creating an anonymous chunk. More specifically, this occurs when
associating an fd tracker to the newly created anonymous trace chunk.
This occurs when a session is created by a peer that is older than
2.11.
Cause
=====
Trace chunks that should monitor their file descriptors with a file
descriptor tracker must be associated with the tracker before any
other operation occurs on the chunk. This is to ensure that "raw" file
descriptors are not created when they were meant to be tracked.
Here, the credentials and session output directory are set before the
file descriptor tracker was provided to the anonymous chunk which is a
breach of the API contract (enforced by the assert()).
Solution
========
Associate the fd tracker immediately to the anonymous chunk before
providing it with a reference to the file descriptor
tracker. Moreover, a leak of the output_directory is prevented by not
setting it to NULL. The trace chunk will acquire a reference to the
trace chunk; it is not transferred to the trace chunk.
Note
====
The problem was introduced during the 2.12 release cycle (clear
feature); this doesn't need to be backported.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I92ca2f156f500dfe02e09f8b4783447e46710246
Jérémie Galarneau [Thu, 2 Apr 2020 18:08:12 +0000 (14:08 -0400)]
Fix: relayd: crash on creation of session by peer < 2.11
Observed issue
==============
A NULL pointer dereference occurs during the creation of
a session that is associated with a peer older than 2.11.
The resulting backtrace follows:
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000564af45b755b in lttng_trace_chunk_set_as_owner (chunk=0x7f8ca8004730, session_output_directory=0x7f8ca8004680) at trace-chunk.c:1033
1033 if (chunk->path[0] != '\0') {
[Current thread is 1 (Thread 0x7f8cb808d700 (LWP 7300))]
#0 0x0000564af45b755b in lttng_trace_chunk_set_as_owner (chunk=0x7f8ca8004730, session_output_directory=0x7f8ca8004680) at trace-chunk.c:1033
#1 0x0000564af45a6a78 in session_set_anonymous_chunk (session=0x7f8ca8001380) at session.c:229
#2 session_create (session_name=<optimized out>, hostname=<optimized out>, base_path=<optimized out>, live_timer=<optimized out>, snapshot=<optimized out>,
sessiond_uuid=<optimized out>, id_sessiond=<optimized out>, current_chunk_id=<optimized out>, creation_time=<optimized out>, major=<optimized out>,
minor=<optimized out>, session_name_contains_creation_time=<optimized out>) at session.c:416
#3 0x0000564af459207e in relay_create_session (conn=0x7f8ca0000f60, payload=<optimized out>, recv_hdr=<optimized out>) at main.c:1428
#4 0x0000564af4594f12 in relay_process_control_command (payload=0x7f8cb808c940, header=0x7f8ca0001000, conn=0x7f8ca0000f60) at main.c:3218
#5 relay_process_control_receive_payload (conn=0x7f8ca0000f60) at main.c:3361
#6 0x0000564af45980b0 in relay_process_control (conn=0x7f8ca0000f60) at main.c:3478
#7 relay_thread_worker (data=<optimized out>) at main.c:3927
#8 0x00007f8cbba9a46f in start_thread () from /usr/lib/libpthread.so.0
#9 0x00007f8cbb9ca3d3 in clone () from /usr/lib/libc.so.6
Cause
=====
lttng_trace_chunk_set_as_owner() correctly handles the case
where a trace chunk has no output path, but expects the path
to be an empty string rather than being NULL.
This is not correct as an anonymous chunk, created in backward
compatibility mode when interacting with older peers, has no
path; the path is transmitted as part of the streams' attributes
upon their creation.
Solution
========
Simply check for a NULL pointer in the same place where the empty
chunk path string is created. The rest of the code in trace-chunk.c
doesn't assume that the chunk's path is non-NULL.
Note
====
The problem was introduced during the 2.12 release cycle (clear
feature); this doesn't need to be backported.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iaeb41e1648d61fbbe78d70b21191fd6d720900df
Jérémie Galarneau [Thu, 2 Apr 2020 04:57:38 +0000 (00:57 -0400)]
Fix: consumer: fallback to flush when flush empty is unsupported
Session destruction fails on older (<= 2.8) lttng-modules as the
flush_empty fails on the kernel streams during the quiet rotation.
Fallback to the regular flush as the semantics of regular rotations
are not expected here; we merely want to flush any pending data and
destroy the session.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ifdf8a4e60b55dbf582747d71f5c2485d24c11964
Jérémie Galarneau [Thu, 2 Apr 2020 04:45:21 +0000 (00:45 -0400)]
Fix: consumerd: incorrect clear logging statement
A logging statement was apparently copy-pasted from the rotation code
and not adapted for the consumer_clear_buffer command and would
indicate that a flush operation failed when a clear operation was
performed.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I909a391fbd8b8bf48c7481d394571ad19d7f6332
Jérémie Galarneau [Thu, 2 Apr 2020 04:25:11 +0000 (00:25 -0400)]
Fix: sessiond: error reported on session destruction for old modules
The session destruction command will return
-LTTNG_ERR_ROTATION_NOT_AVAILABLE_KERNEL when the kernel tracer
version does not support packet sequence numbers which prevents
rotations from being performed.
It is okay to not perform an implicit rotation in this case since we
know that no rotations have occurred during the session's lifetime (as
it is not supported). Thus, the client/library only needs to stop the
session, wait for pending data, and destroy the session.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ibccf73f08eecb6431ea3cc358bf8dd6af3ba4427
Jérémie Galarneau [Thu, 2 Apr 2020 02:16:32 +0000 (22:16 -0400)]
Fix: sessiond: erroneous error code returned on rotation failure
`LTTNG_ERR_KERN_CONSUMER_FAIL` is returned by the kernel domain
rotation handling code. This code is associated with a failure to
launch the kernel consumer daemon which is not the case here.
The `LTTNG_ERR_ROTATION_FAIL_CONSUMER` is used instead and returned to
the client.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ifdc9e46addcc4dc7ca2002d6dff5b2d7ac6c31f3
Jérémie Galarneau [Wed, 1 Apr 2020 22:34:59 +0000 (18:34 -0400)]
Fix: lttng-destroy: missing newline on session destruction message
The lost packets/discarded events statistics are printed on the same
line as the session destruction progress message when the session is
stopped as part of the `destroy` command.
This is a consequence of printing the statistics as they are
retrieved; the statistics must be fetched before the destruction,
but the progress indicator is still being printed.
The statistics output is now formatted to a buffer and printed
after the session's destruction has completed.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2105056b109774a57c83be3e30984038880c0fb7
Jérémie Galarneau [Wed, 1 Apr 2020 20:16:05 +0000 (16:16 -0400)]
relayd: clean-up: reference is repeated in comment
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2d963d5a3c0a2980eb745780b342b1fa9b14184d
Michael Jeanson [Wed, 1 Apr 2020 19:13:54 +0000 (15:13 -0400)]
Typo: 'Descritptor' -> 'Descriptor'
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Change-Id: I3b616777c9d39e23b84224cb1a6a92fa43fceb45
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Wed, 1 Apr 2020 19:09:22 +0000 (15:09 -0400)]
Typo: 'Accomodate' -> 'Accommodate'
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Change-Id: I1f5408d82546db7df14d4899d4d2c52ec1421b52
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 1 Apr 2020 14:51:56 +0000 (10:51 -0400)]
Clean-up: trace-ust comment still refers to only PID trackers
An application must meet all process attribute trackers restrictions
to be traced.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9d8cd00b95c7db70ebf4fe91d6aea60646e3c525
Jérémie Galarneau [Tue, 31 Mar 2020 02:10:36 +0000 (22:10 -0400)]
Fix: tracker: NULL pointer dereference after NULL check
value_view can be NULL and must thus be checked before use.
Moreover, the fix introduced in
1ad5cb59 is erreneous: the
function must validate that either:
- value is a 'name' type, value_view is not null, and not len == 0,
- value is an integer and value_view does not contain more data.
In process_attr_value_from_comm: Pointer is checked against null but
then dereferenced anyway (CWE-476)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia130ef57e10118960f1023338b90f7a10d588ee2
Jérémie Galarneau [Fri, 27 Mar 2020 15:27:13 +0000 (11:27 -0400)]
Fix: sessiond: NULL pointer dereference after NULL check
The process attribute value deserialization allows the buffer view to
be NULL when the value's type is not USER_NAME nor GROUP_NAME. This is
not checked when ensuring that no string is passed (len == 0) in the
case of integral values.
A NULL check is added to the condition.
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I343f747c325f739196284dadd3c407cfb4084268
Jérémie Galarneau [Fri, 27 Mar 2020 15:07:10 +0000 (11:07 -0400)]
Fix: sessiond: missing goto in error handler
The trace_ust inclusion set add/remove methods do not jump to the
end label after checking the `tracker` variable. This can result
in a NULL pointer dereference when an invalid process attribute
is specified.
The same problem appears in save_process_attr_trackers() and
process_attr_value_from_comm().
The missing jump (goto) is added in all cases.
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I473e008e5330a4c3820c8ab7c57ce4f2961e79b2
Jérémie Galarneau [Fri, 27 Mar 2020 15:01:05 +0000 (11:01 -0400)]
Fix: sessiond: user/group name can be leaked on malformed command
process_attr_value_from_comm() can leak a copy of the user/group
name when the value type is erroneous. This is not reachable in
"normal" execution, but could be triggered by invalid "crafter"
lttng-ctl commands.
In process_attr_value_from_comm: Leak of memory or pointers to
system resources (CWE-404).
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7ef55c0743c954a93e3d27ce17e6478708b49437
Simon Marchi [Wed, 25 Mar 2020 22:40:02 +0000 (18:40 -0400)]
configure: add -Wmissing-declarations, -Wmissing-prototypes, and more
Here's the rationale for each:
- -Wmissing-declarations: Make sure the definition of a function can
"see" a corresponding (usually in a header file), if it isn't static.
This makes sure that the declaration and definition don't go out of
sync, which can lead to hard to debug problems (because it still
builds, but the function doesn't receives what it thinks it receives).
On top of pointing out out-of-sync declarations, it can help point out
that a foo.c file misses including its header foo.c, or that a
function should actually be made static.
- -Wmissing-prototypes: makes sure that functions without parameters are
declared as `foo(void)` instead of `foo()`. In C, the former declares
a function that takes no parameters, whereas the latter declares a
function without specifying its parameters. The latter could be
called with any number of parameters, which is a recipe for confusion.
- -Wmissing-parameter-type, -Wold-style-definition,
-Wold-style-declarations, -Wstrict-prototypes: makes sure there's no
function declared with parameters without types specified, or using
the old style:
int foo(bar)
int bar;
{
...
}
Unlikely, but there's no harm in enabling them.
Change-Id: I7ddf5ff61b4466c0bd7b03485ef29156c399e2a8
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Wed, 25 Mar 2020 22:39:56 +0000 (18:39 -0400)]
Fix: sessiond: make the --without-lttng-ust version of launch_application_notification_thread static
When building with --without-lttng-ust, a simple version of
launch_application_notification_thread, implemented in the header file,
is used. We get this warning:
CC main.o
In file included from /home/simark/src/lttng-tools/src/bin/lttng-sessiond/main.c:61:
/home/simark/src/lttng-tools/src/bin/lttng-sessiond/notify-apps.h:17:6: error: no previous prototype for ‘launch_application_notification_thread’ [-Werror=missing-prototypes]
17 | bool launch_application_notification_thread(int apps_cmd_notify_pipe_read_fd)
| ^~~~~~~
Make the function `static inline` to avoid that. The `inline` is not
strictly required here, but if that header ended up included by some
other source file that didn't use
launch_application_notification_thread, we would get a -Wunused-function
warning. The `inline` avoids that.
Change-Id: I19605e0594af0d7997951def2da3a6313bf65e11
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Wed, 25 Mar 2020 22:39:52 +0000 (18:39 -0400)]
Fix: tests: include callsites.h from callsites.c
In commit:
commit
f12eb9c1ceb619db54be0842323a32cda12651cd
Author: Simon Marchi <simon.marchi@efficios.com>
Date: Mon Nov 25 16:41:29 2019 -0500
Fix all -Wmissing-declarations warning instances
I've fixed the -Wmissing-declarations warning in callsites.c by adding a local
declaration. That was wrong, since there is actually a callsites.h header file
that needs to be included, which contains the declaration. This is nicely
pointed out when building with clang and -Wstrict-prototypes:
CC exec_with_callsites-multi-lib-test.o
In file included from /home/simark/src/lttng-tools/tests/regression/ust/multi-lib/multi-lib-test.c:15:
/home/simark/src/lttng-tools/tests/regression/ust/multi-lib/callsites.h:10:21: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes]
void call_tracepoint();
^
void
Remove the local declaration and include callsites.h in callsites.c.
Change-Id: Ib656d96c2ed3b389697a2022e343e98ac0b66447
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Wed, 25 Mar 2020 22:39:48 +0000 (18:39 -0400)]
Fix: relayd: cast idigit argument to unsigned char
This diagnostic is emitted when building on cygwin:
main.c:233:34: warning: array subscript has type ‘char’ [-Wchar-subscripts]
233 | if (errno != 0 || !isdigit(arg[0])) {
| ~~~^~~
This is due to passing a `char` argument to isdigit. According to the
man page of isdigit, the arguments of type `char` must be cast to
`unsigned char`, so they are able to represent the EOF value. This
patch does that, and should get rid of the warning.
Change-Id: Iaed4c0b494a79b917761e65f18388f43478b97e1
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Wed, 25 Mar 2020 22:39:43 +0000 (18:39 -0400)]
Fix: tests: make some functions static
Make two functions static, they are only used in their respective file.
Change-Id: I022dd064249683c9414ab36602e9645865373c51
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
This page took 0.057857 seconds and 4 git commands to generate.