Mathieu Desnoyers [Fri, 22 Sep 2017 21:14:16 +0000 (17:14 -0400)]
Filter: Update shifting tests
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Fri, 22 Sep 2017 00:53:01 +0000 (20:53 -0400)]
Add () for bitwise and comparator tests
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Fri, 22 Sep 2017 00:13:17 +0000 (20:13 -0400)]
Filter: Implement rshift, lshift, bit not operators
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 21 Sep 2017 20:29:10 +0000 (16:29 -0400)]
Filters: generate backward compatible "get field" and "get context" instructions
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Tue, 4 Jul 2017 20:28:54 +0000 (16:28 -0400)]
Filter: index array, sequences, implement bitwise binary operators
Add load expressions, and produce bytecode allowing indexing of array
and sequence of integers, as well as bitwise binary operators &, |, ^.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 22 Jun 2017 20:17:54 +0000 (16:17 -0400)]
Implement support for brackets in filter expressions
Extends the bytecode with new instructions.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Anders Wallin [Thu, 17 May 2018 20:50:41 +0000 (22:50 +0200)]
Tests: add session auto-loading test cases
lttng-sessiond can auto load sessions at startup;
- with "--load" option to lttng-sessiond, load one file
or all sessions files in that directory
- from session files in $LTTNG_HOME/.lttng/sessions/auto/
- from session files in $sysconfdir/lttng/sessions/auto
This test case validates the two first scenarios.
Signed-off-by: Anders Wallin <wallinux@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Tue, 5 Jun 2018 16:11:20 +0000 (12:11 -0400)]
Replace deprecated readdir_r() with readdir()
readdir_r() has been deprecated since glibc 2.24 [1].
We used readdir_r() in load_session_from_path() to be thread-safe
since this function is part of liblttng-ust-ctl. However, according
to readdir()'s man page, it's thread-safe as long as the directory
stream it operates on is not shared across threads :
In the current POSIX.1 specification (POSIX.1-2008), readdir(3) is
not required to be thread-safe. However, in modern
implementations (including the glibc implementation), concurrent
calls to readdir(3) that specify different directory streams are
thread-safe. Therefore, the use of readdir_r() is generally
unnecessary in multithreaded programs. In cases where multiple
threads must read from the same directory stream, using readdir(3)
with external synchronization is still preferable to the use of
readdir_r(), for the reasons given in the points above.
In our use-case where we open the directory stream in the same function,
we know it won't be shared across threads and thus it's safe to use
readdir(). Here is the relevevant bit from the POSIX.1 [2] spec :
The returned pointer, and pointers within the structure, might be
invalidated or the structure or the storage areas might be overwritten
by a subsequent call to readdir() on the same directory stream. They shall
not be affected by a call to readdir() on a different directory stream.
[1] https://sourceware.org/bugzilla/show_bug.cgi?id=19056
[2] http://pubs.opengroup.org/onlinepubs/
9699919799/functions/readdir.html
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 28 May 2018 21:31:48 +0000 (17:31 -0400)]
Bash completion: ignore namespace for xmllint parsing
xmllint cli does not "easily" support namespace.
One can use the local_name() xpath function and other "trick".
The simplest trick for bash completion is to ignore the namespace
altogether.
Replacing "xmlns" by "ignore" does the job.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 5 Jun 2018 15:38:08 +0000 (11:38 -0400)]
Use https in links to the lttng.org website
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 10 May 2018 21:31:10 +0000 (17:31 -0400)]
Log the session to which a ROTATE_PENDING command applies
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 10 May 2018 21:14:39 +0000 (17:14 -0400)]
Initialize relay_stream chunk_id to its session's current trace archive id
Initializing the relayd's streams with a stream_chunk_id allows the
relayd to differentiate between a stream created before the first
rotation (at chunk id == 0) vs. a stream that has been created
after the last (or pending) rotation.
Before this fix, the relayd can fail to identify that a rotation
has been completed.
This is caused by the fact that a stream's chunk id is initialized
to 0 and updated by the RELAYD_ROTATE_STREAM command to the
id of the chunk that is currently being rotated.
The 'stream->current_chunk_id.value < chunk_id' check performed
by the RELAYD_ROTATE_PENDING will cause rotations to never
complete for streams that are created between the launch of a
rotation and the check for its completion.
For example, when the relayd is checking whether the rotation id
'3' is completed, it may see streams with the default value of
their chunk id set to '0' and determine that a rotation is still
pending.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 10 May 2018 19:00:34 +0000 (15:00 -0400)]
Pass the consumerd stream's trace archive id to the relayd
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 9 May 2018 01:41:08 +0000 (21:41 -0400)]
Fix: propagate archive id to the consumer daemon on stream creation
This is the first of a series of fixes addressing a number of problems
with the way session rotation completions handled.
Those issues can result in:
- A stop never completing,
- A rotation never completing,
- A rotation being marked as completed while the consumerd/relayd
are still writing to the completed chunk's trace archive,
resulting in a temporarily corrupted trace.
This first commit performs a relatively simple modification
to ensure that the session's current archive id is propagated to the
consumer daemon.
Detailed description of the problems
---
At the core of the problem is the fact that in per-pid buffering, we
are not guaranteed that the sessiond will be able to see an
application's channel(s) if it was torn down before (or even during)
the rotation.
When an application is torn down, it is removed from the ust_app_ht.
That doesn't mean its buffers were received by the relayd or even
consumed by the consumerd. The session daemon issues a "flush channel"
command, but there is no guarantee/synchronization to ensure the
buffers have been consumed.
The current design assumes that the sessiond knows all the channels to
rotate and that we can monitor those channels for the completion of
a rotation. Given that an application can disappear or appear while
we iterate on the ust_app_ht, this assumption does not hold. We also
don't want to prevent/delay applications from registering or exiting
just because a rotation is ongoing.
* Problem 1 *
A rename can happen before the relay has received all data for a given
chunk, leading to the data pending issue explained previously.
Rename should be performed as the last action after the rotation
has been completed since data can still be in-flight,
causing the creation of indexes upon its arrival on the relayd's end.
See: https://github.com/lttng/lttng-tools/blob/
cea6c68/src/bin/lttng-sessiond/rotation-thread.c#L392
Currently, the rotation thread waits for all channels (known to the
sessiond at the start of the rotation) to have reached their rotation
point. More specifically, the consumer will write to the
channel_rotation pipe everytime a channel's subbuffers have been read
up to the point of the rotation position. This does not guarantee that
the data has been commited to disk on the relay's end.
At that point, the command to rename the destination folder is sent to
the relayd and the sessiond checks for the pending rotation
periodically (every 200ms) if the output was to a relayd.
That check is assumed not to be needed when tracing locally since
reaching the rotation point implies the contents being written to
disk.
This scheme is not safe. If the sessiond sees no channel to iterate
on, it will issue the rename command immediately. If an application's
buffers were being flushed by the consumerd, the relayd will receive
the data, attempt to create index files, and fail since the folder has
been moved.
From an architectural standpoint, the rename command also leaves the
'path' of streams that were unknown to the sessiond pointing to a path
that does not exist anymore.
* Problem 2 *
In per-pid tracing mode, an application can appear after the rotation
was initiated and cause the rotate pending check to never complete.
A RELAYD_ROTATE_PENDING command is applied to a unique session id and
a chunk id.
When handling a RELAYD_ROTATE_PENDING commands, the relayd will perform
the following check:
- Iterate on every stream known at that point:
- Check if the stream is rotating (stream->rotate_at_seq_num != -1ULL)
- If the stream is not rotating, "stream->chunk_id < chunk_id" is checked.
- If true, the rotation is considered incomplete.
See: https://github.com/lttng/lttng-tools/blob/
cea6c68/src/bin/lttng-relayd/main.c#L2850
Given that streams, at their creation, are initialized with their
current "chunk_id" set to 0, the rotation will never be considered
complete if a stream is created between a ROTATE_STREAM and
ROTATE_PENDING command.
This can happen whenever an application is registered during a rotation.
* Problem 3 *
Since the sessiond can't accurately monitor the channels that have to
be rotated, the "rotation completed" notification (and state, if
queried with the lttng_rotation_handle_get_state() interface) is not
reliable.
A client could see that the rotation is marked as completed and
attempt to read a trace archive that has not been completely written.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 4 Jun 2018 21:28:56 +0000 (17:28 -0400)]
Typo in ust consumer log message (channek -> channel)
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Thu, 10 May 2018 15:02:25 +0000 (11:02 -0400)]
Use dynamic payload for the add stream realyd command
Move away from static constant defined char array.
Keep backward compatibility.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Thu, 10 May 2018 14:13:22 +0000 (10:13 -0400)]
Dynamic payload for relayd create session command
Move away from static constant defined char array.
Perform the length check based on constant defined value on reception.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 8 May 2018 21:23:48 +0000 (17:23 -0400)]
Fix: backward relayd communication compatibility.
The size of LTTNG_HOST_NAME_MAX changed from 64 to 256 in commit id
b867041c62b48e89c9f00430cde4c33f13a2da09.
This change result in breaking compatibility with older relayd.
Freeze size of struct used for relayd communication.
Confirmed by lttng-ivc project.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 4 Jun 2018 16:50:51 +0000 (12:50 -0400)]
Add unused attribute to lttng_to_index_major param
Suppresses an unused variable warning. The parameter is kept
since this does depend on the connection's full version. the
'minor' parameter is unused for now since there is only one
major version to support and only one major index version.
However, this could change in the future and we don't want to
have to modify all the version conversion sites.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 4 Jun 2018 15:27:37 +0000 (11:27 -0400)]
Replace strncpy by lttng_strncpy in lttngctl session configuration API
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 4 Jun 2018 15:27:08 +0000 (11:27 -0400)]
Replace strncpy by lttng_strncpy in utils_stream_file_name()
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 4 Jun 2018 15:24:26 +0000 (11:24 -0400)]
Use dynamic buffer to build session configuration path
Re-use the dynamic buffer interface to build the session
configuration path. The main benefit here is silencing a source
string truncation warning emitted by GCC 8. However, this
interface is also simpler to use than manually building the
path.
The LTTNG_PATH_MAX is still enforced, but there is no real
need to restrict paths to that size. This could be removed if
it ever poses a problem.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 4 Jun 2018 15:23:18 +0000 (11:23 -0400)]
Replace strncpy by lttng_strncpy in session config
This eliminates a warning produced by GCC 8 in that repeated
code pattern (potential truncation of the source string) and
using the lttng_strncpy macro reduces code duplication.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 1 Jun 2018 17:01:42 +0000 (13:01 -0400)]
Silence strncpy warning emitted by GCC 8 in XSD path construction
The size of the XSD's path is fully determined in this function
which makes strcpy() safe to use.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 1 Jun 2018 16:54:00 +0000 (12:54 -0400)]
Silence strncpy warning emitted by GCC 8 in lttng_strncpy()
GCC 8 warns if the destination's length is passed to strncpy,
since that could cause the destination to not be NULL-terminated.
This is not a concern for the lttng_strncpy since it checks
that the source must not be truncated. Therefore, it is safe
to simply use strcpy().
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 1 Jun 2018 16:50:40 +0000 (12:50 -0400)]
Silence strncpy warning emitted by GCC 8 in ini parser
While copying 'dst len' bytes in strncpy is normally risky
as the dst may not be NULL-terminated, this function ensures
that the last byte of 'dst' is NULL.
Therefore, this change is mostly made to silence GCC.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 14 Mar 2018 18:54:21 +0000 (14:54 -0400)]
Fix: use signed variable for refcounting of consumer_relayd_sock_pair
Otherwise refcount check after decreasing have no meaning as in
consumer_stream_relayd_close function.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 14 Mar 2018 17:46:09 +0000 (13:46 -0400)]
Cleanup: sobjd is never used by reply_ust_register_channel()
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 14 Mar 2018 17:29:56 +0000 (13:29 -0400)]
Cleanup: chan is never used by save_agent_events()
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 14 Mar 2018 17:26:46 +0000 (13:26 -0400)]
Cleanup: open_memstream and close_memstream compat is never used
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Jérémie Galarneau [Fri, 1 Jun 2018 09:31:12 +0000 (05:31 -0400)]
Remove unnecessary inclusions of version.h
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 1 Jun 2018 09:28:45 +0000 (05:28 -0400)]
Add multilib test files to .gitignore
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 13 Mar 2018 22:23:26 +0000 (18:23 -0400)]
Cleanup: ua_sess is never used create_ust_app_channel_context()
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 13 Mar 2018 22:21:14 +0000 (18:21 -0400)]
Cleanup: consumer_data is never used by update_kernel_stream()
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 13 Mar 2018 22:19:48 +0000 (18:19 -0400)]
Cleanup: app is never used by alloc_ust_app_session()
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 13 Mar 2018 22:19:32 +0000 (18:19 -0400)]
Cleanup: ust_session_id unused by buffer_reg_uid_consumer_channel_key
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 13 Mar 2018 22:13:39 +0000 (18:13 -0400)]
Cleanup: wpipe already contain kernel_tracer_fd
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 13 Mar 2018 22:06:57 +0000 (18:06 -0400)]
Cleanup: domain type is never used by send_consumer_relayd_socket()
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 13 Mar 2018 21:41:34 +0000 (17:41 -0400)]
Cleanup: uid and gid are never used by run_as_noworker()
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 13 Mar 2018 21:37:10 +0000 (17:37 -0400)]
Cleanup: sessiond_id is never used by relayd_create_session_2_*
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 13 Mar 2018 19:06:43 +0000 (15:06 -0400)]
Cleanup: sock is never used by ask_channel()
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 13 Mar 2018 18:40:26 +0000 (14:40 -0400)]
Cleanup: ctx is never used by monitor_timer()
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 13 Mar 2018 18:39:00 +0000 (14:39 -0400)]
Cleanup: signo is never used by metadata_switch_timer
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 13 Mar 2018 18:27:24 +0000 (14:27 -0400)]
Cleanup: channel is never used by metadata_cache_check_version()
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 13 Mar 2018 16:56:12 +0000 (12:56 -0400)]
Cleanup: relayd id is never used by write_relayd_metadata_id()
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 13 Mar 2018 16:48:43 +0000 (12:48 -0400)]
Cleanup: attr is not used by open_ust_stream_fd()
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 13 Mar 2018 16:48:22 +0000 (12:48 -0400)]
Cleanup: *_domain are never used by create_session
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Thu, 17 May 2018 18:29:17 +0000 (14:29 -0400)]
doc/man: update rotation man pages to follow API's terminology
"Manual rotation" becomes "immediate rotation".
"Automatic rotation schedule" becomes "rotation schedule". We still
write "automatic rotation" as the result of a rotation schedule being
fulfilled by its condition.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 28 May 2018 20:51:17 +0000 (16:51 -0400)]
Print consumerd32/64/kernel configuration
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 29 May 2018 18:10:03 +0000 (14:10 -0400)]
Test: change use of space for tabs in utils.sh
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Fri, 9 Feb 2018 21:56:52 +0000 (16:56 -0500)]
Tests: add duplicated providers tests
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Fri, 9 Feb 2018 21:56:51 +0000 (16:56 -0500)]
Tests: add function to validate the number of an event name in metadata
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Fri, 9 Feb 2018 21:56:50 +0000 (16:56 -0500)]
Tests: allow the use of regular expressions to match events
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Fri, 9 Feb 2018 21:56:49 +0000 (16:56 -0500)]
Fix: calling ht_{hash, match}_enum with wrong argument
ht_hash_enum and ht_match_enum are currently called with the address of the
pointer to a ust_registry_enum rather than the expected pointer to a
ust_registry_enum. This means that those function calls would end up
using garbage for hashing and comparing.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Fri, 9 Feb 2018 21:56:48 +0000 (16:56 -0500)]
Fix: probes should be compared strictly by events metadata
Currently, events are compared using names and signatures. Events
with different payloads but identical name and signatures could
lead to corrupted trace because the Session Daemon would consider them
identical and give them the same event ID.
Events should be compared using the name, loglevel, fields and
model_emf_uri to ensure that their respective metadata is the same.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Julien Desfossez [Wed, 10 Jan 2018 19:49:20 +0000 (14:49 -0500)]
Test for lttng-logger
Basic test to write in /proc/lttng-logger and /dev/lttng-logger and
ensure we have the right amount of events in the trace resulting trace.
We also test the 1024 characters limit for the payload.
Signed-off-by: Julien Desfossez <jdesfossez@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sat, 26 May 2018 09:43:53 +0000 (05:43 -0400)]
Test mi: rename sessiond load directory constant
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 28 Feb 2018 21:06:01 +0000 (16:06 -0500)]
mi: support "add-context --list"
The symbol element is the string passed/to be passed on the cli
for the --type option.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Francis Deslauriers [Tue, 6 Feb 2018 17:04:27 +0000 (12:04 -0500)]
Fix: test_ust-dl is generated at configure-time
This file should not be in EXTRA_DIST as it's generated by autoconf and
will thus be available directly in the out-of-tree build directory.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Fri, 18 May 2018 21:45:57 +0000 (17:45 -0400)]
Fix: cmd line options overwrite env variable config options
The doc is clear about the order of precedence regarding configuration.
The command line options always override any config file or
configuration by environment variables.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 18 May 2018 19:08:14 +0000 (15:08 -0400)]
Fix: perform the initialization memory barrier out of loop body
The memory barrier used by the client thread should be performed
after the lttng_sessiond_ready counter has been seen to have
reached zero.
This ensures that loads are not speculatively performed before
this point as the thread will interact with data structures
initialized by the support threads for which it was waiting for
the initialization to complete.
See the comment as to why this read barrier is promoted to a
full barrier.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 18 May 2018 19:03:13 +0000 (15:03 -0400)]
Clean-up: explicit mb before decrementing lttng_sessiond_ready
This is mostly a documentation fix as there are no thread-safety
implications to this change. uatomic_sub_return() was used since it
performs a full memory barrier before and after the atomic operation
(as per the urcu documentation).
The barrier performed after the substraction is not needed in this
particular case. Moreover, using an explicit cmm_smp_mb() statement
makes the code clearer; see the comment as to why this barrier is
needed.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 18 May 2018 19:02:17 +0000 (15:02 -0400)]
Clean-up: use a define for support thread count
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Tue, 15 May 2018 20:19:49 +0000 (16:19 -0400)]
Port: fix format warnings on Cygwin
On Cygwin, be64toh() returns a "long long unsigned int" while the
format specifier PRIu64 expects a "long unsigned int". Both types
are 64bits integers, just cast the result to uint64_t to silence
the warnings.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Tue, 15 May 2018 20:19:48 +0000 (16:19 -0400)]
Add missing include for ssize_t on Cygwin
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 16 May 2018 22:32:38 +0000 (18:32 -0400)]
Fix: sessions with agent channels fail to load
Channels of the "agent" types cannot be created directly. They are
meant to be created implicitly through the activation of events in
their domain.
However, a user can override the default channel configuration
attributes by creating the underlying UST channel before enabling an
agent domain event.
Hence, the channel's type is substituted before the creation and
restored by the time events are created.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 16 May 2018 21:08:36 +0000 (17:08 -0400)]
Fix: don't wait for the load thread before serving client commands
Since the session loading thread uses the same communication than
the external clients, it should not be included in the set of
threads that must be launched before the sessiond starts to serve
client commands.
Since the "load session" thread is guaranteed to be the last
essential thread to be initialized, it can explicitly signal
the parents that the sessiond is ready once it is done auto-loading
session configurations.
This commit also adds a lengthy comment explaining the initialization
of the session daemon.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 16 May 2018 15:51:52 +0000 (11:51 -0400)]
Add test_utils_parse_time_suffix to .gitignore
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 9 May 2018 01:38:57 +0000 (21:38 -0400)]
Clean-up: kernel_consumer_add_stream() does not need to be public
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 9 May 2018 01:26:15 +0000 (21:26 -0400)]
Fix: sessiond fails to launch on --without-ust configuration
The sessiond will never signal that it is ready (in daemonize or
background modes) if it was built without lttng-ust. The fix in
7eac7803 made the main thread wait for the agent thread to be
ready before signalling that the session daemon is ready.
When agent tracing is not possible due to the absence of lttng-ust,
a stub function is used to launch the agent thread. This stub
must call sessiond_notify_ready() in order to unblock the main
thread.
Note that it would be _incorrect_ to not wait for the agent
thread to be launched as users expect all tracing features to
be available as soon as 'lttng-sessiond --daemonize/--background'
returns.
Not waiting for the thread to be ready caused very rare failures
of the agent tracing tests on the CI, especially on ARM and
PowerPC targets.
Reported-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 9 May 2018 01:23:14 +0000 (21:23 -0400)]
Fix: agent thread poll set creation failure results in deadlock
Failing to initialize the agent thread's pollset will cause
the thread to exit before calling sessiond_notify_ready().
This will cause the main thread to wait forever for all threads
to be launched when such an error occurs.
The agent thread is not needed for the sessiond to work (except
to enable the tracing of Java and Python applications). Such
a failure should leave the sessiond in a useable state.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 9 May 2018 01:22:36 +0000 (21:22 -0400)]
Fix: test uses sizeof() on the wrong operand of strncpy
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 8 May 2018 16:11:19 +0000 (12:11 -0400)]
Rename kernel_consumer_send_channel_stream()
kernel_consumer_send_channel_stream() sends _all_ streams of
a given channel. It is renamed kernel_consumer_send_channel_streams()
to ensure its name is no longer misleading.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 8 May 2018 16:07:58 +0000 (12:07 -0400)]
Rename consumer_init_channel_comm_msg()
consumer_init_channel_comm_msg() is only used for the ADD_CHANNEL
command. It is therefore renamed to
consumer_init_add_channel_comm_msg()
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 8 May 2018 16:00:41 +0000 (12:00 -0400)]
Cleanup: send_fds functions are not const-correct
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 3 May 2018 20:06:11 +0000 (16:06 -0400)]
Remove unused ltt_session look-up result
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 3 May 2018 18:57:07 +0000 (14:57 -0400)]
Clean-up: reduce indentation level of create_channel_per_uid()
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 3 May 2018 18:35:24 +0000 (14:35 -0400)]
Enforce locking assumptions during channel creation
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 2 May 2018 21:43:48 +0000 (17:43 -0400)]
Cleanup: misleading create_ust_app_session() name
create_ust_app_session() does not necessarily create an
ust_app_session; it will look for an existing one and return it
and only create one if it fails to do so.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 2 May 2018 19:27:37 +0000 (15:27 -0400)]
Rename rotate_count to current_archive_id
The ltt_session's rotate count will no longer be used only to
count the number of rotations. It will be used to tag streams
with a "trace archive chunk id" that indicates the epoch of
their creation.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 2 May 2018 18:39:14 +0000 (14:39 -0400)]
Cleanup: name of send_sessiond_channel() is misleading
This function sends a channel to the sessiond _and_ to the
relay daemon (if applicable).
Comments are updated to reflect this change and the publication
of streams towards the relay daemon is now logged.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Fri, 27 Apr 2018 21:27:29 +0000 (17:27 -0400)]
Print the git version used to build from a distribution tarball
The git version is omitted when building from a distribution
tarball. This will cause 'lttng version' and 'lttng --version'
to print the state of the git tree which produced the tarball.
git describe is used to produce the description of the tree's
state, along with the "dirty" state (whether or not local
changes were present in the tree).
Note that the 'git version' will not be printed when the
distribution tarball was produced at a release tag (a tag
starting with v[0-9]).
This patch simplifies the generation of the version.h file by
generating a file that is merely included by version.h.
It also ensures that version.tmpl is no longer installed on the
system by the install target.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 1 May 2018 19:22:57 +0000 (15:22 -0400)]
Docs: lttng-version uses the intransitive form of "broke"
To indicate that something is divided, the transitive form
"broken down" is preferred.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Sat, 28 Apr 2018 00:06:08 +0000 (20:06 -0400)]
Fix: relayd streams can be leaked on connection error
There are cases where a connection error can cause streams to be
leaked.
For instance, the control connection could receive an index and
close. Since a packet is in-flight, the stream corresponding to
that index will not close. However, nothing guarantees that
the data connection will be able to receive the packet's data.
If the protocol is respected, this is not a problem. However,
a buggy consumerd or network errors can cause the streams to
remain in the "data in-flight" state and never close.
To mitigate a case observed in the field where a consumerd
would be forcibly closed (network interface brought down) and
cause leaks on the relay daemon, the session is aborted whenever
the control or data connection encounters an error. Aborting
a session causes the streams to be closed regardless of the
fact that data is in-flight.
Currently, only the control connection holds an ownership of
the session object. This can cause the following scenario to leak
streams:
1) Control connection receives an index
- Stream is put in "in-flight data" mode
2) Control connection is closed/shutdown cleanly
- try_stream_close refuses to close the stream as data is in-flight,
but it puts the stream in "closed" mode. When the data is
received, the stream will be closed as soon as possible.
3) Data connection closes cleanly or due to an error
- The stream "closing" condition will never be re-evaluated.
Since the data connection has no ownership of the session, it can
never clean-up the streams that are waiting for "in-flight" data to
arrive before closing.
This patch lazily associates the data connection to its session
so that the session can be aborted whenever an error happens on
either the data or control connection.
Note that this leaves the relayd vulnerable to a case which will
still leak. If the control connection receives an index and closes
cleanly, the data connection could have never been established
with the consumer daemon and result in a leak.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 1 May 2018 15:58:15 +0000 (11:58 -0400)]
Cleanup: fix typo in relayd comment
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 30 Apr 2018 18:27:35 +0000 (14:27 -0400)]
Fix: ret may be used uninitialized in sample_channel_positions()
sample_channel_positions() returns garbage if
cds_lfht_is_node_deleted(&stream->node.node) on first and "possibly"
only iteration of the consumer_data.stream_per_chan_id_ht hash table.
Found by scan-build.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Fri, 27 Apr 2018 21:20:21 +0000 (17:20 -0400)]
Cleanup: ret is unused in relay_process_data_receive_header()
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 27 Apr 2018 22:23:26 +0000 (18:23 -0400)]
Fix build: in_git_repo is used before being set
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 27 Apr 2018 21:30:50 +0000 (17:30 -0400)]
Fix: partial writes of padding are not checked
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 27 Apr 2018 20:47:04 +0000 (16:47 -0400)]
Propagate whether a connection was closed cleanly or after an error
This allows a follow-up fix that requires this distinction to
decide whether a session must be closed or aborted.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Fri, 27 Apr 2018 19:44:19 +0000 (15:44 -0400)]
Fix: relayd protocol field present from minor 8 is not checked
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 9 Apr 2018 14:23:33 +0000 (10:23 -0400)]
Add DBG statement for TCP keep-alive options
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 25 Apr 2018 18:57:29 +0000 (14:57 -0400)]
Fix: relay_recv_metadata does not check for partial write
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 21 Feb 2018 05:57:26 +0000 (00:57 -0500)]
Use non-blocking recvmsg() for data/ctrl connections of lttng-relayd
The relay daemon's use of blocking network I/O can cause severe
performance degradation when interacting with unresponsive peers.
This patch changes the recvmsg() calls to use the MSG_DONTWAIT flag
which makes the call non-blocking. The connection classes are modified
to handle the partial reception of buffers.
The sendmsg() calls are still blocking, but this is assumed to
represent a fairly minimal risk of actually blocking given that
the control protocol's replies consist of 4-byte status codes.
A similar approach could be used to make the live connections
non-blocking as that side may also suffer from the same resiliancy
problems. So far, no users have reported this problem so it is
not prioritised.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 24 Apr 2018 19:58:41 +0000 (15:58 -0400)]
Fix: unprivilieged sessiond agent port clashes with root sessiond
This fix addresses the same problem as reported in
f28f9e44.
The session daemon now tries to bind the agent TCP socket to a
port within a range (10 ports by default). The session daemon
will use the first available TCP port within that range when
binding to "localhost". It is still possible to restrict the
session daemon to the broken behaviour by specifying an agent
port using the --agent-tcp-port PORT. If that option is used,
the session daemon will attempt to bind to that part. If it
fails, agent tracing will be marked as disabled.
This fix is backported since the current logic of binding to a
set port means that the default configuration on Ubuntu, Debian,
and other distributions that launch an lttng-sessiond on boot does
not allow the tracing of agent domains (Java Util Logging, log4j,
and Python logging back-ends).
By default, users are not part of the tracing group and it is
not reasonable to expect users to be part of that group for
userspace tracing.
The behaviour of the "system" lttng-sessiond does not change
as it will bind on the first available port within the range.
The non-privilieged session daemons that will be launched after
will be able to bind on other ports available within the range.
Reported-by: Deborah Barnard <starfallprojects@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 24 Apr 2018 15:21:37 +0000 (11:21 -0400)]
Fix: erroneous use of extern keyword
The extern keyword is errneously (or at least, uselessly) used
for an internal API where LTTNG_HIDDEN is meant to be used.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 23 Apr 2018 23:03:16 +0000 (19:03 -0400)]
Fix: failure to launch agent thread is not reported
A session daemon may fail to launch its agent thread. In such
a case, the tracing of agent domains fails silently as events
never get enabled through the agent.
The problem that was reported was caused by a second session
daemon being already bound on the agent TCP socket port, which
prevented the launch of the agent thread.
While in this situation tracing is still not possible, the user
will at least get an error indicating as such when enabling
an event in those domains.
Reported-by: Deborah Barnard <starfallprojects@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 23 Apr 2018 20:36:25 +0000 (16:36 -0400)]
Fix: agent may not be ready on launch
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 23 Apr 2018 19:45:12 +0000 (15:45 -0400)]
Cleanup: misleading variable name
Using "running" implies that the thread is guaranteed to be
functional/ready. The intention of those "running" flags is only
to indicate that the underlying pthread was created. The thread
may not be running anymore and these flags should not be used
to check if the thread is "ready" to process anything.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 23 Apr 2018 19:29:39 +0000 (15:29 -0400)]
Fix: checking for existing session daemon is done after daemonizing
The session daemon checks that no other session daemons are
running only after daemonizing. This means that launching the
deamon in background or daemon modes will appear to succeed even
if the launch failed due to an already present daemon.
The check is performed using both the client socket and the lock
file. This fix also addresses another problem that would cause
the pid file to be overwritten and deleted even if the session daemon
failed to launch.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
This page took 0.045496 seconds and 4 git commands to generate.