Jérémie Galarneau [Thu, 8 Nov 2018 17:09:04 +0000 (12:09 -0500)]
Remove unused nr_stream_rotate_pending from consumer channel
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 30 Oct 2018 12:47:52 +0000 (13:47 +0100)]
Fix: session destruction blocks indefinitely if rotation is ongoing
Issue
---
The destruction of an active session can hang indefinitely if it
occurs while a rotation is ongoing. This was observed when automatic
session rotations were scheduled on a time basis.
The destruction of the session causes it to be stopped. The 'stop'
command causes the session's timers to be stopped. These timers
include the rotation pending check timer.
Meanwhile, 'data pending' queries are performed against the session
until one of them returns that no data is pending.
The 'data pending' check returns that data is pending if a session
rotation is ongoing at the moment of the check.
Hence, stopping the rotation completion check timer causes the
session to remain in the 'session ongoing' state forever and
prevents the session destruction from completing.
Solution
---
The session's rotation schedule timer is correctly stopped when
a 'stop' is performed; we don't want new rotations to be issued
from this point. However, it is incorrect to stop the
'rotation pending check' timer at this stage if a rotation is
ongoing.
This commit leaves the 'rotation pending check' timer running,
allowing the rotation thread to update the session's rotation
state on completion of the rotation. The operations that were
performed as part of the stop command, namely renaming the
'current' chunk, are then performed from the context of the
rotation thread.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 30 Oct 2018 13:15:27 +0000 (14:15 +0100)]
Clean-up: remove non-existent function's declaration
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Wed, 17 Oct 2018 21:16:12 +0000 (17:16 -0400)]
Always choose large event header for UST channels
UST can receive the session start command before all probe provider
library constructors have completed running, therefore finding less
events than eventually enabled within the process. Moreover, with
per-uid buffers, many processes end up registering events into shared
buffers. Therefore, the guess based on number of events from the first
process to use the buffer is incorrect.
Considering that we typically have applications with more than 30
events, we will modify the session daemon so it selects the "large"
header type independently of the number of events.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 12 Oct 2018 23:49:42 +0000 (19:49 -0400)]
Fix relayd: stream index file created in the wrong directory
This fix addresses an issue that can cause a stream's index
file to be created in the wrong trace archive chunk's
directory.
The data connection creates a stream's first index file.
This can happen _after_ a ROTATE_STREAM command. More specifically,
the data of the first packet of a stream can be received after a
ROTATE_STREAM command.
The ROTATE_STREAM command changes the streams path_name to point to
the "next" chunk. If a rotation is pending for a stream, as
indicated by "rotate_at_seq_num != -1ULL", it means that we are still
receiving data that belongs in the stream's former path.
In fact, we may have never received any data for this stream at this
point.
In this specific case, we must ensure that the index file is created
in the streams's former path, "prev_path_name", on reception of the
first packet's data on the data connection.
All other rotations beyond the first one are not affected by this
problem since the actual rotation operation creates the new chunk's
index file.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 12 Oct 2018 23:49:18 +0000 (19:49 -0400)]
relayd: add payload logging to session rotation commands
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 12 Oct 2018 22:34:06 +0000 (18:34 -0400)]
relayd: rename stream prev_seq to prev_data_seq
Since there are now two "previous sequence numbers" that are
tracked, it makes sense to give them more descriptive names.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 12 Oct 2018 22:31:51 +0000 (18:31 -0400)]
Fix: take index seq number into account for rotation pending check
The rotation pending check is only performed on the sequence number of
the received data. However, it is expected that the index of the
stream has been written to disk by the time this check returns that
no rotation is pending.
This patch ensures that the minimum between the data and index
sequence numbers are used to perform this check.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 12 Oct 2018 22:22:35 +0000 (18:22 -0400)]
Fix: take index sequence number into account for data pending check
The data pending checks are only performed on the sequence number of
the received data. However, it is expected that the index of the
stream (when applicable) has been written to disk by the time this
check returns that no data is pending.
This patch ensures that the minimum between the data and index
sequence numbers are used to perform this check.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 12 Oct 2018 22:05:10 +0000 (18:05 -0400)]
relayd: keep track of prev_index_seq in relayd_stream
The rotation and data pending checks are only performed on the
sequence number of the received data. However, it is expected
that the index of the stream (when applicable) has been written
to disk when those checks say that their respective operations
have completed.
This patch only introduces a new 'prev_index_seq' position that
is updated when an index is flushed to disk.
A follow-up fix addresses the issue mentioned above.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 5 Oct 2018 15:55:39 +0000 (11:55 -0400)]
Fix: session conditions not evaluated at subscription/registration
Conditions bound to sessions (session rotation ongoing/completed)
are not automatically evaluated when a notification channel client
subscribes to them or when a client is subscribed _before_ the
trigger is created.
The problematic scenario is:
- Trigger is registered to notify on session rotation ongoing for
session 'foo',
- A rotation is launched on session foo (but not completed)
- A client subscribes to 'session rotation ongoing' notifications for
session 'foo'
In this scenario, the client would not be notified of the 'current'
state of the session.
Whether or not a client is notified of the 'current' state at the
time of subscription/registration is defined per-condition. In
the case of 'session rotation ongoing', it is desirable for clients
to be notified that the rotation is ongoing at the time of their
subscription/registration.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 5 Oct 2018 16:06:37 +0000 (12:06 -0400)]
Remove unnecessary check of output parameter
It is not necessary to check for `_notification != NULL` as it
is done at the beginning of the function. Moreover, it confuses
Coverity which warns that `notification` will be leaked if the
output parameter is NULL.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 4 Oct 2018 21:50:27 +0000 (17:50 -0400)]
Allow get_next_notification to return when interrupted
Applications (and scripts) which consume a given set of notifications
indefinitely may fail to exit if a SIGTERM handler is registered.
lttng_notification_channel_get_next_notification() blocks indefinitely
on recvmsg() until a new notification is available. The wrapper that
is used to do so automatically restarts the recvmsg() if it is
interrupted, thus not allowing clients a change to cleanly exit.
This change causes the notification channel to wait for a message to
be available using select() before starting the actual reception of
the data and return LTTNG_NOTIFICATION_CHANNEL_STATUS_INTERRUTPED if a
signal occurs during the wait.
If data is available, it is assumed that the message is well-formed
and can promptly be received in its entirety. The goal of this change
is to give a monitoring application a chance to leave the
get_next_notification() function and check if it should exit.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 4 Oct 2018 02:05:32 +0000 (22:05 -0400)]
Fix: register rotation thread as RCU thread
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 4 Oct 2018 00:50:35 +0000 (20:50 -0400)]
Docs: comment typo fix (accomodates -> accommodates)
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 3 Oct 2018 15:28:11 +0000 (11:28 -0400)]
Fix: uninitialized variable may be used in local rotation check
** CID
1395985: Uninitialized variables (UNINIT)
'ret' may be left uninitialized if no consumer daemons are
iterated-upon to perform a local rotation pending check.
In practice this wont' happen as that would mean that the
ltt_session has no user space nor kernel session action, thus
no rotation would be launched.
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 2 Oct 2018 18:06:17 +0000 (14:06 -0400)]
Rename sessiond-timer.[hc] to timer.[hc]
There is no need to namespace the timer files as they are already
contained withing the lttng-sessiond directory.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 21 Sep 2018 22:16:05 +0000 (18:16 -0400)]
Fix: rotation may never complete in per-PID buffering mode
Issue
-----
The current scheme to ensure that a rotation is completed
consists in the following, from the session daemon's perspective:
Iterate on all channels:
- Ask the consumerd to sample the current "write" positions
- Increment a count of channels being rotated
Wait for the consumer daemon to notify the session daemon every time
a channel's streams's "read" position have all reached the sampled
"write" position.
The idea behind this is making sure that all the data that was
produced before a rotation was triggered has been consumed (i.e.
been written to a local FS or streamed to the relay daemon) before
marking the rotation as completed.
However, this assumes that the session daemon is always aware of
all channels/streams that exist at the moment at which the rotation is
initiated. This is only true for the kernel domain.
In per-PID buffer mode, it is possible for an application, and its
buffers, to be torn down at any moment. Thus the following scenario
can happen:
- The application fills its buffers, causing the consumerd to fall
behind
- The application exits, leaving its full buffers behind to be
extracted by the consumer daemon
- The session daemon removes anything to do with the application from
its internal structures, including its channels
- A rotation is initiated
- The positions of the application's buffers are never sampled as the
session daemon does not see the channels when iterating on the
session's channels
Multiple bad things can happen from there.
First, the rotation can be marked as "completed" while the consumerd
is still exctracting the dead application's buffers, causing readers
to consume an incomplete/corrupted trace.
Second, if the session is being streamed to a relay daemon, it is
possible for the 'rename' command to be issued before the contents
of the buffers has been written causing indexes to fail to be
flushed (as the relay daemon attempts to write them to a now-defunct
location).
Solution
--------
Eliminate the pipe between the session daemon and consumer daemon that
is used to signify that a rotation is completed as the information is
unreliable.
The rotation thread now periodically asks the consumer daemon to check
for channels that have a pending rotation for a given session_id or
that belong to the ongoing rotation archive id.
Hence, for every stream:
- If the archive id during which it was created is '>' than that of
the ongoing rotation, we don't need to consider it
- If the current position is '>=' than the sampled rotation position,
we can consider its rotation 'done'
- If it belongs to the pending rotation archive id and doesn't have
a "target" position, it was unknown to the session daemon and the
application associated with it is dead. We must wait for the
stream to be flushed and torn down before assuming that the
rotation was completed.
Drawbacks
---------
This polling approach is somewhat inefficient and can cause rotations
to take longer to complete than necessary, especially in high-latency
networking conditions.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 11 Sep 2018 00:09:15 +0000 (20:09 -0400)]
Fix: perform local data pending before checking data pending with relayd
Performing the data pending check in two phases, local and network,
reduces the total number network operations needed.
Doing the local check first enable early return in cases where data is
still pending locally.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 18 Sep 2018 01:18:33 +0000 (21:18 -0400)]
Fix: missing header breaks the cygwin build
stddef.h must be included to use ssize_t.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 11 Sep 2018 00:09:11 +0000 (20:09 -0400)]
Fix: double put on error path
Let relay_index_try_flush be responsible for the self-reference put on
error path.
Code flow of relay_index_try_flush is a bit tricky but the only error
flow (via relay_index_file_write) will always mark the index as flushed
and perform the self-reference put.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 11 Sep 2018 00:09:14 +0000 (20:09 -0400)]
Fix: holding the stream lock does not equate to having data pending
The live timer can hold the stream lock while sending empty beacon. An
empty beacon does not mean that data is still pending for the stream.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 11 Sep 2018 00:09:13 +0000 (20:09 -0400)]
Fix: skip uid registry when metadata key is 0
A value of zero for the metadata key indicate that metadata was never
created/pushed to the consumer.
This can occur in scenario were a tracker is present since metadata
might never be created/pushed.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 17 Sep 2018 22:15:11 +0000 (18:15 -0400)]
Docs: document the meaning of a ust app channel key set to 0
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 11 Sep 2018 00:09:12 +0000 (20:09 -0400)]
Fix: acquire stream lock during kernel metadata snapshot
The stream lock is not taken when interacting with the kernel
metadata stream that is created at the time a snapshot is taken.
This was noticed while reviewing the code for an unrelated reason,
so there is no known problem caused by this. Nevertheless, this
is incorrect as the stream is globally visible in the consumer.
Moreover, the stream was not cleaned-up which can cause a leak
whenever a metadata snapshot fails.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Jonathan Rajotte [Fri, 7 Sep 2018 19:18:38 +0000 (15:18 -0400)]
Fix: skip closed session on viewer listing
There is no value in listing a closed session. A viewer cannot hook
itself to a closed session in live mode and the session is about to be
removed from the sessions hash table.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Fri, 7 Sep 2018 19:18:37 +0000 (15:18 -0400)]
Fix: use LTTNG_VIEWER_ATTACH_UNK to report a closed session
LTTNG_VIEWER_NEW_STREAMS_HUP is not a valid error number for the
LTTNG_VIEWER_ATTACH_SESSION command. This result in erroneous error
reporting on the client side.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 17 Sep 2018 16:19:40 +0000 (12:19 -0400)]
Doc: withinin -> within
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 6 Jun 2018 16:06:04 +0000 (12:06 -0400)]
Fix: cleanup relayd sockets on rotation command communication error
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 6 Jun 2018 01:00:28 +0000 (21:00 -0400)]
Fix: perform relayd socket pair cleanup on control socket error
A reference to the local context for the socket pair is used to "force" an
evaluation of the data and metadata streams since we changed the endpoint
status. This imitates what is currently done for the data socket.
This prevents hitting network timeouts multiple times in a row when an
error occurs. For now, there is no mechanism for retry hence
"terminating" all communication make sense and prevent unwanted delays
on operation.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 13 Sep 2018 21:04:45 +0000 (17:04 -0400)]
Fix: relayd control socket mutex is not destroyed
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 3 Jul 2018 18:49:23 +0000 (14:49 -0400)]
Tests: do not bound test app iterations when in background mode
On systems with a high number of CPUs and slow disk, taking snapshots
can take a long time. When running a long regression test, the tests
sometimes outlive the test application.
The test application then exits since the required number of
iterations was completed
(NR_ITER=
2000000).
Set the iterations parameter to -1 to ensure the application keeps
producing events for the duration of the test.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 11 Sep 2018 19:11:39 +0000 (15:11 -0400)]
Tests: add missing rotation and autoload tests to check target
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 5 Jul 2016 19:23:42 +0000 (15:23 -0400)]
Tests: remove temporary folder
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 5 Jul 2016 18:38:46 +0000 (14:38 -0400)]
Tests: remove mi result files when done
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 14 Oct 2015 13:57:42 +0000 (09:57 -0400)]
Tests: Remove unused set +x
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 6 Oct 2015 21:10:56 +0000 (17:10 -0400)]
Tests: Kill relayd after sessiond to ensure a clean tear down
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 6 Oct 2015 16:07:41 +0000 (12:07 -0400)]
Tests: Remove unused variable
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 30 Sep 2015 22:41:30 +0000 (18:41 -0400)]
Tests: Use stop relayd from utils.sh
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 30 Sep 2015 22:38:13 +0000 (18:38 -0400)]
Tests: remove declaration already present in utils.sh
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Anders Wallin [Thu, 26 Jul 2018 07:46:28 +0000 (09:46 +0200)]
Tests: added test_autoload to noinst_SCRIPTS
Signed-off-by: Anders Wallin <wallinux@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Fri, 7 Sep 2018 14:40:04 +0000 (10:40 -0400)]
Fix: Memory leak on run_as worker restart error path
Reported-by: Coverity (1395614) Resource leak
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 7 Sep 2018 01:39:18 +0000 (21:39 -0400)]
Fix: non-zero return of open handled as error
The open() run_as wrapper marks any non-zero return value
of open() as an error, causing the transmission of the file
descriptor to be skipped.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 7 Sep 2018 01:25:13 +0000 (21:25 -0400)]
Fix: global run_as worker lock released during restart
The global run_as should not be released during the restart of
the working as other threads could then start dispatching commands
while the worker is recovering from an error.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 6 Sep 2018 22:11:25 +0000 (18:11 -0400)]
Fix: runas worker attempts to send invalid fd to master
Commands which return a file descriptor (i.e. RUN_AS_OPEN) attempt
to send the resulting file descriptor even on failure. However,
this is not permitted by the UNIX socket interface.
As a result, skip the reception of the file descriptor payload
when a command fails. The 'master' end is also adapted to skip
the reception of the file descriptor in the case of an error.
A check has also been added to ensure that the 'master' end does
not attempt to send invalid file descriptors to the worker process.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 6 Sep 2018 21:40:06 +0000 (17:40 -0400)]
Fix runas: don't attempt close negative fd
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Wed, 5 Sep 2018 03:24:19 +0000 (23:24 -0400)]
Fix: tests: missing frame pointer for callstack test on some compiler
The callstack testcase fails when the testapp is built with gcc 8. This
is because GCC8 may not emit frame pointers even when the
`-fno-omit-frame-pointer` is used.
To prevent that we manually mark these functions with optimization level
0.
On Clang we also need to include the `-mno-omit-leaf-frame-pointer` flag
along side with the existing `-fno-omit-frame-pointer` to ensure that
frame pointers are emitted. It's not clear if this incompatibility with
GCC is expected [1].
[1]: https://bugs.llvm.org/show_bug.cgi?id=9825
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 31 Aug 2018 21:04:45 +0000 (17:04 -0400)]
Add release name and description to configure.ac
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 31 Aug 2018 20:50:02 +0000 (16:50 -0400)]
Update version to v2.11.0-rc1
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 31 Aug 2018 18:49:24 +0000 (14:49 -0400)]
Missing kernel test files in dist target
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Fri, 31 Aug 2018 15:59:56 +0000 (11:59 -0400)]
elf: support dynamic symbol table lookup
Background
==========
There may be two symbol tables in a shared object or executable. The
normal symbol table (.symtab) and the dynamic symbol table (.dynsym).
The normal symbol table contains lots of information, such as static
linking data, but none of it is used at runtime. This is why some
shared libraries are 'stripped', reducing the final size of the file.
Stripping an object file removes the entire .symtab section of the elf
file, amongst other things.
The dynamic symbol table contains symbols that are needed for dynamic
linking of the shared object. The symbols in that section form a subset
of the symbols contained in the normal symbol section (before
stripping). The .dynsym section is left untouched when stripping a file
as it is needed at runtime.
Current limitation
==================
The current elf parsing implementation looks for the normal symbol
section (.symtab) to find the target symbol. If the .symtab is not
found, the parsing stops and returns that the symbol was not found. As
explained in the section above, a shared library might be stripped from
its normal symbol table, but still have a dynamic symbol table (.dynsym)
containing the information of the target symbol. For example, on
distributions where libc is stripped, the malloc symbol can only be
found in the .dynsym section
Solution
========
Look for the normal symbol section first and, if it's found, use it to
find the symbol, as was previously done. If the .symtab is absent,
try to use the dynamic symbol section instead.
This commit also adds a testcase for this feature.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 31 Aug 2018 17:56:04 +0000 (13:56 -0400)]
Fix: leak of event attributes on copy failure
Reported-by: Coverity Scan (1243042 Resource leak)
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 31 Aug 2018 17:47:36 +0000 (13:47 -0400)]
Test fix: check length of input string
Reported-by: Coverity Scan (395327 Unbounded source buffer)
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 31 Aug 2018 17:41:00 +0000 (13:41 -0400)]
Test cleanup: wrong indentation style in test_ust_data.c
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 31 Aug 2018 17:39:50 +0000 (13:39 -0400)]
Test fix: leak of exclusions on allocation error
Reported-by: Coverity Scan (1395328 Resource leak)
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 31 Aug 2018 17:33:51 +0000 (13:33 -0400)]
Fix: runas check fd value before calling close()
A bug could cause an 'open' command to return no FD in which
case the initial value of '-1' would be used in the call to
close().
Reported-by: Coverity Scan (1395329 Improper use of negative value)
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 20:21:42 +0000 (16:21 -0400)]
Docs: multiple rotation schedules can be active
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 20:19:53 +0000 (16:19 -0400)]
Docs: immadiate rotations can be performed with active schedules
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 19:42:41 +0000 (15:42 -0400)]
Fix: ret variable is used instead of cmd_ret in disable-rotation
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 19:39:30 +0000 (15:39 -0400)]
Cleanup: unused assignation on rotation error
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 19:37:51 +0000 (15:37 -0400)]
Cleanup: unused assignation on rotation already pending
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 19:36:38 +0000 (15:36 -0400)]
Fix: unchecked writer open element return value
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 18:51:52 +0000 (14:51 -0400)]
Remove unused session current_archive_location accessor
This function was replaced by
lttng_rotation_handle_get_archive_location() which requires
an lttng_rotation_handle to be used, making its use less
error-prone.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 18:49:29 +0000 (14:49 -0400)]
Fix: incorrect error message on regenerate missing argument
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 18:45:23 +0000 (14:45 -0400)]
Fix: incorrect error message on metadata missing argument
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 18:32:10 +0000 (14:32 -0400)]
Fix: snapshot command mishandles missing arguments
The snapshot command does not print explicit errors when
arguments are missing. This commit introduces more error
reporting and ensures that lttng_error_code and cmd_error_code
values are not freely mixed.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 17:45:40 +0000 (13:45 -0400)]
Cleanup: improve readability of filter expression condition
In this situation, a logical inequality '!=' is equivalent to the
binary xor '^' that was used.
However, while it is equivalent, mixing logical ('!') and bitwise
operators ('^') makes this code harder to read than it needs to be.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 17:36:37 +0000 (13:36 -0400)]
Fix: potential use of NULL path in stat() use
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 17:21:50 +0000 (13:21 -0400)]
Cleanup: unused assignment of curr_data_ptr in lttng_elf
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 17:14:33 +0000 (13:14 -0400)]
Fix: uninitialized data/ret in runas offset commands
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 17:10:40 +0000 (13:10 -0400)]
Fix: uninitialized fd value used in runas
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 17:00:53 +0000 (13:00 -0400)]
Fix: report setegid()/seteuid() failure in runas
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 30 Aug 2018 16:57:04 +0000 (12:57 -0400)]
Fix: leak of binary path on location creation error
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 29 Aug 2018 22:47:44 +0000 (18:47 -0400)]
Fix: missing return value check in notification serialization
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 29 Aug 2018 22:28:17 +0000 (18:28 -0400)]
Fix: possible leak of path in _utils_expand_path
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 29 Aug 2018 22:12:43 +0000 (18:12 -0400)]
Fix: silent truncation in _utils_expand_path
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 29 Aug 2018 22:06:25 +0000 (18:06 -0400)]
Cleanup: unused assignment of ret_code in ROTATE_CHANNEL
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 29 Aug 2018 21:28:36 +0000 (17:28 -0400)]
Fix: passing null to closedir() on error
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 29 Aug 2018 21:25:04 +0000 (17:25 -0400)]
Fix: unchecked access to pids array
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 29 Aug 2018 21:23:38 +0000 (17:23 -0400)]
Fix: missing jump to error on allocation failure
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 29 Aug 2018 21:22:44 +0000 (17:22 -0400)]
Cleanup: unused assignation of ret value
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 29 Aug 2018 21:21:14 +0000 (17:21 -0400)]
Cleanup: unused assignation of ELF parsing error
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 29 Aug 2018 21:17:03 +0000 (17:17 -0400)]
Fix: leak of probe_locs on error
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 29 Aug 2018 21:10:25 +0000 (17:10 -0400)]
Fix: leak on agent event listing error
Jumping to the 'error' label after allocating tmp_events results
in a memory leak.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 29 Aug 2018 21:06:48 +0000 (17:06 -0400)]
Fix: possible null dereference on communication error
lttng_ctl_ask_sessiond_fds_varlen() can return a positive
error code and NULL buffers if the sessiond uses a command
return code that is already negative.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 29 Aug 2018 20:56:52 +0000 (16:56 -0400)]
Fix: returned pids may be uninitialized
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Wed, 29 Aug 2018 12:37:48 +0000 (08:37 -0400)]
Fix: invalid destruction of lookup_method
When passing the lookup_method to the location create function we give
it the ownership of that structure. By setting our pointer to NULL, we
make sure the _destroy function won't free it.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Tue, 28 Aug 2018 21:09:48 +0000 (17:09 -0400)]
Fix: unused value in SDT probe description parsing
Reported-by: Coverity (1395199) Unused value
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Tue, 28 Aug 2018 20:54:31 +0000 (16:54 -0400)]
Fix: use of uninitialized variable in C++ userspace-probe testapp
Reported-by: Coverity (1395206) Uninitialized scalar variable
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Tue, 28 Aug 2018 20:43:12 +0000 (16:43 -0400)]
Fix: use of uninitialized value in error path
Reported-by: Coverity (1395212) Uninitialized pointer read
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Tue, 28 Aug 2018 20:12:40 +0000 (16:12 -0400)]
Fix: leaking string by setting pointer to NULL before freeing it
Reported-by: Coverity (1395200) Resource leak
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Tue, 28 Aug 2018 20:01:02 +0000 (16:01 -0400)]
Fix: passing negative param to dup(2) on error
Reported-by: Coverity (1395195)
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Tue, 28 Aug 2018 19:38:22 +0000 (15:38 -0400)]
Fix: use-after-free in UST test case
Create a copy of the exclusion structure to be able to compare both
struct after the event is created.
Reported-by: Coverity (1395194) Read from pointer after free
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Tue, 28 Aug 2018 19:10:12 +0000 (15:10 -0400)]
Fix: leak in error handling of userspace param parsing
Reported-by: Coverity Scan (1395217 Resource leak)
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Tue, 28 Aug 2018 15:25:01 +0000 (11:25 -0400)]
Fix: Remove dead code in fd passing function
Found by Coverity:
CID
1395190 (#1 of 1): Logically dead code (DEADCODE)
dead_error_begin: Execution cannot reach this statement: fprintf(stderr,
"Error: Inv...
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Mon, 27 Aug 2018 21:13:07 +0000 (17:13 -0400)]
Cleanup: avoid duplicating userspace-probe desc twice
In userspace probe location _copy functions, we duplicate the strings
(e.g. function name, provider name, etc.) before passing those new
strings to the *_create_no_check function. But this function also duplicates
those strings.
To remove this double duplication, we remove the calls to lttng_strndup() in
the _copy functions and pass the pointers to those strings directly to
the _create_no_check functions.
Also, we now don't call open() needlessly when calling the
_create_no_check functions from the _copy functions as we need to
manually set a duplicated fd (using dup(2)) to avoid file unlink race.
Fixes Coverity resource leak issues:
1395196,
1395192,
1395205,
1395211
and
1395214
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Mon, 27 Aug 2018 20:28:26 +0000 (16:28 -0400)]
Fix: memory leak in userspace probe param parsing
Found by Coverity:
CID
1395217: (RESOURCE_LEAK)
Variable "real_target_path" going out of scope leaks the storage it
points to.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Mon, 27 Aug 2018 20:11:02 +0000 (16:11 -0400)]
Fix: missing error handling goto statement in runas
Found by Coverity:
CID
1395218: Code maintainability issues (UNUSED_VALUE)
Assigning value "-1" to "ret" here, but that stored value is overwritten
before it can be used.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Mon, 27 Aug 2018 19:41:51 +0000 (15:41 -0400)]
Fix: use-after-free on error of lttng_event creation and copy
Found by Coverity:
>>> CID
1395219: Memory - illegal accesses (USE_AFTER_FREE)
>>> Using freed pointer "new_event".
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 28 Aug 2018 23:05:25 +0000 (19:05 -0400)]
Add function instrumentation type accessors to function location type
Since the uprobe instrumentation is currently limited to function
entries, and since support for the instrumentation function return
is planned to be introduced at some point, it makes sense
to introduce an "instrumentation type" attribute on function
locations.
Currently, the only available instrumentation type is
"entry", which matches what is supported by the kernel tracer as of
2.11.
In the future, a RETURN and ENTRY_RETURN/BOTH instrumentation type
could be added without changing the default behavior of rules
such a userspace probe.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
This page took 0.044745 seconds and 4 git commands to generate.