Introduce tmp_path to ensure that no code path can possibly try to free
the return value of utils_get_home_dir(). Re-using alloc_path for both
static and dynamically allocated pointer is error-prone.
Fix: Live tracing does not honor live timer after first tracefile with tracefile rotation
When we pass to the 2nd sub-file (or following sub-files) of a stream in
relayd, the live timer has no visible effect from a live reader
perspective, and then everything is flushed when we reach the following
sub-file.
This is caused by the reset of stream->total_index_received after each
tracefile rotation. It should keep on incrementing to match what is
expected by check in check_index_status():
Fix: UST subbuffers silently dropped on moderate trace traffic
Well, it looks like we really screwed up on this one.
lttng-tools commit 02b3d1769d5f8a33e4109b1e681141c9295dfda6 introduced
an important regression for lttng-ust tracing in the consumer daemon:
after reading a sub-buffer, a check has been added to see whether there
are more sub-buffers available to read, and if it is the case, it
ensures the wakeup pipe will be awakened again.
The issue lies in the use of ustctl_put_next_subbuf() in this check.
This acts as if the sub-buffer has been read, when in reality it has not
been read. It therefore trashes the data contained by this sub-buffer.
This check should use ustctl_put_subbuf(), which does not move the
consumer position.
This is a severe bug, and the fix needs to be applied to stable-2.6,
stable-2.5, and stable-2.4.
Julien Desfossez [Wed, 12 Nov 2014 23:36:17 +0000 (18:36 -0500)]
Fix: create/destroy a splice_pipe per stream
We had a per-thread splice_pipe (one for data and one for metadata), but
in case of error, we would end up filling the write side of the pipe and
never emptying it. This could lead to leaking data from one session to
the other, but also to stall the consumer trying to splice into a full
pipe.
Now we create a splice_pipe per-stream, so it is destroyed when the
session is destroyed.
David Goulet [Tue, 7 Oct 2014 19:05:48 +0000 (15:05 -0400)]
Fix: return EINVAL if agent registration fails
The errno value might be 0 thus not returning an error if so. It has
been seen with an unstable python agent code base which means it could
happen in the future if a third part decides to create an agent.
Signed-off-by: David Goulet <dgoulet@efficios.com>
In order to correctly handle the use-case where events are enabled
_after_ trace is started, and _after_ applications are already being
traced, the event should be created in a "disabled" state, so that it
does not trace events until its filter is attached.
This fix needs to be done both in lttng-tools and lttng-ust. In order to
keep ABI compatibility between tools and ust within a stable release
cycle, we introduce a new "disabled" within struct lttng_ust_event
padding (previously zeroed). Newer LTTng-UST checks this flag, and
fallback on the old racy behavior (enabling the event on creation) if it
is unset.
Therefore, old session daemon works with newer lttng-ust of the same
stable release, and vice-versa. However, building lttng-tools requires
an upgraded lttng-ust, which contains the communication protocol with
the new "disabled" field.
This patch should be backported to stable-2.4, stable-2.5, stable-2.6
branches.
David Goulet [Fri, 31 Oct 2014 17:23:29 +0000 (13:23 -0400)]
Fix: UST consumer sync all available metadata
In live mode, the sync metadata function was only working on one single
metadata stream of a given session ID. However, we can have multiple
metadata stream for the same session ID thus failing to send the data in
live mode correctly for the other streams.
This fixes it by simply iterating over all metadata stream for a session
ID and syncing them all.
Signed-off-by: David Goulet <dgoulet@efficios.com>
The issue uncovered a more serious problem. The loop on ready FDs of the
thread was exiting at each branch thus not going on all fd. This is
problematic when the thread quit pipe is triggered and when there is
also at the same time a request for metadata from the consumer since the
metadata request could have been ignored.
This patch makes sure we go through all FDs in the loop when the thread
quit pipe or the metadata fd is triggered.
Signed-off-by: David Goulet <dgoulet@efficios.com>
Julien Desfossez [Wed, 27 Aug 2014 17:59:21 +0000 (13:59 -0400)]
Fix: make sure no index is in flight before using inactivity beacons
Since the index is sent in two parts on two separate connections from
the consumer, there can be cases where we receive an inactivity beacon
between the index creation and the data reception.
This fix prevents from using the inactivity beacon if we know a data
index is coming.
Signed-off-by: Julien Desfossez <jdesfossez@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
Fix: Parenthesize previous statement when adding conditions to a filter
Not parenthesizing the clauses in a filter string causes JUL events to be
traced even though they are not enabled when an enable-event command is
issued with a filter and the --loglevel-only option.
Fix: parse_prob_opts return the actual success of the function
This bug have been triggered by the mi merging and the use of a
command_ret in enable_events functions. Previously, enable_events was
reusing the ret variable for another operation and always replacing ret.
Parse_probe_event returned the last output of sscanf which represent
the number of match and not the success of the operation.
Fixes #830
Signed-off-by: Jonathan Rajotte Julien <jonathan.r.julien@gmail.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
It is never locked in this function, but should be. This is triggering
spurious runtime failures on my system, where it seems that sessiond was
sometimes breaking the communication pipe with liblttng-ctl when the
unbalanced unlock is reached.
This should be backported to stable-2.4 and stable-2.5.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
Fix: get the stream_id when generating live beacons
When we send an empty index (beacon), we need to extract the stream_id
to avoid stalling the client on inactive streams on startup.
Since the live clients need to know this feature is implemented, we had
to bump the lttng-live protocol version.
This fix should be backported to stable-2.4 as well.
Fix: memory leak in lttng_enable_event_with_exclusions
lttng_enable_event_with_exclusions leaks a filter expression when
automatically generated filter statements are used. This happens when
loglevel and logger name filtering are used when enabling JUL events.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
Fix: alignment problems on targets not supporting unaligned access.
Accessing floats, doubles and 64 bit int at unaligned addresses is not
supported on all configurations of arm processors and if it is it's
emulated and slow. This patch replaces direct assignments with memcpy.
Signed-off-by: Fredrik Markström <fredrik.markstrom@gmail.com> Signed-off-by: Roy Li <rongqing.li@windriver.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
David Goulet [Wed, 7 May 2014 17:53:36 +0000 (13:53 -0400)]
Fix: JUL filtering done on the UST level
This is to support enabling all events with different loglevels in two
different sessions.
For this, if any loglevel have been defined, the 'int_loglevel' filter
is added to the UST event. The liblttng-ust-jul library has been
modified to stop filtering loglevel in the agent.
This commit adds two tests, one for a back to back session that are
destroyed and a second one for multi loglevel session.
Signed-off-by: David Goulet <dgoulet@efficios.com>
while [ -n "$(pidof $TESTAPP_BIN)" ]; do
sleep 1
done
pass "Wait for application end"
[...]
tracing_teardown
validate_trace $EXACT_EVENT_COUNT
It is possible that the check for "pidof $TESTAPP_BIN" occurs _before_
the execve() of the applications (starting the applications in background
with & is basically a clone() + execve()). The consequence is that the check
succeed, never waiting for any applications to finish and then the tracing
sessions are prematurely teared down. Thus the resulting trace contains only
some events. We then validate for a fixed number of events and thus the test
fails caused by this racy scheduling situation.
The fix is to start the applications in foreground instead of background.
Signed-off-by: Christian Babeux <christian.babeux@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
Simon Marchi [Thu, 10 Apr 2014 15:30:19 +0000 (11:30 -0400)]
Fix: rework utils_parse_size_suffix
Ok, so there are a lot of problems with this function (sorry :|). Taking
the regex road is probably to complicated for nothing, so here is a
version without regexes.
I added many test cases as suggested by Sandeep Chaudhary and Daniel
Thibault. I tested on both Intel 32 and 64 bits.
Fixes #633
Signed-off-by: Simon Marchi <simon.marchi@polymtl.ca> Signed-off-by: David Goulet <dgoulet@efficios.com>
David Goulet [Thu, 3 Apr 2014 17:14:00 +0000 (13:14 -0400)]
Fix: don't delete stream from connection recv list
We don't need to delete them from the list during a connection destroy
because it's only a reference to the stream that might be valid or not
during the connection destroy. There is no need at all to access the
stream's pointer at that point.
David Goulet [Wed, 2 Apr 2014 14:31:34 +0000 (10:31 -0400)]
Fix: use after free of a relayd stream
A race could occur with a stream destruction and a control connection
being destroyed emptying its recv_list. A freed stream could still be in
the list thus having a use after free during the connection destroy.
That was triggering undefined behavior from infinite looping to
segmentation faults.
We've observed this issue on high load stress test. A relayd received
all the stream but NOT the streams sent command which empty the list.
This can happen if a start tracing never occured or failed on the
application side thus the close stream command is sent to the relayd
freeing the stream before it is removed from that list.
Signed-off-by: David Goulet <dgoulet@efficios.com>
David Goulet [Tue, 1 Apr 2014 15:36:13 +0000 (11:36 -0400)]
Fix: don't print stream name in error message
The stream received, in per UID, is actually a temporary stream object
that only contains the UST object data which is the relevant part for
UST to use.
Thus on error the name was random data thus print the valid handle
descriptor instead of invalid data.
Signed-off-by: David Goulet <dgoulet@efficios.com>
David Goulet [Fri, 28 Mar 2014 13:58:03 +0000 (09:58 -0400)]
Fix: take session list lock when listing tp
This is important since the list tracepoints command access the
application socket to ask the application for its TPs. The session list
lock protects the ordering of message for those sockets.
This was triggering out of order message between the session daemon and
an application thus triggering undefined behavior.
Fixes #774
Signed-off-by: David Goulet <dgoulet@efficios.com>
David Goulet [Wed, 19 Mar 2014 18:34:27 +0000 (14:34 -0400)]
Fix: add consumer wake up pipe to avoid race
UST application will notify the wait_fd pipe for every subbuffer that it
writes and ready to be consumed. However, on *high* load systems, this
1:1 property can fail if the pipe gets filled up. For performance
reason, UST will ignore this error and continue since it can't wait for
the pipe to clear up.
This triggers a race condition where we have *one* wake up on the UST
pipe for potentially multiple subbuffers. A data pending command will
wait forever on streams that still has data but the data thread could'nt
consumed them because of this 1:n possible race. Using the stop command
without waiting would mean a memory/fd leak of the stream.
Thus, we add a consumer wake up pipe here that notifies the data thread
if there is still data to be read after a successful read subbuffer
call. With this, we end up handling the residual buffers if any since
the data thread is always notified when there is still data to be read.
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
David Goulet [Mon, 24 Mar 2014 18:23:00 +0000 (14:23 -0400)]
Fix: allow empty URL for live session creation
This is actually very important so -C/-D can be used with lttng create
--live command and also the load command can set the control and data
URL independently.
This also adds a small test to make sure -C/-D works in live mode.
Fixes #769
Signed-off-by: David Goulet <dgoulet@efficios.com>
Lars Persson [Wed, 12 Mar 2014 09:22:40 +0000 (10:22 +0100)]
Use autoconf AM_MAINTAINER_MODE.
Give distribution maintainers the option to skip rebuilding autoconf and
automake generated files. The default behaviour is still to have the
rebuild rules enabled.
Signed-off-by: Lars Persson <larper@axis.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
Fix: Unchecked session pointer when destroying a connection in relayd
An unknown command currently crashes the relay daemon since
destroy_connection calls destroy_session without checking whether or not
a session is associated with the connection.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
Fix: relayd should listen for viewers on localhost only by default
Having relayd listening by default on 0.0.0.0 (all interfaces) with a
protocol without authentication is an information leak waiting to
happen.
Users should explicitely specify if they want to listen on all
interfaces, using e.g. -L tcp://0.0.0.0:5344 (see lttng-relayd(8)
manpage for details). They should only do so if they use a firewall, or
are within a secured network.
Fixes #746
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
David Goulet [Thu, 27 Feb 2014 15:03:14 +0000 (10:03 -0500)]
Fix: JUL to enable user and root tracepoints
This is needed to support the LTTng JUL agent to connect to both user
and root session daemon, we have to enable different tracepoint for the
two cases in order to avoid duplicating the trace payload in both the
user and root trace output.
Signed-off-by: David Goulet <dgoulet@efficios.com>
Conflicts:
src/bin/lttng-sessiond/lttng-sessiond.h
src/bin/lttng-sessiond/main.c