Philippe Proulx [Fri, 8 Sep 2017 02:52:48 +0000 (22:52 -0400)]
lttng-enable-event(1): filtering: specify that `$ctx.cpu_id` is available
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Olivier Blin [Fri, 27 Oct 2017 09:46:19 +0000 (11:46 +0200)]
Fix: Make version.h generation work with dash
version.h generation failed when using dash as shell:
Generating version.h... /bin/sh: 24: Syntax error: Missing '))'
dash does not handle the following construct:
git_describe="$((cd /path/to/lttng-tools/.; git describe) 2>/dev/null)"
Use backquotes instead.
The fix has been tested with dash and bash.
Signed-off-by: Olivier Blin <olivier.blin@softathome.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 12 Nov 2017 16:41:47 +0000 (11:41 -0500)]
Fix: buffer overflow warning in python bindings
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 8 Nov 2017 19:02:07 +0000 (14:02 -0500)]
Tests fix: BT2 does not output the metadata of a trace collection
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 3 Aug 2017 19:16:40 +0000 (15:16 -0400)]
Update version to v2.9.6
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Wed, 2 Aug 2017 15:34:43 +0000 (11:34 -0400)]
Fix: uninitialized return value on error path
Found by Coverity:
*** CID
1378810: Uninitialized variables (UNINIT)
/src/bin/lttng-sessiond/context.c: 73 in add_kctx_all_channels()
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 2 Aug 2017 20:49:44 +0000 (16:49 -0400)]
Fix: ensure kernel context is in a list before trying to delete it
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 31 Jul 2017 21:51:35 +0000 (17:51 -0400)]
Fix: ambiguous ownership of kernel context by multiple channels
A kernel context, when added to multiple channels, must be copied
before being added to individual channels. The current code
adds the same ltt_kernel_context structure to multiple kernel
channels which introduces a conceptual ambiguity in the ownership
of the context object.
Concretely, creating multiple kernel channels and adding a context
to all of them (by not specifying a channel name) causes the context
to be added to each channels' list of contexts, overwritting the
context's list node, and causing the channel context lists to become
corrupted. This results in crashes being observed during the
destruction of the session.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 26 Jul 2017 14:52:15 +0000 (10:52 -0400)]
Fix: ret is never used on error_open code path
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 26 Jul 2017 14:29:17 +0000 (10:29 -0400)]
Fix: use error code path instead of break when errors happen before execl
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 25 Jul 2017 21:46:47 +0000 (17:46 -0400)]
Fix: wrong variable assignment on error
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 25 Jul 2017 21:20:45 +0000 (17:20 -0400)]
Fix: missing error handling in use of print_tabs()
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 25 Jul 2017 15:31:02 +0000 (11:31 -0400)]
Fix: ret is used instead or err to set an error code
Use err instead of ret. ret is never used for error reporting under
error label.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 25 Jul 2017 14:45:32 +0000 (10:45 -0400)]
Fix: report error using fd instead of ret
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 28 Jul 2017 14:59:30 +0000 (10:59 -0400)]
Fix: NULL passed to memcpy in error path
CID
1378708: Null pointer dereferences (FORWARD_NULL)
Passing null pointer "data" to "memcpy", which dereferences it.
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Julien Desfossez [Tue, 25 Jul 2017 19:23:49 +0000 (15:23 -0400)]
Fix: lost packet accounting always lost on snapshot
Because of the continue when we fail to get a subbuff, the lost_packet
count is always reset to 0 before we can account it in the channel. Now
we account it directly before the continue.
Reported-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Julien Desfossez <jdesfossez@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Fri, 21 Jul 2017 15:09:14 +0000 (11:09 -0400)]
Fix: report error on session listing
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 24 Jul 2017 20:07:00 +0000 (16:07 -0400)]
Fix live-comm: merge TCP socket write-write sequence in a single write
The live protocol implementation is often sending content
on TCP sockets in two separate writes. One to send a command header,
and the second one sending the command's payload. This was presumably
done under the assumption that it would not result in two separate
TCP packets being sent on the network (or that it would not matter).
Delayed ACK-induced delays were observed [1] on the second write of the
"write header, write payload" sequence and result in problematic
latency build-ups for live clients connected to moderately/highly
active sessions.
Fundamentaly, this problem arises due to the combination of Nagle's
algorithm and the delayed ACK mechanism which make write-write-read
sequences on TCP sockets problematic as near-constant latency is
expected when clients can keep-up with the event production rate.
In such a write-write-read sequence, the second write is held up until
the first write is acknowledged (TCP ACK). The solution implemented
by this patch bundles the writes into a single one [2].
[1] https://github.com/tbricks/wireshark-lttng-plugin
Basic Wireshark dissector for lttng-live by Anto Smyk from Itiviti
[2] https://lists.freebsd.org/pipermail/freebsd-net/2006-January/009527.html
Reported-by: Anton Smyk <anton.smyk@itiviti.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Fri, 16 Jun 2017 21:23:13 +0000 (17:23 -0400)]
Fix: join consumer timer thread
Detaching the timer thread has the unfortunate side-effect of letting
the health management data structures be freed by main() while the timer
thread may still be using them (if, e.g., main() exits quickly).
Overcome this situation by tearing down and joining the timer thread.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 14 Jun 2017 18:25:46 +0000 (14:25 -0400)]
Update version to v2.9.5
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 13 Jun 2017 18:50:05 +0000 (14:50 -0400)]
Fix: test_utils_expand_path passes NULL to sprintf
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Fri, 26 May 2017 16:14:19 +0000 (18:14 +0200)]
Fix: lttng list of channels should return errors
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Fri, 26 May 2017 16:14:18 +0000 (18:14 +0200)]
Fix: discard event/lost packet counters
For per-pid buffers, we need to sum the counters for each application.
For per-uid buffers, if no application has launched yet, it should not
be considered as an error (which stops iteration on all other channels),
but rather as values of 0.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 2 Jun 2017 18:49:20 +0000 (14:49 -0400)]
Fix: missing errno.h include in time.h compat header
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 6 Feb 2017 20:28:52 +0000 (15:28 -0500)]
Fix: registry can be null on lookup
A session teardown can be initiated by a dying application. Hence, a
session object can exist without a valid registry. As a result,
get_session_registry can return null. To prevent this, the UST
application session lock should be held, when possible, when looking up
the registry to ensure synchronization. Otherwise the presence of a
registry is not guaranteed. In such case, handling a null return value
from look-up registry function is necessary.
Core dumps, triggered by the "assert(registry)" statement found in
reply_ust_register_channel, were observed when killing instrumented
applications. In this occurrence, obtaining the UST application lock
result in a deadlock since the lock is already held during
ust_app_global_create. Handling the null value is simpler and
corresponds with the handling of previous look-up done during the
function.
Handling of null value is also applied to:
add_event_ust_registry
add_enum_ust_registry
ust_app_snapshot_record
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Wed, 31 May 2017 21:08:23 +0000 (17:08 -0400)]
Test: Replace test relying on pselect6(2) man page ambiguity
The `pselect_fd_too_big` test is checking for the case where the `nfds`
is larger than the number of open files allowed for this process
(RLIMIT_NOFILE).
According to the ERRORS section of the pselect6(2) kernel man page[1], if
`nfds` > RLIMIT_NOFILE is evaluate to true the pselect6 syscall should
return EINVAL but the BUGS section mentions that the current
implementation ignores any FD larger than the highest numbered FD of the
current process.
This is in fact what happens. The Linux implementation of the pselect6
syscall[2] does not compare the `nfds` and RLIMIT_NOFILE, but rather caps
`nfds` to the highest numbered FD of the current process as the BUGS
kernel man page mentionned.
It was observed elsewhere that there is a discrepancy between the manual
page and the implementation[3].
As a solution, replace the current testcase with one that checks the
behaviour of the syscall when an invalid FD is passed.
[1]:http://man7.org/linux/man-pages/man2/pselect6.2.html
[2]:https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/select.c#n619
[3]:https://patchwork.kernel.org/patch/
9345805/
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Julien Desfossez <jdesfossez@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 11 May 2017 21:53:58 +0000 (17:53 -0400)]
Fix: use "flush empty" ioctl for snapshots
When the flush empty ioctl is available, use it to produce an empty
packet at the end of the snapshot, which ensures the stream intersection
feature works.
If this specific ioctl is not available, fallback on the "flush" ioctl,
which does not produce empty packets.
In that situation, there were two prior behaviors possible for
lttng-modules: earlier versions implement a "snapshot" command which
does not perform an implicit "flush_empty". In that case, the stream
intersection feature may not be reliable. In more recent lttng-modules
versions (included stable branch) which did not implement the
flush_empty ioctl, the snapshot ioctl implicitly performed a
flush_empty, which makes the stream intersection feature work, but has
side-effects on the snapshot ioctl performed by the live timer (produces
a stream of empty packets in live mode).
[ Please apply to master, 2.10, 2.9, 2.8 branches. ]
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 11 May 2017 20:00:56 +0000 (16:00 -0400)]
Fix: lttng-consumerd: cpu hotplug: send "streams_sent" command
When creating a new channel, the streams being sent to the relayd are
kept invisible to the live client until the "streams_sent" command is
received. This ensures the client does not see a partial stream set.
This "streams_sent" command needs to be sent on CPU hotplug too,
otherwise the live client handling within relayd is not aware of those
streams (they are never published).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 11 May 2017 20:00:55 +0000 (16:00 -0400)]
Fix: lttng-sessiond: cpu hotplug: send channel to consumer only once
On CPU hotplug, we currently send a duplicate of the channel key, which
allocates its own object (duplicated) within the consumerd. We want the
newly added stream to map to the pre-existing channel key, so don't send
the channel duplicate.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 11 May 2017 20:00:54 +0000 (16:00 -0400)]
Fix: lttng-sessiond: cpu hotplug stream number mismatch
The counter should be always increasing (kept in the channel), rather
than local to the function. This causes cpu hotplug handling to
disregard further streams that should be added to the consumer output
on CPU hotplug.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 18 May 2017 20:15:20 +0000 (16:15 -0400)]
Fix: consumer_timer_signal_thread_qs waits on LTTNG_CONSUMER_SIG_SWITCH
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Wed, 17 May 2017 22:36:54 +0000 (18:36 -0400)]
Fix: thread exit vs futex wait/wakeup race
relayd_live_stop performs, in this order:
CMM_STORE_SHARED(live_dispatch_thread_exit, 1); [A]
futex_nto1_wake(&viewer_conn_queue.futex); [B]
whereas thread_dispatcher does:
while (!CMM_LOAD_SHARED(live_dispatch_thread_exit)) { [1]
[...]
futex_nto1_prepare(&viewer_conn_queue.futex); [2]
[...]
futex_nto1_wait(&viewer_conn_queue.futex); [3]
Unfortunately, on the following sequence:
[1] [A] [B] [2] [3]
thread_dispatcher will end up hanging.
We need to move the live_dispatch_thread_exit load between "prepare" and
"wait" to fix this.
There are similar scenarios with relay_thread_dispatcher, and the
session daemon thread_dispatch_ust_registration, which are also fixed
here.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 15 May 2017 14:37:18 +0000 (10:37 -0400)]
Fix: status_loc argument of waitpid() is used on error
waitpid() may leave stat_loc uninitialized on error (depending
on errno's value, see WAIT(3)).
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 9 May 2017 19:46:35 +0000 (15:46 -0400)]
Fix: COMPAT_EPOLL_PROC_PATH is available from Linux 2.6.28
v2: Typo in commit message "per see" -> "per se"
Failing on opening [1] is not an error per se. [1] was
introduced in Linux 2.6.28 but epoll is available since
2.5.44. Hence, goto end and set a default value without
setting error return value.
[1] /proc/sys/fs/epoll/max_user_watches
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Mon, 8 May 2017 12:38:37 +0000 (08:38 -0400)]
doc: how to trace consumerd with valgrind
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Mon, 8 May 2017 12:34:57 +0000 (08:34 -0400)]
Cleanup: initialize kernel ioctl ABI structures to 0
Valgrind complains that we pass uninitialized data to the kernel.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Mon, 8 May 2017 12:15:20 +0000 (08:15 -0400)]
Cleanup: initialize data to 0
Valgrind catches read of uninitialized data caused by the on-stack
"data" argument which ends up not being fully initialized (it contains a
union).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 24 Apr 2017 19:59:20 +0000 (15:59 -0400)]
Fix: assert() on null index_file in lttng_index_file_write()
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 24 Apr 2017 19:32:15 +0000 (15:32 -0400)]
Fix: fail on relayd lookup when finding a relayd is expected
An actual relayd lookup error leads to using the code path of a local
handling. Since stream->index_file is NULL when expecting a relayd, using
the code path for local handling results in an invalid access.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 21 Feb 2017 02:39:48 +0000 (21:39 -0500)]
Update version to v2.9.4
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Thu, 2 Feb 2017 22:09:43 +0000 (17:09 -0500)]
Port: Link with no-undefined on Windows
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Thu, 26 Jan 2017 20:09:22 +0000 (15:09 -0500)]
Port: win32 DLLs don't support hidden symbols
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Thu, 26 Jan 2017 20:09:21 +0000 (15:09 -0500)]
Port: add cygwin support to endian compat
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Thu, 26 Jan 2017 19:55:46 +0000 (14:55 -0500)]
Fix: Remove unused headers
This is a portability fix, these headers are unused and not available on
some platforms.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Thu, 26 Jan 2017 19:53:03 +0000 (14:53 -0500)]
Fix: tests: register thread for RCU operations.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Thu, 26 Jan 2017 19:36:45 +0000 (14:36 -0500)]
Fix: Lazily initialize max poll set size in poll compat
This was applied to the epoll implementation in commit
22dad56815ce0201c5ae7d5ef5d79cc0c6a42c5e
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 17 Jan 2017 15:08:47 +0000 (10:08 -0500)]
Fix: null dereference on error path for create_ctx_type
When zmalloc of type->opt fail the destroy_ctx_type would result in a
null dereference.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 17 Jan 2017 15:08:22 +0000 (10:08 -0500)]
Fix: test_ust_data dereference of null pointer
Skip test on NULL value to prevent null dereference.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 17 Jan 2017 15:02:08 +0000 (10:02 -0500)]
Fix: test_kernel_data dereference of null pointer
Skip tests when tested struct is null.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Fri, 13 Jan 2017 22:04:42 +0000 (17:04 -0500)]
Man: move [SESSION] before options
The previous synopses for the live mode can cause confusion to users
since it can lead to an error while trying one of the simplest create
command for live session that the synopsis is proposing:
lttng create --live test.
Other synopsis are modified for symmetry.
Fixes #1081
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Wed, 11 Jan 2017 20:49:49 +0000 (15:49 -0500)]
Fix: consumerd: add missing put_subbuf for ust and kernel errors
While reading a sub-buffer, error handling need to put the sub-buffer,
else all future attempts to use the stream will trigger warnings.
The affects recent features added to UST and kernel tracing.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Wed, 11 Jan 2017 20:49:48 +0000 (15:49 -0500)]
Fix: sessiond: only send streams to consumer once
Session daemon should not send streams to consumer daemon
repeatedly when CPU hotplug is performed while doing kernel
tracing.
This causes the consumer daemon to have multiple file descriptors
on the same stream, and thus try to perform operations like reading
a sub-buffer and checking for data pending concurrently. This triggers
safety-net warnings in the kernel tracer.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 19 Jan 2017 00:23:27 +0000 (19:23 -0500)]
Fix: consumerd main: needs to be a registered RCU thread
main->lttng_consumer_destroy->destroy_data_stream_ht requires a RCU
read-side lock.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 19 Jan 2017 00:23:26 +0000 (19:23 -0500)]
Fix: thread_dispatch_ust_registration needs to be a RCU thread
It uses a read-side lock.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 19 Jan 2017 00:23:09 +0000 (19:23 -0500)]
Fix: don't abort metadata push on closed metadata
The failure/exit of any of the consumerd, relayd or applications
(in per-PID buffer mode) will cause the metadata closed flag to
be set.
While pushing new metadata updates to the consumerd (and relayd
in streaming/live scenarios) will fail, those conditions should
be handled in-place.
Applications are _expected_ to exit during the course of a per-PID
session. However, they will typically have pushed their metadata
to the metadata cache before doing so. The session daemon must
flush the unconsumed metadata to the consumerd in this case.
Failure to answer to the metadata request originating from the
consumerd can cause it to keep the stream lock held and, thus,
prevent the channel poll thread from cleaning up on channel
close.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Nathan Lynch [Mon, 9 Jan 2017 22:14:28 +0000 (16:14 -0600)]
lttng-tools: remove bogus interpreter line from utils shell library
tests/utils/utils.sh is always sourced, never executed, and
/src/bin/bash is not a typical path for a shell interpreter. Just
delete it.
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 9 Jan 2017 19:14:41 +0000 (14:14 -0500)]
Update version to v2.9.3
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Mon, 9 Jan 2017 16:23:16 +0000 (11:23 -0500)]
Fix: consumerd: order of metadata cache vs stream lock
The locking order comment in consumer.h is incorrect. First, its
description of locking order is not in sync with the comment found in
consumer-metadata-cache.h. The comment in struct consumer_metadata_cache
only states that the metadata cache lock nests inside the consumer_data
lock, and does not mention the stream lock, which implies that the
metadata cache lock does NOT nest inside the stream lock. But let's
investigate further to confirm:
* lttng_consumer_read_subbuffer() acquires the stream lock, and then
calls lttng_ustconsumer_read_subbuffer() with stream lock held,
and then invokes commin_one_metadata_packet(), which acquires the
metadata cache lock.
* lttng_ustconsumer_sync_metadata() acquires the metadata stream lock,
and calls commit_one_metadata_packet(), which takes the metadata cache
lock.
Therefore, update the comment in consumer.h to state that the metadata
cache lock nests INSIDE the stream lock, and update
consumer_del_metadata_stream() accordingly.
This should take care of fixing the locking order reversal found by
Coverity.
CID
1368314 (#1 of 1): Thread deadlock (ORDER_REVERSAL)
CID
1368319: Program hangs (ORDER_REVERSAL)
Fixes: 5feafd4130 "Fix: protect the channel's metadata stream using the metadata cache lock"
Fixes: 1ea6cc572b "Fix: lock nesting order reversed"
Fixes: fb549e7ac2 "Fix: reverse channel and metadata cache lock nesting order"
Reported-by: Coverity Scan
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Wed, 21 Dec 2016 22:59:38 +0000 (17:59 -0500)]
Fix: add missing rcu_barrier before daemon teardown
When performing the "cleanup" of sessiond, consumerd, and relayd, we
destroy data structures that may still be concurrently accessed by
call_rcu worker thread.
Ensure no more work is present in the call_rcu worker thread by issuing
a rcu_barrier barrier. Note that this expects call_rcu handlers don't
chain work to other call_rcu handlers.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Mon, 5 Dec 2016 20:39:26 +0000 (15:39 -0500)]
Fix: Add missing pthread.h include
Some libc like musl and uClibc requires explicit includes of pthread.h
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Tue, 20 Dec 2016 21:31:26 +0000 (16:31 -0500)]
Fix: support for older versions of Babeltrace in test script
A new context field was introduced in version LTTng 2.8 that is printed
by Babeltrace prior to v1.2.5. This regex thus fails to match the
output. Since the context fields are not used by the script, we create a
non-capturing group for these fields that matches on both old and new
Babeltrace.
This is causing problems on Ubuntu 14.04 Trusty when building
lttng-tools from source and using the Babeltrace package from the
official repository (v1.2.1) to run the test suite.
Also, this patch removes commented and used code in the function but
keeps the names of non-capturing groups for readability.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
CC: Philippe Proulx <pproulx@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 8 Jan 2017 19:29:09 +0000 (14:29 -0500)]
Fix: reverse channel and metadata cache lock nesting order
CID
1368319: Program hangs (ORDER_REVERSAL)
The lttng_consumer_channel lock must be nested outside of the
metadata cache lock, as indicated in the structure's comments.
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sat, 7 Jan 2017 21:24:59 +0000 (16:24 -0500)]
Update version to v2.9.2
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sat, 7 Jan 2017 18:42:12 +0000 (13:42 -0500)]
Fix: only lock the metadata_cache in userspace consumers
The kernel consumer, which re-uses the consumer_del_metadata_stream
function, has no metadata cache. Therefore, it can't be used to
protect the metadata stream (see
5feafd41).
However, only the userspace consumers invoke
consumer_metadata_cache_write() which the previous fix seeked to
protect against. It is therefore safe to omit this lock in the
kernel consumer case.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sat, 7 Jan 2017 17:32:13 +0000 (12:32 -0500)]
Fix: lock nesting order reversed
The lttng_consumer_stream lock must nest INSIDE the metadata
cache lock, as indicated in the structure's comments
(see consumer.h:340).
CID
1368314 (#1 of 1): Thread deadlock (ORDER_REVERSAL)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jan 2017 19:58:28 +0000 (14:58 -0500)]
Update version to v2.9.1
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 20 Dec 2016 23:25:17 +0000 (18:25 -0500)]
Fix: lttng-relayd: forcefully close stream on relayd shutdown
Add an "aborted" field to relay_session struct to indicate that on
shutdown pending data for a stream is no relevant and should not be
waited for.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 21 Dec 2016 22:56:24 +0000 (17:56 -0500)]
Fix: protect the channel's metadata stream using the metadata cache lock
The consumer_thread_data_poll and consumer_thread_metadata_poll
both access the channel's metadata stream.
During a session destruction, consumer_thread_metadata_poll will
destroy all metadata streams. However, the consumer_thread_data_poll
may still invoke a consumer_metadata_cache_write() triggered
by a "ready" subbuffer. Hence, the metadata stream must be protected
from this action by the metadata cache lock.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 20 Dec 2016 20:00:04 +0000 (15:00 -0500)]
Fix: double unlock of metadata mutex on error
lttng_ustconsumer_sync_metadata must leave the metadata lock
in its initial state. Otherwise an error may result in a
double unlock in the caller.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 15 Dec 2016 11:13:19 +0000 (12:13 +0100)]
Fix: add element length check in lttng_index_file_open
Handle cases where the index file header would contain a corrupted
value.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 12 Dec 2016 21:39:17 +0000 (16:39 -0500)]
Fix: free previous instance of url (alloc_url) on default live url assignation
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 15 Dec 2016 10:04:57 +0000 (11:04 +0100)]
Fix: relayd vs consumerd compatibility
relay and consumerd 2.7 and 2.8 are expected to negociate compatibility
with the lowest common minor version.
If a consumer daemon 2.8 interacts with a relayd 2.7, it needs to send
the index fields for ctf index 1.0. Same if a relayd 2.8 interacts with
a consumer daemon 2.7: relayd should expect ctf index 1.0 fields, and
generate a ctf index 1.0 index file layout.
If both relayd and consumerd versions are 2.8+, then we can send the ctf
index 1.1 fields over the protocol, and store them in the index files.
Whenever the relayd live viewer server opens and reads an index file,
it needs to use the file's header to figure out the index "element"
size.
[ Should be applied to master, stable-2.9, stable-2.8. ]
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Wed, 30 Nov 2016 17:29:06 +0000 (12:29 -0500)]
lttng-add-context(1): add missing man: prefix
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 29 Nov 2016 22:42:32 +0000 (17:42 -0500)]
Update version to v2.9.0
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 29 Nov 2016 22:40:52 +0000 (17:40 -0500)]
Add 2.9.0 release beer description
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Mon, 28 Nov 2016 23:54:41 +0000 (18:54 -0500)]
lttng-add-context(1): fix style
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Mon, 28 Nov 2016 23:27:53 +0000 (18:27 -0500)]
lttng-snapshot(1): fix style
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Mon, 28 Nov 2016 23:27:43 +0000 (18:27 -0500)]
lttng-metadata(1): fix style
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Mon, 28 Nov 2016 23:25:28 +0000 (18:25 -0500)]
doc/man: put short option's argument too
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Mon, 28 Nov 2016 23:14:55 +0000 (18:14 -0500)]
Remove `metadata` command from various help resources
This command is now deprecated. Its own man page remains available
and warns the user that it's deprecated and suggests to look at
lttng-regenerate(1) instead.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Mon, 28 Nov 2016 23:06:34 +0000 (18:06 -0500)]
List the `regenerate` command in various help resources
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Mon, 28 Nov 2016 22:57:06 +0000 (17:57 -0500)]
lttng-load(1): fix synopsis and style
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Liguang Li [Mon, 28 Nov 2016 08:37:47 +0000 (16:37 +0800)]
Fix: truncate the metadata file in shm-path
In the shm-path mode, the metadata will be backuped to a metadata
file, when run the lttng command "lttng metadata regenerate" to
resample the wall time following a major NTP correction, the metadata
file will not be truncated and regenerated.
Add the function clear_metadata_file() to truncate and regenerate the
metadata file.
Signed-off-by: Liguang Li <liguang.li@windriver.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Thu, 24 Nov 2016 22:14:22 +0000 (17:14 -0500)]
Load: add message indication that a name override was carried out
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Thu, 24 Nov 2016 21:44:17 +0000 (16:44 -0500)]
Load: expose overrides elements in mi
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Thu, 24 Nov 2016 19:33:32 +0000 (14:33 -0500)]
Fix: assign values to path, ctrl and data uris during configuration load
Since overrides can be partial (name only, etc.) always assign a base
value from the configuration being loaded then apply overrides.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Thu, 24 Nov 2016 19:27:28 +0000 (14:27 -0500)]
Load: test that name override does not have side effects
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 24 Nov 2016 16:07:42 +0000 (11:07 -0500)]
Docs: remove invalid short option -U and move option descriptions
Reported-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Mon, 21 Nov 2016 17:36:00 +0000 (12:36 -0500)]
Fix: add missing refcount of loaded modules
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 18 Nov 2016 21:35:34 +0000 (16:35 -0500)]
Fix: only unload successfully loaded kernel modules
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Thu, 10 Nov 2016 20:26:35 +0000 (15:26 -0500)]
Fix: test cases now rely on explicit workloads
Run a process explicitly in the tracing session to generate the enabled events
rather than relying on the events generated by the lttng CLI.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Wed, 2 Nov 2016 07:25:25 +0000 (03:25 -0400)]
m4/pprint.m4: update with correct quoting
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Fri, 28 Oct 2016 23:01:19 +0000 (19:01 -0400)]
configure.ac: move warning to end of output for the end user
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Fri, 28 Oct 2016 22:33:19 +0000 (18:33 -0400)]
doc/man: only require asciidoc-attrs.conf when building the man pages
Situations:
* If you want to and can build the man pages:
* If it's a tarball tree:
* Make the man page destinations depend on asciidoc-attrs.conf.
Since it's a generated file, its date is greater than the
date of the prebuilt man pages, therefore the man pages are
built again, which is a good thing because they include the
default values of this build.
* If it's a Git tree:
* Always build the man pages anyway (no prebuilt man pages here).
* If you want to, but cannot build the man pages:
* If it's a tarball tree:
* Make the man page destinations NOT depend on asciidoc-attrs.conf,
because its recent date would ask said destinations to be rebuilt
and this is not possible because we don't have the tools.
However, warn the user at configure time that the prebuilt man
pages will be installed, which means that they will contain the
project's default values, not this build's default values.
* If it's a Git tree:
* Not valid: error at configure time as usual.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 10 Nov 2016 19:47:14 +0000 (14:47 -0500)]
Test fix: increase test count in plan of test_perf_raw
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Julien Desfossez [Mon, 12 Sep 2016 20:57:10 +0000 (16:57 -0400)]
Create a dedicated test suite for Perf
Introduce the perf_regression test suite that must be run manually to
check if the support for the Perf-related features are available on the
current machine. This test cannot be run automatically since there are
some platforms where it can fail (VMs, some SoCs, etc).
For now, the test only makes sure that we can trace events with perf
contexts enabled by raw ID in kernel and user-space. The test only works
if libpfm is installed on the system and fails if it is not installed.
Signed-off-by: Julien Desfossez <jdesfossez@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Nathan Lynch [Tue, 1 Nov 2016 17:25:47 +0000 (11:25 -0600)]
Tests: accommodate stricter mktemp implementations in tests
Busybox's mktemp command uses mkstemp(3) which requires the last six
characters of the template to be X's. Extend the mktemp templates
used in the test scripts.
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Anders Wallin [Thu, 20 Oct 2016 05:58:55 +0000 (07:58 +0200)]
Add version info to lttng-relayd help
lttng-relayd man pages states that the option
-V --version is available, but it it's missing in the code
Signed-off-by: Anders Wallin <wallinux@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 20 Oct 2016 21:05:14 +0000 (17:05 -0400)]
Fix: stop sessiond threads on health thread error
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 20 Oct 2016 19:45:42 +0000 (15:45 -0400)]
Fix: stop lttng-relayd threads on health thread error
The lttng-relayd health thread may fail to initialize for
a variety of reason (notably, a too long unix domain socket
address), which will cause it to never notify that it is
ready.
In such circumstances, the lttng-relayd command, in background or
daemonize mode, will never return as the daemon's "readyness"
will never be signaled.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
This page took 0.051341 seconds and 4 git commands to generate.