Fix: stop lttng-relayd threads on health thread error
The lttng-relayd health thread may fail to initialize for
a variety of reason (notably, a too long unix domain socket
address), which will cause it to never notify that it is
ready.
In such circumstances, the lttng-relayd command, in background or
daemonize mode, will never return as the daemon's "readyness"
will never be signaled.
Issuing fprintf() to stderr (thus write() to the standard error file
descriptor) within the SIGPIPE signal handler is bad: it can trigger
SIGPIPE repeatedly if the listening end has closed its end of the pipe.
Set the SIGPIPE action to SIG_IGN in relayd, sessiond, and consumerd.
This was affecting sessiond and relayd. The consumerd did not print
anything to stderr.
Tests: tap.sh spams tests' output when no plan is set
Some tests are implemented in C (using tap.h) or in Python
and don't use tap.sh's facilities. However, it is sourced
by utils.sh and prints an error message during its clean-up
because a plan was never set.
Fix: validate number of subbuffers after tweaking properties
There are properties that are tweaked by each of ust and kernel channel
create functions after a validation on the number of subbuffers for
overwrite channels. Move validation after those properties
modifications.
The ht_cleanup thread is shut down before the queue of rcu
callbacks is emptied by the rcu_barrier(). Since callbacks added
by call_rcu can push hash tables through the ht_cleanup pipe, we run
into cases where the clean-up thread has been shutdown and
hash tables pushed through the clean-up pipe are leaked.
For channels configured with large sub-buffer size, the relayd copies
the entire trace sub-buffer (trace packet) into a large buffer, and then
copies the large buffer to disk. It is inefficient from a point of view
of cache locality.
Use a 64k buffer on the stack instead, and move the data piece-wise.
Jonathan Rajotte [Sat, 28 May 2016 06:34:23 +0000 (02:34 -0400)]
Fix: set the logger level to prevent unexpected level inheritance
BSF and other jars can ship with an embedded log4j.properties
file. This causes problem when launching an application with a general
class path (e.g /usr/share/java/*) since log4j will look for a
configuration file in all loaded jars. If any contains a directive for
the root logger, it will affect any logger with no level that are
directly under the root logger. This can result in an unexpected
behaviour (e.g no events triggered etc.).
Link: https://issues.apache.org/jira/browse/BSF-24 Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 22 Mar 2016 18:12:04 +0000 (14:12 -0400)]
Fix: do not return error on LTTNG_ERR_SNAPSHOT_NODATA
A warning is fine since the user has no control on
whether or not applications (or the kernel) have
produced any event between the start of the tracing
session and the recording of the snapshot.
MI wise the command is not a success since nothing was
recorded. The command line return code is CMD_SUCCESS.
refs #1002
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Fix: Set loopback adress in set_ip_addr if gethostbyname2 fails
Some systems may not have "localhost" defined in accordance with IETF
RFC 6761. According to this RFC, applications may recognize
"localhost" names as special and resolve to the appropriate loopback
address.
We choose to use the system name resolution API first to honor its
network configuration. If this fails, we resolve to the appropriate
loopback address. This is done to accomodate systems which may want to
start tracing before their network configured.
Jonathan Rajotte [Fri, 23 Oct 2015 15:32:44 +0000 (11:32 -0400)]
Fix: load event state (enabled/disabled) correctly
This bug fix is a workaround due to limitations of lttng_disable_event_ext
regarding the disabling of events with similar name but different
characteristics. Although lttng_disable_event_ext provides support for
disabling by name and filter string it does not support exclusion.
The loading of events is cut in 3 phases.
1 - Create all events regardless of their state.
2 - Disable all events.
3 - Enable only the events with the 'enabled' state.
Fixes #959
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Cleanup error.h __lttng_print() used for message printing
The loglevels have never really been a mask, and it is useless to try to
use them as masks, because the compiler statically knows the value of
the loglevel requested, and can therefore optimise away all the logic.
This takes care of Coverity warning about mixed bitwise and boolean
logic, which was technically correct, but more complex than needed.
Fix: ust-consumer: flush empty packets on snapshot channel
Snapshot operation on a non-stopped stream should use a "final" flush to
ensure empty packets are flushed, so we gather timestamps at the moment
where the snapshot is taken. This is important for streams that have a
low amount of activity, which might be on an empty packet when the
snapshot is triggered.
PRINT_ERR maps to 0x1, PRINT_WARN maps to 0x2, which is fine so far to
use as masks, but PRINT_BUG maps to 0x3, which is the same as both
PRINT_ERR and PRINT_WARN, and does not make sense to use in masks with
__lttng_print:
(type & (PRINT_WARN | PRINT_ERR | PRINT_BUG))
Fix this by ensuring PRINT_BUG has its own mask, and express all
constants as shifts to eliminate the risk of re-introducing a similar
bug in the future.
We should flush the last packet after stop, not before. Otherwise, we
may end up with events written immediately after the flush, which
defeats the purpose of flushing.
Fix: UST should not generate packet at destroy after stop
In the following scenario:
- create, enable events (ust),
- start
- ...
- stop (await for data_pending to complete)
- destroy
- rm the trace directory
We would expect that the "rm" operation would not conflict with the
consumer daemon trying to output data into the trace files, since the
"stop" operation ensured that there was no data_pending.
However, the "destroy" operation currently generates an extra packet
after the data_pending check (the "on_stream_hangup"). This causes the
consumer daemon to try to perform trace file rotation concurrently with
the trace directory removal in the scenario above, which triggers
errors. The main reason why this empty packet is generated by "destroy"
is to deal with trace start/stop scenario which would otherwise generate
a completely empty stream.
Therefore, introduce the concept of a "quiescent stream". It is
initialized at false on stream creation (first packet is empty). When
tracing is started, it is set to false (for cases of start/stop/start).
When tracing is stopped, if the stream is not quiescent, perform a
"final" flush (which will generate an empty packet if the current packet
was empty), and set quiescent to true. On "destroy" stream and on
application hangup: if the stream is not quiescent, perform a "final"
flush, and set the quiescent state to true.
The test case for '*', which enables all events, is flaky by its
nature since buffers may be filled by other kernel events preventing
the test script from finding the test event (it is often discarded).
Fix: bad file descriptors on close after rotation error
Ensure we don't try to close output stream file descriptors twice when a
trace file rotation error occurs (once at tracefile rotation, once when
closing the stream). Set the fd value to -1 after the first close to
ensure we don't try to close it again.
Fix: remove logically dead code in send_channel_uid_to_ust
Found by Coverity:
at_most: At condition ret < 0, the value of ret must be at most -1.
cannot_set: At condition ret < 0, the value of ret cannot be equal
to any of {-1030, -32}.
dead_error_condition: The condition ret < 0 must be true.
2825 } else if (ret < 0) {
2826 goto error_stream_unlock;
2827 }
CID 1323135 (#1 of 1): Logically dead code
(DEADCODE)dead_error_line: Execution cannot reach this statement: goto
error_stream_unlock;.
Fix: unchecked return value in low throughput test
Found by Coverity:
CID 1019967 (#1 of 1): Unchecked return value from library
(CHECKED_RETURN)2. check_return: Calling poll(NULL, 0UL, 60000) without
checking return value. This library function may fail and return an
error code.
We really don't care whether this poll succeeds or not.
CID 1019971 (#1 of 1): Unchecked return value from library
(CHECKED_RETURN)2. check_return: Calling posix_fadvise(outfd,
orig_offset - stream->max_sb_size, stream->max_sb_size, 4) without
checking return value. This library function may fail and return an
error code.
CID 1323137 (#1 of 1): Unchecked return value (CHECKED_RETURN)30.
check_return: Calling viewer_stream_get without checking return value
(as is done elsewhere 5 out of 6 times).
Fix: unchecked return value in trace_clock_read64_monotonic
Found by Coverity:
CID 1311498 (#1 of 1): Unchecked return value (CHECKED_RETURN)1.
check_return: Calling clock_gettime without checking return value (as is
done elsewhere 8 out of 9 times).
Jonathan Rajotte [Tue, 17 May 2016 15:52:47 +0000 (11:52 -0400)]
Fix: initialize the cur_event variable before using it
CID 1243041 (#1 of 1): Uninitialized scalar variable (UNINIT)
uninit_use_in_call: Using uninitialized element of array *cur_event.name when
calling strcmp.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Fix: Deference after null check in sessiond set_option
Found by Coverity:
2. var_compare_op: Comparing arg to null implies that arg might be null.
CID 1256137 (#1 of 9): Dereference after null check (FORWARD_NULL)14.
var_deref_model: Passing null pointer arg to strdup, which dereferences
it.
[... same for #2 through #9 ]
This should not really be an issue since
1) options that use the "arg" parameter will not be set by popt if one
is not provided,
2) the configuration file parser will never invoke set_option with
a NULL argument; if no "value" is provided in the file, an empty
string is passed.
The second point is the reason for the "arg && arg[0] == '\0'" check;
we already know that the argument is invalid since an empty string
is never a valid argument for the supported options.
Nonetheless, it makes sense for Coverity to flag this and moving
the check to individual cases, although very verbose, is clear.
CID 1292557 (#1 of 1): Wrong sizeof argument
(SIZEOF_MISMATCH)suspicious_sizeof: Passing argument 8UL /* sizeof
(*_pid_list) */ to function zmalloc and then casting the return value to
int * is suspicious.
CID 1242317 (#1 of 2): Integer overflowed argument (INTEGER_OVERFLOW)25.
overflow_sink: Overflowed or truncated value (or a value computed from
an overflowed or truncated value) new_nbmem * 304UL used as critical
argument to function.
CID 1242317 (#2 of 2): Integer overflowed argument (INTEGER_OVERFLOW)27.
overflow_sink: Overflowed or truncated value (or a value computed from
an overflowed or truncated value) (new_nbmem - nbmem) * 304UL used as
critical argument to function.
CID 1262117 (#1 of 1): Macro compares unsigned to 0
(NO_EFFECT)unsigned_compare: This greater-than-or-equal-to-zero
comparison of an unsigned value is always true. events->nb_fd >= 0U.
Fix: illegal memory access in test_create_kernel_event
Found by Coverity:
CID 1243030 (#1 of 1): Buffer not null terminated (BUFFER_SIZE_WARNING)1.
buffer_size_warning: Calling strncpy with a maximum size argument of 256
bytes on destination array ev.name of size 256 bytes might leave the
destination string unterminated.
CID 1243037 (#1 of 2): Buffer not null terminated
(BUFFER_SIZE_WARNING)18. buffer_size_warning: Calling strncpy with a
maximum size argument of 4096 bytes on destination array
send_stream.path_name of size 4096 bytes might leave the destination
string unterminated.
CID 1243037 (#2 of 2): Buffer not null terminated
(BUFFER_SIZE_WARNING)18. buffer_size_warning: Calling strncpy with a
maximum size argument of 255 bytes on destination array
send_stream.channel_name of size 255 bytes might leave the destination
string unterminated.
Fix: illegal memory access in viewer_list_sessions
Found by Coverity:
CID 1243025 (#1 of 2): Buffer not null terminated
(BUFFER_SIZE_WARNING)17. buffer_size_warning: Calling strncpy with a
maximum size argument of 64 bytes on destination array
send_session->hostname of size 64 bytes might leave the destination
string unterminated.
CID 1243025 (#2 of 2): Buffer not null terminated
(BUFFER_SIZE_WARNING)17. buffer_size_warning: Calling strncpy with a
maximum size argument of 255 bytes on destination array
send_session->session_name of size 255 bytes might leave the destination
string unterminated.
CID 1243017 (#1 of 4): Buffer not null terminated
(BUFFER_SIZE_WARNING)14. buffer_size_warning: Calling strncpy with a
maximum size argument of 264 bytes on destination array msg.channel_name
of size 264 bytes might leave the destination string unterminated.
ID 1243017 (#2 of 4): Buffer not null terminated
(BUFFER_SIZE_WARNING)14. buffer_size_warning: Calling strncpy with a
maximum size argument of 264 bytes on destination array
msg_2_2.channel_name of size 264 bytes might leave the destination
string unterminated.
CID 1243017 (#3 of 4): Buffer not null terminated
(BUFFER_SIZE_WARNING)15. buffer_size_warning: Calling strncpy with a
maximum size argument of 4096 bytes on destination array msg.pathname of
size 4096 bytes might leave the destination string unterminated.
CID 1243017 (#4 of 4): Buffer not null terminated
(BUFFER_SIZE_WARNING)15. buffer_size_warning: Calling strncpy with a
maximum size argument of 4096 bytes on destination array
msg_2_2.pathname of size 4096 bytes might leave the destination string
unterminated.
Fix: illegal memory access in relayd_create_session_2_4
Found by Coverity:
CID 1243024 (#1 of 2): Buffer not null terminated
(BUFFER_SIZE_WARNING)2. buffer_size_warning: Calling strncpy with a
maximum size argument of 255 bytes on destination array msg.session_name
of size 255 bytes might leave the destination string unterminated.
CID 1243024 (#2 of 2): Buffer not null terminated
(BUFFER_SIZE_WARNING)3. buffer_size_warning: Calling strncpy with a
maximum size argument of 64 bytes on destination array msg.hostname of
size 64 bytes might leave the destination string unterminated.
CID 1323138 (#1 of 2): Buffer not null terminated
(BUFFER_SIZE_WARNING)3. buffer_size_warning: Calling strncpy with a
maximum size argument of 64 bytes on destination array session->hostname
of size 64 bytes might leave the destination string unterminated.
CID 1323138 (#2 of 2): Buffer not null terminated
(BUFFER_SIZE_WARNING)3. buffer_size_warning: Calling strncpy with a
maximum size argument of 255 bytes on destination array
session->session_name of size 255 bytes might leave the destination
string unterminated.
Found by Coverity:
CID 1243015 (#1 of 1): Buffer not null terminated
(BUFFER_SIZE_WARNING)8. buffer_size_warning: Calling strncpy with a
maximum size argument of 4096 bytes on destination array
consumer->subdir of size 4096 bytes might leave the destination string
unterminated.
Found by Coverity:
CID 1243021 (#1 of 1): Buffer not null terminated
(BUFFER_SIZE_WARNING)25. buffer_size_warning: Calling strncpy with a
maximum size argument of 255 bytes on destination array (syscall_table +
index).name of size 255 bytes might leave the destination string
unterminated.
Found by Coverity:
CID 1243023 (#1 of 1): Buffer not null terminated
(BUFFER_SIZE_WARNING)3. buffer_size_warning: Calling strncpy with a
maximum size argument of 4096 bytes on destination array pidfile_path of
size 4096 bytes might leave the destination string unterminated.
Found by Coverity:
CID 1243018 (#1 of 1): Buffer not null terminated
(BUFFER_SIZE_WARNING)11. buffer_size_warning: Calling strncpy with a
maximum size argument of 256 bytes on destination array (channels +
i).name of size 256 bytes might leave the destination string
unterminated.
Found by Coverity:
CID 1243027 (#1 of 1): Buffer not null terminated
(BUFFER_SIZE_WARNING)20. buffer_size_warning: Calling strncpy with a
maximum size argument of 255 bytes on destination array tmp_output.name
of size 255 bytes might leave the destination string unterminated.
CID 1243028 (#1 of 2): Buffer not null terminated
(BUFFER_SIZE_WARNING)5. buffer_size_warning: Calling strncpy with a
maximum size argument of 255 bytes on destination array output->name of
size 255 bytes might leave the destination string unterminated.
CID 1243028 (#2 of 2): Buffer not null terminated
(BUFFER_SIZE_WARNING)10. buffer_size_warning: Calling strncpy with a
maximum size argument of 4096 bytes on destination array
output->consumer->dst.trace_path of size 4096 bytes might leave the
destination string unterminated.
Fix: illegal memory access in consumer_set_network_uri
Found by Coverity:
CID 1243029 (#1 of 1): Buffer not null terminated
(BUFFER_SIZE_WARNING)31. buffer_size_warning: Calling strncpy with a
maximum size argument of 4096 bytes on destination array obj->subdir of
size 4096 bytes might leave the destination string unterminated.