David Goulet [Fri, 24 Jan 2014 17:36:44 +0000 (12:36 -0500)]
Fix: add missing JUL loglevel handling
JUL loglevels are directly mapped to the Level class from the JUL
interface. A complete listing has been added to the enable-event help
command and to the lttng.h ABI as lttng_loglevel_jul.
Signed-off-by: David Goulet <dgoulet@efficios.com>
Fix: relayd: notify parent of readiness when all threads ready
- relayd start using daemonize common lib,
- wait for health check thread, listener and live listener threads to be
ready before letting the parent know it is ready (in daemon and
background modes).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
Ensure that relayd health check is ready to receive queries by starting
relayd in background mode with '-b' rather than putting it in background
by the shell (&).
Hash tables shared between threads should not be initialized by a
specific thread, because the other threads could start using it before
it is initialized.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
Fix: race with the viewer and readiness of streams
Add a message to inform the relayd that all the streams of a certain
channels were sent so it can make them available to the viewer. This
fixes a race where the viewer could start reading some streams before
having received them all.
Signed-off-by: Julien Desfossez <jdesfossez@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
Should pass nesting + 1 as parameter rather than nesting++. Worked in
when nesting was in one direction (due to side-effect of the first ++),
but not the other.
Fixes #688
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
David Goulet [Wed, 18 Dec 2013 23:34:44 +0000 (18:34 -0500)]
Fix: remove break in epoll loop of apps. thread
In *heavy* stress test with a large number of applications (> 7000 a
second), the manage application thread could starve the delete process
by breaking just after adding an application to the poll set.
Also, we've observed that somehow the application unregister process is
not done on most of the application by breaking the loop at each delete
from the poll set. We are still uncertain why but one theory is that
epoll detects that an I/O operation is ready (here a shutdown) and an
other subsystem of the session daemon uses that socket for I/O which
flags the poll event as "has been taken care of" thus the loop never
sees it because of that break.
The notify socket thread does not use a break between poll operation
which leads us to that conclusion with the manage apps thread.
We don't use epoll with edge-trigger thus a POLLERR/POLLHUP should
always be return as long as it's not taken care of.
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
Clang 3.3 with -O2 optimisations is especially picky about arithmetic
on NULL pointers. This undefined behavior is turned into optimized out
NULL checks by clang 3.3. Fix the undefined behavior by checking against
the pointer directly, without going back and forth around NULL with
pointer arithmetic.
David Goulet [Thu, 28 Nov 2013 18:25:25 +0000 (13:25 -0500)]
Fix: remove assert on fd in the read/write layer
It is possible that an invalid fd is passed to read or write. This can
happen for instance if the endpoint of the transport (ex: relayd) dies
out while actively trying to send data. It is OK to let an invalid fd
where the syscall will return the right value along with errno being
populated with the corresponding code.
Acked-by: Julien Desfossez <julien.desfossez@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
David Goulet [Thu, 28 Nov 2013 18:08:10 +0000 (13:08 -0500)]
Fix: don't fail on push metadata if no channel
The comments in the code explains it well but in a nutshell, this is an
acceptable race between the creation of the metadata on the consumer
side and the push metadata from the session daemon for that channel.
This race is resolved by either having the consumer requesting metadata
or the session is stopped which will in both situation push the metadata
to the consumer.
Without that fix, the session daemon flags the registry's metadata to be
"closed" which usually indicates that the consumer is not responding
leading to the consumer thread exiting in the session daemon.
Acked-by: Julien Desfossez <julien.desfossez@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
David Goulet [Mon, 25 Nov 2013 19:16:39 +0000 (14:16 -0500)]
Fix: implicit conversion of enum types in consumer
This actually remove the use of LTTNG_ERR* code in the sessiond/consumer
protocol since it should NOT be used and some comparison of enum
(lttng_error_code vs lttcomm_return_code) were broken and dangerous.
Signed-off-by: David Goulet <dgoulet@efficios.com>
David Goulet [Mon, 25 Nov 2013 16:33:59 +0000 (11:33 -0500)]
Tests: remove useless sleep when spawning sessiond
The new daemonize scheme returns to the parent only when the session
daemon is ready to operate thus the sleep that was used to wait for the
session deamon to bootstrap is no longer needed.
Signed-off-by: David Goulet <dgoulet@efficios.com>
Raphaël Beamonte [Sat, 23 Nov 2013 22:32:27 +0000 (17:32 -0500)]
Tests: add symlink tests for test_utils_expand_path
These new test cases allows to test the utils_expand_path
function with an existing tree using symlinks. It allows
to verify the right behavior of the function in complex
situation.
Raphaël Beamonte [Sat, 23 Nov 2013 22:32:26 +0000 (17:32 -0500)]
Fix: utils_expand_path now works for paths that ends with '/.' or '/..'
Cases where the path given ended with '/.' or '/..' were not handled
properly using the utils_expand_path function and the resulting path
was still showing this end part. This fix aims to treat that last
part and 'expand' it as expected.
David Goulet [Thu, 21 Nov 2013 18:02:40 +0000 (13:02 -0500)]
Fix: use non block waitpid to lookup child state
When daemonizing the session daemon, if the child fails *before* it
could set the recv_child_signal variable that indicates the parent to
exit, the parent process gets in an infinite loop never returning.
This commit fixes that by adding a non blocking waitpid() that monitors
the status of the child so it can exit if the child failed.
Acked-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
Raphaël Beamonte [Fri, 15 Nov 2013 00:58:35 +0000 (19:58 -0500)]
Remove the utils_resolve_relative function that is not useful anymore
As all of the work is now done in utils_partial_realpath and
utils_expand_path, utils_resolve_relative is not necessary
anymore and should be deleted from the sources.
Signed-off-by: Raphaël Beamonte <raphael.beamonte@gmail.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
Raphaël Beamonte [Fri, 15 Nov 2013 00:58:34 +0000 (19:58 -0500)]
Change the utils_expand_path function to use utils_partial_realpath
As most of the resolve-related work can now be done using
utils_partial_realpath, the utils_expand_path function can
call it and concentrate on resolving the relative paths in
the middle of a path string, such as '/./' and '/../'.
Signed-off-by: Raphaël Beamonte <raphael.beamonte@gmail.com> Signed-off-by: David Goulet <dgoulet@efficios.com>
Raphaël Beamonte [Fri, 15 Nov 2013 00:58:33 +0000 (19:58 -0500)]
Introduce a new utils_partial_realpath function
This new utils function allows to resolve partially the paths
using realpath. As realpath(3) is not available to use for
unexistent paths, this function that allows to resolve partially
existent paths can be used in cases we don't know if the path
fully exists. It first resolves the existent part using
realpath(3) and then concatenate the unexistent part to it.
Signed-off-by: Raphaël Beamonte <raphael.beamonte@gmail.com> Signed-off-by: David Goulet <dgoulet@efficios.com>