Build: fallback to AC_CHECK_LIBS when looking for popt and uuid
Not all distro ship .pc so fallback to basic libs searching if necessary.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com> Reviewed-by: Samuel Martin <s.martin49@gmail.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Samuel Martin [Tue, 1 Dec 2015 23:36:45 +0000 (00:36 +0100)]
tests/unit: fix object files' location
Referring to *.o files under a .libs/ directory is not recommended
because this belongs to libtool's innards.
Indeed, libtool decides to place the *.o files in an
implementation-specific location:
- PIC *.o files go into a .libs/ directory;
- non-PIC *.o files are generated along side to their corresponding
source files.
Using PIC objects to build executable is legit, thought it may
introduce some minor overhead at runtime.
However, hard-coding these PIC object files in the Makefile.am to build
executables breaks the build in case of static only build.
In this case, no PIC object files is generated, so the linker will not
found some of the needed objects files.
Changing these dependencies' path fixes the static build, keeping the
shared one ok, though the non-PIC object files are now always built.
Fixes #983.
Fix tested on git master and v2.6 with no change needed.
Signed-off-by: Samuel Martin <s.martin49@gmail.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Samuel Martin [Sun, 22 Nov 2015 22:38:00 +0000 (23:38 +0100)]
configure.ac: fix static build
For static build, some extra LDFLAGS may be needed.
Using PKG_CHECK_MODULES instead of AC_CHECK_LIB for librairy detection
allows to get all these flags. Then, the LIBS variable can be extended
with everything that is needed.
So, use PKG_CHECK_MODULES for popt and uuid detection; which both depend
on libintl.
This changes fixes build failures triggered with Buildroot, e.g.:
http://autobuild.buildroot.net/results/0f1/0f1e015a0c5a5ac2beeb5011d31a1e0058a32a0d/build-end.log
Signed-off-by: Samuel Martin <s.martin49@gmail.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Julien Desfossez [Fri, 27 Nov 2015 17:12:44 +0000 (12:12 -0500)]
Fix: close indexes when rotating the trace files in splice mode
The consumer needs to close the old index file when doing a file
rotation before opening a new one.
The relay does not have this problem (handled with refcounts).
Fix: Check for NULL hash tables on relay daemon teardown
The relay daemon will log any "leaked" object on exit. However,
some errors encountered early-on during the daemon's
initialization may result in the teardown being executed with
uninitialized hash tables.
Mikael Beckius [Wed, 21 Oct 2015 19:48:29 +0000 (15:48 -0400)]
Fix live timer calculation error
There is an calculation error for live timer. Variable
switch_timer_interval is based on microsecond, and it is not
right to assign switch_timer_interval mod 1000000 to var tv_nsec
which is based on nanosecond.
Signed-off-by: Mikael Beckius <mikael.beckius@windriver.com> Signed-off-by: Jianchuan Wang <jianchuan.wang@windriver.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Fix: Remove dependency on glibc 2.12 caused by pthread_setname_np
prctl() can be used to set the same attribute set by
pthread_setname_np, but doesn't introduce a dependency on a newer
glibc. Using prctl(PR_SET_NAME) introduces a soft dependency on
Linux 2.6.9. However, the worker won't fail to launch if the call
fails as it is set out of convenience (debugger output).
Fix: Log and ignore SIGINT and SIGTERM in run_as worker
The run_as worker is in the same process group as its parent and
will receive both SIGINT and SIGTERM. However, we want to give
the worker a chance to tear itself down gracefully when its
parent closes the command socket.
The run_as worker will now ignore these signals (although it will
log them) and wait for the parent to induce the teardown.
Fix: Handle EINTR of waitpid when spawning a session daemon
waitpid may fail for various reasons, being interrupted being
the most frequent. In such a case, status is left uninitialized
which results in the WIFSIGNALED and WIFEXITED macros returning
undefined value, resulting in surprising logging statements such
as "killed by signal 114".
Jonathan Rajotte [Mon, 21 Sep 2015 22:43:54 +0000 (18:43 -0400)]
Fix: disable kernel event based on name and event type
The -a argument is interpreted as a zero-length event name
instead of '*' which is actually a valid wildcard event
name by itself. This simplifies how a disable command is
handled by the session daemon.
The event type can now be passed as argument and is a
new criteria while disabling kernel events. The default
is to disable for all event types.
UST and agent domain do not yet support disabling by event
type.
e.g:
# Only disable kernel event of type tracepoint.
lttng disable -a -k --tracepoint
# Only disable the event with name '*' and type syscall.
lttng disable -k '*' --syscall
# Disable all kernel event of all type.
lttng disable -a -k
Fixes #925
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Fixes the following warning introduced by the runas worker changes. Use
the same technique used in src/bin/lttng/Makefile.am.
src/common/Makefile.am:17: warning: source file 'sessiond-comm/unix.c' is in a subdirectory,
src/common/Makefile.am:17: but option 'subdir-objects' is disabled
automake: warning: possible forward-incompatibility.
automake: At least a source file is in a subdirectory, but the 'subdir-objects'
automake: automake option hasn't been enabled. For now, the corresponding output
automake: object file(s) will be placed in the top-level directory. However,
automake: this behaviour will change in future Automake versions: they will
automake: unconditionally cause object files to be placed in the same subdirectory
automake: of the corresponding sources.
automake: You are advised to start using 'subdir-objects' option throughout your
automake: project, to avoid future incompatibilities.
Some distributions like Debian (e.g. Debian kernel 4.1.0-2-amd64) have
some grsecurity options enabled, such as CONFIG_GRKERNSEC_PERF_HARDEN.
Unfortunately, this option makes it impossible to use the SW page-fault
perf event as a normal user. It only leaves some HW events. However, we
can only use SW events within virtual machines.
Therefore, only run this test as root for now until we find a better
approach.
Some implementations of pidof (such as the one from procps-ng)
seem immune to changing a process' name using prctl() and
overwriting argv[0]. Using preg --full works around this
problem.
In time, we should ensure every deamon publishes a PID file
which can be reliably used by the tests.
Cleanup: remove duplicated implementation of rculfhash
lttng-tools features a duplicated copy of Userspace RCU rculfhash due to
interaction issues between runas clone() and internal libc mutexes.
Now that the runas implementation has been changed to use fork() and a
worker process, we don't need this work-around anymore. Remove the
duplicated rculfhash to lessen the maintenance burden.
Implement a proper run_as worker process scheme to fix internal libc
mutex races. Those races lead to having the internal mutex held by
another process when clone() is called, thus hanging the clone child.
Now that we create the worker process when the parent process is
still single-threaded, we don't run into those issues. Implement a
standard fork + file descriptor passing over unnamed unix sockets rather
than the prior clone + shared file descriptor table, which was causing
issues with valgrind.
This adds a new process called "lttng-runas" for each sessiond
and consumerd process.
relay_process_data has error cases that don't print any error to the
console. Add those cases, and enhance the information provided by error
output within handle_index_data().
Fix: relayd: handle consumerd crashes without leak
We can be clever about indexes partially received in cases where we
received the data socket part, but not the control socket part: since
we're currently closing the stream on behalf of the control socket, we
*know* there won't be any more control information for this socket.
Therefore, we can destroy all indexes for which we have received only
the file descriptor (from data socket). This takes care of consumerd
crashes between sending the data and control information for a packet.
Since those are sent in that order, we take care of consumerd crashes.
Fix: LPOLLHUP and LPOLLERR when there is still data in pipe/socket
The event mask returned by poll/epoll is a bitwise mask made of all the
events observed. On bidirectional sockets, there are cases where
combinations of LPOLLHUP/LPOLLERR and LPOLLIN/LPOLLPRI can be raised at
the same time.
Currently the overall behavior in sessiond, consumerd and relayd is to
handle LPOLLHUP or LPOLLERR immediately, whether or not there is still
data to read in the socket. Unfortunately, this behavior may discard the
last information made available on the pipe or socket.
Audit all uses of LPOLLHUP and LPOLLERR on sockets on which we expect
data to ensure that we deal with LPOLLIN or LPOLLPRI, and catch the
hangup when read or recvmsg returns 0. Keep the LPOLLHUP and LPOLLERR
handling, but only when LPOLLIN is not raised, just in case some
unforeseen error happens when sending the reply.
This is one correct case where we can handle LPOLLHUP and LPOLLERR
directly without caring about LPOLLIN: sockets where we are expected to
write and then read the reply (e.g. command sockets). It is then OK
for a dedicated thread to watch for LPOLLHUP and LPOLLERR.
Add rcu_read_ongoing() assertions around process_client_msg
process_client_msg ensures that RCU read-side lock should not be held
when calling it. Validate this using rcu_read_ongoing() at the entry and
exit points of this function. This allows us to catch unbalanced RCU
read-side lock within commands quickly.
Philippe Proulx [Wed, 2 Sep 2015 16:55:47 +0000 (12:55 -0400)]
Fix: disable agent events by name
The event_agent_disable() function only disables the first
agent event matching a given name. However, if multiple agent
events exist with different loglevels, but share the same name,
we want all of them to be disabled at once.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Antoine Busque [Wed, 2 Sep 2015 00:21:00 +0000 (20:21 -0400)]
Fix: don't print the default channel name when enabling agent events
Enabling an event in the python domain erroneously reported the
channel as being the default `channel0`. Instead, don't report the
channel name when enabling an event in an agent domain.
Fixes: #910 Signed-off-by: Antoine Busque <abusque@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Antoine Busque [Tue, 1 Sep 2015 23:48:43 +0000 (19:48 -0400)]
Fix: fail gracefully on --exclude on unsupported domains
Trying to use event name exclusions on unsupported domains other than
kernel (i.e. log4j, jul, and python) would hang the client. Instead,
report the error appropriately.
Fixes: #909 Signed-off-by: Antoine Busque <abusque@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Antoine Busque [Tue, 1 Sep 2015 22:53:57 +0000 (18:53 -0400)]
Fix: correct mismatched function signatures
The extern declaration of `_lttng_create_session_ext` in `create.c`
had a superfluous `live_timer` parameter not present in the actual
function definition in `lttng_ctl.c`. The -1 value with which it was
called was therefore unused.
Signed-off-by: Antoine Busque <abusque@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Thu, 27 Aug 2015 15:52:33 +0000 (11:52 -0400)]
Daemonize sessiond on `lttng create`
Since the session daemon forked by `lttng create` shares its
standard output/error FDs when not using `--daemonize`, redirecting
the standard output/error of this command to another program "hangs"
because the session daemon never terminates.
Example that's not working (when sessiond is not running):
lttng create | wc
or:
lttng 2>&1 | wc
Using sessiond's `--daemonize` option makes it close its FDs. This
option also ensures that when the sessiond process exits, it has forked
itself as a daemon and is ready to accept commands. Therefore we don't
need to catch SIGCHLD and SIGUSR1; just waitpid() on sessiond's PID and
make sure it exited normally and with an exit status of 0 to continue.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Fix: race between kconsumerd and sessiond on tear down
v2: minimize indentation by using return on condition.
Kconsumerd and sessiond both have reference on lttng-module. This can lead to a race
on modprobe_remove_lttng_all which might fail to unload modules due to
certain modules not having a ref count equal to zero at the time.
waitpid is used to force a synchronization on the child (kconsumer) termination.
This also have been applied to ust consumers for the sake of consistency.
Fixes: #878 Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>