Make sure wait/wakeup stream and channel FDs are closed at teardown.
This led to FD leaks on the consumer daemon when the relay daemon
disconnects unexpectedly.
Recently, commits to fix SDT issues with extern C
(https://bugs.lttng.org/issues/597) brougt in compile errors when the
tracepoint is defined in the same file where the tracepoint provider is
created.
This was due to the presence of extern C guards in tracepoint-event.h, a
header dedicated to tracepoint probe provider compilation. After commits
"Tracepoint probes don't need extern C", it should have gone away. This
is the main fix done by this patch.
This patch also adds missing extern C guards in ust-error.h and
ust-events.h.
Would deal with spaces in the env. var. if there are any. It does not
seem to be important in practice (currently), because automake seems to
fail on CC including spaces at configure time.
By using the timestamp sampled at space reservation when the packet is
being filled as "end timestamp" for a packet, we can ensure there is no
overlap between packet timestamp ranges, so that packet timestamp end <=
following packets timestamp begin.
Overlap between consecutive packets becomes an issue when the end
timestamp of a packet is greater than the end timestamp of a following
packet, IOW a packet completely contains the timestamp range of a
following packet. This kind of situation does not allow trace viewers
to do binary search within the packet timestamps. This kind of situation
will typically never occur if packets are significantly larger than
event size, but this fix ensures it can never even theoretically happen.
The only case where packets can still theoretically overlap is if they
have equal begin and end timestamps, which is valid.
Jon Bernard [Fri, 15 Nov 2013 14:12:47 +0000 (09:12 -0500)]
Escape minus signs in lttng-ust-cyg-profile manpage
By default, "-" chars are interpreted as hyphens (U+2010) by groff, not
as minus signs (U+002D). Since options to programs use minus signs
(U+002D), this means for example in UTF-8 locales that you cannot cut
and paste options, nor search for them easily.
Fix: application SIGBUS when starting in parallel with sessiond
There is a race between application startup and sessiond startup, where
there is an intermediate state where applications can SIGBUS if they see
a zero-sized shm, if the shm has been created, but not ftruncated yet.
On the UST side, fix this by ensuring that UST can read the shared
memory file descriptor with a read() system call before they try
accessing it through a memory map (which triggers the SIGBUS if the
access goes beyond the file size).
On the sessiond side, another commit needs to ensure that the shared
memory is writeable by applications as long as its size is 0, which
allow applications to perform ftruncate and extend its size.
We need to perform both connect and sending registration message before
doing the next connect otherwise we may reach unix socket connect queue
max limits and block on the 2nd connect while the session daemon is
awaiting the first connect registration message.
This happens in scenarios where unix socket connect queues are nearly
full.
Fix: ust-comm recvmsg should handle partial receive
Handles cases where unix socket buffer is full. Without this fix, the
session daemon kicks out application that happen to have their
registration message split into multiple recvmsg on the sessiond side.
For the "ordered comparison of pointer with integer zero" warning, fix
this by comparing (type) -1 against (type) 0 instead of just 0, so if
"type" is a pointer type, this pointer type will be applied to the right
operand too, thus fixing the warning.
Ikaheimonen, JP [Mon, 7 Oct 2013 13:33:02 +0000 (09:33 -0400)]
Add usage reference count for tracepoints
Keep track of how many libraries use a tracepoint, and disable the
tracepoint when the number of users drops to zero.
A new reference counter is added to tracepoint_entry. This keeps track
of how many callsites use that tracepoint.
When you have libraries and/or executables sharing tracepoints, you
cannot just disable your tracepoints when the library is unregistered.
You must check that the tracepoint is not used by any other libraries
before you disable it.
Function lib_disable_tracepoints becomes unnecessary, and is removed.
Fix: ring buffer: handle concurrent update in nested buffer wrap around check
With stress-test loads that trigger sub-buffer switch very frequently
(small 4kB sub-buffers, frequent flush) in lttng-modules, we currently
observe this kind of warnings once every few minutes:
[65335.896208] ring buffer relay-overwrite-mmap, cpu 5: records were lost. Caused by:
[65335.896208] [ 0 buffer full, 1 nest buffer wrap-around, 0 event too big ]
It appears that the check for nested buffer wrap-around does not take
into account that a concurrent execution contexts (either nested for
per-cpu buffers, or from another CPU or nested for global buffers) can
update the commit_count value concurrently.
What we really want to do with this check is to ensure that if we enter
a sub-buffer that had an unbalanced reserve/commit count, assuming there
is no hope that this gets rebalanced promptly, we detect this and drop
the current event. However, in the case where the commit counter has
been concurrently updated by another reserve or a switch, we want to
retry the entire reserve operation.
One way to detect this is to sample the reserve offset twice, around the
commit counter read, along with the appropriate memory barriers.
Therefore, we can detect if the mismatch between reserve and commit
counter is actually caused by a concurrent update, which necessarily has
updated the reserve counter.
lib_ring_buffer_write() could be passed a length of 0. This typically
has no side-effect as far as writing into the buffers is concerned,
except for one detail: in overwrite mode, there is a check to make sure
the sub-buffer can be written into. This check is performed even if
length is 0. In the case where this would fall exactly at the end of a
sub-buffer, the check would fail, because the offset would fall exactly
at the beginning of the next sub-buffer.
This commit introduce a new feature in lttng-gen-tp, but still has some
semantic issue in the notion of comments vs #include. Postpone this
feature to 2.3.
Fix: Add --no-as-needed to the demo example's Makefile
Some distributions now ship with the --as-needed linker flag
set by default (Ubuntu 13.04). This will cause the linker to
remove the references to lttng-ust from the provider objects
thus causing the application to fail when preloading them.
Fix: liblttng-ust process startup hang when sessiond is stopped
Ensure the listener thread owns socket and notify_socket, so they don't
have to hold the ust_lock() while connecting to the sessiond and reading
from this socket.
Therefore, after process fork, we can safely cleanup those retources,
because the thread has been removed by the operating system. On exit,
however, let the OS teardown those sockets, so exit path does not race
with the listener thread.
Actually, $^ here is "demo.o", not "demo. Also, the libs should appear
after the objects on the command line. See the "-l" section in
http://gcc.gnu.org/onlinedocs/gcc/Link-Options.html. On most setup
this doesn't matter, since -Wl,--no-as-needed was the default pretty
much everywhere. Ubuntu decided to use -Wl,--as-needed to avoid
unnecessary dependencies, so the order becomes important. If you try
to manual build on a recent Ubuntu you will get undefined references
to dlopen and such. So this patch is good.
If you read carefully the log sent by Alexandre, you see that it is
when building the shared libs in this directory
(lttng-ust-provider-ust-tests-demo.so) that the build fails. I don't
know why it fails, but Alexandre hinted that passing "-fPIE -pie" to
build a shared library is weird (it is usually -fPIC -pic). I am not
sure where that comes from. This behaviour only happens when building
the package, not when building manually.
Actually, $^ here is "demo.o", not "demo. Also, the libs should appear
after the objects on the command line. See the "-l" section in
http://gcc.gnu.org/onlinedocs/gcc/Link-Options.html. On most setup
this doesn't matter, since -Wl,--no-as-needed was the default pretty
much everywhere. Ubuntu decided to use -Wl,--as-needed to avoid
unnecessary dependencies, so the order becomes important. If you try
to manual build on a recent Ubuntu you will get undefined references
to dlopen and such. So this patch is good.
If you read carefully the log sent by Alexandre, you see that it is
when building the shared libs in this directory
(lttng-ust-provider-ust-tests-demo.so) that the build fails. I don't
know why it fails, but Alexandre hinted that passing "-fPIE -pie" to
build a shared library is weird (it is usually -fPIC -pic). I am not
sure where that comes from. This behaviour only happens when building
the package, not when building manually.
* Zifei Tong <soariez@gmail.com> wrote:
> I did some debugging one this issue. The problem only occurs when we
> have more than one context field.
> So this will not work, too:
>
> lttng create
> lttng enable-event -a -u
> lttng add-context -u -t vpid
> lttng add-context -u -t vtid
> lttng start
> $@
> lttng stop
> sleep 1
> lttng view
> lttng destroy
>
> The problem I found out is wrong `fields` argument passed into
> `ustcomm_register_channel`.
> The `fields` argument passed is a pointer to the `event_field` of the
> first element in a `lttng_ctx_field` array, but not a
> `lttng_event_field` array as expected.
Fixes #529
Reported-by: Francis Giraldeau <francis.giraldeau@gmail.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Move "hello-static-lib" to doc/examples and add non-automake Makefiles
The examples are now automatically built as part of the default make
target and plain Makefiles with no dependency on automake are provided
for clarity.
Update the manpage and README to reflect the change and remove lots of
trailing whitespace.
There is not much we can do for this compatibility bug in lttng-ust 2.0
and 2.1 (already stable). Adding this check so that starting with
lttng-ust 2.2, when liblttng-ust encounters a probe provider with a
provider version major number higher than it supports, it will reject
it.
Timer management is not called under ust_lock(). It is only called from
the consumer. Add internal locking for timer start/stop and
synchronization management.
In file included from ../include/lttng/ust-tracepoint-event.h:357,
from ../include/lttng/tracepoint-event.h:62,
from lttng-ust-cyg-profile.h:63,
from lttng-ust-cyg-profile.c:27:
././lttng-ust-cyg-profile.h: In function ‘__event_prepare_filter_stack__lttng_ust_cyg_profile___func_entry’:
././lttng-ust-cyg-profile.h:35: warning: cast from pointer to integer of different size
././lttng-ust-cyg-profile.h:35: warning: cast from pointer to integer of different size
././lttng-ust-cyg-profile.h:35: warning: cast from pointer to integer of different size
././lttng-ust-cyg-profile.h:35: warning: cast from pointer to integer of different size
././lttng-ust-cyg-profile.h: In function ‘__event_prepare_filter_stack__lttng_ust_cyg_profile___func_exit’:
././lttng-ust-cyg-profile.h:46: warning: cast from pointer to integer of different size
././lttng-ust-cyg-profile.h:46: warning: cast from pointer to integer of different size
././lttng-ust-cyg-profile.h:46: warning: cast from pointer to integer of different size
././lttng-ust-cyg-profile.h:46: warning: cast from pointer to integer of different size
CCLD liblttng-ust-cyg-profile.la
CC lttng-ust-cyg-profile-fast.lo
In file included from ../include/lttng/ust-tracepoint-event.h:357,
from ../include/lttng/tracepoint-event.h:62,
from lttng-ust-cyg-profile-fast.h:59,
from lttng-ust-cyg-profile-fast.c:27:
././lttng-ust-cyg-profile-fast.h: In function ‘__event_prepare_filter_stack__lttng_ust_cyg_profile_fast___func_entry’:
././lttng-ust-cyg-profile-fast.h:35: warning: cast from pointer to integer of different size
././lttng-ust-cyg-profile-fast.h:35: warning: cast from pointer to integer of different size