Don't use ring buffer client's struct lttng_channel from ioctl which
applies to ring buffer streams, because lttng_channel is freed while lib
ring buffer stream and channel are still in use. Their lifetime persists
until the consumer daemon releases its handles on the related stream
file descriptors.
"Introduce API to remap event names exposed by LTTng"
failed to map the event names enabled by the user to tracepoint names
known to the kernel. For instance, tracing with the kmem_kmalloc event
enabled is not gathering any event. This issue applies to all tracepoint
events declared with a different name within LTTng than within the Linux
kernel.
It should use lib_ring_buffer_read_offset_address() to get the packet
being read, rather than lib_ring_buffer_offset_address(), which is only
meant to be used when writing to the packet.
By using the timestamp sampled at space reservation when the packet is
being filled as "end timestamp" for a packet, we can ensure there is no
overlap between packet timestamp ranges, so that packet timestamp end <=
following packets timestamp begin.
Overlap between consecutive packets becomes an issue when the end
timestamp of a packet is greater than the end timestamp of a following
packet, IOW a packet completely contains the timestamp range of a
following packet. This kind of situation does not allow trace viewers
to do binary search within the packet timestamps. This kind of situation
will typically never occur if packets are significantly larger than
event size, but this fix ensures it can never even theoretically happen.
The only case where packets can still theoretically overlap is if they
have equal begin and end timestamps, which is valid.
lttng-statedump-impl: Use generic hard irqs for Linux >= 3.12
Quoting the original patch changelog from Otavio Salvador:
> The Linux kernel 3.12 uses the generic hard irqs system for all
> architectures and dropped the GENERIC_HARDIRQ option, as can be seen
> at the commit quoted below:
>
> ,----
> | commit 0244ad004a54e39308d495fee0a2e637f8b5c317
> | Author: Martin Schwidefsky <schwidefsky@de.ibm.com>
> | Date: Fri Aug 30 09:39:53 2013 +0200
> |
> | Remove GENERIC_HARDIRQ config option
> |
> | After the last architecture switched to generic hard irqs the config
> | options HAVE_GENERIC_HARDIRQS & GENERIC_HARDIRQS and the related code
> | for !CONFIG_GENERIC_HARDIRQS can be removed.
> |
> | Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> `----
Introduce wrapper/irq.h to move the feature availability testing logic
into a specific wrapper header. It now tests if the kernel version is
>= 3.12 or if CONFIG_GENERIC_HARDIRQS is defined (for older kernels).
Introduce the lttng-specific CONFIG_LTTNG_HAS_LIST_IRQ to track
availability of this feature within LTTng.
Reported-by: Philippe Mangaud <r49081@freescale.com> Reported-by: Otavio Salvador <otavio@ossystems.com.br> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Import fix from LTSI: 3.4+ RT kernels use CONFIG_PREEMPT_RT_FULL
Initial LTSI commit:
From: Paul Gortmaker <paul.gortmaker@windriver.com>
> fix reference to obsolete RT Kconfig variable.
>
> The preempt-rt patches no longer use CONFIG_PREEMPT_RT in
> the 3.4 (and newer) versions. So even though LTSI doesn't
> include RT, having this define present can lead to an easy
> to overlook bug for anyone who does try to layer RT onto
> the LTSI baseline.
>
> Update it to use the currently used define name by RT.
>
> Reported-by: Jim Somerville <Jim.Somerville@windriver.com>
> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Merged with kernel version checks for >= 3.4 to support both old and
newer kernels.
These new calls export the data required for the consumer to generate
the index while tracing :
- timestamp begin
- timestamp end
- events discarded
- context size
- packet size
- stream id
This patch allows LTTng to override the file operations of the lib ring
buffer.
For now it does not provide any additional functions, but it prepares
the work of adding LTTng-specific ioctls to the ring buffer.
Linux kernels 3.10 and 3.11 introduce a deadlock in the timekeeping
subsystem. See
http://lkml.kernel.org/r/1378943457-27314-1-git-send-email-john.stultz@linaro.org
for details. Awaiting patch merge into Linux master, stable-3.10 and
stable-3.11 for fine-grained kernel version blacklisting.
The metadata stream should only reference the metadata cache, not the
session. Otherwise, we end up in a catch 22 situation:
- Stream POLLHUP is only given when the session is destroyed, but,
- The session is only destroyed when all references to session are
released, including references from channels, but,
- If the metadata stream holds a reference on the metadata session, we
end up with a circular dependency loop.
Fix this by making sure the metadata stream does not use any of the
lttng channel nor lttng session.
The OOPS at bug #622 is likely caused by a missing reference on the
lttng channel structure, which could lead to accessing the object after
it has been destroyed if the lttng channel file descriptor is closed
while the metadata stream fd is still in use.
However, we don't want to populate data from the metadata cache into the
stream until put_next_subbuf is issued. Add a check to ensure that it is
not populated until required.
Also, disallow get_subbuf() ioctl on metadata channel: its random-access
semantic does not play well with serialization of the metadata cache on
demand.
- Don't require 2kB of stack anymore (requiring so much kernel stack
space should be considered as a bug in itself),
- Add support for 3.10 kernel printk instrumentation.
This patch uses "Introduce __dynamic_array_enc_ext_2() and
tp_memcpy_dyn_2()".
Fix: ring buffer: handle concurrent update in nested buffer wrap around check
With stress-test loads that trigger sub-buffer switch very frequently
(small 4kB sub-buffers, frequent flush), we currently observe this kind
of warnings once every few minutes:
[65335.896208] ring buffer relay-overwrite-mmap, cpu 5: records were lost. Caused by:
[65335.896208] [ 0 buffer full, 1 nest buffer wrap-around, 0 event too big ]
It appears that the check for nested buffer wrap-around does not take
into account that a concurrent execution contexts (either nested for
per-cpu buffers, or from another CPU or nested for global buffers) can
update the commit_count value concurrently.
What we really want to do with this check is to ensure that if we enter
a sub-buffer that had an unbalanced reserve/commit count, assuming there
is no hope that this gets rebalanced promptly, we detect this and drop
the current event. However, in the case where the commit counter has
been concurrently updated by another reserve or a switch, we want to
retry the entire reserve operation.
One way to detect this is to sample the reserve offset twice, around the
commit counter read, along with the appropriate memory barriers.
Therefore, we can detect if the mismatch between reserve and commit
counter is actually caused by a concurrent update, which necessarily has
updated the reserve counter.
Cleanup: lib_ring_buffer_switch_new_end() only calls subbuffer_set_data_size()
lib_ring_buffer_switch_new_end() is always called when an event exactly
fills a sub-buffer, which makes padding_size always 0. However, there is
one side-effect that lib_ring_buffer_switch_new_end() needs to have: it
calls subbuffer_set_data_size() to update the size of the data to be
read from the sub-buffer.
lib_ring_buffer_write(), lib_ring_buffer_memset() and
lib_ring_buffer_copy_from_user_inatomic() could be passed a length of 0.
This typically has no side-effect as far as writing into the buffers is
concerned, except for one detail: in overwrite mode, there is a check to
make sure the sub-buffer can be written into. This check is performed
even if length is 0. In the case where this would fall exactly at the
end of a sub-buffer, the check would fail, because the offset would fall
exactly at the beginning of the next sub-buffer.
Fix: ring buffer: RING_BUFFER_FLUSH ioctl buffer corruption
lib_ring_buffer_switch_slow() clearly states:
* Note, however, that as a v_cmpxchg is used for some atomic
* operations, this function must be called from the CPU which owns the
* buffer for a ACTIVE flush.
But unfortunately, the RING_BUFFER_FLUSH ioctl does not follow these
important directives. Therefore, whenever the consumer daemon or session
daemon explicitly triggers a "flush" on a buffer, it can race with data
being written to the buffer, leading to corruption of the reserve/commit
counters, and therefore corruption of data in the buffer. It triggers
these warnings for overwrite mode buffers:
[65356.890016] WARNING: at
/home/compudj/git/lttng-modules/wrapper/ringbuffer/../../lib/ringbuffer/../../wrapper/ringbuffer/../../lib/ringbuffer/backend.h:110 lttng_event_write+0x118/0x140 [lttng_ring_buffer_client_mmap_overwrite]()
Which indicates that we are trying to write into a sub-buffer for which
we don't have exclusive access. It also causes the following warnings to
show up:
[65335.896208] ring buffer relay-overwrite-mmap, cpu 5: records were lost. Caused by:
[65335.896208] [ 0 buffer full, 80910 nest buffer wrap-around, 0 event too big ]
Which is caused by corrupted commit counter.
Fix this by sending an IPI to the CPU owning the flushed buffer for
per-cpu synchronization. For global synchronization, no IPI is needed,
since we allow writes from remote CPUs.
Cleanup: ring buffer: remove lib_ring_buffer_switch_new_end()
lib_ring_buffer_switch_new_end() is a leftover from the days where an
event that would exactly fill the current sub-buffer would automatically
trigger a sub-buffer switch into the next sub-buffer.
Even before the ring buffer code has been moved into lttng-modules, this
behavior had been changed: an event that exactly fills a sub-buffer only
fills this current sub-buffer, and does not need to switch into the
next one to populate the sub-buffer header. This change had been done so
periodical timer switch, which shares the same semantic as an event
exactly filling a sub-buffer, would not create tons of empty
sub-buffers.
However, when doing this change, lib_ring_buffer_switch_new_end() has
not been removed, but clearly should have been. Its job is now performed
by the event "commit".
lib_ring_buffer_switch_new_end() has no effect, since padding_size is
always 0.
Samuel Martin [Mon, 17 Jun 2013 14:28:51 +0000 (10:28 -0400)]
Fix build and load against linux-2.6.33.x
* lttng-event.h declared but did not implement
lttng_add_perf_counter_to_ctx on kernel >=2.6.33, the implementation
was in lttng-context-perf-counters.c, which was only included for
kernel >=2.6.34. This prevented the module from being loaded.
* on kernel 2.6.33.x, lttng-context-perf-counters.c complains about
implicit declaration for {get,put}_online_cpus and
{,un}register_cpu_notifier; so fix header inclusion.
Signed-off-by: Samuel Martin <smartin@aldebaran-robotics.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Implement a cache for the kernel metadata and a list of metadata
channels.
When new metadata is appended, all metadata channels are awakened so
they can return from poll and get the newly added metadata
This allows to request the metadata multiple times by creating multiple
metadata channels (useful for snapshots).
With this new feature, the poll and get_subbuf ring buffer operations
are now overridden by lttng-abi for the metadata channels, to check the
cache before doing these operations.
Jan Glauber [Thu, 23 May 2013 11:35:16 +0000 (07:35 -0400)]
Fix CPU hotplug section mismatches
Get rid of the following section mismatches:
WARNING: /home/jang/temp/lttng-modules-2.2.0-r0/git/lttng-tracer.o(.text+0x19dc0): Section mismatch in reference from the function lttng_add_perf_counter_to_ctx() to the function .cpuinit.text:lttng_perf_counter_cpu_hp_callback()
The function lttng_add_perf_counter_to_ctx() references
the function __cpuinit lttng_perf_counter_cpu_hp_callback().
This is often because lttng_add_perf_counter_to_ctx lacks a __cpuinit
annotation or the annotation of lttng_perf_counter_cpu_hp_callback is wrong.
WARNING: /home/jang/temp/lttng-modules-2.2.0-r0/git/lib/lttng-lib-ring-buffer.o(.text+0x1204): Section mismatch in reference from the function channel_backend_init() to the function .cpuinit.text:lib_ring_buffer_cpu_hp_callback()
The function channel_backend_init() references
the function __cpuinit lib_ring_buffer_cpu_hp_callback().
This is often because channel_backend_init lacks a __cpuinit
annotation or the annotation of lib_ring_buffer_cpu_hp_callback is wrong.
WARNING: /home/jang/temp/lttng-modules-2.2.0-r0/git/lib/lttng-lib-ring-buffer.o(.text+0x269c): Section mismatch in reference from the function channel_create() to the function .cpuinit.text:lib_ring_buffer_cpu_hp_callback()
The function channel_create() references
the function __cpuinit lib_ring_buffer_cpu_hp_callback().
This is often because channel_create lacks a __cpuinit
annotation or the annotation of lib_ring_buffer_cpu_hp_callback is wrong.
WARNING: /home/jang/temp/lttng-modules-2.2.0-r0/git/lib/lttng-lib-ring-buffer.o(.text+0x4a1c): Section mismatch in reference from the function channel_iterator_init() to the function .cpuinit.text:channel_iterator_cpu_hotplug()
The function channel_iterator_init() references
the function __cpuinit channel_iterator_cpu_hotplug().
This is often because channel_iterator_init lacks a __cpuinit
annotation or the annotation of channel_iterator_cpu_hotplug is wrong.
Signed-off-by: Jan Glauber <jan.glauber@gmail.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Jon Bernard [Mon, 13 May 2013 15:38:17 +0000 (11:38 -0400)]
Remove bashism in lttng-syscalls-generate-headers.sh
Options to echo are not portable. In particular, the 'echo -e' option is
implemented by some shells, including bash, to expand escape sequences.
However, dash is one of the other family of shells that instead expands
escape sequences by default.
The printf command is portable and much more reliable.
We don't use this event anymore since we write the metadata directly
into the ring buffer, no need for an external event. This probe was
the only one in the lttng-probe-lttng module, so we can get rid of
this module as well.