Two variables in ext4_inode_info, i_reserved_meta_blocks and
i_allocated_meta_blocks, are unused. Removing them saves a little
memory per in-memory inode and cleans up clutter in several tracepoints.
Adjust tracepoint output from ext4_alloc_da_blocks() for consistency
and fix a typo and whitespace near these changes.
Signed-off-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fix: Sleeping function called from invalid context
It affects system call instrumentation for accept, accept4 and connect,
only on the x86-64 architecture.
We need to use the LTTng accessing functions to touch user-space memory,
which take care of disabling the page fault handler, so we don't preempt
while in preempt-off context (tracepoints disable preemption).
The "pid" notion exposed by LTTng translates to the "pgid" notion in the
Linux kernel. Therefore using "current->pid" as argument to the PID
tracker actually ends up behaving as a "tid" tracker, which does not
match the intent nor the user-space tracer behavior.
Fix: NULL pointer dereference of THIS_MODULE with built-in modules
THIS MODULE is defined to 0 when a module is built-in the kernel [1].
This caused NULL pointer dereference when booting a kernel with the
lttng-modules built-in.
To fix this issue, add #if guard around the wrapper_lttng_fixup_sig
function checking if the MODULE macro is defined to confirm that this
piece of code will end up in a module and not in the kernel itself.
Fix: add "flush empty" ioctl for stream intersection
Changing the behavior of the "snapshot" lttng command to implicitly do a
buffer "flush" (even when current packet is empty) had unwanted
side-effects: for instance, the snapshot ABI is used by the live timer
to grab the buffer positions, and we don't want to generate useless
empty packets in that scenario.
Therefore, add the "flush empty" behavior as a new ioctl to the ring
buffer. This allows lttng-tools to perform buffer flush (even for empty
packets) when it needs to. Given that this new ioctl is added within
stable branches as well, lttng-tools always need to handle "-ENOSYS"
gracefully.
There is no need to bump the LTTNG_MODULES_ABI_MINOR_VERSION
since the multiple wildcard feature introduced as part of the 2.10
release already bumps it from 2 to 3.
Use SIZE_MAX instead of -1ULL for size_t parameter
strutils_star_glob_match() receives a size_t. Passing -1ULL truncates
the value implicitly on systems where size_t is 32-bit. It is cleaner to
use SIZE_T.
Philippe Proulx [Sun, 19 Feb 2017 01:01:34 +0000 (20:01 -0500)]
Add support for star globbing patterns in event names
This patch adds support for full star-only globbing patterns used in
the event names (enabler names).
strutils_star_glob_match() is always used to perform the match when
the enabler is LTTNG_ENABLER_STAR_GLOB. This enabler is set when it is
detected that its name contains at least one non-escaped star with
strutils_is_star_glob_pattern().
The match is performed by strutils_star_glob_match(), the same function
that the filter interpreter uses.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Philippe Proulx [Sun, 19 Feb 2017 01:04:11 +0000 (20:04 -0500)]
Filtering: add support for star-only globbing patterns
This patch adds the support for "full" star-only globbing patterns to be
used in filter literal strings. A star-only globbing pattern is a
globbing pattern with the star (`*`) being the only special character.
This means `?` and character sets (`[abc-k]`) are not supported here. We
cannot support them without a strategy to differentiate the globbing
pattern because `?` and `[` are not special characters in filter literal
strings right now. The eventual strategy to support them would probably
look like this:
filename =* "?sys*.[ch]"
The filter bytecode generator in LTTng-tools's session daemon creates
the new FILTER_OP_LOAD_STAR_GLOB_STRING operation when the interpreter
should load a star globbing pattern literal string. Even if both
"plain", or legacy strings and star globbing pattern strings are literal
strings, they do not represent the same thing, that is, the == and !=
operators act differently.
The validation process checks that:
1. There's no binary operator between two
FILTER_OP_LOAD_STAR_GLOB_STRING operations. It is illegal to compare
two star globbing patterns, as this is not trivial to implement, and
completely useless as far as I know.
2. Only the == and != binary operators are allowed between a
star globbing pattern and a string.
For the special case of star globbing patterns with a star at the end
only, the current behaviour is not changed to preserve a maximum of
backward compatibility. This is also why the ABI version is changed from
2.2 to 2.3, not to 3.0.
== or != operations between REG_STRING and REG_STAR_GLOB_STRING
registers is specialized to FILTER_OP_EQ_STAR_GLOB_STRING and
FILTER_OP_NE_STAR_GLOB_STRING. Which side is the actual globbing pattern
(the one with the REG_STAR_GLOB_STRING type) is checked at execution
time. The strutils_star_glob_match() function is used to perform the
match operation. See the implementation for more details.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fix: use of uninitialized ret value in lttng_abi_open_metadata_stream
Fixes the following compiler warning:
lttng-abi.c: In function ‘lttng_metadata_ioctl’:
lttng-abi.c:971:6: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized]
int ret;
^
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
The underlying type of `struct kref` changed in kernel 4.11 from an
atomic_t to a refcount_t. This change was introduced in kernel
commit:10383ae. This commit also added a builtin overflow checks to
`kref_get()` so we use it.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fix: atomic_add_unless() returns true/false rather than prior value
The previous implementation assumed that `atomic_add_unless` returned
the prior value of the atomic counter when in fact it returned if the
addition was performed (true) or not performed (false).
Since `atomic_add_unless` can not return INT_MAX, the `lttng_kref_get`
always returned that the call was successful.
This issue had a low likelihood of being triggered since the two refcounts
of the counters used with this call are both bounded by the maximum
number of file descriptors on the system.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
On 32-bit systems, the algorithm within lttng-modules that ensures the
nmi-safe clock increases monotonically on a CPU assumes to have one
clock read per 32-bit LSB overflow period, which is not guaranteed. It
also has an issue on the first clock reads after module load, because
the initial value for the last LSB is 0. It can cause the time to stay
stuck at the same value for a few seconds at the beginning of the trace,
which is unfortunate for the first trace after module load, because this
is where the offset between realtime and trace_clock is sampled, which
prevents correlation of kernel and user-space traces for that session.
It only affects 32-bit systems with kernels >= 3.17.
Fix this by using the non-nmi-safe clock source on 32-bit systems.
While we are there, remove an implementation-defined c99 behavior
regarding casting u64 to long by using unsigned arithmetic instead:
turn:
if (((long) now - (long) last) < 0)
into:
if (U64_MAX / 2 < now - last)
from /home/compudj/git/lttng-modules/lttng-context-perf-counters.c:23:
/home/compudj/git/lttng-modules/lttng-context-perf-counters.c: In function ‘lttng_add_perf_counter_to_ctx’:
/home/compudj/git/lttng-modules/lttng-context-perf-counters.c:353:22: error: ‘cpu’ undeclared (first use in this function)
for_each_online_cpu(cpu) {
^
./include/linux/cpumask.h:223:8: note: in definition of macro ‘for_each_cpu’
for ((cpu) = -1; \
^
/home/compudj/git/lttng-modules/lttng-context-perf-counters.c:353:2: note: in expansion of macro ‘for_each_online_cpu’
for_each_online_cpu(cpu) {
^
/home/compudj/git/lttng-modules/lttng-context-perf-counters.c:353:22: note: each undeclared identifier is reported only once for each function it appears in
for_each_online_cpu(cpu) {
^
./include/linux/cpumask.h:223:8: note: in definition of macro ‘for_each_cpu’
for ((cpu) = -1; \
^
/home/compudj/git/lttng-modules/lttng-context-perf-counters.c:353:2: note: in expansion of macro ‘for_each_online_cpu’
for_each_online_cpu(cpu) {
^
./include/linux/cpumask.h:224:38: warning: left-hand operand of comma expression has no effect [-Wunused-value]
(cpu) = cpumask_next((cpu), (mask)), \
^
./include/linux/cpumask.h:717:36: note: in expansion of macro ‘for_each_cpu’
#define for_each_online_cpu(cpu) for_each_cpu((cpu), cpu_online_mask)
^
/home/compudj/git/lttng-modules/lttng-context-perf-counters.c:353:2: note: in expansion of macro ‘for_each_online_cpu’
for_each_online_cpu(cpu) {
^
scripts/Makefile.build:289: recipe for target '/home/compudj/git/lttng-modules/lttng-context-perf-counters.o' failed
make[2]: *** [/home/compudj/git/lttng-modules/lttng-context-perf-counters.o] Error 1
make[2]: *** Waiting for unfinished jobs....
Fix: bump stable kernel version ranges for clock work-around
Linux commit 27727df240c7 ("Avoid taking lock in NMI path with
CONFIG_DEBUG_TIMEKEEPING"), changed the logic to open-code
the timekeeping_get_ns() function, but forgot to include
the unit conversion from cycles to nanoseconds, breaking the
function's output, which impacts LTTng.
We expected Linux commit 58bfea9532 "timekeeping: Fix
__ktime_get_fast_ns() regression" to make its way into stable
kernels promptly, but it appears new stable kernel releases were
done before the fix was cherry-picked from the master branch.
We therefore need to bump the version ranges for the work-around
in lttng-modules.
Linux commit 27727df240c7 ("Avoid taking lock in NMI path with
CONFIG_DEBUG_TIMEKEEPING"), changed the logic to open-code
the timekeeping_get_ns() function, but forgot to include
the unit conversion from cycles to nanoseconds, breaking the
function's output, which impacts LTTng.
The following kernel versions are affected: 4.8, 4.7.4+, 4.4.20+,
4.1.32+
We expect that the upstream fix will reach the master and stable
branches timely before the next releases, so we use 4.8.1, 4.7.7,
4.4.24, and 4.1.34 as upper bounds (exclusive).
Fall-back to the non-NMI-safe trace clock for those kernel versions.
We simply discard events from NMI context with a in_nmi() check,
as we did before Linux 3.17.
Simon Marchi [Tue, 4 Oct 2016 21:07:05 +0000 (17:07 -0400)]
Add support for i2c tracepoints
This patch teaches lttng-modules about the i2c tracepoints in the Linux
kernel.
It contains the following tracepoints:
* i2c_write
* i2c_read
* i2c_reply
* i2c_result
I translated the fields and assignments from the kernel's
include/trace/events/i2c.h as well as I could. I also tried building
this module against a kernel without CONFIG_I2C, and it built fine (the
required types are unconditionally defined). So I don't think any "#if
CONFIG_I2C" or similar are required.
A module parameter (extract_sensitive_payload) controls the extraction
of possibly sensitive data from events.
[ With edit by Mathieu Desnoyers. ]
Signed-off-by: Simon Marchi <simon.marchi@ericsson.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Version checks in makefiles should always be a disjunctive normal form
where the conjunctions consist of one or more "equals" comparisons and
at most a single greater-or-equal comparison.
Because all length parameters received for serializing data coming from
applications go through a callback, they are never constant, and it
hurts performance to perform a call to memcpy each time.
Performance: cache the backend pages pointer in context
Getting the backend pages pointer requires pointer chasing through the
ring buffer backend tables. Cache the current value so it can be re-used
for all backend write operations writing fields for the same event.
Performance: Relax atomicity constraints for crash handling
Use a store rather than a cmpxchg() for the update of the
sequential commit counter. This speeds up commit. The downside
is that short race windows between the if() check to see if the
counter is larger than the new value and the update could result
in the counter going backwards, in unlikely preemption or signal
delivery scenarios.
Accept that we may lose a few events in a crash dump for the
benefit of tracing speed.
Performance: mark ring buffer do_copy callers always inline
The underlying copy operation is more efficient if the size is a
constant, which only happens if this function is inlined in the caller.
Otherwise, we end up calling memcpy for each field.
Force inlining for performance reasons for:
- lib_ring_buffer_do_strcpy,
- lib_ring_buffer_do_strcpy_from_user_inatomic,
- lib_ring_buffer_copy_from_user_inatomic.
Performance: mark lib_ring_buffer_write always inline
The underlying copy operation is more efficient if the size is a
constant, which only happens if this function is inlined in the caller.
Otherwise, we end up calling memcpy for each field.
Performance improvement changelog from lttng-ust, ported back to
lttng-modules:
Disable event counting in the ring buffer, which can count the number of
events produced per ring-buffer, as well as the number of events
overwritten in overwrite mode.
This feature is currently unused anyway: it is not saved in the ring
buffer header, nor made available to lttng-tools.
This saves 70 ns/event in lttng-ust on the ARM32 Cubietruck.
Fix: handle large number of pages or subbuffers per buffer
Do no trigger kernel console warnings when we try to allocate too many
pages, or a too large kmalloc area for page array (within a subbuffer),
or a sub-buffer array (within a buffer).
Use vmalloc/vfree for the "pages" local variable used only during
allocation, which is an array of nr_subbuf * nr_pages_per_subbuf
pointers. This ensures we do not limit the overall buffer size due to
kmalloc limitations.
Fix: unregister cpu hotplug notifier on buffer alloc error
The cpu hotplug notifier needs to be unregistered in the error path of
buffer allocation, else it eventually causes kernel OOPS when the kernel
accesses freed memory of the notifier block.
Fixes #1031
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
sa_family is an unsigned short in sockaddr definitions. For instance,
the kernel's unix_getname() function sets addrlen to sizeof(short) as it
only returns the socket's family.
Fix: check for sizeof sa_family to save sa_family in accept and connect
The check of addrlen >= sizeof(struct sockaddr) is too restrictive
and causes sa_family to not be saved in the case of AF_UNIX sockets
as the addrlen returned by the syscall may be only sizeof(short).
Individual checks per socket family are performed anyhow in the
switch case, making this safe.
This patch adds an instrumentation override for the accept4() syscall
which is almost identical to accept(), except for an additional
"flags" parameter.
A follow-up patch refactors both overrides to minimize code
duplication as is done for the select/pselect6 overrides.
Performance improvement changelog from lttng-ust, ported back to
lttng-modules:
On ARMv7l (Cubietruck), the compiler generates a function call for each
lib_ring_buffer_check_deliver, even though it typically only do an
unlikely check. Split it into an inline fast path, and a function call
for the slow path. This brings a performance gain of about 500ns/event
on the Cubietruck.
Michael Jeanson [Thu, 28 Jul 2016 16:12:11 +0000 (12:12 -0400)]
Fix: Use fs_initcall instead of rootfs_initcall
The rootfs_initcall for drivers built as modules was only introduced in
kernel 3.14 by commit b46d0c46ccaa366a5bb8ac709fdf2bcaa76221fd. Use
fs_initcall instead which comes just before and exists in older kernels.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Anders Wallin [Fri, 22 Jul 2016 14:10:47 +0000 (16:10 +0200)]
Fix: Add kernel configuration for lttng clock plugin
Only one lttng clock plugin can be used when building the lttng-modules
in the kernel. To make it possible to use a custom clock plugin it must
be possible to unconfigure the test clock plugin
Signed-off-by: Anders Wallin <wallinux@gmail.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Anders Wallin [Fri, 22 Jul 2016 13:56:59 +0000 (15:56 +0200)]
Fix: the clock plugin must be initiated before first use of the clock
When building lttng inside the kernel the clock plugin must be initated
before the rest of the lttng code. Moved the module_init to
rootfs_initcall. The functionality will not change when built as a
module.
Signed-off-by: Anders Wallin <wallinux@gmail.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>