Errors returned by lttng_kernel_event_create are handled by the caller,
and may happen e.g. when a kprobe or kretprobe symbol does not exist.
It should not generate a warning in the kernel console.
Fix: circular dependency on symbol lttng_id_tracker_lookup
Adding lttng_id_tracker_lookup feature into kprobes, uprobes and
kretprobes introduces a circular dependency between lttng-tracer.ko and
the respective probe modules.
There is no real reason for having the kprobes/uprobes/kretprobes
modules separate from the tracer core, so combine those.
Commit 0badc02f82b38 ("Fix: adjust SLE version ranges to build with SP2
and SP3") introduced code duplication. Modify the version match logic to
remove duplicated code.
Also remove the confusing comment about checking if a fd exists. I
could not find one instance in the entire kernel that still matches
the description or the reason for the name fcheck.
The need for better names became apparent in the last round of
discussion of this set of changes[1].
Michael Jeanson [Wed, 29 May 2024 19:02:15 +0000 (15:02 -0400)]
Warn and return on fd overflow fdt
The fdt should only grow and iterate_fd() holds file_lock, which should
ensure the fdt does not change while the lock is taken but be cautious
and check anyway.
Change-Id: Icd6a3263026734cbe3f296f6087f79add4148a8f Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
net: udp: add IP/port data to the tracepoint udp/udp_fail_queue_rcv_skb
The udp_fail_queue_rcv_skb() tracepoint lacks any details on the source
and destination IP/port whereas this information can be critical in case
of UDP/syslog.
Change-Id: I0c337c5817b0a120298cbf5088d60671d9625b0d Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
btrfs: move ->parent and ->ref_root into btrfs_delayed_ref_node
These two members are shared by both the tree refs and data refs, so
move them into btrfs_delayed_ref_node proper. This allows us to greatly
simplify the comparison code, as the shared refs always only sort on
parent, and the non shared refs always sort first on ref_root, and then
only data refs sort on their specific fields.
Change-Id: Ib7c92cc4bb8d674ac66ccfa25c03476f7adaaf90 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Now that all of the delayed ref information is in the delayed ref node,
drastically simplify the delayed ref tracepoints by simply passing in
the btrfs_delayed_ref_node and populating the tracepoints with the
values from the structure itself.
Change-Id: Ic90bc23d6aa558baec33adc33b4d21e052e83375 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 8 May 2024 18:20:30 +0000 (14:20 -0400)]
fix: Add missing 'pselect6_time32' and 'ppoll_time32' syscall overrides
The instrumentation currently has overrides to the generated syscall
tracepoints of 'ppoll' and 'pselect6' to extract additional information
from the parameters.
On arm-32 and x86-32 these 2 syscalls were renamed to 'ppoll_time32' and
'pselect6_time32' and new syscalls using 64-bit time_t were introduced
with the old names. This results in missing overrides on these
architectures for the 32-bit variants that were renamed.
Add the '_time32' overrides to restore the previous behavior.
Change-Id: I81e3a3ddc3f3cea58d86edcdf4a1fc9b600637c2 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
The canary __canary__get_pfnblock_flags_mask has never done its job of
detecting changes to the prototype of get_pfnblock_flags_mask because
it was actually calling the wrapper, because the wrapper/page_alloc.h
header maps get_pfnblock_flags_mask to wrapper_get_pfnblock_flags_mask.
Unfortunately, this wrapper is included by page_alloc.c only _after_ the
linux/pageblock-flags.h header is included, which means the
get_pfnblock_flags_mask prototype does _not_ have the wrapper prefix,
which prevents it from being useful for any kind of type validation.
This has been detected by a compiler warning stating that
wrapper_get_pfnblock_flags_mask() does not have a prior declaration.
Move the wrapper/page_alloc.h include _before_ including
pageblock-flags.h. This ensures the declaration has the wrapper_ prefix,
and therefore the compiler compares the declaration with the definition
of wrapper_get_pfnblock_flags_mask within page_alloc.c. The canary
function can be removed because it is redundant with this type check.
With this proper type check in place, we notice the following two
changes upstream:
commit 535b81e209219 ("mm/page_alloc.c: remove unnecessary end_bitidx for [set|get]_pfnblock_flags_mask()")
introduced in v5.9 removes the end_bitidx argument.
commit ca891f41c4c79 ("mm: constify get_pfnblock_flags_mask and get_pfnblock_migratetype")
introduced in v5.14 adds a const qualifier to the struct page pointer.
Adapt the code to match the evolution of this prototype.
Upstream commit f8c7511db009d ("block: make block_class constant")
makes the block_class const. Reflect this change in the lttng-modules
canary function.
Naming timestamps "TSC" or "tsc" is an historical artefact dating from
the implementation of libringbuffer, where the initial intent was to use
the x86 "rdtsc" instruction directly, which ended up not being what was
done in reality.
Rename uses of "TSC" and "tsc" to "timestamp" to clarify things and
don't require reviewers to be fluent in x86 instruction set.
Timers are added to the timer wheel off by one. This is required in
case a timer is queued directly before incrementing jiffies to prevent
early timer expiry.
When reading a timer trace and relying only on the expiry time of the timer
in the timer_start trace point and on the now in the timer_expiry_entry
trace point, it seems that the timer fires late. With the current
timer_expiry_entry trace point information only now=jiffies is printed but
not the value of base->clk. This makes it impossible to draw a conclusion
to the index of base->clk and makes it impossible to examine timer problems
without additional trace points.
Therefore add the base->clk value to the timer_expire_entry trace
point, to be able to calculate the index the timer base is located at
during collecting expired timers.
Change-Id: I2ebdbb637db0966ff51f45bf66916a59a496b50c Signed-off-by: Kienan Stewart <kstewart@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Upstream Linux commit 8690bbcf3b7 ("Introduce cpu_dcache_is_aliasing()
across all architectures") allows checking whether the architecture has
aliasing data caches more accurately. This will be present in upstream
Linux v6.9 (currently in v6.9-rc3).
I expect this to improve the ring buffer performance on ARM64 and
32-bit ARM with non-aliasing data caches.
mm: compaction: update the cc->nr_migratepages when allocating or freeing the freepages
Currently we will use 'cc->nr_freepages >= cc->nr_migratepages' comparison
to ensure that enough freepages are isolated in isolate_freepages(),
however it just decreases the cc->nr_freepages without updating
cc->nr_migratepages in compaction_alloc(), which will waste more CPU
cycles and cause too many freepages to be isolated.
So we should also update the cc->nr_migratepages when allocating or
freeing the freepages to avoid isolating excess freepages. And I can see
fewer free pages are scanned and isolated when running thpcompact on my
Arm64 server:
There are clearly multiple calls, one per component, but they cannot be
discriminated from each other.
Change the ftrace events to also print the component name, to make it clear
which part of the code is involved. This requires changing the passed value
from a struct snd_soc_card, where the DAPM context is not kwown, to a
struct snd_soc_dapm_context where it is obviously known but the a card
pointer is also available.
Kienan Stewart [Fri, 22 Mar 2024 13:55:55 +0000 (09:55 -0400)]
Fix: support ext4_journal_start on EL 8.4+
The lower value of the EL range, 240.15.1, corresponds to the first
import of EL r8 kernels into Rocky Linux's kernel staging repo.
The change may have been introduced in an earlier RHEL 8 kernel,
prior to the history of imports into Rocky.
Change-Id: Ibec02b382478bee33947d079f33835823827f4c5 Signed-off-by: Kienan Stewart <kstewart@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Fri, 22 Mar 2024 13:28:08 +0000 (09:28 -0400)]
Fix: build kvm probe on EL 8.4+
The lower value of the EL range, 240.15.1, corresponds to the first
import of EL r8 kernels into Rocky Linux's kernel staging repo.
The change may have been introduced in an earlier RHEL 8 kernel,
prior to the history of imports into Rocky.
Change-Id: Icefe472d43e28cc09746e9e046b12299609ebab1 Signed-off-by: Kienan Stewart <kstewart@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Fri, 8 Mar 2024 17:47:06 +0000 (12:47 -0500)]
Fix: Correct minimum version in jbd2 SLE kernel range
This range was introduced in commit b49650509ff072d37ec112cf45a5f14f382c9a31;
however, the range is wrong and worked because the kernel versions
(eg. `5.14.21-150400.24.100-default`) were evaluated to values
greater than `LTTNG_SLE_KERNEL_RANGE(5,14,21,24,46,1)`.
As a result builds of lttng-modules against older versions of SLE
kernels failed.
Change-Id: I23d97d84a23c7b24e957fe943932d6aefbe1b409 Signed-off-by: Kienan Stewart <kstewart@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Fri, 8 Mar 2024 16:26:02 +0000 (11:26 -0500)]
Fix: Handle recent SLE major version codes
Starting in early 2022, the SLE linux version codes changed from the
previous style `5.3.18-59.40.1` to a new convention in which the major
version is a compound number consisting of the major release version,
the service pack version, and the auxillary version (currently unused
from my understanding) similar to the following `5.3.18-150300.59.43.1`[1].
The newer values used in the SLE major version causes the integer
value to "overflow" the expected number of digits and the comparisons
may fail. The `LTTNG_SLE_KERNEL_VERSION` macro also multiplies the
`LTTNG_KERNEL_VERSION` by `100000000ULL` which doesn't work in all
situations, as the resulting value is too large to be stored fully in
an `unsigned long long`.
Example of previous results:
```
// Example range comparison. True or false depending on the value of
// `LTTNG_SLE_VERSION_CODE` and `LTTNG_LINUX_VERSION_CODE`.
LTTNG_SLE_KERNEL_RANGE(5,15,21,150400,24,46, 5,15,0,0,0,0);
`LTTNG_KERNEL_VERSION` packs the kernel version into a 32-bit integer;
however, using that type of packing on the SLE kernel version will not
work well:
In this patch, the SLE version is packed into a 64-bit integer
with 48 bits for the major version, 8 bits for each of the minor and
patch versions.
As a result of packing the SLE version into a 64-bit integer,
it is not possible to coherently combine an `LTTNG_KERNEL_VERSION` and
an `LTTNG_SLE_KERNEL_VERSION`. Doing so would require an integer
larger than 64-bits. Therefore, the `LTTNG_SLE_KERNEL_RANGE` macro has
been adjusted to perform the range comparisons using the two values
separately. The usage of the `LTTNG_SLE_KERNEL_RANGE` remains
unchanged, as `LTTNG_SLE_VERSION` is only used inside that macro.
Using the adjusted macros:
```
// Example range comparison. True or false depending on the value of
// `LTTNG_SLE_VERSION_CODE` and `LTTNG_LINUX_VERSION_CODE`.
LTTNG_SLE_KERNEL_RANGE(5,15,21,150400,24,46, 5,15,0,0,0,0);
It's possible that future releases of SLE kernels have minor or patch
values that exceed 255 (SLE15SP1 has a release using `197`, for example),
requiring an adjustment to using more bits for those fields when
packing into a 64-bit integer.
The schema of multiplying an `LTTNG_KERNEL_VERSION` by a large value
is used for other distributions. RHEL in particular uses
`100000000ULL`, which could lead to overflow issues with certain
comparisons similar to the previous behaviour of
`LTTNG_SLE_KERNEL_VERSION(5,15,0,0,0,0);`.
Martin Hicks [Fri, 26 Jan 2024 17:18:33 +0000 (12:18 -0500)]
Compile fixes for RHEL 9.3 kernels
The ranges were build tested on RHEL9.2 (5.14.0-284.11.1), RHEL9.3
(5.14.0-362.8.1) and RHEL8.9 (4.18.0-513.11.1).
This disables the kmem and compaction modules. I don't believe getting
these to compile will be easy, as the required struct declarations are
in vmlinux.h, and haven't been moved into mm/internal.h and mm/slab.h in
the RHEL sources.
Change-Id: I999c593d6850e2327f6e9df8432a4ea2325a7cea Signed-off-by: Martin Hicks <martin@sr-research.com> Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Thu, 1 Feb 2024 19:02:19 +0000 (14:02 -0500)]
Remove strlcpy usage
The replacement for `strlcpy`, `strscpy`, was introduced in Linux
4.3. As lttng-modules master aims to support Linux 4.4+ at this time,
the version check can safely be removed.
The strscpy() API is intended to be used instead of strlcpy(),
and instead of most uses of strncpy().
- Unlike strlcpy(), it doesn't read from memory beyond (src + size).
- Unlike strlcpy() or strncpy(), the API provides an easy way to check
for destination buffer overflow: an -E2BIG error return value.
- The provided implementation is robust in the face of the source
buffer being asynchronously changed during the copy, unlike the
current implementation of strlcpy().
- Unlike strncpy(), the destination buffer will be NUL-terminated
if the string in the source buffer is too long.
- Also unlike strncpy(), the destination buffer will not be updated
beyond the NUL termination, avoiding strncpy's behavior of zeroing
the entire tail end of the destination buffer. (A memset() after
the strscpy() can be used if this behavior is desired.)
- The implementation should be reasonably performant on all
platforms since it uses the asm/word-at-a-time.h API rather than
simple byte copy. Kernel-to-kernel string copy is not considered
to be performance critical in any case.
Change-Id: I31fefde148d5b63a30532fbcb4b59bb951bba902 Signed-off-by: Kienan Stewart <kstewart@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>