Mathieu Desnoyers [Mon, 18 Nov 2024 15:48:45 +0000 (10:48 -0500)]
Version 2.13.16
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0521ab7dff0124ecbf14cdbfc9df295d52c9e128
Michael Jeanson [Tue, 12 Nov 2024 16:19:23 +0000 (11:19 -0500)]
fix: mm/page_alloc: fix tracepoint mm_page_alloc_zone_locked() (v5.15.171)
See upstream backported commit:
commit
28e7a507196fefd119e7ca2286840f1a9aad5e8a
Author: Wonhyuk Yang <vvghjk1234@gmail.com>
Date: Thu May 19 14:08:54 2022 -0700
mm/page_alloc: fix tracepoint mm_page_alloc_zone_locked()
[ Upstream commit
10e0f7530205799e7e971aba699a7cb3a47456de ]
Currently, trace point mm_page_alloc_zone_locked() doesn't show correct
information.
First, when alloc_flag has ALLOC_HARDER/ALLOC_CMA, page can be allocated
from MIGRATE_HIGHATOMIC/MIGRATE_CMA. Nevertheless, tracepoint use
requested migration type not MIGRATE_HIGHATOMIC and MIGRATE_CMA.
Second, after commit
44042b4498728 ("mm/page_alloc: allow high-order pages
to be stored on the per-cpu lists") percpu-list can store high order
pages. But trace point determine whether it is a refiil of percpu-list by
comparing requested order and 0.
To handle these problems, make mm_page_alloc_zone_locked() only be called
by __rmqueue_smallest with correct migration type. With a new argument
called percpu_refill, it can show roughly whether it is a refill of
percpu-list.
Link: https://lkml.kernel.org/r/20220512025307.57924-1-vvghjk1234@gmail.com
Change-Id: Ib76feb79d95e9f93c84c3aa1b946e57ac2e2666a
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 17 Oct 2024 15:56:02 +0000 (11:56 -0400)]
Fix: uprobes: make uprobe_register() return struct uprobe * (v6.12)
See upstream commits :
commit
3c83a9ad0295eb63bdeb81d821b8c3b9417fbcac
Author: Oleg Nesterov <oleg@redhat.com>
Date: Thu Aug 1 15:27:34 2024 +0200
uprobes: make uprobe_register() return struct uprobe *
This way uprobe_unregister() and uprobe_apply() can use "struct uprobe *"
rather than inode + offset. This simplifies the code and allows to avoid
the unnecessary find_uprobe() + put_uprobe() in these functions.
TODO: uprobe_unregister() still needs get_uprobe/put_uprobe to ensure that
this uprobe can't be freed before up_write(&uprobe->register_rwsem).
commit
04b01625da130c7521b768996cd5e48052198b97
Author: Peter Zijlstra <peterz@infradead.org>
Date: Tue Sep 3 10:46:00 2024 -0700
perf/uprobe: split uprobe_unregister()
With uprobe_unregister() having grown a synchronize_srcu(), it becomes
fairly slow to call. Esp. since both users of this API call it in a
loop.
Peel off the sync_srcu() and do it once, after the loop.
We also need to add uprobe_unregister_sync() into uprobe_register()'s
error handling path, as we need to be careful about returning to the
caller before we have a guarantee that partially attached consumer won't
be called anymore. This is an unlikely slow path and this should be
totally fine to be slow in the case of a failed attach.
commit
e04332ebc8ac128fa551e83f1161ab1c094d13a9
Author: Oleg Nesterov <oleg@redhat.com>
Date: Thu Aug 1 15:27:28 2024 +0200
uprobes: kill uprobe_register_refctr()
It doesn't make any sense to have 2 versions of _register(). Note that
trace_uprobe_enable(), the only user of uprobe_register(), doesn't need
to check tu->ref_ctr_offset to decide which one should be used, it could
safely pass ref_ctr_offset == 0 to uprobe_register_refctr().
Add this argument to uprobe_register(), update the callers, and kill
uprobe_register_refctr().
Change-Id: I8d1f9a5db1f19c2bc2029709ae36f82e86f6fe58
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 28 Oct 2024 20:02:59 +0000 (16:02 -0400)]
Fix: silence 'non-consumed' message for non-started sessions
Destroying a session with at least one enabled event and which has never
been started will currently result in an error message in the kernel log
about 'non-consumed data' for each of the per-cpu buffer. This happens
because a packet header is created in the buffer but never consumed if
the session is not started.
Add a check in the buffer cleanup code to avoid printing 'non-consumed
data' errors for buffers associated with a session taht was never
started.
Change-Id: I1358e1ae49d03544a961515b97b115a488434e27
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 17 Oct 2024 20:59:07 +0000 (16:59 -0400)]
fix: writeback: Refine the show_inode_state() macro definition (v6.12)
See upstream commit :
commit
459ca85ae1feff78d1518344df88bb79a092780c
Author: Julian Sun <sunjunchao2870@gmail.com>
Date: Wed Aug 28 16:13:59 2024 +0800
writeback: Refine the show_inode_state() macro definition
Currently, the show_inode_state() macro only prints
part of the state of inode->i_state. Let’s improve it
to display more of its state.
Change-Id: Idaebd56f5775205f8a5c76e117c5ab65f7f1754b
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Sat, 28 Sep 2024 14:55:07 +0000 (10:55 -0400)]
Version 2.13.15
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I4e26025afef07393ff04f5ddfe4a571914900c81
Kienan Stewart [Mon, 29 Jul 2024 14:23:02 +0000 (14:23 +0000)]
Fix: scsi: sd: Atomic write support added in 6.11-rc1
See upstream commit:
commit
bf4ae8f2e6407a779c0368eb0f3e047a8333be17
Author: John Garry <john.g.garry@oracle.com>
Date: Thu Jun 20 12:53:57 2024 +0000
scsi: sd: Atomic write support
Support is divided into two main areas:
- reading VPD pages and setting sdev request_queue limits
- support WRITE ATOMIC (16) command and tracing
The relevant block limits VPD page need to be read to allow the block layer
request_queue atomic write limits to be set. These VPD page limits are
described in sbc4r22 section 6.6.4 - Block limits VPD page.
There are five limits of interest:
- MAXIMUM ATOMIC TRANSFER LENGTH
- ATOMIC ALIGNMENT
- ATOMIC TRANSFER LENGTH GRANULARITY
- MAXIMUM ATOMIC TRANSFER LENGTH WITH BOUNDARY
- MAXIMUM ATOMIC BOUNDARY SIZE
MAXIMUM ATOMIC TRANSFER LENGTH is the maximum length for a WRITE ATOMIC
(16) command. It will not be greater than the device MAXIMUM TRANSFER
LENGTH.
ATOMIC ALIGNMENT and ATOMIC TRANSFER LENGTH GRANULARITY are the minimum
alignment and length values for an atomic write in terms of logical blocks.
Unlike NVMe, SCSI does not specify an LBA space boundary, but does specify
a per-IO boundary granularity. The maximum boundary size is specified in
MAXIMUM ATOMIC BOUNDARY SIZE. When used, this boundary value is set in the
WRITE ATOMIC (16) ATOMIC BOUNDARY field - layout for the WRITE_ATOMIC_16
command can be found in sbc4r22 section 5.48. This boundary value is the
granularity size at which the device may atomically write the data. A value
of zero in WRITE ATOMIC (16) ATOMIC BOUNDARY field means that all data must
be atomically written together.
MAXIMUM ATOMIC TRANSFER LENGTH WITH BOUNDARY is the maximum atomic write
length if a non-zero boundary value is set.
For atomic write support, the WRITE ATOMIC (16) boundary is not of much
interest, as the block layer expects each request submitted to be executed
be atomically written together.
MAXIMUM ATOMIC TRANSFER LENGTH WITH BOUNDARY is the maximum atomic write
length if a non-zero boundary value is set.
For atomic write support, the WRITE ATOMIC (16) boundary is not of much
interest, as the block layer expects each request submitted to be executed
atomically. However, the SCSI spec does leave itself open to a quirky
scenario where MAXIMUM ATOMIC TRANSFER LENGTH is zero, yet MAXIMUM ATOMIC
TRANSFER LENGTH WITH BOUNDARY and MAXIMUM ATOMIC BOUNDARY SIZE are both
non-zero. This case will be supported.
To set the block layer request_queue atomic write capabilities, sanitize
the VPD page limits and set limits as follows:
- atomic_write_unit_min is derived from granularity and alignment values.
If no granularity value is not set, use physical block size
- atomic_write_unit_max is derived from MAXIMUM ATOMIC TRANSFER LENGTH. In
the scenario where MAXIMUM ATOMIC TRANSFER LENGTH is zero and boundary
limits are non-zero, use MAXIMUM ATOMIC BOUNDARY SIZE for
atomic_write_unit_max. New flag scsi_disk.use_atomic_write_boundary is
set for this scenario.
- atomic_write_boundary_bytes is set to zero always
SCSI also supports a WRITE ATOMIC (32) command, which is for type 2
protection enabled. This is not going to be supported now, so check for
T10_PI_TYPE2_PROTECTION when setting any request_queue limits.
To handle an atomic write request, add support for WRITE ATOMIC (16)
command in handler sd_setup_atomic_cmnd(). Flag use_atomic_write_boundary
is checked here for encoding ATOMIC BOUNDARY field.
Trace info is also added for WRITE_ATOMIC_16 command.
Change-Id: Ie072002fe2184615c72531ac081a324ef18cfb03
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 29 Jul 2024 14:14:24 +0000 (14:14 +0000)]
Fix: block_start removed from btrfs_get_extent in 6.11-rc1
See upstream commit:
commit
c77a8c61002e91d859e118008fd495efbe1d9373
Author: Qu Wenruo <wqu@suse.com>
Date: Tue Apr 30 07:53:06 2024 +0930
btrfs: remove extent_map::block_start member
The member extent_map::block_start can be calculated from
extent_map::disk_bytenr + extent_map::offset for regular extents.
And otherwise just extent_map::disk_bytenr.
And this is already validated by the validate_extent_map(). Now we can
remove the member.
However there is a special case in btrfs_create_dio_extent() where we
for NOCOW/PREALLOC ordered extents cannot directly use the resulting
btrfs_file_extent, as btrfs_split_ordered_extent() cannot handle them
yet.
So for that call site, we pass file_extent->disk_bytenr +
file_extent->num_bytes as disk_bytenr for the ordered extent, and 0 for
offset.
Change-Id: I2e3245bb0d1f5263e902659aa05848d5e231909b
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 29 Jul 2024 14:12:47 +0000 (14:12 +0000)]
Fix: block_len removed frmo btrfs_get_extent in 6.11-rc1
See upstream commit:
commit
e28b851ed9b232c3b84cb8d0fedbdfa8ca881386
Author: Qu Wenruo <wqu@suse.com>
Date: Tue Apr 30 07:53:05 2024 +0930
btrfs: remove extent_map::block_len member
The extent_map::block_len is either extent_map::len (non-compressed
extent) or extent_map::disk_num_bytes (compressed extent).
Since we already have sanity checks to do the cross-checks between the
new and old members, we can drop the old extent_map::block_len now.
For most call sites, they can manually select extent_map::len or
extent_map::disk_num_bytes, since most if not all of them have checked
if the extent is compressed.
Change-Id: Ib03fc685b4e876bf4e53afdd28ca9826342a0e4e
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 29 Jul 2024 14:11:36 +0000 (14:11 +0000)]
Fix: orig_start removed from btrfs_get_extent in 6.11-rc1
See upstream commit:
commit
4aa7b5d1784f510c0f42afc1d74efb41947221d7
Author: Qu Wenruo <wqu@suse.com>
Date: Tue Apr 30 07:53:04 2024 +0930
btrfs: remove extent_map::orig_start member
Since we have extent_map::offset, the old extent_map::orig_start is just
extent_map::start - extent_map::offset for non-hole/inline extents.
And since the new extent_map::offset is already verified by
validate_extent_map() while the old orig_start is not, let's just remove
the old member from all call sites.
Change-Id: I025a30d49b3e3ddc37d7846acc191ebbdf2ff19e
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 29 Jul 2024 14:08:32 +0000 (14:08 +0000)]
Fix: ext4_da_reserve_space changed in 6.11-rc1
See upstream commit:
commit
0d66b23d79c750276f791411d81a524549a64852
Author: Zhang Yi <yi.zhang@huawei.com>
Date: Fri May 17 20:40:02 2024 +0800
ext4: make ext4_da_reserve_space() reserve multi-clusters
Add 'nr_resv' parameter to ext4_da_reserve_space(), which indicates the
number of clusters wants to reserve, make it reserve multiple clusters
at a time.
Change-Id: Ib1ce8c3023d53a6d22ec444a435fdb3c871f64c5
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 29 Jul 2024 14:01:18 +0000 (14:01 +0000)]
Fix: kfree_skb changed in 6.11-rc1
See upstream commit:
commit
c53795d48ee8f385c6a9e394651e7ee914baaeba
Author: Yan Zhai <yan@cloudflare.com>
Date: Mon Jun 17 11:09:04 2024 -0700
net: add rx_sk to trace_kfree_skb
skb does not include enough information to find out receiving
sockets/services and netns/containers on packet drops. In theory
skb->dev tells about netns, but it can get cleared/reused, e.g. by TCP
stack for OOO packet lookup. Similarly, skb->sk often identifies a local
sender, and tells nothing about a receiver.
Allow passing an extra receiving socket to the tracepoint to improve
the visibility on receiving drops.
Change-Id: I33c8ce1a48006456f198ab1592f733b55be01016
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 26 Jul 2024 20:02:04 +0000 (16:02 -0400)]
Fix: event notifier: set eval_capture to false for kprobe and uprobe
Trying to capture fields for kprobe and uprobe, event notifications
will end up dereferencing NULL pointers. Prevent execution of capture
code in those cases.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I7ee7e33b9d12986d79aabefcbd5ef91ed2c6d690
Mathieu Desnoyers [Thu, 18 Jul 2024 17:33:07 +0000 (13:33 -0400)]
Version 2.13.14
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I17fac965a7b9aebeefd65404470dd158c225b0b8
Mathieu Desnoyers [Thu, 4 Jul 2024 15:22:23 +0000 (11:22 -0400)]
kvm instrumentation: Fix kvm_mmio event NULL pointer dereference
Upstream Linux commit
e39d200fa5bf ("KVM: Fix stack-out-of-bounds read
in write_mmio") introduce a NULL pointer check within TP_fast_assign().
lttng-modules commit
33630522da97 ("Update kvm instrumentation for 4.15")
introduce use of:
ctf_sequence_hex(unsigned char, val, val, u32, len)
without the required NULL pointer check, which can trigger NULL pointer
dereference in case of unsatisfied MMIO read.
Add the missing NULL pointer check. Record a sequence of length 0 in the
trace when the val pointer is NULL.
Reported-by: Fahad Arslan <fahad.arslan@siemens.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I51a171a56af96e2cf68dba73f7eb473dd6c0ba0e
Kienan Stewart [Thu, 20 Jun 2024 15:15:59 +0000 (11:15 -0400)]
Fix: Build on CentOS 9 Stream 2024-06
Change-Id: I445d2df9d49930e28e57b6a7d075a9f68e914bad
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 25 Jun 2024 18:15:39 +0000 (14:15 -0400)]
fix: file: Rename fcheck lookup_fd_rcu (v5.10.220)
See upstream backported commit:
commit
c4716bb296504cbc64aeefb370df44e821214c44
Author: Eric W. Biederman <ebiederm@xmission.com>
Date: Fri Nov 20 17:14:27 2020 -0600
file: Rename fcheck lookup_fd_rcu
[ Upstream commit
460b4f812a9d473d4b39d87d37844f9fc30a9eb3 ]
Also remove the confusing comment about checking if a fd exists. I
could not find one instance in the entire kernel that still matches
the description or the reason for the name fcheck.
The need for better names became apparent in the last round of
discussion of this set of changes[1].
[1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com
Link: https://lkml.kernel.org/r/20201120231441.29911-10-ebiederm@xmission.com
Change-Id: Ib880bd8feef1c5d75d2a018cd93a1d464485ab7b
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 21 Nov 2022 22:26:59 +0000 (17:26 -0500)]
Cleanup: update stale file paths in LICENSE
Change-Id: I4849b19daa235b93a6435e57bd764128e43d691e
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 21 Nov 2022 22:10:48 +0000 (17:10 -0500)]
Cleanup: use SPDX v3.0 identifiers
The short form of GPL-2.0 and LGPL-2.1 were deprecated in favour of the
clearer GPL-2.0-only and GPL-2.0-or-later in the SPDX license list v3.0.
Change-Id: I8b59b3689aa38fb5f5a114f9d02f22274a5bff57
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 29 May 2024 19:02:15 +0000 (15:02 -0400)]
Warn and return on fd overflow fdt
The fdt should only grow and iterate_fd() holds file_lock, which should
ensure the fdt does not change while the lock is taken but be cautious
and check anyway.
Change-Id: Icd6a3263026734cbe3f296f6087f79add4148a8f
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 27 May 2024 14:49:45 +0000 (10:49 -0400)]
fix: close_on_exec(): pass files_struct instead of fdtable (v6.10)
See upstream commit:
commit
f60d374d2cc88034385265d193a38e3f4a4b430c
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Thu Jan 4 21:35:38 2024 -0500
close_on_exec(): pass files_struct instead of fdtable
both callers are happier that way...
Change-Id: I8cdabb073c2090842b27b74954d86cb486c43b3e
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 27 May 2024 15:11:21 +0000 (11:11 -0400)]
fix: net: udp: add IP/port data to the tracepoint udp/udp_fail_queue_rcv_skb (v6.10)
See upstream commit:
commit
e9669a00bba79442dd4862c57761333d6a020c24
Author: Balazs Scheidler <bazsi77@gmail.com>
Date: Tue Mar 26 19:05:47 2024 +0100
net: udp: add IP/port data to the tracepoint udp/udp_fail_queue_rcv_skb
The udp_fail_queue_rcv_skb() tracepoint lacks any details on the source
and destination IP/port whereas this information can be critical in case
of UDP/syslog.
Change-Id: I0c337c5817b0a120298cbf5088d60671d9625b0d
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 27 May 2024 17:13:15 +0000 (13:13 -0400)]
fix: btrfs: move ->parent and ->ref_root into btrfs_delayed_ref_node (v6.10)
See upstream commit:
commit
cf4f04325b2b27efa5697ba0ea4c1abdee0035b4
Author: Josef Bacik <josef@toxicpanda.com>
Date: Fri Apr 12 22:57:13 2024 -0400
btrfs: move ->parent and ->ref_root into btrfs_delayed_ref_node
These two members are shared by both the tree refs and data refs, so
move them into btrfs_delayed_ref_node proper. This allows us to greatly
simplify the comparison code, as the shared refs always only sort on
parent, and the non shared refs always sort first on ref_root, and then
only data refs sort on their specific fields.
Change-Id: Ib7c92cc4bb8d674ac66ccfa25c03476f7adaaf90
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 27 May 2024 17:04:42 +0000 (13:04 -0400)]
fix: btrfs: simplify delayed ref tracepoints (v6.10)
See upstream commit:
commit
1bff6d4f873790cfc675afce9860208576508c5a
Author: Josef Bacik <josef@toxicpanda.com>
Date: Fri Apr 12 20:27:00 2024 -0400
btrfs: simplify delayed ref tracepoints
Now that all of the delayed ref information is in the delayed ref node,
drastically simplify the delayed ref tracepoints by simply passing in
the btrfs_delayed_ref_node and populating the tracepoints with the
values from the structure itself.
Change-Id: Ic90bc23d6aa558baec33adc33b4d21e052e83375
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Martin Hicks [Fri, 17 May 2024 14:54:27 +0000 (10:54 -0400)]
Fix mm_vmscan_lru_isolate tracepoint for RHEL 9.4 kernel
Redhat has moved to using the format first found in the 6.7 kernel
for the mm_vmscan_lru_isolate tracepoint.
Change-Id: I2aa84769c0070458d902e9a0305488d6d8a380e1
Signed-off-by: Martin Hicks <martin@sr-research.com>
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Mon, 13 May 2024 18:51:11 +0000 (14:51 -0400)]
Version 2.13.13
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia4f3e6f2686278cfdd4aea5389c63072e0fe42cb
Mathieu Desnoyers [Thu, 9 May 2024 19:46:21 +0000 (15:46 -0400)]
splice wrapper: Fix missing declaration
Include the splice wrapper header within the splice.c implementation to
prevent missing declaration warnings.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib62f7f575324aa3268d76fb38c39ef70257155ef
Mathieu Desnoyers [Thu, 9 May 2024 18:43:05 +0000 (14:43 -0400)]
page alloc wrapper: Fix get_pfnblock_flags_mask prototype
The canary __canary__get_pfnblock_flags_mask has never done its job of
detecting changes to the prototype of get_pfnblock_flags_mask because
it was actually calling the wrapper, because the wrapper/page_alloc.h
header maps get_pfnblock_flags_mask to wrapper_get_pfnblock_flags_mask.
Unfortunately, this wrapper is included by page_alloc.c only _after_ the
linux/pageblock-flags.h header is included, which means the
get_pfnblock_flags_mask prototype does _not_ have the wrapper prefix,
which prevents it from being useful for any kind of type validation.
This has been detected by a compiler warning stating that
wrapper_get_pfnblock_flags_mask() does not have a prior declaration.
Move the wrapper/page_alloc.h include _before_ including
pageblock-flags.h. This ensures the declaration has the wrapper_ prefix,
and therefore the compiler compares the declaration with the definition
of wrapper_get_pfnblock_flags_mask within page_alloc.c. The canary
function can be removed because it is redundant with this type check.
With this proper type check in place, we notice the following two
changes upstream:
commit
535b81e209219 ("mm/page_alloc.c: remove unnecessary end_bitidx for [set|get]_pfnblock_flags_mask()")
introduced in v5.9 removes the end_bitidx argument.
commit
ca891f41c4c79 ("mm: constify get_pfnblock_flags_mask and get_pfnblock_migratetype")
introduced in v5.14 adds a const qualifier to the struct page pointer.
Adapt the code to match the evolution of this prototype.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I51b7871edfbff0f74ba1cf4d0ad988eb8d642b4e
Mathieu Desnoyers [Thu, 9 May 2024 17:55:44 +0000 (13:55 -0400)]
lttng probe: include events-internal.h
Include events-internal.h for the declarations of lttng_logger_init and
lttng_logger_exit.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I14968060c309083f90a282f186f3c635f1ebfd8d
Mathieu Desnoyers [Thu, 9 May 2024 17:54:20 +0000 (13:54 -0400)]
syscalls: Remove unused duplicated code
lttng_abi_syscall_list() was moved to src/lttng-abi.c within the 2.13
refactoring. Remove this unused copy.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Iabbab3b576e3d85bc5ac9831729a5786b5c5f224
Mathieu Desnoyers [Thu, 9 May 2024 17:52:54 +0000 (13:52 -0400)]
statedump: Add missing events-internal.h include
Include events-internal.h for the declaration of
lttng_statedump_start().
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ieda391aa0d780113efe1d517d8bc5aedb2095a2c
Mathieu Desnoyers [Thu, 9 May 2024 17:50:19 +0000 (13:50 -0400)]
lttng-events: Add missing static
get_tracker() and lttng_metadata_printf() are only used within the
compile unit, mark them as static.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie454c85cc29a30d964922fcfe1f88f3fb91bbc8f
Mathieu Desnoyers [Thu, 9 May 2024 17:49:37 +0000 (13:49 -0400)]
event notifier: Add missing static
Mark capture_sequence() static because it is only used within the
compile unit.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I003e5cf016fbf2f2df24f4550a6c285e020956d0
Mathieu Desnoyers [Thu, 9 May 2024 17:49:08 +0000 (13:49 -0400)]
context callstack: Add missing static
lttng_cs_event_fields() is only used within the compile unit, mark it
static.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I3435c6de411a0671de86ecdbcb3e1ea019d908e6
Mathieu Desnoyers [Thu, 9 May 2024 17:48:05 +0000 (13:48 -0400)]
lttng-clock: Add missing lttng/events-internal.h include
Needed for lttng_clock_ref and lttng_clock_unref declarations.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Idb21d942b9e2151f1d5a78b31b28ca1ac455719b
Mathieu Desnoyers [Thu, 9 May 2024 17:46:52 +0000 (13:46 -0400)]
lttng-calibrate: Add missing static and include
Include lttng/events-internal.h for the lttng_calibrate
declaration. Make lttng_calibrate_kretprobe static.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia18aa60e69cae7a16ed7d17c8297d7ad3bb3aca2
Mathieu Desnoyers [Thu, 9 May 2024 17:46:07 +0000 (13:46 -0400)]
lttng-bytecode: Remove dead code
Functions lttng_filter_enabler_attach_bytecode and
lttng_free_enabler_filter_bytecode are unused since the
refactoring of lttng 2.13. Remove them.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I2abbb3906cdb5c1b7fa9c27c7871a51af7697832
Mathieu Desnoyers [Thu, 9 May 2024 17:45:45 +0000 (13:45 -0400)]
lttng-abi: Add missing static to function definitions
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Icc6928df5a47c96a560e18b2c854780afbcf6307
Mathieu Desnoyers [Thu, 9 May 2024 17:45:24 +0000 (13:45 -0400)]
ring buffer: Add missing static to function definitions
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I80c03d8c8879777e37d24c3574ba7bdea512446a
Mathieu Desnoyers [Thu, 9 May 2024 17:44:05 +0000 (13:44 -0400)]
blkdev wrapper: Fix constness warning
Upstream commit
f8c7511db009d ("block: make block_class constant")
makes the block_class const. Reflect this change in the lttng-modules
canary function.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib3aadd3dedc413f8370b9739f200ce9663c38d99
Kienan Stewart [Mon, 15 Apr 2024 13:25:26 +0000 (09:25 -0400)]
Fix: timer_expire_entry changed in 4.19.312
See upstream commit:
commit
bbb5b1c060d73ca96ccc8cceaa81f5e1a96e8fa4
Author: Anna-Maria Gleixner <anna-maria@linutronix.de>
Date: Thu Mar 21 13:09:21 2019 +0100
timer/trace: Improve timer tracing
[ Upstream commit
f28d3d5346e97e60c81f933ac89ccf015430e5cf ]
Timers are added to the timer wheel off by one. This is required in
case a timer is queued directly before incrementing jiffies to prevent
early timer expiry.
When reading a timer trace and relying only on the expiry time of the timer
in the timer_start trace point and on the now in the timer_expiry_entry
trace point, it seems that the timer fires late. With the current
timer_expiry_entry trace point information only now=jiffies is printed but
not the value of base->clk. This makes it impossible to draw a conclusion
to the index of base->clk and makes it impossible to examine timer problems
without additional trace points.
Therefore add the base->clk value to the timer_expire_entry trace
point, to be able to calculate the index the timer base is located at
during collecting expired timers.
Change-Id: I2ebdbb637db0966ff51f45bf66916a59a496b50c
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 25 Mar 2024 14:53:46 +0000 (10:53 -0400)]
Fix: dev_base_lock removed in linux 6.9-rc1
See upstream commit:
commit
1b3ef46cb7f2618cc0b507393220a69810f6da12
Author: Eric Dumazet <edumazet@google.com>
Date: Tue Feb 13 06:32:45 2024 +0000
net: remove dev_base_lock
dev_base_lock is not needed anymore, all remaining users also hold RTNL.
Change-Id: I6b07e6eed07fd398302ca14d23162ed24d74df15
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 25 Mar 2024 14:30:32 +0000 (10:30 -0400)]
Fix: mm_compaction_migratepages changed in linux 6.9-rc1
See upstream commit:
commit
ab755bf4249b992fc2140d615ab0a686d50765b4
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date: Tue Feb 20 14:16:31 2024 +0800
mm: compaction: update the cc->nr_migratepages when allocating or freeing the freepages
Currently we will use 'cc->nr_freepages >= cc->nr_migratepages' comparison
to ensure that enough freepages are isolated in isolate_freepages(),
however it just decreases the cc->nr_freepages without updating
cc->nr_migratepages in compaction_alloc(), which will waste more CPU
cycles and cause too many freepages to be isolated.
So we should also update the cc->nr_migratepages when allocating or
freeing the freepages to avoid isolating excess freepages. And I can see
fewer free pages are scanned and isolated when running thpcompact on my
Arm64 server:
k6.7 k6.7_patched
Ops Compaction pages isolated
120692036.00
118160797.00
Ops Compaction migrate scanned
131210329.00
154093268.00
Ops Compaction free scanned
1090587971.00
1080632536.00
Ops Compact scan efficiency 12.03 14.26
Moreover, I did not see an obvious latency improvements, this is likely
because isolating freepages is not the bottleneck in the thpcompact test
case.
k6.7 k6.7_patched
Amean fault-both-1 1089.76 ( 0.00%) 1080.16 * 0.88%*
Amean fault-both-3 1616.48 ( 0.00%) 1636.65 * -1.25%*
Amean fault-both-5 2266.66 ( 0.00%) 2219.20 * 2.09%*
Amean fault-both-7 2909.84 ( 0.00%) 2801.90 * 3.71%*
Amean fault-both-12 4861.26 ( 0.00%) 4733.25 * 2.63%*
Amean fault-both-18 7351.11 ( 0.00%) 6950.51 * 5.45%*
Amean fault-both-24 9059.30 ( 0.00%) 9159.99 * -1.11%*
Amean fault-both-30 10685.68 ( 0.00%) 11399.02 * -6.68%*
Change-Id: I103a43fd1b549360b3fc978fd409b7c17ef3e192
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 25 Mar 2024 13:40:29 +0000 (09:40 -0400)]
Fix: ASoC add component to set_bias_level events in linux 6.9-rc1
See upstream commit:
commit
6ef46a69ec32fe1cf56de67742fcd01af4bf48af
Author: Luca Ceresoli <luca.ceresoli@bootlin.com>
Date: Wed Mar 6 10:30:00 2024 +0100
ASoC: trace: add component to set_bias_level trace events
The snd_soc_bias_level_start and snd_soc_bias_level_done trace events
currently look like:
aplay-229 [000] 1250.140778: snd_soc_bias_level_start: card=vscn-2046 val=1
aplay-229 [000] 1250.140784: snd_soc_bias_level_done: card=vscn-2046 val=1
aplay-229 [000] 1250.140786: snd_soc_bias_level_start: card=vscn-2046 val=2
aplay-229 [000] 1250.140788: snd_soc_bias_level_done: card=vscn-2046 val=2
kworker/u8:1-21 [000] 1250.140871: snd_soc_bias_level_start: card=vscn-2046 val=1
kworker/u8:0-11 [000] 1250.140951: snd_soc_bias_level_start: card=vscn-2046 val=1
kworker/u8:0-11 [000] 1250.140956: snd_soc_bias_level_done: card=vscn-2046 val=1
kworker/u8:0-11 [000] 1250.140959: snd_soc_bias_level_start: card=vscn-2046 val=2
kworker/u8:0-11 [000] 1250.140961: snd_soc_bias_level_done: card=vscn-2046 val=2
kworker/u8:1-21 [000] 1250.167219: snd_soc_bias_level_done: card=vscn-2046 val=1
kworker/u8:1-21 [000] 1250.167222: snd_soc_bias_level_start: card=vscn-2046 val=2
kworker/u8:1-21 [000] 1250.167232: snd_soc_bias_level_done: card=vscn-2046 val=2
kworker/u8:0-11 [000] 1250.167440: snd_soc_bias_level_start: card=vscn-2046 val=3
kworker/u8:0-11 [000] 1250.167444: snd_soc_bias_level_done: card=vscn-2046 val=3
kworker/u8:1-21 [000] 1250.167497: snd_soc_bias_level_start: card=vscn-2046 val=3
kworker/u8:1-21 [000] 1250.167506: snd_soc_bias_level_done: card=vscn-2046 val=3
There are clearly multiple calls, one per component, but they cannot be
discriminated from each other.
Change the ftrace events to also print the component name, to make it clear
which part of the code is involved. This requires changing the passed value
from a struct snd_soc_card, where the DAPM context is not kwown, to a
struct snd_soc_dapm_context where it is obviously known but the a card
pointer is also available.
With this change, the resulting trace becomes:
aplay-247 [000] 1436.357332: snd_soc_bias_level_start: card=vscn-2046 component=(none) val=1
aplay-247 [000] 1436.357338: snd_soc_bias_level_done: card=vscn-2046 component=(none) val=1
aplay-247 [000] 1436.357340: snd_soc_bias_level_start: card=vscn-2046 component=(none) val=2
aplay-247 [000] 1436.357343: snd_soc_bias_level_done: card=vscn-2046 component=(none) val=2
kworker/u8:4-215 [000] 1436.357437: snd_soc_bias_level_start: card=vscn-2046 component=
ff560000.codec val=1
kworker/u8:5-231 [000] 1436.357518: snd_soc_bias_level_start: card=vscn-2046 component=
ff320000.i2s val=1
kworker/u8:5-231 [000] 1436.357523: snd_soc_bias_level_done: card=vscn-2046 component=
ff320000.i2s val=1
kworker/u8:5-231 [000] 1436.357526: snd_soc_bias_level_start: card=vscn-2046 component=
ff320000.i2s val=2
kworker/u8:5-231 [000] 1436.357528: snd_soc_bias_level_done: card=vscn-2046 component=
ff320000.i2s val=2
kworker/u8:4-215 [000] 1436.383217: snd_soc_bias_level_done: card=vscn-2046 component=
ff560000.codec val=1
kworker/u8:4-215 [000] 1436.383221: snd_soc_bias_level_start: card=vscn-2046 component=
ff560000.codec val=2
kworker/u8:4-215 [000] 1436.383231: snd_soc_bias_level_done: card=vscn-2046 component=
ff560000.codec val=2
kworker/u8:5-231 [000] 1436.383468: snd_soc_bias_level_start: card=vscn-2046 component=
ff320000.i2s val=3
kworker/u8:5-231 [000] 1436.383472: snd_soc_bias_level_done: card=vscn-2046 component=
ff320000.i2s val=3
kworker/u8:4-215 [000] 1436.383503: snd_soc_bias_level_start: card=vscn-2046 component=
ff560000.codec val=3
kworker/u8:4-215 [000] 1436.383513: snd_soc_bias_level_done: card=vscn-2046 component=
ff560000.codec val=3
Change-Id: I959f1680c002acdf29828b968d3975247f5433d8
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 25 Mar 2024 12:54:42 +0000 (08:54 -0400)]
Fix: ASoC snd_doc_dapm on linux 6.9-rc1
See upstream commit:
commit
7df3eb4cdb6bbfa482f51548b9fd47c2723c68ba
Author: Luca Ceresoli <luca.ceresoli@bootlin.com>
Date: Wed Mar 6 10:30:01 2024 +0100
ASoC: trace: add event to snd_soc_dapm trace events
Add the event value to the snd_soc_dapm_start and snd_soc_dapm_done trace
events to make them more informative.
Trace before:
aplay-229 [000] 250.140309: snd_soc_dapm_start: card=vscn-2046
aplay-229 [000] 250.167531: snd_soc_dapm_done: card=vscn-2046
aplay-229 [000] 251.169588: snd_soc_dapm_start: card=vscn-2046
aplay-229 [000] 251.195245: snd_soc_dapm_done: card=vscn-2046
Trace after:
aplay-214 [000] 693.290612: snd_soc_dapm_start: card=vscn-2046 event=1
aplay-214 [000] 693.315508: snd_soc_dapm_done: card=vscn-2046 event=1
aplay-214 [000] 694.537349: snd_soc_dapm_start: card=vscn-2046 event=2
aplay-214 [000] 694.563241: snd_soc_dapm_done: card=vscn-2046 event=2
Change-Id: If0d33544b8dd1dfb3d12ca9390892190fc0444b0
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Fri, 22 Mar 2024 13:28:08 +0000 (09:28 -0400)]
Fix: build kvm probe on EL 8.4+
The lower value of the EL range, 240.15.1, corresponds to the first
import of EL r8 kernels into Rocky Linux's kernel staging repo.
The change may have been introduced in an earlier RHEL 8 kernel,
prior to the history of imports into Rocky.
Change-Id: Icefe472d43e28cc09746e9e046b12299609ebab1
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Fri, 22 Mar 2024 13:55:55 +0000 (09:55 -0400)]
Fix: support ext4_journal_start on EL 8.4+
The lower value of the EL range, 240.15.1, corresponds to the first
import of EL r8 kernels into Rocky Linux's kernel staging repo.
The change may have been introduced in an earlier RHEL 8 kernel,
prior to the history of imports into Rocky.
Change-Id: Ibec02b382478bee33947d079f33835823827f4c5
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Thu, 21 Mar 2024 19:16:29 +0000 (15:16 -0400)]
Fix: correct RHEL range for kmem_cache_free define
When compiling against RHEL 8.5 kernels, lttng-modules builds fail
with the following error:
```
lttng-modules/src/probes/../../include/lttng/tracepoint-event-impl.h:133:6: error: conflicting types for ‘trace_kmem_
cache_free’; have ‘void(long unsigned int, const void *)’
```
The original range was introduced in commit
89d917153fc52c1e5b0ddabf8ee078897656b263 which tested against RHEL 8.6
and not RHEL 8.5.
Change-Id: Icff98c15415ce8e1e95a10974cd65ed6e84cd00a
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Thu, 21 Mar 2024 19:45:34 +0000 (15:45 -0400)]
Version 2.13.12
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I448e493bbf064587e3b717a9f5c41c13f0569f29
Kienan Stewart [Thu, 14 Mar 2024 15:37:05 +0000 (11:37 -0400)]
docs: Add supported versions and fix-backport policy
Change-Id: I5d6da21b9541f838cb326263eff8c1448e37fc55
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Fri, 24 Nov 2023 15:09:46 +0000 (10:09 -0500)]
docs: Add links to project resources
Indicate that Gerrit (https://review.lttng.org) is the principal place
where patches are submitted and reviewed, rather than the mailing list.
Based on feedback received on the mailing list:
https://lists.lttng.org/pipermail/lttng-dev/2023-November/030670.html
Change-Id: I611deeec26393fc25c9a103c022687198100df0c
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Fri, 8 Mar 2024 17:47:06 +0000 (12:47 -0500)]
Fix: Correct minimum version in jbd2 SLE kernel range
This range was introduced in commit
b49650509ff072d37ec112cf45a5f14f382c9a31;
however, the range is wrong and worked because the kernel versions
(eg. `5.14.21-150400.24.100-default`) were evaluated to values
greater than `LTTNG_SLE_KERNEL_RANGE(5,14,21,24,46,1)`.
As a result builds of lttng-modules against older versions of SLE
kernels failed.
Change-Id: I23d97d84a23c7b24e957fe943932d6aefbe1b409
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Fri, 8 Mar 2024 16:26:02 +0000 (11:26 -0500)]
Fix: Handle recent SLE major version codes
Starting in early 2022, the SLE linux version codes changed from the
previous style `5.3.18-59.40.1` to a new convention in which the major
version is a compound number consisting of the major release version,
the service pack version, and the auxillary version (currently unused
from my understanding) similar to the following `5.3.18-150300.59.43.1`[1].
The newer values used in the SLE major version causes the integer
value to "overflow" the expected number of digits and the comparisons
may fail. The `LTTNG_SLE_KERNEL_VERSION` macro also multiplies the
`LTTNG_KERNEL_VERSION` by `100000000ULL` which doesn't work in all
situations, as the resulting value is too large to be stored fully in
an `unsigned long long`.
Example of previous results:
```
// Example range comparison. True or false depending on the value of
// `LTTNG_SLE_VERSION_CODE` and `LTTNG_LINUX_VERSION_CODE`.
LTTNG_SLE_KERNEL_RANGE(5,15,21,150400,24,46, 5,15,0,0,0,0);
// Note: values printed with `%ull`
LTTNG_SLE_KERNEL_VERSION(5,15,21,24,26,1); //
6106486698364570153
LTTNG_SLE_KERNEL_VERSION(5,15,0,0,0,0); // 0
LTTNG_KERNEL_VERSION(5,15,0); //
84869120
// Corrected SLE version codes
LTTNG_SLE_KERNEL_VERSION(5,14,21,150400,24,26); //
14918348902249793914
LTTNG_SLE_KERNEL_VERSION(5,14,21,150400,24,46); //
14918348902249793934
LTTNG_SLE_KERNEL_VERSION(5,15,0,150400,0,0)); //
6971507145825058816
```
`LTTNG_KERNEL_VERSION` packs the kernel version into a 32-bit integer;
however, using that type of packing on the SLE kernel version will not
work well:
* Major: `150400` needs 18 bits
* Minor: may exceed 127, requires 8 bits (eg. `4.12.14-150100.197.148.1`)
* Patch: may exceed 127, requires 8 bits (eg. `5.3.18-150300.59.124.1`)
In this patch, the SLE version is packed into a 64-bit integer
with 48 bits for the major version, 8 bits for each of the minor and
patch versions.
As a result of packing the SLE version into a 64-bit integer,
it is not possible to coherently combine an `LTTNG_KERNEL_VERSION` and
an `LTTNG_SLE_KERNEL_VERSION`. Doing so would require an integer
larger than 64-bits. Therefore, the `LTTNG_SLE_KERNEL_RANGE` macro has
been adjusted to perform the range comparisons using the two values
separately. The usage of the `LTTNG_SLE_KERNEL_RANGE` remains
unchanged, as `LTTNG_SLE_VERSION` is only used inside that macro.
Using the adjusted macros:
```
// Example range comparison. True or false depending on the value of
// `LTTNG_SLE_VERSION_CODE` and `LTTNG_LINUX_VERSION_CODE`.
LTTNG_SLE_KERNEL_RANGE(5,15,21,150400,24,46, 5,15,0,0,0,0);
// Note: values printed with `%ull`
LTTNG_SLE_VERSION(24,26,1); //
1579521
LTTNG_SLE_VERSION(0,0,0); // 0
LTTNG_KERNEL_VERSION(5,15,0); //
84869120
// Corrected SLE version codes
LTTNG_SLE_VERSION(150400,24,26); //
9856620570
LTTNG_SLE_VERSION(150400,24,46); //
9856620590
LTTNG_SLE_VERSION(150400,0,0)); //
9863168000
```
Known drawbacks
===============
It's possible that future releases of SLE kernels have minor or patch
values that exceed 255 (SLE15SP1 has a release using `197`, for example),
requiring an adjustment to using more bits for those fields when
packing into a 64-bit integer.
The schema of multiplying an `LTTNG_KERNEL_VERSION` by a large value
is used for other distributions. RHEL in particular uses
`100000000ULL`, which could lead to overflow issues with certain
comparisons similar to the previous behaviour of
`LTTNG_SLE_KERNEL_VERSION(5,15,0,0,0,0);`.
[1]: https://www.suse.com/support/kb/doc/?id=
000019587#SLE15SP4
Change-Id: Iaa90bfa422e47213a13829cdf008ab20d7484cab
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Tue, 27 Feb 2024 16:47:58 +0000 (11:47 -0500)]
Fix: build on sles15sp4
Introduced in 5.14.21-150400.46.1.
See SLE commit:
commit
96a814b6c528f45fc92bf8e6de90ad8923511091
Author: Petr Pavlu <petr.pavlu@suse.com>
Date: Tue Jan 24 14:52:24 2023 +0100
jbd2: use the correct print format (git-fixes).
suse-commit:
34db311bec3ca4388b82b2355eed7c08b25f5a2e
Change-Id: Ic267b9498b7f9a4a814514ff82f9226f844c133f
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Martin Hicks [Fri, 26 Jan 2024 17:18:33 +0000 (12:18 -0500)]
Compile fixes for RHEL 9.3 kernels
The ranges were build tested on RHEL9.2 (5.14.0-284.11.1), RHEL9.3
(5.14.0-362.8.1) and RHEL8.9 (4.18.0-513.11.1).
This disables the kmem and compaction modules. I don't believe getting
these to compile will be easy, as the required struct declarations are
in vmlinux.h, and haven't been moved into mm/internal.h and mm/slab.h in
the RHEL sources.
Change-Id: I999c593d6850e2327f6e9df8432a4ea2325a7cea
Signed-off-by: Martin Hicks <martin@sr-research.com>
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 5 Feb 2024 13:52:29 +0000 (08:52 -0500)]
Fix: ext4_discard_preallocations changed in linux 6.8.0-rc3
See upstream commit:
commit
f0e54b6087de9571ec61c189d6c378b81edbe3b2
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date: Fri Jan 5 17:21:02 2024 +0800
ext4: remove 'needed' in trace_ext4_discard_preallocations
As 'needed' to trace_ext4_discard_preallocations is always 0 which
is meaningless. Just remove it.
Change-Id: Ib6b698ca553c4beebd4ca791c83bbbb927901758
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 22 Jan 2024 18:13:36 +0000 (13:13 -0500)]
Fix: btrfs_get_extent flags and compress_type changed in linux 6.8.0-rc1
See upstream commit:
commit
f86f7a75e2fb5fd7d31d00eab8a392f97ba42ce9
Author: Filipe Manana <fdmanana@suse.com>
Date: Mon Dec 4 16:20:33 2023 +0000
btrfs: use the flags of an extent map to identify the compression type
Currently, in struct extent_map, we use an unsigned int (32 bits) to
identify the compression type of an extent and an unsigned long (64 bits
on a 64 bits platform, 32 bits otherwise) for flags. We are only using
6 different flags, so an unsigned long is excessive and we can use flags
to identify the compression type instead of using a dedicated 32 bits
field.
We can easily have tens or hundreds of thousands (or more) of extent maps
on busy and large filesystems, specially with compression enabled or many
or large files with tons of small extents. So it's convenient to have the
extent_map structure as small as possible in order to use less memory.
So remove the compression type field from struct extent_map, use flags
to identify the compression type and shorten the flags field from an
unsigned long to a u32. This saves 8 bytes (on 64 bits platforms) and
reduces the size of the structure from 136 bytes down to 128 bytes, using
now only two cache lines, and increases the number of extent maps we can
have per 4K page from 30 to 32. By using a u32 for the flags instead of
an unsigned long, we no longer use test_bit(), set_bit() and clear_bit(),
but that level of atomicity is not needed as most flags are never cleared
once set (before adding an extent map to the tree), and the ones that can
be cleared or set after an extent map is added to the tree, are always
performed while holding the write lock on the extent map tree, while the
reader holds a lock on the tree or tests for a flag that never changes
once the extent map is in the tree (such as compression flags).
Change-Id: I95402d43f064c016b423b48652e4968d3db9b8a9
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 22 Jan 2024 17:17:33 +0000 (12:17 -0500)]
Fix: btrfs_chunk tracepoints changed in linux 6.8.0-rc1
See upstream commit:
commit
7dc66abb5a47778d7db327783a0ba172b8cff0b5
Author: Filipe Manana <fdmanana@suse.com>
Date: Tue Nov 21 13:38:38 2023 +0000
btrfs: use a dedicated data structure for chunk maps
Currently we abuse the extent_map structure for two purposes:
1) To actually represent extents for inodes;
2) To represent chunk mappings.
This is odd and has several disadvantages:
1) To create a chunk map, we need to do two memory allocations: one for
an extent_map structure and another one for a map_lookup structure, so
more potential for an allocation failure and more complicated code to
manage and link two structures;
2) For a chunk map we actually only use 3 fields (24 bytes) of the
respective extent map structure: the 'start' field to have the logical
start address of the chunk, the 'len' field to have the chunk's size,
and the 'orig_block_len' field to contain the chunk's stripe size.
Besides wasting a memory, it's also odd and not intuitive at all to
have the stripe size in a field named 'orig_block_len'.
We are also using 'block_len' of the extent_map structure to contain
the chunk size, so we have 2 fields for the same value, 'len' and
'block_len', which is pointless;
3) When an extent map is associated to a chunk mapping, we set the bit
EXTENT_FLAG_FS_MAPPING on its flags and then make its member named
'map_lookup' point to the associated map_lookup structure. This means
that for an extent map associated to an inode extent, we are not using
this 'map_lookup' pointer, so wasting 8 bytes (on a 64 bits platform);
4) Extent maps associated to a chunk mapping are never merged or split so
it's pointless to use the existing extent map infrastructure.
So add a dedicated data structure named 'btrfs_chunk_map' to represent
chunk mappings, this is basically the existing map_lookup structure with
some extra fields:
1) 'start' to contain the chunk logical address;
2) 'chunk_len' to contain the chunk's length;
3) 'stripe_size' for the stripe size;
4) 'rb_node' for insertion into a rb tree;
5) 'refs' for reference counting.
This way we do a single memory allocation for chunk mappings and we don't
waste memory for them with unused/unnecessary fields from an extent_map.
We also save 8 bytes from the extent_map structure by removing the
'map_lookup' pointer, so the size of struct extent_map is reduced from
144 bytes down to 136 bytes, and we can now have 30 extents map per 4K
page instead of 28.
Change-Id: Ie52b5ac83df4bc6abeb84d958c4f5d24ae0d8c75
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 22 Jan 2024 16:47:40 +0000 (11:47 -0500)]
Fix: strlcpy removed in linux 6.8.0-rc1
See upstream commit:
commit
d26270061ae66b915138af7cd73ca6f8b85e6b44
Author: Kees Cook <keescook@chromium.org>
Date: Thu Jan 18 12:31:55 2024 -0800
string: Remove strlcpy()
With all the users of strlcpy() removed[1] from the kernel, remove the
API, self-tests, and other references. Leave mentions in Documentation
(about its deprecation), and in checkpatch.pl (to help migrate host-only
tools/ usage). Long live strscpy().
The replacement interface, `strscpy`, has been available since linux
4.3, introduced in the upstream commit
30c44659f4a3e7e1f9f47e895591b4b40bf62671.
As lttng-modules master branch targets linux 4.4+ at this time,
`strlcpy` can be replaced with `strscpy`.
Change-Id: I27cdff70a504b25340cc59150ed8e959d9629e43
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 22 Jan 2024 16:33:39 +0000 (11:33 -0500)]
Fix: timer_start changed in linux 6.8.0-rc1
See upstream commit
commit
dbcdcb62b59db2cf6a24113873b90da15c6f0b19
Author: Anna-Maria Behnsen <anna-maria@linutronix.de>
Date: Fri Dec 1 10:26:26 2023 +0100
tracing/timers: Enhance timer_start tracepoint
For starting a timer, the timer is enqueued into a bucket of the timer
wheel. The bucket expiry is the defacto expiry of the timer but it is not
equal the timer expiry because of increasing granularity when bucket is in
a higher level of the wheel. To be able to figure out in a trace whether a
timer expired in time or not, the bucket expiry time is required as well.
Add bucket expiry time to the timer_start tracepoint and thereby simplify
the arguments.
Change-Id: I4868092765745b1efd0c48f13c0b837f2007dcb6
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 22 Jan 2024 16:10:37 +0000 (11:10 -0500)]
Fix: sched_stat_runtime changed in linux 6.8.0-rc1
See upstream commit:
commit
5fe6ec8f6ab549b6422e41551abb51802bd48bc7
Author: Peter Zijlstra <peterz@infradead.org>
Date: Mon Nov 6 13:41:43 2023 +0100
sched: Remove vruntime from trace_sched_stat_runtime()
Tracing the runtime delta makes sense, observer can sum over time.
Tracing the absolute vruntime makes less sense, inconsistent:
absolute-vs-delta, but also vruntime delta can be computed from
runtime delta.
Removing the vruntime thing also makes the two tracepoint sites
identical, allowing to unify the code in a later patch.
Change-Id: I24ebb4e06dbb646a1af75ac62b74f3821ff197de
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 10 Jan 2024 20:35:48 +0000 (15:35 -0500)]
Version 2.13.11
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0f91343f361271cc5b51f5fade12c7cc7ed90da4
Mathieu Desnoyers [Wed, 10 Jan 2024 01:55:58 +0000 (20:55 -0500)]
Fix: Include linux/sched/rt.h for kernels v3.9 to v3.14
From kernel v3.0 to v3.8, MAX_RT_PRIO is defined in linux/sched.h.
From kernel v3.9 to v3.14, MAX_RT_PRIO is defined in linux/sched/rt.h,
which is not included by linux/sched.h (hence this work-around).
From kernel v3.15 onwards, MAX_RT_PRIO is defined in linux/sched/prio.h,
which is included by linux/sched.h.
Add the missing linux/sched/rt.h include for the affected kernel version
range.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie7e1d9dc710621deca04553a9b5ba7f9a4d83c15
Mathieu Desnoyers [Mon, 8 Jan 2024 18:31:04 +0000 (13:31 -0500)]
Fix: Disable IBT around indirect function calls
When the Intel IBT feature is enabled, a CPU supporting this feature
validates that all indirect jumps/calls land on an ENDBR64 instruction.
The kernel seals functions which are not meant to be called indirectly,
which means that calling functions indirectly from their address fetched
using kallsyms or kprobes trigger a crash.
Use the MSR_IA32_S_CET CET_ENDBR_EN MSR bit to temporarily disable ENDBR
validation around indirect calls to kernel functions. Considering that
the main purpose of this feature is to prevent ROP-style attacks,
disabling the ENDBR validation temporarily around the call from a kernel
module does not affect the ROP protection.
Fixes #1408
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I97f5d8efce093c1e956cede1f44de2fcebf30227
Mathieu Desnoyers [Tue, 9 Jan 2024 15:36:31 +0000 (10:36 -0500)]
Inline implementation of task_prio()
The task_prio() function has been implemented as "return p->prio -
MAX_RT_PRIO;" since at least kernel v3.0, so inline it into
lttng-modules rather than using kallsyms to call the kernel
implementation.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I7dd482a2da72a005c16b3e5864767b47d7bc3fd3
Mathieu Desnoyers [Tue, 9 Jan 2024 15:33:13 +0000 (10:33 -0500)]
Fix: prio context NULL pointer exception
A missing call to wrapper_task_prio_init() causes the function pointer
for task_prio to stay NULL, which triggers a OOPS when trying to use the
prio context.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I417e84cb8a07db624e682c7ec2c033fbc2a7b8e7
Mathieu Desnoyers [Mon, 18 Dec 2023 18:17:07 +0000 (13:17 -0500)]
Fix: MODULE_IMPORT_NS is introduced in kernel 5.4
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I4c5faafb3a3ff8178b45c0e411113b17643bbc78
Lei wang [Mon, 18 Dec 2023 10:16:33 +0000 (05:16 -0500)]
Android: Import VFS namespace for android common kernel
Android GKI kernel add limitation on fs interface usage.
Need to import VFS namespace explicitly to make it workable
for lttng-modules.
Signed-off-by: Lei wang <quic_leiwan@quicinc.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 1 Dec 2023 14:52:08 +0000 (09:52 -0500)]
Fix: get_file_rcu is missing in kernels < 4.1
Open-code the get_file_rcu using atomic_long_inc_not_zero() for kernel
versions < 4.1.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0fa905b078165ede8b1837bb8d77891d05d0e8ed
Kienan Stewart [Mon, 20 Nov 2023 16:34:40 +0000 (11:34 -0500)]
fix: lookup_fd_rcu replaced by lookup_fdget_rcu in linux 6.7.0-rc1
See upstream commit:
commit
0ede61d8589cc2d93aa78230d74ac58b5b8d0244
Author: Christian Brauner <brauner@kernel.org>
Date: Fri Sep 29 08:45:59 2023 +0200
file: convert to SLAB_TYPESAFE_BY_RCU
In recent discussions around some performance improvements in the file
handling area we discussed switching the file cache to rely on
SLAB_TYPESAFE_BY_RCU which allows us to get rid of call_rcu() based
freeing for files completely. This is a pretty sensitive change overall
but it might actually be worth doing.
The main downside is the subtlety. The other one is that we should
really wait for Jann's patch to land that enables KASAN to handle
SLAB_TYPESAFE_BY_RCU UAFs. Currently it doesn't but a patch for this
exists.
With SLAB_TYPESAFE_BY_RCU objects may be freed and reused multiple times
which requires a few changes. So it isn't sufficient anymore to just
acquire a reference to the file in question under rcu using
atomic_long_inc_not_zero() since the file might have already been
recycled and someone else might have bumped the reference.
In other words, callers might see reference count bumps from newer
users. For this reason it is necessary to verify that the pointer is the
same before and after the reference count increment. This pattern can be
seen in get_file_rcu() and __files_get_rcu().
In addition, it isn't possible to access or check fields in struct file
without first aqcuiring a reference on it. Not doing that was always
very dodgy and it was only usable for non-pointer data in struct file.
With SLAB_TYPESAFE_BY_RCU it is necessary that callers first acquire a
reference under rcu or they must hold the files_lock of the fdtable.
Failing to do either one of this is a bug.
Thanks to Jann for pointing out that we need to ensure memory ordering
between reallocations and pointer check by ensuring that all subsequent
loads have a dependency on the second load in get_file_rcu() and
providing a fixup that was folded into this patch.
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Change-Id: Iba3663f19a54820afd31a8eeec24b3b5d4b06589
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Kienan Stewart [Mon, 20 Nov 2023 16:33:14 +0000 (11:33 -0500)]
fix: mm, vmscan signatures changed in linux 6.7.0-rc1
See upstream commit:
commit
3dfbb555c98ac55b9d911f9af0e35014b445fb41
Author: Vlastimil Babka <vbabka@suse.cz>
Date: Thu Sep 14 15:16:39 2023 +0200
mm, vmscan: remove ISOLATE_UNMAPPED
This isolate_mode_t flag is effectively unused since
89f6c88a6ab4 ("mm:
__isolate_lru_page_prepare() in isolate_migratepages_block()") as
sc->may_unmap is now checked directly (and only node_reclaim has a mode
that sets it to 0). The last remaining place is mm_vmscan_lru_isolate
tracepoint for the isolate_mode parameter. That one was mainly used to
indicate the active/inactive mode, which the trace-vmscan-postprocess.pl
script consumed, but that got silently broken. After fixing the script by
the previous patch, it does not need the isolate_mode anymore. So just
remove the parameter and with that the whole ISOLATE_UNMAPPED flag.
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie7346886d926a1a9d20bcb1570c587c5e943a1c3
Kienan Stewart [Mon, 20 Nov 2023 16:27:12 +0000 (11:27 -0500)]
fix: phys_proc_id and cpu_core_id moved in linux 6.7.0-rc1
See upstream commit:
commit
02fb601d27a7abf60d52b21bdf5b100a8d63da3f
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Mon Aug 14 10:18:30 2023 +0200
x86/cpu: Move phys_proc_id into topology info
Rename it to pkg_id which is the terminology used in the kernel.
No functional change.
See upstream commit:
commit
e95256335d45cc965cd12c423535002974313340
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Mon Aug 14 10:18:34 2023 +0200
x86/cpu: Move cpu_core_id into topology info
Rename it to core_id and stick it to the other ID fields.
No functional change.
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I574b02430210d5bb72c4b9db901d0e3a6dc7bea0
Kienan Stewart [Mon, 16 Oct 2023 14:10:09 +0000 (10:10 -0400)]
Fix build for RHEL 8.8 with linux 4.18.0-477.10.1+
4.18.0-477.10.1 introduces backports a change which updates the
`kfree_skb` trace event to the 3-argument version used in more recent
kernel versions.
Change-Id: I5a1071a59659b76e1499beae3388159ca8ced1f7
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Jérémie Galarneau [Thu, 5 Oct 2023 21:02:57 +0000 (17:02 -0400)]
Fix: bytecode validator: oops during validation of immediate string
Issue observed
--------------
Running Linux 6.5.5, lttng-modules @
6be48c9f, all built with gcc
13.2.1, I got a 'BUG' in dmesg while enabling the following event
rule:
$ lttng enable-event --kernel --syscall --channel chanK --all --filter '$ctx.procname == "UST reg*"'
The relevant parts of the 'BUG' output follow:
[ +0.715480] detected buffer overflow in strnlen
[ +0.000001] kernel BUG at lib/string_helpers.c:1031!
[ +0.000008] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ +0.000003] CPU: 2 PID: 157174 Comm: Client manageme Tainted: G S U OE 6.5.5-arch1-1 #1
d82a0f532dd8cfe67d5795c1738d9c01059a0c62
[ +0.000001] RIP: 0010:fortify_panic+0x13/0x20
[ +0.000006] Code: 41 5d c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 89 fe 48 c7 c7 90 22 c8 86 e8 3d aa b1 ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90
[ +0.000002] RSP: 0018:
ffffa7c7c106f918 EFLAGS:
00010246
[ +0.000002] RAX:
0000000000000023 RBX:
000000000000000b RCX:
0000000000000000
[ +0.000002] RDX:
0000000000000000 RSI:
ffff92766e4a16c0 RDI:
ffff92766e4a16c0
[ +0.000001] RBP:
0000000000000000 R08:
0000000000000000 R09:
ffffa7c7c106f7c0
[ +0.000001] R10:
0000000000000003 R11:
ffffffff874ca068 R12:
ffff927618202480
[ +0.000001] R13:
ffff9276182024d2 R14:
ffff927453999c08 R15:
ffff9273dc7aa478
[ +0.000001] FS:
00007f06553f9680(0000) GS:
ffff92766e480000(0000) knlGS:
0000000000000000
[ +0.000002] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ +0.000002] CR2:
0000556d54eceaa8 CR3:
00000001ad9de002 CR4:
00000000003706e0
[ +0.000001] Call Trace:
[ +0.000002] <TASK>
[ +0.000002] ? die+0x36/0x90
[ +0.000004] ? do_trap+0xda/0x100
[ +0.000003] ? fortify_panic+0x13/0x20
[ +0.000002] ? do_error_trap+0x6a/0x90
[ +0.000002] ? fortify_panic+0x13/0x20
[ +0.000002] ? exc_invalid_op+0x50/0x70
[ +0.000003] ? fortify_panic+0x13/0x20
[ +0.000002] ? asm_exc_invalid_op+0x1a/0x20
[ +0.000005] ? fortify_panic+0x13/0x20
[ +0.000002] ? fortify_panic+0x13/0x20
[ +0.000003] bytecode_validate_overflow+0x155/0x1f0 [lttng_tracer
759e3e4fee0e774ef575e93b67e8dc7955d0c2c2]
[ +0.000330] lttng_bytecode_validate_load+0x32/0x1e0 [lttng_tracer
759e3e4fee0e774ef575e93b67e8dc7955d0c2c2]
[ +0.000183] lttng_enabler_link_bytecode+0x135/0x5a0 [lttng_tracer
759e3e4fee0e774ef575e93b67e8dc7955d0c2c2]
[ +0.000132] lttng_sync_event_list+0xef/0x650 [lttng_tracer
759e3e4fee0e774ef575e93b67e8dc7955d0c2c2]
[ +0.000123] ? __wake_up_common+0x73/0x180
[ +0.000004] lttng_session_enable+0x3e/0x130 [lttng_tracer
759e3e4fee0e774ef575e93b67e8dc7955d0c2c2]
[ +0.000121] lttng_session_ioctl+0x5db/0x720 [lttng_tracer
759e3e4fee0e774ef575e93b67e8dc7955d0c2c2]
[ +0.000120] ? __slab_free+0xf1/0x330
[ +0.000004] ? __scm_recv_common.isra.0+0x144/0x180
[ +0.000004] ? unix_stream_read_generic+0x233/0xb60
[ +0.000006] __x64_sys_ioctl+0x94/0xd0
[ +0.000004] do_syscall_64+0x5d/0x90
[ +0.000004] ? switch_fpu_return+0x50/0xe0
[ +0.000004] ? exit_to_user_mode_prepare+0x132/0x1e0
[ +0.000003] ? syscall_exit_to_user_mode+0x2b/0x40
[ +0.000002] ? do_syscall_64+0x6c/0x90
[ +0.000003] ? do_syscall_64+0x6c/0x90
[ +0.000002] ? do_syscall_64+0x6c/0x90
[ +0.000002] ? do_syscall_64+0x6c/0x90
[ +0.000002] ? syscall_exit_to_user_mode+0x2b/0x40
[ +0.000002] ? do_syscall_64+0x6c/0x90
[ +0.000002] ? do_syscall_64+0x6c/0x90
[ +0.000002] ? do_syscall_64+0x6c/0x90
[ +0.000002] ? do_syscall_64+0x6c/0x90
[ +0.000002] ? exc_page_fault+0x7f/0x180
[ +0.000003] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Cause
-----
`struct load_op` has a trailing 0-length array `data` member that is
used to refer, in the context of BYTECODE_OP_LOAD_STAR_GLOB_STRING, to
an immediate string operand that follows it.
During the validation of a filtering bytecode, strnlen is properly used
to determine the size of the immediate string operand, with a `maxlen`
parameter that is used to ensure the string operand is contained within
the bytecode (see lttng-bytecode-validator.c:434).
However, recent KSPP-related changes have enabled additional overrun
checks when statically-sized and flexible arrays are used. Those are
enabled when the kernel is built with CONFIG_UBSAN_BOUNDS and/or
CONFIG_FORTIFY_SOURCE configured.
The KBUILD CFLAGS now contain `-fstrict-flex-arrays=3`, which is
recognized by gcc 13+[1] and allows proper coverage of dynamically sized
trailing arrays when those configuration options are used.
With those validations in place, the kernel assumes that the `data`
array is truly of length 0 and it BUGs to warn of an invalid access.
The commit linked above contains a number of links explaining the
rationale for transitioning uses of the trailing zero-length arrays (a
gcc extension) to C99 flexible array members (FAM).
This was discussed at this year's GNU Cauldron [2].
Solution
--------
Uses of zero-length arrays (`foo[0]`) are replaced by flexible array
members (`foo[]`). The only cases that are left untouched are those
where the zero-length array is used to indicate the end of a
structure (i.e. it doesn't indicate that a variable number of elements
follow), see the `metadata_packet_header`, `metadata_record_header`,
`event_notifier_packet_header`, and `event_notifier_record_header`
structures.
It may be desirable to use the new `counted_by` attribute for some of
those in the future (`lttng_kernel_abi_filter_bytecode`,
`lttng_kernel_abi_capture_bytecode`, and `bytecode_runtime`) [3].
Note
----
While this is tagged as a memory handling 'fix', it has no security
implication as far as I can tell. The accesses that are flagged by the
new validations were valid.
This merely allows the runtime validations to understand the memory
layout properly.
[1] https://github.com/torvalds/linux/commit/
df8fc4e934c12b906d08050d7779f292b9c5c6b5
[2] https://gcc.gnu.org/wiki/cauldron2023talks?action=AttachFile&do=get&target=Most-wanted+Security+Features+in+GCC+for+Linux+Kernel.pdf
[3] https://lwn.net/Articles/930943/
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id39b101aaafe68f8fae6b86cd61806cba8cb1e6a
Kienan Stewart [Tue, 26 Sep 2023 18:45:09 +0000 (14:45 -0400)]
fix: lttng-probe-kvm-x86-mmu build with linux 6.6
A small change was made upstream in `spte.h` that requires
`arch/x86/kvm` to be added to the search path when
building lttng-probe-kvm.x86-mmu.o.
See upstream commit :
commit
d10f3780bc2f80744d291e118c0c8bade54ed3b8
Author: Sean Christopherson <seanjc@google.com>
Date: Tue Aug 8 15:40:59 2023 -0700
KVM: x86/mmu: Include mmu.h in spte.h
Explicitly include mmu.h in spte.h instead of relying on the "parent" to
include mmu.h. spte.h references a variety of macros and variables that
are defined/declared in mmu.h, and so including spte.h before (or instead
of) mmu.h will result in build errors, e.g.
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I5c3fc87d3b006cefbcca198e6e15868a342cb8dd
Michael Jeanson [Fri, 18 Aug 2023 15:28:30 +0000 (11:28 -0400)]
fix: built-in lttng with kernel >= v6.1
In kernel v6.1 the list of subdirectories was moved from Makefile to
Kbuild. Adjust our built-in.sh script to detect this change and use the
appropriate file to graft ourself to the kernel build system.
Thanks to Richa Bharti for the initial patch.
See upstream commit:
commit
5750121ae7382ebac8d47ce6d68012d6cd1d7926
Author: Masahiro Yamada <masahiroy@kernel.org>
Date: Sun Sep 25 03:19:10 2022 +0900
kbuild: list sub-directories in ./Kbuild
Use the ordinary obj-y syntax to list subdirectories.
Change-Id: Ifc0f1bdea5ee59b0e0b96cdb31c9c689deb20559
Reported-by: Richa Bharti <Richa.Bharti@siemens.com>
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Fri, 7 Jul 2023 17:27:15 +0000 (13:27 -0400)]
fix: ubuntu kinetic kernel range for jdb2
Kinetic introduces a 'lowlatency' kernel with a different ABI number
than the 'generic' flavor, add 2 ranges accordingly.
Change-Id: I89427e30672f3f25b2f6d698d6e1cabfb45d9366
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 7 Jun 2023 14:53:24 +0000 (10:53 -0400)]
Version 2.13.10
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I9d1d358ba28dd0f9f68b54c52e784d002d9bc74c
Michael Jeanson [Fri, 14 Apr 2023 19:09:25 +0000 (15:09 -0400)]
Add support for RHEL 9.1
Change-Id: I2aaa8e385448b1e46c3c16edc4f36f2eb6906e76
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 19 Jul 2022 19:07:22 +0000 (15:07 -0400)]
Add support for RHEL 9.0
Change-Id: Ia01527c3d6243805445734f00f4f2f945efd16e7
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 29 Nov 2022 17:10:17 +0000 (12:10 -0500)]
fix: kallsyms wrapper on CONFIG_PPC64_ELF_ABI_V1
Change-Id: Ibdff5792a1511b678f7776f5d032758db739c5ad
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 7 Mar 2023 16:10:26 +0000 (11:10 -0500)]
fix: net: add location to trace_consume_skb() (v6.3)
See upstream commit :
commit
dd1b527831a3ed659afa01b672d8e1f7e6ca95a5
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Feb 16 15:47:18 2023 +0000
net: add location to trace_consume_skb()
kfree_skb() includes the location, it makes sense
to add it to consume_skb() as well.
Change-Id: I8d871187d90e7fe113a63e209b00aebe0df475f3
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 7 Mar 2023 16:26:25 +0000 (11:26 -0500)]
fix: btrfs: pass find_free_extent_ctl to allocator tracepoints (v6.3)
See upstream commit :
commit
cfc2de0fce015d4249c674ef9f5e0b4817ba5c53
Author: Boris Burkov <boris@bur.io>
Date: Thu Dec 15 16:06:31 2022 -0800
btrfs: pass find_free_extent_ctl to allocator tracepoints
The allocator tracepoints currently have a pile of values from ffe_ctl.
In modifying the allocator and adding more tracepoints, I found myself
adding to the already long argument list of the tracepoints. It makes it
a lot simpler to just send in the ffe_ctl itself.
Change-Id: Iab4132a9d3df3a6369591a50fb75374b1e399fa4
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 7 Mar 2023 17:05:00 +0000 (12:05 -0500)]
fix: uuid: Decouple guid_t and uuid_le types and respective macros (v6.3)
See upstream commit :
commit
5e6a51787fef20b849682d8c49ec9c2beed5c373
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Tue Jan 24 15:38:38 2023 +0200
uuid: Decouple guid_t and uuid_le types and respective macros
The guid_t type and respective macros are being used internally only.
The uuid_le has its user outside the kernel. Decouple these types and
macros, and make guid_t completely internal type to the kernel.
Change-Id: I8644fd139b0630e9cf18886b84e33bffab1e5abd
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 7 Mar 2023 16:41:14 +0000 (11:41 -0500)]
fix: mm: introduce vma->vm_flags wrapper functions (v6.3)
See upstream commit :
commit
bc292ab00f6c7a661a8a605c714e8a148f629ef6
Author: Suren Baghdasaryan <surenb@google.com>
Date: Thu Jan 26 11:37:47 2023 -0800
mm: introduce vma->vm_flags wrapper functions
vm_flags are among VMA attributes which affect decisions like VMA merging
and splitting. Therefore all vm_flags modifications are performed after
taking exclusive mmap_lock to prevent vm_flags updates racing with such
operations. Introduce modifier functions for vm_flags to be used whenever
flags are updated. This way we can better check and control correct
locking behavior during these updates.
Change-Id: I2cf662420d9d7748e5e310d3ea4bac98ba7d7f94
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 3 Mar 2023 15:39:24 +0000 (10:39 -0500)]
Version 2.13.9
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie5399d8f24102ee78aa6950222aa64289bbdb6ed
Michael Jeanson [Wed, 18 Jan 2023 21:32:04 +0000 (16:32 -0500)]
fix: jbd2: use the correct print format (v5.4.229)
See upstream commit :
commit
ecb9d0d2e123874bcdd2efdecda0f4e0c3dc566d
Author: Bixuan Cui <cuibixuan@linux.alibaba.com>
Date: Tue Oct 11 19:33:44 2022 +0800
jbd2: use the correct print format
[ Upstream commit
d87a7b4c77a997d5388566dd511ca8e6b8e8a0a8 ]
The print format error was found when using ftrace event:
<...>-1406 [000] ....
23599442.895823: jbd2_end_commit: dev 252,8 transaction -
1866216965 sync 0 head -
1866217368
<...>-1406 [000] ....
23599442.896299: jbd2_start_commit: dev 252,8 transaction -
1866216964 sync 0
Use the correct print format for transaction, head and tid.
Change-Id: Ieee3d39ed1f2515e096e87d18b5ea8f921c54bd0
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 17 Jan 2023 17:16:04 +0000 (12:16 -0500)]
fix: jbd2 upper bound for v5.10.163
Use the correct upper bound of 5,11,0.
Change-Id: I435b44b940c7346ed8c3ef0d445365ed156702d0
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 17 Jan 2023 16:03:12 +0000 (11:03 -0500)]
fix: jbd2: use the correct print format (v5.10.163)
See upstream commit :
commit
d87a7b4c77a997d5388566dd511ca8e6b8e8a0a8
Author: Bixuan Cui <cuibixuan@linux.alibaba.com>
Date: Tue Oct 11 19:33:44 2022 +0800
jbd2: use the correct print format
The print format error was found when using ftrace event:
<...>-1406 [000] ....
23599442.895823: jbd2_end_commit: dev 252,8 transaction -
1866216965 sync 0 head -
1866217368
<...>-1406 [000] ....
23599442.896299: jbd2_start_commit: dev 252,8 transaction -
1866216964 sync 0
Use the correct print format for transaction, head and tid.
Change-Id: I7601f5cbb86495c2607be7b11e02724c90b3ebf9
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 16 Jan 2023 20:01:51 +0000 (15:01 -0500)]
fix: btrfs: move accessor helpers into accessors.h (v6.2)
See upstream commit :
commit
07e81dc94474eb62705c6f96d9ab1a5a797b8703
Author: Josef Bacik <josef@toxicpanda.com>
Date: Wed Oct 19 10:51:00 2022 -0400
btrfs: move accessor helpers into accessors.h
This is a large patch, but because they're all macros it's impossible to
split up. Simply copy all of the item accessors in ctree.h and paste
them in accessors.h, and then update any files to include the header so
everything compiles.
Change-Id: I1f0876dd8b7a8687f6802b60c3e3baabd017cc52
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 13 Jan 2023 21:08:06 +0000 (16:08 -0500)]
Version 2.13.8
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I3686e4e44475aa8cd6aa435f74e4994f5e36da2e
Michael Jeanson [Thu, 12 Jan 2023 18:52:22 +0000 (13:52 -0500)]
fix: jbd2: use the correct print format
See upstream commit :
commit
d87a7b4c77a997d5388566dd511ca8e6b8e8a0a8
Author: Bixuan Cui <cuibixuan@linux.alibaba.com>
Date: Tue Oct 11 19:33:44 2022 +0800
jbd2: use the correct print format
The print format error was found when using ftrace event:
<...>-1406 [000] ....
23599442.895823: jbd2_end_commit: dev 252,8 transaction -
1866216965 sync 0 head -
1866217368
<...>-1406 [000] ....
23599442.896299: jbd2_start_commit: dev 252,8 transaction -
1866216964 sync 0
Use the correct print format for transaction, head and tid.
Change-Id: Ic053f0e0c1e24ebc75bae51d07696aaa5e1c0094
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Thu, 1 Dec 2022 16:33:20 +0000 (11:33 -0500)]
Fix: in_x32_syscall was introduced in v4.7.0
Prior to v4.7.0, is_x32_task() was the API to query whether the current
system call is following the x32 ABI.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I783bd3bb46ec5e863ae209f79cee2f1bb415e661
Mathieu Desnoyers [Wed, 30 Nov 2022 20:41:02 +0000 (15:41 -0500)]
Explicitly skip tracing x32 system calls
x86 x32 system calls are not supported by LTTng. They are currently not
traced simply because their system call number is beyond the range of
NR_compat_syscalls.
However, this mostly happens by accident rather than by design.
Enforce this with an explicit check for in_x32_syscall(), which clearly
documents that those are not supported.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1235c32c5cf03612bf9c36785cf7c4f8f49d292b
Michael Jeanson [Thu, 24 Nov 2022 19:25:33 +0000 (14:25 -0500)]
fix: kallsyms wrapper on ppc64el
The 'PPC64_ELF_ABI_v2' macro in 'asm/types.h' was removed in v5.19 and
replaced by a config option 'CONFIG_PPC64_ELF_ABI_V2'.
See upstream commit :
commit
5b89492c03e5c0a2c259b97d7d4c1bb9b02860aa
Author: Christophe Leroy <christophe.leroy@csgroup.eu>
Date: Mon May 9 07:36:08 2022 +0200
powerpc: Finalise cleanup around ABI use
Now that we have CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2,
get rid of all indirect detection of ABI version.
Link: https://lore.kernel.org/r/709d9d69523c14c8a9fba4486395dca0f2d675b1.1652074503.git.christophe.leroy@csgroup.eu
Change-Id: Ibd00e35cab5516a6224bdfa5a6b540119b42dc55
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Fri, 11 Nov 2022 15:47:54 +0000 (10:47 -0500)]
fix: Adjust ranges for RHEL 8.6 kernels
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0b2c90f3678d0fb4503f61f336a4af185de2b39d
Michael Jeanson [Tue, 8 Nov 2022 16:26:46 +0000 (11:26 -0500)]
fix: kvm-x86 requires CONFIG_KALLSYMS_ALL
Fixes: #1363
Change-Id: I6da15f77123c393ccb9109b562c7c8dc5bbb96a5
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 17 Oct 2022 17:49:51 +0000 (13:49 -0400)]
fix: mm/slab_common: drop kmem_alloc & avoid dereferencing fields when not using (v6.1)
See uptream commit:
commit
2c1d697fb8ba6d2d44f914d4268ae1ccdf025f1b
Author: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Date: Wed Aug 17 19:18:24 2022 +0900
mm/slab_common: drop kmem_alloc & avoid dereferencing fields when not using
Drop kmem_alloc event class, and define kmalloc and kmem_cache_alloc
using TRACE_EVENT() macro.
And then this patch does:
- Do not pass pointer to struct kmem_cache to trace_kmalloc.
gfp flag is enough to know if it's accounted or not.
- Avoid dereferencing s->object_size and s->size when not using kmem_cache_alloc event.
- Avoid dereferencing s->name in when not using kmem_cache_free event.
- Adjust s->size to SLOB_UNITS(s->size) * SLOB_UNIT in SLOB
Change-Id: Icd7925731ed4a737699c3746cb7bb7760a4e8009
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 30 Sep 2022 21:11:06 +0000 (17:11 -0400)]
Version 2.13.7
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I28620e23756e3e91965839801ea8828b3f2b919c
Mathieu Desnoyers [Fri, 30 Sep 2022 20:19:16 +0000 (16:19 -0400)]
Fix: handle integer capture page faults as skip field
Now that we have the appropriate save/restore position mechanism for
error handling in place, we can handle page faults on integer
copy-from-user by skipping the offending captured field entirely rather
than relying on an arbitrary 0 value.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I4ec6243d96753ce7e9c6230563713aeacb126567
This page took 0.058428 seconds and 4 git commands to generate.