The print format error was found when using ftrace event:
<...>-1406 [000] .... 23599442.895823: jbd2_end_commit: dev 252,8 transaction -1866216965 sync 0 head -1866217368
<...>-1406 [000] .... 23599442.896299: jbd2_start_commit: dev 252,8 transaction -1866216964 sync 0
Use the correct print format for transaction, head and tid.
Change-Id: Ic053f0e0c1e24ebc75bae51d07696aaa5e1c0094 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
x86 x32 system calls are not supported by LTTng. They are currently not
traced simply because their system call number is beyond the range of
NR_compat_syscalls.
However, this mostly happens by accident rather than by design.
Enforce this with an explicit check for in_x32_syscall(), which clearly
documents that those are not supported.
mm/slab_common: drop kmem_alloc & avoid dereferencing fields when not using
Drop kmem_alloc event class, and define kmalloc and kmem_cache_alloc
using TRACE_EVENT() macro.
And then this patch does:
- Do not pass pointer to struct kmem_cache to trace_kmalloc.
gfp flag is enough to know if it's accounted or not.
- Avoid dereferencing s->object_size and s->size when not using kmem_cache_alloc event.
- Avoid dereferencing s->name in when not using kmem_cache_free event.
- Adjust s->size to SLOB_UNITS(s->size) * SLOB_UNIT in SLOB
Change-Id: Icd7925731ed4a737699c3746cb7bb7760a4e8009 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fix: handle integer capture page faults as skip field
Now that we have the appropriate save/restore position mechanism for
error handling in place, we can handle page faults on integer
copy-from-user by skipping the offending captured field entirely rather
than relying on an arbitrary 0 value.
Fix: honor "user" attribute for array/sequence of user integers
The macro _lttng_kernel_static_type_integer_from_type() should map to
_lttng_kernel_static_type_integer() to pass the "_user" attribute.
Otherwise, userspace fields such as pipe2's system call fildes field (a
ctf_user_array()) can trigger NULL pointer exceptions and read arbitrary
kernel memory if the pipe2 system call receives a bogus pointer as input
while filtering/capture is accessing this field.
He Zhe [Tue, 27 Sep 2022 07:59:42 +0000 (15:59 +0800)]
wrapper: powerpc64: fix kernel crash caused by do_get_kallsyms
Kernel crashes on powerpc64 ABIv2 as follow when lttng_tracer initializes,
since do_get_kallsyms in lttng_wrapper fails to return a proper address of
kallsyms_lookup_name.
root@qemuppc64:~# lttng create trace_session --live -U net://127.0.0.1
Spawning a session daemon
lttng_kretprobes: loading out-of-tree module taints kernel.
BUG: Unable to handle kernel data access on read at 0xfffffffffffffff8
Faulting instruction address: 0xc0000000001f6fd0
Oops: Kernel access of bad area, sig: 11 [#1]
<snip>
NIP [c0000000001f6fd0] module_kallsyms_lookup_name+0xf0/0x180
LR [c0000000001f6f28] module_kallsyms_lookup_name+0x48/0x180
Call Trace:
module_kallsyms_lookup_name+0x34/0x180 (unreliable)
kallsyms_lookup_name+0x258/0x2b0
wrapper_kallsyms_lookup_name+0x4c/0xd0 [lttng_wrapper]
wrapper_get_pfnblock_flags_mask_init+0x28/0x60 [lttng_wrapper]
lttng_events_init+0x40/0x344 [lttng_tracer]
do_one_initcall+0x78/0x340
do_init_module+0x6c/0x2f0
__do_sys_finit_module+0xd0/0x120
system_call_exception+0x194/0x2f0
system_call_vectored_common+0xe8/0x278
<snip>
do_get_kallsyms makes use of kprobe_register and in turn kprobe_lookup_name
to get the address of the kernel function kallsyms_lookup_name. In case of
PPC64_ELF_ABI_v2, when kprobes are placed at function entry,
kprobe_lookup_name adjusts the global entry point of the function returned
by kallsyms_lookup_name to the local entry point(at some fixed offset of
global one). This adjustment is all for kprobes to be able to work properly.
Global and local entry point are defined in powerpc64 ABIv2.
When the local entry point is given, some instructions at the beginning of
the function are skipped and thus causes the above kernel crash. We just
want to make a simple function call which needs global entry point.
This patch adds 4 bytes which is the length of one instruction to
kallsyms_lookup_name so that it will not trigger the global to local
adjustment, and then substracts 4 bytes from the returned address. See the
following kernel change for more details.
Validate that the buffer length is large enough to hold empty capture
fields.
If the buffer is initially not large enough to hold empty capture fields
for each field to capture, discard the notification.
If after capturing a field there is not enough room anymore in the
buffer to write empty capture fields, skip the offending large field by
writing an empty capture field in its place.
Now that we have the appropriate save/restore position mechanism for
error handling in place, we can handle page faults on copy-from-user by
skipping the offending captured field entirely rather than relying on an
empty string.
The "user" attribute (copy from userspace) is not applied to
sequence/array of integer field capture within event notifications. This
could eventually lead to unsafe copy of integers from user-space.
Currently, the only array/sequence of integers which are read from
user-space are the arguments to sys_select (e.g. `readfds` field). Those
are expressed as "custom" fields, which are skipped by the filter and
capture bytecode.
This is therefore not an issue with the current instrumentation, but we
should properly handle this nevertheless.
The "user" attribute (copy from userspace) is not applied to string
field capture within event notifications. This leads to copy of strings
from user-space (e.g. `filename` field from sys_open) to end up using
strlen/memcpy on user-space data. This can cause kernel OOPS due to
unhandled page faults, and it also allows reading kernel memory through
the event notification capture mechanism. As a result, the users within
the `tracing` group can read arbitrary kernel memory.
Fix: bytecode interpreter: LOAD_FIELD: handle user fields
The instructions for recursive traversal through composed types
are used by the capture bytecode, and by filter expressions which
access fields nested within composed types.
Instructions BYTECODE_OP_LOAD_FIELD_STRING and
BYTECODE_OP_LOAD_FIELD_SEQUENCE were leaving the "user" attribute
uninitialized. Initialize those to 0.
The handling of userspace strings and integers is missing in LOAD_FIELD
instructions. Therefore, ensure that the specialization leaves the
generic LOAD_FIELD instruction in place for userspace input.
Add a "user" attribute to:
- struct bytecode_get_index_data elem field (produced by the
specialization),
- struct vstack_load used by the specialization,
- struct load_ptr used by the interpreter.
- struct lttng_interpreter_output used by the event notification
capture.
Use this "user" attribute in dynamic_load_field() for integer, string
and string_sequence object types to ensure that the proper
userspace-aware accesses are performed when loading those fields.
The "user" field attribute (copy from userspace) is not taken into
account in the bytecode specialization and interpreter recursive
traversal through composed types (LOAD_FIELD bytecode instructions).
Those are currently used by the event notification capture bytecode, and
by filter expressions which access fields nested within composed types.
Move the "user" attribute from the event fields to the integer and
string types. This will allow ensuring that the bytecode specialization,
interpreter and event notification output capture have access to this
user attribute even in nested types (e.g. arrays, sequences) in a
subsequent change.
Fix: incorrect stub prototypes when CONFIG_HAVE_SYSCALL_TRACEPOINTS=n
The stub prototypes do not match the expected argument types, and extra
erroneous semicolons are present. This has been fixed by a refactoring
in the master branch:
mm/tracing: add 'accounted' entry into output of allocation tracepoints
Slab caches marked with SLAB_ACCOUNT force accounting for every
allocation from this cache even if __GFP_ACCOUNT flag is not passed.
Unfortunately, at the moment this flag is not visible in ftrace output,
and this makes it difficult to analyze the accounted allocations.
This patch adds boolean "accounted" entry into trace output,
and set it to 'true' for calls used __GFP_ACCOUNT flag and
for allocations from caches marked with SLAB_ACCOUNT.
Set it to 'false' if accounting is disabled in configs.
Change-Id: I023a355b94e79931499e1a1f648e2649d6dd3c89 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Commit 2a222ca992c3 ("fs: have submit_bh users pass in op and flags
separately") renamed the jbd2_write_superblock() 'write_op' argument into
'write_flags'. Propagate this change to the jbd2_write_superblock()
callers. Additionally, change the type of 'write_flags' into blk_opf_t.
Change-Id: I65b8af95b3d07438763dd94f409c197e3b400733 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 10 Aug 2022 15:07:14 +0000 (11:07 -0400)]
fix: tie compaction probe build to CONFIG_COMPACTION
The definition of 'struct compact_control' in 'mm/internal.h' depends on
CONFIG_COMPACTION being defined. Only build the compaction probe when
this configuration option is enabled.
Thanks to Bruce Ashfield <bruce.ashfield@gmail.com> for reporting this
issue.
Change-Id: I81e77aa9c1bf10452c152d432fe5224df0db42c9 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
So, change unsigned type to signed type in the trace event. After
applying this patch, cpu number will be printed as -1 instead of 4294967295 as folllows.
Currently, trace point mm_page_alloc_zone_locked() doesn't show correct
information.
First, when alloc_flag has ALLOC_HARDER/ALLOC_CMA, page can be allocated
from MIGRATE_HIGHATOMIC/MIGRATE_CMA. Nevertheless, tracepoint use
requested migration type not MIGRATE_HIGHATOMIC and MIGRATE_CMA.
Second, after commit 44042b4498728 ("mm/page_alloc: allow high-order pages
to be stored on the per-cpu lists") percpu-list can store high order
pages. But trace point determine whether it is a refiil of percpu-list by
comparing requested order and 0.
To handle these problems, make mm_page_alloc_zone_locked() only be called
by __rmqueue_smallest with correct migration type. With a new argument
called percpu_refill, it can show roughly whether it is a refill of
percpu-list.
Change-Id: I2e4a57393757f12b9c5a4566c4d1102ee2474a09 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fix: event notifier: racy use of last subbuffer record
The lttng-modules event notifiers use the ring buffer internally. When
reading the payload of the last event in a sub-buffer with a multi-part
read (e.g. two read system calls), we should not "put" the sub-buffer
holding this data, else continuing reading the data in the following
read system call can observe corrupted data if it has been concurrently
overwritten by the producer.
Fix: bytecode interpreter context_get_index() leaves byte order uninitialized
Observed Issue
==============
When using the event notification capture feature to capture a context
field, e.g. '$ctx.cpu_id', the captured value is often observed in
reverse byte order.
Cause
=====
Within the bytecode interpreter, context_get_index() leaves the "rev_bo"
field uninitialized in the top of stack.
This only affects the event notification capture bytecode because the
BYTECODE_OP_GET_SYMBOL bytecode instruction (as of lttng-tools 2.13)
is only generated for capture bytecode in lttng-tools. Therefore, only
capture bytecode targeting contexts are affected by this issue. The
reason why lttng-tools uses the "legacy" bytecode instruction to get
context (BYTECODE_OP_GET_CONTEXT_REF) for the filter bytecode is to
preserve backward compatibility of filtering when interacting with
applications linked against LTTng-UST 2.12.
Solution
========
Initialize the rev_bo field based on the context field type
reserve_byte_order field.
Michael Jeanson [Tue, 31 May 2022 19:24:48 +0000 (15:24 -0400)]
fix: 'random' tracepoints removed in stable kernels
The upstream commit 14c174633f349cb41ea90c2c0aaddac157012f74 removing
the 'random' tracepoints is being backported to multiple stable kernel
branches, I don't see how that qualifies as a fix but here we are.
Use the presence of 'include/trace/events/random.h' in the kernel source
tree instead of the rather tortuous version check to determine if we
need to build 'lttng-probe-random.ko'.
Change-Id: I8f5f2f4c9e09c61127c49c7949b22dd3fab0460d Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
These explicit tracepoints aren't really used and show sign of aging.
It's work to keep these up to date, and before I attempted to keep them
up to date, they weren't up to date, which indicates that they're not
really used. These days there are better ways of introspecting anyway.
Signed-off-by: He Zhe <zhe.he@windriver.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0b7eb8aa78b5bd2039e20ae3e1da4c5eb9018789
sched/tracing: Append prev_state to tp args instead
Commit fa2c3254d7cf (sched/tracing: Don't re-read p->state when emitting
sched_switch event, 2022-01-20) added a new prev_state argument to the
sched_switch tracepoint, before the prev task_struct pointer.
This reordering of arguments broke BPF programs that use the raw
tracepoint (e.g. tp_btf programs). The type of the second argument has
changed and existing programs that assume a task_struct* argument
(e.g. for bpf_task_storage access) will now fail to verify.
If we instead append the new argument to the end, all existing programs
would continue to work and can conditionally extract the prev_state
argument on supported kernel versions.
Change-Id: Ife2ec88a8bea2743562590cbd357068d7773863f Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
mm: compaction: cleanup the compaction trace events
As Steven suggested [1], we should access the pointers from the trace
event to avoid dereferencing them to the tracepoint function when the
tracepoint is disabled.
[1] https://lkml.org/lkml/2021/11/3/409
Change-Id: I6c08250df8596e8dbc76780ae5d95c899c12e6fe Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Use rethook for kretprobe function return hooking if the arch sets
CONFIG_HAVE_RETHOOK=y. In this case, CONFIG_KRETPROBE_ON_RETHOOK is
set to 'y' automatically, and the kretprobe internal data fields
switches to use rethook. If not, it continues to use kretprobe
specific function return hooks.
Change-Id: I2b7670dc04e4769c1e3c372582ad2f555f6d7a66 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
These explicit tracepoints aren't really used and show sign of aging.
It's work to keep these up to date, and before I attempted to keep them
up to date, they weren't up to date, which indicates that they're not
really used. These days there are better ways of introspecting anyway.
Change-Id: I3b8c3e2732e7efdd76ce63204ac53a48784d0df6 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
There is no good reason to keep genhd.h separate from the main blkdev.h
header that includes it. So fold the contents of genhd.h into blkdev.h
and remove genhd.h entirely.
Change-Id: I7cf2aaa3a4c133320b95f2edde49f790f9515dbd Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Drop the export of kvm_x86_ops now it is no longer referenced by SVM or
VMX. Disallowing access to kvm_x86_ops is very desirable as it prevents
vendor code from incorrectly modifying hooks after they have been set by
kvm_arch_hardware_setup(), and more importantly after each function's
associated static_call key has been updated.
No functional change intended.
Change-Id: Icee959a984570f95ab9b71354225b5aeecea7da0 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
The commit "fix: mm: compaction: fix the migration stats in trace_mm_compaction_migratepages() (v5.17)"
Triggers this warning:
LTTng: event provider mismatch: The event name needs to start with provider name + _ + one or more letter, provider: compaction, event name: mm_compaction_migratepages
Sampling the discarded events count in the buffer_end callback is done
out of order, and may therefore include increments performed by following
events (in following packets) if the thread doing the end-of-packet
event write is interrupted for a long time.
Sampling the event discarded counts before reserving space for the last
event in a packet, and keeping this as part of the private ring buffer
context, should fix this race.
In lttng-modules, this scenario would only happen if an interrupt
handler produces many events, when nested over an event between its
reserve and commit. Note that if lttng-modules supports faultable
tracepoints in the future, this may become more easy to trigger due to
preemption.
Syscall exit should fetch the "sc_out" parameters. This issue was
introduced by commit e42c4f49c15b ("Split syscall tracepoint generation in their own files").
Fixes: e42c4f49c15b ("Split syscall tracepoint generation in their own files") Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib34912005323ea34b6d11ca9acc5edf491649cdd
Rename SKB_DROP_REASON_SOCKET_FILTER, which is used
as the reason of skb drop out of socket filter before
it's part of a released kernel. It will be used for
more protocols than just TCP in future series.
random: rather than entropy_store abstraction, use global
Originally, the RNG used several pools, so having things abstracted out
over a generic entropy_store object made sense. These days, there's only
one input pool, and then an uneven mix of usage via the abstraction and
usage via &input_pool. Rather than this uneasy mixture, just get rid of
the abstraction entirely and have things always use the global. This
simplifies the code and makes reading it a bit easier.
Change-Id: I1a2a14d7b6e69a047804e1e91e00fe002f757431 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
btrfs: pass fs_info to trace_btrfs_transaction_commit
The root on the trans->root can be anything, and generally we're
committing from the transaction kthread so it's usually the tree_root.
Change this to just take an fs_info, and to maintain compatibility
simply put the ROOT_TREE_OBJECTID as the root objectid for the
tracepoint. This will allow use to remove trans->root.
Change-Id: Ie5a4804330edabffac0714fcb9c25b8c8599e424 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
mm: compaction: fix the migration stats in trace_mm_compaction_migratepages()
Now the migrate_pages() has changed to return the number of {normal
page, THP, hugetlb} instead, thus we should not use the return value to
calculate the number of pages migrated successfully. Instead we can
just use the 'nr_succeeded' which indicates the number of normal pages
migrated successfully to calculate the non-migrated pages in
trace_mm_compaction_migratepages().
block: don't call blk_status_to_errno in blk_update_request
We only need to call it to resolve the blk_status_t -> errno mapping for
tracing, so move the conversion into the tracepoints that are not called
at all when tracing isn't enabled.
Change-Id: Ic556cee1d82e44a93a1467f55d45b6e17a48d387 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
KVM: MMU: change tracepoints arguments to kvm_page_fault
Pass struct kvm_page_fault to tracepoints instead of extracting the
arguments from the struct. This also lets the kvm_mmu_spte_requested
tracepoint pick the gfn directly from fault->gfn, instead of using
the address.
Change-Id: I5ee3f344a8d1cd0ed185cdeecb3a3121183c9b43 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
KVM: x86: Get exit_reason as part of kvm_x86_ops.get_exit_info
Extend the get_exit_info static call to provide the reason for the VM
exit. Modify relevant trace points to use this rather than extracting
the reason in the caller.
Change-Id: I28903e658eb7cbfc6666e35ba4cffba5e49d1445 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
There is an issue when associating multiple actions to a single system
call, where the first action is effectively added, but the following
actions are silently ignored.
Cause
=====
The is caused by event creation of the 2nd event to be skipped because
lttng_syscall_filter_enable returns "-EBUSY", because the bit is already
set in the bitmap.
Solution
========
Fixing this double-add-trigger scenario requires changes to how the
lttng_syscall_filter_enable -EBUSY errors are dealt with on event
creation, but the problem runs deeper.
Given that many events may be responsible for requiring a bit to be set
in the bitmap of the syscall filter, we cannot simply clear the bit
whenever one event is disabled
Otherwise, this will cause situations where all events for a given
system call are disabled (due to the syscall filter bit being cleared)
whenever one single event for that system call is disabled (e.g. 2
triggers on the same system call, but with different actions).
We need to keep track of all events associated with this bit in the
bitmap, and only clear the bit when the reference count reaches 0.
This fixes the double-add-trigger issue by enturing that no -EBUSY
is returned when creating the second event, and fixes the
double-add-trigger-then-remove-trigger scenario as well.
Daniel Gomez [Fri, 8 Oct 2021 17:38:02 +0000 (19:38 +0200)]
fix: implicit-int error in EXPORT_SYMBOL_GPL
Add module header to fix error:
error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
[-Werror=implicit-int]
Log:
/workdir/build/workspace/sources/lttng-modules/src/wrapper/random.c:54:1:
warning: data definition has no type or storage class
54 EXPORT_SYMBOL_GPL(wrapper_get_bootid);
^~~~~~~~~~~~~~~~~
/workdir/build/workspace/sources/lttng-modules/src/wrapper/random.c:54:1:
error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
[-Werror=implicit-int]
/workdir/build/workspace/sources/lttng-modules/src/wrapper/random.c:54:1:
warning: parameter names (without types) in function declaration
cc1: some warnings being treated as errors
make[3]: ***
[/workdir/build/tmp/work-shared/qt5222/kernel-source/scripts/Makefile.build:271:
/workdir/build/workspace/sources/lttng-modules/src/wrapper/random.o]
Signed-off-by: Daniel Gomez <daniel@qtec.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: If1b96f6fcc19988b94dc09b12d071f2c61491b83
Michael Jeanson [Mon, 13 Sep 2021 18:16:22 +0000 (14:16 -0400)]
fix: Revert "Makefile: Enable -Wimplicit-fallthrough for Clang" (v5.15)
Starting with v5.15, "-Wimplicit-fallthrough=5" was added to the build
flags which requires the use of "__attribute__((__fallthrough__))" to
annotate fallthrough case statements.
It turns out that the problem with the clang -Wimplicit-fallthrough
warning is not about the kernel source code, but about clang itself, and
that the warning is unusable until clang fixes its broken ways.
In particular, when you enable this warning for clang, you not only get
warnings about implicit fallthroughs. You also get this:
warning: fallthrough annotation in unreachable code [-Wimplicit-fallthrough]
which is completely broken becasue it
(a) doesn't even tell you where the problem is (seriously: no line
numbers, no filename, no nothing).
(b) is fundamentally broken anyway, because there are perfectly valid
reasons to have a fallthrough statement even if it turns out that
it can perhaps not be reached.
In the kernel, an example of that second case is code in the scheduler:
switch (state) {
case cpuset:
if (IS_ENABLED(CONFIG_CPUSETS)) {
cpuset_cpus_allowed_fallback(p);
state = possible;
break;
}
fallthrough;
case possible:
where if CONFIG_CPUSETS is enabled you actually never hit the
fallthrough case at all. But that in no way makes the fallthrough
wrong.
So the warning is completely broken, and enabling it for clang is a very
bad idea.
In the meantime, we can keep the gcc option enabled, and make the gcc
build use
-Wimplicit-fallthrough=5
which means that we will at least continue to require a proper
fallthrough statement, and that gcc won't silently accept the magic
comment versions. Because gcc does this all correctly, and while the odd
"=5" part is kind of obscure, it's documented in [1]:
"-Wimplicit-fallthrough=5 doesn’t recognize any comments as
fallthrough comments, only attributes disable the warning"
so if clang ever fixes its bad behavior we can try enabling it there again.
Change-Id: Iea69849592fb69ac04fb9bb28efcd6b8dce8ba88 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
The counting 'rwsem' hackery of get|put_online_cpus() is going to be
replaced by percpu rwsem.
Rename the functions to make it clear that it's locking and not some
refcount style interface. These new functions will be used for the
preparatory patches which make the code ready for the percpu rwsem
conversion.
Rename all instances in the cpu hotplug code while at it.
Change-Id: I5a37cf5afc075a402b7347989fac637dfa60a1ed Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change the type and name of task_struct::state. Drop the volatile and
shrink it to an 'unsigned int'. Rename it in order to find all uses
such that we can use READ_ONCE/WRITE_ONCE as appropriate.
Change-Id: I3a379192d6b977753fe58d4f67833a78dd7a0a47 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
btrfs: pass btrfs_inode to btrfs_writepage_endio_finish_ordered()
There is a pretty bad abuse of btrfs_writepage_endio_finish_ordered() in
end_compressed_bio_write().
It passes compressed pages to btrfs_writepage_endio_finish_ordered(),
which is only supposed to accept inode pages.
Thankfully the important info here is the inode, so let's pass
btrfs_inode directly into btrfs_writepage_endio_finish_ordered(), and
make @page parameter optional.
By this, end_compressed_bio_write() can happily pass page=NULL while
still getting everything done properly.
Also, to cooperate with such modification, replace @page parameter for
trace_btrfs_writepage_end_io_hook() with btrfs_inode.
Although this removes page_index info, the existing start/len should be
enough for most usage.
Change-Id: If96e99c2d9533d96d9d1aa6460bb7fd3ac9ed7ab Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 13 May 2021 14:30:24 +0000 (10:30 -0400)]
Disable x86 error code bitwise enum in default build
Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.
Michael Jeanson [Wed, 12 May 2021 18:20:44 +0000 (14:20 -0400)]
Disable mmap bitwise enum in default build
Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.
Michael Jeanson [Tue, 11 May 2021 21:22:12 +0000 (17:22 -0400)]
Disable block rwbs bitwise enum in default build
Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.
Michael Jeanson [Tue, 11 May 2021 21:15:15 +0000 (17:15 -0400)]
Disable sched_switch bitwise enum in default build
Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.
Michael Jeanson [Tue, 11 May 2021 21:09:07 +0000 (17:09 -0400)]
Disable open[at] bitwise enum in default build
Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.
Michael Jeanson [Tue, 11 May 2021 21:05:37 +0000 (17:05 -0400)]
Disable fcntl bitwise enum in default build
Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.
Michael Jeanson [Tue, 11 May 2021 20:16:18 +0000 (16:16 -0400)]
Disable clone bitwise enum in default build
Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.
Michael Jeanson [Wed, 12 May 2021 20:21:50 +0000 (16:21 -0400)]
Add experimental bitwise enum config option
Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.
Change-Id: Id45c7a78b280a7f35bbeafb80f2f6f5aa1ebbdc9 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
The only use of I_DIRTY_TIME_EXPIRE is to detect in
__writeback_single_inode() that inode got there because flush worker
decided it's time to writeback the dirty inode time stamps (either
because we are syncing or because of age). However we can detect this
directly in __writeback_single_inode() and there's no need for the
strange propagation with I_DIRTY_TIME_EXPIRE flag.
Change-Id: I6e7c0ced13acd4fcd88bcd572d0ba1f9b254c58c Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Sync `show_inode_state()` macro with Ubuntu 4.15 kernel
The following commit changed the `show_inode_state()` macro which
triggered a warning on our CI build:
commit 63388062bea96e5cd8b8d7abf7b7142f8666ca1f
Author: Jan Kara <jack@suse.cz>
Date: Mon Jan 25 12:37:43 2021 -0800
writeback: Drop I_DIRTY_TIME_EXPIRE
Also, this commit adds a comment to clarify why we keep these
`#if/#elif` even though we don't use it the macro.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I2dd53a1a286ab8a431977bda6cde01f700f0c7d9