git.lttng.org Git - lttng-modules.git/log

Version 2.13.8

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I3686e4e44475aa8cd6aa435f74e4994f5e36da2e

fix: jbd2: use the correct print format

See upstream commit :

  commit d87a7b4c77a997d5388566dd511ca8e6b8e8a0a8
  Author: Bixuan Cui <cuibixuan@linux.alibaba.com>
  Date:   Tue Oct 11 19:33:44 2022 +0800

    jbd2: use the correct print format

    The print format error was found when using ftrace event:
        <...>-1406 [000] .... 23599442.895823: jbd2_end_commit: dev 252,8 transaction -1866216965 sync 0 head -1866217368
        <...>-1406 [000] .... 23599442.896299: jbd2_start_commit: dev 252,8 transaction -1866216964 sync 0

    Use the correct print format for transaction, head and tid.

Change-Id: Ic053f0e0c1e24ebc75bae51d07696aaa5e1c0094
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Fix: in_x32_syscall was introduced in v4.7.0

Prior to v4.7.0, is_x32_task() was the API to query whether the current
system call is following the x32 ABI.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I783bd3bb46ec5e863ae209f79cee2f1bb415e661

Explicitly skip tracing x32 system calls

x86 x32 system calls are not supported by LTTng. They are currently not
traced simply because their system call number is beyond the range of
NR_compat_syscalls.

However, this mostly happens by accident rather than by design.

Enforce this with an explicit check for in_x32_syscall(), which clearly
documents that those are not supported.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1235c32c5cf03612bf9c36785cf7c4f8f49d292b

fix: kallsyms wrapper on ppc64el

The 'PPC64_ELF_ABI_v2' macro in 'asm/types.h' was removed in v5.19 and
replaced by a config option 'CONFIG_PPC64_ELF_ABI_V2'.

See upstream commit :

  commit 5b89492c03e5c0a2c259b97d7d4c1bb9b02860aa
  Author: Christophe Leroy <christophe.leroy@csgroup.eu>
  Date:   Mon May 9 07:36:08 2022 +0200

    powerpc: Finalise cleanup around ABI use

    Now that we have CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2,
    get rid of all indirect detection of ABI version.

Link: https://lore.kernel.org/r/709d9d69523c14c8a9fba4486395dca0f2d675b1.1652074503.git.christophe.leroy@csgroup.eu
Change-Id: Ibd00e35cab5516a6224bdfa5a6b540119b42dc55
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: Adjust ranges for RHEL 8.6 kernels

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0b2c90f3678d0fb4503f61f336a4af185de2b39d

fix: kvm-x86 requires CONFIG_KALLSYMS_ALL

Fixes: #1363
Change-Id: I6da15f77123c393ccb9109b562c7c8dc5bbb96a5
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: mm/slab_common: drop kmem_alloc & avoid dereferencing fields when not using (v6.1)

See uptream commit:

  commit 2c1d697fb8ba6d2d44f914d4268ae1ccdf025f1b
  Author: Hyeonggon Yoo <42.hyeyoo@gmail.com>
  Date:   Wed Aug 17 19:18:24 2022 +0900

    mm/slab_common: drop kmem_alloc & avoid dereferencing fields when not using

    Drop kmem_alloc event class, and define kmalloc and kmem_cache_alloc
    using TRACE_EVENT() macro.

    And then this patch does:
       - Do not pass pointer to struct kmem_cache to trace_kmalloc.
         gfp flag is enough to know if it's accounted or not.
       - Avoid dereferencing s->object_size and s->size when not using kmem_cache_alloc event.
       - Avoid dereferencing s->name in when not using kmem_cache_free event.
       - Adjust s->size to SLOB_UNITS(s->size) * SLOB_UNIT in SLOB

Change-Id: Icd7925731ed4a737699c3746cb7bb7760a4e8009
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Version 2.13.7

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I28620e23756e3e91965839801ea8828b3f2b919c

Fix: handle integer capture page faults as skip field

Now that we have the appropriate save/restore position mechanism for
error handling in place, we can handle page faults on integer
copy-from-user by skipping the offending captured field entirely rather
than relying on an arbitrary 0 value.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I4ec6243d96753ce7e9c6230563713aeacb126567

Version 2.13.6

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Idecfca64078038637c4a790adac84d87893d2bdd

Fix: bytecode validator: reject specialized load field/context ref instructions

Reject specialized load field/context ref instructions so a bytecode
crafted with nefarious intent cannot:

- Read user-space memory without proper get_user accessors,
- Read a memory area larger than the memory targeted by the instrumentation.

This prevents bytecode received from a tracing group user from oopsing
the kernel or disclosing the content of kernel memory to the tracing
group

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I2bda938a3a050f20be1d3d542aefe638b1b8bf73

Fix: bytecode validator: reject specialized load instructions

Reject specialized load instructions so a bytecode crafted with
nefarious intent cannot:

- Read user-space memory without proper get_user accessors,
- Read a memory area larger than the memory targeted by the instrumentation.

This prevents bytecode received from a tracing group user from oopsing
the kernel or disclosing the content of kernel memory to the tracing
group.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I6bcdf37d4a8601164082b3c24358bf0e765a2c92

Fix: honor "user" attribute for array/sequence of user integers

The macro _lttng_kernel_static_type_integer_from_type() should map to
_lttng_kernel_static_type_integer() to pass the "_user" attribute.
Otherwise, userspace fields such as pipe2's system call fildes field (a
ctf_user_array()) can trigger NULL pointer exceptions and read arbitrary
kernel memory if the pipe2 system call receives a bogus pointer as input
while filtering/capture is accessing this field.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I44276d751b822f214804184d1ce4d9b10b47d89d

wrapper: powerpc64: fix kernel crash caused by do_get_kallsyms

Kernel crashes on powerpc64 ABIv2 as follow when lttng_tracer initializes,
since do_get_kallsyms in lttng_wrapper fails to return a proper address of
kallsyms_lookup_name.

root@qemuppc64:~# lttng create trace_session --live -U net://127.0.0.1
Spawning a session daemon
lttng_kretprobes: loading out-of-tree module taints kernel.
BUG: Unable to handle kernel data access on read at 0xfffffffffffffff8
Faulting instruction address: 0xc0000000001f6fd0
Oops: Kernel access of bad area, sig: 11 [#1]
<snip>
NIP [c0000000001f6fd0] module_kallsyms_lookup_name+0xf0/0x180
LR [c0000000001f6f28] module_kallsyms_lookup_name+0x48/0x180
Call Trace:
module_kallsyms_lookup_name+0x34/0x180 (unreliable)
kallsyms_lookup_name+0x258/0x2b0
wrapper_kallsyms_lookup_name+0x4c/0xd0 [lttng_wrapper]
wrapper_get_pfnblock_flags_mask_init+0x28/0x60 [lttng_wrapper]
lttng_events_init+0x40/0x344 [lttng_tracer]
do_one_initcall+0x78/0x340
do_init_module+0x6c/0x2f0
__do_sys_finit_module+0xd0/0x120
system_call_exception+0x194/0x2f0
system_call_vectored_common+0xe8/0x278
<snip>

do_get_kallsyms makes use of kprobe_register and in turn kprobe_lookup_name
to get the address of the kernel function kallsyms_lookup_name. In case of
PPC64_ELF_ABI_v2, when kprobes are placed at function entry,
kprobe_lookup_name adjusts the global entry point of the function returned
by kallsyms_lookup_name to the local entry point(at some fixed offset of
global one). This adjustment is all for kprobes to be able to work properly.
Global and local entry point are defined in powerpc64 ABIv2.

When the local entry point is given, some instructions at the beginning of
the function are skipped and thus causes the above kernel crash. We just
want to make a simple function call which needs global entry point.

This patch adds 4 bytes which is the length of one instruction to
kallsyms_lookup_name so that it will not trigger the global to local
adjustment, and then substracts 4 bytes from the returned address. See the
following kernel change for more details.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=290e3070762ac80e5fc4087d8c4de7e3f1d90aca

Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I34e68e886b97e3976d0b5e25be295a8bb866c1a4

Fix: event notification: Remove duplicate event enabled check

The event enabled checks are already done by the event notification
callers, so there is no point in checking it again.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I8033c053d6a601cf646a008d5325d556dba5a8f9

Fix: event notification capture: validate buffer length

Validate that the buffer length is large enough to hold empty capture
fields.

If the buffer is initially not large enough to hold empty capture fields
for each field to capture, discard the notification.

If after capturing a field there is not enough room anymore in the
buffer to write empty capture fields, skip the offending large field by
writing an empty capture field in its place.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ifa2cdaf084e2ebee2efa052331107cb4d9095243

Fix: handle capture page faults as skip field

Now that we have the appropriate save/restore position mechanism for
error handling in place, we can handle page faults on copy-from-user by
skipping the offending captured field entirely rather than relying on an
empty string.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ibe1e818f57f8218d2b83281a572895884fc28b86

Fix: event notification capture error handling

When the captured fields end up taking more than 512 bytes of space for
the msgpack message, the notification append capture fails.

Currently, this is handled by printing a WARN_ON_ONCE() on the console,
and a printk "Error appending capture to notification" warning.

Considering that this kind of error is very much legitimate, spamming
the console with warnings is not the way we want to handle this.

Rather than print a warning on the console, reset the msgpack writer
position to skip the problematic captured field entirely when it is
erroneous.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I4c98dc85266dd7af5e11bbd3d73ab5118c9e03af

Fix: capture_sequence_element_{un,}signed: handle user-space input

The "user" attribute (copy from userspace) is not applied to
sequence/array of integer field capture within event notifications. This
could eventually lead to unsafe copy of integers from user-space.

Currently, the only array/sequence of integers which are read from
user-space are the arguments to sys_select (e.g. `readfds` field). Those
are expressed as "custom" fields, which are skipped by the filter and
capture bytecode.

This is therefore not an issue with the current instrumentation, but we
should properly handle this nevertheless.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Icf0c141d333f63402d8a76051bcd53fcdd5ed8c2

Fix: notification capture: handle userspace strings

The "user" attribute (copy from userspace) is not applied to string
field capture within event notifications. This leads to copy of strings
from user-space (e.g. `filename` field from sys_open) to end up using
strlen/memcpy on user-space data. This can cause kernel OOPS due to
unhandled page faults, and it also allows reading kernel memory through
the event notification capture mechanism. As a result, the users within
the `tracing` group can read arbitrary kernel memory.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I3241b144fea849004a3f0a19276506c9f1b0d5e5

Implement lttng_msgpack_write_user_str

Implement lttng_msgpack_write_user_str to allow safely capturing
user-space strings.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0354382cdd599b041fd20e59bb673fda7d72b2be

Fix: bytecode interpreter: LOAD_FIELD: handle user fields

The instructions for recursive traversal through composed types
are used by the capture bytecode, and by filter expressions which
access fields nested within composed types.

Instructions BYTECODE_OP_LOAD_FIELD_STRING and
BYTECODE_OP_LOAD_FIELD_SEQUENCE were leaving the "user" attribute
uninitialized. Initialize those to 0.

The handling of userspace strings and integers is missing in LOAD_FIELD
instructions. Therefore, ensure that the specialization leaves the
generic LOAD_FIELD instruction in place for userspace input.

Add a "user" attribute to:
- struct bytecode_get_index_data elem field (produced by the
specialization),
- struct vstack_load used by the specialization,
- struct load_ptr used by the interpreter.
- struct lttng_interpreter_output used by the event notification
capture.

Use this "user" attribute in dynamic_load_field() for integer, string
and string_sequence object types to ensure that the proper
userspace-aware accesses are performed when loading those fields.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib8d4db5b7da5064e5897ab3802ab47e063607036

Fix: move "user" attribute from field to type

The "user" field attribute (copy from userspace) is not taken into
account in the bytecode specialization and interpreter recursive
traversal through composed types (LOAD_FIELD bytecode instructions).

Those are currently used by the event notification capture bytecode, and
by filter expressions which access fields nested within composed types.

Move the "user" attribute from the event fields to the integer and
string types. This will allow ensuring that the bytecode specialization,
interpreter and event notification output capture have access to this
user attribute even in nested types (e.g. arrays, sequences) in a
subsequent change.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I044a0845b256b5e2cf65aa0888af2b906678d19d

Introduce lttng_copy_from_user_check_nofault

This code will be re-used by the event notification capture code, so
move it out of the ring buffer.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I482adb5f619944285703425e278a70c601ce99b3

fix: adjust range v5.10.137 in block probe

See upstream commit, backported in v5.10.137 :

commit 1cb3032406423b25aa984854b4d78e0100d292dd
Author: Christoph Hellwig <hch@lst.de>
Date:   Thu Dec 3 17:21:39 2020 +0100

    block: remove the request_queue to argument request based tracepoints

    [ Upstream commit a54895fa057c67700270777f7661d8d3c7fda88a ]

    The request_queue can trivially be derived from the request.

Change-Id: I01f96a437641421faf993b4b031171c372bd0374
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Version 2.13.5

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: If7bcc8f8709c264d140a5e7c8c815efbec8d0508

Fix: incorrect stub prototypes when CONFIG_HAVE_SYSCALL_TRACEPOINTS=n

The stub prototypes do not match the expected argument types, and extra
erroneous semicolons are present. This has been fixed by a refactoring
in the master branch:

commit f2db8be348380b48e3795d14e49cc585b3c357fd
Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Date: Mon Nov 1 15:14:44 2021 -0400

Cleanup: syscall filter enable/disable event

Fixes: #1357
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I43130c8ebb7fbc6961a3b73a7b04845bef59d318

fix: mm/tracing: add 'accounted' entry into output of allocation tracepoints (v6.0)

See upstream commit :

  commit b347aa7b57477f71c740e2bbc6d1078a7109ba23
  Author: Vasily Averin <vasily.averin@linux.dev>
  Date:   Fri Jun 3 06:21:49 2022 +0300

    mm/tracing: add 'accounted' entry into output of allocation tracepoints

    Slab caches marked with SLAB_ACCOUNT force accounting for every
    allocation from this cache even if __GFP_ACCOUNT flag is not passed.
    Unfortunately, at the moment this flag is not visible in ftrace output,
    and this makes it difficult to analyze the accounted allocations.

    This patch adds boolean "accounted" entry into trace output,
    and set it to 'true' for calls used __GFP_ACCOUNT flag and
    for allocations from caches marked with SLAB_ACCOUNT.
    Set it to 'false' if accounting is disabled in configs.

Change-Id: I023a355b94e79931499e1a1f648e2649d6dd3c89
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: block: remove bdevname (v6.0)

See upstream commit :

  commit 900d156bac2bc474cf7c7bee4efbc6c83ec5ae58
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Wed Jul 13 07:53:17 2022 +0200

    block: remove bdevname

    Replace the remaining calls of bdevname with snprintf using the %pg
    format specifier.

Change-Id: I09f2afe91e549be2746334a4a09fc00be09b0778
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: fs/jbd2: Fix the documentation of the jbd2_write_superblock() callers (v6.0)

See upstream commit :

  commit 6669797b0dd41ced457760b6e1014fdda8ce19ce
  Author: Bart Van Assche <bvanassche@acm.org>
  Date:   Thu Jul 14 11:07:22 2022 -0700

    Commit 2a222ca992c3 ("fs: have submit_bh users pass in op and flags
    separately") renamed the jbd2_write_superblock() 'write_op' argument into
    'write_flags'. Propagate this change to the jbd2_write_superblock()
    callers. Additionally, change the type of 'write_flags' into blk_opf_t.

Change-Id: I65b8af95b3d07438763dd94f409c197e3b400733
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: tie compaction probe build to CONFIG_COMPACTION

The definition of 'struct compact_control' in 'mm/internal.h' depends on
CONFIG_COMPACTION being defined. Only build the compaction probe when
this configuration option is enabled.

Thanks to Bruce Ashfield <bruce.ashfield@gmail.com> for reporting this
issue.

Change-Id: I81e77aa9c1bf10452c152d432fe5224df0db42c9
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: net: skb: introduce kfree_skb_reason() (v5.15.58..v5.16)

See upstream commit :

  commit c504e5c2f9648a1e5c2be01e8c3f59d394192bd3
  Author: Menglong Dong <imagedong@tencent.com>
  Date:   Sun Jan 9 14:36:26 2022 +0800

    net: skb: introduce kfree_skb_reason()

    Introduce the interface kfree_skb_reason(), which is able to pass
    the reason why the skb is dropped to 'kfree_skb' tracepoint.

    Add the 'reason' field to 'trace_kfree_skb', therefor user can get
    more detail information about abnormal skb with 'drop_monitor' or
    eBPF.

    All drop reasons are defined in the enum 'skb_drop_reason', and
    they will be print as string in 'kfree_skb' tracepoint in format
    of 'reason: XXX'.

    ( Maybe the reasons should be defined in a uapi header file, so that
    user space can use them? )

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib3c039207739dad10f097cf76474e0822e351273

fix: workqueue: Fix type of cpu in trace event (v5.19)

See upstream commit :

  commit 873a400938b31a1e443c4d94b560b78300787540
  Author: Wonhyuk Yang <vvghjk1234@gmail.com>
  Date:   Wed May 4 11:32:03 2022 +0900

    workqueue: Fix type of cpu in trace event

    The trace event "workqueue_queue_work" use unsigned int type for
    req_cpu, cpu. This casue confusing cpu number like below log.

    $ cat /sys/kernel/debug/tracing/trace
    cat-317  [001] ...: workqueue_queue_work: ... req_cpu=8192 cpu=4294967295

    So, change unsigned type to signed type in the trace event. After
    applying this patch, cpu number will be printed as -1 instead of
    4294967295 as folllows.

    $ cat /sys/kernel/debug/tracing/trace
    cat-1338  [002] ...: workqueue_queue_work: ... req_cpu=8192 cpu=-1

Change-Id: I478083c350b6ec314d87e9159dc5b342b96daed7
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: fs: Remove flags parameter from aops->write_begin (v5.19)

See upstream commit :

  commit 9d6b0cd7579844761ed68926eb3073bab1dca87b
  Author: Matthew Wilcox (Oracle) <willy@infradead.org>
  Date:   Tue Feb 22 14:31:43 2022 -0500

    fs: Remove flags parameter from aops->write_begin

    There are no more aop flags left, so remove the parameter.

Change-Id: I82725b93e13d749f52a631b2ac60df81a5e839f8
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: mm/page_alloc: fix tracepoint mm_page_alloc_zone_locked() (v5.19)

See upstream commit :

  commit 10e0f7530205799e7e971aba699a7cb3a47456de
  Author: Wonhyuk Yang <vvghjk1234@gmail.com>
  Date:   Thu May 19 14:08:54 2022 -0700

    mm/page_alloc: fix tracepoint mm_page_alloc_zone_locked()

    Currently, trace point mm_page_alloc_zone_locked() doesn't show correct
    information.

    First, when alloc_flag has ALLOC_HARDER/ALLOC_CMA, page can be allocated
    from MIGRATE_HIGHATOMIC/MIGRATE_CMA.  Nevertheless, tracepoint use
    requested migration type not MIGRATE_HIGHATOMIC and MIGRATE_CMA.

    Second, after commit 44042b4498728 ("mm/page_alloc: allow high-order pages
    to be stored on the per-cpu lists") percpu-list can store high order
    pages.  But trace point determine whether it is a refiil of percpu-list by
    comparing requested order and 0.

    To handle these problems, make mm_page_alloc_zone_locked() only be called
    by __rmqueue_smallest with correct migration type.  With a new argument
    called percpu_refill, it can show roughly whether it is a refill of
    percpu-list.

Change-Id: I2e4a57393757f12b9c5a4566c4d1102ee2474a09
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Version 2.13.4

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I37d9d03bfc0bae7c271ce3956e71919c4e2fb6e7

Fix: event notifier: racy use of last subbuffer record

The lttng-modules event notifiers use the ring buffer internally. When
reading the payload of the last event in a sub-buffer with a multi-part
read (e.g. two read system calls), we should not "put" the sub-buffer
holding this data, else continuing reading the data in the following
read system call can observe corrupted data if it has been concurrently
overwritten by the producer.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Idb051e50ee8a25958cfd63a9b143f4943ca2e01a

Fix: bytecode interpreter context_get_index() leaves byte order uninitialized

Observed Issue
==============

When using the event notification capture feature to capture a context
field, e.g. '$ctx.cpu_id', the captured value is often observed in
reverse byte order.

Cause
=====

Within the bytecode interpreter, context_get_index() leaves the "rev_bo"
field uninitialized in the top of stack.

This only affects the event notification capture bytecode because the
BYTECODE_OP_GET_SYMBOL bytecode instruction (as of lttng-tools 2.13)
is only generated for capture bytecode in lttng-tools. Therefore, only
capture bytecode targeting contexts are affected by this issue. The
reason why lttng-tools uses the "legacy" bytecode instruction to get
context (BYTECODE_OP_GET_CONTEXT_REF) for the filter bytecode is to
preserve backward compatibility of filtering when interacting with
applications linked against LTTng-UST 2.12.

Solution
========

Initialize the rev_bo field based on the context field type
reserve_byte_order field.

Known drawbacks
===============

None.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1483642b0b8f6bc28d5b68be170a04fb419fd9b3

fix: 'random' tracepoints removed in stable kernels

The upstream commit 14c174633f349cb41ea90c2c0aaddac157012f74 removing
the 'random' tracepoints is being backported to multiple stable kernel
branches, I don't see how that qualifies as a fix but here we are.

Use the presence of 'include/trace/events/random.h' in the kernel source
tree instead of the rather tortuous version check to determine if we
need to build 'lttng-probe-random.ko'.

Change-Id: I8f5f2f4c9e09c61127c49c7949b22dd3fab0460d
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: random: remove unused tracepoints (v5.10, v5.15)

The following kernel commit has been back ported to v5.10.119 and v5.15.44.

commit 14c174633f349cb41ea90c2c0aaddac157012f74
Author: Jason A. Donenfeld <Jason@zx2c4.com>
Date:   Thu Feb 10 16:40:44 2022 +0100

  random: remove unused tracepoints

  These explicit tracepoints aren't really used and show sign of aging.
  It's work to keep these up to date, and before I attempted to keep them
  up to date, they weren't up to date, which indicates that they're not
  really used. These days there are better ways of introspecting anyway.

Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0b7eb8aa78b5bd2039e20ae3e1da4c5eb9018789

fix: sched/tracing: Append prev_state to tp args instead (v5.18)

See upstream commit :

  commit 9c2136be0878c88c53dea26943ce40bb03ad8d8d
  Author: Delyan Kratunov <delyank@fb.com>
  Date:   Wed May 11 18:28:36 2022 +0000

    sched/tracing: Append prev_state to tp args instead

    Commit fa2c3254d7cf (sched/tracing: Don't re-read p->state when emitting
    sched_switch event, 2022-01-20) added a new prev_state argument to the
    sched_switch tracepoint, before the prev task_struct pointer.

    This reordering of arguments broke BPF programs that use the raw
    tracepoint (e.g. tp_btf programs). The type of the second argument has
    changed and existing programs that assume a task_struct* argument
    (e.g. for bpf_task_storage access) will now fail to verify.

    If we instead append the new argument to the end, all existing programs
    would continue to work and can conditionally extract the prev_state
    argument on supported kernel versions.

Change-Id: Ife2ec88a8bea2743562590cbd357068d7773863f
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: mm: compaction: cleanup the compaction trace events (v5.18)

See upstream commit :

  commit abd4349ff9b8d242376b67711254221f64f447c7
  Author: Baolin Wang <baolin.wang@linux.alibaba.com>
  Date:   Tue Mar 22 14:45:56 2022 -0700

    mm: compaction: cleanup the compaction trace events

    As Steven suggested [1], we should access the pointers from the trace
    event to avoid dereferencing them to the tracepoint function when the
    tracepoint is disabled.

    [1] https://lkml.org/lkml/2021/11/3/409

Change-Id: I6c08250df8596e8dbc76780ae5d95c899c12e6fe
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: scsi: core: Remove <scsi/scsi_request.h> (v5.18)

See upstream commit :

  commit 26440303310591e29121964ede0048583cb3126d
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Thu Feb 24 18:55:52 2022 +0100

    scsi: core: Remove <scsi/scsi_request.h>

    This header is empty now except for an include of <linux/blk-mq.h>, so
    remove it.

Change-Id: Ic8ee3352f1e8bddfcd44c31be9b788db82f183aa
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: kprobes: Use rethook for kretprobe if possible (v5.18)

See upstream commit :

  commit 73f9b911faa74ac5107879de05c9489c419f41bb
  Author: Masami Hiramatsu <mhiramat@kernel.org>
  Date:   Sat Mar 26 11:27:05 2022 +0900

    kprobes: Use rethook for kretprobe if possible

    Use rethook for kretprobe function return hooking if the arch sets
    CONFIG_HAVE_RETHOOK=y. In this case, CONFIG_KRETPROBE_ON_RETHOOK is
    set to 'y' automatically, and the kretprobe internal data fields
    switches to use rethook. If not, it continues to use kretprobe
    specific function return hooks.

Change-Id: I2b7670dc04e4769c1e3c372582ad2f555f6d7a66
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: random: remove unused tracepoints (v5.18)

See upstream commit :

  commit 14c174633f349cb41ea90c2c0aaddac157012f74
  Author: Jason A. Donenfeld <Jason@zx2c4.com>
  Date:   Thu Feb 10 16:40:44 2022 +0100

    random: remove unused tracepoints

    These explicit tracepoints aren't really used and show sign of aging.
    It's work to keep these up to date, and before I attempted to keep them
    up to date, they weren't up to date, which indicates that they're not
    really used. These days there are better ways of introspecting anyway.

Change-Id: I3b8c3e2732e7efdd76ce63204ac53a48784d0df6
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: scsi: block: Remove REQ_OP_WRITE_SAME support (v5.18)

See upstream commit :

  commit 73bd66d9c834220579c881a3eb020fd8917075d8
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Wed Feb 9 09:28:28 2022 +0100

    scsi: block: Remove REQ_OP_WRITE_SAME support

    No more users of REQ_OP_WRITE_SAME or drivers implementing it are left,
    so remove the infrastructure.

Change-Id: Ifbff71f79f8b590436fc7cb79f82d90c6e033d84
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: block: remove genhd.h (v5.18)

See upstream commit :

  commit 322cbb50de711814c42fb088f6d31901502c711a
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Mon Jan 24 10:39:13 2022 +0100

    block: remove genhd.h

    There is no good reason to keep genhd.h separate from the main blkdev.h
    header that includes it.  So fold the contents of genhd.h into blkdev.h
    and remove genhd.h entirely.

Change-Id: I7cf2aaa3a4c133320b95f2edde49f790f9515dbd
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: sched/tracing: Don't re-read p->state when emitting sched_switch event (v5.18)

See upstream commit :

  commit fa2c3254d7cfff5f7a916ab928a562d1165f17bb
  Author: Valentin Schneider <valentin.schneider@arm.com>
  Date:   Thu Jan 20 16:25:19 2022 +0000

    sched/tracing: Don't re-read p->state when emitting sched_switch event

    As of commit

      c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu")

    the following sequence becomes possible:

                          p->__state = TASK_INTERRUPTIBLE;
                          __schedule()
                            deactivate_task(p);
      ttwu()
        READ !p->on_rq
        p->__state=TASK_WAKING
                            trace_sched_switch()
                              __trace_sched_switch_state()
                                task_state_index()
                                  return 0;

    TASK_WAKING isn't in TASK_REPORT, so the task appears as TASK_RUNNING in
    the trace event.

    Prevent this by pushing the value read from __schedule() down the trace
    event.

Change-Id: I46743cd006be4b4d573cae2d77df7d6d16744d04
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: KVM: x86: Unexport kvm_x86_ops (v5.18)

See upstream commit :

  commit dfc4e6ca041135217c07ebcd102b6694cea22856
  Author: Sean Christopherson <seanjc@google.com>
  Date:   Fri Jan 28 00:51:56 2022 +0000

    KVM: x86: Unexport kvm_x86_ops

    Drop the export of kvm_x86_ops now it is no longer referenced by SVM or
    VMX.  Disallowing access to kvm_x86_ops is very desirable as it prevents
    vendor code from incorrectly modifying hooks after they have been set by
    kvm_arch_hardware_setup(), and more importantly after each function's
    associated static_call key has been updated.

    No functional change intended.

Change-Id: Icee959a984570f95ab9b71354225b5aeecea7da0
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Fix: do not warn on unknown counter ioctl

It is perfectly valid for a newer lttng-tools to try to use an unknown
ioctl and handle -ENOSYS.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia9f6472ca1196f983eee1327805b0ad69d028a98

Fix: tracepoint event: allow same provider and event name

Using the same name for the provider (TRACE_SYSTEM) and event name
causes a compilation error because the same identifiers are emitted
twice.

Fix this by prefixing the provider identifier with
"__provider_event_desc___".

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I8cdf8f859e35b8bd5c19737860d12f1ed546dfc2

Fix: compaction migratepages event name

The commit "fix: mm: compaction: fix the migration stats in trace_mm_compaction_migratepages() (v5.17)"

Triggers this warning:

LTTng: event provider mismatch: The event name needs to start with provider name + _ + one or more letter, provider: compaction, event name: mm_compaction_migratepages

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I01c7485af765084dafb33bf33ae392e60bfbf1e7

Version 2.13.3

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ief0169c89e54c719907186e654d314d65a527098

Document expected ISO8601 time formats in ABI header

Document the expected ISO8601 time formats in the ABI header to justify
the choice of string maximum length.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I4dedde83b5fb81c376245338773ea63677401a09

Fix: lttng ABI: lttng_counter_ioctl() tainted scalar

Found by Coverity:

>>>     CID 1476250:    (TAINTED_SCALAR)
>>>     Using tainted variable "local_counter_aggregate.index.number_dimensions" as a loop boundary.

>>>     CID 1476250:    (TAINTED_SCALAR)
>>>     Using tainted variable "local_counter_clear.index.number_dimensions" as a loop boundary.

>>>     CID 1476250:    (TAINTED_SCALAR)
>>>     Using tainted variable "local_counter_read.index.number_dimensions" as a loop boundary.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I7d35cf96781bb18837fe4564e4e8a34aa2ddc310

Fix: sample discarded events count before reserve

Sampling the discarded events count in the buffer_end callback is done
out of order, and may therefore include increments performed by following
events (in following packets) if the thread doing the end-of-packet
event write is interrupted for a long time.

Sampling the event discarded counts before reserving space for the last
event in a packet, and keeping this as part of the private ring buffer
context, should fix this race.

In lttng-modules, this scenario would only happen if an interrupt
handler produces many events, when nested over an event between its
reserve and commit. Note that if lttng-modules supports faultable
tracepoints in the future, this may become more easy to trigger due to
preemption.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I696a206b3926fc1abbee35caa9af65461ff56c68

Cleanup: comment alignment in ring buffer config.h

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I38fdfd786dfb60e1339634780be2645968351ed8

Version 2.13.2

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I8796907b413e09d417a1d087da5e4bdaf2a2bbf3

Fix: incorrect in/out direction for syscall exit

Syscall exit should fetch the "sc_out" parameters. This issue was
introduced by commit e42c4f49c15b ("Split syscall tracepoint generation in their own files").

Fixes: e42c4f49c15b ("Split syscall tracepoint generation in their own files")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib34912005323ea34b6d11ca9acc5edf491649cdd

fix: net: socket: rename SKB_DROP_REASON_SOCKET_FILTER (v5.17)

No version check needed since this change is between two RCs, see
upstream commit :

  commit 364df53c081d93fcfd6b91085ff2650c7f17b3c7
  Author: Menglong Dong <imagedong@tencent.com>
  Date:   Thu Jan 27 17:13:01 2022 +0800

    net: socket: rename SKB_DROP_REASON_SOCKET_FILTER

    Rename SKB_DROP_REASON_SOCKET_FILTER, which is used
    as the reason of skb drop out of socket filter before
    it's part of a released kernel. It will be used for
    more protocols than just TCP in future series.

Link: https://lore.kernel.org/all/20220127091308.91401-2-imagedong@tencent.com/
Change-Id: I666461a5b541fe9e0bf53ad996ce33237af4bfbb
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: net: skb: introduce kfree_skb_reason() (v5.17)

See upstream commit :

  commit c504e5c2f9648a1e5c2be01e8c3f59d394192bd3
  Author: Menglong Dong <imagedong@tencent.com>
  Date:   Sun Jan 9 14:36:26 2022 +0800

    net: skb: introduce kfree_skb_reason()

    Introduce the interface kfree_skb_reason(), which is able to pass
    the reason why the skb is dropped to 'kfree_skb' tracepoint.

    Add the 'reason' field to 'trace_kfree_skb', therefor user can get
    more detail information about abnormal skb with 'drop_monitor' or
    eBPF.

    All drop reasons are defined in the enum 'skb_drop_reason', and
    they will be print as string in 'kfree_skb' tracepoint in format
    of 'reason: XXX'.

    ( Maybe the reasons should be defined in a uapi header file, so that
    user space can use them? )

Change-Id: I6766678a288da959498a4736fc3f95bf239c3e94
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: random: rather than entropy_store abstraction, use global (v5.17)

See upstream commit :

  commit 90ed1e67e896cc8040a523f8428fc02f9b164394
  Author: Jason A. Donenfeld <Jason@zx2c4.com>
  Date:   Wed Jan 12 17:18:08 2022 +0100

    random: rather than entropy_store abstraction, use global

    Originally, the RNG used several pools, so having things abstracted out
    over a generic entropy_store object made sense. These days, there's only
    one input pool, and then an uneven mix of usage via the abstraction and
    usage via &input_pool. Rather than this uneasy mixture, just get rid of
    the abstraction entirely and have things always use the global. This
    simplifies the code and makes reading it a bit easier.

Change-Id: I1a2a14d7b6e69a047804e1e91e00fe002f757431
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: btrfs: pass fs_info to trace_btrfs_transaction_commit (v5.17)

See upstream commit :

  commit 2e4e97abac4c95f8b87b2912ea013f7836a6f10b
  Author: Josef Bacik <josef@toxicpanda.com>
  Date:   Fri Nov 5 16:45:29 2021 -0400

    btrfs: pass fs_info to trace_btrfs_transaction_commit

    The root on the trans->root can be anything, and generally we're
    committing from the transaction kthread so it's usually the tree_root.
    Change this to just take an fs_info, and to maintain compatibility
    simply put the ROOT_TREE_OBJECTID as the root objectid for the
    tracepoint.  This will allow use to remove trans->root.

Change-Id: Ie5a4804330edabffac0714fcb9c25b8c8599e424
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: mm: compaction: fix the migration stats in trace_mm_compaction_migratepages() (v5.17)

See upstream commit :

  commit 84b328aa81216e08804d8875d63f26bda1298788
  Author: Baolin Wang <baolin.wang@linux.alibaba.com>
  Date:   Fri Jan 14 14:08:40 2022 -0800

    mm: compaction: fix the migration stats in trace_mm_compaction_migratepages()

    Now the migrate_pages() has changed to return the number of {normal
    page, THP, hugetlb} instead, thus we should not use the return value to
    calculate the number of pages migrated successfully.  Instead we can
    just use the 'nr_succeeded' which indicates the number of normal pages
    migrated successfully to calculate the non-migrated pages in
    trace_mm_compaction_migratepages().

Link: https://lkml.kernel.org/r/b4225251c4bec068dcd90d275ab7de88a39e2bd7.1636275127.git.baolin.wang@linux.alibaba.com
Change-Id: Ib8e8f2a16a273f16cd73fe63afbbfc25c0a2540c
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: block: remove the ->rq_disk field in struct request (v5.17)

See upstream commit :

  commit f3fa33acca9f0058157214800f68b10d8e71ab7a
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Fri Nov 26 13:18:00 2021 +0100

    block: remove the ->rq_disk field in struct request

    Just use the disk attached to the request_queue instead.

Link: https://lore.kernel.org/r/20211126121802.2090656-4-hch@lst.de
Change-Id: I24263be519d1b51f4b00bd95f14a9aeb8457889a
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: block: remove GENHD_FL_SUPPRESS_PARTITION_INFO (v5.17)

See upstream commit :

  commit 3b5149ac50970669ee0ddb9629ec77ffd5c0622d
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Mon Nov 22 14:06:21 2021 +0100

    block: remove GENHD_FL_SUPPRESS_PARTITION_INFO

    This flag is not set directly anywhere and only inherited from
    GENHD_FL_HIDDEN.  Just check for GENHD_FL_HIDDEN instead.

Link: https://lore.kernel.org/r/20211122130625.1136848-11-hch@lst.de
Change-Id: Ide92bdaaff7d16e96be23aaf00cebeaa601235b7
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Copyright ownership transfer

Apply copyright ownership transfer from Julien Desfossez to EfficiOS Inc.

Link: https://lists.lttng.org/pipermail/lttng-dev/2022-January/030092.html
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Julien Desfossez <ju@klipix.org>
Signed-off-by: Julien Desfossez <ju@klipix.org>
Change-Id: Ida168b1fbe6589cb371a549ef14d9b4b28b221b3

Version 2.13.1

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I96afc0704a05f50e8478eca11977d7435c8db533

fix: mm: move kvmalloc-related functions to slab.h (v5.16)

See upstream commit :

  commit 8587ca6f34152ea650bad4b2db68456601159024
  Author: Matthew Wilcox (Oracle) <willy@infradead.org>
  Date:   Fri Nov 5 13:35:07 2021 -0700

    mm: move kvmalloc-related functions to slab.h

    Not all files in the kernel should include mm.h.  Migrating callers from
    kmalloc to kvmalloc is easier if the kvmalloc functions are in slab.h.

    [akpm@linux-foundation.org: move the new kvrealloc() also]
    [akpm@linux-foundation.org: drivers/hwmon/occ/p9_sbe.c needs slab.h]

Link: https://lkml.kernel.org/r/20210622215757.3525604-1-willy@infradead.org
Change-Id: I84e885ffbd1e2ff551a4738950e0c9462551b853
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: block: don't call blk_status_to_errno in blk_update_request (v5.16)

See upstream commit :

  commit 8a7d267b4a2c71a5ff5dd9046abea7117c7d0ac2
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Mon Oct 18 10:45:18 2021 +0200

    block: don't call blk_status_to_errno in blk_update_request

    We only need to call it to resolve the blk_status_t -> errno mapping for
    tracing, so move the conversion into the tracepoints that are not called
    at all when tracing isn't enabled.

Change-Id: Ic556cee1d82e44a93a1467f55d45b6e17a48d387
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: KVM: MMU: change tracepoints arguments to kvm_page_fault (v5.16)

See upstream commit :

  commit f0066d94c92dc5cf7f1a272a1bd324b0fc575292
  Author: Paolo Bonzini <pbonzini@redhat.com>
  Date:   Fri Aug 6 04:35:50 2021 -0400

    KVM: MMU: change tracepoints arguments to kvm_page_fault

    Pass struct kvm_page_fault to tracepoints instead of extracting the
    arguments from the struct.  This also lets the kvm_mmu_spte_requested
    tracepoint pick the gfn directly from fault->gfn, instead of using
    the address.

Change-Id: I5ee3f344a8d1cd0ed185cdeecb3a3121183c9b43
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: KVM: x86: Get exit_reason as part of kvm_x86_ops.get_exit_info (v5.16)

See upstream commit :

  commit 0a62a0319abb92c89a4f91c2dbfcaee4e47f37ca
  Author: David Edmondson <david.edmondson@oracle.com>
  Date:   Mon Sep 20 11:37:35 2021 +0100

    KVM: x86: Get exit_reason as part of kvm_x86_ops.get_exit_info

    Extend the get_exit_info static call to provide the reason for the VM
    exit. Modify relevant trace points to use this rather than extracting
    the reason in the caller.

Change-Id: I28903e658eb7cbfc6666e35ba4cffba5e49d1445
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: isystem: delete global -isystem compile option (v5.16)

See upstream commit :

  commit 04e85bbf71c9072dcf0ad9a7150495d72461105c
  Author: Alexey Dobriyan <adobriyan@gmail.com>
  Date:   Mon Aug 2 23:43:15 2021 +0300

    isystem: delete global -isystem compile option

    Further isolate kernel from userspace, prevent accidental inclusion of
    undesireable headers, mainly float.h and stdatomic.h.

    nds32 keeps -isystem globally due to intrinsics used in entrenched header.

    -isystem is selectively reenabled for some files, again, for intrinsics.

Change-Id: I5bea29687dc2bc15e96eeb13008aefe1acc97b8a
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: block: move block-related definitions out of fs.h (v5.16)

See upstream commit :

  commit 3f1266f1f82d7b8c72472a8921e80aa3e611fb62
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Sat Jun 20 09:16:41 2020 +0200

    block: move block-related definitions out of fs.h

    Move most of the block related definition out of fs.h into more suitable
    headers.

Change-Id: I8d072406a04f549d1ddb0f2ac392deaabbb26a92
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Fix: syscall tracing: missing trigger actions

Observed issue
==============

There is an issue when associating multiple actions to a single system
call, where the first action is effectively added, but the following
actions are silently ignored.

Cause
=====

The is caused by event creation of the 2nd event to be skipped because
lttng_syscall_filter_enable returns "-EBUSY", because the bit is already
set in the bitmap.

Solution
========

Fixing this double-add-trigger scenario requires changes to how the
lttng_syscall_filter_enable -EBUSY errors are dealt with on event
creation, but the problem runs deeper.

Given that many events may be responsible for requiring a bit to be set
in the bitmap of the syscall filter, we cannot simply clear the bit
whenever one event is disabled

Otherwise, this will cause situations where all events for a given
system call are disabled (due to the syscall filter bit being cleared)
whenever one single event for that system call is disabled (e.g. 2
triggers on the same system call, but with different actions).

We need to keep track of all events associated with this bit in the
bitmap, and only clear the bit when the reference count reaches 0.
This fixes the double-add-trigger issue by enturing that no -EBUSY
is returned when creating the second event, and fixes the
double-add-trigger-then-remove-trigger scenario as well.

Known drawbacks
===============

None.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I3c1e966e9720a2e8f4438ae2c98d83ffe318e424

Warn on event registration/unregistration failure

Warn whenever event registration/unregistration fails, including when
synchonizing the event list.

Silently ignoring errors makes identification of those failures tricky.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I764455c8535ddb906325942e70aa689ca21c04bb

fix: implicit-int error in EXPORT_SYMBOL_GPL

Add module header to fix error:
error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
[-Werror=implicit-int]

Log:
/workdir/build/workspace/sources/lttng-modules/src/wrapper/random.c:54:1:
warning: data definition has no type or storage class
54 EXPORT_SYMBOL_GPL(wrapper_get_bootid);
^~~~~~~~~~~~~~~~~
/workdir/build/workspace/sources/lttng-modules/src/wrapper/random.c:54:1:
error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
[-Werror=implicit-int]
/workdir/build/workspace/sources/lttng-modules/src/wrapper/random.c:54:1:
warning: parameter names (without types) in function declaration
cc1: some warnings being treated as errors
make[3]: ***
[/workdir/build/tmp/work-shared/qt5222/kernel-source/scripts/Makefile.build:271:
/workdir/build/workspace/sources/lttng-modules/src/wrapper/random.o]

Signed-off-by: Daniel Gomez <daniel@qtec.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: If1b96f6fcc19988b94dc09b12d071f2c61491b83

fix: Revert "Makefile: Enable -Wimplicit-fallthrough for Clang" (v5.15)

Starting with v5.15, "-Wimplicit-fallthrough=5" was added to the build
flags which requires the use of "__attribute__((__fallthrough__))" to
annotate fallthrough case statements.

See upstream commit by the man himself:

  commit d936eb23874433caa3e3d841cfa16f5434b85dcf
  Author: Linus Torvalds <torvalds@linux-foundation.org>
  Date:   Thu Jul 15 18:05:31 2021 -0700

    Revert "Makefile: Enable -Wimplicit-fallthrough for Clang"

    This reverts commit b7eb335e26a9c7f258c96b3962c283c379d3ede0.

    It turns out that the problem with the clang -Wimplicit-fallthrough
    warning is not about the kernel source code, but about clang itself, and
    that the warning is unusable until clang fixes its broken ways.

    In particular, when you enable this warning for clang, you not only get
    warnings about implicit fallthroughs.  You also get this:

       warning: fallthrough annotation in unreachable code [-Wimplicit-fallthrough]

    which is completely broken becasue it

     (a) doesn't even tell you where the problem is (seriously: no line
         numbers, no filename, no nothing).

     (b) is fundamentally broken anyway, because there are perfectly valid
         reasons to have a fallthrough statement even if it turns out that
         it can perhaps not be reached.

    In the kernel, an example of that second case is code in the scheduler:

                    switch (state) {
                    case cpuset:
                            if (IS_ENABLED(CONFIG_CPUSETS)) {
                                    cpuset_cpus_allowed_fallback(p);
                                    state = possible;
                                    break;
                            }
                            fallthrough;
                    case possible:

    where if CONFIG_CPUSETS is enabled you actually never hit the
    fallthrough case at all.  But that in no way makes the fallthrough
    wrong.

    So the warning is completely broken, and enabling it for clang is a very
    bad idea.

    In the meantime, we can keep the gcc option enabled, and make the gcc
    build use

        -Wimplicit-fallthrough=5

    which means that we will at least continue to require a proper
    fallthrough statement, and that gcc won't silently accept the magic
    comment versions. Because gcc does this all correctly, and while the odd
    "=5" part is kind of obscure, it's documented in [1]:

      "-Wimplicit-fallthrough=5 doesn’t recognize any comments as
       fallthrough comments, only attributes disable the warning"

    so if clang ever fixes its bad behavior we can try enabling it there again.

Change-Id: Iea69849592fb69ac04fb9bb28efcd6b8dce8ba88
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: cpu/hotplug: Remove deprecated CPU-hotplug functions. (v5.15)

The CPU-hotplug functions get|put_online_cpus() were deprecated in v4.13
and removed in v5.15.

See upstream commits :

commit 8c854303ce0e38e5bbedd725ff39da7e235865d8
Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date:   Tue Aug 3 16:16:21 2021 +0200

    cpu/hotplug: Remove deprecated CPU-hotplug functions.

    No users in tree use the deprecated CPU-hotplug functions anymore.

    Remove them.

Introduced in v4.13 :

  commit 8f553c498e1772cccb39a114da4a498d22992758
  Author: Thomas Gleixner <tglx@linutronix.de>
  Date:   Wed May 24 10:15:12 2017 +0200

    cpu/hotplug: Provide cpus_read|write_[un]lock()

    The counting 'rwsem' hackery of get|put_online_cpus() is going to be
    replaced by percpu rwsem.

    Rename the functions to make it clear that it's locking and not some
    refcount style interface. These new functions will be used for the
    preparatory patches which make the code ready for the percpu rwsem
    conversion.

    Rename all instances in the cpu hotplug code while at it.

Change-Id: I5a37cf5afc075a402b7347989fac637dfa60a1ed
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Version 2.13.0

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I24bced435d30a8660489e76f94370a19160fe9d1

Version 2.13.0-rc3

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I6d21580daba07be92428c5dab4227eae8aaace0a

fix: sched: Change task_struct::state (v5.14)

See upstream commit:

  commit 2f064a59a11ff9bc22e52e9678bc601404c7cb34
  Author: Peter Zijlstra <peterz@infradead.org>
  Date:   Fri Jun 11 10:28:17 2021 +0200

    sched: Change task_struct::state

    Change the type and name of task_struct::state. Drop the volatile and
    shrink it to an 'unsigned int'. Rename it in order to find all uses
    such that we can use READ_ONCE/WRITE_ONCE as appropriate.

Change-Id: I3a379192d6b977753fe58d4f67833a78dd7a0a47
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: btrfs: pass btrfs_inode to btrfs_writepage_endio_finish_ordered() (v5.14)

See upstream commit:

  commit 38a39ac77e089515acbe85c6c70c3df1e728357d
  Author: Qu Wenruo <wqu@suse.com>
  Date:   Thu Apr 8 20:32:27 2021 +0800

    btrfs: pass btrfs_inode to btrfs_writepage_endio_finish_ordered()

    There is a pretty bad abuse of btrfs_writepage_endio_finish_ordered() in
    end_compressed_bio_write().

    It passes compressed pages to btrfs_writepage_endio_finish_ordered(),
    which is only supposed to accept inode pages.

    Thankfully the important info here is the inode, so let's pass
    btrfs_inode directly into btrfs_writepage_endio_finish_ordered(), and
    make @page parameter optional.

    By this, end_compressed_bio_write() can happily pass page=NULL while
    still getting everything done properly.

    Also, to cooperate with such modification, replace @page parameter for
    trace_btrfs_writepage_end_io_hook() with btrfs_inode.
    Although this removes page_index info, the existing start/len should be
    enough for most usage.

Change-Id: If96e99c2d9533d96d9d1aa6460bb7fd3ac9ed7ab
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: adjust ranges for RHEL 8.4

Change-Id: I9ac44467cca4850fb4051252937542d5a054ccc4
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Version 2.13.0-rc2

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1e09e6683a3aad482fb7cc8b44eb3b603132c9b8

fix: adjust ranges for RHEL 8.2 and 8.3

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0863ac030f9fdfeb0173b843e75396acda21f3b6

Disable x86 error code bitwise enum in default build

Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.

commit 8c7f2a9f2732b11f5cc9798cecb621420cc0e972
Author: Francis Deslauriers <francis.deslauriers@efficios.com>
Date: Fri Jun 5 18:42:54 2020 -0400

x86: add error code enum to pagefault tracepoints

Change-Id: I7901fad216b4774a9bbf3665568f357321805871
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Disable mmap bitwise enum in default build

Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.

Original commit:

  commit 3cf55950d0f6aa43eb5ad119bad1dbda69f75a54
  Author: Francis Deslauriers <francis.deslauriers@efficios.com>
  Date:   Fri Jun 5 11:38:14 2020 -0400

    syscalls: Make mmap()'s fields `prot` and `flags` enums

Change-Id: I033f42855c2967356b1e90cd89450eb5100a0f0b
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Disable block rwbs bitwise enum in default build

Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.

Original commit:

  commit 23634515e7271c8c8594ad87a6685232d4eff297
  Author: Geneviève Bastien <gbastien@versatic.net>
  Date:   Tue Feb 11 11:20:27 2020 -0500

    block: Make the rwbs field as a bit field enum

Change-Id: I6c285d1cbfbd88e151d4b0abb52ad9a650764ea2
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Disable sched_switch bitwise enum in default build

Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.

Original commit:

  commit 721caea47b6506f7ad9086c3e9801dc9dfe06b6a
  Author: Geneviève Bastien <gbastien@versatic.net>
  Date:   Wed Feb 12 16:58:25 2020 -0500

    sched: Make the sched_switch task state an enum

Change-Id: Ib19d06365fbed7daa9440d5dfff283c4f89db6ee
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Disable open[at] bitwise enum in default build

Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.

Original commit:

  commit c8dfb72431505d5f01a6f090f3f7427d9ca6fe94
  Author: Geneviève Bastien <gbastien@versatic.net>
  Date:   Wed Mar 11 12:33:04 2020 -0400

    syscalls: Make the flags and mode fields of open[at] enumerations

Use the non-override system call tracing implementation when the enum is
not used.

Change-Id: I2dbebe56eaba6186843dab20264d3378962c7d30
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Disable fcntl bitwise enum in default build

Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.

Original commit:

  commit c1c07d681a68ba37da066d6f0456129957073169
  Author: Geneviève Bastien <gbastien+lttng@versatic.net>
  Date:   Wed Mar 11 12:38:51 2020 -0400

    syscalls: Make the cmd field of fcntl an enum

Use the non-override system call tracing implementation when the enum is
not used.

Change-Id: I4c650d40a14b1c56ada0ed9aae1877364d0c4580
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Disable clone bitwise enum in default build

Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.

Original commit:

  commit d775625e2ba4825b73b5897e7701ad6e2bdba115
  Author: Geneviève Bastien <gbastien@versatic.net>
  Date:   Wed Apr 1 14:31:49 2020 -0400

    syscalls: Make clone()'s `flags` field a 2 enum struct.

Change-Id: Ia67b5fc93b932b8b76330b4965753a43401c1514
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Add experimental bitwise enum config option

Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.

Change-Id: Id45c7a78b280a7f35bbeafb80f2f6f5aa1ebbdc9
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Add defaults to Kconfig options

Add defaults to the Kconfig options used when building in-tree that
match the default configuration when built out-of-tree.

Change-Id: I436251f1fb4c3e238a013d18d67932565b92ef45
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Cleanup: remove unused EXTCFLAGS from Makefile

Change-Id: I36c4f6f403b03a1aa158863f393049efdc70af04
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Sync `show_inode_state()` macro with upstream stable kernels

The following commit was backported to multiple stable branches:

  commit 5fcd57505c002efc5823a7355e21f48dd02d5a51
  Author: Jan Kara <jack@suse.cz>
  Date:   Fri May 29 16:24:43 2020 +0200

    writeback: Drop I_DIRTY_TIME_EXPIRE

    The only use of I_DIRTY_TIME_EXPIRE is to detect in
    __writeback_single_inode() that inode got there because flush worker
    decided it's time to writeback the dirty inode time stamps (either
    because we are syncing or because of age). However we can detect this
    directly in __writeback_single_inode() and there's no need for the
    strange propagation with I_DIRTY_TIME_EXPIRE flag.

Change-Id: I6e7c0ced13acd4fcd88bcd572d0ba1f9b254c58c
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Update .gitreview for stable-2.13

Change-Id: I169739a9e4f43be052cb815feefb21610e7cf742
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Sync `show_inode_state()` macro with Ubuntu 4.15 kernel

The following commit changed the `show_inode_state()` macro which
triggered a warning on our CI build:
  commit 63388062bea96e5cd8b8d7abf7b7142f8666ca1f
  Author: Jan Kara <jack@suse.cz>
  Date:   Mon Jan 25 12:37:43 2021 -0800

      writeback: Drop I_DIRTY_TIME_EXPIRE

Also, this commit adds a comment to clarify why we keep these
`#if/#elif` even though we don't use it the macro.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I2dd53a1a286ab8a431977bda6cde01f700f0c7d9