lttng-modules.git
3 years agofix: adjust ranges for RHEL 8.4
Michael Jeanson [Tue, 18 May 2021 15:16:34 +0000 (11:16 -0400)] 
fix: adjust ranges for RHEL 8.4

Change-Id: I9ac44467cca4850fb4051252937542d5a054ccc4
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
3 years agoVersion 2.12.6 v2.12.6
Mathieu Desnoyers [Fri, 14 May 2021 19:18:48 +0000 (15:18 -0400)] 
Version 2.12.6

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia64c94d00e43f3371482304840b8a591d9d8cdbd

3 years agofix: adjust ranges for RHEL 8.2 and 8.3
Michael Jeanson [Tue, 11 May 2021 19:29:23 +0000 (15:29 -0400)] 
fix: adjust ranges for RHEL 8.2 and 8.3

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0863ac030f9fdfeb0173b843e75396acda21f3b6

3 years agoDisable block rwbs bitwise enum in default build
Michael Jeanson [Tue, 11 May 2021 21:22:12 +0000 (17:22 -0400)] 
Disable block rwbs bitwise enum in default build

Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.

Original commit:

  commit 23634515e7271c8c8594ad87a6685232d4eff297
  Author: Geneviève Bastien <gbastien@versatic.net>
  Date:   Tue Feb 11 11:20:27 2020 -0500

    block: Make the rwbs field as a bit field enum

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Iccf0d2b1e9ed44b280707fcc2b25a26ef7b4ca1f

3 years agoDisable sched_switch bitwise enum in default build
Michael Jeanson [Tue, 11 May 2021 21:15:15 +0000 (17:15 -0400)] 
Disable sched_switch bitwise enum in default build

Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.

Original commit:

  commit 721caea47b6506f7ad9086c3e9801dc9dfe06b6a
  Author: Geneviève Bastien <gbastien@versatic.net>
  Date:   Wed Feb 12 16:58:25 2020 -0500

    sched: Make the sched_switch task state an enum

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I004616ff203c26e74fab9c19fea3ef2a86de16df

3 years agoAdd experimental bitwise enum config option
Michael Jeanson [Wed, 12 May 2021 20:21:50 +0000 (16:21 -0400)] 
Add experimental bitwise enum config option

Only generate the bitwise enumerations when
CONFIG_LTTNG_EXPERIMENTAL_BITWISE_ENUM is enabled, so the default build
does not generate traces which lead to warnings when viewed with
babeltrace 1.x and babeltrace 2 with default options.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id54ae3df470b9cdbc0edc0a528fa79532493d1ad

3 years agoAdd defaults to Kconfig options
Michael Jeanson [Wed, 12 May 2021 20:13:01 +0000 (16:13 -0400)] 
Add defaults to Kconfig options

Add defaults to the Kconfig options used when building in-tree that
match the default configuration when built out-of-tree.

Change-Id: I8e38a3d4fd1c13b54b5a4a8deb66c84acdfb76c0
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
3 years agoSync `show_inode_state()` macro with upstream stable kernels
Michael Jeanson [Wed, 12 May 2021 17:35:24 +0000 (13:35 -0400)] 
Sync `show_inode_state()` macro with upstream stable kernels

The following commit was backported to multiple stable branches:

  commit 5fcd57505c002efc5823a7355e21f48dd02d5a51
  Author: Jan Kara <jack@suse.cz>
  Date:   Fri May 29 16:24:43 2020 +0200

    writeback: Drop I_DIRTY_TIME_EXPIRE

    The only use of I_DIRTY_TIME_EXPIRE is to detect in
    __writeback_single_inode() that inode got there because flush worker
    decided it's time to writeback the dirty inode time stamps (either
    because we are syncing or because of age). However we can detect this
    directly in __writeback_single_inode() and there's no need for the
    strange propagation with I_DIRTY_TIME_EXPIRE flag.

Change-Id: I6e7c0ced13acd4fcd88bcd572d0ba1f9b254c58c
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
3 years agofix: block: remove disk_part_iter (v5.12)
Michael Jeanson [Mon, 10 May 2021 15:39:24 +0000 (11:39 -0400)] 
fix: block: remove disk_part_iter (v5.12)

In v5.12 a refactoring of the genhd code was started and the symbols
related to 'disk_part_iter' were unexported. In v5.13 they were
completely removed.

This patch replaces the short lived compat code that is specific to
v5.12 and replaces it with a generic internal implementation that
iterates directly on the 'disk->part_tbl' xarray which will be used
on v5.12 and up.

This seems like a better option than keeping the compat code that will
only work on v5.12 and make maintenance more complicated. The compat was
backported to the stable branches but isn't yet part of a point release
so can be safely replaced.

See the upstream commits:

  commit 3212135a718b06be38811f2d9a320ae842e76409
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Tue Apr 6 08:23:02 2021 +0200

    block: remove disk_part_iter

    Just open code the xa_for_each in the remaining user.

  commit a33df75c6328bf40078b35f2040d8e54d574c357
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Sun Jan 24 11:02:41 2021 +0100

    block: use an xarray for disk->part_tbl

    Now that no fast path lookups in the partition table are left, there is
    no point in micro-optimizing the data structure for it.  Just use a bog
    standard xarray.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I43d8ef8463cb7a83dc8859a32dc29502cd897ddf

3 years agoFix: Backport of "Fix: increment buffer offset when failing to copy from user-space"
Mathieu Desnoyers [Wed, 12 May 2021 14:45:12 +0000 (10:45 -0400)] 
Fix: Backport of "Fix: increment buffer offset when failing to copy from user-space"

Private field was introduced in 2.13 only.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I7af440269feda5500651d6ecdc1f63007910cf3d

3 years agoFix: increment buffer offset when failing to copy from user-space
Mathieu Desnoyers [Fri, 7 May 2021 19:03:04 +0000 (15:03 -0400)] 
Fix: increment buffer offset when failing to copy from user-space

Upon failure to copy from user-space due to failing access ok check, the
ring buffer offset is not incremented, which could generate unreadable
traces because we don't account for the padding we write into the ring
buffer.

Note that this typically won't affect a common use-case of copying
strings from user-space, because unless mprotect is invoked within a
narrow race window (between user strlen and user strcpy), the strlen
will fail on access ok when calculating the space to reserve, which will
match what happens on strcpy.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic4d9487dd8870a526bae3023bb80f5e6301cec50

3 years agoSync `show_inode_state()` macro with Ubuntu 4.15 kernel
Francis Deslauriers [Mon, 10 May 2021 17:41:48 +0000 (13:41 -0400)] 
Sync `show_inode_state()` macro with Ubuntu 4.15 kernel

The following commit changed the `show_inode_state()` macro which
triggered a warning on our CI build:
  commit 63388062bea96e5cd8b8d7abf7b7142f8666ca1f
  Author: Jan Kara <jack@suse.cz>
  Date:   Mon Jan 25 12:37:43 2021 -0800

      writeback: Drop I_DIRTY_TIME_EXPIRE

Also, this commit adds a comment to clarify why we keep these
`#if/#elif` even though we don't use it the macro.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I2dd53a1a286ab8a431977bda6cde01f700f0c7d9

3 years agofix: mm, tracing: kfree event name mismatching with provider kmem (v5.12)
He Zhe [Mon, 19 Apr 2021 09:09:28 +0000 (09:09 +0000)] 
fix: mm, tracing: kfree event name mismatching with provider kmem (v5.12)

a8bc8ae5c932 ("fix: mm, tracing: record slab name for kmem_cache_free() (v5.12)")
introduces the following call trace for kfree. This is caused by mismatch
between kfree event and its provider kmem.

This patch maps kfree to kmem_kfree.

WARNING: CPU: 2 PID: 42294 at src/lttng-probes.c:81 fixup_lazy_probes+0xb0/0x1b0 [lttng_tracer]
CPU: 2 PID: 42294 Comm: modprobe Tainted: G           O      5.12.0-rc6-yoctodev-standard #1
Hardware name: Intel Corporation JACOBSVILLE/JACOBSVILLE, BIOS JBVLCRB2.86B.0014.P20.2004020248 04/02/2020
RIP: 0010:fixup_lazy_probes+0xb0/0x1b0 [lttng_tracer]
Code: 75 28 83 c3 01 3b 5d c4 74 22 48 8b 4d d0 48 63
      c3 4c 89 e2 4c 89 f6 48 8b 04 c1 4c 8b 38 4c 89
      ff e8 64 9f 4b de 85 c0 74 c3 <0f> 0b 48 8b 05 bf
      f2 1e 00 48 8d 50 e8 48 3d f0 a0 98 c0 75 18 eb
RSP: 0018:ffffb976807bfbe0 EFLAGS: 00010286
RAX: 00000000ffffffff RBX: 0000000000000004 RCX: 0000000000000004
RDX: 0000000000000066 RSI: ffffffffc03c10a7 RDI: ffffffffc03c11a1
RBP: ffffb976807bfc28 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000004
R13: ffffffffc03c2000 R14: ffffffffc03c10a7 R15: ffffffffc03c11a1
FS:  00007f0ef9533740(0000) GS:ffffa100faa00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000561e8f0aa000 CR3: 000000015b318000 CR4: 0000000000350ee0
Call Trace:
 lttng_probe_register+0x38/0xe0 [lttng_tracer]
 ? __event_probe__module_load+0x520/0x520 [lttng_probe_module]
 __lttng_events_init__module+0x15/0x20 [lttng_probe_module]
 do_one_initcall+0x68/0x310
 ? kmem_cache_alloc_trace+0x2ad/0x4c0
 ? do_init_module+0x28/0x280
 do_init_module+0x62/0x280
 load_module+0x26e4/0x2920
 ? kernel_read_file+0x22e/0x290
 __do_sys_finit_module+0xb1/0xf0
 __x64_sys_finit_module+0x1a/0x20
 do_syscall_64+0x38/0x50
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I00e8ee2b8c35f6f8602c88295f5113fbbd139709

3 years agoSet 'stable-2.12' branch in git review config
Michael Jeanson [Thu, 15 Apr 2021 17:57:13 +0000 (13:57 -0400)] 
Set 'stable-2.12' branch in git review config

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I67bb1757f220f6f120b81f4dee1c6ea22724e12b

3 years agofix backport: block: add a disk_uevent helper (v5.12)
Michael Jeanson [Thu, 15 Apr 2021 17:56:24 +0000 (13:56 -0400)] 
fix backport: block: add a disk_uevent helper (v5.12)

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I717162069990577abe78e5e7fed28816f32b2c84

3 years agofix: Adjust ranges for Ubuntu 5.4.0-67 kernel
Michael Jeanson [Thu, 15 Apr 2021 14:53:21 +0000 (10:53 -0400)] 
fix: Adjust ranges for Ubuntu 5.4.0-67 kernel

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ifa0f50ffdc946d80b67bb5ae7ca4b0aa152e825b

3 years agofix: block: add a disk_uevent helper (v5.12)
Michael Jeanson [Mon, 15 Mar 2021 18:54:02 +0000 (14:54 -0400)] 
fix: block: add a disk_uevent helper (v5.12)

See upstream commit:

  commit bc359d03c7ec1bf3b86d03bafaf6bbb21e6414fd
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Sun Jan 24 11:02:39 2021 +0100

    block: add a disk_uevent helper

    Add a helper to call kobject_uevent for the disk and all partitions, and
    unexport the disk_part_iter_* helpers that are now only used in the core
    block code.

Change-Id: If6e8797049642ab382d5699660ee1dd734e92c90
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
3 years agoFix: properly compare type enumeration
Mathieu Desnoyers [Mon, 12 Apr 2021 19:00:33 +0000 (15:00 -0400)] 
Fix: properly compare type enumeration

Fixes: fead3a9cead4 ("Fix: bytecode linker: validate event and field array/sequence encoding")
Fixes: #1304
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I9b5e3df5415d56420365997ce289be7e02a0140e

3 years agocompiler warning cleanup: is_signed_type: compare -1 to 1
Mathieu Desnoyers [Thu, 25 Mar 2021 18:20:58 +0000 (14:20 -0400)] 
compiler warning cleanup: is_signed_type: compare -1 to 1

Comparing -1 to 0 triggers compiler warnings (gcc -Wtype-limits and
-Wbool-compare) and Coverity warning "Macro compares unsigned to 0".

Comparing -1 to 1 instead takes care of silencing those warnings while
keeping the same behavior.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id42a51759a1c7c669e63588c05f9d4485304c541

3 years agoFix: bytecode linker: validate event and field array/sequence encoding
Mathieu Desnoyers [Mon, 22 Mar 2021 18:35:53 +0000 (14:35 -0400)] 
Fix: bytecode linker: validate event and field array/sequence encoding

The bytecode linker should only allow linking filter expressions loading
fields which are string-encoded arrays and sequence for comparison
against a string, and reject arrays and sequences without encoding, so
the filter interpreter does not attempt to load non-NULL terminated
arrays/sequences as if they were strings.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I61213b736b2e41b55ad8d6b32a6db0f50494e316

3 years agoFix: kretprobe: null ptr deref on session destroy
Francis Deslauriers [Wed, 17 Mar 2021 14:40:56 +0000 (10:40 -0400)] 
Fix: kretprobe: null ptr deref on session destroy

The `filter_bytecode_runtime_head` list is currently not initialized for
the return event of the kretprobe. This caused a kernel null ptr
dereference when destroying a session. It can reproduced with the
following commands:

  lttng create
  lttng enable-event -k --function=lttng_test_filter_event_write my_event
  lttng start
  lttng stop
  lttng destroy

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1162ce8b10dd7237a26331531f048346b984eee7

3 years agofix: mm, tracing: record slab name for kmem_cache_free() (v5.12)
Michael Jeanson [Thu, 4 Mar 2021 21:50:12 +0000 (16:50 -0500)] 
fix: mm, tracing: record slab name for kmem_cache_free() (v5.12)

See upstream commit:

  commit 3544de8ee6e4817278b15fe08658de49abf58954
  Author: Jacob Wen <jian.w.wen@oracle.com>
  Date:   Wed Feb 24 12:00:55 2021 -0800

    mm, tracing: record slab name for kmem_cache_free()

    Currently, a trace record generated by the RCU core is as below.

    ... kmem_cache_free: call_site=rcu_core+0x1fd/0x610 ptr=00000000f3b49a66

    It doesn't tell us what the RCU core has freed.

    This patch adds the slab name to trace_kmem_cache_free().
    The new format is as follows.

    ... kmem_cache_free: call_site=rcu_core+0x1fd/0x610 ptr=0000000037f79c8d name=dentry
    ... kmem_cache_free: call_site=rcu_core+0x1fd/0x610 ptr=00000000f78cb7b5 name=sock_inode_cache
    ... kmem_cache_free: call_site=rcu_core+0x1fd/0x610 ptr=0000000018768985 name=pool_workqueue
    ... kmem_cache_free: call_site=rcu_core+0x1fd/0x610 ptr=000000006a6cb484 name=radix_tree_node

    We can use it to understand what the RCU core is going to free. For
    example, some users maybe interested in when the RCU core starts
    freeing reclaimable slabs like dentry to reduce memory pressure.

Link: https://lkml.kernel.org/r/20201216072804.8838-1-jian.w.wen@oracle.com
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1ee2fc476614cadcc8d3ac5d8feddc7910e1aa3a

3 years agoFix: filter interpreter early-exits on uninitialized value
Jérémie Galarneau [Wed, 3 Mar 2021 23:52:19 +0000 (18:52 -0500)] 
Fix: filter interpreter early-exits on uninitialized value

I observed that syscall filtering on string arguments wouldn't work on
my development machines, both running 5.11.2-arch1-1 (Arch Linux).

For instance, enabling the tracing of the `openat()` syscall with the
'filename == "/proc/cpuinfo"' filter would not produce events even
though matching events were present in another session that had no
filtering active. The same problem occurred with `execve()`.

I tried a couple of kernel versions before (5.11.1 and 5.10.13, if
memory serves me well) and I had the same problem. Meanwhile, I couldn't
reproduce the problem on various Debian machines (the LTTng CI) nor on a
fresh Ubuntu 20.04 with both the stock kernel and with an updated 5.11.2
kernel.

I built the lttng-modules with the interpreter debugging printout and
saw the following warning:
  LTTng: [debug bytecode in /home/jgalar/EfficiOS/src/lttng-modules/src/lttng-bytecode-interpreter.c:bytecode_interpret@1508] Bytecode warning: loading a NULL string.

After a shedload (yes, a _shed_load) of digging, I figured that the
problem was hidden in plain sight near that logging statement.

In the `BYTECODE_OP_LOAD_FIELD_REF_USER_STRING` operation, the 'ax'
register's 'user_str' is initialized with the stack value (the user
space string's address in our case). However, a NULL check is performed
against the register's 'str' member.

I initialy suspected that both members would be part of the same union
and alias each-other, but they are actually contiguous in a structure.

On the unaffected machines, I could confirm that the `str` member was
uninitialized to a non-zero value causing the condition to evaluate to
false.

Francis Deslauriers reproduced the problem by initializing the
interpreter stack to zero.

I am unsure of the exact kernel configuration option that reveals this
issue on Arch Linux, but my kernel has the following option enabled:

CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL:
   Zero-initialize any stack variables that may be passed by reference
   and had not already been explicitly initialized. This is intended to
   eliminate all classes of uninitialized stack variable exploits and
   information exposures.

I have not tried to build without this enabled as, anyhow, this seems
to be a legitimate issue.

I have spotted what appears to be an identical problem in
`BYTECODE_OP_LOAD_FIELD_REF_USER_SEQUENCE` and corrected it. However,
I have not exercised that code path.

The commit that introduced this problem is 5b4ad89.

The debug print-out of the `BYTECODE_OP_LOAD_FIELD_REF_USER_STRING`
operation is modified to print the user string (truncated to 31 chars).

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I2da3c31b9e3ce0e1b164cf3d2711c0893cbec273

3 years agoFix: memory leaks on event destroy
Mathieu Desnoyers [Wed, 3 Mar 2021 15:10:16 +0000 (10:10 -0500)] 
Fix: memory leaks on event destroy

Both filter runtime and event enabler ref objects are owned by the
event, but are not freed upon destruction of the event object, thus
leaking memory.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ice9b1c18b47584838aea2b965494d3c8391f4c84

3 years agoVersion 2.12.5 v2.12.5
Mathieu Desnoyers [Wed, 17 Feb 2021 20:52:51 +0000 (15:52 -0500)] 
Version 2.12.5

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I630522a86433066c82c31e8f488803886f839db9

3 years agofix: Adjust ranges for Ubuntu 5.8.0-44 kernel
Michael Jeanson [Tue, 16 Feb 2021 23:08:19 +0000 (18:08 -0500)] 
fix: Adjust ranges for Ubuntu 5.8.0-44 kernel

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I419904dc9da316b38c2c16a08b6c17625b19b305

3 years agofix: missing include for 'task_struct' in fdtable.h
Michael Jeanson [Tue, 19 Jan 2021 16:34:25 +0000 (11:34 -0500)] 
fix: missing include for 'task_struct' in fdtable.h

In some kernel versions, linux/fdtable.h dereferences a pointer in a
forward declared 'struct task_struct' without an include of 'linux/sched.h'.

Add this missing include to the wrapper.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie9d3cc8f0e37c0a671a16ce44c9dd3a8686e0ca8

3 years agofix: file: Rename fcheck lookup_fd_rcu (v5.11)
Michael Jeanson [Thu, 7 Jan 2021 17:10:23 +0000 (12:10 -0500)] 
fix: file: Rename fcheck lookup_fd_rcu (v5.11)

See upstream commit:

  commit 460b4f812a9d473d4b39d87d37844f9fc30a9eb3
  Author: Eric W. Biederman <ebiederm@xmission.com>
  Date:   Fri Nov 20 17:14:27 2020 -0600

    file: Rename fcheck lookup_fd_rcu

    Also remove the confusing comment about checking if a fd exists.  I
    could not find one instance in the entire kernel that still matches
    the description or the reason for the name fcheck.

    The need for better names became apparent in the last round of
    discussion of this set of changes[1].

    [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic44ef9b2e3e98153e2d141fc4550b61b6c011793

3 years agoFix: do not use bdi_unknown_name symbol
Mathieu Desnoyers [Wed, 10 Feb 2021 17:33:38 +0000 (12:33 -0500)] 
Fix: do not use bdi_unknown_name symbol

Use the GPL-exported bdi_dev_name introduced in kernel 5.7. Do not use
static inline bdi_dev_name in prior kernels because it uses the bdi_unknown_name
symbol which is not exported to GPL modules.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I8b4e4fd84ecacef7942b308e615ca88db8dce7b6

3 years agofix: memcg: fix a crash in wb_workfn when a device disappears (5.6)
Mathieu Desnoyers [Wed, 10 Feb 2021 16:45:42 +0000 (11:45 -0500)] 
fix: memcg: fix a crash in wb_workfn when a device disappears (5.6)

See upstream commit:

commit 68f23b89067fdf187763e75a56087550624fdbee
("memcg: fix a crash in wb_workfn when a device disappears")

It is currently backported into stable branches 5.4 and 5.5, but appears
to be missing from the 4.4, 4.9, 4.14, 4.19 LTS branches.

Implement our own lttng_bdi_dev_name wrapper to provide this fix on
builds against stable kernels which do not have this fix.

There is one user-visible change with this commit: for builds against
kernels < 4.4.0, the writeback_work_class events did use the
default_backing_dev_info to handle cases where the device is NULL,
writing "default" into the trace. This behavior is now aligned to
match what is done in kernels >= 4.4.0, which is to write "(unknown)"
into the name field.

Link: https://lore.kernel.org/r/537870616.15400.1612973059419.JavaMail.zimbra@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0823643aa2f9d4c2b9f2005748a2adfd4457979a

3 years agoFix: writeback: out-of-bound reads
Mathieu Desnoyers [Fri, 5 Feb 2021 21:21:47 +0000 (16:21 -0500)] 
Fix: writeback: out-of-bound reads

Use ctf_string rather than ctf_array_text for name fields, because the
source strings are not guaranteed to be at least 32 bytes.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
3 years agofix: Add one digit to RHEL major release version
Michael Jeanson [Tue, 9 Feb 2021 16:28:27 +0000 (11:28 -0500)] 
fix: Add one digit to RHEL major release version

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Iac1572bde82541f4eeb18004534e750a53e90da8

3 years agofix: Add one digit to SLES minor release version
Michael Jeanson [Tue, 9 Feb 2021 16:25:57 +0000 (11:25 -0500)] 
fix: Add one digit to SLES minor release version

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I5d5811c06c45ee0877bdd47967453f467a09a3f6

3 years agofix: RT_PATCH_VERSION is close to overflow
Michael Jeanson [Mon, 8 Feb 2021 20:32:47 +0000 (15:32 -0500)] 
fix: RT_PATCH_VERSION is close to overflow

We allocated only 8bits for RT_PATCH_VERSION in LTTNG_RT_VERSION_CODE,
the current RT patch version for the 4.4 branch is currently 214 which
is getting close to 256. Bump it to 16bits to avoid breakage in the
future.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: If791cecf3ac8ceb2a4d0c2879410ae6b4199117d

3 years agofix: cast LTTNG_KERNEL_VERSION/LTTNG_LINUX_VERSION_CODE to uint64_t
Michael Jeanson [Tue, 9 Feb 2021 16:04:25 +0000 (11:04 -0500)] 
fix: cast LTTNG_KERNEL_VERSION/LTTNG_LINUX_VERSION_CODE to uint64_t

Cast our version macros to an unsigned 64bits value to prevent
overflowing when we append distro specific version information.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I5af74cffe57cfe44c6e4a951c7a1270028650aa5

3 years agofix: UTS_UBUNTU_RELEASE_ABI is close to overflow
Michael Jeanson [Fri, 5 Feb 2021 20:21:55 +0000 (15:21 -0500)] 
fix: UTS_UBUNTU_RELEASE_ABI is close to overflow

We allocated only 8bits for UTS_UBUNTU_RELEASE_ABI in
LTTNG_UBUNTU_KERNEL_VERSION, the current Xenial kernel has an ABI of 207
which is getting close to 256. Bump it to 16bits to avoid breakage in
the future.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id6046ac402b33c8aff577d66a7d68397a1f08d5c

3 years agofix: sublevel version overflow in LINUX_VERSION_CODE
Michael Jeanson [Fri, 5 Feb 2021 17:08:40 +0000 (12:08 -0500)] 
fix: sublevel version overflow in LINUX_VERSION_CODE

The 4.4.256 and 4.9.256 stable release overflow the 8bits allocated to
the sublevel in LINUX_VERSION_CODE which ends means they report
themselves as 4.5.0 and 4.10.0 respectively. The next releases in these
stables branches will have sublevel clamped at 255 and will thus report
themselves as 4.4.255 and 4.9.255 for all subsequent releases.

We need a way to way to properly detect these release since I doubt they
will stop breaking tracepoints declarations. As a workaround, extract
the version information from the Makefile in the kernel headers and use
this information to generate a version code when the sublevel is equal
or greater than 256.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie12dec7d2f28fc83343262f42680d42a2f40b59e

3 years agoNamespace kernel version macros
Michael Jeanson [Tue, 9 Feb 2021 18:36:29 +0000 (13:36 -0500)] 
Namespace kernel version macros

This patch replaces all uses of the LINUX_VERSION_CODE and
KERNEL_VERSION macros by an 'LTTNG_' prefixed version, this will allow
us to override them.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I2f6c13372e83564bd517c81ed7e2a696da8ee8ec

3 years agoaarch64: blacklist gcc prior to 5.1
Mathieu Desnoyers [Fri, 22 Jan 2021 20:25:47 +0000 (15:25 -0500)] 
aarch64: blacklist gcc prior to 5.1

Linux aarch64 requires GCC 5.1 or better because prior versions perform
unsafe access to deallocated stack.

Some Linux distributions may have backported the fix, but it was never
released into earlier upstream gcc versions.

Link: https://lwn.net/Articles/842122/
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63293
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I72993e446f7f54f39d0f360273b68f194be8c13a

3 years agofix: genirq: Restrict export of irq_to_desc() (v5.11)
Michael Jeanson [Mon, 18 Jan 2021 19:25:49 +0000 (14:25 -0500)] 
fix: genirq: Restrict export of irq_to_desc() (v5.11)

See upstream commit:

  commit 64a1b95bb9fe3ec76e1a2cd803eff06389341ae4
  Author: Thomas Gleixner <tglx@linutronix.de>
  Date:   Thu Dec 10 20:26:06 2020 +0100

    genirq: Restrict export of irq_to_desc()

    No more (ab)use in drivers finally. There is still the modular build of
    PPC/KVM which needs it, so restrict it to this case which still makes it
    unavailable for most drivers.

Link: https://lore.kernel.org/r/20201210194045.551428291@linutronix.de
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie5a2af2f02ade07c73e1c7a8aa0fb155280b3d8b

3 years agofix: block: merge struct block_device and struct hd_struct (v5.11)
Michael Jeanson [Wed, 13 Jan 2021 19:27:41 +0000 (14:27 -0500)] 
fix: block: merge struct block_device and struct hd_struct (v5.11)

See upstream commit :

  commit 0d02129e76edf91cf04fabf1efbc3a9a1f1d729a
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Fri Nov 27 16:43:51 2020 +0100

    block: merge struct block_device and struct hd_struct

    Instead of having two structures that represent each block device with
    different life time rules, merge them into a single one.  This also
    greatly simplifies the reference counting rules, as we can use the inode
    reference count as the main reference count for the new struct
    block_device, with the device model reference front ending it for device
    model interaction.

Change-Id: I47702d1867fda0d8fc0754d761aa4d1ae702cdeb
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
3 years agofix: kprobes: Remove kretprobe hash (v5.11)
Michael Jeanson [Thu, 7 Jan 2021 19:50:50 +0000 (14:50 -0500)] 
fix: kprobes: Remove kretprobe hash (v5.11)

See upstream commit:

  commit d741bf41d7c7db4898bacfcb020353cddc032fd8
  Author: Peter Zijlstra <peterz@infradead.org>
  Date:   Sat Aug 29 22:03:24 2020 +0900

    kprobes: Remove kretprobe hash

    The kretprobe hash is mostly superfluous, replace it with a per-task
    variable.

    This gets rid of the task hash and it's related locking.

    Note that this may change the kprobes module-exported API for kretprobe
    handlers. If any out-of-tree kretprobe user uses ri->rp, use
    get_kretprobe(ri) instead.

Link: https://lore.kernel.org/r/159870620431.1229682.16325792502413731312.stgit@devnote2
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I855765f390ad7caf481ef5fea334645e852f5b0f

3 years agofix: block: remove the request_queue argument to the block_bio_remap tracepoint ...
Michael Jeanson [Thu, 7 Jan 2021 17:01:40 +0000 (12:01 -0500)] 
fix: block: remove the request_queue argument to the block_bio_remap tracepoint (v5.11)

See upstream commit:

  commit 1c02fca620f7273b597591065d366e2cca948d8f
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Thu Dec 3 17:21:38 2020 +0100

    block: remove the request_queue argument to the block_bio_remap tracepoint

    The request_queue can trivially be derived from the bio.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic49a9c9ebeea37e4ec79c736382293a6c9ce86d3

3 years agofix: block: remove the request_queue argument to the block_split tracepoint (v5.11)
Michael Jeanson [Thu, 7 Jan 2021 16:56:25 +0000 (11:56 -0500)] 
fix: block: remove the request_queue argument to the block_split tracepoint (v5.11)

See upstream commit:

  commit eb6f7f7cd3af0f67ce57b21fab1bc64beb643581
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Thu Dec 3 17:21:37 2020 +0100

    block: remove the request_queue argument to the block_split tracepoint

    The request_queue can trivially be derived from the bio.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I93e8c4c51ba36d22b587841e95ff4be8d5224230

3 years agofix: block: simplify and extend the block_bio_merge tracepoint class (v5.11)
Michael Jeanson [Thu, 7 Jan 2021 16:50:25 +0000 (11:50 -0500)] 
fix: block: simplify and extend the block_bio_merge tracepoint class (v5.11)

See upstream commit:

  commit e8a676d61c07eccfcd9d6fddfe4dcb630651c29a
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Thu Dec 3 17:21:36 2020 +0100

    block: simplify and extend the block_bio_merge tracepoint class

    The block_bio_merge tracepoint class can be reused for most bio-based
    tracepoints.  For that it just needs to lose the superfluous q and rq
    parameters.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I90a1da20ab07605ed88f29b02f63134fa4aee6a8

3 years agofix: block: remove the request_queue to argument request based tracepoints (v5.11)
Michael Jeanson [Thu, 7 Jan 2021 16:17:20 +0000 (11:17 -0500)] 
fix: block: remove the request_queue to argument request based tracepoints (v5.11)

See upstream commit :

  commit a54895fa057c67700270777f7661d8d3c7fda88a
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Thu Dec 3 17:21:39 2020 +0100

    block: remove the request_queue to argument request based tracepoints

    The request_queue can trivially be derived from the request.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ibe4a13f06ed57955fa3e0b77b87a44b7e6b57775

3 years agoVersion 2.12.4 v2.12.4
Mathieu Desnoyers [Mon, 11 Jan 2021 19:01:42 +0000 (14:01 -0500)] 
Version 2.12.4

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I4b67ff8cbed189cd08d235f02e737f1721afb949

3 years agofix: adjust version range for trace_find_free_extent()
Michael Jeanson [Tue, 24 Nov 2020 16:27:18 +0000 (11:27 -0500)] 
fix: adjust version range for trace_find_free_extent()

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Iaa6088092cf58b4d29d55f3ff9586c57ae272302

3 years agoImprove the release script
Michael Jeanson [Mon, 23 Nov 2020 17:15:43 +0000 (12:15 -0500)] 
Improve the release script

  * Use git-archive, this removes all custom code to cleanup the repo, it
    can now be used in an unclean repo as the code will be exported from
    a specific tag.
  * Add parameters, this will allow using the script on any machine
    while keeping the default behavior for the maintainer.

Change-Id: I9f29d0e1afdbf475d0bbaeb9946ca3216f725e86
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
3 years agoAdd release maintainer script
Mathieu Desnoyers [Mon, 23 Nov 2020 15:49:57 +0000 (10:49 -0500)] 
Add release maintainer script

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
3 years agofix: statedump: undefined symbols caused by incorrect patch backport
He Zhe [Mon, 23 Nov 2020 10:14:25 +0000 (18:14 +0800)] 
fix: statedump: undefined symbols caused by incorrect patch backport

bb346792c2cb ("fix: tracepoint: Optimize using static_call() (v5.10)")
misses three definitions and causes the following build failures.

ERROR: "__tracepoint_lttng_statedump_process_net_ns" [lttng-statedump.ko] undefined!
ERROR: "__tracepoint_lttng_statedump_process_user_ns" [lttng-statedump.ko] undefined!
ERROR: "__tracepoint_lttng_statedump_process_uts_ns" [lttng-statedump.ko] undefined!

Fixes: #1290
Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agofix: include order for older kernels
Michael Jeanson [Fri, 20 Nov 2020 16:42:30 +0000 (11:42 -0500)] 
fix: include order for older kernels

Fixes a build failure on v3.0 and v3.1.

Change-Id: Ic48512d2aa5ee46678e67d147b92dba6d0959615
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agofix: tracepoint: Optimize using static_call() (v5.10)
Michael Jeanson [Mon, 26 Oct 2020 21:09:05 +0000 (17:09 -0400)] 
fix: tracepoint: Optimize using static_call() (v5.10)

See upstream commit :

  commit d25e37d89dd2f41d7acae0429039d2f0ae8b4a07
  Author: Steven Rostedt (VMware) <rostedt@goodmis.org>
  Date:   Tue Aug 18 15:57:52 2020 +0200

    tracepoint: Optimize using static_call()

    Currently the tracepoint site will iterate a vector and issue indirect
    calls to however many handlers are registered (ie. the vector is
    long).

    Using static_call() it is possible to optimize this for the common
    case of only having a single handler registered. In this case the
    static_call() can directly call this handler. Otherwise, if the vector
    is longer than 1, call a function that iterates the whole vector like
    the current code.

Change-Id: I739dd84d62cc1a821b8bd8acff74fa29aa25d22f
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agofix: KVM: x86/mmu: Return unique RET_PF_* values if the fault was fixed (v5.10)
Michael Jeanson [Mon, 26 Oct 2020 21:07:13 +0000 (17:07 -0400)] 
fix: KVM: x86/mmu: Return unique RET_PF_* values if the fault was fixed (v5.10)

See upstream commit :

  commit c4371c2a682e0da1ed2cd7e3c5496f055d873554
  Author: Sean Christopherson <sean.j.christopherson@intel.com>
  Date:   Wed Sep 23 15:04:24 2020 -0700

    KVM: x86/mmu: Return unique RET_PF_* values if the fault was fixed

    Introduce RET_PF_FIXED and RET_PF_SPURIOUS to provide unique return
    values instead of overloading RET_PF_RETRY.  In the short term, the
    unique values add clarity to the code and RET_PF_SPURIOUS will be used
    by set_spte() to avoid unnecessary work for spurious faults.

    In the long term, TDX will use RET_PF_FIXED to deterministically map
    memory during pre-boot.  The page fault flow may bail early for benign
    reasons, e.g. if the mmu_notifier fires for an unrelated address.  With
    only RET_PF_RETRY, it's impossible for the caller to distinguish between
    "cool, page is mapped" and "darn, need to try again", and thus cannot
    handle benign cases like the mmu_notifier retry.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie0855c78852b45f588e131fe2463e15aae1bc023

4 years agofix: kvm: x86/mmu: Add TDP MMU PF handler (v5.10)
Michael Jeanson [Mon, 26 Oct 2020 18:28:35 +0000 (14:28 -0400)] 
fix: kvm: x86/mmu: Add TDP MMU PF handler (v5.10)

See upstream commit :

  commit bb18842e21111a979e2e0e1c5d85c09646f18d51
  Author: Ben Gardon <bgardon@google.com>
  Date:   Wed Oct 14 11:26:50 2020 -0700

    kvm: x86/mmu: Add TDP MMU PF handler

    Add functions to handle page faults in the TDP MMU. These page faults
    are currently handled in much the same way as the x86 shadow paging
    based MMU, however the ordering of some operations is slightly
    different. Future patches will add eager NX splitting, a fast page fault
    handler, and parallel page faults.

    Tested by running kvm-unit-tests and KVM selftests on an Intel Haswell
    machine. This series introduced no new failures.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie56959cb6c77913d2f1188b0ca15da9114623a4e

4 years agofix: KVM: x86: Add intr/vectoring info and error code to kvm_exit tracepoint (v5.10)
Michael Jeanson [Mon, 26 Oct 2020 18:11:17 +0000 (14:11 -0400)] 
fix: KVM: x86: Add intr/vectoring info and error code to kvm_exit tracepoint (v5.10)

See upstream commit :

  commit 235ba74f008d2e0936b29f77f68d4e2f73ffd24a
  Author: Sean Christopherson <sean.j.christopherson@intel.com>
  Date:   Wed Sep 23 13:13:46 2020 -0700

    KVM: x86: Add intr/vectoring info and error code to kvm_exit tracepoint

    Extend the kvm_exit tracepoint to align it with kvm_nested_vmexit in
    terms of what information is captured.  On SVM, add interrupt info and
    error code, while on VMX it add IDT vectoring and error code.  This
    sets the stage for macrofying the kvm_exit tracepoint definition so that
    it can be reused for kvm_nested_vmexit without loss of information.

    Opportunistically stuff a zero for VM_EXIT_INTR_INFO if the VM-Enter
    failed, as the field is guaranteed to be invalid.  Note, it'd be
    possible to further filter the interrupt/exception fields based on the
    VM-Exit reason, but the helper is intended only for tracepoints, i.e.
    an extra VMREAD or two is a non-issue, the failed VM-Enter case is just
    low hanging fruit.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I638fa29ef7d8bb432de42a33f9ae4db43259b915

4 years agofix: ext4: fast commit recovery path (v5.10)
Michael Jeanson [Mon, 26 Oct 2020 21:03:23 +0000 (17:03 -0400)] 
fix: ext4: fast commit recovery path (v5.10)

See upstream commit :

  commit 8016e29f4362e285f0f7e38fadc61a5b7bdfdfa2
  Author: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
  Date:   Thu Oct 15 13:37:59 2020 -0700

    ext4: fast commit recovery path

    This patch adds fast commit recovery path support for Ext4 file
    system. We add several helper functions that are similar in spirit to
    e2fsprogs journal recovery path handlers. Example of such functions
    include - a simple block allocator, idempotent block bitmap update
    function etc. Using these routines and the fast commit log in the fast
    commit area, the recovery path (ext4_fc_replay()) performs fast commit
    log recovery.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia65cf44e108f2df0b458f0d335f33a8f18f50baa

4 years agofix: btrfs: make ordered extent tracepoint take btrfs_inode (v5.10)
Michael Jeanson [Tue, 27 Oct 2020 16:10:05 +0000 (12:10 -0400)] 
fix: btrfs: make ordered extent tracepoint take btrfs_inode (v5.10)

See upstream commit :

  commit acbf1dd0fcbd10c67826a19958f55a053b32f532
  Author: Nikolay Borisov <nborisov@suse.com>
  Date:   Mon Aug 31 14:42:40 2020 +0300

    btrfs: make ordered extent tracepoint take btrfs_inode

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I096d0801ffe0ad826cfe414cdd1c0857cbd2b624

4 years agofix: btrfs: tracepoints: output proper root owner for trace_find_free_extent() (v5.10)
Michael Jeanson [Tue, 27 Oct 2020 15:42:23 +0000 (11:42 -0400)] 
fix: btrfs: tracepoints: output proper root owner for trace_find_free_extent() (v5.10)

See upstream commit :

  commit 437490fed3b0c9ae21af8f70e0f338d34560842b
  Author: Qu Wenruo <wqu@suse.com>
  Date:   Tue Jul 28 09:42:49 2020 +0800

    btrfs: tracepoints: output proper root owner for trace_find_free_extent()

    The current trace event always output result like this:

     find_free_extent: root=2(EXTENT_TREE) len=16384 empty_size=0 flags=4(METADATA)
     find_free_extent: root=2(EXTENT_TREE) len=16384 empty_size=0 flags=4(METADATA)
     find_free_extent: root=2(EXTENT_TREE) len=8192 empty_size=0 flags=1(DATA)
     find_free_extent: root=2(EXTENT_TREE) len=8192 empty_size=0 flags=1(DATA)
     find_free_extent: root=2(EXTENT_TREE) len=4096 empty_size=0 flags=1(DATA)
     find_free_extent: root=2(EXTENT_TREE) len=4096 empty_size=0 flags=1(DATA)

    T's saying we're allocating data extent for EXTENT tree, which is not
    even possible.

    It's because we always use EXTENT tree as the owner for
    trace_find_free_extent() without using the @root from
    btrfs_reserve_extent().

    This patch will change the parameter to use proper @root for
    trace_find_free_extent():

    Now it looks much better:

     find_free_extent: root=5(FS_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
     find_free_extent: root=5(FS_TREE) len=8192 empty_size=0 flags=1(DATA)
     find_free_extent: root=5(FS_TREE) len=16384 empty_size=0 flags=1(DATA)
     find_free_extent: root=5(FS_TREE) len=4096 empty_size=0 flags=1(DATA)
     find_free_extent: root=5(FS_TREE) len=8192 empty_size=0 flags=1(DATA)
     find_free_extent: root=5(FS_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
     find_free_extent: root=7(CSUM_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
     find_free_extent: root=2(EXTENT_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
     find_free_extent: root=1(ROOT_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1d674064d29b31417e2acffdeb735f5052a87032

4 years agofix: objtool: Rename frame.h -> objtool.h (v5.10)
Michael Jeanson [Mon, 26 Oct 2020 17:41:02 +0000 (13:41 -0400)] 
fix: objtool: Rename frame.h -> objtool.h (v5.10)

See upstream commit :

  commit 00089c048eb4a8250325efb32a2724fd0da68cce
  Author: Julien Thierry <jthierry@redhat.com>
  Date:   Fri Sep 4 16:30:25 2020 +0100

    objtool: Rename frame.h -> objtool.h

    Header frame.h is getting more code annotations to help objtool analyze
    object files.

    Rename the file to objtool.h.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic2283161bebcbf1e33b72805eb4d2628f4ae3e89

4 years agoFix: ressource leak in id tracker
Mathieu Desnoyers [Thu, 19 Nov 2020 16:03:17 +0000 (11:03 -0500)] 
Fix: ressource leak in id tracker

Memory leak found by Coverity:

CID 1412251 (#2 of 2): Resource leak (RESOURCE_LEAK)
21. leaked_storage: Variable head going out of scope leaks the storage it points to.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agofix: strncpy equals destination size warning
Michael Jeanson [Mon, 5 Oct 2020 19:31:42 +0000 (15:31 -0400)] 
fix: strncpy equals destination size warning

Some versions of GCC when called with -Wstringop-truncation will warn
when doing a copy of the same size as the destination buffer with
strncpy :

  ‘strncpy’ specified bound 256 equals destination size [-Werror=stringop-truncation]

Since we unconditionally write '\0' in the last byte, reduce the copy
size by one.

Change-Id: Idb907c9550817a06fc0dffc489740f63d440e7d4
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
4 years agoUpdate Changelog for Version 2.12.3 v2.12.3
Mathieu Desnoyers [Mon, 5 Oct 2020 19:29:51 +0000 (15:29 -0400)] 
Update Changelog for Version 2.12.3

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agoCleanup: lttng-syscalls: silence warning about uninitialized bitmap variable
Mathieu Desnoyers [Mon, 5 Oct 2020 16:01:37 +0000 (12:01 -0400)] 
Cleanup: lttng-syscalls: silence warning about uninitialized bitmap variable

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agofix: backport 'Add 'kernel_read' wrapper for kernels < v4.14'
Michael Jeanson [Fri, 2 Oct 2020 20:10:05 +0000 (16:10 -0400)] 
fix: backport 'Add 'kernel_read' wrapper for kernels < v4.14'

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I3b558a6a4b850054d5786bdf99e0849091c83eae

4 years agoAdd 'kernel_read' wrapper for kernels < v4.14
Michael Jeanson [Fri, 2 Oct 2020 17:03:34 +0000 (13:03 -0400)] 
Add 'kernel_read' wrapper for kernels < v4.14

See upstream commit:

  commit bdd1d2d3d251c65b74ac4493e08db18971c09240
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Fri Sep 1 17:39:13 2017 +0200

    fs: fix kernel_read prototype

    Use proper ssize_t and size_t types for the return value and count
    argument, move the offset last and make it an in/out argument like
    all other read/write helpers, and make the buf argument a void pointer
    to get rid of lots of casts in the callers.

Change-Id: I825c3fcbcc17e9b46e2a661fadc66b52a94eb2da
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agoVersion 2.12.3
Mathieu Desnoyers [Fri, 2 Oct 2020 16:22:27 +0000 (12:22 -0400)] 
Version 2.12.3

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agofix: Use 'kernel_read' to read from procfs
Michael Jeanson [Thu, 24 Sep 2020 19:38:35 +0000 (15:38 -0400)] 
fix: Use 'kernel_read' to read from procfs

Use the 'kernel_read' helper to read files in procfs, it's present in
the kernel since the 2.6 series and does the right thing on kernels that
require the set_fs dance and newer one which don't.

Change-Id: I1a53fda379e0bb9acc79331626925bbdba63d727
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agofix: don't allow userspace copy to read kernel memory
Michael Jeanson [Fri, 25 Sep 2020 20:05:00 +0000 (16:05 -0400)] 
fix: don't allow userspace copy to read kernel memory

This patch fixes a security issue which allows the root user to read
arbitrary kernel memory. Considering the security model used in LTTng
userspace tooling for kernel tracing, this bug also allows members of
the 'tracing' group to read arbitrary kernel memory.

Calls to __copy_from_user_inatomic() where wrongly enclosed in
set_fs(KERNEL_DS) defeating the access_ok() calls and allowing to read
from kernel memory if a kernel address is provided.

Remove all set_fs() calls around __copy_from_user_inatomic().

As a side effect this will allow us to support v5.10 which should remove
set_fs().

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I35e4562c835217352c012ed96a7b8f93e941381e

4 years agofix: Add a 1MB limit to lttng_strlen_user_inatomic
Michael Jeanson [Fri, 25 Sep 2020 15:23:58 +0000 (11:23 -0400)] 
fix: Add a 1MB limit to lttng_strlen_user_inatomic

The previous implementation was unbounded which could result in long
loops with preemption turned off.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I85afcd879258735bb2e7502f6016fcb2d3974cf7

4 years agofix: Adjust ranges for Ubuntu 4.15.0-119 kernel
Michael Jeanson [Wed, 23 Sep 2020 18:42:18 +0000 (14:42 -0400)] 
fix: Adjust ranges for Ubuntu 4.15.0-119 kernel

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie32f70f810c8fc756fbd31ab129aeb35500790f7

4 years agofix: Adjust ranges for Ubuntu HWE 5.0 kernels
Michael Jeanson [Wed, 16 Sep 2020 19:16:17 +0000 (15:16 -0400)] 
fix: Adjust ranges for Ubuntu HWE 5.0 kernels

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I36f2c3485dcc6ccb74ea86a7ce66fcb1662d060b

4 years agoFix: system call filter table
Mathieu Desnoyers [Tue, 28 Jan 2020 21:02:44 +0000 (16:02 -0500)] 
Fix: system call filter table

The system call filter table has effectively been unused for a long
time due to system call name prefix mismatch. This means the overhead of
selective system call tracing was larger than it should have been because
the event payload preparation would be done for all system calls as soon
as a single system call is traced.

However, fixing this underlying issue unearths several issues that crept
unnoticed when the "enabler" concept was introduced (after the original
implementation of the system call filter table).

Here is a list of the issues which are resolved here:

- Split lttng_syscalls_unregister into an unregister and destroy
  function, thus awaiting for a grace period (and therefore quiescence
  of the users) after unregistering the system call tracepoints before
  freeing the system call filter data structures. This effectively fixes
  a use-after-free.

- The state for enabling "all" system calls vs enabling specific system
  calls (and sequences of enable-disable) was incorrect with respect to
  the "enablers" semantic. This is solved by always tracking the
  bitmap of enabled system calls, and keeping this bitmap even when
  enabling all system calls. The sc_filter is now always allocated
  before system call tracing is registered to tracepoints, which means
  it does not need to be RCU dereferenced anymore.

Padding fields in the ABI are reserved to select whether to:

- Trace either native or compat system call (or both, which is the
  behavior currently implemented),
- Trace either system call entry or exit (or both, which is the
  behavior currently implemented),
- Select the system call to trace by name (behavior currently
  implemented) or by system call number,

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agofix: version ranges for ext4_discard_preallocations and writeback_queue_io
Michael Jeanson [Fri, 4 Sep 2020 15:52:51 +0000 (11:52 -0400)] 
fix: version ranges for ext4_discard_preallocations and writeback_queue_io

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id4fa53cb2e713cbda651e1a75deed91013115592

4 years agofix: writeback: Fix sync livelock due to b_dirty_time processing (v5.9)
Michael Jeanson [Mon, 31 Aug 2020 18:16:01 +0000 (14:16 -0400)] 
fix: writeback: Fix sync livelock due to b_dirty_time processing (v5.9)

See upstream commit:

  commit f9cae926f35e8230330f28c7b743ad088611a8de
  Author: Jan Kara <jack@suse.cz>
  Date:   Fri May 29 16:08:58 2020 +0200

    writeback: Fix sync livelock due to b_dirty_time processing

    When we are processing writeback for sync(2), move_expired_inodes()
    didn't set any inode expiry value (older_than_this). This can result in
    writeback never completing if there's steady stream of inodes added to
    b_dirty_time list as writeback rechecks dirty lists after each writeback
    round whether there's more work to be done. Fix the problem by using
    sync(2) start time is inode expiry value when processing b_dirty_time
    list similarly as for ordinarily dirtied inodes. This requires some
    refactoring of older_than_this handling which simplifies the code
    noticeably as a bonus.

Change-Id: I8b894b13ccc14d9b8983ee4c2810a927c319560b
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agofix: writeback: Drop I_DIRTY_TIME_EXPIRE (v5.9)
Michael Jeanson [Mon, 31 Aug 2020 15:41:38 +0000 (11:41 -0400)] 
fix: writeback: Drop I_DIRTY_TIME_EXPIRE (v5.9)

See upstream commit:

  commit 5fcd57505c002efc5823a7355e21f48dd02d5a51
  Author: Jan Kara <jack@suse.cz>
  Date:   Fri May 29 16:24:43 2020 +0200

    writeback: Drop I_DIRTY_TIME_EXPIRE

    The only use of I_DIRTY_TIME_EXPIRE is to detect in
    __writeback_single_inode() that inode got there because flush worker
    decided it's time to writeback the dirty inode time stamps (either
    because we are syncing or because of age). However we can detect this
    directly in __writeback_single_inode() and there's no need for the
    strange propagation with I_DIRTY_TIME_EXPIRE flag.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I92e37c2ff3ec36d431e8f9de5c8e37c5a2da55ea

4 years agofix: removal of [smp_]read_barrier_depends (v5.9)
Michael Jeanson [Tue, 25 Aug 2020 14:56:29 +0000 (10:56 -0400)] 
fix: removal of [smp_]read_barrier_depends (v5.9)

See upstream commits:

  commit 76ebbe78f7390aee075a7f3768af197ded1bdfbb
  Author: Will Deacon <will@kernel.org>
  Date:   Tue Oct 24 11:22:47 2017 +0100

    locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE()

    In preparation for the removal of lockless_dereference(), which is the
    same as READ_ONCE() on all architectures other than Alpha, add an
    implicit smp_read_barrier_depends() to READ_ONCE() so that it can be
    used to head dependency chains on all architectures.

  commit 76ebbe78f7390aee075a7f3768af197ded1bdfbb
  Author: Will Deacon <will.deacon@arm.com>
  Date:   Tue Oct 24 11:22:47 2017 +0100

    locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE()

    In preparation for the removal of lockless_dereference(), which is the
    same as READ_ONCE() on all architectures other than Alpha, add an
    implicit smp_read_barrier_depends() to READ_ONCE() so that it can be
    used to head dependency chains on all architectures.

Change-Id: Ife8880bd9378dca2972da8838f40fc35ccdfaaac
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agofix: ext4: indicate via a block bitmap read is prefetched… (v5.9)
Michael Jeanson [Mon, 24 Aug 2020 19:37:50 +0000 (15:37 -0400)] 
fix: ext4: indicate via a block bitmap read is prefetched… (v5.9)

See upstream commit:

  commit ab74c7b23f3770935016e3eb3ecdf1e42b73efaa
  Author: Theodore Ts'o <tytso@mit.edu>
  Date:   Wed Jul 15 11:48:55 2020 -0400

    ext4: indicate via a block bitmap read is prefetched via a tracepoint

    Modify the ext4_read_block_bitmap_load tracepoint so that it tells us
    whether a block bitmap is being prefetched.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0e5e2c5b8004223d0928235c092449ee16a940e1

4 years agofix: ext4: limit the length of per-inode prealloc list (v5.9)
Michael Jeanson [Mon, 24 Aug 2020 19:26:04 +0000 (15:26 -0400)] 
fix: ext4: limit the length of per-inode prealloc list (v5.9)

See upstream commit:

  commit 27bc446e2def38db3244a6eb4bb1d6312936610a
  Author: brookxu <brookxu.cn@gmail.com>
  Date:   Mon Aug 17 15:36:15 2020 +0800

    ext4: limit the length of per-inode prealloc list

    In the scenario of writing sparse files, the per-inode prealloc list may
    be very long, resulting in high overhead for ext4_mb_use_preallocated().
    To circumvent this problem, we limit the maximum length of per-inode
    prealloc list to 512 and allow users to modify it.

    After patching, we observed that the sys ratio of cpu has dropped, and
    the system throughput has increased significantly. We created a process
    to write the sparse file, and the running time of the process on the
    fixed kernel was significantly reduced, as follows:

    Running time on unfixed kernel:
    [root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
    real    0m2.051s
    user    0m0.008s
    sys     0m2.026s

    Running time on fixed kernel:
    [root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
    real    0m0.471s
    user    0m0.004s
    sys     0m0.395s

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I5169cb24853d4da32e2862a6626f1f058689b053

4 years agofix: KVM: x86/mmu: Make kvm_mmu_page definition and accessor internal-only (v5.9)
Michael Jeanson [Mon, 10 Aug 2020 15:36:03 +0000 (11:36 -0400)] 
fix: KVM: x86/mmu: Make kvm_mmu_page definition and accessor internal-only (v5.9)

  commit 985ab2780164698ec6e7d73fad523d50449261dd
  Author: Sean Christopherson <sean.j.christopherson@intel.com>
  Date:   Mon Jun 22 13:20:32 2020 -0700

    KVM: x86/mmu: Make kvm_mmu_page definition and accessor internal-only

    Make 'struct kvm_mmu_page' MMU-only, nothing outside of the MMU should
    be poking into the gory details of shadow pages.

Change-Id: Ia5c1b9c49c2b00dad1d5b17c50c3dc730dafda20
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agofix: Move mmutrace.h into the mmu/ sub-directory (v5.9)
Michael Jeanson [Mon, 10 Aug 2020 15:22:05 +0000 (11:22 -0400)] 
fix: Move mmutrace.h into the mmu/ sub-directory (v5.9)

  commit 33e3042dac6bcc33b80835f7d7b502b1d74c457c
  Author: Sean Christopherson <sean.j.christopherson@intel.com>
  Date:   Mon Jun 22 13:20:29 2020 -0700

    KVM: x86/mmu: Move mmu_audit.c and mmutrace.h into the mmu/ sub-directory

    Move mmu_audit.c and mmutrace.h under mmu/ where they belong.

Change-Id: I582525ccca34e1e3bd62870364108a7d3e9df2e4
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agoKconfig: fix dependency issue when building in-tree without CONFIG_FTRACE
Beniamin Sandu [Thu, 13 Aug 2020 13:24:39 +0000 (16:24 +0300)] 
Kconfig: fix dependency issue when building in-tree without CONFIG_FTRACE

When building in-tree, one could disable CONFIG_FTRACE from kernel
config which will leave CONFIG_TRACEPOINTS selected by LTTNG modules,
but generate a lot of linker errors like below because it leaves out
other stuff, e.g.:

trace.c:(.text+0xd86b): undefined reference to `trace_event_buffer_reserve'
ld: trace.c:(.text+0xd8de): undefined reference to `trace_event_buffer_commit'
ld: trace.c:(.text+0xd926): undefined reference to `event_triggers_call'
ld: trace.c:(.text+0xd942): undefined reference to `trace_event_ignore_this_pid'
ld: net/mac80211/trace.o: in function `trace_event_raw_event_drv_tdls_cancel_channel_switch':

It appears to be caused by the fact that TRACE_EVENT macros in the Linux
kernel depend on the Ftrace ring buffer as soon as CONFIG_TRACEPOINTS is
enabled.

Steps to reproduce:

- Get a clone of an upstream stable kernel and use scripts/built-in.sh on it

- Configure a standard x86-64 build, enable built-in LTTNG but disable
  CONFIG_FTRACE from Kernel Hacking-->Tracers using menuconfig

- Build will fail at linking stage

Signed-off-by: Beniamin Sandu <beniaminsandu@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agoVersion 2.12.2 v2.12.2
Mathieu Desnoyers [Tue, 4 Aug 2020 13:46:24 +0000 (09:46 -0400)] 
Version 2.12.2

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agoFix: Lock metadata cache on session destroy
Mathieu Desnoyers [Mon, 13 Jul 2020 18:59:33 +0000 (14:59 -0400)] 
Fix: Lock metadata cache on session destroy

commit 92143b2c5656 ("Fix: metadata stream leak, missing list removal and locking")
missed taking a lock protecting the metadata stream list iteration on
session destroy. This opens a race window between iteration and item
removal/free which triggers kernel OOPS.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agoFix: metadata stream leak, missing list removal and locking
Mathieu Desnoyers [Fri, 10 Jul 2020 15:15:40 +0000 (11:15 -0400)] 
Fix: metadata stream leak, missing list removal and locking

The metadata stream is part of a list of metadata streams in the
metadata cache. Its addition to the list should be protected by
the metadata cache lock. It needs to be paired with protection
of list iteration with the same lock.

Removal from the list is entirely missing, and should be added
to lttng_metadata_ring_buffer_release (with proper locking).

This missing list removal was probably not causing issues because the
metadata stream structure was leaked: a kfree() is missing from
lttng_metadata_ring_buffer_release as well.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agoFix: coherent state not changed atomically with metadata written
Mathieu Desnoyers [Fri, 10 Jul 2020 14:51:26 +0000 (10:51 -0400)] 
Fix: coherent state not changed atomically with metadata written

commit 122c63cb4310 ("Fix: Implement RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK")
introduces a new ioctl which returns a flag indicating whether the
metadata is in consistent state at the end of the sub-buffer.

That commit is meant to address metadata consistency issues observable
in live sessions.

However, the "consistent" state is false as soon as a producer is
active (between an outermost metadata_begin/end pair). Unfortunately,
if the last "RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK" operation is
done between the last metadata printf and "end" of the transaction, the
last consistency state will be false, and the consumer daemon will never
send metadata to the relay daemon. This in turn causes a live viewer to
wait for metadata endlessly.

This issue can be reproduced by running lttng-tools:
tests/regression/tools/live/test_kernel

as root in a loop.

We observe two things:
1) the poll operation blocks when there is no more metadata to send,
   which means there is no mean to unblock when the consistency state
   changes back to "true" without producing additional metadata,

2) Even if (1) was fixed, the expectation from an ABI perspective is
   that the "coherent" state is only populated when
   RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK succeeds. Therefore,
   there is no way to let user-space know about conherency transition
   unless additional metadata is generated.

Fixing this requires to hold the metadata cache lock across the entire
production of a coherent metadata transaction. This simpler scheme is
possible because the metadata is generated in a reallocated memory area
and not directly into a ring buffer anymore. This was not the case in
earlier lttng-modules versions, when the metadata was generated directly
into a ring buffer, which explains why this simpler scheme was not
implemented.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agofix: include module.h for EXPORT_SYMBOL_GPL
Michael Jeanson [Tue, 7 Jul 2020 18:18:37 +0000 (14:18 -0400)] 
fix: include module.h for EXPORT_SYMBOL_GPL

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic337e1eb375791ace08560555dd02b37cbefcf25

4 years agofix: __lttng_vmalloc_node_range const caller introduced in v3.6
Michael Jeanson [Tue, 7 Jul 2020 17:50:15 +0000 (13:50 -0400)] 
fix: __lttng_vmalloc_node_range const caller introduced in v3.6

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib13cf03b5ab11830a8732318a12713720cf1b3e3

4 years agofix: version range for overflow_callback
Michael Jeanson [Tue, 7 Jul 2020 18:07:01 +0000 (14:07 -0400)] 
fix: version range for overflow_callback

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1b8f1d59552a1723d3f4ed74780a2b57d13d0e52

4 years agofix: global_dirty_limit was introduced in v3.1
Michael Jeanson [Tue, 7 Jul 2020 17:00:10 +0000 (13:00 -0400)] 
fix: global_dirty_limit was introduced in v3.1

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id97dbb2d0181a45c45cfed36c4be8753cabac283

4 years agofix: wrapper_uprobe_unregister is a void function
Michael Jeanson [Tue, 7 Jul 2020 16:21:54 +0000 (12:21 -0400)] 
fix: wrapper_uprobe_unregister is a void function

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib4438da02aac3defd1245324d1b48f400f806d58

4 years agofix: prior to v4.0, __vmalloc_node_range had no vm_flags param
Michael Jeanson [Tue, 7 Jul 2020 15:58:03 +0000 (11:58 -0400)] 
fix: prior to v4.0, __vmalloc_node_range had no vm_flags param

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib476e32d109298d9ca3e6b6ab7ac8f63c50fb09f

4 years agofix: vmalloc on v5.8 without KALLSYMS
Michael Jeanson [Tue, 7 Jul 2020 15:15:39 +0000 (11:15 -0400)] 
fix: vmalloc on v5.8 without KALLSYMS

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic945dad92e78a5bc2895a969a10c527e1349decf

4 years agoDetect missing symbols used with kallsyms_lookup at compile time
Michael Jeanson [Thu, 14 May 2020 17:47:35 +0000 (13:47 -0400)] 
Detect missing symbols used with kallsyms_lookup at compile time

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I19a9a31c386196899517899d861fe63611272139

4 years agoUse exported symbol bdevname() instead of disk_name()
Michael Jeanson [Thu, 2 Jul 2020 16:06:42 +0000 (12:06 -0400)] 
Use exported symbol bdevname() instead of disk_name()

bdevname() is a simple wrapper over disk_name() but has the honor to be
exported. Using it removes the need for a kallsym wrapper.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic2b2233c4db7826175c68edea69751ddcb17a5e6

4 years agoAdd git-review config
Michael Jeanson [Fri, 3 Jul 2020 14:46:12 +0000 (10:46 -0400)] 
Add git-review config

Add .gitreview for contributors wishing to use gerrit for patch
reviews.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I663e66a433ddb645f580c4b9f885db9c3a08e02f

4 years agofix: mm: remove vmalloc_sync_(un)mappings() (v5.8)
Michael Jeanson [Thu, 2 Jul 2020 15:21:42 +0000 (11:21 -0400)] 
fix: mm: remove vmalloc_sync_(un)mappings() (v5.8)

See upstream commit:

  commit 73f693c3a705756032c2863bfb37570276902d7d
  Author: Joerg Roedel <jroedel@suse.de>
  Date:   Mon Jun 1 21:52:36 2020 -0700

    mm: remove vmalloc_sync_(un)mappings()

    These functions are not needed anymore because the vmalloc and ioremap
    mappings are now synchronized when they are created or torn down.

    Remove all callers and function definitions.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ifdefa35b25b4906cde407360e608b77e47cc3808

4 years agofix: mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK (v5.8)
Michael Jeanson [Mon, 15 Jun 2020 15:12:24 +0000 (11:12 -0400)] 
fix: mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK (v5.8)

See upstream commit:

  commit 8d92890bd6b8502d6aee4b37430ae6444ade7a8c
  Author: NeilBrown <neilb@suse.de>
  Date:   Mon Jun 1 21:48:21 2020 -0700

    mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK instead

    After an NFS page has been written it is considered "unstable" until a
    COMMIT request succeeds.  If the COMMIT fails, the page will be
    re-written.

    These "unstable" pages are currently accounted as "reclaimable", either
    in WB_RECLAIMABLE, or in NR_UNSTABLE_NFS which is included in a
    'reclaimable' count.  This might have made sense when sending the COMMIT
    required a separate action by the VFS/MM (e.g.  releasepage() used to
    send a COMMIT).  However now that all writes generated by ->writepages()
    will automatically be followed by a COMMIT (since commit 919e3bd9a875
    ("NFS: Ensure we commit after writeback is complete")) it makes more
    sense to treat them as writeback pages.

    So this patch removes NR_UNSTABLE_NFS and accounts unstable pages in
    NR_WRITEBACK and WB_WRITEBACK.

    A particular effect of this change is that when
    wb_check_background_flush() calls wb_over_bg_threshold(), the latter
    will report 'true' a lot less often as the 'unstable' pages are no
    longer considered 'dirty' (as there is nothing that writeback can do
    about them anyway).

    Currently wb_check_background_flush() will trigger writeback to NFS even
    when there are relatively few dirty pages (if there are lots of unstable
    pages), this can result in small writes going to the server (10s of
    Kilobytes rather than a Megabyte) which hurts throughput.  With this
    patch, there are fewer writes which are each larger on average.

    Where the NR_UNSTABLE_NFS count was included in statistics
    virtual-files, the entry is retained, but the value is hard-coded as
    zero.  static trace points and warning printks which mentioned this
    counter no longer report it.

Change-Id: I18080ca62bc6c1cd7d6da4cb27cc1521fbdca5e1
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agofix: block: remove the error argument to the block_bio_complete (v5.8)
Michael Jeanson [Mon, 15 Jun 2020 15:06:13 +0000 (11:06 -0400)] 
fix: block: remove the error argument to the block_bio_complete (v5.8)

See upstream commit:

  commit d24de76af836260a99ca2ba281a937bd5bc55591
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Wed Jun 3 07:14:43 2020 +0200

    block: remove the error argument to the block_bio_complete tracepoint

    The status can be trivially derived from the bio itself.  That also avoid
    callers like NVMe to incorrectly pass a blk_status_t instead of the errno,
    and the overhead of translating the blk_status_t to the errno in the I/O
    completion fast path when no tracing is enabled.

Fixes: 35fe0d12c8a3 ("nvme: trace bio completion")
Change-Id: I8d1463184d79bfab418a1755bfc6a0200170fff3
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
4 years agofix: pipe_buf_operations rework (v5.8)
Michael Jeanson [Mon, 15 Jun 2020 14:51:41 +0000 (10:51 -0400)] 
fix: pipe_buf_operations rework (v5.8)

See upstream commits:

  commit c928f642c29a5ffb02e16f2430b42b876dde69de
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Wed May 20 17:58:16 2020 +0200

    fs: rename pipe_buf ->steal to ->try_steal

    And replace the arcane return value convention with a simple bool
    where true means success and false means failure.

    [AV: braino fix folded in]

  commit b8d9e7f2411b0744df2ec33e80d7698180fef21a
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Wed May 20 17:58:15 2020 +0200

    fs: make the pipe_buf_operations ->confirm operation optional

    Just return 0 for success if it is not present.

  commit 76887c256744740d6121af9bc4aa787712a1f694
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Wed May 20 17:58:14 2020 +0200

    fs: make the pipe_buf_operations ->steal operation optional

    Just return 1 for failure if it is not present.

Change-Id: Ic185632202470db1eb5b012e95e793ff2cb26be7
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
This page took 0.053261 seconds and 4 git commands to generate.