git.lttng.org Git - lttng-modules.git/log

fix: kprobes: Remove kretprobe hash (v5.11)

See upstream commit:

  commit d741bf41d7c7db4898bacfcb020353cddc032fd8
  Author: Peter Zijlstra <peterz@infradead.org>
  Date:   Sat Aug 29 22:03:24 2020 +0900

    kprobes: Remove kretprobe hash

    The kretprobe hash is mostly superfluous, replace it with a per-task
    variable.

    This gets rid of the task hash and it's related locking.

    Note that this may change the kprobes module-exported API for kretprobe
    handlers. If any out-of-tree kretprobe user uses ri->rp, use
    get_kretprobe(ri) instead.

Link: https://lore.kernel.org/r/159870620431.1229682.16325792502413731312.stgit@devnote2
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I717953f6364ae72803d7a8c55947662319835a5f

fix: file: Rename fcheck lookup_fd_rcu (v5.11)

See upstream commit:

  commit 460b4f812a9d473d4b39d87d37844f9fc30a9eb3
  Author: Eric W. Biederman <ebiederm@xmission.com>
  Date:   Fri Nov 20 17:14:27 2020 -0600

    file: Rename fcheck lookup_fd_rcu

    Also remove the confusing comment about checking if a fd exists.  I
    could not find one instance in the entire kernel that still matches
    the description or the reason for the name fcheck.

    The need for better names became apparent in the last round of
    discussion of this set of changes[1].

    [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Iccdab43e94c62dadd3faa52c66b410f1955674fd

fix: block: remove the request_queue argument to the block_bio_remap tracepoint (v5.11)

See upstream commit:

  commit 1c02fca620f7273b597591065d366e2cca948d8f
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Thu Dec 3 17:21:38 2020 +0100

    block: remove the request_queue argument to the block_bio_remap tracepoint

    The request_queue can trivially be derived from the bio.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Idd95e1ae5175167043eeaef9bb0ab879985dd96d

fix: block: remove the request_queue argument to the block_split tracepoint (v5.11)

See upstream commit:

  commit eb6f7f7cd3af0f67ce57b21fab1bc64beb643581
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Thu Dec 3 17:21:37 2020 +0100

    block: remove the request_queue argument to the block_split tracepoint

    The request_queue can trivially be derived from the bio.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id64217d35cbae698b3ab77401c5d3277800e2a6a

fix: block: simplify and extend the block_bio_merge tracepoint class (v5.11)

See upstream commit:

  commit e8a676d61c07eccfcd9d6fddfe4dcb630651c29a
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Thu Dec 3 17:21:36 2020 +0100

    block: simplify and extend the block_bio_merge tracepoint class

    The block_bio_merge tracepoint class can be reused for most bio-based
    tracepoints.  For that it just needs to lose the superfluous q and rq
    parameters.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia44719af77eeb6c116c079b0fc68330458f7a592

fix: block: remove the request_queue to argument request based tracepoints (v5.11)

See upstream commit :

  commit a54895fa057c67700270777f7661d8d3c7fda88a
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Thu Dec 3 17:21:39 2020 +0100

    block: remove the request_queue to argument request based tracepoints

    The request_queue can trivially be derived from the request.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I532c928bdd36ef57d9b0b61c95fe42f5d479cefb

Version 2.11.7

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I8aa77dfe301395ba850220384e6ca8a4c69ac858

fix: adjust version range for trace_find_free_extent()

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Iaa6088092cf58b4d29d55f3ff9586c57ae272302

fix: backport of fix: tracepoint: Optimize using static_call() (v5.10)

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I94f2b845f11654e639f03254185980de527a4ca8

Revert "fix: include order for older kernels"

This reverts commit 2ce89d35c9477d8c17c00489c72e1548e16af9b9.

This commit is only needed for master and stable-2.12, because
stable-2.11 does not include irq_work.h.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: backport of fix: ext4: fast commit recovery path (v5.10)

Add missing '#endif'.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I43349d685d7ed740b32ce992be0c2e7e6f12c799

Improve the release script

  * Use git-archive, this removes all custom code to cleanup the repo, it
    can now be used in an unclean repo as the code will be exported from
    a specific tag.
  * Add parameters, this will allow using the script on any machine
    while keeping the default behavior for the maintainer.

Change-Id: I9f29d0e1afdbf475d0bbaeb9946ca3216f725e86
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Add release maintainer script

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: include order for older kernels

Fixes a build failure on v3.0 and v3.1.

Change-Id: Ic48512d2aa5ee46678e67d147b92dba6d0959615
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: tracepoint: Optimize using static_call() (v5.10)

See upstream commit :

  commit d25e37d89dd2f41d7acae0429039d2f0ae8b4a07
  Author: Steven Rostedt (VMware) <rostedt@goodmis.org>
  Date:   Tue Aug 18 15:57:52 2020 +0200

    tracepoint: Optimize using static_call()

    Currently the tracepoint site will iterate a vector and issue indirect
    calls to however many handlers are registered (ie. the vector is
    long).

    Using static_call() it is possible to optimize this for the common
    case of only having a single handler registered. In this case the
    static_call() can directly call this handler. Otherwise, if the vector
    is longer than 1, call a function that iterates the whole vector like
    the current code.

Change-Id: I739dd84d62cc1a821b8bd8acff74fa29aa25d22f
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: KVM: x86/mmu: Return unique RET_PF_* values if the fault was fixed (v5.10)

See upstream commit :

  commit c4371c2a682e0da1ed2cd7e3c5496f055d873554
  Author: Sean Christopherson <sean.j.christopherson@intel.com>
  Date:   Wed Sep 23 15:04:24 2020 -0700

    KVM: x86/mmu: Return unique RET_PF_* values if the fault was fixed

    Introduce RET_PF_FIXED and RET_PF_SPURIOUS to provide unique return
    values instead of overloading RET_PF_RETRY.  In the short term, the
    unique values add clarity to the code and RET_PF_SPURIOUS will be used
    by set_spte() to avoid unnecessary work for spurious faults.

    In the long term, TDX will use RET_PF_FIXED to deterministically map
    memory during pre-boot.  The page fault flow may bail early for benign
    reasons, e.g. if the mmu_notifier fires for an unrelated address.  With
    only RET_PF_RETRY, it's impossible for the caller to distinguish between
    "cool, page is mapped" and "darn, need to try again", and thus cannot
    handle benign cases like the mmu_notifier retry.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie0855c78852b45f588e131fe2463e15aae1bc023

fix: kvm: x86/mmu: Add TDP MMU PF handler (v5.10)

See upstream commit :

  commit bb18842e21111a979e2e0e1c5d85c09646f18d51
  Author: Ben Gardon <bgardon@google.com>
  Date:   Wed Oct 14 11:26:50 2020 -0700

    kvm: x86/mmu: Add TDP MMU PF handler

    Add functions to handle page faults in the TDP MMU. These page faults
    are currently handled in much the same way as the x86 shadow paging
    based MMU, however the ordering of some operations is slightly
    different. Future patches will add eager NX splitting, a fast page fault
    handler, and parallel page faults.

    Tested by running kvm-unit-tests and KVM selftests on an Intel Haswell
    machine. This series introduced no new failures.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie56959cb6c77913d2f1188b0ca15da9114623a4e

fix: KVM: x86: Add intr/vectoring info and error code to kvm_exit tracepoint (v5.10)

See upstream commit :

  commit 235ba74f008d2e0936b29f77f68d4e2f73ffd24a
  Author: Sean Christopherson <sean.j.christopherson@intel.com>
  Date:   Wed Sep 23 13:13:46 2020 -0700

    KVM: x86: Add intr/vectoring info and error code to kvm_exit tracepoint

    Extend the kvm_exit tracepoint to align it with kvm_nested_vmexit in
    terms of what information is captured.  On SVM, add interrupt info and
    error code, while on VMX it add IDT vectoring and error code.  This
    sets the stage for macrofying the kvm_exit tracepoint definition so that
    it can be reused for kvm_nested_vmexit without loss of information.

    Opportunistically stuff a zero for VM_EXIT_INTR_INFO if the VM-Enter
    failed, as the field is guaranteed to be invalid.  Note, it'd be
    possible to further filter the interrupt/exception fields based on the
    VM-Exit reason, but the helper is intended only for tracepoints, i.e.
    an extra VMREAD or two is a non-issue, the failed VM-Enter case is just
    low hanging fruit.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I638fa29ef7d8bb432de42a33f9ae4db43259b915

fix: ext4: fast commit recovery path (v5.10)

See upstream commit :

  commit 8016e29f4362e285f0f7e38fadc61a5b7bdfdfa2
  Author: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
  Date:   Thu Oct 15 13:37:59 2020 -0700

    ext4: fast commit recovery path

    This patch adds fast commit recovery path support for Ext4 file
    system. We add several helper functions that are similar in spirit to
    e2fsprogs journal recovery path handlers. Example of such functions
    include - a simple block allocator, idempotent block bitmap update
    function etc. Using these routines and the fast commit log in the fast
    commit area, the recovery path (ext4_fc_replay()) performs fast commit
    log recovery.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia65cf44e108f2df0b458f0d335f33a8f18f50baa

fix: btrfs: make ordered extent tracepoint take btrfs_inode (v5.10)

See upstream commit :

  commit acbf1dd0fcbd10c67826a19958f55a053b32f532
  Author: Nikolay Borisov <nborisov@suse.com>
  Date:   Mon Aug 31 14:42:40 2020 +0300

    btrfs: make ordered extent tracepoint take btrfs_inode

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I096d0801ffe0ad826cfe414cdd1c0857cbd2b624

fix: btrfs: tracepoints: output proper root owner for trace_find_free_extent() (v5.10)

See upstream commit :

  commit 437490fed3b0c9ae21af8f70e0f338d34560842b
  Author: Qu Wenruo <wqu@suse.com>
  Date:   Tue Jul 28 09:42:49 2020 +0800

    btrfs: tracepoints: output proper root owner for trace_find_free_extent()

    The current trace event always output result like this:

     find_free_extent: root=2(EXTENT_TREE) len=16384 empty_size=0 flags=4(METADATA)
     find_free_extent: root=2(EXTENT_TREE) len=16384 empty_size=0 flags=4(METADATA)
     find_free_extent: root=2(EXTENT_TREE) len=8192 empty_size=0 flags=1(DATA)
     find_free_extent: root=2(EXTENT_TREE) len=8192 empty_size=0 flags=1(DATA)
     find_free_extent: root=2(EXTENT_TREE) len=4096 empty_size=0 flags=1(DATA)
     find_free_extent: root=2(EXTENT_TREE) len=4096 empty_size=0 flags=1(DATA)

    T's saying we're allocating data extent for EXTENT tree, which is not
    even possible.

    It's because we always use EXTENT tree as the owner for
    trace_find_free_extent() without using the @root from
    btrfs_reserve_extent().

    This patch will change the parameter to use proper @root for
    trace_find_free_extent():

    Now it looks much better:

     find_free_extent: root=5(FS_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
     find_free_extent: root=5(FS_TREE) len=8192 empty_size=0 flags=1(DATA)
     find_free_extent: root=5(FS_TREE) len=16384 empty_size=0 flags=1(DATA)
     find_free_extent: root=5(FS_TREE) len=4096 empty_size=0 flags=1(DATA)
     find_free_extent: root=5(FS_TREE) len=8192 empty_size=0 flags=1(DATA)
     find_free_extent: root=5(FS_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
     find_free_extent: root=7(CSUM_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
     find_free_extent: root=2(EXTENT_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
     find_free_extent: root=1(ROOT_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1d674064d29b31417e2acffdeb735f5052a87032

fix: objtool: Rename frame.h -> objtool.h (v5.10)

See upstream commit :

  commit 00089c048eb4a8250325efb32a2724fd0da68cce
  Author: Julien Thierry <jthierry@redhat.com>
  Date:   Fri Sep 4 16:30:25 2020 +0100

    objtool: Rename frame.h -> objtool.h

    Header frame.h is getting more code annotations to help objtool analyze
    object files.

    Rename the file to objtool.h.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic2283161bebcbf1e33b72805eb4d2628f4ae3e89

fix: strncpy equals destination size warning

Some versions of GCC when called with -Wstringop-truncation will warn
when doing a copy of the same size as the destination buffer with
strncpy :

‘strncpy’ specified bound 256 equals destination size [-Werror=stringop-truncation]

Since we unconditionally write '\0' in the last byte, reduce the copy
size by one.

Change-Id: Idb907c9550817a06fc0dffc489740f63d440e7d4
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>

Version 2.11.6

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Cleanup: lttng-syscalls: silence warning about uninitialized bitmap variable

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: backport 'Add 'kernel_read' wrapper for kernels < v4.14'

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I3b558a6a4b850054d5786bdf99e0849091c83eae

Add 'kernel_read' wrapper for kernels < v4.14

See upstream commit:

  commit bdd1d2d3d251c65b74ac4493e08db18971c09240
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Fri Sep 1 17:39:13 2017 +0200

    fs: fix kernel_read prototype

    Use proper ssize_t and size_t types for the return value and count
    argument, move the offset last and make it an in/out argument like
    all other read/write helpers, and make the buf argument a void pointer
    to get rid of lots of casts in the callers.

Change-Id: I825c3fcbcc17e9b46e2a661fadc66b52a94eb2da
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: Use 'kernel_read' to read from procfs

Use the 'kernel_read' helper to read files in procfs, it's present in
the kernel since the 2.6 series and does the right thing on kernels that
require the set_fs dance and newer one which don't.

Change-Id: I1a53fda379e0bb9acc79331626925bbdba63d727
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: don't allow userspace copy to read kernel memory

This patch fixes a security issue which allows the root user to read
arbitrary kernel memory. Considering the security model used in LTTng
userspace tooling for kernel tracing, this bug also allows members of
the 'tracing' group to read arbitrary kernel memory.

Calls to __copy_from_user_inatomic() where wrongly enclosed in
set_fs(KERNEL_DS) defeating the access_ok() calls and allowing to read
from kernel memory if a kernel address is provided.

Remove all set_fs() calls around __copy_from_user_inatomic().

As a side effect this will allow us to support v5.10 which should remove
set_fs().

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I35e4562c835217352c012ed96a7b8f93e941381e

fix: Add a 1MB limit to lttng_strlen_user_inatomic

The previous implementation was unbounded which could result in long
loops with preemption turned off.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I85afcd879258735bb2e7502f6016fcb2d3974cf7

fix: Adjust ranges for Ubuntu 4.15.0-119 kernel

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie32f70f810c8fc756fbd31ab129aeb35500790f7

fix: Adjust ranges for Ubuntu HWE 5.0 kernels

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I36f2c3485dcc6ccb74ea86a7ce66fcb1662d060b

Fix: system call filter table

The system call filter table has effectively been unused for a long
time due to system call name prefix mismatch. This means the overhead of
selective system call tracing was larger than it should have been because
the event payload preparation would be done for all system calls as soon
as a single system call is traced.

However, fixing this underlying issue unearths several issues that crept
unnoticed when the "enabler" concept was introduced (after the original
implementation of the system call filter table).

Here is a list of the issues which are resolved here:

- Split lttng_syscalls_unregister into an unregister and destroy
  function, thus awaiting for a grace period (and therefore quiescence
  of the users) after unregistering the system call tracepoints before
  freeing the system call filter data structures. This effectively fixes
  a use-after-free.

- The state for enabling "all" system calls vs enabling specific system
  calls (and sequences of enable-disable) was incorrect with respect to
  the "enablers" semantic. This is solved by always tracking the
  bitmap of enabled system calls, and keeping this bitmap even when
  enabling all system calls. The sc_filter is now always allocated
  before system call tracing is registered to tracepoints, which means
  it does not need to be RCU dereferenced anymore.

Padding fields in the ABI are reserved to select whether to:

- Trace either native or compat system call (or both, which is the
  behavior currently implemented),
- Trace either system call entry or exit (or both, which is the
  behavior currently implemented),
- Select the system call to trace by name (behavior currently
  implemented) or by system call number,

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: version ranges for ext4_discard_preallocations and writeback_queue_io

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id4fa53cb2e713cbda651e1a75deed91013115592

fix: writeback: Fix sync livelock due to b_dirty_time processing (v5.9)

See upstream commit:

  commit f9cae926f35e8230330f28c7b743ad088611a8de
  Author: Jan Kara <jack@suse.cz>
  Date:   Fri May 29 16:08:58 2020 +0200

    writeback: Fix sync livelock due to b_dirty_time processing

    When we are processing writeback for sync(2), move_expired_inodes()
    didn't set any inode expiry value (older_than_this). This can result in
    writeback never completing if there's steady stream of inodes added to
    b_dirty_time list as writeback rechecks dirty lists after each writeback
    round whether there's more work to be done. Fix the problem by using
    sync(2) start time is inode expiry value when processing b_dirty_time
    list similarly as for ordinarily dirtied inodes. This requires some
    refactoring of older_than_this handling which simplifies the code
    noticeably as a bonus.

Change-Id: I8b894b13ccc14d9b8983ee4c2810a927c319560b
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: writeback: Drop I_DIRTY_TIME_EXPIRE (v5.9)

See upstream commit:

  commit 5fcd57505c002efc5823a7355e21f48dd02d5a51
  Author: Jan Kara <jack@suse.cz>
  Date:   Fri May 29 16:24:43 2020 +0200

    writeback: Drop I_DIRTY_TIME_EXPIRE

    The only use of I_DIRTY_TIME_EXPIRE is to detect in
    __writeback_single_inode() that inode got there because flush worker
    decided it's time to writeback the dirty inode time stamps (either
    because we are syncing or because of age). However we can detect this
    directly in __writeback_single_inode() and there's no need for the
    strange propagation with I_DIRTY_TIME_EXPIRE flag.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I92e37c2ff3ec36d431e8f9de5c8e37c5a2da55ea

fix: removal of [smp_]read_barrier_depends (v5.9)

See upstream commits:

  commit 76ebbe78f7390aee075a7f3768af197ded1bdfbb
  Author: Will Deacon <will@kernel.org>
  Date:   Tue Oct 24 11:22:47 2017 +0100

    locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE()

    In preparation for the removal of lockless_dereference(), which is the
    same as READ_ONCE() on all architectures other than Alpha, add an
    implicit smp_read_barrier_depends() to READ_ONCE() so that it can be
    used to head dependency chains on all architectures.

  commit 76ebbe78f7390aee075a7f3768af197ded1bdfbb
  Author: Will Deacon <will.deacon@arm.com>
  Date:   Tue Oct 24 11:22:47 2017 +0100

    locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE()

    In preparation for the removal of lockless_dereference(), which is the
    same as READ_ONCE() on all architectures other than Alpha, add an
    implicit smp_read_barrier_depends() to READ_ONCE() so that it can be
    used to head dependency chains on all architectures.

Change-Id: Ife8880bd9378dca2972da8838f40fc35ccdfaaac
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: ext4: indicate via a block bitmap read is prefetched… (v5.9)

See upstream commit:

  commit ab74c7b23f3770935016e3eb3ecdf1e42b73efaa
  Author: Theodore Ts'o <tytso@mit.edu>
  Date:   Wed Jul 15 11:48:55 2020 -0400

    ext4: indicate via a block bitmap read is prefetched via a tracepoint

    Modify the ext4_read_block_bitmap_load tracepoint so that it tells us
    whether a block bitmap is being prefetched.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0e5e2c5b8004223d0928235c092449ee16a940e1

fix: ext4: limit the length of per-inode prealloc list (v5.9)

See upstream commit:

  commit 27bc446e2def38db3244a6eb4bb1d6312936610a
  Author: brookxu <brookxu.cn@gmail.com>
  Date:   Mon Aug 17 15:36:15 2020 +0800

    ext4: limit the length of per-inode prealloc list

    In the scenario of writing sparse files, the per-inode prealloc list may
    be very long, resulting in high overhead for ext4_mb_use_preallocated().
    To circumvent this problem, we limit the maximum length of per-inode
    prealloc list to 512 and allow users to modify it.

    After patching, we observed that the sys ratio of cpu has dropped, and
    the system throughput has increased significantly. We created a process
    to write the sparse file, and the running time of the process on the
    fixed kernel was significantly reduced, as follows:

    Running time on unfixed kernel：
    [root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
    real    0m2.051s
    user    0m0.008s
    sys     0m2.026s

    Running time on fixed kernel：
    [root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
    real    0m0.471s
    user    0m0.004s
    sys     0m0.395s

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I5169cb24853d4da32e2862a6626f1f058689b053

fix: KVM: x86/mmu: Make kvm_mmu_page definition and accessor internal-only (v5.9)

  commit 985ab2780164698ec6e7d73fad523d50449261dd
  Author: Sean Christopherson <sean.j.christopherson@intel.com>
  Date:   Mon Jun 22 13:20:32 2020 -0700

    KVM: x86/mmu: Make kvm_mmu_page definition and accessor internal-only

    Make 'struct kvm_mmu_page' MMU-only, nothing outside of the MMU should
    be poking into the gory details of shadow pages.

Change-Id: Ia5c1b9c49c2b00dad1d5b17c50c3dc730dafda20
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: Move mmutrace.h into the mmu/ sub-directory (v5.9)

  commit 33e3042dac6bcc33b80835f7d7b502b1d74c457c
  Author: Sean Christopherson <sean.j.christopherson@intel.com>
  Date:   Mon Jun 22 13:20:29 2020 -0700

    KVM: x86/mmu: Move mmu_audit.c and mmutrace.h into the mmu/ sub-directory

    Move mmu_audit.c and mmutrace.h under mmu/ where they belong.

Change-Id: I582525ccca34e1e3bd62870364108a7d3e9df2e4
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Kconfig: fix dependency issue when building in-tree without CONFIG_FTRACE

When building in-tree, one could disable CONFIG_FTRACE from kernel
config which will leave CONFIG_TRACEPOINTS selected by LTTNG modules,
but generate a lot of linker errors like below because it leaves out
other stuff, e.g.:

trace.c:(.text+0xd86b): undefined reference to `trace_event_buffer_reserve'
ld: trace.c:(.text+0xd8de): undefined reference to `trace_event_buffer_commit'
ld: trace.c:(.text+0xd926): undefined reference to `event_triggers_call'
ld: trace.c:(.text+0xd942): undefined reference to `trace_event_ignore_this_pid'
ld: net/mac80211/trace.o: in function `trace_event_raw_event_drv_tdls_cancel_channel_switch':

It appears to be caused by the fact that TRACE_EVENT macros in the Linux
kernel depend on the Ftrace ring buffer as soon as CONFIG_TRACEPOINTS is
enabled.

Steps to reproduce:

- Get a clone of an upstream stable kernel and use scripts/built-in.sh on it

- Configure a standard x86-64 build, enable built-in LTTNG but disable
CONFIG_FTRACE from Kernel Hacking-->Tracers using menuconfig

- Build will fail at linking stage

Signed-off-by: Beniamin Sandu <beniaminsandu@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Version 2.11.5

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: no __lttng_vmalloc_node_range() prior to v2.6.38

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I9747c89f7b57448f4f1a5c1573ba2e81afe09a08

Fix: Lock metadata cache on session destroy

commit 92143b2c5656 ("Fix: metadata stream leak, missing list removal and locking")
missed taking a lock protecting the metadata stream list iteration on
session destroy. This opens a race window between iteration and item
removal/free which triggers kernel OOPS.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Fix: metadata stream leak, missing list removal and locking

The metadata stream is part of a list of metadata streams in the
metadata cache. Its addition to the list should be protected by
the metadata cache lock. It needs to be paired with protection
of list iteration with the same lock.

Removal from the list is entirely missing, and should be added
to lttng_metadata_ring_buffer_release (with proper locking).

This missing list removal was probably not causing issues because the
metadata stream structure was leaked: a kfree() is missing from
lttng_metadata_ring_buffer_release as well.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Fix: coherent state not changed atomically with metadata written

commit 122c63cb4310 ("Fix: Implement RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK")
introduces a new ioctl which returns a flag indicating whether the
metadata is in consistent state at the end of the sub-buffer.

That commit is meant to address metadata consistency issues observable
in live sessions.

However, the "consistent" state is false as soon as a producer is
active (between an outermost metadata_begin/end pair). Unfortunately,
if the last "RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK" operation is
done between the last metadata printf and "end" of the transaction, the
last consistency state will be false, and the consumer daemon will never
send metadata to the relay daemon. This in turn causes a live viewer to
wait for metadata endlessly.

This issue can be reproduced by running lttng-tools:
tests/regression/tools/live/test_kernel

as root in a loop.

We observe two things:
1) the poll operation blocks when there is no more metadata to send,
   which means there is no mean to unblock when the consistency state
   changes back to "true" without producing additional metadata,

2) Even if (1) was fixed, the expectation from an ABI perspective is
   that the "coherent" state is only populated when
   RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK succeeds. Therefore,
   there is no way to let user-space know about conherency transition
   unless additional metadata is generated.

Fixing this requires to hold the metadata cache lock across the entire
production of a coherent metadata transaction. This simpler scheme is
possible because the metadata is generated in a reallocated memory area
and not directly into a ring buffer anymore. This was not the case in
earlier lttng-modules versions, when the metadata was generated directly
into a ring buffer, which explains why this simpler scheme was not
implemented.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: include module.h for EXPORT_SYMBOL_GPL

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic337e1eb375791ace08560555dd02b37cbefcf25

fix: __lttng_vmalloc_node_range const caller introduced in v3.6

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib13cf03b5ab11830a8732318a12713720cf1b3e3

fix: version range for overflow_callback

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1b8f1d59552a1723d3f4ed74780a2b57d13d0e52

fix: global_dirty_limit was introduced in v3.1

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id97dbb2d0181a45c45cfed36c4be8753cabac283

fix: wrapper_uprobe_unregister is a void function

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib4438da02aac3defd1245324d1b48f400f806d58

fix: prior to v4.0, __vmalloc_node_range had no vm_flags param

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib476e32d109298d9ca3e6b6ab7ac8f63c50fb09f

fix: vmalloc on v5.8 without KALLSYMS

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic945dad92e78a5bc2895a969a10c527e1349decf

Detect missing symbols used with kallsyms_lookup at compile time

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I19a9a31c386196899517899d861fe63611272139

Use exported symbol bdevname() instead of disk_name()

bdevname() is a simple wrapper over disk_name() but has the honor to be
exported. Using it removes the need for a kallsym wrapper.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic2b2233c4db7826175c68edea69751ddcb17a5e6

Add git-review config

Add .gitreview for contributors wishing to use gerrit for patch
reviews.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I663e66a433ddb645f580c4b9f885db9c3a08e02f

fix: mm: remove vmalloc_sync_(un)mappings() (v5.8)

See upstream commit:

  commit 73f693c3a705756032c2863bfb37570276902d7d
  Author: Joerg Roedel <jroedel@suse.de>
  Date:   Mon Jun 1 21:52:36 2020 -0700

    mm: remove vmalloc_sync_(un)mappings()

    These functions are not needed anymore because the vmalloc and ioremap
    mappings are now synchronized when they are created or torn down.

    Remove all callers and function definitions.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ifdefa35b25b4906cde407360e608b77e47cc3808

fix: backport of block_bio_complete for <= v2.6.38

The stable-2.11 branch still supports kernel before 3.0,
add the proper version checks to a patch that was backported
from master.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I9c5a66d9b68ba132bd9e752af6c069826961869a

fix: mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK (v5.8)

See upstream commit:

  commit 8d92890bd6b8502d6aee4b37430ae6444ade7a8c
  Author: NeilBrown <neilb@suse.de>
  Date:   Mon Jun 1 21:48:21 2020 -0700

    mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK instead

    After an NFS page has been written it is considered "unstable" until a
    COMMIT request succeeds.  If the COMMIT fails, the page will be
    re-written.

    These "unstable" pages are currently accounted as "reclaimable", either
    in WB_RECLAIMABLE, or in NR_UNSTABLE_NFS which is included in a
    'reclaimable' count.  This might have made sense when sending the COMMIT
    required a separate action by the VFS/MM (e.g.  releasepage() used to
    send a COMMIT).  However now that all writes generated by ->writepages()
    will automatically be followed by a COMMIT (since commit 919e3bd9a875
    ("NFS: Ensure we commit after writeback is complete")) it makes more
    sense to treat them as writeback pages.

    So this patch removes NR_UNSTABLE_NFS and accounts unstable pages in
    NR_WRITEBACK and WB_WRITEBACK.

    A particular effect of this change is that when
    wb_check_background_flush() calls wb_over_bg_threshold(), the latter
    will report 'true' a lot less often as the 'unstable' pages are no
    longer considered 'dirty' (as there is nothing that writeback can do
    about them anyway).

    Currently wb_check_background_flush() will trigger writeback to NFS even
    when there are relatively few dirty pages (if there are lots of unstable
    pages), this can result in small writes going to the server (10s of
    Kilobytes rather than a Megabyte) which hurts throughput.  With this
    patch, there are fewer writes which are each larger on average.

    Where the NR_UNSTABLE_NFS count was included in statistics
    virtual-files, the entry is retained, but the value is hard-coded as
    zero.  static trace points and warning printks which mentioned this
    counter no longer report it.

Change-Id: I18080ca62bc6c1cd7d6da4cb27cc1521fbdca5e1
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: block: remove the error argument to the block_bio_complete (v5.8)

See upstream commit:

  commit d24de76af836260a99ca2ba281a937bd5bc55591
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Wed Jun 3 07:14:43 2020 +0200

    block: remove the error argument to the block_bio_complete tracepoint

    The status can be trivially derived from the bio itself.  That also avoid
    callers like NVMe to incorrectly pass a blk_status_t instead of the errno,
    and the overhead of translating the blk_status_t to the errno in the I/O
    completion fast path when no tracing is enabled.

Fixes: 35fe0d12c8a3 ("nvme: trace bio completion")
Change-Id: I8d1463184d79bfab418a1755bfc6a0200170fff3
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: pipe_buf_operations rework (v5.8)

See upstream commits:

  commit c928f642c29a5ffb02e16f2430b42b876dde69de
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Wed May 20 17:58:16 2020 +0200

    fs: rename pipe_buf ->steal to ->try_steal

    And replace the arcane return value convention with a simple bool
    where true means success and false means failure.

    [AV: braino fix folded in]

  commit b8d9e7f2411b0744df2ec33e80d7698180fef21a
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Wed May 20 17:58:15 2020 +0200

    fs: make the pipe_buf_operations ->confirm operation optional

    Just return 0 for success if it is not present.

  commit 76887c256744740d6121af9bc4aa787712a1f694
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Wed May 20 17:58:14 2020 +0200

    fs: make the pipe_buf_operations ->steal operation optional

    Just return 1 for failure if it is not present.

Change-Id: Ic185632202470db1eb5b012e95e793ff2cb26be7
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Version 2.11.4

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Fix: Implement RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK

Get next metadata subbuffer, returning a flag indicating whether the
metadata is guaranteed to be in a consistent state at the end of this
sub-buffer (can be parsed).

This can be used by the consumer to know whether the metadata can be
parsed at the end of this sub-buffer, which is useful to distinguish
between errors and incomplete metadata in live tracing.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: vmalloc_sync_mappings was backported to v5.5.12

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie554d9c956afc2a8e114fe41e4b3c225d8af40a1

Update: Additional kernel ranges for vmalloc_sync_mappings

Some Ubuntu kernels cannot be directly mapped to an upstream stable
version. Define distro specific ranges for those (4.15, 5.0, 5.3).

Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Update: Use vmalloc_sync_mappings for stable kernels

Starting from v5.4.28/v5.2.37/v4.19.113/v4.14.175/v4.9.218/v4.4.218, stable
kernel branches backported v5.6 upstream commit [1], causing the following
warnings:
...
[ 483.242037] LTTng: vmalloc_sync_all symbol lookup failed.
[ 483.257056] Page fault handler and NMI tracing might trigger faults.
...

Extend check for vmalloc_sync_mappings for stable kernels as well.

[1] https://github.com/torvalds/linux/commit/763802b53a427ed3cbd419dbba255c414fdd9e7c

[ Edit: minor coding style fix by Mathieu Desnoyers. ]

Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Fix: Use vmalloc_sync_mappings on kernel 5.6 as well

Upstream commit [1], that got rid of vmalloc_sync_all and introduced
vmalloc_sync_mappings, is a v5.6 commit:
$ git tag --contains 763802b53a427ed3cbd419dbba255c414fdd9e7c
v5.6
v5.6-rc7
v5.7-rc1
v5.7-rc2
v5.7-rc3

Extend the LINUX_VERSION_CODE check to v5.6 to fix the following warnings:
...
[ 483.242037] LTTng: vmalloc_sync_all symbol lookup failed.
[ 483.257056] Page fault handler and NMI tracing might trigger faults.
...

[1] https://github.com/torvalds/linux/commit/763802b53a427ed3cbd419dbba255c414fdd9e7c

Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: add missing guid_t type to wrapper

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0de39c24a7925b580fabbdaa12dbe05c43cfcd98

Fix: missing wrapper rename to wrapper_vmalloc_sync_mappings

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Idf7082a980c5a604bfef5c69906678b5083a9bbf

Update for kernel 5.7: use vmalloc_sync_mappings on kernels >= 5.7

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Unbreak LTTng for kernel 5.7

Linux commit 0bd476e6c67190b5eb7b6e105c8db8ff61103281 ("kallsyms:
unexport kallsyms_lookup_name() and kallsyms_on_each_symbol()") breaks
LTTng-modules by removing symbols used by the LTTng-modules out-of-tree
tracer.

I pointed this out when the change was originally considered before the
5.7 merge window. This generated some discussion but it did not lead to
any concrete proposal to fix the issue. [1]

The commit has been merged in the 5.7 merge window. At that point, as
maintainer of LTTng, I immediately raised a flag about this issue,
proposing an alternative approach to solve this: expose the few symbols
needed by LTTng to GPL modules. This was NACKed on the ground that the
Linux kernel cannot export GPL symbols when there are no in-tree
users. [2]

Steven Rostedt has shown interest in merging LTTng-modules upstream.
LTTng-modules being LGPL, this is very much doable. I have prepared a
tree of LTTng-modules "for upstreaming" and sent it to him privately so
he can review it. Even if in an ideal scenario LTTng-modules is merged
for the following merge window, it leaves LTTng-modules broken on the
5.7 kernel.

In order to ensure that the LTTng-modules kernel tracer continues working
for my end users on kernels 5.7 onwards, as a very last resort, this is
with great reluctance that I created this fix for LTTng modules. It
basically uses kprobes to lookup the kallsyms_lookup_name symbol, and
continues using kallsyms_lookup_name as before.

Link: https://lore.kernel.org/r/20200302192811.n6o5645rsib44vco@localhost
Link: https://lore.kernel.org/r/20200409193543.18115-1-mathieu.desnoyers@efficios.com
Link: https://lwn.net/Articles/817988/
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Will Deacon <will@kernel.org>
CC: akpm@linux-foundation.org
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: Masami Hiramatsu <mhiramat@kernel.org>
CC: rostedt@goodmis.org
CC: Alexei Starovoitov <ast@kernel.org>

Move lttng wrappers into own module

Currently, we only pull the wrapper symbols into a single sub-module,
either:

lttng-tracer.o:
  - wrapper/random.o
  - wrapper/trace-clock.o
  - wrapper/page_alloc.o

or

lttng-statedump.o:
  - wrapper/irqdesc.o
  - wrapper/fdtable.o

Because lttng-tracer depends on lttng-statedump, we cannot just put all
wrappers into lttng-tracer.o, because it would create a circular
dependency. This will be an issue if we introduce common wrappers which
are used in both lttng-tracer.o and in lttng-statedump.o.

Introduce a new lttng-wrapper.o to contain all wrapper symbols for all
lttng modules.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Introduce lttng_guid_gen wrapper for kernels >= 5.7.0

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Drop uuid.h wrapper

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

instrumentation: update x86 kvm instrumentation for kernel >= 5.7.0

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

instrumentation: update mm_vmscan for kernel >= 5.7.0

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Version 2.11.3

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: uaccess wrapper for CentOS >= 4.18.0-147

Fixes: #1253
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I2a79c1c0e897a6148e60e5599949cd2778d09d50

fix: ext4 instrumentation for CentOS >= 4.18.0-147

Fixes: #1253
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1fd54af16fbb02cd4b3ab7fc7d9232708088f1fd

fix: signal instrumentation for CentOS >= 4.18.0-147

Fixes: #1253
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I280013402df6f14222fbb912cdf64d80af3ab265

fix: kvm instrumentation for CentOS >= 4.18.0-147

Fixes: #1253
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ide20ebf51bec503866ffc96dda3e0b09ebeb14d6

fix: rcu instrumentation for CentOS >= 4.18.0-80

Fixes: #1253
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1735d2caa7215ce94272aaaa98cbbc8f3a10743d

Fix: update kvm instrumentation for Ubuntu 5.3.0-45

This commit introduced in 5.3.0-43 was dropped in 5.3.0-45 and reintroduced
in 5.3.0-46:

  commit 795f8a34f279e17c279bba46da10f15c5dd00264
  Author: Sean Christopherson <sean.j.christopherson@intel.com>
  Date:   Fri Dec 6 15:57:14 2019 -0800

    KVM: x86: Use gpa_t for cr2/gpa to fix TDP support on 32-bit KVM

BugLink: https://bugs.launchpad.net/bugs/1867051
    [ Upstream commit 736c291c9f36b07f8889c61764c28edce20e715d ]

Fun times!

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia5f1a4ba355f592f09e964038b6334ddb3ad5153

Fix: update kvm instrumentation for Ubuntu 5.3.0-43

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1ce5f9ebba997fcc4cfbae6901eed479e2e1a79e

Fix: update kvm instrumentation for Ubuntu 4.15.0-92

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib367b9a0ce3846f45313906e710a9a6d644e3955

Remove lttng-ftrace integration

The lttng-ftrace integration (LTTNG_KERNEL_FUNCTION instrumentation
type) was unused for a while now. The "function" probing is actually
done with kprobes and kretprobes (LTTNG_KERNEL_KPROBE and
LTTNG_KERNEL_KRETPROBE).

Remove it so a use of kallsyms_lookup_name() can be removed as well.
Note that in the future we could add back this support by using
register_ftrace_function() which is exported to kernel modules, but
considering that we have not been using this code for a while,
just remove the implementation for now.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Remove dependency on kallsyms for splice_to_pipe (kernel 4.2+)

Upstream commit 2b514574f7e88 "net: af_unix: implement splice for stream
af_unix sockets" exported the "splice_to_pipe" symbol, so use it to
remove a dependency on kallsyms_lookup_name().

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Remove dependency on kallsyms for irq_to_desc (kernel 3.4+)

Upstream commit 3911ff30f5d "genirq: export handle_edge_irq() and
irq_to_desc()" exported the irq_to_desc symbol, so use it to remove a
dependency on kallsyms_lookup_name().

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Remove work-around for signed tracepoint module tainting (kernel 3.15+)

Upstream commit 66cc69e34e86a "Fix: module signature vs tracepoints: add
new TAINT_UNSIGNED_MODULE" fixed an issue where the kernel was
considering unsigned modules as tainting the kernel in the same way as a
force-loaded modules, which was causing the tracepoints within those
modules to be hidden.

This fix was merged in kernel 3.15, so there is no use in applying this
work-around starting from that kernel.

This removes a dependency on kallsyms_lookup_name() for the symbol
"tracepoint_module_notify".

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Version 2.11.2

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Fix: rcu: Fix data-race due to atomic_t copy-by-value (5.5.6, 5.4.22)

The following upstream commit has been backported to stable kernels
5.5.6 and 5.4.22:

  commit 6cf539a87a61a4fbc43f625267dbcbcf283872ed
  Author: Marco Elver <elver@google.com>
  Date:   Wed Oct 9 17:57:43 2019 +0200

    rcu: Fix data-race due to atomic_t copy-by-value

    This fixes a data-race where `atomic_t dynticks` is copied by value. The
    copy is performed non-atomically, resulting in a data-race if `dynticks`
    is updated concurrently.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: workqueue: add worker function to workqueue_execute_end tracepoint (v5.6)

See upstream commit :

  commit 1c5da0ec7f20dfb56030fb93f7f52f48e12deb52
  Author: Daniel Jordan <daniel.m.jordan@oracle.com>
  Date:   Mon Jan 13 17:52:39 2020 -0500

    workqueue: add worker function to workqueue_execute_end tracepoint

    It's surprising that workqueue_execute_end includes only the work when
    its counterpart workqueue_execute_start has both the work and the worker
    function.

    You can't set a tracing filter or trigger based on the function, and
    postprocessing scripts interested in specific functions are harder to
    write since they have to remember the work from _start and match it up
    with the same field in _end.

    Add the function name, taking care to use the copy stashed in the
    worker since the work is no longer safe to touch.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: media: v4l2: abstract timeval handling in v4l2_buffer (v5.6)

See upstream commit :

  commit 77cdffcb0bfb87fe3645894335cb8cb94917e6ac
  Author: Arnd Bergmann <arnd@arndb.de>
  Date:   Mon Dec 16 15:15:00 2019 +0100

    media: v4l2: abstract timeval handling in v4l2_buffer

    As a preparation for adding 64-bit time_t support in the uapi,
    change the drivers to no longer care about the format of the
    timestamp field in struct v4l2_buffer.

    The v4l2_timeval_to_ns() function is no longer needed in the
    kernel after this, but there is userspace code relying on
    it to be part of the uapi header.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: rcu: Remove kfree_rcu() special casing and lazy-callback (v5.6)

See upstream commit :

  commit 77a40f97030b27b3fc1640a3ed203870f0817f57
  Author: Joel Fernandes (Google) <joel@joelfernandes.org>
  Date:   Fri Aug 30 12:36:32 2019 -0400

    rcu: Remove kfree_rcu() special casing and lazy-callback handling

    This commit removes kfree_rcu() special-casing and the lazy-callback
    handling from Tree RCU.  It moves some of this special casing to Tiny RCU,
    the removal of which will be the subject of later commits.

    This results in a nice negative delta.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: rcu: Fix data-race due to atomic_t copy-by-value (v5.6)

See upstream commit :

  commit 6cf539a87a61a4fbc43f625267dbcbcf283872ed
  Author: Marco Elver <elver@google.com>
  Date:   Wed Oct 9 17:57:43 2019 +0200

    rcu: Fix data-race due to atomic_t copy-by-value

    This fixes a data-race where `atomic_t dynticks` is copied by value. The
    copy is performed non-atomically, resulting in a data-race if `dynticks`
    is updated concurrently.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: btrfs: make btrfs_ordered_extent naming consistent (v5.6)

See upstream commit :

  commit bffe633e00fb6b904817137fc17a44b42efcd985
  Author: Omar Sandoval <osandov@fb.com>
  Date:   Mon Dec 2 17:34:19 2019 -0800

    btrfs: make btrfs_ordered_extent naming consistent with btrfs_file_extent_item

    ordered->start, ordered->len, and ordered->disk_len correspond to
    fi->disk_bytenr, fi->num_bytes, and fi->disk_num_bytes, respectively.
    It's confusing to translate between the two naming schemes. Since a
    btrfs_ordered_extent is basically a pending btrfs_file_extent_item,
    let's make the former use the naming from the latter.

    Note that I didn't touch the names in tracepoints just in case there are
    scripts depending on the current naming.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: KVM: x86: Use gpa_t for cr2/gpa to fix TDP support on 32-bit (v5.6)

See upstream commit :

  commit 736c291c9f36b07f8889c61764c28edce20e715d
  Author: Sean Christopherson <sean.j.christopherson@intel.com>
  Date:   Fri Dec 6 15:57:14 2019 -0800

    KVM: x86: Use gpa_t for cr2/gpa to fix TDP support on 32-bit KVM

    Convert a plethora of parameters and variables in the MMU and page fault
    flows from type gva_t to gpa_t to properly handle TDP on 32-bit KVM.

    Thanks to PSE and PAE paging, 32-bit kernels can access 64-bit physical
    addresses.  When TDP is enabled, the fault address is a guest physical
    address and thus can be a 64-bit value, even when both KVM and its guest
    are using 32-bit virtual addressing, e.g. VMX's VMCS.GUEST_PHYSICAL is a
    64-bit field, not a natural width field.

    Using a gva_t for the fault address means KVM will incorrectly drop the
    upper 32-bits of the GPA.  Ditto for gva_to_gpa() when it is used to
    translate L2 GPAs to L1 GPAs.

    Opportunistically rename variables and parameters to better reflect the
    dual address modes, e.g. use "cr2_or_gpa" for fault addresses and plain
    "addr" instead of "vaddr" when the address may be either a GVA or an L2
    GPA.  Similarly, use "gpa" in the nonpaging_page_fault() flows to avoid
    a confusing "gpa_t gva" declaration; this also sets the stage for a
    future patch to combing nonpaging_page_fault() and tdp_page_fault() with
    minimal churn.

    Sprinkle in a few comments to document flows where an address is known
    to be a GVA and thus can be safely truncated to a 32-bit value.  Add
    WARNs in kvm_handle_page_fault() and FNAME(gva_to_gpa_nested)() to help
    document such cases and detect bugs.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: proc: decouple proc from VFS with "struct proc_ops" (v5.6)

See upstream commit :

  commit d56c0d45f0e27f814e87a1676b6bdccccbc252e9
  Author: Alexey Dobriyan <adobriyan@gmail.com>
  Date:   Mon Feb 3 17:37:14 2020 -0800

    proc: decouple proc from VFS with "struct proc_ops"

    Currently core /proc code uses "struct file_operations" for custom hooks,
    however, VFS doesn't directly call them.  Every time VFS expands
    file_operations hook set, /proc code bloats for no reason.

    Introduce "struct proc_ops" which contains only those hooks which /proc
    allows to call into (open, release, read, write, ioctl, mmap, poll).  It
    doesn't contain module pointer as well.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

fix: y2038: hide timeval/timespec/itimerval/itimerspec types (v5.6)

See upstream commit:

  commit c766d1472c70d25ad475cf56042af1652e792b23
  Author: Arnd Bergmann <arnd@arndb.de>
  Date:   Thu Feb 20 20:03:57 2020 -0800

    y2038: hide timeval/timespec/itimerval/itimerspec types

    There are no in-kernel users remaining, but there may still be users that
    include linux/time.h instead of sys/time.h from user space, so leave the
    types available to user space while hiding them from kernel space.

    Only the __kernel_old_* versions of these types remain now.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I986a813ad8b1c753ab1fa07f726b0cc481f049cb