Francis Deslauriers [Tue, 12 May 2020 19:52:41 +0000 (15:52 -0400)]
bytecode: generalize `struct lttng_filter_bytecode_node`
Rename `struct lttng_filter_bytecode_node` to `struct
lttng_bytecode_node` so it can be used by capture bytecode as well.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I27162522bb20d0fdce6af551fbd982a791d1067c
Francis Deslauriers [Tue, 12 May 2020 17:04:15 +0000 (13:04 -0400)]
Add msgpack implementation for serializing captures
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I7d9cfd4906c5c047cfb4fc9467b293c4e895523d
Francis Deslauriers [Tue, 12 May 2020 16:07:50 +0000 (12:07 -0400)]
bytecode: allow interpreter to return any type
The bytecode interpreter when used by capture bytecode needs to return
types other than an integer or dynamic type.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I099749183fbd0622f258f9c38e37fdb167493a0b
Francis Deslauriers [Mon, 11 May 2020 20:24:31 +0000 (16:24 -0400)]
bytecode: propagate `rev_bo` of element
When specializing and executing bytecode.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I86aea5b5733f92c56564c6352bd6dcb85f6a2d30
Francis Deslauriers [Mon, 11 May 2020 20:09:20 +0000 (16:09 -0400)]
bytecode: set register type to `REG_PTR` even if not used
There was no need to set the field when using filter as the next
instruction would assume that the top of stack is a `REG_PTR`.
With the upcoming capture feature, we need to ensure this field is
consistent for extraction.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I9c60416dd452949e584fadd70b15cdc3d402aa46
Francis Deslauriers [Mon, 11 May 2020 19:57:20 +0000 (15:57 -0400)]
Add `lttng_bytecode_interpret_format_output()` for top of stack extraction
This new static function will be used to extract the register on the top of
stack after the execution of the bytecode. This is currently not used by the
filter bytecode but will be used by capture bytecode.
The returned value is saved in a tagged union struct named `struct
lttng_interpreter_output` and can be used by the caller of the interpreter
function.
Typically, this struct will be allocated on the stack to avoid dynamic
allocation inside the tracepoint probes.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1cfd3ab6e84b7e308c48ed7a8a9555a3e338eea7
Francis Deslauriers [Thu, 30 Apr 2020 21:30:45 +0000 (17:30 -0400)]
bytecode: add `REG_U64` interpreter register type
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I53c12a54cbd416617834982bbd2b7cf528d41a76
Mathieu Desnoyers [Wed, 25 Nov 2020 17:58:27 +0000 (12:58 -0500)]
Fix: filter validator: refuse string and star glob input to bitwise operation
The validator refuses input ax=string,bx=unknown, but accepts input
ax=unknown,bx=string. Both inputs should be refused.
The same goes for the error glob input.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Francis Deslauriers [Tue, 5 May 2020 14:21:41 +0000 (10:21 -0400)]
Fix: bytecode: Validate register type for instructions expecting unknown type
The bytecode validator allows unknown type as input for some
instructions which are not specialized. The interpreter therefore needs
to check the register type for their input.
Thie requires that every instruction in the interpreter sets the
register type of the output it populates (unless it is unchanged).
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I3339c36340645937b801f6bf6dbf517d06416a14
Francis Deslauriers [Wed, 1 Apr 2020 21:12:59 +0000 (17:12 -0400)]
Cleanup: Rename filter functions/fields to mention "filter"
This will be cleaner when we introduce the capture bytecode functions
and fields.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I04aca5bfd31f2526b24fe3a4b2e8f2b1c1b482f9
Francis Deslauriers [Thu, 23 Jan 2020 22:47:17 +0000 (17:47 -0500)]
Implement event notifiers for syscalls
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I31e60d0d758b93ada11d208f583d71f05168c014
Francis Deslauriers [Wed, 25 Nov 2020 15:25:54 +0000 (10:25 -0500)]
Fix: syscalls: address of statically allocated element never null
This check is intended to confirm that the table element for that syscall
is indeed populated but checked that the element is NULL. This was never
the case because the address of an element of a statically allocated
array cannot be NULL.
Fix this by check if the function pointer is NULL instead. This means
that the element is not populated.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1d769d6609fa4517199f022e1a262c4494c8f63a
Francis Deslauriers [Wed, 25 Nov 2020 02:18:46 +0000 (21:18 -0500)]
Rename LTTNG_SYSCALL_MATCH_ -> LTTNG_KERNEL_SYSCALL_MATCH_
This is done to keep the same name scheme used to all ABI enums.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I8e1010ab21a47b7f1e519df498acd230315cdc26
Francis Deslauriers [Tue, 24 Nov 2020 16:08:14 +0000 (11:08 -0500)]
Allow LTTNG_KERNEL_SYSCALL_{ENTRY, EXIT}
Signed-off-by: Francis Deslauriers <fdeslaur@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1ea097797da5db474f2f33d779f66254b7979c46
Francis Deslauriers [Thu, 19 Nov 2020 22:00:19 +0000 (17:00 -0500)]
syscalls: extract `lttng_syscall_filter_enable()` for reuse
The syscall event notifiers will reuse the concept of syscall filtering
to avoid needlessly preparing arguments for disabled syscalls.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I64395a031e526e8485e10b4b72f653058c8d0a38
Francis Deslauriers [Thu, 19 Nov 2020 20:01:56 +0000 (15:01 -0500)]
Cleanup: syscall: remove unused `syscall_name` field
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I5d37b253348ac4812602d89b6be39a7abd1be4ff
Michael Jeanson [Tue, 24 Nov 2020 16:27:18 +0000 (11:27 -0500)]
fix: adjust version range for trace_find_free_extent()
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Iaa6088092cf58b4d29d55f3ff9586c57ae272302
Michael Jeanson [Mon, 23 Nov 2020 17:15:43 +0000 (12:15 -0500)]
Improve the release script
* Use git-archive, this removes all custom code to cleanup the repo, it
can now be used in an unclean repo as the code will be exported from
a specific tag.
* Add parameters, this will allow using the script on any machine
while keeping the default behavior for the maintainer.
Change-Id: I9f29d0e1afdbf475d0bbaeb9946ca3216f725e86
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Mon, 23 Nov 2020 15:49:57 +0000 (10:49 -0500)]
Add release maintainer script
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Fri, 20 Nov 2020 16:42:30 +0000 (11:42 -0500)]
fix: include order for older kernels
Fixes a build failure on v3.0 and v3.1.
Change-Id: Ic48512d2aa5ee46678e67d147b92dba6d0959615
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 26 Oct 2020 21:09:05 +0000 (17:09 -0400)]
fix: tracepoint: Optimize using static_call() (v5.10)
See upstream commit :
commit
d25e37d89dd2f41d7acae0429039d2f0ae8b4a07
Author: Steven Rostedt (VMware) <rostedt@goodmis.org>
Date: Tue Aug 18 15:57:52 2020 +0200
tracepoint: Optimize using static_call()
Currently the tracepoint site will iterate a vector and issue indirect
calls to however many handlers are registered (ie. the vector is
long).
Using static_call() it is possible to optimize this for the common
case of only having a single handler registered. In this case the
static_call() can directly call this handler. Otherwise, if the vector
is longer than 1, call a function that iterates the whole vector like
the current code.
Change-Id: I739dd84d62cc1a821b8bd8acff74fa29aa25d22f
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 26 Oct 2020 21:07:13 +0000 (17:07 -0400)]
fix: KVM: x86/mmu: Return unique RET_PF_* values if the fault was fixed (v5.10)
See upstream commit :
commit
c4371c2a682e0da1ed2cd7e3c5496f055d873554
Author: Sean Christopherson <sean.j.christopherson@intel.com>
Date: Wed Sep 23 15:04:24 2020 -0700
KVM: x86/mmu: Return unique RET_PF_* values if the fault was fixed
Introduce RET_PF_FIXED and RET_PF_SPURIOUS to provide unique return
values instead of overloading RET_PF_RETRY. In the short term, the
unique values add clarity to the code and RET_PF_SPURIOUS will be used
by set_spte() to avoid unnecessary work for spurious faults.
In the long term, TDX will use RET_PF_FIXED to deterministically map
memory during pre-boot. The page fault flow may bail early for benign
reasons, e.g. if the mmu_notifier fires for an unrelated address. With
only RET_PF_RETRY, it's impossible for the caller to distinguish between
"cool, page is mapped" and "darn, need to try again", and thus cannot
handle benign cases like the mmu_notifier retry.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie0855c78852b45f588e131fe2463e15aae1bc023
Michael Jeanson [Mon, 26 Oct 2020 18:28:35 +0000 (14:28 -0400)]
fix: kvm: x86/mmu: Add TDP MMU PF handler (v5.10)
See upstream commit :
commit
bb18842e21111a979e2e0e1c5d85c09646f18d51
Author: Ben Gardon <bgardon@google.com>
Date: Wed Oct 14 11:26:50 2020 -0700
kvm: x86/mmu: Add TDP MMU PF handler
Add functions to handle page faults in the TDP MMU. These page faults
are currently handled in much the same way as the x86 shadow paging
based MMU, however the ordering of some operations is slightly
different. Future patches will add eager NX splitting, a fast page fault
handler, and parallel page faults.
Tested by running kvm-unit-tests and KVM selftests on an Intel Haswell
machine. This series introduced no new failures.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie56959cb6c77913d2f1188b0ca15da9114623a4e
Michael Jeanson [Mon, 26 Oct 2020 18:11:17 +0000 (14:11 -0400)]
fix: KVM: x86: Add intr/vectoring info and error code to kvm_exit tracepoint (v5.10)
See upstream commit :
commit
235ba74f008d2e0936b29f77f68d4e2f73ffd24a
Author: Sean Christopherson <sean.j.christopherson@intel.com>
Date: Wed Sep 23 13:13:46 2020 -0700
KVM: x86: Add intr/vectoring info and error code to kvm_exit tracepoint
Extend the kvm_exit tracepoint to align it with kvm_nested_vmexit in
terms of what information is captured. On SVM, add interrupt info and
error code, while on VMX it add IDT vectoring and error code. This
sets the stage for macrofying the kvm_exit tracepoint definition so that
it can be reused for kvm_nested_vmexit without loss of information.
Opportunistically stuff a zero for VM_EXIT_INTR_INFO if the VM-Enter
failed, as the field is guaranteed to be invalid. Note, it'd be
possible to further filter the interrupt/exception fields based on the
VM-Exit reason, but the helper is intended only for tracepoints, i.e.
an extra VMREAD or two is a non-issue, the failed VM-Enter case is just
low hanging fruit.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I638fa29ef7d8bb432de42a33f9ae4db43259b915
Michael Jeanson [Mon, 26 Oct 2020 21:03:23 +0000 (17:03 -0400)]
fix: ext4: fast commit recovery path (v5.10)
See upstream commit :
commit
8016e29f4362e285f0f7e38fadc61a5b7bdfdfa2
Author: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Date: Thu Oct 15 13:37:59 2020 -0700
ext4: fast commit recovery path
This patch adds fast commit recovery path support for Ext4 file
system. We add several helper functions that are similar in spirit to
e2fsprogs journal recovery path handlers. Example of such functions
include - a simple block allocator, idempotent block bitmap update
function etc. Using these routines and the fast commit log in the fast
commit area, the recovery path (ext4_fc_replay()) performs fast commit
log recovery.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia65cf44e108f2df0b458f0d335f33a8f18f50baa
Michael Jeanson [Tue, 27 Oct 2020 16:10:05 +0000 (12:10 -0400)]
fix: btrfs: make ordered extent tracepoint take btrfs_inode (v5.10)
See upstream commit :
commit
acbf1dd0fcbd10c67826a19958f55a053b32f532
Author: Nikolay Borisov <nborisov@suse.com>
Date: Mon Aug 31 14:42:40 2020 +0300
btrfs: make ordered extent tracepoint take btrfs_inode
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I096d0801ffe0ad826cfe414cdd1c0857cbd2b624
Michael Jeanson [Tue, 27 Oct 2020 15:42:23 +0000 (11:42 -0400)]
fix: btrfs: tracepoints: output proper root owner for trace_find_free_extent() (v5.10)
See upstream commit :
commit
437490fed3b0c9ae21af8f70e0f338d34560842b
Author: Qu Wenruo <wqu@suse.com>
Date: Tue Jul 28 09:42:49 2020 +0800
btrfs: tracepoints: output proper root owner for trace_find_free_extent()
The current trace event always output result like this:
find_free_extent: root=2(EXTENT_TREE) len=16384 empty_size=0 flags=4(METADATA)
find_free_extent: root=2(EXTENT_TREE) len=16384 empty_size=0 flags=4(METADATA)
find_free_extent: root=2(EXTENT_TREE) len=8192 empty_size=0 flags=1(DATA)
find_free_extent: root=2(EXTENT_TREE) len=8192 empty_size=0 flags=1(DATA)
find_free_extent: root=2(EXTENT_TREE) len=4096 empty_size=0 flags=1(DATA)
find_free_extent: root=2(EXTENT_TREE) len=4096 empty_size=0 flags=1(DATA)
T's saying we're allocating data extent for EXTENT tree, which is not
even possible.
It's because we always use EXTENT tree as the owner for
trace_find_free_extent() without using the @root from
btrfs_reserve_extent().
This patch will change the parameter to use proper @root for
trace_find_free_extent():
Now it looks much better:
find_free_extent: root=5(FS_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
find_free_extent: root=5(FS_TREE) len=8192 empty_size=0 flags=1(DATA)
find_free_extent: root=5(FS_TREE) len=16384 empty_size=0 flags=1(DATA)
find_free_extent: root=5(FS_TREE) len=4096 empty_size=0 flags=1(DATA)
find_free_extent: root=5(FS_TREE) len=8192 empty_size=0 flags=1(DATA)
find_free_extent: root=5(FS_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
find_free_extent: root=7(CSUM_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
find_free_extent: root=2(EXTENT_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
find_free_extent: root=1(ROOT_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1d674064d29b31417e2acffdeb735f5052a87032
Michael Jeanson [Mon, 26 Oct 2020 17:41:02 +0000 (13:41 -0400)]
fix: objtool: Rename frame.h -> objtool.h (v5.10)
See upstream commit :
commit
00089c048eb4a8250325efb32a2724fd0da68cce
Author: Julien Thierry <jthierry@redhat.com>
Date: Fri Sep 4 16:30:25 2020 +0100
objtool: Rename frame.h -> objtool.h
Header frame.h is getting more code annotations to help objtool analyze
object files.
Rename the file to objtool.h.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic2283161bebcbf1e33b72805eb4d2628f4ae3e89
Mathieu Desnoyers [Thu, 19 Nov 2020 16:41:11 +0000 (11:41 -0500)]
Revert "Implement event notifiers for syscalls"
This reverts commit
8ced8896fe832af52b749d429b8eceb872a83d1b.
This commit was not ready and was committed by error.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Thu, 19 Nov 2020 16:03:17 +0000 (11:03 -0500)]
Fix: ressource leak in id tracker
Memory leak found by Coverity:
CID
1412251 (#2 of 2): Resource leak (RESOURCE_LEAK)
21. leaked_storage: Variable head going out of scope leaks the storage it points to.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Francis Deslauriers [Thu, 23 Jan 2020 22:47:17 +0000 (17:47 -0500)]
Implement event notifiers for syscalls
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Change-Id: Ic8f17feb45aef6e933252908c761d3241123cfe4
Francis Deslauriers [Thu, 23 Jan 2020 23:31:06 +0000 (18:31 -0500)]
lttng-syscalls.c: extract function calling actual probe
This function will be reused by the event notifier infrastructure.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Iad25a44202d74eac8f75af108eb8297d82303d63
Francis Deslauriers [Wed, 5 Feb 2020 17:44:22 +0000 (12:44 -0500)]
Namespace syscall code relating to events
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia1c3cf01d82681dfc77c2786ab58259085d349c8
Francis Deslauriers [Tue, 21 Jan 2020 23:20:53 +0000 (18:20 -0500)]
Implement event notifiers for uprobes
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I28ec581f412c1633d0cdb675020ad2c642c6c768
Francis Deslauriers [Tue, 21 Jan 2020 23:45:11 +0000 (18:45 -0500)]
Namespace uprobe functions relating to events
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I4130b5e9fe88ee5121646a3303b262266db658be
Francis Deslauriers [Tue, 21 Jan 2020 20:57:16 +0000 (15:57 -0500)]
doc: event notifier on kretprobe is not supported
The kretprobe behavior is to fire twice (entry and exit of target
function), placing an event notifier on such function does not make
sense at first glance. If we come up with a use case it will be quite
easy to enable.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I6f214501706d1ef170c81b80a1f82c039d687502
Francis Deslauriers [Tue, 14 Jan 2020 20:12:28 +0000 (15:12 -0500)]
Implement event notifiers for kprobes
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I69075c51f9df4ae89457967d96863dcf370d4570
Francis Deslauriers [Tue, 14 Jan 2020 20:34:54 +0000 (15:34 -0500)]
Namespace kprobe functions relating to events
The event notifier support for kprobe will soon be introduce and kprobe
code will be reused.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia44aea762c158e922c1fafed381fca6919bea188
Francis Deslauriers [Wed, 5 Feb 2020 19:47:19 +0000 (14:47 -0500)]
Implement event notifiers for tracepoints
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I256cc2c54179255402c5b7bc7d439508f0a6adbf
Francis Deslauriers [Wed, 5 Feb 2020 19:46:08 +0000 (14:46 -0500)]
Implement event notifier probes
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I807b344edb65ad8d1343187b987f693113029794
Mathieu Desnoyers [Tue, 4 Feb 2020 20:46:41 +0000 (15:46 -0500)]
Fix: event notifier: adapt read iterator state to poll expectations
When completing to read a subbuffer, ensure that the state of the
iterator is moved forward so the "put_subbuf" is performed before
returning to the user, so poll() will not return POLLIN when there
is actually no data available to read.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia94b5dcb0c72d8548325b1004f214044f50fd191
Mathieu Desnoyers [Mon, 3 Feb 2020 20:52:02 +0000 (15:52 -0500)]
Fix: event-notifier: do not flush packet if it only contains subbuf header
This poll behavior returns POLLIN in situations where there is
actually no event to read, which causes read to block when it
should not.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I59e56cd4da9907b6f9ccdc14c6037f0f72e4505e
Mathieu Desnoyers [Mon, 3 Feb 2020 19:19:13 +0000 (14:19 -0500)]
Implement lttng_event_notifier_group_notif_fops read, poll, open, release ABI
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia95c67306226202cfd10f3745ddeecb76b1ef1a7
Francis Deslauriers [Wed, 5 Feb 2020 19:44:02 +0000 (14:44 -0500)]
Implement event notifier send notification
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Change-Id: Ibdab5a25439da7d4c26e480c41cdd655a7d58d82
Francis Deslauriers [Wed, 5 Feb 2020 19:34:36 +0000 (14:34 -0500)]
Add event notifier and event notifier enabler
Idea
====
The purpose of the event notifiers is to allow the session daemon to
react to events in the tracer. For example, the user will be able to
start or stop tracing on a session when a specific tracepoint is fired.
An event notifier is really similar to a regular event. The main
difference is that when the tracepoint is fired the action of the event
notifier is to notify the session daemon about it. This mechanism will
use a special purpose ring buffer to expose these notifications to
userspace.
Unlike regular events, there are no claim on the timeliness of such
notifications.
Implementation
==============
This commit adds structures and functions related to event notifiers
mimicking what we currently do with regular events.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I962e6c7051693d6e4a79f89758f8bf1ebda6c148
Francis Deslauriers [Wed, 5 Feb 2020 19:13:42 +0000 (14:13 -0500)]
Implement event notifier group create
Event notifier groups will contain the event notifiers.
All event notifiers of a group will share the same ring buffer to save
the tracer notifications.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I7c38fbfd26517d00c8de38ca1981da623d570529
Francis Deslauriers [Fri, 13 Nov 2020 17:15:52 +0000 (12:15 -0500)]
Add token to `struct lttng_kernel_event`
This token is provided by the user when registering an event rule to the
kernel. It is going to used to identify messages from event notifiers, and
counter bucket owners in the upcoming event notifier and map counter
features.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I58cd6b6e33b97cc21e50cc0b36bef4b9e4224423
Francis Deslauriers [Wed, 5 Feb 2020 18:51:51 +0000 (13:51 -0500)]
lttng-events: move lttng_transport_find earlier in source file
Will be used by upcoming event notifier code which is earlier in source
file.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Idb7be8c682258195166f3b513fee2aa98656de35
Mathieu Desnoyers [Mon, 3 Feb 2020 19:49:16 +0000 (14:49 -0500)]
lib ring buffer: move subbuffer_consume_record into LTTNG_RING_BUFFER_COUNT_EVENTS ifdef
When event accounting is disabled, counting of event records consumed by
the iterator should be disabled as well, otherwise it triggers
CHAN_WARN_ON() because the accounting of events produced is not
performed.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id8b9e657ee420886b409be1f05ef08a0807fefdc
Mathieu Desnoyers [Tue, 4 Feb 2020 20:44:55 +0000 (15:44 -0500)]
lib ring buffer iterator: introduce lib_ring_buffer_put_current_record
Ensure that the current subbuffer is put after client code has read the
payload of the current record.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id2173ea67213f7ef8e7395b49c5aa8fff0aefffc
Mathieu Desnoyers [Mon, 3 Feb 2020 19:06:26 +0000 (14:06 -0500)]
Introduce event notifier lib ring buffer client
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I89da147ee956f5759c49bd992bf33fe760d79591
Mathieu Desnoyers [Mon, 3 Feb 2020 19:17:53 +0000 (14:17 -0500)]
lttng_abi_create_stream_fd: expect fd name as parameter
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic9711863b58307d3ed6cc782efc78e4f59345950
Mathieu Desnoyers [Mon, 3 Feb 2020 19:09:12 +0000 (14:09 -0500)]
LTTng ring buffer clients: expect void pointer as private data to create channel
Triggers will create a channel without using the lttng_channel objects,
so allow any type of private data.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0725616c84e401c9fcbf00a405a2e2d0f1078979
Mathieu Desnoyers [Thu, 23 Jan 2020 21:02:27 +0000 (16:02 -0500)]
lib ring buffer: use irq_work for wakeup by writer
Using irq_work (like perf does) allows using an interrupt handler
firing soon after the instrumentation execution to issue the wakeups.
This allows the RING_BUFFER_WAKEUP_BY_WRITER ring buffer configuration
to be entirely lock-free, which allows using it in NMI context for
general tracing purposes.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I842ff15736f53d1283cf953804d803f70779652b
Francis Deslauriers [Fri, 17 Jan 2020 23:17:02 +0000 (18:17 -0500)]
Rename `lttng_event_{get,put}()` to `lttng_event_desc_{get,put}()`
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I99a8b4cdf191555c28da5a38a1e65661421fd7fc
Francis Deslauriers [Wed, 18 Dec 2019 22:10:32 +0000 (17:10 -0500)]
Cleanup: extract function to borrow hashlist bucket
This is going to reused by the trigger system.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie6d032374c3991d0a75ad4737e7f082fbc1a74b1
Francis Deslauriers [Tue, 7 Jan 2020 16:00:55 +0000 (11:00 -0500)]
Decouple `struct lttng_event` from filter code
The filter infrastructure will be used by event notifiers and decoupling
this will allow for massive code reuse.
Of all `struct lttng_event`'s fields, filter code needs:
1. The `const struct lttng_event_desc *desc` field,
2. The `struct cds_list_head bytecode_runtime_head` list.
These fields are used to do the tracepoint field relocation
(`apply_field_reloc()` and `specialize_event_payload_lookup()`).
Considering that only these two field are needed, we can pass them
directly to these functions.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: If569b7d315700660aa84241d112668f2451b715a
Francis Deslauriers [Wed, 18 Dec 2019 22:00:37 +0000 (17:00 -0500)]
Rename `lttng_create_*_if_missing()` in anticipation of event notifiers
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I58a799f992a53215ff04896b783e7ebe31965b7c
Francis Deslauriers [Thu, 5 Dec 2019 20:29:26 +0000 (15:29 -0500)]
Extract event enabler fields to specialized struct
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I356d9b91c6e20c288ca931a4d449a54b67f3937c
Francis Deslauriers [Wed, 18 Dec 2019 21:40:49 +0000 (16:40 -0500)]
Docs: explain why unused `lttng_enabler::ctx` is kept around
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: If2c6b9203ea324bb1ff4051b0a705e7303dbf3a6
Francis Deslauriers [Thu, 5 Dec 2019 19:37:57 +0000 (14:37 -0500)]
Rename `enum lttng_enabler_type` to `_format_type`
This will avoid confusion between the different types of enablers
(event notifier enablers and event enablers).
- Enabler format types describe the way the event name matching is done.
- Enabler types will describe the type of enablers (event
notifier vs event)
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic71d05159c5f244d0b1ad74f9c0ee6247fcdfbbb
Jonathan Rajotte [Thu, 21 May 2020 13:45:25 +0000 (09:45 -0400)]
Test: add signed value and enum for testings of event notifier capture
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I4be725e3ed1e2f94420f4cdcf5ab6ac7962e2464
Francis Deslauriers [Wed, 30 Sep 2020 18:27:26 +0000 (14:27 -0400)]
Cleanup: remove usage of enum in ABI structures
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I3730f7c0341028b25231c368166ee6e5fd74fa5d
Mathieu Desnoyers [Wed, 21 Oct 2020 16:24:40 +0000 (12:24 -0400)]
Fix: type mismatch in clone instrumentation
The data and metadata types should all agree to use "unsigned long",
else it triggers babeltrace trace parsing errors.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Geneviève Bastien [Wed, 1 Apr 2020 18:31:49 +0000 (14:31 -0400)]
syscalls: Make clone()'s `flags` field a 2 enum struct.
The clone system call has a flags field, whose values are defined in
uapi/linux/sched.h file. This field is now a struct made of 2
enumerations to make the values more readable/meaningful.
The `flags` field has two parts:
1. exit signal: the least significant byte of the `unsigned long` is
the signal the kernel need to send to the parent process on child
exit,
2. clone options: the remaining bytes of the `unsigned long` is used a
bitwise flag for the clone options.
Those 2-in-1 fields should be printed using two different CTF fields.
Here's an example babeltrace output of the clone system call:
syscall_entry_clone: { cpu_id = 2 }, { flags = { exit_signal = ( "SIGCHLD" : container = 0x11 ), options = ( "CLONE_CHILD_CLEARTID" | "CLONE_CHILD_SETTID" : container = 0x12000 ) }
Change-Id: Ic375b59fb3b6564f036e1af24d66c0c7069b47d6
Signed-off-by: Geneviève Bastien <gbastien@versatic.net>
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 5 Oct 2020 19:31:42 +0000 (15:31 -0400)]
fix: strncpy equals destination size warning
Some versions of GCC when called with -Wstringop-truncation will warn
when doing a copy of the same size as the destination buffer with
strncpy :
‘strncpy’ specified bound 256 equals destination size [-Werror=stringop-truncation]
Since we unconditionally write '\0' in the last byte, reduce the copy
size by one.
Change-Id: Idb907c9550817a06fc0dffc489740f63d440e7d4
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Michael Jeanson [Tue, 6 Oct 2020 14:29:33 +0000 (10:29 -0400)]
Set version to 2.13-pre
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I4124a7d9d9c2f7a36816b7e498ffd37ae27da604
Mathieu Desnoyers [Mon, 5 Oct 2020 16:01:37 +0000 (12:01 -0400)]
Cleanup: lttng-syscalls: silence warning about uninitialized bitmap variable
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Fri, 2 Oct 2020 17:03:34 +0000 (13:03 -0400)]
Add 'kernel_read' wrapper for kernels < v4.14
See upstream commit:
commit
bdd1d2d3d251c65b74ac4493e08db18971c09240
Author: Christoph Hellwig <hch@lst.de>
Date: Fri Sep 1 17:39:13 2017 +0200
fs: fix kernel_read prototype
Use proper ssize_t and size_t types for the return value and count
argument, move the offset last and make it an in/out argument like
all other read/write helpers, and make the buf argument a void pointer
to get rid of lots of casts in the callers.
Change-Id: I825c3fcbcc17e9b46e2a661fadc66b52a94eb2da
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 24 Sep 2020 19:38:35 +0000 (15:38 -0400)]
fix: Use 'kernel_read' to read from procfs
Use the 'kernel_read' helper to read files in procfs, it's present in
the kernel since the 2.6 series and does the right thing on kernels that
require the set_fs dance and newer one which don't.
Change-Id: I1a53fda379e0bb9acc79331626925bbdba63d727
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Fri, 25 Sep 2020 20:05:00 +0000 (16:05 -0400)]
fix: don't allow userspace copy to read kernel memory
This patch fixes a security issue which allows the root user to read
arbitrary kernel memory. Considering the security model used in LTTng
userspace tooling for kernel tracing, this bug also allows members of
the 'tracing' group to read arbitrary kernel memory.
Calls to __copy_from_user_inatomic() where wrongly enclosed in
set_fs(KERNEL_DS) defeating the access_ok() calls and allowing to read
from kernel memory if a kernel address is provided.
Remove all set_fs() calls around __copy_from_user_inatomic().
As a side effect this will allow us to support v5.10 which should remove
set_fs().
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I35e4562c835217352c012ed96a7b8f93e941381e
Michael Jeanson [Fri, 25 Sep 2020 15:23:58 +0000 (11:23 -0400)]
fix: Add a 1MB limit to lttng_strlen_user_inatomic
The previous implementation was unbounded which could result in long
loops with preemption turned off.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I85afcd879258735bb2e7502f6016fcb2d3974cf7
Michael Jeanson [Wed, 23 Sep 2020 18:42:18 +0000 (14:42 -0400)]
fix: Adjust ranges for Ubuntu 4.15.0-119 kernel
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ie32f70f810c8fc756fbd31ab129aeb35500790f7
Michael Jeanson [Wed, 16 Sep 2020 19:16:17 +0000 (15:16 -0400)]
fix: Adjust ranges for Ubuntu HWE 5.0 kernels
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I36f2c3485dcc6ccb74ea86a7ce66fcb1662d060b
Mathieu Desnoyers [Tue, 28 Jan 2020 21:02:44 +0000 (16:02 -0500)]
Fix: system call filter table
The system call filter table has effectively been unused for a long
time due to system call name prefix mismatch. This means the overhead of
selective system call tracing was larger than it should have been because
the event payload preparation would be done for all system calls as soon
as a single system call is traced.
However, fixing this underlying issue unearths several issues that crept
unnoticed when the "enabler" concept was introduced (after the original
implementation of the system call filter table).
Here is a list of the issues which are resolved here:
- Split lttng_syscalls_unregister into an unregister and destroy
function, thus awaiting for a grace period (and therefore quiescence
of the users) after unregistering the system call tracepoints before
freeing the system call filter data structures. This effectively fixes
a use-after-free.
- The state for enabling "all" system calls vs enabling specific system
calls (and sequences of enable-disable) was incorrect with respect to
the "enablers" semantic. This is solved by always tracking the
bitmap of enabled system calls, and keeping this bitmap even when
enabling all system calls. The sc_filter is now always allocated
before system call tracing is registered to tracepoints, which means
it does not need to be RCU dereferenced anymore.
Padding fields in the ABI are reserved to select whether to:
- Trace either native or compat system call (or both, which is the
behavior currently implemented),
- Trace either system call entry or exit (or both, which is the
behavior currently implemented),
- Select the system call to trace by name (behavior currently
implemented) or by system call number,
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Fri, 4 Sep 2020 15:52:51 +0000 (11:52 -0400)]
fix: version ranges for ext4_discard_preallocations and writeback_queue_io
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id4fa53cb2e713cbda651e1a75deed91013115592
Michael Jeanson [Mon, 31 Aug 2020 18:16:01 +0000 (14:16 -0400)]
fix: writeback: Fix sync livelock due to b_dirty_time processing (v5.9)
See upstream commit:
commit
f9cae926f35e8230330f28c7b743ad088611a8de
Author: Jan Kara <jack@suse.cz>
Date: Fri May 29 16:08:58 2020 +0200
writeback: Fix sync livelock due to b_dirty_time processing
When we are processing writeback for sync(2), move_expired_inodes()
didn't set any inode expiry value (older_than_this). This can result in
writeback never completing if there's steady stream of inodes added to
b_dirty_time list as writeback rechecks dirty lists after each writeback
round whether there's more work to be done. Fix the problem by using
sync(2) start time is inode expiry value when processing b_dirty_time
list similarly as for ordinarily dirtied inodes. This requires some
refactoring of older_than_this handling which simplifies the code
noticeably as a bonus.
Change-Id: I8b894b13ccc14d9b8983ee4c2810a927c319560b
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 31 Aug 2020 15:41:38 +0000 (11:41 -0400)]
fix: writeback: Drop I_DIRTY_TIME_EXPIRE (v5.9)
See upstream commit:
commit
5fcd57505c002efc5823a7355e21f48dd02d5a51
Author: Jan Kara <jack@suse.cz>
Date: Fri May 29 16:24:43 2020 +0200
writeback: Drop I_DIRTY_TIME_EXPIRE
The only use of I_DIRTY_TIME_EXPIRE is to detect in
__writeback_single_inode() that inode got there because flush worker
decided it's time to writeback the dirty inode time stamps (either
because we are syncing or because of age). However we can detect this
directly in __writeback_single_inode() and there's no need for the
strange propagation with I_DIRTY_TIME_EXPIRE flag.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I92e37c2ff3ec36d431e8f9de5c8e37c5a2da55ea
Michael Jeanson [Tue, 25 Aug 2020 14:56:29 +0000 (10:56 -0400)]
fix: removal of [smp_]read_barrier_depends (v5.9)
See upstream commits:
commit
76ebbe78f7390aee075a7f3768af197ded1bdfbb
Author: Will Deacon <will@kernel.org>
Date: Tue Oct 24 11:22:47 2017 +0100
locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE()
In preparation for the removal of lockless_dereference(), which is the
same as READ_ONCE() on all architectures other than Alpha, add an
implicit smp_read_barrier_depends() to READ_ONCE() so that it can be
used to head dependency chains on all architectures.
commit
76ebbe78f7390aee075a7f3768af197ded1bdfbb
Author: Will Deacon <will.deacon@arm.com>
Date: Tue Oct 24 11:22:47 2017 +0100
locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE()
In preparation for the removal of lockless_dereference(), which is the
same as READ_ONCE() on all architectures other than Alpha, add an
implicit smp_read_barrier_depends() to READ_ONCE() so that it can be
used to head dependency chains on all architectures.
Change-Id: Ife8880bd9378dca2972da8838f40fc35ccdfaaac
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 24 Aug 2020 19:37:50 +0000 (15:37 -0400)]
fix: ext4: indicate via a block bitmap read is prefetched… (v5.9)
See upstream commit:
commit
ab74c7b23f3770935016e3eb3ecdf1e42b73efaa
Author: Theodore Ts'o <tytso@mit.edu>
Date: Wed Jul 15 11:48:55 2020 -0400
ext4: indicate via a block bitmap read is prefetched via a tracepoint
Modify the ext4_read_block_bitmap_load tracepoint so that it tells us
whether a block bitmap is being prefetched.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I0e5e2c5b8004223d0928235c092449ee16a940e1
Michael Jeanson [Mon, 24 Aug 2020 19:26:04 +0000 (15:26 -0400)]
fix: ext4: limit the length of per-inode prealloc list (v5.9)
See upstream commit:
commit
27bc446e2def38db3244a6eb4bb1d6312936610a
Author: brookxu <brookxu.cn@gmail.com>
Date: Mon Aug 17 15:36:15 2020 +0800
ext4: limit the length of per-inode prealloc list
In the scenario of writing sparse files, the per-inode prealloc list may
be very long, resulting in high overhead for ext4_mb_use_preallocated().
To circumvent this problem, we limit the maximum length of per-inode
prealloc list to 512 and allow users to modify it.
After patching, we observed that the sys ratio of cpu has dropped, and
the system throughput has increased significantly. We created a process
to write the sparse file, and the running time of the process on the
fixed kernel was significantly reduced, as follows:
Running time on unfixed kernel:
[root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
real 0m2.051s
user 0m0.008s
sys 0m2.026s
Running time on fixed kernel:
[root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
real 0m0.471s
user 0m0.004s
sys 0m0.395s
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I5169cb24853d4da32e2862a6626f1f058689b053
Michael Jeanson [Mon, 10 Aug 2020 15:36:03 +0000 (11:36 -0400)]
fix: KVM: x86/mmu: Make kvm_mmu_page definition and accessor internal-only (v5.9)
commit
985ab2780164698ec6e7d73fad523d50449261dd
Author: Sean Christopherson <sean.j.christopherson@intel.com>
Date: Mon Jun 22 13:20:32 2020 -0700
KVM: x86/mmu: Make kvm_mmu_page definition and accessor internal-only
Make 'struct kvm_mmu_page' MMU-only, nothing outside of the MMU should
be poking into the gory details of shadow pages.
Change-Id: Ia5c1b9c49c2b00dad1d5b17c50c3dc730dafda20
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 10 Aug 2020 15:22:05 +0000 (11:22 -0400)]
fix: Move mmutrace.h into the mmu/ sub-directory (v5.9)
commit
33e3042dac6bcc33b80835f7d7b502b1d74c457c
Author: Sean Christopherson <sean.j.christopherson@intel.com>
Date: Mon Jun 22 13:20:29 2020 -0700
KVM: x86/mmu: Move mmu_audit.c and mmutrace.h into the mmu/ sub-directory
Move mmu_audit.c and mmutrace.h under mmu/ where they belong.
Change-Id: I582525ccca34e1e3bd62870364108a7d3e9df2e4
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 12 Aug 2020 20:58:26 +0000 (16:58 -0400)]
Namespace all logging statements
Add the 'LTTng:' prefix to all our logging statements to easily
distinguish them from other kernel messages.
Change-Id: I90fb4f4c75ce195734ec82946827bcf78e03429a
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Beniamin Sandu [Thu, 13 Aug 2020 13:24:39 +0000 (16:24 +0300)]
Kconfig: fix dependency issue when building in-tree without CONFIG_FTRACE
When building in-tree, one could disable CONFIG_FTRACE from kernel
config which will leave CONFIG_TRACEPOINTS selected by LTTNG modules,
but generate a lot of linker errors like below because it leaves out
other stuff, e.g.:
trace.c:(.text+0xd86b): undefined reference to `trace_event_buffer_reserve'
ld: trace.c:(.text+0xd8de): undefined reference to `trace_event_buffer_commit'
ld: trace.c:(.text+0xd926): undefined reference to `event_triggers_call'
ld: trace.c:(.text+0xd942): undefined reference to `trace_event_ignore_this_pid'
ld: net/mac80211/trace.o: in function `trace_event_raw_event_drv_tdls_cancel_channel_switch':
It appears to be caused by the fact that TRACE_EVENT macros in the Linux
kernel depend on the Ftrace ring buffer as soon as CONFIG_TRACEPOINTS is
enabled.
Steps to reproduce:
- Get a clone of an upstream stable kernel and use scripts/built-in.sh on it
- Configure a standard x86-64 build, enable built-in LTTNG but disable
CONFIG_FTRACE from Kernel Hacking-->Tracers using menuconfig
- Build will fail at linking stage
Signed-off-by: Beniamin Sandu <beniaminsandu@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Francis Deslauriers [Thu, 6 Aug 2020 15:03:00 +0000 (11:03 -0400)]
Fix: mmap enum flags build failures
Some of the mmap option flags are not available on all architectures and
are defined to zero by include/linux/mman.h. This is probably done as a
way to no-op the use of these flags on configurations that don't support
them.
To fix this, only define these flags in our enumeration if they are
defined and non-zero.
Also, the MAP_HUGE_{2MB,1GB} labels were mistakingly named
MAP_HUGETLB_{2MB,1GB}.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I778a52a0da9da6e04231a52c7f68a22d122dfb83
Francis Deslauriers [Fri, 5 Jun 2020 15:38:14 +0000 (11:38 -0400)]
syscalls: Make mmap()'s fields `prot` and `flags` enums
The `prot` flags is a simple CTF enumeration.
The `flags` field is a CTF struct of 2 CTF enumerations (`type` and
`options`). This is needed to express the two parts of this integer
flag. The 4 least significant bits of the integer are reserved to
express the type of the mapping (MAP_SHARED=0x1, MAP_PRIVATE=0x2, and
MAP_SHARED_VALIDATE=0x3).
The remaining 28 bits are used to specify optional configurations on the
mapping. As opposed to the type part, the options part is bit flag
field where all values are power of 2. This part can be expressed as
ORed bit flag values.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I5ae78754b5863b31d9a3ba1b1173502e1ae284d3
Francis Deslauriers [Fri, 5 Jun 2020 22:42:54 +0000 (18:42 -0400)]
x86: add error code enum to pagefault tracepoints
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ia939eccd1a918958f6a281595e447f33da2d64f7
Michael Jeanson [Mon, 20 Jul 2020 14:48:02 +0000 (10:48 -0400)]
Fix: TAINT_UNSAFE_SMP renamed to TAINT_CPU_OUT_OF_SPEC in v3.15
See upstream commit:
commit
8c90487cdc64847b4fdd812ab3047f426fec4d13
Author: Dave Jones <davej@redhat.com>
Date: Wed Feb 26 10:49:49 2014 -0500
Rename TAINT_UNSAFE_SMP to TAINT_CPU_OUT_OF_SPEC
Rename TAINT_UNSAFE_SMP to TAINT_CPU_OUT_OF_SPEC, so we can repurpose
the flag to encompass a wider range of pushing the CPU beyond its
warrany.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I3e91df01bfbfaa6fab4e3904e59317022a9ec0f8
Francis Deslauriers [Tue, 18 Feb 2020 16:30:54 +0000 (11:30 -0500)]
module_load: change `taints` field to `ctf_enum`
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I67b5aad0bd2bc43e06a5708f0f5e1fea56f31436
Mathieu Desnoyers [Mon, 13 Jul 2020 18:59:33 +0000 (14:59 -0400)]
Fix: Lock metadata cache on session destroy
commit
92143b2c5656 ("Fix: metadata stream leak, missing list removal and locking")
missed taking a lock protecting the metadata stream list iteration on
session destroy. This opens a race window between iteration and item
removal/free which triggers kernel OOPS.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 10 Jul 2020 15:15:40 +0000 (11:15 -0400)]
Fix: metadata stream leak, missing list removal and locking
The metadata stream is part of a list of metadata streams in the
metadata cache. Its addition to the list should be protected by
the metadata cache lock. It needs to be paired with protection
of list iteration with the same lock.
Removal from the list is entirely missing, and should be added
to lttng_metadata_ring_buffer_release (with proper locking).
This missing list removal was probably not causing issues because the
metadata stream structure was leaked: a kfree() is missing from
lttng_metadata_ring_buffer_release as well.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 10 Jul 2020 14:51:26 +0000 (10:51 -0400)]
Fix: coherent state not changed atomically with metadata written
commit
122c63cb4310 ("Fix: Implement RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK")
introduces a new ioctl which returns a flag indicating whether the
metadata is in consistent state at the end of the sub-buffer.
That commit is meant to address metadata consistency issues observable
in live sessions.
However, the "consistent" state is false as soon as a producer is
active (between an outermost metadata_begin/end pair). Unfortunately,
if the last "RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK" operation is
done between the last metadata printf and "end" of the transaction, the
last consistency state will be false, and the consumer daemon will never
send metadata to the relay daemon. This in turn causes a live viewer to
wait for metadata endlessly.
This issue can be reproduced by running lttng-tools:
tests/regression/tools/live/test_kernel
as root in a loop.
We observe two things:
1) the poll operation blocks when there is no more metadata to send,
which means there is no mean to unblock when the consistency state
changes back to "true" without producing additional metadata,
2) Even if (1) was fixed, the expectation from an ABI perspective is
that the "coherent" state is only populated when
RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK succeeds. Therefore,
there is no way to let user-space know about conherency transition
unless additional metadata is generated.
Fixing this requires to hold the metadata cache lock across the entire
production of a coherent metadata transaction. This simpler scheme is
possible because the metadata is generated in a reallocated memory area
and not directly into a ring buffer anymore. This was not the case in
earlier lttng-modules versions, when the metadata was generated directly
into a ring buffer, which explains why this simpler scheme was not
implemented.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 7 Jul 2020 18:18:37 +0000 (14:18 -0400)]
fix: include module.h for EXPORT_SYMBOL_GPL
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic337e1eb375791ace08560555dd02b37cbefcf25
Michael Jeanson [Tue, 7 Jul 2020 17:50:15 +0000 (13:50 -0400)]
fix: __lttng_vmalloc_node_range const caller introduced in v3.6
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib13cf03b5ab11830a8732318a12713720cf1b3e3
Michael Jeanson [Tue, 7 Jul 2020 18:07:01 +0000 (14:07 -0400)]
fix: version range for overflow_callback
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I1b8f1d59552a1723d3f4ed74780a2b57d13d0e52
Michael Jeanson [Tue, 7 Jul 2020 17:00:10 +0000 (13:00 -0400)]
fix: global_dirty_limit was introduced in v3.1
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id97dbb2d0181a45c45cfed36c4be8753cabac283
Michael Jeanson [Tue, 7 Jul 2020 16:21:54 +0000 (12:21 -0400)]
fix: wrapper_uprobe_unregister is a void function
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib4438da02aac3defd1245324d1b48f400f806d58
Michael Jeanson [Tue, 7 Jul 2020 15:58:03 +0000 (11:58 -0400)]
fix: prior to v4.0, __vmalloc_node_range had no vm_flags param
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ib476e32d109298d9ca3e6b6ab7ac8f63c50fb09f
Michael Jeanson [Tue, 7 Jul 2020 15:15:39 +0000 (11:15 -0400)]
fix: vmalloc on v5.8 without KALLSYMS
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Ic945dad92e78a5bc2895a969a10c527e1349decf
This page took 0.048382 seconds and 4 git commands to generate.