Mathieu Desnoyers [Tue, 15 Oct 2019 19:56:23 +0000 (15:56 -0400)]
README.md: cleanup formatting for bullet lists
Suggested-by: David Boles <biblioboles@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 30 Sep 2019 19:15:34 +0000 (15:15 -0400)]
Fix: btrfs: move basic block_group definitions to their own header (v5.4)
commit
aac0023c2106952538414254960c51dcf0dc39e9
Author: Josef Bacik <josef@toxicpanda.com>
Date: Thu Jun 20 15:37:44 2019 -0400
btrfs: move basic block_group definitions to their own header
This is prep work for moving all of the block group cache code into its
own file.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Tue, 17 Sep 2019 18:12:07 +0000 (14:12 -0400)]
Cleanup: Silence gcc fall-through warning
Use a comment pattern recognized by gcc 7.4.0 to silence the
fall-through warning.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Gabriel-Andrew Pollo-Guilbert [Mon, 16 Sep 2019 17:57:38 +0000 (13:57 -0400)]
Fix: update sched prev_state instrumentation for upstream kernel
Introduced in upstream Linux kernel 4.14.
commit
efb40f588b4370ffaeffafbd50f6ff213d954254
Author: Peter Zijlstra <peterz@infradead.org>
Date: Fri Sep 22 18:19:53 2017 +0200
sched/tracing: Fix trace_sched_switch task-state printing
Introduced in upstream Linux kernel 4.15.
Backported in
13f12749af15 (4.14.64).
commit
3f5fe9fef5b2da06b6319fab8123056da5217c3f
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Wed Nov 22 13:05:48 2017 +0100
sched/debug: Fix task state recording/printout
Introduced in upstream Linux kernel 4.20.
Backported in
e1e5fa73e466 (4.14.102) and
fd8152818f11 (4.19.9).
commit
3054426dc68e5d63aa6a6e9b91ac4ec78e3f3805
Author: Pavankumar Kondeti <pkondeti@codeaurora.org>
Date: Tue Oct 30 12:24:33 2018 +0530
sched, trace: Fix prev_state output in sched_switch tracepoint
Signed-off-by: Gabriel-Andrew Pollo-Guilbert <gabriel.pollo-guilbert@efficios.com>
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 6 Sep 2019 01:01:47 +0000 (21:01 -0400)]
Fix: gcc-9.1 stack frame size warning
gcc-9.1.0 warns about lttng_session_ioctl taking a too large frame size.
lttng-modules/lttng-abi.c:622:1: warning: the frame size of 2240 bytes
is larger than 2048 bytes [-Wframe-larger-than=]
Combine the variables used in the various case of the switch so they are
not duplicated on the stack by the compiler.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 30 May 2018 00:11:43 +0000 (02:11 +0200)]
Implement ring buffer clear
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Simon Marchi [Tue, 20 Aug 2019 01:51:52 +0000 (21:51 -0400)]
Make bitfield.h C++-friendly
This patch changes bitfield.h to be usable in C++11.
It will probably never be compiled as C++ in the context of
lttng-modules, but this is just to keep things sync'ed across projects.
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Jonathan Rajotte [Tue, 13 Aug 2019 18:14:25 +0000 (14:14 -0400)]
Introduce LTTNG_KERNEL_SESSION_SET_CREATION_TIME
Add trace_creation_datetime to the metadata env field
This allows a viewer to get more information regarding the creation time
of a trace. This information is normally inferred from the trace
hierarchy.
The ABI expects an ISO8601 compliant string value. We leave the
formatting to the trace controller.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Jonathan Rajotte [Tue, 13 Aug 2019 18:14:24 +0000 (14:14 -0400)]
Add metadata env fields
Add the following fields:
- tracer_buffering_scheme
The buffering scheme used by the tracer. lttng-modules sole
buffering scheme is "global".
- trace_name
The name of the trace. Use the session name.
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Jonathan Rajotte [Tue, 13 Aug 2019 18:14:23 +0000 (14:14 -0400)]
Introduce LTTNG_KERNEL_SESSION_SET_NAME
The tracer controller (lttng-sessiond) can now inform the kernel tracer
of the name of the created session. This will allow the tracer to
propagate the information inside the trace metadata under a "env" field.
This information is normally inferred from the generated folder
structure where a trace rests.
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Jonathan Rajotte [Thu, 4 Jul 2019 20:02:13 +0000 (16:02 -0400)]
Fix: do not use diagnostic pragma when GCC version is lower than 4.6.0
Officially the diagnostic pragma are supported starting in 4.6. [1]
But they were present before 4.6 with limitation which we cannot
honour easily.
[1] https://gcc.gnu.org/gcc-4.6/changes.html
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Jonathan Rajotte [Thu, 4 Jul 2019 20:02:12 +0000 (16:02 -0400)]
Fix: missing define when not building with gcc
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Mon, 24 Jun 2019 13:43:45 +0000 (09:43 -0400)]
Fix: lttng-tracepoint module notifier should return NOTIFY_OK
Module notifiers should return NOTIFY_OK on success rather than the
value 0. The return value 0 does not seem to have any ill side-effects
in the notifier chain caller, but it is preferable to respect the API
requirements in case this changes in the future.
Notifiers can encapsulate a negative errno value with
notifier_from_errno(), but this is not needed by the LTTng tracepoint
notifier.
The approach taken in this notifier is to just print a console warning
on error, because tracing failure should not prevent loading a module.
So we definitely do not want to stop notifier iteration. Returning
an error without stopping iteration is not really that useful, because
only the return value of the last callback is returned to notifier chain
caller.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Jérémie Galarneau [Tue, 11 Jun 2019 22:34:32 +0000 (18:34 -0400)]
Fix: Don't print ring-buffer's records count when it is not used
The teardown of a ring buffer causes a number of diagnostic messages
to be printed using printk. One of those contains the "records
count", which is only updated when lttng-modules is built with
LTTNG_RING_BUFFER_COUNT_EVENTS defined.
Move the "records count" printing to a different function and stub it
out when LTTNG_RING_BUFFER_COUNT_EVENTS is not defined
(default configuration).
This eliminates messages of the following form from the dmesg output
when an LTTng session is torn down.
[...] ring buffer relay-discard, cpu 0: 0 records written, 0 records overrun
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Tue, 11 Jun 2019 21:44:56 +0000 (23:44 +0200)]
Fix: do not set quiescent state on channel destroy
Setting the quiescent state to true for each stream at channel
destruction is not useful: there are no readers left anyway at
that stage.
The side-effect perceived of setting this quiescent state on
destroy is that the metadata stream ends up with an empty last
packet (due to flush_empty performed when setting the quiescent state)
which is never consumed. This shows up in the lttng-modules error
reporting.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Tue, 4 Jun 2019 18:59:26 +0000 (14:59 -0400)]
Fix: ring_buffer_frontend.c: init read timer with uninitialized flags
For the config->alloc RING_BUFFER_ALLOC_GLOBAL (metadata channel), the
read timer flags argument is uninitialized.
Found by Coverity:
CID
1401114 (#1 of 1): Uninitialized scalar variable (UNINIT)
6. uninit_use_in_call: Using uninitialized value flags when calling init_timer_key.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 22 May 2019 01:16:15 +0000 (21:16 -0400)]
Introduce callstack stackwalk implementation header
Introduce a new implementation header for the stackwalk-based API, added
in Linux 5.2 and gradually integrated within each architecture.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 22 May 2019 01:15:44 +0000 (21:15 -0400)]
Prepare callstack common code for stackwalk
Prepare the callstack common code for stackwalk implementation,
moving more legacy code to the legacy implementation header.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Tue, 21 May 2019 21:51:23 +0000 (17:51 -0400)]
Introduce callstack legacy implementation header
Split the callstack code: keep boilerplate code within the
C implementation file, and move the parts which depend on the
"legacy" (pre-stackwalk) stacktrace kernel API to a separate
implementation header.
This is a preparation step to introduce a new implementation
header for the stackwalk API, added in Linux 5.2 and gradually
integrated within each architecture.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 21 May 2019 20:33:14 +0000 (16:33 -0400)]
fix: random: only read from /dev/random after its pool has received 128 bits (v5.2)
See upstream commit:
commit
eb9d1bf079bb438d1a066d72337092935fc770f6
Author: Theodore Ts'o <tytso@mit.edu>
Date: Wed Feb 20 16:06:38 2019 -0500
random: only read from /dev/random after its pool has received 128 bits
Immediately after boot, we allow reads from /dev/random before its
entropy pool has been fully initialized. Fix this so that we don't
allow this until the blocking pool has received 128 bits.
We do this by repurposing the initialized flag in the entropy pool
struct, and use the initialized flag in the blocking pool to indicate
whether it is safe to pull from the blocking pool.
To do this, we needed to rework when we decide to push entropy from the
input pool to the blocking pool, since the initialized flag for the
input pool was used for this purpose. To simplify things, we no
longer use the initialized flag for that purpose, nor do we use the
entropy_total field any more.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 21 May 2019 20:33:13 +0000 (16:33 -0400)]
fix: mm: move recent_rotated pages calculation to shrink_inactive_list() (v5.2)
See upstream commit:
commit
886cf1901db962cee5f8b82b9b260079a5e8a4eb
Author: Kirill Tkhai <ktkhai@virtuozzo.com>
Date: Mon May 13 17:16:51 2019 -0700
mm: move recent_rotated pages calculation to shrink_inactive_list()
Patch series "mm: Generalize putback functions"]
putback_inactive_pages() and move_active_pages_to_lru() are almost
similar, so this patchset merges them ina single function.
This patch (of 4):
The patch moves the calculation from putback_inactive_pages() to
shrink_inactive_list(). This makes putback_inactive_pages() looking more
similar to move_active_pages_to_lru().
To do that, we account activated pages in reclaim_stat::nr_activate.
Since a page may change its LRU type from anon to file cache inside
shrink_page_list() (see ClearPageSwapBacked()), we have to account pages
for the both types. So, nr_activate becomes an array.
Previously we used nr_activate to account PGACTIVATE events, but now we
account them into pgactivate variable (since they are about number of
pages in general, not about sum of hpage_nr_pages).
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 21 May 2019 20:33:12 +0000 (16:33 -0400)]
fix: mm/vmscan: simplify trace_reclaim_flags and trace_shrink_flags (v5.2)
See upstream commit:
commit
60b62ff7cc4217ac3de76535fa4c1510a798dbcb
Author: Yafang Shao <laoar.shao@gmail.com>
Date: Mon May 13 17:23:08 2019 -0700
mm/vmscan: simplify trace_reclaim_flags and trace_shrink_flags
trace_reclaim_flags and trace_shrink_flags are almost the same.
We can simplify them to avoid redundant code.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 21 May 2019 20:33:11 +0000 (16:33 -0400)]
fix: mm/vmscan: drop may_writepage and classzone_idx from direct reclaim begin template (v5.2)
See upstream commit:
commit
3481c37ffa1de58ef140d0fe9eabf56305e74666
Author: Yafang Shao <laoar.shao@gmail.com>
Date: Mon May 13 17:19:14 2019 -0700
mm/vmscan: drop may_writepage and classzone_idx from direct reclaim begin template
There are three tracepoints using this template, which are
mm_vmscan_direct_reclaim_begin,
mm_vmscan_memcg_reclaim_begin,
mm_vmscan_memcg_softlimit_reclaim_begin.
Regarding mm_vmscan_direct_reclaim_begin,
sc.may_writepage is !laptop_mode, that's a static setting, and
reclaim_idx is derived from gfp_mask which is already show in this
tracepoint.
Regarding mm_vmscan_memcg_reclaim_begin,
may_writepage is !laptop_mode too, and reclaim_idx is (MAX_NR_ZONES-1),
which are both static value.
mm_vmscan_memcg_softlimit_reclaim_begin is the same with
mm_vmscan_memcg_reclaim_begin.
So we can drop them all.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 21 May 2019 20:33:10 +0000 (16:33 -0400)]
fix: timer/trace: Improve timer tracing (v5.2)
See upstream commit:
commit
f28d3d5346e97e60c81f933ac89ccf015430e5cf
Author: Anna-Maria Gleixner <anna-maria@linutronix.de>
Date: Thu Mar 21 13:09:21 2019 +0100
timer/trace: Improve timer tracing
Timers are added to the timer wheel off by one. This is required in
case a timer is queued directly before incrementing jiffies to prevent
early timer expiry.
When reading a timer trace and relying only on the expiry time of the timer
in the timer_start trace point and on the now in the timer_expiry_entry
trace point, it seems that the timer fires late. With the current
timer_expiry_entry trace point information only now=jiffies is printed but
not the value of base->clk. This makes it impossible to draw a conclusion
to the index of base->clk and makes it impossible to examine timer problems
without additional trace points.
Therefore add the base->clk value to the timer_expire_entry trace
point, to be able to calculate the index the timer base is located at
during collecting expired timers.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 17 May 2019 14:01:36 +0000 (10:01 -0400)]
Cleanup: bitfields: streamline use of underscores
Do not prefix macro arguments with underscores. Use one leading
underscore as prefix for local variables defined within macros.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Tue, 14 May 2019 14:57:13 +0000 (10:57 -0400)]
Silence compiler "always false comparison" warning
Compiling the bitfield test with gcc -Wextra generates those warnings:
../../include/babeltrace/bitfield-internal.h:38:45: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
#define _bt_is_signed_type(type) ((type) -1 < (type) 0)
This is the intent of the macro. Disable compiler warnings around use of
that macro.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com
Mathieu Desnoyers [Tue, 14 May 2019 14:56:14 +0000 (10:56 -0400)]
Fix: bitfield: shift undefined/implementation defined behaviors
bitfield.h uses the left shift operator with a left operand which
may be negative. The C99 standard states that shifting a negative
value is undefined.
When building with -Wshift-negative-value, we get this gcc warning:
In file included from /home/smarchi/src/babeltrace/include/babeltrace/ctfser-internal.h:44:0,
from /home/smarchi/src/babeltrace/ctfser/ctfser.c:42:
/home/smarchi/src/babeltrace/include/babeltrace/ctfser-internal.h: In function ‘bt_ctfser_write_unsigned_int’:
/home/smarchi/src/babeltrace/include/babeltrace/bitfield-internal.h:116:24: error: left shift of negative value [-Werror=shift-negative-value]
mask = ~((~(type) 0) << (__start % ts)); \
^
/home/smarchi/src/babeltrace/include/babeltrace/bitfield-internal.h:222:2: note: in expansion of macro ‘_bt_bitfield_write_le’
_bt_bitfield_write_le(ptr, type, _start, _length, _v)
^~~~~~~~~~~~~~~~~~~~~
/home/smarchi/src/babeltrace/include/babeltrace/ctfser-internal.h:418:3: note: in expansion of macro ‘bt_bitfield_write_le’
bt_bitfield_write_le(mmap_align_addr(ctfser->base_mma) +
^~~~~~~~~~~~~~~~~~~~
This boils down to the fact that the expression ~((uint8_t)0) has type
"signed int", which is used as an operand of the left shift. This is due
to the integer promotion rules of C99 (6.3.3.1):
If an int can represent all values of the original type, the value is
converted to an int; otherwise, it is converted to an unsigned int.
These are called the integer promotions. All other types are unchanged
by the integer promotions.
We also need to cast the result explicitly into the left hand
side type to deal with:
warning: large integer implicitly truncated to unsigned type [-Woverflow]
The C99 standard states that a right shift has implementation-defined
behavior when shifting a signed negative value. Add a preprocessor check
that the compiler provides the expected behavior, else provide an
alternative implementation which guarantees the intended behavior.
A preprocessor check is also added to ensure that the compiler
representation for signed values is two's complement, which is expected
by this header.
Document that this header strictly respects the C99 standard, with
the exception of its use of __typeof__.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Tue, 30 Apr 2019 03:44:30 +0000 (23:44 -0400)]
Fix: timestamp_end field should include all events within sub-buffer
Fix for timestamp_end not including all events within sub-buffer. This
happens if a thread is preempted/interrupted for a long time between
reserve and commit (e.g. in the middle of a packet), which causes the
timestamp used for timestamp_end field of the packet header to be lower
than the timestamp of the last events in the buffer (those following the
event that was preempted/interrupted between reserve and commit).
The fix involves sampling the timestamp when doing the last space
reservation in a sub-buffer (which necessarily happens before doing the
delivery after its last commit). Save this timestamp temporarily in a
per-sub-buffer control area (we have exclusive access to that area until
we increment the commit counter).
Then, that timestamp value will be read when delivering the sub-buffer,
whichever event or switch happens to be the last to increment the commit
counter to perform delivery. The timestamp value can be read without
worrying about concurrent access, because at that point sub-buffer
delivery has exclusive access to the sub-buffer.
This ensures the timestamp_end value is always larger or equal to the
timestamp of the last event, always below or equal the timestamp_begin
of the following packet, and always below or equal the timestamp of the
first event in the following packet.
Fixes: #1183
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Tue, 9 Apr 2019 18:12:41 +0000 (14:12 -0400)]
Fix: Remove start and number from syscall_get_arguments() args (v5.1)
commit
b35f549df1d7520d37ba1e6d4a8d4df6bd52d136
Author: Steven Rostedt (Red Hat) <rostedt@goodmis.org>
Date: Mon Nov 7 16:26:37 2016 -0500
syscalls: Remove start and number from syscall_get_arguments() args
At Linux Plumbers, Andy Lutomirski approached me and pointed out that the
function call syscall_get_arguments() implemented in x86 was horribly
written and not optimized for the standard case of passing in 0 and 6 for
the starting index and the number of system calls to get. When looking at
all the users of this function, I discovered that all instances pass in only
0 and 6 for these arguments. Instead of having this function handle
different cases that are never used, simply rewrite it to return the first 6
arguments of a system call.
This should help out the performance of tracing system calls by ptrace,
ftrace and perf.
Link: http://lkml.kernel.org/r/20161107213233.754809394@goodmis.org
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 10 Apr 2019 15:13:15 +0000 (11:13 -0400)]
lttng abi documentation: clarify getter usage requirements
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 10 Apr 2019 14:25:47 +0000 (10:25 -0400)]
Fix: don't access packet header for stream_id and stream_instance_id getters
The stream ID and stream instance ID are invariant for a stream, so
there is no point reading them from the packet header currently owned by
the consumer (between get/put subbuf).
Actually, the consumer try to access the stream_id from the live timer
when sending a live beacon without getting the reader subbuffer first.
Doing so is racy against producers. In typical live scenarios
(non-overwrite channels), the producers will always write the same
stream id and stream instance id values at the same header offsets,
which will "work", except for the initial state of an empty buffer:
the value "0" will be returned (erroneously).
For the less frequently used scenario of a live session with "overwrite"
channels, this will trigger WARN_ON safety nets in libringbuffer. This
safety net triggers a kernel OOPS report and disables tracing for that
channel.
In the case where a ring buffer does not have any data ready, it makes
no sense to try to get a subbuffer for reading anyway, so the approach
was broken.
So return the stream id and stream instance id from the internal
data structures rather than reading it from the ring buffer.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 18 Mar 2019 20:20:36 +0000 (16:20 -0400)]
Fix: atomic_long_add_unless() returns a boolean
Because of a documentation error in older kernels, it was assumed that
atomic_long_add_unless would return the old value, but the
implementation actually returns a boolean.
Also add missing error code int 'ret' and compare against the right type
max value.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 18 Mar 2019 20:20:35 +0000 (16:20 -0400)]
Fix: Revert "KVM: MMU: show mmu_valid_gen..." (v5.1)
See upstream commit :
commit
b59c4830ca185ba0e9f9e046fb1cd10a4a92627a
Author: Sean Christopherson <sean.j.christopherson@intel.com>
Date: Tue Feb 5 13:01:30 2019 -0800
Revert "KVM: MMU: show mmu_valid_gen in shadow page related tracepoints"
...as part of removing x86 KVM's fast invalidate mechanism, i.e. this
is one part of a revert all patches from the series that introduced the
mechanism[1].
This reverts commit
2248b023219251908aedda0621251cffc548f258.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 18 Mar 2019 20:20:34 +0000 (16:20 -0400)]
Fix: pipe: stop using ->can_merge (v5.1)
See upstream commit:
commit
01e7187b41191376cee8bea8de9f907b001e87b4
Author: Jann Horn <jannh@google.com>
Date: Wed Jan 23 15:19:18 2019 +0100
pipe: stop using ->can_merge
Al Viro pointed out that since there is only one pipe buffer type to which
new data can be appended, it isn't necessary to have a ->can_merge field in
struct pipe_buf_operations, we can just check for a magic type.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 18 Mar 2019 20:20:33 +0000 (16:20 -0400)]
Fix: rcu: Remove wrapper definitions for obsolete RCU... (v5.1)
See upstream commit :
commit
6ba7d681aca22e53385bdb35b1d7662e61905760
Author: Paul E. McKenney <paulmck@linux.ibm.com>
Date: Wed Jan 9 15:22:03 2019 -0800
rcu: Remove wrapper definitions for obsolete RCU update functions
None of synchronize_rcu_bh, synchronize_rcu_bh_expedited, call_rcu_bh,
rcu_barrier_bh, synchronize_sched, synchronize_sched_expedited,
call_rcu_sched, rcu_barrier_sched, get_state_synchronize_sched, and
cond_synchronize_sched are actually used. This commit therefore removes
their trivial wrapper-function definitions.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 18 Mar 2019 20:20:32 +0000 (16:20 -0400)]
Fix: mm: create the new vm_fault_t type (v5.1)
See upstream commit:
commit
3d3539018d2cbd12e5af4a132636ee7fd8d43ef0
Author: Souptick Joarder <jrdr.linux@gmail.com>
Date: Thu Mar 7 16:31:14 2019 -0800
mm: create the new vm_fault_t type
Page fault handlers are supposed to return VM_FAULT codes, but some
drivers/file systems mistakenly return error numbers. Now that all
drivers/file systems have been converted to use the vm_fault_t return
type, change the type definition to no longer be compatible with 'int'.
By making it an unsigned int, the function prototype becomes
incompatible with a function which returns int. Sparse will detect any
attempts to return a value which is not a VM_FAULT code.
VM_FAULT_SET_HINDEX and VM_FAULT_GET_HINDEX values are changed to avoid
conflict with other VM_FAULT codes.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 15 Mar 2019 15:13:39 +0000 (11:13 -0400)]
Fix: extra-version-git.sh redirect stderr to /dev/null
Running make in a git repo that does not contain any tag prints:
fatal: No names found, cannot describe anything.
in the make and make clean outputs.
It's fine to have no tag name available (extra-version-git.sh will
return the value 0), but we should not print an error in the make
output. Redirect this error to /dev/null.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Suggested-by: Michael Jeanson <mjeanson@efficios.com>
Jonathan Rajotte [Thu, 7 Mar 2019 19:58:00 +0000 (14:58 -0500)]
Move timekeeping blacklisting to a header file
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Jonathan Rajotte [Thu, 7 Mar 2019 19:57:59 +0000 (14:57 -0500)]
Blacklist: kprobe for arm
This upstream kernel commit broke optimized kprobe.
commit
e46daee53bb50bde38805f1823a182979724c229
Author: Kees Cook <keescook@chromium.org>
Date: Tue Oct 30 22:12:56 2018 +0100
ARM: 8806/1: kprobes: Fix false positive with FORTIFY_SOURCE
The arm compiler internally interprets an inline assembly label
as an unsigned long value, not a pointer. As a result, under
CONFIG_FORTIFY_SOURCE, the address of a label has a size of 4 bytes,
which was tripping the runtime checks. Instead, we can just cast the label
(as done with the size calculations earlier).
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=
1639397
Reported-by: William Cohen <wcohen@redhat.com>
Fixes: 6974f0c4555e ("include/linux/string.h: add the option of fortified string.h functions")
Cc: stable@vger.kernel.org
Acked-by: Laura Abbott <labbott@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: William Cohen <wcohen@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
It was introduced in the 4.20 cycle.
It was also backported to the 4.19 and 4.14 branch.
This issue is fixed upstream by [1] and is present in the 5.0 kernel
release.
[1]
0ac569bf6a7983c0c5747d6df8db9dc05bc92b6c
The fix was backported to 4.20, 4.19 and 4.14 branch.
It is included starting at:
v5.0.0
v4.20.13
v4.19.26
v4.14.104
Fixes #1174
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Thu, 14 Feb 2019 16:40:50 +0000 (11:40 -0500)]
Cleanup: tp mempool: Remove logically dead code
Found by Coverity:
CID
1391045 (#1 of 1): Logically dead code (DEADCODE)
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 9 Jan 2019 19:59:18 +0000 (14:59 -0500)]
Fix: btrfs: Remove fsid/metadata_fsid fields from btrfs_info
Introduced in v5.0.
See upstream commit :
commit
de37aa513105f864d3c21105bf5542d498f21ca2
Author: Nikolay Borisov <nborisov@suse.com>
Date: Tue Oct 30 16:43:24 2018 +0200
btrfs: Remove fsid/metadata_fsid fields from btrfs_info
Currently btrfs_fs_info structure contains a copy of the
fsid/metadata_uuid fields. Same values are also contained in the
btrfs_fs_devices structure which fs_info has a reference to. Let's
reduce duplication by removing the fields from fs_info and always refer
to the ones in fs_devices. No functional changes.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 9 Jan 2019 19:59:17 +0000 (14:59 -0500)]
Fix: SUNRPC: Simplify defining common RPC trace events (v5.0)
See upstream commit :
commit
dc5820bd21d84ee34770b0a1e2fca9378f8f7456
Author: Chuck Lever <chuck.lever@oracle.com>
Date: Wed Dec 19 11:00:16 2018 -0500
SUNRPC: Simplify defining common RPC trace events
Clean up, no functional change is expected.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 9 Jan 2019 19:59:16 +0000 (14:59 -0500)]
Fix: Replace pointer values with task->tk_pid and rpc_clnt->cl_clid
Introduced in v3.12.
See upstream commit :
commit
92cb6c5be8134db6f7c38f25f6afd13e444cebaf
Author: Trond Myklebust <Trond.Myklebust@netapp.com>
Date: Wed Sep 4 22:09:50 2013 -0400
SUNRPC: Replace pointer values with task->tk_pid and rpc_clnt->cl_clid
Instead of the pointer values, use the task and client identifier values
for tracing purposes.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 9 Jan 2019 19:59:15 +0000 (14:59 -0500)]
Fix: Remove 'type' argument from access_ok() function (v5.0)
See upstream commit :
commit
96d4f267e40f9509e8a66e2b39e8b95655617693
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu Jan 3 18:57:57 2019 -0800
Remove 'type' argument from access_ok() function
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.
It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access. But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.
A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model. And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.
This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.
There were a couple of notable cases:
- csky still had the old "verify_area()" name as an alias.
- the iter_iov code had magical hardcoded knowledge of the actual
values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
really used it)
- microblaze used the type argument for a debug printout
but other than those oddities this should be a total no-op patch.
I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something. Any missed conversion should be trivially fixable, though.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 6 Dec 2018 16:31:51 +0000 (11:31 -0500)]
Fix: timer instrumentation for RHEL 7.6
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 23:37:45 +0000 (18:37 -0500)]
Add missing SPDX license identifiers to uprobes
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 21:11:57 +0000 (16:11 -0500)]
Drop support for kernels < 3.0 from Makefiles
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:54:41 +0000 (15:54 -0500)]
Drop support for kernels < 3.0 from writeback instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:54:27 +0000 (15:54 -0500)]
Drop support for kernels < 3.0 from workqueue instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:52:57 +0000 (15:52 -0500)]
Drop support for kernels < 3.0 from skb instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:52:41 +0000 (15:52 -0500)]
Drop support for kernels < 3.0 from scsi instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:52:31 +0000 (15:52 -0500)]
Drop support for kernels < 3.0 from sched instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:52:16 +0000 (15:52 -0500)]
Drop support for kernels < 3.0 from power instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:52:04 +0000 (15:52 -0500)]
Drop support for kernels < 3.0 from net instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:51:53 +0000 (15:51 -0500)]
Drop support for kernels < 3.0 from module instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:51:41 +0000 (15:51 -0500)]
Drop support for kernels < 3.0 from mm_vmscan instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:51:22 +0000 (15:51 -0500)]
Drop support for kernels < 3.0 from lock instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:51:14 +0000 (15:51 -0500)]
Drop support for kernels < 3.0 from kvm instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:51:04 +0000 (15:51 -0500)]
Drop support for kernels < 3.0 from kmem instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:50:51 +0000 (15:50 -0500)]
Drop support for kernels < 3.0 from jbd2 instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:50:38 +0000 (15:50 -0500)]
Drop support for kernels < 3.0 from irq instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:50:26 +0000 (15:50 -0500)]
Drop support for kernels < 3.0 from ext4 instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:46:09 +0000 (15:46 -0500)]
Drop support for kernels < 3.0 from block instrumentation
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:45:29 +0000 (15:45 -0500)]
Drop support for kernels < 3.0 from lttng-statedump-impl.c
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:45:12 +0000 (15:45 -0500)]
Drop support for kernels < 3.0 from lttng-kernel-version.h
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:44:55 +0000 (15:44 -0500)]
Drop support for kernels < 3.0 from lttng-events.h
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:44:21 +0000 (15:44 -0500)]
Drop support for kernels < 3.0 from lib
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:40:23 +0000 (15:40 -0500)]
Drop spinlock.h wrapper
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:36:35 +0000 (15:36 -0500)]
Drop kstrtox.h wrapper
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:35:14 +0000 (15:35 -0500)]
Drop uuid.h wrapper
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:32:49 +0000 (15:32 -0500)]
Drop vzalloc.h wrapper
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:28:21 +0000 (15:28 -0500)]
Drop support for kernels < 3.0 from tracepoint.h wrapper
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:27:19 +0000 (15:27 -0500)]
Drop support for kernels < 3.0 from perf.h wrapper
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:26:23 +0000 (15:26 -0500)]
Drop support for kernels < 3.0 from atomic.h wrapper
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 20:24:01 +0000 (15:24 -0500)]
Drop compat patches for kernels < 2.6.36
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Thu, 8 Nov 2018 21:14:34 +0000 (16:14 -0500)]
Bump minimum kernel version to 3.0
Upstream Linux 3.0 was released 7 years ago, the oldest longterm release
still supported is 3.16. On the distro kernels side, we still cover :
RHEL / CentOS 7 and up
SLES11 SP2 and up
SLES12 and up
Ubuntu 12.04 (Precise) and up
Debian 7 (Wheezy) and up
Fedora 16 and up
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 5 Nov 2018 16:35:54 +0000 (11:35 -0500)]
Fix: ext4: adjust reserved cluster count when removing extents (v4.20)
See upstream commit :
commit
9fe671496b6c286f9033aedfc1718d67721da0ae
Author: Eric Whitney <enwlinux@gmail.com>
Date: Mon Oct 1 14:25:08 2018 -0400
ext4: adjust reserved cluster count when removing extents
Modify ext4_ext_remove_space() and the code it calls to correct the
reserved cluster count for pending reservations (delayed allocated
clusters shared with allocated blocks) when a block range is removed
from the extent tree. Pending reservations may be found for the clusters
at the ends of written or unwritten extents when a block range is removed.
If a physical cluster at the end of an extent is freed, it's necessary
to increment the reserved cluster count to maintain correct accounting
if the corresponding logical cluster is shared with at least one
delayed and unwritten extent as found in the extents status tree.
Add a new function, ext4_rereserve_cluster(), to reapply a reservation
on a delayed allocated cluster sharing blocks with a freed allocated
cluster. To avoid ENOSPC on reservation, a flag is applied to
ext4_free_blocks() to briefly defer updating the freeclusters counter
when an allocated cluster is freed. This prevents another thread
from allocating the freed block before the reservation can be reapplied.
Redefine the partial cluster object as a struct to carry more state
information and to clarify the code using it.
Adjust the conditional code structure in ext4_ext_remove_space to
reduce the indentation level in the main body of the code to improve
readability.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 5 Nov 2018 16:35:53 +0000 (11:35 -0500)]
Fix: signal: Remove SEND_SIG_FORCED (v4.20)
See upstream commit :
commit
4ff4c31a6e85f4c49fbeebeaa28018d002884b5a
Author: Eric W. Biederman <ebiederm@xmission.com>
Date: Mon Sep 3 10:39:04 2018 +0200
signal: Remove SEND_SIG_FORCED
There are no more users of SEND_SIG_FORCED so it may be safely removed.
Remove the definition of SEND_SIG_FORCED, it's use in is_si_special,
it's use in TP_STORE_SIGINFO, and it's use in __send_signal as without
any users the uses of SEND_SIG_FORCED are now unncessary.
This makes the code simpler, easier to understand and use. Users of
signal sending functions now no longer need to ask themselves do I
need to use SEND_SIG_FORCED.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Mon, 5 Nov 2018 16:35:52 +0000 (11:35 -0500)]
Fix: signal: Distinguish between kernel_siginfo and siginfo (v4.20)
See upstream commit :
commit
ae7795bc6187a15ec51cf258abae656a625f9980
Author: Eric W. Biederman <ebiederm@xmission.com>
Date: Tue Sep 25 11:27:20 2018 +0200
signal: Distinguish between kernel_siginfo and siginfo
Linus recently observed that if we did not worry about the padding
member in struct siginfo it is only about 48 bytes, and 48 bytes is
much nicer than 128 bytes for allocating on the stack and copying
around in the kernel.
The obvious thing of only adding the padding when userspace is
including siginfo.h won't work as there are sigframe definitions in
the kernel that embed struct siginfo.
So split siginfo in two; kernel_siginfo and siginfo. Keeping the
traditional name for the userspace definition. While the version that
is used internally to the kernel and ultimately will not be padded to
128 bytes is called kernel_siginfo.
The definition of struct kernel_siginfo I have put in include/signal_types.h
A set of buildtime checks has been added to verify the two structures have
the same field offsets.
To make it easy to verify the change kernel_siginfo retains the same
size as siginfo. The reduction in size comes in a following change.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Sat, 27 Oct 2018 19:33:02 +0000 (20:33 +0100)]
statedump cpu topology: introduce LTTNG_HAVE_STATEDUMP_CPU_TOPOLOGY
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Julien Desfossez [Fri, 26 Oct 2018 19:55:18 +0000 (15:55 -0400)]
CPU topology statedump on x86
New statedump tracepoint to dump the active CPU/NUMA topology. This
allows to know which CPUs are SMT sibling or on the same socket. For now
only x86 is supported because all architectures has different fields.
The field "architecture" is statically defined and should be present in
all implementations so parsing tools know what content to expect.
Example output:
lttng_statedump_cpu_topology: { cpu_id = 3 }, { architecture = "x86",
cpu_id = 0, vendor = "GenuineIntel", family = 6, model = 142,
model_name = "Intel(R) Core(TM) i7-7600U CPU @ 2.80GHz",
physical_id = 0, core_id = 0, cores = 2 }
Signed-off-by: Julien Desfossez <jdesfossez@digitalocean.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Fri, 26 Oct 2018 22:01:17 +0000 (18:01 -0400)]
Fix: update kvm instrumentation for SLES12 SP2 LTSS >= 4.4.121-92.92
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 24 Oct 2018 19:43:49 +0000 (20:43 +0100)]
Fix: Add missing const to lttng_tracepoint_ptr_deref prototype
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 12 Oct 2018 18:47:53 +0000 (14:47 -0400)]
Fix: adapt to kernel relative references
Upstream Linux commit
46e0c9be20 introduces relative references in the
struct tracepoint array of pointers.
Up to (including) v4.19-rc7, the upstream kernel has a type mismatch bug
that allows it to pass an out-of-bound end of array to modules
coming/going notifiers.
The fix for upstream Linux is to introduce a new type: tracepoint_ptr_t,
which can be used to adequately iterate on the array. It is introduced
prior to v4.19 as commit
9c0be3f6b5d77 "tracepoint: Fix tracepoint array
element size mismatch".
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 17 Oct 2018 19:41:37 +0000 (15:41 -0400)]
Fix: sync event enablers before choosing header type
On session start, we should allocate the event IDs before figuring
out the number of events per channel and select the proper header
type.
Without this, the number of events is always perceived to be 0,
which selects the "compact" header type. For a channel containing
many events (e.g. enable-event -k -a), this selects an inefficient
header type. With this fix, it selects the "large" header type,
which is more appropriate for a larger number of event IDs.
This will lead to a reduced trace throughput for tracing workloads
that have many events.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Francis Deslauriers [Tue, 16 Oct 2018 19:23:22 +0000 (15:23 -0400)]
Fix: implicit declarations caused by buffer size checks.
Issue
=====
Three kernel functions used in the following commit are unavailable on
some supported kernels:
commit
1f0ab1eb0409d23de5f67cc588c3ea4cee4d10e0
Prevent allocation of buffers if exceeding available memory
* si_mem_available() was added in kernel 4.6 with commit
d02bd27.
* {set, clear}_current_oom_origin() were added in kernel 3.8 with commit:
e1e12d2f
Solution
========
Add wrappers around these functions such that older kernels will build
with these functions defined as NOP or trivial return value.
wrapper_check_enough_free_pages() uses the si_mem_available() kernel
function to compute if the number pages requested passed as parameter is
smaller than the number of pages available on the machine. If the
si_mem_available() kernel function is unavailable, we always return
true.
wrapper_set_current_oom_origin() function wraps the
set_current_oom_origin() kernel function when it is available.
If set_current_oom_origin() is unavailable the wrapper is empty.
wrapper_clear_current_oom_origin() function wraps the
clear_current_oom_origin() kernel function when it is available.
If clear_current_oom_origin() is unavailable the wrapper is empty.
Drawbacks
=========
None.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Francis Deslauriers [Thu, 11 Oct 2018 21:37:00 +0000 (17:37 -0400)]
Prevent allocation of buffers if exceeding available memory
Issue
=====
The running system can be rendered unusable by creating a channel
buffers larger than the available memory of the system, resulting in
random processes being killed by the OOM-killer.
These simple commands trigger the crash on my 15G of RAM laptop:
lttng create
lttng enable-channel -k --subbuf-size=16G --num-subbuf=1 chan0
Note that the subbuf-size * num-subbuf is larger than the physical
memory.
Solution
========
Get an estimate of the number of available pages and return ENOMEM if
there are not enough pages to cover the needs of the caller. Also, mark
the calling user thread as the first target for the OOM killer in case
the estimate of available pages was wrong.
This greatly reduces the attack surface of this issue as well as reducing
its potential impact.
This approach is inspired by the one taken by the Linux kernel
trace ring buffer[1].
Drawback
========
This approach is imperfect because it's based on an estimate.
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/trace/ring_buffer.c#n1172
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 12 Oct 2018 19:02:06 +0000 (15:02 -0400)]
Fix: btrfs instrumentation namespacing
Triggers this warning:
[ 47.922818] WARNING: CPU: 29 PID: 1163 at /home/efficios/git/lttng-modules/lttng-probes.c:83 fixup_lazy_probes+0x1fb/0x210 [lttng_tracer]
[ 47.951661] Modules linked in: lttng_probe_compaction(O+) lttng_probe_btrfs(O) lttng_probe_block(O) lttng_ring_buffer_metadata_mmap_client(O) lttng_ring_buffer_client_mmap_overwrite(O) lttng_ring_buffer_client_mmap_discard(O) lttng_ring_buffer_metadata_client(O) lttng_ring_buffer_client_overwrite(O) lttng_ring_buffer_client_discard(O) lttng_tracer(O) lttng_statedump(O) lttng_ftrace(O) lttng_kprobes(O) lttng_clock(O) lttng_uprobes(O) lttng_lib_ring_buffer(O) lttng_kretprobes(O)
[ 48.039200] CPU: 29 PID: 1163 Comm: modprobe Tainted: G O 4.19.0-rc7+ #19
[ 48.055628] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 48.078273] RIP: 0010:fixup_lazy_probes+0x1fb/0x210 [lttng_tracer]
[ 48.092257] Code: 01 48 39 d0 76 24 43 80 3c 3a 5f 75 1d 44 8b 44 24 14 4c 8b 4c 24 08 41 83 c0 01 45 39 45 10 0f 86 7a fe ff ff e9 6c ff ff ff <0f> 0b e9 6e fe ff ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f
[ 48.141947] RSP: 0018:
ffffafe74777bc40 EFLAGS:
00010286
[ 48.153733] RAX:
00000000ffffffff RBX:
dead000000000200 RCX:
0000000000000061
[ 48.173986] RDX:
0000000000000005 RSI:
ffffffffc04b728c RDI:
ffffffffc04b74a5
[ 48.193595] RBP:
dead000000000100 R08:
0000000000000062 R09:
ffffffffc04bd040
[ 48.211573] R10:
ffffffffc04b74a5 R11:
ffffffff920e204d R12:
ffffffffffffffff
[ 48.232131] R13:
ffffffffc04bd000 R14:
ffffffffc03f0078 R15:
0000000000000005
[ 48.246832] FS:
00007f5495093540(0000) GS:
ffff8fcf0fb40000(0000) knlGS:
0000000000000000
[ 48.267475] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 48.280348] CR2:
00007ffde00f8328 CR3:
00000018078a2001 CR4:
00000000001606e0
[ 48.302404] Call Trace:
[ 48.309201] lttng_probe_register+0xd5/0xe0 [lttng_tracer]
[ 48.326993] ? __event_probe__compaction_isolate_template+0x2c0/0x2c0 [lttng_probe_compaction]
[ 48.345702] do_one_initcall+0x46/0x1c8
[ 48.360147] ? kobject_uevent_env+0x117/0x810
[ 48.370388] ? _cond_resched+0x15/0x40
[ 48.380649] ? kmem_cache_alloc_trace+0x153/0x1c0
[ 48.394706] do_init_module+0x5b/0x20b
[ 48.404412] load_module+0x2194/0x2980
[ 48.418759] ? ima_post_read_file+0xe2/0x120
[ 48.427716] ? __do_sys_finit_module+0xe9/0x110
[ 48.438226] __do_sys_finit_module+0xe9/0x110
[ 48.452983] do_syscall_64+0x65/0x190
[ 48.461521] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 48.473173] RIP: 0033:0x7f5494ba6839
[ 48.486630] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
[ 48.529108] RSP: 002b:
00007ffde00fb408 EFLAGS:
00000246 ORIG_RAX:
0000000000000139
[ 48.549652] RAX:
ffffffffffffffda RBX:
000055d326ee2ac0 RCX:
00007f5494ba6839
[ 48.564325] RDX:
0000000000000000 RSI:
000055d326cbac2e RDI:
0000000000000005
[ 48.582576] RBP:
000055d326cbac2e R08:
0000000000000000 R09:
000055d326ee2ac0
[ 48.596892] R10:
0000000000000005 R11:
0000000000000246 R12:
0000000000000000
[ 48.617576] R13:
000055d326ee2d80 R14:
0000000000040000 R15:
000055d326ee2ac0
[ 48.633713] ---[ end trace
c265591e0ada440c ]---
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Wed, 10 Oct 2018 18:17:46 +0000 (14:17 -0400)]
Fix: Convert rcu tracepointis to gp_seq (v4.19)
See upstream commits :
commit
477351f7829d2268769c5d545511081555066529
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date: Tue May 1 12:54:11 2018 -0700
rcu: Convert rcu_grace_period tracepoint to gp_seq
This commit makes the rcu_grace_period tracepoint use gp_seq instead
of ->gpnum or ->completed. It also introduces a "cpuofl-bgp" string to
less obscurely indicate when a CPU has gone offline while a grace period
is waiting on it.
commit
63d86a7e85f84b8ac3b2f394570965aedbb03787
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date: Tue May 1 13:08:46 2018 -0700
rcu: Convert rcu_grace_period_init tracepoint to gp_seq
This commit makes the rcu_grace_period_init tracepoint use gp_seq instead
of ->gpnum.
commit
598ce09480efb6b48799df60c66bac70bea5ef54
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date: Tue May 1 13:35:20 2018 -0700
rcu: Convert rcu_preempt_task tracepoint to ->gp_seq
This commit makes the rcu_preempt_task tracepoint use ->gp_seq instead
of ->gpnum.
commit
865aa1e08d8aefdfd1f5d30ecfce1b8ef8cd520a
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date: Tue May 1 13:35:20 2018 -0700
rcu: Convert rcu_unlock_preempted_task tracepoint to ->gp_seq
This commit makes the rcu_unlock_preempted_task tracepoint use ->gp_seq
instead of ->gpnum.
commit
db023296f0115d2fe01fdabad54678f2b806da23
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date: Tue May 1 13:35:20 2018 -0700
rcu: Convert rcu_quiescent_state_report tracepoint to ->gp_seq
This commit makes the rcu_quiescent_state_report tracepoint use ->gp_seq
instead of ->gpnum.
commit
fee5997c17562e95fb1fecc142efb2da0934baa4
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date: Tue May 1 13:35:20 2018 -0700
rcu: Convert rcu_fqs tracepoint to ->gp_seq
This commit makes the rcu_fqs tracepoint use ->gp_seq instead of ->gpnum.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Fri, 7 Sep 2018 16:21:13 +0000 (12:21 -0400)]
Fix: tracing: Centralize preemptirq tracepoints (4.19)
See upstream commit:
commit
c3bc8fd637a9623f5c507bd18f9677effbddf584
Author: Joel Fernandes (Google) <joel@joelfernandes.org>
Date: Mon Jul 30 15:24:23 2018 -0700
tracing: Centralize preemptirq tracepoints and unify their usage
This patch detaches the preemptirq tracepoints from the tracers and
keeps it separate.
Advantages:
* Lockdep and irqsoff event can now run in parallel since they no longer
have their own calls.
* This unifies the usecase of adding hooks to an irqsoff and irqson
event, and a preemptoff and preempton event.
3 users of the events exist:
- Lockdep
- irqsoff and preemptoff tracers
- irqs and preempt trace events
The unification cleans up several ifdefs and makes the code in preempt
tracer and irqsoff tracers simpler. It gets rid of all the horrific
ifdeferry around PROVE_LOCKING and makes configuration of the different
users of the tracepoints more easy and understandable. It also gets rid
of the time_* function calls from the lockdep hooks used to call into
the preemptirq tracer which is not needed anymore. The negative delta in
lines of code in this patch is quite large too.
In the patch we introduce a new CONFIG option PREEMPTIRQ_TRACEPOINTS
as a single point for registering probes onto the tracepoints. With
this,
the web of config options for preempt/irq toggle tracepoints and its
users becomes:
PREEMPT_TRACER PREEMPTIRQ_EVENTS IRQSOFF_TRACER PROVE_LOCKING
| | \ | |
\ (selects) / \ \ (selects) /
TRACE_PREEMPT_TOGGLE ----> TRACE_IRQFLAGS
\ /
\ (depends on) /
PREEMPTIRQ_TRACEPOINTS
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Michael Jeanson [Fri, 7 Sep 2018 16:21:12 +0000 (12:21 -0400)]
Fix: net: expose sk wmem in sock_exceed_buf_limit tracepoint (4.19)
See upstream commit:
commit
d6f19938eb031ee2158272757db33258153ae59c
Author: Yafang Shao <laoar.shao@gmail.com>
Date: Sun Jul 1 23:31:30 2018 +0800
net: expose sk wmem in sock_exceed_buf_limit tracepoint
Currently trace_sock_exceed_buf_limit() only show rmem info,
but wmem limit may also be hit.
So expose wmem info in this tracepoint as well.
Regarding memcg, I think it is better to introduce a new tracepoint(if
that is needed), i.e. trace_memcg_limit_hit other than show memcg info in
trace_sock_exceed_buf_limit.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Jonathan Rajotte [Wed, 19 Sep 2018 21:48:49 +0000 (17:48 -0400)]
Fix: access migrate_disable field directly
For stable real time kernel > 4.9, the __migrate_disabled utility symbol
is not always exported. This can result in linking problem at build time
and runtime, preventing the loading of the tracer.
The problem was reported to the RT community. [1] [2]
A solution is to access the field directly instead of using the
utility wrapper.
It is important to note that the field is now available for other
configurations than CONFIG_PREEMPT_RT_FULL. For now, we choose to
expose the migratable context only for configurations where
CONFIG_PREEMPT_RT_FULL is set.
Based on the configuration dependency of the kernels, selecting
CONFIG_PREEMPT_RT_FULL ensures the presence of the migrate_disable
field.
Initial bug report [3].
[1] https://marc.info/?l=linux-rt-users&m=
153730414126984&w=2
[2] https://marc.info/?l=linux-rt-users&m=
153729444223779&w=2
[3] https://lists.lttng.org/pipermail/lttng-dev/2018-September/028216.html
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Fri, 7 Sep 2018 21:55:32 +0000 (17:55 -0400)]
Fix: out of memory error handling
CPU hotplug handles teardown on failure to complete adding an instance
of CPU hotplug. Trying to remove after a failed "add" on that instance
triggers a NULL pointer dereference OOPS.
Fixes: #1167
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Thu, 30 Aug 2018 15:50:33 +0000 (11:50 -0400)]
Fix: uprobes: missing break in lttng_event_ioctl()
Found by Coverity:
** CID
1395322: Control flow issues (MISSING_BREAK)
/lttng-abi.c: 1465 in lttng_event_ioctl()
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Francis Deslauriers [Thu, 30 Aug 2018 01:36:47 +0000 (21:36 -0400)]
Fix: ACCESS_ONCE was removed in 4.15, use READ_ONCE instead
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Francis Deslauriers [Wed, 29 Aug 2018 16:49:59 +0000 (12:49 -0400)]
Fix: instruction pointer has different names across arch
Different terms are used to refer to the instruction pointer depending
on the CPU architecture.
For example:
x86 -> ip
powerpc -> nip
RISC-V -> sepc
ARM -> ARM_pc
Microblaze -> pc
To fix this issue, we use the instruction_pointer() kernel function
(or macro depending on the arch) to get the right field in the pt_regs
struct for the current architecture.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Francis Deslauriers [Wed, 29 Aug 2018 15:45:49 +0000 (11:45 -0400)]
Fix: build failures when CONFIG_UPROBES is absent
Problems
========
- There is a typo in the struct name of the parameters of stub version
of the lttng_uprobes_add_callsite function,
- We are building the lttng-uprobes.o object file even when
CONFIG_UPROBES is absent.
Both of these are causing build errors.
Fixes
=====
- Replace struct lttng_kernel_callsite_uprobe by struct
lttng_kernel_event_callsite,
- Only add the lttng-uprobes.o object file to the needed artefacts if
CONFIG_UPROBES is present.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Francis Deslauriers [Tue, 14 Nov 2017 19:23:11 +0000 (14:23 -0500)]
uprobe: Support multiple call sites for the same uprobe event
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Francis Deslauriers [Tue, 27 Jun 2017 20:23:27 +0000 (16:23 -0400)]
uprobe: Receive file descriptor from session instead of path to file
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Francis Deslauriers [Thu, 15 Jun 2017 17:40:35 +0000 (13:40 -0400)]
uprobe: Mark uprobe event as registered
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
This page took 0.051637 seconds and 4 git commands to generate.